CN113222328B - Air quality monitoring equipment point arrangement and site selection method based on road section pollution similarity - Google Patents
Air quality monitoring equipment point arrangement and site selection method based on road section pollution similarity Download PDFInfo
- Publication number
- CN113222328B CN113222328B CN202110320263.0A CN202110320263A CN113222328B CN 113222328 B CN113222328 B CN 113222328B CN 202110320263 A CN202110320263 A CN 202110320263A CN 113222328 B CN113222328 B CN 113222328B
- Authority
- CN
- China
- Prior art keywords
- road
- sub
- pollution
- layer
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses an air quality monitoring equipment point arrangement and site selection method based on road segment pollution similarity, which comprises the following steps of constructing a graph convolution network model, and obtaining K tail gas pollution categories of urban road network roads by utilizing the driving tracks of motor vehicles in the urban road network and combining the topological connectivity of the urban road structure; according to the tail gas pollution categories of the roads, the distribution information entropy of the roads belonging to each category is calculated, and the establishment priority is marked, so that a proper point location is selected. The invention introduces a graph convolution network, combines road topological connectivity, and integrates external factors such as meteorological conditions, POIs distribution and the like, thereby extracting spatial characteristics. The invention fully utilizes the topological structure information and the road communication information of the traffic network and combines the external traffic information which is easy to obtain, thereby obtaining the road section arrangement priority and providing the recommendation of arrangement point positions for the supervision department, and the applicability is stronger.
Description
Technical Field
The invention relates to the technical field of environmental monitoring, in particular to an air quality monitoring equipment point arrangement and site selection method based on road section pollution similarity.
Background
The method can accurately predict the air quality distribution of urban areas, and has important significance for government environment governance, daily health prevention of people and the like. The problem of monitoring equipment point distribution and site selection is a premise of urban atmospheric pollutant prediction, how to accurately predict the air quality of the whole city by using a small amount of monitoring equipment depends on whether the point position distribution of the monitoring equipment is reasonable or not.
Over the years, research on site selection and distribution mainly utilizes mathematical models and physical knowledge to solve the site selection problem through computational simulation. These methods are all determined by a series of optimization techniques, including the use of mathematical methods such as statistical analysis methods of correlation analysis and cluster analysis, or multiobjective optimization. To achieve steady state, the simulation process not only requires complex system programming, but also consumes a lot of computing power, and unrealistic assumptions and simplifications in modeling will further reduce the model efficiency; in addition, these methods typically ignore the road network topology in urban areas and some external influencing factors related to air quality distribution.
Disclosure of Invention
The invention provides a method for arranging and selecting sites of air quality monitoring equipment based on road section pollution similarity, which can solve the technical problems.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for arranging and selecting sites of air quality monitoring equipment based on road section pollution similarity comprises the following steps:
constructing a graph convolution network model, and obtaining K tail gas pollution categories of urban road network roads by using the driving tracks of the motor vehicles in the urban road network and combining the topological connectivity of the urban road structure;
according to the tail gas pollution categories of the roads, the distribution information entropy of the roads belonging to each category is calculated, and the establishment priority is marked, so that a proper point location is selected.
Further, the method for obtaining K tail gas pollution categories of urban road network roads by using the running track of the motor vehicle in the urban road network and combining the topological connectivity of the urban road structure specifically comprises the following steps:
s1, acquiring the running track data of the motor vehicle in the urban road network, and carrying out road matching according to the longitude and latitude information;
s2, dividing each track into a series of sub-tracks according to the intersection distribution of the road sections, namely the starting points of the road sections, combining the sub-tracks related to the same road section together to form a group of initial clusters, and expressing as IC ═ IC {1,ic2,...,icnWhere n is expressed as total number of links, ici={ici1,ici2,...,iciJThe track cluster of each road section is composed of sub tracks of all the vehicle types;
s3, calculating the road pollution emission X of the motorcade as { ice ] by using the localized emission factors according to the emission factor model1,ice2,...,iceNWhere N is represented as the total number of road segments, icei={icei1,icei2,...,iceidD represents the calculated total road section emission time number, wherein the total fleet emission amount at the time t of the road section i is as follows:
wherein, EfmRepresenting the emission factor of the m-th vehicle type,desity(icim) Representing the number of sub-tracks, len (ic), of the m-th vehicle type at time t on road section ii) Representing the road length of the section i, and J representing the number of vehicle types;
the S3 further includes a link conversion for abstracting the link into nodes, a connection between two nodes represents connectivity of two corresponding links in the road network, and a given link connectivity graph G ═ V0ε, W), where V0 is a finite set of vertices made up of N road segments, ε is a set of edges representing connectivity between the road segments, and W ∈ RN×NRepresenting the adjacency weight matrix of graph G, each vertex vi∈V0The method comprises the following steps of (1) calculating an adjacency matrix of a mobile source pollution graph through road section connectivity, wherein the final spatial weighting adjacency matrix can be recorded as:
wherein, distijRepresenting a section of road viAnd vjNormalized distance between geographical locations, link (i, j) representing a road section viAnd vjIf link (i, j) value is 1, v isiAnd vjWith the cells being connected, otherwise they are not, θ20.05 is used to control adjacency matrix scale and sparsity;
the S3 further comprises that the exhaust emission X belongs to R in the urban road networkN×dInputting end-to-end structural clustering frame to obtain the clustering result of road section, dividing the whole road network section into several clustering sub-clusters (X)1,X2,...,XK) Wherein K represents the number of clusters;
each row xiE X represents the ith sample, i.e. the ith road segment, N is the total number of road segments, and d is the dimension, i.e. the total number of discharge moments.
Furthermore, the graph convolution network model comprises a BERT module for extracting pollution emission rules, a GCN module for extracting a road network structure, a DNN module for extracting external factors, and a double self-supervision module for unifying emission rule feature extraction and road network structure extraction.
Further, the BERT module extracts the exhaust emission X belonging to RN×dUsing a Transformer encoder to iteratively calculate the discharge ice of each road sectioni={icei1,icei2,...,iceidEvery position ice of }itIs hidden representation of each layer lStackingForm aWherein d is0Representing the characteristic dimension, d represents the time length of the time sequence, and the Transformer encoder is composed of two sublayers: the multi-head self-attention mechanism sublayer and the position are all connected with a feedforward network;
the multi-head attention mechanism sublayer projects Q, K, V through h different linear transformations, and finally splices different attention results:
MH(H1 (l))=[head1;head2;...;headh]WO
wherein the projection matrix of each head All are learnable parameters, the parameters are not shared between layers, and the Attention function uses a telescopic dot product Attention mechanism:wherein query Q, key K, and value V are derived from the same matrix H1 lProjection, temperatureDegree of rotationIs introduced to produce a softer attention distribution, avoiding extremely small gradients;
the position full-connection feedforward network is mainly based on linear projection, in order to endow model nonlinearity, the position full-connection feedforward network is applied to the output of the self-attention sublayer, the position full-connection feedforward network is the same at each position and consists of two affine transformations, and Gaussian Error Linear Unit (GELU) activation is arranged in the middle:
FFN(x)=GELU(xW(1)+b(1))W(2)+b(2)
gelu (x) x Φ (x) where Φ (x) is the cumulative distribution function of a standard gaussian distribution, is a learnable parameter and each location is shared;
stacking the transform sub-layers to form a transform encoder layer, using dropout for the output of each sub-layer, using residual concatenation around the sub-layers, and then normalizing:
Trm(H1 (l-1))=LN(A(l-1)+Dropout(PFFN(A(l-1))))
A(l-1)=LN(H1 (l-1)+Dropout(MH(H1 (l-1))))
wherein LN is a defined layer normalization function;
embedding a location into an entry, whichInput deviceConfigured as an addition of the respective sequence value and the position embedding:
wherein iceit∈iceiIs d0Embedding of the dimension at time t drainage, ptIs d0Position embedding of dimensions, using fixed sine embedding instead of learnable position embedding;
in the decoder, an encoder-decoder attention sublayer is added between a multi-head self-attention mechanism sublayer and a position full-connection feedforward network, in the encoder-decoder attention, Q is from the last output of the decoder, K and V are from the output of an encoder, the calculation mode is the same as that of the encoder, and the decoder outputs a reconstructed vector X' and has the following objective function:
furthermore, the DNN module is used for extracting external factors, wherein the external factors comprise meteorological data and urban interest point data Y belonging to RN×d′Providing additional information for feature extraction of road movement source emission, wherein d' represents the dimension of external factor data, and an automatic encoder is adopted to learn the features of the external factors; assuming that the self-encoder has a total of L layers, L represents a specific number of layers, and the characteristics learned by the L-th layer encoding part are expressed as:
where φ is an activation function of the fully-connected layer, such as the Relu or Sigmoid functions,andrespectively, the weight matrix and the offset of the l-th layer of the encoder, denoted H2 (0)Represents the original data Y;
the encoded part is followed by a decoder to reconstruct the input, which uses some fully-connected layers, denoted as:
wherein the content of the first and second substances,andrespectively representing the weight matrix and the deviation of the decoder;
the decoder outputs reconstructed extrinsic factor data Y' with the following objective function:
further, the graph convolution network model further comprises a feature fusion module, and the external factor features and the emission rule features of each layer are added to form fusion features:
H(l)=H1 (l)+H2 (l)。
furthermore, the GCN module is used for extracting a road network structure, and G ═ V (V) is determined according to a road network connectivity graph0Epsilon, W), integrating the fusion features into the GCN, which can learn to represent that three different pieces of information are accommodated: the relationship between the emission data itself, external factors, and data; the representation of the l-th layer GCN learning can be performed by the following convolution operation:
whereinI is the unit diagonal matrix of the self-circulating adjacency matrix W for each node, Z(l-1)Adjacency matrix by normalizationPropagate to obtain a new representation Z(l)Considering HlCombines the time sequence characteristics and the external factor characteristics of discharge and combines Z(l-1)And H(l-1)Binding gives a stronger representation:
wherein ∈ is an equilibrium coefficient; in this way, the BERT module, the DNN module and the GCN module are connected layer by layer;
Fusion feature H(l-1)By normalizing the matrixThe information of each layer of fusion features is different, the fusion features of each layer are transferred to the corresponding GCN layer for information propagation, and the propagation operator works in the whole model for L times;
the inputs to the first layer GCN are raw emission data X:
the last layer of the GCN module is a multi-layer classification layer with a softmax function:
result zijE is Z, the probability sample i belongs to the clustering center j, and Z is probability distribution.
Further, the probability distribution Z is supervised using the target distribution P:
the overall loss function is:
wherein, alpha is a coefficient for controlling the GCN module to embed the space interference.
Further, the calculating, according to the tail gas pollution categories of the roads, the distribution information entropy of the road belonging to each category, and marking the establishment priority, thereby selecting a suitable point location specifically includes:
setting cluster labels for road sections, selecting probability distribution Z as a final clustering result, finding a value with the highest quantization probability from the distribution Z, and distributing the value as the cluster labels of the road sections i:
i.e. the whole network segment can be divided into several cluster sub-clusters (X)1,X2,...,XK) Wherein K denotes the number of clusters, cluster sub-cluster (X)1,X2,...,XK) Actually represents K categories of urban exhaust pollution;
a plurality of clustering sub-clusters (X) obtained by dividing road sections of the whole road network1,X2,...,XK) Respectively, respectivelyCounting and obtaining each sub-cluster XKNumber N of included roadsKFurther calculating each sub-cluster XKNumber n of air quality monitoring stations to be constructedi:
For each sub-cluster XKAnd calculating the distribution information entropy of the probability distribution Z of each road i, which is defined as follows:
H(Z)={-∑[Zlog(Z)+(1-Z)log(1-Z)]}
calculating the numerical value of the information entropy to mark the established priority of each road section i according to the calculated distribution information entropy of each road i; the smaller the value of the distribution information entropy is, the road and the sub-cluster X which the road belongs to are indicatedKThe higher the relevance of other roads, i.e. the more representative the road can be of the corresponding pollution class sub-cluster XK;
In each sub-cluster XKThe priority is labeled and established according to the value of the distribution information entropy of each road i, and the higher the value of the information entropy is, the higher the priority is established for the road with the larger value of the information entropy;
in each sub-cluster XKIn the method, n is constructed according to the construction priority of the roadiAnd (4) an air quality monitoring station.
Further, the balance coefficients ∈, both set to 0.5
According to the technical scheme, the air quality monitoring equipment point arrangement and site selection method based on the road section pollution similarity introduces a graph convolution network from the technical innovation point of view, combines road topological connectivity, and integrates external factors such as meteorological conditions and POI distribution, so as to extract spatial features. Obtaining the tail gas pollution category of all roads in a city by using the running track of the motor vehicle in the city road network; and then, according to the tail gas pollution categories of the roads, calculating the distribution information entropy of the roads belonging to each category, and marking the establishment priority, thereby achieving the purpose of recommending the position of a newly established station. The invention fully utilizes the topological structure information and the road communication information of the traffic network and combines the external traffic information which is easy to obtain, thereby obtaining the road section arrangement priority and providing the recommendation of arrangement point positions for the supervision department, and the applicability is stronger.
The method is different from the traditional monitoring station position recommendation method, defines the urban road network as a graph structure by combining external complex environment characteristics, introduces a graph convolution network, and captures spatial characteristics by combining road topological connectivity. According to the obtained pollution category of each road in the urban road network, the road section arrangement priority is obtained, recommendation of arrangement point locations is provided for supervision departments, the newly-established air observation station can represent the air quality distribution of each category to the greatest extent, the cost is saved, and the method has certain value significance in practical application.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic diagram of the structure of the present invention;
fig. 3 is a diagram of an application example of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.
As shown in fig. 1 and fig. 2, the air quality monitoring device stationing and addressing method based on road segment pollution similarity according to the embodiment has a main task of selecting K positions from candidate areas to newly build an air quality monitoring device, and the K positions can improve the characterization capability of the air quality distribution of the whole city to the greatest extent. Firstly, acquiring K tail gas pollution categories of urban road networks by utilizing the running tracks of the motor vehicles in the urban road networks and combining the topological connectivity of the urban road structures; further, the establishment priority of each road category is calculated according to the distribution information entropy of the road, so that a proper point is selected. The method comprises the following specific steps:
s1, acquiring the running track data of the motor vehicle in the urban road network, and carrying out road matching according to the longitude and latitude information;
s2, dividing each track into a series of sub-tracks according to the intersection distribution of the road sections, namely the starting points of the road sections, combining the sub-tracks related to the same road section together to form a group of initial clusters, and expressing as IC ═ IC {1,ic2,...,icnWhere n is expressed as total number of links, ici={ici1,ici2,...,iciJThe track cluster of each road section is composed of sub tracks of all the vehicle types;
s3, calculating the road pollution emission X of the motorcade as { ice ] by using the localized emission factors according to the emission factor model1,ice2,...,iceNWhere N is represented as the total number of road segments, icei={icei1,icei2,...,iceidD represents the calculated total road section emission time number, wherein the total fleet emission amount at the time t of the road section i is as follows:
wherein, EfmEmission factor, severity (ic) representing the m-th vehicle modelim) Representing the number of sub-tracks, len (ic), of the m-th vehicle type at time t on road section ii) The road length of the link i is indicated, and J indicates the number of vehicle types.
Further, the link conversion abstracts the links into nodes, the connection between two nodes represents the connectivity of two corresponding links in the road network, and the given link connectivity graph G ═ V0ε, W) in which V0Is a finite set of vertices consisting of N road segments, epsilon is a set of edges representing connectivity between road segments, W is an element of RN×NThe adjacency weight matrix of fig. G is represented. Each vertex vi∈V0Containing geographical location information. Calculating an adjacency matrix of the mobile source pollution graph through the road section connectivity, wherein the final spatial weighting adjacency matrix can be recorded as:
wherein, distijRepresenting a section of road viAnd vjNormalized distance between geographical locations, link (i, j) representing a road section viAnd vjIf link (i, j) value is 1, v isiAnd vjThe cells are connected, otherwise they are not. Theta20.05 is used to control adjacency matrix scale and sparsity.
Furthermore, the exhaust emission X in the urban road network belongs to RN×dInputting end-to-end structural clustering frame to obtain the clustering result of road section, dividing the whole road network section into several clustering sub-clusters (X)1,X2,...,XK) Where K represents the number of clusters. Each row xiE X represents the ith sample, i.e. the ith road segment, N is the total number of road segments, and d is the dimension, i.e. the total number of discharge moments. Specifically, the method comprises a BERT module for extracting pollution discharge laws, a GCN module for extracting a road network structure, a DNN module for extracting external factors, and a dual self-supervision module for unifying discharge law feature extraction and road network structure extraction:
s4 and BERT module for extracting exhaust emission X from RN×dUsing a Transformer encoder to iteratively calculate the discharge ice of each road sectioni={icei1,icei2,...,iceidEvery position ice of }itIs hidden representation of each layer lStackingForm aWherein d is0Representing the characteristic dimension, d represents the time length of the time sequence, and the Transformer encoder is composed of two sublayers: multi-head self-attention deviceThe system layer and the position are all connected with the feedforward network.
The multi-head attention mechanism sublayer projects Q, K, V through h different linear transformations, and finally splices different attention results:
MH(H1 (l))=[head1;head2;...;headh]WO
wherein the projection matrix of each head All are learnable parameters, the parameters are not shared between layers, and the Attention function uses a telescopic dot product Attention mechanism:wherein query Q, key K, and value V are derived from the same matrix H1 lProjection, temperatureIs introduced to produce a softer attention profile, avoiding extremely small gradients.
The position full-connection feedforward network is mainly based on linear projection, in order to endow model nonlinearity, the position full-connection feedforward network is applied to the output of the self-attention sublayer, the position full-connection feedforward network is the same at each position and consists of two affine transformations, and Gaussian Error Linear Unit (GELU) activation is arranged in the middle:
FFN(x)=GELU(xW(1)+b(1))W(2)+b(2)
GELU(x)=xΦ(x)
where Φ (x) is the cumulative distribution function of the standard gaussian distribution, is a learnable parameter and is shared by locations.
Stacking the transform sub-layers to form a transform encoder layer, using dropout for the output of each sub-layer, using residual concatenation around the sub-layers, and then normalizing:
Trm(H1 (l-1))=LN(A(l-1)+Dropout(PFFN(A(l-1))))
A(l-1)=LN(H1 (l-1)+Dropout(MH(H1 (l-1))))
where LN is a defined layer normalization function.
Embedding a location into an entry, the entry of whichConfigured as an addition of the respective sequence value and the position embedding:
wherein iceit∈iceiIs d0Embedding of the dimension at time t drainage, ptIs d0Dimensional position embedding, fixed sine embedding is used instead of learnable position embedding.
In the decoder, an encoder-decoder attention sublayer is added between a multi-head self-attention mechanism sublayer and a position full-connection feedforward network, in the encoder-decoder attention, Q is from the last output of the decoder, K and V are from the output of an encoder, and the calculation mode is the same as that of the encoder. The decoder outputs a reconstructed vector X' with the following objective function:
s5, DNN module extracts external factors such as weather data and city interest point data Y ∈ RN×d′Additional information may be provided for feature extraction of road movement source emissions, where d' is expressed as a dimension of the extrinsic factor data, and an auto-encoder is employed to learn the features of the extrinsic factors. Assuming that the self-encoder has a total of L layers, L represents a specific number of layers, and the characteristics learned by the L-th layer encoding part are expressed as:
where φ is an activation function of the fully-connected layer, such as the Relu or Sigmoid functions,andrespectively, weight matrix and offset of the l-th layer of the encoder, and the invention is further described in H2 (0)Representing the original data Y.
The encoded part is followed by a decoder to reconstruct the input, which uses some fully-connected layers, denoted as:
wherein the content of the first and second substances,andindividual watchThe weight matrix and bias of the decoder are shown.
The decoder outputs reconstructed extrinsic factor data Y' with the following objective function:
and S6, fusing the characteristics, namely adding the external factor characteristics and the emission rule characteristics of each layer to form fused characteristics.
H(l)=H1 (l)+H2 (l)
S7, GCN module extracts road network structure, according to road network connection graph G ═ (V)0Epsilon, W), integrating the fusion features into the GCN, which can learn to represent that three different pieces of information are accommodated: emission data itself, external factors, relationships between data. The representation of the l-th layer GCN learning can be performed by the following convolution operation:
which is composed ofI is the unit diagonal matrix of the self-circulating adjacency matrix W for each node, Z(l-1)Adjacency matrix by normalizationPropagate to obtain a new representation Z(l)Considering HlCombines the time sequence characteristics and the external factor characteristics of discharge and combines Z(l-1)And H(l-1)Binding gives a stronger representation:
where e is the balance coefficient, the invention is all set to 0.5. In this way, the invention can connect the BERT module, the DNN module and the GCN module layer by layer.
Fusion feature H(l-1)By normalizing the matrixPropagation, because the representation information of the fused features of each layer is different, in order to keep as much information as possible, the invention transfers the fused features of each layer to the corresponding GCN layer for information propagation, and as shown in FIG. 1, propagation operators work L times in the whole model.
The inputs to the first layer GCN are raw emission data X:
the last layer of the GCN module is a multi-layer classification layer with a softmax function:
result zijE is Z, the probability sample i belongs to the clustering center j, and Z is probability distribution.
The invention uses the target distribution P to supervise the probability distribution Z:
overall, the overall loss function of the framework of the invention is:
wherein, alpha is a coefficient for controlling the GCN module to embed the space interference.
S8, obtaining a stable result after training, setting cluster labels for road sections, selecting probability distribution Z as a final clustering result, finding a value with the highest quantization probability from the distribution Z, and distributing the value as the cluster label of the road section i:
i.e. the whole network segment can be divided into several cluster sub-clusters (X)1,X2,...,XK) Where K represents the number of clusters.
Cluster son (X)1,X2,...,XK) Actually, K categories of urban tail gas pollution are shown, and the method only needs to find the roads which can represent the corresponding categories most among the K basic categories, and then builds the corresponding quantity of air quality monitoring equipment, namely, the K air quality monitoring equipment with the required construction can be used for reflecting the air quality distribution of the whole city to the maximum extent.
S9, dividing road sections of the whole road network into a plurality of clustering sub-clusters (X)1,X2,...,XK) Respectively counting and obtaining each sub-cluster XKNumber N of included roadsKFurther calculating each sub-cluster XKNumber n of air quality monitoring stations to be constructedi:
S10, aiming at each sub-cluster XKAnd calculating the distribution information entropy of the probability distribution Z of each road i, which is defined as follows:
H(Z)={-∑[Zlog(Z)+(1-Z)log(1-Z)]}
and S11, calculating the numerical value of the information entropy according to the calculated distribution information entropy of each road i to mark the established priority of each road section i. The smaller the value of the distribution information entropy is, the road and the sub-cluster X which the road belongs to are indicatedKThe higher the relevance of other roads, i.e. the more representative the road can be of the corresponding pollution class sub-cluster XK. The significance of the road of the constructed air quality monitoring station is to improve the air quality distribution precision of the whole city to the greatest extent, and new monitoring equipment should be preferentially established on the road with high relevance, so that the method is more representative and can more directly represent the air quality distribution of the whole city. Therefore, in each sub-cluster XKAnd marking and establishing the priority according to the value of the distribution information entropy of each road i. The road with larger information entropy value has higher priority.
S12, in each sub-cluster XKIn the method, n is constructed according to the construction priority of the roadiAnd (4) an air quality monitoring station.
As shown in fig. 3, this is recommended by air quality monitoring sites in the beijing city. The green symbol mark g is the original air quality monitoring site in Beijing, and the orange symbol mark o is marked with 1, 2, 3, 4 and 5 which are the optimal sites in the corresponding candidate areas recommended by the patent.
In summary, the air quality monitoring device point arrangement and site selection method based on the road section pollution similarity has the advantages that: different from the traditional monitoring station position recommendation method, the method combines external complex environment characteristics, defines the urban road network as a graph structure, introduces a graph convolution network, and combines road topology connectivity to capture spatial characteristics. According to the obtained pollution category of each road in the urban road network, the road section arrangement priority is obtained, recommendation of arrangement point locations is provided for supervision departments, the newly-established air observation station can represent the air quality distribution of each category to the greatest extent, the cost is saved, and the method has certain value significance in practical application.
In another aspect, the present invention discloses a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method as described above.
It is understood that the system provided by the embodiment of the present invention corresponds to the method provided by the embodiment of the present invention, and the explanation, the example and the beneficial effects of the related contents can refer to the corresponding parts in the method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. .
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (9)
1. A method for arranging and selecting sites of air quality monitoring equipment based on road section pollution similarity is characterized by comprising the following steps,
constructing a graph convolution network model, and obtaining K tail gas pollution categories of urban road network roads by using the driving tracks of the motor vehicles in the urban road network and combining the topological connectivity of the urban road structure;
according to the tail gas pollution categories of roads, calculating the distribution information entropy of the roads belonging to each category, and marking the construction priority level, thereby selecting a proper point location;
the method for obtaining K tail gas pollution categories of the urban road network road by utilizing the running track of the motor vehicle in the urban road network and combining the topological connectivity of the urban road structure specifically comprises the following steps:
s1, acquiring the running track data of the motor vehicle in the urban road network, and carrying out road matching according to the longitude and latitude information;
s2, dividing each track into a series of sub-tracks according to the intersection distribution of the road sections, namely the starting points of the road sections, combining the sub-tracks related to the same road section together to form a group of initial clusters, and expressing as IC ═ IC {1,ic2,...,icnWhere n is expressed as total number of links, ici={ici1,ici2,...,iciJThe track cluster of each road section is composed of sub tracks of all the vehicle types;
s3, calculating the road pollution emission X of the motorcade as { ice ] by using the localized emission factors according to the emission factor model1,ice2,...,iceNWhere N is represented as the total number of road segments, icei={icei1,icei2,...,iceidD represents the calculated total road section emission time number, wherein the total fleet emission amount at the time t of the road section i is as follows:
wherein, EfmEmission factor, severity (ic) representing the m-th vehicle modelim) Representing the number of sub-tracks, len (ic), of the m-th vehicle type at time t on road section ii) Representing the road length of the section i, and J representing the number of vehicle types;
the S3 further includes a link conversion for abstracting the link into nodes, a connection between two nodes represents connectivity of two corresponding links in the road network, and a given link connectivity graph G ═ V0ε, W) in which V0Is a finite set of vertices consisting of N road segments, epsilon is a set of edges representing connectivity between road segments, W is an element of RN×NRepresenting the adjacency weight matrix of graph G, each vertex vi∈V0The method comprises the following steps of (1) calculating an adjacency matrix of a mobile source pollution graph through road section connectivity, wherein the final spatial weighting adjacency matrix can be recorded as:
wherein, distijRepresenting a section of road viAnd vjNormalized distance between geographical locations, link (i, j) representing road segmentviAnd vjIf link (i, j) value is 1, v isiAnd vjWith the cells being connected, otherwise they are not, θ20.05 is used to control adjacency matrix scale and sparsity;
the S3 further comprises that the exhaust emission X belongs to R in the urban road networkN×dInputting end-to-end structural clustering frame to obtain the clustering result of road section, dividing the whole road network section into several clustering sub-clusters (X)1,X2,...,XK) Wherein K represents the number of clusters;
each row xiE X represents the ith sample, i.e. the ith road segment, N is the total number of road segments, and d is the dimension, i.e. the total number of discharge moments.
2. The air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 1, characterized by comprising the following steps: the graph convolution network model comprises a BERT module for extracting pollution emission laws, a GCN module for extracting a road network structure, a DNN module for extracting external factors and a double self-supervision module for unifying emission law feature extraction and road network structure extraction.
3. The air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 2, characterized by comprising the following steps:
BERT module extracts exhaust emission X from RN×dUsing a Transformer encoder to iteratively calculate the discharge ice of each road sectioni={icei1,icei2,...,iceidEvery position ice of }itIs hidden representation of each layer lStackingForm aWherein d is0Representing the characteristic dimension, d represents the time length of the time sequence, and the Transformer encoder is composed of two sublayers: the multi-head self-attention mechanism sublayer and the position are all connected with a feedforward network;
the multi-head attention mechanism sublayer projects Q, K, V through h different linear transformations, and finally splices different attention results:
MH(H1 (l))=[head1;head2;...;headh]WO
wherein the projection matrix of each head All are learnable parameters, the parameters are not shared between layers, and the Attention function uses a telescopic dot product Attention mechanism:
wherein query Q, key K, and value V are derived from the same matrix H1 lProjection, temperatureIs introduced to produce a softer attention distribution, avoiding extremely small gradients;
the position full-connection feedforward network is mainly based on linear projection, in order to endow model nonlinearity, the position full-connection feedforward network is applied to the output of the self-attention sublayer, the position full-connection feedforward network is the same at each position and consists of two affine transformations, and Gaussian Error Linear Unit (GELU) activation is arranged in the middle:
FFN(x)=GELU(xW(1)+b(1))W(2)+b(2)
GELU(x)=xΦ(x)
where Φ (x) is the cumulative distribution function of the standard gaussian distribution, is a learnable parameter and each location is shared;
stacking the transform sub-layers to form a transform encoder layer, using dropout for the output of each sub-layer, using residual concatenation around the sub-layers, and then normalizing:
Trm(H1 (l-1))=LN(A(l-1)+Dropout(PFFN(A(l-1))))
A(l-1)=LN(H1 (l-1)+Dropout(MH(H1 (l-1))))
wherein LN is a defined layer normalization function;
embedding a location into an entry, the entry of whichConfigured as an addition of the respective sequence value and the position embedding:
wherein iceit∈iceiIs d0Embedding of the dimension at time t drainage, ptIs d0Position embedding of dimensions, using fixed sine embedding instead of learnable position embedding;
in the decoder, an encoder-decoder attention sublayer is added between a multi-head self-attention mechanism sublayer and a position full-connection feedforward network, in the encoder-decoder attention, Q is from the last output of the decoder, K and V are from the output of an encoder, the calculation mode is the same as that of the encoder, and the decoder outputs a reconstructed vector X' and has the following objective function:
4. the air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 3, characterized by comprising the following steps:
the DNN module is used for extracting external factors, wherein the external factors comprise meteorological data and urban interest point data Y belonging to RN×d′Providing additional information for feature extraction of road movement source emission, wherein d' represents the dimension of external factor data, and an automatic encoder is adopted to learn the features of the external factors; assuming that the self-encoder has a total of L layers, L represents a specific number of layers, and the characteristics learned by the L-th layer encoding part are expressed as:
where φ is an activation function of the fully-connected layer, such as the Relu or Sigmoid functions,andrespectively, the weight matrix and the offset of the l-th layer of the encoder, denoted H2 (0)Represents the original data Y;
the encoded part is followed by a decoder to reconstruct the input, which uses some fully-connected layers, denoted as:
wherein the content of the first and second substances,andrespectively representing the weight matrix and the deviation of the decoder;
the decoder outputs reconstructed extrinsic factor data Y' with the following objective function:
5. the air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 4, wherein the method comprises the following steps: the graph convolution network model further comprises a feature fusion module, and the feature fusion module is used for adding the external factor features and the emission rule features of each layer to form fusion features:
H(l)=H1 (l)+H2 (l)。
6. the air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 5, wherein the method comprises the following steps:
the GCN module is used for extracting a road network structure and according to a road network connected graph G ═ V0Epsilon, W), integrating the fusion features into the GCN, which can learn to represent that three different pieces of information are accommodated: the relationship between the emission data itself, external factors, and data; the representation of the l-th layer GCN learning can be performed by the following convolution operation:
whereinI is the unit diagonal matrix of the self-circulating adjacency matrix W for each node, Z(l-1)Adjacency matrix by normalizationPropagate to obtain a new representation Z(l)Considering HlCombines the time sequence characteristics and the external factor characteristics of discharge and combines Z(l-1)And H(l-1)Binding gives a stronger representation:
wherein ∈ is an equilibrium coefficient; in this way, the BERT module, the DNN module and the GCN module are connected layer by layer;
Fusion feature H(l-1)By normalizing the matrixThe information of each layer of fusion features is different, the fusion features of each layer are transferred to the corresponding GCN layer for information propagation, and the propagation operator works in the whole model for L times;
the inputs to the first layer GCN are raw emission data X:
the last layer of the GCN module is a multi-layer classification layer with a softmax function:
result zijE is Z, the probability sample i belongs to the clustering center j, and Z is probability distribution.
7. The air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 6, wherein the method comprises the following steps:
the probability distribution Z is supervised using the target distribution P:
the overall loss function is:
wherein, alpha is a coefficient for controlling the GCN module to embed the space interference.
8. The air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 7, characterized by comprising the following steps:
the method comprises the following steps of calculating distribution information entropy of roads belonging to each category according to tail gas pollution categories of the roads, and marking and establishing priority levels, so as to select proper point locations, and specifically comprises the following steps:
setting cluster labels for road sections, selecting probability distribution Z as a final clustering result, finding a value with the highest quantization probability from the distribution Z, and distributing the value as the cluster labels of the road sections i:
i.e. the whole network segment can be divided into several cluster sub-clusters (X)1,X2,...,XK) Wherein K denotes the number of clusters, cluster sub-cluster (X)1,X2,...,XK) Actually represents K categories of urban exhaust pollution;
a plurality of clustering sub-clusters (X) obtained by dividing road sections of the whole road network1,X2,...,XK) Respectively counting and obtaining each sub-cluster XKNumber N of included roadsKFurther calculating each sub-cluster XKNumber n of air quality monitoring stations to be constructedi:
For each sub-cluster XKAnd calculating the distribution information entropy of the probability distribution Z of each road i, which is defined as follows:
H(Z)={-∑[Zlog(Z)+(1-Z)log(1-Z)]}
calculating the numerical value of the information entropy to mark the established priority of each road section i according to the calculated distribution information entropy of each road i; the smaller the value of the distribution information entropy is, the road and the sub-cluster X which the road belongs to are indicatedKThe higher the relevance of other roads, i.e. the more representative the road can be of the corresponding pollution class sub-cluster XK;
In each sub-cluster XKThe priority is labeled and established according to the value of the distribution information entropy of each road i, and the higher the value of the information entropy is, the higher the priority is established for the road with the larger value of the information entropy;
in each sub-cluster XKIn the method, n is constructed according to the construction priority of the roadiAnd (4) an air quality monitoring station.
9. The air quality monitoring device point placement and site selection method based on road segment pollution similarity according to claim 6, wherein the method comprises the following steps: the balance coefficients e are all set to 0.5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110320263.0A CN113222328B (en) | 2021-03-25 | 2021-03-25 | Air quality monitoring equipment point arrangement and site selection method based on road section pollution similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110320263.0A CN113222328B (en) | 2021-03-25 | 2021-03-25 | Air quality monitoring equipment point arrangement and site selection method based on road section pollution similarity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113222328A CN113222328A (en) | 2021-08-06 |
CN113222328B true CN113222328B (en) | 2022-02-25 |
Family
ID=77084118
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110320263.0A Active CN113222328B (en) | 2021-03-25 | 2021-03-25 | Air quality monitoring equipment point arrangement and site selection method based on road section pollution similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113222328B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113538485B (en) * | 2021-08-25 | 2022-04-22 | 广西科技大学 | Contour detection method for learning biological visual pathway |
CN113886507A (en) * | 2021-08-27 | 2022-01-04 | 北京工业大学 | Rail transit station site selection prediction method based on dynamic grid division |
CN114050975B (en) * | 2022-01-10 | 2022-04-19 | 苏州浪潮智能科技有限公司 | Heterogeneous multi-node interconnection topology generation method and storage medium |
CN114818984B (en) * | 2022-05-31 | 2022-09-23 | 南京信息工程大学 | Refined urban ponding water level fitting method based on artificial intelligence |
CN115358904B (en) * | 2022-10-20 | 2023-02-03 | 四川国蓝中天环境科技集团有限公司 | Dynamic and static combined urban area air quality monitoring station site selection method |
CN116233865B (en) * | 2023-05-09 | 2023-07-04 | 北京建工环境修复股份有限公司 | Point distribution method and system for new pollutant monitoring equipment |
CN117129638B (en) * | 2023-10-26 | 2024-01-12 | 江西怡杉环保股份有限公司 | Regional air environment quality monitoring method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110163449A (en) * | 2019-05-31 | 2019-08-23 | 杭州电子科技大学 | A kind of motor vehicle blowdown monitoring node dispositions method based on active space-time diagram convolution |
CN111832814A (en) * | 2020-07-01 | 2020-10-27 | 北京工商大学 | Air pollutant concentration prediction method based on graph attention machine mechanism |
CN111949749A (en) * | 2020-07-30 | 2020-11-17 | 中国科学技术大学 | High-order graph convolution network-based air quality monitoring station position recommendation method |
CN112231569A (en) * | 2020-10-23 | 2021-01-15 | 中国平安人寿保险股份有限公司 | News recommendation method and device, computer equipment and storage medium |
WO2021022521A1 (en) * | 2019-08-07 | 2021-02-11 | 华为技术有限公司 | Method for processing data, and method and device for training neural network model |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105844401B (en) * | 2016-03-22 | 2019-04-12 | 北京工商大学 | The lake and reservoir wawter bloom of case-based reasioning administers complexity and dynamically associates model and decision-making technique |
CN112217197B (en) * | 2020-09-01 | 2022-04-12 | 广西大学 | Optimization method for economic dispatch of double-layer distributed multi-region power distribution network |
-
2021
- 2021-03-25 CN CN202110320263.0A patent/CN113222328B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110163449A (en) * | 2019-05-31 | 2019-08-23 | 杭州电子科技大学 | A kind of motor vehicle blowdown monitoring node dispositions method based on active space-time diagram convolution |
WO2021022521A1 (en) * | 2019-08-07 | 2021-02-11 | 华为技术有限公司 | Method for processing data, and method and device for training neural network model |
CN111832814A (en) * | 2020-07-01 | 2020-10-27 | 北京工商大学 | Air pollutant concentration prediction method based on graph attention machine mechanism |
CN111949749A (en) * | 2020-07-30 | 2020-11-17 | 中国科学技术大学 | High-order graph convolution network-based air quality monitoring station position recommendation method |
CN112231569A (en) * | 2020-10-23 | 2021-01-15 | 中国平安人寿保险股份有限公司 | News recommendation method and device, computer equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
A Low-Cost Reconfigurable Nonlinear Core for Embedded DNN Applications;Yue Li.et.al.;《2020 International Conference on Field-Programmable Technology (ICFPT)》;20201111;全文 * |
基于注意力机制增强图卷积神经网络的个性化新闻推荐;杨宝生;《兰州文理学院学报(自然科学版)》;20200930;第34卷(第5期);全文 * |
机动车尾气遥感监测数据中心平台的设计与开发;吴迪等;《大气与环境光学学报》;20161130;第11卷(第6期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113222328A (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113222328B (en) | Air quality monitoring equipment point arrangement and site selection method based on road section pollution similarity | |
Yuan et al. | A survey of traffic prediction: from spatio-temporal data to intelligent transportation | |
Xu et al. | Multitask air-quality prediction based on LSTM-autoencoder model | |
CN110889546B (en) | Attention mechanism-based traffic flow model training method | |
CN109887282B (en) | Road network traffic flow prediction method based on hierarchical timing diagram convolutional network | |
CN109142171B (en) | Urban PM10 concentration prediction method based on feature expansion and fusing with neural network | |
Liang et al. | Applying genetic algorithm and ant colony optimization algorithm into marine investigation path planning model | |
CN111400620A (en) | User trajectory position prediction method based on space-time embedded Self-orientation | |
CN108009674A (en) | Air PM2.5 concentration prediction methods based on CNN and LSTM fused neural networks | |
Ren et al. | Mtrajrec: Map-constrained trajectory recovery via seq2seq multi-task learning | |
CN112037506B (en) | Vehicle track prediction model construction method, vehicle track prediction method and system | |
Lin et al. | A spatial-temporal hybrid model for short-term traffic prediction | |
Liu et al. | Fedgru: Privacy-preserving traffic flow prediction via federated learning | |
CN112905801A (en) | Event map-based travel prediction method, system, device and storage medium | |
CN114925836B (en) | Urban traffic flow reasoning method based on dynamic multi-view graph neural network | |
CN114841400A (en) | Air quality prediction method based on multi-task space-time diagram convolution | |
CN116168548A (en) | Traffic flow prediction method of space-time attention pattern convolution network based on multi-feature fusion | |
Wang et al. | Transworldng: Traffic simulation via foundation model | |
Xiao et al. | Parking prediction in smart cities: A survey | |
CN116612633A (en) | Self-adaptive dynamic path planning method based on vehicle-road cooperative sensing | |
Zhang | Remote sensing data processing of urban land using based on artificial neural network | |
CN115862324A (en) | Space-time synchronization graph convolution neural network for intelligent traffic and traffic prediction method | |
CN115080795A (en) | Multi-charging-station cooperative load prediction method and device | |
Bonet et al. | Conditional variational graph autoencoder for air quality forecasting | |
CN116975256B (en) | Method and system for processing multisource information in construction process of underground factory building of pumped storage power station |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |