CN113762351B - Air quality prediction method based on deep transition network - Google Patents

Air quality prediction method based on deep transition network Download PDF

Info

Publication number
CN113762351B
CN113762351B CN202110923976.6A CN202110923976A CN113762351B CN 113762351 B CN113762351 B CN 113762351B CN 202110923976 A CN202110923976 A CN 202110923976A CN 113762351 B CN113762351 B CN 113762351B
Authority
CN
China
Prior art keywords
information
gru
value
gating
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110923976.6A
Other languages
Chinese (zh)
Other versions
CN113762351A (en
Inventor
欧阳继红
杨智尧
王艺蒙
曲延非
李嘉寅
毕夏旭
王兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN202110923976.6A priority Critical patent/CN113762351B/en
Publication of CN113762351A publication Critical patent/CN113762351A/en
Application granted granted Critical
Publication of CN113762351B publication Critical patent/CN113762351B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Abstract

The invention discloses an air quality prediction method based on a deep transition network, and provides an air quality prediction model (AI-DTN) based on auxiliary information and the deep transition network, which comprises two transition networks in different directions, wherein the two transition networks respectively extract characteristic information from the two time sequence directions to enhance the characteristic extraction degree. Each transition network in the AI-DTN extracts a gating circulation unit AI-GRU of fusion auxiliary information of spatial characteristics and an existing transition gating circulation unit T-GRU for extracting time characteristics. In two gating of AI-GRU, one controls the degree of auxiliary information flowing into the gating circulation unit, and the other controls the fusion degree of PM2.5 and auxiliary information, and the gating mechanism can avoid mutual interference in the information fusion process.

Description

Air quality prediction method based on deep transition network
Technical Field
The invention relates to the technical field of data processing, in particular to an air quality prediction method based on a deep transition network.
Background
There are many factors affecting the air quality, such as NO, CO, etc., pollutants, automobile exhaust, industrial emissions, and weather information such as wind speed, wind direction, rainfall, etc., which are collectively referred to as auxiliary information. There are difficulties in using these auxiliary information for air quality prediction. First, it is difficult to obtain all of these information accurately, and it is difficult to obtain all of the real-time emission of automobile exhaust and industrial emission information for pollution information, and weather information is a certain deviation from forecast information, so that forecast information cannot be used, and error accumulation is caused. Therefore, the current air quality prediction mostly uses pollutant information such as NO and CO, and past weather information. Second, complex variations can occur between contaminants, between contaminants and meteorological information, and there is no way to fully model all the variations. How to use the well available auxiliary information, finding the respective auxiliary information has an important meaning for the different roles of the PM2.5 prediction.
Most air quality prediction models now utilize CNN for spatial feature extraction among various features, RNN for temporal feature extraction, or attention mechanism directly for spatial feature extraction, and then cyclic RNN network or LSTM network for temporal feature extraction. But these methods do not have sufficient ability to extract potential features given the side information.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide an air quality prediction method based on a deep transition network so as to improve the accuracy of air quality prediction.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
an air quality prediction method based on a deep transition network comprises the following specific processes:
s1, acquiring air quality time sequence data and preprocessing;
s2, carrying out air quality prediction by adopting an air quality prediction model AI-DTN based on auxiliary information and a deep transition network:
the air quality prediction model AI-DTN consists of a front deep transition network, a back deep transition network and a full connection layer, wherein the front deep transition network and the back deep transition network are used for extracting space features and time features, then the results of the two transition networks are spliced together, and finally the full connection layer is used for outputting;
the depth of each deep transition network is L; the first layer of the deep transition network is a gating circulation unit AI-GRU, wherein the AI-GRU is used for extracting input spatial characteristics; the second layer to the L layer of the deep transition network are composed of transition gate control circulating units T-GRU, and the output of the L layer T-GRU at the time T is the input of the first layer AI-GRU at the time t+1;
the detailed calculation process of the air quality prediction model AI-DTN is as follows:
the input of the model is divided into two parts, the first part being a PM2.5 time series representing a historical time window size q, denoted X t ={x t-q+1 ,...,x t },X t Is a matrix of dimension 1*q, and the second part is a time sequence of side information representing a historical time window of size q, denoted as A t ={a t-q+1 ,…,a t },A t Is a matrix with dimension of n x q, A t Each a of (a) t-q+1 ,...,a t Are all a matrix of n 1, where n represents the number of features in the auxiliary information;
in a forward deep transition network, X is first taken t And A t Input into AI-GRU to obtain hidden state of first layer of deep transition network
Wherein L represents the layer number of the deep transition network, and the hidden state is weighted and fused with PM2.5 information and auxiliary information to represent the spatial characteristic information at the moment t; the hidden state is then passed to the next layer of T-GRU for the time step, whose hidden state is as follows:
where i represents the current network depth, the T-GRU takes only the hidden state of the AI-GRU of the upper layer as input, and the hidden state of the T-GRU of the last layer is transferred to the AI-GRU of the next time step as input;
likewise, for X using a reverse deep transition network t And A t The two time sequences are subjected to reverse feature extraction, so that hidden states representing reverse time sequence information can be obtained
Then splicing the hidden states of the forward and reverse deep transition networks together according to the time sequence:
wherein; representing a splicing operation; at this time E t The system comprises space characteristic information and time characteristic information which are extracted from a deep transition network and comprise a forward time sequence and a reverse time sequence; finally, E is t Inputting the final prediction to a full connection layer to obtain a final output:
Y t =W*E t +b;
where W is the parameter matrix and b is the bias term.
Further, in step S1, the specific process of the pretreatment is:
s1.1, processing a missing value: performing missing value processing on the original air quality time sequence data based on a Lagrange interpolation method;
s1.2, normalization: and (3) adopting a normalization method of min-max standardization to linearly transform the data after the missing value processing is completed, so that the result value is mapped between [0-1 ].
Further, stepIn step S2, for time step t, the hidden state h of the AI-GRU network t The calculation formula of (2) is shown as follows:
wherein +. t Gating z by updating of current time step t To conceal the state h from the last time step t-1 And candidates for hidden state of current time stepInformation selection and combination are carried out;
z t the updating gate control has the value range of (0, 1), the value is closer to 0, the more the historical information is discarded, the less the information is newly added in the current time step, the value is closer to 1, the less the information is newly added in the past time step, and the more the information is newly added in the current time step; updating gating z t The calculation formula of (2) is shown as follows:
z t =σ(W xz x t +W hz h t-1 +W az a t );
W xz 、W az 、W hz respectively, the weights are represented by the weights,is a candidate value of the hidden state of the current time step; />PM2.5 information x of current time step is selectively controlled by gating mechanism t Auxiliary information a t Hidden state h of last time step t-1 Adding into AI-GRU; candidate value of hidden state +.>The calculation formula of (2) is shown as follows:
r t representing a reset gate, l t Gating representing linear transformation g t Representing gating of auxiliary information, p t Representing the gating of the degree of fusion of the auxiliary information and the PM2.5 information, H (x) represents the linear transformation of PM 2.5;scaling data to [ -1, 1] by tanh activation function]Finally, the information after linear transformation is added to obtain +.>Results of (2); r is (r) t 、l t 、g t 、p t The calculation formula of H (x) is as follows:
r t =σ(W xr x t +W hr h t-1 ) (7);
l t =σ(W xl x t +W hl h t-1 ) (8);
g t =σ(W ag a t +W hg h t-1 ) (9);
p t =σ(W ap a t +W hp h t-1 ) (10);
H(x t )=W x x t (11);
in the above formula, W xr 、W hr 、W xl 、W hl 、W ag 、W hg 、W ap 、W hp 、W x Respectively represent weights, r t Representing reset gating, representing control over historical information; at the position ofIn the calculation of (2), r t And h t-1 Performing element multiplication operation, h t-1 All history information up to the last time step is contained, and r t The value range of (0, 1), which means that the value is closer to 0, the less the historical information representing the inflow of the AI-GRU is, the value is closer to 1, the more the historical information representing the inflow of the AI-GRU is, so that the historical information irrelevant to prediction can be timely discarded;
g t and p t Is to a t And h t-1 Performing nonlinear transformation; wherein g t The function of (2) is to extract auxiliary information useful for PM2.5, which controls the extent to which the auxiliary information flows into the AI-GRU, and the value range is (0, 1), the value is closer to 0, the smaller the auxiliary information flowing into the AI-GRU is represented, the value is closer to 1, and the more auxiliary information flowing into the AI-GRU is represented; p is p t The function of the system is to merge auxiliary information and PM2.5 information, the system controls the degree of the auxiliary information and PM2.5 information, the value range is (0, 1), the value is closer to 0, the smaller the fusion degree of the auxiliary information and PM2.5 information is, the closer to 1, and the larger the fusion degree of the auxiliary information and PM2.5 information is;
l t the gating of the linear transformation H (x) controls the degree of PM2.5 information flowing into the AI-GRU after the linear transformation, the value range is (0, 1), the value is closer to 0, the less PM2.5 information flowing into the AI-GRU is represented, the value is closer to 1, and the more PM2.5 information flowing into the AI-GRU is represented; h (x) is a linear transformation of PM2.5 information that acts to focus AI-GRU on PM2.5 only, making it more focused on PM2.5 information;
AI-GRU gating r by resetting t Gating g of auxiliary information t Gating p representing the degree of fusion of side information and PM2.5 information t Linear transformation gating t The influence degree of various auxiliary information on the air quality prediction is effectively controlled; meanwhile, the gating mechanism selectively adds the predicted auxiliary information, PM2.5 information and history information with positive effect into the AI-GRU, and discards various information irrelevant to prediction in time.
Further, the hidden state of the T-GRUThe calculation formula of (2) is shown as follows:
wherein, as the element multiplication, i represents the depth of the current transition network; z t The update gating is that the value range is (0, 1), the closer the value is to 0, the more the discarded historical information is represented, the less the information newly added by the current network layer is, the closer the value is to 1, the less the information representing the discarded previous network layer is, the more the information newly added by the current network layer is, and the calculation formula is shown as the following formula:
is a candidate for the hidden state of the T-GRU by resetting the gate r t To conceal the status of the previous network layer +.>And performing data processing, wherein the calculation formula is shown as follows:
r t a representative reset gate representing control of the history information; r is (r) t The value range of (0, 1), which means that the value is closer to 0, the less the historical information representing inflow T-GRU is, the value is closer to 1, the more the historical information representing inflow T-GRU is, so that the historical information which is irrelevant to prediction can be cleared in time, and the calculation formula is shown as follows:
the T-GRU only receives the hidden states transmitted by the upper layer in the same time step, so that a special nonlinear relation between continuous hidden states can be learned, and further, the state representation of a deeper layer can be obtained.
The invention has the beneficial effects that: in order to extract deep space features and time features of air quality data, the invention provides an air quality prediction model (AI-DTN) based on auxiliary information and deep transition networks, which comprises two transition networks in different directions, wherein feature information is extracted from the two time sequence directions respectively to enhance the feature extraction degree. Each transition network in the AI-DTN extracts a gating circulation unit AI-GRU of fusion auxiliary information of spatial characteristics and an existing transition gating circulation unit T-GRU for extracting time characteristics. In the two gating of AI-GRU, one controls the range of auxiliary information flowing into the gating circulation unit, and the other controls the fusion degree of PM2.5 and auxiliary information, and the gating mechanism can avoid mutual interference in the information fusion process.
Drawings
FIG. 1 is a schematic diagram of a process according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the structure of an air quality prediction model AI-DTN according to an embodiment of the invention;
FIG. 3 is a block diagram of an AI-GRU in accordance with an embodiment of the invention;
FIG. 4 is a block diagram of a T-GRU in an embodiment of the invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, and it should be noted that, on the premise of the present technical solution, the present embodiment provides a detailed implementation manner and a specific operation procedure, but the protection scope of the present invention is not limited to the present embodiment.
The embodiment provides an air quality prediction method based on a deep transition network, as shown in fig. 1, which specifically comprises the following steps:
s1, acquiring air quality time sequence data and preprocessing, wherein the preprocessing process comprises the following steps:
s1.1, processing a missing value:
there are a large number of incomplete, inconsistent, abnormal and deviated data in the original air quality time sequence data, and these problem data can influence the accuracy of air quality prediction. Data preprocessing is therefore indispensable, and among them, it is common to perform missing value processing of a data set.
Data loss value processing can be divided into two categories. One is deletion of missing data, and one is data interpolation. The biggest limitation of the former is that the history data is reduced to be replaced by complete data, so that a great amount of resource waste is caused, and especially in the case of fewer data sets, the objectivity and accuracy of an analysis result may be directly affected by deleting records. The present embodiment thus performs the missing value processing based on the lagrangian interpolation method.
The definition of Lagrange interpolation is as follows:
for a certain polynomial function, given k+1 value points are known, (x) 0 ,u 0 ),...(x k ,y k ) Wherein x is j Corresponds to the position of the argument, and y i Corresponding to the value of the function at this location.
Assuming any two different x j All are different from each other, the lagrangian interpolation polynomial obtained by applying the lagrangian interpolation formula is:
wherein l j (x) Is a Lagrangian base polynomial (or interpolation basis function) whose expression is:
lagrangian base polynomial l j (x) Is characterized in that x is j Take on a value of 1 at other points x i The value of i.noteq.j is 0.
S1.2 normalization
The data normalization process is a basic work before an air quality model is built, different features often have different dimensions and dimension units, the situation can influence the result of data analysis, and in order to eliminate the dimension influence among the features, the data normalization process is needed to solve the comparability among data indexes. After the original data is subjected to data standardization processing, all the characteristics are in the same order of magnitude, and the method is suitable for comprehensive comparison and evaluation. The normalization method adopted by the method of the embodiment is min-max normalization, which is the linear transformation of the data after the missing value processing is completed, so that the result value is mapped between [0-1 ]. The transfer function is as follows:
wherein x is * For normalized data, x represents data after the missing value processing is completed, max is the maximum value of the data after the missing value processing is completed, and min is the minimum value of the data after the missing value processing is completed.
S2, air quality prediction:
aiming at the problem of how to distinguish the prediction degree of different auxiliary information on PM2.5, the embodiment provides an air quality prediction model-AI-DTN (Auxiliary Information-Deep Transition Network) based on the auxiliary information and a deep transition network.
The air quality prediction model AI-DTN consists of a front deep transition network, a back deep transition network and a full connection layer, wherein the front deep transition network and the back deep transition network are used for extracting space characteristics and time characteristics, then the results of the two transition networks are spliced together, and finally the full connection layer is used for outputting, and the model structure is shown in figure 2.
The construction of the air quality prediction model AI-DTN focuses on two deep transition networks, namely a positive deep transition network and a negative deep transition network, wherein the depth of each deep transition network is L. For a feed forward neural network, the network depth refers to the number of layers of the nonlinear layer between the input and output, while for a cyclic RNN network, the network depth refers to the number of layers of the nonlinear layer in one time step.
The first layer of the deep transition network is the gating loop AI-GRU, which is used to extract the spatial features of the input. The second layer to the L layer of the deep Transition network are composed of Transition gate control circulating units (T-GRU), wherein the T-GRU is an important component in the deep Transition network, and can extract information with deeper hidden states in the circulating neural network. the output of the L-th layer T-GRU at time T is the input of the first layer AI-GRU at time t+1.
AI-GRU and T-GRU are described in further detail below:
1. auxiliary information fused gating circulation unit AI-GRU
In order to extract deeper spatial features and achieve the purpose of distinguishing the importance degree of different auxiliary information on PM2.5 prediction, the embodiment provides an AI-GRU (Auxiliary Information-GRU) which is a gating circulation unit fused with the auxiliary information. AI-GRU is inspired by AGDT (reference: liang Y, meng F, zhang J, et al A non aspect-guided deep transition model for aspect based sentiment analysis [ J ]. ArXiv preprint arXiv:1909.00324, 2019.) and is a recurrent neural network unit that utilizes the gating characteristics of GRU and adds gating of auxiliary information based thereon. The AI-GRU inputs PM2.5 information, utilizes auxiliary information to carry out auxiliary prediction on air quality prediction, fuses the two information at the same time, and controls the input degree of the auxiliary information and the fusion degree of the PM2.5 information and the auxiliary information. The AI-GRU can dynamically adjust the weight of each auxiliary information through a gating mechanism, so that the importance degree of different auxiliary information on PM2.5 prediction is found.
The output structure of the AI-GRU is the same as the GRU, while the input structure is different, the AI-GRU increases the input of auxiliary information. AI-GRU incorporates x from the current time step t 、a t And the hidden state h of the last time t-1 Obtaining the hidden state h of the current time step t This hidden state contains information about all previous time steps. Hidden state h of AI-GRU t I.e., the output of the AI-GRU. The structure of the AI-GRU is shown in FIG. 3.
For time step t, hidden state h of AI-GRU network t The calculation formula of (2) is shown as formula (4).
Wherein +. t Gating z by updating of current time step t To conceal the state h from the last time step t-1 And candidates for hidden state of current time stepInformation selection and combination are performed.
z t The update gating is that the value range is (0, 1), the value is closer to 0, the more the historical information is discarded, the less the information is newly added in the current time step, the value is closer to 1, the less the information is newly added in the current time step, and the more the information is newly added in the current time step. Updating gating z t The calculation formula of (2) is shown as formula (5):
z t =σ(W xz x t +W hz h t-1 +W az a t ) (5);
W xz 、W az 、W hz respectively representing weights, and automatically obtaining new weights through a gradient descent method in deep learning in the training process;is a candidate for the hidden state of the current time step, simply to update the new hidden state. />PM2.5 information x of current time step is selectively controlled by gating mechanism t Auxiliary information a t Hidden state h of last time step t-1 Adding into AI-GRU;
candidate value of hidden stateThe calculation formula of (2) is shown as formula (6):
r t represents a reset gate, and the calculation formula is shown in formula (7). l (L) t Representing the gating of the linear transformation, the calculation formula of which is shown in formula (8). g t And the calculation formula of the gating representing the auxiliary information is shown in formula (9). P is p t And the gating representing the merging degree of the auxiliary information and the PM2.5 information is shown in a formula (10). H (x) represents the linear transformation of PM2.5, and the calculation formula is shown in formula (11).Scaling data to [ -1, 1] by tanh activation function]Finally, the information after linear transformation is added to obtain +.>As a result of (a).
r t =σ(W xr x t +W hr h t-1 ) (7);
l t =σ(W xl x t +W hl h t-1 ) (8);
g t =σ(W ag a t +W hg h t-1 ) (9);
p t =σ(W ap a t +W hp h t-1 ) (10);
H(x t )=W x x t (11);
In the above formula, W xr 、W hr 、W x1 、W h1 、W ag 、W hg 、W ap 、W hp 、W x Respectively representing weights, and automatically obtaining new weights through a gradient descent method in deep learning in the training process; r is (r) t Representing reset gating, substitutionControl of the history information is shown. At the position ofIn the calculation of (2), r t And h t-1 Performing element multiplication operation, h t-1 All history information up to the last time step is contained, and r t The value range of (0, 1) means that the value is closer to 0, the less the historical information representing the inflow of the AI-GRU is, the value is closer to 1, the more the historical information representing the inflow of the AI-GRU is, and the historical information irrelevant to prediction can be timely discarded.
g t And p t Is to a t And h t-1 And performing nonlinear transformation. Wherein g t The function of (2) is to extract auxiliary information useful for PM2.5, which controls the extent to which the auxiliary information flows into the AI-GRU, and the range of values is (0, 1), the smaller the value is, the more the auxiliary information flowing into the AI-GRU is represented, the closer the value is to 1, and the more the auxiliary information flowing into the AI-GRU is represented. P is p t The function of the system is to merge auxiliary information and PM2.5 information, the system controls the degree of the auxiliary information and PM2.5 information, the value range is (0, 1), the value is closer to 0, the degree of the auxiliary information and PM2.5 information fusion is represented to be smaller, the value is closer to 1, and the degree of the auxiliary information and PM2.5 information fusion is represented to be larger.
l t The gating of the linear transformation H (x) controls the degree of PM2.5 information flowing into the AI-GRU after the linear transformation, the value range is (0, 1), the value is closer to 0, the less PM2.5 information flowing into the AI-GRU is represented, the value is closer to 1, and the more PM2.5 information flowing into the AI-GRU is represented. H (x) is a linear transformation of PM2.5 information that acts to focus AI-GRU on PM2.5 only, focusing more on PM2.5 information.
AI-GRU gating r by resetting t Gating g of auxiliary information t Gating p representing the degree of fusion of side information and PM2.5 information t Linear transformation gating t The influence degree of various auxiliary information on the air quality prediction is effectively controlled. At the same time, the gating mechanism also predicts the positive effectAuxiliary information, PM2.5 information and history information are selectively added to the AI-GRU, and various kinds of information irrelevant to prediction are timely discarded.
2. Transition gate control circulation unit
Transition-gated circulation units (T-GRU) (reference: pascanu R, gulcehre C, cho K, et al How toconstruct deep recurrent neural networks [ J)]arXiv preprint arXiv:1312.6026 2013.) is an important component of deep transition networks, and typically when the transition network depth is greater than 2, T-GRU is started to be used. The input of the T-GRU is just the hidden state of the upper layer of the same time stepThe hidden state of the current network layer of the same time step is output +.>The structure of the T-GRU is shown in FIG. 4.
Hidden state of T-GRUThe calculation formula of (2) is shown in formula (12).
Where, as indicated by the multiplication of the elements, i represents the depth of the current transition network. z t The update gating is that the value range is (0, 1), the value is closer to 0, the more the discarded historical information is represented, the less the information newly added by the current network layer is, the value is closer to 1, the less the information representing the discarded previous network layer is, the more the information newly added by the current network layer is, and the calculation formula is represented as formula (13).
Is a candidate for the hidden state of the T-GRU by resetting the gate r t To conceal the status of the previous network layer +.>And (4) performing data processing, wherein a calculation formula is shown in a formula (14).
r t Representing a reset gate, representing control of the history information. r is (r) t The value range of (0, 1) means that the value is closer to 0, the less the historical information representing inflow T-GRU is, the value is closer to 1, the more the historical information representing inflow T-GRU is, so that the historical information which is irrelevant to prediction can be cleared in time. The calculation formula is shown as formula (15).
The T-GRU only receives the hidden states transmitted by the upper layer in the same time step, so that a special nonlinear relation between continuous hidden states can be learned, and further, the state representation of a deeper layer can be obtained.
Further, the detailed calculation process of the air quality prediction model AI-DTN is as follows:
the input of the model is divided into two parts, the first part being a PM2.5 time series representing a historical time window size q, denoted X t ={x t-q+1 ,...,x t },X t Is a matrix of dimension 1*q, and the second part is a time sequence of side information representing a historical time window of size q, denoted as A t ={a t-q+1 ,…,a t },A t Is a matrix with dimension of n x q, A t Each a of (a) t-q+1 ,...,a t Are all a matrix of n 1, where n isThe number of features in the table auxiliary information.
Since the principle of the deep transition network is the same, the calculation process is described below by taking the forward deep transition network as an example. First X is taken up t And A t Input into AI-GRU to obtain hidden state of first layer of deep transition network
Wherein L represents the layer number of the deep transition network, and the hidden state is weighted and fused with PM2.5 information and auxiliary information to represent the spatial characteristic information at the moment t; then the hidden state is transferred to the T-GRU of the next layer of the time step, and the hidden state is as follows:
where i represents the current network depth, the T-GRU takes as input only the hidden state of the AI-GRU of the upper layer, and the hidden state of the T-GRU of the last layer will be passed as input to the AI-GRU of the next time step.
Likewise, the reverse deep transition network and the forward deep transition network have the same principle, and the reverse deep transition network is utilized to perform X-ray t And A t The two time sequences are subjected to reverse feature extraction to obtain hidden states representing reverse time sequence informationThen, splicing the hidden states of the forward and reverse deep transition networks together in time sequence:
wherein; representing a stitching operation. At this time E t The method comprises the step of extracting spatial characteristic information and time characteristic information of a forward time sequence and a reverse time sequence through a deep transition network. Finally, E is t Inputting the final prediction to a full connection layer to obtain a final output:
Y t =W*E t +b (19);
where W is the parameter matrix and b is the bias term.
Various modifications and variations of the present invention will be apparent to those skilled in the art in light of the foregoing teachings and are intended to be included within the scope of the following claims.

Claims (4)

1. The air quality prediction method based on the deep transition network is characterized by comprising the following specific processes of:
s1, acquiring air quality time sequence data and preprocessing;
s2, carrying out air quality prediction by adopting an air quality prediction model AI-DTN based on auxiliary information and a deep transition network:
the air quality prediction model AI-DTN consists of a front deep transition network, a back deep transition network and a full connection layer, wherein the front deep transition network and the back deep transition network are used for extracting space features and time features, then the results of the two transition networks are spliced together, and finally the full connection layer is used for outputting;
the depth of each deep transition network is L; the first layer of the deep transition network is a gating circulation unit AI-GRU, wherein the AI-GRU is used for extracting input spatial characteristics; the second layer to the L layer of the deep transition network are composed of transition gate control circulating units T-GRU, and the output of the L layer T-GRU at the time T is the input of the first layer AI-GRU at the time t+1;
the detailed calculation process of the air quality prediction model AI-DTN is as follows:
the input of the model is divided into two parts, the first part being a PM2.5 time series representing a historical time window size q, denoted X t ={x t-q+1 ,...,x t },X t Is a matrix of dimension 1*q, and the second part is a time sequence representing auxiliary information of historical time window size q, denoted as A t ={a t-q+1 ,...,a t },A t Is a matrix with dimension of n x q, A t Each a of (a) t-q+1 ,…,a t Are all a matrix of n 1, where n represents the number of features in the auxiliary information;
in a forward deep transition network, X is first taken t And A t Input into AI-GRU to obtain hidden state of first layer of deep transition network
Wherein L represents the layer number of the deep transition network, and the hidden state is weighted and fused with PM2.5 information and auxiliary information to represent the spatial characteristic information at the moment t; the hidden state is then passed to the next layer of T-GRU for the time step, whose hidden state is as follows:
where i represents the current network depth, the T-GRU takes only the hidden state of the AI-GRU of the upper layer as input, and the hidden state of the T-GRU of the last layer is transferred to the AI-GRU of the next time step as input;
likewise, for X using a reverse deep transition network t And A t The two time sequences are subjected to reverse feature extraction to obtain hidden states representing reverse time sequence information
Then, splicing the hidden states of the forward and reverse deep transition networks together in time sequence:
wherein, the symbol; representing a splicing operation; at this time E t The system comprises space characteristic information and time characteristic information which are extracted from a forward time sequence and a reverse time sequence through a deep transition network; finally, E is t Inputting the final prediction to a full connection layer to obtain a final output:
Y t =W*E t +b;
where W is the parameter matrix and b is the bias term.
2. The method according to claim 1, wherein in step S1, the specific process of pretreatment is:
s1.1, processing a missing value: performing missing value processing on the original air quality time sequence data based on a Lagrange interpolation method;
s1.2, normalization: and (3) adopting a normalization method of min-max standardization to linearly transform the data after the missing value processing is completed, so that the result value is mapped between [0-1 ].
3. The method according to claim 1, wherein in step S2, for time step t, the hidden state h of the AI-GRU network t The calculation formula of (2) is shown as follows:
wherein +. t Gating z by updating of current time step t To conceal the state h from the last time step t-1 And candidates for hidden state of current time stepSelecting and combining information;
z t the update gating is that the value range is (0, 1), the value is closer to 0, the more the historical information is discarded, the less the information is newly added in the current time step, the value is closer to 1, the less the information is newly added in the past time step, and the more the information is newly added in the current time step; updating gating z t The calculation formula of (2) is shown as follows:
z t =σ(W xz x t +W hz h t-1 +W az a t );
W xz 、W hz 、W az respectively, the weights are represented by the weights,is a candidate value of the hidden state of the current time step; />PM2.5 information x of current time step is selectively controlled by gating mechanism t Auxiliary information a t Hidden state h of last time step t-1 Adding into AI-GRU; candidate value of hidden state +.>The calculation formula of (2) is shown as follows:
r t representing a reset gate, l t Gating representing linear transformation g t Representing gating of auxiliary information, p t Gating representing the degree of fusion of side information and PM2.5 information, H (x) t ) Represented is a linear transformation of PM 2.5;scaling data to [ -1, 1] by tanh activation function]Finally, the information after linear transformation is added to obtain +.>Results of (2); r is (r) t 、l t 、g t 、p t 、H(x t ) The calculation formula of (2) is as follows:
r t =σ(W xr x t +W hr h t-1 ) (7);
l t =σ(W xl x t +W hl h t-1 ) (8);
g t =σ(W ag a t +W hg h t-1 ) (9);
p t =σ(W ap a t +W hp h t-1 ) (10);
H(x t )=W x x t (11);
in the above formula, W xr 、W hr 、W xl 、W hl 、W ag 、W hg 、W ap 、W hp 、W x Respectively represent weights, r t Representing reset gating, representing control over historical information; at the position ofIn the calculation of (2), r t And h t-1 Performing element multiplication operation, h t-1 All history information up to the last time step is contained, and r t The value range of (0, 1), which means that the value is closer to 0, the less the historical information representing the inflow of the AI-GRU is, the value is closer to 1, the more the historical information representing the inflow of the AI-GRU is, so that the historical information irrelevant to prediction can be timely discarded;
g t and p t Is to a t And h t-1 Performing nonlinear transformation; wherein g t The function of (2) is to extract auxiliary information useful for PM2.5, which controls the extent to which the auxiliary information flows into the AI-GRU, and the value range is (0, 1), the value is closer to 0, the smaller the auxiliary information flowing into the AI-GRU is represented, the value is closer to 1, and the more auxiliary information flowing into the AI-GRU is represented; p is p t The function of (1) is to fuse the side information with PM2.5 information gating, which controls the degree of fusion of auxiliary information and PM2.5 information, wherein the value range is (0, 1), the value is closer to 0, the smaller the degree of fusion of inflow auxiliary information and PM2.5 information is, the closer to 1, and the larger the degree of fusion of auxiliary information and PM2.5 information is;
l t is a linear transformation H (x t ) The gating of PM2.5 information flowing into the AI-GRU after linear transformation is controlled, the value range is (0, 1), the value is closer to 0, the smaller the PM2.5 information flowing into the AI-GRU is, the value is closer to 1, and the more PM2.5 information flowing into the AI-GRU is represented; h (x) t ) The linear transformation of PM2.5 information is used for enabling the AI-GRU to only pay attention to PM2.5, so that the AI-GRU is more focused on PM2.5 information;
AI-GRU gating r by resetting t Gating g of auxiliary information t Gating p representing the degree of fusion of side information and PM2.5 information t Linear transformation gating t The influence degree of various auxiliary information on the air quality prediction is effectively controlled; meanwhile, the gating mechanism selectively adds the auxiliary information, PM2.5 information and history information which are predicted to have a forward effect into the AI-GRU, and discards various information irrelevant to prediction in time.
4. The method of claim 1, wherein the hidden state of the T-GRUThe calculation formula of (2) is shown as follows:
wherein, as the element multiplication, i represents the depth of the current transition network;is update gating, the value range is (0, 1), the value is closer to 0, the more historical information is discarded, and the current network layer is used for updating the dataThe less the newly added information, the closer the value is to 1, the less the information representing the discarded previous network layer, and the more the newly added information of the current network layer, the calculation formula is shown as follows:
is a candidate for the hidden state of T-GRU by resetting the gate +.>To conceal the status of the previous network layer +.>And performing data processing, wherein the calculation formula is shown as follows:
a representative reset gate representing control of the history information; />The value range of (0, 1), which means that the value is closer to 0, the less the historical information representing inflow T-GRU is, the value is closer to 1, the more the historical information representing inflow T-GRU is, so that the historical information which is irrelevant to prediction can be cleared in time, and the calculation formula is shown as follows:
the T-GRU only receives the hidden states transmitted by the upper layer in the same time step, so that a special nonlinear relation between continuous hidden states can be learned, and further deeper state representation can be obtained.
CN202110923976.6A 2021-08-12 2021-08-12 Air quality prediction method based on deep transition network Active CN113762351B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110923976.6A CN113762351B (en) 2021-08-12 2021-08-12 Air quality prediction method based on deep transition network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110923976.6A CN113762351B (en) 2021-08-12 2021-08-12 Air quality prediction method based on deep transition network

Publications (2)

Publication Number Publication Date
CN113762351A CN113762351A (en) 2021-12-07
CN113762351B true CN113762351B (en) 2023-12-05

Family

ID=78789092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110923976.6A Active CN113762351B (en) 2021-08-12 2021-08-12 Air quality prediction method based on deep transition network

Country Status (1)

Country Link
CN (1) CN113762351B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103714261A (en) * 2014-01-14 2014-04-09 吉林大学 Intelligent auxiliary medical treatment decision supporting method of two-stage mixed model
CN111275168A (en) * 2020-01-17 2020-06-12 南京信息工程大学 Air quality prediction method of bidirectional gating circulation unit based on convolution full connection
CN112085163A (en) * 2020-08-26 2020-12-15 哈尔滨工程大学 Air quality prediction method based on attention enhancement graph convolutional neural network AGC and gated cyclic unit GRU
CN113095550A (en) * 2021-03-26 2021-07-09 北京工业大学 Air quality prediction method based on variational recursive network and self-attention mechanism

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10867595B2 (en) * 2017-05-19 2020-12-15 Baidu Usa Llc Cold fusing sequence-to-sequence models with language models

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103714261A (en) * 2014-01-14 2014-04-09 吉林大学 Intelligent auxiliary medical treatment decision supporting method of two-stage mixed model
CN111275168A (en) * 2020-01-17 2020-06-12 南京信息工程大学 Air quality prediction method of bidirectional gating circulation unit based on convolution full connection
CN112085163A (en) * 2020-08-26 2020-12-15 哈尔滨工程大学 Air quality prediction method based on attention enhancement graph convolutional neural network AGC and gated cyclic unit GRU
CN113095550A (en) * 2021-03-26 2021-07-09 北京工业大学 Air quality prediction method based on variational recursive network and self-attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于深度学习的混合股指预测模型研究;从筱卿;CNKI;全文 *
基于深度门控循环单元神经网络的短期风功率预测模型;牛哲文;余泽远;李波;唐文虎;;电力自动化设备(第05期);全文 *
基于自注意力机制的双向门控循环单元和卷积神经网络的芒果产量预测;林靖皓;秦亮曦;苏永秀;秦川;;计算机应用(第S1期);全文 *

Also Published As

Publication number Publication date
CN113762351A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN111400620B (en) User trajectory position prediction method based on space-time embedded Self-orientation
CN111667092B (en) Short-time passenger flow prediction method and system for rail transit based on graph convolution neural network
CN105654729A (en) Short-term traffic flow prediction method based on convolutional neural network
CN113094860B (en) Industrial control network flow modeling method based on attention mechanism
JP3637412B2 (en) Time-series data learning / prediction device
US20230334981A1 (en) Traffic flow forecasting method based on multi-mode dynamic residual graph convolution network
CN114802296A (en) Vehicle track prediction method based on dynamic interaction graph convolution
CN108596470A (en) A kind of power equipments defect text handling method based on TensorFlow frames
CN109086892A (en) It is a kind of based on the visual problem inference pattern and system that typically rely on tree
CN115951014A (en) CNN-LSTM-BP multi-mode air pollutant prediction method combining meteorological features
CN110309537A (en) A kind of the intelligent health prediction technique and system of aircraft
CN107945210A (en) Target tracking algorism based on deep learning and environment self-adaption
CN113902007A (en) Model training method and device, image recognition method and device, equipment and medium
CN116168548A (en) Traffic flow prediction method of space-time attention pattern convolution network based on multi-feature fusion
CN113112791A (en) Traffic flow prediction method based on sliding window long-and-short term memory network
CN111612175A (en) Waste mobile phone intelligent pricing method based on fuzzy transfer learning
CN113255597B (en) Transformer-based behavior analysis method and device and terminal equipment thereof
CN113762351B (en) Air quality prediction method based on deep transition network
CN114241606A (en) Character interaction detection method based on adaptive set learning prediction
CN117116048A (en) Knowledge-driven traffic prediction method based on knowledge representation model and graph neural network
CN115438190B (en) Power distribution network fault auxiliary decision knowledge extraction method and system
CN115146844A (en) Multi-mode traffic short-time passenger flow collaborative prediction method based on multi-task learning
CN114548572A (en) Method, device, equipment and medium for predicting urban road network traffic state
CN113112792A (en) Multi-module traffic intensity prediction method based on semantic information
Yao et al. A unified neural network for panoptic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant