CN114936530A - Multi-element air quality data missing value filling model based on TAM and construction method thereof - Google Patents

Multi-element air quality data missing value filling model based on TAM and construction method thereof Download PDF

Info

Publication number
CN114936530A
CN114936530A CN202210714518.6A CN202210714518A CN114936530A CN 114936530 A CN114936530 A CN 114936530A CN 202210714518 A CN202210714518 A CN 202210714518A CN 114936530 A CN114936530 A CN 114936530A
Authority
CN
China
Prior art keywords
output
predictor
layer
data
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210714518.6A
Other languages
Chinese (zh)
Inventor
马思远
宋伟
任晟岐
焦佳辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou University
Original Assignee
Zhengzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou University filed Critical Zhengzhou University
Priority to CN202210714518.6A priority Critical patent/CN114936530A/en
Publication of CN114936530A publication Critical patent/CN114936530A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention relates to a multi-air quality data missing value filling model based on TAM and a construction method thereof. The Triple-Views Layer predicts missing readings from three different perspectives, namely a timestamp perspective, a characteristic perspective and a short-term historical data perspective, the Output Layer performs different calculations according to numerical types of the missing readings, when the missing readings are continuous numerical values, the Output Layer distributes weights to predicted values of all the perspectives and performs weighted summation to obtain final prediction results, when the missing readings are discrete numerical values, the Output Layer connects outputs of the timestamp perspective and the characteristic perspective, prediction probabilities of different categories are obtained through linear mapping and normalization, and then prediction numerical values are obtained. The model can predict and fill missing readings of different numerical types, achieves a more accurate prediction effect, and develops a new idea for a subsequent task of filling the missing values of the multivariate air quality data.

Description

Multi-element air quality data missing value filling model based on TAM and construction method thereof
Technical Field
The invention relates to a filling model of a plurality of air quality data missing values based on TAM and a construction method thereof.
Background
The air quality data is multi-dimensional time series data with geographic markers, and has the periodicity of timeliness, sequentiality and seasonal variation. Through many years of research and development, methods for processing missing data include missing data deletion method, MEAN value replacement (MEAN), previous value replacement method, linear regression model lr (linear regression), Multi-Layer Perceptron MLP (Multi-Layer Perceptron), K-Nearest neighbor models KNN (K-Nearest Neighbors) and recurrent Neural network rnn (current Neural network). In the big data era, the quality of data profoundly influences decision making and scientific development, and the importance of the data is self-evident, and the missing data processing is still a field continuously explored by many researchers in addition to the influence of the data on the field of machine learning.
The method for deleting the missing data is a mode for processing the missing data in an early stage, and directly deletes all data containing true readings in data types, so that a large amount of information is lost in the data, the structurality of the data is damaged, and effective data cannot be obtained even in a data set with a high missing rate for analysis and processing.
When a missing value in a certain attribute is filled, the average value substitution method is used for filling the average value of all observed values of the attribute, the previous value substitution method is used for filling the data before the timestamp where the missing value is located, the two methods ignore the correlation between the variance of the data and each attribute, and the prediction is one-sided.
The linear regression model is a Feed Forward Neural Networks (Feed Forward Neural Networks) composed of a Fully Connected Layer (full Connected Layer), and the multi-Layer perceptron is formed by adding an activation function and the number of the Fully Connected layers on the basis of the linear regression model. Compared with the mean value substitution method, the two methods retain the variance and covariance of the missing data variables, but all the estimated values of the two methods follow a single regression curve, only the single-characteristic time sequence information is considered, the structural property of the data matrix is ignored, and the estimation cannot represent any intrinsic change in the data.
The K-nearest neighbor model is a machine learning model, which does not contain parameters to be learned, and for target samples containing missing values, KNN calculates the "distance" between the target samples and known samples according to some distance metric algorithm (such as euclidean distance), so as to select the nearest K samples to predict the missing values.
The recurrent neural network can deal with the timing problem, has the capacity of 'memorizing', and can well deal with the correlation of time characteristics. On one hand, the RNN forward propagation is carried out sequentially, the parallel characteristic of a GPU is difficult to utilize, a large amount of time is spent when data with a long time sequence is processed, and the calculation efficiency is not high; on the other hand, RNNs have difficulty in solving the problem of long-range dependence, and mining information corresponding to all time stamps is difficult when data with a long time sequence is encountered.
TAM is a deep learning model based on Attention mechanism (Attention). Firstly, Attention is paid to the fact that potential interrelations in data can be well explored, and in the face of multi-feature air quality data, the Attention mechanism can extract internal relations of the data from different angles, namely time stamp data correlation and feature correlation, so that a better prediction effect is achieved. Secondly, the attention mechanism well solves the long-range dependence problem, and can pay attention to all timestamp information, so that the condition that the timestamp information is difficult to establish contact at a distance is avoided. Finally, the Attention mechanism well utilizes the parallel processing capability of the GPU, and has good efficiency when processing data with a long time sequence.
In conclusion, the Attention mechanism can well process the task of filling missing values of multi-element air quality data.
Nowadays, people pay more and more attention to the air pollution problem because the people threaten the physical health of human beings and the sustainable development of society all the time, so people establish more and more monitoring stations in cities to continuously acquire air quality data, meteorological data and the like, and the data basis is provided for people to analyze pollution sources, explore main pollution components and predict air quality. However, due to shutdown maintenance, damage, communication error, unexpected interruption (such as power failure) and the like of the monitoring equipment, the data obtained by the monitoring of the sensor contains missing values. Missing data not only affects real-time pollutant numerical monitoring, but also brings interference to data analysis and pollutant concentration prediction, and the validity of the data is very important for people to analyze the data and prevent and treat air pollution.
The air quality missing value filling (air quality data monitoring) is used as an important branch of an urban air quality prediction task, has important research significance and application value, and has attracted wide attention in the field of air quality data mining. The traditional data analysis methods (such as MEAN value substitution method MEAN and pre-substitution method) can not meet the challenges of big data, the realization efficiency is low, and the use of the big data analysis technology is beneficial to deep mining of the data so as to extract the mode and the rule of air quality change and apply the mode and the rule to the filling task of missing data to obtain better effect. In the big data-based method, LR and MLP only consider single-feature time sequence information, and the structurality of a data matrix is ignored; the KNN needs to compare all known data when predicting each exact reading, so that time is consumed, and the algorithm depends on a distance measurement method; although RNN can be well competent for time series prediction tasks, its operating efficiency is low and the long-range dependence problem cannot be solved.
Disclosure of Invention
The invention provides a multivariate air quality data missing value filling model based on TAM and a construction method thereof, which are used for solving the technical problem of low prediction precision of the urban multivariate air quality data missing value filling model based on the conventional model.
A construction method of a multi-element air quality data missing value filling model based on a TAM (goal-based model) comprises the following steps:
constructing a BatchNorm layer for normalizing multivariate air quality data;
constructing an Input Layer, wherein the Input Layer comprises a multi-element time sequence position code and a fully-connected linear Layer, the multi-element time sequence position code is used for adding position information to a time sequence, and the fully-connected linear Layer is used for mapping Input data to a dense vector;
constructing a Triple-Views Layer, wherein the Triple-Views Layer comprises a realization time stamp view predictor, a characteristic view predictor and a short-term historical data view predictor, the time stamp view predictor is used for predicting from a time dimension, the characteristic view predictor is used for predicting from a multi-element characteristic dimension, and the short-term historical data view predictor is used for predicting according to historical readings;
and constructing an Output Layer, wherein the Output Layer is divided into a Continuous Output Layer and a Discrete Output Layer according to the type of the missing data. The Continuous Output Layer firstly distributes and initializes weights and offsets for a time stamp visual angle predictor, a characteristic visual angle predictor and a short-term historical data visual angle predictor aiming at Continuous missing data, secondly maps Output matrixes of the time stamp visual angle predictor and the characteristic visual angle predictor into predicted values, and finally carries out weighting summation on the predicted values according to the weights to obtain a final prediction result; aiming at Discrete missing data, the Discrete Output Layer firstly processes Output matrixes of a timestamp visual angle predictor and a characteristic visual angle predictor into Output vectors, secondly maps the Output vectors into a [0,1] interval, enables the accumulated sum of all Output vector values to be 1, and finally selects a node with the maximum probability as an Output node.
In one embodiment, the constructing the BatchNorm layer comprises:
a BatchNorm layer was constructed and the multivariate air quality data was normalized by a normalization function.
In a specific embodiment, a full connection Layer and a position code are arranged in the Input Layer.
In one embodiment, a multi-head attention mechanism, a ReLU activation function and a Dropout random deactivation function are arranged in the timestamp view predictor and the characteristic view predictor.
In one embodiment, the assigning and initializing weights and biases includes:
weights and offsets are assigned and initialized for each view { w } 1 ,w 2 ,w 3 B, its initial value is {0.33,0.33,0.33,0 };
the mapping of the output matrix of the timestamp view predictor and the output matrix of the characteristic view predictor into a predicted value comprises the following steps:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And Matrix F Mapping the output matrix through a layer of fully-connected neural network to obtain predicted values;
the weighted summation of the predicted values according to the weights comprises the following steps:
the obtained predicted value of the characteristic view predictor, the predicted value of the timestamp view predictor and the predicted value of the short-term historical data view predictor are pre respectively F 、pre T And pre P And weighting and summing the predicted values of the three visual angles, wherein the calculation formula is as follows:
Output=w 1 *pre F +w 2 *pre T +w 3 *pre P +b
wherein Output is the final prediction result;
the processing of the output matrices of the timestamp view predictor and the feature view predictor into output vectors comprises:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And Matrix F The vector obtained by connecting the output matrixes is coordinate (Matrix) T ,Matrix F ) Then mapping through a full connection layer to obtain an output vector;
the mapping of the output vector into the [0,1] interval comprises:
Figure BDA0003708392510000041
wherein z is i C is the length of the output vector, i.e. the number of classes of the discrete data.
A TAM-based multivariate air quality data missing value filling model comprises:
a BatchNorm layer to normalize the multivariate air quality data;
the system comprises an Input Layer, a data processing Layer and a data processing Layer, wherein the Input Layer comprises a position code for realizing a multi-element time sequence and a fully-connected linear Layer, the multi-element time sequence position code is used for adding position information to the time sequence, and the fully-connected linear Layer is used for mapping Input data to dense vectors;
the system comprises a Triple-Views Layer, a view Layer and a view Layer, wherein the Triple-Views Layer comprises a realization time stamp view predictor, a characteristic view predictor and a short-term historical data view predictor, the time stamp view predictor is used for predicting from a time dimension, the characteristic view predictor is used for predicting from a plurality of characteristic dimensions, and the short-term historical data view predictor is used for predicting according to historical readings;
and the Output Layer is divided into a Continuous Output Layer and a Discrete Output Layer according to the type of the missing data. The Continuous Output Layer firstly distributes and initializes weights and offsets for a time stamp visual angle predictor, a characteristic visual angle predictor and a short-term historical data visual angle predictor aiming at Continuous missing data, secondly maps Output matrixes of the time stamp visual angle predictor and the characteristic visual angle predictor into predicted values, and finally carries out weighting summation on the predicted values according to the weights to obtain a final prediction result; aiming at Discrete missing data, the Discrete Output Layer firstly processes Output matrixes of a timestamp visual angle predictor and a characteristic visual angle predictor into Output vectors, secondly maps the Output vectors into a [0,1] interval, enables the accumulated sum of all Output vector values to be 1, and finally selects a node with the maximum probability as an Output node.
In one embodiment, the constructing the BatchNorm layer comprises:
and constructing a BatchNorm layer, and normalizing the multi-element air quality data through a normalization function.
In a specific embodiment, a full connection Layer and a position code are arranged in the Input Layer.
In one embodiment, a multi-head attention mechanism, a ReLU activation function and a Dropout random deactivation are arranged in the timestamp view predictor and the characteristic view predictor.
In one embodiment, the assigning and initializing weights and biases includes:
weights and offsets are assigned and initialized for each view { w } 1 ,w 2 ,w 3 B, its initial value is {0.33,0.33,0.33,0 };
mapping the output matrix of the timestamp view predictor and the output matrix of the characteristic view predictor into a predicted value comprises the following steps:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And Matrix F Mapping the output matrix through a layer of fully-connected neural network to obtain predicted values;
the weighted summation of the predicted values according to the weights comprises the following steps:
the obtained predicted value of the characteristic view predictor, the predicted value of the timestamp view predictor and the predicted value of the short-term historical data view predictor are pre respectively F 、pre F And pre P And weighting and summing the predicted values of the three visual angles, wherein the calculation formula is as follows:
Output=w 1 *pre F +w 2 *pre T +w 3 *pre P +b
wherein Output is the final prediction result;
the processing of the output matrices of the timestamp view predictor and the feature view predictor into output vectors comprises:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And Matrix F The vector obtained by connecting the output matrixes is coordinate (Matrix) T ,Matrix F ) Then mapping through a full connection layer to obtain an output vector;
the mapping of the output vector into the [0,1] interval comprises:
Figure BDA0003708392510000061
wherein z is i C is the length of the output vector, i.e. the number of classes of the discrete data.
The invention provides a multivariate air quality data missing value filling model based on TAM and a construction method thereof. After an Output matrix is obtained at each visual angle, according to the data type of missing values, an Output Layer is divided into a Continuous Output Layer and a Discrete Output Layer, and the Continuous Output Layer is used for Continuous missing data, firstly, a timestamp visual angle predictor, a characteristic visual angle predictor and a short-term historical data visual angle predictor are distributed and initialized with weight and bias, secondly, the Output matrices of the timestamp visual angle predictor and the characteristic visual angle predictor are mapped into predicted values, and finally, the predicted values are weighted and summed according to the weight to obtain a final prediction result; aiming at Discrete missing data, the Discrete Output Layer firstly processes Output matrixes of a timestamp visual angle predictor and a characteristic visual angle predictor into Output vectors, secondly maps the Output vectors into a [0,1] interval, enables the accumulated sum of all Output vector values to be 1, and finally selects a node with the maximum probability as an Output node. Compared with other traditional models and models based on big data technology, the model for filling the multiple air quality data missing value provided by the invention can achieve higher prediction precision, and opens up a new idea for the subsequent task of filling the multiple air quality data missing value.
Drawings
FIG. 1 is a flow chart of a filling model of multiple air quality data missing values based on TAM and a construction method thereof;
FIG. 2 is a flow chart of a specific data execution of the TAM-based multivariate air quality data missing value filling model provided by the present invention;
FIG. 3 is a schematic overall structure diagram of a filling model of the multiple air quality data missing value based on the TAM provided by the invention;
fig. 4 is a specific network structure diagram of the filling model of the multiple air quality data missing value based on TAM according to the present invention.
Detailed Description
The embodiment of the construction method of the filling model of the multiple air quality data missing value based on the TAM comprises the following steps:
the embodiment provides a multivariate air quality data missing value filling model based on TAM and a construction method thereof, a hardware execution main body of the construction method can be a desktop computer, a notebook computer, a server device, an intelligent mobile terminal (a tablet computer, a smart phone, etc.), and the embodiment is not limited.
As shown in fig. 1, the construction method includes:
step 1: constructing a BatchNorm layer:
and constructing a BatchNorm layer, and normalizing the multi-element air quality data through a normalization function.
Normalization function:
Figure BDA0003708392510000071
wherein x is the input data, and x is the input data,
Figure BDA0003708392510000072
in order to be the normalized data, the data,
Figure BDA0003708392510000073
is the mean value, σ, of the input data 2 For the variance of the input data, e is a minimum to prevent the denominator from being 0.
Step 2: constructing an Input Layer, wherein the Input Layer comprises a Layer for realizing multi-element time sequence position coding and a fully-connected linear Layer, the multi-element time sequence position coding is used for adding position information for a time sequence, and the fully-connected linear Layer is used for mapping Input data to a dense vector:
the input data is a multidimensional time sequence with a certain time length, the time span is small, and each numerical value only represents index data of a certain attribute at a certain time stamp, so that a full connection layer is added to map the sparse vector into a high-dimensional dense vector.
Since the self-attention mechanism cannot identify time position information of a multi-dimensional time series, position coding is added to encode the time information.
Full connection layer: a ═ WX + B
Position coding:
Figure BDA0003708392510000074
Figure BDA0003708392510000075
wherein pos refers to the position of a certain feature in the multi-element air quality data, the value range is [0, max _ sequence _ length ], max _ sequence _ length is the total number of the features contained in the multi-element air quality data, i refers to the time dimension serial number of the feature, the value range is [0, embedding _ dimension/2 ], embedding _ dimension refers to the dimension of the feature after the multi-dimensional feature is mapped into a dense vector, and d refers to the dimension of the feature after the multi-dimensional feature is mapped into the dense vector model Refers to the value of embedding _ dimension.
And step 3: constructing a Triple-Views Layer, wherein the Triple-Views Layer comprises a realization timestamp view predictor, a characteristic view predictor and a short-term historical data view predictor, the timestamp view predictor is used for predicting from a time dimension, the characteristic view predictor is used for predicting from a multi-element characteristic dimension, and the short-term historical data view predictor is used for predicting according to historical readings:
firstly, the multivariate air quality data is a time series data, and the numerical prediction of the multivariate air quality data in the future time depends on the variation characteristic in time; secondly, the multi-element air quality data has a plurality of characteristic dimensions, wherein the data of each characteristic dimension is a univariate time sequence, and potential relation exists between different characteristics; finally, the values of the time series tend to be relatively similar to the short-term neighboring data. This example designs a Triple-Views Layer framework where predictors in both the timestamp view and feature Views explicitly perform the following steps: the correlation calculation related to the time stamp and the feature is performed by an attention mechanism.
Q=X·W Q
K=X·W K
V=X·W V
Q, K, V, which respectively represent the Query vector, Key vector and Value vector for each sequence.
The self-attention mechanism is as follows:
Figure BDA0003708392510000081
constructing a Triple-Views Layer, wherein the Triple-Views Layer comprises a realization time stamp view predictor, a characteristic view predictor and a short-term historical data view predictor, the time stamp view predictor is used for predicting from a time dimension, the characteristic view predictor is used for predicting from a multi-characteristic dimension, the short-term historical data view predictor is used for predicting according to historical readings, and the method comprises the following steps:
timestamp View Predictor:
to focus on potential links between different timestamp data, the model computes a correlation weight matrix between all timestamps, with a larger weight representing a higher correlation. The multi-head Attention mechanism focuses on correlation calculation on different spaces and splices a plurality of Attention results to serve as output vectors. After multi-head attention, a ReLU nonlinear activation function and Dropout are inactivated immediately, the ReLU adds nonlinear change to the neural network, the learning speed of the model is increased, the fitting capability is enhanced, and the overfitting degree of the model is reduced by random inactivation.
As shown in fig. 2, a specific data processing flow chart of the timestamp view predictor is that a linear layer is used to map the normalized multivariate air quality data into a dense vector, then a position code is added to the multivariate time sequence (dense vector), then a timestamp Q, K, V matrix is generated to perform correlation calculation, and a timestamp orientation matrix, that is, an output matrix of the timestamp view predictor, is obtained.
Feature View Predictor (Feature View Predictor):
in order to focus on potential links between different features, the model calculates a correlation weight matrix between all feature time series, with a larger weight representing a higher correlation. The multi-head Attention mechanism focuses on correlation calculation on different spaces and splices a plurality of Attention results to serve as output vectors. After multi-head attention, a ReLU nonlinear activation function and Dropout are inactivated immediately, the ReLU adds nonlinear change to the neural network, the learning speed of the model is increased, the fitting capability is enhanced, and the overfitting degree of the model is reduced by random inactivation.
As shown in fig. 2, a specific data processing flow diagram of the characteristic view predictor is that first, the normalized multivariate air quality data is transposed, then the normalized transposed multivariate air quality data is mapped into a dense vector by using a linear layer, then a position code is added to the multivariate time sequence (dense vector), then a characteristic Q, K, V matrix is generated for correlation calculation, and a characteristic Attention matrix, that is, an output matrix of the characteristic view predictor, is obtained.
Short-term historical data View Predictor (Previous View Predictor):
the multivariate air quality data is time sequence data, the data of adjacent time stamps are often more similar in the time dimension, and in order to pay attention to the characteristic, the short-term historical data predictor selects the data of a time stamp before the attribute (characteristic) where the missing value (to be predicted) is located as a predicted value.
As shown in fig. 2, for the short-term historical data view predictor, the data of the timestamp before the attribute where the missing value is located is obtained as the predicted value according to the normalized multivariate air quality data.
And step 3: and constructing an Output Layer, wherein the Output Layer is divided into a Continuous Output Layer and a Discrete Output Layer according to the type of the missing data. Aiming at Continuous missing data, the Continuous Output Layer firstly distributes and initializes weight and bias for a timestamp visual angle predictor, a characteristic visual angle predictor and a short-term historical data visual angle predictor, secondly maps Output matrixes of the timestamp visual angle predictor and the characteristic visual angle predictor into a predicted value, and finally carries out weighting summation on the predicted value according to the weight to obtain a final prediction result; aiming at Discrete missing data, the Discrete Output Layer firstly processes Output matrixes of a timestamp visual angle predictor and a characteristic visual angle predictor into Output vectors, secondly maps the Output vectors into a [0,1] interval, enables the accumulation sum of all Output vector values to be 1, and finally selects a node with the maximum probability as an Output node:
the air quality data not only contains continuous missing data but also contains discrete missing data, the attribution of the continuous missing data is a prediction task, and the attribution of the discrete missing data is a classification task, so that different types of missing data are filled in the air quality data and should be treated differently. Most of prediction models based on big data are often suitable for tasks attributed to continuous data, but are difficult to obtain a good effect on tasks attributed to discrete missing data, and models based on an Attention mechanism can well solve the problem, and the Attention extracts data features in a multi-dimensional time sequence by paying Attention to the mutual relation among different dimensions, and are suitable for classification and prediction tasks.
When the missing value is Continuous data, two linear layers are set in the Continuous Output Layer to map the Output matrixes of the time stamp view predictor and the characteristic view predictor into predicted values which are pre respectively T And pre F Then we obtain the prediction value pre of the short-term historical data view predictor P And finally, calculating by a weighted summation formula to obtain a final predicted value, wherein the calculation formula is as follows:
Output=w 1 *pre F +w 2 *pre T +w 3 *pre P +b
wherein w 1 、w 2 、w 3 B, initializing when the model is established, and learning and updating by gradient descent in the model learning process, wherein Output is a final prediction value.
When the missing value is discrete data, Matrix T And Matrix F Respectively, the Output matrixes of the timestamp view predictor and the characteristic view predictor, and the Discrete Output Layer firstly connects the Output matrixes to obtain a splicing vector Concatenate (Matrix) T ,Matrix F ) Secondly, mapping the spliced vector to an output vector with dimensionality of classification category (discrete data category) through full-connection layer mapping, and finally mapping the output vector to [0,1] through a Softmax function]In the interval, the cumulative sum of all the output vector values is 1, and finally, the node with the maximum probability is selected as an output node, wherein the Softmax function is as follows:
Figure BDA0003708392510000101
wherein z is i C is the length of the output vector, i.e. the number of classes of the discrete data.
The model provided by the invention can use the Attention mechanism to discover potential interrelations in data from different angles, namely time stamp data correlation and characteristic correlation, and simultaneously considers short-term historical data of a time sequence, so that a better prediction effect is obtained, and the Attention mechanism solves the long-range dependence problem. Compared with other models based on big data technology, the model can realize more accurate prediction on the multi-air quality missing value filling task, and opens up a new idea for the subsequent multi-air quality missing data filling task.
The embodiment of a filling model of the deficiency value of the multi-element air quality data based on the TAM comprises the following steps:
the present embodiment provides a multivariate air quality data missing value filling model based on TAM, which corresponds to the above construction method of the multivariate air quality data missing value filling model based on TAM, as shown in fig. 3, the multivariate air quality data missing value filling model based on TAM includes:
a BatchNorm layer to normalize the multivariate air quality data;
the system comprises an Input Layer, a data processing Layer and a data processing Layer, wherein the Input Layer comprises a multi-element time sequence position code and a fully-connected linear Layer, the multi-element time sequence position code is used for adding position information to a time sequence, and the fully-connected linear Layer is used for mapping Input data to dense vectors;
the system comprises a Triple-Views Layer, a view Layer and a view Layer, wherein the Triple-Views Layer comprises a realization time stamp view predictor, a characteristic view predictor and a short-term historical data view predictor, the time stamp view predictor is used for predicting from a time dimension, the characteristic view predictor is used for predicting from a plurality of characteristic dimensions, and the short-term historical data view predictor is used for predicting according to historical readings;
and the Output Layer is divided into a Continuous Output Layer and a Discrete Output Layer according to the type of the missing data. The Continuous Output Layer firstly distributes and initializes weights and offsets for a time stamp visual angle predictor, a characteristic visual angle predictor and a short-term historical data visual angle predictor aiming at Continuous missing data, secondly maps Output matrixes of the time stamp visual angle predictor and the characteristic visual angle predictor into predicted values, and finally carries out weighting summation on the predicted values according to the weights to obtain a final prediction result; aiming at Discrete missing data, the Discrete Output Layer firstly processes Output matrixes of a timestamp visual angle predictor and a characteristic visual angle predictor into Output vectors, secondly maps the Output vectors into a [0,1] interval, enables the accumulated sum of all Output vector values to be 1, and finally selects a node with the maximum probability as an Output node.
For specific implementation manners of each processing layer, refer to the above embodiment of the method for constructing the multivariate air quality data missing value filling model based on the TAM, and details are not repeated.
Fig. 4 is a specific network structure diagram of a multi-component air quality data missing value filling model based on TAM.
The above-mentioned embodiments are merely illustrative of the technical solutions of the present invention in a specific embodiment, and any equivalent substitutions and modifications or partial substitutions of the present invention without departing from the spirit and scope of the present invention should be covered by the claims of the present invention.

Claims (10)

1. A construction method of a multi-element air quality data missing value filling model based on TAM is characterized by comprising the following steps:
constructing a BatchNorm layer for normalizing multivariate air quality data;
constructing an Input Layer, wherein the Input Layer comprises a multi-element time sequence position code and a fully-connected linear Layer, the multi-element time sequence position code is used for adding position information to a time sequence, and the fully-connected linear Layer is used for mapping Input data to a dense vector;
constructing a Triple-Views Layer, wherein the Triple-Views Layer comprises a realization time stamp view predictor, a characteristic view predictor and a short-term historical data view predictor, the time stamp view predictor is used for predicting from a time dimension, the characteristic view predictor is used for predicting from a multi-element characteristic dimension, and the short-term historical data view predictor is used for predicting according to historical readings;
and constructing an Output Layer, wherein the Output Layer is divided into a Continuous Output Layer and a Discrete Output Layer according to the type of the missing data. The Continuous Output Layer firstly distributes and initializes weights and offsets for a time stamp visual angle predictor, a characteristic visual angle predictor and a short-term historical data visual angle predictor aiming at Continuous missing data, secondly maps Output matrixes of the time stamp visual angle predictor and the characteristic visual angle predictor into predicted values, and finally carries out weighting summation on the predicted values according to the weights to obtain a final prediction result; aiming at Discrete missing data, the Discrete Output Layer firstly processes Output matrixes of a timestamp visual angle predictor and a characteristic visual angle predictor into Output vectors, secondly maps the Output vectors into a [0,1] interval, enables the accumulated sum of all Output vector values to be 1, and finally selects a node with the maximum probability as an Output node.
2. The method for constructing the TAM-based multivariate air quality data missing value filling model according to claim 1, wherein the constructing the embedding layer comprises the following steps:
a BatchNorm layer was constructed and the multivariate air quality data was normalized by a normalization function.
3. The method for constructing the multivariate air quality data missing value filling model based on the TAM as claimed in claim 1, wherein a full connection Layer and a position code are arranged in the Input Layer.
4. The method for constructing the multivariate air quality data missing value filling model based on the TAM as claimed in claim 1, wherein a multi-head attention mechanism, a ReLU activation function and a Dropout random deactivation are arranged in the time stamp view predictor and the characteristic view predictor.
5. The method for constructing the TAM-based multivariate air quality data missing value filling model according to claim 1, wherein the assigning and initializing weights and biases comprises:
weights and offsets are assigned and initialized for each view { w } 1 ,w 2 ,w 3 B, its initial value is {0.33,0.33,0.33,0 };
mapping the output matrix of the timestamp view predictor and the output matrix of the characteristic view predictor into a predicted value comprises the following steps:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And Matrix F Mapping the output matrix through a layer of fully-connected neural network to obtain predicted values;
the weighted summation of the predicted values according to the weights comprises the following steps:
the obtained predicted value of the characteristic view predictor, the predicted value of the timestamp view predictor and the predicted value of the short-term historical data view predictor are pre respectively F 、pre T And pre P And weighting and summing the predicted values of the three visual angles, wherein the calculation formula is as follows:
Output=w 1 *pre F +w 2 *pre F +w 3 *pre P +b
wherein Output is the final prediction result;
the processing of the output matrices of the timestamp view predictor and the feature view predictor into output vectors comprises:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And matrix F The vector obtained by connecting the output matrixes is Concatenate (Matrix) T ,Matrix F ) Then mapping through a full connection layer to obtain an output vector;
the mapping of the output vector into the [0,1] interval comprises:
Figure FDA0003708392500000021
wherein z is i C is the length of the output vector, i.e. the number of classes of the discrete data.
6. A multi-element air quality data missing value filling model based on TAM is characterized by comprising the following steps:
a BatchNorm layer to normalize the multivariate air quality data;
the system comprises an Input Layer, a data processing Layer and a data processing Layer, wherein the Input Layer comprises a multi-element time sequence position code and a fully-connected linear Layer, the multi-element time sequence position code is used for adding position information to a time sequence, and the fully-connected linear Layer is used for mapping Input data to dense vectors;
the system comprises a Triple-Views Layer, a view Layer and a view Layer, wherein the Triple-Views Layer comprises a realization time stamp view predictor, a characteristic view predictor and a short-term historical data view predictor, the time stamp view predictor is used for predicting from a time dimension, the characteristic view predictor is used for predicting from a plurality of characteristic dimensions, and the short-term historical data view predictor is used for predicting according to historical readings;
and the Output Layer is divided into a Continuous Output Layer and a Discrete Output Layer according to the missing data type. The Continuous Output Layer firstly distributes and initializes weights and offsets for a time stamp visual angle predictor, a characteristic visual angle predictor and a short-term historical data visual angle predictor aiming at Continuous missing data, secondly maps Output matrixes of the time stamp visual angle predictor and the characteristic visual angle predictor into predicted values, and finally carries out weighting summation on the predicted values according to the weights to obtain a final prediction result; aiming at Discrete missing data, the Discrete Output Layer firstly processes Output matrixes of a timestamp visual angle predictor and a characteristic visual angle predictor into Output vectors, secondly maps the Output vectors into a [0,1] interval, enables the accumulated sum of all Output vector values to be 1, and finally selects a node with the maximum probability as an Output node.
7. The TAM-based multivariate air quality data deficiency value filling model of claim 6, wherein the BatchNorm layer is used to normalize the multivariate air quality data.
8. The TAM-based multivariate air quality data missing value filling model according to claim 6, wherein a full connection Layer and a position code are arranged in the Input Layer.
9. The TAM-based multivariate air quality data missing value population model according to claim 6, wherein a multi-head attention mechanism, a ReLU activation function and a Dropout random deactivation are set in the time stamp view predictor and the characteristic view predictor.
10. The TAM-based multivariate air quality data deficiency value filling model of claim 6, wherein the assigning and initializing weights and biases comprises:
weights and offsets are assigned and initialized for each view { w } 1 ,w 2 ,w 3 B, its initial value is {0.33,0.33,0.33,0 };
mapping the output matrix of the timestamp view predictor and the output matrix of the characteristic view predictor into a predicted value comprises the following steps:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And Matrix F Mapping the output matrix through a layer of fully-connected neural network to obtain predicted values;
the weighted summation of the predicted values according to the weights comprises the following steps:
the obtained predicted value of the characteristic view predictor, the predicted value of the timestamp view predictor and the predicted value of the short-term historical data view predictor are pre respectively F 、pre T And pre P And weighting and summing the predicted values of the three visual angles, wherein the calculation formula is as follows:
Output=w 1 *pre F +w 2 *pre T +w 3 *pre P +b
wherein Output is the final prediction result;
the processing of the output matrices of the timestamp view predictor and the feature view predictor into output vectors comprises:
the obtained output matrixes of the timestamp view predictor and the characteristic view predictor are Matrix respectively T And Matrix F The vector obtained by connecting the output matrixes is Concatenate (Matrix) T ,Matrix F ) Then mapping through a full connection layer to obtain an output vector;
the mapping of the output vector into the [0,1] interval comprises:
Figure FDA0003708392500000041
wherein z is i C is the length of the output vector, i.e. the number of classes of the discrete data.
CN202210714518.6A 2022-06-22 2022-06-22 Multi-element air quality data missing value filling model based on TAM and construction method thereof Pending CN114936530A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210714518.6A CN114936530A (en) 2022-06-22 2022-06-22 Multi-element air quality data missing value filling model based on TAM and construction method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210714518.6A CN114936530A (en) 2022-06-22 2022-06-22 Multi-element air quality data missing value filling model based on TAM and construction method thereof

Publications (1)

Publication Number Publication Date
CN114936530A true CN114936530A (en) 2022-08-23

Family

ID=82868409

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210714518.6A Pending CN114936530A (en) 2022-06-22 2022-06-22 Multi-element air quality data missing value filling model based on TAM and construction method thereof

Country Status (1)

Country Link
CN (1) CN114936530A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165664A (en) * 2018-07-04 2019-01-08 华南理工大学 A kind of attribute missing data collection completion and prediction technique based on generation confrontation network
CN111046027A (en) * 2019-11-25 2020-04-21 北京百度网讯科技有限公司 Missing value filling method and device for time series data
CN112884230A (en) * 2021-02-26 2021-06-01 润联软件系统(深圳)有限公司 Power load prediction method and device based on multivariate time sequence and related components
CN113011495A (en) * 2021-03-18 2021-06-22 郑州大学 GTN-based multivariate time series classification model and construction method thereof
US20210407284A1 (en) * 2020-06-30 2021-12-30 Xidian University Vehicle traffic flow prediction method with missing data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165664A (en) * 2018-07-04 2019-01-08 华南理工大学 A kind of attribute missing data collection completion and prediction technique based on generation confrontation network
CN111046027A (en) * 2019-11-25 2020-04-21 北京百度网讯科技有限公司 Missing value filling method and device for time series data
US20210407284A1 (en) * 2020-06-30 2021-12-30 Xidian University Vehicle traffic flow prediction method with missing data
CN112884230A (en) * 2021-02-26 2021-06-01 润联软件系统(深圳)有限公司 Power load prediction method and device based on multivariate time sequence and related components
CN113011495A (en) * 2021-03-18 2021-06-22 郑州大学 GTN-based multivariate time series classification model and construction method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋维;高超;赵?;赵燕东;: "基于LSTM的活立木茎干水分缺失数据填补方法", 林业科学, no. 02, 15 February 2020 (2020-02-15) *
苏思凡;竹翠;朱文军;赵枫朝;: "基于时空融合的缺失值填补算法", 计算机工程与设计, no. 01, 16 January 2020 (2020-01-16) *

Similar Documents

Publication Publication Date Title
Al-Molegi et al. STF-RNN: Space time features-based recurrent neural network for predicting people next location
Tzafestas et al. Computational intelligence techniques for short-term electric load forecasting
Zhang et al. A graph-based temporal attention framework for multi-sensor traffic flow forecasting
Li et al. Predicting short-term electricity demand by combining the advantages of arma and xgboost in fog computing environment
Pai et al. A recurrent support vector regression model in rainfall forecasting
CN113316163B (en) Long-term network traffic prediction method based on deep learning
Nguyen et al. PM2. 5 prediction using genetic algorithm-based feature selection and encoder-decoder model
CN112508265A (en) Time and activity multi-task prediction method and system for business process management
CN115270986A (en) Data anomaly detection method and device and computer equipment
Tan et al. Multi-node load forecasting based on multi-task learning with modal feature extraction
Shi et al. Handling uncertainty in financial decision making: a clustering estimation of distribution algorithm with simplified simulation
CN114841072A (en) Differential fusion Transformer-based time sequence prediction method
Zou et al. Deep non-crossing probabilistic wind speed forecasting with multi-scale features
CN115630101A (en) Hydrological parameter intelligent monitoring and water resource big data management system
CN116910049A (en) MDAN-based power load data missing value filling model and construction method thereof
CN114936530A (en) Multi-element air quality data missing value filling model based on TAM and construction method thereof
CN115983497A (en) Time sequence data prediction method and device, computer equipment and storage medium
Sahu et al. Forecasting currency exchange rate time series with fireworks-algorithm-based higher order neural network with special attention to training data enrichment
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN115018193A (en) Time series wind energy data prediction method based on LSTM-GA model
Kavitha et al. GA Based Stochastic Optimization For Stock Price Forecasting using Fuzzy Time series Hidden Markov Model
Palokoto et al. A Comparative Study on Machine Learning-Aided Flow Traffic Generators
CN116933924A (en) Building load prediction method, device, equipment and readable storage medium
Seth et al. SNAP: Social Network Analysis Using Predictive Modeling
Momtaz et al. Long-term Forecasting Heat Use in Sweden's Residential Sector using Genetic Algorithms and Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination