CN112580859A

CN112580859A - Haze prediction method based on global attention mechanism

Info

Publication number: CN112580859A
Application number: CN202011415067.3A
Authority: CN
Inventors: 薛晓军; 张春霞; 彭成; 牛振东; 薛涛; 鹿旸
Original assignee: Beijing Institute of Technology BIT; China University of Geosciences Beijing
Current assignee: Beijing Institute of Technology BIT; China University of Geosciences Beijing
Priority date: 2020-06-01
Filing date: 2020-12-03
Publication date: 2021-03-30
Anticipated expiration: 2040-12-03
Also published as: CN112580859B

Abstract

The invention relates to a haze prediction method based on a global attention mechanism, and belongs to the technical field of artificial intelligence information prediction. The method comprises the steps of firstly obtaining haze data of an environment monitoring point, processing the obtained haze data, training a haze prediction model based on a global attention mechanism, and outputting a final prediction result by using the haze prediction model. In the haze prediction task, a global attention mechanism is introduced, different weights are given to different influence factors, and the problem of overlong information transmission distance is effectively solved. The two-way gating circulation neural network is introduced, so that the influence of the previous moment data on the next moment data in the training data is introduced, the correlation between the next moment data and the previous moment data is analyzed, the long-term dependence problem in haze prediction data is solved, and the haze data at the future moment can be accurately predicted. The method has good expansibility, and can dynamically change the network structure according to the data characteristics of different areas to obtain the haze prediction method suitable for the local area.

Description

Haze prediction method based on global attention mechanism

Technical Field

The invention relates to a haze prediction method based on a global attention mechanism, and belongs to the technical field of artificial intelligence information prediction.

Background

Haze is one of the important factors affecting the air pollution condition. The haze has the characteristic of regional transmission, the haze that a region produced can be transmitted to other regions to the regional transmission and the time of haze are correlated, and the haze data of a plurality of moments in the past and the haze data of a future moment have certain correlation. Therefore, the environmental quality of each region at a future time can be predicted by using the environmental monitoring point data of each region at a plurality of time intervals.

In 2005, in a document "a neural network for solar average PM10 concentrations in Belgium" (Atmospheric Environment,2005), Hooyberghs and Mensink et al predicted the daily average PM10 concentration on the next day of Belgium using a neural network based on the measurement results of ten monitoring points in five years of Belgium. In the literature, "urban PM2.5 concentration spatial prediction based on BP artificial neural network" (2013), wanmin et al predict the spatial variation of PM2.5 concentration using a BP (Back Propagation) neural network, and research results show that when predicting the PM2.5 concentration at a fixed location, the BP neural network is more accurate than a common kriging interpolation method. In the literature, "research on air pollution space-time forecasting model based on RNN" (mapping science, 2017), RNN (Recurrent Neural Network) is used by fangxiang et al to predict haze time sequence data, the data contains a missing value, a deep learning model is constructed by using Long Short-Term Memory Network (LSTM) and a full connection layer, and the deep learning model is used for space-time forecasting.

However, the conventional haze prediction method is difficult to effectively express the association between data at different times. When the characteristic dimension of input data is large or the neural network is deep, the distance of information transmission in the model is too long, and part of effective information is often lost, so that the representation of association between data at different moments needs to be improved, and the problem of too long information transmission distance is solved.

Disclosure of Invention

The invention aims to solve the technical problem that effective information is difficult to obtain due to the fact that the network information transmission distance is too long in a haze prediction task, and provides a haze prediction method based on a global attention mechanism.

The method is characterized in that a global attention mechanism and a bidirectional gated cyclic neural network are combined and applied to haze prediction, and the influence of data at different moments on data at future moments is effectively considered.

In the aspect of sequence data processing, the recurrent neural network has better performance, and can extract the time sequence information in the sequence data. The bidirectional gate-controlled recurrent neural network is a recurrent neural network, not only has better performance, but also has simpler structure than a long-term and short-term memory network. The gated Recurrent neural network is composed of a gated Recurrent unit gru (gate recovery unit) in which an update gate and a reset gate are used. The update gate indicates the degree of influence of the information at the previous time on the current time state, and the reset gate indicates the manner in which the new input information is combined with the previously memorized information. The bidirectional gated cyclic neural network is formed by superposing two layers of gated cyclic units, wherein one layer represents a forward propagation state, the other layer represents a backward propagation state, and the correlation among the states at different moments is represented.

The Attention (Attention) mechanism improves the performance of the sequence learning task. The attention mechanism breaks through the problem that the information transmission distance of the traditional coder-decoder structure is too long during coding and decoding, namely, the state of the current network can be influenced by the state of the network before a long time. By giving different learning weights to the states at different moments and carrying out weighted summation on the hidden states at all the moments, the network can directly acquire the information most relevant to the current moment at each moment (the moment of the information may be longer than the current moment), while the traditional method can only acquire the information at the previous moment each time, and when the information transmission distance is too long, the traditional method is difficult to acquire effective information. The global attention mechanism calculates a context vector by performing weighted summation on hidden vectors of the whole sequence, selectively learns the input sequence by using a score function to distribute different weights to states at different time steps, and can improve the performance of a sequence learning task according to output sequences of different generation models endowed with the weights.

The technical scheme adopted by the invention is as follows:

a haze prediction method based on a global attention mechanism comprises the following steps:

step 1: and acquiring haze data of the environment monitoring points.

The method comprises the following specific steps:

first, haze data of an environment monitoring point is obtained. The haze data of each environment monitoring point comprises 12 monitoring data: the method comprises the steps of monitoring point name, time, Air Quality Index (AQI), air quality index category, primary pollutants, PM2.5 fine particles, PM10 inhalable particles, carbon monoxide, nitrogen dioxide, ozone 1 hour average, ozone 8 hour average and sulfur dioxide, storing the acquired haze data and storing the acquired haze data in a text file.

Haze data for each environmental monitoring point is in units of "hours". For the environment monitoring point t, haze data { X at m previous moments are used_t-m+1,...,X_t-1,X_tPredicting haze data (Y) of the environment monitoring point at the time of t + n_t+nIn which "X" is_t，X_t-1...X_t-m+1"represents haze data of" time t, time t-1, … …, time t-m +1 ", respectively.

Wherein, X_tRepresenting a vector of 12 monitoring data for the environmental monitoring point. For example, X_t＝<New airport area, 2020/3/15/0:00, 70, good, particulate matter (PM10), 48, 90, 0.9, 64, 22, 76, 23>The vector represents: the name of the monitoring point is as follows: a new region of the airport; time: 2020/3/15/0: 00; air Quality Index (AQI): 70; air quality index category: good; primary contaminants: particulate matter (PM 10); PM2.5 fine particulate matter: 48; PM10 inhalable particulate matter: 90, respectively; carbon monoxide (CO): 0.9; nitrogen dioxide (N0)₂): 64; ozone 1 hour average (O)₃A _1): 22; ozone 8 hours average (O)₃_ 8): 76; sulfur dioxide (SO)₂)：23。

Y_t+nRepresenting predicted data at a future time t + n, Y_t+nE.p, P ═ { air mass index, PM2.5 fine particulate matter, PM10 respirable particulate matter, carbon monoxide (CO), nitrogen dioxide (N0)₂) Ozone 1 hour average (O)₃_1), ozone 8 hours average (O)₃8), sulfur dioxide (SO)₂)}，Y_t+nIs any index in the set P.

Step 2: and processing the acquired haze data.

Step 2.1: and removing the weight of the haze data.

In the process of obtaining the haze data, in order to ensure the continuity of the data, the haze data of the environment monitoring points are collected at intervals (such as 1 hour), so that the haze data obtained in the step 1 may be repeated. Through establishing a temporary set, repeated haze data are removed and stored, and the haze data of the environment monitoring points after the repeated haze data are removed can be stored in a new file.

Step 2.2: and filtering and processing missing values of the de-duplicated haze data.

And giving an environment monitoring point list needing to be analyzed by a user, and selecting the required haze data according to the monitoring point list. Due to the fact that invalid data such as incomplete data, messy codes of the data and the like may exist in the acquired haze data of the environment monitoring point. Therefore, in the data filtering process, invalid haze data is to be removed. Meanwhile, for haze data with missing values, mean value completion is carried out on the missing data.

Step 2.3: unifying the numerical range.

For the acquired haze data, the numerical ranges of the data are unified due to the fact that the numerical ranges of different environment indexes are different, the speed of gradient descent for obtaining the optimal solution can be increased, and the method is easy to converge.

And carrying out label coding on the environment monitoring point name, and converting the environment monitoring point name into data of a numerical type. The numerical ranges of all the numerical data are unified, i.e., the data ranges are all converted to [0,1 ].

And step 3: and training a haze prediction model based on the global attention mechanism.

Step 3.1: and (3) constructing data characteristics and labels according to the data processed in the step (2).

The data characteristics are haze data of the first m moments, the label is the haze data of the future t + n moments needing to be predicted, and the haze data are used as the input of the haze training model.

Step 3.2: and generating hidden states at different time steps through a bidirectional gating cyclic neural network. The gated loop unit selectively passes information in the network through a gating mechanism, using an update gate and a reset gate. The update gate indicates the degree of influence of the information at the previous time on the current time state, and the reset gate indicates the manner in which the new input information is combined with the previously memorized information. The formula for each "gate" is defined as follows:

p_t＝sigmoid(v_r×[S_t-1,x_t]) (1)

q_t＝sigmoid(v_z×[S_t-1,x_t]) (2)

wherein the content of the first and second substances,

represents the candidate value, S_tIndicating an active state, S_t-1Indicating the activation state at the previous moment, p_tDenotes a reset gate, q_tRepresents an update gate, x_tInput representing time t, v_r、v_zAnd v represents a parameter matrix to be trained. tanh () is a hyperbolic tangent function and Sigmoid () is an activation function of a neural network.

Step 3.3: and defining a score function in the global attention mechanism, combining the hidden state and the global attention mechanism, and giving different weights to the hidden state.

The global attention mechanism assigns different weights to the data at different moments, assigns higher weight to the data with larger influence, and assigns lower weight to the data with smaller influence, thereby improving the prediction accuracy of the method. For the score function in the global attention mechanism, because data of different environment monitoring points have different characteristics, different score functions are required for haze prediction of different environment monitoring points. In the present invention, the scoring function uses a fully connected neural network.

Let o_iRepresenting the output of step i, o, of a bi-directional gated recurrent neural network_iFormed by the output connections of the gated recurrent neural network which propagates forward and backward at the ith step, t_jRepresenting the state of the bidirectional gated recurrent neural network at the jth step, and calculating the hidden state context with weight at the jth step_jComprises the following steps:

context_j＝∑_iw_i,jo_i (6)

wherein, w_i,jIs the global attention weight; e (o)_i,t_j) Is a scoring function that uses a fully connected neural network; exp () represents an exponential function with a natural constant e as the base.

And after the global attention mechanism processing, obtaining a hidden state with weight information at each moment. The hidden state with the weight information may well represent the association between the different time instant information. A random discard (dropout) operation is performed on the hidden states with weight information to prevent overfitting.

Step 3.4: and processing the hidden state with the weight through a full connection layer to obtain a preliminary haze prediction result.

Specifically, using the fully connected layer, features in the hidden state with weights are further extracted and converted into a preliminary haze prediction result.

Step 3.5: and training a haze prediction model.

The method comprises the following specific steps:

haze data of different environment monitoring points are selected, and the selected data are divided into a training set and a testing set according to a set proportion (such as 8: 2).

Training was performed using a mini-batch (mini-batch) method, with a small batch of data being trained at a time. In each iteration of training, haze data is trained in batches. In each batch of training data, placeholder (placeholder) type data is populated and loss values, accuracy and predicted values are calculated. And training the model on a training set by calculating the loss between the predicted value and the true value, and continuously updating the model parameters until the training is finished.

And on the test set, testing the trained haze prediction model, and calculating the root mean square error RMSE and the average absolute error MAE of the haze prediction model on the training set, thereby inspecting the haze prediction effect.

And 4, step 4: and outputting a final haze prediction result by using a haze prediction model.

Because unified numerical range operation is carried out in step 2, the proportion of the preliminary haze prediction result can be reversed, and a final haze prediction result is generated.

And inputting the test set into the trained haze prediction model, and outputting a final haze prediction result.

Advantageous effects

The haze prediction method based on the global attention mechanism is adopted for predicting haze aiming at different environmental monitoring point data, and haze data at a future moment can be predicted more accurately.

Compared with the prior art, the method has the following characteristics:

(1) the haze prediction method has the advantages that the haze data at the future moment can be predicted more effectively according to a plurality of past moments, different weights are given to different influence factors by using a global attention mechanism, and the problem of overlong information transmission distance is effectively solved, so that the information most relevant to the current moment can be directly obtained at each moment, the root mean square error and the average absolute error of the predicted data are reduced, and the haze prediction accuracy is improved;

(2) the method has good expansibility, and can dynamically change the network structure according to the data characteristics of different areas, thereby obtaining a haze prediction method more suitable for the local area;

(3) by using the bidirectional gated cyclic neural network, the influence of the previous moment data on the next moment data in the training data is introduced, and the correlation between the next moment data and the previous moment data is analyzed, so that the problem of long-term dependence in haze prediction data is solved better, the correlation degree between different moment data is further determined, and the haze prediction effect is more accurate.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention.

Detailed Description

The method of the present invention will be further described and verified in detail with reference to the accompanying drawings and examples.

Examples

A haze prediction method based on a global attention mechanism is disclosed, as shown in FIG. 1, and includes the following steps:

step 1: and acquiring haze data of the environment monitoring points.

And (3) acquiring data of 1600 environment monitoring points in each city in China from 2019, 2 and 10 days to 2019, 4 and 23 days by using a Beautiful Soup library in Python language. Each data point contains twelve monitoring data of monitoring point name, time, air quality index AQI, air quality index category, primary pollutants, PM2.5 fine particulate matter, PM10 respirable particulate matter, carbon monoxide, nitrogen dioxide, ozone 1 hour average, ozone 8 hour average, sulfur dioxide. Each city has data for a plurality of environmental monitoring points.

In this embodiment, an environment monitoring point in beijing city is used for description. The environmental monitoring points in Beijing City include Beijing Wanshou West palace, Beijing Dongtiansiu, Beijing Tetan, Beijing agriculture exhibition hall, Beijing Guanguanyuan, Beijing Wanzhi willow, Beijing Shunyuan New City, Beijing Huanyuan Touchu, Beijing Changzhou Zhengzhen, Beijing Olympic center, and Beijing Gucheng, and the time range is 2 months and 10 days in 2019 to 4 months and 23 days in 2019.

Step 2: and processing the acquired haze data.

Step 2.1: and removing the weight of the haze data.

The obtained environmental monitoring point data is stored in a text file, and the text file is read line by line. And newly building a set of temporary storage data, and comparing a line of data in the text file with the data in the set each time the line of data is read. And if the read row of data is different from the data in the set, the row of data is not repeated, the data with the non-repeated row is added into the set until all the data are read, and finally the data in the set is stored in a file, so that repeated haze removal data is obtained.

Step 2.2: and filtering and processing the missing value of the haze data.

And 2.1, after the data are subjected to the deduplication, acquiring the data of the deduplication monitoring points of all cities in the country. And matching the duplication-removing monitoring point data according to the monitoring point name of Beijing city to obtain the data of the monitoring point of Beijing city. And removing invalid data in the matching process, wherein the method for removing the invalid data comprises the following steps: the symbol "\ t" is used for dividing the data into 12 columns, the length of the time data is 19 (the first group of data after the division), the time data comprises "2019", and each group of the divided data is not empty (the data is not empty, which indicates that the data is effective). And after invalid data are removed, storing the data according to the time sequence.

The missing value exists in the obtained data, the _ "in the data represents the missing of the data, the data are sequenced according to the time sequence during training, the missing data need to be completed in the training process, the method for using the completed data is a mean completion method, the data at each moment are stored in one line, the data in the previous line and the data in the next line where the missing data exists are used for calculating the average value, the missing data are completed by using the average value, and the data are stored in the csv file after being completed. 17000 pieces of data were obtained in total from 11 environmental monitoring points in Beijing area.

Further, comprising step 2.3: unifying the numerical range.

And unifying the numerical range of the data by using a sklern module of Python, and unifying the operation by using a MinMaxScale function in the sklern module, wherein the unified numerical range is 0 to 1. The numerical ranges of different environmental indexes are different, and the speed of solving the optimal solution through gradient descent is improved after the data are unified to the same range.

And step 3: and constructing a haze prediction model based on a global attention mechanism.

Step 3.1: and (3) loading the haze data processed in the step (2), deleting two columns of the first pollutants and the air quality index categories, and acquiring the haze data of the environment monitoring point in the file. And digitally coding the names of the environment monitoring points, converting Chinese characters into numbers, converting all data into float types, and constructing data characteristics and labels. That is, in each example, the monitoring point data characterized by the previous hours, labeled as the air quality index or other environmental quality indicator at the next time. The training data and tags are defined using placeholders (placeholders).

Step 3.2: and defining a bidirectional gated recurrent neural network by using a bidirectional _ dynamic _ rnn function in a TensorFlow framework, and inputting the data characteristics and the label into the bidirectional gated recurrent neural network for processing to generate a hidden state.

Step 3.3: in an implementation of the global attention mechanism, the weights and outputs are calculated using the fully connected layers as a scoring function. The global attention mechanism is defined as a function, returning hidden states and weights with weight information. The output of the bidirectional gating cyclic neural network is used as a parameter to be input into a global attention mechanism for learning, and different weights are given to hidden states at different moments.

After the processing of the global attention mechanism, the hidden state with the weight information at each moment is obtained, and the hidden state with the weight information can well represent the association between different moment information. A random discard (dropout) operation is performed on the hidden states with weight information to prevent overfitting.

Step 3.4: and inputting the hidden state with the weight information into the full-connection layer, further extracting the characteristics and outputting a predicted value. And calculating the difference between the predicted value and the true value by taking the average absolute error as a loss function.

Step 3.5: and training a haze prediction model.

17000 pieces of environment monitoring point data in Beijing area are divided into a training set and a test set according to the proportion of 8:2, namely 13600 pieces of data in the training set and 3400 pieces of data in the test set.

Training was performed in a small batch method, with the size of each batch of data set to 256. The number of iterations of the training is set from 10, and each iteration is increased by 5 times. In each training, dynamic filling is carried out on training data samples, training data labels, the retention probability of each node and the like, loss values, accuracy and predicted values are calculated, and label values and predicted values of a training set are recorded for evaluation. The model parameters are continuously updated until the training is completed.

In the experiment, the range of the hidden layer node number in the experiment is [8,256], the range of the node number in the global attention mechanism in the experiment is [4,128], the range of the node retention probability in the experiment is [0,1], and the range of the selected characteristic number in the experiment is [10,150 ].

On the test set, the trained haze prediction model is tested, and the root mean square error RMSE and the mean absolute error MAE of the haze prediction model on the training set are calculated, so that the haze prediction effect can be tested.

In step 2, the numerical range is unified, so that the prediction result can be subjected to proportion inversion to obtain a final haze prediction result.

Comparison of experiments

In order to illustrate the haze prediction effect of the invention, (1) a prediction method of a deep circulating neural network based on a double-layer long-short term memory network LSTM is respectively adopted; (2) a deep circulating neural network prediction method based on a double-layer gated circulating neural network GRU; (3) the haze prediction method based on the global attention mechanism carries out air quality index prediction. In the experiment, the air quality index can be predicted, and any one environmental index of { PM2.5 fine particles, PM10 inhalable particles, carbon monoxide (CO), nitrogen dioxide (N02), ozone 1 hour average (O3_1), ozone 8 hour average (O3_8) and sulfur dioxide (SO2) } can be predicted according to needs.

Air quality index prediction is a regression problem, and Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) are used as evaluation indexes. The root mean square error reflects the accuracy of haze prediction, and the average absolute error reflects the true condition of the haze prediction error. The smaller the numerical values of the root mean square error RMSE and the average absolute error MAE are, the more accurate the haze prediction result of the method is.

Experiments were performed using environmental monitoring point data from the Beijing area, and the air quality index for the next hour was predicted using the data from the previous 100 hours.

(1) The root mean square error RMSE of the haze prediction of the deep circulating neural network based on the double-layer long and short term memory network LSTM is 20.385, and the average absolute error MAE is 11.825. Wherein, the number of hidden layer nodes is 256. (2) The root mean square error RMSE of the haze prediction of the deep circulating neural network based on the double-layer gated circulating neural network GRU is 20.294, and the average absolute error MAE is 11.815. Wherein, the number of hidden layer nodes is 256. (3) The root mean square error RMSE of the haze prediction method based on the global attention mechanism is 18.832, and the average absolute error MAE is 10.084. The number of hidden layer nodes is 256, and the number of nodes in the global attention mechanism is 128. According to the experimental result, the root mean square error RMSE and the average absolute error MAE of the haze prediction method based on the global attention mechanism are minimum, and the haze prediction is accurate.

The problem that the information transmission distance of training sample data is too long and the influence of the training sample data on the information transmission distance can be well considered by the aid of the global attention mechanism and the bidirectional gated cyclic neural network, the network can directly acquire information most relevant to the current moment at each moment, the bidirectional gated cyclic neural network expresses the association between states at different moments, the haze prediction accuracy is improved, and the method has a wide application prospect in the fields of information recommendation and environment prediction.

Claims

1. A haze prediction method based on a global attention mechanism is characterized by comprising the following steps:

step 1: acquiring haze data of an environment monitoring point, and storing the acquired haze data;

wherein, the obtained haze data of the environment monitoring points takes 'hour' as a unit, and for the environment monitoring points t, the haze data { X at m moments before use_t-m+1,...,X_t-1,X_tPredicting haze data (Y) of the environment monitoring point at the time of t + n_t+n}; wherein "X" is_t，X_t-1...X_t-m+1Respectively representing "time tHaze data at time t-1, … …, and time t-m +1 "; x_tRepresenting a vector consisting of environmental monitoring point monitoring data; y is_t+nRepresenting predicted data at a future time t + n, Y_t+nE.p, P ═ PM2.5 fine particulate matter, PM10 respirable particulate matter, carbon monoxide, nitrogen dioxide, ozone 1 hour average, ozone 8 hour average, sulfur dioxide }, Y_t+nIs any index in the set P;

step 2: processing the acquired haze data;

and step 3: training a haze prediction model based on a global attention mechanism, specifically as follows:

step 3.1: constructing data characteristics and labels according to the data processed in the step 2;

the data characteristics are haze data at m moments before an environment monitoring point, the label is haze data at a future t + n moment needing to be predicted, and the haze data is used as the input of a haze training model;

step 3.2: generating hidden states of different time steps through a bidirectional gating cyclic neural network; the gate control circulation unit selectively transmits information in the network through a gate control mechanism, and uses an update gate and a reset gate; the influence degree of the information of the previous time on the current time state is represented by the updating gate, and the resetting gate represents the combination mode of the new input information and the previous memory information;

step 3.3: defining a score function in the global attention mechanism, combining the hidden state with the global attention mechanism, and giving different weights to the hidden state;

in the global attention mechanism, the scoring function uses a fully connected neural network. After the global attention mechanism processing, obtaining a hidden state with weight information at each moment, and carrying out random discarding operation on the hidden state with the weight information;

step 3.4: processing the hidden state with the weight through a full connection layer to obtain a preliminary haze prediction result;

step 3.5: training a haze prediction model;

2. The global attention mechanism-based haze prediction method according to claim 1, wherein the haze data of the environment monitoring points acquired in the step 1 comprises 12 monitoring data: monitoring point name, time, air quality index, AQI, air quality index category, primary pollutants, PM2.5 fine particulate matter, PM10 respirable particulate matter, carbon monoxide, nitrogen dioxide, ozone 1 hour average, ozone 8 hour average, sulfur dioxide.

3. The global attention mechanism-based haze prediction method according to claim 1, wherein the step 2 of processing the acquired haze data comprises:

step 2.1: removing the weight of the haze data;

4. The global attention mechanism-based haze prediction method as claimed in claim 1, wherein the step 2 is used for processing the acquired haze data, and comprises the steps of 2.3: unifying the numerical range.

5. The global attention mechanism-based haze prediction method according to claim 4, wherein the step 2.3 is a method for unifying the numerical range as follows: and performing label coding on the names of the environment monitoring points, converting the names into data of numerical type, and unifying the numerical ranges of all the numerical data, namely converting the data ranges into [0,1 ].

6. The global attention mechanism-based haze prediction method according to claim 1, wherein the method for training the haze prediction model in step 3.5 is as follows:

selecting haze data of different environment monitoring points, and dividing the selected data into a training set and a test set according to a set proportion;

training by using a small batch method, wherein a small batch of data is trained each time; in each iteration of training, carrying out batch training on haze data; filling placeholder type data in each batch of training data, and calculating a loss value, accuracy and a predicted value; training the model on a training set by calculating the loss between the predicted value and the true value, and continuously updating model parameters until the training is finished;

7. The global attention mechanism-based haze prediction method as claimed in claim 1, wherein in step 4, the preliminary haze prediction result is subjected to proportion inversion to generate a final haze prediction result.