CN116400433A

CN116400433A - Single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution

Info

Publication number: CN116400433A
Application number: CN202310069184.6A
Authority: CN
Inventors: 向黎; 曹增辉; 张滢; 杨道勇; 聂于棚; 袁阅; 吕苗; 刘宸钊; 甘思旧; 敬文慧; 张伟; 许磊; 张瑞林; 任燕
Original assignee: 63796 FORCES PLA
Current assignee: 63796 FORCES PLA
Priority date: 2023-01-16
Filing date: 2023-01-16
Publication date: 2023-07-07

Abstract

The invention relates to a single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution, and belongs to the field of weather forecasting. The method selects important single-station meteorological observation elements to obtain a single-station historical observation sequence; extracting multi-element forecast data to obtain a numerical forecast space-time sequence; normalizing the data; constructing a three-dimensional convolution module, inputting the normalized numerical forecasting space-time sequence, and finally producing a time sequence containing space features; constructing an interactive learning model ICM, and processing a single-station historical observation sequence and a three-dimensional convolution extracted time sequence to generate time characteristic information of two data; constructing an MCNN network model, fusing the two branch time characteristic information obtained in the S3, and forming a prediction result through interactive learning and full connection again; air temperature prediction is performed using the model. The time sequence prediction model built by the method improves stability and prediction accuracy, and the method is suitable for time-by-time air temperature prediction of 72-120 hours of a single station.

Description

Single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution

Technical Field

The invention belongs to the field of weather forecast, and particularly relates to a single-station time-by-time air temperature forecast method based on data fusion and mixed convolution.

Background

The data of a meteorological site, representing historical or current meteorological element changes, can be used to predict the atmospheric conditions of the site, but it is difficult to obtain the characteristics of the atmospheric nonlinear change curve. Numerical weather forecast aims to simulate the atmospheric forecast process and provide future weather conditions. However, the numerical prediction is based on discrete numerical calculation, the result is often grid data with errors, accurate local weather prediction is difficult to obtain, and the mountain local weather conditions are complex and changeable, so that the prediction is challenged. Therefore, the numerical forecasting data and the historical data can be used for fusion, the advantages of the numerical forecasting data and the historical data are fully exerted, and site atmosphere forecasting is carried out.

Disclosure of Invention

First, the technical problem to be solved

The technical problem to be solved by the invention is how to provide a single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution, so as to solve the problems that the numerical forecasting is based on discrete numerical calculation, the result is often grid data with errors, accurate local weather forecasting is difficult to obtain, and the local weather conditions of mountain areas are complex and changeable, so that challenges are brought to forecasting.

(II) technical scheme

In order to solve the technical problems, the invention provides a single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution, which comprises the following steps:

s1, data preprocessing:

s11, selecting important single-station meteorological observation elements to obtain a single-station historical observation sequence;

s12, extracting multi-element prediction data to obtain a numerical prediction space-time sequence;

s13, data normalization;

s2, three-dimensional convolution feature extraction:

constructing a three-dimensional convolution module, inputting the normalized numerical forecasting space-time sequence, and finally producing a time sequence containing space features;

s3, an interactive learning model ICM:

constructing an interactive learning model ICM, extracting local correlation of the time sequence by using a one-dimensional convolution filter cnn1d for the single-station historical observation sequence and the three-dimensional convolution extracted time sequence, establishing a hierarchical structure to extract sequences with different time scales, and generating time characteristic information of two data;

s4, a mixed convolution network model MCNN:

constructing an MCNN network model, fusing the two branch time characteristic information obtained in the S3, and forming a prediction result through interactive learning and full connection again; training the MCNN network model;

s5, predicting results:

and according to the trained MCNN network model, using latest observation data of the site and numerical mode forecast data, preprocessing, importing the model, and outputting 72-120 hours of time-by-time air temperature forecast data.

(III) beneficial effects

The invention provides a single-station time-by-time air temperature prediction method based on data fusion and mixed convolution, and provides a method MCNN (Mixed convolution neural network) based on data fusion and mixed convolution, wherein a time sequence prediction model is built, so that single-station historical data and numerical prediction data can be effectively utilized to perform time-by-time air temperature prediction, and stability and prediction accuracy are improved. The method is suitable for the time-by-time air temperature prediction of a single station for 72-120 hours.

Drawings

FIG. 1 is a structure (CNN) of a three-dimensional convolution model of the present disclosure;

FIG. 2 is a diagram of the structure of an Interactive Convolution Module (ICM) of the present invention;

FIG. 3 is a structure of a one-dimensional convolution operation (cnn 1 d) of the present invention;

FIG. 4 is a general structure of a space-time convolutional network of the present invention;

FIG. 5 is an example of test set prediction results;

FIG. 6 is a comparison of prediction results;

FIG. 7 is a comparison of two prediction results;

FIG. 8 is a comparison of the predicted results three.

Detailed Description

To make the objects, contents and advantages of the present invention more apparent, the following detailed description of the present invention will be given with reference to the accompanying drawings and examples.

The invention relates to the technical field of weather time-by-time air temperature prediction, in particular to a single-station medium-term time-by-time air temperature prediction method based on data fusion and mixed convolution.

The technical problem to be solved by the invention is to design a model architecture based on a deep neural network, integrate information characteristics of site data and forecast data, and realize single-site time-by-time air temperature prediction for 72-120 hours.

The data of a meteorological site, representing historical or current meteorological element changes, can be used to predict the atmospheric conditions of the site, but it is difficult to obtain the characteristics of the atmospheric nonlinear change curve. Numerical weather forecast aims to simulate the atmospheric forecast process and provide future weather conditions. However, the numerical prediction is based on discrete numerical calculation, the result is often grid data with errors, accurate local weather prediction is difficult to obtain, and the mountain local weather conditions are complex and changeable, so that the prediction is challenged. Therefore, the numerical forecasting data and the historical data can be used for fusion, the advantages of the numerical forecasting data and the historical data are fully exerted, and site atmosphere forecasting is carried out. The invention provides a method MCNN (Mixedconvolutionneuralnetwork) based on data fusion and mixed convolution, which is used for constructing a time sequence prediction model, so that single-station historical data and numerical prediction data can be effectively utilized to predict air temperature time by time, and the stability and the prediction accuracy are improved. The method is suitable for the time-by-time air temperature prediction of a single station for 72-120 hours.

The invention designs a deep neural network MCNN based on mixed convolution, builds a spatial convolution module and a time sequence module, realizes the time-by-time air temperature prediction in a single station 120h through feature fusion, and gives a prediction result at one time.

The technical scheme adopted by the invention is as follows: the mountain single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution comprises the following steps:

s1, data preprocessing:

s13, data normalization.

S2, three-dimensional convolution feature extraction (CNN):

a three-dimensional convolution module is constructed, which comprises a series of spatial convolution, downsampling, activation functions, full connection and the like, and numerical forecast space-time sequences after normalization are input, so that a time sequence containing spatial features is finally produced.

S3, interactive learning (ICM):

and respectively extracting local correlations of the time sequences by using a one-dimensional convolution filter for the single-station historical observation sequence and the three-dimensional convolution extracted time sequences, establishing a hierarchical structure to extract sequences with different time scales, and generating time characteristic information of the two data.

S4, hybrid convolutional network (MCNN):

and (3) fusing the two branch time characteristic information obtained in the step (S3), and forming a prediction result through interactive learning and full connection again.

S5, predicting results:

and according to the trained model, using latest detection data of the site and numerical mode forecast data, preprocessing, importing the model, and outputting temperature forecast data from time to time for 72-120 hours.

Example 1:

the mountain single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution mainly comprises the following steps: data preprocessing, three-dimensional convolution feature extraction, interactive learning, space-time convolution network and prediction result output.

S1, data preprocessing:

s11, selecting important single-station meteorological observation elements

The selected meteorological observation elements comprise: temperature, humidity, air pressure, wind speed, wind direction, total cloud amount and the like, and obtaining a single-station historical observation sequence S:

where L is the time series length and N is the number of elements. Taking as input observation data of the site over a period of time (e.g., l=240 hours), and before that, in order to reduce the complexity of the model, importance screening is required, a random forest method is adopted to determine the importance of a temperature element by adding noise to other elements, and the greater the loss, the more critical the element to which noise is added. The results are shown in Table 1, targeting air temperature.

Table 1 importance score for variables

S12, extracting multi-element forecast data

And carrying out numerical mode forecast data extraction on European fine grid data by taking the station as the center to obtain space data represented by time sequence, wherein the space range of the data is 5 degrees, the precision is 0.125 degree, and the time interval is 3 or 6 hours. The MCNN model adopts 2 m air temperature, ground air pressure, ground temperature, total cloud quantity and topographic data as selection elements to form a forecast data set, and the record of the data set is expressed as a numerical forecast space-time sequence P:

where L 'represents the length of the time series, N' represents the number of meteorological elements, and W H represents the size of the spatial grid.

S13, data normalization

And normalizing the two types of data in S11 and S12 by adopting a zero mean normalization method.

S'＝(S-S _m )/S _s (3)

P'＝(P-P _m )/P _s (4)

Where S 'and P' are values after normalization, S and P are values before normalization, S _m Is the mean value of S _s Is the standard deviation of S, P _m Is the average value of P, P _s Is the standard deviation of P.

S2, three-dimensional convolution feature extraction:

convolutional neural networks have played a vital role in computer vision. The convolution kernel has a local receptive field that captures local features and performs spatial downsampling to continually extract more detailed features. For the forecast grid data, convolution operation can be performed, and meteorological element information can be extracted from the forecast data. According to the invention, a module is constructed by utilizing local receptive field and space downsampling of a convolution kernel to extract a time sequence of characteristic information of prediction data near a site and match the time sequence with historical data.

FIG. 1 shows the structure of the three-dimensional convolution model, and a three-dimensional convolution module (CNN) is constructed, comprising a series of spatial convolutions, downsampling, activation functions, full connection, etc., to finally form a time sequence containing spatial features, the convolution module is input as a normalized numerical prediction spatiotemporal sequence P', and output as a time sequence P containing spatial feature information ₁ 。

For the present station, 81×81 grids are selected herein, and european fine grid data aged for 0-120 hours is used as the space-time information around the station, i.e., P in step S12 is normalized P'. Features are extracted from the grid data over a time sequence using a multi-layer convolution operation, activated using a LeakyRelu after downsampling, and finally the output is expanded to one-dimensional information through a fully connected layer.

The three-dimensional convolution module CNN network adopted by the invention comprises the following components: 1. a three-dimensional space convolution layer Conv3d (5,32,3) for improving the channel number to 32; 2. a three-dimensional space convolution layer Conv3d (32,64,3) for raising the channel number to 64; 3. the first space downsampling layer is MaxPool3d (3), and the length and the width of the grid are reduced by 3 times; 4. a three-dimensional space convolution layer Conv3d (64,32,3) for reducing the channel number to 32; 5. a three-dimensional space convolution layer Conv3d (32,16,3) for reducing the channel number to 16; 6. the second space downsampling layer is MaxPool3d (3), and the length and width of the grid are reduced by 3 times again; 7. a three-dimensional space convolution layer Conv3d (16,4,3) for reducing the channel number to 4; 8. a three-dimensional space convolution layer Conv3d (4,1,3) for reducing the channel number to 1; 9. a first activation function layer LeakyRelu (); 10. a first full link layer Linear (); 11. a second activation function layer LeakyRelu (); 12. and a second full link layer Linear ().

S3, an interactive learning model (ICM):

an interactive learning model ICM is constructed, and for a single-station historical observation sequence and a three-dimensional convolution extracted time sequence, local correlation of the time sequence is extracted by using a one-dimensional convolution operation (cnn 1 d) respectively, and a hierarchical structure is built to extract sequences of different time scales.

The detailed structure of the ICM model is shown in FIG. 2. The input time sequence is layer 1, the input time sequence is divided into a sequence pair according to parity to be layer 2, each sequence of layer 2 is divided into a sequence pair according to parity, and so on. The process for each small sequence pair is denoted as an Interactive Convolution Block (ICB), and interactive learning is performed. And splicing the outputs of all ICBs of the last layer, and obtaining the output of the ICM model through the full-connection layer.

Each small sequence is split and downsampled to generate 1 subsequence pair, and the process is as follows:

X _odd ＝X _2i-1 ,i＝1,2,…,L/2 (5)

X _even ＝X _2i ,i＝1,2,…,L/2 (6)

wherein X represents the time sequence P of which the initial sequence is a single-station historical observation sequence S' or the numerical forecast is output through the S2 step ₁ Layer-by-layer Split sequences (Split). X is X _odd 、X _even The downsampled pairs of subsequences are represented as an even term sequence and an odd term sequence, respectively. Downsampling the original sequence in the time dimension enables the module to study dynamic information of different time resolutions.

The Interactive Convolution Block (ICB) processing steps for each sub-sequence pair are: 1) Sub-sequence X _odd And X _even Respectively using one-dimensional convolution extraction network (cnn 1 d), converting into exp form, respectively performing dot product multiplication with the atomic sequence of each other, and obtaining the final product

And->

2) Extracting network by repeated one-dimensional convolution to obtain ∈Di-L>

And->

Projection to hidden state, added to +.>

And->

The output X 'of the ICB module is obtained as in equation (8)' _odd And X' _even 。

Wherein sigma,

ρ, τ are both one-dimensional convolution extraction networks (cnn 1 d), as shown in fig. 3, comprising: one-dimensional filled replicarpad 1d, one-dimensional convolution Conv1d, leakyRelu activation, dropout layer, one-dimensional convolution Conv1d, tanh activation. Where T represents the sequence length, C represents the number of channels, and K represents the convolution kernel.

Finally, through repeated downsampling, updating time sequences X 'with different resolutions through a one-dimensional convolution extraction network cnn1d and an interactive convolution block ICB' _odd And X' _even The sub-sequences are then rearranged by an inverted parity splitting operation and concatenated into a new sequence, which is added to the original sequence by residual concatenation. In addition, a final output result is generated by the connection layer. The specific operation is as follows:

X'＝Revese(X' _odd ,X' _even ) (9)

X _out ＝Linear(X+X') (10)

wherein X' represents a new sequence representation, X _out Denote the final output, and Linear denotes the full link layer.

4. Hybrid convolutional network Model (MCNN):

MCNN is the overall deep neural network of the present invention, as shown in fig. 4, and specifically includes the following steps: 1) Using a random forest to take S 'after normalization of a meteorological element S with high contribution as input one, and taking P' after normalization of a European fine grid numerical forecast plus terrain data sequence P as input two; 2) Input two P' uses three-dimensional convolution module (CNN) to extract space sequence feature P ₁ The method comprises the steps of carrying out a first treatment on the surface of the 3) Input S' and feature P using an interactive learning model (ICM) ₁ Respectively performing interactive learning to obtain Y ₁ And Y ₂ In this case, the intermediate loss function L can be obtained ₁ And L ₂ The method comprises the steps of carrying out a first treatment on the surface of the 4) Time sequence Y ₁ And Y ₂ Projecting (one-dimensional convolution Conv1 d) to obtain two equal-length sequence splicing (cat) to obtain Y ₃ I.e. (fusion block); 5) Output Y from the previous step ₃ And then obtaining Y through an interactive convolutional network (ICM) ₄ ；6)Y ₄ Forming a predicted result Y through full connection (Linear), and obtaining a loss function L ₃ . The main operation is as follows:

Y ₁ ＝ICM(S') (11)

Y ₂ ＝ICM(CNN(P')) (12)

Y ₃ ＝CAT(α(Y ₁ ),β(Y ₂ )) (13)

Y＝Linear(ICM(Y ₃ )) (14)

wherein S 'represents input one, P' represents input two, CNN represents a three-dimensional convolution module, ICM represents an interactive learning model, alpha and beta represent one-dimensional convolution (Conv 1 d), the function is to adjust the sequence length, CAT represents the combination, the function is to combine the characteristics of two branches, and Linear represents full connection.

In addition, two intermediate generation results based on intermediate supervision construction loss functions, supervision single-station historical data and numerical forecast data are used in the whole MCNN network model training. The total loss function comprises the loss function before the fusion of the two branch information, and is specifically expressed as follows:

L＝aL ₁ +bL ₂ +cL ₃ (15)

wherein L is ₁ 、L ₂ And L ₃ Representing Loss function of Loss1 for Loss1, tables a, b and c, respectivelyThe three weight parameters of the loss function are shown, and the sum of the weight parameters is 1, for example, can be set to 0.5, 0.1 and 0.4.

5. And (3) predicting results:

and according to the trained model, using new detection data and numerical forecast data, preprocessing, importing the model, and predicting a temperature sequence.

Effect verification

Fig. 5 is a test set test example in which a blue line represents a single-station measured temperature, a red line represents a predicted temperature, an abscissa represents a time of prediction, and an ordinate represents a temperature value. Through inspection statistics of 145 records, the average absolute error of the air temperature of the model prediction time by time is 1.2 degrees under the condition of 120 hours, and the accuracy rate is 93 percent under the condition that the error is less than 3 degrees.

Fig. 6-8 are updated results of time-by-time air temperature prediction applications, wherein blue lines represent measured temperatures for a single station, red lines represent predicted temperatures, abscissas represent predicted times, ordinates represent temperature values, bars represent predicted differences, red represents absolute differences greater than 3, mes is the mean absolute error, ACC is the accuracy. The results in conclusion show that the invention achieves good effects.

The time-by-time air temperature forecasting method MCNN (Mixed convolution neural network) based on data fusion and mixed convolution can effectively solve the problem that the forecasting accuracy of a product is low by simply using site data or mode numerical forecasting, improves the time-by-time air temperature forecasting accuracy, and can enable the forecasting accuracy to reach 92% under the condition of 3 degrees of error.

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. A single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution is characterized by comprising the following steps:

s1, data preprocessing:

s13, data normalization;

s2, three-dimensional convolution feature extraction:

s3, an interactive learning model ICM:

s4, a mixed convolution network model MCNN:

s5, predicting results:

2. The single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution as claimed in claim 1, wherein said step S11 specifically comprises: selecting meteorological observation elements to obtain a single-station historical observation sequence S:

where L is the time series length, N is the element number, and the importance of the element is determined by adding noise to each other element for the temperature element by using a random forest method, and the larger the loss, the more critical the element to which noise is added.

3. The single-station time-by-time air temperature prediction method based on data fusion and mixed convolution as defined in claim 2, wherein the step S12 specifically includes: carrying out numerical mode forecast data extraction on European fine grid data by taking the station as the center to obtain space data represented by time sequence, wherein the space range of the data is 5 degrees, the precision is 0.125 degrees, the time interval is 3 or 6 hours, meteorological elements are selected to form a forecast data set, and the record of the data set is represented as a numerical forecast space-time sequence P:

4. The single-station time-by-time air temperature prediction method based on data fusion and mixed convolution as defined in claim 3, wherein said step S13 specifically includes: adopting a zero mean value normalization method to normalize two types of data in S11 and S12,

S′＝(S-S _m )/S _s (3)

P′＝(P-P _m )/P _s (4)

5. The single-station time-by-time air temperature prediction method based on data fusion and mixed convolution as defined in claim 4, wherein said step S21 specifically comprises: a three-dimensional convolution module CNN is constructed, comprising: spatial convolution, downsampling, activation function and full concatenation, the convolution module input being the numerical forecast after normalizationThe empty sequence P' uses multi-layer convolution operation to extract the characteristics from the grid data on the time sequence, uses LeakyRelu to activate after downsampling, finally expands the output into one-dimensional information through the full connection layer, and outputs the time sequence P containing the space characteristic information ₁ 。

6. The single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution as claimed in claim 5, wherein the three-dimensional convolution module CNN network selects 81×81 grids, comprising: a three-dimensional space convolution layer Conv3d (5,32,3) for improving the channel number to 32; a three-dimensional space convolution layer Conv3d (32,64,3) for raising the channel number to 64; the first space downsampling layer is MaxPool3d (3), and the length and the width of the grid are reduced by 3 times; a three-dimensional space convolution layer Conv3d (64,32,3) for reducing the channel number to 32; a three-dimensional space convolution layer Conv3d (32,16,3) for reducing the channel number to 16; the second space downsampling layer is MaxPool3d (3), and the length and width of the grid are reduced by 3 times again; a three-dimensional space convolution layer Conv3d (16,4,3) for reducing the channel number to 4; a three-dimensional space convolution layer Conv3d (4,1,3) for reducing the channel number to 1; a first activation function layer LeakyRelu (); a first full link layer Linear (); a second activation function layer LeakyRelu (); and a second full link layer Linear ().

7. The single-station time-by-time air temperature prediction method based on data fusion and mixed convolution according to claim 5 or 6, wherein in the step S3, the ICM model includes: the input time sequence is layer 1, the input time sequence is split into a sequence pair according to parity to be layer 2, each sequence of layer 2 is split into a sequence pair according to parity, and so on; and (3) recording the processing of each small sequence pair as an interactive convolution block ICB, performing interactive learning, splicing the outputs of all ICBs of the last layer, and obtaining the output of the ICM model through a full connection layer.

8. The single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution as claimed in claim 7, wherein each small sequence is split and downsampled to generate 1 subsequence pair, and the process is as follows:

X _odd ＝X _2i-1 ，i＝1，2，...，L/2 (5)

X _even ＝X _2i ，i＝1，2，...，L/2 (6)

wherein X represents the time sequence P of which the initial sequence is a single-station historical observation sequence S' or the numerical forecast is output through the S2 step ₁ Is a layer-by-layer split sequence of (a); x is X _odd 、X _even Representing the sub-sequence pairs after downsampling, which are respectively an even term sequence and an odd term sequence;

the interactive convolution block ICB processing steps for each sub-sequence pair are: 1) Sub-sequence X _odd And X _even Extracting the network cnn1d by using one-dimensional convolution respectively, converting the network cnn1d into an exp form, and multiplying the obtained product by dot products of the atomic sequences of the other party respectively, wherein the formula (7) is obtained

And->

And->

Projection to hidden state, added to +.>

And->

See equation (8) to obtain the output X 'of the ICB module' _odd And X' _even ；

Wherein sigma,

ρ, τ are both one-dimensional convolution extraction networks (cnn 1 d) comprising: one-dimensional filled replicarpad 1d, one-dimensional convolution Conv1d, leakyRelu activation, dropout layer, one-dimensional convolution Conv1d, tanh activation; wherein T represents the sequence length;

finally, through repeated downsampling, updating time sequences X 'with different resolutions through a one-dimensional convolution extraction network cnn1d and an interactive convolution block ICB' _odd And X' _even The sub-sequences are then rearranged by an inverted parity splitting operation and concatenated into a new sequence, which is added to the original sequence by residual concatenation; in addition, a final output result is generated through the connection layer; the specific operation is as follows:

X′＝Revese(X′ _odd ，X′ _even ) (9)

X _out ＝Linear(X+X′) (10)

9. The single-station time-by-time air temperature prediction method based on data fusion and mixed convolution according to claim 8, wherein in the step S4, the MCNN network model includes: 1) Using a random forest to take S 'after normalization of a meteorological element S with high contribution as input one, and taking P' after normalization of a European fine grid numerical forecast plus terrain data sequence P as input two; 2) The input two P' uses a three-dimensional convolution module CNN to extract the spatial sequence feature P ₁ The method comprises the steps of carrying out a first treatment on the surface of the 3) Input S' and feature P using an interactive learning model ICM ₁ Respectively performing interactive learning to obtain Y ₁ And Y ₂ In this case, too, an intermediate can be obtainedLoss function L ₁ And L ₂ The method comprises the steps of carrying out a first treatment on the surface of the 4) Time sequence Y ₁ And Y ₂ Performing one-dimensional convolution Conv1d projection to obtain two equal-length sequence spliced CAT to obtain Y ₃ The method comprises the steps of carrying out a first treatment on the surface of the 5) Output Y from the previous step ₃ And obtaining Y through an interactive learning model ICM ₄ ；6)Y ₄ Forming a predicted result Y through the full-connection layer, and obtaining a loss function L at the moment ₃ The method comprises the steps of carrying out a first treatment on the surface of the The method comprises the following steps:

Y ₁ ＝ICM(S′) (11)

Y ₂ ＝ICM(CNN(P′)) (12)

Y ₃ ＝CAT(α(Y ₁ )，β(Y ₂ )) (13)

Y＝Linear(ICM(Y ₃ )) (14)

s 'represents input one, P' represents input two, CNN represents a three-dimensional convolution module, ICM represents an interactive learning model, alpha and beta represent one-dimensional convolution Conv1d, the function is to adjust the sequence length, CAT represents the combination, the function is to combine the characteristics of two branches, and Linear represents full connection.

10. The single-station time-by-time air temperature forecasting method based on data fusion and mixed convolution as claimed in claim 9, wherein a loss function is built based on intermediate supervision in the whole MCNN network model training, two intermediate generation results of single-station historical data and numerical forecasting data are supervised, and the total loss function comprises loss functions before two branch information fusion, and specifically comprises the following steps:

L＝aL ₁ +bL ₂ +cL ₃ (15)

wherein L is ₁ 、L ₂ And L ₃ The Loss function of Loss1 is represented, a, b and c respectively represent weight parameters of the three Loss functions, and the sum of the weight parameters is 1.