CN111967679B

CN111967679B - Ionosphere total electron content forecasting method based on TCN model

Info

Publication number: CN111967679B
Application number: CN202010842098.0A
Authority: CN
Inventors: 唐丝语; 黄智�; 陈伟
Original assignee: Jiangsu Normal University
Current assignee: Jiangsu Normal University
Priority date: 2020-08-20
Filing date: 2020-08-20
Publication date: 2023-12-08
Anticipated expiration: 2040-08-20
Also published as: CN111967679A

Abstract

The invention discloses a TCN model-based ionosphere total electron content forecasting method, and relates to the technical field of ionosphere total electron content forecasting. The forecasting method comprises the following steps: firstly, acquiring data, dividing the data into a training set and a testing set, and carrying out normalization processing; then, determining the topological structure of the TCN neural network, initializing the weight of the TCN neural network, and adjusting parameters in the model; then, inputting the training set into the ionosphere TEC forecast model in batches, calculating the output error of the effective history length, and simultaneously, reversely propagating the error to update TCN forecast model parameters until the optimal parameters are found, so as to complete training; and finally, inputting test set data, obtaining a TEC prediction result, comparing the prediction result with observation data, and evaluating the effectiveness of a prediction model. Due to TCN convolution characteristics, the method can better capture the space-time characteristics of the electronic content of the ionized layer, improves the accuracy of forecasting the total electronic content of the ionized layer to a certain extent, and enhances the instantaneity.

Description

Ionosphere total electron content forecasting method based on TCN model

Technical Field

The invention relates to the technical field of ionosphere total electron content forecasting, in particular to an ionosphere total electron content forecasting method based on a convolutional neural network TCN model.

Background

The ionosphere is an important component of the earth's atmosphere, located above 60km above the ground to the top magnetic layer, and contains a large number of free electrons and ions, which can affect the propagation of radio waves.

The total ionospheric electron content (total electric contents, TEC) is one of the important parameters of the ionosphere, which determines the extent to which the ionosphere affects satellite signals. Therefore, the method has important significance for correction of electric wave propagation and ionosphere theory research.

Knowing the satellite signal frequency, the ionospheric delay can be calculated by simply determining the TEC on the propagation path. Dual-frequency or multi-frequency users can compose a linear combination of radio layer delays based on satellite observations, minimizing or weakening ionospheric delays. However, since the price of the dual-frequency or multi-frequency GNSS receiver is far higher than that of the single-frequency receiver, in practical application, single-frequency users still use single-frequency users for a lot of GNSS users, and the single-frequency users generally cannot obtain ionospheric delay through self-measured data, and usually need to evaluate and correct the ionospheric delay by means of an ionospheric TEC model.

Currently, the research of a prediction model of an ionosphere TEC has gradually developed into a neural network model with higher advantage and learning ability in terms of processing the nonlinear input-output relationship and time-varying problem from traditional statistical and metering economy models, such as an international reference ionosphere (International Reference Ionosphere, IRI), an exponential smoothing model, an autoregressive moving average model ARMA, a similarity prediction method, an autocorrelation analysis method and the like. In particular, in recent years, computer hardware technology and artificial intelligence technology have been rapidly developed, and deep learning networks have been used to replace conventional neural networks represented by BP, and have been applied to the field of spatial weather prediction. Among other things, long and short term memory networks LSTM and its variants GRU show good performance in ionospheric TEC forecasting due to their good sequence learning and sequence conversion capabilities. However, LSTM and GRU are mainly processed sequentially over time, so that the problem that the gradient disappears or the gradient explodes cannot be completely solved, and meanwhile, a large amount of memories are required to store partial results of a plurality of units, so that the calculation time is long, and real-time requirements are difficult to meet in practical engineering application. Furthermore, ionospheric TEC is not only related to time variations but also to spatial location. Therefore, it is necessary to further extract the space-time variation characteristics to achieve more accurate and rapid prediction results.

Disclosure of Invention

In view of the above, the invention discloses a TCN model-based ionosphere total electron content forecasting method, which not only can provide ionosphere refraction correction parameters for wide single-frequency GNSS users, but also can provide a large amount of data support for ionosphere space environment analysis and research.

According to the invention, the ionosphere total electron content forecasting method based on the TCN model comprises the following steps:

step one: ionosphere TEC data, solar radiant flux values F10.7, and geomagnetic activity index Ap are acquired and TEC is converted to a high time resolution sequence.

Step two: dividing the converted TEC time sequence, solar radiation flux value F10.7 and geomagnetic activity index Ap into a training set and a testing set, and then carrying out zero-mean normalization processing on data in the training set.

Step three: and determining the topological structure of the TCN neural network, initializing the weight of the TCN neural network, and adjusting parameters in the model.

Step four: inputting the training set into the ionosphere TEC forecast model in batches, calculating the output error of the effective history length, and updating the weight parameter of the TCN forecast model by back propagation until the maximum iteration number or the accuracy and the error rate reach the optimal values;

step five: and (3) comparing training results, if the error is within the set range, ending the training, executing the step (six), if the error exceeds the set range, updating TCN network parameters, and training again until the optimal parameters are found, and determining an optimal model.

Step six: after model training is completed, the trained model is marked as TCN _TEC Inputting test set data into TCN _TEC In the model, the obtained output result is marked as Y ₀ Then Y is taken ₀ And performing Z-score inverse standardization to obtain a final TEC prediction result.

Step seven: and comparing the prediction result with the observation data, and evaluating the effectiveness of the model by adopting MAPE, MAD, RMSE three performance indexes.

Preferably, in the first step, the TEC high resolution sequence conversion mode is: converting the downloaded ionized layer data into VTEC data through spherical harmonic function, removing abnormal data which does not accord with the normal range of the electronic content, and filling the ionized layer TEC data through spline interpolation to obtain a TEC sequence with time resolution of 1 h; f10.7 and Ap are processed into a data format consistent with TEC time resolution.

Preferably, in step two, TEC, solar radiation flux F10.7 and geomagnetic Activity indexThe Ap standardized processing mode is as follows: TEC time sequence is recorded as X ₀ ＝{x ₁ ，x ₂ ，x ₃ ，...，x _n Training set X _tr ＝{x ₁ ，x ₂ ，...x _m }: test set X _te ＝{x ₁ ，x ₂ ，...x _l } =4:1; after the data set is divided, the element x in the training set is divided _m Zero-mean normalization processing is carried out to obtain a training set X 'after normalization' _tr ＝{x′ ₁ ，x′ ₂ ，...x′ _t Normalized formula:wherein: mu is the average value of the original TEC time sequence, sigma is the standard error of the original TEC time sequence, and t is more than or equal to 1 and less than or equal to m; the dividing ratio of the training set to the testing set of the solar radiation flux value F10.7 and the geomagnetic activity index Ap is also 4:1, and the zero mean value standardization process and formula are the same as the zero mean value standardization of TEC data.

Preferably, in the third step, the TCN network is designed into a three-layer network structure, including an input layer, a hidden layer and an output layer; the input layer comprises input parameters such as geomagnetic index Ap, solar activity index F10.7 and electron content TEC; the hidden layer uses causal convolution and expansion convolution as standard convolution layers, each two convolution layers and identity mapping are packaged into a residual error module, a deep network is stacked by the residual error module, and the full convolution layer is used for replacing a full connection layer in the last layers; the output layer outputs a target for the ionospheric electron content value for the next 24 hours.

Preferably, in the third step, the weight of initializing the TCN network is: the number of iterations was 50, the expansion factor d was 1,2,4, the convolution kernel size was 3, the learning rate size was 0.001, and the number of hidden layer neurons was 20.

Preferably, in the fourth step, the training set training step is as follows: s1, dividing single batch of input data into two parts, wherein the output of a residual error module is formed by adding two parts of input through a series of changes F; s2, carrying out one-dimensional convolution operation on the input X to finally obtain the output of the residual error module, wherein the output of the residual error module is calculated through the next residual error module; s3, repeating the steps S1-S2, and obtaining an output result of the TCN network when the last residual error module is calculated.

Preferably, the input data in S1 is first subjected to hole causal convolution, then nonlinear processing is performed through an activation function Relu, and a Dropout layer is added at the same time, and the output of the residual error module is finally obtained through two identical operations.

Preferably, in the seventh step, the three performance indexes MAPE, MAD, RMSE are respectively: wherein Y is _i For the forecast value of the ionosphere TEC of the ith hour, Q _i An observation value of the ionosphere TEC in the ith hour, wherein n is the length of a time period; when the three performance index values are smaller, the predicted value and the true value are closer, the fitting degree is better, and the error of the predicted value and the true value is smaller.

Compared with the prior art, the ionosphere total electron content forecasting method based on the TCN model has the advantages that:

(1) Due to causal characteristics between TCN convolution network layers, the space-time characteristics of the ionosphere electron content can be better captured, the accuracy of the total electron content forecast of the ionosphere is improved to a certain extent, and the instantaneity is enhanced.

(2) The added residual convolution is a layer jump convolution, and the larger expansion coefficient is used for changing the receptive field size to provide more flexibility, so that the memory length of a prediction model can be better controlled, and the total electronic content prediction accuracy of the ionosphere is further improved.

(3) The reverse propagation path of TCN and the time direction of the sequence are different, so that the problems of gradient disappearance or gradient explosion in the network are solved, and the method is superior to the traditional feedforward network.

(4) The sharing characteristic of the convolution kernel ensures that the convolution kernel can run fast and with less memory under the condition of processing the total electron content big data for many years, thereby accelerating the electron content forecasting speed.

Drawings

For a clearer description of embodiments of the invention or of the prior art, the drawings which are used in the description of the embodiments or of the prior art will be briefly described, it being evident that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is an overall flow chart of the present invention.

Fig. 2 is a diagram of a residual module structure in the TCN model according to the present invention.

Fig. 3 is a diagram illustrating the F (x) calculation process in the TCN network according to the present invention.

Detailed Description

The following is a brief description of embodiments of the present invention with reference to the accompanying drawings. It is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and that all other embodiments obtained by a person having ordinary skill in the art without making creative efforts based on the embodiments in the present invention are within the protection scope of the present invention.

Figures 1-3 illustrate a preferred embodiment of the present invention, which is described in detail.

The ionosphere total electron content forecasting method based on the TCN model shown in fig. 1 comprises the following steps:

step one: ionosphere TEC data, solar radiant flux values F10.7, and geomagnetic activity index Ap are acquired, and the TEC is converted into a time series of high time resolution by interpolation techniques. Specifically, TEC data is derived from European orbital center CODE, F10.7 is derived from Godade cosmic center of flight of NASA, and Ap is derived from geomagnetic and space magnetic data analysis centers. And (3) carrying out spherical harmonic function expansion calculation on the downloaded ionized layer data to obtain the vertical ionized layer content, removing abnormal data which do not accord with the normal range of the electronic content, and filling ionized layer TEC data by a spline interpolation method to obtain a TEC sequence with the time resolution of 1 h. F10.7 and Ap are processed into a data format consistent with TEC time resolution.

Step two: time series X of TEC after conversion ₀ ＝{x ₁ ，x ₂ ，x ₃ ，...，x _n Dividing into training sets X _tr ＝{x ₁ ，x ₂ ，...x _m Sum test set X _te ＝{x ₁ ，x ₂ ，...x _l And the ratio is 4:1. After the data set is divided, the element x in the training set is divided _m Zero-mean normalization processing is carried out to obtain a training set X 'after normalization' _tr ＝{x′ ₁ ，x′ ₂ ，...x′ _t Normalized formula:wherein: mu is the average value of the original TEC time sequence, sigma is the standard error of the original TEC time sequence, wherein t is more than or equal to 1 and less than or equal to m. The training set and test set division ratio of the solar radiation flux value F10.7 and the geomagnetic activity index Ap is also 4:1. The zero-mean normalization process and formula are the same as the zero-mean normalization of TEC data.

Step three: the topology structure of the TCN neural network is determined, the weight of the TCN neural network is initialized, parameters in the model are adjusted, the number of iterations is set to be 50, the expansion factors d are 1,2 and 4, the convolution kernel size is 3, the learning rate size is 0.001, the number of hidden layer neurons is 20 and other parameters. Specifically, the TCN network is designed as a three-layer network structure including an input layer, a hidden layer, and an output layer. Because the electronic content of the ionized layer is closely related to geomagnetic activity and solar activity, the input layer selects geomagnetic index Ap, solar activity index F10.7 and electronic content TEC as input parameters. The hidden layer uses causal convolution and expansion convolution as standard convolution layers, each two convolution layers and identity mapping are packaged into a residual error module, a deep network is stacked by the residual error module, and the full convolution layer is used for replacing the full connection layer in the last layers. The output layer outputs a target for the ionospheric electron content value for the next 24 hours.

Using the causal convolutional CNN model, the sequence problem can be translated into: according to x ₁ ，x ₂ ，...x _n To predict y ₁ ,y ₂ ，…y _n Definition of causal convolution, filter f= { F ₁ ，f ₂ ，...f _k Sequence X ₀ ＝{x ₁ ，x ₂ ，x ₃ ，...，x _n In x } _t The causal convolution at this point is:in the causal convolution model of the forecasting model, the last two nodes of the input layer are assumed to be x respectively _t-1 ，x _t The last node of the first hidden layer is y _t Filter f= (F ₁ ，f ₂ ) According to the formula there is y _t ＝f ₁ x _t-1 +f ₂ x _t 。

The cavity convolution is to add a cavity into the standard convolution to increase the receptive field, so that each convolution output contains a larger range of information, at x _t The hole convolution at the dialatization rate equal to d is:where d is the expansion coefficient and k is the convolution kernel size, (s-d.i) indicates which neuron of the upper layer is employed when calculating the neurons of the lower layer. The TCN model adds structural features of hole convolution, expands receptive fields by skipping some existing pixels, and the number of skipped pixels is (condition-1). In a cavity convolution model of a TEC prediction model, the last five nodes of a first layer of hidden layer are assumed to be x respectively _t-4 ，x _t-3 ，x _t-2 ，x _t-1 ，x _t The last node of the second hidden layer is y _t Filter f= (F ₁ ，f ₂ ，f ₃ ) According to the formula there is y _t ＝f ₁ x _t-2d +f ₂ x _t-d +f ₃ x _t (d＝2)。

TCN networks to improve accuracy, layer jump connections to add residual convolution, and convolution operations of 1*1. The residual convolution solves the training problem of the deep network through a special connection mode, and the number of network layers is greatly increased. The output of the residual block combines the input information and the output information of the input convolution operation, so that the information can be transferred in a cross-layer mode, and a residual module in the TCN architecture is shown in fig. 2. In the invention, a residual layer is constructed to replace one layer of convolution, and the input is subjected to hole convolution, weight normalization, activation function and Dropout. Wherein WeighNorm and Dropout regularize the network.

Step four: the training set is input into the ionosphere TEC forecast model in batches, the output error of the effective history length is calculated, and in the process, the model automatically adjusts the weight and the bias parameter according to the output error until the maximum iteration number or the accuracy and the error rate reach the optimal values. Specifically, the training set comprises the following training steps: s1, dividing single batch of input data into two parts, and adding the two parts of input data through a series of changes F to form the output of the residual error module. And the F calculation process is shown in fig. 3, input data is subjected to hole causal convolution firstly, then nonlinear processing is carried out through an activation function Relu, meanwhile, a Dropout layer is added, and the output of a residual error module is finally obtained through the same operation twice. S2, carrying out one-dimensional convolution operation on the input X. 1*1 convolution is used for reducing the dimension, if the feature jump layer of the lower layer is connected to the upper layer, the number of the corresponding feature maps of each Cell is inconsistent, so that the addition operation of jump layer feature maps similar to Resnet cannot be directly performed, in order to make the two layers coincide with the number of the feature maps, 1*1 convolution is used for performing a dimension reduction operation, and finally, the output of the residual error module is obtained, and the output of the residual error module is calculated through the next residual error module; s3, repeating the steps S1-S2, and obtaining an output result of the TCN network when the last residual error module is calculated.

Step five: calculating errors by utilizing squares of differences, accumulating errors obtained by multiple batches of data, comparing the errors with an error threshold value, taking a Loss function infinitely close to 0 and a accuracy function infinitely close to 1 as model optimization targets, ending training if the errors are within a set range, executing a step six, executing a step three, continuously updating parameters such as network weights and the like through reverse error propagation, training again until optimal parameters are found, and determining an optimal model.

Step six: modelAfter training is completed, the trained model is marked as TCN _TEC Inputting test set data into TCN _TEC In the model, the obtained output result is marked as Y ₀ Then Y is taken ₀ And performing Z-score inverse standardization to obtain a final TEC prediction result.

Step seven: comparing the predicted result with the observed data, and adopting MAPE, MAD, RMSE three performance indexes to evaluate the effectiveness of the model, wherein the smaller the three performance index values are, the closer the predicted value and the true value are, the better the fitting degree is, and the smaller the error between the predicted value and the true value is. The performance index is defined as follows:

wherein Y is _i For the forecast value of the ionosphere TEC of the ith hour, Q _i For the observation of ionosphere TEC for the ith hour, n is the length of the time period.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The ionosphere total electron content forecasting method based on the TCN model is characterized by comprising the following steps of:

step one: acquiring ionosphere TEC data, solar radiation flux value F10.7 and geomagnetic activity index Ap, and converting the TEC into a high-time resolution sequence;

step two: dividing the converted TEC time sequence, solar radiation flux value F10.7 and geomagnetic activity index Ap into a training set and a testing set, and then carrying out zero-mean normalization processing on data in the training set;

step three: determining the topology structure of the TCN neural network, initializing the weight of the TCN neural network, and adjusting parameters in the model;

step five: comparing training results, ending training if the error is within the set range, executing the step six, if the error exceeds the set range, updating TCN network parameters, training again until the optimal parameters are found, and determining an optimal model;

step six: after model training is completed, the trained model is marked as TCN _TEC Inputting test set data into TCN _TEC In the model, the obtained output result is marked as Y ₀ Then Y is taken ₀ Performing Z-score inverse standardization to obtain a final TEC prediction result;

2. The method for forecasting total electron content of an ionosphere based on a TCN model according to claim 1, wherein in the first step, the TEC high-resolution sequence conversion mode is as follows: converting the downloaded ionized layer data into VTEC data through spherical harmonic function, removing abnormal data which does not accord with the normal range of the electronic content, and filling the ionized layer TEC data through spline interpolation to obtain a TEC sequence with time resolution of 1 h; f10.7 and Ap are processed into a data format consistent with TEC time resolution.

3. The method for forecasting total electron content of an ionosphere based on a TCN model according to claim 1, wherein in the second step, the standardized processing mode of TEC, solar radiation flux F10.7 and geomagnetic activity index Ap is as follows: TEC time sequence is recorded as X ₀ ＝{x ₁ ，x ₂ ，x ₃ ，...，x _n Training set X _tr ＝{x ₁ ，x ₂ ，...x _m }: test set X _te ＝{x ₁ ，x ₂ ，...x _l } =4:1; after the data set is divided, the element x in the training set is divided _m Zero-mean normalization processing is carried out to obtain a training set X 'after normalization' _tr ＝{x′ ₁ ，x′ ₂ ，...x′ _t Normalized formula:wherein: mu is the average value of the original TEC time sequence, sigma is the standard error of the original TEC time sequence, and t is more than or equal to 1 and less than or equal to m; the dividing ratio of the training set and the testing set of the solar radiation flux value F10.7 and the geomagnetic activity index Ap is also 4:1, and the zero-mean normalization process and formula are the same as that of TEC data zero-mean normalization.

4. The method for forecasting total electronic content of an ionosphere based on a TCN model according to claim 1, wherein in the third step, the TCN network is designed into a three-layer network structure comprising an input layer, a hidden layer and an output layer; the input layer comprises input parameters such as geomagnetic index Ap, solar activity index F10.7 and electron content TEC; the hidden layer uses causal convolution and expansion convolution as standard convolution layers, each two convolution layers and identity mapping are packaged into a residual error module, a deep network is stacked by the residual error module, and the full convolution layer is used for replacing a full connection layer in the last layers; the output layer outputs a target for the ionospheric electron content value for the next 24 hours.

5. The method for forecasting total ionospheric electron content based on TCN model according to claim 4, wherein in step three, the weight of initializing TCN network is: the number of iterations was 50, the expansion factor d was 1,2,4, the convolution kernel size was 3, the learning rate size was 0.001, and the number of hidden layer neurons was 20.

6. The method for forecasting total ionospheric electron content based on TCN model according to claim 5, wherein in the fourth step, the training set training step is: s1, dividing single batch of input data into two parts, wherein the output of a residual error module is formed by adding two parts of input through a series of changes F; s2, carrying out one-dimensional convolution operation on the input X to finally obtain the output of the residual error module, wherein the output of the residual error module is calculated through the next residual error module; s3, repeating the steps S1-S2, and obtaining an output result of the TCN network when the last residual error module is calculated.

7. The method for forecasting total ionosphere electron content based on TCN model according to claim 6, wherein the input data in S1 is first subjected to hole causal convolution, then is subjected to nonlinear processing through an activation function Relu, and meanwhile, a Dropout layer is added, and the output of a residual error module is finally obtained through two identical operations.

8. The method for forecasting total ionospheric electron content based on the TCN model according to claim 1, wherein in step seven, the three performance indexes MAPE, MAD, RMSE are respectively: wherein Y is _i For the forecast value of the ionosphere TEC of the ith hour, Q _i An observation value of the ionosphere TEC in the ith hour, wherein n is the length of a time period; when the three performance index values are smaller, the predicted value and the true value are closer, the fitting degree is better, and the error of the predicted value and the true value is smaller.