CN111967679A

CN111967679A - Ionized layer total electron content forecasting method based on TCN model

Info

Publication number: CN111967679A
Application number: CN202010842098.0A
Authority: CN
Inventors: 唐丝语; 黄智�; 陈伟
Original assignee: Jiangsu Normal University
Current assignee: Jiangsu Normal University
Priority date: 2020-08-20
Filing date: 2020-08-20
Publication date: 2020-11-20
Anticipated expiration: 2040-08-20
Also published as: CN111967679B

Abstract

The invention discloses an ionospheric total electron content forecasting method based on a TCN (tool tracking network) model, and relates to the technical field of ionospheric total electron content forecasting. The forecasting method comprises the following steps: firstly, acquiring data, dividing the data into a training set and a test set, and carrying out normalization processing; then, determining a topological structure of the TCN neural network, initializing the weight of the TCN neural network, and adjusting parameters in the model; secondly, inputting the training set into an ionized layer TEC forecasting model in batches, calculating an output error of the effective historical length, and meanwhile, reversely propagating the error to update TCN prediction model parameters until the optimal parameters are found, and finishing training; and finally, inputting the data of the test set, obtaining a TEC prediction result, comparing the prediction result with the observation data, and evaluating the effectiveness of the prediction model. Due to the TCN convolution characteristic, the method can better capture the space-time characteristic of the electron content of the ionized layer, improve the accuracy of the total electron content forecast of the ionized layer to a certain extent and enhance the real-time property.

Description

Ionized layer total electron content forecasting method based on TCN model

Technical Field

The invention relates to the technical field of ionosphere total electron content prediction, in particular to an ionosphere total electron content prediction method based on a convolutional neural network TCN model.

Background

The ionosphere, an important component of the earth's atmosphere, is located above 60km above ground to the magnetic top layer and contains a large number of free electrons and ions that affect the propagation of radio waves.

Total Electron Content (TEC) is one of the important parameters of the ionosphere, and determines the degree of influence of the ionosphere on satellite signals. Therefore, the method has important significance for correcting radio wave propagation and researching ionosphere theory.

When the satellite signal frequency is known, only the TEC on the propagation path needs to be determined, and the ionospheric delay can be calculated. Dual-band or multi-band users can form a linear combination of radio layer delays based on satellite observations, eliminating or attenuating ionospheric delays to the maximum. However, because the price of the dual-frequency or multi-frequency GNSS receiver is much higher than that of the single-frequency receiver, in practical application, many single-frequency users are still used, and the single-frequency users generally cannot obtain the ionospheric delay through self-measured data, and usually need to evaluate and correct the ionospheric delay by means of an ionospheric TEC model.

Currently, research on a prediction model of an Ionosphere TEC has gradually evolved from traditional statistical and computational economics models, such as modeling methods of an International Reference Ionosphere (IRI), an exponential smoothing model, an autoregressive moving average model ARMA, a similar prediction method, an autocorrelation analysis method, and the like, to a neural network model which is more advantageous and more capable of learning in terms of processing nonlinear input and output relationships and time-varying problems. Particularly, in recent years, with rapid development of computer hardware technology and artificial intelligence technology, the deep learning network replaces the traditional neural network represented by BP, and has been applied to the field of spatial weather forecast to a certain extent. The long-time memory network LSTM and the variant GRU thereof show good performance in ionosphere TEC prediction due to good sequence learning and sequence conversion capability. However, LSTM and GRU mainly perform sequential processing over time, and cannot completely solve the problem of gradient disappearance or gradient explosion, and meanwhile, a large amount of memory is required to store partial results of a plurality of units, the calculation time is long, and the real-time requirement is difficult to meet in actual engineering application. In addition, the ionosphere TEC is not only related to temporal variation, but also closely related to spatial position. Therefore, the temporal-spatial variation characteristics need to be further extracted to achieve more accurate and rapid prediction results.

Disclosure of Invention

In view of the above, the invention discloses a method for forecasting total electron content of an ionosphere based on a TCN model, which not only can provide ionosphere refraction correction parameters for a large number of single-frequency GNSS users, but also can provide a large amount of data support for ionosphere space environment analysis and research.

The ionospheric total electron content forecasting method based on the TCN model comprises the following steps:

the method comprises the following steps: acquiring ionized layer TEC data, a solar radiation flux value F10.7 and a geomagnetic activity index Ap, and converting the TEC into a high-time resolution sequence.

Step two: and dividing the converted TEC time sequence, the solar radiation flux value F10.7 and the geomagnetic activity index Ap into a training set and a testing set, and then carrying out zero-mean standardization processing on data in the training set.

Step three: determining the topological structure of the TCN neural network, initializing the weight of the TCN neural network, and adjusting the parameters in the model.

Step four: inputting the training set into an ionized layer TEC forecasting model in batches, calculating an output error of an effective historical length, and updating a TCN forecasting model weight parameter by back propagation of the error until the maximum iteration times or the accuracy and the error rate reach the optimal values;

step five: and comparing the training results, if the error is within the set range, ending the training, executing the step six, if the error exceeds the set range, executing the step three, updating the TCN network parameters, performing the training again until the optimal parameters are found, and determining the optimal model.

Step six: after the model training is finished, the trained model is recorded as TCN_TECInputting test set data into TCN_TECIn the model, the output result is recorded as Y₀Then Y is put₀And carrying out Z-score denormalization to obtain a final TEC prediction result.

Step seven: and comparing the prediction result with the observation data, and evaluating the effectiveness of the model by adopting three performance indexes, namely MAPE, MAD and RMSE.

Preferably, in the step one, the TEC high resolution sequence conversion method is: converting the downloaded ionized layer data into VTEC data through a spherical harmonic function, eliminating abnormal data which do not conform to the normal range of the electron content, and filling ionized layer TEC data through a spline interpolation method to obtain a TEC sequence with the time resolution of 1 h; f10.7 and Ap are processed into a data format consistent with TEC time resolution.

Preferably, in the second step, the TEC, the solar radiation flux F10.7, and the geomagnetic activity index Ap are normalized in the following manner: recording TEC time sequence as X₀＝{x₁，x₂，x₃，...，x_n}, training set X_tr＝{x₁，x₂，...x_m}: test set X_te＝{x₁，x₂，...x_l4: 1; after the data set is divided, the element x in the training set is subjected to_mCarrying out zero mean value standardization to obtain a standardized training set of X'_tr＝{x′₁，x′₂，...x′_tThe normalization formula is:

in the formula: mu is the mean value of the original TEC time sequence, sigma is the standard error of the original TEC time sequence, and t is more than or equal to 1 and less than or equal to m; the division ratio of the solar radiation flux value F10.7 and the geomagnetic activity index Ap to the training set and the test set is also 4:1, and the zero-mean standardization process and formula and the TEC data zero-mean standardThe same applies.

Preferably, in step three, the TCN network is designed as a three-layer network structure, including an input layer, a hidden layer, and an output layer; the input layer comprises input parameters of a geomagnetic index Ap, a solar activity index F10.7 and an electron content TEC; the hidden layer uses causal convolution and expansion convolution as standard convolution layers, every two convolution layers and identity mapping are packaged into a residual module, then the residual module is used for stacking a deep layer network, and the last layers use full convolution layers to replace full connection layers; the output layer output target is the ionospheric electron content value 24 hours in the future.

Preferably, in step three, the weight for initializing the TCN network is: the number of iterations is 50, the dilation factor d is 1,2,4, the convolution kernel size is 3, the learning rate size is 0.001, and the number of hidden layer neurons is 20.

Preferably, in step four, the training set training step is: s1, dividing single batch of input data into two parts, wherein the output of a residual error module is formed by adding a series of changes F of the two parts of input data; s2, performing one-dimensional convolution operation on the input X to finally obtain the output of the residual error module, wherein the output of the residual error module is calculated by the next residual error module; and S3, repeating the steps S1-S2, and obtaining an output result of the TCN when the last residual error module is calculated.

Preferably, in S1, the input data is subjected to a hole causal convolution, then to a nonlinear process by activating the function Relu, and a Dropout layer is added at the same time, and the same operation is performed twice, so as to obtain the output of the residual module.

Preferably, in step seven, the three performance indexes of MAPE, MAD, and RMSE are:

wherein, Y_iIonospheric TEC predicted value, Q, for hour i_iThe observed value of the ionized layer TEC in the ith hour is shown, and n is the length of the time period; when the three performance index values are smaller, theThe closer the predicted value and the true value are, the better the fitting degree is, and the smaller the error between the predicted value and the true value is.

Compared with the prior art, the ionosphere total electron content forecasting method based on the TCN model has the advantages that:

(1) due to causal characteristics between layers of the TCN convolutional network, the space-time characteristics of the electron content of the ionized layer can be better captured, the accuracy of the total electron content prediction of the ionized layer is improved to a certain extent, and the instantaneity is enhanced.

(2) The added layer-skipping convolution of the residual convolution uses a larger expansion coefficient to change the size of a receptive field to provide more flexibility, can better control the memory length of a prediction model, and further improves the prediction accuracy of the total electron content of the ionosphere.

(3) The time directions of the reverse propagation path and the sequence of the TCN are different, the problem of gradient disappearance or gradient explosion in the network is solved, and the method is superior to the traditional feedforward network.

(4) The sharing characteristic of the convolution kernel enables the operation to be fast and less in internal memory under the condition of processing the data with large total electron content for many years, and the electron content forecasting speed is accelerated.

Drawings

For a clearer explanation of the embodiments or technical solutions of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for a person skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is an overall flow chart of the present invention.

FIG. 2 is a diagram illustrating a residual error model in the TCN model of the present invention.

FIG. 3 shows the calculation process of F (x) in the TCN network of the present invention.

Detailed Description

The following provides a brief description of embodiments of the present invention with reference to the accompanying drawings. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by those skilled in the art based on the embodiments of the present invention without any inventive work belong to the protection scope of the present invention.

Fig. 1-3 illustrate a preferred embodiment of the present invention, which is parsed in detail.

Fig. 1 shows a method for forecasting total electron content in an ionosphere based on a TCN model, which includes the following steps:

the method comprises the following steps: acquiring ionized layer TEC data, a solar radiation flux value F10.7 and a geomagnetic activity index Ap, and converting the TEC into a time sequence with high time resolution by adopting an interpolation technology. Specifically, TEC data is derived from european orbit determination center CODE, F10.7 is derived from gordad space flight center by NASA, Ap is derived from geomagnetic and space magnetic data analysis centers. And (3) performing expansion calculation on the downloaded ionized layer data through a spherical harmonic function to obtain the vertical ionized layer content, eliminating abnormal data which do not conform to the normal range of the electron content, and filling ionized layer TEC data through a spline interpolation method to obtain a TEC sequence with the time resolution of 1 h. F10.7 and Ap are processed into a data format consistent with TEC time resolution.

Step two: converting TEC time sequence X after conversion₀＝{x₁，x₂，x₃，...，x_nDivide into training set X_tr＝{x₁，x₂，...x_mAnd test set X_te＝{x₁，x₂，...x_lThe ratio is 4: 1. After the data set is divided, the element x in the training set is subjected to_mCarrying out zero mean value standardization to obtain a standardized training set of X'_tr＝{x′₁，x′₂，...x′_tThe normalization formula is:

in the formula: mu is the mean value of the original TEC time sequence, sigma is the standard error of the original TEC time sequence, and t is more than or equal to 1 and less than or equal to m. The training set and test set division ratio of the solar radiation flux value F10.7 and the geomagnetic activity index Ap is also 4: 1.The zero mean normalization process is the same as the equation for zero mean normalization of TEC data.

Step three: determining the topological structure of the TCN neural network, initializing the weight of the TCN network, adjusting parameters in the model, setting parameters such as the number of iterations as 50, the expansion factor d as 1,2 and 4, the convolution kernel size as 3, the learning rate as 0.001 and the number of hidden layer neurons as 20. Specifically, the TCN network is designed as a three-layer network structure including an input layer, a hidden layer, and an output layer. Because the electron content of the ionized layer is closely related to geomagnetic activity and solar activity, the geomagnetic index Ap, the solar activity index F10.7 and the electron content TEC are selected as input parameters by the input layer. The hidden layer uses causal convolution and expansion convolution as standard convolution layers, every two convolution layers and identity mapping are packaged into a residual module, then the residual module is used for stacking a deep layer network, and the last layers use full convolution layers to replace full connection layers. The output layer output target is the ionospheric electron content value 24 hours in the future.

Using the causal convolutional CNN model, the sequence problem can be translated into: according to x₁，x₂，...x_nTo predict y₁,y₂，…y_nDefinition of causal convolution, filter F ═ F₁，f₂，...f_k}, sequence X₀＝{x₁，x₂，x₃，...，x_nAt x_tThe causal convolution of (a) is:

in the causal convolution model of the prediction model, the last two nodes of the input layer are assumed to be x respectively_t-1，x_tThe last node of the first hidden layer is y_tFilter F ═ F₁，f₂) According to the formula, there is y_t＝f₁x_t-1+f₂x_t。

Hole convolution is to add holes to the standard convolution to increase the field of view, so that each convolution output contains a larger range of information, at x_tThe convolution of a hole at a diaglationatrate equal to d is:

where d is the dilation coefficient, k is the convolution kernel size, and (s-d · i) indicates which upper neuron is used when calculating the lower neurons. The TCN model adds the structural characteristic of hole convolution, expands the receptive field by skipping some existing pixels, and the number of the skipped pixels is (contrast-1). In a cavity convolution model of the TEC forecasting model, the last five nodes of the first hidden layer are assumed to be x respectively_t-4，x_t-3，x_t-2，x_t-1，x_tThe last node of the hidden layer of the second layer is y_tFilter F ═ F₁，f₂，f₃) According to the formula, there is y_t＝f₁x_t-2d+f₂x_t-d+f₃x_t(d＝2)。

The TCN network adds a layer jump connection of residual convolution and a convolution operation of 1 x 1 to improve accuracy. The residual convolution solves the training problem of a deep network through a special connection mode, and the number of network layers is greatly increased. The output of the residual block combines the input information and the output information of the input convolution operation, so that the information can be transmitted in a cross-layer mode, and a residual module in the TCN framework is shown in FIG. 2. In the invention, a residual error layer is constructed to replace one layer of convolution, and the input is subjected to hole convolution, weight normalization, an activation function and Dropout. Where WeighNorm and Dropout regularize the network.

Step four: and inputting the training set into the ionosphere TEC forecasting model in batches, calculating the output error of the effective historical length, and automatically adjusting the weight and the bias parameter by the model according to the output error in the process until the maximum iteration times or the accuracy and the error rate reach the optimal values. Specifically, the training set comprises the following training steps: s1, dividing single batch of input data into two parts, wherein the output of a residual error module is formed by adding a series of changes F of the two parts of input data. And in the F calculation process, as shown in fig. 3, the input data is subjected to the hole causal convolution, then the nonlinear processing is performed through the activation function Relu, and meanwhile, a Dropout layer is added, and the same operation is performed twice, so that the output of the residual error module is finally obtained. And S2, performing one-dimensional convolution operation on the input X. 1 × 1 convolution is used for reducing dimensions, if a feature jump layer of a lower layer is connected to an upper layer, the number of corresponding feature maps of each Cell is inconsistent, so that the addition operation of jump layer feature maps similar to Resnet cannot be directly performed, in order to make the number of two layers and the number of feature maps consistent, 1 × 1 convolution is used for performing a dimension reduction operation, and finally the output of the residual error module is obtained, and the output of the residual error module is calculated through a next residual error module; and S3, repeating the steps S1-S2, and obtaining an output result of the TCN when the last residual error module is calculated.

Step five: calculating errors by utilizing the square of the difference, accumulating errors obtained by multiple batches of data, comparing the errors with an error threshold value, taking the fact that the Loss function is infinitely close to 0 and the accuracy function is infinitely close to 1 as a model optimization target, finishing training if the errors are within a set range, executing a sixth step, executing a third step if the errors are beyond the set range, continuously updating parameters such as network weight and the like through reverse error propagation, and training again until the optimal parameters are found out to determine the optimal model.

Step seven: and comparing the prediction result with the observation data, and evaluating the effectiveness of the model by adopting three performance indexes, namely MAPE, MAD and RMSE, wherein the smaller the three performance index values are, the closer the predicted value and the true value are, the better the fitting degree is, and the smaller the error between the predicted value and the true value is. The performance index is correspondingly defined as:

wherein, Y_iIonospheric TEC predicted value, Q, for hour i_iThe observed value of the ionized layer TEC in the ith hour is shown, and n is the length of the time period.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An ionospheric total electron content forecasting method based on a TCN model is characterized by comprising the following steps:

the method comprises the following steps: acquiring ionized layer TEC data, a solar radiation flux value F10.7 and a geomagnetic activity index Ap, and converting the TEC into a high-time resolution sequence;

step two: dividing the converted TEC time sequence, the solar radiation flux value F10.7 and the geomagnetic activity index Ap into a training set and a testing set, and then carrying out zero-mean standardization processing on data in the training set;

step three: determining a topological structure of the TCN neural network, initializing the weight of the TCN neural network, and adjusting parameters in the model;

step five: comparing training results, if the error is within a set range, ending the training, executing the step six, if the error exceeds the set range, executing the step three, updating TCN network parameters, performing the training again until the optimal parameters are found, and determining an optimal model;

step six: after the model training is finished, the trained model is recorded as TCN_TECInputting test set data into TCN_TECIn the model, the output result is recorded as Y₀Then Y is put₀Performing Z-score denormalization to obtain a final TEC prediction result;

2. The method for forecasting the total electron content of the ionized layer based on the TCN model according to claim 1, wherein in the first step, the TEC high resolution sequence conversion mode is as follows: converting the downloaded ionized layer data into VTEC data through a spherical harmonic function, eliminating abnormal data which do not conform to the normal range of the electron content, and filling ionized layer TEC data through a spline interpolation method to obtain a TEC sequence with the time resolution of 1 h; f10.7 and Ap are processed into a data format consistent with TEC time resolution.

3. The method for forecasting the total electron content in the ionosphere based on the TCN model according to claim 1, wherein in the second step, the TEC, the solar radiation flux F10.7, and the geomagnetic activity index Ap are normalized by: recording TEC time sequence as X₀＝{x₁，x₂，x₃，...，x_n}, training set X_tr＝{x₁，x₂，...x_m}: test set X_te＝{x₁，x₂，...x_l4: 1; after the data set is divided, the element x in the training set is subjected to_mCarrying out zero mean value standardization to obtain a standardized training set of X'_tr＝{x′₁，x′₂，...x′_tThe normalization formula is:

in the formula: mu is the mean value of the original TEC time sequence, sigma is the standard error of the original TEC time sequence, and t is more than or equal to 1 and less than or equal to m; the division ratio of the solar radiation flux value F10.7 and the geomagnetic activity index Ap to the training set and the test set is also 4:1, and the zero-mean standardization process and formula are the same as those of TEC data zero-mean standardization.

4. The method for forecasting the total electron content of the ionized layer based on the TCN model according to claim 1, wherein in the third step, the TCN network is designed into a three-layer network structure comprising an input layer, a hidden layer and an output layer; the input layer comprises input parameters of a geomagnetic index Ap, a solar activity index F10.7 and an electron content TEC; the hidden layer uses causal convolution and expansion convolution as standard convolution layers, every two convolution layers and identity mapping are packaged into a residual module, then the residual module is used for stacking a deep layer network, and the last layers use full convolution layers to replace full connection layers; the output layer output target is the ionospheric electron content value 24 hours in the future.

5. The method for forecasting the total electron content of the ionized layer based on the TCN model as claimed in claim 4, wherein in the third step, the weight for initializing the TCN network is as follows: the number of iterations is 50, the dilation factor d is 1,2,4, the convolution kernel size is 3, the learning rate size is 0.001, and the number of hidden layer neurons is 20.

6. The method for forecasting the total electron content of the ionosphere based on the TCN model according to claim 5, wherein in the fourth step, the training set training step is as follows: s1, dividing single batch of input data into two parts, wherein the output of a residual error module is formed by adding a series of changes F of the two parts of input data; s2, performing one-dimensional convolution operation on the input X to finally obtain the output of the residual error module, wherein the output of the residual error module is calculated by the next residual error module; and S3, repeating the steps S1-S2, and obtaining an output result of the TCN when the last residual error module is calculated.

7. The method for predicting the total electron content of the ionosphere based on the TCN model according to claim 6, wherein the input data in S1 is subjected to a hole causal convolution, then to a nonlinear treatment by activating a function Relu, and a Dropout layer is added at the same time, and the output of the residual error module is obtained after two identical operations.

8. The method for forecasting the total electron content of the ionized layer based on the TCN model according to claim 1, wherein in the seventh step, the three performance indexes of MAPE, MAD and RMSE are respectively:

wherein, Y_iIonospheric TEC predicted value, Q, for hour i_iThe observed value of the ionized layer TEC in the ith hour is shown, and n is the length of the time period; when the three performance index values are smaller, the predicted value is closer to the true value, the fitting degree is better, and the error between the predicted value and the true value is smaller.