WO2019214143A1

WO2019214143A1 - Server, financial time sequence data processing method and storage medium

Info

Publication number: WO2019214143A1
Application number: PCT/CN2018/107678
Authority: WO
Inventors: 李正洋; 李海疆
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-05-10
Filing date: 2018-09-26
Publication date: 2019-11-14
Also published as: JP6812573B2; JP2020522774A; CN108615096A

Abstract

The present application relates to a server, a financial time sequence data processing method and a storage medium, the method comprising: configuring slide windows having different predetermined time steps, using the slide windows to slide on financial time sequence data which does not contain a missing value so as to obtain a plurality of window data, and sampling each window data so as to obtain sample data; using each sample data to respectively train a pre-determined cyclic neural network model, thus obtaining each trained model to serve as a prediction model; obtaining financial time sequence data comprising the missing value, obtaining the position and digits of the missing value in the financial time sequence data, intercepting financial time sequence data in front of the position of the missing value according to the position and digits of the missing value, and using the intercepted data as data to be inputted; and inputting the data to be inputted into each prediction model, and obtaining an average value of prediction values outputted by each prediction model to serve as a fill-in value for the missing value. The present application may predict accurate and objective missing values.

Description

Server, financial time series data processing method and storage medium

Priority claim

The present application is based on the priority of the Chinese Patent Application entitled "Processing Method of Server, Financial Time Series Data and Storage Medium", which is filed on May 10, 2018, with the application number of CN2018104414146, the entire contents of which are The manner of reference is incorporated in the present application.

Technical field

The present application relates to the field of data processing technologies, and in particular, to a server, a method for processing financial time series data, and a storage medium.

Background technique

Financial time series data has statistical characteristics of time series and has many categories. For example, financial time series data of price includes: opening price, closing price, highest price, lowest price, and volume data of stocks, futures, foreign exchange, etc. The financial time series data of derivative indicators include: China Bond Debt Yield to Yield - China Bond Corporate Bond Yield to Maturity, Risk Premium, Dividend Rate, CR Index, Ratio of Large and Small Handicap, RSRS Indicator, and CSI 300 Premium Rate , Shanghai and Shenzhen 300 initiative to buy the amount. In the actual situation, the financial time series data is missing due to various reasons, such as: 1. The stock suspension of the listed company leads to the loss of information such as the opening price, closing price, highest price, lowest price, and trading volume of the day; The platform cannot obtain the corresponding financial time series data; 3. The financial time series data obtained on the public platform has significant deviation from the actual value, and so on.

Traditional missing value processing methods include manual filling, special value filling, mean filling, near filling, cluster filling, and so on. However, for financial time series data, due to its dependence on time, the traditional simple processing method obtains missing values that are inaccurate, and cannot simulate the distribution of real financial time series data to the greatest extent, which is easy to cause information loss and affect subsequent Research on financial time series data.

Summary of the invention

The purpose of the present application is to provide a server, a method for processing financial time series data, and a storage medium, which are intended to predict accurate and objective missing values.

To achieve the above object, the present application provides a server including a memory and a processor coupled to the memory, the memory storing a processing system operable on the processor, the processing system being The processor implements the following steps when executed:

Setting a sliding window with different predetermined time steps, using the set sliding window to slide on the financial time series data without missing values to obtain multiple window data, and sampling each window data to obtain sample data corresponding to each predetermined time step ;

The predetermined cyclic neural network model is respectively trained by using the sample data corresponding to each predetermined time step, and the model corresponding to each predetermined time step after the training is obtained as a prediction model;

Obtaining financial time series data with missing values, obtaining the position of the missing value in the financial time series data and the number of missing values, and intercepting the financial timing ahead of the position of the missing value according to the position of the missing value and the number of bits of the missing value Data, with the intercepted data as the data to be input;

The data to be input is input to each prediction model, and the predicted values output by the respective prediction models are obtained, and the average value of each predicted value is obtained as the filling value of the missing value.

To achieve the above objective, the present application further provides a method for processing financial time series data, and the method for processing the financial time series data includes:

S1, setting a sliding window with different predetermined time steps, using the set sliding window to slide on the financial time series data without missing values to obtain a plurality of window data, and sampling each window data to obtain corresponding time steps corresponding to each step sample;

S2: training the predetermined cyclic neural network model by using sample data corresponding to each predetermined time step, and obtaining a model corresponding to each predetermined time step after the training as a prediction model;

S3, obtaining financial time series data with missing values, obtaining the position of the missing value in the financial time series data and the number of bits of the missing value, and cutting the position of the missing value in front of the position of the missing value according to the position of the missing value and the number of missing values Financial time series data, with the intercepted data as the data to be input;

S4: Input the data to be input into each prediction model, obtain a predicted value output by each prediction model, and obtain an average value of each predicted value as a filling value of the missing value.

The application further provides a computer readable storage medium having a processing system stored thereon, the processing system being implemented by a processor to implement the steps:

The beneficial effects of the present application are as follows: the present application utilizes a cyclic neural network model to process and predict missing values in financial time series data, and can capture dependencies before and after financial time series data. The padding value of missing values is given by the average of multiple models. More objective and accurate, it can restore the overall distribution of real financial time series data to the greatest extent.

DRAWINGS

1 is a schematic diagram of a hardware architecture of an embodiment of a server according to the present application;

2 is a schematic structural view of an LSTM model;

3 is a schematic structural view of the modified LSTM model shown in FIG. 2;

FIG. 4 is a schematic flowchart diagram of an embodiment of a method for processing financial time series data according to the present application.

detailed description

In order to make the objects, technical solutions, and advantages of the present application more comprehensible, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.

It should be noted that the descriptions of "first", "second" and the like in the present application are for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. . Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In addition, the technical solutions between the various embodiments may be combined with each other, but must be based on the realization of those skilled in the art, and when the combination of the technical solutions is contradictory or impossible to implement, it should be considered that the combination of the technical solutions does not exist. Nor is it within the scope of protection required by this application.

1 is a schematic diagram of a hardware architecture of an embodiment of a server according to the present application. The server 1 is a device capable of automatically performing numerical calculation and/or information processing in accordance with an instruction set or stored in advance. The server 1 may be a computer, a single network server, a server group composed of multiple network servers, or a cloud-based cloud composed of a large number of hosts or network servers, where cloud computing is a type of distributed computing. A super virtual computer consisting of a group of loosely coupled computers.

In the present embodiment, the server 1 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13 communicably connected to each other through a system bus, and the memory 11 stores a processing system operable on the processor 12. It is pointed out that Figure 1 only shows the server 1 with the components 11-13, but it should be understood that not all illustrated components are required to be implemented, and more or fewer components may be implemented instead.

The memory 11 includes a memory and at least one type of readable storage medium. The memory provides a cache for the operation of the server 1; the readable storage medium can be, for example, a flash memory, a hard disk, a multimedia card, a card type memory (for example, SD or DX memory, etc.), a random access memory (RAM), a static random access memory (SRAM). A non-volatile storage medium such as a read only memory (ROM), an electrically erasable programmable read only memory (EEPROM), a programmable read only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, or the like. In some embodiments, the readable storage medium may be an internal storage unit of the server 1, such as a hard disk of the server 1; in other embodiments, the non-volatile storage medium may also be an external storage device of the server 1, For example, a plug-in hard disk provided on the server 1, a smart memory card (SMC), a Secure Digital (SD) card, a flash card, and the like. In this embodiment, the readable storage medium of the memory 11 is generally used to store an operating system installed on the server 1 and various types of application software, such as program code for storing the processing system in an embodiment of the present application. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.

The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the server 1, such as performing control and processing related to data interaction or communication with the other devices. In this embodiment, the processor 12 is configured to run program code or process data stored in the memory 11, such as a running processing system.

The network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the server 1 and other electronic devices. In this embodiment, the network interface 13 is mainly used to connect the server 1 with one or more terminal devices 2, and establish a data transmission channel and a communication connection between the server 1 and one or more terminal devices 2.

The processing system is stored in the memory 11 and includes at least one computer readable instruction stored in the memory 11, the at least one computer readable instruction being executable by the processor 12 to implement the methods of various embodiments of the present application; The at least one computer readable instruction can be classified into different logic modules depending on the functions implemented by its various parts.

In an embodiment, when the processing system is executed by the processor 12, the following steps are implemented:

The predetermined time step includes 6 time units, 11 time units and 16 time units, and the time unit refers to a granularity unit of financial time series data, for example, financial time series data with a day-to-day granularity, and the time unit is days. High-frequency financial time series data in minutes, whose time unit is minutes, and so on.

For a sliding window of 6 time units, the number of bits of the corresponding window data is 6 bits, and the number of bits of sample data obtained by sampling is 6 bits; for a sliding window of 11 time units, the number of bits of the corresponding window data is 11 Bit, the sampled sample has a bit number of 6 bits. For example, the sampled sample data is (x1, x3, x5, x7, x9, x11), ie, the first, third, fifth, and seventh in the sample window data. , 9, 11-bit data; for a sliding window of 16 time units, the corresponding window data has a bit number of 16 bits, and the sampled data has a bit number of 6 bits, for example, the sampled sample data is (x1) , x4, x7, x10, x13, x16), that is, the data of the first, fourth, seventh, tenth, thirteenth, and sixteenth bits in the sampling window data.

The purpose of setting the sliding window with different predetermined time steps is to expand the long-distance and relationship of the captured information without changing the length of the sample data. The financial time series data without missing values is sampled to obtain sample data, and the sample data is used to train the model to obtain a model with higher accuracy.

The predetermined cyclic neural network model is a hybrid model of two or more cyclic neural networks, preferably a Long Short-Term Memory (LSTM) and a gated loop unit model (Gated Recurrent). A mixed model composed of Unit, GRU), LSTM model and GRU model can be used to capture the dependencies before and after the time series.

In an embodiment, the step includes: dividing sample data corresponding to each predetermined time step into a training set of a first ratio and a test set of a second ratio, using a training set corresponding to each predetermined time step Performing training on a predetermined cyclic neural network model, wherein the sum of the first ratio and the second ratio is less than or equal to 1; and extracting a predetermined number of sample data as a verification set in each training set corresponding to the predetermined time step, using the The verification set tests the parameters of the cyclic neural network model in the training. When the test error is greater than or equal to a predetermined error threshold, the training is terminated to obtain the trained cyclic neural network model; the test set is used to train the cyclic neural network model. The accuracy rate is tested; if the accuracy rate is greater than or equal to a predetermined accuracy threshold, the trained cyclic neural network model is used as a prediction model; if the accuracy is less than a predetermined accuracy threshold, the cyclic neural network model is modified. Implicit layer structure and re-training to get a pre-accuracy rate greater than or equal to the predetermined accuracy threshold Test model.

Wherein, since the sample data corresponding to each predetermined time step can be regarded as independent and identically distributed, random random sampling is adopted for the training set and the test set, and the proportion of the training set is 70%, and the proportion of the test set is 30%, for example, the training set includes 70,000 sample data, and the test set includes 30,000 sample data.

Preferably, in the training set, the training is performed by means of cross-validation, that is, the sample data in the training set is divided into 10 parts, 9 pieces are taken for training each time, and 1 sample data is taken as a verification set to use the verification set pair in training. The parameters of the cyclic neural network model were tested. Training is performed on the training set, and the test result is obtained on the verification set. If the number of training increases, if the test error is found on the verification set, that is, the test error is greater than or equal to a predetermined error threshold, the training is stopped to obtain training. The post-recurrent neural network model is used as a model for the test set described below to effectively avoid over-fitting of the model.

Specifically, the training set is used to train the LSTM model, and the LSTM model structure may adopt a Bi-directional LSTM structure, and the sample data of the training set includes (X1, X2, X3, X4, X5, X6), as shown in FIG. 2, X1, X2, X3, X4, X5) are input layers, A is an implicit layer, and St is an output. The hidden layer A is the memory unit of the LSTM model, which is the parameter of the model, and is calculated according to the input of the current input layer and the output of the hidden layer of the previous step. When the test set tests the accuracy of the trained LSTM model, the output St is compared with the X6 in the sample data to test, and the test results indicate the ability of the model to characterize the distribution of financial time series data. If the accuracy of the LSTM model is greater than or equal to a predetermined accuracy threshold (eg, 0.9), the LSTM model meets the requirements, and the trained LSTM model is used as a prediction model; if the accuracy of the LSTM model is less than a predetermined accuracy threshold, the LSTM model If the requirements are not met, the hidden layer structure of the LSTM model is modified. As shown in FIG. 3, in this embodiment, the hidden layer corresponding to the input sample data is modified from a single hidden layer to a double hidden layer. The structure is stacked and retrained to obtain a prediction model with an accuracy rate greater than or equal to a predetermined accuracy threshold.

The structure of the GRU model is similar to that of the LSTM model, except that the structure of the hidden layer is more complex than the LSTM model. The GRU model is trained by using the same training set as above. The process of training the GRU model is basically consistent with the training of the LSTM model, and extracting part of the sample data as a verification set in the training set can effectively avoid over-fitting of the model. After training, the trained GRU model is tested by using the test set, so that the accuracy of the GRU model is greater than or equal to a predetermined accuracy threshold. If the accuracy of the GRU model is less than the accuracy threshold, then the structure of the GRU model is modified. The modification is similar to the LSTM model.

Through the above training and testing process, a hybrid model composed of LSTM model + GRU model corresponding to each predetermined time step is obtained as a prediction model.

In this embodiment, the location of the missing value is first located. Since the financial time series data is a time series sequence, the position of the missing value can be located by the time point where the missing value is located; and then the number of bits of each missing value is determined, for example, 1 bit. Or 2 digits, etc. The number of bits of the financial time series data of the input model is determined according to the number of bits of the missing value to be predicted, and several bits of data in front of the missing value are intercepted as the data to be input.

Wherein, the number of bits of the missing value is generally 1 or 2 bits, and the data to be input is preferably 5 bits, 6 bits or 7 bits, and less than 5 bits and more than 7 bits are usually difficult to achieve better results because less than 5 The bit captures less timing information, while more than 7 bits have longer timing and greater information skew. Preferably, as shown in Table 1 below, the correspondence between the number of bits of the missing value and the number of bits of the data to be input is:

缺失值的位数Number of missing values	待输入数据的位数The number of bits of data to be entered
11	55
11	66
22	66
11	77
22	77

Table 1

In Table 1, if the number of bits of the missing value is 1 bit, it is determined that the number of bits of the intercepted data is 5, 6, or 7 bits, and the 5, 6, or 7 financial positions in front of the position of the missing value are intercepted. Time series data, with the intercepted data as the data to be input; if the number of bits of the missing value is 2 bits, it is determined that the number of bits of the intercepted data is 6 or 7 bits, and 6 bits or 7 in front of the position of the missing value are intercepted. Bit financial time series data, with the intercepted data as the data to be input.

In this embodiment, the data to be input is respectively input into a prediction model of a mixed model composed of each GRU model and an LSTM model, that is, respectively input to a hybrid model corresponding to 6 time units, a hybrid model corresponding to 11 time units, and 16 In the hybrid model corresponding to the time units, the predicted values V1, V2, and V3 corresponding to the output of the three mixed models are obtained, and the padding value of the missing value is calculated as V=(V1+V2+V3)/3, and the number of bits of the missing value is The 2-bit is also the average of the predicted values of the corresponding positions of the calculated output. The padding value V of the missing value can capture the dependencies before and after the financial time series data, and is given by the average of the three mixed models, which is more objective and accurate.

Compared with the prior art, the present application sets a sliding window with different time steps to intercept data for financial time series data without missing values, and then samples the intercepted data to obtain sample data corresponding to different time steps, respectively. The data partition training set and the test set train a predetermined cyclic neural network model to obtain prediction models corresponding to different time steps; for financial time series data with missing values, locate the position of the missing value and determine the number of missing values, according to the missing The position of the value and the number of digits of the missing value are taken in the financial time series data in front of the position of the missing value, and the data is input into each prediction model, and the predicted value output by each prediction model is obtained, and the average value of each predicted value is used as the missing value. The filling value of the value, the present application uses the cyclic neural network model to process and predict the missing values in the financial time series data, and can capture the dependency relationship before and after the financial time series data, and the filling value of the missing value is given by the average value of various models, and Objective and accurate, it can restore the overall distribution of real financial time series data to the greatest extent.

As shown in FIG. 4, FIG. 4 is a schematic flowchart diagram of an embodiment of a method for processing financial time series data according to the present application. The method for processing the financial time series data includes the following steps:

Step S1, setting a sliding window with different predetermined time steps, using the set sliding window to slide on the financial time series data without missing values to obtain multiple window data, and sampling each window data to obtain corresponding predetermined time steps Sample data;

Step S2: training the predetermined cyclic neural network model by using sample data corresponding to each predetermined time step, and obtaining a model corresponding to each predetermined time step after the training as a prediction model;

The predetermined cyclic neural network model is a hybrid model of two or more cyclic neural networks, preferably a Long Short-Term Memory (LSTM) and a gated loop unit model (Gated Recurrent). The mixed model composed of Unit, GRU), LSTM model and GRU model can be used to capture the dependencies before and after the time series.

In an embodiment, the step includes: dividing sample data corresponding to each predetermined time step into a training set of a first ratio and a test set of a second ratio, using a training set corresponding to each predetermined time step Performing training on a predetermined cyclic neural network model, wherein the sum of the first ratio and the second ratio is less than or equal to 1; and extracting a predetermined number of sample data as a verification set in each training set corresponding to the predetermined time step, using the The verification set tests the parameters of the cyclic neural network model in the training. When the test error is greater than or equal to a predetermined error threshold, the training is terminated to obtain the trained cyclic neural network model; the test set is used to train the cyclic neural network model. The accuracy rate is tested; if the accuracy rate is greater than or equal to a predetermined accuracy threshold, the trained cyclic neural network model is used as a prediction model; if the accuracy is less than a predetermined accuracy threshold, the cyclic neural network model is modified. Implicit layer structure and retraining to get an accuracy rate greater than or equal to the predetermined accuracy threshold Measurement model.

Step S3: Acquire financial time series data with missing values, obtain the position of the missing value in the financial time series data, and the number of bits of the missing value, and intercept the position of the missing value according to the position of the missing value and the number of missing values. Financial time series data, with the intercepted data as the data to be input;

Wherein, the number of bits of the missing value is generally 1 or 2 bits, and the data to be input is preferably 5 bits, 6 bits or 7 bits, and less than 5 bits and more than 7 bits are usually difficult to achieve better results because less than 5 The bit captures less timing information, while more than 7 bits have longer timing and greater information skew. Preferably, it is as shown in Table 1 above.

In step S4, the data to be input is input to each prediction model, and the predicted value outputted by each prediction model is obtained, and the average value of each predicted value is obtained as the filling value of the missing value.

The present application also provides a computer readable storage medium having stored thereon a processing system, the processing system being executed by a processor to implement the steps of the processing method of the financial time series data described above.

The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present application.

The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims

A server, comprising: a memory and a processor coupled to the memory, the memory storing a processing system operable on the processor, the processing system being The following steps are implemented during execution:

Setting a sliding window with different predetermined time steps, using the set sliding window to slide on the financial time series data without missing values to obtain multiple window data, and sampling each window data to obtain sample data corresponding to each predetermined time step ;

The predetermined cyclic neural network model is respectively trained by using the sample data corresponding to each predetermined time step, and the model corresponding to each predetermined time step after the training is obtained as a prediction model;

Obtaining financial time series data with missing values, obtaining the position of the missing value in the financial time series data and the number of missing values, and intercepting the financial timing ahead of the position of the missing value according to the position of the missing value and the number of bits of the missing value Data, with the intercepted data as the data to be input;

The data to be input is input to each prediction model, and the predicted values output by the respective prediction models are obtained, and the average value of each predicted value is obtained as the filling value of the missing value.
The server according to claim 1, wherein the sample data corresponding to each predetermined time step is respectively used to train a predetermined cyclic neural network model, and a model corresponding to each predetermined time step after training is obtained as a prediction. The steps of the model include:

The sample data corresponding to each predetermined time step is divided into a training set of the first ratio and a test set of the second ratio, and the predetermined cyclic neural network model is respectively trained by using the training set corresponding to each predetermined time step. The sum of the first ratio and the second ratio is less than or equal to 1;

Extracting a predetermined number of sample data as a verification set in each training set corresponding to the predetermined time step, and using the verification set to test parameters of the cyclic neural network model in training, when the test error is greater than or equal to a predetermined error threshold, End training to obtain a trained cyclic neural network model;

Using the test set to test the accuracy of the trained cyclic neural network model;

If the accuracy rate is greater than or equal to a predetermined accuracy threshold, the trained cyclic neural network model is used as a prediction model;

If the accuracy is less than the predetermined accuracy threshold, the implicit layer structure of the cyclic neural network model is modified, and training is performed again to obtain a prediction model whose accuracy is greater than or equal to a predetermined accuracy threshold.
The server according to claim 1, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, and using the intercepted data as the data to be input Specifically, including:

The number of bits of the intercepted data is determined according to the number of bits of the missing value, and the financial time series data having the same number of bits as the determined number of bits in front of the position of the missing value is intercepted, and the intercepted data is used as the data to be input.
The server according to claim 2, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, and using the intercepted data as the data to be input Specifically, including:

The number of bits of the intercepted data is determined according to the number of bits of the missing value, and the financial time series data having the same number of bits as the determined number of bits in front of the position of the missing value is intercepted, and the intercepted data is used as the data to be input.
The server according to claim 3, wherein said step of intercepting the financial time series data in front of the position of the missing value based on the position of the missing value and the number of bits of the missing value, and using the intercepted data as the data to be input , further including:

If the number of bits of the missing value is 1 bit, it is determined that the number of bits of the intercepted data is 5 bits, 6 bits or 7 bits, and the 5th, 6th or 7th financial time series data in front of the position of the missing value is intercepted. Intercepted data as data to be input;

If the number of bits of the missing value is 2 bits, it is determined that the number of bits of the intercepted data is 6 or 7 bits, and the 6-bit or 7-bit financial time series data in front of the position of the missing value is intercepted, and the intercepted data is used as the input. data.
The server according to claim 4, wherein said step of intercepting the financial time series data in front of the position of the missing value based on the position of the missing value and the number of bits of the missing value, and using the intercepted data as the data to be input , further including:

If the number of bits of the missing value is 1 bit, it is determined that the number of bits of the intercepted data is 5 bits, 6 bits or 7 bits, and the 5th, 6th or 7th financial time series data in front of the position of the missing value is intercepted. Intercepted data as data to be input;

If the number of bits of the missing value is 2 bits, it is determined that the number of bits of the intercepted data is 6 or 7 bits, and the 6-bit or 7-bit financial time series data in front of the position of the missing value is intercepted, and the intercepted data is used as the input. data.
The server according to claim 1 or 2, wherein said predetermined time step is 6 time units, 11 time units, and 16 time units, and said predetermined cyclic neural network model is a long-term and short-term memory network. A hybrid model consisting of a model and a gated loop unit model.
A method for processing financial time series data, characterized in that the processing method of the financial time series data comprises:

S1, setting a sliding window with different predetermined time steps, using the set sliding window to slide on the financial time series data without missing values to obtain a plurality of window data, and sampling each window data to obtain corresponding time steps corresponding to each step sample;

S2: training the predetermined cyclic neural network model by using sample data corresponding to each predetermined time step, and obtaining a model corresponding to each predetermined time step after the training as a prediction model;

S3, obtaining financial time series data with missing values, obtaining the position of the missing value in the financial time series data and the number of bits of the missing value, and cutting the position of the missing value in front of the position of the missing value according to the position of the missing value and the number of missing values Financial time series data, with the intercepted data as the data to be input;

S4: Input the data to be input into each prediction model, obtain a predicted value output by each prediction model, and obtain an average value of each predicted value as a filling value of the missing value.
The method of processing the financial time series data according to claim 8, wherein the step S2 comprises:

The sample data corresponding to each predetermined time step is divided into a training set of the first ratio and a test set of the second ratio, and the predetermined cyclic neural network model is respectively trained by using the training set corresponding to each predetermined time step. The sum of the first ratio and the second ratio is less than or equal to 1;

Extracting a predetermined number of sample data as a verification set in each training set corresponding to the predetermined time step, and using the verification set to test parameters of the cyclic neural network model in training, when the test error is greater than or equal to a predetermined error threshold, End training to obtain a trained cyclic neural network model;

Using the test set to test the accuracy of the trained cyclic neural network model;

If the accuracy rate is greater than or equal to a predetermined accuracy threshold, the trained cyclic neural network model is used as a prediction model;

If the accuracy is less than the predetermined accuracy threshold, the implicit layer structure of the cyclic neural network model is modified, and training is performed again to obtain a prediction model whose accuracy is greater than or equal to a predetermined accuracy threshold.
The method for processing financial time series data according to claim 8, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, using the intercepted data as The steps of inputting data include:

The number of bits of the intercepted data is determined according to the number of bits of the missing value, and the financial time series data having the same number of bits as the determined number of bits in front of the position of the missing value is intercepted, and the intercepted data is used as the data to be input.
The method for processing financial time series data according to claim 9, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, using the intercepted data as The steps of inputting data include:

The number of bits of the intercepted data is determined according to the number of bits of the missing value, and the financial time series data having the same number of bits as the determined number of bits in front of the position of the missing value is intercepted, and the intercepted data is used as the data to be input.
The method for processing financial time series data according to claim 10, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, using the intercepted data as The step of inputting data further includes:

If the number of bits of the missing value is 1 bit, it is determined that the number of bits of the intercepted data is 5 bits, 6 bits or 7 bits, and the 5th, 6th or 7th financial time series data in front of the position of the missing value is intercepted. Intercepted data as data to be input;

If the number of bits of the missing value is 2 bits, it is determined that the number of bits of the intercepted data is 6 or 7 bits, and the 6-bit or 7-bit financial time series data in front of the position of the missing value is intercepted, and the intercepted data is used as the input. data.
The method for processing financial time series data according to claim 11, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, using the intercepted data as The step of inputting data further includes:

If the number of bits of the missing value is 1 bit, it is determined that the number of bits of the intercepted data is 5 bits, 6 bits or 7 bits, and the 5th, 6th or 7th financial time series data in front of the position of the missing value is intercepted. Intercepted data as data to be input;

If the number of bits of the missing value is 2 bits, it is determined that the number of bits of the intercepted data is 6 or 7 bits, and the 6-bit or 7-bit financial time series data in front of the position of the missing value is intercepted, and the intercepted data is used as the input. data.
The method for processing financial time series data according to claim 8 or 9, wherein said predetermined time step is 6 time units, 11 time units, and 16 time units, said predetermined cyclic neural network model A hybrid model consisting of a long-term and short-term memory network model and a gated loop unit model.
A computer readable storage medium, wherein the computer readable storage medium stores a processing system, and when the processing system is executed by the processor, the steps are:

Setting a sliding window with different predetermined time steps, using the set sliding window to slide on the financial time series data without missing values to obtain multiple window data, and sampling each window data to obtain sample data corresponding to each predetermined time step ;

The predetermined cyclic neural network model is respectively trained by using the sample data corresponding to each predetermined time step, and the model corresponding to each predetermined time step after the training is obtained as a prediction model;

Obtaining financial time series data with missing values, obtaining the position of the missing value in the financial time series data and the number of missing values, and intercepting the financial timing ahead of the position of the missing value according to the position of the missing value and the number of bits of the missing value Data, with the intercepted data as the data to be input;

The data to be input is input to each prediction model, and the predicted values output by the respective prediction models are obtained, and the average value of each predicted value is obtained as the filling value of the missing value.
The computer readable storage medium according to claim 15, wherein the sample data corresponding to each predetermined time step is respectively used to train a predetermined cyclic neural network model, and each predetermined time step corresponding to the training is obtained. The steps of the model as a predictive model include:

The sample data corresponding to each predetermined time step is divided into a training set of the first ratio and a test set of the second ratio, and the predetermined cyclic neural network model is respectively trained by using the training set corresponding to each predetermined time step. The sum of the first ratio and the second ratio is less than or equal to 1;

Extracting a predetermined number of sample data as a verification set in each training set corresponding to the predetermined time step, and using the verification set to test parameters of the cyclic neural network model in training, when the test error is greater than or equal to a predetermined error threshold, End training to obtain a trained cyclic neural network model;

Using the test set to test the accuracy of the trained cyclic neural network model;

If the accuracy rate is greater than or equal to a predetermined accuracy threshold, the trained cyclic neural network model is used as a prediction model;

If the accuracy is less than the predetermined accuracy threshold, the implicit layer structure of the cyclic neural network model is modified, and training is performed again to obtain a prediction model whose accuracy is greater than or equal to a predetermined accuracy threshold.
The computer readable storage medium according to claim 15, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, and taking the intercepted data as a waiting The steps for entering data, including:

The number of bits of the intercepted data is determined according to the number of bits of the missing value, and the financial time series data having the same number of bits as the determined number of bits in front of the position of the missing value is intercepted, and the intercepted data is used as the data to be input.
The computer readable storage medium according to claim 16, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, and taking the intercepted data as a waiting The steps for entering data, including:

The number of bits of the intercepted data is determined according to the number of bits of the missing value, and the financial time series data having the same number of bits as the determined number of bits in front of the position of the missing value is intercepted, and the intercepted data is used as the data to be input.
The computer readable storage medium according to claim 17, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, and taking the intercepted data as a waiting The steps of entering data further include:

If the number of bits of the missing value is 1 bit, it is determined that the number of bits of the intercepted data is 5 bits, 6 bits or 7 bits, and the 5th, 6th or 7th financial time series data in front of the position of the missing value is intercepted. Intercepted data as data to be input;

If the number of bits of the missing value is 2 bits, it is determined that the number of bits of the intercepted data is 6 or 7 bits, and the 6-bit or 7-bit financial time series data in front of the position of the missing value is intercepted, and the intercepted data is used as the input. data.
The computer readable storage medium according to claim 18, wherein the intercepting the financial time series data in front of the position of the missing value according to the position of the missing value and the number of bits of the missing value, and taking the intercepted data as a waiting The steps of entering data further include:

If the number of bits of the missing value is 1 bit, it is determined that the number of bits of the intercepted data is 5 bits, 6 bits or 7 bits, and the 5th, 6th or 7th financial time series data in front of the position of the missing value is intercepted. Intercepted data as data to be input;

If the number of bits of the missing value is 2 bits, it is determined that the number of bits of the intercepted data is 6 or 7 bits, and the 6-bit or 7-bit financial time series data in front of the position of the missing value is intercepted, and the intercepted data is used as the input. data.