WO2019192136A1 - Electronic device, financial data processing method and system, and computer-readable storage medium - Google Patents

Electronic device, financial data processing method and system, and computer-readable storage medium Download PDF

Info

Publication number
WO2019192136A1
WO2019192136A1 PCT/CN2018/102226 CN2018102226W WO2019192136A1 WO 2019192136 A1 WO2019192136 A1 WO 2019192136A1 CN 2018102226 W CN2018102226 W CN 2018102226W WO 2019192136 A1 WO2019192136 A1 WO 2019192136A1
Authority
WO
WIPO (PCT)
Prior art keywords
factor data
data
processed
prediction model
factor
Prior art date
Application number
PCT/CN2018/102226
Other languages
French (fr)
Chinese (zh)
Inventor
李正洋
毛小豪
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019192136A1 publication Critical patent/WO2019192136A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Definitions

  • the present application relates to the field of computer technologies, and in particular, to an electronic device, a financial data processing method, a system, and a computer readable storage medium.
  • Financial data includes stock factor data, bond factor data, futures factor data, fund factor data, and so on.
  • factor analysis based on stock factor data is often used as an important basis for judging the future trend of the stock.
  • factor data There are various types of factor data, and different types of factor data have different calculation methods. Due to the collection method or spatiotemporality, the observed factor data is basically considered to be unclean (ie, full of noise) data. For example, due to the delay, when the momentum factor observed at a certain time is positive, it does not mean that the stock price will continue to increase upward, and there may be momentum reversal. Therefore, the analysis results based on the data filled with noise are not accurate.
  • the main purpose of the present application is to solve the problem that the observed factor data is full of noise, and the factor analysis result based on the factor data is not accurate.
  • the electronic device includes a memory and a processor, and the memory stores a financial data processing system operable on the processor, the financial data processing system being The following steps are implemented when the processor is executed:
  • the present application provides a financial data processing method, the method comprising the steps of:
  • the present application provides a financial data processing system, where the financial data processing system includes:
  • a first acquiring module configured to acquire an observation value of each to-be-processed factor data in the first preset time interval
  • a query module configured to query, in a pre-established factor data prediction model library, a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
  • a second acquiring module configured to acquire, according to the factor data prediction model corresponding to each of the to-be-processed factor data, a predicted value of each of the to-be-processed factor data in the first preset time interval;
  • a data processing module configured to input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
  • the present application provides a computer readable storage medium, wherein the computer readable storage medium stores a financial data processing system, and the financial data processing system can be executed by at least one processor. Taking the at least one processor to perform the following steps:
  • the present application determines a factor data prediction model corresponding to the to-be-processed factor data, calculates a predicted value of the to-be-processed factor data according to the factor data prediction model, and inputs the observed value and the predicted value of the factor data in the first preset time interval to the data.
  • the model is processed to obtain the denoised factor data; compared with the prior art, the present application fully considers the difference in characteristics between the various factor data, and the prediction value of the factor data can be made by using different factor data prediction models for it. Accurate and more in line with the real situation.
  • the factor data after denoising by the data processing model eliminates the noise in the observation data of the factor data, improves the accuracy of the factor data, and lays a good foundation for the subsequent analysis of the factor data.
  • FIG. 1 is a schematic flowchart of a first embodiment of a method for processing financial data according to the present application
  • FIG. 2 is a schematic flowchart of a second embodiment of a method for processing financial data according to the present application
  • FIG. 3 is a schematic diagram of an operating environment of a first embodiment of a financial data system of the present application
  • FIG. 4 is a block diagram of a program of a first embodiment of the financial data system of the present application.
  • FIG. 5 is a block diagram of a program of a second embodiment of the financial data system of the present application.
  • FIG. 1 is a schematic flowchart of a first embodiment of a method for processing financial data according to the present application.
  • the method includes:
  • the above factor data may include the stock's day gain, moving average, stock trading volume, and the like.
  • the user can set the first preset time interval as needed, for example, setting the first preset time interval from January 2, 2018 to January 22, 2018.
  • the frequency of obtaining the observation value of each to-be-processed factor data in the first preset time interval may be set as needed, for example, in real time, or set a fixed time interval (for example, 1 day), or receiving an acquisition instruction issued by the user.
  • the step of acquiring the observed values of the respective to-be-processed factor data in the first preset time interval is performed. It should be noted that if some factor data (for example, net profit growth, operating profit growth, etc.) needs to be calculated, the calculation step may be performed in advance, and the factor data obtained by the calculation step is stored to a preset storage space. When the step S10 is performed, the required factor data may be read in the storage space. In some embodiments, the step S10 may also include the calculation step of the factor data.
  • a corresponding factor data prediction model is established in advance for each factor data. Since the characteristics of each factor data are different, a method for establishing a corresponding factor data prediction model is also different, and the established factor data prediction models are stored in the factor data prediction model library. In the middle, the mapping relationship between the factor data and the factor data prediction model is saved (for example, a mapping table reflecting the mapping relationship between the factor data and the factor data prediction model).
  • the prediction value of the to-be-processed factor data at time K is to be obtained, the observation value of the to-be-processed factor data at time K-1 is input into the factor data prediction model, and the output result of the factor data prediction model is K-time waiting The predicted value of the processing factor data.
  • the embodiment determines a factor data prediction model corresponding to the to-be-processed factor data, calculates a predicted value of the to-be-processed factor data according to the factor data prediction model, and inputs the observed value and the predicted value of the factor data in the first preset time interval to
  • the data processing model is used to obtain the denoised factor data; compared with the prior art, the present embodiment fully considers the characteristics of each factor data, and uses different factor data prediction models to predict the factor data. More accurate and more in line with the real situation.
  • the factor data after denoising by the data processing model eliminates the noise in the observation of the factor data, improves the accuracy of the factor data, and lays a good foundation for the subsequent analysis of the factor data. basis.
  • step S40 may be specifically:
  • the observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
  • the step of generating the observation matrix of the to-be-processed factor data may be performed immediately after the step S10 is performed; likewise, the step of generating the prediction matrix of the to-be-processed factor data may be performed immediately after the step S30 is performed.
  • the data processing model is a Kalman filter model
  • the Kalman filter model can effectively eliminate the noise of the factor data, so that the factor data is closer to its true value.
  • FIG. 2 is a schematic flowchart of a second embodiment of a method for processing financial data according to the present application.
  • the present embodiment is based on the first embodiment.
  • the method further includes:
  • S50 classify each factor data according to a predetermined classification rule
  • the factor data prediction model corresponding to each of the factor data is respectively established according to the type of each of the factor data.
  • S70 Store the mapping relationship between each of the established factor data prediction models and their corresponding factor data.
  • step S60 includes the following steps:
  • the factor data prediction model corresponding to the factor data is established based on the following operation expression:
  • X(K+1) is a predicted value of the factor data at time K+1
  • Z(K) is an observed value of the factor data at time K.
  • the method for establishing a factor data prediction model corresponding to the factor data of the first data type is applied to low-frequency factor data, and the so-called low-frequency factor data is factor data that changes slowly with price.
  • step S60 further includes the following steps:
  • the method for establishing the factor data prediction model corresponding to the factor data is:
  • Pre-processing eg, normalization
  • the pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
  • the step of constructing the factor data prediction model corresponding to the factor data based on the Long Short-Term Memory (LSTM) neural network is specifically as follows:
  • the pre-processed factor data is divided into a training set, an evaluation set, and a test set (eg, 70% of the factor data is used as the training set, and 10% of the factor data is used as the evaluation set, 20% of the factor data as a test set).
  • the training set is input to the LSTM neural network model and trained based on a gradient descent method (eg, a stochastic gradient descent method).
  • the evaluation set is used to verify the LSTM neural network model during the training process, and input the test set into the trained LSTM neural network model to verify the trained LSTM neural network model when training
  • the obtained LSTM neural network model satisfies the first preset verification condition (for example, the difference from the verification result is less than a preset threshold), and the training is completed, and the trained LSTM neural network model is set as the factor data prediction of the factor data. model.
  • the above steps of dividing the factor data into the training set, the evaluation set and the test set based on the cross-validation method may be replaced by dividing the factor data into a training set and a test set based on the cross-validation method.
  • the number of training sets, evaluation sets, and test concentration factor data can be set as needed, and is not limited to the above-exemplified schemes.
  • the method for establishing a factor data prediction model corresponding to the factor data of the second data type is applicable to high-frequency factor data, and the so-called high-frequency factor data is factor data that changes rapidly with price, because the LSTM neural network targets high-frequency factors.
  • the data has more accurate predictive power.
  • the method for judging whether a factor data belongs to low frequency factor data or high frequency factor data can be referred to the following example:
  • the factor data is determined as the high frequency factor data
  • the factor data is determined as the low frequency factor data
  • the preset standard change frequency value may be set as a change frequency value of the closing price.
  • the factor data in a time interval is taken, and the factor data is differentially processed by time to obtain a factor data curve.
  • the coordinate X-axis of the factor data curve is the sampling time, and the Y-axis is the value of the differential processing factor data. ;
  • the number of intersections of the factor data curve and the mean line is counted; wherein the mean is an average of the factor data after the differential processing;
  • the factor data corresponding to the factor data curve is determined as the high-frequency factor data
  • the factor data corresponding to the factor data curve is determined as the low-frequency factor data
  • the preset number of standard intersection points may be set as the number of standard intersection points based on the closing price curve after the closing price differential processing.
  • the pre-processed factor data is divided into a training set and a test set according to a preset rule (for example, chronological order is used to divide 80% of the factor data as a training set, and divide 20% of the factor data as a test set) ;
  • the selection rule selects a prediction model from the plurality of prediction models as the factor data prediction model of the factor data.
  • the predetermined selection rule may be that the prediction model closest to the verification result is selected as the factor data prediction model.
  • the method for establishing the above-mentioned factor data prediction model is applicable to various factor data (this scheme does not need to classify the factor data), and simultaneously trains multiple prediction models based on one factor data to find an optimal prediction model as factor data of the factor data. Forecast model.
  • the present application also proposes a financial data processing system.
  • FIG. 3 is a schematic diagram of an operating environment of a preferred embodiment of the financial data processing system 10 of the present application.
  • the financial data processing system 10 is installed and operated in the electronic device 1.
  • the electronic device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a server.
  • the electronic device 1 may include, but is not limited to, a memory 11, a processor 12, and a display 13.
  • Figure 3 shows only the electronic device 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a hard disk or memory of the electronic device 1.
  • the memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC), and a secure digital (SD). Card, flash card, etc.
  • the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 11 is used to store application software and various types of data installed in the electronic device 1, such as program codes of the financial data processing system 10.
  • the memory 11 can also be used to temporarily store data that has been output or is about to be output.
  • the processor 12 in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as performing financial data processing. System 10 and so on.
  • CPU Central Processing Unit
  • microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as performing financial data processing. System 10 and so on.
  • the display 13 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like in some embodiments.
  • the display 13 is for displaying information processed in the electronic device 1 and a user interface for displaying visualization.
  • the components 11-13 of the electronic device 1 communicate with one another via a system bus.
  • FIG. 4 is a program block diagram of a preferred embodiment of the financial data processing system 10 of the present application.
  • the financial data processing system 10 can be divided into one or more modules, one or more modules are stored in the memory 11, and by one or more processors (the processor 12 in this embodiment) Executed to complete the application.
  • the financial data processing system 10 can be divided into a first acquisition module 101, a query module 102, a second acquisition module 103, and a data processing module 104.
  • a module referred to in this application refers to a series of computer program instruction segments capable of performing a specific function, and is more suitable than the program for describing the execution process of the financial data processing system 10 in the electronic device 1, wherein:
  • the first obtaining module 101 is configured to acquire an observation value of each to-be-processed factor data in the first preset time interval;
  • the query module 102 is configured to query, in a pre-established factor data prediction model library, a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
  • a second obtaining module 103 configured to acquire, according to the factor data prediction model corresponding to each of the to-be-processed factor data, a predicted value of each of the to-be-processed factor data in the first preset time interval;
  • the data processing module 104 is configured to input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
  • the above factor data may include the stock's day gain, moving average, stock trading volume, and the like.
  • the user can set the first preset time interval as needed, for example, setting the first preset time interval from January 2, 2018 to January 22, 2018.
  • the frequency of obtaining the observation value of each to-be-processed factor data in the first preset time interval may be set as needed, for example, in real time, or set a fixed time interval (for example, 1 day), or receiving an acquisition instruction issued by the user.
  • the step of acquiring the observed values of the respective to-be-processed factor data in the first preset time interval is performed. It should be noted that if some factor data (for example, net profit growth, operating profit growth, etc.) needs to be calculated, the calculation step may be performed in advance, and the factor data obtained by the calculation step is stored to a preset storage space.
  • the first obtaining module 101 may read the required factor data in the storage space, and in some embodiments, the first obtaining Module 101 can also perform the calculation steps of the factor data.
  • the financial data processing system 10 pre-establishes a corresponding factor data prediction model for each factor data. Since the characteristics of each factor data are different, the method for establishing a corresponding factor data prediction model is different.
  • the established factor data prediction models are stored in the factor data prediction model library, and the mapping relationship between the factor data and the factor data prediction model is saved (for example, a mapping table reflecting the mapping relationship between the factor data and the factor data prediction model).
  • the step of acquiring, by the second acquiring module 103, the predicted values of the to-be-processed factor data in the first preset time interval may be specifically exemplified by: if the predicted value of the to-be-processed factor data at the time K is obtained, then K is obtained.
  • the observation value of the factor data to be processed at time -1 is input to the factor data prediction model, and the output result of the factor data prediction model is the predicted value of the factor data to be processed at time K.
  • the embodiment determines a factor data prediction model corresponding to the to-be-processed factor data, calculates a predicted value of the to-be-processed factor data according to the factor data prediction model, and inputs the observed value and the predicted value of the factor data in the first preset time interval to
  • the data processing model is used to obtain the denoised factor data; compared with the prior art, the present embodiment fully considers the characteristics of each factor data, and uses different factor data prediction models to predict the factor data. More accurate and more in line with the real situation.
  • the factor data after denoising by the data processing model eliminates the noise in the observation of the factor data, improves the accuracy of the factor data, and lays a good foundation for the subsequent analysis of the factor data. basis.
  • the data processing module 104 is further configured to:
  • the observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
  • the step of generating the observation matrix of the to-be-processed factor data may be performed by the first obtaining module 101 immediately after performing the step of acquiring the observed values of the to-be-processed factor data in the first preset time interval;
  • the step of generating the prediction matrix of the to-be-processed factor data may be performed by the second obtaining module 103 immediately after performing the step of acquiring the predicted values of the to-be-processed factor data in the first preset time interval.
  • the data processing model is a Kalman filter model
  • the Kalman filter model can effectively eliminate the noise of the factor data, so that the factor data is closer to its true value.
  • FIG. 5 is a program module diagram of a second embodiment of the financial data processing system of the present application.
  • the present embodiment is based on the first embodiment, the financial data processing system 10 further includes a model building module 105, and the model building module 105 is configured to:
  • the model building module 105 is further configured to:
  • the factor data prediction model corresponding to the factor data is established based on the following operation expression:
  • X(K+1) is a predicted value of the factor data at time K+1
  • Z(K) is an observed value of the factor data at time K.
  • the method for establishing a factor data prediction model corresponding to the factor data of the first data type is applied to low-frequency factor data, and the so-called low-frequency factor data is factor data that changes slowly with price.
  • the model building module 105 is further configured to:
  • the method for establishing the factor data prediction model corresponding to the factor data is:
  • Pre-processing eg, normalization
  • the pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
  • the step of constructing the factor data prediction model corresponding to the factor data based on the Long Short-Term Memory (LSTM) neural network is specifically as follows:
  • the pre-processed factor data is divided into a training set, an evaluation set, and a test set (eg, 70% of the factor data is used as the training set, and 10% of the factor data is used as the evaluation set, 20% of the factor data as a test set).
  • the training set is input to the LSTM neural network model and trained based on a gradient descent method (eg, a stochastic gradient descent method).
  • the evaluation set is used to verify the LSTM neural network model during the training process, and input the test set into the trained LSTM neural network model to verify the trained LSTM neural network model when training
  • the obtained LSTM neural network model satisfies the first preset verification condition (for example, the difference from the verification result is less than a preset threshold), and the training is completed, and the trained LSTM neural network model is set as the factor data prediction of the factor data. model.
  • the above steps of dividing the factor data into the training set, the evaluation set and the test set based on the cross-validation method may be replaced by dividing the factor data into a training set and a test set based on the cross-validation method.
  • the number of training sets, evaluation sets, and test concentration factor data can be set as needed, and is not limited to the above-exemplified schemes.
  • the method for establishing a factor data prediction model corresponding to the factor data of the second data type is applicable to high-frequency factor data, and the so-called high-frequency factor data is factor data that changes rapidly with price, because the LSTM neural network is for the time series Factor data has more accurate predictive power.
  • model building module 105 described above is further configured to:
  • the pre-processed factor data is divided into a training set and a test set according to a preset rule (for example, chronological order is used to divide 80% of the factor data as a training set, and divide 20% of the factor data as a test set) ;
  • the selection rule selects a prediction model from the plurality of prediction models as the factor data prediction model of the factor data.
  • the predetermined selection rule may be that the prediction model closest to the verification result is selected as the factor data prediction model.
  • the method for establishing the above-mentioned factor data prediction model is applicable to various factor data (this scheme does not need to classify the factor data), and simultaneously trains multiple prediction models based on one factor data to find an optimal prediction model as factor data of the factor data. Forecast model.
  • the present application further provides a computer readable storage medium storing a financial data processing system, the financial data processing system being executable by at least one processor to cause the at least one processing
  • the financial data processing method in any of the above embodiments is performed.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Disclosed in the present application are an electronic device, a financial data processing method and system, and a computer-readable storage medium. According to the present application, a factor data prediction model corresponding to factor data to be processed is determined, a predicted value of said factor data is calculated according to the factor data prediction model, and the observed value and predicted value of the factor data within a first preset time interval are input to a data processing model to obtain denoised factor data. Compared with the prior art, the present application eliminates the noise in the observed value of the factor data, improves the accuracy of the factor data, and lays a good foundation for the subsequent analysis of the factor data.

Description

电子装置、金融数据处理方法、系统和计算机可读存储介质Electronic device, financial data processing method, system and computer readable storage medium
优先权申明Priority claim
本申请基于巴黎公约申明享有2018年04月03日递交的申请号为CN 201810298032.2、名称为“电子装置、金融数据处理方法和计算机可读存储介质”中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。The present application is based on the priority of the Chinese Patent Application entitled "Electronic Device, Financial Data Processing Method, and Computer Readable Storage Media", filed on Apr. 3, 2018, the disclosure of which is incorporated herein by reference. The entire content is incorporated herein by reference.
技术领域Technical field
本申请涉及计算机技术领域,特别涉及一种电子装置、金融数据处理方法、系统和计算机可读存储介质。The present application relates to the field of computer technologies, and in particular, to an electronic device, a financial data processing method, a system, and a computer readable storage medium.
背景技术Background technique
金融数据包括股票因子数据、债券因子数据、期货因子数据、基金因子数据等。例如,基于股票因子数据的因子分析通常被用来作为判断该股票未来走势的重要依据,其因子数据的种类多种多样,不同种类的因子数据,其相对应的计算方式也不同。由于收集方式或时空性的原因,所能观测到的因子数据基本认为是不干净(即充斥噪声)的数据。例如,由于存在延时性,在某时刻观测到的动量因子为正值时,并不代表该股票价格会继续向上增长,也可能存在动量反转现象。因此基于充斥噪声的数据进行分析得到的分析结果准确度不高。Financial data includes stock factor data, bond factor data, futures factor data, fund factor data, and so on. For example, factor analysis based on stock factor data is often used as an important basis for judging the future trend of the stock. There are various types of factor data, and different types of factor data have different calculation methods. Due to the collection method or spatiotemporality, the observed factor data is basically considered to be unclean (ie, full of noise) data. For example, due to the delay, when the momentum factor observed at a certain time is positive, it does not mean that the stock price will continue to increase upward, and there may be momentum reversal. Therefore, the analysis results based on the data filled with noise are not accurate.
发明内容Summary of the invention
本申请的主要目的是解决观测到的因子数据充斥噪声,基于该因子数据进行的因子分析结果准确度不高的问题。The main purpose of the present application is to solve the problem that the observed factor data is full of noise, and the factor analysis result based on the factor data is not accurate.
为实现上述目的,本申请提出的电子装置,所述电子装置包括存储器和处理器,所述存储器上存储有可在所述处理器上运行的金融数据处理系统,所述金融数据处理系统被所述处理器执行时实现如下步骤:To achieve the above object, an electronic device proposed by the present application, the electronic device includes a memory and a processor, and the memory stores a financial data processing system operable on the processor, the financial data processing system being The following steps are implemented when the processor is executed:
S10,获取第一预设时间区间内各待处理因子数据的观测值;S10. Obtain an observation value of each to-be-processed factor data in the first preset time interval.
S20,在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;S20, in a pre-established factor data prediction model library, querying a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
S30,根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;S30. Acquire a predicted value of each of the to-be-processed factor data in the first preset time interval according to the factor data prediction model corresponding to each of the to-be-processed factor data.
S40,将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。S40. Input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
此外,为实现上述目的,本申请提供一种金融数据处理方法,该方法包括步骤:In addition, to achieve the above object, the present application provides a financial data processing method, the method comprising the steps of:
S10,获取第一预设时间区间内各待处理因子数据的观测值;S10. Obtain an observation value of each to-be-processed factor data in the first preset time interval.
S20,在预先建立的因子数据预测模型库中,根据预先确定的各因子数据 与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;S20, in a pre-established factor data prediction model library, querying a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
S30,根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;S30. Acquire a predicted value of each of the to-be-processed factor data in the first preset time interval according to the factor data prediction model corresponding to each of the to-be-processed factor data.
S40,将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。S40. Input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
此外,为实现上述目的,本申请提供一种金融数据处理系统,所述金融数据处理系统包括:In addition, in order to achieve the above object, the present application provides a financial data processing system, where the financial data processing system includes:
第一获取模块,用于获取第一预设时间区间内各待处理因子数据的观测值;a first acquiring module, configured to acquire an observation value of each to-be-processed factor data in the first preset time interval;
查询模块,用于在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;a query module, configured to query, in a pre-established factor data prediction model library, a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
第二获取模块,用于根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;a second acquiring module, configured to acquire, according to the factor data prediction model corresponding to each of the to-be-processed factor data, a predicted value of each of the to-be-processed factor data in the first preset time interval;
数据处理模块,用于将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。And a data processing module, configured to input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
此外,为实现上述目的,本申请提供一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有金融数据处理系统,所述金融数据处理系统可被至少一个处理器执行,以使所述至少一个处理器执行如下步骤:In addition, in order to achieve the above object, the present application provides a computer readable storage medium, wherein the computer readable storage medium stores a financial data processing system, and the financial data processing system can be executed by at least one processor. Taking the at least one processor to perform the following steps:
S10,获取第一预设时间区间内各待处理因子数据的观测值;S10. Obtain an observation value of each to-be-processed factor data in the first preset time interval.
S20,在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;S20, in a pre-established factor data prediction model library, querying a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
S30,根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;S30. Acquire a predicted value of each of the to-be-processed factor data in the first preset time interval according to the factor data prediction model corresponding to each of the to-be-processed factor data.
S40,将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。S40. Input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
本申请确定待处理因子数据对应的因子数据预测模型,根据该因子数据预测模型计算出待处理因子数据的预测值,将第一预设时间区间内的因子数据的观测值及预测值输入至数据处理模型,从而得到除噪后的因子数据;相较于现有技术,本申请充分考虑了各因子数据之间特性不同,针对其采用不同的因子数据预测模型可使因子数据的预测值更为准确、更为符合真实情况,此外,经数据处理模型除噪后的因子数据消除了因子数据观测值中的噪声,提高了因子数据的准确性,为后续针对因子数据的分析奠定了良好基础。The present application determines a factor data prediction model corresponding to the to-be-processed factor data, calculates a predicted value of the to-be-processed factor data according to the factor data prediction model, and inputs the observed value and the predicted value of the factor data in the first preset time interval to the data. The model is processed to obtain the denoised factor data; compared with the prior art, the present application fully considers the difference in characteristics between the various factor data, and the prediction value of the factor data can be made by using different factor data prediction models for it. Accurate and more in line with the real situation. In addition, the factor data after denoising by the data processing model eliminates the noise in the observation data of the factor data, improves the accuracy of the factor data, and lays a good foundation for the subsequent analysis of the factor data.
附图说明DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实 施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图示出的结构获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings to be used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present application, and other drawings can be obtained according to the structures shown in the drawings without any creative work for those skilled in the art.
图1为本申请金融数据处理方法第一实施例的流程示意图;1 is a schematic flowchart of a first embodiment of a method for processing financial data according to the present application;
图2为本申请金融数据处理方法第二实施例的流程示意图;2 is a schematic flowchart of a second embodiment of a method for processing financial data according to the present application;
图3为本申请金融数据系统第一实施例的运行环境示意图;3 is a schematic diagram of an operating environment of a first embodiment of a financial data system of the present application;
图4为本申请金融数据系统第一实施例的程序模块图;4 is a block diagram of a program of a first embodiment of the financial data system of the present application;
图5为本申请金融数据系统第二实施例的程序模块图。FIG. 5 is a block diagram of a program of a second embodiment of the financial data system of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.
具体实施方式detailed description
以下结合附图对本申请的原理和特征进行描述,所举实例只用于解释本申请,并非用于限定本申请的范围。The principles and features of the present application are described in the following with reference to the accompanying drawings, which are only used to explain the present application and are not intended to limit the scope of the application.
如图1所示,图1为本申请金融数据处理方法第一实施例的流程示意图。As shown in FIG. 1, FIG. 1 is a schematic flowchart of a first embodiment of a method for processing financial data according to the present application.
本实施例中,该方法包括:In this embodiment, the method includes:
S10,获取第一预设时间区间内各待处理因子数据的观测值;S10. Obtain an observation value of each to-be-processed factor data in the first preset time interval.
若以股票的因子数据为例,上述因子数据可包括股票当日涨幅、移动平均价(moving average)、股票交易量等。Taking the factor data of the stock as an example, the above factor data may include the stock's day gain, moving average, stock trading volume, and the like.
用户可根据需要设置第一预设时间区间,例如,设置第一预设时间区间为2018年1月2号至2018年1月22号。The user can set the first preset time interval as needed, for example, setting the first preset time interval from January 2, 2018 to January 22, 2018.
上述获取第一预设时间区间内各待处理因子数据的观测值的频率可根据需要设置,例如,实时,或者设置固定的时间间隔(例如,1天),或者在接收到用户发出的获取指令时执行获取第一预设时间区间内各待处理因子数据的观测值的步骤。需要注意的是,一些因子数据(例如,净利润增长、营业利润增长等)若需要通过计算得到,则可预先执行计算步骤,并存储该计算步骤得到的因子数据至预设的存储空间,当执行步骤S10时,在存储空间读取所需的因子数据即可,在一些实施例中,上述步骤S10也可包含因子数据的计算步骤。The frequency of obtaining the observation value of each to-be-processed factor data in the first preset time interval may be set as needed, for example, in real time, or set a fixed time interval (for example, 1 day), or receiving an acquisition instruction issued by the user. The step of acquiring the observed values of the respective to-be-processed factor data in the first preset time interval is performed. It should be noted that if some factor data (for example, net profit growth, operating profit growth, etc.) needs to be calculated, the calculation step may be performed in advance, and the factor data obtained by the calculation step is stored to a preset storage space. When the step S10 is performed, the required factor data may be read in the storage space. In some embodiments, the step S10 may also include the calculation step of the factor data.
S20,在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;S20, in a pre-established factor data prediction model library, querying a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
预先针对各因子数据建立对应的因子数据预测模型,由于各因子数据的特性不同,针对其建立对应的因子数据预测模型的方法也不同,将建立的各因子数据预测模型存储至因子数据预测模型库中,同时保存因子数据与因子数据预测模型之间的映射关系(例如,反映因子数据与因子数据预测模型映射关系的映射表)。A corresponding factor data prediction model is established in advance for each factor data. Since the characteristics of each factor data are different, a method for establishing a corresponding factor data prediction model is also different, and the established factor data prediction models are stored in the factor data prediction model library. In the middle, the mapping relationship between the factor data and the factor data prediction model is saved (for example, a mapping table reflecting the mapping relationship between the factor data and the factor data prediction model).
S30,根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;S30. Acquire a predicted value of each of the to-be-processed factor data in the first preset time interval according to the factor data prediction model corresponding to each of the to-be-processed factor data.
例如,若要获得K时刻的待处理因子数据的预测值,则将K-1时刻的待处理因子数据的观测值输入至因子数据预测模型中,因子数据预测模型输出结果即为K时刻的待处理因子数据的预测值。For example, if the predicted value of the to-be-processed factor data at time K is to be obtained, the observation value of the to-be-processed factor data at time K-1 is input into the factor data prediction model, and the output result of the factor data prediction model is K-time waiting The predicted value of the processing factor data.
S40,将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。S40. Input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
本实施例确定待处理因子数据对应的因子数据预测模型,根据该因子数据预测模型计算出待处理因子数据的预测值,将第一预设时间区间内的因子数据的观测值及预测值输入至数据处理模型,从而得到除噪后的因子数据;相较于现有技术,本实施例充分考虑了各因子数据之间特性不同,针对其采用不同的因子数据预测模型可使因子数据的预测值更为准确、更为符合真实情况,此外,经数据处理模型除噪后的因子数据消除了因子数据观测值中的噪声,提高了因子数据的准确性,为后续针对因子数据的分析奠定了良好基础。The embodiment determines a factor data prediction model corresponding to the to-be-processed factor data, calculates a predicted value of the to-be-processed factor data according to the factor data prediction model, and inputs the observed value and the predicted value of the factor data in the first preset time interval to The data processing model is used to obtain the denoised factor data; compared with the prior art, the present embodiment fully considers the characteristics of each factor data, and uses different factor data prediction models to predict the factor data. More accurate and more in line with the real situation. In addition, the factor data after denoising by the data processing model eliminates the noise in the observation of the factor data, improves the accuracy of the factor data, and lays a good foundation for the subsequent analysis of the factor data. basis.
优选的,本实施例中,上述步骤S40可具体为:Preferably, in this embodiment, the foregoing step S40 may be specifically:
将获取的各所述待处理因子数据的观测值按照时间先后顺序进行排序,以生成所述待处理因子数据的观测矩阵;Obtaining the obtained observation values of the to-be-processed factor data in chronological order to generate an observation matrix of the to-be-processed factor data;
将获取的各所述待处理因子数据的预测值按照时间先后顺序进行排序,以生成所述待处理因子数据的预测矩阵;And predicting the obtained predicted values of the to-be-processed factor data in a chronological order to generate a prediction matrix of the to-be-processed factor data;
将所述待处理因子数据的观测矩阵及所述待处理因子数据的预测矩阵输入至预先确定的卡尔曼滤波模型,以获得修正后的所述因子数据。The observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
本实施例中,上述生成待处理因子数据的观测矩阵的步骤还可在执行完步骤S10后立即执行;同样,上述生成待处理因子数据的预测矩阵的步骤可在执行完步骤S30后立即执行。In this embodiment, the step of generating the observation matrix of the to-be-processed factor data may be performed immediately after the step S10 is performed; likewise, the step of generating the prediction matrix of the to-be-processed factor data may be performed immediately after the step S30 is performed.
本实施例中数据处理模型为卡尔曼滤波模型,卡尔曼滤波模型可高效的消除因子数据的噪声,使因子数据更为接近其真实值。In this embodiment, the data processing model is a Kalman filter model, and the Kalman filter model can effectively eliminate the noise of the factor data, so that the factor data is closer to its true value.
如图2所示,图2为本申请金融数据处理方法第二实施例的流程示意图。As shown in FIG. 2, FIG. 2 is a schematic flowchart of a second embodiment of a method for processing financial data according to the present application.
本申请金融数据处理方法第二实施例中,本实施例在第一实施例的基础上,所述步骤S20之前,该方法还包括:In the second embodiment of the method for processing financial data of the present application, the present embodiment is based on the first embodiment. Before the step S20, the method further includes:
S50,根据预先确定的分类规则,对各因子数据进行分类处理;S50: classify each factor data according to a predetermined classification rule;
S60,根据各所述因子数据的种类,分别建立各所述因子数据对应的因子数据预测模型;S60. The factor data prediction model corresponding to each of the factor data is respectively established according to the type of each of the factor data.
S70,将建立的各所述因子数据预测模型与其对应的因子数据之间的映射关系进行存储处理。S70: Store the mapping relationship between each of the established factor data prediction models and their corresponding factor data.
优选的,本实施例中,上述步骤S60包括以下步骤:Preferably, in this embodiment, the foregoing step S60 includes the following steps:
当一因子数据的种类为第一数据种类时,则基于以下运算表达式建立所 述因子数据对应的因子数据预测模型:When the type of the one-factor data is the first data type, the factor data prediction model corresponding to the factor data is established based on the following operation expression:
X(K+1)=Z(K)X(K+1)=Z(K)
其中,X(K+1)为K+1时刻的因子数据的预测值,Z(K)为K时刻的因子数据的观测值。Here, X(K+1) is a predicted value of the factor data at time K+1, and Z(K) is an observed value of the factor data at time K.
上述第一数据种类的因子数据对应的因子数据预测模型的建立方法适用于低频因子数据,所谓低频因子数据为随价格变化,缓慢变化的因子数据。The method for establishing a factor data prediction model corresponding to the factor data of the first data type is applied to low-frequency factor data, and the so-called low-frequency factor data is factor data that changes slowly with price.
优选的,本实施例中,上述步骤S60还包括以下步骤:Preferably, in the embodiment, the foregoing step S60 further includes the following steps:
当一因子数据的种类为第二数据种类时,所述因子数据对应的因子数据预测模型的建立方法为:When the type of the one-factor data is the second data type, the method for establishing the factor data prediction model corresponding to the factor data is:
采集第二预设时间区间内的所述因子数据;Collecting the factor data in a second preset time interval;
对采集的所述因子数据进行预处理(例如,归一化处理);Pre-processing (eg, normalization) of the collected factor data;
利用预处理后的所述因子数据,且基于长短期记忆神经网络构建所述因子数据对应的因子数据预测模型。The pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
上述基于长短期记忆(LSTM,Long Short-Term Memory)神经网络构建所述因子数据对应的因子数据预测模型的步骤具体为:The step of constructing the factor data prediction model corresponding to the factor data based on the Long Short-Term Memory (LSTM) neural network is specifically as follows:
基于交叉验证法(cross-validation),将预处理后的因子数据划分为训练集、评估集和测试集(例如,70%数量的因子数据作为训练集,10%数量的因子数据作为评估集,20%数量的因子数据作为测试集)。将训练集输入至LSTM神经网络模型,且基于梯度下降法(例如,随机梯度下降法)进行训练。所述评估集用于在训练过程中对LSTM神经网络模型进行验证,将所述测试集输入训练得到的所述LSTM神经网络模型,以对训练得到的所述LSTM神经网络模型进行验证,当训练得到的所述LSTM神经网络模型满足第一预设验证条件(例如,与验证结果差值小于预设阈值),则训练完成,将训练完成的LSTM神经网络模型设置为该因子数据的因子数据预测模型。Based on cross-validation, the pre-processed factor data is divided into a training set, an evaluation set, and a test set (eg, 70% of the factor data is used as the training set, and 10% of the factor data is used as the evaluation set, 20% of the factor data as a test set). The training set is input to the LSTM neural network model and trained based on a gradient descent method (eg, a stochastic gradient descent method). The evaluation set is used to verify the LSTM neural network model during the training process, and input the test set into the trained LSTM neural network model to verify the trained LSTM neural network model when training The obtained LSTM neural network model satisfies the first preset verification condition (for example, the difference from the verification result is less than a preset threshold), and the training is completed, and the trained LSTM neural network model is set as the factor data prediction of the factor data. model.
需要注意的是,上述基于交叉验证法将因子数据划分为训练集、评估集和测试集的步骤可替换为:基于交叉验证法将因子数据划分为训练集和测试集。且训练集、评估集和测试集中因子数据的数量可根据需要设置,并不限于上述例举的方案。It should be noted that the above steps of dividing the factor data into the training set, the evaluation set and the test set based on the cross-validation method may be replaced by dividing the factor data into a training set and a test set based on the cross-validation method. And the number of training sets, evaluation sets, and test concentration factor data can be set as needed, and is not limited to the above-exemplified schemes.
上述第二数据种类的因子数据对应的因子数据预测模型的建立方法适用于高频因子数据,所谓高频因子数据为随价格变化,迅速变化的因子数据,这是因为LSTM神经网络针对高频因子数据具有更为精准的预测能力。The method for establishing a factor data prediction model corresponding to the factor data of the second data type is applicable to high-frequency factor data, and the so-called high-frequency factor data is factor data that changes rapidly with price, because the LSTM neural network targets high-frequency factors. The data has more accurate predictive power.
判断一因子数据属于低频因子数据还是高频因子数据的方法可参照如下示例:The method for judging whether a factor data belongs to low frequency factor data or high frequency factor data can be referred to the following example:
方差法:Variance method:
取一时间区间内的因子数据,并对该因子数据以时间为变量作差分处理;Taking the factor data in a time interval, and performing differential processing on the factor data with time as a variable;
计算差分处理后的因子数据的相对标准差作为变化频率值;Calculating a relative standard deviation of the differentially processed factor data as a variation frequency value;
若一因子数据的变化频率值大于或等于预设的标准变化频率值,则将该因子数据确定为高频因子数据;If the change frequency value of the factor data is greater than or equal to the preset standard change frequency value, the factor data is determined as the high frequency factor data;
若一因子数据的变化频率值小于预设的标准变化频率值,则将该因子数据确定为低频因子数据;If the change frequency value of the factor data is less than the preset standard change frequency value, the factor data is determined as the low frequency factor data;
其中,若以股票因子数据为例,所述预设的标准变化频率值可设置为收盘价格的变化频率值。Wherein, if the stock factor data is taken as an example, the preset standard change frequency value may be set as a change frequency value of the closing price.
交叉点法:Intersection method:
取一时间区间内的因子数据,并对该因子数据以时间为变量作差分处理以获得因子数据曲线,该因子数据曲线所在的坐标X轴为采样时刻,Y轴为差分处理后因子数据的值;The factor data in a time interval is taken, and the factor data is differentially processed by time to obtain a factor data curve. The coordinate X-axis of the factor data curve is the sampling time, and the Y-axis is the value of the differential processing factor data. ;
在该时间区间内,统计所述因子数据曲线与均值线的交点数量;其中,所述均值为差分处理后因子数据的平均值;During the time interval, the number of intersections of the factor data curve and the mean line is counted; wherein the mean is an average of the factor data after the differential processing;
若一因子数据曲线与均值线的交点数量大于或等于预设的标准交点数量,则将该因子数据曲线对应的因子数据确定为高频因子数据;If the number of intersections of the one-factor data curve and the mean line is greater than or equal to the preset number of standard intersection points, the factor data corresponding to the factor data curve is determined as the high-frequency factor data;
若一因子数据曲线与均值线的交点数量小于预设的标准交点数量,则将该因子数据曲线对应的因子数据确定为低频因子数据;If the number of intersections of the one-factor data curve and the mean line is less than the preset number of standard intersection points, the factor data corresponding to the factor data curve is determined as the low-frequency factor data;
其中,若以股票因子数据为例,所述预设的标准交点数量可设置为基于收盘价格差分处理后的收盘价格曲线的标准交点数量。Wherein, if the stock factor data is taken as an example, the preset number of standard intersection points may be set as the number of standard intersection points based on the closing price curve after the closing price differential processing.
此外,本实施例中,上述步骤S50、步骤S60及步骤S70可由以下步骤替换:In addition, in this embodiment, the above steps S50, S60 and S70 can be replaced by the following steps:
采集第三预设时间区间内的因子数据;Collecting factor data in a third preset time interval;
对采集的所述因子数据进行预处理;Pre-processing the collected factor data;
将预处理后的所述因子数据按照预设规则划分为训练集及测试集(例如,按照时间先后顺序,划分80%数量的因子数据作为训练集,划分20%数量的因子数据作为测试集);The pre-processed factor data is divided into a training set and a test set according to a preset rule (for example, chronological order is used to divide 80% of the factor data as a training set, and divide 20% of the factor data as a test set) ;
利用训练集,且基于多个预测模型(例如,神经网络模型、随机森林模型、线性回归模型、逻辑回归模型等)进行训练,利用测试集对完成训练的多个预测模型进行验证,根据预先确定的选择规则从上述多个预测模型中选取一预测模型作为该因子数据的因子数据预测模型。Using the training set, and training based on multiple prediction models (for example, neural network model, random forest model, linear regression model, logistic regression model, etc.), using the test set to verify multiple prediction models of the completed training, according to predetermined The selection rule selects a prediction model from the plurality of prediction models as the factor data prediction model of the factor data.
其中,上述预先确定的选择规则可以是选择与验证结果最相近的预测模型作为因子数据预测模型。The predetermined selection rule may be that the prediction model closest to the verification result is selected as the factor data prediction model.
上述因子数据预测模型的建立方法适用于各类因子数据(此方案不需对因子数据分类),基于一因子数据同时训练多个预测模型,找出最优的预测模型作为该因子数据的因子数据预测模型。The method for establishing the above-mentioned factor data prediction model is applicable to various factor data (this scheme does not need to classify the factor data), and simultaneously trains multiple prediction models based on one factor data to find an optimal prediction model as factor data of the factor data. Forecast model.
此外,本申请还提出一种金融数据处理系统。In addition, the present application also proposes a financial data processing system.
请参阅图3,是本申请金融数据处理系统10较佳实施例的运行环境示意图。Please refer to FIG. 3 , which is a schematic diagram of an operating environment of a preferred embodiment of the financial data processing system 10 of the present application.
在本实施例中,金融数据处理系统10安装并运行于电子装置1中。电子装置1可以是桌上型计算机、笔记本、掌上电脑及服务器等计算设备。该电 子装置1可包括,但不仅限于,存储器11、处理器12及显示器13。图3仅示出了具有组件11-13的电子装置1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。In the present embodiment, the financial data processing system 10 is installed and operated in the electronic device 1. The electronic device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a server. The electronic device 1 may include, but is not limited to, a memory 11, a processor 12, and a display 13. Figure 3 shows only the electronic device 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
存储器11在一些实施例中可以是电子装置1的内部存储单元,例如该电子装置1的硬盘或内存。存储器11在另一些实施例中也可以是电子装置1的外部存储设备,例如电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括电子装置1的内部存储单元也包括外部存储设备。存储器11用于存储安装于电子装置1的应用软件及各类数据,例如金融数据处理系统10的程序代码等。存储器11还可以用于暂时地存储已经输出或者将要输出的数据。The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a hard disk or memory of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC), and a secure digital (SD). Card, flash card, etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 is used to store application software and various types of data installed in the electronic device 1, such as program codes of the financial data processing system 10. The memory 11 can also be used to temporarily store data that has been output or is about to be output.
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行金融数据处理系统10等。The processor 12, in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as performing financial data processing. System 10 and so on.
显示器13在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。显示器13用于显示在电子装置1中处理的信息以及用于显示可视化的用户界面。电子装置1的部件11-13通过系统总线相互通信。The display 13 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like in some embodiments. The display 13 is for displaying information processed in the electronic device 1 and a user interface for displaying visualization. The components 11-13 of the electronic device 1 communicate with one another via a system bus.
请参阅图4,是本申请金融数据处理系统10较佳实施例的程序模块图。在本实施例中,金融数据处理系统10可以被分割成一个或多个模块,一个或者多个模块被存储于存储器11中,并由一个或多个处理器(本实施例为处理器12)所执行,以完成本申请。例如,在图4中,金融数据处理系统10可以被分割成第一获取模块101、查询模块102、第二获取模块103及数据处理模块104。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段,比程序更适合于描述金融数据处理系统10在电子装置1中的执行过程,其中:Please refer to FIG. 4, which is a program block diagram of a preferred embodiment of the financial data processing system 10 of the present application. In the present embodiment, the financial data processing system 10 can be divided into one or more modules, one or more modules are stored in the memory 11, and by one or more processors (the processor 12 in this embodiment) Executed to complete the application. For example, in FIG. 4, the financial data processing system 10 can be divided into a first acquisition module 101, a query module 102, a second acquisition module 103, and a data processing module 104. A module referred to in this application refers to a series of computer program instruction segments capable of performing a specific function, and is more suitable than the program for describing the execution process of the financial data processing system 10 in the electronic device 1, wherein:
第一获取模块101,用于获取第一预设时间区间内各待处理因子数据的观测值;The first obtaining module 101 is configured to acquire an observation value of each to-be-processed factor data in the first preset time interval;
查询模块102,用于在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;The query module 102 is configured to query, in a pre-established factor data prediction model library, a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
第二获取模块103,用于根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;a second obtaining module 103, configured to acquire, according to the factor data prediction model corresponding to each of the to-be-processed factor data, a predicted value of each of the to-be-processed factor data in the first preset time interval;
数据处理模块104,用于将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。The data processing module 104 is configured to input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
若以股票的因子数据为例,上述因子数据可包括股票当日涨幅、移动平均价(moving average)、股票交易量等。Taking the factor data of the stock as an example, the above factor data may include the stock's day gain, moving average, stock trading volume, and the like.
用户可根据需要设置第一预设时间区间,例如,设置第一预设时间区间 为2018年1月2号至2018年1月22号。The user can set the first preset time interval as needed, for example, setting the first preset time interval from January 2, 2018 to January 22, 2018.
上述获取第一预设时间区间内各待处理因子数据的观测值的频率可根据需要设置,例如,实时,或者设置固定的时间间隔(例如,1天),或者在接收到用户发出的获取指令时执行获取第一预设时间区间内各待处理因子数据的观测值的步骤。需要注意的是,一些因子数据(例如,净利润增长、营业利润增长等)若需要通过计算得到,则可预先执行计算步骤,并存储该计算步骤得到的因子数据至预设的存储空间,当执行所述获取第一预设时间区间内各待处理因子数据的观测值的步骤时,第一获取模块101在存储空间读取所需的因子数据即可,在一些实施例中,第一获取模块101也可执行因子数据的计算步骤。The frequency of obtaining the observation value of each to-be-processed factor data in the first preset time interval may be set as needed, for example, in real time, or set a fixed time interval (for example, 1 day), or receiving an acquisition instruction issued by the user. The step of acquiring the observed values of the respective to-be-processed factor data in the first preset time interval is performed. It should be noted that if some factor data (for example, net profit growth, operating profit growth, etc.) needs to be calculated, the calculation step may be performed in advance, and the factor data obtained by the calculation step is stored to a preset storage space. When the step of acquiring the observed values of the to-be-processed factor data in the first preset time interval is performed, the first obtaining module 101 may read the required factor data in the storage space, and in some embodiments, the first obtaining Module 101 can also perform the calculation steps of the factor data.
在查询模块102执行器查询步骤之前,金融数据处理系统10预先针对各因子数据建立对应的因子数据预测模型,由于各因子数据的特性不同,针对其建立对应的因子数据预测模型的方法也不同,将建立的各因子数据预测模型存储至因子数据预测模型库中,同时保存因子数据与因子数据预测模型之间的映射关系(例如,反映因子数据与因子数据预测模型映射关系的映射表)。Before the query module 102 performs the query step, the financial data processing system 10 pre-establishes a corresponding factor data prediction model for each factor data. Since the characteristics of each factor data are different, the method for establishing a corresponding factor data prediction model is different. The established factor data prediction models are stored in the factor data prediction model library, and the mapping relationship between the factor data and the factor data prediction model is saved (for example, a mapping table reflecting the mapping relationship between the factor data and the factor data prediction model).
上述第二获取模块103获取所述第一预设时间区间内各所述待处理因子数据的预测值的步骤可具体举例为:若要获得K时刻的待处理因子数据的预测值,则将K-1时刻的待处理因子数据的观测值输入至因子数据预测模型中,因子数据预测模型输出结果即为K时刻的待处理因子数据的预测值。The step of acquiring, by the second acquiring module 103, the predicted values of the to-be-processed factor data in the first preset time interval may be specifically exemplified by: if the predicted value of the to-be-processed factor data at the time K is obtained, then K is obtained. The observation value of the factor data to be processed at time -1 is input to the factor data prediction model, and the output result of the factor data prediction model is the predicted value of the factor data to be processed at time K.
本实施例确定待处理因子数据对应的因子数据预测模型,根据该因子数据预测模型计算出待处理因子数据的预测值,将第一预设时间区间内的因子数据的观测值及预测值输入至数据处理模型,从而得到除噪后的因子数据;相较于现有技术,本实施例充分考虑了各因子数据之间特性不同,针对其采用不同的因子数据预测模型可使因子数据的预测值更为准确、更为符合真实情况,此外,经数据处理模型除噪后的因子数据消除了因子数据观测值中的噪声,提高了因子数据的准确性,为后续针对因子数据的分析奠定了良好基础。The embodiment determines a factor data prediction model corresponding to the to-be-processed factor data, calculates a predicted value of the to-be-processed factor data according to the factor data prediction model, and inputs the observed value and the predicted value of the factor data in the first preset time interval to The data processing model is used to obtain the denoised factor data; compared with the prior art, the present embodiment fully considers the characteristics of each factor data, and uses different factor data prediction models to predict the factor data. More accurate and more in line with the real situation. In addition, the factor data after denoising by the data processing model eliminates the noise in the observation of the factor data, improves the accuracy of the factor data, and lays a good foundation for the subsequent analysis of the factor data. basis.
优选的,本实施例中,上述数据处理模块104还用于:Preferably, in the embodiment, the data processing module 104 is further configured to:
将获取的各所述待处理因子数据的观测值按照时间先后顺序进行排序,以生成所述待处理因子数据的观测矩阵;Obtaining the obtained observation values of the to-be-processed factor data in chronological order to generate an observation matrix of the to-be-processed factor data;
将获取的各所述待处理因子数据的预测值按照时间先后顺序进行排序,以生成所述待处理因子数据的预测矩阵;And predicting the obtained predicted values of the to-be-processed factor data in a chronological order to generate a prediction matrix of the to-be-processed factor data;
将所述待处理因子数据的观测矩阵及所述待处理因子数据的预测矩阵输入至预先确定的卡尔曼滤波模型,以获得修正后的所述因子数据。The observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
本实施例中,上述生成待处理因子数据的观测矩阵的步骤还可由第一获取模块101在执行完所述获取第一预设时间区间内各待处理因子数据的观测值的步骤后立即执行;同样,上述生成待处理因子数据的预测矩阵的步骤还可由第二获取模块103在执行完所述获取所述第一预设时间区间内各所述待 处理因子数据的预测值的步骤后立即执行。In this embodiment, the step of generating the observation matrix of the to-be-processed factor data may be performed by the first obtaining module 101 immediately after performing the step of acquiring the observed values of the to-be-processed factor data in the first preset time interval; Similarly, the step of generating the prediction matrix of the to-be-processed factor data may be performed by the second obtaining module 103 immediately after performing the step of acquiring the predicted values of the to-be-processed factor data in the first preset time interval. .
本实施例中数据处理模型为卡尔曼滤波模型,卡尔曼滤波模型可高效的消除因子数据的噪声,使因子数据更为接近其真实值。In this embodiment, the data processing model is a Kalman filter model, and the Kalman filter model can effectively eliminate the noise of the factor data, so that the factor data is closer to its true value.
如图5所示,图5为本申请金融数据处理系统第二实施例的程序模块图。As shown in FIG. 5, FIG. 5 is a program module diagram of a second embodiment of the financial data processing system of the present application.
本申请金融数据处理系统第二实施例中,本实施例在第一实施例的基础上,所述金融数据处理系统10还包括模型构建模块105,所述模型构建模块105用于:In the second embodiment of the financial data processing system of the present application, the present embodiment is based on the first embodiment, the financial data processing system 10 further includes a model building module 105, and the model building module 105 is configured to:
根据预先确定的分类规则,对各因子数据进行分类处理;根据各所述因子数据的种类,分别建立各所述因子数据对应的因子数据预测模型;将建立的各所述因子数据预测模型与其对应的因子数据之间的映射关系进行存储处理。Performing classification processing on each factor data according to a predetermined classification rule; respectively, establishing a factor data prediction model corresponding to each of the factor data according to each type of the factor data; and correspondingly establishing each of the determined factor data prediction models The mapping relationship between the factor data is stored.
优选的,本实施例中,上述所述模型构建模块105还用于:Preferably, in the embodiment, the model building module 105 is further configured to:
当一因子数据的种类为第一数据种类时,则基于以下运算表达式建立所述因子数据对应的因子数据预测模型:When the type of the one-factor data is the first data type, the factor data prediction model corresponding to the factor data is established based on the following operation expression:
X(K+1)=Z(K)X(K+1)=Z(K)
其中,X(K+1)为K+1时刻的因子数据的预测值,Z(K)为K时刻的因子数据的观测值。Here, X(K+1) is a predicted value of the factor data at time K+1, and Z(K) is an observed value of the factor data at time K.
上述第一数据种类的因子数据对应的因子数据预测模型的建立方法适用于低频因子数据,所谓低频因子数据为随价格变化,缓慢变化的因子数据。The method for establishing a factor data prediction model corresponding to the factor data of the first data type is applied to low-frequency factor data, and the so-called low-frequency factor data is factor data that changes slowly with price.
优选的,本实施例中,上述所述模型构建模块105还用于:Preferably, in the embodiment, the model building module 105 is further configured to:
当一因子数据的种类为第二数据种类时,所述因子数据对应的因子数据预测模型的建立方法为:When the type of the one-factor data is the second data type, the method for establishing the factor data prediction model corresponding to the factor data is:
采集第二预设时间区间内的所述因子数据;Collecting the factor data in a second preset time interval;
对采集的所述因子数据进行预处理(例如,归一化处理);Pre-processing (eg, normalization) of the collected factor data;
利用预处理后的所述因子数据,且基于长短期记忆神经网络构建所述因子数据对应的因子数据预测模型。The pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
上述基于长短期记忆(LSTM,Long Short-Term Memory)神经网络构建所述因子数据对应的因子数据预测模型的步骤具体为:The step of constructing the factor data prediction model corresponding to the factor data based on the Long Short-Term Memory (LSTM) neural network is specifically as follows:
基于交叉验证法(cross-validation),将预处理后的因子数据划分为训练集、评估集和测试集(例如,70%数量的因子数据作为训练集,10%数量的因子数据作为评估集,20%数量的因子数据作为测试集)。将训练集输入至LSTM神经网络模型,且基于梯度下降法(例如,随机梯度下降法)进行训练。所述评估集用于在训练过程中对LSTM神经网络模型进行验证,将所述测试集输入训练得到的所述LSTM神经网络模型,以对训练得到的所述LSTM神经网络模型进行验证,当训练得到的所述LSTM神经网络模型满足第一预设验证条件(例如,与验证结果差值小于预设阈值),则训练完成,将训练完成的LSTM神经网络模型设置为该因子数据的因子数据预测模型。Based on cross-validation, the pre-processed factor data is divided into a training set, an evaluation set, and a test set (eg, 70% of the factor data is used as the training set, and 10% of the factor data is used as the evaluation set, 20% of the factor data as a test set). The training set is input to the LSTM neural network model and trained based on a gradient descent method (eg, a stochastic gradient descent method). The evaluation set is used to verify the LSTM neural network model during the training process, and input the test set into the trained LSTM neural network model to verify the trained LSTM neural network model when training The obtained LSTM neural network model satisfies the first preset verification condition (for example, the difference from the verification result is less than a preset threshold), and the training is completed, and the trained LSTM neural network model is set as the factor data prediction of the factor data. model.
需要注意的是,上述基于交叉验证法将因子数据划分为训练集、评估集和测试集的步骤可替换为:基于交叉验证法将因子数据划分为训练集和测试集。且训练集、评估集和测试集中因子数据的数量可根据需要设置,并不限于上述例举的方案。It should be noted that the above steps of dividing the factor data into the training set, the evaluation set and the test set based on the cross-validation method may be replaced by dividing the factor data into a training set and a test set based on the cross-validation method. And the number of training sets, evaluation sets, and test concentration factor data can be set as needed, and is not limited to the above-exemplified schemes.
上述第二数据种类的因子数据对应的因子数据预测模型的建立方法适用于高频因子数据,所谓高频因子数据为随价格变化,迅速变化的因子数据,这是因为LSTM神经网络针对时序类的因子数据具有更为精准的预测能力。The method for establishing a factor data prediction model corresponding to the factor data of the second data type is applicable to high-frequency factor data, and the so-called high-frequency factor data is factor data that changes rapidly with price, because the LSTM neural network is for the time series Factor data has more accurate predictive power.
此外,本实施例中,上述所述模型构建模块105还用于:In addition, in this embodiment, the model building module 105 described above is further configured to:
采集第三预设时间区间内的因子数据;Collecting factor data in a third preset time interval;
对采集的所述因子数据进行预处理;Pre-processing the collected factor data;
将预处理后的所述因子数据按照预设规则划分为训练集及测试集(例如,按照时间先后顺序,划分80%数量的因子数据作为训练集,划分20%数量的因子数据作为测试集);The pre-processed factor data is divided into a training set and a test set according to a preset rule (for example, chronological order is used to divide 80% of the factor data as a training set, and divide 20% of the factor data as a test set) ;
利用训练集,且基于多个预测模型(例如,神经网络模型、随机森林模型、线性回归模型、逻辑回归模型等)进行训练,利用测试集对完成训练的多个预测模型进行验证,根据预先确定的选择规则从上述多个预测模型中选取一预测模型作为该因子数据的因子数据预测模型。Using the training set, and training based on multiple prediction models (for example, neural network model, random forest model, linear regression model, logistic regression model, etc.), using the test set to verify multiple prediction models of the completed training, according to predetermined The selection rule selects a prediction model from the plurality of prediction models as the factor data prediction model of the factor data.
其中,上述预先确定的选择规则可以是选择与验证结果最相近的预测模型作为因子数据预测模型。The predetermined selection rule may be that the prediction model closest to the verification result is selected as the factor data prediction model.
上述因子数据预测模型的建立方法适用于各类因子数据(此方案不需对因子数据分类),基于一因子数据同时训练多个预测模型,找出最优的预测模型作为该因子数据的因子数据预测模型。The method for establishing the above-mentioned factor data prediction model is applicable to various factor data (this scheme does not need to classify the factor data), and simultaneously trains multiple prediction models based on one factor data to find an optimal prediction model as factor data of the factor data. Forecast model.
进一步地,本申请还提出一种计算机可读存储介质,所述计算机可读存储介质存储有金融数据处理系统,所述金融数据处理系统可被至少一个处理器执行,以使所述至少一个处理器执行上述任一实施例中的金融数据处理方法。Further, the present application further provides a computer readable storage medium storing a financial data processing system, the financial data processing system being executable by at least one processor to cause the at least one processing The financial data processing method in any of the above embodiments is performed.
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是在本申请的发明构思下,利用本申请说明书及附图内容所作的等效结构变换,或直接/间接运用在其他相关的技术领域均包括在本申请的专利保护范围内。The above description is only a preferred embodiment of the present application, and is not intended to limit the scope of the patents of the present application, and the equivalent structural transformation, or direct/indirect use, of the present application and the contents of the drawings is used in the present invention. All other related technical fields are included in the patent protection scope of the present application.

Claims (20)

  1. 一种电子装置,其特征在于,所述电子装置包括存储器和处理器,所述存储器上存储有可在所述处理器上运行的金融数据处理系统,所述金融数据处理系统被所述处理器执行时实现如下步骤:An electronic device, comprising: a memory and a processor, the memory storing a financial data processing system operable on the processor, the financial data processing system being The following steps are implemented during execution:
    S10,获取第一预设时间区间内各待处理因子数据的观测值;S10. Obtain an observation value of each to-be-processed factor data in the first preset time interval.
    S20,在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;S20, in a pre-established factor data prediction model library, querying a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
    S30,根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;S30. Acquire a predicted value of each of the to-be-processed factor data in the first preset time interval according to the factor data prediction model corresponding to each of the to-be-processed factor data.
    S40,将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。S40. Input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
  2. 如权利要求1所述的电子装置,其特征在于,在所述步骤S20之前,所述处理器还用于执行所述金融数据处理系统,以实现以下步骤:The electronic device according to claim 1, wherein the processor is further configured to execute the financial data processing system to perform the following steps before the step S20:
    S50,根据预先确定的分类规则,对各因子数据进行分类处理;S50: classify each factor data according to a predetermined classification rule;
    S60,根据各所述因子数据的种类,分别建立各所述因子数据对应的因子数据预测模型;S60. The factor data prediction model corresponding to each of the factor data is respectively established according to the type of each of the factor data.
    S70,将建立的各所述因子数据预测模型与其对应的因子数据之间的映射关系进行存储处理。S70: Store the mapping relationship between each of the established factor data prediction models and their corresponding factor data.
  3. 如权利要求2所述的电子装置,其特征在于,所述步骤S60包括:The electronic device of claim 2, wherein the step S60 comprises:
    当一因子数据的种类为第一数据种类时,则基于以下运算表达式建立所述因子数据对应的因子数据预测模型:When the type of the one-factor data is the first data type, the factor data prediction model corresponding to the factor data is established based on the following operation expression:
    X(K+1)=Z(K)X(K+1)=Z(K)
    其中,X(K+1)为K+1时刻的因子数据的预测值,Z(K)为K时刻的因子数据的观测值。Here, X(K+1) is a predicted value of the factor data at time K+1, and Z(K) is an observed value of the factor data at time K.
  4. 如权利要求2所述的电子装置,其特征在于,所述步骤S60还包括:The electronic device of claim 2, wherein the step S60 further comprises:
    当一因子数据的种类为第二数据种类时,所述因子数据对应的因子数据预测模型的建立方法包括:When the type of the one-factor data is the second data type, the method for establishing the factor data prediction model corresponding to the factor data includes:
    采集第二预设时间区间内的所述因子数据;Collecting the factor data in a second preset time interval;
    对采集的所述因子数据进行预处理;Pre-processing the collected factor data;
    利用预处理后的所述因子数据,且基于长短期记忆神经网络构建所述因子数据对应的因子数据预测模型。The pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
  5. 如权利要求1至4中任意一项所述的电子装置,其特征在于,所述步骤S40包括:The electronic device according to any one of claims 1 to 4, wherein the step S40 comprises:
    将获取的各所述待处理因子数据的观测值按照时间先后顺序进行排序,以生成所述待处理因子数据的观测矩阵;Obtaining the obtained observation values of the to-be-processed factor data in chronological order to generate an observation matrix of the to-be-processed factor data;
    将获取的各所述待处理因子数据的预测值按照时间先后顺序进行排序,以生成所述待处理因子数据的预测矩阵;And predicting the obtained predicted values of the to-be-processed factor data in a chronological order to generate a prediction matrix of the to-be-processed factor data;
    将所述待处理因子数据的观测矩阵及所述待处理因子数据的预测矩阵输 入至预先确定的卡尔曼滤波模型,以获得修正后的所述因子数据。The observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
  6. 一种金融数据处理方法,其特征在于,该方法包括步骤:A financial data processing method, characterized in that the method comprises the steps of:
    S10,获取第一预设时间区间内各待处理因子数据的观测值;S10. Obtain an observation value of each to-be-processed factor data in the first preset time interval.
    S20,在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;S20, in a pre-established factor data prediction model library, querying a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
    S30,根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;S30. Acquire a predicted value of each of the to-be-processed factor data in the first preset time interval according to the factor data prediction model corresponding to each of the to-be-processed factor data.
    S40,将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。S40. Input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
  7. 如权利要求6所述的金融数据处理方法,其特征在于,在所述步骤S20之前,所述金融数据处理方法还包括:The financial data processing method according to claim 6, wherein before the step S20, the financial data processing method further comprises:
    S50,根据预先确定的分类规则,对各因子数据进行分类处理;S50: classify each factor data according to a predetermined classification rule;
    S60,根据各所述因子数据的种类,分别建立各所述因子数据对应的因子数据预测模型;S60. The factor data prediction model corresponding to each of the factor data is respectively established according to the type of each of the factor data.
    S70,将建立的各所述因子数据预测模型与其对应的因子数据之间的映射关系进行存储处理。S70: Store the mapping relationship between each of the established factor data prediction models and their corresponding factor data.
  8. 如权利要求7所述的金融数据处理方法,其特征在于,所述步骤S60包括:The financial data processing method according to claim 7, wherein the step S60 comprises:
    当一因子数据的种类为第一数据种类时,则基于以下运算表达式建立所述因子数据对应的因子数据预测模型:When the type of the one-factor data is the first data type, the factor data prediction model corresponding to the factor data is established based on the following operation expression:
    X(K+1)=Z(K)X(K+1)=Z(K)
    其中,X(K+1)为K+1时刻的因子数据的预测值,Z(K)为K时刻的因子数据的观测值。Here, X(K+1) is a predicted value of the factor data at time K+1, and Z(K) is an observed value of the factor data at time K.
  9. 如权利要求7所述的金融数据处理方法,其特征在于,所述步骤S60还包括:The financial data processing method according to claim 7, wherein the step S60 further comprises:
    当一因子数据的种类为第二数据种类时,所述因子数据对应的因子数据预测模型的建立方法包括:When the type of the one-factor data is the second data type, the method for establishing the factor data prediction model corresponding to the factor data includes:
    采集第二预设时间区间内的所述因子数据;Collecting the factor data in a second preset time interval;
    对采集的所述因子数据进行预处理;Pre-processing the collected factor data;
    利用预处理后的所述因子数据,且基于长短期记忆神经网络构建所述因子数据对应的因子数据预测模型。The pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
  10. 如权利要求6至9中任一项所述的金融数据处理方法,其特征在于,所述步骤S40包括:The financial data processing method according to any one of claims 6 to 9, wherein the step S40 comprises:
    将获取的各所述待处理因子数据的观测值按照时间先后顺序进行排序,以生成所述待处理因子数据的观测矩阵;Obtaining the obtained observation values of the to-be-processed factor data in chronological order to generate an observation matrix of the to-be-processed factor data;
    将获取的各所述待处理因子数据的预测值按照时间先后顺序进行排序,以生成所述待处理因子数据的预测矩阵;And predicting the obtained predicted values of the to-be-processed factor data in a chronological order to generate a prediction matrix of the to-be-processed factor data;
    将所述待处理因子数据的观测矩阵及所述待处理因子数据的预测矩阵输 入至预先确定的卡尔曼滤波模型,以获得修正后的所述因子数据。The observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
  11. 一种金融数据处理系统,其特征在于,所述金融数据处理系统包括:A financial data processing system, characterized in that the financial data processing system comprises:
    第一获取模块,用于获取第一预设时间区间内各待处理因子数据的观测值;a first acquiring module, configured to acquire an observation value of each to-be-processed factor data in the first preset time interval;
    查询模块,用于在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;a query module, configured to query, in a pre-established factor data prediction model library, a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
    第二获取模块,用于根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;a second acquiring module, configured to acquire, according to the factor data prediction model corresponding to each of the to-be-processed factor data, a predicted value of each of the to-be-processed factor data in the first preset time interval;
    数据处理模块,用于将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。And a data processing module, configured to input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
  12. 如权利要求11所述的金融数据处理系统,其特征在于,所述金融数据处理系统还包括模型构建模块,所述模型构建模块用于:The financial data processing system of claim 11 wherein said financial data processing system further comprises a model building module, said model building module for:
    根据预先确定的分类规则,对各因子数据进行分类处理;Sorting each factor data according to a predetermined classification rule;
    根据各所述因子数据的种类,分别建立各所述因子数据对应的因子数据预测模型;Generating a factor data prediction model corresponding to each of the factor data according to each type of the factor data;
    将建立的各所述因子数据预测模型与其对应的因子数据之间的映射关系进行存储处理。The mapping relationship between each of the established factor data prediction models and their corresponding factor data is stored.
  13. 如权利要求12所述的金融数据处理系统,其特征在于,所述模型构建模块还用于:The financial data processing system of claim 12, wherein the model building module is further configured to:
    当一因子数据的种类为第一数据种类时,则基于以下运算表达式建立所述因子数据对应的因子数据预测模型:When the type of the one-factor data is the first data type, the factor data prediction model corresponding to the factor data is established based on the following operation expression:
    X(K+1)=Z(K)X(K+1)=Z(K)
    其中,X(K+1)为K+1时刻的因子数据的预测值,Z(K)为K时刻的因子数据的观测值。Here, X(K+1) is a predicted value of the factor data at time K+1, and Z(K) is an observed value of the factor data at time K.
  14. 如权利要求12所述的金融数据处理系统,其特征在于,所述模型构建模块还用于:The financial data processing system of claim 12, wherein the model building module is further configured to:
    当一因子数据的种类为第二数据种类时,所述因子数据对应的因子数据预测模型的建立方法包括:When the type of the one-factor data is the second data type, the method for establishing the factor data prediction model corresponding to the factor data includes:
    采集第二预设时间区间内的所述因子数据;Collecting the factor data in a second preset time interval;
    对采集的所述因子数据进行预处理;Pre-processing the collected factor data;
    利用预处理后的所述因子数据,且基于长短期记忆神经网络构建所述因子数据对应的因子数据预测模型。The pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
  15. 如权利要求11至14中任意一项所述的金融数据处理系统,其特征在于,所述数据处理模块具体用于:The financial data processing system according to any one of claims 11 to 14, wherein the data processing module is specifically configured to:
    将获取的各所述待处理因子数据的观测值按照时间先后顺序进行排序,以生成所述待处理因子数据的观测矩阵;Obtaining the obtained observation values of the to-be-processed factor data in chronological order to generate an observation matrix of the to-be-processed factor data;
    将获取的各所述待处理因子数据的预测值按照时间先后顺序进行排序, 以生成所述待处理因子数据的预测矩阵;And predicting the obtained predicted values of the to-be-processed factor data in a chronological order to generate a prediction matrix of the to-be-processed factor data;
    将所述待处理因子数据的观测矩阵及所述待处理因子数据的预测矩阵输入至预先确定的卡尔曼滤波模型,以获得修正后的所述因子数据。The observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有金融数据处理系统,所述金融数据处理系统可被至少一个处理器执行,以使所述至少一个处理器执行如下步骤:A computer readable storage medium, characterized in that the computer readable storage medium stores a financial data processing system, the financial data processing system being executable by at least one processor to cause the at least one processor to perform the following step:
    S10,获取第一预设时间区间内各待处理因子数据的观测值;S10. Obtain an observation value of each to-be-processed factor data in the first preset time interval.
    S20,在预先建立的因子数据预测模型库中,根据预先确定的各因子数据与因子数据预测模型之间的映射关系,查询各所述待处理因子数据对应的因子数据预测模型;S20, in a pre-established factor data prediction model library, querying a factor data prediction model corresponding to each of the to-be-processed factor data according to a mapping relationship between the predetermined factor data and the factor data prediction model;
    S30,根据各所述待处理因子数据对应的所述因子数据预测模型,获取所述第一预设时间区间内各所述待处理因子数据的预测值;S30. Acquire a predicted value of each of the to-be-processed factor data in the first preset time interval according to the factor data prediction model corresponding to each of the to-be-processed factor data.
    S40,将获取的各所述待处理因子数据的观测值及各所述待处理因子数据的预测值输入预先确定的数据处理模型,以获得修正后的因子数据。S40. Input the acquired observation value of each of the to-be-processed factor data and the predicted value of each of the to-be-processed factor data into a predetermined data processing model to obtain the corrected factor data.
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,在所述步骤S20之前,所述处理器还用于执行所述金融数据处理系统,以实现以下步骤:The computer readable storage medium of claim 16, wherein prior to said step S20, said processor is further configured to execute said financial data processing system to implement the following steps:
    S50,根据预先确定的分类规则,对各因子数据进行分类处理;S50: classify each factor data according to a predetermined classification rule;
    S60,根据各所述因子数据的种类,分别建立各所述因子数据对应的因子数据预测模型;S60. The factor data prediction model corresponding to each of the factor data is respectively established according to the type of each of the factor data.
    S70,将建立的各所述因子数据预测模型与其对应的因子数据之间的映射关系进行存储处理。S70: Store the mapping relationship between each of the established factor data prediction models and their corresponding factor data.
  18. 如权利要求17所述的计算机可读存储介质,其特征在于,所述步骤S60包括:The computer readable storage medium of claim 17, wherein the step S60 comprises:
    当一因子数据的种类为第一数据种类时,则基于以下运算表达式建立所述因子数据对应的因子数据预测模型:When the type of the one-factor data is the first data type, the factor data prediction model corresponding to the factor data is established based on the following operation expression:
    X(K+1)=Z(K)X(K+1)=Z(K)
    其中,X(K+1)为K+1时刻的因子数据的预测值,Z(K)为K时刻的因子数据的观测值。Here, X(K+1) is a predicted value of the factor data at time K+1, and Z(K) is an observed value of the factor data at time K.
  19. 如权利要求17所述的计算机可读存储介质,其特征在于,所述步骤S60还包括:The computer readable storage medium of claim 17, wherein the step S60 further comprises:
    当一因子数据的种类为第二数据种类时,所述因子数据对应的因子数据预测模型的建立方法包括:When the type of the one-factor data is the second data type, the method for establishing the factor data prediction model corresponding to the factor data includes:
    采集第二预设时间区间内的所述因子数据;Collecting the factor data in a second preset time interval;
    对采集的所述因子数据进行预处理;Pre-processing the collected factor data;
    利用预处理后的所述因子数据,且基于长短期记忆神经网络构建所述因子数据对应的因子数据预测模型。The pre-processed factor data is used, and a factor data prediction model corresponding to the factor data is constructed based on a long-term and short-term memory neural network.
  20. 如权利要求16至19中任意一项所述的计算机可读存储介质,其特 征在于,所述步骤S40包括:A computer readable storage medium according to any one of claims 16 to 19, wherein said step S40 comprises:
    将获取的各所述待处理因子数据的观测值按照时间先后顺序进行排序,以生成所述待处理因子数据的观测矩阵;Obtaining the obtained observation values of the to-be-processed factor data in chronological order to generate an observation matrix of the to-be-processed factor data;
    将获取的各所述待处理因子数据的预测值按照时间先后顺序进行排序,以生成所述待处理因子数据的预测矩阵;And predicting the obtained predicted values of the to-be-processed factor data in a chronological order to generate a prediction matrix of the to-be-processed factor data;
    将所述待处理因子数据的观测矩阵及所述待处理因子数据的预测矩阵输入至预先确定的卡尔曼滤波模型,以获得修正后的所述因子数据。The observation matrix of the to-be-processed factor data and the prediction matrix of the to-be-processed factor data are input to a predetermined Kalman filter model to obtain the corrected factor data.
PCT/CN2018/102226 2018-04-03 2018-08-24 Electronic device, financial data processing method and system, and computer-readable storage medium WO2019192136A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810298032.2 2018-04-03
CN201810298032.2A CN108734335A (en) 2018-04-03 2018-04-03 Electronic device, finance data processing method and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2019192136A1 true WO2019192136A1 (en) 2019-10-10

Family

ID=63941290

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/102226 WO2019192136A1 (en) 2018-04-03 2018-08-24 Electronic device, financial data processing method and system, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN108734335A (en)
WO (1) WO2019192136A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180053102A1 (en) * 2016-08-16 2018-02-22 Toyota Jidosha Kabushiki Kaisha Individualized Adaptation of Driver Action Prediction Models
CN107730087A (en) * 2017-09-20 2018-02-23 平安科技(深圳)有限公司 Forecast model training method, data monitoring method, device, equipment and medium
CN107766888A (en) * 2017-10-24 2018-03-06 众安信息技术服务有限公司 Data processing method and device
CN107798604A (en) * 2017-09-28 2018-03-13 平安科技(深圳)有限公司 Become a shareholder when selecting method and terminal device based on machine learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180053102A1 (en) * 2016-08-16 2018-02-22 Toyota Jidosha Kabushiki Kaisha Individualized Adaptation of Driver Action Prediction Models
CN107730087A (en) * 2017-09-20 2018-02-23 平安科技(深圳)有限公司 Forecast model training method, data monitoring method, device, equipment and medium
CN107798604A (en) * 2017-09-28 2018-03-13 平安科技(深圳)有限公司 Become a shareholder when selecting method and terminal device based on machine learning
CN107766888A (en) * 2017-10-24 2018-03-06 众安信息技术服务有限公司 Data processing method and device

Also Published As

Publication number Publication date
CN108734335A (en) 2018-11-02

Similar Documents

Publication Publication Date Title
WO2021184554A1 (en) Database exception monitoring method and device, computer device, and storage medium
US11694109B2 (en) Data processing apparatus for accessing shared memory in processing structured data for modifying a parameter vector data structure
US11636487B2 (en) Graph decomposition for fraudulent transaction analysis
US9940166B2 (en) Allocating field-programmable gate array (FPGA) resources
US20180330261A1 (en) Auto-selection of hierarchically-related near-term forecasting models
CN111125529A (en) Product matching method and device, computer equipment and storage medium
WO2021257395A1 (en) Systems and methods for machine learning model interpretation
JP2016099915A (en) Server for credit examination, system for credit examination, and program for credit examination
CN112182250A (en) Construction method of checking relation knowledge graph, and financial statement checking method and device
WO2019214142A1 (en) Electronic device, research report data-based prediction method, program, and computer storage medium
CN114090601B (en) Data screening method, device, equipment and storage medium
CN111078500A (en) Method and device for adjusting operation configuration parameters, computer equipment and storage medium
US20160063394A1 (en) Computing Device Classifier Improvement Through N-Dimensional Stratified Input Sampling
WO2018205391A1 (en) Method, system and apparatus for evaluating accuracy of information retrieval, and computer-readable storage medium
CN111861757A (en) Financing matching method, system, equipment and storage medium
WO2019192136A1 (en) Electronic device, financial data processing method and system, and computer-readable storage medium
CN113780675B (en) Consumption prediction method and device, storage medium and electronic equipment
US20230022253A1 (en) Fast and accurate prediction methods and systems based on analytical models
CN114944204A (en) Methods, apparatus, devices and media for managing molecular predictions
US9892462B1 (en) Heuristic model for improving the underwriting process
US20210182696A1 (en) Prediction of objective variable using models based on relevance of each model
EP3163463A1 (en) A correlation estimating device and the related method
WO2019209571A1 (en) Proactive data modeling
WO2019192135A1 (en) Electronic device, bond yield analysis method, system, and storage medium
CN116719519B (en) Generalized linear model training method, device, equipment and medium in banking field

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18913634

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18913634

Country of ref document: EP

Kind code of ref document: A1