WO2019062006A1

WO2019062006A1 - Time selection admission method based on machine learning, device and terminal equipment therefor

Info

Publication number: WO2019062006A1
Application number: PCT/CN2018/077242
Authority: WO
Inventors: 王健宗; 黄章成; 吴天博; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2017-09-28
Filing date: 2018-02-26
Publication date: 2019-04-04
Also published as: CN107798604A

Abstract

The invention is applicable for the technical field of computers, and a time selection admission method on the basis of machine learning, a device and a terminal equipment therefor are provided. The method comprises the following steps: inputting preset index data of each stock into a preset stock picking model, and outputting a stock combination; independently obtaining feature data of each stock in the stock combination, wherein the feature data of the stock comprises the stock market transaction data of the stock or the technical index data of the stock; and inputting the feature data of each stock in the stock combination into a pre-trained long-and-short-term memory network, and outputting a price prediction result about each stock in the stock combination. By use of the method, the whole prediction process fully considers the behavior characteristics of a financial market, and a deviation between the prediction result and the subsequent practical price tendency of the stock is effectively reduced. A user can more reasonably carry out the investment behaviors of stock selection and time selection admission on the basis of the price prediction result, and the investment risk of the user is effectively lowered.

Description

Method, device and terminal device for timing acquisition based on machine learning

This application claims the priority of the Chinese Patent Application filed on September 28, 2017, the Chinese Patent Office, the application number is 201710899893.1, and the invention name is "Machine-based time-based stock-in method and terminal equipment", the entire contents of which are incorporated by reference. In this application.

Technical field

The present application relates to the field of computer technology, and in particular, to a method, device, and terminal device for timing-based stock purchase based on machine learning.

Background technique

The stock price is fluctuating in real time. In the process of stock trading, it is often based on the subjective decision of the person or the stock picking and buying behavior when the stock price falls. Such stock picking is not based on the follow-up price of the stock. The forecast of the trend is made, so there may be a large investment risk. In order to build and adopt an appropriate portfolio strategy to achieve a more stable and rational investment method, the application of machine learning technology in the field of securities investment, especially in the selection of investment portfolio and the determination of the timing of entering the market, has been The researchers' extensive concern, based on the prediction of stock price fluctuations for stock selection and timing of shares, has been applied to the decision-making process of stock purchase behavior.

However, the above technology is only from the perspective of machine learning to conduct stock selection and timing stocks. The forecasting process does not fully consider the behavior characteristics of financial markets, resulting in a large deviation between the forecast results and the actual price movements following the stocks. .

technical problem

In view of this, the embodiments of the present application provide a method, device, and terminal device based on machine learning, so as to solve the calculation process of the existing machine learning-based prediction model without fully considering the behavior characteristics of the financial market, resulting in selection. There is a big deviation between the forecast results of stocks and timing stocks and the actual price movements following the stocks.

Technical solution

A first aspect of the embodiments of the present application provides a machine learning based method for timing purchase, comprising:

Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;

And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;

Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.

A second aspect of the embodiments of the present application provides a machine learning based timing purchase device, comprising:

a stock selection unit for inputting preset index data of each stock into a preset stock selection model, and outputting a stock combination;

The obtaining unit is configured to respectively obtain feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock;

The prediction unit is configured to pre-train the long-term and short-term memory network, and input the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and output the price prediction result of each stock in the stock portfolio, Enable users to determine the timing of the stock based strategy based on the price forecast.

A third aspect of an embodiment of the present application provides a terminal device including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, the processor The following steps are implemented when the computer readable instructions are executed:

A fourth aspect of an embodiment of the present application provides a computer readable storage medium storing computer readable instructions that, when executed by at least one processor, implement the following steps:

Beneficial effect

In the embodiment of the present application, based on the preset index data of each stock, the stock portfolio suitable for investment is selected, and then the characteristic data of each stock in the stock portfolio is extracted based on various data sources affecting the stock fluctuation, so as to pass the long The short-term memory network calculates the price prediction results of these stocks. The whole forecasting process fully considers the behavior characteristics of the financial market, and effectively reduces the deviation between the forecast result and the actual price trend of the stock. As a result, the user can more reasonably conduct the stock selection and the investment behavior of the stock selection based on the price forecast result, thereby effectively reducing the investment risk of the user.

DRAWINGS

1 is a flowchart of an implementation of a machine learning based timing acquisition method provided by an embodiment of the present application;

FIG. 2 is a flowchart of a specific implementation of a machine learning based timing acquisition method S101 provided by an embodiment of the present application;

3 is a schematic diagram of a process of acquiring stock-related social media data provided by an embodiment of the present application;

4 is a schematic diagram of a process of acquiring stock-related news data provided by an embodiment of the present application;

FIG. 5 is a flowchart of a specific implementation of a machine learning based timing acquisition method S103 provided by an embodiment of the present application;

6 is an operational diagram of a simple memory cell of the LSTM provided by the embodiment of the present application;

7 is a structural block diagram of a machine learning based timing purchase device provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of a terminal device according to an embodiment of the present application.

Embodiments of the invention

In the following description, for purposes of illustration and description However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the application.

In order to explain the technical solutions described in the present application, the following description will be made by way of specific embodiments.

FIG. 1 is a flowchart showing an implementation process of a machine learning based time sharing method provided by an embodiment of the present application, which is described in detail as follows:

S101: input preset indicator data of each stock into a preset stock selection model, and output a stock combination.

In the embodiment of the present application, the preset indicator data of the stock may be data extracted from the three major financial statements of the listed company, including but not limited to valuation indicators, financial indicators, scale indicators, growth indicators, and technical indicators. Among them, each type of indicator is a collection of multiple similar indicator data. Specifically, the valuation indicators include price-earnings ratio, price-to-book ratio, and market-to-acquisition rate; financial indicators include return on equity, return on assets, and quick ratio; scale indicators include minimum market value and ratio of market capitalization; growth indicators include The year-on-year growth of total assets, net asset growth, operating profit growth and net assets growth year-on-year; technical indicators include 60-day average volume, relative strength indicators and volatility. It should be noted that, in order to facilitate the calculation of the preset index data into the stock selection model, after the extraction of the preset index data is completed, it is necessary to standardize the value of each dimension in each type of index. In the embodiment of the present application, a standardized formula can be adopted.

To standardize, among them,

It is the index value after standardization, X is the mean value of the index sequence, and σ is the standard deviation of the index sequence.

The traditional multi-factor stock selection model calculates the contribution degree of each factor, and according to the multi-factor comprehensive score, finally selects a suitable stock selection factor. In the embodiment of the present application, the machine learning method is adopted The preset indicator data is used as each factor, and the forecasting target of the stock selection model is related to the total return in the predetermined time period. Here, the total income is divided into two categories, and the difference between the two types of total returns is large. Based on the above prediction targets, the importance of each factor (ie, each type of preset indicator data) is calculated, and the importance is used as the basis for stock selection. In the embodiments of the present application, machine learning algorithms that can be employed include logistic regression, support vector machines, neural networks, and the like. Preferably, the support vector machine can be used as a machine learning algorithm.

Because of the machine learning algorithm, the stock selection model is not simply based on econometrics and statistics to calculate the contribution of each factor, but to treat each factor as a different default indicator. The data, that is, the valuation indicators, financial indicators, scale indicators, growth indicators and technical indicators extracted in the previous step, are selected to select the stock collection suitable for investment.

Figure 2 illustrates a specific implementation of S101 in detail:

S201: setting M stocks whose top-year income is ranked in the top M in the first year as the first category, and setting the N stocks whose top-year income is ranked in the last N-position as the second category, the initial selection The stock model is trained to obtain a preset stock picking model.

Generally, the stocks that are known to benefit from the stock picking model training, for example, can use the annual income of the stocks of the previous year of the current year for the training of the stock picking model. As mentioned above, according to the level of full-year income, the stocks are divided into two categories. The first-class stocks rank in the top M, and the second-class stocks rank in the bottom N, and By reasonably setting the values of M and N, the gap between the first type of stock and the second type of voting in the annual income is widened, so as to train a suitable stock selection model.

S202: Based on the decision tree algorithm, select P-type preset indicator data that has the largest contribution to the annual income of the preset year.

For each of the above-mentioned preset indicators, each type of preset indicator is composed of several similar indicator data. Here, the decision-making tree algorithm is used to select the annual income from all the preset indicators. A number of pre-set indicators with the greatest contribution are used as the basis for stock selection.

S203: input the foregoing P-type preset index data of each stock into a preset stock selection model, and calculate a comprehensive score of each stock on the P-type preset index data.

For the selected P-type preset indicator data, the P-type preset indicator data of each stock is respectively obtained, and for each stock, the corresponding P-type preset indicator data is input to the preset The stock selection model is to separately output the comprehensive scores of each stock on the P-type preset indicator data.

S204: Output the stock in which the comprehensive score is ranked in the first Q position as the stock combination.

Wherein, the above M, N, P and Q are all positive integers.

For example, all the constituent stocks of the 2016 Shanghai and Shenzhen 300 Index will be drawn into the pool of candidate stocks. Based on the full-year earnings of the above constituents, the full-year earnings of the top 60 will be ranked as one category, and the full-year earnings will be ranked in the last 60 years. As another category, the 60 stocks in the top 60 earnings of the top 300 in 2016 are marked as 1 and the 60 stocks ranked in the bottom 60 in the full year are marked as 0, and the 120 stocks will be The various indicators data are trained as various factors, and then based on the decision tree algorithm, the top three most important indicator feature data are selected as three factors; then, the first day of each month of the forecast month is calculated separately. The scores on these three factors are combined, and 10 stocks ranked in the previous stock are selected as the selected portfolio.

S102: Obtain feature data of each stock in the stock portfolio respectively, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock.

Firstly, the feature data of the investment target is acquired. In the embodiment of the present application, the investment target is the stock, and the characteristic data of the stock is used to reflect the valuation, scale, growth and financial status of the listed company. For the characteristic data of the stock, the original data source may include stock trading data of the stock, such as the opening price, the lowest price, the highest price, the closing price, the transaction amount or the profit rate of the stock, or the original data source may include The technical indicator data of the stock, such as the stock's smoothing similarity average, cumulative energy line, Bollinger line, psychological line or triple exponential average line and so on. The above stock market transaction data and technical indicator data can all be derived from financial securities programs, using the application programming interface (API) of such programs to obtain, and, due to stock market transaction data and technical indicator data, the original data is It is already quantifiable data, so it can be used directly as feature data for stocks without feature extraction.

Preferably, in addition to stock market transaction data or technical indicator data, the feature data of the stock may also include social media data related to the stock, the source of the data being web 2.0 related user generated content such as a social platform and a media platform. Figure 3 shows the process of acquiring stock-related social media data:

S301: Collect and store user generated content on a network social platform.

Specifically, a distributed web crawler may be used to collect user-generated content on a server related to each social network platform. Such user-generated content includes but is not limited to: content posted by a user on a blog or a microblog, and the user is under news or a picture. Post comments, posts posted by users on the forum or responses, and more.

S302: Extract user-generated content that matches the search keyword, wherein the search keyword includes a stock name or a stock code.

As a specific implementation manner, when performing the matching and extracting operation of the user-generated content, the name of the stock or the stock code may be used as a search keyword, thereby capturing the user-generated content that matches the search keyword. For example, if a user publishes a trend forecast for a stock in Weibo, then the stock name and/or stock code of the stock is bound to be included in the content of the stock, so it can be based on the stock name and/or stock of the stock. Code to get matching user generated content. In addition, further, based on the extracted user generated content, the number of user generated content matching the stock may be counted, the number of users of the social account of the user who generated such user generated content, the account registration time of the user, The user posts the type of terminal for such user-generated content, and so on.

S303: Analyze the extracted user generated content to obtain social media data of the stock.

Illustratively, the stock feature data acquired based on the social media data may include the following types:

1, emotional valence (sentimentValence):

Emotional valence is used to reflect the emotional tendency of stocks in the user's social platform. Illustratively, you can pass the formula

Emotional valence is calculated, where P is the number of user-generated content of positive emotions and N is the number of user-generated content of negative emotions. Behavioral finance research has found that fluctuations in investor sentiment can cause irrational fluctuations in stock prices. Therefore, the closer the emotional valence is to log(1)=0, the smaller the emotional volatility of the investor, then the corresponding stock The irrational fluctuations that the price may produce are smaller; on the contrary, the greater the emotional volatility of the investor, the greater the irrational fluctuations that may occur in the price of the corresponding stock.

2, pay attention to heat:

Concerned about the heat, which reflects the stock's attention in the user's social platform. Behavioral finance has also found that investors' excessive attention to a stock will also cause irrational fluctuations in stock prices. In the embodiment of the present application, the number of user-generated content may be characterized, that is, the more the number of user-generated content related to the stock, the higher the degree of concern, the more irrational fluctuations may be generated by the price of the corresponding stock. The smaller the number of user-generated content related to stocks, the lower the heat of concern, and the less irrational fluctuations that may occur in the price of the corresponding stock.

3. Participation in user influence:

Participation in user influence refers to the ability to publish user-generated content related to stocks, its influence on social networking platforms and even the entire Internet. It is generally believed that users with greater influence are more likely to be stock market operators and less influential users. Their true identity is more likely to be amateur stock investors. From this point of view, the more influential users The more likely its speech is, the more irrational fluctuations occur in the price of the corresponding stock; on the contrary, the less influential the user, the less likely the remarks cause the irrational fluctuation of the price of the corresponding stock. Illustratively, participating user influence can be quantified by the number of accounts that the user is interested in in the social networking platform, the number of accounts that are interested in the user, and the total number of user-generated content published by the user. Add or weight add to get quantified participation user influence.

4. Participation in user registration time:

The participation user registration duration can be characterized by the ratio of the new user to the old user who publishes the user-generated content of the stock, wherein by setting the registration duration threshold, the user whose registration duration is lower than the threshold is classified as a new user, and the registration duration is Users above this threshold are classified as old users. It can be known that the higher the ratio of new users to old users, the smaller the quantified participation user registration duration, and the more the user group representing the user-generated content about the stock. The capital is shallow, and the remarks cause the irrational fluctuation of the price of the corresponding stocks. The lower the ratio of new users to old users, the greater the quantified participation user registration time, and the release of users on the stock. The more senior the user group generating the content, the more likely its true identity is to be a mature stock market investor, and the more likely its remarks are, the more irrational fluctuations in the price of the corresponding stock.

5. Participate in the user publishing terminal:

The number of mobile terminals is the ratio of the number of mobile terminals to the number of PCs. The number of mobile terminals is the number of user-generated content distributed by the mobile terminal. Generally, users who believe that the stock market discussion data is released through the PC side are more likely to be mature stock market investors. Users who believe that the stock market discussion data is released through mobile terminals are more likely to be immature or junior stocks. Trader, then obviously, the higher the ratio of the number of mobile terminals to the number of PCs, the more savvy the user group representing the user-generated content of the stock, and the less likely the remarks cause the irrational fluctuations in the price of the corresponding stock. Conversely, the lower the ratio of new users to old users, the more senior the user group representing the user-generated content of the stock, the more likely its remarks will cause irrational fluctuations in the price of the corresponding stock.

6, the average number of words generated by users:

Here, it is considered that the average number of words that publish a single stock review is mostly professional stock investors, while the average number of words that publish a single stock review is mostly non-professional stock investors. Obviously, the more the average number of words generated by the user, the more senior the user group representing the participation in the stock discussion, the more likely the speech will cause irrational fluctuations in the price of the corresponding stock; conversely, the less the average number of words generated by the user, the representative participates in the stock. The more user groups discussed, the less likely their remarks are to cause irrational fluctuations in the price of the corresponding stock.

In the above, several stock feature data obtained based on social media data are listed, and different feature data correspond to different quantization methods. Therefore, by performing quantitative analysis on the extracted user-generated content matching the stock, Get stock feature data obtained based on social media data. When the feature data further includes social media data related to the stock, the feature data is input as a model to complete a machine learning-based stock picking process, so that the influence factor of the user generated content on the stock price is added in the stock picking process, thereby It can predict stock price movements more accurately and get more accurate prediction results.

Preferably, in addition to stock market transaction data or technical indicator data, the feature data of the stock may also include message data related to the stock, which may be derived from various mainstream news clients or news websites, and FIG. 4 shows the stock. The process of obtaining related message data:

S401: Collect news data on a news client or a news website and store it.

Specifically, distributed web crawlers can be used to collect news data on various news clients or news site related servers. Preferably, for a large amount of data published on a news client or a news website, crawling can be performed for the financial section or the stock market section to provide collection efficiency of such news data.

S402: Extracting news data that matches the search keyword, wherein the search keyword includes a person name or a company name associated with the listed company.

The main purpose of capturing news data from news clients or news websites is to capture news related to listed companies, and these news can actually reflect the price fluctuation of the listed stocks in the future for a certain period of time. In other words, the name of the company of the listed company or the name of the person or the person in charge of the listed company is used as a search key, so that news data related to the stock can be extracted.

S403: Analyze the extracted news data to obtain message data of the stock.

Exemplarily, based on the stock feature data acquired by the news data, reference may be made to the type of the stock feature data acquired based on the social data in the above, for example, the emotional valence related to the stock in the news data, the interest rate of interest, The influence of the press release, the length of the news data, and so on. Different feature data corresponds to different quantization methods. Therefore, by performing quantitative analysis on the extracted news data matching the stock, the stock feature data acquired based on the news data can be obtained. When the feature data further includes news data related to the stock, the feature data is input as a model to complete a machine learning-based stock picking process, so that the stock selection process adds the influence factor of the listed company news on the stock price, thereby enabling More accurate prediction of stock price movements to get more accurate predictions.

In the embodiment of the present application, before the relevant data is collected, and the feature data of the stock is extracted from the related data, the data may be denoised and complemented, so as to further improve the acquisition efficiency of the feature data.

In S103, the Long Short Term Memory Networks (LSTM) are pre-trained, and the feature data of each stock in the stock portfolio is input into the pre-trained long-term and short-term memory network, and the branches in the stock portfolio are output. The price of the stock is predicted so that the user can determine the timing of the stock based on the price forecast.

After determining the stock portfolio, it is necessary to determine the timing of the stock entry operation by predicting the trend of the stock price in the future time window. For example, if the daily time is the time window, then the prediction task is equivalent to the day-to-day ups and downs of the data based on the previous day's data, or based on the previous day's data to predict the closing price of the day after. Based on the investment portfolio selected by the stock selection model, the embodiment of the present application comprehensively adopts the feature extraction step, and obtains the prediction effect and the predicted return based on the deep learning algorithm.

In LSTM, a conventional neuron, a unit that applies S-type activation to its input linear combination, is replaced by a storage unit. Each memory cell is associated with an input gate, an output gate, and an internal state that is fed into itself without interference. In the embodiment of the present application, the feature data of each stock in the stock combination needs to be input into the pre-trained LSTM. LSTM is a variant of recurrent neural network (RNN), which is characterized by the addition of valve nodes of each layer outside the RNN structure. There are three types of valves: forget gate, input gate And the output gate. These valves can be opened or closed for determining whether the result of the LSTM memory state at the layer output reaches a threshold and is added to the calculation of the current layer. The valve node uses the sigmoid function to calculate the memory state of the network as an input; if the output result reaches the threshold, the valve output is multiplied by the calculation result of the current layer as an input of the next layer; if the threshold is not reached, the output result is forgotten Drop it. The weight of each layer, including the valve nodes, is updated during each model backpropagation training.

LSTM training can be optimized by adjusting many parameters, such as the activation function, the number of LSTM layers, and the variable dimensions of the input and output. In order to minimize training errors, Gradient descent, such as the Application of Backpropagation through time (BPTT), can be used to modify the weight of each time based on errors. The error gradient exponentially disappears with the length of time between events. When the LSTM block is set, the error is also calculated with the rewind, from the output back to each input gate of the input phase until the value is filtered out. Therefore, the normal inverted-transfer nerve is a method to effectively train the LSTM block to remember long-term values.

Due to various accidental factors in the financial market, there are noises in financial data, especially in the financial time series. These noises seriously affect the analysis and processing results of financial data, so in the stock portfolio obtained in S102 Before the feature data of each stock is input to the LSTM, it is necessary to denoise these feature data first. However, since the financial time series itself has the characteristics of non-stationary, non-linear and high signal-to-noise ratio, it is often inappropriate to adopt existing denoising methods. Therefore, as an embodiment of the present application, wavelet denoising is used for the original time. The sequence is filtered, and the various hidden periods and nonlinearities of the time series are extracted and separated by the denoising process. The characteristics of the wavelet decomposition sequence and the decomposition data are multiplied and multiplied by the scale to fully utilize the calculation process of the LSTM neural network model.

As shown in Figure 5, the specific implementation of S103 is as follows:

S1031: Perform feature denoising processing on feature data of each stock in the stock combination.

S1032: Input feature data of each stock in the stock combination after denoising processing to a long-short-term memory network that completes pre-training, and output a price prediction result about each stock in the stock portfolio.

Specifically, the embodiment of the present application adopts the Haar function as a wavelet basis function, which can not only effectively decompose the time series into the time domain and the frequency domain, but also can significantly reduce the processing time to reduce the processing time of the data in the LSTM. In the present application embodiment, the wavelet function of the continuous wavelet transform with the time t as a variable is defined as:

Where a is the transform coefficient, τ is the conversion factor, and φ(t) is a reference wavelet obeying the wavelet allowable condition. The wavelet allowable condition is defined as:

Where Φ(ω) is a function of frequency ω and is also a Fourier transform of φ(t). If x(t) is defined as a square integrable function (x(t) ∈ L ² (R)), then a continuous wavelet transform with wavelet φ can be defined as:

among them,

Is the complex conjugate function of φ(t). At this time, the inverse transform of the wavelet transform can be defined as:

The above-mentioned continuous wavelet transform is redundant because the wavelet bases are not orthogonal, and the information after the signal transformation is redundant. Therefore, in the embodiment of the present application, the Mallat algorithm is used, that is, on the orthogonal wavelet base. A signal decomposition algorithm to construct an orthogonal wavelet base. The algorithm uses a high-pass filter and a low-pass filter as the implementation of the discrete wavelet transform in filtering the time series, specifically, through the parent wavelet.

Describe the low frequency components of the time series and describe the high frequency components of the time series by the mother wavelet ψ(t).

Father wavelet

And the mother wavelet ψ(t) is the integral to 1 and 0, respectively, as defined below:

The mother wavelet and the parent wavelet at the j level can be converted into:

The parent wavelet and the mother wavelet with multilevel index analysis k∈{0,1,2,...} and j∈{0,1,2,...J} can reconstruct the financial time series. The orthogonal wavelet series approximation time series x(t) is defined as:

Among them, the formula for the expansion coefficients s _J,k and d _J,k is as follows:

d _j,k =∫ψ _j,k x(t)dt

The approximation of a given multiscale time series x(t) is:

Therefore, the form of the simplified orthogonal wavelet series approximation can be expressed as:

x(t)=S _J (t)+D _J (t)+D _J-1 (t)+...+D ₁ (t)

Where S _J (t) is the roughest approximation of the input time series x(t), and the multiresolution decomposition of x(t) is the sequence {S _J (t), D _J (t), D _J-1 ( t),...,D ₁ (t)}. In the case of a rough financial time series, the reproducible application of discrete wavelet transform can reduce the risk in the process.

After denoising the feature data of each stock in the stock portfolio, the feature data is input to the LSTM, and the price prediction result for each stock in the stock portfolio is output. In LSTM, each neuron is a memory cell with an input gate, a forget gate, and an output gate. One of the keys to the LSTM model is its The Forgotten Gate, which controls the convergence of the gradients during training, while maintaining long-term memory. Figure 6 shows an operational diagram of a simple memory cell of the LSTM. Referring to Figure 6, the main mathematical symbols involved in a simple memory cell are as follows:

1. x _t is the input vector in the memory cell at time t;

2. W _i , W _f , W _c , W _o , U _i , U _f , U _c , U _o and V _o are network weighted squares;

3, b _i , b _f , b _c and b _o are network deviation vectors;

4, h _t is the value of the memory cell t time;

5, i _t (ie the input gate shown in Figure 6) and

The calculation formula for the memory cell input gate and candidate state at time t:

i _t =σ(W _i x _t +U _i h _t-1 +b _i );

6. f _t (that is, the forgetting gate shown in Fig. 6) and C _t are the calculation formulas of the memory cell forgetting gate and the candidate state at time t:

f _t =σ(W _f x _t +U _f h _t-1 +b _f )

7. o _t (ie the output gate shown in Figure 6) and h _t are the respective calculation formulas for the memory cell output gate and memory cell at time t:

o _t =σ(W _o x _t +U _o h _t-1 +V _o C _t +b _o )

h _t =o _t *tanh(C _t )

It should be understood that the size of the sequence of the steps in the above embodiments does not mean that the order of execution is performed. The order of execution of each process should be determined by its function and internal logic, and should not be construed as limiting the implementation process of the embodiments of the present application.

Corresponding to the machine learning-based timing joining method described in the above embodiments, FIG. 7 is a structural block diagram of a machine learning-based timing joining device provided by an embodiment of the present application. For the convenience of description, only The relevant parts of the embodiments of the present application.

Referring to Figure 7, the apparatus includes:

The stock selection unit 71: input preset indicator data of each stock into a preset stock selection model, and output a stock combination.

The obtaining unit 72: respectively obtain feature data of each stock in the stock combination, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock.

Prediction unit 73: pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting the price prediction result of each stock in the stock portfolio, so that The user determines the timing of the stock purchase strategy based on the price forecast.

Optionally, the feature data further includes social media data related to the stock, and the obtaining unit 72 includes:

The first collection sub-unit: collects and stores user-generated content on the network social platform.

The first extraction subunit: extracts user generated content matching the retrieval keyword, wherein the retrieval keyword includes a stock name or a stock code of each stock in the stock combination.

The first analysis subunit: analyzes the extracted user generated content, and obtains feature data of each stock in the stock combination respectively.

Optionally, the feature data further includes message data related to the stock, and the obtaining unit 72 includes:

The first collection sub-unit: collects and stores news data on a news client or a news website.

The first extracting subunit: extracting news data matching the search keyword, wherein the search keyword includes a person name or a company name related to the listed company corresponding to each stock in the stock portfolio.

The first analysis subunit: analyzes the extracted news data to obtain message data of each stock in the stock portfolio.

Optionally, the stock selection unit 71 includes:

Training sub-unit: Set the M stocks with the pre-determined annual full-year earnings in the top M position as the first category, and set the N stocks with the default annual annual return in the last N positions as the second category. The stock selection model is trained to obtain a preset stock selection model;

Select sub-units: based on the decision tree algorithm, select the P-type preset indicator data that has the largest contribution to the annual income of the preset year;

Calculating the sub-unit: inputting the P-type preset index data of each stock into the preset stock selection model, and calculating the comprehensive score of each stock on the P-type preset index data;

The first output sub-unit: outputting the stock with the comprehensive score in the top Q position as a stock combination;

Wherein, the M, N, P and Q are all positive integers.

Optionally, the prediction unit 73 includes:

Transform subunit: performing characteristic denoising processing on each feature data of each stock in the stock combination;

The second output subunit: input the feature data of each stock in the stock combination after the denoising process to the long-short-term memory network that completes the pre-training, and output a price prediction result about each stock in the stock portfolio.

FIG. 8 is a schematic diagram of a terminal device according to an embodiment of the present application. As shown in FIG. 8, the terminal device 8 of this embodiment includes a processor 80, a memory 81, and computer readable instructions 82 stored in the memory 81 and operable on the processor 80, such as a ticket face content processing program for an invoice. The processor 80 executes the steps in the embodiment of the ticket content processing method of each of the invoices when the computer readable instructions 82 are executed, such as steps 101 through 103 shown in FIG. Alternatively, processor 80, when executing computer readable instructions 82, implements the functions of the various units of the various apparatus embodiments described above, such as the functions of modules 71 through 73 shown in FIG.

Illustratively, computer readable instructions 82 may be partitioned into one or more units, one or more units being stored in memory 81 and executed by processor 80 to complete the application. The one or more units may be a series of instructions of computer readable instructions capable of performing a particular function, which is used to describe the execution of computer readable instructions 82 in the terminal device 8. For example, the computer readable instructions 82 can be divided into a stock selection unit, an acquisition unit, and a prediction unit, and the specific functions of each unit are as follows:

Stock selection unit: input the preset index data of each stock into the preset stock selection model and output the stock portfolio.

Acquiring unit: respectively obtaining feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical index data of the stock.

Predicting unit: pre-training the long-term and short-term networks, and inputting the characteristic data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting the price prediction result of each stock in the stock portfolio, so that the user can The price forecast results determine the timing of the shareholding strategy.

Optionally, the feature data further includes social media data related to the stock, and the acquiring unit includes:

The first analysis sub-unit: analyzes the extracted user-generated content, and respectively obtains social media data of each stock in the stock combination.

Optionally, the feature data further includes message data related to the stock, and the acquiring unit includes:

Optionally, the stock selection unit comprises:

Wherein, the M, N, P and Q are all positive integers.

Optionally, the prediction unit comprises:

The terminal device 8 can be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The terminal device 8 may include, but is not limited to, a processor 80, a memory 81. It will be understood by those skilled in the art that FIG. 8 is merely an example of the terminal device 8, and does not constitute a limitation on the terminal device 8, and may include more or less components than those illustrated, or combine some components, or different components. For example, the terminal device 8 may further include an input/output device, a network access device, a bus, and the like.

The processor 80 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), and an off-the-shelf device. Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like.

The memory 81 may be an internal storage unit of the terminal device 8, such as a hard disk or a memory of the terminal device 8. The memory 81 may also be an external storage device of the terminal device 8, such as a plug-in hard disk provided on the terminal device 8, a smart memory card (SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on. Further, the memory 81 may also include both an internal storage unit of the terminal device 5 and an external storage device. The memory 81 is used to store the computer readable instructions and other programs and data required by the terminal device. The memory 81 can also be used to temporarily store data that has been output or is about to be output.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer readable instructions, which may be stored in a computer readable storage medium. The computer readable instructions, when executed by a processor, may implement the steps of the various method embodiments described above. Wherein, the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like. The computer readable medium can include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard drive, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only) Memory), random access memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the content contained in the computer readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in a jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, computer readable media Does not include electrical carrier signals and telecommunication signals.

The above-mentioned embodiments are only used to explain the technical solutions of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing embodiments. The technical solutions described in the examples are modified or equivalently replaced with some of the technical features; and the modifications or substitutions do not deviate from the spirit and scope of the technical solutions of the embodiments of the present application, and should be included in Within the scope of protection of this application.

Claims

A method based on machine learning for timing purchase, characterized in that it comprises:

Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;

And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;

Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
The machine learning-based timing-based stock-in method according to claim 1, wherein the feature data further comprises social media data related to the stock, and the characteristics of each stock in the stock portfolio are respectively acquired. Data, including:

Collecting and storing user-generated content on a social networking platform;

Extracting the user-generated content that matches the search keyword, the search keyword including a stock name or a stock code of each stock in the stock portfolio;

The extracted user generated content is analyzed to obtain the social media data of each stock in the stock combination.
The machine learning-based timing-based stock-in method according to claim 1, wherein the feature data further includes message data related to the stock, and the feature data of each stock in the stock combination is separately acquired. ,include:

Collect and store news data on news clients or news sites;

Extracting the news data that matches the search keyword, the search keyword including a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;

The extracted news data is analyzed to obtain message data of each stock in the stock combination.
The machine learning-based timing-based stock-in method according to claim 1, wherein the inputting the preset index data of each stock into a preset stock selection model and outputting the stock combination comprises:

The M stocks with the pre-determined annual full-year income ranked in the top M are set to the first category, and the N stocks with the preset annual annual income ranked in the last N positions are set to the second category, and the initial selection is made. The stock model is trained to obtain the preset stock selection model;

Determining, according to the decision tree algorithm, the preset indicator data of the P class having the largest contribution to the annual income of the preset year;

Entering the preset indicator data of the P class of each stock into the preset stock selection model, and calculating a comprehensive score of each stock on the preset indicator data of the P category;

Outputting the stock in which the composite score is ranked in the top Q is the stock combination;

Wherein, the M, N, P and Q are all positive integers.
The machine learning-based timing-based stock-in method according to claim 1, wherein the feature data of each stock in the stock combination is input to the long-term and short-term memory network that completes pre-training, and the relevant information is output. The price forecast results for each stock in the stock portfolio, including:

De-noising the feature data of each stock in the stock portfolio;

The feature data of each stock in the stock combination after the denoising process is input to the long-short-term memory network that completes the pre-training, and the price prediction result about each stock in the stock portfolio is output.
A timing learning device based on machine learning, characterized in that it comprises:

a stock selection unit for inputting preset index data of each stock into a preset stock selection model, and outputting a stock combination;

The obtaining unit is configured to respectively obtain feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock;

The prediction unit is configured to pre-train the long-term and short-term memory network, and input the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and output the price prediction result of each stock in the stock portfolio, Enable users to determine the timing of the stock based strategy based on the price forecast.
The machine learning-based timing depositing device of claim 6, wherein the feature data further comprises social media data related to the stock, the obtaining unit comprising:

a first collection subunit for collecting and storing user generated content on a network social platform;

a first extracting subunit, configured to extract user generated content that matches the search keyword, wherein the search keyword includes a stock name or a stock code of each stock in the stock combination;

The first analysis subunit is configured to analyze the extracted user generated content, and respectively obtain social media data of each stock in the stock combination.
The machine learning-based timing depositing device according to claim 6, wherein the feature data further includes message data related to the stock, and the obtaining unit comprises:

a first collection subunit for collecting and storing news data on a news client or a news website;

a first extracting subunit, configured to extract news data that matches the search keyword, wherein the search keyword includes a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;

The first analysis subunit is configured to analyze the extracted news data to obtain message data of each stock in the stock combination.
The machine learning based timing depositing device of claim 6 wherein said stock selection unit comprises:

The training sub-unit is configured to set the M stocks whose top-year income is ranked in the top M in the first year as the first category, and the N-shares in which the default annual income is ranked in the last N-position as the second category. Training the initial stock selection model to obtain a preset stock selection model;

Selecting a sub-unit for selecting a P-type preset indicator data having the largest contribution to the annual income of the preset year based on the decision tree algorithm;

The calculation subunit is configured to input the P type preset indicator data of each stock into a preset stock selection model, and calculate a comprehensive score of each stock on the P type preset indicator data;

a first output subunit, configured to output the stock with the composite score in the top Q position as a stock combination;

Wherein, the M, N, P and Q are all positive integers.
The machine learning-based timing depositing device according to claim 6, wherein the predicting unit comprises:

a transformation subunit, configured to perform denoising processing on feature data of each stock in the stock combination;

a second output subunit, configured to input feature data of each stock in the stock combination after denoising processing to a long-short-term memory network that completes pre-training, and output a price prediction result about each stock in the stock portfolio .
A terminal device comprising a memory, a processor, and computer readable instructions stored in the memory and operable on the processor, wherein the processor executes the computer readable instructions as follows step:

Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;

And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;

Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
The terminal device according to claim 11, wherein the feature data further includes social media data related to the stock, and the step of respectively acquiring feature data of each stock in the stock combination includes :

Collecting and storing user-generated content on a social networking platform;

Extracting the user-generated content that matches the search keyword, the search keyword including a stock name or a stock code of each stock in the stock portfolio;

The extracted user generated content is analyzed to obtain the social media data of each stock in the stock combination.
The terminal device according to claim 11, wherein the feature data further comprises message data related to the stock, and the step of respectively acquiring feature data of each stock in the stock combination comprises:

Collect and store news data on news clients or news sites;

Extracting the news data that matches the search keyword, the search keyword including a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;

The extracted news data is analyzed to obtain message data of each stock in the stock combination.
The terminal device according to claim 11, wherein the step of inputting preset indicator data of each stock into a preset stock selection model and outputting a stock combination comprises:

The M stocks with the pre-determined annual full-year income ranked in the top M are set to the first category, and the N stocks with the preset annual annual income ranked in the last N positions are set to the second category, and the initial selection is made. The stock model is trained to obtain the preset stock selection model;

Determining, according to the decision tree algorithm, the preset indicator data of the P class having the largest contribution to the annual income of the preset year;

Entering the preset indicator data of the P class of each stock into the preset stock selection model, and calculating a comprehensive score of each stock on the preset indicator data of the P category;

Outputting the stock in which the composite score is ranked in the top Q is the stock combination;

Wherein, the M, N, P and Q are all positive integers.
The terminal device according to claim 11, wherein said character data of each stock in said stock combination is input to a long-term and short-term memory network, and a price prediction result for each stock in said stock combination is output. Steps, including:

De-noising the feature data of each stock in the stock portfolio;

The feature data of each stock in the stock combination after the denoising process is input to the long-short-term memory network that completes the pre-training, and the price prediction result about each stock in the stock portfolio is output.
A computer readable storage medium storing computer readable instructions, wherein the computer readable instructions, when executed by at least one processor, implement the following steps:

Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;

And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;

Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
The computer readable storage medium according to claim 16, wherein the feature data further comprises social media data related to the stock, wherein the feature data of each stock in the stock combination is separately obtained, including :

Collecting and storing user-generated content on a social networking platform;

Extracting the user-generated content that matches the search keyword, the search keyword including a stock name or a stock code of each stock in the stock portfolio;

The extracted user generated content is analyzed to obtain the social media data of each stock in the stock combination.
The computer readable storage medium according to claim 16, wherein the feature data further comprises message data related to the stock, and the obtaining feature data of each stock in the stock combination respectively comprises:

Collect and store news data on news clients or news sites;

Extracting the news data that matches the search keyword, the search keyword including a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;

The extracted news data is analyzed to obtain message data of each stock in the stock combination.
The computer readable storage medium according to claim 16, wherein the inputting the preset indicator data of each stock into a preset stock selection model and outputting the stock combination comprises:

The M stocks with the pre-determined annual full-year income ranked in the top M are set to the first category, and the N stocks with the preset annual annual income ranked in the last N positions are set to the second category, and the initial selection is made. The stock model is trained to obtain the preset stock selection model;

Determining, according to the decision tree algorithm, the preset indicator data of the P class having the largest contribution to the annual income of the preset year;

Entering the preset indicator data of the P class of each stock into the preset stock selection model, and calculating a comprehensive score of each stock on the preset indicator data of the P category;

Outputting the stock in which the composite score is ranked in the top Q is the stock combination;

Wherein, the M, N, P and Q are all positive integers.
The computer readable storage medium according to claim 16, wherein said inputting feature data of each stock in said stock combination to said long-term and short-term memory network that completes pre-training, and outputting said stock combination The price forecast results of each stock include:

De-noising the feature data of each stock in the stock portfolio;

The feature data of each stock in the stock combination after the denoising process is input to the long-short-term memory network that completes the pre-training, and the price prediction result about each stock in the stock portfolio is output.