WO2019062006A1 - Time selection admission method based on machine learning, device and terminal equipment therefor - Google Patents

Time selection admission method based on machine learning, device and terminal equipment therefor Download PDF

Info

Publication number
WO2019062006A1
WO2019062006A1 PCT/CN2018/077242 CN2018077242W WO2019062006A1 WO 2019062006 A1 WO2019062006 A1 WO 2019062006A1 CN 2018077242 W CN2018077242 W CN 2018077242W WO 2019062006 A1 WO2019062006 A1 WO 2019062006A1
Authority
WO
WIPO (PCT)
Prior art keywords
stock
data
preset
combination
feature data
Prior art date
Application number
PCT/CN2018/077242
Other languages
French (fr)
Chinese (zh)
Inventor
王健宗
黄章成
吴天博
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019062006A1 publication Critical patent/WO2019062006A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present application relates to the field of computer technology, and in particular, to a method, device, and terminal device for timing-based stock purchase based on machine learning.
  • the stock price is fluctuating in real time.
  • it is often based on the subjective decision of the person or the stock picking and buying behavior when the stock price falls.
  • Such stock picking is not based on the follow-up price of the stock.
  • the forecast of the trend is made, so there may be a large investment risk.
  • the application of machine learning technology in the field of securities investment especially in the selection of investment portfolio and the determination of the timing of entering the market, has been The researchers' extensive concern, based on the prediction of stock price fluctuations for stock selection and timing of shares, has been applied to the decision-making process of stock purchase behavior.
  • the embodiments of the present application provide a method, device, and terminal device based on machine learning, so as to solve the calculation process of the existing machine learning-based prediction model without fully considering the behavior characteristics of the financial market, resulting in selection. There is a big deviation between the forecast results of stocks and timing stocks and the actual price movements following the stocks.
  • a first aspect of the embodiments of the present application provides a machine learning based method for timing purchase, comprising:
  • feature data of each stock in the stock combination includes stock market transaction data of the stock or technical indicator data of the stock;
  • Pre-training the long-term and short-term memory network Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
  • a second aspect of the embodiments of the present application provides a machine learning based timing purchase device, comprising:
  • a stock selection unit for inputting preset index data of each stock into a preset stock selection model, and outputting a stock combination
  • the obtaining unit is configured to respectively obtain feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock;
  • the prediction unit is configured to pre-train the long-term and short-term memory network, and input the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and output the price prediction result of each stock in the stock portfolio, Enable users to determine the timing of the stock based strategy based on the price forecast.
  • a third aspect of an embodiment of the present application provides a terminal device including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, the processor The following steps are implemented when the computer readable instructions are executed:
  • feature data of each stock in the stock combination includes stock market transaction data of the stock or technical indicator data of the stock;
  • Pre-training the long-term and short-term memory network Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
  • a fourth aspect of an embodiment of the present application provides a computer readable storage medium storing computer readable instructions that, when executed by at least one processor, implement the following steps:
  • feature data of each stock in the stock combination includes stock market transaction data of the stock or technical indicator data of the stock;
  • Pre-training the long-term and short-term memory network Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
  • the stock portfolio suitable for investment is selected, and then the characteristic data of each stock in the stock portfolio is extracted based on various data sources affecting the stock fluctuation, so as to pass the long
  • the short-term memory network calculates the price prediction results of these stocks.
  • the whole forecasting process fully considers the behavior characteristics of the financial market, and effectively reduces the deviation between the forecast result and the actual price trend of the stock. As a result, the user can more reasonably conduct the stock selection and the investment behavior of the stock selection based on the price forecast result, thereby effectively reducing the investment risk of the user.
  • FIG. 1 is a flowchart of an implementation of a machine learning based timing acquisition method provided by an embodiment of the present application
  • FIG. 2 is a flowchart of a specific implementation of a machine learning based timing acquisition method S101 provided by an embodiment of the present application;
  • FIG. 3 is a schematic diagram of a process of acquiring stock-related social media data provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of a process of acquiring stock-related news data provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of a specific implementation of a machine learning based timing acquisition method S103 provided by an embodiment of the present application;
  • FIG. 6 is an operational diagram of a simple memory cell of the LSTM provided by the embodiment of the present application.
  • FIG. 7 is a structural block diagram of a machine learning based timing purchase device provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a terminal device according to an embodiment of the present application.
  • FIG. 1 is a flowchart showing an implementation process of a machine learning based time sharing method provided by an embodiment of the present application, which is described in detail as follows:
  • S101 input preset indicator data of each stock into a preset stock selection model, and output a stock combination.
  • the preset indicator data of the stock may be data extracted from the three major financial statements of the listed company, including but not limited to valuation indicators, financial indicators, scale indicators, growth indicators, and technical indicators.
  • each type of indicator is a collection of multiple similar indicator data.
  • the valuation indicators include price-earnings ratio, price-to-book ratio, and market-to-acquisition rate; financial indicators include return on equity, return on assets, and quick ratio; scale indicators include minimum market value and ratio of market capitalization; growth indicators include The year-on-year growth of total assets, net asset growth, operating profit growth and net assets growth year-on-year; technical indicators include 60-day average volume, relative strength indicators and volatility.
  • the traditional multi-factor stock selection model calculates the contribution degree of each factor, and according to the multi-factor comprehensive score, finally selects a suitable stock selection factor.
  • the machine learning method is adopted
  • the preset indicator data is used as each factor, and the forecasting target of the stock selection model is related to the total return in the predetermined time period.
  • the total income is divided into two categories, and the difference between the two types of total returns is large.
  • the importance of each factor ie, each type of preset indicator data
  • machine learning algorithms that can be employed include logistic regression, support vector machines, neural networks, and the like.
  • the support vector machine can be used as a machine learning algorithm.
  • the stock selection model is not simply based on econometrics and statistics to calculate the contribution of each factor, but to treat each factor as a different default indicator.
  • the data that is, the valuation indicators, financial indicators, scale indicators, growth indicators and technical indicators extracted in the previous step, are selected to select the stock collection suitable for investment.
  • FIG. 2 illustrates a specific implementation of S101 in detail:
  • the stocks that are known to benefit from the stock picking model training can use the annual income of the stocks of the previous year of the current year for the training of the stock picking model.
  • the stocks are divided into two categories. The first-class stocks rank in the top M, and the second-class stocks rank in the bottom N, and By reasonably setting the values of M and N, the gap between the first type of stock and the second type of voting in the annual income is widened, so as to train a suitable stock selection model.
  • each type of preset indicator is composed of several similar indicator data.
  • the decision-making tree algorithm is used to select the annual income from all the preset indicators.
  • a number of pre-set indicators with the greatest contribution are used as the basis for stock selection.
  • S203 input the foregoing P-type preset index data of each stock into a preset stock selection model, and calculate a comprehensive score of each stock on the P-type preset index data.
  • the P-type preset indicator data of each stock is respectively obtained, and for each stock, the corresponding P-type preset indicator data is input to the preset
  • the stock selection model is to separately output the comprehensive scores of each stock on the P-type preset indicator data.
  • the constituent stocks of the 2016 Shanghai and Shenzhen 300 Index will be drawn into the pool of candidate stocks.
  • the full-year earnings of the above constituents the full-year earnings of the top 60 will be ranked as one category, and the full-year earnings will be ranked in the last 60 years.
  • the 60 stocks in the top 60 earnings of the top 300 in 2016 are marked as 1 and the 60 stocks ranked in the bottom 60 in the full year are marked as 0, and the 120 stocks will be
  • the various indicators data are trained as various factors, and then based on the decision tree algorithm, the top three most important indicator feature data are selected as three factors; then, the first day of each month of the forecast month is calculated separately. The scores on these three factors are combined, and 10 stocks ranked in the previous stock are selected as the selected portfolio.
  • S102 Obtain feature data of each stock in the stock portfolio respectively, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock.
  • the feature data of the investment target is acquired.
  • the investment target is the stock
  • the characteristic data of the stock is used to reflect the valuation, scale, growth and financial status of the listed company.
  • the original data source may include stock trading data of the stock, such as the opening price, the lowest price, the highest price, the closing price, the transaction amount or the profit rate of the stock, or the original data source may include The technical indicator data of the stock, such as the stock's smoothing similarity average, cumulative energy line, Bollinger line, psychological line or triple exponential average line and so on.
  • the above stock market transaction data and technical indicator data can all be derived from financial securities programs, using the application programming interface (API) of such programs to obtain, and, due to stock market transaction data and technical indicator data, the original data is It is already quantifiable data, so it can be used directly as feature data for stocks without feature extraction.
  • API application programming interface
  • the feature data of the stock may also include social media data related to the stock, the source of the data being web 2.0 related user generated content such as a social platform and a media platform.
  • Figure 3 shows the process of acquiring stock-related social media data:
  • S301 Collect and store user generated content on a network social platform.
  • a distributed web crawler may be used to collect user-generated content on a server related to each social network platform.
  • user-generated content includes but is not limited to: content posted by a user on a blog or a microblog, and the user is under news or a picture. Post comments, posts posted by users on the forum or responses, and more.
  • S302 Extract user-generated content that matches the search keyword, wherein the search keyword includes a stock name or a stock code.
  • the name of the stock or the stock code may be used as a search keyword, thereby capturing the user-generated content that matches the search keyword. For example, if a user publishes a trend forecast for a stock in Weibo, then the stock name and/or stock code of the stock is bound to be included in the content of the stock, so it can be based on the stock name and/or stock of the stock. Code to get matching user generated content.
  • the number of user generated content matching the stock may be counted, the number of users of the social account of the user who generated such user generated content, the account registration time of the user, The user posts the type of terminal for such user-generated content, and so on.
  • the stock feature data acquired based on the social media data may include the following types:
  • the number of user-generated content may be characterized, that is, the more the number of user-generated content related to the stock, the higher the degree of concern, the more irrational fluctuations may be generated by the price of the corresponding stock.
  • Participation in user influence refers to the ability to publish user-generated content related to stocks, its influence on social networking platforms and even the entire Internet. It is generally believed that users with greater influence are more likely to be stock market operators and less influential users. Their true identity is more likely to be amateur stock investors. From this point of view, the more influential users The more likely its speech is, the more irrational fluctuations occur in the price of the corresponding stock; on the contrary, the less influential the user, the less likely the remarks cause the irrational fluctuation of the price of the corresponding stock.
  • participating user influence can be quantified by the number of accounts that the user is interested in in the social networking platform, the number of accounts that are interested in the user, and the total number of user-generated content published by the user. Add or weight add to get quantified participation user influence.
  • the participation user registration duration can be characterized by the ratio of the new user to the old user who publishes the user-generated content of the stock, wherein by setting the registration duration threshold, the user whose registration duration is lower than the threshold is classified as a new user, and the registration duration is Users above this threshold are classified as old users. It can be known that the higher the ratio of new users to old users, the smaller the quantified participation user registration duration, and the more the user group representing the user-generated content about the stock. The capital is shallow, and the remarks cause the irrational fluctuation of the price of the corresponding stocks. The lower the ratio of new users to old users, the greater the quantified participation user registration time, and the release of users on the stock. The more senior the user group generating the content, the more likely its true identity is to be a mature stock market investor, and the more likely its remarks are, the more irrational fluctuations in the price of the corresponding stock.
  • the number of mobile terminals is the ratio of the number of mobile terminals to the number of PCs.
  • the number of mobile terminals is the number of user-generated content distributed by the mobile terminal.
  • users who believe that the stock market discussion data is released through the PC side are more likely to be mature stock market investors. Users who believe that the stock market discussion data is released through mobile terminals are more likely to be immature or junior stocks.
  • Trader then obviously, the higher the ratio of the number of mobile terminals to the number of PCs, the more savvy the user group representing the user-generated content of the stock, and the less likely the remarks cause the irrational fluctuations in the price of the corresponding stock.
  • the lower the ratio of new users to old users the more senior the user group representing the user-generated content of the stock, the more likely its remarks will cause irrational fluctuations in the price of the corresponding stock.
  • the average number of words that publish a single stock review is mostly professional stock investors, while the average number of words that publish a single stock review is mostly non-professional stock investors.
  • the feature data of the stock may also include message data related to the stock, which may be derived from various mainstream news clients or news websites, and FIG. 4 shows the stock.
  • message data related to the stock may be derived from various mainstream news clients or news websites, and FIG. 4 shows the stock. The process of obtaining related message data:
  • S401 Collect news data on a news client or a news website and store it.
  • distributed web crawlers can be used to collect news data on various news clients or news site related servers.
  • crawling can be performed for the financial section or the stock market section to provide collection efficiency of such news data.
  • S402 Extracting news data that matches the search keyword, wherein the search keyword includes a person name or a company name associated with the listed company.
  • the main purpose of capturing news data from news clients or news websites is to capture news related to listed companies, and these news can actually reflect the price fluctuation of the listed stocks in the future for a certain period of time.
  • the name of the company of the listed company or the name of the person or the person in charge of the listed company is used as a search key, so that news data related to the stock can be extracted.
  • the stock feature data acquired by the news data reference may be made to the type of the stock feature data acquired based on the social data in the above, for example, the emotional valence related to the stock in the news data, the interest rate of interest, The influence of the press release, the length of the news data, and so on.
  • Different feature data corresponds to different quantization methods. Therefore, by performing quantitative analysis on the extracted news data matching the stock, the stock feature data acquired based on the news data can be obtained.
  • the feature data When the feature data further includes news data related to the stock, the feature data is input as a model to complete a machine learning-based stock picking process, so that the stock selection process adds the influence factor of the listed company news on the stock price, thereby enabling More accurate prediction of stock price movements to get more accurate predictions.
  • the data before the relevant data is collected, and the feature data of the stock is extracted from the related data, the data may be denoised and complemented, so as to further improve the acquisition efficiency of the feature data.
  • the Long Short Term Memory Networks are pre-trained, and the feature data of each stock in the stock portfolio is input into the pre-trained long-term and short-term memory network, and the branches in the stock portfolio are output.
  • the price of the stock is predicted so that the user can determine the timing of the stock based on the price forecast.
  • the embodiment of the present application comprehensively adopts the feature extraction step, and obtains the prediction effect and the predicted return based on the deep learning algorithm.
  • LSTM a conventional neuron, a unit that applies S-type activation to its input linear combination, is replaced by a storage unit.
  • Each memory cell is associated with an input gate, an output gate, and an internal state that is fed into itself without interference.
  • the feature data of each stock in the stock combination needs to be input into the pre-trained LSTM.
  • LSTM is a variant of recurrent neural network (RNN), which is characterized by the addition of valve nodes of each layer outside the RNN structure.
  • RNN recurrent neural network
  • the valve node uses the sigmoid function to calculate the memory state of the network as an input; if the output result reaches the threshold, the valve output is multiplied by the calculation result of the current layer as an input of the next layer; if the threshold is not reached, the output result is forgotten Drop it.
  • the weight of each layer, including the valve nodes, is updated during each model backpropagation training.
  • LSTM training can be optimized by adjusting many parameters, such as the activation function, the number of LSTM layers, and the variable dimensions of the input and output.
  • Gradient descent such as the Application of Backpropagation through time (BPTT)
  • BPTT Backpropagation through time
  • the error gradient exponentially disappears with the length of time between events.
  • wavelet denoising is used for the original time.
  • the sequence is filtered, and the various hidden periods and nonlinearities of the time series are extracted and separated by the denoising process.
  • the characteristics of the wavelet decomposition sequence and the decomposition data are multiplied and multiplied by the scale to fully utilize the calculation process of the LSTM neural network model.
  • S1031 Perform feature denoising processing on feature data of each stock in the stock combination.
  • S1032 Input feature data of each stock in the stock combination after denoising processing to a long-short-term memory network that completes pre-training, and output a price prediction result about each stock in the stock portfolio.
  • the embodiment of the present application adopts the Haar function as a wavelet basis function, which can not only effectively decompose the time series into the time domain and the frequency domain, but also can significantly reduce the processing time to reduce the processing time of the data in the LSTM.
  • the wavelet function of the continuous wavelet transform with the time t as a variable is defined as:
  • is the conversion factor
  • ⁇ (t) is a reference wavelet obeying the wavelet allowable condition.
  • the wavelet allowable condition is defined as:
  • ⁇ ( ⁇ ) is a function of frequency ⁇ and is also a Fourier transform of ⁇ (t). If x(t) is defined as a square integrable function (x(t) ⁇ L 2 (R)), then a continuous wavelet transform with wavelet ⁇ can be defined as:
  • the inverse transform of the wavelet transform can be defined as:
  • the above-mentioned continuous wavelet transform is redundant because the wavelet bases are not orthogonal, and the information after the signal transformation is redundant. Therefore, in the embodiment of the present application, the Mallat algorithm is used, that is, on the orthogonal wavelet base.
  • a signal decomposition algorithm to construct an orthogonal wavelet base uses a high-pass filter and a low-pass filter as the implementation of the discrete wavelet transform in filtering the time series, specifically, through the parent wavelet. Describe the low frequency components of the time series and describe the high frequency components of the time series by the mother wavelet ⁇ (t).
  • the mother wavelet and the parent wavelet at the j level can be converted into:
  • the parent wavelet and the mother wavelet with multilevel index analysis k ⁇ 0,1,2,... ⁇ and j ⁇ 0,1,2,...J ⁇ can reconstruct the financial time series.
  • the orthogonal wavelet series approximation time series x(t) is defined as:
  • each neuron is a memory cell with an input gate, a forget gate, and an output gate.
  • One of the keys to the LSTM model is its The Forgotten Gate, which controls the convergence of the gradients during training, while maintaining long-term memory.
  • Figure 6 shows an operational diagram of a simple memory cell of the LSTM. Referring to Figure 6, the main mathematical symbols involved in a simple memory cell are as follows:
  • x t is the input vector in the memory cell at time t;
  • W i , W f , W c , W o , U i , U f , U c , U o and V o are network weighted squares
  • b i , b f , b c and b o are network deviation vectors
  • h t is the value of the memory cell t time
  • f t (that is, the forgetting gate shown in Fig. 6) and C t are the calculation formulas of the memory cell forgetting gate and the candidate state at time t:
  • o t (ie the output gate shown in Figure 6) and h t are the respective calculation formulas for the memory cell output gate and memory cell at time t:
  • o t ⁇ (W o x t +U o h t-1 +V o C t +b o )
  • the stock portfolio suitable for investment is selected, and then the characteristic data of each stock in the stock portfolio is extracted based on various data sources affecting the stock fluctuation, so as to pass the long
  • the short-term memory network calculates the price prediction results of these stocks.
  • the whole forecasting process fully considers the behavior characteristics of the financial market, and effectively reduces the deviation between the forecast result and the actual price trend of the stock. As a result, the user can more reasonably conduct the stock selection and the investment behavior of the stock selection based on the price forecast result, thereby effectively reducing the investment risk of the user.
  • FIG. 7 is a structural block diagram of a machine learning-based timing joining device provided by an embodiment of the present application. For the convenience of description, only The relevant parts of the embodiments of the present application.
  • the apparatus includes:
  • the stock selection unit 71 input preset indicator data of each stock into a preset stock selection model, and output a stock combination.
  • the obtaining unit 72 respectively obtain feature data of each stock in the stock combination, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock.
  • Prediction unit 73 pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting the price prediction result of each stock in the stock portfolio, so that The user determines the timing of the stock purchase strategy based on the price forecast.
  • the feature data further includes social media data related to the stock
  • the obtaining unit 72 includes:
  • the first collection sub-unit collects and stores user-generated content on the network social platform.
  • the first extraction subunit extracts user generated content matching the retrieval keyword, wherein the retrieval keyword includes a stock name or a stock code of each stock in the stock combination.
  • the first analysis subunit analyzes the extracted user generated content, and obtains feature data of each stock in the stock combination respectively.
  • the feature data further includes message data related to the stock
  • the obtaining unit 72 includes:
  • the first collection sub-unit collects and stores news data on a news client or a news website.
  • the first extracting subunit extracting news data matching the search keyword, wherein the search keyword includes a person name or a company name related to the listed company corresponding to each stock in the stock portfolio.
  • the first analysis subunit analyzes the extracted news data to obtain message data of each stock in the stock portfolio.
  • the stock selection unit 71 includes:
  • Training sub-unit Set the M stocks with the pre-determined annual full-year earnings in the top M position as the first category, and set the N stocks with the default annual annual return in the last N positions as the second category.
  • the stock selection model is trained to obtain a preset stock selection model
  • Select sub-units based on the decision tree algorithm, select the P-type preset indicator data that has the largest contribution to the annual income of the preset year;
  • Calculating the sub-unit inputting the P-type preset index data of each stock into the preset stock selection model, and calculating the comprehensive score of each stock on the P-type preset index data;
  • the first output sub-unit outputting the stock with the comprehensive score in the top Q position as a stock combination
  • the M, N, P and Q are all positive integers.
  • the prediction unit 73 includes:
  • Transform subunit performing characteristic denoising processing on each feature data of each stock in the stock combination
  • the second output subunit input the feature data of each stock in the stock combination after the denoising process to the long-short-term memory network that completes the pre-training, and output a price prediction result about each stock in the stock portfolio.
  • FIG. 8 is a schematic diagram of a terminal device according to an embodiment of the present application.
  • the terminal device 8 of this embodiment includes a processor 80, a memory 81, and computer readable instructions 82 stored in the memory 81 and operable on the processor 80, such as a ticket face content processing program for an invoice.
  • the processor 80 executes the steps in the embodiment of the ticket content processing method of each of the invoices when the computer readable instructions 82 are executed, such as steps 101 through 103 shown in FIG.
  • processor 80 when executing computer readable instructions 82, implements the functions of the various units of the various apparatus embodiments described above, such as the functions of modules 71 through 73 shown in FIG.
  • computer readable instructions 82 may be partitioned into one or more units, one or more units being stored in memory 81 and executed by processor 80 to complete the application.
  • the one or more units may be a series of instructions of computer readable instructions capable of performing a particular function, which is used to describe the execution of computer readable instructions 82 in the terminal device 8.
  • the computer readable instructions 82 can be divided into a stock selection unit, an acquisition unit, and a prediction unit, and the specific functions of each unit are as follows:
  • Stock selection unit input the preset index data of each stock into the preset stock selection model and output the stock portfolio.
  • Acquiring unit respectively obtaining feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical index data of the stock.
  • Predicting unit pre-training the long-term and short-term networks, and inputting the characteristic data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting the price prediction result of each stock in the stock portfolio, so that the user can
  • the price forecast results determine the timing of the shareholding strategy.
  • the feature data further includes social media data related to the stock
  • the acquiring unit includes:
  • the first collection sub-unit collects and stores user-generated content on the network social platform.
  • the first extraction subunit extracts user generated content matching the retrieval keyword, wherein the retrieval keyword includes a stock name or a stock code of each stock in the stock combination.
  • the first analysis sub-unit analyzes the extracted user-generated content, and respectively obtains social media data of each stock in the stock combination.
  • the feature data further includes message data related to the stock
  • the acquiring unit includes:
  • the first collection sub-unit collects and stores news data on a news client or a news website.
  • the first extracting subunit extracting news data matching the search keyword, wherein the search keyword includes a person name or a company name related to the listed company corresponding to each stock in the stock portfolio.
  • the first analysis subunit analyzes the extracted news data to obtain message data of each stock in the stock portfolio.
  • the stock selection unit comprises:
  • Training sub-unit Set the M stocks with the pre-determined annual full-year earnings in the top M position as the first category, and set the N stocks with the default annual annual return in the last N positions as the second category.
  • the stock selection model is trained to obtain a preset stock selection model
  • Select sub-units based on the decision tree algorithm, select the P-type preset indicator data that has the largest contribution to the annual income of the preset year;
  • Calculating the sub-unit inputting the P-type preset index data of each stock into the preset stock selection model, and calculating the comprehensive score of each stock on the P-type preset index data;
  • the first output sub-unit outputting the stock with the comprehensive score in the top Q position as a stock combination
  • the M, N, P and Q are all positive integers.
  • the prediction unit comprises:
  • Transform subunit performing characteristic denoising processing on each feature data of each stock in the stock combination
  • the second output subunit input the feature data of each stock in the stock combination after the denoising process to the long-short-term memory network that completes the pre-training, and output a price prediction result about each stock in the stock portfolio.
  • the terminal device 8 can be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device 8 may include, but is not limited to, a processor 80, a memory 81. It will be understood by those skilled in the art that FIG. 8 is merely an example of the terminal device 8, and does not constitute a limitation on the terminal device 8, and may include more or less components than those illustrated, or combine some components, or different components.
  • the terminal device 8 may further include an input/output device, a network access device, a bus, and the like.
  • the processor 80 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), and an off-the-shelf device.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA Field-Programmable Gate Array
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 81 may be an internal storage unit of the terminal device 8, such as a hard disk or a memory of the terminal device 8.
  • the memory 81 may also be an external storage device of the terminal device 8, such as a plug-in hard disk provided on the terminal device 8, a smart memory card (SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on.
  • the memory 81 may also include both an internal storage unit of the terminal device 5 and an external storage device.
  • the memory 81 is used to store the computer readable instructions and other programs and data required by the terminal device.
  • the memory 81 can also be used to temporarily store data that has been output or is about to be output.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer readable instructions, which may be stored in a computer readable storage medium.
  • the computer readable instructions when executed by a processor, may implement the steps of the various method embodiments described above.
  • the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like.
  • the computer readable medium can include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard drive, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only) Memory), random access memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media.
  • a recording medium a USB flash drive
  • a removable hard drive a magnetic disk, an optical disk
  • a computer memory a read only memory (ROM, Read-Only) Memory
  • RAM random access memory
  • electrical carrier signals telecommunications signals
  • software distribution media e.g., software distribution media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Technology Law (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention is applicable for the technical field of computers, and a time selection admission method on the basis of machine learning, a device and a terminal equipment therefor are provided. The method comprises the following steps: inputting preset index data of each stock into a preset stock picking model, and outputting a stock combination; independently obtaining feature data of each stock in the stock combination, wherein the feature data of the stock comprises the stock market transaction data of the stock or the technical index data of the stock; and inputting the feature data of each stock in the stock combination into a pre-trained long-and-short-term memory network, and outputting a price prediction result about each stock in the stock combination. By use of the method, the whole prediction process fully considers the behavior characteristics of a financial market, and a deviation between the prediction result and the subsequent practical price tendency of the stock is effectively reduced. A user can more reasonably carry out the investment behaviors of stock selection and time selection admission on the basis of the price prediction result, and the investment risk of the user is effectively lowered.

Description

基于机器学习的择时入股方法、装置及终端设备Method, device and terminal device for timing acquisition based on machine learning
本申请要求于2017年09月28日提交中国专利局、申请号为201710899893.1、发明名称为“基于机器学习的择时入股方法及终端设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese Patent Application filed on September 28, 2017, the Chinese Patent Office, the application number is 201710899893.1, and the invention name is "Machine-based time-based stock-in method and terminal equipment", the entire contents of which are incorporated by reference. In this application.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种基于机器学习的择时入股方法、装置及终端设备。The present application relates to the field of computer technology, and in particular, to a method, device, and terminal device for timing-based stock purchase based on machine learning.
背景技术Background technique
股票的价格都是实时波动的,在股票交易过程中,往往是基于人的主观决策或者是在股票价格下跌时来做出选股及购买行为,这样的选股行为并非是基于对股票后续价格走势的预测来做出的,因此可能存在较大的投资风险。为了构建并采取适当的投资组合策略,以实现一种较为稳健的理性投资方式,机器学习技术在证券投资领域的应用,尤其是在投资组合的选择与入市时机的确定方面的应用,已受到了研究人员的广泛关注,其基于对股票价格波动的预测来进行选股及择时入股,已被应用于股票购买行为的决策过程中。The stock price is fluctuating in real time. In the process of stock trading, it is often based on the subjective decision of the person or the stock picking and buying behavior when the stock price falls. Such stock picking is not based on the follow-up price of the stock. The forecast of the trend is made, so there may be a large investment risk. In order to build and adopt an appropriate portfolio strategy to achieve a more stable and rational investment method, the application of machine learning technology in the field of securities investment, especially in the selection of investment portfolio and the determination of the timing of entering the market, has been The researchers' extensive concern, based on the prediction of stock price fluctuations for stock selection and timing of shares, has been applied to the decision-making process of stock purchase behavior.
然而,上述技术仅仅是从机器学习的角度出发来进行选股及择时入股预测的,其预测过程并未充分考虑金融市场的行为特点,导致预测结果与股票后续的实际价格走势存在较大偏差。However, the above technology is only from the perspective of machine learning to conduct stock selection and timing stocks. The forecasting process does not fully consider the behavior characteristics of financial markets, resulting in a large deviation between the forecast results and the actual price movements following the stocks. .
技术问题technical problem
有鉴于此,本申请实施例提供了基于机器学习的择时入股方法、装置及终端设备,以解决现有的基于机器学习的预测模型的计算过程并未充分考虑金融市场的行为特点,导致选股及择时入股的预测结果与股票后续的实际价格走势存在较大偏差的问题。In view of this, the embodiments of the present application provide a method, device, and terminal device based on machine learning, so as to solve the calculation process of the existing machine learning-based prediction model without fully considering the behavior characteristics of the financial market, resulting in selection. There is a big deviation between the forecast results of stocks and timing stocks and the actual price movements following the stocks.
技术解决方案Technical solution
本申请实施例的第一方面提供了一种基于机器学习的择时入股方法,包括:A first aspect of the embodiments of the present application provides a machine learning based method for timing purchase, comprising:
将各支股票的预设指标数据输入预设的选股模型,输出股票组合;Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;
分别获取所述股票组合中各支股票的特征数据,所述股票的特征数据包括所述股票的股市交易数据或所述股票的技术指标数据;And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;
对长短期记忆网络进行预训练,并将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,以使用户根据所述价格预测结果确定择时入股策略。Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
本申请实施例的第二方面提供了一种基于机器学习的择时入股装置,包括:A second aspect of the embodiments of the present application provides a machine learning based timing purchase device, comprising:
选股单元,用于将各支股票的预设指标数据输入预设的选股模型,输出股票组合;a stock selection unit for inputting preset index data of each stock into a preset stock selection model, and outputting a stock combination;
获取单元,用于分别获取股票组合中各支股票的特征数据,其中,股票的特征数据包括述股票的股市交易数据或股票的技术指标数据;The obtaining unit is configured to respectively obtain feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock;
预测单元,用于对长短期记忆网络进行预训练,并将股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于股票组合中各支股票的价格预测结果,以使用户根据价格预测结果确定择时入股策略。The prediction unit is configured to pre-train the long-term and short-term memory network, and input the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and output the price prediction result of each stock in the stock portfolio, Enable users to determine the timing of the stock based strategy based on the price forecast.
本申请实施例的第三方面提供了一种终端设备,所述终端设备包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A third aspect of an embodiment of the present application provides a terminal device including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, the processor The following steps are implemented when the computer readable instructions are executed:
将各支股票的预设指标数据输入预设的选股模型,输出股票组合;Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;
分别获取所述股票组合中各支股票的特征数据,所述股票的特征数据包括所述股票的股市交易数据或所述股票的技术指标数据;And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;
对长短期记忆网络进行预训练,并将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,以使用户根据所述价格预测结果确定择时入股策略。Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
本申请实施例的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被至少一个处理器执行时,实现以下步骤:A fourth aspect of an embodiment of the present application provides a computer readable storage medium storing computer readable instructions that, when executed by at least one processor, implement the following steps:
将各支股票的预设指标数据输入预设的选股模型,输出股票组合;Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;
分别获取所述股票组合中各支股票的特征数据,所述股票的特征数据包括所述股票的股市交易数据或所述股票的技术指标数据;And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;
对长短期记忆网络进行预训练,并将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,以使用户根据所述价格预测结果确定择时入股策略。Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
有益效果Beneficial effect
本申请实施例中,基于各支股票的预设指标数据来筛选出适宜投资的股票组合,再基于多种影响股份波动的数据源来提取出股票组合中各支股票的特征数据,以通过长短期记忆网络计算得到这些股票的价格预测结果,整个预测过程充分考虑了金融市场的行为特点,有效地减少了预测结果与股票后续的实际价格走势之间的偏差。由此一来,用户可以更为合理地基于该价格预测结果进行选股及择时入股的投资行为,有效降低了用户的投资风险。In the embodiment of the present application, based on the preset index data of each stock, the stock portfolio suitable for investment is selected, and then the characteristic data of each stock in the stock portfolio is extracted based on various data sources affecting the stock fluctuation, so as to pass the long The short-term memory network calculates the price prediction results of these stocks. The whole forecasting process fully considers the behavior characteristics of the financial market, and effectively reduces the deviation between the forecast result and the actual price trend of the stock. As a result, the user can more reasonably conduct the stock selection and the investment behavior of the stock selection based on the price forecast result, thereby effectively reducing the investment risk of the user.
附图说明DRAWINGS
图1是本申请实施例提供的基于机器学习的择时入股方法的实现流程图;1 is a flowchart of an implementation of a machine learning based timing acquisition method provided by an embodiment of the present application;
图2是本申请实施例提供的基于机器学习的择时入股方法S101的具体实现流程图;FIG. 2 is a flowchart of a specific implementation of a machine learning based timing acquisition method S101 provided by an embodiment of the present application;
图3是本申请实施例提供的对股票相关的社交媒体数据的获取过程的示意图;3 is a schematic diagram of a process of acquiring stock-related social media data provided by an embodiment of the present application;
图4是本申请实施例提供的对股票相关的新闻数据的获取过程的示意图;4 is a schematic diagram of a process of acquiring stock-related news data provided by an embodiment of the present application;
图5是本申请实施例提供的基于机器学习的择时入股方法S103的具体实现流程图;FIG. 5 is a flowchart of a specific implementation of a machine learning based timing acquisition method S103 provided by an embodiment of the present application;
图6是本申请实施例提供的LSTM一个简易记忆细胞的运算图;6 is an operational diagram of a simple memory cell of the LSTM provided by the embodiment of the present application;
图7是本申请实施例提供的基于机器学习的择时入股装置的结构框图;7 is a structural block diagram of a machine learning based timing purchase device provided by an embodiment of the present application;
图8是本申请实施例提供的终端设备的示意图。FIG. 8 is a schematic diagram of a terminal device according to an embodiment of the present application.
本发明的实施方式Embodiments of the invention
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, for purposes of illustration and description However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the application.
为了说明本申请所述的技术方案,下面通过具体实施例来进行说明。In order to explain the technical solutions described in the present application, the following description will be made by way of specific embodiments.
图1示出了本申请实施例提供的基于机器学习的择时入股方法的实现流程,详述如下:FIG. 1 is a flowchart showing an implementation process of a machine learning based time sharing method provided by an embodiment of the present application, which is described in detail as follows:
S101:将各支股票的预设指标数据输入预设的选股模型,输出股票组合。S101: input preset indicator data of each stock into a preset stock selection model, and output a stock combination.
在本申请实施例中,股票的预设指标数据可以为从上市公司三大财务报表中提取出的数据,包括但不限于估值指标、财务指标、规模指标、成长指标和技术指标等。其中,每类指标都是多个同类指标数据的集合。具体来讲,估值指标包括市盈率、市净率和市现率等;财务指标包括股东权益报酬率、资产报酬率和速动比率等;规模指标包括最小市值和收益市值比率等;成长指标包括总成资产同比增长、净资产增长率、营业利润同比增长和净资产同 比增长等;技术指标包括60天的平均成交量、相对强弱指标和波动率等。需要说明的是,为了方便将预设指标数据输入选股模型中进行计算,在对预设指标数据抽取完成之后,需要对每类指标中的每维指标数值进行标准化处理。在本申请实施例中,可以采用标准化公式
Figure PCTCN2018077242-appb-000001
来进行标准化处理,其中,
Figure PCTCN2018077242-appb-000002
是标准化后的指标数值,X为指标序列的均值,σ为指标序列的标准差。
In the embodiment of the present application, the preset indicator data of the stock may be data extracted from the three major financial statements of the listed company, including but not limited to valuation indicators, financial indicators, scale indicators, growth indicators, and technical indicators. Among them, each type of indicator is a collection of multiple similar indicator data. Specifically, the valuation indicators include price-earnings ratio, price-to-book ratio, and market-to-acquisition rate; financial indicators include return on equity, return on assets, and quick ratio; scale indicators include minimum market value and ratio of market capitalization; growth indicators include The year-on-year growth of total assets, net asset growth, operating profit growth and net assets growth year-on-year; technical indicators include 60-day average volume, relative strength indicators and volatility. It should be noted that, in order to facilitate the calculation of the preset index data into the stock selection model, after the extraction of the preset index data is completed, it is necessary to standardize the value of each dimension in each type of index. In the embodiment of the present application, a standardized formula can be adopted.
Figure PCTCN2018077242-appb-000001
To standardize, among them,
Figure PCTCN2018077242-appb-000002
It is the index value after standardization, X is the mean value of the index sequence, and σ is the standard deviation of the index sequence.
传统的多因子选股模型计算出每种因子贡献度,并据此,根据多因子综合评分,最终选出合适的选股因子;而本申请实施例中,采用机器学习的思路,将各类预设指标数据作为各个因子,选股模型的预测目标与既定时间段内的总收益相关。在此,将总收益划分为两类,且这两类总收益之间的收益差距较大。在上述预测目标的基础之上,计算出每种因子(即每类各类预设指标数据)的重要性,将重要性作为选股的依据。在本申请实施例中,可采用的机器学习算法包括逻辑回归、支持向量机和神经网络等。优选地,可以将支持向量机作为机器学习算法。The traditional multi-factor stock selection model calculates the contribution degree of each factor, and according to the multi-factor comprehensive score, finally selects a suitable stock selection factor. In the embodiment of the present application, the machine learning method is adopted The preset indicator data is used as each factor, and the forecasting target of the stock selection model is related to the total return in the predetermined time period. Here, the total income is divided into two categories, and the difference between the two types of total returns is large. Based on the above prediction targets, the importance of each factor (ie, each type of preset indicator data) is calculated, and the importance is used as the basis for stock selection. In the embodiments of the present application, machine learning algorithms that can be employed include logistic regression, support vector machines, neural networks, and the like. Preferably, the support vector machine can be used as a machine learning algorithm.
由于采用的是机器学习算法,因此,选股模型不仅仅是简单地基于计量经济学与统计学的方法来计算出每种因子的贡献度,而是将各个因子看成是不同的预设指标数据,即上一步所提取的估值指标、财务指标、规模指标、成长指标和技术指标等,以选择适合投资的股票集合。Because of the machine learning algorithm, the stock selection model is not simply based on econometrics and statistics to calculate the contribution of each factor, but to treat each factor as a different default indicator. The data, that is, the valuation indicators, financial indicators, scale indicators, growth indicators and technical indicators extracted in the previous step, are selected to select the stock collection suitable for investment.
图2对S101的一种具体实现方式进行详细阐述:Figure 2 illustrates a specific implementation of S101 in detail:
S201:将预设年度全年收益排名在前M位的M支股票设置为第一类,将预设年度全年收益排名在后N位的N支股票设置为第二类,对初始的选股模型进行训练,以得到预设的选股模型。S201: setting M stocks whose top-year income is ranked in the top M in the first year as the first category, and setting the N stocks whose top-year income is ranked in the last N-position as the second category, the initial selection The stock model is trained to obtain a preset stock picking model.
通常来说,用于进行选股模型训练的为已知收益的股票,例如,可以将当前年度的上一年度的股票的全年收益用于进行选股模型的训练。如上文所述,根据全年收益的高低,将股票分为两类,第一类的股票其全年收益排名在前M位,第二类的股票其全年收益排名在后N位,并通过合理设置M和N的取值,拉大第一类股票与第二类投票在全年收益上的差距,以便于训练出合适的选股模型。Generally, the stocks that are known to benefit from the stock picking model training, for example, can use the annual income of the stocks of the previous year of the current year for the training of the stock picking model. As mentioned above, according to the level of full-year income, the stocks are divided into two categories. The first-class stocks rank in the top M, and the second-class stocks rank in the bottom N, and By reasonably setting the values of M and N, the gap between the first type of stock and the second type of voting in the annual income is widened, so as to train a suitable stock selection model.
S202:基于决策树算法,选取对预设年度全年收益贡献度最大的P类预设指标数据。S202: Based on the decision tree algorithm, select P-type preset indicator data that has the largest contribution to the annual income of the preset year.
针对上文中提及的若干类预设指标,其每一类预设指标均由若干的同类指标数据构成,在此,通过决策树算法,从所有类别的预设指标中选取出对全年收益贡献度最大的若干类预设指标,用以作为选股依据。For each of the above-mentioned preset indicators, each type of preset indicator is composed of several similar indicator data. Here, the decision-making tree algorithm is used to select the annual income from all the preset indicators. A number of pre-set indicators with the greatest contribution are used as the basis for stock selection.
S203:将各支股票的上述P类预设指标数据分别输入预设的选股模型,计算各支股票 在上述P类预设指标数据上的综合得分。S203: input the foregoing P-type preset index data of each stock into a preset stock selection model, and calculate a comprehensive score of each stock on the P-type preset index data.
对于筛选出的P类预设指标数据,分别获取到各支股票的这P类预设指标数据,并针对每一支股票,分别将其对应的这P类预设指标数据输入至预设的选股模型,以分别输出各支股票在这P类预设指标数据上的综合得分。For the selected P-type preset indicator data, the P-type preset indicator data of each stock is respectively obtained, and for each stock, the corresponding P-type preset indicator data is input to the preset The stock selection model is to separately output the comprehensive scores of each stock on the P-type preset indicator data.
S204:将所述综合得分排在前Q位的股票输出为所述股票组合。S204: Output the stock in which the comprehensive score is ranked in the first Q position as the stock combination.
其中,上述M、N、P和Q均为正整数。Wherein, the above M, N, P and Q are all positive integers.
例如,将2016年沪深300指数的所有成分股拉入候选股票池,基于上述成分股的全年收益,将排名在前60的全年收益作为一类,将排名在后60的全年收益作为另一类,并将这300支股票中2016年全年收益排名在前60的60支股票标记为1,将全年收益排名在后60的60支股票标记为0,将这120支股票的各类指标数据作为各个因子,进行训练,再基于决策树算法,选出前三个贡献最大的指标特征数据,以作为三个因子;然后,分别计算预测月份的第一天各支股票在这三个因子上的得分综合,选出10支得分排名先前的股票,作为选出的投资组合。For example, all the constituent stocks of the 2016 Shanghai and Shenzhen 300 Index will be drawn into the pool of candidate stocks. Based on the full-year earnings of the above constituents, the full-year earnings of the top 60 will be ranked as one category, and the full-year earnings will be ranked in the last 60 years. As another category, the 60 stocks in the top 60 earnings of the top 300 in 2016 are marked as 1 and the 60 stocks ranked in the bottom 60 in the full year are marked as 0, and the 120 stocks will be The various indicators data are trained as various factors, and then based on the decision tree algorithm, the top three most important indicator feature data are selected as three factors; then, the first day of each month of the forecast month is calculated separately. The scores on these three factors are combined, and 10 stocks ranked in the previous stock are selected as the selected portfolio.
S102:分别获取股票组合中各支股票的特征数据,其中,股票的特征数据包括股票的股市交易数据或股票的技术指标数据。S102: Obtain feature data of each stock in the stock portfolio respectively, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock.
首先,对投资标的特征数据进行获取,在本申请实施例中,投资标的即股票,而股票的特征数据,用于反映股票所属上市公司的估值、规模、成长情况和财务状况等。对于股票的特征数据,其原始数据来源可以包括该股票的股市交易数据,例如股票的开盘价、最低价、最高价、收盘价、交易额或收益率等等,或者,其原始数据来源可以包括该股票的技术指标数据,例如股票的平滑异同平均、累积能量线、布林线、心理线或三重指数平均线等等。上述股市交易数据和技术指标数据均可来源于金融证券类程序,利用此类程序的应用程序编程接口(Application Programming Interface,API)来获取,并且,由于股市交易数据和技术指标数据其原始数据就已经是可量化的数据,因此,可以直接作为股票的特征数据使用,无需进行特征提取。Firstly, the feature data of the investment target is acquired. In the embodiment of the present application, the investment target is the stock, and the characteristic data of the stock is used to reflect the valuation, scale, growth and financial status of the listed company. For the characteristic data of the stock, the original data source may include stock trading data of the stock, such as the opening price, the lowest price, the highest price, the closing price, the transaction amount or the profit rate of the stock, or the original data source may include The technical indicator data of the stock, such as the stock's smoothing similarity average, cumulative energy line, Bollinger line, psychological line or triple exponential average line and so on. The above stock market transaction data and technical indicator data can all be derived from financial securities programs, using the application programming interface (API) of such programs to obtain, and, due to stock market transaction data and technical indicator data, the original data is It is already quantifiable data, so it can be used directly as feature data for stocks without feature extraction.
优选地,除了股市交易数据或技术指标数据以外,股票的特征数据还可以包括与股票相关的社交媒体数据,这部分数据的来源为社交平台、自媒体平台等web 2.0相关的用户生成内容。图3示出了对股票相关的社交媒体数据的获取过程:Preferably, in addition to stock market transaction data or technical indicator data, the feature data of the stock may also include social media data related to the stock, the source of the data being web 2.0 related user generated content such as a social platform and a media platform. Figure 3 shows the process of acquiring stock-related social media data:
S301:采集并存储网络社交平台上的用户生成内容。S301: Collect and store user generated content on a network social platform.
具体地,可以采用分布式网络爬虫来采集各网络社交平台相关服务器上的用户生成内容,此类用户生成内容包括但不限于:用户在博客或微博上发布的内容,用户在新闻或图片下发布的评论,用户在论坛上发表的帖子或回复的内容,等等。Specifically, a distributed web crawler may be used to collect user-generated content on a server related to each social network platform. Such user-generated content includes but is not limited to: content posted by a user on a blog or a microblog, and the user is under news or a picture. Post comments, posts posted by users on the forum or responses, and more.
S302:提取与检索关键词匹配的用户生成内容,其中,检索关键词包括股票名称或者股票代码。S302: Extract user-generated content that matches the search keyword, wherein the search keyword includes a stock name or a stock code.
作为一种具体的实现方式,在进行用户生成内容的匹配与提取操作时,可以将股票的名称或股票代码作为检索关键词,从而抓取到与此类检索关键词匹配的用户生成内容。例如,若用户在微博中发布了关于某支股票的走势预测,那么在其发布内容中势必包含了该股票的股票名称和/或股票代码,因此可以基于该股票的股票名称和/或股票代码来获取到匹配的用户生成内容。此外,进一步地,还可以基于提取出的用户生成内容,统计出与该股票匹配的用户生成内容的数量,发布此类用户生成内容的用户其社交账号的关注人数、该用户的账号注册时间、该用户发布此类用户生成内容的终端类型,等等。As a specific implementation manner, when performing the matching and extracting operation of the user-generated content, the name of the stock or the stock code may be used as a search keyword, thereby capturing the user-generated content that matches the search keyword. For example, if a user publishes a trend forecast for a stock in Weibo, then the stock name and/or stock code of the stock is bound to be included in the content of the stock, so it can be based on the stock name and/or stock of the stock. Code to get matching user generated content. In addition, further, based on the extracted user generated content, the number of user generated content matching the stock may be counted, the number of users of the social account of the user who generated such user generated content, the account registration time of the user, The user posts the type of terminal for such user-generated content, and so on.
S303:对提取出的用户生成内容进行分析,得到股票的社交媒体数据。S303: Analyze the extracted user generated content to obtain social media data of the stock.
示例性地,基于社交媒体数据所获取到的股票特征数据,可以包括以下几种类型:Illustratively, the stock feature data acquired based on the social media data may include the following types:
1、情感效价(sentimentValence):1, emotional valence (sentimentValence):
情感效价用于反映股票在用户社交平台中的情感倾向。示例性地,可以通过公式
Figure PCTCN2018077242-appb-000003
计算得到情感效价,其中,P为积极情感的用户生成内容的数量,N为消极情感的用户生成内容的数量。行为金融有研究发现,投资者情绪的波动会引起股票价格的非理性波动,因此,情感效价越趋近于log(1)=0,则代表投资者的情绪波动越小,那么对应股票的价格可能产生的非理性波动也越小;反之,则代表投资者的情绪波动越大,那么对应股票的价格可能产生的非理性波动也越大。
Emotional valence is used to reflect the emotional tendency of stocks in the user's social platform. Illustratively, you can pass the formula
Figure PCTCN2018077242-appb-000003
Emotional valence is calculated, where P is the number of user-generated content of positive emotions and N is the number of user-generated content of negative emotions. Behavioral finance research has found that fluctuations in investor sentiment can cause irrational fluctuations in stock prices. Therefore, the closer the emotional valence is to log(1)=0, the smaller the emotional volatility of the investor, then the corresponding stock The irrational fluctuations that the price may produce are smaller; on the contrary, the greater the emotional volatility of the investor, the greater the irrational fluctuations that may occur in the price of the corresponding stock.
2、关注热度:2, pay attention to heat:
关注热度,即反映股票在用户社交平台中的被关注情况。行为金融同时有研究发现,投资者对某支股票的过度关注也会引起股票价格的非理性波动。在本申请实施例中,可以通过用户生成内容的数量来表征,即,与股票相关的用户生成内容的数量越多,则关注热度越高,那么对应股票的价格可能产生的非理性波动也越大;与股票相关的用户生成内容的数量越少,则关注热度越低,那么对应股票的价格可能产生的非理性波动也越小。Concerned about the heat, which reflects the stock's attention in the user's social platform. Behavioral finance has also found that investors' excessive attention to a stock will also cause irrational fluctuations in stock prices. In the embodiment of the present application, the number of user-generated content may be characterized, that is, the more the number of user-generated content related to the stock, the higher the degree of concern, the more irrational fluctuations may be generated by the price of the corresponding stock. The smaller the number of user-generated content related to stocks, the lower the heat of concern, and the less irrational fluctuations that may occur in the price of the corresponding stock.
3、参与用户影响力:3. Participation in user influence:
参与用户影响力,指的是发表与股票相关的用户生成内容的用户,其在社交网络平台甚至整个互联网中的影响力。通常认为,影响力越大的用户,其真实身份更倾向于是股市操纵者,影响力较小的用户,其真实身份更倾向于是业余的股票投资者,由此看来,影响力越大的用户,其言论越容易引起对应股票的价格产生非理性波动;反之,影响力越小的用户,其言论引起对应股票的价格产生非理性波动的可能也越小。示例性地,参与用户影响力可以 通过以下方式量化:将该用户在社交网络平台中所关注的账户数、关注该用户的账户数以及该用户所发布的用户生成内容的总数,这三者进行相加或者加权相加,以得到量化的参与用户影响力。Participation in user influence refers to the ability to publish user-generated content related to stocks, its influence on social networking platforms and even the entire Internet. It is generally believed that users with greater influence are more likely to be stock market operators and less influential users. Their true identity is more likely to be amateur stock investors. From this point of view, the more influential users The more likely its speech is, the more irrational fluctuations occur in the price of the corresponding stock; on the contrary, the less influential the user, the less likely the remarks cause the irrational fluctuation of the price of the corresponding stock. Illustratively, participating user influence can be quantified by the number of accounts that the user is interested in in the social networking platform, the number of accounts that are interested in the user, and the total number of user-generated content published by the user. Add or weight add to get quantified participation user influence.
4、参与用户注册时长:4. Participation in user registration time:
参与用户注册时长可以用发布关于股票的用户生成内容的新用户与老用户的比率来表征,其中,通过设置注册时长阈值,将注册时长低于该阈值的用户归类为新用户,将注册时长高于该阈值的用户归类为老用户,那么可以知道,新用户与老用户的比率越高,所量化出的参与用户注册时长越小,代表发布关于该股票的用户生成内容的用户群体越资浅,其言论引起对应股票的价格产生非理性波动的可能也越小;反之,新用户与老用户的比率越低,所量化出的参与用户注册时长越大,代表发布关于该股票的用户生成内容的用户群体越资深,其真实身份更倾向于是成熟的股市投资者,其言论越容易引起对应股票的价格产生非理性波动。The participation user registration duration can be characterized by the ratio of the new user to the old user who publishes the user-generated content of the stock, wherein by setting the registration duration threshold, the user whose registration duration is lower than the threshold is classified as a new user, and the registration duration is Users above this threshold are classified as old users. It can be known that the higher the ratio of new users to old users, the smaller the quantified participation user registration duration, and the more the user group representing the user-generated content about the stock. The capital is shallow, and the remarks cause the irrational fluctuation of the price of the corresponding stocks. The lower the ratio of new users to old users, the greater the quantified participation user registration time, and the release of users on the stock. The more senior the user group generating the content, the more likely its true identity is to be a mature stock market investor, and the more likely its remarks are, the more irrational fluctuations in the price of the corresponding stock.
5、参与用户发布终端:5. Participate in the user publishing terminal:
参与用户发布终端可以量化为移动终端数量与PC端数量的比率,其中,移动终端数量为通过移动终端发布用户生成内容的数量,PC端数量为通过PC端发布用户生成内容的数量。通常,认为通过PC端发布股市讨论数据的用户,其真实身份更倾向于是成熟的股市投资者,认为通过移动终端发布股市讨论数据的用户,其真实身份更倾向于是不成熟或资历较浅的股市交易者,那么显然,移动终端数量与PC端数量的比率越高,代表发布关于该股票的用户生成内容的用户群体越资浅,其言论引起对应股票的价格产生非理性波动的可能也越小;反之,新用户与老用户的比率越低,代表发布关于该股票的用户生成内容的用户群体越资深,其言论越容易引起对应股票的价格产生非理性波动。The number of mobile terminals is the ratio of the number of mobile terminals to the number of PCs. The number of mobile terminals is the number of user-generated content distributed by the mobile terminal. Generally, users who believe that the stock market discussion data is released through the PC side are more likely to be mature stock market investors. Users who believe that the stock market discussion data is released through mobile terminals are more likely to be immature or junior stocks. Trader, then obviously, the higher the ratio of the number of mobile terminals to the number of PCs, the more savvy the user group representing the user-generated content of the stock, and the less likely the remarks cause the irrational fluctuations in the price of the corresponding stock. Conversely, the lower the ratio of new users to old users, the more senior the user group representing the user-generated content of the stock, the more likely its remarks will cause irrational fluctuations in the price of the corresponding stock.
6、用户生成内容平均字数:6, the average number of words generated by users:
在此,认为发布单条股票评论的平均字数较多的多为专业的股票投资者,而发布单条股票评论的平均字数较少的多为非专业的股票投资者。那么显然,用户生成内容平均字数越多,代表参与该股票讨论的用户群体越资深,其言论越容易引起对应股票的价格产生非理性波动;反之,用户生成内容平均字数越少,代表参与该股票讨论的用户群体越资浅,其言论引起对应股票的价格产生非理性波动的可能也越小。Here, it is considered that the average number of words that publish a single stock review is mostly professional stock investors, while the average number of words that publish a single stock review is mostly non-professional stock investors. Obviously, the more the average number of words generated by the user, the more senior the user group representing the participation in the stock discussion, the more likely the speech will cause irrational fluctuations in the price of the corresponding stock; conversely, the less the average number of words generated by the user, the representative participates in the stock. The more user groups discussed, the less likely their remarks are to cause irrational fluctuations in the price of the corresponding stock.
上文中列举了若干种基于社交媒体数据所获取到的股票特征数据,不同的特征数据对应不同的量化方法,因此,通过对提取出的与股票相匹配的用户生成内容进行相关的量化分析,可以得到基于社交媒体数据所获取到的股票特征数据。当特征数据还包括与股票相关的社交媒体数据时,将该特征数据作为模型输入以完成基于机器学习的选股过程,使得该选股 过程中加入了用户生成内容对股票价格的影响因子,从而能够更为准确地对股票价格走势进行预测,得到更为精确的预测结果。In the above, several stock feature data obtained based on social media data are listed, and different feature data correspond to different quantization methods. Therefore, by performing quantitative analysis on the extracted user-generated content matching the stock, Get stock feature data obtained based on social media data. When the feature data further includes social media data related to the stock, the feature data is input as a model to complete a machine learning-based stock picking process, so that the influence factor of the user generated content on the stock price is added in the stock picking process, thereby It can predict stock price movements more accurately and get more accurate prediction results.
优选地,除了股市交易数据或技术指标数据以外,股票的特征数据还可以包括与股票相关的消息数据,这部分数据可来源于各主流的新闻客户端或新闻网站,图4示出了对股票相关的消息数据的获取过程:Preferably, in addition to stock market transaction data or technical indicator data, the feature data of the stock may also include message data related to the stock, which may be derived from various mainstream news clients or news websites, and FIG. 4 shows the stock. The process of obtaining related message data:
S401:采集新闻客户端或新闻网站上的新闻数据并存储。S401: Collect news data on a news client or a news website and store it.
具体地,可以采用分布式网络爬虫来采集各新闻客户端或新闻网站相关服务器上的新闻数据。优选地,对于新闻客户端或新闻网站上发布的海量数据,可以针对其中的金融版块或者股市版块进行爬虫抓取,以提供此类新闻数据的采集效率。Specifically, distributed web crawlers can be used to collect news data on various news clients or news site related servers. Preferably, for a large amount of data published on a news client or a news website, crawling can be performed for the financial section or the stock market section to provide collection efficiency of such news data.
S402:提取与检索关键词匹配的新闻数据,其中,检索关键词包括与上市公司相关的人名或公司名称。S402: Extracting news data that matches the search keyword, wherein the search keyword includes a person name or a company name associated with the listed company.
从新闻客户端或新闻网站上抓取新闻数据,主要目的是为了抓取到与上市公司相关的新闻,而这些新闻实际上也可以从一定层面上反映出该上市股票在未来一段时间的价格波动情况,因此,将上市公司的公司名称,或者该上市公司法人或相关负责人的人名作为检索关键词,从而可以提取出与股票相关的新闻数据。The main purpose of capturing news data from news clients or news websites is to capture news related to listed companies, and these news can actually reflect the price fluctuation of the listed stocks in the future for a certain period of time. In other words, the name of the company of the listed company or the name of the person or the person in charge of the listed company is used as a search key, so that news data related to the stock can be extracted.
S403:对提取出的新闻数据进行分析,得到所述股票的消息数据。S403: Analyze the extracted news data to obtain message data of the stock.
示例性地,基于新闻数据所获取到的股票特征数据,可以参照上文中基于社交数据所获取到的股票特征数据的类型,例如,可以为新闻数据中与股票相关的情感效价、关注热度、新闻发布方影响力、新闻数据的长度,等等。不同的特征数据对应不同的量化方法,因此,通过对提取出的与股票相匹配的新闻数据进行相关的量化分析,可以得到基于新闻数据所获取到的股票特征数据。当特征数据还包括与股票相关的新闻数据时,将该特征数据作为模型输入以完成基于机器学习的选股过程,使得该选股过程中加入了上市公司新闻对股票价格的影响因子,从而能够更为准确地对股票价格走势进行预测,得到更为精确的预测结果。Exemplarily, based on the stock feature data acquired by the news data, reference may be made to the type of the stock feature data acquired based on the social data in the above, for example, the emotional valence related to the stock in the news data, the interest rate of interest, The influence of the press release, the length of the news data, and so on. Different feature data corresponds to different quantization methods. Therefore, by performing quantitative analysis on the extracted news data matching the stock, the stock feature data acquired based on the news data can be obtained. When the feature data further includes news data related to the stock, the feature data is input as a model to complete a machine learning-based stock picking process, so that the stock selection process adds the influence factor of the listed company news on the stock price, thereby enabling More accurate prediction of stock price movements to get more accurate predictions.
在本申请实施例中,采集到相关数据,并从相关数据中提取出股票的特征数据之前,可以对数据进去噪、补缺等优化处理,以进一步地提高特征数据的获取效率。In the embodiment of the present application, before the relevant data is collected, and the feature data of the stock is extracted from the related data, the data may be denoised and complemented, so as to further improve the acquisition efficiency of the feature data.
在S103中,对长短期记忆网络(Long Short Term Memory networks,LSTM)进行预训练,并将股票组合中各支股票的特征数据输入完成预训练的长短期记忆网络,输出关于股票组合中各支股票的价格预测结果,以使用户根据价格预测结果确定择时入股策略。In S103, the Long Short Term Memory Networks (LSTM) are pre-trained, and the feature data of each stock in the stock portfolio is input into the pre-trained long-term and short-term memory network, and the branches in the stock portfolio are output. The price of the stock is predicted so that the user can determine the timing of the stock based on the price forecast.
在确定了股票组合之后,需要通过预测股价在未来既定时间窗口的走势,来确定出入股操作的时机。比如,以日度为时间窗口,那么预测任务相当于是基于前一天的数据预测后一天的涨跌信号,或基于前一天的数据预测后一天的收盘价。本申请实施例在选股模型选出 的投资组合的基础上,综合采用特征抽取步骤,基于深度学习的算法来得到预测效果与预测收益。After determining the stock portfolio, it is necessary to determine the timing of the stock entry operation by predicting the trend of the stock price in the future time window. For example, if the daily time is the time window, then the prediction task is equivalent to the day-to-day ups and downs of the data based on the previous day's data, or based on the previous day's data to predict the closing price of the day after. Based on the investment portfolio selected by the stock selection model, the embodiment of the present application comprehensively adopts the feature extraction step, and obtains the prediction effect and the predicted return based on the deep learning algorithm.
在LSTM中,常规的神经元,即一个将S型激活应用于其输入线性组合的单位,被存储单元所代替。每个存储单元是与一个输入门,一个输出门和一个跨越时间步骤无干扰送入自身的内部状态相关联。在本申请实施例中,需要将股票组合中各支股票的特征数据输入至预先训练好的LSTM中。LSTM是一种递归神经网络(recurrent neural network,RNN)的变型,其特点就是在RNN结构以外添加了各层的阀门节点,阀门有3类:遗忘阀门(forget gate),输入阀门(input gate)和输出阀门(output gate)。这些阀门可以打开或关闭,用于将判断LSTM的记忆态在该层输出的结果是否达到阈值从而加入到当前该层的计算中。阀门节点利用sigmoid函数将网络的记忆态作为输入计算;如果输出结果达到阈值则将该阀门输出与当前层的的计算结果相乘作为下一层的输入;如果没有达到阈值则将该输出结果遗忘掉。每一层包括阀门节点的权重都会在每一次模型反向传播训练过程中更新。In LSTM, a conventional neuron, a unit that applies S-type activation to its input linear combination, is replaced by a storage unit. Each memory cell is associated with an input gate, an output gate, and an internal state that is fed into itself without interference. In the embodiment of the present application, the feature data of each stock in the stock combination needs to be input into the pre-trained LSTM. LSTM is a variant of recurrent neural network (RNN), which is characterized by the addition of valve nodes of each layer outside the RNN structure. There are three types of valves: forget gate, input gate And the output gate. These valves can be opened or closed for determining whether the result of the LSTM memory state at the layer output reaches a threshold and is added to the calculation of the current layer. The valve node uses the sigmoid function to calculate the memory state of the network as an input; if the output result reaches the threshold, the valve output is multiplied by the calculation result of the current layer as an input of the next layer; if the threshold is not reached, the output result is forgotten Drop it. The weight of each layer, including the valve nodes, is updated during each model backpropagation training.
LSTM的训练可以通过调整很多参数来优化,例如activation函数,LSTM层数,输入输出的变量维度等。为了最小化训练误差,梯度下降法(Gradient descent),如应用时序性倒传递算法(Backpropagation through time,BPTT),可用来依据错误修改每次的权重。误差梯度随着事件间的时间长度成指数般的消失,当设置了LSTM区块时,误差也随着倒回计算,从输出影响回输入阶段的每一个输入门,直到这个数值被过滤掉。因此正常的倒传递类神经是一个有效训练LSTM区块记住长时间数值的方法。LSTM training can be optimized by adjusting many parameters, such as the activation function, the number of LSTM layers, and the variable dimensions of the input and output. In order to minimize training errors, Gradient descent, such as the Application of Backpropagation through time (BPTT), can be used to modify the weight of each time based on errors. The error gradient exponentially disappears with the length of time between events. When the LSTM block is set, the error is also calculated with the rewind, from the output back to each input gate of the input phase until the value is filtered out. Therefore, the normal inverted-transfer nerve is a method to effectively train the LSTM block to remember long-term values.
由于金融市场中各种偶然因素的影响,使得金融数据,特别是金融时间序列中存在着噪声,这些噪声严重影响了对金融数据的分析和处理结果,因此在将S102中获取到的股票组合中各支股票的特征数据输入至LSTM之前,有必要先对这些特征数据进行去噪处理。但是,由于金融时间序列本身具有非平稳、非线性和信噪比高的特点,采用现有的去噪方法往往不合适,因此,作为本申请的一个实施例,采用小波消噪对原始的时间序列进行滤波,利用去噪处理提取并分离时间序列的各种隐周期和非线性,把小波分解序列的特征和分解数据随尺度倍增而倍减的规律充分用于LSTM神经网络模型的计算过程。Due to various accidental factors in the financial market, there are noises in financial data, especially in the financial time series. These noises seriously affect the analysis and processing results of financial data, so in the stock portfolio obtained in S102 Before the feature data of each stock is input to the LSTM, it is necessary to denoise these feature data first. However, since the financial time series itself has the characteristics of non-stationary, non-linear and high signal-to-noise ratio, it is often inappropriate to adopt existing denoising methods. Therefore, as an embodiment of the present application, wavelet denoising is used for the original time. The sequence is filtered, and the various hidden periods and nonlinearities of the time series are extracted and separated by the denoising process. The characteristics of the wavelet decomposition sequence and the decomposition data are multiplied and multiplied by the scale to fully utilize the calculation process of the LSTM neural network model.
如图5所示,S103的具体实现如下:As shown in Figure 5, the specific implementation of S103 is as follows:
S1031:将所述股票组合中各支股票的特征数据分别进行去噪处理。S1031: Perform feature denoising processing on feature data of each stock in the stock combination.
S1032:将去噪处理后的所述股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果。S1032: Input feature data of each stock in the stock combination after denoising processing to a long-short-term memory network that completes pre-training, and output a price prediction result about each stock in the stock portfolio.
具体地,本申请实施例采用Haar函数作为小波基函数,不仅可以有效地将时间序列分解成时域与频域,而且可以显著地减少处理时间,以减少数据在LSTM的处理时间。在本申 请实施例中,以时间t为变量的连续小波变换的小波函数定义为:Specifically, the embodiment of the present application adopts the Haar function as a wavelet basis function, which can not only effectively decompose the time series into the time domain and the frequency domain, but also can significantly reduce the processing time to reduce the processing time of the data in the LSTM. In the present application embodiment, the wavelet function of the continuous wavelet transform with the time t as a variable is defined as:
Figure PCTCN2018077242-appb-000004
Figure PCTCN2018077242-appb-000004
其中,a为变换系数,τ为转换因子,φ(t)是一种服从小波可允许条件的基准小波。小波可允许条件定义为:Where a is the transform coefficient, τ is the conversion factor, and φ(t) is a reference wavelet obeying the wavelet allowable condition. The wavelet allowable condition is defined as:
Figure PCTCN2018077242-appb-000005
Figure PCTCN2018077242-appb-000005
其中,Φ(ω)是频率ω的函数,也是φ(t)的傅里叶变换。如果将x(t)定义为平方可积函数(x(t)∈L 2(R)),那么,带有小波φ的连续小波变换可以定义为: Where Φ(ω) is a function of frequency ω and is also a Fourier transform of φ(t). If x(t) is defined as a square integrable function (x(t) ∈ L 2 (R)), then a continuous wavelet transform with wavelet φ can be defined as:
Figure PCTCN2018077242-appb-000006
Figure PCTCN2018077242-appb-000006
其中,
Figure PCTCN2018077242-appb-000007
是φ(t)的复共轭函数,此时,小波变换的逆变换可以定义为:
among them,
Figure PCTCN2018077242-appb-000007
Is the complex conjugate function of φ(t). At this time, the inverse transform of the wavelet transform can be defined as:
Figure PCTCN2018077242-appb-000008
Figure PCTCN2018077242-appb-000008
上述的连续小波变换由于其小波基不是正交的,通过它们对信号变换后的信息是有冗余的,因此,在本申请实施例中,通过Mallat算法,即一种在正交小波基上的信号分解算法,来构建正交小波基。该算法在过滤时间序列的过程中使用高通滤波器和低通滤波器作为离散小波变换的实现,具体地,通过父小波
Figure PCTCN2018077242-appb-000009
描述时间序列的低频成分,通过母小波ψ(t)描述时间序列的高频成分。
The above-mentioned continuous wavelet transform is redundant because the wavelet bases are not orthogonal, and the information after the signal transformation is redundant. Therefore, in the embodiment of the present application, the Mallat algorithm is used, that is, on the orthogonal wavelet base. A signal decomposition algorithm to construct an orthogonal wavelet base. The algorithm uses a high-pass filter and a low-pass filter as the implementation of the discrete wavelet transform in filtering the time series, specifically, through the parent wavelet.
Figure PCTCN2018077242-appb-000009
Describe the low frequency components of the time series and describe the high frequency components of the time series by the mother wavelet ψ(t).
父小波
Figure PCTCN2018077242-appb-000010
和母小波ψ(t)分别为积分到1和0,定义如下:
Father wavelet
Figure PCTCN2018077242-appb-000010
And the mother wavelet ψ(t) is the integral to 1 and 0, respectively, as defined below:
Figure PCTCN2018077242-appb-000011
Figure PCTCN2018077242-appb-000011
在j水平上的母小波和父小波可以分别转化为:The mother wavelet and the parent wavelet at the j level can be converted into:
Figure PCTCN2018077242-appb-000012
Figure PCTCN2018077242-appb-000012
Figure PCTCN2018077242-appb-000013
Figure PCTCN2018077242-appb-000013
带有多级索引分析k∈{0,1,2,...}和j∈{0,1,2,...J}的父小波及母小波可以重新构造金融时间序列。正交小波级数逼近时间序列x(t)公式定义为:The parent wavelet and the mother wavelet with multilevel index analysis k∈{0,1,2,...} and j∈{0,1,2,...J} can reconstruct the financial time series. The orthogonal wavelet series approximation time series x(t) is defined as:
Figure PCTCN2018077242-appb-000014
Figure PCTCN2018077242-appb-000014
其中,对扩展系数s J,k和d J,k给定公式如下: Among them, the formula for the expansion coefficients s J,k and d J,k is as follows:
Figure PCTCN2018077242-appb-000015
Figure PCTCN2018077242-appb-000015
d j,k=∫ψ j,kx(t)dt d j,k =∫ψ j,k x(t)dt
给定多尺度的时间序列x(t)的近似为:The approximation of a given multiscale time series x(t) is:
Figure PCTCN2018077242-appb-000016
Figure PCTCN2018077242-appb-000016
Figure PCTCN2018077242-appb-000017
Figure PCTCN2018077242-appb-000017
因此,简化的正交小波级数逼近的形式可以表示为:Therefore, the form of the simplified orthogonal wavelet series approximation can be expressed as:
x(t)=S J(t)+D J(t)+D J-1(t)+…+D 1(t) x(t)=S J (t)+D J (t)+D J-1 (t)+...+D 1 (t)
其中,S J(t)是输入的时间序列x(t)最粗糙的近似,x(t)的多分辨率分解是序列{S J(t),D J(t),D J-1(t),...,D 1(t)}。在金融时间序列很粗糙的情况下,离散小波变换的可重复应用可以减少过程中的风险。 Where S J (t) is the roughest approximation of the input time series x(t), and the multiresolution decomposition of x(t) is the sequence {S J (t), D J (t), D J-1 ( t),...,D 1 (t)}. In the case of a rough financial time series, the reproducible application of discrete wavelet transform can reduce the risk in the process.
在对股票组合中各支股票的特征数据进行去噪处理之后,将特征数据输入至LSTM,输出关于所述股票组合中各支股票的价格预测结果。在LSTM中,每个神经元是一个记忆细胞,细胞里面有一个输入门(input gate),一个遗忘门(forget gate)和一个输出门(output gate),LSTM模型的关键之一就在于其中的遗忘门,其能够控制训练时候梯度在这里的收敛性,同时也能够保持长期的记忆性。图6示出了LSTM一个简易记忆细胞的运算图。结合图6来看,一个简易记忆细胞在运算时所涉及的主要数学符号如下:After denoising the feature data of each stock in the stock portfolio, the feature data is input to the LSTM, and the price prediction result for each stock in the stock portfolio is output. In LSTM, each neuron is a memory cell with an input gate, a forget gate, and an output gate. One of the keys to the LSTM model is its The Forgotten Gate, which controls the convergence of the gradients during training, while maintaining long-term memory. Figure 6 shows an operational diagram of a simple memory cell of the LSTM. Referring to Figure 6, the main mathematical symbols involved in a simple memory cell are as follows:
1、x t为t时刻记忆细胞里的输入向量; 1. x t is the input vector in the memory cell at time t;
2、W i,W f,W c,W o,U i,U f,U c,U o和V o均为网络权重方阵; 2. W i , W f , W c , W o , U i , U f , U c , U o and V o are network weighted squares;
3、b i,b f,b c和b o为网络偏差向量; 3, b i , b f , b c and b o are network deviation vectors;
4、h t为记忆细胞t时刻的值; 4, h t is the value of the memory cell t time;
5、i t(即图6所示的输入门)和
Figure PCTCN2018077242-appb-000018
分别为t时刻记忆细胞输入门与候选状态的计算公式:
5, i t (ie the input gate shown in Figure 6) and
Figure PCTCN2018077242-appb-000018
The calculation formula for the memory cell input gate and candidate state at time t:
i t=σ(W ix t+U ih t-1+b i); i t =σ(W i x t +U i h t-1 +b i );
Figure PCTCN2018077242-appb-000019
Figure PCTCN2018077242-appb-000019
6、f t(即图6所示的遗忘门)和C t分别为t时刻记忆细胞遗忘门与候选状态各自的计算公式: 6. f t (that is, the forgetting gate shown in Fig. 6) and C t are the calculation formulas of the memory cell forgetting gate and the candidate state at time t:
f t=σ(W fx t+U fh t-1+b f) f t =σ(W f x t +U f h t-1 +b f )
Figure PCTCN2018077242-appb-000020
Figure PCTCN2018077242-appb-000020
7、o t(即图6所示的输出门)和h t分别为t时刻记忆细胞输出门与记忆细胞各自的计算公式: 7. o t (ie the output gate shown in Figure 6) and h t are the respective calculation formulas for the memory cell output gate and memory cell at time t:
o t=σ(W ox t+U oh t-1+V oC t+b o) o t =σ(W o x t +U o h t-1 +V o C t +b o )
h t=o t*tanh(C t) h t =o t *tanh(C t )
本申请实施例中,基于各支股票的预设指标数据来筛选出适宜投资的股票组合,再基于多种影响股份波动的数据源来提取出股票组合中各支股票的特征数据,以通过长短期记忆网络计算得到这些股票的价格预测结果,整个预测过程充分考虑了金融市场的行为特点,有效地减少了预测结果与股票后续的实际价格走势之间的偏差。由此一来,用户可以更为合理地基于该价格预测结果进行选股及择时入股的投资行为,有效降低了用户的投资风险。In the embodiment of the present application, based on the preset index data of each stock, the stock portfolio suitable for investment is selected, and then the characteristic data of each stock in the stock portfolio is extracted based on various data sources affecting the stock fluctuation, so as to pass the long The short-term memory network calculates the price prediction results of these stocks. The whole forecasting process fully considers the behavior characteristics of the financial market, and effectively reduces the deviation between the forecast result and the actual price trend of the stock. As a result, the user can more reasonably conduct the stock selection and the investment behavior of the stock selection based on the price forecast result, thereby effectively reducing the investment risk of the user.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence of the steps in the above embodiments does not mean that the order of execution is performed. The order of execution of each process should be determined by its function and internal logic, and should not be construed as limiting the implementation process of the embodiments of the present application.
对应于上文实施例所述的基于机器学习的择时入股方法,图7示出了本申请实施例提供的基于机器学习的择时入股装置的结构框图,为了便于说明,仅示出了与本申请实施例相关的部分。Corresponding to the machine learning-based timing joining method described in the above embodiments, FIG. 7 is a structural block diagram of a machine learning-based timing joining device provided by an embodiment of the present application. For the convenience of description, only The relevant parts of the embodiments of the present application.
参照图7,该装置包括:Referring to Figure 7, the apparatus includes:
选股单元71:将各支股票的预设指标数据输入预设的选股模型,输出股票组合。The stock selection unit 71: input preset indicator data of each stock into a preset stock selection model, and output a stock combination.
获取单元72:分别获取股票组合中各支股票的特征数据,其中,股票的特征数据包括述股票的股市交易数据或股票的技术指标数据。The obtaining unit 72: respectively obtain feature data of each stock in the stock combination, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock.
预测单元73:对长短期记忆网络进行预训练,并将股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于股票组合中各支股票的价格预测结果,以使用户根据价格预测结果确定择时入股策略。Prediction unit 73: pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting the price prediction result of each stock in the stock portfolio, so that The user determines the timing of the stock purchase strategy based on the price forecast.
可选地,特征数据还包括与股票相关的社交媒体数据,则获取单元72包括:Optionally, the feature data further includes social media data related to the stock, and the obtaining unit 72 includes:
第一采集子单元:采集并存储网络社交平台上的用户生成内容。The first collection sub-unit: collects and stores user-generated content on the network social platform.
第一提取子单元:提取与检索关键词匹配的用户生成内容,其中,检索关键词包括股票组合中各支股票的股票名称或者股票代码。The first extraction subunit: extracts user generated content matching the retrieval keyword, wherein the retrieval keyword includes a stock name or a stock code of each stock in the stock combination.
第一分析子单元:对提取出的用户生成内容进行分析,分别得到股票组合中各支股票的特征数据。The first analysis subunit: analyzes the extracted user generated content, and obtains feature data of each stock in the stock combination respectively.
可选地,特征数据的还包括与股票相关的消息数据,则获取单元72包括:Optionally, the feature data further includes message data related to the stock, and the obtaining unit 72 includes:
第一采集子单元:采集新闻客户端或新闻网站上的新闻数据并存储。The first collection sub-unit: collects and stores news data on a news client or a news website.
第一提取子单元:提取与检索关键词匹配的新闻数据,其中,检索关键词包括股票组合中各支股票对应的上市公司相关的人名或公司名称。The first extracting subunit: extracting news data matching the search keyword, wherein the search keyword includes a person name or a company name related to the listed company corresponding to each stock in the stock portfolio.
第一分析子单元:对提取出的新闻数据进行分析,得到股票组合中各支股票的消息数据。The first analysis subunit: analyzes the extracted news data to obtain message data of each stock in the stock portfolio.
可选地,选股单元71包括:Optionally, the stock selection unit 71 includes:
训练子单元:将预设年度全年收益排名在前M位的M支股票设置为第一类,将预设年度全年收益排名在后N位的N支股票设置为第二类,对初始的选股模型进行训练,以得到预设的选股模型;Training sub-unit: Set the M stocks with the pre-determined annual full-year earnings in the top M position as the first category, and set the N stocks with the default annual annual return in the last N positions as the second category. The stock selection model is trained to obtain a preset stock selection model;
选取子单元:基于决策树算法,选取对预设年度全年收益贡献度最大的P类预设指标数据;Select sub-units: based on the decision tree algorithm, select the P-type preset indicator data that has the largest contribution to the annual income of the preset year;
计算子单元:将各支股票的P类预设指标数据分别输入预设的选股模型,计算各支股票在P类预设指标数据上的综合得分;Calculating the sub-unit: inputting the P-type preset index data of each stock into the preset stock selection model, and calculating the comprehensive score of each stock on the P-type preset index data;
第一输出子单元:将综合得分排在前Q位的股票输出为股票组合;The first output sub-unit: outputting the stock with the comprehensive score in the top Q position as a stock combination;
其中,所述M、N、P和Q均为正整数。Wherein, the M, N, P and Q are all positive integers.
可选地,预测单元73包括:Optionally, the prediction unit 73 includes:
变换子单元:将所述股票组合中各支股票的特征数据分别进行去噪处理;Transform subunit: performing characteristic denoising processing on each feature data of each stock in the stock combination;
第二输出子单元:将去噪处理后的所述股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果。The second output subunit: input the feature data of each stock in the stock combination after the denoising process to the long-short-term memory network that completes the pre-training, and output a price prediction result about each stock in the stock portfolio.
图8是本申请一实施例提供的终端设备的示意图。如图8所示,该实施例的终端设备8包括:处理器80、存储器81以及存储在存储器81中并可在处理器80上运行的计算机可读指令82,例如发票的票面内容处理程序。处理器80执行计算机可读指令82时实现上述各个发票的票面内容处理方法实施例中的步骤,例如图1所示的步骤101至103。或者,处理器80执行计算机可读指令82时实现上述各装置实施例中各单元的功能,例如图7所示模块71至73的功能。FIG. 8 is a schematic diagram of a terminal device according to an embodiment of the present application. As shown in FIG. 8, the terminal device 8 of this embodiment includes a processor 80, a memory 81, and computer readable instructions 82 stored in the memory 81 and operable on the processor 80, such as a ticket face content processing program for an invoice. The processor 80 executes the steps in the embodiment of the ticket content processing method of each of the invoices when the computer readable instructions 82 are executed, such as steps 101 through 103 shown in FIG. Alternatively, processor 80, when executing computer readable instructions 82, implements the functions of the various units of the various apparatus embodiments described above, such as the functions of modules 71 through 73 shown in FIG.
示例性的,计算机可读指令82可以被分割成一个或多个单元,一个或者多个单元被存储在存储器81中,并由处理器80执行,以完成本申请。一个或多个单元可以是能够完成特定功能的一系列计算机可读指令的指令段,该指令段用于描述计算机可读指令82在所述终端设备8中的执行过程。例如,计算机可读指令82可以被分割成选股单元、获取单元、预测单元,各单元具体功能如下:Illustratively, computer readable instructions 82 may be partitioned into one or more units, one or more units being stored in memory 81 and executed by processor 80 to complete the application. The one or more units may be a series of instructions of computer readable instructions capable of performing a particular function, which is used to describe the execution of computer readable instructions 82 in the terminal device 8. For example, the computer readable instructions 82 can be divided into a stock selection unit, an acquisition unit, and a prediction unit, and the specific functions of each unit are as follows:
选股单元:将各支股票的预设指标数据输入预设的选股模型,输出股票组合。Stock selection unit: input the preset index data of each stock into the preset stock selection model and output the stock portfolio.
获取单元:分别获取股票组合中各支股票的特征数据,其中,股票的特征数据包括述股票的股市交易数据或股票的技术指标数据。Acquiring unit: respectively obtaining feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical index data of the stock.
预测单元:对长短期网络进行预训练,并将股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于股票组合中各支股票的价格预测结果,以使用户根据价格预测结果确定择时入股策略。Predicting unit: pre-training the long-term and short-term networks, and inputting the characteristic data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting the price prediction result of each stock in the stock portfolio, so that the user can The price forecast results determine the timing of the shareholding strategy.
可选地,特征数据还包括与股票相关的社交媒体数据,则获取单元包括:Optionally, the feature data further includes social media data related to the stock, and the acquiring unit includes:
第一采集子单元:采集并存储网络社交平台上的用户生成内容。The first collection sub-unit: collects and stores user-generated content on the network social platform.
第一提取子单元:提取与检索关键词匹配的用户生成内容,其中,检索关键词包括股票组合中各支股票的股票名称或者股票代码。The first extraction subunit: extracts user generated content matching the retrieval keyword, wherein the retrieval keyword includes a stock name or a stock code of each stock in the stock combination.
第一分析子单元:对提取出的用户生成内容进行分析,分别得到股票组合中各支股票的社交媒体数据。The first analysis sub-unit: analyzes the extracted user-generated content, and respectively obtains social media data of each stock in the stock combination.
可选地,特征数据的还包括与股票相关的消息数据,则获取单元包括:Optionally, the feature data further includes message data related to the stock, and the acquiring unit includes:
第一采集子单元:采集新闻客户端或新闻网站上的新闻数据并存储。The first collection sub-unit: collects and stores news data on a news client or a news website.
第一提取子单元:提取与检索关键词匹配的新闻数据,其中,检索关键词包括股票组合中各支股票对应的上市公司相关的人名或公司名称。The first extracting subunit: extracting news data matching the search keyword, wherein the search keyword includes a person name or a company name related to the listed company corresponding to each stock in the stock portfolio.
第一分析子单元:对提取出的新闻数据进行分析,得到股票组合中各支股票的消息数据。The first analysis subunit: analyzes the extracted news data to obtain message data of each stock in the stock portfolio.
可选地,选股单元包括:Optionally, the stock selection unit comprises:
训练子单元:将预设年度全年收益排名在前M位的M支股票设置为第一类,将预设年度全年收益排名在后N位的N支股票设置为第二类,对初始的选股模型进行训练,以得到预设的选股模型;Training sub-unit: Set the M stocks with the pre-determined annual full-year earnings in the top M position as the first category, and set the N stocks with the default annual annual return in the last N positions as the second category. The stock selection model is trained to obtain a preset stock selection model;
选取子单元:基于决策树算法,选取对预设年度全年收益贡献度最大的P类预设指标数据;Select sub-units: based on the decision tree algorithm, select the P-type preset indicator data that has the largest contribution to the annual income of the preset year;
计算子单元:将各支股票的P类预设指标数据分别输入预设的选股模型,计算各支股票在P类预设指标数据上的综合得分;Calculating the sub-unit: inputting the P-type preset index data of each stock into the preset stock selection model, and calculating the comprehensive score of each stock on the P-type preset index data;
第一输出子单元:将综合得分排在前Q位的股票输出为股票组合;The first output sub-unit: outputting the stock with the comprehensive score in the top Q position as a stock combination;
其中,所述M、N、P和Q均为正整数。Wherein, the M, N, P and Q are all positive integers.
可选地,预测单元包括:Optionally, the prediction unit comprises:
变换子单元:将所述股票组合中各支股票的特征数据分别进行去噪处理;Transform subunit: performing characteristic denoising processing on each feature data of each stock in the stock combination;
第二输出子单元:将去噪处理后的所述股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果。The second output subunit: input the feature data of each stock in the stock combination after the denoising process to the long-short-term memory network that completes the pre-training, and output a price prediction result about each stock in the stock portfolio.
终端设备8可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。终端设备8可包括,但不仅限于,处理器80、存储器81。本领域技术人员可以理解,图8仅仅是终端设备8的示例,并不构成对终端设备8的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如终端设备8还可以包括输入输出设备、网络接入设备、总线等。The terminal device 8 can be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The terminal device 8 may include, but is not limited to, a processor 80, a memory 81. It will be understood by those skilled in the art that FIG. 8 is merely an example of the terminal device 8, and does not constitute a limitation on the terminal device 8, and may include more or less components than those illustrated, or combine some components, or different components. For example, the terminal device 8 may further include an input/output device, a network access device, a bus, and the like.
处理器80可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处 理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 80 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), and an off-the-shelf device. Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
存储器81可以是终端设备8的内部存储单元,例如终端设备8的硬盘或内存。存储器81也可以是终端设备8的外部存储设备,例如终端设备8上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器81还可以既包括终端设备5的内部存储单元也包括外部存储设备。存储器81用于存储所述计算机可读指令以及终端设备所需的其他程序和数据。存储器81还可以用于暂时地存储已经输出或者将要输出的数据。The memory 81 may be an internal storage unit of the terminal device 8, such as a hard disk or a memory of the terminal device 8. The memory 81 may also be an external storage device of the terminal device 8, such as a plug-in hard disk provided on the terminal device 8, a smart memory card (SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on. Further, the memory 81 may also include both an internal storage unit of the terminal device 5 and an external storage device. The memory 81 is used to store the computer readable instructions and other programs and data required by the terminal device. The memory 81 can also be used to temporarily store data that has been output or is about to be output.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读存储介质中,该计算机可读指令在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机可读指令包括计算机可读指令代码,所述计算机可读指令代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机可读指令代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer readable instructions, which may be stored in a computer readable storage medium. The computer readable instructions, when executed by a processor, may implement the steps of the various method embodiments described above. Wherein, the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like. The computer readable medium can include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard drive, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only) Memory), random access memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the content contained in the computer readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in a jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, computer readable media Does not include electrical carrier signals and telecommunication signals.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to explain the technical solutions of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing embodiments. The technical solutions described in the examples are modified or equivalently replaced with some of the technical features; and the modifications or substitutions do not deviate from the spirit and scope of the technical solutions of the embodiments of the present application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种基于机器学习的择时入股方法,其特征在于,包括:A method based on machine learning for timing purchase, characterized in that it comprises:
    将各支股票的预设指标数据输入预设的选股模型,输出股票组合;Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;
    分别获取所述股票组合中各支股票的特征数据,所述股票的特征数据包括所述股票的股市交易数据或所述股票的技术指标数据;And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;
    对长短期记忆网络进行预训练,并将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,以使用户根据所述价格预测结果确定择时入股策略。Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
  2. 如权利要求1所述的基于机器学习的择时入股方法,其特征在于,所述特征数据还包括与所述股票相关的社交媒体数据,所述分别获取所述股票组合中各支股票的特征数据,包括:The machine learning-based timing-based stock-in method according to claim 1, wherein the feature data further comprises social media data related to the stock, and the characteristics of each stock in the stock portfolio are respectively acquired. Data, including:
    采集并存储网络社交平台上的用户生成内容;Collecting and storing user-generated content on a social networking platform;
    提取与检索关键词匹配的所述用户生成内容,所述检索关键词包括所述股票组合中各支股票的股票名称或者股票代码;Extracting the user-generated content that matches the search keyword, the search keyword including a stock name or a stock code of each stock in the stock portfolio;
    对提取出的所述用户生成内容进行分析,分别得到所述股票组合中各支股票的所述社交媒体数据。The extracted user generated content is analyzed to obtain the social media data of each stock in the stock combination.
  3. 如权利要求1所述的基于机器学习的择时入股方法,其特征在于,所述特征数据还包括与所述股票相关的消息数据,所述分别获取所述股票组合中各支股票的特征数据,包括:The machine learning-based timing-based stock-in method according to claim 1, wherein the feature data further includes message data related to the stock, and the feature data of each stock in the stock combination is separately acquired. ,include:
    采集新闻客户端或新闻网站上的新闻数据并存储;Collect and store news data on news clients or news sites;
    提取与检索关键词匹配的所述新闻数据,所述检索关键词包括所述股票组合中各支股票对应的上市公司相关的人名或公司名称;Extracting the news data that matches the search keyword, the search keyword including a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;
    对提取出的所述新闻数据进行分析,得到所述股票组合中各支股票的消息数据。The extracted news data is analyzed to obtain message data of each stock in the stock combination.
  4. 如权利要求1所述的基于机器学习的择时入股方法,其特征在于,所述将各支股票的预设指标数据输入预设的选股模型,输出股票组合,包括:The machine learning-based timing-based stock-in method according to claim 1, wherein the inputting the preset index data of each stock into a preset stock selection model and outputting the stock combination comprises:
    将预设年度全年收益排名在前M位的M支股票设置为第一类,将所述预设年度全年收益排名在后N位的N支股票设置为第二类,对初始的选股模型进行训练,以得到所述预设的选股模型;The M stocks with the pre-determined annual full-year income ranked in the top M are set to the first category, and the N stocks with the preset annual annual income ranked in the last N positions are set to the second category, and the initial selection is made. The stock model is trained to obtain the preset stock selection model;
    基于决策树算法,选取对所述预设年度全年收益贡献度最大的P类所述预设指标数据;Determining, according to the decision tree algorithm, the preset indicator data of the P class having the largest contribution to the annual income of the preset year;
    将各支股票的P类所述预设指标数据分别输入所述预设的选股模型,计算各支股票在P 类所述预设指标数据上的综合得分;Entering the preset indicator data of the P class of each stock into the preset stock selection model, and calculating a comprehensive score of each stock on the preset indicator data of the P category;
    将所述综合得分排在前Q位的股票输出为所述股票组合;Outputting the stock in which the composite score is ranked in the top Q is the stock combination;
    其中,所述M、N、P和Q均为正整数。Wherein, the M, N, P and Q are all positive integers.
  5. 如权利要求1所述的基于机器学习的择时入股方法,其特征在于,所述将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,包括:The machine learning-based timing-based stock-in method according to claim 1, wherein the feature data of each stock in the stock combination is input to the long-term and short-term memory network that completes pre-training, and the relevant information is output. The price forecast results for each stock in the stock portfolio, including:
    将所述股票组合中各支股票的特征数据分别进行去噪处理;De-noising the feature data of each stock in the stock portfolio;
    将去噪处理后的所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果。The feature data of each stock in the stock combination after the denoising process is input to the long-short-term memory network that completes the pre-training, and the price prediction result about each stock in the stock portfolio is output.
  6. 一种基于机器学习的择时入股装置,其特征在于,包括:A timing learning device based on machine learning, characterized in that it comprises:
    选股单元,用于将各支股票的预设指标数据输入预设的选股模型,输出股票组合;a stock selection unit for inputting preset index data of each stock into a preset stock selection model, and outputting a stock combination;
    获取单元,用于分别获取股票组合中各支股票的特征数据,其中,股票的特征数据包括述股票的股市交易数据或股票的技术指标数据;The obtaining unit is configured to respectively obtain feature data of each stock in the stock portfolio, wherein the feature data of the stock includes stock market trading data of the stock or technical indicator data of the stock;
    预测单元,用于对长短期记忆网络进行预训练,并将股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于股票组合中各支股票的价格预测结果,以使用户根据价格预测结果确定择时入股策略。The prediction unit is configured to pre-train the long-term and short-term memory network, and input the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and output the price prediction result of each stock in the stock portfolio, Enable users to determine the timing of the stock based strategy based on the price forecast.
  7. 如权利要求6所述的基于机器学习的择时入股装置,其特征在于,所述特征数据还包括与所述股票相关的社交媒体数据,所述获取单元包括:The machine learning-based timing depositing device of claim 6, wherein the feature data further comprises social media data related to the stock, the obtaining unit comprising:
    第一采集子单元,用于采集并存储网络社交平台上的用户生成内容;a first collection subunit for collecting and storing user generated content on a network social platform;
    第一提取子单元,用于提取与检索关键词匹配的用户生成内容,其中,检索关键词包括股票组合中各支股票的股票名称或者股票代码;a first extracting subunit, configured to extract user generated content that matches the search keyword, wherein the search keyword includes a stock name or a stock code of each stock in the stock combination;
    第一分析子单元,用于对提取出的用户生成内容进行分析,分别得到股票组合中各支股票的社交媒体数据。The first analysis subunit is configured to analyze the extracted user generated content, and respectively obtain social media data of each stock in the stock combination.
  8. 如权利要求6所述的基于机器学习的择时入股装置,其特征在于,所述特征数据还包括与所述股票相关的消息数据,所述获取单元包括:The machine learning-based timing depositing device according to claim 6, wherein the feature data further includes message data related to the stock, and the obtaining unit comprises:
    第一采集子单元,用于采集新闻客户端或新闻网站上的新闻数据并存储;a first collection subunit for collecting and storing news data on a news client or a news website;
    第一提取子单元,用于提取与检索关键词匹配的新闻数据,其中,检索关键词包括股票组合中各支股票对应的上市公司相关的人名或公司名称;a first extracting subunit, configured to extract news data that matches the search keyword, wherein the search keyword includes a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;
    第一分析子单元,用于对提取出的新闻数据进行分析,得到股票组合中各支股票的消息数据。The first analysis subunit is configured to analyze the extracted news data to obtain message data of each stock in the stock combination.
  9. 如权利要求6所述的基于机器学习的择时入股装置,其特征在于,所述选股单元包 括:The machine learning based timing depositing device of claim 6 wherein said stock selection unit comprises:
    训练子单元,用于将预设年度全年收益排名在前M位的M支股票设置为第一类,将预设年度全年收益排名在后N位的N支股票设置为第二类,对初始的选股模型进行训练,以得到预设的选股模型;The training sub-unit is configured to set the M stocks whose top-year income is ranked in the top M in the first year as the first category, and the N-shares in which the default annual income is ranked in the last N-position as the second category. Training the initial stock selection model to obtain a preset stock selection model;
    选取子单元,用于基于决策树算法,选取对预设年度全年收益贡献度最大的P类预设指标数据;Selecting a sub-unit for selecting a P-type preset indicator data having the largest contribution to the annual income of the preset year based on the decision tree algorithm;
    计算子单元,用于将各支股票的P类预设指标数据分别输入预设的选股模型,计算各支股票在P类预设指标数据上的综合得分;The calculation subunit is configured to input the P type preset indicator data of each stock into a preset stock selection model, and calculate a comprehensive score of each stock on the P type preset indicator data;
    第一输出子单元,用于将综合得分排在前Q位的股票输出为股票组合;a first output subunit, configured to output the stock with the composite score in the top Q position as a stock combination;
    其中,所述M、N、P和Q均为正整数。Wherein, the M, N, P and Q are all positive integers.
  10. 如权利要求6所述的基于机器学习的择时入股装置,其特征在于,所述预测单元包括:The machine learning-based timing depositing device according to claim 6, wherein the predicting unit comprises:
    变换子单元,用于将所述股票组合中各支股票的特征数据分别进行去噪处理;a transformation subunit, configured to perform denoising processing on feature data of each stock in the stock combination;
    第二输出子单元,用于将去噪处理后的所述股票组合中各支股票的特征数据输入至完成预训练的长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果。a second output subunit, configured to input feature data of each stock in the stock combination after denoising processing to a long-short-term memory network that completes pre-training, and output a price prediction result about each stock in the stock portfolio .
  11. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A terminal device comprising a memory, a processor, and computer readable instructions stored in the memory and operable on the processor, wherein the processor executes the computer readable instructions as follows step:
    将各支股票的预设指标数据输入预设的选股模型,输出股票组合;Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;
    分别获取所述股票组合中各支股票的特征数据,所述股票的特征数据包括所述股票的股市交易数据或所述股票的技术指标数据;And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;
    对长短期记忆网络进行预训练,并将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,以使用户根据所述价格预测结果确定择时入股策略。Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
  12. 如权利要求11所述的终端设备,其特征在于,所述特征数据的还包括与所述股票相关的社交媒体数据,所述分别获取所述股票组合中各支股票的特征数据的步骤,包括:The terminal device according to claim 11, wherein the feature data further includes social media data related to the stock, and the step of respectively acquiring feature data of each stock in the stock combination includes :
    采集并存储网络社交平台上的用户生成内容;Collecting and storing user-generated content on a social networking platform;
    提取与检索关键词匹配的所述用户生成内容,所述检索关键词包括所述股票组合中各支股票的股票名称或者股票代码;Extracting the user-generated content that matches the search keyword, the search keyword including a stock name or a stock code of each stock in the stock portfolio;
    对提取出的所述用户生成内容进行分析,分别得到所述股票组合中各支股票的所述社交媒体数据。The extracted user generated content is analyzed to obtain the social media data of each stock in the stock combination.
  13. 如权利要求11所述的终端设备,其特征在于,所述特征数据的还包括与所述股票 相关的消息数据,所述分别获取所述股票组合中各支股票的特征数据的步骤,包括:The terminal device according to claim 11, wherein the feature data further comprises message data related to the stock, and the step of respectively acquiring feature data of each stock in the stock combination comprises:
    采集新闻客户端或新闻网站上的新闻数据并存储;Collect and store news data on news clients or news sites;
    提取与检索关键词匹配的所述新闻数据,所述检索关键词包括所述股票组合中各支股票对应的上市公司相关的人名或公司名称;Extracting the news data that matches the search keyword, the search keyword including a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;
    对提取出的所述新闻数据进行分析,得到所述股票组合中各支股票的消息数据。The extracted news data is analyzed to obtain message data of each stock in the stock combination.
  14. 如权利要求11所述的终端设备,其特征在于,所述将各支股票的预设指标数据输入预设的选股模型,输出股票组合的步骤,包括:The terminal device according to claim 11, wherein the step of inputting preset indicator data of each stock into a preset stock selection model and outputting a stock combination comprises:
    将预设年度全年收益排名在前M位的M支股票设置为第一类,将所述预设年度全年收益排名在后N位的N支股票设置为第二类,对初始的选股模型进行训练,以得到所述预设的选股模型;The M stocks with the pre-determined annual full-year income ranked in the top M are set to the first category, and the N stocks with the preset annual annual income ranked in the last N positions are set to the second category, and the initial selection is made. The stock model is trained to obtain the preset stock selection model;
    基于决策树算法,选取对所述预设年度全年收益贡献度最大的P类所述预设指标数据;Determining, according to the decision tree algorithm, the preset indicator data of the P class having the largest contribution to the annual income of the preset year;
    将各支股票的P类所述预设指标数据分别输入所述预设的选股模型,计算各支股票在P类所述预设指标数据上的综合得分;Entering the preset indicator data of the P class of each stock into the preset stock selection model, and calculating a comprehensive score of each stock on the preset indicator data of the P category;
    将所述综合得分排在前Q位的股票输出为所述股票组合;Outputting the stock in which the composite score is ranked in the top Q is the stock combination;
    其中,所述M、N、P和Q均为正整数。Wherein, the M, N, P and Q are all positive integers.
  15. 如权利要求11所述的终端设备,其特征在于,所述将所述股票组合中各支股票的特征数据输入至长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果的步骤,包括:The terminal device according to claim 11, wherein said character data of each stock in said stock combination is input to a long-term and short-term memory network, and a price prediction result for each stock in said stock combination is output. Steps, including:
    将所述股票组合中各支股票的特征数据分别进行去噪处理;De-noising the feature data of each stock in the stock portfolio;
    将去噪处理后的所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果。The feature data of each stock in the stock combination after the denoising process is input to the long-short-term memory network that completes the pre-training, and the price prediction result about each stock in the stock portfolio is output.
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被至少一个处理器执行时实现如下步骤:A computer readable storage medium storing computer readable instructions, wherein the computer readable instructions, when executed by at least one processor, implement the following steps:
    将各支股票的预设指标数据输入预设的选股模型,输出股票组合;Inputting preset indicator data of each stock into a preset stock selection model, and outputting a stock combination;
    分别获取所述股票组合中各支股票的特征数据,所述股票的特征数据包括所述股票的股市交易数据或所述股票的技术指标数据;And acquiring feature data of each stock in the stock combination, where the feature data of the stock includes stock market transaction data of the stock or technical indicator data of the stock;
    对长短期记忆网络进行预训练,并将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,以使用户根据所述价格预测结果确定择时入股策略。Pre-training the long-term and short-term memory network, and inputting the feature data of each stock in the stock portfolio to the long-short-term memory network that completes the pre-training, and outputting a price prediction result for each stock in the stock portfolio, In order for the user to determine a timing share strategy based on the price prediction result.
  17. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述特征数据还包括与所述股票相关的社交媒体数据,所述分别获取所述股票组合中各支股票的特征数据,包括:The computer readable storage medium according to claim 16, wherein the feature data further comprises social media data related to the stock, wherein the feature data of each stock in the stock combination is separately obtained, including :
    采集并存储网络社交平台上的用户生成内容;Collecting and storing user-generated content on a social networking platform;
    提取与检索关键词匹配的所述用户生成内容,所述检索关键词包括所述股票组合中各支股票的股票名称或者股票代码;Extracting the user-generated content that matches the search keyword, the search keyword including a stock name or a stock code of each stock in the stock portfolio;
    对提取出的所述用户生成内容进行分析,分别得到所述股票组合中各支股票的所述社交媒体数据。The extracted user generated content is analyzed to obtain the social media data of each stock in the stock combination.
  18. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述特征数据还包括与所述股票相关的消息数据,所述分别获取所述股票组合中各支股票的特征数据,包括:The computer readable storage medium according to claim 16, wherein the feature data further comprises message data related to the stock, and the obtaining feature data of each stock in the stock combination respectively comprises:
    采集新闻客户端或新闻网站上的新闻数据并存储;Collect and store news data on news clients or news sites;
    提取与检索关键词匹配的所述新闻数据,所述检索关键词包括所述股票组合中各支股票对应的上市公司相关的人名或公司名称;Extracting the news data that matches the search keyword, the search keyword including a person name or a company name related to the listed company corresponding to each stock in the stock portfolio;
    对提取出的所述新闻数据进行分析,得到所述股票组合中各支股票的消息数据。The extracted news data is analyzed to obtain message data of each stock in the stock combination.
  19. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述将各支股票的预设指标数据输入预设的选股模型,输出股票组合,包括:The computer readable storage medium according to claim 16, wherein the inputting the preset indicator data of each stock into a preset stock selection model and outputting the stock combination comprises:
    将预设年度全年收益排名在前M位的M支股票设置为第一类,将所述预设年度全年收益排名在后N位的N支股票设置为第二类,对初始的选股模型进行训练,以得到所述预设的选股模型;The M stocks with the pre-determined annual full-year income ranked in the top M are set to the first category, and the N stocks with the preset annual annual income ranked in the last N positions are set to the second category, and the initial selection is made. The stock model is trained to obtain the preset stock selection model;
    基于决策树算法,选取对所述预设年度全年收益贡献度最大的P类所述预设指标数据;Determining, according to the decision tree algorithm, the preset indicator data of the P class having the largest contribution to the annual income of the preset year;
    将各支股票的P类所述预设指标数据分别输入所述预设的选股模型,计算各支股票在P类所述预设指标数据上的综合得分;Entering the preset indicator data of the P class of each stock into the preset stock selection model, and calculating a comprehensive score of each stock on the preset indicator data of the P category;
    将所述综合得分排在前Q位的股票输出为所述股票组合;Outputting the stock in which the composite score is ranked in the top Q is the stock combination;
    其中,所述M、N、P和Q均为正整数。Wherein, the M, N, P and Q are all positive integers.
  20. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述将所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果,包括:The computer readable storage medium according to claim 16, wherein said inputting feature data of each stock in said stock combination to said long-term and short-term memory network that completes pre-training, and outputting said stock combination The price forecast results of each stock include:
    将所述股票组合中各支股票的特征数据分别进行去噪处理;De-noising the feature data of each stock in the stock portfolio;
    将去噪处理后的所述股票组合中各支股票的特征数据输入至完成预训练的所述长短期记忆网络,输出关于所述股票组合中各支股票的价格预测结果。The feature data of each stock in the stock combination after the denoising process is input to the long-short-term memory network that completes the pre-training, and the price prediction result about each stock in the stock portfolio is output.
PCT/CN2018/077242 2017-09-28 2018-02-26 Time selection admission method based on machine learning, device and terminal equipment therefor WO2019062006A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710899893.1 2017-09-28
CN201710899893.1A CN107798604A (en) 2017-09-28 2017-09-28 Become a shareholder when selecting method and terminal device based on machine learning

Publications (1)

Publication Number Publication Date
WO2019062006A1 true WO2019062006A1 (en) 2019-04-04

Family

ID=61533874

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/077242 WO2019062006A1 (en) 2017-09-28 2018-02-26 Time selection admission method based on machine learning, device and terminal equipment therefor

Country Status (2)

Country Link
CN (1) CN107798604A (en)
WO (1) WO2019062006A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108734335A (en) * 2018-04-03 2018-11-02 平安科技(深圳)有限公司 Electronic device, finance data processing method and computer readable storage medium
CN108804564A (en) * 2018-05-22 2018-11-13 深圳壹账通智能科技有限公司 The combined recommendation method and terminal device of financial product
CN108764718B (en) * 2018-05-28 2021-05-18 王春宁 College entrance examination score estimation and volunteering selection method and system based on deep learning algorithm
CN108985501B (en) * 2018-06-29 2022-04-29 平安科技(深圳)有限公司 Index feature extraction-based stock index prediction method, server and storage medium
CN110880163B (en) * 2018-09-05 2022-08-19 南京大学 Low-light color imaging method based on deep learning
CN109389426A (en) * 2018-09-26 2019-02-26 深圳壹账通智能科技有限公司 Acquisition methods, system, computer equipment and the storage medium of commodity price level
CN109271971B (en) * 2018-11-02 2022-06-14 广东工业大学 Noise reduction method for time sequence financial data
CN111178498B (en) * 2019-12-09 2023-08-22 北京邮电大学 Stock fluctuation prediction method and device
CN111222051B (en) * 2020-01-16 2023-09-12 深圳市华海同创科技有限公司 Training method and device for trend prediction model
CN111681113B (en) * 2020-05-29 2023-07-18 泰康保险集团股份有限公司 System and server for configuring foundation product object
CN112561699A (en) * 2020-12-11 2021-03-26 山证科技(深圳)有限公司 Method, system and storage medium for processing dealer client data
CN113159941A (en) * 2021-02-02 2021-07-23 上海卡方信息科技有限公司 Intelligent streaming transaction execution method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384288A (en) * 2016-08-31 2017-02-08 苗青 Decision system and method based on multiple-factor quantization time selection model
CN106952161A (en) * 2017-03-31 2017-07-14 洪志令 A kind of recent forward prediction method of stock based on shot and long term memory depth learning network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384288A (en) * 2016-08-31 2017-02-08 苗青 Decision system and method based on multiple-factor quantization time selection model
CN106952161A (en) * 2017-03-31 2017-07-14 洪志令 A kind of recent forward prediction method of stock based on shot and long term memory depth learning network

Also Published As

Publication number Publication date
CN107798604A (en) 2018-03-13

Similar Documents

Publication Publication Date Title
WO2019062006A1 (en) Time selection admission method based on machine learning, device and terminal equipment therefor
Huang et al. Deep learning in finance and banking: A literature review and classification
Lu et al. A CNN-BiLSTM-AM method for stock price prediction
US20180260891A1 (en) Systems and methods for generating and using optimized ensemble models
Yujun et al. Research on a hybrid prediction model for stock price based on long short-term memory and variational mode decomposition
Su et al. Multi-factor RFG-LSTM algorithm for stock sequence predicting
CN110796539A (en) Credit investigation evaluation method and device
Qian et al. On exploring the impact of users’ bullish-bearish tendencies in online community on the stock market
Deng et al. An intelligent system for insider trading identification in Chinese security market
Zhang et al. An equity fund recommendation system by combing transfer learning and the utility function of the prospect theory
Ni et al. Forecasting the dynamic correlation of stock indices based on deep learning method
CN116362823A (en) Recommendation model training method, recommendation method and recommendation device for behavior sparse scene
Moedjahedy et al. Stock price forecasting on telecommunication sector companies in Indonesia Stock Exchange using machine learning algorithms
Wang et al. A multi-factor two-stage deep integration model for stock price prediction based on intelligent optimization and feature clustering
Zhang et al. A two-step framework for arbitrage-free prediction of the implied volatility surface
Mamadiyorov et al. The Impact of Digitalization on Microfinance Services in Uzbekistan
Wang et al. Carbon trading price forecasting in digitalization social change era using an explainable machine learning approach: The case of China as emerging country evidence
Tang et al. On forecasting realized volatility for bitcoin based on deep learning PSO–GRU model
Zhang et al. A hybrid forecasting model based on deep learning feature extraction and statistical arbitrage methods for stock trading strategies
Modi et al. Big data analysis in stock market prediction
Pagariya et al. Cryptocurrency analysis and forecasting
Xie et al. Exploration of stock portfolio investment construction using deep learning neural network
Lan et al. [Retracted] Risk Identification and Application of Farmland Management Right Mortgage Loan Based on Neural Network
Zang Construction of Mobile Internet Financial Risk Cautioning Framework Based on BP Neural Network
Yang et al. Value at risk estimation under stochastic volatility models using adaptive PMCMC methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18863767

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC , EPO FORM 1205A DATED 28.09.2020.

122 Ep: pct application non-entry in european phase

Ref document number: 18863767

Country of ref document: EP

Kind code of ref document: A1