CN106022522A - Method and system for predicting stocks based on big data published by internet - Google Patents

Method and system for predicting stocks based on big data published by internet Download PDF

Info

Publication number
CN106022522A
CN106022522A CN201610338598.4A CN201610338598A CN106022522A CN 106022522 A CN106022522 A CN 106022522A CN 201610338598 A CN201610338598 A CN 201610338598A CN 106022522 A CN106022522 A CN 106022522A
Authority
CN
China
Prior art keywords
stock
data
days
day
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610338598.4A
Other languages
Chinese (zh)
Inventor
马健
俞扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201610338598.4A priority Critical patent/CN106022522A/en
Publication of CN106022522A publication Critical patent/CN106022522A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a method and system for predicting stocks based on big data published by the internet. The method comprises the following steps: crawling related information of the stocks before a business day; and then performing the feature extraction using the crawled data, constructing a training dataset, and using a Group Lasso to perform prediction model training, wherein the evaluation standard of the model is yield rate in a period of time in the operation mode of selling stocks purchased in late trading day and purchasing the stocks recommended at the current trading day at the opening every day; and then constructing a new testing set according to the data crawled at the trading day, predicting using the prediction model trained in former step to obtain the finally recommended stocks. Through the adoption of the method and system disclosed by the invention, a new, useful and reliable information source is provided for quantitative stock selection or stock prediction, the adding of above information can more reflect the market in combination with the traditional information; on the basis of method and system, the stock prediction model obtained using the machine learning technique can more capture the internal operation mechanism of the market, and the benefit of the investor can be effectively improved.

Description

A kind of method and system based on data prediction stock big disclosed in the Internet
Technical field
The present invention relates to a kind of big data Prediction of Stock Index method, grasp based on stock invester disclosed in the Internet particularly to one The big data Prediction of Stock Index method such as work, analyst's prediction, stock invester's comment, news, bulletin, historical stock price, funds flow, basic side And system.
Background technology
Before the seventies in last century, equity investment is that one is analyzed qualitatively, does not has a market demand, but one subjective Art.Along with popularizing of computer, a lot of people begin one's study and drive the rule of the change of stock price, tradition basic side research method mould Type replaces, and p/e ratio, the concept of HSBC are born, and quantify investment and thus rise.
From subjective judgment to quantifying investment, it it is the process transferring science from art to.The seventies in last century with previous substantially Face researcher can only pay close attention to 20 to 50 stocks, and coverage rate is the most limited.There is quantitative model just can cover all stock, this It it is exactly a big leap.Additionally, along with the development of computer process ability, the consumption of information also has a leap change.Cross Go to see that three indexs are the most much of that, referring now to index get more and more, the prediction made is more and more accurate.
Along with the arrival of 21 century, quantify investment and encounter again new bottleneck, it is simply that homogeneity is competed.The amount of Ge Jia mechanism Change model more and more convergent, cause investing result with rising with falling." can seek by bigger data before seeing report data Look for rule?" this is the problem that big data policy entrepreneurs attempt to solve.
The investment model that Nobel prize in economics winner Robert's seat in 2013 is strangled in design is spoken approvingly of the most in the industry. In his model, three variablees of Primary Reference: the cash flow of investment project plan, the estimated cost of corporate capital, stock city The field reaction (market sentiment) to investment.He thinks, market can affect investment per se with subjective judgment factor, investor sentiment Behavior, and investment behavior directly affects assets price.Computer is by analyzing news, research report, social information, search behavior Deng, by natural language processing method, extract useful information;And by machine learning intellectual analysis, the past only quantifies investment Can cover tens strategies, the investment of big data then can cover thousands of strategies.
Show that traditional Prediction of Stock Index is all based on the history tendency of stock price, funds flow, and each stock accordingly Market value, the information such as p/e ratio carries out stock analysis prediction.When present internet deep affects many traditional industries, Compared to the Internet decades ago also no before invention, or even before the Internet is the most universal, except those traditional stocks Outside ticket data, the Internet also has the data much about stock, including the practical operation of stock invester of public data, analyst Prediction, the comment of stock invester, news, bulletin etc. information.These information are the reaction to current stock market to a certain extent, also can Show the intended reaction to following stock market.The present invention attempts to these new useful data and traditional data profit With a kind of big data Study on Stock Prediction Model of the technology creation such as natural language processing, machine learning.
Summary of the invention:
Goal of the invention: for problems of the prior art, the present invention proposes a kind of based on stock disclosed on the Internet The big data quantity share-selecting method of the people and analyst's operation behavior and system, for numerous stock investers, investment reference is done by Fund Company etc..
Technical scheme: the present invention proposes a kind of method based on data prediction stock big disclosed in the Internet, including as follows Step:
1) relevant information of stock before the day of trade is crawled;
Concrete crawling method is: first crawl some Agent IPs, then uses Scrapy framework to crawl the number of related web site According to, it is stored in after converting the data into json form in Mongodb data base;
The specifying information crawled include snowball net, gold compass, stock, phoenix finance and economics, on the website such as Sina's finance and economics about stock The Stock-operation of the stock invester of the ticket of ticket, the prediction of analyst, stock invester's comment, news, bulletin, and the historical price of every stock Data, market value, net assets income ratio, Return on Assets, earnings per share rate of increase, ratio of current liabilities, enterprise value multiple, clean The stock earnings price ratio of profit year-on-year growth rate, Equity Concentration Ratio, free flow market value and nearest one month and stability bandwidth.
2) data utilizing step 1 to crawl carry out feature extraction, construct training dataset, and use Group Lasso to enter Row forecast model is trained;
The training dataset of structure: be made up of, for this data of 5 day of trade in the previous week of current trading day Each day of trade of 5 day of trade, every stock is made up of feature and classification, and wherein feature obtains with according to relevant information process Vector representation, whether classification rises for this stock price of next day of trade, if rising is just 1 to be otherwise 0, the most just obtains Initial training matrix;Owing to data exist redundancy, this step can first filter out the data that quantity of information is not enough, concrete filter criteria For: filter out stock invester on same day in the data crawled to sample less than 10 times of the operand of stock.
The extracting method of the vector characterizing stock feature is: operate data for stock invester, according to income last month of stock invester Rate, is divided into 10 groups by stock invester, and each grade group is extracted first 1 day, 3 days, 7 days, 15 days, 30 days to this stock of this group Deng buying number, sell number, the amount of holding position, position in storehouse knots modification, this group at each timestamp in timestamp each in timestamp The feature such as average return;
For analyst's prediction data, extraction and analysis teacher was to first 1 day of this stock, 3 days, 7 days, 15 days, the time such as 30 days Buying number, sell the features such as number in each timestamp in stamp;
For stock invester's comment data, extraction and analysis teacher was to first 1 day of this stock, 3 days, 7 days, 15 days, the timestamp such as 30 days In the comment number of this stock in each timestamp, the average of the emotion value of each comment, the feature such as variance;
For news data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days every The news number of this stock in individual timestamp, the average of the emotion value of each news, the feature such as variance;
For advertisement data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days every The bulletin number of this stock in individual timestamp, the summation of the number of times that the word in bulletin keywords database corresponding in each bulletin occurs Etc. feature;
For historical stock price data, extraction and analysis teacher was to first 1 day of this stock, 3 days, 7 days, 15 days, the timestamp such as 30 days In the opening price of this stock in each timestamp, closing price, highest price, lowest price and the ratio of first 30 days prices, line on the 3rd oblique The features such as rate, line slope, line slope, line slope, line slopes on the 30th on the 15th on the 10th on the 7th;
For funds flow data, extraction and analysis teacher was to first 1 day of this stock, 3 days, 7 days, 15 days, the timestamp such as 30 days In the feature such as ratio of the amount of flowing to of this stock main force fund and discharge in each timestamp;
For other information datas, extract the current market value of this stock, net assets income ratio, Return on Assets, per share receipts Benefit rate of increase, ratio of current liabilities, enterprise value multiple, net profit year-on-year growth rate, Equity Concentration Ratio, free flow market value with And the feature such as the stock earnings price ratio of nearest month and stability bandwidth;
Finance emotion dictionary, bulletin keywords database two are primarily based on for text datas such as stock invester's comment, news, bulletins Dictionary uses natural language processing technique that text is carried out participle, calculates every stock further according to the financial emotion word occurred in text In the emotion value of people's comment, news etc., and bulletin corresponding key word occur number of times, finance emotion dictionary lists one A little stock emotion key words and emotion score corresponding to this key word, list some and announce relevant in bulletin keywords database Key word, the two dictionary is to use the mode of mass-rent manually to mark to obtain.
Owing to operating data for stock invester in feature extraction, according to earning rate last month of stock invester, stock invester is divided into 10 Individual group, each group in this is equivalent to a packet (Group), and the feature of each packets inner is to have stronger association, and The relatedness between feature between different grouping is then the strongest, when model training, it would be desirable to in same packet Feature has the factor of overall consideration, uses the Group Lasso algorithm in machine learning preferably to consider on this basis To these factors, so selecting Group Lasso algorithm.
Group Lasso algorithmic notation is as follows:
β ^ λ = arg min β ( | | Y - X β | | 2 2 + λ Σ g = 1 G | | β I g | | 2 )
Wherein,For model training result, X is training sample matrix, and Y is the categorization vector of sample, IgRepresent and belong to g The aspect indexing of individual Group, wherein g=1 ..., G,Represent that belonging to model corresponding to the aspect indexing of g Group instructs The value of the weights practised.
During model training, the method utilizing crosscheck, take turns the test set probability according to prediction for each Descending chooses the stock that prediction probability is the highest, then according to every day, the stock bought in the last day of trade was sold in opening quotation, buys current The earning rate of stock such mode of operation two time-of-week total revenue that the day of trade recommends, regulates the parameter of model with this.
3) crawl the test set that the data configuration on the same day day of trade is new, and the forecast model using step 2 to train is carried out Prediction, obtains consequently recommended stock.
The present invention also proposes a kind of system based on data prediction stock big disclosed in the Internet, crawls storage including data Module, forecast model training module and Prediction of Stock Index module;Wherein, data crawl memory module for crawling and storing stock Relevant information;The data configuration training dataset that forecast model training module crawls before utilizing the day of trade, and use Group Lasso trains forecast model;Prediction of Stock Index module, utilizes the test set that the data configuration crawled the same day day of trade is new, and uses The forecast model trained predicts consequently recommended stock.
The system of big data prediction stock based on the Internet public data also includes display module, for by Prediction of Stock Index Result shows client.
Beneficial effect: the present invention is that quantization is selected stocks or Prediction of Stock Index provides new useful reliable information source, all As the operation of stock invester, the prediction of analyst, news, announce, grind the data such as report relative to traditional such as stock historical price, The data such as funds flow are novel Data Sources, and these information are the reaction to current stock market to a certain extent, also can table Reveal the intended reaction to following stock market.Owing to there being substantial amounts of text data, the difficulty crawling in real time and analyzing of these data Degree crawls and processes difficulty than traditional equity data, and the present invention uses the skills such as Scrapy framework reptile and natural language processing Art crawls in real time for the data of these types and processes, and and the traditional historical price of such as stock, cash flow To etc. the combination of data more can reflect market.Owing to some part of feature of the extraction of the present invention is the last month according to stock invester Earning rate, is divided into multiple packet by stock invester, and the feature of each packets inner is to have stronger association, and between different grouping Relatedness between feature is then the strongest, when model training, it is therefore desirable to be able to the feature in same packet is had entirety The factor considered, uses the Group Lasso algorithm in machine learning can preferably consider these factors on this basis, The Study on Stock Prediction Model obtained is better able to catch the inherent operating mechanism in market, substantially increases the income brought to money person.
Accompanying drawing explanation
Fig. 1 is the integrated stand composition of the Prediction of Stock Index system of the present invention;
Fig. 2 is the Organization Chart that the data of the present invention crawl memory module;
Fig. 3 is the Organization Chart of the forecast model training module of the present invention;
Fig. 4 is the Organization Chart of the Prediction of Stock Index prediction module of the present invention.
Detailed description of the invention
Below in conjunction with specific embodiment, it is further elucidated with the present invention, it should be understood that these embodiments are merely to illustrate the present invention Rather than restriction the scope of the present invention, after having read the present invention, the those skilled in the art's various equivalences to the present invention The amendment of form all falls within the application claims limited range.
Fig. 1 is the general frame of the Prediction of Stock Index system of the present invention, and including four modules, data crawl memory module, stock Ticket forecast model training module, Prediction of Stock Index module and display module.Language use Python of the present invention, data base uses Mongodb。
Data crawl memory module as in figure 2 it is shown, reptile uses Scrapy framework, and Scrapy is one and opens based on Python Quick, the high-level Web information grasping system sent out, is mainly used in automatically accessing relevant Web sites and extracting knot from the page The data of structure.Scrapy use efficient Twisted asynchronous network storehouse to process network communication, Scrapy overall architecture As shown in Figure 3.
In reptile, in order to solve the anti-creep problem of the websites such as such as snowball net, first crawl some Agent IPs, then use Scrapy framework crawl snowball net, gold compass, stock, phoenix finance and economics, Sina's finance and economics, the data of huge website such as tide information etc., by number It is stored in Mongodb data base according to after changing into json form.Wherein, snowball net can crawl the operand of some stock investers According to data such as, stock invester's comment, news, bulletins, gold compass can crawl the data such as the prediction of analyst, and stock can crawl Data, phoenix finance and economics and Sina's finance and economicss such as stock invester's comment can crawl news and the historical price of stock, funds flow, base The data such as this face, huge tide information can crawl the data such as bulletin.
Study on Stock Prediction Model training module as shown in Figure 4, first constructs the training dataset of machine learning, training dataset by The data composition of 5 day of trade in the previous week of distance current trading day.For each day of trade of this 5 day of trade, A 2780 every, stock stocks of stock are made up of feature and classification, and wherein feature one vector representation, this vector has 700 dimensions left The right side, whether classification rises for this stock price of next day of trade, if rising is just 1 to be otherwise 0, so can obtain a 5* The matrix of about 2780*701.This is initial training collection.
The composition of the characteristic vector about table 1 700 dimension
The data crawled due to the stock day having are not a lot, so describing possible distortion with original 700 dimensional vectors, So Study on Stock Prediction Model training module can filter out the data that quantity of information is not enough, concrete filter criteria can be according to evaluating standard Then being adjusted, the present stage present invention filters out stock invester on same day in the data crawled to sample less than 10 times of the operand of stock This.Training set after so can being filtered.
Then carrying out model training with the Group Lasso algorithm in machine learning, the statistic of same type is one Group.Different with traditional Machine Learning Problems at this, the evaluation criterion of model quality here is not accuracy rate, F1 etc., and It is to recommend 8 stocks every day according to model, sell, according to opening quotation every day, the stock bought in the last day of trade, buys current trading day The such mode of operation of stock recommended earning rate during this period of time.The parameter of model is regulated with this.Group Lasso algorithm It is expressed as follows:
β ^ λ = arg min β ( | | Y - X β | | 2 2 + λ Σ g = 1 G | | β I g | | 2 )
Wherein,For model training result, X is training sample matrix, and Y is the categorization vector of sample, IgRepresent and belong to g The aspect indexing of individual Group, wherein g=1 ..., G,Represent that belonging to model corresponding to the aspect indexing of g Group instructs The value of the weights practised.
Thus obtain big data Study on Stock Prediction Model, about 10 hours before each day of trade opens the set, this The bright training carrying out model on the same day.
The prediction module of big data Study on Stock Prediction Model as shown in Figure 4, is extracted feature according to the data crawled the same day and is obtained Test data set, so can obtain 2780 samples of 2780 stocks of A-share.According still further to training data and the method for filtration, Get rid of the sample that quantity of information is few, the test set after being filtered.Finally use the big data Study on Stock Prediction Model pair trained Test set after filtration is predicted, and selects 8 the highest stocks of the output probability recommendation stock as the next day of trade.

Claims (9)

1. a method based on data prediction stock big disclosed in the Internet, comprises the steps:
1) related data information of stock before the day of trade is crawled;
2) data utilizing step 1 to crawl carry out feature extraction, construct training set, and use Group Lasso algorithm to carry out greatly The training of data Study on Stock Prediction Model;
3) crawl the test set that the data configuration on the same day day of trade is new, and the forecast model using step 2 to train be predicted, Obtain consequently recommended stock.
Method based on data prediction stock big disclosed in the Internet the most according to claim 1, described step 1 extracts stock The method of ticket information is: first crawl some Agent IPs, then uses Scrapy framework to crawl the data of related web site, data is turned It is stored in Mongodb data base after chemical conversion Json form.
Method based on data prediction stock big disclosed in the Internet the most according to claim 1, described step 1 crawls Specifying information include snowball net, gold compass, stock, phoenix finance and economics, on the website such as Sina's finance and economics about the stock of stock invester of stock Operation, the prediction of analyst, stock invester's comment, news, bulletin, grind and respond with and price history data, the market value of every stock, only provide Product earning rate, Return on Assets, earnings per share rate of increase, ratio of current liabilities, enterprise value multiple, net profit increase by a year-on-year basis The stock earnings price ratio of rate, Equity Concentration Ratio, free flow market value and nearest one month and stability bandwidth.
Method based on data prediction stock big disclosed in the Internet the most according to claim 1, described step 2 can filter Falling the data that quantity of information is not enough, concrete filter criteria is: filter out the stock invester on same day in the data the crawled operand to stock Sample less than 10 times.
Method based on data prediction stock big disclosed in the Internet the most according to claim 1, described step 2 structure Training dataset is made up of the data of 5 day of trade in the previous week of current trading day, each for this 5 day of trade The day of trade, every stock is made up of feature and classification, and wherein feature is with processing the vector representation obtained, classification according to relevant information Whether rise for this stock price of next day of trade, if rising is just 1 to be otherwise 0, the most just obtain initial training matrix.
Method based on data prediction stock big disclosed in the Internet the most according to claim 5, described sign stock is special The vectorial extracting method levied is:
Data are operated for stock invester, according to earning rate last month of stock invester, stock invester is divided into 10 groups, the group of each grade is carried Take this group to first 1 day of this stock, 3 days, 7 days, 15 days, buying number, sell in each timestamp in the timestamp such as 30 days Number, the amount of holding position, position in storehouse knots modification, this group are in the feature such as average return of each timestamp;
For analyst's prediction data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days Buying number, sell the features such as number in each timestamp;
For stock invester's comment data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days every The comment number of this stock in individual timestamp, the average of emotion value of each comment, the feature such as variance;
For news data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days each time Between the news number of this stock in stamp, the average of the emotion value of each news, the feature such as variance;
For advertisement data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days each time Between the bulletin number of this stock in stamp, the spy such as the summation of the number of times that word in bulletin keywords database corresponding in each bulletin occurs Levy;
For historical stock price data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days every The opening price of this stock, closing price, highest price, lowest price and the ratio of first 30 days prices in individual timestamp, line slope on the 3rd, 7 Day feature such as line slope, line slope, line slope, line slopes on the 30th on the 15th on the 10th;
For funds flow data, extraction and analysis teacher to first 1 day of this stock, 3 days, 7 days, 15 days, in the timestamp such as 30 days every The feature such as ratio of the amount of flowing to of this stock main force fund and discharge in individual timestamp;
For other information datas, extract the current market value of this stock, net assets income ratio, Return on Assets, earnings per share increasing Long rate, ratio of current liabilities, enterprise value multiple, net profit year-on-year growth rate, Equity Concentration Ratio, free flow market value and The features such as the stock earnings price ratio of nearly month and stability bandwidth;
Finance emotion dictionary, bulletin two dictionaries of keywords database are primarily based on for text datas such as stock invester's comment, news, bulletins Use natural language processing technique that text is carried out participle, calculate every stock invester further according to the financial emotion word occurred in text and comment In the emotion value of opinion, news etc., and bulletin corresponding key word occur number of times, finance emotion dictionary lists some stocks Ticket emotion key word and emotion score corresponding to this key word, list some and announce relevant passes in bulletin keywords database Keyword, the two dictionary is to use the mode of mass-rent manually to mark to obtain.
Method based on data prediction stock big disclosed in the Internet the most according to claim 6, due in feature extraction In data are operated for stock invester, according to earning rate last month of stock invester, stock invester is divided into 10 groups, each group in this is equivalent to One packet, the feature of each packets inner is to have stronger association, and the relatedness between feature between different grouping is then The strongest, in order to enable the consideration that the feature in same packet is had entirety, use on this basis in machine learning Group Lasso algorithm is predicted model training, and Group Lasso algorithmic notation is as follows:
β ^ λ = arg min β ( | | Y - X β | | 2 2 + λ Σ g = 1 G | | β I g | | 2 )
Wherein,For model training result, X is training sample matrix, and Y is the categorization vector of sample, IgRepresent and belong to g The aspect indexing of Group, wherein g=1 ..., G,Represent and belong to the model training that the aspect indexing of g Group is corresponding The value of the weights gone out;
During model training, the method utilizing crosscheck, take turns the test set probability descending according to prediction for each Choose the stock that prediction probability is the highest, then according to every day, the stock bought in the last day of trade was sold in opening quotation, buy current transaction The earning rate of stock such mode of operation two time-of-week total revenue that day recommends, regulates the parameter of model with this.
8. a system based on data prediction stock big disclosed in the Internet, crawls memory module, forecast model including data Training module and Prediction of Stock Index module;Wherein, data crawl memory module for crawling and store the relevant information of stock;Prediction The data configuration training dataset that model training module crawls before utilizing the day of trade, and use Group Lasso training prediction mould Type;Prediction of Stock Index module, utilizes the test set that the data configuration crawled the same day day of trade is new, and uses the forecast model trained Predict consequently recommended stock.
System based on data prediction stock big disclosed in the Internet the most according to claim 8, also includes display module, For Prediction of Stock Index result is showed client.
CN201610338598.4A 2016-05-20 2016-05-20 Method and system for predicting stocks based on big data published by internet Pending CN106022522A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610338598.4A CN106022522A (en) 2016-05-20 2016-05-20 Method and system for predicting stocks based on big data published by internet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610338598.4A CN106022522A (en) 2016-05-20 2016-05-20 Method and system for predicting stocks based on big data published by internet

Publications (1)

Publication Number Publication Date
CN106022522A true CN106022522A (en) 2016-10-12

Family

ID=57096593

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610338598.4A Pending CN106022522A (en) 2016-05-20 2016-05-20 Method and system for predicting stocks based on big data published by internet

Country Status (1)

Country Link
CN (1) CN106022522A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897932A (en) * 2017-01-19 2017-06-27 沃民高新科技(北京)股份有限公司 Data method of replacing and device
CN107689000A (en) * 2017-08-16 2018-02-13 北京国新汇金股份有限公司 A kind of financial information management system
CN108446984A (en) * 2018-03-20 2018-08-24 张家林 A kind of investment data management method and device
CN108596765A (en) * 2018-04-28 2018-09-28 国信优易数据有限公司 A kind of Electronic Finance resource recommendation method and device
CN108629690A (en) * 2018-04-28 2018-10-09 福州大学 Futures based on deeply study quantify transaction system
CN108647823A (en) * 2018-05-10 2018-10-12 北京航空航天大学 Stock certificate data analysis method based on deep learning and device
CN108830722A (en) * 2018-06-27 2018-11-16 东莞市波动赢机器人科技有限公司 Based on transaction machine people recommended method, electronic equipment and the storage medium to liquidate
CN108921444A (en) * 2018-07-12 2018-11-30 李俊山 Based on block chain technology distribution formula number exchange stock index sample acquisition system
CN109087205A (en) * 2018-08-10 2018-12-25 北京字节跳动网络技术有限公司 Prediction technique and device, the computer equipment and readable storage medium storing program for executing of public opinion index
CN109102319A (en) * 2018-06-27 2018-12-28 众安信息技术服务有限公司 Plate index preparation method, device and the server of block chain cryptographic assets
CN109146166A (en) * 2018-08-09 2019-01-04 南京安链数据科技有限公司 A kind of personal share based on the marking of investor's content of the discussions slumps prediction model
CN109255714A (en) * 2018-08-27 2019-01-22 深圳市利讯互联网金融服务有限公司 Machine learning fund optimum decision system and its preferred method
CN109272405A (en) * 2018-09-30 2019-01-25 大唐碳资产有限公司 Carbon transaction in assets method and system
WO2019019346A1 (en) * 2017-07-25 2019-01-31 上海壹账通金融科技有限公司 Asset allocation strategy acquisition method and apparatus, computer device, and storage medium
WO2019041520A1 (en) * 2017-08-31 2019-03-07 平安科技(深圳)有限公司 Social data-based method of recommending financial product, electronic device and medium
CN109739895A (en) * 2018-12-07 2019-05-10 中国联合网络通信集团有限公司 A kind of virtual article trading prediction technique and device
CN110163758A (en) * 2019-06-03 2019-08-23 成都慧财智科技有限公司 Artificial intelligence Stock investment analysis system
WO2019192135A1 (en) * 2018-04-03 2019-10-10 平安科技(深圳)有限公司 Electronic device, bond yield analysis method, system, and storage medium
WO2019205378A1 (en) * 2018-04-26 2019-10-31 平安科技(深圳)有限公司 Method and apparatus for selecting investment stocks based on public sentiment factor, and storage medium
CN110400225A (en) * 2019-07-29 2019-11-01 北京北信源软件股份有限公司 A kind of market value of stock management method
WO2019218517A1 (en) * 2018-05-16 2019-11-21 平安科技(深圳)有限公司 Server, method for processing text data and storage medium
WO2019242143A1 (en) * 2018-06-21 2019-12-26 平安科技(深圳)有限公司 Stocks selling early-warning method and apparatus, and computer-readable storage medium
CN110809778A (en) * 2018-03-30 2020-02-18 加藤宽之 Stock price prediction support system and method
TWI692735B (en) * 2018-10-12 2020-05-01 台北富邦商業銀行股份有限公司 Exposure management system of corporate finance

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897932A (en) * 2017-01-19 2017-06-27 沃民高新科技(北京)股份有限公司 Data method of replacing and device
WO2019019346A1 (en) * 2017-07-25 2019-01-31 上海壹账通金融科技有限公司 Asset allocation strategy acquisition method and apparatus, computer device, and storage medium
CN107689000A (en) * 2017-08-16 2018-02-13 北京国新汇金股份有限公司 A kind of financial information management system
WO2019041520A1 (en) * 2017-08-31 2019-03-07 平安科技(深圳)有限公司 Social data-based method of recommending financial product, electronic device and medium
CN108446984A (en) * 2018-03-20 2018-08-24 张家林 A kind of investment data management method and device
CN110809778A (en) * 2018-03-30 2020-02-18 加藤宽之 Stock price prediction support system and method
WO2019192135A1 (en) * 2018-04-03 2019-10-10 平安科技(深圳)有限公司 Electronic device, bond yield analysis method, system, and storage medium
WO2019205378A1 (en) * 2018-04-26 2019-10-31 平安科技(深圳)有限公司 Method and apparatus for selecting investment stocks based on public sentiment factor, and storage medium
CN108596765A (en) * 2018-04-28 2018-09-28 国信优易数据有限公司 A kind of Electronic Finance resource recommendation method and device
CN108629690A (en) * 2018-04-28 2018-10-09 福州大学 Futures based on deeply study quantify transaction system
CN108647823A (en) * 2018-05-10 2018-10-12 北京航空航天大学 Stock certificate data analysis method based on deep learning and device
WO2019218517A1 (en) * 2018-05-16 2019-11-21 平安科技(深圳)有限公司 Server, method for processing text data and storage medium
WO2019242143A1 (en) * 2018-06-21 2019-12-26 平安科技(深圳)有限公司 Stocks selling early-warning method and apparatus, and computer-readable storage medium
CN108830722A (en) * 2018-06-27 2018-11-16 东莞市波动赢机器人科技有限公司 Based on transaction machine people recommended method, electronic equipment and the storage medium to liquidate
CN109102319A (en) * 2018-06-27 2018-12-28 众安信息技术服务有限公司 Plate index preparation method, device and the server of block chain cryptographic assets
CN108921444A (en) * 2018-07-12 2018-11-30 李俊山 Based on block chain technology distribution formula number exchange stock index sample acquisition system
CN109146166A (en) * 2018-08-09 2019-01-04 南京安链数据科技有限公司 A kind of personal share based on the marking of investor's content of the discussions slumps prediction model
CN109087205A (en) * 2018-08-10 2018-12-25 北京字节跳动网络技术有限公司 Prediction technique and device, the computer equipment and readable storage medium storing program for executing of public opinion index
CN109087205B (en) * 2018-08-10 2020-09-18 北京字节跳动网络技术有限公司 Public opinion index prediction method and device, computer equipment and readable storage medium
CN109255714A (en) * 2018-08-27 2019-01-22 深圳市利讯互联网金融服务有限公司 Machine learning fund optimum decision system and its preferred method
CN109272405A (en) * 2018-09-30 2019-01-25 大唐碳资产有限公司 Carbon transaction in assets method and system
TWI692735B (en) * 2018-10-12 2020-05-01 台北富邦商業銀行股份有限公司 Exposure management system of corporate finance
CN109739895A (en) * 2018-12-07 2019-05-10 中国联合网络通信集团有限公司 A kind of virtual article trading prediction technique and device
CN110163758A (en) * 2019-06-03 2019-08-23 成都慧财智科技有限公司 Artificial intelligence Stock investment analysis system
CN110400225A (en) * 2019-07-29 2019-11-01 北京北信源软件股份有限公司 A kind of market value of stock management method

Similar Documents

Publication Publication Date Title
CN106022522A (en) Method and system for predicting stocks based on big data published by internet
Bhowmik et al. Stock market volatility and return analysis: A systematic literature review
Meng et al. Reinforcement learning in financial markets
US20130138577A1 (en) Methods and systems for predicting market behavior based on news and sentiment analysis
Raman et al. Mapping ESG trends by distant supervision of neural language models
Brown et al. Financial statement adequacy and firms’ MD&A disclosures
Eisfeld Entry and acquisitions in software markets
Yin et al. Daily investor sentiment, order flow imbalance and stock liquidity: evidence from the Chinese stock market
Fang et al. Practical machine learning approach to capture the scholar data driven alpha in AI industry
CN116775975A (en) Deep learning network for analysis of complex news text public opinion in financial field
Singh et al. FII flow and Indian stock market: A causal study
Li et al. Forecasting stock prices changes using long-short term memory neural network with symbolic genetic programming
Huang et al. Autonomous self-evolving forecasting models for price movement in high frequency trading: Evidence from Taiwan
Teplova et al. A retail investor in a cobweb of social networks
Eickhoff et al. Stock analysts vs. the crowd: a study on mutual prediction
Carter et al. The IPO window of opportunity for digital product and service firms
Li Related research on news sentiment tendency and stock price fluctuation
Reintjes Automatic Identification and Classification of Share Buybacks and their Effect on Short-, Mid-and Long-Term Returns
Coiro Tesla: Is Now the Time to Invest? An examination of Tesla, social media, and its effect on stock
BEŞER et al. The impact of foreign direct investment on tax revenues: Evidence from selected transition economies
Puspita Sari INFLUENCE OF INVESTORS’ATTENTION ON STOCK RETURN, LIQUIDITY, AND RETURN VOLATILITY COMPARISON BETWEEN MANUFACTURE COMPANIES IN INDONESIA AND INDIA
Raju et al. Machine Learning Algorithms for Prediction of Stock Market: A Systematic Literature Review
Fang et al. Practical Machine Learning Approach for Stock Trading Strategies using Alternative Dataset
Zheng To What Extent Can Social Media Be Used to Identify Potential Investments?
Marjanovic Extreme Views on Reddit: Information or Noise?

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20161012

RJ01 Rejection of invention patent application after publication