CN105956770A - Stock market risk prediction platform and text excavation method thereof - Google Patents

Stock market risk prediction platform and text excavation method thereof Download PDF

Info

Publication number
CN105956770A
CN105956770A CN201610283046.8A CN201610283046A CN105956770A CN 105956770 A CN105956770 A CN 105956770A CN 201610283046 A CN201610283046 A CN 201610283046A CN 105956770 A CN105956770 A CN 105956770A
Authority
CN
China
Prior art keywords
data
text
stock market
module
risk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610283046.8A
Other languages
Chinese (zh)
Inventor
吴德胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chinese Academy of Sciences
Original Assignee
University of Chinese Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Chinese Academy of Sciences filed Critical University of Chinese Academy of Sciences
Priority to CN201610283046.8A priority Critical patent/CN105956770A/en
Publication of CN105956770A publication Critical patent/CN105956770A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Educational Administration (AREA)
  • Technology Law (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a stock market risk prediction platform and a text excavation method thereof. The stock market risk prediction platform comprises a data collection module, a data pre-processing module, a text excavation module, a stock market prediction module, a risk evaluation module and a result output module. The invention further provides a text excavation method of the stock market risk prediction platform which is characterized in that non-structuralized text data is converted into structuralized data so as to analyze viewpoints, attitudes or emotions contained in the text. According to the invention, the design is reasonable, the non-structuralized text data is converted into structuralized data so as to analyze viewpoints, attitudes or emotions contained in the text, the evaluation of the stock market risk level is carried out according to the result obtained by data analysis, and the stock market risk level is helpful for investors to make decisions and also provides a basis for the government to make related policies and enterprises to implement corresponding strategies.

Description

A kind of stock market's risk profile platform and text mining method thereof
Technical field
The invention belongs to Stock Market Forecasting and risk identification field, specifically, relate to a kind of stock market's risk profile platform and Its text mining method.
Background technology
Stock market is a country economy and the barometer of finance activities, is also Corporate finance and investor money Produce the important means of configuration, be possible not only to formulate relevant Decision offer into government, enterprise and investor to the predictive study of stock market Foundation, it is also possible to evade financial risks, promotes stably to develop in a healthy way in stock market.
Existing Stock Market Forecasting method includes securities analysis method, mathematical statistical model, Nonlinear Dynamics, god Through network, support vector machine etc., these methods all assume that investor is rationality, it is possible to be traded according to principle of maximum utility Movable.Now stock market activities is more complicated and changeable, along with finances such as Herd Behavior, overreaction or underactions The continuous discovery of heteromophism, the defect of traditional prediction method gradually highlights.
Additionally, along with the development of information technology, the Internet comprises the information of magnanimity, not only comprise stock market's transaction etc. and disappear Breath, also includes the content that stock market is had a major impact by macroeconomy news, Correspondence policy etc., has become as investor and obtains The irreplaceable channel of information.On the other hand, along with forum, microblogging etc. are from media and the appearance of intercommunion platform, stock invester is mutually In networking, the general trend of market development, macro economic policy, investment intent etc. being delivered to the viewpoint of oneself and carried out information exchange, the Internet becomes For excavating the important carrier of investor sentiment.
Existing Stock Market Forecasting platform is built upon on traditional Stock Market Forecasting method mostly, and its shortcoming is mainly reflected in Three aspect below:
First, have ignored investor sentiment and the behavior impact on stock market, it was predicted that result can not reflect that real market is moved State.
Second, it is absorbed in the information such as research stock market's transaction, and have ignored the research to the data such as internet news, forum.
3rd, lack risk evaluation module, the purpose of Stock Market Forecasting is not only in that and instructs Investor's Decision, it is thus achieved that investment is received Benefit, is more to identify financial market risks, prevents the generation of systematic risk, safeguards stable and national financial market, financial market Safety.
Summary of the invention
The technical problem to be solved in the present invention is to overcome drawbacks described above, it is provided that a kind of stock market's risk profile platform and text thereof Method for digging, reasonable in design, the method that non-structured text data is converted into structural data is accumulate to analyze in document Viewpoint, attitude or the emotion contained, and the result obtained according to data analysis carries out the evaluation of stock market's risk class, stock market's wind Danger grade is possible not only to serve Investor's Decision, it is also possible to formulates relevant policies, enterprise implement corresponding strategy etc. for government and carries For foundation.
For solving the problems referred to above, the technical solution adopted in the present invention is:
A kind of stock market risk profile platform, it is characterised in that: including:
Data acquisition module, for automatically collecting and obtaining stock market transaction data and multi-source internet text notebook data;
The data obtained in data acquisition module are carried out pretreatment, comprise data cleansing, data set by data preprocessing module Become, data convert and data regularization, carry out Data Preparation for setting up Stock Market Forecast Model;
Text mining module, is used for being analyzed internet text notebook data processing to excavate investor sentiment, builds emotion and refer to Number, comprises text participle, part-of-speech tagging, feeling polarities mark, moos index calculating, moos index adjustment, moos index integration Six big steps;
Stock Market Forecasting module, stock market is predicted point by integrated application text mining, machine learning, the method for mathematical statistics Analysis;
Risk evaluation module, carries out risk according to the result of Stock Market Forecasting module to stock and the market overall trend of monitoring in real time Grade classification;
Result output module, the risk class of the stock for being paid close attention to investor output, and export whole market simultaneously Risk class situation also provides real-time early warning.
Present invention also offers the text mining method of a kind of stock market risk profile platform, be a kind of by non-structured literary composition Notebook data is converted into the method for structural data to analyze viewpoint, attitude or the emotion contained in document;
The internet text database that text mining method is used comprises POLICY, financial and economic news, forum data three aspect, POLICY can excavate attitude and the tendency of government, and financial and economic news it will be seen that socioeconomic integrated information, forum data Can the most directly extract investor sentiment;
Text mining module in stock market's risk profile platform is that the text data in the Internet is entered by applicating text method for digging Row analyzing and processing, thus extracts the viewpoint of investor, attitude, emotion, then using the moos index calculated as input Variable is applied in Stock Market Forecasting module.
Owing to have employed technique scheme, compared with prior art, the present invention is reasonable in design, by non-structured text Data are converted into the method for structural data to analyze viewpoint, attitude or the emotion contained in document, and according to data Analyzing the result obtained and carry out the evaluation of stock market's risk class, stock market's risk class is possible not only to serve Investor's Decision, also Can be that government formulates the offer foundation such as relevant policies, enterprise implement corresponding strategy.
The invention will be further described with detailed description of the invention below in conjunction with the accompanying drawings simultaneously.
Accompanying drawing explanation
Fig. 1 is the structured flowchart of stock market's risk profile platform in an embodiment of the present invention;
Fig. 2 is the structured flowchart of stock market's risk profile console module in an embodiment of the present invention;
Fig. 3 is the flow chart of an embodiment of the present invention Chinese version method for digging.
Detailed description of the invention
Embodiment:
A kind of stock market risk profile platform, as depicted in figs. 1 and 2, including:
Data acquisition module, the built-in crawlers of application platform automatically obtain stock supervisory committee, the Banking Supervision Commission, Central Bank, news hookup and News net, east wealth, finance and economics forum of Sina, finance and economics forum of Netease, the text data of finance and economics forum of Tengxun and stock market are handed over Easily data.
Data preprocessing module, carries out denoising operation to the text data collected, comprises data cleansing, data integration, number According to conversion and data regularization etc., with the demand of satisfied modeling.
Text mining module, obtains policy emotion day degree index according to above-mentioned text mining step, finance and economics emotion day degree refers to Number, forum's emotion day degree index and comprehensive emotion day degree index.
Stock Market Forecasting module, apply comprehensive emotion day degree index and delayed item, Shangzheng index and delayed item thereof, Trading volume, stability bandwidth set up Vector Autoression Models, are predicted the tendency of Index of Shanghai Stock Exchange;
Risk evaluation module, risk is divided into five grades by system, and one-level is extremely low risk, and two grades is relatively low-risk, and three grades are Medium risk, level Four is medium or high risk, and Pyatyi is excessive risk, the overall risk of prompting stock market.
Result output module, exports stock market overall risk grade and also points out risk, and Pyatyi excessive risk is suitable for radical type Investor, level Four medium or high risk is suitable for active investment person, three grades of medium risks are suitable for equilibrated type investor, two grades of relatively low-risks Being suitable for sane type investor, one-level relatively low-risk is suitable for conservative investor.Stock market's risk class is possible not only to serve investment Person's decision-making, it is also possible to formulate relevant policies, enterprise implement corresponding strategy etc. for government and foundation is provided.
A kind of text mining method is provided in the invention described above embodiment, as it is shown on figure 3,
Data Source comprises POLICY, financial and economic news, forum data three part, and the source of POLICY includes stock supervisory committee, silver Prison meeting, Central Bank and news hookup, the source of financial and economic news comprises Homeway.com, east wealth, and the source of forum data is Sina's wealth Through forum, finance and economics forum of Netease and finance and economics forum of Tengxun.Carry out text analyzing for above source of news to process to excavate market Emotion and investor sentiment;
1), text participle, application Words partition system text data is cut word process;
2), part-of-speech tagging, remove after stop words, modal particle etc. and word carried out part-of-speech tagging;
3), feeling polarities mark, word is carried out feeling polarities mark, is divided into positive word, passive word and neutral words Language, adds up positive word and the number of passive word the most respectively;
4), moos index calculate, according to emotion computing formula (1), every news or the feelings of forum's comment data can be obtained Thread index, thus obtain the moos index of every day, wherein, Sdx represents that moos index, Nn represent the number of passive word, and Np amasss The number of pole word, moos index represents pessimistic investor sentiment more than 0, and moos index represents optimistic investor sentiment less than 0;
5), moos index adjust, in 104 steps find government website news there is particularity, POLICY is within a certain period of time The most influential and POLICY is openness greatly, does not i.e. have POLICY not represent government and does not has the expression of emotion, but The appearance of POLICY represents the interested regulatory authorities attitude to stock market within a period of time, therefore arranges time decay factor Being adjusted POLICY, the POLICY index after adjustment is with representing, computing formula is as shown in (2),Represent original I-th (i=0,1,2) phase delayed item of POLICY index, wherein
Being the time attenuation function of monotone decreasing, computing formula is as shown in (3);
6), moos index integrate, the moos index of comprehensive 104 and 105, policy emotion day degree index, finance and economics emotion can be obtained Day degree index, forum's emotion day degree index and comprehensive emotion day degree index.
The present invention is not limited to above-mentioned preferred implementation, and anyone should learn and make under the enlightenment of the present invention Structure changes, every have with the present invention same or like as technical scheme, belong to protection scope of the present invention.

Claims (2)

1. stock market's risk profile platform, it is characterised in that:
Including:
Data acquisition module, for automatically collecting and obtaining stock market transaction data and multi-source internet text notebook data;
The data obtained in data acquisition module are carried out pretreatment, comprise data cleansing, data set by data preprocessing module Become, data convert and data regularization, carry out Data Preparation for setting up Stock Market Forecast Model;
Text mining module, is used for being analyzed internet text notebook data processing to excavate investor sentiment, builds emotion and refer to Number, comprises text participle, part-of-speech tagging, feeling polarities mark, moos index calculating, moos index adjustment, moos index integration Six big steps;
Stock Market Forecasting module, stock market is predicted point by integrated application text mining, machine learning, the method for mathematical statistics Analysis;
Risk evaluation module, carries out risk according to the result of Stock Market Forecasting module to stock and the market overall trend of monitoring in real time Grade classification;
Result output module, the risk class of the stock for being paid close attention to investor output, and export whole market simultaneously Risk class situation also provides real-time early warning.
The text mining method of stock market the most according to claim 1 risk profile platform, it is characterised in that:
Text mining method is that a kind of method of structural data that is converted into by non-structured text data is to analyze in document Viewpoint, attitude or the emotion contained;
The internet text database that text mining method is used comprises POLICY, financial and economic news, forum data three aspect, POLICY can excavate attitude and the tendency of government, and financial and economic news it will be seen that socioeconomic integrated information, forum data Can the most directly extract investor sentiment;
Text mining module in stock market's risk profile platform is that the text data in the Internet is entered by applicating text method for digging Row analyzing and processing, thus extracts the viewpoint of investor, attitude, emotion, then using the moos index calculated as input Variable is applied in Stock Market Forecasting module.
CN201610283046.8A 2016-05-03 2016-05-03 Stock market risk prediction platform and text excavation method thereof Pending CN105956770A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610283046.8A CN105956770A (en) 2016-05-03 2016-05-03 Stock market risk prediction platform and text excavation method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610283046.8A CN105956770A (en) 2016-05-03 2016-05-03 Stock market risk prediction platform and text excavation method thereof

Publications (1)

Publication Number Publication Date
CN105956770A true CN105956770A (en) 2016-09-21

Family

ID=56914866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610283046.8A Pending CN105956770A (en) 2016-05-03 2016-05-03 Stock market risk prediction platform and text excavation method thereof

Country Status (1)

Country Link
CN (1) CN105956770A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780036A (en) * 2016-11-16 2017-05-31 硕橙(厦门)科技有限公司 A kind of moos index construction method based on internet data collection
CN106886834A (en) * 2017-01-19 2017-06-23 沃民高新科技(北京)股份有限公司 The modeling method and model building device of data
CN106951563A (en) * 2017-04-01 2017-07-14 上海诺悦智能科技有限公司 A kind of tourist hot spot event influence degree forecasting system
CN107292743A (en) * 2017-06-07 2017-10-24 前海梧桐(深圳)数据有限公司 The intelligent decision making method and its system invested and financed for enterprise
CN107464068A (en) * 2017-09-18 2017-12-12 前海梧桐(深圳)数据有限公司 Enterprise development trend forecasting method and its system based on neutral net
CN107480858A (en) * 2017-07-10 2017-12-15 武汉楚鼎信息技术有限公司 A kind of Aided intelligent decision-making and method based on the analysis of stock big data
CN107798607A (en) * 2017-07-25 2018-03-13 上海壹账通金融科技有限公司 Asset Allocation strategy acquisition methods, device, computer equipment and storage medium
CN107885849A (en) * 2017-11-13 2018-04-06 成都蓝景信息技术有限公司 A kind of moos index analysis system based on text classification
CN108876604A (en) * 2018-05-25 2018-11-23 平安科技(深圳)有限公司 Stock market's Risk Forecast Method, device, computer equipment and storage medium
CN109300042A (en) * 2018-09-11 2019-02-01 广州财略金融信息科技有限公司 A kind of air control system based on big data
CN109360107A (en) * 2018-10-16 2019-02-19 成都四方伟业软件股份有限公司 A kind of method of stock analysis, device and its storage medium
CN110163758A (en) * 2019-06-03 2019-08-23 成都慧财智科技有限公司 Artificial intelligence Stock investment analysis system
CN110189170A (en) * 2019-05-27 2019-08-30 中译语通科技股份有限公司 Market sentiment analysis method and system
CN110489631A (en) * 2019-07-10 2019-11-22 平安科技(深圳)有限公司 Stock market development method, apparatus, computer equipment and storage medium
CN111105154A (en) * 2019-12-17 2020-05-05 中科鼎富(北京)科技发展有限公司 Stock market operation risk assessment method and device, electronic equipment and storage medium
US20210065296A1 (en) * 2018-03-26 2021-03-04 Ziggurat Technologies, Inc. Intelligent trading and risk management framework
CN112487808A (en) * 2020-12-18 2021-03-12 未鲲(上海)科技服务有限公司 Big data based news message pushing method, device, equipment and storage medium
CN114119233A (en) * 2021-12-01 2022-03-01 北京航空航天大学 Method for constructing emotion index of investor of stock fund, method, device and equipment for predicting accumulated net income rate
CN116611696A (en) * 2023-07-19 2023-08-18 北京大学 Digital asset market risk prediction system based on time sequence analysis

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780036A (en) * 2016-11-16 2017-05-31 硕橙(厦门)科技有限公司 A kind of moos index construction method based on internet data collection
CN106886834A (en) * 2017-01-19 2017-06-23 沃民高新科技(北京)股份有限公司 The modeling method and model building device of data
CN106951563A (en) * 2017-04-01 2017-07-14 上海诺悦智能科技有限公司 A kind of tourist hot spot event influence degree forecasting system
CN107292743A (en) * 2017-06-07 2017-10-24 前海梧桐(深圳)数据有限公司 The intelligent decision making method and its system invested and financed for enterprise
CN107480858A (en) * 2017-07-10 2017-12-15 武汉楚鼎信息技术有限公司 A kind of Aided intelligent decision-making and method based on the analysis of stock big data
CN107798607A (en) * 2017-07-25 2018-03-13 上海壹账通金融科技有限公司 Asset Allocation strategy acquisition methods, device, computer equipment and storage medium
CN107464068A (en) * 2017-09-18 2017-12-12 前海梧桐(深圳)数据有限公司 Enterprise development trend forecasting method and its system based on neutral net
CN107885849A (en) * 2017-11-13 2018-04-06 成都蓝景信息技术有限公司 A kind of moos index analysis system based on text classification
US20210065296A1 (en) * 2018-03-26 2021-03-04 Ziggurat Technologies, Inc. Intelligent trading and risk management framework
CN108876604A (en) * 2018-05-25 2018-11-23 平安科技(深圳)有限公司 Stock market's Risk Forecast Method, device, computer equipment and storage medium
CN109300042A (en) * 2018-09-11 2019-02-01 广州财略金融信息科技有限公司 A kind of air control system based on big data
CN109360107A (en) * 2018-10-16 2019-02-19 成都四方伟业软件股份有限公司 A kind of method of stock analysis, device and its storage medium
CN110189170A (en) * 2019-05-27 2019-08-30 中译语通科技股份有限公司 Market sentiment analysis method and system
CN110163758A (en) * 2019-06-03 2019-08-23 成都慧财智科技有限公司 Artificial intelligence Stock investment analysis system
CN110489631A (en) * 2019-07-10 2019-11-22 平安科技(深圳)有限公司 Stock market development method, apparatus, computer equipment and storage medium
CN111105154A (en) * 2019-12-17 2020-05-05 中科鼎富(北京)科技发展有限公司 Stock market operation risk assessment method and device, electronic equipment and storage medium
CN112487808A (en) * 2020-12-18 2021-03-12 未鲲(上海)科技服务有限公司 Big data based news message pushing method, device, equipment and storage medium
CN114119233A (en) * 2021-12-01 2022-03-01 北京航空航天大学 Method for constructing emotion index of investor of stock fund, method, device and equipment for predicting accumulated net income rate
CN116611696A (en) * 2023-07-19 2023-08-18 北京大学 Digital asset market risk prediction system based on time sequence analysis
CN116611696B (en) * 2023-07-19 2024-01-26 北京大学 Digital asset market risk prediction system based on time sequence analysis

Similar Documents

Publication Publication Date Title
CN105956770A (en) Stock market risk prediction platform and text excavation method thereof
Li et al. DP-LSTM: Differential privacy-inspired LSTM for stock prediction using financial news
CN110796470A (en) Market subject supervision and service oriented data analysis system
Gao The use of machine learning combined with data mining technology in financial risk prevention
Li et al. Improved Bayesian network-based risk model and its application in disaster risk assessment
ADRIAN et al. BIG DATA ANALYTICS IMPLEMENTATION FOR VALUE DISCOVERY: A SYSTEMATIC LITERATURE REVIEW.
Weng et al. Stock price prediction based on LSTM and BERT
Deng et al. Stock index direction forecasting using an explainable eXtreme Gradient Boosting and investor sentiments
Wang A stock price prediction method based on BiLSTM and improved transformer
Wei et al. Energy financial risk early warning model based on Bayesian network
Liu et al. A new method to analyze the driving mechanism of flood disaster resilience and its management decision-making
Liu Research on risk management of big data and machine learning insurance based on internet finance
Wang et al. Innovative risk early warning model based on internet of things under big data technology
Ma The Research of Stock Predictive Model based on the Combination of CART and DBSCAN
Zhang et al. Analysis of the trend of global power sources based on comment emotion mining
Veluvolu The Establishment of a Financial Crisis Early Warning System for Domestic Listed Companies Based on Two Neural Network Models in the Context of COVID‐19
Shang et al. Check for updates Machine Learning in Finance: A Brief Review
Shi Intelligent Analysis and Processing System of Financial Big Data Based on Neural Network Algorithm
Liu A public opinion monitoring system based on big data technology
Shi et al. Construction and research of regional green finance statistical model based on CVM-MLP neural network
Liu Countermeasures of Internet Rumor Management Based on Artificial Intelligence Technology
CN112364182B (en) Enterprise risk conduction prediction method, equipment and storage medium based on graph characteristics
Luo Instrumental Music Dissemination of Southwest Ethnic Minorities Based on Big Data Technology
Huang et al. Digital Transformation Strategy for Financial Management of Entity Enterprises in the Information Age
Sun et al. Study on Credit Default Risk Prediction Model Based on BP-RF Neural Network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160921

RJ01 Rejection of invention patent application after publication