CN109325853A - One kind being based on data mining finance data analysis method - Google Patents

One kind being based on data mining finance data analysis method Download PDF

Info

Publication number
CN109325853A
CN109325853A CN201810252701.2A CN201810252701A CN109325853A CN 109325853 A CN109325853 A CN 109325853A CN 201810252701 A CN201810252701 A CN 201810252701A CN 109325853 A CN109325853 A CN 109325853A
Authority
CN
China
Prior art keywords
data
finance
mining
finance data
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810252701.2A
Other languages
Chinese (zh)
Inventor
闫国良
李苏
胡启云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zheng Qi Mdt Infotech Ltd
Original Assignee
Shanghai Zheng Qi Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zheng Qi Mdt Infotech Ltd filed Critical Shanghai Zheng Qi Mdt Infotech Ltd
Priority to CN201810252701.2A priority Critical patent/CN109325853A/en
Publication of CN109325853A publication Critical patent/CN109325853A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Evolutionary Computation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses one kind to be based on data mining finance data analysis method, the following steps are included: establishing finance data acquisition and analysis system, its system includes: finance data warehouse module is for acquiring finance data and data being stored and monitored in real time, data classification and processing module are used for line number Data preprocess that the data progress Classification Management in finance data warehouse module is gone forward side by side, data analysis module be used for sort data into and processing module in data carry out medelling be transferred to data-mining module, data are passed through finance data transfer algorithm by data-mining module, finance data checking algorithm and finance data cleaning algorithm generate financial business information and are sent to foreground user.The invention has the advantages that by using data mining technology and the mode combined based on TF-IDF naive Bayesian Stakeout & Homicide Preservation Strategy, help administrator and user more convenient and flexible and accurately formulate fetching, also for foreground user realize it is targeted, professional, accuracy from magnanimity, many and diverse internet Financial Information resource obtain finance data, for user progress financial investment reference is provided.

Description

One kind being based on data mining finance data analysis method
Technical field
The present invention relates to a kind of method of finance data collection analysis, more specifically, it relates to one kind to be dug based on data Data analysing method is melted in Denver Nuggets.
Background technique
In today of Internet rapid development, the requirement that people in all fields obtain data information is higher and higher. With the rapid development of networking, more stringent requirements are proposed for real-time and accuracy of the financial industry to finance data, urgently It is required that more efficient more convenient and fast mode obtains finance data.It is how financial from such magnanimity of internet, many and diverse internet In information resources, these finance datas in real time, are rapidly identified, grabbed and handled, are that people are carrying out processing financial business When encounter a huge challenge.The securities market of China is set up for over ten years, with computer technology, informationization and networking Continuous development, each financial institution of financial industry stores and has accumulated a large amount of raw financial data, how to make financial number It is constantly improve according to management is excavated, also becomes a hot topic.
Summary of the invention
The present invention in view of the deficiencies of the prior art, and provides a kind of pair of finance data and is accurately excavated, comprehensive analysis One kind being based on data mining finance data analysis method.
One kind of the invention is based on data mining finance data analysis method, comprising the following steps:
Step 1 establishes finance data acquisition and analysis system, system include: finance data warehouse module is for acquiring Finance data and data are stored and are monitored in real time, data classification and processing module are for will be in finance data warehouse module Data carry out Classification Management and go forward side by side line number Data preprocess, data analysis module be used to sort data into and processing module in number Data-mining module, data-mining module are transferred to according to progress medelling, and data are passed through into finance data transfer algorithm, financial number Financial business information is generated according to checking algorithm and finance data cleaning algorithm and is sent to foreground user;
Step 2, analysis finance data mining object, formulate finance data mining rule learning direction;Finance data mining Object is web data, using C#.NET technology, ADO.NET data access technology and SQLServer database technology, take by Layer crawl strategy acquisition finance data and time series feature;
Step 3, finance data acquisition and integration;It is new for finance data acquisition, finance data integration, natural language processing It hears public opinion and tends to data acquisition;
Step 4, analysis of data collected;Automatic screening valid data carry out system management module;Using TF-IDF simplicity shellfish The reference mode that the emotion of news tendency predicting strategy and fundamental data of this model of leaf combine;
Step 5, data are shown;In conjunction with needed for user intention and algorithm entity library analysis user, customer group is identified, a When finance data changes, it will be sent by way of transaction message after Transaction Information hardware encryption, analyzed and used by algorithm entity library Family is simultaneously pushed to foreground user.
Preferably, the finance data mining object is web data, using C#.NET technology, ADO.NET data access Technology and SQLServer database technology take successively crawl strategy acquisition finance data and time series feature;
Preferably, the finance data mining rule learning includes: finance data transformation rule, finance data verification rule With finance data cleaning rule.
Preferably, the finance data acquisition includes with integration: finance data acquisition, finance data integration, natural language Media opinion is handled to tend to.
Preferably, the data analysis is inclined to predicting strategy using the emotion of news based on TF-IDF model-naive Bayesian The reference mode combined with fundamental data.
The beneficial effects of the present invention are: (1) uses data mining technology, finger is acquired and excavated from the financial web site page Various useful financial business information are therefrom analyzed and excavated to fixed finance data, tends in conjunction with media opinion, makes financial row The user of industry can preferably recognize, grasp and utilize its financial business rule;(2) to financial regulation agencies and financial industry Mechanism and investor further analyze and grasp the changing rule in financial market, carry out effective financial supervision, financial business Operation improves efficiency of investment etc. with realistic meaning.
Detailed description of the invention
Fig. 1 is system function module structural schematic diagram of the invention;
Fig. 2 is system data process schematic diagram of the invention;
Fig. 3 is mining rule learning functionality structural schematic diagram of the invention;
Fig. 4 is data digging flow schematic diagram of the invention;
Fig. 5 is the prediction flow chart of the model-naive Bayesian of TF-IDF of the invention;
Fig. 6 is system operational process schematic diagram of the invention.
Specific embodiment
Below by specific embodiment, the technical solutions of the present invention will be further described, but the present invention is simultaneously It is not limited to embodiment.
One kind of the invention is based on data mining finance data analysis method, comprising the following steps:
Step 1 establishes finance data acquisition and analysis system, system include: finance data warehouse module is for acquiring Finance data and data are stored and are monitored in real time, data classification and processing module are for will be in finance data warehouse module Data carry out Classification Management and go forward side by side line number Data preprocess, data analysis module be used to sort data into and processing module in number Data-mining module, data-mining module are transferred to according to progress medelling, and data are passed through into finance data transfer algorithm, financial number Financial business information is generated according to checking algorithm and finance data cleaning algorithm and is sent to foreground user;
Step 2, analysis finance data mining object, formulate finance data mining rule learning direction;Finance data mining Object is web data, using C#.NET technology, ADO.NET data access technology and SQLServer database technology, take by Layer crawl strategy acquisition finance data and time series feature;
Step 3, finance data acquisition and integration;It is new for finance data acquisition, finance data integration, natural language processing It hears public opinion and tends to data acquisition;
Step 4, analysis of data collected;Automatic screening valid data carry out system management module;Using TF-IDF simplicity shellfish The reference mode that the emotion of news tendency predicting strategy and fundamental data of this model of leaf combine;
Step 5, data are shown;In conjunction with needed for user intention and algorithm entity library analysis user, customer group is identified, a When finance data changes, it will be sent by way of transaction message after Transaction Information hardware encryption, analyzed and used by algorithm entity library Family is simultaneously pushed to foreground user.
Preferably, the finance data mining rule learning includes: finance data transformation rule: finance data field is reflected It penetrates, Auto-matching information, the fractionation mode of finance data field, multiple finance data fields of each field mapping of finance data The contents such as transformation rule operation;Finance data verification rule: processing, each finance data to each field null value of finance data The verification rule definition that constraint definition, finance data correctness and the integrality of field define etc.;Finance data cleaning rule Then: in order to which processing financial data uses the finance data ambiguity being likely to occur in the process, finance data repetition, finance data not It is complete and the problems such as violate business rule, need to record problematic finance data in finance data acquisition be filtered and Cleaning.Its core code is as follows.
<td class=" tdbgedit320 ">
< asp:Text Box ID=" txt_LLo String " runat=" server " Css Class=" txt_ Edit300"Text Mode="Multi Line"Height="50px"></asp:Text Box>
<br/>
Such as: the linked code in list is shaped like: &lt;A href='Article/Class1/1358.html' Target='_blank'&gt;
<br/>
Then linking beginning code should be arranged are as follows:<font color=" red ">;/a&gt;&lt;A href=' </ Font >,
Link end code setting are as follows:<font color="red">'target='_blank'></font>
<br/>
If not filling in link starts code and link end code, it will obtain chained address all in list page!)
</td>
Preferably, the finance data acquisition includes with integration: finance data acquisition, finance data integration, natural language Media opinion is handled to tend to.Finance data acquisition includes: excavating item setup, list page excavates setting and content pages excavation is set It sets;The core code for excavating item setup is as follows:
It is as follows that list page excavates setting core code:
It is as follows that content pages excavate setting core code:
Natural language processing media opinion trend is realized in TP_Naive Bayes class, is set according to backstage manager Study dates, the target stock newsletter archive data of Web Text lane database are stored in front of obtaining from Mongo DB, Just merge the text data in the title title and text content field of the record news after obtaining, and uses Python Third party's inclusion bar participle (jieba) technology segments content of text, and is screened with participle of the filter function to acquisition, goes Except words such as function word, preposition, the tone.The delta value for calculating the same day using get UPOr Down method simultaneously, by corresponding result It is stored in label Mat, for retrieving the foundation of news record, core code is as follows:
Preferably, the data analysis is inclined to predicting strategy using the emotion of news based on TF-IDF model-naive Bayesian The reference mode combined with fundamental data.According to evaluation vocabulary Sentiment orientation, comprehensive simulation finance tendency in media opinion. Its core code is as follows:
By using data mining technology and the mode combined based on TF-IDF naive Bayesian Stakeout & Homicide Preservation Strategy, side It helps administrator and user more convenient and flexible and accurately formulate fetching, also realizes for foreground user from magnanimity, many and diverse Internet Financial Information resource in targeted, professional, accuracy acquisition finance data, for user carry out financial investment Reference is provided.
The above description is only an embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright description is applied directly or indirectly in other relevant technology necks Domain is included within the scope of the present invention.

Claims (3)

1. one kind is based on data mining finance data analysis method, which comprises the following steps:
Step 1 establishes finance data acquisition and analysis system, system include: finance data warehouse module is for acquiring finance Data and data are stored and are monitored in real time, data classification and processing module are used for the number in finance data warehouse module Go forward side by side line number Data preprocess according to Classification Management is carried out, data analysis module be used to sort data into and processing module in data into Row medelling is transferred to data-mining module, data-mining module and data is passed through finance data transfer algorithm, finance data school Checking method and finance data cleaning algorithm generate financial business information and are sent to foreground user;
Step 2, analysis finance data mining object are for formulating finance data mining rule learning direction;Finance data mining pair As being taken successively for web data using C#.NET technology, ADO.NET data access technology and SQLServer database technology Crawl strategy acquisition finance data and time series feature;
Step 3, finance data acquisition are with integration for finance data acquisition, finance data integration, natural language processing news carriage It is acquired by data are tended to;
Step 4, analysis of data collected carry out system management module for automatic screening valid data;Using TF-IDF simplicity pattra leaves The reference mode that the emotion of news tendency predicting strategy and fundamental data of this model combine;
Step 5, data are shown for customer group being identified, a in conjunction with needed for user intention and algorithm entity library analysis user When finance data changes, it will be sent by way of transaction message after Transaction Information hardware encryption, analyzed and used by algorithm entity library Family is simultaneously pushed to foreground user.
2. according to claim 1 a kind of based on data mining finance data analysis method, which is characterized in that the finance Data warehouse module adopts transaction terminal reception data of financial transaction by data transmission to the data of different data sources Collection.
3. according to claim 1 a kind of based on data mining finance data analysis method, which is characterized in that the step The destination address that finance data mining object first establishes collection rule study is analyzed in 2, setting selection data acquisition is other, according to Collection rule is arranged in data characteristics, and data pass through raw data-mining module after collecting test.
CN201810252701.2A 2018-03-26 2018-03-26 One kind being based on data mining finance data analysis method Pending CN109325853A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810252701.2A CN109325853A (en) 2018-03-26 2018-03-26 One kind being based on data mining finance data analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810252701.2A CN109325853A (en) 2018-03-26 2018-03-26 One kind being based on data mining finance data analysis method

Publications (1)

Publication Number Publication Date
CN109325853A true CN109325853A (en) 2019-02-12

Family

ID=65263501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810252701.2A Pending CN109325853A (en) 2018-03-26 2018-03-26 One kind being based on data mining finance data analysis method

Country Status (1)

Country Link
CN (1) CN109325853A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256195A (en) * 2020-10-19 2021-01-22 安徽工业大学 Financial data storage method and system based on GPU
CN112835948A (en) * 2019-11-22 2021-05-25 湖北经济学院 Data management method based on internet finance
CN112910923A (en) * 2021-03-04 2021-06-04 麦荣章 Intelligent financial big data processing system
CN113253659A (en) * 2021-06-04 2021-08-13 厦门致上信息科技有限公司 Financial big data automatic acquisition and intelligent analysis system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
余春: "基于数据挖掘技术的金融数据分析系统设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
周碧漳: "面向量化交易的金融数据处理平台研究与原型实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112835948A (en) * 2019-11-22 2021-05-25 湖北经济学院 Data management method based on internet finance
CN112256195A (en) * 2020-10-19 2021-01-22 安徽工业大学 Financial data storage method and system based on GPU
CN112256195B (en) * 2020-10-19 2022-11-01 安徽工业大学 Financial data storage method and system based on GPU
CN112910923A (en) * 2021-03-04 2021-06-04 麦荣章 Intelligent financial big data processing system
CN113253659A (en) * 2021-06-04 2021-08-13 厦门致上信息科技有限公司 Financial big data automatic acquisition and intelligent analysis system

Similar Documents

Publication Publication Date Title
Hedrick et al. Digitization and the future of natural history collections
Sun et al. Embracing textual data analytics in auditing with deep learning.
CN111309759B (en) Intelligent matching platform for enterprise science and technology projects
CN109325853A (en) One kind being based on data mining finance data analysis method
CN102708096B (en) Network intelligence public sentiment monitoring system based on semantics and work method thereof
CN105975984B (en) Network quality evaluation method based on evidence theory
CN107220237A (en) A kind of method of business entity&#39;s Relation extraction based on convolutional neural networks
CN107885793A (en) A kind of hot microblog topic analyzing and predicting method and system
CN101350011B (en) Method for detecting search engine cheat based on small sample set
CN105976056A (en) Information extraction system based on bidirectional RNN
CN111414520B (en) Intelligent mining system for sensitive information in public opinion information
CN102270212A (en) User interest feature extraction method based on hidden semi-Markov model
Abuhay et al. Analysis of publication activity of computational science society in 2001–2017 using topic modelling and graph theory
CN105138665A (en) Online internet topic mining method based on improved LDA model
CN107292744A (en) Investment Trend analysis method and its system based on machine learning
CN101819585A (en) Device and method for constructing forum event dissemination pattern
CN109472462A (en) A kind of project risk ranking method and device based on the fusion of multi-model storehouse
CN108416034B (en) Information acquisition system based on financial heterogeneous big data and control method thereof
CN111813874B (en) Terahertz knowledge graph construction method and system
Sharafat et al. Data mining for smart legal systems
CN105808722A (en) Information discrimination method and system
Zhang Application of data mining technology in digital library.
WO2021210992A9 (en) Systems and methods for determining entity attribute representations
CN112800229A (en) Knowledge graph embedding-based semi-supervised aspect-level emotion analysis method for case-involved field
Putera et al. How indonesia uses big data “indonesian one data” for the future of policy making

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190212