CN109325853A - One kind being based on data mining finance data analysis method - Google Patents
One kind being based on data mining finance data analysis method Download PDFInfo
- Publication number
- CN109325853A CN109325853A CN201810252701.2A CN201810252701A CN109325853A CN 109325853 A CN109325853 A CN 109325853A CN 201810252701 A CN201810252701 A CN 201810252701A CN 109325853 A CN109325853 A CN 109325853A
- Authority
- CN
- China
- Prior art keywords
- data
- finance
- mining
- finance data
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/06—Asset management; Financial planning or analysis
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Theoretical Computer Science (AREA)
- Accounting & Taxation (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Bioinformatics & Computational Biology (AREA)
- Operations Research (AREA)
- Evolutionary Computation (AREA)
- Entrepreneurship & Innovation (AREA)
- Evolutionary Biology (AREA)
- Game Theory and Decision Science (AREA)
- Human Resources & Organizations (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses one kind to be based on data mining finance data analysis method, the following steps are included: establishing finance data acquisition and analysis system, its system includes: finance data warehouse module is for acquiring finance data and data being stored and monitored in real time, data classification and processing module are used for line number Data preprocess that the data progress Classification Management in finance data warehouse module is gone forward side by side, data analysis module be used for sort data into and processing module in data carry out medelling be transferred to data-mining module, data are passed through finance data transfer algorithm by data-mining module, finance data checking algorithm and finance data cleaning algorithm generate financial business information and are sent to foreground user.The invention has the advantages that by using data mining technology and the mode combined based on TF-IDF naive Bayesian Stakeout & Homicide Preservation Strategy, help administrator and user more convenient and flexible and accurately formulate fetching, also for foreground user realize it is targeted, professional, accuracy from magnanimity, many and diverse internet Financial Information resource obtain finance data, for user progress financial investment reference is provided.
Description
Technical field
The present invention relates to a kind of method of finance data collection analysis, more specifically, it relates to one kind to be dug based on data
Data analysing method is melted in Denver Nuggets.
Background technique
In today of Internet rapid development, the requirement that people in all fields obtain data information is higher and higher.
With the rapid development of networking, more stringent requirements are proposed for real-time and accuracy of the financial industry to finance data, urgently
It is required that more efficient more convenient and fast mode obtains finance data.It is how financial from such magnanimity of internet, many and diverse internet
In information resources, these finance datas in real time, are rapidly identified, grabbed and handled, are that people are carrying out processing financial business
When encounter a huge challenge.The securities market of China is set up for over ten years, with computer technology, informationization and networking
Continuous development, each financial institution of financial industry stores and has accumulated a large amount of raw financial data, how to make financial number
It is constantly improve according to management is excavated, also becomes a hot topic.
Summary of the invention
The present invention in view of the deficiencies of the prior art, and provides a kind of pair of finance data and is accurately excavated, comprehensive analysis
One kind being based on data mining finance data analysis method.
One kind of the invention is based on data mining finance data analysis method, comprising the following steps:
Step 1 establishes finance data acquisition and analysis system, system include: finance data warehouse module is for acquiring
Finance data and data are stored and are monitored in real time, data classification and processing module are for will be in finance data warehouse module
Data carry out Classification Management and go forward side by side line number Data preprocess, data analysis module be used to sort data into and processing module in number
Data-mining module, data-mining module are transferred to according to progress medelling, and data are passed through into finance data transfer algorithm, financial number
Financial business information is generated according to checking algorithm and finance data cleaning algorithm and is sent to foreground user;
Step 2, analysis finance data mining object, formulate finance data mining rule learning direction;Finance data mining
Object is web data, using C#.NET technology, ADO.NET data access technology and SQLServer database technology, take by
Layer crawl strategy acquisition finance data and time series feature;
Step 3, finance data acquisition and integration;It is new for finance data acquisition, finance data integration, natural language processing
It hears public opinion and tends to data acquisition;
Step 4, analysis of data collected;Automatic screening valid data carry out system management module;Using TF-IDF simplicity shellfish
The reference mode that the emotion of news tendency predicting strategy and fundamental data of this model of leaf combine;
Step 5, data are shown;In conjunction with needed for user intention and algorithm entity library analysis user, customer group is identified, a
When finance data changes, it will be sent by way of transaction message after Transaction Information hardware encryption, analyzed and used by algorithm entity library
Family is simultaneously pushed to foreground user.
Preferably, the finance data mining object is web data, using C#.NET technology, ADO.NET data access
Technology and SQLServer database technology take successively crawl strategy acquisition finance data and time series feature;
Preferably, the finance data mining rule learning includes: finance data transformation rule, finance data verification rule
With finance data cleaning rule.
Preferably, the finance data acquisition includes with integration: finance data acquisition, finance data integration, natural language
Media opinion is handled to tend to.
Preferably, the data analysis is inclined to predicting strategy using the emotion of news based on TF-IDF model-naive Bayesian
The reference mode combined with fundamental data.
The beneficial effects of the present invention are: (1) uses data mining technology, finger is acquired and excavated from the financial web site page
Various useful financial business information are therefrom analyzed and excavated to fixed finance data, tends in conjunction with media opinion, makes financial row
The user of industry can preferably recognize, grasp and utilize its financial business rule;(2) to financial regulation agencies and financial industry
Mechanism and investor further analyze and grasp the changing rule in financial market, carry out effective financial supervision, financial business
Operation improves efficiency of investment etc. with realistic meaning.
Detailed description of the invention
Fig. 1 is system function module structural schematic diagram of the invention;
Fig. 2 is system data process schematic diagram of the invention;
Fig. 3 is mining rule learning functionality structural schematic diagram of the invention;
Fig. 4 is data digging flow schematic diagram of the invention;
Fig. 5 is the prediction flow chart of the model-naive Bayesian of TF-IDF of the invention;
Fig. 6 is system operational process schematic diagram of the invention.
Specific embodiment
Below by specific embodiment, the technical solutions of the present invention will be further described, but the present invention is simultaneously
It is not limited to embodiment.
One kind of the invention is based on data mining finance data analysis method, comprising the following steps:
Step 1 establishes finance data acquisition and analysis system, system include: finance data warehouse module is for acquiring
Finance data and data are stored and are monitored in real time, data classification and processing module are for will be in finance data warehouse module
Data carry out Classification Management and go forward side by side line number Data preprocess, data analysis module be used to sort data into and processing module in number
Data-mining module, data-mining module are transferred to according to progress medelling, and data are passed through into finance data transfer algorithm, financial number
Financial business information is generated according to checking algorithm and finance data cleaning algorithm and is sent to foreground user;
Step 2, analysis finance data mining object, formulate finance data mining rule learning direction;Finance data mining
Object is web data, using C#.NET technology, ADO.NET data access technology and SQLServer database technology, take by
Layer crawl strategy acquisition finance data and time series feature;
Step 3, finance data acquisition and integration;It is new for finance data acquisition, finance data integration, natural language processing
It hears public opinion and tends to data acquisition;
Step 4, analysis of data collected;Automatic screening valid data carry out system management module;Using TF-IDF simplicity shellfish
The reference mode that the emotion of news tendency predicting strategy and fundamental data of this model of leaf combine;
Step 5, data are shown;In conjunction with needed for user intention and algorithm entity library analysis user, customer group is identified, a
When finance data changes, it will be sent by way of transaction message after Transaction Information hardware encryption, analyzed and used by algorithm entity library
Family is simultaneously pushed to foreground user.
Preferably, the finance data mining rule learning includes: finance data transformation rule: finance data field is reflected
It penetrates, Auto-matching information, the fractionation mode of finance data field, multiple finance data fields of each field mapping of finance data
The contents such as transformation rule operation;Finance data verification rule: processing, each finance data to each field null value of finance data
The verification rule definition that constraint definition, finance data correctness and the integrality of field define etc.;Finance data cleaning rule
Then: in order to which processing financial data uses the finance data ambiguity being likely to occur in the process, finance data repetition, finance data not
It is complete and the problems such as violate business rule, need to record problematic finance data in finance data acquisition be filtered and
Cleaning.Its core code is as follows.
<td class=" tdbgedit320 ">
< asp:Text Box ID=" txt_LLo String " runat=" server " Css Class=" txt_
Edit300"Text Mode="Multi Line"Height="50px"></asp:Text Box>
<br/>
Such as: the linked code in list is shaped like: <;A href='Article/Class1/1358.html'
Target='_blank'>;
<br/>
Then linking beginning code should be arranged are as follows:<font color=" red ">;/a>;<;A href=' </
Font >,
Link end code setting are as follows:<font color="red">'target='_blank'></font>
<br/>
If not filling in link starts code and link end code, it will obtain chained address all in list page!)
</td>
Preferably, the finance data acquisition includes with integration: finance data acquisition, finance data integration, natural language
Media opinion is handled to tend to.Finance data acquisition includes: excavating item setup, list page excavates setting and content pages excavation is set
It sets;The core code for excavating item setup is as follows:
It is as follows that list page excavates setting core code:
It is as follows that content pages excavate setting core code:
Natural language processing media opinion trend is realized in TP_Naive Bayes class, is set according to backstage manager
Study dates, the target stock newsletter archive data of Web Text lane database are stored in front of obtaining from Mongo DB,
Just merge the text data in the title title and text content field of the record news after obtaining, and uses Python
Third party's inclusion bar participle (jieba) technology segments content of text, and is screened with participle of the filter function to acquisition, goes
Except words such as function word, preposition, the tone.The delta value for calculating the same day using get UPOr Down method simultaneously, by corresponding result
It is stored in label Mat, for retrieving the foundation of news record, core code is as follows:
Preferably, the data analysis is inclined to predicting strategy using the emotion of news based on TF-IDF model-naive Bayesian
The reference mode combined with fundamental data.According to evaluation vocabulary Sentiment orientation, comprehensive simulation finance tendency in media opinion.
Its core code is as follows:
By using data mining technology and the mode combined based on TF-IDF naive Bayesian Stakeout & Homicide Preservation Strategy, side
It helps administrator and user more convenient and flexible and accurately formulate fetching, also realizes for foreground user from magnanimity, many and diverse
Internet Financial Information resource in targeted, professional, accuracy acquisition finance data, for user carry out financial investment
Reference is provided.
The above description is only an embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright description is applied directly or indirectly in other relevant technology necks
Domain is included within the scope of the present invention.
Claims (3)
1. one kind is based on data mining finance data analysis method, which comprises the following steps:
Step 1 establishes finance data acquisition and analysis system, system include: finance data warehouse module is for acquiring finance
Data and data are stored and are monitored in real time, data classification and processing module are used for the number in finance data warehouse module
Go forward side by side line number Data preprocess according to Classification Management is carried out, data analysis module be used to sort data into and processing module in data into
Row medelling is transferred to data-mining module, data-mining module and data is passed through finance data transfer algorithm, finance data school
Checking method and finance data cleaning algorithm generate financial business information and are sent to foreground user;
Step 2, analysis finance data mining object are for formulating finance data mining rule learning direction;Finance data mining pair
As being taken successively for web data using C#.NET technology, ADO.NET data access technology and SQLServer database technology
Crawl strategy acquisition finance data and time series feature;
Step 3, finance data acquisition are with integration for finance data acquisition, finance data integration, natural language processing news carriage
It is acquired by data are tended to;
Step 4, analysis of data collected carry out system management module for automatic screening valid data;Using TF-IDF simplicity pattra leaves
The reference mode that the emotion of news tendency predicting strategy and fundamental data of this model combine;
Step 5, data are shown for customer group being identified, a in conjunction with needed for user intention and algorithm entity library analysis user
When finance data changes, it will be sent by way of transaction message after Transaction Information hardware encryption, analyzed and used by algorithm entity library
Family is simultaneously pushed to foreground user.
2. according to claim 1 a kind of based on data mining finance data analysis method, which is characterized in that the finance
Data warehouse module adopts transaction terminal reception data of financial transaction by data transmission to the data of different data sources
Collection.
3. according to claim 1 a kind of based on data mining finance data analysis method, which is characterized in that the step
The destination address that finance data mining object first establishes collection rule study is analyzed in 2, setting selection data acquisition is other, according to
Collection rule is arranged in data characteristics, and data pass through raw data-mining module after collecting test.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810252701.2A CN109325853A (en) | 2018-03-26 | 2018-03-26 | One kind being based on data mining finance data analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810252701.2A CN109325853A (en) | 2018-03-26 | 2018-03-26 | One kind being based on data mining finance data analysis method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109325853A true CN109325853A (en) | 2019-02-12 |
Family
ID=65263501
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810252701.2A Pending CN109325853A (en) | 2018-03-26 | 2018-03-26 | One kind being based on data mining finance data analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109325853A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112256195A (en) * | 2020-10-19 | 2021-01-22 | 安徽工业大学 | Financial data storage method and system based on GPU |
CN112835948A (en) * | 2019-11-22 | 2021-05-25 | 湖北经济学院 | Data management method based on internet finance |
CN112910923A (en) * | 2021-03-04 | 2021-06-04 | 麦荣章 | Intelligent financial big data processing system |
CN113253659A (en) * | 2021-06-04 | 2021-08-13 | 厦门致上信息科技有限公司 | Financial big data automatic acquisition and intelligent analysis system |
-
2018
- 2018-03-26 CN CN201810252701.2A patent/CN109325853A/en active Pending
Non-Patent Citations (2)
Title |
---|
余春: "基于数据挖掘技术的金融数据分析系统设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
周碧漳: "面向量化交易的金融数据处理平台研究与原型实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112835948A (en) * | 2019-11-22 | 2021-05-25 | 湖北经济学院 | Data management method based on internet finance |
CN112256195A (en) * | 2020-10-19 | 2021-01-22 | 安徽工业大学 | Financial data storage method and system based on GPU |
CN112256195B (en) * | 2020-10-19 | 2022-11-01 | 安徽工业大学 | Financial data storage method and system based on GPU |
CN112910923A (en) * | 2021-03-04 | 2021-06-04 | 麦荣章 | Intelligent financial big data processing system |
CN113253659A (en) * | 2021-06-04 | 2021-08-13 | 厦门致上信息科技有限公司 | Financial big data automatic acquisition and intelligent analysis system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hedrick et al. | Digitization and the future of natural history collections | |
Sun et al. | Embracing textual data analytics in auditing with deep learning. | |
CN111309759B (en) | Intelligent matching platform for enterprise science and technology projects | |
CN109325853A (en) | One kind being based on data mining finance data analysis method | |
CN102708096B (en) | Network intelligence public sentiment monitoring system based on semantics and work method thereof | |
CN105975984B (en) | Network quality evaluation method based on evidence theory | |
CN107220237A (en) | A kind of method of business entity's Relation extraction based on convolutional neural networks | |
CN107885793A (en) | A kind of hot microblog topic analyzing and predicting method and system | |
CN101350011B (en) | Method for detecting search engine cheat based on small sample set | |
CN105976056A (en) | Information extraction system based on bidirectional RNN | |
CN111414520B (en) | Intelligent mining system for sensitive information in public opinion information | |
CN102270212A (en) | User interest feature extraction method based on hidden semi-Markov model | |
Abuhay et al. | Analysis of publication activity of computational science society in 2001–2017 using topic modelling and graph theory | |
CN105138665A (en) | Online internet topic mining method based on improved LDA model | |
CN107292744A (en) | Investment Trend analysis method and its system based on machine learning | |
CN101819585A (en) | Device and method for constructing forum event dissemination pattern | |
CN109472462A (en) | A kind of project risk ranking method and device based on the fusion of multi-model storehouse | |
CN108416034B (en) | Information acquisition system based on financial heterogeneous big data and control method thereof | |
CN111813874B (en) | Terahertz knowledge graph construction method and system | |
Sharafat et al. | Data mining for smart legal systems | |
CN105808722A (en) | Information discrimination method and system | |
Zhang | Application of data mining technology in digital library. | |
WO2021210992A9 (en) | Systems and methods for determining entity attribute representations | |
CN112800229A (en) | Knowledge graph embedding-based semi-supervised aspect-level emotion analysis method for case-involved field | |
Putera et al. | How indonesia uses big data “indonesian one data” for the future of policy making |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190212 |