CN109992704A - A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks - Google Patents
A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks Download PDFInfo
- Publication number
- CN109992704A CN109992704A CN201910183686.5A CN201910183686A CN109992704A CN 109992704 A CN109992704 A CN 109992704A CN 201910183686 A CN201910183686 A CN 201910183686A CN 109992704 A CN109992704 A CN 109992704A
- Authority
- CN
- China
- Prior art keywords
- data
- shot
- neural networks
- long term
- term memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of enterprise's public sentiment monitoring systems and method based on shot and long term Memory Neural Networks, the system include data crawl and memory module, the cleaning sorting module of data, the mark processing module of sample data, shot and long term Memory Neural Networks training pattern, and statistical analysis module, the method disclosed in the present acquires cleaning multi-dimensional data relevant to enterprise in real time, and the Sentiment orientation of data is analyzed with trained shot and long term Memory Neural Networks training pattern, finally use statistical analysis technique, comprehensively consider the Sentiment orientation of enterprise's multi-dimensional data, it draws a conclusion, and conclusion is pushed to client in real time.The present invention is using distributed big data processing technique, and allowing whole flow process, faster timeliness as a result is more preferable.Moreover, the more the dimension that the present invention considers the more comprehensive, can be derived that more accurately as a result, reducing error.
Description
Technical field
The present invention relates to public sentiment monitoring information technical field, in particular to a kind of enterprise based on shot and long term Memory Neural Networks
Industry public sentiment monitoring system and method.
Background technique
Today's society is fast-developing, enters data age, data overload, the generation speed of data is far beyond people
Understand its speed.In each corner of internet, people are difficult comprehensively to obtain various data distribution relevant to enterprise
Many and diverse data, obtain immediately, also can not quickly analyze various many and diverse data, it was therefore concluded that and then carry out decision.Due to
Obtain company information not in time, not comprehensively, the potential risk of many enterprises is unable to learn in time, and often bringing to enterprise can not
The loss of appraisal.
The news data of enterprise is distributed on major news website, and the judicial data distribution of enterprise is in judgement document website
On, the industrial and commercial data distribution of enterprise is on industrial and commercial website.The data of enterprise are likely to change all the time, become each time
It is dynamic that all there is potential risks.Want to avoid risk it is necessary to get these data in time, and analyzed, by multiple
Dimension synthesis is drawn a conclusion.
The prior art is the data of a certain piece of Centralized Monitoring enterprise mostly, and the more the dimension of consideration the more complicated, and error also can
It is bigger;And the prior art often has bigger time delay in data collection and result feedback.
Summary of the invention
In order to solve the above technical problems, the present invention provides a kind of, enterprise's public sentiment based on shot and long term Memory Neural Networks is supervised
Control system and method, with reach result more precisely, process flow faster, the better purpose of timeliness.
In order to achieve the above objectives, technical scheme is as follows:
A kind of enterprise's public sentiment monitoring system based on shot and long term Memory Neural Networks, crawling and storage mould including data
Block, cleaning sorting module, the mark processing module of sample data, the shot and long term Memory Neural Networks training pattern of data, and
Statistical analysis module.
In above scheme, the shot and long term Memory Neural Networks training pattern is divided into four layers: first layer is term vector layer;The
Two layers are LSTM layers, are the cores of training pattern;Third layer is Dropout layers, prevents model over-fitting;4th layer is Dense
Layer.
A kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks, includes the following steps:
(1) news, the administration of justice, industry and commerce, the multiclass enterprise for managing aspect of enterprise are acquired in real time using distributed reptile technology
Data;
(2) training of shot and long term Memory Neural Networks training pattern;
(3) trained shot and long term Memory Neural Networks training pattern is utilized, tag along sort is carried out to the data newly obtained;
(4) statistical analysis technique is utilized, the public sentiment of enterprise is calculated according to the tag along sort of data, and pushes result.
In further technical solution, in the step (1), collected multiclass business data uses the hdfs of hadoop
Distributed file system is stored, and is timed cleaning daily, is arranged, and the data after cleaning are stored in hbase.
In further technical solution, in the step (2), the training method of shot and long term Memory Neural Networks training pattern
It is as follows: firstly, randomly select a certain amount of sample data, to give sample data mark, and with participle tool jieba by sample data
It is segmented, and is converted into higher-dimension term vector;Then, sample data is divided into three parts in proportion: training set, verifying collection and
Test set;Finally, using in training set sample data mark classification based training model, using verifying concentrate sample data into
Row verifying model, carries out test model using the sample data in test set;Test obtains trained length after meeting the requirements
Phase Memory Neural Networks training pattern.
Further, it is divided into two kinds to sample data mark, one is classifying contents, and one is emotional semantic classifications.
Through the above technical solutions, a kind of enterprise's public sentiment monitoring based on shot and long term Memory Neural Networks provided by the invention
System and method acquires cleaning multi-dimensional data relevant to enterprise in real time, and is instructed with trained shot and long term Memory Neural Networks
Practice the Sentiment orientation of model analysis data, finally use statistical analysis technique, comprehensively consider the Sentiment orientation of enterprise's multi-dimensional data,
It draws a conclusion, and conclusion is pushed to client in real time.The present invention using distributed big data processing technique, allow whole flow process more
Fastly, as a result timeliness is more preferable.Moreover, the more the dimension that the present invention considers the more comprehensive, can be derived that more accurately as a result, reducing
Error.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described.
Fig. 1 is a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks disclosed in the embodiment of the present invention
Flow diagram.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description.
It is specific real the present invention provides a kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks
Under applying for example:
Step 1: data crawl
Utilize the website found with memory module and have Company News, the administration of justice, industry and commerce, management data that crawls of data, screening
Wherein data update relatively timely and more bigger data volume website, analyzing web page structure, daily timing acquiring.
Step 2: data storage
Needing to store news, the administration of justice, industry and commerce, management data of nearly 80,000,000 enterprises daily, data are all again text-types,
Traditional database is unable to satisfy requirement, we are stored using the hdfs distributed file system of hadoop.
Step 3: data cleansing, arrangement
The data acquired from website are in a mess, have a large amount of web page tags and incoherent data to be collected, and need clear
It washes off, the data format of different data sources is very different again to be needed to unitize.It is daily using the cleaning sorting module of data
Company News, the administration of justice, industry and commerce, the management data that fixed time cleaning, arrangement are newly put in storage, the data after cleaning are stored in hbase.
The method of data cleansing includes the following:
1, the value of mode substitution missing is utilized;
2, error value is identified using variance analysis;
3, detection repeats to record and merge;
4, the data of different data sources different-format are unified for one kind.
Step 4: sample data mark
A certain amount of sample data is randomly selected, is laid down a regulation in conjunction with statistical analysis technique and business experience, sample is utilized
The mark processing module of data is labeled to data, is divided to two kinds, one is classifying contents: 1 lists delisting, 2 management functions, 3
Loss profit, 4 product qualities, 5 infringement plagiarism, 6 credit worthiness, 7 debts mortgage, 8 stop doing business bankruptcy, 9 stop doing business pause, 10 go bankrupt clearly
Calculation, 11 tax evasion, 12 other bulletins, 13 service disputes, 14 cooperative management, 15 contract disputes, 16 employee's situations, 17, which increase, keeps reducing
Hold, 18 security incidents, 19 achievement awards, 20 investment and financings, the variation of 21 equitys, 22 purchase and reshufflings, 23 new products upgrading, 24 relate to tell it is separated
It advises, 25 senior executives are negative, 26 environmental protections, the variation of 27 share prices, 28 fraud frauds, 29 major transactions, 30 senior executives variation, 31 policy methods
Rule, 32 correlations refer to;One is emotional semantic classifications: 0 is neutral, 1 negative, 2 fronts, 3 correlations.
Step 5: data participle
Tool jieba is segmented by text-type Interval data at one using the third party in the mark processing module of sample data
A one word needs to increase a large amount of dictionary for the training effect of following model.
Step 6: word is converted into term vector
Higher-dimension term vector only is converted by word, deep learning training pattern could be used, higher-dimension term vector converts good
The bad result that will directly affect model.We are converted word to using the word2vec in the mark processing module of sample data
One very big advantage of higher-dimension term vector, word2vec is that the word of semantic similarity can be placed on similar position by it.
Step 7: shot and long term Memory Neural Networks training pattern
Sample data is divided into three parts: training set, verifying collection, test set in proportion.
This LSTM model is divided into four layers:
First layer be term vector layer, if step 6 a word be converted into k dimension term vector, a total of n word, that
It is exactly the matrix of a n*k;
The second layer is LSTM layers, the core of training pattern;
Third layer is Dropout layers, prevents model over-fitting;
4th layer is Dense layers.
Firstly, the classifying content training pattern that the sample data in training set is marked according to step 4, obtains content point
Class model;Then the emotional semantic classification training pattern marked according to step 4, obtains sentiment classification model.The sample concentrated using verifying
Notebook data carries out verifying model, carries out test model using the sample data in test set;Test is trained after meeting the requirements
Good shot and long term Memory Neural Networks training pattern.
Step 8: model deployment and interface exploitation
On the server by the deployment of trained model, and model interface is provided, the data elder generation calling model newly obtained connects
Mouth obtains classifying content label and emotional semantic classification label, is then put in storage label and data together.
Step 9: the analysis of public opinion
Each enterprise is calculated according to the classifying content label and emotional semantic classification label of each emotion using statistical analysis technique
The public sentiment of industry, and push result.Statistical analysis technique is frequency analysis, calculates the number of different data classification and different emotions tendency
Data bulk accounts for the specific gravity of total quantity.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (6)
1. a kind of enterprise's public sentiment monitoring system based on shot and long term Memory Neural Networks, which is characterized in that crawling including data
With memory module, the cleaning sorting module of data, the mark processing module of sample data, shot and long term Memory Neural Networks training mould
Type and statistical analysis module.
2. a kind of enterprise's public sentiment monitoring system based on shot and long term Memory Neural Networks according to claim 1, feature
It is, the shot and long term Memory Neural Networks training pattern is divided into four layers: first layer is term vector layer;The second layer is LSTM layers,
It is the core of training pattern;Third layer is Dropout layers, prevents model over-fitting;4th layer is Dense layers.
3. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks, which comprises the steps of:
(1) news, the administration of justice, industry and commerce, the multiclass business data for managing aspect of enterprise are acquired in real time using distributed reptile technology;
(2) training of shot and long term Memory Neural Networks training pattern;
(3) trained shot and long term Memory Neural Networks training pattern is utilized, tag along sort is carried out to the data newly obtained;
(4) statistical analysis technique is utilized, the public sentiment of enterprise is calculated according to the tag along sort of data, and pushes result.
4. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks according to claim 3, feature
It is, in the step (1), collected multiclass business data is deposited using the hdfs distributed file system of hadoop
Storage, and be timed cleaning daily, arrange, the data after cleaning are stored in hbase.
5. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks according to claim 3, feature
It is, in the step (2), the training method of shot and long term Memory Neural Networks training pattern is as follows: firstly, randomly selecting certain
The sample data of amount is marked to sample data, and is segmented sample data with participle tool jieba, and be converted into higher-dimension
Term vector;Then, sample data is divided into three parts: training set, verifying collection and test set in proportion;Finally, utilizing training
The classification based training model of the sample data mark of concentration, carries out verifying model using the sample data that verifying is concentrated, utilizes test
The sample data of concentration carries out test model;Test obtains trained shot and long term Memory Neural Networks training mould after meeting the requirements
Type.
6. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks according to claim 5, feature
It is, is divided into two kinds to sample data mark, one is classifying contents, and one is emotional semantic classifications.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910183686.5A CN109992704A (en) | 2019-03-12 | 2019-03-12 | A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910183686.5A CN109992704A (en) | 2019-03-12 | 2019-03-12 | A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109992704A true CN109992704A (en) | 2019-07-09 |
Family
ID=67130515
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910183686.5A Withdrawn CN109992704A (en) | 2019-03-12 | 2019-03-12 | A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109992704A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619572A (en) * | 2019-09-20 | 2019-12-27 | 重庆誉存大数据科技有限公司 | Method for monitoring high fault tolerance growth of enterprise public data |
CN112231483A (en) * | 2020-11-06 | 2021-01-15 | 中国水利水电科学研究院 | Disaster tracking method, disaster tracking system, disaster tracking device and storage medium |
CN113222471A (en) * | 2021-06-04 | 2021-08-06 | 西安交通大学 | Asset wind control method and device based on new media data |
CN113240556A (en) * | 2021-05-31 | 2021-08-10 | 平安科技(深圳)有限公司 | Infringement processing method, device, equipment and medium based on intelligent decision |
-
2019
- 2019-03-12 CN CN201910183686.5A patent/CN109992704A/en not_active Withdrawn
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619572A (en) * | 2019-09-20 | 2019-12-27 | 重庆誉存大数据科技有限公司 | Method for monitoring high fault tolerance growth of enterprise public data |
CN112231483A (en) * | 2020-11-06 | 2021-01-15 | 中国水利水电科学研究院 | Disaster tracking method, disaster tracking system, disaster tracking device and storage medium |
CN113240556A (en) * | 2021-05-31 | 2021-08-10 | 平安科技(深圳)有限公司 | Infringement processing method, device, equipment and medium based on intelligent decision |
CN113240556B (en) * | 2021-05-31 | 2024-02-09 | 平安科技(深圳)有限公司 | Infringement processing method, device, equipment and medium based on intelligent decision |
CN113222471A (en) * | 2021-06-04 | 2021-08-06 | 西安交通大学 | Asset wind control method and device based on new media data |
CN113222471B (en) * | 2021-06-04 | 2023-06-06 | 西安交通大学 | Asset wind control method and device based on new media data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Babu et al. | Exploring big data-driven innovation in the manufacturing sector: evidence from UK firms | |
Sun | Applying deep learning to audit procedures: An illustrative framework | |
Song et al. | Sustainable strategy for corporate governance based on the sentiment analysis of financial reports with CSR | |
Ahmed et al. | Business boosting through sentiment analysis using Artificial Intelligence approach | |
CN109992704A (en) | A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks | |
Qaisi et al. | A twitter sentiment analysis for cloud providers: A case study of Azure vs. AWS | |
Ordenes et al. | Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications | |
CN110516077A (en) | Knowledge mapping construction method and device towards enterprise's market conditions | |
Sangari et al. | A data-driven, comparative review of the academic literature and news media on blockchain-enabled supply chain management: Trends, gaps, and research needs | |
Suganya et al. | Sentiment analysis for scraping of product reviews from multiple web pages using machine learning algorithms | |
Hussein | How many old and new big data v’s characteristics, processing technology, and applications (bd1) | |
Nanayakkara et al. | A survey of finding trends in data mining techniques for social media analysis | |
Yeung et al. | Data analytics architectures for e-commerce platforms in cloud | |
Zhang et al. | [Retracted] Deep Learning‐Based Consumer Behavior Analysis and Application Research | |
CN114528416A (en) | Enterprise public opinion environment monitoring method and system based on big data | |
Vidgen et al. | Business analytics: a management approach | |
Sharaff et al. | Lstm based sentiment analysis of financial news | |
Modrušan et al. | Intelligent Public Procurement Monitoring System Powered by Text Mining and Balanced Indicators | |
CN110909050A (en) | Data statistical analysis system | |
Motohashi | Understanding AI driven innovation by linked database of scientific articles and patents | |
Verdhan et al. | Introduction to supervised learning | |
Kumar et al. | Feedback Investigation on Twitter Dataset Using Classification Approaches | |
Echeberria | The Impact of AI on Business, Economics and Innovation | |
Shastry et al. | Machine Learning for Business Analytics: Case Studies and Open Research Problems | |
Nalabala et al. | An amalgamation of big data analytics with tweet feeds for stock market trend anticipating systems: A review |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20190709 |
|
WW01 | Invention patent application withdrawn after publication |