CN109992704A - A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks - Google Patents

A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks Download PDF

Info

Publication number
CN109992704A
CN109992704A CN201910183686.5A CN201910183686A CN109992704A CN 109992704 A CN109992704 A CN 109992704A CN 201910183686 A CN201910183686 A CN 201910183686A CN 109992704 A CN109992704 A CN 109992704A
Authority
CN
China
Prior art keywords
data
shot
neural networks
long term
term memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910183686.5A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Grand Credit Management Consulting Co Ltd
Original Assignee
Qingdao Grand Credit Management Consulting Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Grand Credit Management Consulting Co Ltd filed Critical Qingdao Grand Credit Management Consulting Co Ltd
Priority to CN201910183686.5A priority Critical patent/CN109992704A/en
Publication of CN109992704A publication Critical patent/CN109992704A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of enterprise's public sentiment monitoring systems and method based on shot and long term Memory Neural Networks, the system include data crawl and memory module, the cleaning sorting module of data, the mark processing module of sample data, shot and long term Memory Neural Networks training pattern, and statistical analysis module, the method disclosed in the present acquires cleaning multi-dimensional data relevant to enterprise in real time, and the Sentiment orientation of data is analyzed with trained shot and long term Memory Neural Networks training pattern, finally use statistical analysis technique, comprehensively consider the Sentiment orientation of enterprise's multi-dimensional data, it draws a conclusion, and conclusion is pushed to client in real time.The present invention is using distributed big data processing technique, and allowing whole flow process, faster timeliness as a result is more preferable.Moreover, the more the dimension that the present invention considers the more comprehensive, can be derived that more accurately as a result, reducing error.

Description

A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks
Technical field
The present invention relates to public sentiment monitoring information technical field, in particular to a kind of enterprise based on shot and long term Memory Neural Networks Industry public sentiment monitoring system and method.
Background technique
Today's society is fast-developing, enters data age, data overload, the generation speed of data is far beyond people Understand its speed.In each corner of internet, people are difficult comprehensively to obtain various data distribution relevant to enterprise Many and diverse data, obtain immediately, also can not quickly analyze various many and diverse data, it was therefore concluded that and then carry out decision.Due to Obtain company information not in time, not comprehensively, the potential risk of many enterprises is unable to learn in time, and often bringing to enterprise can not The loss of appraisal.
The news data of enterprise is distributed on major news website, and the judicial data distribution of enterprise is in judgement document website On, the industrial and commercial data distribution of enterprise is on industrial and commercial website.The data of enterprise are likely to change all the time, become each time It is dynamic that all there is potential risks.Want to avoid risk it is necessary to get these data in time, and analyzed, by multiple Dimension synthesis is drawn a conclusion.
The prior art is the data of a certain piece of Centralized Monitoring enterprise mostly, and the more the dimension of consideration the more complicated, and error also can It is bigger;And the prior art often has bigger time delay in data collection and result feedback.
Summary of the invention
In order to solve the above technical problems, the present invention provides a kind of, enterprise's public sentiment based on shot and long term Memory Neural Networks is supervised Control system and method, with reach result more precisely, process flow faster, the better purpose of timeliness.
In order to achieve the above objectives, technical scheme is as follows:
A kind of enterprise's public sentiment monitoring system based on shot and long term Memory Neural Networks, crawling and storage mould including data Block, cleaning sorting module, the mark processing module of sample data, the shot and long term Memory Neural Networks training pattern of data, and Statistical analysis module.
In above scheme, the shot and long term Memory Neural Networks training pattern is divided into four layers: first layer is term vector layer;The Two layers are LSTM layers, are the cores of training pattern;Third layer is Dropout layers, prevents model over-fitting;4th layer is Dense Layer.
A kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks, includes the following steps:
(1) news, the administration of justice, industry and commerce, the multiclass enterprise for managing aspect of enterprise are acquired in real time using distributed reptile technology Data;
(2) training of shot and long term Memory Neural Networks training pattern;
(3) trained shot and long term Memory Neural Networks training pattern is utilized, tag along sort is carried out to the data newly obtained;
(4) statistical analysis technique is utilized, the public sentiment of enterprise is calculated according to the tag along sort of data, and pushes result.
In further technical solution, in the step (1), collected multiclass business data uses the hdfs of hadoop Distributed file system is stored, and is timed cleaning daily, is arranged, and the data after cleaning are stored in hbase.
In further technical solution, in the step (2), the training method of shot and long term Memory Neural Networks training pattern It is as follows: firstly, randomly select a certain amount of sample data, to give sample data mark, and with participle tool jieba by sample data It is segmented, and is converted into higher-dimension term vector;Then, sample data is divided into three parts in proportion: training set, verifying collection and Test set;Finally, using in training set sample data mark classification based training model, using verifying concentrate sample data into Row verifying model, carries out test model using the sample data in test set;Test obtains trained length after meeting the requirements Phase Memory Neural Networks training pattern.
Further, it is divided into two kinds to sample data mark, one is classifying contents, and one is emotional semantic classifications.
Through the above technical solutions, a kind of enterprise's public sentiment monitoring based on shot and long term Memory Neural Networks provided by the invention System and method acquires cleaning multi-dimensional data relevant to enterprise in real time, and is instructed with trained shot and long term Memory Neural Networks Practice the Sentiment orientation of model analysis data, finally use statistical analysis technique, comprehensively consider the Sentiment orientation of enterprise's multi-dimensional data, It draws a conclusion, and conclusion is pushed to client in real time.The present invention using distributed big data processing technique, allow whole flow process more Fastly, as a result timeliness is more preferable.Moreover, the more the dimension that the present invention considers the more comprehensive, can be derived that more accurately as a result, reducing Error.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described.
Fig. 1 is a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks disclosed in the embodiment of the present invention Flow diagram.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description.
It is specific real the present invention provides a kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks Under applying for example:
Step 1: data crawl
Utilize the website found with memory module and have Company News, the administration of justice, industry and commerce, management data that crawls of data, screening Wherein data update relatively timely and more bigger data volume website, analyzing web page structure, daily timing acquiring.
Step 2: data storage
Needing to store news, the administration of justice, industry and commerce, management data of nearly 80,000,000 enterprises daily, data are all again text-types, Traditional database is unable to satisfy requirement, we are stored using the hdfs distributed file system of hadoop.
Step 3: data cleansing, arrangement
The data acquired from website are in a mess, have a large amount of web page tags and incoherent data to be collected, and need clear It washes off, the data format of different data sources is very different again to be needed to unitize.It is daily using the cleaning sorting module of data Company News, the administration of justice, industry and commerce, the management data that fixed time cleaning, arrangement are newly put in storage, the data after cleaning are stored in hbase.
The method of data cleansing includes the following:
1, the value of mode substitution missing is utilized;
2, error value is identified using variance analysis;
3, detection repeats to record and merge;
4, the data of different data sources different-format are unified for one kind.
Step 4: sample data mark
A certain amount of sample data is randomly selected, is laid down a regulation in conjunction with statistical analysis technique and business experience, sample is utilized The mark processing module of data is labeled to data, is divided to two kinds, one is classifying contents: 1 lists delisting, 2 management functions, 3 Loss profit, 4 product qualities, 5 infringement plagiarism, 6 credit worthiness, 7 debts mortgage, 8 stop doing business bankruptcy, 9 stop doing business pause, 10 go bankrupt clearly Calculation, 11 tax evasion, 12 other bulletins, 13 service disputes, 14 cooperative management, 15 contract disputes, 16 employee's situations, 17, which increase, keeps reducing Hold, 18 security incidents, 19 achievement awards, 20 investment and financings, the variation of 21 equitys, 22 purchase and reshufflings, 23 new products upgrading, 24 relate to tell it is separated It advises, 25 senior executives are negative, 26 environmental protections, the variation of 27 share prices, 28 fraud frauds, 29 major transactions, 30 senior executives variation, 31 policy methods Rule, 32 correlations refer to;One is emotional semantic classifications: 0 is neutral, 1 negative, 2 fronts, 3 correlations.
Step 5: data participle
Tool jieba is segmented by text-type Interval data at one using the third party in the mark processing module of sample data A one word needs to increase a large amount of dictionary for the training effect of following model.
Step 6: word is converted into term vector
Higher-dimension term vector only is converted by word, deep learning training pattern could be used, higher-dimension term vector converts good The bad result that will directly affect model.We are converted word to using the word2vec in the mark processing module of sample data One very big advantage of higher-dimension term vector, word2vec is that the word of semantic similarity can be placed on similar position by it.
Step 7: shot and long term Memory Neural Networks training pattern
Sample data is divided into three parts: training set, verifying collection, test set in proportion.
This LSTM model is divided into four layers:
First layer be term vector layer, if step 6 a word be converted into k dimension term vector, a total of n word, that It is exactly the matrix of a n*k;
The second layer is LSTM layers, the core of training pattern;
Third layer is Dropout layers, prevents model over-fitting;
4th layer is Dense layers.
Firstly, the classifying content training pattern that the sample data in training set is marked according to step 4, obtains content point Class model;Then the emotional semantic classification training pattern marked according to step 4, obtains sentiment classification model.The sample concentrated using verifying Notebook data carries out verifying model, carries out test model using the sample data in test set;Test is trained after meeting the requirements Good shot and long term Memory Neural Networks training pattern.
Step 8: model deployment and interface exploitation
On the server by the deployment of trained model, and model interface is provided, the data elder generation calling model newly obtained connects Mouth obtains classifying content label and emotional semantic classification label, is then put in storage label and data together.
Step 9: the analysis of public opinion
Each enterprise is calculated according to the classifying content label and emotional semantic classification label of each emotion using statistical analysis technique The public sentiment of industry, and push result.Statistical analysis technique is frequency analysis, calculates the number of different data classification and different emotions tendency Data bulk accounts for the specific gravity of total quantity.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (6)

1. a kind of enterprise's public sentiment monitoring system based on shot and long term Memory Neural Networks, which is characterized in that crawling including data With memory module, the cleaning sorting module of data, the mark processing module of sample data, shot and long term Memory Neural Networks training mould Type and statistical analysis module.
2. a kind of enterprise's public sentiment monitoring system based on shot and long term Memory Neural Networks according to claim 1, feature It is, the shot and long term Memory Neural Networks training pattern is divided into four layers: first layer is term vector layer;The second layer is LSTM layers, It is the core of training pattern;Third layer is Dropout layers, prevents model over-fitting;4th layer is Dense layers.
3. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks, which comprises the steps of:
(1) news, the administration of justice, industry and commerce, the multiclass business data for managing aspect of enterprise are acquired in real time using distributed reptile technology;
(2) training of shot and long term Memory Neural Networks training pattern;
(3) trained shot and long term Memory Neural Networks training pattern is utilized, tag along sort is carried out to the data newly obtained;
(4) statistical analysis technique is utilized, the public sentiment of enterprise is calculated according to the tag along sort of data, and pushes result.
4. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks according to claim 3, feature It is, in the step (1), collected multiclass business data is deposited using the hdfs distributed file system of hadoop Storage, and be timed cleaning daily, arrange, the data after cleaning are stored in hbase.
5. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks according to claim 3, feature It is, in the step (2), the training method of shot and long term Memory Neural Networks training pattern is as follows: firstly, randomly selecting certain The sample data of amount is marked to sample data, and is segmented sample data with participle tool jieba, and be converted into higher-dimension Term vector;Then, sample data is divided into three parts: training set, verifying collection and test set in proportion;Finally, utilizing training The classification based training model of the sample data mark of concentration, carries out verifying model using the sample data that verifying is concentrated, utilizes test The sample data of concentration carries out test model;Test obtains trained shot and long term Memory Neural Networks training mould after meeting the requirements Type.
6. a kind of enterprise's public sentiment monitoring method based on shot and long term Memory Neural Networks according to claim 5, feature It is, is divided into two kinds to sample data mark, one is classifying contents, and one is emotional semantic classifications.
CN201910183686.5A 2019-03-12 2019-03-12 A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks Withdrawn CN109992704A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910183686.5A CN109992704A (en) 2019-03-12 2019-03-12 A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910183686.5A CN109992704A (en) 2019-03-12 2019-03-12 A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks

Publications (1)

Publication Number Publication Date
CN109992704A true CN109992704A (en) 2019-07-09

Family

ID=67130515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910183686.5A Withdrawn CN109992704A (en) 2019-03-12 2019-03-12 A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks

Country Status (1)

Country Link
CN (1) CN109992704A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619572A (en) * 2019-09-20 2019-12-27 重庆誉存大数据科技有限公司 Method for monitoring high fault tolerance growth of enterprise public data
CN112231483A (en) * 2020-11-06 2021-01-15 中国水利水电科学研究院 Disaster tracking method, disaster tracking system, disaster tracking device and storage medium
CN113222471A (en) * 2021-06-04 2021-08-06 西安交通大学 Asset wind control method and device based on new media data
CN113240556A (en) * 2021-05-31 2021-08-10 平安科技(深圳)有限公司 Infringement processing method, device, equipment and medium based on intelligent decision

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619572A (en) * 2019-09-20 2019-12-27 重庆誉存大数据科技有限公司 Method for monitoring high fault tolerance growth of enterprise public data
CN112231483A (en) * 2020-11-06 2021-01-15 中国水利水电科学研究院 Disaster tracking method, disaster tracking system, disaster tracking device and storage medium
CN113240556A (en) * 2021-05-31 2021-08-10 平安科技(深圳)有限公司 Infringement processing method, device, equipment and medium based on intelligent decision
CN113240556B (en) * 2021-05-31 2024-02-09 平安科技(深圳)有限公司 Infringement processing method, device, equipment and medium based on intelligent decision
CN113222471A (en) * 2021-06-04 2021-08-06 西安交通大学 Asset wind control method and device based on new media data
CN113222471B (en) * 2021-06-04 2023-06-06 西安交通大学 Asset wind control method and device based on new media data

Similar Documents

Publication Publication Date Title
Babu et al. Exploring big data-driven innovation in the manufacturing sector: evidence from UK firms
Sun Applying deep learning to audit procedures: An illustrative framework
Song et al. Sustainable strategy for corporate governance based on the sentiment analysis of financial reports with CSR
Ahmed et al. Business boosting through sentiment analysis using Artificial Intelligence approach
CN109992704A (en) A kind of enterprise's public sentiment monitoring system and method based on shot and long term Memory Neural Networks
Qaisi et al. A twitter sentiment analysis for cloud providers: A case study of Azure vs. AWS
Ordenes et al. Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications
CN110516077A (en) Knowledge mapping construction method and device towards enterprise's market conditions
Sangari et al. A data-driven, comparative review of the academic literature and news media on blockchain-enabled supply chain management: Trends, gaps, and research needs
Suganya et al. Sentiment analysis for scraping of product reviews from multiple web pages using machine learning algorithms
Hussein How many old and new big data v’s characteristics, processing technology, and applications (bd1)
Nanayakkara et al. A survey of finding trends in data mining techniques for social media analysis
Yeung et al. Data analytics architectures for e-commerce platforms in cloud
Zhang et al. [Retracted] Deep Learning‐Based Consumer Behavior Analysis and Application Research
CN114528416A (en) Enterprise public opinion environment monitoring method and system based on big data
Vidgen et al. Business analytics: a management approach
Sharaff et al. Lstm based sentiment analysis of financial news
Modrušan et al. Intelligent Public Procurement Monitoring System Powered by Text Mining and Balanced Indicators
CN110909050A (en) Data statistical analysis system
Motohashi Understanding AI driven innovation by linked database of scientific articles and patents
Verdhan et al. Introduction to supervised learning
Kumar et al. Feedback Investigation on Twitter Dataset Using Classification Approaches
Echeberria The Impact of AI on Business, Economics and Innovation
Shastry et al. Machine Learning for Business Analytics: Case Studies and Open Research Problems
Nalabala et al. An amalgamation of big data analytics with tweet feeds for stock market trend anticipating systems: A review

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20190709

WW01 Invention patent application withdrawn after publication