CN104820629B - A kind of intelligent public sentiment accident emergent treatment system and method - Google Patents
A kind of intelligent public sentiment accident emergent treatment system and method Download PDFInfo
- Publication number
- CN104820629B CN104820629B CN201510243751.0A CN201510243751A CN104820629B CN 104820629 B CN104820629 B CN 104820629B CN 201510243751 A CN201510243751 A CN 201510243751A CN 104820629 B CN104820629 B CN 104820629B
- Authority
- CN
- China
- Prior art keywords
- text
- public sentiment
- lexical item
- network
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a kind of intelligent public sentiment accident emergent treatment system and method.It is related to and natural language processing technique, ontology theory and semantic association technology is applied to the Intelligent Recognition of internet public feelings accident and automatically generating for prevention and control prediction scheme.It based on computer information processing method realize to meet an urgent need prevention and control prediction scheme formatting convert, realize the semantic matches between public sentiment accident scene and prediction scheme, realize to various internet public feelings accidents accurately identify and aid decision.The present invention can be monitored in real time to internet public feelings, aid in internet public feelings prevention and control decision-making, improve the prevention and control response speed of disposal public sentiment accident.
Description
Technical field
The invention belongs to computer application field, is related to natural language processing technique, ontology theory and semantic association skill
Art is applied to the Intelligent Recognition of internet public feelings accident and automatically generating for prevention and control scheme.It is based on computer information processing
Method realizes that the formatting to prevention and control prediction scheme of meeting an urgent need converts, and realizes the semantic matches between public sentiment accident scene and prediction scheme,
Realize to various internet public feelings accidents accurately identify and aid decision.
Background technology
With the continuous development of Internet technology, internet turns into a kind of mass media being widely used, its feeler
The every field of society is almost stretched to, and is increasingly becoming a new important medium of public opinion.Network public-opinion is that the public exists
Open expression has certain influence power and tendentious communis opinio to certain social phenomenon or social concern on internet,
Influence of the network public-opinion to political life order and social stability is growing day by day, and some network public-opinion accidents can not be appropriate in time
Reason is conducted oneself well, very likely induces the unhealthy emotion of the common people and the generation of bad behavior, and then serious threat is formed to social stability.
There is an urgent need to a kind of technological means can realize the automatic monitoring to network public sentiment information, can be to the disposal of public sentiment accident
Decision support is provided.
The content of the invention
The present invention is aiming at the demand, it is proposed that a kind of computer application system-public sentiment accident emergency processing
System, it can be monitored in real time to internet public feelings, can aid decision person according to the actual conditions of public sentiment accident
The prevention and control scheme matched is targetedly formed, accelerates the disposal response speed to network public-opinion accident.
The technical problems to be solved by the invention are realized by following technical scheme:
A kind of intelligent public sentiment accident emergent treatment system, it is characterised in that:The system is adopted including internet information
Collection judges to give birth to Cluster Analysis module, emergence treatment scheme with parsing module, internet information analysis module, network text classification
Into module and emergency processing recruitment evaluation module;The internet information acquisition is used to gather from internet with parsing module to be believed
Breath, extracts the metadata information of natural language word and webpage in webpage, and be saved in database;The internet letter
Breath analysis module is used to carry out feature extraction to the natural language word gathered in the information come, forms text feature;The net
Network text categories judge to be used to judge the classification of network text with Cluster Analysis module, cumulative network text are gathered
Alanysis;The emergence treatment scheme generation module is used to automatically generate corresponding processing in advance according to the concrete condition of public sentiment event
Case, decision-maker can be based on processing prediction scheme formulation and implementation scheme;The emergency processing recruitment evaluation module is used for the side of execution
The implementation effect of case is assessed.
A kind of intelligent public sentiment accident emergent treatment system and method, it is characterised in that this method includes following step
Suddenly:
1. internet information acquisition and parsing:By the computer of connection internet from internet forum, blog, news website
The network datas such as upper collection forum postings, Blog content and Website News webpage, then, using computer using rule-based
Information extraction technique automatically parses to network data, extracts two category informations therefrom:Natural language text information and net
The metadata information of page;Natural language text information includes headline, body, forum postings title, model content etc.
Information;The metadata information of webpage includes delivering time, author, posting person, model reply volume, model amount of reading, the net occurred
The information such as station name, website URL, the information parsed are saved in database, and information gathering is one lasting with parsing
Process, form the automatic continuous monitoring to internet site;
2. internet information is analyzed:First with the Chinese word cutting method of natural language processing technique to the mark of network text
Topic and body matter are segmented respectively, and the part of speech of each lexical item in word segmentation result is labeled, and give up fall to remove name from the rolls afterwards
Lexical item outside word, verb, adjective, the single lexical item feature of network text is then extracted using text multiple-accuracy representing method
With lexical item linked character, further according to the part-of-speech tagging situation in word segmentation result identify geographic location feature in network text and
Character features, geographic location feature are that geographic position name, the character features occurred in network text are that occur in network text
Person names;
3. the word of the public sentiment classification set in the lexical item in network text and Computer Database 2. step is handled after
Matching is compared in item feature, and is entered network text according to the public sentiment classification set in Computer Database according to matching result
Row classification is handled;The network text that can not sort out is subjected to cluster analysis, network text similar in content is polymerized to cluster, if in cluster
Network text quantity exceeds given threshold, then the lexical item feature that public sentiment classification is carried out to network text in cluster takes out processing, and will take out
The lexical item feature of the public sentiment classification taken is added in Computer Database;Step is transferred to for the network text for completing to sort out 4.;
Wherein, matching content includes single lexical item feature, lexical item linked character, geographic location feature and character features;
If 4. at the appointed time in section, belong to the quantity of the network text of a certain classification or category network text occur
This Websites quantity exceedes the threshold value specified, then starts emergency preplan;
Complete the emergency processing of intelligent public sentiment accident.
Wherein, 4. emergency processing recruitment evaluation step is also included afterwards in step:It is first according to evaluation index acquisition index
Data, achievement data is then inputted into assessment formula and draws quantitative evaluation result.
Wherein, step 3. according to matching result by network text according to Computer Database in the public sentiment classification that sets
Carrying out classification processing is specially:The method that network text classification judges is the word by the lexical item of network text and each public sentiment classification
Matching is compared in item feature, respectively at four single word feature, word association feature, geographic location feature and character features aspects
Matching operation is carried out, the Similarity value of network text and each public sentiment classification is obtained according to match condition, text is attributed to phase
Like angle value highest public sentiment classification.
Wherein, step 3. in in cluster network text carry out public sentiment classification lexical item feature take out processing, specially:Assuming that cluster
The network text that T is included has T={ t1,t2,…tn, extract each text t using text multiple-accuracy representing methodiIt is single
Lexical item feature and lexical item linked character, then calculate using statistical method all single lexical item features and word of all texts in T
The Statistical Distribution of item linked character, the vocabulary for selecting in more than half network text to occur in T is as public sentiment classification
Lexical item feature, and calculate its frequency of the average occurrence frequency as public sentiment category feature lexical item in T;Wherein, 1≤i≤
n。
Wherein, the generation method of step 4. middle emergency preplan is:Based on internet public feelings event scene ontology knowledge base mould
Type and network public-opinion prevention and control measure prediction scheme ontology knowledge base, the specific feelings using semantic matches technology according to public sentiment event scene
Condition, Auto-matching goes out most suitable plan for emergency handling from prevention and control measure prediction scheme storehouse.
Compared with prior art, the present invention has the following advantages that and beneficial effect:
1st, the present invention can not only be monitored automatically to network public-opinion, additionally it is possible to provide prevention and control for burst public sentiment event
Measure scheme.
2nd, public sentiment type identification Computer Database of the invention has scalability, is constantly mended by Clustering Analysis of Text
New public sentiment type feature is filled into database, enables a system to identify the public sentiment event of newly-increased type.
Brief description of the drawings
Fig. 1 system modules composition figure
Fig. 2 public sentiment taxonomic hierarchies illustratons of model
Fig. 3 public sentiment taxonomic hierarchies concept attribute illustratons of model
Fig. 4 public sentiment taxonomic hierarchies schematic diagrames
Fig. 5 category features produce process fundamental diagram
Fig. 6 semantic matches schematic diagrams
The knowledge augmented figure that Fig. 7 is clustered based on network text
Fig. 8 public sentiment event scene ontology knowledge base figures
Fig. 9 public sentiment prevention and control measure prediction scheme ontology knowledge base figures
Figure 10 network public-opinion prevention and control Knowledge Semantic Model Based figures
Matching process figures of the Figure 11 based on semanteme
Figure 12 emergency processing recruitment evaluation index system figures
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention will be further described.But embodiments of the present invention are unlimited
In this.
The present embodiment provides a kind of intelligent public sentiment accident emergent treatment system, and the system is adopted including internet information
Collection and parsing module, internet information analysis module, network text classification judges and Cluster Analysis module, emergence treatment scheme life
Into module, emergency processing recruitment evaluation module, as shown in Figure 1;The internet information acquisition is used for from mutual with parsing module
Information is gathered in networking, extracts the metadata information of natural language word and webpage in webpage, and be saved in database;
The internet information analysis module is used to carry out feature extraction to the natural language word gathered in the information come, forms text
Feature;The network text classification judges to be used to judge the classification of network text with Cluster Analysis module, to accumulating net
Network text carries out cluster analysis;The emergence treatment scheme generation module is used to be automatically generated according to the concrete condition of public sentiment event
Corresponding processing prediction scheme, decision-maker can be based on processing prediction scheme formulation and implementation scheme;The emergency processing recruitment evaluation module
For assessing the implementation effect to carry into execution a plan.
The present embodiment also provides a kind of method of work of intelligent public sentiment accident emergent treatment system, and this method includes
Following steps:
1. internet information acquisition and parsing:By the computer of connection internet from internet forum, blog, news website
The network datas such as upper collection forum postings, Blog content and Website News webpage, then, using computer using rule-based
Information extraction technique automatically parses to network data, extracts two category informations therefrom:Natural language text information and net
The metadata information of page.Natural language text information includes headline, body, forum postings title, model content, work
The information such as person, posting person;The metadata information of webpage includes delivering time, model reply volume, model amount of reading, the website occurred
Title, website URL etc., the key message parsed is saved in database, and information gathering is a lasting mistake with parsing
Journey, form the automatic continuous monitoring to internet site.
2. internet information is analyzed:First with the Chinese word cutting method of natural language processing technique to the mark of network text
Inscribe and carry out participle and part-of-speech tagging processing with body matter, mark out the part of speech of each lexical item, give up to fall and noun is removed in text, is moved
Vocabulary outside word, adjective.Then " a kind of text for text retrieval system for having obtained national inventing patent mandate is utilized
This multiple-accuracy representing method " methods described extracts single the word feature and word association feature of network text.In addition, tied according to participle
Part-of-speech tagging situation in fruit identifies geographic location feature and character features in text, and geographic location feature is network text
Geographic position name, the character features of middle appearance are the person names occurred in network text, such as the network text language in accompanying drawing 5
Shown in adopted feature extraction functional unit.Generally speaking network text is characterized in one group of vocabulary, equipped with its occurrence frequency.
3. network text classification judges and cluster analysis:The purpose is to the content based on network text to use text classification skill
Art judges the generic of network text.Generic is the public sentiment taxonomic hierarchies mould set up in advance based on ontology
One kind in type, as shown in Figure 2, it is a hierarchical model to public sentiment taxonomic hierarchies model, and first layer is major class, the second layer
It is group, each group is defined by concept attribute, as shown in Figure 3, there is two concept attributes:Classification semantic feature and anti-
Control strategy.Classification semantic feature includes:
Single word feature:The single word feature for the network text that classification semantic feature abstraction module extracts;
Word association feature:The more word association features for the network text that classification semantic feature abstraction module extracts;
Geographic location feature:Geographic position name in the network text that classification semantic feature abstraction module extracts;
Character features:Person names in the network text that classification semantic feature abstraction module extracts;
Example:One example text of the type network public-opinion;
Classification judgment criterion.Judge whether the related text accumulation of certain a collection of class public sentiment is really a public sentiment event.Example
Such as, it is a public sentiment event that IF, which the Websites quantity of public sentiment text occurs more than n THEN,;The money order receipt to be signed and returned to the sender quantity of IF public sentiment texts is more than n
THEN is a public sentiment event.
Prevention and control strategy includes prevention principle and preventing control method, and prevention principle is to carry out defence and control for certain class public sentiment event
The basic principle of system;Preventing control method is the specific prevention and control measure taken for certain class public sentiment.
Fig. 4 is the schematic diagram of an actual public sentiment taxonomic hierarchies.
Each classification has its category feature, and the method for category feature is produced as shown in Figure 5 for each classification:First
Some network texts of each classification are gathered as training sample, using the Chinese word cutting method of natural language processing technique to institute
There is training sample to carry out participle and part-of-speech tagging processing, mark out the part of speech of each lexical item, give up to fall and noun is removed in text, is moved
Vocabulary outside word, adjective;Single word feature, the word of each text are extracted by network text semantic feature extract function unit
Linked character, geographic location feature and character features, then classification semantic feature is extracted by classification semantic feature extract function unit;
Specific method is:Each feature of each text is calculated in each classification using statistic algorithm using computer and trained
The Statistical Distribution of sample complete or collected works, selects to occur in more than half classification sample files and is not in training sample complete or collected works
Vocabulary common to all samples calculates the occurrence frequency that is averaged in its classification as Based on Class Feature Word Quadric as Based on Class Feature Word Quadric
Frequency.Generally speaking category feature is one group of vocabulary for representing category feature, equipped with its occurrence frequency that is averaged.
The method that network text classification judges is to carry out the feature lexical item of network text and each category feature lexical item
Matching is compared, as shown in Figure 6, respectively in four single word feature, word association feature, geographic location feature and character features sides
Face carries out matching operation, and calculates Similarity value according to following formula, and text is attributed into Similarity value highest classification.
Wherein,
D represents document to be sorted;
C represents classification;
Coord (d, C) represents the quantity of the category feature lexical item comprising classification C in text d to be identified;
Frequency represents word frequency of the feature lexical item t in category feature;
weight(t):Represent feature lexical item t weight;
Obtained in the category feature lexical item table that frequency and weight values can create from modeling process, classification is special
It is as shown in table 1 to levy lexical item table.
The category feature lexical item table of table 1
Classification | Feature Words | Word frequency | Weight |
varchar | varchar | float | float |
numofClasses:Represent to share several classifications;
ClassFreq(t):It is the feature lexical item of several classifications simultaneously to represent characteristic item item t.
As shown in Figure 7, network text obtains text word segmentation result and removed and stop after preprocessing function cell processing
Word, then its semantic feature is obtained by semantic feature abstraction module, using the interpretation of network text classification arbitration functions unit its
Whether be known n kind network public-opinions one kind, if then being sorted out, otherwise, be given to network text cluster analysis work(
Energy unit is analyzed, and sees wherein whether there is much-talked-about topic, and coming each network text to collection carries out classification judgement, accords with
The network text for closing class condition is assigned to corresponding class label.If at the appointed time in section, belong to the net of a certain classification
The quantity of network text, the Websites quantity for category network text occur exceed the threshold value specified, then are sent to system operators
Alarm, and then emergence treatment scheme is provided by emergence treatment scheme generation module.
In above-mentioned network text classification deterministic process, it may appear that some are not belonging in existing public sentiment taxonomic hierarchies model
Any kind text, over time, UNKNOWN TYPE text can constantly be accumulated, and the UNKNOWN TYPE text of accumulation is carried out
Cluster analysis, network text similar in content is polymerized to cluster, if network amount of text exceeds certain threshold value in cluster, as
Much-talked-about topic submits artificial interpretation, if it is determined that it is new public sentiment classification, then public sentiment classification semantic feature is carried out to it and takes out place
Reason, and the classification semantic feature of extraction is added in knowledge base, detailed process is as shown in Figure 7;Said process ensure that this
The scalability of the knowledge base of system so that system can identify the new public sentiment on internet after knowledge is supplemented.
4. emergence treatment scheme generates:It is on the basis of public sentiment type identification, is provided for the public sentiment type identified
Emergency disposal prediction scheme, it is characterized in that, utilize the internet public feelings event scene ontology knowledge base of ontology technique construction stratification
Model and network public-opinion prevention and control measure prediction scheme ontology knowledge base model.The former carries out qualitative and quantitative description to public sentiment event,
As shown in Figure 8;The latter arranges the emergent prevention and control rules and regulations of public sentiment existing for natural language text mode, processing specification, reply
Apply and be digitized, as shown in Figure 9.The purpose for the arrangement is that the information of unformatted is changed into the intelligible lattice of computer
Formula information.There is the support of above-mentioned two knowledge base model, it is possible to which semantic matches technology is utilized automatically based on computer
Realize the automatic identification of public sentiment event, the fast automatic reasoning of the corresponding precautionary measures, processing scheme, handle the real-time auxiliary of prediction scheme
Generation.Scene ontology knowledge base includes the knowledge concepts such as public sentiment, time, website, participant, audient, potential hazard.
The information of the public sentiment event identified in internet information analysis and network text classification judgment step can quilt
Storage is extracted into public sentiment event scene ontology knowledge base;Public sentiment classification information is given by network text classification judgment step
Go out, specifically using Text Classification;Public sentiment content, time time of origin, duration time, web site name, website
Quantity, participant's user name are provided by internet information analytical procedure, using rule-based information extraction technique;Its
Filled in if its information such as information such as public sentiment grade, participant's IP address according to priori.
Public sentiment prevention and control measure prediction scheme ontology knowledge base includes four basis of compilation, the scope of application, resource, prevention and control measure sides
Face, its content are filled according to specific laws and regulations content.
It is common based on internet public feelings event scene ontology knowledge base and network public-opinion prevention and control measure prediction scheme ontology knowledge base
Network public-opinion prevention and control Knowledge Semantic Model Based is constituted, based on this model, emergency preplan, such as accompanying drawing are generated using semantic matches technology
Shown in 10.Emergency preplan is the scheme and method for instructing to dispose various public sentiment accidents, and the specific bar of each public sentiment event
Part, situation and parameter are different, and policymaker's needs are selected appropriate prevention and control disposal from prevention and control prediction scheme and arranged as the case may be
Apply, method and implementation steps allocate corresponding organization and department and perform emergency preplan as emergency preplan.Therefore, will
" the public sentiment classification " of event scene, " public sentiment content ", " public sentiment grade " " the applicable event type " with prediction scheme body, " suitable respectively
With event content ", " applicable event class " match, as shown in Figure 11, so as to find the reply matched with public sentiment event
Prediction scheme, as shown in table 2 and table 3.
The prediction scheme example that table 2 is generated based on semantic matches
The prediction scheme of table 3 illustrates
Contingency plan is a guiding scheme, it is necessary to concrete condition further according to public sentiment, for example, the time, website,
Situations such as participant, audient, potential hazard, generates one and specifically carries into execution a plan.
5. emergency processing recruitment evaluation:Emergency processing recruitment evaluation is complete based on evaluation index system and assessment calculation formula
Into, evaluation index system contains the item for needing to assess, and assesses calculation formula and calculates quantitative evaluation result;Evaluation index
As shown in Figure 12, the detailed description of each index is as shown in table 4 for system.
The emergency processing recruitment evaluation index system of table 4
Public sentiment intensity index is intended to weigh public sentiment in scope and formal situation.1. public sentiment scope refers to the wide of public sentiment
Degree, is weighed by three website coverage, regional coverage degree, Websites quantity indexs.Website coverage is referred to comprising public sentiment text
This website accounts for the proportion of sample site measure;Sample site measure can represent whole network to a certain extent by choosing meticulously
State and horizontal set of websites;Because the scale-level of each website is different, processing is weighted to it, public sentiment text occurs
Sample site measure it is more, illustrate that the scope of public sentiment is wider, after prevention and control measure is implemented, if including the Websites quantity of public sentiment text
There is reduced trend and illustrate that prevention and control measure has played effect.Regional coverage degree refers to the geography of the website comprising public sentiment text
Distribution situation, occur public sentiment text website distribution it is wider, illustrate that the coverage of public sentiment is wider.Websites quantity refers to including
The total quantity of the website of public sentiment text, quantity is more, illustrates that the coverage of public sentiment is wider.2. public sentiment form refers to that public sentiment passes
Media channel species, the length of network text used, the medium kind of network text broadcast.Media channel species can be BBS,
Microblogging, blog, dating site, Email etc., channel used is more, then transmission capacity is stronger.The length of network text used
Degree is longer, then transmission capacity is stronger.Medium kind can be text, audio, video, and medium used thereof species more at most public sentiment influences
It is stronger.
Audient's attention rate index is intended to reflect network public-opinion to the influence power of audient, responded by audient's situation, audient, by
The indexs such as many attitudes are weighed.1. audient's situation refers to that the audience size influenceed by public sentiment and audient's scope, audience size are led to
Network text viewer IP quantity is crossed to measure, audient's scope is surveyed by network text viewer IP distributional region range
Amount.2. audient's response refers to degree of concern of the viewer to network text, pass through amount of reading, transfer amount, money order receipt to be signed and returned to the sender amount, liveness
To weigh.Amount of reading is measured by the touching quantity of network text, and transfer amount is by network text in the range of full internet
The occurrence number of different web sites measures, and money order receipt to be signed and returned to the sender amount replys quantity to measure by network text, and liveness passes through the unit interval
The interior reply quantity to network text refers to viewer's recognizing to the viewpoint expressed by network text to measure 3. audient's attitude
It is unison, weighed by positive attitude money order receipt to be signed and returned to the sender quantity, middle sexual attitude money order receipt to be signed and returned to the sender quantity, negative attitude money order receipt to be signed and returned to the sender quantity.
The weight of the indexs at different levels of the index system is calculated by analytic hierarchy process (AHP), and each single item index can quantify to count
Draw, the Quantitative Calculation Method of index is divided into three kinds:Index calculates, frequency/density calculates and weight coefficient determines.
(1) index calculates
There are quantitative target and qualitative index in index system.Quantitative target includes amount of reading, transfer amount, money order receipt to be signed and returned to the sender amount etc. and referred to
Mark;Qualitative index includes audiovisual degree.With comparativity, qualitative index and quantitative target to be pressed into normalized, here
Using index calculation method, specifically Sigmoid functions are usedCalculated, wherein x represents amount of reading, turned
Hair amount, money order receipt to be signed and returned to the sender amount etc..So that audient responds as an example, if for network text i, the touching quantity of network text is x1i, network text
The occurrence number of different web sites is x in the range of full internet2i, it is x that network text, which replys quantity,3i, to network in the unit interval
The reply quantity of text is x4i.If amount of reading, transfer amount, money order receipt to be signed and returned to the sender amount, the weight of liveness are g1, g2, g3, g4, then network is literary
This to audient response influence power P1 be:
P1=f (x1i)×g1+f(x2i)×g2+f(x3i)×g3+f(x4i)×g4
(2) frequency calculates
Liveness is that the reply frequency of network text is weighed according to netizen, using day, week, the moon as timing statisticses list
Position.
(3) weight coefficient determines
The weight coefficient of each attribute factor is determined using analytic hierarchy process (AHP) according to expertise.It is mainly characterized by multiple
The problem of miscellaneous, is decomposed into several compositing factors, and these factors are divided into hierarchical structure by subordinate relation;Expert only needs when appraising through comparison
Each factor is compared two-by-two, determines the relative importance of factors in same level, the judgement for then integrating expert determines
The relatively important order of each factor.Determine the weight coefficient of each factor ratio in several factors by rule of thumb simultaneously in this way
It is more more scientific to make weight coefficient, more accurately judges because easily being drawn when people only compare two-by-two.But using
During these methods, in order to ensure effect, the factor that each level is included is generally more than 10.9 points are pressed when contrast two-by-two
System is carried out, and 1 represents quite, and 3 be slightly good, and 5 be significantly good, and 7 be very good, and 9 be fabulous.As between then being used between said two devices
2nd, 4,6 or 8 points of expressions.Rating matrix is formed according to contrast marking result two-by-two, by the Maximum characteristic root and feature of seeking matrix
Vector can calculate importance or evaluation weight of each factor relative to last layer target.If it is required that each parameter is calculated to again
The sequence of importance or influence degree size of last layer target, the weight of each parameter of bottom can be multiplied by one by one associated
Last layer factor weight, be then added, so each parameter just calculates to the order of quality or weight coefficient of last layer again
Come.
The calculation formula of quantitative evaluation result is,
Wherein, AiRepresent the score value of first class index, public sentiment intensity and audient's attention rate, ωiRepresent respective weight.
Each first class index is determined by the two-level index of its subordinate, and calculation formula isIts
In,It is the jth item of i-th of first class index, its weight is ωj.Similarly, each two-level index is referred to by the three-level of its subordinate
Mark determines.
Claims (5)
- A kind of 1. intelligent public sentiment accident emergent treatment system, it is characterised in that:The system includes internet information acquisition Judge to generate with Cluster Analysis module, emergence treatment scheme with parsing module, internet information analysis module, network text classification Module and emergency processing recruitment evaluation module;The internet information acquisition is used to gather from internet with parsing module to be believed Breath, is automatically parsed using rule-based information extraction technique to network text, extracts natural language text in webpage The metadata information of word and webpage, and be saved in database;The internet information analysis module is used for gathering what is come After natural language word in information is segmented, the lexical item based on part of speech filters, the more accuracy characteristics of text extract processing links, Form the text feature for including single lexical item feature, lexical item linked character, geographic location feature and character features;The network Text categories judge to be used to be based on single lexical item feature, lexical item linked character, geographic location feature and people with Cluster Analysis module Thing feature is judged the classification of network text, cluster analysis is carried out to the network text that can not sort out of accumulation, content Similar network text is polymerized to cluster, if network amount of text exceeds given threshold in cluster, triggering alarm is simultaneously literary to network in cluster The lexical item feature extraction of this progress public sentiment classification is handled to generate the lexical item feature of cluster class cluster;The emergence treatment scheme generation Module is used for the concrete condition according to public sentiment event, is carried out by the lexical item of network text and the lexical item feature of each public sentiment classification Matching is compared, the classification of public sentiment event is judged according to match condition, then, automatically generates corresponding processing prediction scheme, decision-maker Processing prediction scheme formulation and implementation scheme can be based on;The emergency processing recruitment evaluation module is used for the implementation effect to carrying into execution a plan Assessed;It is described in cluster network text carry out public sentiment classification lexical item feature extraction handle concrete mode be:Assuming that cluster T is included Network text have T={ t1,t2,…tn, utilize the more accuracy tables of textShow that method extracts each text tiSingle lexical item feature and lexical item linked character, then calculated using statistical method in T All single lexical item features of all texts and the Statistical Distribution of lexical item linked character, select more than half network in T The vocabulary occurred in text calculates its average occurrence frequency in T as public sentiment as public sentiment classification lexical item feature The frequency of category feature lexical item;Wherein, 1≤i≤n.
- 2. a kind of intelligent public sentiment accident emergency processing method, it is characterised in that comprise the following steps:1. internet information acquisition and parsing:Network text is gathered from internet by the computer of connection internet;Then, count Calculation machine is automatically parsed using rule-based information extraction technique to network text, extracts two category informations therefrom:From The metadata information of right spoken and written languages information and webpage;2. internet information is analyzed:First with the Chinese word cutting method of natural language processing technique to the title of network text and Body matter is segmented respectively, and the part of speech of each lexical item in word segmentation result is labeled, and is given up to fall afterwards except noun, is moved Lexical item outside word, adjective, the single lexical item feature and word of network text are then extracted using text multiple-accuracy representing method Item linked character, geographic location feature and the personage in network text are identified further according to the part-of-speech tagging situation in word segmentation result Feature;3. the lexical item of the public sentiment classification set in the lexical item in network text and Computer Database 2. step is handled after is special Matching is compared in sign, and is returned network text according to the public sentiment classification set in Computer Database according to matching result Class processing;The network text that can not sort out is subjected to cluster analysis, network text similar in content is polymerized to cluster, if network in cluster Amount of text exceeds given threshold, then the lexical item feature extraction that public sentiment classification is carried out to network text in cluster is handled, and will extract The lexical item feature of public sentiment classification add in Computer Database;Step is transferred to for the network text for completing to sort out 4.;Its In, matching content includes single lexical item feature, lexical item linked character, geographic location feature and character features;To in cluster network text carry out public sentiment classification lexical item feature extraction handle concrete mode be:Assuming that the net that cluster T is included Network text has T={ t1,t2,…tn, extract each text t using text multiple-accuracy representing methodiSingle lexical item feature and Lexical item linked character, then calculate using statistical method all single lexical item features and lexical item linked character of all texts in T Statistical Distribution, the vocabulary for selecting in more than half network text to occur in T as public sentiment classification lexical item feature, and Calculate its frequency of the average occurrence frequency as public sentiment category feature lexical item in T;Wherein, 1≤i≤n;If 4. at the appointed time in section, belong to the quantity of the network text of a certain classification or category network text occur Websites quantity exceedes the threshold value specified, then starts emergency preplan;Complete the emergency processing of intelligent public sentiment accident.
- A kind of 3. intelligent public sentiment accident emergency processing method according to claim 2, it is characterised in that:In step 4. also include emergency processing recruitment evaluation step afterwards:Evaluation index acquisition index data are first according to, then by achievement data Input assesses formula and draws quantitative evaluation result.
- A kind of 4. intelligent public sentiment accident emergency processing method according to claim 2, it is characterised in that:In step 3. network text is carried out into classification processing according to the public sentiment classification set in Computer Database according to matching result in is specially: The method that network text classification judges is that the lexical item of network text is compared into matching with the lexical item feature of each public sentiment classification, Carry out matching operations at single word feature, word association feature, geographic location feature and the aspect of character features four respectively, according to The Similarity value of network text and each public sentiment classification is obtained with situation, text is attributed to Similarity value highest public sentiment class Not.
- A kind of 5. intelligent public sentiment accident emergency processing method according to claim 2, it is characterised in that:Step is 4. The generation method of middle emergency preplan is:First, the internet public feelings event scene body of ontology technique construction stratification is utilized Knowledge base model and network public-opinion prevention and control measure prediction scheme ontology knowledge base model, internet public feelings event scene ontology knowledge base mould Type is used to carry out public sentiment event qualitative and quantitative description, and network public-opinion prevention and control measure prediction scheme ontology knowledge base model is used for will The emergent prevention and control rules and regulations of public sentiment existing for natural language text mode, processing specification, counter-measure are digitized;Then, Based on internet public feelings event scene ontology knowledge base model and network public-opinion prevention and control measure prediction scheme ontology knowledge base, semanteme is utilized Matching technique is according to the concrete condition of public sentiment event scene, the Auto-matching from network public-opinion prevention and control measure prediction scheme ontology knowledge base Go out most suitable plan for emergency handling.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510243751.0A CN104820629B (en) | 2015-05-14 | 2015-05-14 | A kind of intelligent public sentiment accident emergent treatment system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510243751.0A CN104820629B (en) | 2015-05-14 | 2015-05-14 | A kind of intelligent public sentiment accident emergent treatment system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104820629A CN104820629A (en) | 2015-08-05 |
CN104820629B true CN104820629B (en) | 2018-01-30 |
Family
ID=53730930
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510243751.0A Active CN104820629B (en) | 2015-05-14 | 2015-05-14 | A kind of intelligent public sentiment accident emergent treatment system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104820629B (en) |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107193796B (en) * | 2016-03-14 | 2021-12-24 | 北大方正集团有限公司 | Public opinion event detection method and device |
CN107239452B (en) * | 2016-03-28 | 2021-07-27 | 腾讯科技(深圳)有限公司 | Method and device for strategy adjustment |
CN105956740B (en) * | 2016-04-19 | 2019-12-31 | 北京深度时代科技有限公司 | Semantic risk calculation method based on text logical features |
CN106202561B (en) * | 2016-07-29 | 2019-10-01 | 北京联创众升科技有限公司 | Digitlization contingency management case base construction method and device based on text big data |
CN106294619A (en) * | 2016-08-01 | 2017-01-04 | 上海交通大学 | Public sentiment intelligent supervision method |
CN108255832A (en) * | 2016-12-28 | 2018-07-06 | 航天信息股份有限公司 | public sentiment processing system and method |
CN107274324A (en) * | 2017-06-06 | 2017-10-20 | 张黎明 | A kind of method that accident risk assessment is carried out based on cloud service |
CN107590196A (en) * | 2017-08-15 | 2018-01-16 | 中国农业大学 | Earthquake emergency information screening and evaluating system and system in a kind of social networks |
CN107491438A (en) * | 2017-08-25 | 2017-12-19 | 前海梧桐(深圳)数据有限公司 | Business decision elements recognition method and its system based on natural language |
CN107622354B (en) * | 2017-09-29 | 2020-06-26 | 中国科学技术大学 | Emergency capacity evaluation method for emergency events based on interval binary semantics |
CN107741929A (en) * | 2017-10-18 | 2018-02-27 | 网智天元科技集团股份有限公司 | The analysis of public opinion method and device |
CN108108902B (en) * | 2017-12-26 | 2021-06-29 | 创新先进技术有限公司 | Risk event warning method and device |
CN110096406A (en) * | 2018-01-31 | 2019-08-06 | 阿里巴巴集团控股有限公司 | A kind of event of failure discovery method and server |
CN110046220A (en) * | 2018-12-13 | 2019-07-23 | 阿里巴巴集团控股有限公司 | Public feelings information processing method, device, equipment and computer readable storage medium |
CN110868383A (en) * | 2018-12-24 | 2020-03-06 | 北京安天网络安全技术有限公司 | Website risk assessment method and device, electronic equipment and storage medium |
CN109615266B (en) * | 2018-12-26 | 2022-11-04 | 贵州电网有限责任公司 | Text analysis decision method for power grid abnormal information based on data mining |
CN110609969A (en) * | 2019-08-08 | 2019-12-24 | 阿里巴巴集团控股有限公司 | Information processing method and device |
CN110852090B (en) * | 2019-11-07 | 2024-03-19 | 中科天玑数据科技股份有限公司 | Mechanism characteristic vocabulary expansion system and method for public opinion crawling |
CN111160805A (en) * | 2019-12-31 | 2020-05-15 | 清华大学 | Emergency plan auxiliary information acquisition method, device and equipment |
CN111223026B (en) * | 2020-01-03 | 2024-03-01 | 武汉理工大学 | Intelligent management method for garbage crisis transformation |
CN111428146A (en) * | 2020-03-24 | 2020-07-17 | 上海智臻智能网络科技股份有限公司 | Network information processing method and system, equipment and storage medium |
CN113626722A (en) * | 2020-05-08 | 2021-11-09 | 国家广播电视总局广播电视科学研究院 | Public opinion guiding method, device, equipment and computer readable storage medium |
CN111898385B (en) * | 2020-07-17 | 2023-08-04 | 中国农业大学 | Earthquake disaster assessment method and system |
CN112069381A (en) * | 2020-09-27 | 2020-12-11 | 中国科学院深圳先进技术研究院 | Monitoring management method and system based on natural language processing technology |
CN112905745A (en) * | 2021-03-05 | 2021-06-04 | 广州虎牙科技有限公司 | Information processing method, information processing device, electronic equipment and computer readable storage medium |
CN113433994A (en) * | 2021-07-20 | 2021-09-24 | 青岛一云数联科技有限公司 | System and method for sensing and monitoring energy event |
CN113468320B (en) * | 2021-07-22 | 2022-02-15 | 中国地震台网中心 | Method and system for quickly visualizing earthquake emergency information |
CN114417021B (en) * | 2022-01-24 | 2023-08-25 | 中国电子科技集团公司第五十四研究所 | Semantic information accurate distribution method based on time, space and sense multi-constraint fusion |
CN114444514B (en) * | 2022-02-08 | 2023-01-24 | 北京百度网讯科技有限公司 | Semantic matching model training method, semantic matching method and related device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436194A (en) * | 2008-11-04 | 2009-05-20 | 中国电子科技集团公司第五十四研究所 | Text multiple-accuracy representing method based on data excavating technology |
CN101819573A (en) * | 2009-09-15 | 2010-09-01 | 电子科技大学 | Self-adaptive network public opinion identification method |
CN102509164A (en) * | 2011-11-24 | 2012-06-20 | 广州市地下铁道总公司 | Automatic generation method for digital emergency plan |
CN103544255A (en) * | 2013-10-15 | 2014-01-29 | 常州大学 | Text semantic relativity based network public opinion information analysis method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103150335A (en) * | 2013-01-25 | 2013-06-12 | 河南理工大学 | Co-clustering-based coal mine public sentiment monitoring system |
CN104573016A (en) * | 2015-01-12 | 2015-04-29 | 武汉泰迪智慧科技有限公司 | System and method for analyzing vertical public opinions based on industry |
-
2015
- 2015-05-14 CN CN201510243751.0A patent/CN104820629B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436194A (en) * | 2008-11-04 | 2009-05-20 | 中国电子科技集团公司第五十四研究所 | Text multiple-accuracy representing method based on data excavating technology |
CN101819573A (en) * | 2009-09-15 | 2010-09-01 | 电子科技大学 | Self-adaptive network public opinion identification method |
CN102509164A (en) * | 2011-11-24 | 2012-06-20 | 广州市地下铁道总公司 | Automatic generation method for digital emergency plan |
CN103544255A (en) * | 2013-10-15 | 2014-01-29 | 常州大学 | Text semantic relativity based network public opinion information analysis method |
Also Published As
Publication number | Publication date |
---|---|
CN104820629A (en) | 2015-08-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104820629B (en) | A kind of intelligent public sentiment accident emergent treatment system and method | |
CN107633044B (en) | Public opinion knowledge graph construction method based on hot events | |
CN102591854B (en) | For advertisement filtering system and the filter method thereof of text feature | |
WO2020000847A1 (en) | News big data-based method and system for monitoring and analyzing risk perception index | |
CN109145216A (en) | Network public-opinion monitoring method, device and storage medium | |
CN102929861B (en) | Method and system for calculating text emotion index | |
CN112650848A (en) | Urban railway public opinion information analysis method based on text semantic related passenger evaluation | |
CN103793503A (en) | Opinion mining and classification method based on web texts | |
CN103399891A (en) | Method, device and system for automatic recommendation of network content | |
CN111967761A (en) | Monitoring and early warning method and device based on knowledge graph and electronic equipment | |
CN109446423B (en) | System and method for judging sentiment of news and texts | |
CN108595525A (en) | A kind of lawyer's information processing method and system | |
Madichetty et al. | Disaster damage assessment from the tweets using the combination of statistical features and informative words | |
CN104346425A (en) | Method and system of hierarchical internet public sentiment indication system | |
CN104899335A (en) | Method for performing sentiment classification on network public sentiment of information | |
CN108681548A (en) | A kind of lawyer's information processing method and system | |
CN108681977B (en) | Lawyer information processing method and system | |
Cao et al. | Topics and trends of the on-line public concerns based on Tianya forum | |
Samonte | Polarity analysis of editorial articles towards fake news detection | |
CN105869058A (en) | Method for user portrait extraction based on multilayer latent variable model | |
CN108614860A (en) | A kind of lawyer's information processing method and system | |
Deraman et al. | A social media mining using topic modeling and sentiment analysis on tourism in Malaysia during COVID19 | |
Meddeb et al. | Using twitter streams for opinion mining: a case study on airport noise | |
Clarizia et al. | Sentiment analysis in social networks: A methodology based on the latent dirichlet allocation approach | |
Grijzenhout et al. | Opinion mining in dutch hansards |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |