CN109614551A - A kind of negative public sentiment judgment method and device - Google Patents

A kind of negative public sentiment judgment method and device Download PDF

Info

Publication number
CN109614551A
CN109614551A CN201811518710.8A CN201811518710A CN109614551A CN 109614551 A CN109614551 A CN 109614551A CN 201811518710 A CN201811518710 A CN 201811518710A CN 109614551 A CN109614551 A CN 109614551A
Authority
CN
China
Prior art keywords
negative
public sentiment
keyword
identified
public
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811518710.8A
Other languages
Chinese (zh)
Inventor
耿琦
苏谦
余训培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI YOUYANG XINMEI INFORMATION TECHNOLOGY Co Ltd
Original Assignee
SHANGHAI YOUYANG XINMEI INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI YOUYANG XINMEI INFORMATION TECHNOLOGY Co Ltd filed Critical SHANGHAI YOUYANG XINMEI INFORMATION TECHNOLOGY Co Ltd
Priority to CN201811518710.8A priority Critical patent/CN109614551A/en
Publication of CN109614551A publication Critical patent/CN109614551A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The embodiment of the present application discloses a kind of negative public sentiment judgment method and device, the frequency of occurrence of the negative keyword occurred in public sentiment to be identified and each negative keyword is determined by processing equipment first, the negative scoring of the public sentiment to be identified can be determined according to the frequency of occurrence of the negative keyword and each negative keyword, if the negative scoring of the public sentiment to be identified is greater than or equal to preset threshold, the public sentiment to be identified is judged as negative public sentiment.To realize whether the automatic identification public sentiment by way of negative Keywords matching is negative public sentiment, the influence of human factor is avoided, recognition efficiency and stability are improved.

Description

A kind of negative public sentiment judgment method and device
Technical field
This application involves data processing fields, more particularly to a kind of negative public sentiment judgment method and device.
Background technique
Public sentiment is the abbreviation of " public opinion situation ", is that the common people organize around certain a certain enterprise, individual, social framework etc. Attitude expressed by one social event, opinion, the summation of speech.With the development of internet, viewpoint of the netizen to a certain event Expression forms network public-opinion by network media propagation, and since the publication cost of the network information is extremely low, communication channel is more, propagates Speed is fast, spread scope is wide, and along with rumour and irrational sound, and network public-opinion has become the prevailing model of public sentiment, and one Denier processing not yet in effect, powerful opinion momentum easy to form, becomes the fuse cord of the public event of initiation company or government.
What financial industry was substantially managed is " trust ", is compared to other industry, make good corporate image for It is more important for financial industry practitioner and corporate facility.However some financial companies can be misled by distribution " positive public sentiment " User is made a profit with this, such as many P2P mechanisms organized can manufacture " positive public sentiment " by the network media before thunderclaps, such as dissipate The news such as so-and-so platform business performance of cloth increases, and the future development impetus is good, and broad medium is organized to reprint, attract investor to throw Money is rolled up after money to run away, and has finally caused Mass disturbance.
So public sentiment is the key element of financial regulator's acquisition of information, it is accurate to need to carry out the public sentiment on network Judgement identification, especially need to judge negative public sentiment, so as to accurately identify financial company by negative public sentiment Actual operation situation, and judge the spoofing for financial company whether is spread in public sentiment communication channel with this, when determining Intervene at the first time and refute a rumour, effectively avoids the further deterioration of risk.
The means of the judgement public sentiment of current main-stream mainly based on manual inspection, i.e., are responsible for, in a manual manner by special messenger Relevant news information is periodically checked, in conjunction with the negative public sentiment of understanding artificial judgment of business.As the network media is especially from matchmaker The fast development of body, existing public sentiment monitoring means are difficult to meet business demand.Be embodied in: man efficiency is lower, shortage pair The comprehensive assessment of public sentiment.It needs manually to browse the public sentiment on each media one by one based on manually checking, judges whether it is negative Public sentiment, this is firstly the need of a large amount of human resources of investment;Next is limited to everyone industry experience difference, everyone is to carriage Influence power, the judgement measurement standard of propagation degree of feelings are variant, influence the comprehensive analysis to public sentiment;Third manually counts public sentiment Data also increase operational risk, such as omit some important public sentiment.
Summary of the invention
In order to solve the above-mentioned technical problem, this application provides
The embodiment of the present application discloses following technical solution:
In a first aspect, the embodiment of the present application provides a kind of negative public sentiment judgment method, which comprises
Determine the frequency of occurrence of the negative keyword occurred in public sentiment to be identified and each negative keyword;
The public sentiment to be identified is determined according to the frequency of occurrence of the negative keyword and each negative keyword Negative scoring;
If the negative scoring of the public sentiment to be identified is greater than or equal to preset threshold, the public sentiment judgement to be identified is negative Face public sentiment.
Optionally, described according to the negative keyword for any one target keyword in the negative keyword The negative scoring of the public sentiment to be identified is determined with the frequency of occurrence, comprising:
The target is determined according to the frequency of occurrence of the negative weight of the target keyword point and the target keyword Keyword is to the negative scoring of the son of the public sentiment to be identified;
The negative scoring of the public sentiment to be identified is calculated according to the negative scoring of the son of each negative keyword.
Optionally, the negative weight of the target keyword point is determined according to such as under type:
It is matched according to the target keyword with first sample set, the first sample set includes identified multiple Negative public sentiment and multiple non-negative public sentiments;
The posteriority conditional probability for the negative public sentiment of the target keyword occur is concentrated according to the first sample, and The first sample concentrates the prior probability of negative public sentiment to determine that the negative weight of the target keyword is divided.
Optionally, the preset threshold is determined according to such as under type:
Obtaining the second sample set, second sample set includes identified multiple negative public sentiments and multiple non-negative public sentiments, And the negative scoring of the multiple negative public sentiment and multiple non-negative public sentiments;
According to marking model, the knowledge of negative public sentiment is carried out to the public sentiment in the second sample set using different recognition thresholds Not;
If the recognition result under target identification threshold value and the degree of conformity between the actual result of second sample set meet pre- If condition, using target identification threshold value as the preset threshold.
Optionally, the occurrence out of the negative keyword and each negative keyword that occur in determination public sentiment to be identified Before number, the method also includes:
Obtain the public sentiment set including multiple public sentiments undetermined;
Public sentiment filtering is carried out to the public sentiment set according to the title of the multiple public sentiment undetermined;
Using any one filtered public sentiment undetermined as the public sentiment to be identified.
Optionally, for any one target keyword in the negative keyword, go out in the determination public sentiment to be identified The frequency of occurrence of existing negative keyword and each negative keyword, comprising:
It is matched to text position in the public sentiment to be identified according to the target keyword, determines to include that the target is closed The contextual information of keyword;
Identify the semantic meaning representation tendency of the contextual information;
If the semantic meaning representation tendency is forward direction, determines and be not matched to the target keyword in the text position.
Second aspect, the embodiment of the present application provide a kind of negative public sentiment judgment means, described device include determination unit, Computing unit and judging unit:
The determination unit, for determining going out for the negative keyword occurred in public sentiment to be identified and each negative keyword Occurrence number;
The computing unit, for being determined according to the frequency of occurrence of the negative keyword and each negative keyword The negative scoring of the public sentiment to be identified;
The judging unit will be described if the negative scoring for the public sentiment to be identified is greater than or equal to preset threshold Public sentiment to be identified is judged as negative public sentiment.
Optionally, for any one target keyword in the negative keyword, the computing unit is also used to:
The target is determined according to the frequency of occurrence of the negative weight of the target keyword point and the target keyword Keyword is to the negative scoring of the son of the public sentiment to be identified;
The negative scoring of the public sentiment to be identified is calculated according to the negative scoring of the son of each negative keyword.
Optionally, the computing unit is also used to the negative weight point according to the target keyword as described in determining under type:
It is matched according to the target keyword with first sample set, the first sample set includes identified multiple Negative public sentiment and multiple non-negative public sentiments;
The posteriority conditional probability for the negative public sentiment of the target keyword occur is concentrated according to the first sample, and The first sample concentrates the prior probability of negative public sentiment to determine that the negative weight of the target keyword is divided.
Optionally, the computing unit is also used to according to the preset threshold as described in determining under type:
Obtaining the second sample set, second sample set includes identified multiple negative public sentiments and multiple non-negative public sentiments, And the negative scoring of the multiple negative public sentiment and multiple non-negative public sentiments;
According to marking model, the knowledge of negative public sentiment is carried out to the public sentiment in the second sample set using different recognition thresholds Not;
If the recognition result under target identification threshold value and the degree of conformity between the actual result of second sample set meet pre- If condition, using target identification threshold value as the preset threshold.
Optionally, described device further includes filter element, and the filter element is used for:
Obtain the public sentiment set including multiple public sentiments undetermined;
Public sentiment filtering is carried out to the public sentiment set according to the title of the multiple public sentiment undetermined;
Using any one filtered public sentiment undetermined as the public sentiment to be identified.
Optionally, for any one target keyword in the negative keyword, the determination unit is also used to:
It is matched to text position in the public sentiment to be identified according to the target keyword, determines to include that the target is closed The contextual information of keyword;
Identify the semantic meaning representation tendency of the contextual information;
If the semantic meaning representation tendency is forward direction, determines and be not matched to the target keyword in the text position.
The negative key occurred in public sentiment to be identified is determined by processing equipment first it can be seen from above-mentioned technical proposal The frequency of occurrence of word and each negative keyword, according to the frequency of occurrence of the negative keyword and each negative keyword The negative scoring of the public sentiment to be identified can be determined, if the negative scoring of the public sentiment to be identified is greater than or equal to default threshold Value, is judged as negative public sentiment for the public sentiment to be identified.To realize the automatic identification by way of negative Keywords matching Whether public sentiment is negative public sentiment, avoids the influence of human factor, improves recognition efficiency and stability.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of application without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.
Fig. 1 is a kind of method flow diagram of negative public sentiment judgment method provided by the embodiments of the present application;
Fig. 2 is a kind of structure drawing of device of negative public sentiment judgment means provided by the embodiments of the present application.
Specific embodiment
With reference to the accompanying drawing, embodiments herein is described.
Network public-opinion is the key element of financial regulator's acquisition of information at present, with many P2P thunderclaps since 2018 For, many P2P mechanisms organized can manufacture " positive public sentiment " by the network media before thunderclaps, such as spread so-and-so platform warp The news such as achievement of doing business increases, and the future development impetus is good, and broad medium is organized to reprint, attract volume money after investor's investment to run Mass disturbance has finally been caused on road.
If not firm message can be belonged in such public sentiment of EARLY RECOGNITION, or the channel of propagation public sentiment often spreads falseness Message intervenes refute a rumour at the first time, then can effectively avoid the further deterioration of risk.Therefore effective public sentiment identifying schemes are needed, For the means of the judgement public sentiment of current main-stream mainly based on manual inspection, efficiency is lower, is limited to everyone industry experience Difference is easy to appear omission.
For this purpose, the embodiment of the present application provides a kind of negative public sentiment judgment method, this method be can be applied to data In the processing equipment of processing capacity, which can be computer, server etc..
For the public sentiment (such as public sentiment to be identified) that identifies of needs, determine occur in public sentiment to be identified by processing equipment The occurrence out of negative keyword and each negative keyword.According to going out for the negative keyword and each negative keyword Occurrence number can determine the negative scoring of the public sentiment to be identified, if the negative scoring of the public sentiment to be identified is greater than or equal to The public sentiment to be identified is judged as negative public sentiment by preset threshold.To realize by way of negative Keywords matching certainly Whether dynamic identification public sentiment is negative public sentiment, avoids the influence of human factor, improves recognition efficiency and stability.
Fig. 1 is a kind of method flow diagram of negative public sentiment judgment method provided by the embodiments of the present application, which comprises
101: determining the frequency of occurrence of the negative keyword occurred in public sentiment to be identified and each negative keyword.
Negative keyword can be it is pre-set according to identification demand, for different field public sentiment can have it is different Negative keyword.In financial field, pre-set negative keyword is used to identify in the operation state of enterprise or individual Negative keyword in negative characteristics, such as financial field may include that " promise breaking steps on thunder Rat Trading and rises suddenly and sharply that cut limit down in half huge Thanks to farce, which earns wharf explosion cruelly and produces shrink roller-coaster shockingly and quit listing to delay to cash to split grade under set pattern and break even, connects village insider trading nest case of sitting crosslegged The run on a bank liquidation that runs away of drop is halted to close a position the overdue debt crisis Quantity of large redemption of closing a position of quick-fried thunder mini fund liquidity run customization benefit Benefit conveying " etc..
Form of the public sentiment on network can be article, news, circle of friends, blog, mail, chat message etc..Processing is set It is standby to be matched according to preset negative keyword with public sentiment to be identified, to occur how many in determination public sentiment to be identified Negative keyword, and the number that each negative keyword occurs.For example, public sentiment to be identified is a webpage article A, for A The matching result for carrying out negative keyword can be as shown in table 1:
Table 1
By table 1 it is found that occurring three negative keywords in the content of A, respectively " thunderclaps ", " limit down ", " delisting ". Wherein, " thunderclaps " frequency of occurrence in A is 2, and " limit down ", " delisting " frequency of occurrence in A are 1.
It should be noted that have can for the semantic meaning representation tendency for the contextual information being matched to where negative keyword in public sentiment It can not be negative sense, such as " corporate president refutes a rumour: thunderclaps had never occurred since listing for my company in contextual information In risk ", although there is " thunderclaps " this negative keyword, the semantic meaning representation tendency of entire contextual information is really Positive, enterprise is not given expression to negative characteristics.
Therefore in order to improve the matching precision of negative keyword, using in negative keyword any one as target keyword, It illustrates how to solve the above problems by target keyword.
In an optional implementation manner, step 101 may include:
It is matched to text position in the public sentiment to be identified according to the target keyword, determines to include that the target is closed The contextual information of keyword.
Identify the semantic meaning representation tendency of the contextual information.
If the semantic meaning representation tendency is forward direction, determines and be not matched to the target keyword in the text position.
Judge semantic meaning representation tendency whether be positive mode can there are many, the embodiment of the present application provides a kind of optional Mode, i.e., determined by phrase structure of the negative keyword in contextual information, such as " there is no [negative vocabulary], stop [negative vocabulary], there is no [negative vocabulary], without great unlawful practice, without promise breaking situation, there is no false record, mislead Property statement or it is great omit, overdue 90 days or more accountings " etc..
By determining the semantic meaning representation tendency of contextual information, the matching precision of negative keyword can be improved, such as on In the example for stating webpage article A, it is inclined to by the semantic meaning representation of contextual information where the negative keyword of determination, what is obtained is negative The matching result of keyword can be as shown in table 2:
Negation words Hit-count Hit location
Thunderclaps 1 Position one: " XXX the said firm be limited to debt promise breaking and supervision in violation of rules and regulations there are thunderclaps risk XXX "
Limit down 1 Position one: " XXX stock limit down of continuous 10 day of trade "
It quits listing 1 Position one: " XXX has risk of quitting listing "
Table 2
By table 1 it is found that occurring three negative keywords in the content of A, respectively " thunderclaps ", " limit down ", " delisting ". Wherein, " thunderclaps ", " limit down ", " delisting " frequency of occurrence in A are 1.
It should be noted that since emerging public sentiment quantity daily on network is very more, and some of them may be bright It is aobvious to be not belonging to negative public sentiment, it optionally can tentatively be sieved by the title of public sentiment to improve the efficiency of identification public sentiment Choosing, rejecting are obviously not belonging to negative public sentiment.
In one possible implementation, it before executing step 101, can tentatively be sieved by public sentiment title Choosing, concrete mode are as follows:
Obtain the public sentiment set including multiple public sentiments undetermined.
Public sentiment filtering is carried out to the public sentiment set according to the title of the multiple public sentiment undetermined.
Using any one filtered public sentiment undetermined as the public sentiment to be identified.
Wherein, public sentiment set can be crawls from network in advance, or the collection of all kinds of public sentiments undetermined obtained in advance It closes, public sentiment undetermined, which belongs to, not yet to be identified whether as negative public sentiment.At this moment preliminary screening can be carried out for public sentiment set, it will be bright It is aobvious to belong to non-negative public sentiment and filter out.
For example, the standard of filtering can detecte whether article title has colon ": " in financial field, if any directly general This information labeling is non-negative, stopping follow-up link.As " long letter benefit is full of mixing: updating recruitment abstract of description (2018 the 2nd Number) ".The standard of filtering can also be in detection article title whether there is " retrospect of the important news ", since such article is more polymerization Class news list is not included in the monitoring range of negative public sentiment, and such as hitting Direct Mark this article is non-negative, such as " [weekend important news time Care for] how trade war upgrading China 60,000,000,000 to cope with the U.S. 200,000,000,000? ".
The standard of filtering, can also be according to the different personal settings of application scenarios other than above-mentioned example, and user can be with Rule is arranged in understanding based on particular demands, such as filters out the title article comprising " bulletin is selected ", and system realization level will give birth to At corresponding regular expression, each article is matched as a rule.
102: the carriage to be identified is determined according to the frequency of occurrence of the negative keyword and each negative keyword The negative scoring of feelings.
Occurs the number of different negative keywords in one public sentiment, the number that each negative keyword occurs can be to the carriage The negative scoring of feelings has an impact.In general, when the number for occurring different negative keywords in a public sentiment is more or every The number that a negative keyword occurs is more, be equivalent to this public sentiment provide involved enterprise negative characteristics it is more, institute's table The content reached is more likely to negatively, to will increase the negative scoring of this public sentiment.
It how true according to the number of different negative keywords and the frequency of occurrence for the keyword that goes out is not limited in the application Fixed negative scoring.
But, the embodiment of the present application provides a kind of possible implementation.In this implementation, with carriage to be identified Any one in the negative keyword that feelings are matched to for target keyword as being illustrated.
For target keyword, in a step 102, closed according to the negative weight of the target keyword point and the target The frequency of occurrence of keyword determines the target keyword to the negative scoring of the son of the public sentiment to be identified.
That is, target keyword itself and the frequency of occurrence of target keyword can impact negative scoring, This influence can be indicated by sub negative scoring.In general, the negative weight of different target keyword point can be different, and one The number that a target keyword occurs is more, and the negative scoring of obtained son is higher.
It, can be by calculating after the negative scoring of the son of each target keyword occurred in public sentiment to be identified has been determined To the negative scoring of the public sentiment to be identified.
By taking aforementioned webpage article A as an example, occur three negative keywords in the content of A, respectively " thunderclaps ", " fall Stop ", " delisting ".Wherein, " thunderclaps ", " limit down ", " delisting " frequency of occurrence in A are 1." thunderclaps " corresponding negative weight point It is 0.257;" limit down " corresponding negative weight is divided into 0.911;" delisting " corresponding negative weight is divided into 0.42.Calculate the negative of A The problem of face is scored is asked to calculate the summation on the basis of the number that the negative weight of the negative keyword of each hit divides * to hit Topic, specifically may refer to formula (1):
P (A)=∑ P (n) * C (n) (1)
In formula (1), P (A) is the negative scoring of A, and n is the number of the negative keyword of difference appeared in A, P (n) table Show the negative weight point of target keyword, C (n) indicates the frequency of occurrence of target keyword, and P (A) is 1.588 in upper example.
The negative weight point of above-mentioned different negative keywords can be obtained by different modes determination, and the embodiment of the present application mentions A kind of optional calculation is supplied, by taking target keyword as an example, the negative weight point of target keyword is according to such as under type It determines:
It is matched according to the target keyword with first sample set, the first sample set includes identified multiple Negative public sentiment and multiple non-negative public sentiments;
The posteriority conditional probability for the negative public sentiment of the target keyword occur is concentrated according to the first sample, and The first sample concentrates the prior probability of negative public sentiment to determine that the negative weight of the target keyword is divided.
A kind of possible implementation of aforesaid way belongs to the problem of carrying out conditional probability summation using Bayes law, It is illustrated below by citing:
1. negative or non-negative public sentiment will be had been identified as first sample set, identification method for example can be by artificial Mark.Such as the public sentiment that first sample is concentrated is 10000 total, wherein 6000 are negative public sentiment, 4000 are non-negative Public sentiment.
2. target keyword such as " thunderclaps " is matched with 6000 negative public sentiments, the public sentiment of the word in statistical match Number, such as 4800 match " thunderclaps ", and 1200 do not match.
3. " thunderclaps " is matched with 4000 non-negative public sentiments, the public sentiment number of the word in statistical match, such as 800 It mixes, 3200 do not match.
So far the table 3 constructed is as follows:
" thunderclaps " Negative public sentiment Non-negative public sentiment
It matches 4800 800
It does not match 1200 3200
It amounts to 6000 4000
Table 3
" thunderclaps " will be matched as event O, article is negative public sentiment as event M, then according to Bayes law, calculates " thunderclaps " corresponding negative weight point is converted into appearance " thunderclaps " afterwards and is the posteriority conditional probability Reduction of Students' Study Load face public sentiment of negative public sentiment Prior probability mathematical problem:
Wherein, P (M) is negative public sentiment prior probability, and P (M | O) be " thunderclaps " occur afterwards and for the posteriority item of negative public sentiment Part probability.
P (O ∩ M)=4800/10000=48%;P (O)=(4800+800)/10000=56%;P (M)=6000/ 10000=60%;Then P (M | O)-P (M)=48%/56%-60%=0.257.So that it is determined that the negative weight of " thunderclaps " is divided into 0.257。
In a step 102, if the negative scoring of the public sentiment to be identified is greater than or equal to preset threshold, step 103 is executed.
103: the public sentiment to be identified is judged as negative public sentiment.
Such as public sentiment to be identified shares 100, the negative scoring determined respectively is as shown in table 4:
Article number Mood point
1st article A 9.1
2nd article B 3.2
99th article Y -0.3
100th article Z -7.2
Table 4
Public sentiment to be identified is distinguished according to the good preset threshold of system-computed, if threshold value is set to -0.2, then negative scoring It is determined as negative public sentiment more than or equal to -0.2, negative scoring is determined as non-negative public sentiment less than -0.2.
Wherein, preset threshold can be determined according to different modes, such as can be wanted according to the accuracy of identification under different scenes It asks and is configured.
The embodiment of the present application provides a kind of mode for calculating preset threshold, belongs to asking for the operational research of optimization Topic.
Optionally, the preset threshold is determined according to such as under type:
Obtaining the second sample set, second sample set includes identified multiple negative public sentiments and multiple non-negative public sentiments, And the negative scoring of the multiple negative public sentiment and multiple non-negative public sentiments.
According to marking model, the knowledge of negative public sentiment is carried out to the public sentiment in the second sample set using different recognition thresholds Not.
If the recognition result under target identification threshold value and the degree of conformity between the actual result of second sample set meet pre- If condition, using target identification threshold value as the preset threshold.
In one possible implementation, recognition threshold can be first transferred to biggish numerical value, then every time calculate according to Secondary reduction, therefrom to determine the recognition threshold as preset threshold according to result.
1. calculate the negative scoring that 10000 public sentiments in the second sample set have identified public sentiment, by its according to by it is high on earth Sequence arranges.6000 are negative public sentiment in this 10000 public sentiments, and 4000 are non-negative public sentiment.
2. set recognition threshold to than on the best result that negatively scores, such as negatively score it is highest be divided into 13, then will know Other threshold value is set as 13.1, i.e., determines that public sentiment all in the second sample set is non-negative public sentiment according to recognition threshold.
3. calculating the practical recognition result one of the negative public sentiment and public sentiment in the second sample set that divide according to the recognition threshold The ratio of cause, i.e., relatively accurate rate (degree of conformity)=(reality is labeled as negative public sentiment number+reality with machine and is labeled as with machine Non- negative public sentiment number)/public sentiment sum.Recognition threshold be 13.1 under recognition result and the second sample set actual result between Degree of conformity=(0+4000)/10000=40%.
Table 5 shows the actual result of recognition result and the second sample set that recognition threshold is 13.1:
4. table 5 according to step-length 0.1, gradually lowers recognition threshold, be adjusted downward to always in the second sample set negatively score it is minimum Public sentiment, every time adjustment after recalculate the relatively accurate rate i.e. degree of conformity that machine divides negative public sentiment, finally take relatively accurate The highest recognition threshold of rate is as preset threshold, when recognition threshold is -0.2 in upper example, degree of conformity highest, specially 98.2%, therefore preset threshold can be determined as -0.2.Table 6 shows the recognition result and second that recognition threshold is set as -0.2 The actual result of sample set:
Table 6
According to above-described embodiment as can be seen that determining the negative key occurred in public sentiment to be identified by processing equipment first The frequency of occurrence of word and each negative keyword, according to the frequency of occurrence of the negative keyword and each negative keyword The negative scoring of the public sentiment to be identified can be determined, if the negative scoring of the public sentiment to be identified is greater than or equal to default threshold Value, is judged as negative public sentiment for the public sentiment to be identified.To realize the automatic identification by way of negative Keywords matching Whether public sentiment is negative public sentiment, avoids the influence of human factor, improves recognition efficiency and stability.
Fig. 2 provides a kind of structure drawing of device of negative public sentiment judgment means for the embodiment of the present application, and described device includes Determination unit 201, computing unit 202 and judging unit 203:
The determination unit 201, for determining the negative keyword occurred in public sentiment to be identified and each negative keyword Frequency of occurrence;
The computing unit 202, for the frequency of occurrence according to the negative keyword and each negative keyword Determine the negative scoring of the public sentiment to be identified;
The judging unit 203, if the negative scoring for the public sentiment to be identified is greater than or equal to preset threshold, by institute It states public sentiment to be identified and is judged as negative public sentiment.
Optionally, for any one target keyword in the negative keyword, the computing unit is also used to:
The target is determined according to the frequency of occurrence of the negative weight of the target keyword point and the target keyword Keyword is to the negative scoring of the son of the public sentiment to be identified;
The negative scoring of the public sentiment to be identified is calculated according to the negative scoring of the son of each negative keyword.
Optionally, the computing unit is also used to the negative weight point according to the target keyword as described in determining under type:
It is matched according to the target keyword with first sample set, the first sample set includes identified multiple Negative public sentiment and multiple non-negative public sentiments;
The posteriority conditional probability for the negative public sentiment of the target keyword occur is concentrated according to the first sample, and The first sample concentrates the prior probability of negative public sentiment to determine that the negative weight of the target keyword is divided.
Optionally, the computing unit is also used to according to the preset threshold as described in determining under type:
Obtaining the second sample set, second sample set includes identified multiple negative public sentiments and multiple non-negative public sentiments, And the negative scoring of the multiple negative public sentiment and multiple non-negative public sentiments;
According to marking model, the knowledge of negative public sentiment is carried out to the public sentiment in the second sample set using different recognition thresholds Not;
If the recognition result under target identification threshold value and the degree of conformity between the actual result of second sample set meet pre- If condition, using target identification threshold value as the preset threshold.
Optionally, described device further includes filter element, and the filter element is used for:
Obtain the public sentiment set including multiple public sentiments undetermined;
Public sentiment filtering is carried out to the public sentiment set according to the title of the multiple public sentiment undetermined;
Using any one filtered public sentiment undetermined as the public sentiment to be identified.
Optionally, for any one target keyword in the negative keyword, the determination unit is also used to:
It is matched to text position in the public sentiment to be identified according to the target keyword, determines to include that the target is closed The contextual information of keyword;
Identify the semantic meaning representation tendency of the contextual information;
If the semantic meaning representation tendency is forward direction, determines and be not matched to the target keyword in the text position.
According to above-described embodiment as can be seen that determining the negative key occurred in public sentiment to be identified by processing equipment first The frequency of occurrence of word and each negative keyword, according to the frequency of occurrence of the negative keyword and each negative keyword The negative scoring of the public sentiment to be identified can be determined, if the negative scoring of the public sentiment to be identified is greater than or equal to default threshold Value, is judged as negative public sentiment for the public sentiment to be identified.To realize the automatic identification by way of negative Keywords matching Whether public sentiment is negative public sentiment, avoids the influence of human factor, improves recognition efficiency and stability.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and foregoing routine can be stored in a computer readable storage medium, which exists When execution, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned can be at least one in following media Kind: read-only memory (English: read-only memory, abbreviation: ROM), RAM, magnetic or disk etc. are various to be can store The medium of program code.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment it Between same and similar part may refer to each other, each embodiment focuses on the differences from other embodiments. For equipment and system embodiment, since it is substantially similar to the method embodiment, so describe fairly simple, The relevent part can refer to the partial explaination of embodiments of method.Equipment and system embodiment described above is only schematic , wherein unit may or may not be physically separated as illustrated by the separation member, it is shown as a unit Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks On unit.Some or all of the modules therein can be selected to achieve the purpose of the solution of this embodiment according to the actual needs. Those of ordinary skill in the art can understand and implement without creative efforts.
The above, only a kind of specific embodiment of the application, but the protection scope of the application is not limited thereto, Within the technical scope of the present application, any changes or substitutions that can be easily thought of by anyone skilled in the art, Should all it cover within the scope of protection of this application.Therefore, the protection scope of the application should be with scope of protection of the claims Subject to.

Claims (10)

1. a kind of negative public sentiment judgment method, which is characterized in that the described method includes:
Determine the frequency of occurrence of the negative keyword occurred in public sentiment to be identified and each negative keyword;
The negative of the public sentiment to be identified is determined according to the frequency of occurrence of the negative keyword and each negative keyword Scoring;
If the negative scoring of the public sentiment to be identified is greater than or equal to preset threshold, the public sentiment to be identified is judged as negative carriage Feelings.
2. the method according to claim 1, wherein for any one target critical in the negative keyword Word, the negative scoring that the public sentiment to be identified is determined according to the negative keyword and the frequency of occurrence, comprising:
The target critical is determined according to the frequency of occurrence of the negative weight of the target keyword point and the target keyword Word is to the negative scoring of the son of the public sentiment to be identified;
The negative scoring of the public sentiment to be identified is calculated according to the negative scoring of the son of each negative keyword.
3. according to the method described in claim 2, it is characterized in that, the negative weight of the target keyword point is according to such as lower section Formula determines:
It is matched according to the target keyword with first sample set, the first sample set includes identified multiple negative Public sentiment and multiple non-negative public sentiments;
The posteriority conditional probability of negative public sentiment for the target keyword occur and described is concentrated according to the first sample First sample concentrates the prior probability of negative public sentiment to determine that the negative weight of the target keyword is divided.
4. the method according to claim 1, wherein the preset threshold is determined according to such as under type:
Obtaining the second sample set, second sample set includes identified multiple negative public sentiments and multiple non-negative public sentiments, and The negative scoring of the multiple negative public sentiment and multiple non-negative public sentiments;
According to marking model, the identification of negative public sentiment is carried out to the public sentiment in the second sample set using different recognition thresholds;
If the recognition result under target identification threshold value and the degree of conformity between the actual result of second sample set meet default item Part, using target identification threshold value as the preset threshold.
5. method according to any of claims 1-4, which is characterized in that occur in determination public sentiment to be identified Negative keyword and each negative keyword frequency of occurrence before, the method also includes:
Obtain the public sentiment set including multiple public sentiments undetermined;
Public sentiment filtering is carried out to the public sentiment set according to the title of the multiple public sentiment undetermined;
Using any one filtered public sentiment undetermined as the public sentiment to be identified.
6. method according to any of claims 1-4, which is characterized in that for any one in the negative keyword A target keyword, the frequency of occurrence of the negative keyword occurred in the determination public sentiment to be identified and each negative keyword, Include:
It is matched to text position in the public sentiment to be identified according to the target keyword, determines to include the target keyword Contextual information;
Identify the semantic meaning representation tendency of the contextual information;
If the semantic meaning representation tendency is forward direction, determines and be not matched to the target keyword in the text position.
7. a kind of negative public sentiment judgment means, which is characterized in that described device includes that determination unit, computing unit and judgement are single Member:
The determination unit, for determining the occurrence out of the negative keyword occurred in public sentiment to be identified and each negative keyword Number;
The computing unit, for according to the frequency of occurrence determination of the negative keyword and each negative keyword The negative scoring of public sentiment to be identified;
The judging unit, if the negative scoring for the public sentiment to be identified is greater than or equal to preset threshold, by described wait know Other public sentiment is judged as negative public sentiment.
8. device according to claim 7, which is characterized in that for any one target critical in the negative keyword Word, the computing unit are also used to:
The target critical is determined according to the frequency of occurrence of the negative weight of the target keyword point and the target keyword Word is to the negative scoring of the son of the public sentiment to be identified;
The negative scoring of the public sentiment to be identified is calculated according to the negative scoring of the son of each negative keyword.
9. device according to claim 8, which is characterized in that the computing unit is also used to according to as under type determines institute State the negative weight point of target keyword:
It is matched according to the target keyword with first sample set, the first sample set includes identified multiple negative Public sentiment and multiple non-negative public sentiments;
The posteriority conditional probability of negative public sentiment for the target keyword occur and described is concentrated according to the first sample First sample concentrates the prior probability of negative public sentiment to determine that the negative weight of the target keyword is divided.
10. device according to claim 7, which is characterized in that the computing unit is also used to be determined according to such as under type The preset threshold:
Obtaining the second sample set, second sample set includes identified multiple negative public sentiments and multiple non-negative public sentiments, and The negative scoring of the multiple negative public sentiment and multiple non-negative public sentiments;
According to marking model, the identification of negative public sentiment is carried out to the public sentiment in the second sample set using different recognition thresholds;
If the recognition result under target identification threshold value and the degree of conformity between the actual result of second sample set meet default item Part, using target identification threshold value as the preset threshold.
CN201811518710.8A 2018-12-12 2018-12-12 A kind of negative public sentiment judgment method and device Pending CN109614551A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811518710.8A CN109614551A (en) 2018-12-12 2018-12-12 A kind of negative public sentiment judgment method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811518710.8A CN109614551A (en) 2018-12-12 2018-12-12 A kind of negative public sentiment judgment method and device

Publications (1)

Publication Number Publication Date
CN109614551A true CN109614551A (en) 2019-04-12

Family

ID=66009061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811518710.8A Pending CN109614551A (en) 2018-12-12 2018-12-12 A kind of negative public sentiment judgment method and device

Country Status (1)

Country Link
CN (1) CN109614551A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222513A (en) * 2019-05-21 2019-09-10 平安科技(深圳)有限公司 A kind of method for monitoring abnormality of Above-the-line, device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102385579A (en) * 2010-08-30 2012-03-21 腾讯科技(深圳)有限公司 Internet information classification method and system
CN103593359A (en) * 2012-08-16 2014-02-19 江苏金鸽网络科技有限公司 Text negative tendency judgment method based on industries
CN103678278A (en) * 2013-12-16 2014-03-26 中国科学院计算机网络信息中心 Chinese text emotion recognition method
CN105183748A (en) * 2015-07-13 2015-12-23 电子科技大学 Combined forecasting method based on content and score
US20180285461A1 (en) * 2017-03-31 2018-10-04 Facebook, Inc. Systems and Methods for Providing Diverse Content

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102385579A (en) * 2010-08-30 2012-03-21 腾讯科技(深圳)有限公司 Internet information classification method and system
CN103593359A (en) * 2012-08-16 2014-02-19 江苏金鸽网络科技有限公司 Text negative tendency judgment method based on industries
CN103678278A (en) * 2013-12-16 2014-03-26 中国科学院计算机网络信息中心 Chinese text emotion recognition method
CN105183748A (en) * 2015-07-13 2015-12-23 电子科技大学 Combined forecasting method based on content and score
US20180285461A1 (en) * 2017-03-31 2018-10-04 Facebook, Inc. Systems and Methods for Providing Diverse Content

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222513A (en) * 2019-05-21 2019-09-10 平安科技(深圳)有限公司 A kind of method for monitoring abnormality of Above-the-line, device and storage medium
CN110222513B (en) * 2019-05-21 2023-06-23 平安科技(深圳)有限公司 Abnormality monitoring method and device for online activities and storage medium

Similar Documents

Publication Publication Date Title
Enayet et al. NileTMRG at SemEval-2017 task 8: Determining rumour and veracity support for rumours on Twitter.
US20190164015A1 (en) Machine learning techniques for evaluating entities
Dey et al. Fake news pattern recognition using linguistic analysis
CN105068993B (en) A method of assessment text difficulty
CN103854064B (en) Event occurrence risk prediction and early warning method targeted to specific zone
CN103150333A (en) Opinion leader identification method in microblog media
Garcia-Lopez et al. Analysis of relationships between tweets and stock market trends
Madichetty et al. Disaster damage assessment from the tweets using the combination of statistical features and informative words
Edwards et al. Scamming the scammers: towards automatic detection of persuasion in advance fee frauds
Bharathi et al. Sentiment Analysis of Twitter and RSS News Feeds and Its Impact on Stock Market Prediction.
CN109168051A (en) A kind of network direct broadcasting platform supervision evidence-obtaining system based on blue-ray storage
Chatterjee et al. Investor classification and sentiment analysis
Chi et al. A supernetwork-based online post informative quality evaluation model
CN113190683A (en) Enterprise ESG index determination method based on clustering technology and related product
US8620918B1 (en) Contextual text interpretation
CN108243046A (en) A kind of evaluation the quality method and device based on data auditing
CN106997340A (en) The generation of dictionary and the Document Classification Method and device using dictionary
Hussein et al. DamascusTeam at NLP4IF2021: Fighting the Arabic COVID-19 infodemic on Twitter using AraBERT
CN109614551A (en) A kind of negative public sentiment judgment method and device
Wang Research on bank marketing behavior based on machine learning
CN107945034A (en) Financial analysis method, application server and computer-readable recording medium based on microblogging finance and economics event
CN111915312A (en) Risk identification method and device and electronic equipment
Suwa et al. Develop method to predict the increase in the Nikkei VI index
Zhao et al. Dynamic impacts of online investor sentiment on international crude oil prices
Sen et al. Analysis of media bias in policy discourse in india

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190412