CN108595564A - Media friendliness appraisal procedure, device and computer readable storage medium - Google Patents

Media friendliness appraisal procedure, device and computer readable storage medium Download PDF

Info

Publication number
CN108595564A
CN108595564A CN201810330401.1A CN201810330401A CN108595564A CN 108595564 A CN108595564 A CN 108595564A CN 201810330401 A CN201810330401 A CN 201810330401A CN 108595564 A CN108595564 A CN 108595564A
Authority
CN
China
Prior art keywords
media
friendliness
text
index
related text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810330401.1A
Other languages
Chinese (zh)
Other versions
CN108595564B (en
Inventor
纪其进
马斌
陆宇杰
吕博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhongan Information Technology Service Co ltd
Original Assignee
Zhongan Information Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongan Information Technology Service Co Ltd filed Critical Zhongan Information Technology Service Co Ltd
Priority to CN201810330401.1A priority Critical patent/CN108595564B/en
Publication of CN108595564A publication Critical patent/CN108595564A/en
Application granted granted Critical
Publication of CN108595564B publication Critical patent/CN108595564B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of media friendliness appraisal procedure, device and computer readable storage mediums, are related to the analysis of public opinion technical field, method includes:Obtain target medium publication with the relevant multiple related texts of public sentiment object;Feeling polarities analysis is carried out to each related text in multiple related texts, obtains the feeling polarities value of each related text;The feeling polarities value of each related text is converted into the corresponding media friendliness index of each related text respectively, and constitutes media friendliness exponential time sequence;Media friendliness exponential time sequence is smoothed, the media friendliness exponential time sequence after obtaining smoothly;Media friendliness based on the media friendliness exponential time sequence estimation target medium after smooth.The embodiment of the present invention can realize the conversion of the long-term attitude from text emotion to media, and then realize that accurately assessment target medium is to the attitude steady in a long-term of public sentiment object.

Description

Media friendliness appraisal procedure, device and computer readable storage medium
Technical field
The present invention relates to the analysis of public opinion technical field, more particularly to a kind of media friendliness appraisal procedure, device and calculating Machine readable storage medium storing program for executing.
Background technology
To certain hot spots, focal issue in actual life, traditional media, leader of opinion and general public can be delivered With tendentious speech and viewpoint.Public sentiment, speech and sight are known as to some event or phenomenon published utterances and viewpoint Point hold and publisher is known as public sentiment main body, and it is public sentiment object to be paid close attention to the object commented on by public sentiment main body, speech with see The media (including the network media and traditionally on paper media) of point publication are then public sentiment carriers.Public sentiment monitoring is pair and public sentiment object phase The behavior that the speech and viewpoint that the public sentiment main body of pass is delivered are monitored, analyzed and predicted.
Media have the characteristics that profession gather and edit, public publication, audient it is extensive, be the main source that people obtain information, together When also to audient have important influence, therefore be public sentiment monitoring main object.Continuous with internet is popularized, and network is Through the basic tool as work and life, media onlineization also becomes big trend, and the influence power of emerging network media is more next Bigger, traditionally on paper media also issue the network edition simultaneously mostly.Media network accelerates the rhythm of media distribution and propagation, net A large amount of appearance of network public sentiment are that public sentiment monitors and analysis proposes new challenge and opportunity.From the angle of technology, network public-opinion prison Survey refers to integrating internet information acquisition technology and information intelligent treatment technology, by being captured automatically to internet mass information, Automatic taxonomic clustering, topic detection, focus on special topic realize that the information such as network public-opinion monitoring and the Special Topics in Journalism tracking of user need It asks, forms the analysis results such as bulletin, report, chart.
Public sentiment content not only delivers Outside View of the public sentiment main body to public sentiment object, further comprises the subjectivity of public sentiment main body Emotion and mood.Text emotion analysis is also known as opinion mining, is to be analyzed the subjective texts with emotional color, located Reason, the process of conclusion and reasoning.Initial sentiment analysis mainly for the word with emotional color analysis, e.g., " fine " It is the word with commendation color, and " ugliness " is the word of color of conveying a derogatory sense.Later, related researcher was gradually from simple The analysis and research of emotion word be transitioned into increasingly complex emotion sentence research and the research of emotion chapter;In general, main The polarity for seeing this paper is divided into commendation and two class of derogatory sense or commendation, neutrality and derogatory sense three classes.
Text emotion analytical technology towards be single document, only cannot react some matchmaker from the Sentiment orientation of single document Body treats the long-term attitude of public sentiment object.Media friendliness assessment target be observation media treat public sentiment object (such as some Company) attitude steady in a long-term.Understand media friendliness and can be used to identify different tendentious media, to give in various degree Concern.
Media friendliness is assessed and tracked however, it is found by the inventors that not yet disclosing result of study in the prior art.
Invention content
In order to solve problems in the prior art, an embodiment of the present invention provides a kind of media friendliness appraisal procedure and dresses It sets, to assess media friendliness of the media to some public sentiment object, to understand its attitude steady in a long-term.
Specific technical solution provided in an embodiment of the present invention is as follows:
In a first aspect, a kind of media friendliness appraisal procedure is provided, the method includes the steps:
Obtain target medium publication with the relevant multiple related texts of public sentiment object;
Feeling polarities analysis is carried out to each related text in the multiple related text, obtains each related text Feeling polarities value;
The feeling polarities value of each related text is converted into the corresponding media friendliness of each related text respectively Index, and constitute media friendliness exponential time sequence;
The media friendliness exponential time sequence is smoothed, the media friendliness after obtaining smoothly refers to Number time series;
Media friendliness based on target medium described in the media friendliness exponential time sequence estimation after smooth.
With reference to first aspect, in the first possible implementation, the method further includes:
Persistently obtain the multiple related text;
The target is persistently assessed with preset time window and updated to the multiple related text based on lasting acquisition The media friendliness of media.
With reference to first aspect or the first possible realization method of first aspect, in second of possible realization method In, described the step of obtaining multiple related texts relevant with public sentiment object that target medium is issued, further comprises:
Obtain the media text set of the target medium publication;
Generate the primary vector spatial model of each media text in the media text set;And
Generate the secondary vector spatial model of the public sentiment object;
Calculate the similarity of each the primary vector spatial model and the secondary vector spatial model;
If the similarity being calculated is more than predetermined threshold value, it is determined that the media text is the public sentiment object phase The related text of pass.
With reference to first aspect, in the third possible realization method, each phase in the multiple related text The step of text carries out feeling polarities analysis is closed to further comprise:
Emotional semantic classification or machine learning algorithm based on sentiment dictionary carry out feeling polarities analysis to each related text.
With reference to first aspect to the third any one possible realization method of first aspect, in the 4th kind of possible reality In existing mode, the feeling polarities include positive emotion, neutral emotion and negative emotion, described respectively by each related text Feeling polarities value the step of being converted to each related text corresponding media friendliness index further comprise:
According to the feeling polarities value of predefined media friendliness formula of index and each related text, count respectively Calculation obtains the corresponding media friendliness index of each related text;
Wherein, the predefined media friendliness formula of index using it is following any one:
Fr=p+x-n;
Fs=p-x-n;
Wherein, FrFor loose media friendliness index, FsFor stringent media friendliness index, p indicates the general of positive emotion Rate, x indicate that the probability of neutral emotion, n indicate the probability of negative emotion, 0≤p, x, n≤1, p+x+n=1, -1≤Fr,Fs≤1。
With reference to first aspect, in the 5th kind of possible realization method, the composition media friendliness exponential time sequence The step of before, the method further includes:
Average computation is carried out according to time window to the corresponding media friendliness index of each related text, is obtained described The mean value of the corresponding media friendliness index of time window;
The step of composition media friendliness exponential time sequence, further comprises:
Based on the corresponding multiple mean values of multiple time windows, the media friendliness exponential time sequence is constituted Row.
With reference to first aspect or the 5th kind of possible realization method of the first or first aspect of first aspect, the 6th It is described that the media friendliness exponential time sequence is smoothed in the possible realization method of kind, after obtaining smoothly The step of media friendliness exponential time sequence, further comprises:
The media friendliness exponential time sequence is smoothly located using the method for moving average or the method for weighted moving average Reason.
With reference to first aspect or the first possible realization method of first aspect, in the 7th kind of possible realization method In, the step of the media friendliness of target medium described in the media friendliness exponential time sequence estimation based on after smooth After rapid, the method further includes:
According to the assessment result of the media friendliness to the target medium, the media of following target medium are predicted Friendliness index.
Second aspect, provides a kind of media friendliness apparatus for evaluating, and described device includes:
Text acquisition module, for obtain target medium publication with the relevant multiple related texts of public sentiment object;
Sentiment analysis module is obtained for carrying out feeling polarities analysis to each related text in the multiple related text To the feeling polarities value of each related text;
Index conversion module, for the feeling polarities value of each related text to be converted to each related text respectively Corresponding media friendliness index;
Sequence composition module, for constituting media friendliness exponential time sequence;
Smoothing module, for being smoothed to the media friendliness exponential time sequence, after obtaining smoothly The media friendliness exponential time sequence;
Friendliness evaluation module, for based on target described in the media friendliness exponential time sequence estimation after smooth The media friendliness of media.
In conjunction with second aspect, in the first possible implementation, the text acquisition module is specifically used for:
Persistently obtain the multiple related text;
The friendliness evaluation module is specifically used for:
The target is persistently assessed with preset time window and updated to the multiple related text based on lasting acquisition The media friendliness of media.
In conjunction with the possible realization method of the first of second aspect or second aspect, in second of possible realization method In, the text acquisition module is specifically used for:
Obtain the media text set of the target medium publication;
Generate the primary vector spatial model of each media text in the media text set;And
Generate the secondary vector spatial model of the public sentiment object;
Calculate the similarity of each the primary vector spatial model and the secondary vector spatial model;
If the similarity being calculated is more than predetermined threshold value, it is determined that the media text is the public sentiment object phase The related text of pass.
In conjunction with second aspect, in the third possible realization method, the sentiment analysis module is specifically used for:
Emotional semantic classification or machine learning algorithm based on sentiment dictionary carry out feeling polarities analysis to each related text.
In conjunction with the third any one possible realization method of second aspect to second aspect, in the 4th kind of possible reality In existing mode, the feeling polarities include that positive emotion, neutral emotion and negative emotion, the index conversion module are specifically used In:
According to the feeling polarities value of predefined media friendliness formula of index and each related text, count respectively Calculation obtains the corresponding media friendliness index of each related text;
Wherein, the predefined media friendliness formula of index using it is following any one:
Fr=p+x-n;
Fs=p-x-n;
Wherein, FrFor loose media friendliness index, FsFor stringent media friendliness index, p indicates the general of positive emotion Rate, x indicate that the probability of neutral emotion, n indicate the probability of negative emotion, 0≤p, x, n≤1, p+x+n=1, -1≤Fr,Fs≤1。
In conjunction with second aspect, in the 5th kind of possible realization method, described device further includes:
Average computation block, for being carried out according to time window to the corresponding media friendliness index of each related text Average computation obtains the mean value of the corresponding media friendliness index of the time window;
The Sequence composition module is additionally operable to be based on the corresponding multiple mean values of multiple time windows, constitutes institute State media friendliness exponential time sequence.
In conjunction with the 5th kind of possible realization method of the first or second aspect of second aspect or second aspect, the 6th In the possible realization method of kind, the smoothing module is specifically used for:
The media friendliness exponential time sequence is smoothly located using the method for moving average or the method for weighted moving average Reason.
In conjunction with the possible realization method of the first of second aspect or second aspect, in the 7th kind of possible realization method In, described device further includes:
Exponential forecasting module predicts future for the assessment result according to the media friendliness to the target medium The media friendliness index of the target medium.
The third aspect, provides a kind of media friendliness apparatus for evaluating, and described device includes:
One or more processor;
Memory;
The program being stored in the memory, when being executed by one or more of processors, the journey The step of sequence makes the processor execute the media friendliness appraisal procedure as described in above-mentioned first aspect any one.
Fourth aspect, provides a kind of computer readable storage medium, and the computer-readable recording medium storage has journey Sequence, when said program is executed by a processor so that the processor executes the matchmaker as described in above-mentioned first aspect any one The step of body friendliness appraisal procedure.
A kind of media friendliness appraisal procedure of offer of the embodiment of the present invention, device and computer readable storage medium, pass through Based on the sentiment analysis of each related text of target medium publication, the feeling polarities value of each related text is converted to respectively The corresponding media friendliness index of each related text, and media friendliness exponential time sequence is constituted, and to media friendliness Exponential time sequence is smoothed, and the matchmaker of the media friendliness exponential time sequence estimation target medium after utilization smoothly Body friendliness is achieved in the conversion from text emotion to media attitude, and target medium pair is accurately assessed so as to realize The attitude steady in a long-term of public sentiment object.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, other are can also be obtained according to these attached drawings Attached drawing.
Fig. 1 is a kind of media friendliness appraisal procedure flow chart provided in an embodiment of the present invention;
Fig. 2 is the result being smoothed to media friendliness exponential time sequence in the embodiment of the present invention;
Fig. 3 is a kind of block diagram of media friendliness apparatus for evaluating provided in an embodiment of the present invention.
Specific implementation mode
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention Figure, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only this Invention a part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art exist The every other embodiment obtained under the premise of creative work is not made, shall fall within the protection scope of the present invention.
An embodiment of the present invention provides a kind of media friendliness appraisal procedures, with each related text of target medium publication Based on sentiment analysis, by the way that the feeling polarities value of each related text is converted to the corresponding media of each related text respectively Friendliness index, and media friendliness exponential time sequence is constituted, and media friendliness exponential time sequence is carried out smooth Processing, and using the media friendliness of the media friendliness exponential time sequence estimation target medium after smooth, be achieved in from Text emotion accurately assesses target medium to the steady in a long-term of public sentiment object to the conversion of the long-term attitude of media to realize Attitude.Wherein, method provided in this embodiment can be applied to various equipment, such as desktop computer, personal computer, mobile terminal And server, the present invention do not limit this.
Shown in referring to Fig.1, the embodiment of the present invention provides a kind of media friendliness appraisal procedure flow chart, and this method includes such as Lower step:
S10, obtain target medium publication with the relevant multiple related texts of public sentiment object.
S20 carries out feeling polarities analysis to each related text in multiple related texts, obtains the emotion of each related text Polarity number.
The feeling polarities value of each related text is converted to the corresponding media friendliness index of each related text by S30 respectively, And constitute media friendliness exponential time sequence.
S40 is smoothed media friendliness exponential time sequence, obtain it is smooth after media friendliness index when Between sequence.
S50, the media friendliness based on the media friendliness exponential time sequence estimation target medium after smooth.
An embodiment of the present invention provides a kind of media friendliness appraisal procedures, pass through each related text issued with target medium Based on this sentiment analysis, it is friendly that the feeling polarities value of each related text is converted into the corresponding media of each related text respectively Index is spent, and constitutes media friendliness exponential time sequence, and media friendliness exponential time sequence is smoothed, And the media friendliness of the media friendliness exponential time sequence estimation target medium after utilizing smoothly, it is achieved in from text feelings The conversion for feeling media attitude, to realize the media friendliness for accurately assessing target medium to public sentiment object.
In some embodiments of the invention, method further includes:
Persistently obtain multiple related texts;
The matchmaker of target medium is persistently assessed with preset time window and updated to multiple related texts based on lasting acquisition Body friendliness.
Wherein, the self-defined setting of time scale that prefixed time interval can be according to the observation, for example be divided between setting time One week, the present invention was not limited this.
Specifically, multiple related texts based on lasting acquisition, analysis obtains each in the multiple related texts for continuing to obtain The feeling polarities value of related text, and corresponding conversion is the corresponding media friendliness index of each related text, and update was previously The media friendliness exponential time sequence of composition, and updated media friendliness exponential time sequence is smoothed, The media friendliness of the assessment of media friendliness exponential time sequence duration and update target medium after being then based on smoothly.
In the embodiment of the present invention, pass through the target medium publication based on lasting acquisition and the relevant multiple phases of public sentiment object Text is closed, the media friendliness of lasting assessment and update target medium is achieved in the long-term attitude from text emotion to media Conversion, to realize the attitude steady in a long-term for accurately assessing target medium to public sentiment object.
In some embodiments of the invention, step S10 further comprises:
Step S101 obtains the media text set of target medium publication;
Wherein, the media text set that acquisition target medium is issued in chronological order, and it is pre- to carry out text to media text set Processing, for example Chinese word segmentation and part-of-speech tagging etc. are carried out to media text set with participle tool.
Step S102 generates the primary vector spatial model of each media text in media text set, and generates carriage The secondary vector spatial model of feelings object.
Wherein, the process of step S102 may include:
Step S1021 is indicated media document and public sentiment object with vector respectively.
Media document DjVector can be expressed as Dj(w1j,w2j,…,wnj), wherein n is the number of words in system, wij Keyword i is represented in document DjIn weight.
The vector of public sentiment object inquiry Q can be expressed as Q (w1k,w2k,…,wnk), wikWord i is represented in inquiring Q Weight.
Step S1022 calculates the weight in each vector space model by the way of TF*IDF, i.e. keyword i is in document DjWeight wij=TFij+IDFij.Specific calculating process is as follows:
TF weights:TF (Term Frequency) is the number that word occurs in a document, weight wij=TFijOr return TF values after one change.
The normalization (Normalization) of TF:The TF values of all index terms in one document are normalized to [0,1] Between, one of following manner usually may be used:
I), wtf=TF/max (TF)
Ii), wtf=a+ (1-a) * TF/max (TF), wherein a are regulatory factor, experience value a=0.5 or 0.4.
The document frequency DF (Document Frequency) of word:The document piece that word occurs in entire collection of document Number, DF reflect the discrimination of word, and DF is higher, and expression word is more universal, therefore its discrimination is lower, and weight is also lower.
Inverse document frequency (Inverse DF, IDF):The inverse of DF, (N is collection of document to the following formula calculating of generally use In all documents number):
Calculate weight:Keyword i is in document DjWeight wij=TFij+IDFij
Step S103 calculates the similarity of each primary vector spatial model and secondary vector spatial model.
The degree of correlation (i.e. similarity) of document and query word can be by each vector in vectorial sky is asked opposite position It sets to determine.There are many kinds of similarity calculation functions, and more common is two vectorial angle cosine functions.Therefore media document DjThe similarity value that Q is inquired with public sentiment object is obtained by following formula:
Step S104, if the similarity being calculated is more than predetermined threshold value, it is determined that media text is that public sentiment object is related Related text.
Wherein it is possible to the threshold value of inquiry correlation be set, when calculating gained similarity higher than threshold value, you can be considered phase Document is closed, its Sentiment orientation to public sentiment object is then further analyzed.
In the embodiment of the present invention, since same media would generally issue various contents, and the analysis of public opinion is only concerned institute The case where public sentiment object of concern, therefore the assessment of media friendliness needs selection and public sentiment visitor from the full content of media releasing The relevant report of body, if the public sentiment object of concern can indicate that the selection of related text can be expressed as with one group of keyword Text retrieval problem based on keyword.The present invention is used based on vector space model selection and the relevant text of public sentiment object, The difficulty and effect of implementation process can be balanced, it is to be understood that in practical applications, selection and the relevant text of public sentiment object Originally it can also be achieved by other methods, such as Boolean Model, vector space model, probabilistic model etc., the present invention is to obtaining The detailed process with the relevant text of public sentiment object is taken to be not limited.
In some embodiments of the invention, step S20 further comprises:
Three polarity sentiment analysis are carried out to each related text, analysis result is respectively p, x, n, 0≤p, x, n≤1, p+x+ N=1, wherein p express the probability of positive emotion, and x indicates that the probability of neutral emotion, n indicate the probability of negative emotion.
In one embodiment, can the emotional semantic classification based on sentiment dictionary feeling polarities point are carried out to each related text Analysis, specifically, the process may include:
Step S201 selects and improves sentiment dictionary.Wherein, sentiment dictionary includes multiple emotion words, each emotion word Language has the score value for indicating feeling polarities intensity.
Step S202, it is that emotion word is given a mark with phrase to reinforce word and emotion negative word in conjunction with emotion.Wherein, emotion reinforces word For the word that the degree to emotion vocabulary is modified, to indicate the intensity of emotion;Emotion negative word is to be carried out to emotion vocabulary The word of negative, such as:, be not, be non-, absolutely not, it is no, not, can not possibly, not have, not have, fail etc. words.
Step S203 carries out cumulative statistics to the emotion score value of each polarity emotion in document, and calculates separately each pole The probability that disposition sense occurs in a document.
In the embodiment of the present invention, three polarity emotions point are carried out to each related text by the emotional semantic classification based on sentiment dictionary Analysis, obtains the feeling polarities value of each related text, passes through the sentiment analysis results presumption media friendliness based on single document, energy It enough ensure that the reliability and feasibility on assessment basis.
In another embodiment, machine learning algorithm can be based on and feeling polarities analysis, tool is carried out to each related text For body, which may include:
Step S211, construction feature:Using bag of words (n-gram), part of speech, emotion word and emotion phrase, emotion transposition Word etc. is used as feature.
Step S212, machine learning:Using simple Bayes, maximum entropy (Maximum Entropy) and SVM as classification Device is trained.
Step S213, each related text of combining classification device carry out feeling polarities analysis, obtain the emotion pole of each related text Property value.
It should be noted that although in the present embodiment with to each related text carry out three polarity sentiment analysis preferably, However, it will be understood that method provided in an embodiment of the present invention is not limited to this, those skilled in the art can also use pair Each related text carries out two polarity sentiment analysis or multipolarity sentiment analysis.
In the embodiment of the present invention, three polarity sentiment analysis are carried out to each related text by being based on machine learning algorithm, are obtained To the feeling polarities value of each related text, pass through the sentiment analysis results presumption media friendliness based on single document, Neng Goubao The reliability and feasibility on assessment basis is demonstrate,proved.
In some embodiments of the invention, feeling polarities include positive emotion, neutral emotion and negative emotion, step S30 further comprises:
Step S301, according to the feeling polarities value of predefined media friendliness formula of index and each related text, It calculates separately to obtain the corresponding media friendliness index of each related text;
Wherein, predefined media friendliness formula of index using it is following any one:
Fr=p+x-n;
Fs=p-x-n;
Wherein, FrFor loose media friendliness index, FsFor stringent media friendliness index, p indicates the general of positive emotion Rate, x indicate that the probability of neutral emotion, n indicate the probability of negative emotion, 0≤p, x, n≤1, p+x+n=1, -1≤Fr,Fs≤1。
Loosely difference lies in the utilization of centering disposition sense, loose friendliness disposition in stringent friendliness Index Definition Front is regarded in sense as, and neutral emotion is regarded as negative by stringent friendliness.
Step S302 is based on the corresponding media friendliness index of each related text, when constituting a media friendliness index Between sequence.
Wherein, a text of each element representation media releasing of media friendliness exponential time sequence is to public sentiment object Sentiment orientation.
In the embodiment of the present invention, since the emotion that single document is embodied cannot reflect the long-term attitude of media, lead to It crosses and the feeling polarities value of each related text is converted into the corresponding media friendliness index of each related text respectively, and constitute media Friendliness exponential time sequence, using media friendliness exponential time sequence, to realize the long-term state from document emotion to media The conversion of degree, and then can realize that the media friendliness to target medium for public sentiment object is assessed and tracked.
In some embodiments of the invention, step S40 further comprises:
Media friendliness exponential time sequence is smoothed using the method for moving average or the method for weighted moving average.
Wherein, the process being smoothed to media friendliness exponential time sequence using the method for moving average can wrap It includes:
If time series is y1,y2,…,yt;The calculation formula of the simple method of moving average is:
Wherein, Mt- t phase moving averages;The item number of N- rolling averages
When n is large, calculation amount can be greatly reduced using recurrence formula.
Since rolling average does not account for the difference that different document contributes media friendliness, the method for weighted moving average can Solve the problems, such as this.
Wherein, the process being smoothed to media friendliness exponential time sequence using weighted moving average can wrap It includes:
Weighting sliding average general expression be:
Wherein, Mtw- t phase moving averages;wi-yt-i+1Weight, embody corresponding ytIt is important in weighted average Property.
The selection of weight has centainly empirical, and general principle is:The flexible strategy of Recent data are big, the power of data at a specified future date Number is small.
Wherein, weight can utilize time attenuation function f (n)=A+e-μnIt indicates, wherein A ∈ Z, and A > 1, μ ∈ (0, 1), n=1,2 ..., N indicate the serial number of emotion time series.
The scene of combination public sentiment of embodiment of the present invention application, news documents are divided into original and reprint, and original document is usual All be starting, best embody attitude of the media to public sentiment object, and forward embodied attitude tendency want it is weak very much, in order to embody This difference, the value different from of the parameter A of the attenuation function of document weight, therefore in practical applications, can define original Document corresponding A=1, non-original document corresponding A=0.5.Parameter μ can take random value when starting in value range, work as history Data accumulation to a certain extent when can be estimated based on historical data, wherein fit approach can pass through python or matlab Code is realized.
The step 40 in the embodiment of the present invention is further described in conjunction with attached drawing 2, wherein the corresponding curves of label a in Fig. 2 It is smooth preceding media friendliness exponential time sequence, the corresponding curves of label b are using the method for moving average to media friendliness The curve that exponential time sequence obtains after being smoothed, the corresponding curves of label c are using weighted mean method to media friend The curve obtained after exponential time sequence is smoothed is spent well, can intuitively be found out by Fig. 2, by the method for moving average and is added The media friendliness numerical value that two methods of weight average method smoothly obtain media friendliness exponential time sequence is more steady.
In the embodiment of the present invention, by being smoothed to media friendliness exponential time sequence, after obtaining smoothly Media friendliness exponential time sequence, and based on the media of the media friendliness exponential time sequence estimation target medium after smooth Friendliness, thus, it is possible to ensure that the assessment result for being directed to the media friendliness for waiting for public sentiment object to target medium is more stablized, into And it obtains target medium and is directed to the attitude steady in a long-term for waiting for public sentiment object.
In some embodiments of the invention, before constituting media friendliness exponential time sequence in step s 40, method Can also include:
Average computation is carried out according to time window to the corresponding media friendliness index of each related text, obtains time window The mean value of interior corresponding media friendliness index;
The step of media friendliness exponential time sequence is constituted in step S40 further comprises:
Based on the corresponding multiple mean values of multiple time windows, media friendliness exponential time sequence is constituted.
Wherein, smooth result is related with time scale, and the time scale of observation is further amplified if necessary, may be selected A certain range of time scale, such as week, the moon, season, each related text pair in multiple related texts in this time window The media friendliness index answered obtains the mean value of media friendliness index in time window after carrying out average computation, then just constitutes Media friendliness exponential time sequence, and media friendliness exponential time sequence smoothly, can further analyze in this way The media friendliness situation of big time scale.
In some embodiments of the invention, after step S50, method can also include:
According to the assessment result of the media friendliness to target medium, predict that the media friendliness of following target medium refers to Number.
Wherein it is possible to be predicted using weighted moving average, predictor formula is:I.e. with the t phases Predicted value of the weighted moving average as the t+1 phases.
The embodiment of the present invention is based on the sentiment analysis of multiple related texts of target medium, by analyzing single text Corresponding media friendliness index constitutes media friendliness and indicates time series, and based on the smooth place to time series analysis It manages to assess and predict steady in a long-term attitude of the media to public sentiment object, thus, it is possible to ensure to assess and predict media to public sentiment visitor The accuracy of the attitude steady in a long-term of body.
Shown in please referring to Fig.3, in the embodiment of the present invention, a kind of media friendliness apparatus for evaluating is additionally provided, the device packet It includes:
Text acquisition module 31, for obtain target medium publication with the relevant multiple related texts of public sentiment object;
Sentiment analysis module 32 is obtained for carrying out feeling polarities analysis to each related text in multiple related texts The feeling polarities value of each related text;
It is corresponding to be converted to each related text by index conversion module 33 for respectively for the feeling polarities value of each related text Media friendliness index;
Sequence composition module 34, for constituting media friendliness exponential time sequence;
Smoothing module 35, for being smoothed to media friendliness exponential time sequence, after obtaining smoothly Media friendliness exponential time sequence;
Friendliness evaluation module 36, for based on the media friendliness exponential time sequence estimation target medium after smooth Media friendliness.
Further, text acquisition module 31 is specifically used for:
Persistently obtain multiple related texts;
Friendliness evaluation module 36 is specifically used for:
The matchmaker of target medium is persistently assessed with preset time window and updated to multiple related texts based on lasting acquisition Body friendliness.
Further, text acquisition module 31 is specifically used for:
Obtain the media text set of target medium publication;
Generate the primary vector spatial model of each media text in media text set;And
Generate the secondary vector spatial model of public sentiment object;
Calculate the similarity of each primary vector spatial model and secondary vector spatial model;
If the similarity being calculated is more than predetermined threshold value, it is determined that media text is the relevant related text of public sentiment object This.
Further, sentiment analysis module 32 is specifically used for:
Emotional semantic classification or machine learning algorithm based on sentiment dictionary carry out feeling polarities analysis to each related text.
Further, feeling polarities include that positive emotion, neutral emotion and negative emotion, index conversion module 33 are specifically used In:
According to the feeling polarities value of predefined media friendliness formula of index and each related text, calculate separately To the corresponding media friendliness index of each related text;
Wherein, predefined media friendliness formula of index using it is following any one:
Fr=p+x-n;
Fs=p-x-n;
Wherein, FrFor loose media friendliness index, FsFor stringent media friendliness index, p indicates the general of positive emotion Rate, x indicate that the probability of neutral emotion, n indicate the probability of negative emotion, 0≤p, x, n≤1, p+x+n=1, -1≤Fr,Fs≤1。
Further, device further includes:
Average computation block 37, for being put down according to time window to the corresponding media friendliness index of each related text It calculates, obtains the mean value of corresponding media friendliness index in time window;
Sequence composition module 34 is additionally operable to be based on the corresponding multiple mean values of multiple time windows, constitutes media friendliness and refers to Number time series.
Further, smoothing module 35 is specifically used for:
Media friendliness exponential time sequence is smoothed using the method for moving average or the method for weighted moving average.
Further, device further includes:
Exponential forecasting module 38 predicts following mesh for the assessment result according to the media friendliness to target medium Mark the media friendliness index of media.
The embodiment of the present invention provides a kind of media friendliness apparatus for evaluating, which passes through each phase for being issued with target medium Based on the sentiment analysis for closing text, the feeling polarities value of each related text is converted into the corresponding media of each related text respectively Friendliness index, and media friendliness exponential time sequence is constituted, and media friendliness exponential time sequence is carried out smooth Processing, and using the media friendliness of the media friendliness exponential time sequence estimation target medium after smooth, be achieved in from Text emotion accurately assesses target medium to the steady in a long-term of public sentiment object to the conversion of the long-term attitude of media to realize Attitude.
In addition, the embodiment of the present invention also provides a kind of media friendliness apparatus for evaluating, device includes:
One or more processor;
Memory;
Program stored in memory, when being executed by one or more processor, program makes processor execute The step of stating media friendliness appraisal procedure any in embodiment.
Another embodiment of the present invention also provides a kind of computer readable storage medium, and computer-readable recording medium storage has Program, when program is executed by processor so that processor executes media friendliness appraisal procedure any in above-described embodiment The step of.
It should be understood by those skilled in the art that, the embodiment in the embodiment of the present invention can be provided as method, apparatus or meter Calculation machine program product.Therefore, complete hardware embodiment, complete software embodiment can be used in the embodiment of the present invention or combine soft The form of the embodiment of part and hardware aspect.Moreover, it wherein includes meter to be can be used in the embodiment of the present invention in one or more The computer-usable storage medium of calculation machine usable program code (includes but not limited to magnetic disk storage, CD-ROM, optical memory Deng) on the form of computer program product implemented.
It is with reference to the method, apparatus (system) of middle embodiment according to embodiments of the present invention and to calculate in the embodiment of the present invention The flowchart and/or the block diagram of machine program product describes.It should be understood that can be realized by computer program instructions flow chart and/or The combination of the flow and/or box in each flow and/or block and flowchart and/or the block diagram in block diagram.It can carry For the processing of these computer program instructions to all-purpose computer, special purpose computer, Embedded Processor or other programmable datas The processor of equipment is to generate a machine so that is executed by computer or the processor of other programmable data processing devices Instruction generation refer to for realizing in one flow of flow chart or multiple flows and/or one box of block diagram or multiple boxes The device of fixed function.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment in the embodiment of the present invention has been described, once a person skilled in the art knows Basic creative concept, then additional changes and modifications may be made to these embodiments.So appended claims are intended to explain It is to include preferred embodiment and fall into all change and modification of range in the embodiment of the present invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art God and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (18)

1. a kind of media friendliness appraisal procedure, which is characterized in that the method includes the steps:
Obtain target medium publication with the relevant multiple related texts of public sentiment object;
Feeling polarities analysis is carried out to each related text in the multiple related text, obtains the emotion of each related text Polarity number;
The feeling polarities value of each related text is converted into the corresponding media friendliness index of each related text respectively, And constitute media friendliness exponential time sequence;
The media friendliness exponential time sequence is smoothed, obtain it is smooth after the media friendliness index when Between sequence;
Media friendliness based on target medium described in the media friendliness exponential time sequence estimation after smooth.
2. according to the method described in claim 1, it is characterized in that, the method further includes:
The target medium is persistently assessed with preset time window and updated to the multiple related text based on lasting acquisition Media friendliness.
3. method according to claim 1 or 2, which is characterized in that it is described obtain target medium publication with public sentiment object The step of relevant multiple related texts, further comprises:
Obtain the media text set of the target medium publication;
Generate the primary vector spatial model of each media text in the media text set;And
Generate the secondary vector spatial model of the public sentiment object;
Calculate the similarity of each the primary vector spatial model and the secondary vector spatial model;
If the similarity being calculated is more than predetermined threshold value, it is determined that the media text is that the public sentiment object is relevant Related text.
4. according to the method described in claim 1, it is characterized in that, each related text in the multiple related text The step of carrying out feeling polarities analysis further comprises:
Emotional semantic classification or machine learning algorithm based on sentiment dictionary carry out feeling polarities analysis to each related text.
5. method according to any one of claims 1 to 4, which is characterized in that the feeling polarities include positive emotion, Neutral emotion and negative emotion, it is described that the feeling polarities value of each related text is converted into each related text pair respectively The step of media friendliness index answered, further comprises:
According to the feeling polarities value of predefined media friendliness formula of index and each related text, calculate separately To the corresponding media friendliness index of each related text;
Wherein, the predefined media friendliness formula of index using it is following any one:
Fr=p+x-n;
Fs=p-x-n;
Wherein, FrFor loose media friendliness index, FsFor stringent media friendliness index, p indicates the probability of positive emotion, x tables Show that the probability of neutral emotion, n indicate the probability of negative emotion, 0≤p, x, n≤1, p+x+n=1, -1≤Fr,Fs≤1。
6. according to the method described in claim 1, it is characterized in that, the step of the composition media friendliness exponential time sequence Before, the method further includes:
Average computation is carried out according to time window to the corresponding media friendliness index of each related text, obtains the time The mean value of the corresponding media friendliness index of window;
The step of composition media friendliness exponential time sequence, further comprises:
Based on the corresponding multiple mean values of multiple time windows, the media friendliness exponential time sequence is constituted.
7. according to the method described in claims 1 or 2 or 5, which is characterized in that described to the media friendliness exponential time sequence Row are smoothed, obtain it is smooth after the media friendliness exponential time sequence the step of further comprise:
The media friendliness exponential time sequence is smoothed using the method for moving average or the method for weighted moving average.
8. method according to claim 1 or 2, which is characterized in that the media friendliness based on after smooth refers to After the step of number time series assesses the media friendliness of the target medium, the method further includes:
According to the assessment result of the media friendliness to the target medium, predict that the media of following target medium are friendly Spend index.
9. a kind of media friendliness apparatus for evaluating, which is characterized in that described device includes:
Text acquisition module, for obtain target medium publication with the relevant multiple related texts of public sentiment object;
Sentiment analysis module obtains institute for carrying out feeling polarities analysis to each related text in the multiple related text State the feeling polarities value of each related text;
Index conversion module is corresponded to for the feeling polarities value of each related text to be converted to each related text respectively Media friendliness index;
Sequence composition module, for constituting media friendliness exponential time sequence;
Smoothing module, for being smoothed to the media friendliness exponential time sequence, the institute after obtaining smoothly State media friendliness exponential time sequence;
Friendliness evaluation module, for based on target medium described in the media friendliness exponential time sequence estimation after smooth Media friendliness.
10. device according to claim 9, which is characterized in that
The text acquisition module is specifically used for:
Persistently obtain the multiple related text;
The friendliness evaluation module is specifically used for:
The target medium is persistently assessed with preset time window and updated to the multiple related text based on lasting acquisition Media friendliness.
11. device according to claim 9 or 10, which is characterized in that the text acquisition module is specifically used for:
Obtain the media text set of the target medium publication;
Generate the primary vector spatial model of each media text in the media text set;And
Generate the secondary vector spatial model of the public sentiment object;
Calculate the similarity of each the primary vector spatial model and the secondary vector spatial model;
If the similarity being calculated is more than predetermined threshold value, it is determined that the media text is that the public sentiment object is relevant Related text.
12. device according to claim 9, which is characterized in that the sentiment analysis module is specifically used for:
Emotional semantic classification or machine learning algorithm based on sentiment dictionary carry out feeling polarities analysis to each related text.
13. according to the device described in claim 9 to 12 any one, which is characterized in that the feeling polarities include positive feelings Sense, neutral emotion and negative emotion, the index conversion module are specifically used for:
According to the feeling polarities value of predefined media friendliness formula of index and each related text, calculate separately To the corresponding media friendliness index of each related text;
Wherein, the predefined media friendliness formula of index using it is following any one:
Fr=p+x-n;
Fs=p-x-n;
Wherein, FrFor loose media friendliness index, FsFor stringent media friendliness index, p indicates the probability of positive emotion, x tables Show that the probability of neutral emotion, n indicate the probability of negative emotion, 0≤p, x, n≤1, p+x+n=1, -1≤Fr,Fs≤1。
14. device according to claim 9, which is characterized in that described device further includes:
Average computation block, for being averaged according to time window to the corresponding media friendliness index of each related text It calculates, obtains the mean value of the corresponding media friendliness index of the time window;
The Sequence composition module is additionally operable to be based on the corresponding multiple mean values of multiple time windows, constitutes the matchmaker Body friendliness exponential time sequence.
15. according to the device described in claim 9 or 10 or 13, which is characterized in that the smoothing module is specifically used for:
The media friendliness exponential time sequence is smoothed using the method for moving average or the method for weighted moving average.
16. device according to claim 9, which is characterized in that described device further includes:
Exponential forecasting module predicts the described of future for the assessment result according to the media friendliness to the target medium The media friendliness index of target medium.
17. a kind of media friendliness apparatus for evaluating, which is characterized in that described device includes:
One or more processor;
Memory;
The program being stored in the memory, when being executed by one or more of processors, described program makes The processor executes the step of media friendliness appraisal procedure as described in any one of claim 1~8.
18. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has program, when When described program is executed by processor so that the processor executes the media friend as described in any one of claim 1~8 The step of spending appraisal procedure well.
CN201810330401.1A 2018-04-13 2018-04-13 Method and device for evaluating media friendliness and computer-readable storage medium Active CN108595564B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810330401.1A CN108595564B (en) 2018-04-13 2018-04-13 Method and device for evaluating media friendliness and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810330401.1A CN108595564B (en) 2018-04-13 2018-04-13 Method and device for evaluating media friendliness and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN108595564A true CN108595564A (en) 2018-09-28
CN108595564B CN108595564B (en) 2020-08-11

Family

ID=63622360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810330401.1A Active CN108595564B (en) 2018-04-13 2018-04-13 Method and device for evaluating media friendliness and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN108595564B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282754A (en) * 2021-06-10 2021-08-20 北京中科闻歌科技股份有限公司 Public opinion detection method, device, equipment and storage medium for news events

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073646A (en) * 2009-11-23 2011-05-25 北京科技大学 Blog group-oriented subject propensity processing method and system
CN102789449A (en) * 2011-05-20 2012-11-21 日电(中国)有限公司 Method and device for evaluating comment text
CN104951548A (en) * 2015-06-24 2015-09-30 烟台中科网络技术研究所 Method and system for calculating negative public opinion index
CN106951409A (en) * 2017-03-17 2017-07-14 黄淮学院 A kind of network social intercourse media viewpoint tendency analysis system and method
CN107077486A (en) * 2014-09-02 2017-08-18 菲特尔销售工具有限公司 Affective Evaluation system and method
CN107169632A (en) * 2017-04-19 2017-09-15 广东数相智能科技有限公司 Global media community image analysis method, device and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073646A (en) * 2009-11-23 2011-05-25 北京科技大学 Blog group-oriented subject propensity processing method and system
CN102789449A (en) * 2011-05-20 2012-11-21 日电(中国)有限公司 Method and device for evaluating comment text
CN107077486A (en) * 2014-09-02 2017-08-18 菲特尔销售工具有限公司 Affective Evaluation system and method
CN104951548A (en) * 2015-06-24 2015-09-30 烟台中科网络技术研究所 Method and system for calculating negative public opinion index
CN106951409A (en) * 2017-03-17 2017-07-14 黄淮学院 A kind of network social intercourse media viewpoint tendency analysis system and method
CN107169632A (en) * 2017-04-19 2017-09-15 广东数相智能科技有限公司 Global media community image analysis method, device and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282754A (en) * 2021-06-10 2021-08-20 北京中科闻歌科技股份有限公司 Public opinion detection method, device, equipment and storage medium for news events

Also Published As

Publication number Publication date
CN108595564B (en) 2020-08-11

Similar Documents

Publication Publication Date Title
Hu et al. Few-shot charge prediction with discriminative legal attributes
CA3129745C (en) Neural network system for text classification
US20210319051A1 (en) Conversation oriented machine-user interaction
Bagheri et al. ADM-LDA: An aspect detection model based on topic modelling using the structure of review sentences
CN112560479B (en) Abstract extraction model training method, abstract extraction device and electronic equipment
CN109657054A (en) Abstraction generating method, device, server and storage medium
Wu et al. Controllable abstractive dialogue summarization with sketch supervision
Paetzold et al. Sv000gg at semeval-2016 task 11: Heavy gauge complex word identification with system voting
US10067935B2 (en) Prediction and optimized prevention of bullying and other counterproductive interactions in live and virtual meeting contexts
US11032217B2 (en) Reusing entities in automated task-based multi-round conversation
KR20110132991A (en) Identifying activities using a hybrid user-activity model
CN110971659A (en) Recommendation message pushing method and device and storage medium
CN108228808B (en) Method and device for determining hot event, storage medium and electronic equipment
EP4134900A2 (en) Method and apparatus for recommending content, method and apparatus for training ranking model, device, and storage medium
CN110895656B (en) Text similarity calculation method and device, electronic equipment and storage medium
CN110187780A (en) Long text prediction technique, device, equipment and storage medium
Ma et al. Implicit discourse relation identification for open-domain dialogues
Sun et al. Gaussian word embedding with a wasserstein distance loss
Sukumar et al. Semantic based sentence ordering approach for multi-document summarization
CN109271624A (en) A kind of target word determines method, apparatus and storage medium
Parker et al. Named entity recognition through deep representation learning and weak supervision
CN108595564A (en) Media friendliness appraisal procedure, device and computer readable storage medium
Silva et al. Evaluating Pre-training Strategies for Literary Named Entity Recognition in Portuguese
Zhu et al. Order-sensitive keywords based response generation in open-domain conversational systems
CN114416941A (en) Generation method and device of dialogue knowledge point determination model fusing knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240306

Address after: Room 1179, W Zone, 11th Floor, Building 1, No. 158 Shuanglian Road, Qingpu District, Shanghai, 201702

Patentee after: Shanghai Zhongan Information Technology Service Co.,Ltd.

Country or region after: China

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: ZHONGAN INFORMATION TECHNOLOGY SERVICE Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240415

Address after: Room 1179, W Zone, 11th Floor, Building 1, No. 158 Shuanglian Road, Qingpu District, Shanghai, 201702

Patentee after: Shanghai Zhongan Information Technology Service Co.,Ltd.

Country or region after: China

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: ZHONGAN INFORMATION TECHNOLOGY SERVICE Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right