CN105447036B - A kind of social media information credibility evaluation method and device based on opining mining - Google Patents

A kind of social media information credibility evaluation method and device based on opining mining Download PDF

Info

Publication number
CN105447036B
CN105447036B CN201410436605.5A CN201410436605A CN105447036B CN 105447036 B CN105447036 B CN 105447036B CN 201410436605 A CN201410436605 A CN 201410436605A CN 105447036 B CN105447036 B CN 105447036B
Authority
CN
China
Prior art keywords
information
assessed
social media
media information
viewpoint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410436605.5A
Other languages
Chinese (zh)
Other versions
CN105447036A (en
Inventor
尚利峰
李斌阳
黄锦辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201410436605.5A priority Critical patent/CN105447036B/en
Publication of CN105447036A publication Critical patent/CN105447036A/en
Application granted granted Critical
Publication of CN105447036B publication Critical patent/CN105447036B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The embodiment of the invention discloses a kind of social media information credibility evaluation method and device based on opining mining, method include: to obtain information to be assessed;Calculate the uncertain score of every information to be assessed;Calculate the confidence level of the publisher of every information to be assessed;It counts and supports ratio shared by opinion in the comment of every information to be assessed;By the uncertain score of every information to be assessed, support ratio shared by opinion inputs in trained quantitative appraisement model in advance to calculate in the comment of the confidence level of the publisher of every information to be assessed and every information to be assessed, the output of the quantitative appraisement model is the reliability order of every information to be assessed.The embodiment of the present invention can carry out accurate evaluation to the confidence level of social media information.

Description

A kind of social media information credibility evaluation method and device based on opining mining
Technical field
The present invention relates to fields of communication technology, and in particular to a kind of social media information confidence level based on opining mining is commented Estimate method and device.
Background technique
With the development of second-generation internet WEB2.0 technology and universal, various types of social medias are (such as: microblogging, micro- Letter, Twitter etc.) continue to bring out and profoundly change people's publication, obtain, exchange, expressing information or viewpoint mode.It is special It is not with the maturation of ng mobile communication and being widely used for Intelligent mobile equipment, social media has become people The indispensable platform for being used to sharing information and expressing viewpoint in daily life.But due to the content on the platform mainly by The spontaneous creation of large quantities of netizens is extracted, so false, unreliable information is widely present.How social media letter is automatically assessed The confidence level of breath will generate direct influence to the effect of the application systems such as subsequent information recommendation, market survey, automatic question answering.
The information credibility that the prior art provides is analyzed mainly for a certain specific area, certain types of data, such as Biomedicine experiment report, newswire, wikipedia etc..By taking the reliability assessment of biomedicine experiment report as an example, due to Such data has fixed structure and mode, it is possible to different features is easily extracted, especially for some class Topic has more associated laboratory reports, it is possible to a low credibility to identify by the mutual verifying between laboratory report Laboratory report.And the confidence level of wikipedia information is mainly to be characterized by the modification record of information.
I.e. early stage information credibility analysis tool primarily to the data of particular structured and design, do not examine The text information considered in the data structure feature and language communicative habits, especially social media of social media information itself is A kind of non-structured data, the processing of such data it is serious depend on natural language processing technique: such as semantic analysis and Sentiment analysis etc., so the technology of this kind of early stage is not particularly suited for the reliability assessment of social media information.Therefore, it is necessary to mention The confidence level of social media information is assessed for a kind of new method.
Summary of the invention
In view of this, the present invention provides a kind of social media information credibility evaluation method and dress based on opining mining It sets, accurate evaluation can be carried out to the confidence level of social media information.
In a first aspect, the social media information credibility evaluation method provided in an embodiment of the present invention based on opining mining, Include:
Obtain information to be assessed;
Calculate the uncertain score of every information to be assessed;
Calculate the confidence level of the publisher of every information to be assessed;
It counts and supports ratio shared by opinion in the comment of every information to be assessed;
By the uncertain score of every information to be assessed, the confidence level of the publisher of every information to be assessed and every to It assesses and supports ratio shared by opinion inputs in trained quantitative appraisement model in advance to be calculated in the comment of information, it is described The output of quantitative appraisement model is the reliability order of every information to be assessed.
With reference to first aspect, in the first embodiment of first aspect, before obtaining information to be assessed, the side Method further include:
Construct theme dictionary relevant to current subject under discussion;
Each descriptor in the theme dictionary and each emotion word combination in emotion dictionary are formed into viewpoint word pair;
Obtain social media information relevant to current subject under discussion;
According to each viewpoint word pair and the similarity of every social media information and each viewpoint word pair and every social matchmaker The viewpoint value of every social media information of similarity calculation of the comment of body information;
The social media information that viewpoint value is less than preset threshold is filtered, using remaining social media information as described to be evaluated Estimate information.
In the first embodiment with reference to first aspect, in second of embodiment of first aspect, the building Theme dictionary relevant to current subject under discussion specifically includes:
Social media information relevant to current subject under discussion is searched in social networks;
It extracts the keyword in the social media information and counts the frequency that each keyword occurs;
The theme dictionary is constructed as descriptor according to the keyword that the sequence of frequency from high to low chooses preset quantity.
In the first embodiment with reference to first aspect, in the third embodiment of first aspect, the basis Each viewpoint word pair and the similarity of every social media information and the comment of each viewpoint word pair and every social media information The viewpoint value of every social media information of similarity calculation specifically include:
The descriptor and the similarity of each keyword in a social media information for calculating a viewpoint word centering, mention Take similarity maximum value a;Calculate each pass in the descriptor of the viewpoint word centering and the comment of the social media information The similarity of keyword extracts similarity maximum value x;
The emotion word of the viewpoint word centering and the similarity of each emotion word in the social media information are calculated, is mentioned Take similarity maximum value b;Calculate each feelings in the emotion word of the viewpoint word centering and the comment of the social media information Feel the similarity of word, extracts similarity maximum value y;
The similarity of the viewpoint word pair and the social media information is s1=λ a+ (1- λ) b, and λ is greater than 0 less than 1, institute The similarity for stating viewpoint word pair and the comment of the social media information is s2=μ x+ (1- μ) y, and μ is greater than 0 less than 1;
By the viewpoint word pair and the similarity of the social media information and the viewpoint word pair and the social media The similarity of the comment of information is added to obtain the viewpoint subvalue of the social media information;
Each viewpoint word is obtained into all viewpoint subvalues of the social media information to same processing is done, by all sights Point subvalue is cumulative to obtain the viewpoint value of the social media information, and so on, obtain the viewpoint value of each social media information.
With reference to first aspect or second of embodiment of the first embodiment of first aspect or first aspect, or The third embodiment of first aspect, in the 4th kind of embodiment of first aspect, every information to be assessed of the calculating Uncertain score include:
Determine the classification for the uncertain content for including in every information to be assessed;
Calculate the category score that the every class for including in every information to be assessed does not know content;
The category score that the every class for including in every information to be assessed does not know content is added up multiplied by after preset weight Obtain the uncertain score of every information to be assessed.
With reference to first aspect or second of embodiment of the first embodiment of first aspect or first aspect, or The third embodiment of first aspect, in the 5th kind of embodiment of first aspect, by every information to be assessed not Certainty score is supported shared by opinion in the comment of the confidence level of the publisher of every information to be assessed and every information to be assessed Ratio input and calculated in trained quantitative appraisement model in advance during, the uncertainty of the information to be assessed Score is higher, and the confidence level of the information to be assessed is lower;The confidence level of the publisher of the information to be assessed is lower, it is described to The confidence level for assessing information is lower;In the comment of the information to be assessed support opinion shared by ratio it is smaller, and/or with when Between variation, support ratio shared by opinion smaller and smaller in the comment of the information to be assessed, the information to be assessed can Reliability is lower.
Second aspect, the social media information reliability assessment device provided in an embodiment of the present invention based on opining mining, Include:
First acquisition unit, for obtaining information to be assessed;
First computing unit, for calculating the uncertain score of every information to be assessed;
Second computing unit calculates the confidence level of the publisher of every information to be assessed;
Statistic unit supports ratio shared by opinion in the comment for counting every information to be assessed;
Reliability assessment unit, for by the uncertain score of every information to be assessed, the hair of every information to be assessed Support that the input of ratio shared by opinion in advance comment by trained quantization in the comment of the confidence level of cloth person and every information to be assessed Estimate in model and calculated, the output of the quantitative appraisement model is the reliability order of every information to be assessed.
In conjunction with second aspect, in the first embodiment of second aspect, described device further include:
Dictionary construction unit, for constructing theme dictionary relevant to current subject under discussion;
Word to formed unit, for by the theme dictionary each descriptor and emotion dictionary in each emotion word Combination forms viewpoint word pair;
Second acquisition unit, for obtaining social media information relevant to current subject under discussion;
Third computing unit, for the similarity and each viewpoint according to each viewpoint word pair and every social media information The viewpoint value of word pair and every social media information of similarity calculation of the comment of every social media information;
Information filtering unit, the social media information for being less than preset threshold for filtering viewpoint value, by remaining social matchmaker Body information is as the information to be assessed.
In conjunction with the first embodiment of second aspect, in second of embodiment of second aspect, the dictionary structure Unit is built to specifically include:
Subelement is searched for, for searching for social media information relevant to current subject under discussion in social networks;
Subelement is counted, for extracting the keyword in the social media information and counting the frequency that each keyword occurs Rate;
Dictionary constructs subelement, for choosing the keyword of preset quantity as theme according to the sequence of frequency from high to low Word constructs the theme dictionary.
In conjunction with the first embodiment of second aspect, in the third embodiment of second aspect, the third meter Unit is calculated to be specifically used for:
The descriptor and the similarity of each keyword in a social media information for calculating a viewpoint word centering, mention Take similarity maximum value a;Calculate each pass in the descriptor of the viewpoint word centering and the comment of the social media information The similarity of keyword extracts similarity maximum value x;
The emotion word of the viewpoint word centering and the similarity of each emotion word in the social media information are calculated, is mentioned Take similarity maximum value b;Calculate each feelings in the emotion word of the viewpoint word centering and the comment of the social media information Feel the similarity of word, extracts similarity maximum value y;
The similarity of the viewpoint word pair and the social media information is s1=λ a+ (1- λ) b, and λ is greater than 0 less than 1, institute The similarity for stating viewpoint word pair and the comment of the social media information is s2=μ x+ (1- μ) y, and μ is greater than 0 less than 1;
By the viewpoint word pair and the similarity of the social media information and the viewpoint word pair and the social media The similarity of the comment of information is added to obtain the viewpoint subvalue of the social media information;
Each viewpoint word is obtained into all viewpoint subvalues of the social media information to same processing is done, by all sights Point subvalue is cumulative to obtain the viewpoint value of the social media information, and so on, obtain the viewpoint value of each social media information.
In conjunction with the first of second aspect or second aspect embodiment or second of embodiment of second aspect, or The third embodiment of second aspect, in the 4th kind of embodiment of second aspect, first computing unit is specifically used In:
Determine the classification for the uncertain content for including in every information to be assessed;
Calculate the category score that the every class for including in every information to be assessed does not know content;
The category score that the every class for including in every information to be assessed does not know content is added up multiplied by after preset weight Obtain the uncertain score of every information to be assessed.
In conjunction with the first of second aspect or second aspect embodiment or second of embodiment of second aspect, or The third embodiment of second aspect, in the 5th kind of embodiment of second aspect, the reliability assessment unit is being incited somebody to action The uncertain score of every information to be assessed, the confidence level of the publisher of every information to be assessed and every information to be assessed During supporting that ratio shared by opinion is inputted and calculated in trained quantitative appraisement model in advance in comment, it is described to The uncertain score for assessing information is higher, and the confidence level of the information to be assessed is lower;The publisher of the information to be assessed Confidence level it is lower, the confidence level of the information to be assessed is lower;It is supported shared by opinion in the comment of the information to be assessed Ratio is smaller, and/or with the variation of time, supports ratio shared by opinion increasingly in the comment of the information to be assessed Small, the confidence level of the information to be assessed is lower.
In the embodiment of the present invention, by the uncertain score of every information to be assessed, the publisher of every information to be assessed Confidence level and every information to be assessed comment in support opinion shared by ratio input in advance trained quantitative evaluation mould It is calculated in type, every information to be assessed is assessed by these three types of data, increases the accuracy of assessment.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, needed in being described below to the embodiment of the present invention Attached drawing to be used is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, For those skilled in the art, other attached drawings can also be obtained such as these attached drawings.
Fig. 1 is that social media information credibility evaluation method one embodiment provided by the invention based on opining mining is shown It is intended to;
Fig. 2 is another embodiment of the social media information credibility evaluation method provided by the invention based on opining mining Schematic diagram;
Fig. 3 is one embodiment of the social media information reliability assessment device provided by the invention based on opining mining Schematic diagram;
Fig. 4 is another embodiment of the social media information reliability assessment device provided by the invention based on opining mining Schematic diagram;
Fig. 5 is another embodiment of the social media information reliability assessment device provided by the invention based on opining mining Schematic diagram.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those skilled in the art's every other embodiment obtained, shall fall within the protection scope of the present invention.
As shown in Figure 1, the embodiment of the present invention provides a kind of social media information reliability assessment side based on opining mining Method, which comprises
101, information to be assessed is obtained;
Information to be assessed is the information extracted from the social media information in social networks, these information are all discussed with current Topic is related, and social networks can be microblogging, wechat, Twitter etc..
102, the uncertain score of every information to be assessed is calculated;
This step is mainly used for judging whether contain uncertain content and the information in every information to be assessed not Determine degree.
103, the confidence level of the publisher of every information to be assessed is calculated;
The calculating of the confidence level of information publisher is based primarily upon various features of the information publisher on social networks, example Such as: delivering the number of microblogging, whether be certification user, user gradation etc. carries out reliability assessment, specific appraisal procedure to user It can refer to existing method, details are not described herein again.
104, it counts in the comment of every information to be assessed and supports ratio shared by opinion;
105, by the uncertain score of every information to be assessed, the confidence level of the publisher of every information to be assessed and every Support ratio shared by opinion inputs in trained quantitative appraisement model in advance to be calculated in the comment of item information to be assessed, The output of quantitative appraisement model is the reliability order of every information to be assessed.
It should be noted that above-mentioned steps 102 to 104 are in the concrete realization, execution sequence, can be with there is no successive point It executes side by side.
In the present embodiment, by the uncertain score of every information to be assessed, the publisher's of every information to be assessed can Ratio shared by opinion is supported to input in trained quantitative appraisement model in advance in the comment of reliability and every information to be assessed It is calculated, every information to be assessed is assessed by these three types of data, increases the accuracy of assessment.
For ease of understanding, information credibility appraisal procedure of the present invention is described with a specific embodiment below, is asked Referring to Fig.2, the method for the present embodiment includes:
201, theme dictionary relevant to current subject under discussion is constructed;
In the present embodiment, theme dictionary relevant to current subject under discussion can be constructed by Word-frequency, the specific method is as follows: Social media information relevant to current subject under discussion is searched in social networks, extracts keyword and statistics in social media information The frequency that each keyword occurs is constructed according to the keyword that the sequence of frequency from high to low chooses preset quantity as descriptor Theme dictionary.
In a specific example, such as Huawei Company has issued p7 mobile phone, emerged quickly in social networks with The relevant social media information of p7 mobile phone, it can search social media information relevant to current subject under discussion p7 mobile phone, extraction are searched Rope to these information in keyword for example: Huawei, screen, Hai Si, millet etc. count the frequency that each keyword occurs, Then the keyword for choosing the higher preset quantity of the frequency of occurrences constructs theme dictionary as descriptor.
In addition, in other examples, common potential topic model latent topic model can also be used Construct theme dictionary relevant to current subject under discussion.
202, each descriptor in the theme dictionary and each emotion word combination in emotion dictionary are formed into viewpoint Word pair;
Emotion dictionary can use existing mainstream sentiment dictionary, and each viewpoint word is to by a descriptor and an emotion Word composition, viewpoint word to such as<appearance, it is beautiful>,<Hai Si, it is proud>.
203, social media information relevant to current subject under discussion is obtained;
In the specific implementation, can scan for the keyword in current subject under discussion as input and climb in social media It takes.
204, according to each viewpoint word pair and the similarity of every social media information and each viewpoint word pair and every society Hand over the viewpoint value of every social media information of similarity calculation of the comment of media information;
The similarity of any two word A, BWherein A0、B0The term vector of word A, B are respectively indicated, | | A0|| Indicate A0Norm, | | B0| | indicate B0Norm.
The descriptor of a viewpoint word centering is calculated first, in accordance with above-mentioned calculating formula of similarity and a social media is believed The similarity of each keyword in breath extracts similarity maximum value a;Descriptor and the society of the viewpoint word centering are calculated simultaneously The similarity of each keyword in the comment of media information is handed over, similarity maximum value x is extracted;
Next the emotion word and the social media information of the viewpoint word centering are calculated according to above-mentioned calculating formula of similarity In each emotion word similarity, extract similarity maximum value b;The emotion word and the social activity of the viewpoint word centering are calculated simultaneously The similarity of each emotion word in the comment of media information extracts similarity maximum value y;
The similarity of the viewpoint word pair and the social media information is s1=λ a+ (1- λ) b, and λ is greater than 0 can be pre- less than 1, λ If the viewpoint word pair and the similarity of the comment of the social media information are s2=μ x+ (1- μ) y, μ is greater than 0 can be pre- less than 1, μ If.
The viewpoint word pair and the similarity of the social media information and the viewpoint word are commented with the social media information The similarity of opinion is added to obtain the viewpoint subvalue of the social media information;
Each viewpoint word is obtained into all viewpoint subvalues of the social media information to same processing is done, by all viewpoints Subvalue is cumulative to obtain the viewpoint value of the social media information, and so on, obtain the viewpoint value of each social media information.
205, filtering viewpoint value is less than the social media information of preset threshold, using remaining social media information as described in Information to be assessed;
In the present embodiment, it is believed that viewpoint value is less than the social media information of preset threshold without subjectively, clearly Some viewpoints are expressed, such as only states something with having no emotional color or describes some product, this part social media Information will be filtered;It is considered that viewpoint value be greater than or equal to preset threshold social media information subjectively, clear earth's surface Some viewpoints have been reached, this part social media information often becomes hot spot of public opinions, cognition of the people to event or product is influenced, Therefore the confidence level of this partial information is mainly assessed using this part social media information as information to be assessed in the present embodiment.
206, the uncertain score of every information to be assessed is calculated;
In the present embodiment, an information uncertainty assessment models can be first trained, it is uncertain interior by include in information Appearance is classified, such as the uncertain content for including in information can be done following classification:
Type Clue word or phrase Example sentence
Problem type Really Do you think chip in the really sea of p7?
Hear type It is said that It is said that p7 is on sale in Europe
Wish type Really think Really think just there is platform p7 now
Conviction type It is believed that I believe that I has platform p7 some day
Conditional If If salary raise, I can buy p7
Possible type It should I should be able to buy p7 mobile phone
In the specific implementation, can be to be assessed with determination every by clue word or the information to be assessed of phrase on-line checking every Then the classification for the uncertain content for including in information calculates the class that the every class for including in every information to be assessed does not know content Other score, the category score that the every class for including in every information to be assessed is not known to finally content tire out multiplied by after preset weight Add to obtain the uncertain score of every information to be assessed.
For example, giving an information to be assessed, it may belong to multiple classifications simultaneously, for example belong to A, B, C tri- simultaneously Class has an a possibility that score, the higher expression of score belongs to the category bigger, such as calculates according to each classification of model To the uncertainty of the information to be assessed, to assign to the score of these three classifications be respectively SA、SB、SC, then the final letter to be assessed The uncertainty of breath is scored at H=WA*SA+WB*SB*WC*SC, wherein WA、WB、WCFor weight coefficient, the value of three weight coefficients Can be different, such as one weight coefficient can be set for each classification in advance as needed, certain three weight coefficients can also be with Take same value.
207, the confidence level of the publisher of every information to be assessed is calculated;
The calculating of the confidence level of information publisher is based primarily upon various features of the information publisher on social networks, example Such as: delivering the number of microblogging, whether be certification user, user gradation etc. carries out reliability assessment, specific appraisal procedure to user It can refer to existing method, details are not described herein again.
208, it counts in the comment of every information to be assessed and supports ratio shared by opinion;
209, by the uncertain score of every information to be assessed, the confidence level of the publisher of every information to be assessed and every Support ratio shared by opinion inputs in trained quantitative appraisement model in advance to be calculated in the comment of item information to be assessed, The output of quantitative appraisement model is the reliability order of every information to be assessed.
Specifically during calculating, the uncertain score of information to be assessed is higher, the confidence level of the information to be assessed It is lower;The confidence level of the publisher of information to be assessed is lower, and the confidence level of the information to be assessed is lower;The comment of information to be assessed Ratio shared by middle support opinion is smaller, and/or with the variation of time, supports shared by opinion in the comment of information to be assessed Ratio is smaller and smaller, and the confidence level of the information to be assessed is lower.
It should be noted that above-mentioned steps 206 to 208 are in the concrete realization, execution sequence, can be with there is no successive point It executes side by side.
In the present embodiment, after obtaining social media information relevant to current subject under discussion, constructed descriptor is utilized Library and emotion dictionary calculate the similarity of emotion word pair and social media information and its comment information, thus extraction subjectively, it is bright The social information for really expressing some viewpoints is assessed, during assessment, by the uncertain of every information to be assessed Property score, ratio shared by opinion is supported in the comment of the confidence level of the publisher of every information to be assessed and every information to be assessed Example is inputted in trained quantitative appraisement model in advance and is calculated, and is commented by these three types of data every information to be assessed Estimate, increases the accuracy of assessment.
In practical applications, can accurately be understood according to assessment result user in social networks to the view of a certain event, Demand of the hot spot or user of user's concern to a certain product, so as to accurately be that user does some information recommendations or to production Product do some improvement, to promote user experience.
Information credibility provided in an embodiment of the present invention assessment device is described below, referring to Fig. 3, the present embodiment Device 300 include:
First acquisition unit 301, for obtaining information to be assessed;
First computing unit 302, for calculating the uncertain score of every information to be assessed;
Second computing unit 303 calculates the confidence level of the publisher of every information to be assessed;
Statistic unit 304 supports ratio shared by opinion in the comment for counting every information to be assessed;
Reliability assessment unit 305, for by the uncertain score of every information to be assessed, every information to be assessed Ratio shared by opinion is supported to input trained quantization in advance in the comment of the confidence level of publisher and every information to be assessed It is calculated in assessment models, the output of quantitative appraisement model is the reliability order of every information to be assessed.
In the present embodiment, reliability assessment unit is by the uncertain score of every information to be assessed, every letter to be assessed Support ratio shared by opinion as quantitative appraisement model in the comment of the confidence level of the publisher of breath and every information to be assessed Input calculated, every information to be assessed is assessed by these three types of data, increases the accuracy of assessment.
For ease of understanding, information credibility of the present invention assessment device is described with a specific embodiment below, is asked Refering to Fig. 4, the device 400 of the present embodiment includes:
Dictionary construction unit 401, for constructing theme dictionary relevant to current subject under discussion;
Word to formed unit 402, for by the theme dictionary each descriptor and emotion dictionary in each feelings Feel word combination and forms viewpoint word pair;
Second acquisition unit 403, for obtaining social media information relevant to current subject under discussion;
Third computing unit 404, for according to the similarity of each viewpoint word pair and every social media information and each The viewpoint value of viewpoint word pair and every social media information of similarity calculation of the comment of every social media information;
Information filtering unit 405, the social media information for being less than preset threshold for filtering viewpoint value, by remaining social activity Media information is as the information to be assessed;
First acquisition unit 406, for obtaining information to be assessed;
First computing unit 407, for calculating the uncertain score of every information to be assessed;
Second computing unit 408 calculates the confidence level of the publisher of every information to be assessed;
Statistic unit 409 supports ratio shared by opinion in the comment for counting every information to be assessed;
Reliability assessment unit 410, for by the uncertain score of every information to be assessed, every information to be assessed Ratio shared by opinion is supported to input trained quantization in advance in the comment of the confidence level of publisher and every information to be assessed It is calculated in assessment models, the output of quantitative appraisement model is the reliability order of every information to be assessed.
In addition, dictionary construction unit 401 specifically includes search subelement 4011, statistics subelement 4012 and dictionary building Unit 4013, in which:
Search subelement 4011 is used for, and social media information relevant to current subject under discussion is searched in social networks;
Statistics subelement 4012 is used for, and is extracted the keyword in the social media information and is counted each keyword appearance Frequency;
Dictionary building subelement 4013 is used for, and the keyword conduct of preset quantity is chosen according to the sequence of frequency from high to low Descriptor constructs the theme dictionary.
To further understand, device 400 is assessed to information credibility in the present embodiment with a practical application scene below Interactive mode between interior each unit is described, specific as follows:
Firstly, dictionary construction unit 401 can construct theme dictionary relevant to current subject under discussion by Word-frequency.Specifically Ground can search for social media information relevant to current subject under discussion by search subelement 4011 in social networks, then statistics Unit 4012, which extracts the keyword in the social media information that search subelement 4011 searches and counts each keyword, to be occurred Frequency, dictionary constructs subelement 4013 and chooses the keyword of preset quantity as descriptor according to frequency sequence from high to low Construct theme dictionary.
In a specific example, such as Huawei Company has issued p7 mobile phone, emerged quickly in social networks with The relevant social media information of p7 mobile phone, search subelement 4011 can search for social matchmaker relevant to current subject under discussion p7 mobile phone Body information, statistics subelement 4012 extract the keyword in these information searched for example: Huawei, screen, Hai Si, millet Deng counting the frequency that each keyword occurs, then dictionary building subelement 4013 chooses the higher preset quantity of the frequency of occurrences Keyword as descriptor construct theme dictionary.
In addition, in other examples, dictionary construction unit 401 can also use common potential topic model Latent topic model constructs theme dictionary relevant to current subject under discussion.
Word is to each descriptor and emotion word formed in the theme dictionary that unit 402 constructs dictionary construction unit 401 Each emotion word combination in library forms viewpoint word pair, and emotion dictionary can use existing mainstream sentiment dictionary, each viewpoint Word is formed to by a descriptor and an emotion word, viewpoint word to such as<appearance, it is beautiful>,<Hai Si, it is proud>.
Second acquisition unit 403 obtains social media information relevant to current subject under discussion, in the specific implementation, second obtains list Member 403 can scan for and crawl in social media using the keyword in current subject under discussion as input related to current subject under discussion Social media information.
The similarity of any two word A, BWherein A0、B0The term vector of word A, B are respectively indicated, | | A0|| Indicate the norm of A0, | | B0| | indicate B0Norm.
Similarity and each viewpoint word of the third computing unit 404 according to each viewpoint word pair with every social media information To the viewpoint value of every social media information of similarity calculation of the comment with every social media information.
Specifically, third computing unit 404 can calculate a viewpoint word centering according to above-mentioned calculating formula of similarity The similarity of descriptor and each keyword in a social media information, extracts similarity maximum value a;The sight is calculated simultaneously The similarity of the descriptor of point word centering and each keyword in the comment of the social media information, extracts similarity maximum value x;
Following third computing unit 404 according to above-mentioned calculating formula of similarity calculate the emotion word of the viewpoint word centering with The similarity of each emotion word in the social media information extracts similarity maximum value b;The viewpoint word centering is calculated simultaneously The similarity of each emotion word in the comment of emotion word and the social media information extracts similarity maximum value y;
The similarity of the viewpoint word pair and the social media information is s1=λ a+ (1- λ) b, and λ is greater than 0 can be pre- less than 1, λ If the viewpoint word pair and the similarity of the comment of the social media information are s2=μ x+ (1- μ) y, μ is greater than 0 can be pre- less than 1, μ If.
Third computing unit 404 is by the viewpoint word pair and the similarity of the social media information and the viewpoint word pair and the society The similarity of the comment of media information is handed over to be added to obtain the viewpoint subvalue of the social media information;
Each viewpoint word is obtained all viewpoints of the social media information to same processing is done by third computing unit 404 All viewpoint subvalues are added up and obtain the viewpoint value of the social media information by subvalue, and so on, obtain each social media letter The viewpoint value of breath.
Information filtering unit 405 filters the social media information that viewpoint value is less than preset threshold, by remaining social media For information as the information to be assessed, first acquisition unit 406 obtains remaining social media after information filtering unit 405 filters Information.
In the present embodiment, it is believed that viewpoint value is less than the social media information of preset threshold without subjectively, clearly Some viewpoints are expressed, such as only states something with having no emotional color or describes some product, this part social media Information will be filtered;It is considered that viewpoint value be greater than or equal to preset threshold social media information subjectively, clear earth's surface Some viewpoints have been reached, this part social media information often becomes hot spot of public opinions, cognition of the people to event or product is influenced, Therefore the confidence level of this partial information is mainly assessed using this part social media information as information to be assessed in the present embodiment.
First computing unit 407 calculates the uncertain score of every information to be assessed, in the present embodiment, first calculate it is single Member 407 can first train an information uncertainty assessment models, and the uncertain content for including in information is classified, such as The uncertain content for including in information can be done following classification:
Type Clue word or phrase Example sentence
Problem type Really Do you think chip in the really sea of p7?
Hear type It is said that It is said that p7 is on sale in Europe
Wish type Really think Really think just there is platform p7 now
Conviction type It is believed that I believe that I has platform p7 some day
Conditional If If salary raise, I can buy p7
Possible type It should I should be able to buy p7 mobile phone
In the specific implementation, the first computing unit 407 can by clue word or the information to be assessed of phrase on-line checking every, with Then the classification for determining the uncertain content for including in every information to be assessed calculates the every class for including in every information to be assessed The category score of uncertain content, finally the every class for including in every information to be assessed is not known the category score of content multiplied by It adds up after preset weight and obtains the uncertain score of every information to be assessed.
For example, giving an information to be assessed, it may belong to multiple classifications simultaneously, for example belong to A, B, C tri- simultaneously Class has an a possibility that score, the higher expression of score belongs to the category bigger, such as calculates according to each classification of model To the uncertainty of the information to be assessed, to assign to the score of these three classifications be respectively SA、SB、SC, then the first computing unit 407 The uncertainty for the final information to be assessed being calculated is scored at H=WA*SA+WB*SB*WC*SC, wherein WA、WB、WCFor power The value of weight coefficient, three weight coefficients can be different, such as a weight system can be arranged for each classification in advance as needed Number, certain three weight coefficients can also take same value.
Second computing unit 408 calculates the confidence level of the publisher of every information to be assessed, the confidence level of information publisher Calculating be based primarily upon various features of the information publisher on social networks, such as: deliver microblogging number, whether be certification User, user gradation etc. carry out reliability assessment to user, and specific appraisal procedure can refer to existing method, and details are not described herein again.
Statistic unit 409, which counts, supports ratio shared by opinion, reliability assessment list in the comment of every information to be assessed Member 410 by the uncertain score of every information to be assessed, the confidence level of the publisher of every information to be assessed and every it is to be evaluated Estimate and support ratio shared by opinion inputs in trained quantitative appraisement model in advance to be calculated in the comment of information, quantization is commented The output for estimating model is the reliability order of every information to be assessed.It is to be evaluated during reliability assessment unit 410 calculates The uncertain score for estimating information is higher, and the confidence level of the information to be assessed is lower;The confidence level of the publisher of information to be assessed Lower, the confidence level of the information to be assessed is lower;Support ratio shared by opinion smaller in the comment of information to be assessed, and/or With the variation of time, in the comment of information to be assessed support opinion shared by ratio it is smaller and smaller, the information to be assessed can Reliability is lower.
In the present embodiment, after second acquisition unit obtains social media information relevant to current subject under discussion, third meter Calculate unit using theme dictionary constructed by dictionary construction unit and emotion dictionary calculate emotion word pair and social media information and The similarity of its comment information is assessed to extract and subjectively, clearly express the social informations of some viewpoints, can During reliability assessment unit assesses every information to be assessed, by the uncertain score of every information to be assessed, Ratio conduct shared by opinion is supported in the comment of the confidence level of the publisher of every information to be assessed and every information to be assessed The input of quantitative appraisement model is calculated, and is assessed by these three types of data every information to be assessed, is increased assessment Accuracy.
Below referring to Fig. 5, Fig. 5 provides information credibility assessment another embodiment schematic diagram of device of the present invention, this reality The information credibility assessment device 500 for applying example can be used for implementing information credibility appraisal procedure provided by the above embodiment, In practical application, information credibility assessment device 500 is desirably integrated into electronic equipment, which can be mobile phone, puts down Plate apparatus such as computer.Specifically:
Information credibility assessment device 500 may include RF (Radio Frequency, radio frequency) circuit 510, include one A or more than one computer readable storage medium memory 520, input unit 530, display unit 540, sensor 550, Voicefrequency circuit 560, includes one or more than one place at WiFi (wireless fidelity, Wireless Fidelity) module 570 Manage the components such as processor 580 and the power supply 590 of core.It will be understood by those skilled in the art that structure shown in Fig. 5 is simultaneously The restriction to information credibility assessment device 500 is not constituted, may include components more more or fewer than diagram, or combine certain A little components or different component layouts.Wherein:
RF circuit 510 can be used in messaging or communication process, and signal sends and receivees, particularly, by base station After downlink information receives, one or the processing of more than one processor 580 are transferred to;In addition, the data for being related to uplink are sent to Base station.In general, RF circuit 510 includes but is not limited to antenna, at least one amplifier, tuner, one or more oscillators, uses Family identity module (SIM) card, transceiver, coupler, LNA (Low Noise Amplifier, low-noise amplifier), duplex Device etc..In addition, RF circuit 510 can also be communicated with network and other equipment by wireless communication.The wireless communication can make With any communication standard or agreement, and including but not limited to GSM (Global System of Mobile communication, entirely Ball mobile communcations system), GPRS (General Packet Radio Service, general packet radio service), CDMA (Code Division Multiple Access, CDMA), WCDMA (Wideband Code Division Multiple Access, wideband code division multiple access), LTE (Long Term Evolution, long term evolution), Email, SMS (Short Messaging Service, short message service) etc..
Memory 520 can be used for storing software program and module, and processor 580 is stored in memory 520 by operation Software program and module, thereby executing various function application and data processing.Memory 520 can mainly include storage journey Sequence area and storage data area, wherein storing program area can the (ratio of application program needed for storage program area, at least one function Such as sound-playing function, image player function) etc.;Storage data area, which can be stored, creates data according to the use of storage equipment (such as audio data, phone directory etc.).In addition, memory 520 may include high-speed random access memory, it can also include non- Volatile memory, for example, at least a disk memory, flush memory device or other volatile solid-state parts.Accordingly Ground, memory 520 can also include Memory Controller, to provide processor 580 and input unit 530 to memory 520 Access.
Input unit 530 can be used for receiving the number or character information of input, and generate and user setting and function Control related keyboard, mouse, operating stick, optics or trackball signal input.Specifically, input unit 530 may include touching Sensitive surfaces 531 and other input equipments 532.Touch sensitive surface 531, also referred to as touch display screen or Trackpad are collected and are used Family on it or nearby touch operation (such as user using any suitable object or attachment such as finger, stylus in touch-sensitive table Operation on face 531 or near touch sensitive surface 531), and corresponding attachment device is driven according to preset formula.It is optional , touch sensitive surface 531 may include both touch detecting apparatus and touch controller.Wherein, touch detecting apparatus detection is used The touch orientation at family, and touch operation bring signal is detected, transmit a signal to touch controller;Touch controller is from touch Touch information is received in detection device, and is converted into contact coordinate, then gives processor 580, and can receive processor 580 The order sent simultaneously is executed.Furthermore, it is possible to using multiple types such as resistance-type, condenser type, infrared ray and surface acoustic waves Realize touch sensitive surface 531.In addition to touch sensitive surface 531, input unit 530 can also include other input equipments 532.Specifically, Other input equipments 532 can include but is not limited to physical keyboard, function key (such as volume control button, switch key etc.), One of trace ball, mouse, operating stick etc. are a variety of.
Display unit 540 can be used for showing information input by user or be supplied to user information and device it is various Graphical user interface, these graphical user interface can be made of figure, text, icon, video and any combination thereof.Display Unit 540 may include display panel 541, optionally, can use LCD (Liquid Crystal Display, liquid crystal display Device), the forms such as OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) configure display panel 541. Further, touch sensitive surface 531 can cover display panel 541, when touch sensitive surface 531 detects that touch on it or nearby is grasped After work, processor 580 is sent to determine the type of touch event, is followed by subsequent processing device 580 according to the type of touch event aobvious Show and corresponding visual output is provided on panel 541.Although touch sensitive surface 531 and display panel 541 are as two in Fig. 5 Independent component realizes input and input function, but in some embodiments it is possible to by touch sensitive surface 531 and display panel 541 integrate and realize and output and input function.
Information credibility assessment device 500 may also include at least one sensor 550, such as optical sensor, motion-sensing Device and other sensors.Specifically, optical sensor may include ambient light sensor and proximity sensor, wherein environment light passes Sensor can adjust the brightness of display panel 541 according to the light and shade of ambient light, and proximity sensor can be moved to ear in device 500 Bian Shi closes display panel 541 and/or backlight.As a kind of motion sensor, gravity accelerometer can detect each The size of (generally three axis) acceleration, can detect that size and the direction of gravity, can be used to identify device on direction when static The application (such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating) of posture, (such as the step counting of Vibration identification correlation function Device, percussion) etc.;Gyroscope, barometer, hygrometer, thermometer, infrared sensor for can also configure as device 500 etc. its His sensor, details are not described herein.
Voicefrequency circuit 560, loudspeaker 561, microphone 562 can provide the audio interface between user and device.Audio-frequency electric Electric signal after the audio data received conversion can be transferred to loudspeaker 561, be converted to sound by loudspeaker 561 by road 560 Signal output;On the other hand, the voice signal of collection is converted to electric signal by microphone 562, is turned after being received by voicefrequency circuit 560 It is changed to audio data, then by after the processing of audio data output processor 580, such as another device is sent to through RF circuit 510, Or audio data is exported to memory 520 to be further processed.Voicefrequency circuit 560 is also possible that earphone jack, with The communication of peripheral hardware earphone and device is provided.
WiFi belongs to short range wireless transmission technology, and information credibility assessment device 500 can be helped by WiFi module 570 Help user to send and receive e-mail, browse webpage and access streaming video etc., it provides wireless broadband internet for user and visits It asks.Although Fig. 5 shows WiFi module 570, but it is understood that, and it is not belonging to must be configured into for device, completely may be used To omit within the scope of not changing the essence of the invention as needed.
Processor 580 is the control centre of information credibility assessment device, is entirely filled using various interfaces and connection The various pieces set, by running or executing the software program and/or module that are stored in memory 520, and calling storage Data in memory 520 execute the various functions and processing data of storage equipment, to carry out whole prison to storage equipment Control.Optionally, processor 580 may include one or more processing cores;Preferably, processor 580 can integrate application processor And modem processor, wherein the main processing operation system of application processor, user interface and application program etc., modulatedemodulate Processor is adjusted mainly to handle wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor In 580.
It further includes the power supply 590 (such as battery) powered to all parts that information credibility assessment device, which sets 500, preferably , power supply can be logically contiguous by power-supply management system and processor 580, to realize that management is filled by power-supply management system The functions such as electricity, electric discharge and power managed.Power supply 590 can also include one or more direct current or AC power source, again The random components such as charging system, power failure detection circuit, power adapter or inverter, power supply status indicator.
Although being not shown, information credibility assesses device 500 can also be including camera, bluetooth module etc., herein no longer It repeats.Specifically in the present embodiment, information credibility assessment device 500 include memory 520 and one or one with On program, one of them perhaps more than one program be stored in memory 520 and be configured to by one or one with Upper processor 580 executes said one or more than one program includes the instruction for performing the following operation:
Obtain information to be assessed;
Calculate the uncertain score of every information to be assessed;
Calculate the confidence level of the publisher of every information to be assessed;
It counts and supports ratio shared by opinion in the comment of every information to be assessed;
By the uncertain score of every information to be assessed, the confidence level of the publisher of every information to be assessed and every to It assesses and supports ratio shared by opinion inputs in trained quantitative appraisement model in advance to be calculated in the comment of information, quantify The output of assessment models is the reliability order of every information to be assessed.
It should be noted that information credibility provided in an embodiment of the present invention assesses device 500, can be also used in realization Other functions in Installation practice are stated, details are not described herein.
In addition it should be noted that, the apparatus embodiments described above are merely exemplary, wherein described as separation The unit of part description may or may not be physically separated, component shown as a unit can be or It can not be physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to reality Border needs to select some or all of the modules therein to achieve the purpose of the solution of this embodiment.In addition, provided by the invention In Installation practice attached drawing, the connection relationship between module indicates there is communication connection between them, specifically can be implemented as one Item or a plurality of communication bus or signal wire.Those of ordinary skill in the art are without creative efforts, it can It understands and implements.
Through the above description of the embodiments, it is apparent to those skilled in the art that the present invention can borrow Help software that the mode of required common hardware is added to realize, naturally it is also possible to by specialized hardware include specific integrated circuit, specially It is realized with CPU, private memory, special components and parts etc..Under normal circumstances, all functions of being completed by computer program are ok It is easily realized with corresponding hardware, moreover, being used to realize that the specific hardware structure of same function is also possible to a variety of more Sample, such as analog circuit, digital circuit or special circuit etc..But software program is real in situations more for the purpose of the present invention It is now more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words makes the prior art The part of contribution can be embodied in the form of software products, which is stored in the storage medium that can be read In, such as the floppy disk of computer, USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory Device (RAM, Random Access Memory), magnetic or disk etc., including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
It is provided for the embodiments of the invention a kind of social media information reliability assessment side based on opining mining above Method and device are described in detail, for those of ordinary skill in the art, thought according to an embodiment of the present invention, specific There will be changes in embodiment and application range, and therefore, the contents of this specification are not to be construed as limiting the invention.

Claims (12)

1. a kind of social media information credibility evaluation method based on opining mining characterized by comprising
Obtain social media information relevant to current subject under discussion;
Believed with the similarity of every social media information and each viewpoint word pair with every social media according to each viewpoint word pair The viewpoint value of every social media information of similarity calculation of the comment of breath, wherein each viewpoint word to by descriptor and Emotion word composition, the descriptor are obtained from theme dictionary relevant to the current subject under discussion;
The social media information that viewpoint value is less than preset threshold is filtered, using remaining social media information as information to be assessed;
Information to be assessed is obtained, the information to be assessed is to influence user to the social media information of the cognition of event or product;
Calculate the uncertain score of every information to be assessed;
Calculate the confidence level of the publisher of every information to be assessed;
It counts and supports ratio shared by opinion in the comment of every information to be assessed;
By the uncertain score of every information to be assessed, the confidence level of the publisher of every information to be assessed and every it is to be assessed Support ratio shared by opinion inputs in trained quantitative appraisement model in advance to be calculated in the comment of information, the quantization The output of assessment models is the reliability order of every information to be assessed.
2. the method as described in claim 1, which is characterized in that before obtaining information to be assessed, the method also includes:
Building theme dictionary relevant to the current subject under discussion;
Each descriptor in the theme dictionary and each emotion word combination in emotion dictionary are formed into the viewpoint word pair.
3. method according to claim 2, which is characterized in that relevant to the current subject under discussion theme dictionary of the building specifically wraps It includes:
Social media information relevant to current subject under discussion is searched in social networks;
It extracts the keyword in the social media information and counts the frequency that each keyword occurs;
The theme dictionary is constructed as descriptor according to the keyword that the sequence of frequency from high to low chooses preset quantity.
4. method according to claim 2, which is characterized in that described according to each viewpoint word pair and every social media information Similarity and each viewpoint word pair and every social media information of similarity calculation of the comment of every social media information Viewpoint value specifically includes:
The descriptor and the similarity of each keyword in a social media information for calculating a viewpoint word centering, extract phase Like degree maximum value a;Calculate each keyword in the descriptor of the viewpoint word centering and the comment of the social media information Similarity, extract similarity maximum value x;
The emotion word of the viewpoint word centering and the similarity of each emotion word in the social media information are calculated, phase is extracted Like degree maximum value b;Calculate each emotion word in the emotion word of the viewpoint word centering and the comment of the social media information Similarity, extract similarity maximum value y;
The similarity of the viewpoint word pair and the social media information is s1=λ a+ (1- λ) b, and λ is greater than 0 less than 1, the sight Point word pair and the similarity of the comment of the social media information are s2=μ x+ (1- μ) y, and μ is greater than 0 less than 1;
By the viewpoint word pair and the similarity of the social media information and the viewpoint word pair and the social media information The similarity of comment be added to obtain the viewpoint subvalue of the social media information;
Each viewpoint word is obtained into all viewpoint subvalues of the social media information to same processing is done, by all viewpoint Value is cumulative to obtain the viewpoint value of the social media information, and so on, obtain the viewpoint value of each social media information.
5. the method as described in any one of Claims 1-4, which is characterized in that described every information to be assessed of calculating Uncertain score includes:
Determine the classification for the uncertain content for including in every information to be assessed;
Calculate the category score that the every class for including in every information to be assessed does not know content;
The category score that the every class for including in every information to be assessed does not know content is obtained multiplied by adding up after preset weight The uncertain score of every information to be assessed.
6. the method as described in any one of Claims 1-4, which is characterized in that by every information to be assessed not really Qualitative score is supported shared by opinion in the comment of the confidence level of the publisher of every information to be assessed and every information to be assessed During being calculated in the preparatory trained quantitative appraisement model of ratio input, the uncertain of the information to be assessed is obtained Point higher, the confidence level of the information to be assessed is lower;The confidence level of the publisher of the information to be assessed is lower, described to be evaluated The confidence level for estimating information is lower;Support ratio shared by opinion smaller in the comment of the information to be assessed, and/or with the time Variation, support ratio shared by opinion smaller and smaller in the comment of the information to be assessed, the information to be assessed it is credible It spends lower.
7. a kind of social media information reliability assessment device based on opining mining characterized by comprising
Second acquisition unit, for obtaining social media information relevant to current subject under discussion;
Third computing unit, for according to each viewpoint word pair with every social media information similarity and each viewpoint word pair With the viewpoint value of every social media information of similarity calculation of the comment of every social media information, wherein each sight To being made of descriptor and emotion word, the descriptor obtains point word from theme dictionary relevant to the current subject under discussion;
Information filtering unit, the social media information for being less than preset threshold for filtering viewpoint value, remaining social media is believed Breath is used as information to be assessed;
First acquisition unit, for obtaining information to be assessed, the information to be assessed is to influence user to recognize event or product The social media information known;
First computing unit, for calculating the uncertain score of every information to be assessed;
Second computing unit calculates the confidence level of the publisher of every information to be assessed;
Statistic unit supports ratio shared by opinion in the comment for counting every information to be assessed;
Reliability assessment unit, for by the uncertain score of every information to be assessed, the publisher of every information to be assessed Confidence level and every information to be assessed comment in support opinion shared by ratio input in advance trained quantitative evaluation mould It is calculated in type, the output of the quantitative appraisement model is the reliability order of every information to be assessed.
8. device as claimed in claim 7, which is characterized in that described device further include:
Dictionary construction unit, for constructing theme dictionary relevant to the current subject under discussion;
Word to formed unit, for by the theme dictionary each descriptor and emotion dictionary in each emotion word combination Form the viewpoint word pair.
9. device as claimed in claim 8, which is characterized in that the dictionary construction unit specifically includes:
Subelement is searched for, for searching for social media information relevant to current subject under discussion in social networks;
Subelement is counted, for extracting the keyword in the social media information and counting the frequency that each keyword occurs;
Dictionary constructs subelement, for choosing the keyword of preset quantity as descriptor structure according to the sequence of frequency from high to low Build the theme dictionary.
10. device as claimed in claim 8, which is characterized in that the third computing unit is specifically used for:
The descriptor and the similarity of each keyword in a social media information for calculating a viewpoint word centering, extract phase Like degree maximum value a;Calculate each keyword in the descriptor of the viewpoint word centering and the comment of the social media information Similarity, extract similarity maximum value x;
The emotion word of the viewpoint word centering and the similarity of each emotion word in the social media information are calculated, phase is extracted Like degree maximum value b;Calculate each emotion word in the emotion word of the viewpoint word centering and the comment of the social media information Similarity, extract similarity maximum value y;
The similarity of the viewpoint word pair and the social media information is s1=λ a+ (1- λ) b, and λ is greater than 0 less than 1, the sight Point word pair and the similarity of the comment of the social media information are s2=μ x+ (1- μ) y, and μ is greater than 0 less than 1;
By the viewpoint word pair and the similarity of the social media information and the viewpoint word pair and the social media information The similarity of comment be added to obtain the viewpoint subvalue of the social media information;
Each viewpoint word is obtained into all viewpoint subvalues of the social media information to same processing is done, by all viewpoint Value is cumulative to obtain the viewpoint value of the social media information, and so on, obtain the viewpoint value of each social media information.
11. the device as described in claim 7 to 10 any one, which is characterized in that first computing unit is specifically used for:
Determine the classification for the uncertain content for including in every information to be assessed;
Calculate the category score that the every class for including in every information to be assessed does not know content;
The category score that the every class for including in every information to be assessed does not know content is obtained multiplied by adding up after preset weight The uncertain score of every information to be assessed.
12. the device as described in claim 7 to 10 any one, which is characterized in that the reliability assessment unit is will be every The uncertain score of item information to be assessed, the confidence level of the publisher of every information to be assessed and every information to be assessed are commented It is supported in during being calculated in the preparatory trained quantitative appraisement model of the input of ratio shared by opinion, it is described to be evaluated The uncertain score for estimating information is higher, and the confidence level of the information to be assessed is lower;The publisher's of the information to be assessed Confidence level is lower, and the confidence level of the information to be assessed is lower;Ratio shared by opinion is supported in the comment of the information to be assessed Example is smaller, and/or with the variation of time, supports ratio shared by opinion smaller and smaller in the comment of the information to be assessed, The confidence level of the information to be assessed is lower.
CN201410436605.5A 2014-08-29 2014-08-29 A kind of social media information credibility evaluation method and device based on opining mining Active CN105447036B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410436605.5A CN105447036B (en) 2014-08-29 2014-08-29 A kind of social media information credibility evaluation method and device based on opining mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410436605.5A CN105447036B (en) 2014-08-29 2014-08-29 A kind of social media information credibility evaluation method and device based on opining mining

Publications (2)

Publication Number Publication Date
CN105447036A CN105447036A (en) 2016-03-30
CN105447036B true CN105447036B (en) 2019-08-16

Family

ID=55557224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410436605.5A Active CN105447036B (en) 2014-08-29 2014-08-29 A kind of social media information credibility evaluation method and device based on opining mining

Country Status (1)

Country Link
CN (1) CN105447036B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105824805A (en) * 2016-05-09 2016-08-03 腾讯科技(深圳)有限公司 Identification method and device
CN106649433B (en) * 2016-09-05 2020-08-11 东南大学 Topic opinion strength calculation method based on opinion statement credibility
CN107741938A (en) * 2016-10-13 2018-02-27 腾讯科技(深圳)有限公司 A kind of network information recognition methods and device
WO2018068664A1 (en) 2016-10-13 2018-04-19 腾讯科技(深圳)有限公司 Network information identification method and device
CN106528813B (en) * 2016-11-18 2018-12-11 腾讯科技(深圳)有限公司 A kind of multimedia recommendation method and device
CN108074071B (en) * 2016-11-18 2021-06-18 腾讯科技(深圳)有限公司 Project data processing method and device
CN106776551B (en) * 2016-12-06 2020-05-08 桂林电子科技大学 Method for analyzing emotion viewpoints of English composition
JP6835978B2 (en) * 2017-02-21 2021-02-24 ソニー・インタラクティブエンタテインメント エルエルシー How to determine the authenticity of news
CN106951408A (en) * 2017-03-17 2017-07-14 国信优易数据有限公司 A kind of data digging method
CN109299400A (en) * 2018-09-06 2019-02-01 北京奇艺世纪科技有限公司 A kind of viewpoint abstracting method, device and equipment
CN109508370B (en) * 2018-09-28 2022-07-08 北京百度网讯科技有限公司 Comment extraction method, comment extraction device and storage medium
CN110059190A (en) * 2019-04-18 2019-07-26 东南大学 A kind of user's real-time point of view detection method based on social media content and structure
CN111539562A (en) * 2020-04-10 2020-08-14 支付宝(杭州)信息技术有限公司 Data evaluation method and system based on model
CN112000709B (en) * 2020-07-17 2023-10-24 微梦创科网络科技(中国)有限公司 Social media information total exposure batch mining method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101404591A (en) * 2008-11-14 2009-04-08 西安交通大学 Self-adapting dynamic trust weight estimation method
CN101404572A (en) * 2008-11-14 2009-04-08 西安交通大学 Network node total trust degree estimation method based on feedback trust aggregation
CN101466098A (en) * 2009-01-21 2009-06-24 中国人民解放军信息工程大学 Method, device and communication system for evaluating network trust degree
WO2013082395A1 (en) * 2011-12-01 2013-06-06 Google Inc Identifying recommended merchants
CN103390194A (en) * 2012-05-07 2013-11-13 北京三星通信技术研究有限公司 Method, device and system for predicating user intention and recommending suggestion

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9390243B2 (en) * 2012-02-28 2016-07-12 Disney Enterprises, Inc. Dynamic trust score for evaluating ongoing online relationships

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101404591A (en) * 2008-11-14 2009-04-08 西安交通大学 Self-adapting dynamic trust weight estimation method
CN101404572A (en) * 2008-11-14 2009-04-08 西安交通大学 Network node total trust degree estimation method based on feedback trust aggregation
CN101466098A (en) * 2009-01-21 2009-06-24 中国人民解放军信息工程大学 Method, device and communication system for evaluating network trust degree
WO2013082395A1 (en) * 2011-12-01 2013-06-06 Google Inc Identifying recommended merchants
CN103390194A (en) * 2012-05-07 2013-11-13 北京三星通信技术研究有限公司 Method, device and system for predicating user intention and recommending suggestion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于模糊可信度的软件质量的度量研究;李亚平;《长江大学学报(自科版)》;20140205;第11卷(第4期);11-15
微博新闻事件信息可信度评价;高雅;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130915;正文第14,62-68页

Also Published As

Publication number Publication date
CN105447036A (en) 2016-03-30

Similar Documents

Publication Publication Date Title
CN105447036B (en) A kind of social media information credibility evaluation method and device based on opining mining
CN106453053B (en) Group message display methods and device
EP3113035B1 (en) Method and apparatus for grouping contacts
CN104679969B (en) Prevent the method and device of customer churn
WO2016019925A1 (en) Search method, server and client
CN104239535B (en) A kind of method, server, terminal and system for word figure
CN108541310B9 (en) Method and device for displaying candidate words and graphical user interface
EP3031213B1 (en) Apparatus and method for providing conversation topic
CN105335398A (en) Service recommendation method and terminal
US20150058416A1 (en) Determining a community emotional response
US20160065731A1 (en) Electronic device and method for displaying call information thereof
CN107908619A (en) Processing method, device, terminal and computer-readable storage medium based on public sentiment monitoring
CN104244032A (en) Method and device for pushing multimedia data
CN103530562A (en) Method and device for identifying malicious websites
CN104618223B (en) A kind of management method of information recommendation, device and system
CN106126570B (en) Information service system
WO2020257988A1 (en) Method for identifying click user, and related product
US10204164B2 (en) Systems and methods for filtering microblogs
CN109543014B (en) Man-machine conversation method, device, terminal and server
WO2021003673A1 (en) Content pushing method and related product
CN107885718B (en) Semantic determination method and device
CN105550316B (en) The method for pushing and device of audio list
CN105094872B (en) A kind of method and apparatus showing web application
CN106486119A (en) A kind of method and apparatus of identification voice messaging
CN103488720A (en) Method, system and client for viewing data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant