CN104504150B - News public sentiment monitoring system - Google Patents

News public sentiment monitoring system Download PDF

Info

Publication number
CN104504150B
CN104504150B CN201510009993.3A CN201510009993A CN104504150B CN 104504150 B CN104504150 B CN 104504150B CN 201510009993 A CN201510009993 A CN 201510009993A CN 104504150 B CN104504150 B CN 104504150B
Authority
CN
China
Prior art keywords
news
public sentiment
submodule
data
feature phrase
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510009993.3A
Other languages
Chinese (zh)
Other versions
CN104504150A (en
Inventor
张鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Hengqin Fandou Information Technology Co ltd
Original Assignee
BEIJING BLTSFE INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING BLTSFE INFORMATION TECHNOLOGY Co Ltd filed Critical BEIJING BLTSFE INFORMATION TECHNOLOGY Co Ltd
Priority to CN201510009993.3A priority Critical patent/CN104504150B/en
Publication of CN104504150A publication Critical patent/CN104504150A/en
Application granted granted Critical
Publication of CN104504150B publication Critical patent/CN104504150B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of news public sentiment monitoring system, the system includes news information acquisition module, news data pretreatment module, news the analysis of public opinion module and news public sentiment result display module;The news data pretreatment module includes:Preliminary filter submodule, text extracting sub-module, participle submodule, feature phrase filter submodule, emotion tendentiousness of text analysis submodule, picture analyzing submodule and public sentiment temperature acquisition submodule;For the pretreated data of news data pretreatment module, pass through distributed cloud computing mode, news public sentiment hot is obtained using a variety of news public sentiment monitoring algorithm submodules, and comprehensive descision classification assessment is carried out to the news public sentiment hot of acquisition, so as to realize to news public sentiment hot topic more efficient, accurate monitoring.

Description

News public sentiment monitoring system
Technical field
The present invention relates to internet information processing technology field, it relates in particular to a kind of news public sentiment monitoring system.
Background technology
With internet developing rapidly in the world, the network media has been acknowledged as after newspaper, broadcast, TV " fourth media " afterwards, network turns into one of main carriers of reflection Social Public Feelings.
Network public-opinion is that the public is to having that some focuses, focal issue in actual life are held by transmission on Internet Stronger influence power, tendentious emotion, attitude, opinion, speech or viewpoint, its mainly by forum BBS posting comment and Follow-up post, blog Blog etc. are realized and strengthened.Due to internet have virtual, disguised, diversity, permeability and arbitrarily The features such as property, increasing netizen gladly expresses viewpoint, propagating thought by this channel.
Network public-opinion is one powerful public opinion strength, can react on focus incident and to social development and state of affairs process Produce certain influence.Due to the opening of network, network public-opinion can be caused to be formed rapidly, it is huge to social influence.Particularly When there is negative Internet news public sentiment, if can not in time understand, effectively guide, it is easy to public opinion crisis is formed, when serious Even influence public safety.Positive neutralizing to Internet news public opinion crisis, to maintaining social stability, promoting national development to have Important realistic meaning, is also to create harmonious society to have an intension.Internet news public sentiment viewpoint is collected with suitable Important meaning, netizen's viewpoint plays vital effect in the evolution of a focus incident, it might even be possible to recognized To be the core of Internet news public sentiment.
Recently, developing rapidly with Internet technology, breaks the control of information with news media etc. for the new media of representative System and monopolize, people's Free Surface reaches the attitude and opinion of oneself on network, no longer as the past is so easily unconditionally accepted, On the contrary, the Interest demands of different estate are presented one after another, different thought viewpoint head-on crash.For related governmental departments, how Awareness network news public sentiment promptly and accurately, strengthens timely monitoring, the effectively guiding to Internet news public opinion, as Internet news One big difficult point of public sentiment management.In this case, construction can cover the news public sentiment monitoring system in news data source very Necessity, such system can be directed to new news media's communication environments, further the focus analysis method of further investigation news public sentiment And the influence that new media is brought, the research of news public sentiment is carried out abundant and perfect.
Although having there is many units to propose some different solutions for the monitoring of Internet news public sentiment at present.But Be, it is necessary to those skilled in the art solve technical problem be how to improve judge Internet news public feelings information efficiency and accurately Degree.Because so far, not yet there is the network public-opinion monitoring system for more efficiently, being accurately directed to news media's data.
The content of the invention
The present invention is aiming at the weak point in above-mentioned background technology, and a kind of public sentiment monitoring of the news media proposed System, it has higher accuracy rate.The purpose of the present invention is achieved by the following technical measures.
The present invention proposes a kind of news public sentiment monitoring system, and it is pre- that the system includes news information acquisition module 1, news data Processing module 2, news the analysis of public opinion module 3 and news public sentiment result display module 4, wherein
News information acquisition module 1 is used to be acquired the news public feelings information on internet, obtains news data;
The garbage that news data pretreatment module 2 is used in the news data that obtains news information acquisition module 1 Remove, and necessary arrange is carried out to the news data for eliminating garbage;
Based on the news data that news the analysis of public opinion module 3 is arranged by news data pretreatment module 2, using multiple new Hear focus and find that submodule finds news public sentiment hot;
News public sentiment result display module 4 realizes that user hands over chart or report form output news the analysis of public opinion result Mutual function.
Preferably, the news information acquisition module 1 is used to, according to the keyword specified, come origin url or message subject, make With the search engine web crawlers method based on link analysis, queue concomitantly automatic data collection polytype is downloaded by multithreading News public feelings information;Wherein, polytype news public feelings information at least includes the text message and/or picture of news Information;And
The news data pretreatment module 2 includes:Preliminary filter submodule 2a, text extracting sub-module 2b, participle Module 2c, feature phrase filter submodule 2d, emotion tendentiousness of text analysis submodule 2e, picture analyzing submodule 2f and public sentiment Temperature acquisition submodule 2g.
Preferably, the preliminary filter submodule 2a, for tentatively being filtered to the information in news data, removes institute The noise in news data is stated, following handle is carried out to every news data:
Step 2a-1, for every news data, semantic analysis is carried out according to title, detect in network with this news The similar all news datas of data, obtain the similar group of this news data;If do not found similar to this news data News data, then the similar group of this news data be itself;
Step 2a-2, for every news data, by the similar group of this news data that all positions occur in network In all news datas total quantity divided by issue the network address of all news datas in the similar group of this news data Total quantity, the space for obtaining this news data repeats angle value S1;
Step 2a-3, for every news data, owns in the similar group of this news data occurred in calculating network The total quantity of news data, the time for obtaining this news data repeats angle value S2;
Step 2a-4, repeats angle value S1 according to the space of every news data and the time repeats angle value S2 and calculates this news The comprehensive of data repeats angle value S, and carries out threshold decision, if the comprehensive angle value S that repeats exceedes threshold value, filters out this News data and its similar group;
Wherein, the comprehensive angle value S that repeats is calculated by below equation:
S=(log2(S1+50))1/2+(log2(S2+20))1/2+((lgS1)*(lgS2))1/4
Preferably, the text extracting sub-module 2b, for the news number after the preliminary filter submodule 2a processing In, the information of the body part useful to news the analysis of public opinion is extracted, body part is reconstructed, will be had The representational news information of theme flocks together;
The participle submodule 2c, for being carried out to the news data after text extracting sub-module 2b processing at participle Reason, filtering stop words, name Entity recognition, syntax parsing, part-of-speech tagging, emotion recognition, Feature Words are extracted and feature phrase Extract, set up positive sequence index and inverted order index;And word is parsed according to the grammatical attribute of word, part of speech attribute, emotion attribute Tendentiousness, subject attribute and emotion attribute.
Preferably, the feature phrase filter submodule 2d, for the news number after participle submodule 2c processing Feature phrase in carries out filtering screening, comprises the following steps:
Step 2d-1, duplicate removal is carried out to feature phrase, including:The repeated feature phrase occurred in the text for recording news And the number of times of its appearance, filter out the frequency of occurrences and be less than repetition threshold value less than the repeated feature phrase and length for repeating threshold value Repeated feature phrase;
Step 2d-2, is grouped to feature phrase, including:Calculate between each feature phrase and other feature phrases Similarity value, the feature phrase by Similarity value higher than similarity threshold is divided into identical group;If a feature phrase and institute It is all 0 to have the Similarity value between other feature phrases, then filters out this feature phrase;Specifically, following three can be selected One of individual step calculates described two feature phrase X, Y Similarity value Sims (X, Y), then carries out feature phrase point Group:
Step 2d-2-1:
First, described feature phrase X, Y Similarity value Sims (X, Y) are the same word between two feature phrases X, Y The quantity of symbol;
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD1;
Step 2d-2-2:
First, it is assumed that the quantity for occurring feature phrase X, Y sentence simultaneously is sum (XY);Only there is feature phrase X, no The quantity for feature phrase Y sentence occur is sum (X);Only there is feature phrase Y, occur without the quantity of feature phrase X sentence For sum (Y);Now, feature phrase X, Y Similarity value Sims (X, Y) calculation formula is as follows:
Sims (X, Y)=log2(sum(XY))/log2(sum(X))+log2(sum(XY))/log2(sum(Y));
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD2;
Step 2d-2-3:
Assuming that the number that two feature phrases X, Y include character is respectively m and n, k is made to take the smaller value in m, n, respectively With the subphrase of preceding i character composition in Xi, Yi representative feature phrase X, Y, wherein, i=1,2 ..., k;Definition:
| Xi-Yi | the character quantity included in the most long common characters string for representing subphrase Xi, Yi, then feature phrase X, Y Similarity value Sims (X, Y) calculation formula it is as follows:
Sims (X, Y)=(| X1-Y1 |3+|X2-Y2|3+…+|Xk-Yk|3)1/3
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD3;
Step 2d-3, entropy filtering is carried out to feature phrase, including:The entropy of feature phrase is calculated, entropy is filtered out low It is higher than the feature phrase of default upper threshold value in the feature phrase and entropy of default lower threshold value.
Preferably, the emotion tendentiousness of text analysis submodule 2e, the emotion tendentiousness of text point for performing news Analysis, comprises the following steps:
Step 2e-1, manually chooses the Chinese of some common emotion tendencies and adjective, noun and the verb of English Be used as initialization seed collection;Wherein, the initialization seed is concentrated, and adjectival quantity can be 50, noun and verb Quantity can be 150;
Step 2e-2, nominal original reference pair is reduced to by all pronouns with reference relation in the text of news As to prevent that object being failed to judge or misjudges during analysis;
Step 2e-3, in units of the sentence of news, news is analyzed using part-of-speech tagging POS and semantic character labeling SRL In each sentence sentence element, extract the subjectivity word in each sentence;
Step 2e-4, sequentially inputs the subjectivity word in each sentence, according to the subset in the sentence of news Subjectivity word carry out emotion tendency automatic marking;For can not automatic marking subjectivity word, by artificial judgment its After emotion tendency, the subjectivity word is added to the subset.
Preferably, the picture analyzing submodule 2f, in news data picture visual signature carry out extract and Expression, the visual signature of the picture includes color characteristic, Tamura textural characteristics and the shape facility of picture;
The color characteristic is represented by the color histogram based on HSV space, Luv spaces and Lab space;
The Tamura textural characteristics include roughness, contrast and the direction degree of picture;
The shape facility is included by the coordinate of all pixels point carries out Fourier on object boundary profile in picture Curvature function, centroid distance and the complex coordinates function for converting and obtaining.
Preferably, the public sentiment temperature acquisition submodule 2g, the public sentiment temperature weights ρ for calculating the news, if ρ is big In threshold value T ρ set in advance, then using the news as the data source of the analysis of public opinion and analysis foundation, specifically:
Assuming that browsing hits for K1, comment number is K2, and reply number is K3, clicks on and supports number to be K4, clicks on antilogarithm and is K5, forwarding number is K6, and collection number is K7, and 1~ξ of ξ 4 are set in advance and adjustable coefficient, then
ρ=(lg (K1)3/4+0.03)*ξ1+(lg((K2)2/3+(K3)2/3)+0.02)*ξ2+(lg((K4)1/2+(K5)1/2)+ 0.01)*ξ3+(lg((K6)1/3+(K7)1/3)+0.005)*ξ4;
Wherein, 1~ξ of ξ 4 could be arranged to:ξ 1=0.5;ξ 2=0.3;ξ 3=0.2;ξ 4=0.1.
Preferably, the news the analysis of public opinion module 3 is used to analyze and find news public sentiment hot, comprises the following steps:
First, submodule is found using multiple hot news, news carriage is obtained by parallel distributed computing Feelings focus, the hot news finds that submodule includes:
1) Single-Pass hot news finds submodule 3.1, and the submodule uses the single based on MapReduce Pass algorithms;
2) KNN hot news finds submodule 3.2, and the submodule is calculated using the KNN arest neighbors classification based on MapReduce Method;
3) SVM hot news finds submodule 3.3, and the submodule is calculated using the support vector machines based on MapReduce Method;
4) K-means hot news finds submodule 3.4, and the submodule is calculated using the K average clusters based on MapReduce Method;And
5) SOM hot news finds submodule 3.5, and the submodule is using the Self-organizing Maps SOM god based on MapReduce Through network clustering algorithm;
Secondly, all news public sentiment hots that submodule is obtained respectively, which are converged, to be found to each above-mentioned hot news Always, following classification is carried out to judge:
If the news public sentiment hot obtained finds submodule from above three above focus, by the news public sentiment The category label of focus is senior news public sentiment hot;
If the news public sentiment hot obtained finds submodule from above-mentioned two focus, by the news public sentiment hot Category label be intermediate news public sentiment hot;
If the news public sentiment hot obtained is derived only from said one focus and finds submodule, by news public sentiment heat The category label of point is primary news public sentiment hot;
Finally, the senior news public sentiment hot, intermediate news public sentiment hot and primary news public sentiment hot are sent out successively It is sent to the news public sentiment result display module 4.
Preferably, the news public sentiment result display module 4 is based on J2EE frameworks, can be formed:News public feelings information temperature Rank form, news public sentiment warning information distribution form, news public sentiment geography information distribution form, news public sentiment sentiment analysis report Table, news public sentiment statistic form and news public sentiment trend move towards analysis chart.
In the prior art, the key data source of network public-opinion is usually various websites or forum, and individually for news The monitoring system of public sentiment data is then fewer;Even specifically designed for the monitoring system of news public sentiment data, also tending to due to each Kind of reason and accuracy rate or less efficient.And the present invention proposes a kind of public sentiment data specifically designed for news network data source Monitoring system.
Compared with prior art, the present invention includes advantages below:
First, news public sentiment monitoring system of the invention is towards news network resource, and the news data gathered is through preliminary The numbers such as filtering, text extraction, participle, feature phrase filtering, emotion tendentiousness of text analysis, picture analyzing, the acquisition of public sentiment temperature Data preprocess step, effectively increases the news public sentiment data filter efficiency of news network data source;
Secondly, by distributed cloud computing mode, extensive gathered data can be excavated, analyzed, and can News public sentiment hot is obtained based on a variety of news public sentiment monitoring algorithm modules, to the news public sentiment hot comprehensive descision point Class, so that the discovery to news public sentiment hot topic and tracking, the social network analysis to news are realized, analysis result visualization Present, be the units such as Party and government offices, large enterprise and tissue find in time nose for news information, grasp news public sentiment hot, Hold news public sentiment trend, the crisis of reply news public sentiment and automation, systematization and scientific Informational support are provided.Effectively increase The accuracy that the news public sentiment monitoring system judges, for Internet news public feelings information subsequent treatment provide it is more true, It is accurately basic.
Brief description of the drawings
Technical scheme is further detailed below in conjunction with the accompanying drawings.In the accompanying drawings, identical accompanying drawing is used Mark represents identical functional module.The accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to this The limitation of invention.
Fig. 1 shows the functional structure chart of news public sentiment monitoring system according to an embodiment of the invention.
Fig. 2 shows the functional structure chart of news data pretreatment module according to an embodiment of the invention.
Embodiment
By the detailed description of hereafter preferred embodiment, various other advantages and benefit are for ordinary skill Personnel will be clear understanding.The description is only the general introduction of technical solution of the present invention, in order to better understand the present invention Technological means, and can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature It can be become apparent with advantage.
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
A kind of news public sentiment monitoring system is claimed in the present invention, and the system includes news information acquisition module, news number Data preprocess module, news the analysis of public opinion module and news public sentiment result display module;The news data pretreatment module bag Include:Preliminary filter submodule, text extracting sub-module, participle submodule, feature phrase filter submodule, emotion tendentiousness of text Analyze submodule, picture analyzing submodule and public sentiment temperature acquisition submodule;For the pretreatment of news data pretreatment module Data afterwards, by distributed cloud computing mode, news public sentiment is obtained using a variety of news public sentiment monitoring algorithm submodules Focus, and comprehensive descision classification is carried out to the news public sentiment hot of acquisition assess, thus realize to news public sentiment hot topic compared with Efficiently, accurately monitoring.
Fig. 1 is the functional structure chart of news public sentiment monitoring system according to an embodiment of the invention.
As shown in figure 1, the news public sentiment monitoring system includes four modules, it is respectively:News information acquisition module 1, News data pretreatment module 2, news the analysis of public opinion module 3 and news public sentiment result display module 4.Wherein:
News information acquisition module 1 is used to be acquired the news public feelings information on internet, obtains news data;
The garbage that news data pretreatment module 2 is used in the news data that obtains news information acquisition module 1 Remove, and necessary arrange is carried out to the news data for eliminating garbage;
Based on the information that news the analysis of public opinion module 3 is arranged by news data pretreatment module 2, son is found using focus Module finds public sentiment hot;
News public sentiment result display module 4 realizes that user hands over chart or report form output news the analysis of public opinion result Mutual function.
Specifically:
The news information acquisition module 1 is used for according to the keyword specified, come origin url or message subject, using based on The search engine web crawlers method of link analysis, the queue concomitantly polytype news of automatic data collection is downloaded by multithreading Public feelings information;Wherein, polytype news public feelings information at least includes text message and/or pictorial information.
Fig. 2 is the functional structure chart of news data pretreatment module according to an embodiment of the invention.
As shown in Fig. 2 the news data pretreatment module 2 includes:Preliminary filter submodule 2a, text extracting sub-module 2b, participle submodule 2c, feature phrase filter submodule 2d, emotion tendentiousness of text analysis submodule 2e, picture analyzing submodule Block 2f and public sentiment temperature acquisition submodule 2g.
Specifically:
The preliminary filter submodule 2a, for tentatively being filtered to the information in news data, removes the news Noise in data, following handle is carried out to every news data:
Step 2a-1, for every news data, semantic analysis is carried out according to title, detect in network with this news The similar all news datas of data, obtain the similar group of this news data;If do not found similar to this news data News data, then the similar group of this news data be itself;
Step 2a-2, for every news data, by the similar group of this news data that all positions occur in network In all news datas total quantity divided by issue the network address of all news datas in the similar group of this news data Total quantity, the space for obtaining this news data repeats angle value S1;
Step 2a-3, for every news data, owns in the similar group of this news data occurred in calculating network The total quantity of news data, the time for obtaining this news data repeats angle value S2;
Step 2a-4, repeats angle value S1 according to the space of every news data and the time repeats angle value S2 and calculates this news The comprehensive of data repeats angle value S, and carries out threshold decision, if the comprehensive angle value S that repeats exceedes threshold value, filters out this News data and its similar group;
Wherein, the comprehensive angle value S that repeats is calculated by below equation:
S=(log2(S1+50))1/2+(log2(S2+20))1/2+((lgS1)*(lgS2))1/4
Specifically:
The text extracting sub-module 2b, in the news data after the preliminary filter submodule 2a processing, carrying The information of the body part useful to news the analysis of public opinion is taken, body part is reconstructed, there will be theme generation The news information of table flocks together;
The participle submodule 2c, for being carried out to the news data after text extracting sub-module 2b processing at participle Reason, filtering stop words, name Entity recognition, syntax parsing, part-of-speech tagging, emotion recognition, Feature Words are extracted and feature phrase Extract, set up positive sequence index and inverted order index;And word is parsed according to the grammatical attribute of word, part of speech attribute, emotion attribute Tendentiousness, subject attribute and emotion attribute.
Specifically:
The feature phrase filter submodule 2d, for the spy in the news data after participle submodule 2c processing Levy phrase and carry out filtering screening, comprise the following steps:
Step 2d-1, duplicate removal is carried out to feature phrase, including:The repeated feature phrase occurred in the text for recording news And the number of times of its appearance, filter out the frequency of occurrences and be less than repetition threshold value less than the repeated feature phrase and length for repeating threshold value Repeated feature phrase;
Step 2d-2, is grouped to feature phrase, including:Calculate between each feature phrase and other feature phrases Similarity value, the feature phrase by Similarity value higher than similarity threshold is divided into identical group;If a feature phrase and institute It is all 0 to have the Similarity value between other feature phrases, then filters out this feature phrase;Specifically, following three can be selected One of individual step calculates described two feature phrase X, Y Similarity value Sims (X, Y), then carries out feature phrase point Group:
Step 2d-2-1:
First, described feature phrase X, Y Similarity value Sims (X, Y) are the same word between two feature phrases X, Y The quantity of symbol;
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD1;
Step 2d-2-2:
First, it is assumed that the quantity for occurring feature phrase X, Y sentence simultaneously is sum (XY);Only there is feature phrase X, no The quantity for feature phrase Y sentence occur is sum (X);Only there is feature phrase Y, occur without the quantity of feature phrase X sentence For sum (Y);Now, feature phrase X, Y Similarity value Sims (X, Y) calculation formula is as follows:
Sims (X, Y)=log2(sum(XY))/log2(sum(X))+log2(sum(XY))/log2(sum(Y));
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD2;
Step 2d-2-3:
Assuming that the number that two feature phrases X, Y include character is respectively m and n, k is made to take the smaller value in m, n, respectively With the subphrase of preceding i character composition in Xi, Yi representative feature phrase X, Y, wherein, i=1,2 ..., k;Definition:
| Xi-Yi | the character quantity included in the most long common characters string for representing subphrase Xi, Yi, then feature phrase X, Y Similarity value Sims (X, Y) calculation formula it is as follows:
Sims (X, Y)=(| X1-Y1 |3+|X2-Y2|3+…+|Xk-Yk|3)1/3
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD3;
Step 2d-3, entropy filtering is carried out to feature phrase, including:The entropy of feature phrase is calculated, entropy is filtered out low It is higher than the feature phrase of default upper threshold value in the feature phrase and entropy of default lower threshold value.
Specifically:
The emotion tendentiousness of text analyzes submodule 2e, and the emotion tendentiousness of text for performing news is analyzed, including Following steps:
Step 2e-1, manually chooses the Chinese of some common emotion tendencies and adjective, noun and the verb of English Be used as initialization seed collection;Wherein, the initialization seed is concentrated, and adjectival quantity can be 50, noun and verb Quantity can be 150;
Step 2e-2, nominal original reference pair is reduced to by all pronouns with reference relation in the text of news As to prevent that object being failed to judge or misjudges during analysis;
Step 2e-3, in units of the sentence of news, news is analyzed using part-of-speech tagging POS and semantic character labeling SRL In each sentence sentence element, extract the subjectivity word in each sentence;
Step 2e-4, sequentially inputs the subjectivity word in each sentence, according to the subset in the sentence of news Subjectivity word carry out emotion tendency automatic marking;For can not automatic marking subjectivity word, by artificial judgment its After emotion tendency, the subjectivity word is added to the subset.
Specifically:
The picture analyzing submodule 2f, is extracted and is expressed for the visual signature to picture in news data, institute Stating the visual signature of picture includes color characteristic, Tamura textural characteristics and the shape facility of picture;
The color characteristic is represented by the color histogram based on HSV space, Luv spaces and Lab space;
Topmost feature includes roughness (coarseness), the contrast of picture in the Tamura textural characteristics (contrast) and direction degree (directionality), they are even more important to picture retrieval;
For the shape facility, system of the invention uses fourier descriptor (Fourier shape Descriptors), basic thought is by the coordinate of all pixels point carries out Fourier's change on object boundary profile in picture Change and obtain curvature function, centroid distance and complex coordinates function.
Specifically:
The public sentiment temperature acquisition submodule 2g, the public sentiment temperature weights ρ for calculating the news, if ρ is more than in advance The threshold value T ρ of setting, then using the news as the analysis of public opinion data source and analysis foundation, specifically:
Assuming that browsing hits for K1, comment number is K2, and reply number is K3, clicks on and supports number to be K4, clicks on antilogarithm and is K5, forwarding number is K6, and collection number is K7, and 1~ξ of ξ 4 are set in advance and adjustable coefficient, then
ρ=(lg (K1)3/4+0.03)*ξ1+(lg((K2)2/3+(K3)2/3)+0.02)*ξ2+(lg((K4)1/2+(K5)1/2)+ 0.01)*ξ3+(lg((K6)1/3+(K7)1/3)+0.005)*ξ4;
Wherein, 1~ξ of ξ 4 could be arranged to:ξ 1=0.5;ξ 2=0.3;ξ 3=0.2;ξ 4=0.1.
The news the analysis of public opinion module 3 is used to divide the data after the news data pretreatment module 2 processing Analyse to find news public sentiment hot.Specifically:
The present invention uses distributed cloud computing mode, extensive collection news data can be excavated, analyzed;And News public sentiment hot can be obtained based on a variety of public sentiment monitoring algorithm modules, to the news public sentiment hot comprehensive descision point Class, so that the discovery to news public sentiment hot topic and tracking, the social network analysis to news are realized, analysis result visualization Present, be the units such as Party and government offices, large enterprise and tissue find in time nose for news information, grasp news public sentiment hot, Hold news public sentiment trend, the crisis of reply news public sentiment and automation, systematization and scientific Informational support are provided.Effectively increase The accuracy that the news public sentiment monitoring system judges, for Internet news public feelings information subsequent treatment provide it is more true, It is accurately basic.Specifically:
By the news data and analysis result of distributed storage layer storage collection, the distributed storage layer is based on HDFS is realized;
And in Distributed Calculation layer, realize that parallelization is calculated using MapReduce parallel calculating methods;
Optimized by the storage of HDFS files and transmission optimization, MapReduce parallel computations, realize the news public sentiment of magnanimity The optimization of monitoring, and realize stabilization, efficient big data storage optimization so that the news public sentiment data query processing of magnanimity is excellent Change, be with good expansibility, reliability, security.The system is based on cloud platform, with good response speed, supports Magnanimity news data is analyzed to be serviced with excavating.
The news the analysis of public opinion module 3 is used to enter the news data after the news data pretreatment module 2 processing Row is analyzed to find that comprising the following steps that for news public sentiment hot:
First, submodule is found using multiple hot news, news carriage is obtained by parallel distributed computing Feelings focus, the hot news finds that submodule includes:
1) Single-Pass hot news finds submodule 3.1, and the submodule uses the single based on MapReduce Pass algorithms;
2) KNN hot news finds submodule 3.2, and the submodule is calculated using the KNN arest neighbors classification based on MapReduce Method;
3) SVM hot news finds submodule 3.3, and the submodule is calculated using the support vector machines based on MapReduce Method;
4) K-means hot news finds submodule 3.4, and the submodule is calculated using the K average clusters based on MapReduce Method;And
5) SOM hot news finds submodule 3.5, and the submodule is using the Self-organizing Maps SOM god based on MapReduce Through network clustering algorithm;
Secondly, all news public sentiment hots that submodule is obtained respectively, which are converged, to be found to each above-mentioned hot news Always, following classification is carried out to judge:
If the news public sentiment hot obtained finds submodule from above three above focus, by the news public sentiment The category label of focus is senior news public sentiment hot;
If the news public sentiment hot obtained finds submodule from above-mentioned two focus, by the news public sentiment hot Category label be intermediate news public sentiment hot;
If the news public sentiment hot obtained is derived only from said one focus and finds submodule, by news public sentiment heat The category label of point is primary news public sentiment hot;
Finally, the senior news public sentiment hot, intermediate news public sentiment hot and primary news public sentiment hot are sent out successively It is sent to the news public sentiment result display module 4.
Wherein, the algorithm that above-mentioned focus discovery submodule 3.1~3.5 is used is all using this area in general sense General-purpose algorithm.Therefore the improvements of the present invention are not intended to above-mentioned several algorithms in itself.Because in existing news public sentiment prison In examining system, a kind of news public sentiment hot therein has often simply been used to find algorithm, and not yet find will be above-mentioned a variety of new Hear public sentiment hot and find that algorithm is used simultaneously, and the system to concentrating the result of algorithm to carry out grade separation.And, although this hair Bright news public sentiment monitoring system has used a variety of public sentiment hots to find algorithm, but because the system of the present invention is employed based on cloud The distributed structure/architecture of calculating, therefore the expense for being difficult to bear can't be brought, and due to the combination of various ways, substantially increase The accuracy of news public sentiment monitoring system, achieves preferable technique effect.
Specifically:
The news public sentiment result display module 4 is based on J2EE frameworks, can be formed:News public feelings information temperature seniority among brothers and sisters report It is table, news public sentiment warning information distribution form, news public sentiment geography information distribution form, news public sentiment sentiment analysis form, new Hear public sentiment statistic form and news public sentiment trend moves towards analysis chart.
The embodiment of system and its comprising modules described in this specification be only it is schematical, can be according to reality The need for select some or all of module therein to realize the purpose of scheme of the embodiment of the present invention.Ordinary skill people Member is without creative efforts, you can to understand and implement.
In summary, it is only the present invention preferably embodiment, but protection scope of the present invention is not limited thereto, Any one skilled in the art the invention discloses technical scope in, the change or replacement that can be readily occurred in, It should all be included within the scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims It is defined.

Claims (8)

1. a kind of news public sentiment monitoring system, the system includes news information acquisition module (1), news data pretreatment module (2), news the analysis of public opinion module (3) and news public sentiment result display module (4), wherein
News information acquisition module (1) is used to be acquired the news public feelings information on internet, obtains news data;
News data pretreatment module (2) is used for the garbage in the news data of news information acquisition module (1) acquisition Remove, and the news data for eliminating garbage is arranged;
Based on the news data that news the analysis of public opinion module (3) is arranged by news data pretreatment module (2), using multiple new Hear focus and find that submodule finds news public sentiment hot;
News public sentiment result display module (4) realizes user mutual with chart or report form output news the analysis of public opinion result Function;
The news information acquisition module (1) is used for according to the keyword specified, come origin url or message subject, using based on chain The search engine web crawlers method of analysis is connect, the queue concomitantly polytype news carriage of automatic data collection is downloaded by multithreading Feelings information;Wherein, polytype news public feelings information at least includes the text message and/or pictorial information of news;And And
The news data pretreatment module (2) includes:Preliminary filter submodule (2a), text extracting sub-module (2b), participle Submodule (2c), feature phrase filter submodule (2d), emotion tendentiousness of text analysis submodule (2e), picture analyzing submodule (2f) and public sentiment temperature acquisition submodule (2g);
The preliminary filter submodule (2a), for tentatively being filtered to the information in news data, removes the news number Noise in, following handle is carried out to every news data:
Step 2a-1, for every news data, semantic analysis is carried out according to title, detect in network with this news data Similar all news datas, obtain the similar group of this news data;If do not found similar to this news data new Data are heard, then the similar group of this news data is itself;
Step 2a-2, for every news data, by institute in the similar group for this news data that all positions occur in network There is the total quantity of news data divided by issue the sum of the network address of all news datas in the similar group of this news data Amount, the space for obtaining this news data repeats angle value S1;
Step 2a-3, for every news data, all news in the similar group of this news data occurred in calculating network The total quantity of data, the time for obtaining this news data repeats angle value S2;
Step 2a-4, repeats angle value S1 according to the space of every news data and the time repeats angle value S2 and calculates this news data It is comprehensive repeat angle value S, and carry out threshold decision, if the comprehensive angle value S that repeats exceedes threshold value, filter out this news Data and its similar group;
Wherein, the comprehensive angle value S that repeats is calculated by below equation:
S=(log2(S1+50))1/2+(log2(S2+20))1/2+((lgS1)*(lgS2))1/4
2. news public sentiment monitoring system according to claim 1, it is characterised in that:
The text extracting sub-module (2b), in the news data after preliminary filter submodule (2a) processing, carrying The information of the body part useful to news the analysis of public opinion is taken, body part is reconstructed, there will be theme generation The news information of table flocks together;
The participle submodule (2c), for being carried out to the news data after the text extracting sub-module (2b) processing at participle Reason, filtering stop words, name Entity recognition, syntax parsing, part-of-speech tagging, emotion recognition, Feature Words are extracted and feature phrase Extract, set up positive sequence index and inverted order index;And word is parsed according to the grammatical attribute of word, part of speech attribute, emotion attribute Tendentiousness, subject attribute and emotion attribute.
3. news public sentiment monitoring system according to claim 2, it is characterised in that:
The feature phrase filter submodule (2d), for the spy in the news data after the participle submodule (2c) processing Levy phrase and carry out filtering screening, comprise the following steps:
Step 2d-1, duplicate removal is carried out to feature phrase, including:Record in the text of news the repeated feature phrase that occurs and Its number of times occurred, filters out the frequency of occurrences less than the weight of repeated feature phrase and length less than repetition threshold value for repeating threshold value Renaturation feature phrase;
Step 2d-2, is grouped to feature phrase, including:Calculate similar between each feature phrase and other feature phrases Angle value, the feature phrase by Similarity value higher than similarity threshold is divided into identical group;If feature phrase with it is all its Similarity value between his feature phrase is all 0, then filters out this feature phrase;Specifically, in selection three below step One of calculate two feature phrases X, Y Similarity value Sims (X, Y), then carry out feature phrase packet:
Step 2d-2-1:
First, described feature phrase X, Y Similarity value Sims (X, Y) are the identical characters between two feature phrases X, Y Quantity;
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD1;
Step 2d-2-2:
First, it is assumed that the quantity for occurring feature phrase X, Y sentence simultaneously is sum (XY);Only there is feature phrase X, occur without The quantity of feature phrase Y sentence is sum (X);Only there is feature phrase Y, the quantity for occurring without feature phrase X sentence is sum(Y);Now, feature phrase X, Y Similarity value Sims (X, Y) calculation formula is as follows:
Sims (X, Y)=log2(sum(XY))/log2(sum(X))+log2(sum(XY))/log2(sum(Y));
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD2;
Step 2d-2-3:
Assuming that the number that two feature phrases X, Y include character is respectively m and n, make k take the smaller value in m, n, respectively with The subphrase of i character composition before in Xi, Yi representative feature phrase X, Y, wherein, i=1,2 ..., k;Definition:
| Xi-Yi | the phase of the character quantity, then feature phrase X, Y that are included in the most long common characters string for representing subphrase Xi, Yi It is as follows like angle value Sims (X, Y) calculation formula:
Sims (X, Y)=(| X1-Y1 |3+|X2-Y2|3+…+|Xk-Yk|3)1/3
Secondly, if Sims (X, Y)>Feature phrase Y, then be divided into the group where feature phrase X by threshold value TD3;
Step 2d-3, entropy filtering is carried out to feature phrase, including:The entropy of feature phrase is calculated, entropy is filtered out less than pre- If lower threshold value feature phrase and entropy be higher than default upper threshold value feature phrase.
4. news public sentiment monitoring system according to claim 3, it is characterised in that:
The emotion tendentiousness of text analysis submodule (2e), the emotion tendentiousness of text for performing news is analyzed, including with Lower step:
Step 2e-1, manually chooses the Chinese of some common emotion tendencies and adjective, noun and the verb and work of English For initialization seed collection;Wherein, the initialization seed is concentrated, and adjectival quantity is 50, and the quantity of noun and verb is 150;
Step 2e-2, nominal original referents are reduced to by all pronouns with reference relation in the text of news, To prevent that object being failed to judge or misjudges during analysis;
Step 2e-3, in units of the sentence of news, analyzes every in news using part-of-speech tagging POS and semantic character labeling SRL The sentence element of individual sentence, extracts the subjectivity word in each sentence;
Step 2e-4, sequentially inputs the subjectivity word in each sentence, according to the subset to the master in the sentence of news The property seen word carries out emotion tendency automatic marking;For can not automatic marking subjectivity word, by its emotion of artificial judgment After tendentiousness, the subjectivity word is added to the subset.
5. news public sentiment monitoring system according to claim 4, it is characterised in that:
The picture analyzing submodule (2f), is extracted and is expressed for the visual signature to picture in news data, described The visual signature of picture includes color characteristic, Tamura textural characteristics and the shape facility of picture;
The color characteristic is represented by the color histogram based on HSV space, Luv spaces and Lab space;
The Tamura textural characteristics include roughness, contrast and the direction degree of picture;
The shape facility is included by the coordinate of all pixels point carries out Fourier transformation on object boundary profile in picture And curvature function, centroid distance and the complex coordinates function obtained.
6. news public sentiment monitoring system according to claim 5, it is characterised in that:
The public sentiment temperature acquisition submodule (2g), the public sentiment temperature weights ρ for calculating the news is set in advance if ρ is more than Fixed threshold value T ρ, then using the news as the analysis of public opinion data source and analysis foundation, specifically:
Assuming that browsing hits for K1, comment number is K2, and reply number is K3, clicks on and supports number to be K4, and click antilogarithm is K5, is turned Hair number is K6, and collection number is K7, and 1~ξ of ξ 4 are set in advance and adjustable coefficient, then
ρ=(lg (K1)3/4+0.03)*ξ1+(lg((K2)2/3+(K3)2/3)+0.02)*ξ2+(lg((K4)1/2+(K5)1/2)+ 0.01)*ξ3+(lg((K6)1/3+(K7)1/3)+0.005)*ξ4;
Wherein, 1~ξ of ξ 4 are set to:ξ 1=0.5;ξ 2=0.3;ξ 3=0.2;ξ 4=0.1.
7. news public sentiment monitoring system according to claim 6, it is characterised in that:
The news the analysis of public opinion module (3) is used to analyze and find news public sentiment hot, comprises the following steps:
First, submodule is found using multiple hot news, news public sentiment heat is obtained by parallel distributed computing Point, the hot news finds that submodule includes:
1) Single-Pass hot news finds submodule (3.1), and the submodule uses the single based on MapReduce Pass algorithms;
2) KNN hot news finds submodule (3.2), and the submodule is calculated using the KNN arest neighbors classification based on MapReduce Method;
3) SVM hot news finds submodule (3.3), and the submodule is calculated using the support vector machines based on MapReduce Method;
4) K-means hot news finds submodule (3.4), and the submodule is calculated using the K average clusters based on MapReduce Method;And
5) SOM hot news finds submodule (3.5), and the submodule is using the Self-organizing Maps SOM nerves based on MapReduce Network clustering algorithm;
Secondly, all news public sentiment hots that submodule is obtained respectively, which collect, to be found to each above-mentioned hot news, entered The following classification of row judges:
If the news public sentiment hot obtained finds submodule from above three above focus, by the news public sentiment hot Category label be senior news public sentiment hot;
If the news public sentiment hot obtained finds submodule from above-mentioned two focus, by the class of the news public sentiment hot It Biao Ji not be news public sentiment hot;
If the news public sentiment hot obtained is derived only from said one focus and finds submodule, by the news public sentiment hot Category label is primary news public sentiment hot;
Finally, the senior news public sentiment hot, intermediate news public sentiment hot and primary news public sentiment hot are sent in sequence to The news public sentiment result display module (4).
8. news public sentiment monitoring system according to claim 7, it is characterised in that:
The news public sentiment result display module (4) is based on J2EE frameworks, can be formed:News public feelings information temperature seniority among brothers and sisters form, News public sentiment warning information distribution form, news public sentiment geography information distribution form, news public sentiment sentiment analysis form, news carriage Situation state statistical report form and news public sentiment trend move towards analysis chart.
CN201510009993.3A 2015-01-09 2015-01-09 News public sentiment monitoring system Expired - Fee Related CN104504150B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510009993.3A CN104504150B (en) 2015-01-09 2015-01-09 News public sentiment monitoring system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510009993.3A CN104504150B (en) 2015-01-09 2015-01-09 News public sentiment monitoring system

Publications (2)

Publication Number Publication Date
CN104504150A CN104504150A (en) 2015-04-08
CN104504150B true CN104504150B (en) 2017-09-29

Family

ID=52945547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510009993.3A Expired - Fee Related CN104504150B (en) 2015-01-09 2015-01-09 News public sentiment monitoring system

Country Status (1)

Country Link
CN (1) CN104504150B (en)

Families Citing this family (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156232B (en) * 2015-04-24 2020-01-21 阿里巴巴集团控股有限公司 Network information propagation monitoring method and device
CN104834737A (en) * 2015-05-19 2015-08-12 无锡天脉聚源传媒科技有限公司 Public sentiment handling method and device based on television reports
CN104834739B (en) * 2015-05-20 2017-11-17 成都布林特信息技术有限公司 Internet information storage system
CN104951512A (en) * 2015-05-27 2015-09-30 中国科学院信息工程研究所 Public sentiment data collection method and system based on Internet
CN104899335A (en) * 2015-06-25 2015-09-09 四川友联信息技术有限公司 Method for performing sentiment classification on network public sentiment of information
CN106372088B (en) * 2015-07-23 2020-07-03 腾讯科技(深圳)有限公司 Data processing method based on news and server
CN104965931A (en) * 2015-07-30 2015-10-07 成都布林特信息技术有限公司 Big data based public opinion analysis method
CN105068991A (en) * 2015-07-30 2015-11-18 成都鼎智汇科技有限公司 Big data based public sentiment discovery method
CN105183765A (en) * 2015-07-30 2015-12-23 成都鼎智汇科技有限公司 Big data-based topic extraction method
CN105022845A (en) * 2015-08-26 2015-11-04 苏州大学张家港工业技术研究院 News classification method and system based on feature subspaces
CN106548124B (en) * 2015-09-17 2021-09-07 松下知识产权经营株式会社 Theme estimation system and theme estimation method
CN105491117B (en) * 2015-11-26 2018-12-21 北京航空航天大学 Streaming diagram data processing system and method towards real-time data analysis
CN105718587A (en) * 2016-01-26 2016-06-29 王薇 Network content resource evaluation method and evaluation system
CN107239452B (en) * 2016-03-28 2021-07-27 腾讯科技(深圳)有限公司 Method and device for strategy adjustment
CN106095919B (en) * 2016-06-12 2019-08-02 上海交通大学 Data variation trend spring visualization system and method towards analysis of central issue
CN107544988B (en) * 2016-06-27 2021-03-19 百度在线网络技术(北京)有限公司 Method and device for acquiring public opinion data
CN106326496A (en) * 2016-09-30 2017-01-11 广州特道信息科技有限公司 Cloud platform-based news reading system
CN108280085B (en) * 2017-01-06 2021-07-27 工业和信息化部电信研究院 Data deduplication method and device
CN106886916A (en) * 2017-01-20 2017-06-23 电通公共关系顾问(北京)有限公司 Reputation management system and method
JP6893606B2 (en) 2017-03-20 2021-06-23 達闥机器人有限公司 Image tagging methods, devices and electronics
CN107330049B (en) * 2017-06-28 2020-05-22 北京搜狐新媒体信息技术有限公司 News popularity estimation method and system
CN107644269B (en) * 2017-09-11 2020-05-22 国网江西省电力公司南昌供电分公司 Electric power public opinion prediction method and device supporting risk assessment
CN107895008B (en) * 2017-11-10 2022-02-08 中国电子科技集团公司第三十二研究所 Information hotspot discovery method based on big data platform
CN107967310A (en) * 2017-11-17 2018-04-27 深圳市城市公共安全技术研究院有限公司 Public opinion data processing method and device and storage medium
CN109977393B (en) * 2017-12-28 2021-09-03 中国科学院计算技术研究所 Popular news prediction method and system based on content disputeness
CN108628994A (en) * 2018-04-28 2018-10-09 广东亿迅科技有限公司 A kind of public sentiment data processing system
CN108776652B (en) * 2018-05-21 2022-04-01 众安信息技术服务有限公司 Market forecasting method based on news corpus
CN108846142A (en) * 2018-07-12 2018-11-20 南方电网调峰调频发电有限公司 A kind of Text Clustering Method, device, equipment and readable storage medium storing program for executing
CN108960537B (en) * 2018-08-17 2020-10-13 安吉汽车物流股份有限公司 Logistics order prediction method and device and readable medium
CN109657137B (en) * 2018-11-26 2024-05-31 平安科技(深圳)有限公司 Public opinion news classification model construction method, device, computer equipment and storage medium
CN109815391A (en) * 2018-12-14 2019-05-28 深圳壹账通智能科技有限公司 News data analysis method and device, electric terminal based on big data
CN111859230B (en) * 2019-04-30 2024-02-06 北京智慧星光信息技术有限公司 Control method for monitoring hot spot trend of internet information
CN110377696A (en) * 2019-06-19 2019-10-25 新华智云科技有限公司 A kind of commodity future news the analysis of public opinion method and system
CN112711693B (en) * 2019-10-24 2024-04-09 富驰律法(北京)科技有限公司 Litigation thread mining method and system based on multi-feature fusion
CN111143647B (en) * 2019-11-28 2023-11-17 泰康保险集团股份有限公司 Information processing method and device, electronic equipment and storage medium
CN111832304B (en) * 2020-06-29 2024-02-27 上海巧房信息科技有限公司 Weight checking method and device for building names, electronic equipment and storage medium
CN111737556B (en) * 2020-07-03 2021-01-26 和宇健康科技股份有限公司 Big data information heat analysis method and cloud platform device
CN112597380A (en) * 2020-12-17 2021-04-02 中国科学院计算技术研究所数字经济产业研究院 Valuable news clue automatic discovery method based on microblog platform
CN112929235B (en) * 2021-02-06 2022-02-11 珠海市鸿瑞信息技术股份有限公司 Network monitoring system based on internet
CN113032515A (en) * 2021-03-25 2021-06-25 上海华客信息科技有限公司 Method, system, device and storage medium for generating chart based on multiple data sources
CN113282754A (en) * 2021-06-10 2021-08-20 北京中科闻歌科技股份有限公司 Public opinion detection method, device, equipment and storage medium for news events
CN114707045B (en) * 2022-03-23 2023-09-26 江苏悉宁科技有限公司 Public opinion monitoring method and system based on big data
CN115982473B (en) * 2023-03-21 2023-06-23 环球数科集团有限公司 Public opinion analysis arrangement system based on AIGC
CN116821502B (en) * 2023-06-30 2024-03-08 武汉大学 Public opinion hotspot-based data management method and system
CN118093979B (en) * 2024-04-02 2024-08-20 深圳振华数据信息技术有限公司 Internet news analysis system and method based on big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
CN103150335A (en) * 2013-01-25 2013-06-12 河南理工大学 Co-clustering-based coal mine public sentiment monitoring system
CN103544255A (en) * 2013-10-15 2014-01-29 常州大学 Text semantic relativity based network public opinion information analysis method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
CN103150335A (en) * 2013-01-25 2013-06-12 河南理工大学 Co-clustering-based coal mine public sentiment monitoring system
CN103544255A (en) * 2013-10-15 2014-01-29 常州大学 Text semantic relativity based network public opinion information analysis method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BBS准实时舆情监测技术研究与实现;栾文娟;《中国优秀说是学位论文全文数据库信息科技辑》;20130715(第7期);第I139-181页 *
中国移动舆情监测系统的设计与实现;王安宇;《中国优秀硕士学位论文全文数据库信息科技辑》;20130215(第2期);第I139-171页 *
公共卫生网络舆情监测系统设计及实现;郭岩等;《医学信息学》;20111231;第32卷(第8期);第6-9页 *

Also Published As

Publication number Publication date
CN104504150A (en) 2015-04-08

Similar Documents

Publication Publication Date Title
CN104504150B (en) News public sentiment monitoring system
CN104537097B (en) Microblogging public sentiment monitoring system
CN104504151B (en) WeChat public sentiment monitoring system
Papadopoulou et al. A corpus of debunked and verified user-generated videos
KR101737887B1 (en) Apparatus and Method for Topic Category Classification of Social Media Text based on Cross-Media Analysis
US9229977B2 (en) Real-time and adaptive data mining
KR101605430B1 (en) SYSTEM AND METHOD FOR BUINDING QAs DATABASE AND SEARCH SYSTEM AND METHOD USING THE SAME
CN105447081A (en) Cloud platform-oriented government affair and public opinion monitoring method
Aqlan et al. A study of sentiment analysis: Concepts, techniques, and challenges
CN112650848A (en) Urban railway public opinion information analysis method based on text semantic related passenger evaluation
CN103544255A (en) Text semantic relativity based network public opinion information analysis method
CN103699525A (en) Method and device for automatically generating abstract on basis of multi-dimensional characteristics of text
CN101814083A (en) Automatic webpage classification method and system
CN109325860A (en) Network public-opinion detection method and system for overseas investment Risk-warning
CN104899335A (en) Method for performing sentiment classification on network public sentiment of information
CN107463703A (en) English social media account number classification method based on information gain
CN106446051A (en) Deep search method of Eagle media assets
Devika et al. A semantic graph-based keyword extraction model using ranking method on big social data
CN118170878A (en) Intelligent question-answering method and system based on large model and knowledge base
US11675793B2 (en) System for managing, analyzing, navigating or searching of data information across one or more sources within a computer or a computer network, without copying, moving or manipulating the source or the data information stored in the source
Khan et al. Urdu sentiment analysis
CN104462439A (en) Event recognizing method and device
Elbes et al. P-stemmer or NLTK stemmer for arabic text classification?
CN107291952B (en) Method and device for extracting meaningful strings
CN113934910A (en) Automatic optimization and updating theme library construction method and hot event real-time updating method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210729

Address after: 519031 room 105-68672, No. 6, Baohua Road, Hengqin new area, Zhuhai City, Guangdong Province (centralized office area)

Patentee after: Zhuhai Hengqin Fandou Information Technology Co.,Ltd.

Address before: 610000 No. 1, No. 3 Shen Xian Nan Road, Chengdu high tech Zone, Sichuan, China.

Patentee before: CHENGDU BLTSAFE INFORMATION TECHNOLOGY Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170929

Termination date: 20220109