CN106257458A - A kind of public feelings information sorts out assessment system - Google Patents
A kind of public feelings information sorts out assessment system Download PDFInfo
- Publication number
- CN106257458A CN106257458A CN201610562054.6A CN201610562054A CN106257458A CN 106257458 A CN106257458 A CN 106257458A CN 201610562054 A CN201610562054 A CN 201610562054A CN 106257458 A CN106257458 A CN 106257458A
- Authority
- CN
- China
- Prior art keywords
- information
- module
- subject
- assessment report
- pageview
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of public feelings information and sort out assessment system, including subject information retrieval module, keyword extracting module, the first cluster module, semantic module, pageview statistical module and assessment report output module.In the present invention, by key word, subject correlation message is carried out cluster and obtain multiple information groups, then according to the semantic similarity of the key word of information group mark, information group is carried out cluster and obtain the big class of multiple information, so, the process of scattered subject correlation message is converted to information group, the process of the big class of information, improve the concentration class processing object, avoid the triviality using scattered subject correlation message as information processing object, decrease workload, improve information processing efficiency.
Description
Technical field
The present invention relates to the analysis of public opinion technical field, particularly relate to a kind of public feelings information and sort out assessment system.
Background technology
Public sentiment monitors, and integrates internet information acquisition technology and information intelligent treatment technology by internet mass information
Automatically crawl, automatic taxonomic clustering, topic detection, focus on special topic, it is achieved the network public-opinion monitoring of user and Special Topics in Journalism are followed the trail of
Deng information requirement, form the analysis results such as bulletin, report, chart, grasp masses' thought for client dynamic comprehensively, make correct carriage
Opinion guides, it is provided that analyze foundation.
" network public-opinion monitoring system " is to levy in certain social space, around intermediary social events generation,
Development and change, social governor is produced by the common people and the society and politics attitude held expresses wish set on network and
The system of the computer monitoring carried out is referred to as.
" network public-opinion " is that the more masses are about conviction, attitude, suggestion and the feelings expressed by phenomenons various in society, problem
The summation of thread etc. performance.Network public-opinion is formed rapidly, huge to social influence, while strengthening internet information supervision, and group
Knit strength carry out information taken arrangement and analyze, for tackling the public accident of network burst in time and grasping social situation and people's will comprehensively
Highly significant.
Summary of the invention
The technical problem existed based on background technology, the present invention proposes a kind of public feelings information and sorts out assessment system.
A kind of public feelings information that the present invention proposes sorts out assessment system, including:
Subject information retrieval module, for carrying out networked information retrieval according to theme, obtains subject correlation message, and to respectively
Source web and the pageview of subject correlation message are added up;
Keyword extracting module, retrieval module is connected and obtains subject correlation message with subject information for it, and to each theme phase
Close information retrieval key word;
First cluster module, it retrieves module with subject information respectively and keyword extracting module is connected, and it is by key word
Identical subject correlation message clusters, it is thus achieved that multiple information groups, and each information group marks with key word;
Semantic module, is connected with the first cluster module, and it carries out semantic analysis to the key word of each information group, and
The semantic similarity of key word is clustered more than the information group presetting similarity threshold, it is thus achieved that the big class of multiple information, and
Extract in the little class keywords of each information the semantic identical part title as the big class of information;
Pageview statistical module, it connects subject information retrieval module and semantic module respectively, and it calculates respectively respectively
The pageview of the information group that the pageview total value of the subject correlation message that the little apoplexy due to endogenous wind of information comprises and the big apoplexy due to endogenous wind of each information comprise
Total value;And the master that the information group that comprises according to pageview class big to each information, the big apoplexy due to endogenous wind of information and the little apoplexy due to endogenous wind of information comprise
Topic relevant information is ranked up;
Assessment report output module, it is retrieved module with pageview statistical module and subject information and is connected;Assessment report is defeated
Go out module is provided with first threshold and Second Threshold;Assessment report output module screening and sequencing is positioned at the information before first threshold
Big class and the sequence of each information big apoplexy due to endogenous wind are positioned at the information group before Second Threshold, the name of the big class of information that then will filter out
The subject correlation message that the pageview of title, the mark key word of each information group and the little apoplexy due to endogenous wind of information is the highest is depicted as assessment report
Accuse output, and the pageview total value of the big class of each information of typing, the pageview total value of information group, theme are correlated with in assessment report
The pageview of information and source website address.
Preferably, also including website complementary module, its internal preset has high letter site databases, in high-new site databases
Storage has multiple website;Website complementary module retrieves module, semantic module and assessment report with subject information respectively
Output mould connects;Website complementary module obtains the subject correlation message that the source web being present in high letter site databases is corresponding
As check and correction target, and judge whether assessment report houses all check and correction target places information group, and according to judged result pair
Assessment report supplements.
Preferably, if there being check and correction target place information group not include assessment report in, then the check and correction target of omission is obtained
Place information group is as supplementary object;If supplementing the object place big class of information to be present in the assessment report of generation, then will
Supplementary object fills under the big class of information corresponding in assessment report;If supplementing the object place big class of information not exist in generation
In assessment report, then supplementary object and the supplementary object place big class of information are filled into assessment report.
Preferably, for believing that the content that site databases fills into highlights according to height in assessment report.
Preferably, subject information retrieval module includes input block and web crawlers, and input block is used for inputting theme, net
Reptile is connected network with input block, and it carries out network retrieval according to theme and obtains subject correlation message.
Preferably, the similarity threshold preset in semantic module can human-edited.
In the present invention, the theme that subject information retrieval module inputs according to staff carries out theme based on the big data of network
Retrieval, advantageously ensures that the comprehensive of information retrieval, it is to avoid the information in public sentiment monitoring is omitted.And to each subject correlation message
Source web and pageview are added up, follow-up to retrieval result call and check.
In the present invention, by key word, subject correlation message is carried out cluster and obtain multiple information groups, then according to letter
The semantic similarity of the key word of breath group mark carries out cluster and obtains the big class of multiple information, so, by scattered information group
The process of subject correlation message be converted to information group, the process of the big class of information, improve the concentration class processing object, it is to avoid
Using scattered subject correlation message as the triviality of information processing object, decrease workload, improve information processing effect
Rate.
In the present invention, assessment report output module is provided with first threshold and Second Threshold, in order to according to pageview pair
Screen in the big class of information, information group, deleted the content of assessment report typing so that assessment report is short and sweet, just
Consult in staff.And the content of typing is the information that pageview is higher in assessment report, thus, it is ensured that assessment report pair
In the verity that public sentiment tendency is expressed.It addition, arranged the information group of the big apoplexy due to endogenous wind of each information by Second Threshold so that assessment report
In announcement, the expression for public sentiment tendency is more complete, comprehensively.
Accompanying drawing explanation
Fig. 1 is that a kind of public feelings information that the present invention proposes sorts out assessment system block diagram.
Detailed description of the invention
With reference to Fig. 1, a kind of public feelings information that the present invention proposes sorts out assessment system, retrieves module, pass including subject information
Keyword extraction module, the first cluster module, semantic module, pageview statistical module, assessment report output module and website
Complementary module.
Subject information retrieval module, for carrying out networked information retrieval according to theme, obtains subject correlation message, and to respectively
Source web and the pageview of subject correlation message are added up.Specifically, subject information retrieval module include input block and
Web crawlers, input block is used for inputting theme, and web crawlers is connected with input block, and it carries out network retrieval according to theme and obtains
Take subject correlation message.
In present embodiment, theme is provided by input block by staff, then by web crawlers based on network
Big data carry out subject retrieval, advantageously ensure that the comprehensive of information retrieval, it is to avoid the information in public sentiment monitoring is omitted.And to respectively
Source web and the pageview of subject correlation message are added up, follow-up to retrieval result call and check.
Keyword extracting module is connected acquisition subject correlation message, and letter relevant to each theme with subject information retrieval module
Breath extracts key word.The extraction of key word is equivalent to carry out each subject correlation message de-redundancy, extracts main idea so that theme phase
The expression of pass information is more succinct, clearly.
First cluster module is connected with subject information retrieval module and keyword extracting module respectively, and it is identical by key word
Subject correlation message cluster, it is thus achieved that multiple information groups, and each information group with key word mark.So, by closing
Keyword clusters, scattered subject correlation message is converted into the information group with certain concentration class, it is to avoid with scattered
Subject correlation message as the triviality of information processing object, decrease workload, improve information processing efficiency.Each information
Group marks with key word, it is simple to the differentiation of information group, and is easy to the table of the subject correlation message that apoplexy due to endogenous wind little to information is concluded
Reach.
Semantic module the first cluster module connects, and it carries out semantic analysis to the key word of each information group, and will
The semantic similarity of key word clusters more than the information group presetting similarity threshold, it is thus achieved that the big class of multiple information, and carries
Take in the little class keywords of each information the semantic identical part title as the big class of information.So, by information group is concluded
For the big class of information, further increase the concentration class of information processing object, decrease workload, improve information processing efficiency.
In present embodiment, the similarity threshold preset in semantic module can human-edited, in order to staff's root
According to needs in the good similarity threshold of color, improve the motility of semantic module work and applicable range.
Pageview statistical module connects subject information retrieval module and semantic module respectively.Pageview statistical module divides
Do not calculate the pageview total value of the subject correlation message that the little apoplexy due to endogenous wind of each information comprises and information group that the big apoplexy due to endogenous wind of each information comprises
Pageview total value;And the information group that comprises according to pageview class big to each information, the big apoplexy due to endogenous wind of information and the little apoplexy due to endogenous wind of information
The subject correlation message comprised is ranked up.So, can know according to pageview that the big class of each information, information group are expressed intuitively
Public sentiment tendency.
Assessment report output module is retrieved module with pageview statistical module and subject information and is connected.Assessment report output mould
Block is provided with first threshold and Second Threshold.Assessment report output module screening and sequencing is positioned at the big class of information before first threshold
And the sequence of each information big apoplexy due to endogenous wind is positioned at the information group before Second Threshold, then by the title of the big class of information that filters out, each
It is defeated that the mark key word of information group and the highest subject correlation message of the pageview of the little apoplexy due to endogenous wind of information are depicted as assessment report
Go out, and the pageview total value of the big class of each information of typing, the pageview total value of information group, subject correlation message in assessment report
Pageview and source website address.
In present embodiment, first threshold and the setting of Second Threshold, according to pageview class big for information, information group
Screen, deleted the content of assessment report typing so that assessment report is short and sweet, it is simple to staff consults.And this
In embodiment, in assessment report, the content of typing is the information that pageview is higher, thus, it is ensured that assessment report is for public sentiment
The verity that tendency is expressed.It addition, arranged the information group of the big apoplexy due to endogenous wind of each information by Second Threshold so that right in assessment report
More complete, comprehensively in the expression of public sentiment tendency.
Website complementary module internal preset has high letter site databases, and in high-new site databases, storage has multiple websites net
Location, specially release news the station address that validity is higher and popularity is higher.Website complementary module is believed with theme respectively
Breath retrieval module, semantic module and assessment report output mould connect.
The subject correlation message that website complementary module obtains the source web being present in high letter site databases corresponding is made
For proofreading target, and judge whether assessment report houses all check and correction target places information group, and according to judged result to commenting
Estimate report to supplement.Specifically, if there being check and correction target place information group not include assessment report in, then the school of omission is obtained
To target place information group as supplementary object;If supplementing the object place big class of information to be present in the assessment report of generation
In, then supplementary object is filled under the big class of information corresponding in assessment report;If supplementing the object place big class of information not exist
In the assessment report generated, then supplementary object and the supplementary object place big class of information are filled into assessment report.Assessment report
In for highlighting according to the height letter content that fills into of site databases.
So, be equivalent to believe that assessment report is checked and supplements by the source web in site databases by height, make
Obtain assessment report the most credible.
The above, the only present invention preferably detailed description of the invention, but protection scope of the present invention is not limited thereto,
Any those familiar with the art in the technical scope that the invention discloses, according to technical scheme and
Inventive concept equivalent or change in addition, all should contain within protection scope of the present invention.
Claims (6)
1. a public feelings information sorts out assessment system, it is characterised in that including:
Subject information retrieval module, for carrying out networked information retrieval according to theme, obtains subject correlation message, and to each theme
Source web and the pageview of relevant information are added up;
Keyword extracting module, it is connected acquisition subject correlation message, and letter relevant to each theme with subject information retrieval module
Breath extracts key word;
First cluster module, it retrieves module with subject information respectively and keyword extracting module is connected, and it is identical by key word
Subject correlation message cluster, it is thus achieved that multiple information groups, and each information group with key word mark;
Semantic module, is connected with the first cluster module, and it carries out semantic analysis to the key word of each information group, and will close
The semantic similarity of keyword clusters more than the information group presetting similarity threshold, it is thus achieved that the big class of multiple information, and extracts
In the little class keywords of each information, semantic identical part is as the title of the big class of information;
Pageview statistical module, it connects subject information retrieval module and semantic module respectively, and it calculates each information respectively
The pageview total value of the information group that the pageview total value of the subject correlation message that little apoplexy due to endogenous wind comprises and the big apoplexy due to endogenous wind of each information comprise;
And the theme that the information group that comprises according to pageview class big to each information, the big apoplexy due to endogenous wind of information and the little apoplexy due to endogenous wind of information comprise is correlated with
Information is ranked up;
Assessment report output module, it is retrieved module with pageview statistical module and subject information and is connected;Assessment report output mould
Block is provided with first threshold and Second Threshold;Assessment report output module screening and sequencing is positioned at the big class of information before first threshold
And the sequence of each information big apoplexy due to endogenous wind is positioned at the information group before Second Threshold, then by the title of the big class of information that filters out, each
It is defeated that the mark key word of information group and the highest subject correlation message of the pageview of the little apoplexy due to endogenous wind of information are depicted as assessment report
Go out, and the pageview total value of the big class of each information of typing, the pageview total value of information group, subject correlation message in assessment report
Pageview and source website address.
2. public feelings information as claimed in claim 1 sorts out assessment system, it is characterised in that also include website complementary module, its
Internal preset has high letter site databases, and in high-new site databases, storage has multiple website;Website complementary module is respectively
It is connected with subject information retrieval module, semantic module and assessment report output mould;Website complementary module obtains and is present in height
Subject correlation message corresponding to source web in letter site databases is as check and correction target, and judges whether assessment report houses
All check and correction target places information group, and according to judged result, assessment report is supplemented.
3. public feelings information as claimed in claim 2 sorts out assessment system, it is characterised in that if there being check and correction target place information
Group does not includes assessment report in, then obtain the check and correction target place information group of omission as supplementary object;If supplementing object
Information big class in place is present in the assessment report of generation, then supplementary object fills into the big class of information corresponding in assessment report
Under;If supplementing in the assessment report that the object place big class of information does not exists in generation, then by supplementary object and supplementary object
Information big class in place fills into assessment report.
4. public feelings information as claimed in claim 3 sorts out assessment system, it is characterised in that for believing according to height in assessment report
The content that site databases fills into highlights.
5. public feelings information as claimed in claim 1 sorts out assessment system, it is characterised in that subject information retrieval module includes defeated
Entering unit and web crawlers, input block is used for inputting theme, and web crawlers is connected with input block, and it carries out net according to theme
Network retrieval obtains subject correlation message.
6. public feelings information as claimed in claim 1 sorts out assessment system, it is characterised in that the phase preset in semantic module
Can human-edited like degree threshold value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610562054.6A CN106257458A (en) | 2016-07-15 | 2016-07-15 | A kind of public feelings information sorts out assessment system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610562054.6A CN106257458A (en) | 2016-07-15 | 2016-07-15 | A kind of public feelings information sorts out assessment system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106257458A true CN106257458A (en) | 2016-12-28 |
Family
ID=57713752
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610562054.6A Pending CN106257458A (en) | 2016-07-15 | 2016-07-15 | A kind of public feelings information sorts out assessment system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106257458A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109451147A (en) * | 2018-10-15 | 2019-03-08 | 麒麟合盛网络技术股份有限公司 | A kind of information displaying method and device |
CN109657116A (en) * | 2018-11-12 | 2019-04-19 | 平安科技(深圳)有限公司 | A kind of public sentiment searching method, searcher, storage medium and terminal device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101408883A (en) * | 2008-11-24 | 2009-04-15 | 电子科技大学 | Method for collecting network public feelings viewpoint |
CN101751458A (en) * | 2009-12-31 | 2010-06-23 | 暨南大学 | Network public sentiment monitoring system and method |
CN102708096A (en) * | 2012-05-29 | 2012-10-03 | 代松 | Network intelligence public sentiment monitoring system based on semantics and work method thereof |
CN103116651A (en) * | 2013-03-05 | 2013-05-22 | 南京理工大学常熟研究院有限公司 | Public sentiment hot topic dynamic detection method |
CN104598450A (en) * | 2013-10-30 | 2015-05-06 | 北大方正集团有限公司 | Popularity analysis method and system of network public opinion event |
-
2016
- 2016-07-15 CN CN201610562054.6A patent/CN106257458A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101408883A (en) * | 2008-11-24 | 2009-04-15 | 电子科技大学 | Method for collecting network public feelings viewpoint |
CN101751458A (en) * | 2009-12-31 | 2010-06-23 | 暨南大学 | Network public sentiment monitoring system and method |
CN102708096A (en) * | 2012-05-29 | 2012-10-03 | 代松 | Network intelligence public sentiment monitoring system based on semantics and work method thereof |
CN103116651A (en) * | 2013-03-05 | 2013-05-22 | 南京理工大学常熟研究院有限公司 | Public sentiment hot topic dynamic detection method |
CN104598450A (en) * | 2013-10-30 | 2015-05-06 | 北大方正集团有限公司 | Popularity analysis method and system of network public opinion event |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109451147A (en) * | 2018-10-15 | 2019-03-08 | 麒麟合盛网络技术股份有限公司 | A kind of information displaying method and device |
CN109451147B (en) * | 2018-10-15 | 2020-11-24 | 麒麟合盛网络技术股份有限公司 | Information display method and device |
CN109657116A (en) * | 2018-11-12 | 2019-04-19 | 平安科技(深圳)有限公司 | A kind of public sentiment searching method, searcher, storage medium and terminal device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104504150B (en) | News public sentiment monitoring system | |
CN104537097B (en) | Microblogging public sentiment monitoring system | |
Atkinson et al. | Near real time information mining in multilingual news | |
CN108776671A (en) | A kind of network public sentiment monitoring system and method | |
CN105718587A (en) | Network content resource evaluation method and evaluation system | |
CN104504151B (en) | WeChat public sentiment monitoring system | |
US10565253B2 (en) | Model generation method, word weighting method, device, apparatus, and computer storage medium | |
CN105975478A (en) | Word vector analysis-based online article belonging event detection method and device | |
CN103577558A (en) | Device and method for optimizing search ranking of frequently asked question and answer pairs | |
CN104657393A (en) | Public opinion analysis method and corresponding device | |
CN108416034B (en) | Information acquisition system based on financial heterogeneous big data and control method thereof | |
CN104899335A (en) | Method for performing sentiment classification on network public sentiment of information | |
CN104615627A (en) | Event public sentiment information extracting method and system based on micro-blog platform | |
CN106169050B (en) | A kind of PoC Program extraction method based on webpage Knowledge Discovery | |
CN107145568A (en) | A kind of quick media event clustering system and method | |
CN106446051A (en) | Deep search method of Eagle media assets | |
CN106257458A (en) | A kind of public feelings information sorts out assessment system | |
CN109933709B (en) | Public opinion tracking method and device for video text combined data and computer equipment | |
CN103838739A (en) | Method and system for detecting error correction words in search engine | |
CN106257457B (en) | A kind of public sentiment compiles method | |
CN112395513A (en) | Public opinion transmission power analysis method | |
CN110110188A (en) | A kind of network public-opinion monitoring system based on cloud computing technology | |
CN104572767A (en) | Method and system for language classification of sites | |
CN109241085B (en) | Big data SQL query method for SolrCloud | |
CN113012009A (en) | Intelligent policy information acquisition and analysis system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161228 |