CN106257458A - A kind of public feelings information sorts out assessment system - Google Patents

A kind of public feelings information sorts out assessment system Download PDF

Info

Publication number
CN106257458A
CN106257458A CN201610562054.6A CN201610562054A CN106257458A CN 106257458 A CN106257458 A CN 106257458A CN 201610562054 A CN201610562054 A CN 201610562054A CN 106257458 A CN106257458 A CN 106257458A
Authority
CN
China
Prior art keywords
information
module
subject
assessment report
pageview
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610562054.6A
Other languages
Chinese (zh)
Inventor
党连坤
石晔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HEFEI COMPASS ELECTRONIC TECHNOLOGY Co Ltd
Original Assignee
HEFEI COMPASS ELECTRONIC TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HEFEI COMPASS ELECTRONIC TECHNOLOGY Co Ltd filed Critical HEFEI COMPASS ELECTRONIC TECHNOLOGY Co Ltd
Priority to CN201610562054.6A priority Critical patent/CN106257458A/en
Publication of CN106257458A publication Critical patent/CN106257458A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of public feelings information and sort out assessment system, including subject information retrieval module, keyword extracting module, the first cluster module, semantic module, pageview statistical module and assessment report output module.In the present invention, by key word, subject correlation message is carried out cluster and obtain multiple information groups, then according to the semantic similarity of the key word of information group mark, information group is carried out cluster and obtain the big class of multiple information, so, the process of scattered subject correlation message is converted to information group, the process of the big class of information, improve the concentration class processing object, avoid the triviality using scattered subject correlation message as information processing object, decrease workload, improve information processing efficiency.

Description

A kind of public feelings information sorts out assessment system
Technical field
The present invention relates to the analysis of public opinion technical field, particularly relate to a kind of public feelings information and sort out assessment system.
Background technology
Public sentiment monitors, and integrates internet information acquisition technology and information intelligent treatment technology by internet mass information Automatically crawl, automatic taxonomic clustering, topic detection, focus on special topic, it is achieved the network public-opinion monitoring of user and Special Topics in Journalism are followed the trail of Deng information requirement, form the analysis results such as bulletin, report, chart, grasp masses' thought for client dynamic comprehensively, make correct carriage Opinion guides, it is provided that analyze foundation.
" network public-opinion monitoring system " is to levy in certain social space, around intermediary social events generation, Development and change, social governor is produced by the common people and the society and politics attitude held expresses wish set on network and The system of the computer monitoring carried out is referred to as.
" network public-opinion " is that the more masses are about conviction, attitude, suggestion and the feelings expressed by phenomenons various in society, problem The summation of thread etc. performance.Network public-opinion is formed rapidly, huge to social influence, while strengthening internet information supervision, and group Knit strength carry out information taken arrangement and analyze, for tackling the public accident of network burst in time and grasping social situation and people's will comprehensively Highly significant.
Summary of the invention
The technical problem existed based on background technology, the present invention proposes a kind of public feelings information and sorts out assessment system.
A kind of public feelings information that the present invention proposes sorts out assessment system, including:
Subject information retrieval module, for carrying out networked information retrieval according to theme, obtains subject correlation message, and to respectively Source web and the pageview of subject correlation message are added up;
Keyword extracting module, retrieval module is connected and obtains subject correlation message with subject information for it, and to each theme phase Close information retrieval key word;
First cluster module, it retrieves module with subject information respectively and keyword extracting module is connected, and it is by key word Identical subject correlation message clusters, it is thus achieved that multiple information groups, and each information group marks with key word;
Semantic module, is connected with the first cluster module, and it carries out semantic analysis to the key word of each information group, and The semantic similarity of key word is clustered more than the information group presetting similarity threshold, it is thus achieved that the big class of multiple information, and Extract in the little class keywords of each information the semantic identical part title as the big class of information;
Pageview statistical module, it connects subject information retrieval module and semantic module respectively, and it calculates respectively respectively The pageview of the information group that the pageview total value of the subject correlation message that the little apoplexy due to endogenous wind of information comprises and the big apoplexy due to endogenous wind of each information comprise Total value;And the master that the information group that comprises according to pageview class big to each information, the big apoplexy due to endogenous wind of information and the little apoplexy due to endogenous wind of information comprise Topic relevant information is ranked up;
Assessment report output module, it is retrieved module with pageview statistical module and subject information and is connected;Assessment report is defeated Go out module is provided with first threshold and Second Threshold;Assessment report output module screening and sequencing is positioned at the information before first threshold Big class and the sequence of each information big apoplexy due to endogenous wind are positioned at the information group before Second Threshold, the name of the big class of information that then will filter out The subject correlation message that the pageview of title, the mark key word of each information group and the little apoplexy due to endogenous wind of information is the highest is depicted as assessment report Accuse output, and the pageview total value of the big class of each information of typing, the pageview total value of information group, theme are correlated with in assessment report The pageview of information and source website address.
Preferably, also including website complementary module, its internal preset has high letter site databases, in high-new site databases Storage has multiple website;Website complementary module retrieves module, semantic module and assessment report with subject information respectively Output mould connects;Website complementary module obtains the subject correlation message that the source web being present in high letter site databases is corresponding As check and correction target, and judge whether assessment report houses all check and correction target places information group, and according to judged result pair Assessment report supplements.
Preferably, if there being check and correction target place information group not include assessment report in, then the check and correction target of omission is obtained Place information group is as supplementary object;If supplementing the object place big class of information to be present in the assessment report of generation, then will Supplementary object fills under the big class of information corresponding in assessment report;If supplementing the object place big class of information not exist in generation In assessment report, then supplementary object and the supplementary object place big class of information are filled into assessment report.
Preferably, for believing that the content that site databases fills into highlights according to height in assessment report.
Preferably, subject information retrieval module includes input block and web crawlers, and input block is used for inputting theme, net Reptile is connected network with input block, and it carries out network retrieval according to theme and obtains subject correlation message.
Preferably, the similarity threshold preset in semantic module can human-edited.
In the present invention, the theme that subject information retrieval module inputs according to staff carries out theme based on the big data of network Retrieval, advantageously ensures that the comprehensive of information retrieval, it is to avoid the information in public sentiment monitoring is omitted.And to each subject correlation message Source web and pageview are added up, follow-up to retrieval result call and check.
In the present invention, by key word, subject correlation message is carried out cluster and obtain multiple information groups, then according to letter The semantic similarity of the key word of breath group mark carries out cluster and obtains the big class of multiple information, so, by scattered information group The process of subject correlation message be converted to information group, the process of the big class of information, improve the concentration class processing object, it is to avoid Using scattered subject correlation message as the triviality of information processing object, decrease workload, improve information processing effect Rate.
In the present invention, assessment report output module is provided with first threshold and Second Threshold, in order to according to pageview pair Screen in the big class of information, information group, deleted the content of assessment report typing so that assessment report is short and sweet, just Consult in staff.And the content of typing is the information that pageview is higher in assessment report, thus, it is ensured that assessment report pair In the verity that public sentiment tendency is expressed.It addition, arranged the information group of the big apoplexy due to endogenous wind of each information by Second Threshold so that assessment report In announcement, the expression for public sentiment tendency is more complete, comprehensively.
Accompanying drawing explanation
Fig. 1 is that a kind of public feelings information that the present invention proposes sorts out assessment system block diagram.
Detailed description of the invention
With reference to Fig. 1, a kind of public feelings information that the present invention proposes sorts out assessment system, retrieves module, pass including subject information Keyword extraction module, the first cluster module, semantic module, pageview statistical module, assessment report output module and website Complementary module.
Subject information retrieval module, for carrying out networked information retrieval according to theme, obtains subject correlation message, and to respectively Source web and the pageview of subject correlation message are added up.Specifically, subject information retrieval module include input block and Web crawlers, input block is used for inputting theme, and web crawlers is connected with input block, and it carries out network retrieval according to theme and obtains Take subject correlation message.
In present embodiment, theme is provided by input block by staff, then by web crawlers based on network Big data carry out subject retrieval, advantageously ensure that the comprehensive of information retrieval, it is to avoid the information in public sentiment monitoring is omitted.And to respectively Source web and the pageview of subject correlation message are added up, follow-up to retrieval result call and check.
Keyword extracting module is connected acquisition subject correlation message, and letter relevant to each theme with subject information retrieval module Breath extracts key word.The extraction of key word is equivalent to carry out each subject correlation message de-redundancy, extracts main idea so that theme phase The expression of pass information is more succinct, clearly.
First cluster module is connected with subject information retrieval module and keyword extracting module respectively, and it is identical by key word Subject correlation message cluster, it is thus achieved that multiple information groups, and each information group with key word mark.So, by closing Keyword clusters, scattered subject correlation message is converted into the information group with certain concentration class, it is to avoid with scattered Subject correlation message as the triviality of information processing object, decrease workload, improve information processing efficiency.Each information Group marks with key word, it is simple to the differentiation of information group, and is easy to the table of the subject correlation message that apoplexy due to endogenous wind little to information is concluded Reach.
Semantic module the first cluster module connects, and it carries out semantic analysis to the key word of each information group, and will The semantic similarity of key word clusters more than the information group presetting similarity threshold, it is thus achieved that the big class of multiple information, and carries Take in the little class keywords of each information the semantic identical part title as the big class of information.So, by information group is concluded For the big class of information, further increase the concentration class of information processing object, decrease workload, improve information processing efficiency.
In present embodiment, the similarity threshold preset in semantic module can human-edited, in order to staff's root According to needs in the good similarity threshold of color, improve the motility of semantic module work and applicable range.
Pageview statistical module connects subject information retrieval module and semantic module respectively.Pageview statistical module divides Do not calculate the pageview total value of the subject correlation message that the little apoplexy due to endogenous wind of each information comprises and information group that the big apoplexy due to endogenous wind of each information comprises Pageview total value;And the information group that comprises according to pageview class big to each information, the big apoplexy due to endogenous wind of information and the little apoplexy due to endogenous wind of information The subject correlation message comprised is ranked up.So, can know according to pageview that the big class of each information, information group are expressed intuitively Public sentiment tendency.
Assessment report output module is retrieved module with pageview statistical module and subject information and is connected.Assessment report output mould Block is provided with first threshold and Second Threshold.Assessment report output module screening and sequencing is positioned at the big class of information before first threshold And the sequence of each information big apoplexy due to endogenous wind is positioned at the information group before Second Threshold, then by the title of the big class of information that filters out, each It is defeated that the mark key word of information group and the highest subject correlation message of the pageview of the little apoplexy due to endogenous wind of information are depicted as assessment report Go out, and the pageview total value of the big class of each information of typing, the pageview total value of information group, subject correlation message in assessment report Pageview and source website address.
In present embodiment, first threshold and the setting of Second Threshold, according to pageview class big for information, information group Screen, deleted the content of assessment report typing so that assessment report is short and sweet, it is simple to staff consults.And this In embodiment, in assessment report, the content of typing is the information that pageview is higher, thus, it is ensured that assessment report is for public sentiment The verity that tendency is expressed.It addition, arranged the information group of the big apoplexy due to endogenous wind of each information by Second Threshold so that right in assessment report More complete, comprehensively in the expression of public sentiment tendency.
Website complementary module internal preset has high letter site databases, and in high-new site databases, storage has multiple websites net Location, specially release news the station address that validity is higher and popularity is higher.Website complementary module is believed with theme respectively Breath retrieval module, semantic module and assessment report output mould connect.
The subject correlation message that website complementary module obtains the source web being present in high letter site databases corresponding is made For proofreading target, and judge whether assessment report houses all check and correction target places information group, and according to judged result to commenting Estimate report to supplement.Specifically, if there being check and correction target place information group not include assessment report in, then the school of omission is obtained To target place information group as supplementary object;If supplementing the object place big class of information to be present in the assessment report of generation In, then supplementary object is filled under the big class of information corresponding in assessment report;If supplementing the object place big class of information not exist In the assessment report generated, then supplementary object and the supplementary object place big class of information are filled into assessment report.Assessment report In for highlighting according to the height letter content that fills into of site databases.
So, be equivalent to believe that assessment report is checked and supplements by the source web in site databases by height, make Obtain assessment report the most credible.
The above, the only present invention preferably detailed description of the invention, but protection scope of the present invention is not limited thereto, Any those familiar with the art in the technical scope that the invention discloses, according to technical scheme and Inventive concept equivalent or change in addition, all should contain within protection scope of the present invention.

Claims (6)

1. a public feelings information sorts out assessment system, it is characterised in that including:
Subject information retrieval module, for carrying out networked information retrieval according to theme, obtains subject correlation message, and to each theme Source web and the pageview of relevant information are added up;
Keyword extracting module, it is connected acquisition subject correlation message, and letter relevant to each theme with subject information retrieval module Breath extracts key word;
First cluster module, it retrieves module with subject information respectively and keyword extracting module is connected, and it is identical by key word Subject correlation message cluster, it is thus achieved that multiple information groups, and each information group with key word mark;
Semantic module, is connected with the first cluster module, and it carries out semantic analysis to the key word of each information group, and will close The semantic similarity of keyword clusters more than the information group presetting similarity threshold, it is thus achieved that the big class of multiple information, and extracts In the little class keywords of each information, semantic identical part is as the title of the big class of information;
Pageview statistical module, it connects subject information retrieval module and semantic module respectively, and it calculates each information respectively The pageview total value of the information group that the pageview total value of the subject correlation message that little apoplexy due to endogenous wind comprises and the big apoplexy due to endogenous wind of each information comprise; And the theme that the information group that comprises according to pageview class big to each information, the big apoplexy due to endogenous wind of information and the little apoplexy due to endogenous wind of information comprise is correlated with Information is ranked up;
Assessment report output module, it is retrieved module with pageview statistical module and subject information and is connected;Assessment report output mould Block is provided with first threshold and Second Threshold;Assessment report output module screening and sequencing is positioned at the big class of information before first threshold And the sequence of each information big apoplexy due to endogenous wind is positioned at the information group before Second Threshold, then by the title of the big class of information that filters out, each It is defeated that the mark key word of information group and the highest subject correlation message of the pageview of the little apoplexy due to endogenous wind of information are depicted as assessment report Go out, and the pageview total value of the big class of each information of typing, the pageview total value of information group, subject correlation message in assessment report Pageview and source website address.
2. public feelings information as claimed in claim 1 sorts out assessment system, it is characterised in that also include website complementary module, its Internal preset has high letter site databases, and in high-new site databases, storage has multiple website;Website complementary module is respectively It is connected with subject information retrieval module, semantic module and assessment report output mould;Website complementary module obtains and is present in height Subject correlation message corresponding to source web in letter site databases is as check and correction target, and judges whether assessment report houses All check and correction target places information group, and according to judged result, assessment report is supplemented.
3. public feelings information as claimed in claim 2 sorts out assessment system, it is characterised in that if there being check and correction target place information Group does not includes assessment report in, then obtain the check and correction target place information group of omission as supplementary object;If supplementing object Information big class in place is present in the assessment report of generation, then supplementary object fills into the big class of information corresponding in assessment report Under;If supplementing in the assessment report that the object place big class of information does not exists in generation, then by supplementary object and supplementary object Information big class in place fills into assessment report.
4. public feelings information as claimed in claim 3 sorts out assessment system, it is characterised in that for believing according to height in assessment report The content that site databases fills into highlights.
5. public feelings information as claimed in claim 1 sorts out assessment system, it is characterised in that subject information retrieval module includes defeated Entering unit and web crawlers, input block is used for inputting theme, and web crawlers is connected with input block, and it carries out net according to theme Network retrieval obtains subject correlation message.
6. public feelings information as claimed in claim 1 sorts out assessment system, it is characterised in that the phase preset in semantic module Can human-edited like degree threshold value.
CN201610562054.6A 2016-07-15 2016-07-15 A kind of public feelings information sorts out assessment system Pending CN106257458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610562054.6A CN106257458A (en) 2016-07-15 2016-07-15 A kind of public feelings information sorts out assessment system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610562054.6A CN106257458A (en) 2016-07-15 2016-07-15 A kind of public feelings information sorts out assessment system

Publications (1)

Publication Number Publication Date
CN106257458A true CN106257458A (en) 2016-12-28

Family

ID=57713752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610562054.6A Pending CN106257458A (en) 2016-07-15 2016-07-15 A kind of public feelings information sorts out assessment system

Country Status (1)

Country Link
CN (1) CN106257458A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109451147A (en) * 2018-10-15 2019-03-08 麒麟合盛网络技术股份有限公司 A kind of information displaying method and device
CN109657116A (en) * 2018-11-12 2019-04-19 平安科技(深圳)有限公司 A kind of public sentiment searching method, searcher, storage medium and terminal device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408883A (en) * 2008-11-24 2009-04-15 电子科技大学 Method for collecting network public feelings viewpoint
CN101751458A (en) * 2009-12-31 2010-06-23 暨南大学 Network public sentiment monitoring system and method
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
CN103116651A (en) * 2013-03-05 2013-05-22 南京理工大学常熟研究院有限公司 Public sentiment hot topic dynamic detection method
CN104598450A (en) * 2013-10-30 2015-05-06 北大方正集团有限公司 Popularity analysis method and system of network public opinion event

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408883A (en) * 2008-11-24 2009-04-15 电子科技大学 Method for collecting network public feelings viewpoint
CN101751458A (en) * 2009-12-31 2010-06-23 暨南大学 Network public sentiment monitoring system and method
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
CN103116651A (en) * 2013-03-05 2013-05-22 南京理工大学常熟研究院有限公司 Public sentiment hot topic dynamic detection method
CN104598450A (en) * 2013-10-30 2015-05-06 北大方正集团有限公司 Popularity analysis method and system of network public opinion event

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109451147A (en) * 2018-10-15 2019-03-08 麒麟合盛网络技术股份有限公司 A kind of information displaying method and device
CN109451147B (en) * 2018-10-15 2020-11-24 麒麟合盛网络技术股份有限公司 Information display method and device
CN109657116A (en) * 2018-11-12 2019-04-19 平安科技(深圳)有限公司 A kind of public sentiment searching method, searcher, storage medium and terminal device

Similar Documents

Publication Publication Date Title
CN104504150B (en) News public sentiment monitoring system
CN104537097B (en) Microblogging public sentiment monitoring system
Atkinson et al. Near real time information mining in multilingual news
CN108776671A (en) A kind of network public sentiment monitoring system and method
CN105718587A (en) Network content resource evaluation method and evaluation system
CN104504151B (en) WeChat public sentiment monitoring system
US10565253B2 (en) Model generation method, word weighting method, device, apparatus, and computer storage medium
CN105975478A (en) Word vector analysis-based online article belonging event detection method and device
CN103577558A (en) Device and method for optimizing search ranking of frequently asked question and answer pairs
CN104657393A (en) Public opinion analysis method and corresponding device
CN108416034B (en) Information acquisition system based on financial heterogeneous big data and control method thereof
CN104899335A (en) Method for performing sentiment classification on network public sentiment of information
CN104615627A (en) Event public sentiment information extracting method and system based on micro-blog platform
CN106169050B (en) A kind of PoC Program extraction method based on webpage Knowledge Discovery
CN107145568A (en) A kind of quick media event clustering system and method
CN106446051A (en) Deep search method of Eagle media assets
CN106257458A (en) A kind of public feelings information sorts out assessment system
CN109933709B (en) Public opinion tracking method and device for video text combined data and computer equipment
CN103838739A (en) Method and system for detecting error correction words in search engine
CN106257457B (en) A kind of public sentiment compiles method
CN112395513A (en) Public opinion transmission power analysis method
CN110110188A (en) A kind of network public-opinion monitoring system based on cloud computing technology
CN104572767A (en) Method and system for language classification of sites
CN109241085B (en) Big data SQL query method for SolrCloud
CN113012009A (en) Intelligent policy information acquisition and analysis system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161228