CN102968452A - Network public opinion information statistical method and system - Google Patents

Network public opinion information statistical method and system Download PDF

Info

Publication number
CN102968452A
CN102968452A CN2012104144455A CN201210414445A CN102968452A CN 102968452 A CN102968452 A CN 102968452A CN 2012104144455 A CN2012104144455 A CN 2012104144455A CN 201210414445 A CN201210414445 A CN 201210414445A CN 102968452 A CN102968452 A CN 102968452A
Authority
CN
China
Prior art keywords
data
statistical
statistics
network public
public sentiment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012104144455A
Other languages
Chinese (zh)
Inventor
杨睿尘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tengyi Science & Technology Development Co Ltd
Original Assignee
Beijing Tengyi Science & Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tengyi Science & Technology Development Co Ltd filed Critical Beijing Tengyi Science & Technology Development Co Ltd
Priority to CN2012104144455A priority Critical patent/CN102968452A/en
Publication of CN102968452A publication Critical patent/CN102968452A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a network public opinion information statistical method and system, wherein the method comprises the following steps that a theme to which statistics is to be conducted is input; data related to the them is captured from a web page and microblog through a web crawler and is saved; statistics is conducted to the captured data and statistical data is generated; and a statistical statement is generated according to the statistical data. According to the method provided by the embodiment of the invention, the statistical data is worked out and obtained by capturing and conducting statistics to the data of the web page and the microblog, the data statistics efficiency and speed are improved by data statistics and display, and meanwhile the statistical statement is generated to provide convenience for a user.

Description

Network public sentiment information statistical method and system
Technical field
The present invention relates to field of computer technology, particularly a kind of network public sentiment information statistical method and system.
Background technology
Along with extensively popularizing of internet, applications, the embodiment in every respect of the magnanimity of data is more and more outstanding, from network flow data, to the mobile communication subscriber behavior record; From the daily record data of search engine, to client's operation note of bank, etc.The inherent digitizing of these magnanimity informations and networked character, having brought improvement service opportunity to people when, many new technological challenges have been proposed also, how in the data of these magnanimity, conveniently to find new information, how to obtain the data that we want from statistical study here.
The method that generally adopts of using at present is the method for directly carrying out according to demand analyzing and processing from the Network Capture related data.
Solve mass data Statistical Speed and efficiency bottle neck problem
At present employedly can draw relevant information to a certain extent, but have following defective:
(1) in the face of the data of magnanimity the time, can not find main threads and the emphasis statistical efficiency is low.
(2) representing speed can not in time be presented in face of the user slowly fast.
Summary of the invention
Purpose of the present invention is intended to solve at least one of above-mentioned technological deficiency.
For achieving the above object, the embodiment of one aspect of the present invention proposes a kind of network public sentiment information statistical method, may further comprise the steps: S1: input needs the theme of statistics; S2: also preserve with the data of described Topic relative from webpage and microblogging crawl by web crawlers; S3: will grasp described data and add up the generation statistics; And S4: generate statistical report form according to described statistics.
According to the method for the embodiment of the invention, drawn the acquisition statistics by crawl and statistics to webpage and microblogging data, and to data statistics and parallel data statistics efficient and the speed of having improved that represents, generated simultaneously statistical report form and made things convenient for the user.
In one embodiment of the invention, described method also comprises: described statistical report form is preserved, and presented to the user.
In one embodiment of the invention, described step S3 specifically comprises: S31: the statistical that data are set; And S32: will in the data data that are associated be integrated and be added up according to statistical.
In one embodiment of the invention, the described a kind of or user-defined theme that themes as in emotion information, hot issue, reprinting rate rank, the clicking rate rank.
For achieving the above object, embodiments of the invention propose a kind of network public sentiment information statistical system on the other hand, comprising: load module is used for the theme that input needs statistics; Handling module is used for also preserving with the data of described Topic relative from webpage and microblogging crawl by web crawlers; Statistical module will grasp described data and add up the generation statistics; And Reports module, be used for generating statistical report form according to described statistics.
According to the system of the embodiment of the invention, drawn the acquisition statistics by crawl and statistics to webpage and microblogging data, and to data statistics and parallel data statistics efficient and the speed of having improved that represents, generated simultaneously statistical report form and made things convenient for the user.
In one embodiment of the present of invention, described system also comprises: preserve module, be used for described statistical report form is preserved, and present to the user.
In one embodiment of the present of invention, described statistical module specifically comprises: setting unit, be used for arranging data statistical and; Statistic unit is used for according to statistical data being integrated the data that are associated and add up.
In one embodiment of the present of invention, the described a kind of or user-defined theme that themes as in emotion information, hot issue, reprinting rate rank, the clicking rate rank.
The aspect that the present invention adds and advantage in the following description part provide, and part will become obviously from the following description, or recognize by practice of the present invention.
Description of drawings
Above-mentioned and/or the additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:
Fig. 1 is the process flow diagram of network public sentiment information statistical method according to an embodiment of the invention;
Fig. 2 is the process flow diagram of network public sentiment information statistical method in accordance with another embodiment of the present invention;
Fig. 3 is emotion statistical report form figure in accordance with another embodiment of the present invention;
Fig. 4 is the frame diagram of network public sentiment information statistical system according to an embodiment of the invention;
Fig. 5 is the frame diagram of statistical module according to an embodiment of the invention; And
Fig. 6 is the frame diagram of network public sentiment information statistical system in accordance with another embodiment of the present invention.
Embodiment
The below describes embodiments of the invention in detail, and the example of embodiment is shown in the drawings, and wherein identical or similar label represents identical or similar element or the element with identical or similar functions from start to finish.Be exemplary below by the embodiment that is described with reference to the drawings, only be used for explaining the present invention, and can not be interpreted as limitation of the present invention.
Fig. 1 is the process flow diagram based on the advertisement discover method of video of the embodiment of the invention.As shown in Figure 1, the advertisement discover method based on video according to the embodiment of the invention may further comprise the steps:
Step S101, input needs the theme of statistics.
Particularly, the user needs statistics or interested theme in the inputting interface input, wherein, themes as a kind of or user-defined theme in emotion information, hot issue, reprinting rate rank, the clicking rate rank.
Step S102 also preserves with the data of Topic relative from webpage and microblogging crawl by web crawlers.
Particularly, after obtaining the theme that needs statistics, grasp and subject correlation message from the internet by web crawlers.In the middle of acquisition process, simultaneously will be in the source of the information of Topic relative and this information, crawl time etc. records and be saved in the web database.Microblogging extracts and to relate to the larger Tengxun's microblogging of present customer volume, Sina's microblogging, Sohu's microblogging and Netease's microblogging, comes crawl time etc. to record to be saved in the microblogging database with information after the information that grabs with Topic relative.
In one embodiment of the invention, the crawl of the crawl of web data and microblogging data is parallel be saved in respectively in web database and the microblogging database after, transfer to total database and preserve the information of managing in web database and the microblogging database and then delete and empty processing.
According to the method for the embodiment of the invention, by to processing the deletion of rear database information, improved the speed of data interaction, and then improved efficient.
Step S103 will grasp data and add up the generation statistics.
Particularly, the statistical of data is set at first, wherein, the statistical of data comprises monthly to be added up, per diem adds up and add up by the hour, and measurement period is set, and wherein, how long measurement period is for obtaining every the time of a statistics.For example, statistical for statistics monthly, measurement period is one month, then from database, extract related data and integrate and add up the generation statistics according to set statistical and measurement period.For example, set be set to monthly and the cycle is one month then extracted data and per diem added up the generation statistics according to the cycle of data from database.
In one embodiment of the invention, computing machine is in free time to be added up, and suspends statistics when busy.Need to prove and since to be dealt be the network data of the magnanimity data bit intermediate data that needs a large amount of time process therefore within the set cycle, to come out rather than the network data of whole magnanimity processed after data.
Step S104 generates statistical report form according to statistics.
In one embodiment of the invention, the statistics that generates is some lteral datas about theme, according to the setting of these lteral datas and statistical and measurement period, lteral data is processed the generation statistical report form.
According to the method for the embodiment of the invention, draw statistics by crawl and statistics to webpage and microblogging data, and by data statistics and parallel data statistics efficient and the speed of having improved that represents, generate simultaneously statistical report form and made things convenient for the user.
Fig. 2 is the process flow diagram of network public sentiment information statistical method in accordance with another embodiment of the present invention.As shown in Figure 2, the network public sentiment information statistical method according to the embodiment of the invention may further comprise the steps:
Step S201, input needs the theme of statistics.
Particularly, the user needs statistics or interested theme in the inputting interface input, wherein, themes as a kind of or user-defined theme in emotion information, hot issue, reprinting rate rank, the clicking rate rank.
Step S202 also preserves with the data of Topic relative from webpage and microblogging crawl by web crawlers.
In one embodiment of the invention, the crawl of the crawl of web data and microblogging data is parallel be saved in respectively in web database and the microblogging database after, transfer to total database and preserve the information of managing in web database and the microblogging database and then delete and empty processing.
Step S203 will grasp data and add up the generation statistics.
Particularly, the statistical of data is set at first, wherein, the statistical of data comprises monthly to be added up, per diem adds up and add up by the hour, and measurement period is set, and wherein, how long measurement period is for obtaining every the time of a statistics.For example, statistical for statistics monthly, measurement period is one month, then from database, extract related data and integrate and add up the generation statistics according to set statistical and measurement period.For example, set be set to monthly and the cycle is one month then extracted data and per diem added up the generation statistics according to the cycle of data from database.
In one embodiment of the invention, computing machine is in free time to be added up, and suspends statistics when busy.Need to prove since to be dealt be that the network data of magnanimity needs a large amount of time to process, the data bit intermediate data that therefore within the set cycle, comes out rather than the network data of whole magnanimity processed after data.
Step S204 generates statistical report form according to statistics.
Step S205 preserves statistical report form and present to the user.
Particularly, the statistical report form that generates at first is saved in the background data base, and presents to the user by graphic interface.
In one embodiment of the invention, statistics with represent executed in parallel, the content from mass data is divided into a plurality of parts and regularly counts a part of intermediate result first, and intermediate result is stored in database presents to simultaneously the user, for example, Fig. 3 is emotion statistical report form figure.
According to the method for the embodiment of the invention, by adopting statistics and the mode that represents executed in parallel, reduced user's stand-by period, make simultaneously the user understand the data statistics situation and made things convenient for the user.
Fig. 4 is the structured flowchart of the network public sentiment information statistical system of the embodiment of the invention, as shown in Figure 4, comprises load module 100, handling module 200, statistical module 300 and Reports module 400 according to the network public sentiment information statistical system of the embodiment of the invention.
Particularly, load module 100 is used for the theme that input needs statistics.The user needs statistics or interested theme in the inputting interface input, wherein, themes as a kind of or user-defined theme in emotion information, hot issue, reprinting rate rank, the clicking rate rank.
Handling module 200 is used for also preserving with the data of Topic relative from webpage and microblogging crawl by web crawlers.Grasp and subject correlation message from the internet by web crawlers.In the middle of acquisition process, simultaneously will be in the source of the information of Topic relative and this information, crawl time etc. records and be saved in the web database.Microblogging extracts and to relate to the larger Tengxun's microblogging of present customer volume, Sina's microblogging, Sohu's microblogging and Netease's microblogging, comes crawl time etc. to record to be saved in the microblogging database with information after the information that grabs with Topic relative.
In one embodiment of the invention, the crawl of the crawl of web data and microblogging data is parallel be saved in respectively in web database and the microblogging database after, transfer to total database and preserve the information of managing in web database and the microblogging database and then delete and empty processing.
According to the system of the embodiment of the invention, by to processing the deletion of rear database information, improved the speed of data interaction, and then improved efficient.
Statistical module 300 will grasp data and add up the generation statistics.
Fig. 5 is the structured flowchart of the network public sentiment information statistical system of the embodiment of the invention, as shown in Figure 5, specifically comprises setting unit 310 and statistic unit 320 according to the network public sentiment information statistical system of the embodiment of the invention.
More specifically, setting unit 310 is used for arranging the statistical of data.The statistical of data is set, and wherein, the statistical of data comprises monthly to be added up, per diem adds up and add up by the hour, and measurement period is set, and wherein, how long measurement period is for obtaining every the time of a statistics.
Statistic unit 320 is used for according to statistical data being integrated the data that are associated and add up.
From database, extract related data and integrate and add up the generation statistics according to set statistical and measurement period.For example, set be set to monthly and the cycle is one month then extracted data and per diem added up the generation statistics according to the cycle of data from database.
In one embodiment of the invention, computing machine is in free time to be added up, and adds up temporarily when busy.Need to prove since to be dealt be that the network data of magnanimity needs a large amount of time to process, the data bit intermediate data that therefore within the set cycle, comes out rather than the network data of whole magnanimity processed after data.
Reports module 400 is used for generating statistical report form according to statistics.The statistics that generates is some lteral datas about theme, according to the setting of these lteral datas and statistical and measurement period, lteral data processed generating statistical report form.
According to the system of the embodiment of the invention, drawn the acquisition statistics by crawl and statistics to webpage and microblogging data, and to data statistics and parallel data statistics efficient and the speed of having improved that represents, generated simultaneously statistical report form and made things convenient for the user.
Fig. 6 is the structured flowchart of the network public sentiment information statistical system of another embodiment of the present invention, as shown in Figure 6, comprises also that according to the network public sentiment information statistical system of the embodiment of the invention preserving module 500 is used for statistical report form is preserved, and presents to the user.
In one embodiment of the invention, the statistical report form that generates at first is saved in the background data base, and presents to the user by graphic interface.Because these data are to ageing less demanding, but higher to rate request, therefore statistics with represent executed in parallel, the content from mass data is divided into a plurality of parts and regularly counts a part of intermediate result first, and intermediate result is stored in database presents to simultaneously the user
According to the system of the embodiment of the invention, by adopting statistics and the mode that represents executed in parallel, reduced user's stand-by period, make simultaneously the user understand the data statistics situation and made things convenient for the user.
The specific operation process that should be appreciated that modules in the system embodiment of the present invention and unit can be identical with the description in the embodiment of the method, is not described in detail herein.
Although the above has illustrated and has described embodiments of the invention, be understandable that, above-described embodiment is exemplary, can not be interpreted as limitation of the present invention, those of ordinary skill in the art can change above-described embodiment in the situation that does not break away from principle of the present invention and aim within the scope of the invention, modification, replacement and modification.

Claims (8)

1. a network public sentiment information statistical method is characterized in that, may further comprise the steps:
S1: input needs the theme of statistics;
S2: also preserve with the data of described Topic relative from webpage and microblogging crawl by web crawlers;
S3: will grasp described data and add up the generation statistics; And
S4: generate statistical report form according to described statistics.
2. network public sentiment information statistical method according to claim 1 is characterized in that, also comprises:
S5: described statistical report form is preserved, and presented to the user.
3. network public sentiment information statistical method according to claim 1 is characterized in that, described step S3 specifically comprises:
S31: the statistical that data are set; And
S32: will in the data data that are associated be integrated and be added up according to statistical.
4. network public sentiment information statistical method according to claim 1 is characterized in that, the described a kind of or user-defined theme that themes as in emotion information, hot issue, reprinting rate rank, the clicking rate rank.
5. a network public sentiment information statistical system is characterized in that, comprising:
Load module is used for the theme that input needs statistics;
Handling module is used for also preserving with the data of described Topic relative from webpage and microblogging crawl by web crawlers;
Statistical module will grasp described data and add up the generation statistics; And
Reports module is used for generating statistical report form according to described statistics.
6. network public sentiment information statistical system according to claim 6 is characterized in that, also comprises:
Preserve module, be used for described statistical report form is preserved, and present to the user.
7. network public sentiment information statistical system according to claim 6 is characterized in that, described statistical module specifically comprises:
Setting unit is for the statistical that data are set; And
Statistic unit is used for according to statistical data being integrated the data that are associated and add up.
8. network public sentiment information statistical system according to claim 5 is characterized in that, the described a kind of or user-defined theme that themes as in emotion information, hot issue, reprinting rate rank, the clicking rate rank.
CN2012104144455A 2012-10-25 2012-10-25 Network public opinion information statistical method and system Pending CN102968452A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012104144455A CN102968452A (en) 2012-10-25 2012-10-25 Network public opinion information statistical method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012104144455A CN102968452A (en) 2012-10-25 2012-10-25 Network public opinion information statistical method and system

Publications (1)

Publication Number Publication Date
CN102968452A true CN102968452A (en) 2013-03-13

Family

ID=47798590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012104144455A Pending CN102968452A (en) 2012-10-25 2012-10-25 Network public opinion information statistical method and system

Country Status (1)

Country Link
CN (1) CN102968452A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182438A (en) * 2014-02-25 2014-12-03 无锡天脉聚源传媒科技有限公司 Message counting method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853261A (en) * 2009-11-23 2010-10-06 电子科技大学 Network public-opinion behavior analysis method based on social network
CN102567393A (en) * 2010-12-21 2012-07-11 北大方正集团有限公司 Method, device and system for processing public sentiment topics
CN102609427A (en) * 2011-11-10 2012-07-25 天津大学 Public opinion vertical search analysis system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853261A (en) * 2009-11-23 2010-10-06 电子科技大学 Network public-opinion behavior analysis method based on social network
CN102567393A (en) * 2010-12-21 2012-07-11 北大方正集团有限公司 Method, device and system for processing public sentiment topics
CN102609427A (en) * 2011-11-10 2012-07-25 天津大学 Public opinion vertical search analysis system and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182438A (en) * 2014-02-25 2014-12-03 无锡天脉聚源传媒科技有限公司 Message counting method and device

Similar Documents

Publication Publication Date Title
JP6549806B1 (en) Segment the content displayed on the computing device into regions based on the pixel of the screenshot image that captures the content
US20150334068A1 (en) Message processing method and apparatus
CN107515878B (en) Data index management method and device
CN104182506A (en) Log management method
CN103678647A (en) Method and system for recommending information
CN103838867A (en) Log processing method and device
US11816172B2 (en) Data processing method, server, and computer storage medium
CN104394118A (en) User identity identification method and system
CN102404240B (en) Information search system and method
CN107765938B (en) Picture interaction method and device
CN110647512A (en) Data storage and analysis method, device, equipment and readable medium
CN104077415A (en) Searching method and device
US20150074043A1 (en) Distributed and open schema interactions management system and method
CN106339891A (en) Intelligent analysis method and system based on large data acquisition
CN103631791A (en) Information fusion classification display method and system
CN111582951A (en) Advertisement putting system and method for cloud electronic commerce
CN102547554B (en) Mobile service recommendation method based on mobile user behavior
CN102957949A (en) Device and method for recommending video to user
CN106815274B (en) Hadoop-based log data mining method and system
CN102929932A (en) Displaying device and displaying method for real-time news
CN103200269A (en) Internet information statistical method and Internet information statistical system
CN114066533A (en) Product recommendation method and device, electronic equipment and storage medium
CA3200883A1 (en) Multi-cache based digital output generation
CN113051460A (en) Elasticissearch-based data retrieval method and system, electronic device and storage medium
CN112528610A (en) Data labeling method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130313