CN113946736A - System and method for calculating event heat - Google Patents

System and method for calculating event heat Download PDF

Info

Publication number
CN113946736A
CN113946736A CN202111210594.5A CN202111210594A CN113946736A CN 113946736 A CN113946736 A CN 113946736A CN 202111210594 A CN202111210594 A CN 202111210594A CN 113946736 A CN113946736 A CN 113946736A
Authority
CN
China
Prior art keywords
data
module
information
event
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111210594.5A
Other languages
Chinese (zh)
Inventor
朱旭琪
王欢
夏茂晋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qingbo Intelligent Technology Co ltd
Original Assignee
Beijing Qingbo Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qingbo Intelligent Technology Co ltd filed Critical Beijing Qingbo Intelligent Technology Co ltd
Priority to CN202111210594.5A priority Critical patent/CN113946736A/en
Publication of CN113946736A publication Critical patent/CN113946736A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The invention discloses a system for calculating event heat, which comprises a data acquisition module, an information storage module, a data classification module, a heat calculation module and a data generation module, wherein the data acquisition module is used for crawling mass contents from the Internet by using a crawler technology and sending the contents to the information storage module, the information storage module is used for storing and managing the acquired mass content information and establishing a large database, the data classification module is used for extracting data in the information storage module and performing theme aggregation calculation on mass content texts to generate event sets, and an available data set to be analyzed is provided for the heat calculation module. The method fully considers the content index of the news information text, analyzes the volume of the internet data content by taking the specific event as the index, adopts a more comprehensive index system, and finally obtains the event popularity of the specific event.

Description

System and method for calculating event heat
Technical Field
The invention belongs to the field of content analysis and processing, and particularly relates to a system for calculating event heat.
Background
The social network media in the era of rapid development of computer internet is growing up day by day, most users are more dependent on the social network media, the requirements of all users on news information are different, a natural phenomenon that the attention degree of user groups to the news information is different is presented, after macroscopic statistics is carried out, some news information is frequently accessed, and the attention degree of the users is high; some news information is accessed very infrequently and with little user attention. Currently, the number of times that news information is played can be used as a heat value to quantify a single item which represents the attention degree of a user group to the news information, and originally, more items are biased to event volume or event information reading. We therefore improve on this and propose a system for calculating event heat.
Disclosure of Invention
The invention aims to overcome the problems in the prior art and provide a system for calculating event popularity, which fully considers the content index of a news information text, analyzes the volume of internet data content by taking a specific event as an analysis, adopts a more comprehensive index system and finally obtains the event popularity of the specific event.
In order to achieve the technical purpose and achieve the technical effect, the invention is realized by the following technical scheme:
a system for calculating event heat comprises a data acquisition module, an information storage module, a data classification module, a heat calculation module and a data generation module;
the data acquisition module is used for crawling mass contents from the Internet by using a crawler technology and sending the contents to the information storage module;
the information storage module is used for storing and managing the acquired massive content information and establishing a large database;
the data classification module is used for extracting data in the information storage module, performing topic aggregation calculation on massive content texts, generating each event set and providing an available data set to be analyzed for the hot calculation module;
the heat calculation module is used for performing combined calculation on three dimensions of a data source, data content sound volume and time sound volume at different times to obtain a heat comprehensive score and a time-freshness comprehensive score;
the data generation module generates ranking list data according to the combination of the comprehensive heat scores and the comprehensive time-new degree scores.
Further, the data acquisition module is used for acquiring text information, comment times, forwarding times, user basic information and user comment interaction information and sending the text information, the comment times, the forwarding times, the user basic information and the user comment interaction information to the information storage module.
Further, the data classification module is used for extracting user basic information and user comment interaction information in a database to generate a text information data source A, generating data content sound volume B through the text information, generating event sound volume C through comment times and forwarding times, and counting the data source A, the content sound volume B and the event sound volume C in each time period by taking hours as a unit for the obtained data.
A method for calculating event heat comprises the following steps:
s1, collecting hot event news and user information, crawling mass contents from the Internet by using a crawler technology, acquiring hot news information, content forwarding times, content comment times, user basic information and text information to be calculated, and sending the contents to an information storage module;
the method comprises the following steps that S2, a data classification module is used for extracting user basic information and user comment interaction information in a database to generate a text information data source A, the text information generates data content sound volume B, comment times and forwarding times generate event sound volume C, the obtained data are counted by taking hours as units, the data source A, the content sound volume B and the event sound volume C in each time period are counted, then a heat index model of a service scene is formed according to a combination formula H ═ lambda 1A + lambda 2B + lambda 3C, the value of each set can be further standardized, first operation statistical data of the content of each source in a first preset time period and second operation statistical data in a second preset time period are counted, and a heat comprehensive score and a time-freshness comprehensive score are obtained;
and S3, the generating module generates ranking list data according to the combination of the comprehensive heat degree scores and the comprehensive time-new degree scores.
The invention has the beneficial effects that: the system for calculating the event popularity fully considers the content indexes of the news information texts, analyzes the volume of internet data content by taking the specific event as an analysis target, adopts a more comprehensive index system, and finally obtains the event popularity of the specific event.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a schematic structural view of the present invention;
FIG. 2 is a schematic diagram of the calculation process of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "opening," "upper," "lower," "thickness," "top," "middle," "length," "inner," "peripheral," and the like are used in an orientation or positional relationship that is merely for convenience in describing and simplifying the description, and do not indicate or imply that the referenced component or element must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be considered as limiting the present invention.
A system for calculating event heat as shown in fig. 1, which includes a data obtaining module, an information storage module, a data classifying module, a heat calculating module, and a data generating module, where the data obtaining module is configured to crawl mass content from the internet by using a crawler technology and send the content to the information storage module, the information storage module is configured to store and manage the obtained mass content information and establish a large database, the data classifying module is configured to extract data in the information storage module, perform topic aggregation calculation on mass content texts, generate event sets, and provide an available data set to be analyzed for the heat calculating module;
as shown in fig. 2, in the system for calculating event popularity, a popularity calculation module is configured to perform combined calculation on three dimensions of a data source, data content sound volume and time sound volume at different times to obtain comprehensive popularity scores and time-newness scores, a data generation module is configured to generate ranking list data according to the comprehensive popularity scores and the time-newness scores, a data acquisition module is configured to collect text information, comment times, forwarding times, basic user information and user comment interaction information and send the collected text information, a data classification module is configured to extract the basic user information and the user comment interaction information in a database to generate a text information data source a, the text information generates a data content sound volume B, the comment times and the forwarding times generate an event sound volume C, and the obtained data counts the data source a in each time period in hours to obtain a statistics result, Content volume B, event volume C.
A method for calculating event heat comprises the following steps:
s1, collecting hot event news and user information, crawling mass contents from the Internet by using a crawler technology, acquiring hot news information, content forwarding times, content comment times, user basic information and text information to be calculated, and sending the contents to an information storage module;
the method comprises the following steps that S2, a data classification module is used for extracting user basic information and user comment interaction information in a database to generate a text information data source A, the text information generates data content sound volume B, comment times and forwarding times generate event sound volume C, the obtained data are counted by taking hours as units, the data source A, the content sound volume B and the event sound volume C in each time period are counted, then a heat index model of a service scene is formed according to a combination formula H ═ lambda 1A + lambda 2B + lambda 3C, the value of each set can be further standardized, first operation statistical data of the content of each source in a first preset time period and second operation statistical data in a second preset time period are counted, and a heat comprehensive score and a time-freshness comprehensive score are obtained;
and S3, the generating module generates ranking list data according to the combination of the comprehensive heat degree scores and the comprehensive time-new degree scores.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed.

Claims (4)

1. A system for calculating event heat is characterized by comprising a data acquisition module, an information storage module, a data classification module, a heat calculation module and a data generation module;
the data acquisition module is used for crawling mass contents from the Internet by using a crawler technology and sending the contents to the information storage module;
the information storage module is used for storing and managing the acquired massive content information and establishing a large database;
the data classification module is used for extracting data in the information storage module, performing topic aggregation calculation on massive content texts, generating each event set and providing an available data set to be analyzed for the hot calculation module;
the heat calculation module is used for performing combined calculation on three dimensions of a data source, data content sound volume and time sound volume at different times to obtain a heat comprehensive score and a time-freshness comprehensive score;
the data generation module generates ranking list data according to the combination of the comprehensive heat scores and the comprehensive time-new degree scores.
2. The system for calculating the popularity of events according to claim 1, wherein the data acquisition module is configured to collect text information, comment times, forwarding times, user basic information, and user comment interaction information, and send the collected information to the information storage module.
3. The system for calculating event popularity according to claim 1, wherein the data classification module is configured to extract user basic information and user comment interaction information in the database to generate a text information data source a, generate a data content volume B, generate an event volume C according to the number of comments and the number of forwarding, and count the data source a, the content volume B, and the event volume C in each time period by taking hours as a unit for the obtained data.
4. The method of claim 1, wherein the method comprises the steps of:
s1, collecting hot event news and user information, crawling mass contents from the Internet by using a crawler technology, acquiring hot news information, content forwarding times, content comment times, user basic information and text information to be calculated, and sending the contents to an information storage module;
the method comprises the following steps that S2, a data classification module is used for extracting user basic information and user comment interaction information in a database to generate a text information data source A, the text information generates data content sound volume B, comment times and forwarding times generate event sound volume C, the obtained data are counted by taking hours as units, the data source A, the content sound volume B and the event sound volume C in each time period are counted, then a heat index model of a service scene is formed according to a combination formula H ═ lambda 1A + lambda 2B + lambda 3C, the value of each set can be further standardized, first operation statistical data of the content of each source in a first preset time period and second operation statistical data in a second preset time period are counted, and a heat comprehensive score and a time-freshness comprehensive score are obtained;
and S3, the generating module generates ranking list data according to the combination of the comprehensive heat degree scores and the comprehensive time-new degree scores.
CN202111210594.5A 2021-10-18 2021-10-18 System and method for calculating event heat Pending CN113946736A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111210594.5A CN113946736A (en) 2021-10-18 2021-10-18 System and method for calculating event heat

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111210594.5A CN113946736A (en) 2021-10-18 2021-10-18 System and method for calculating event heat

Publications (1)

Publication Number Publication Date
CN113946736A true CN113946736A (en) 2022-01-18

Family

ID=79331127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111210594.5A Pending CN113946736A (en) 2021-10-18 2021-10-18 System and method for calculating event heat

Country Status (1)

Country Link
CN (1) CN113946736A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077190A (en) * 2012-12-20 2013-05-01 人民搜索网络股份公司 Hot event ranking method based on order learning technology
CN103177076A (en) * 2012-12-28 2013-06-26 中联竞成(北京)科技有限公司 Public sentiment monitoring system and method based on fixed point websites
CN103324665A (en) * 2013-05-14 2013-09-25 亿赞普(北京)科技有限公司 Hot spot information extraction method and device based on micro-blog
CN104035960A (en) * 2014-05-08 2014-09-10 东莞市巨细信息科技有限公司 Internet information hotspot predicting method
CN104216954A (en) * 2014-08-20 2014-12-17 北京邮电大学 Prediction device and prediction method for state of emergency topic
CN105589895A (en) * 2014-11-13 2016-05-18 深圳市腾讯计算机系统有限公司 Resource ranking data generation method and device
CN106980692A (en) * 2016-05-30 2017-07-25 国家计算机网络与信息安全管理中心 A kind of influence power computational methods based on microblogging particular event
CN111143655A (en) * 2019-12-30 2020-05-12 创新奇智(青岛)科技有限公司 Method for calculating news popularity
CN111310079A (en) * 2020-02-14 2020-06-19 腾讯科技(深圳)有限公司 Comment information sorting method and device, storage medium and server
CN111461553A (en) * 2020-04-02 2020-07-28 上饶市中科院云计算中心大数据研究院 System and method for monitoring and analyzing public sentiment in scenic spot

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077190A (en) * 2012-12-20 2013-05-01 人民搜索网络股份公司 Hot event ranking method based on order learning technology
CN103177076A (en) * 2012-12-28 2013-06-26 中联竞成(北京)科技有限公司 Public sentiment monitoring system and method based on fixed point websites
CN103324665A (en) * 2013-05-14 2013-09-25 亿赞普(北京)科技有限公司 Hot spot information extraction method and device based on micro-blog
CN104035960A (en) * 2014-05-08 2014-09-10 东莞市巨细信息科技有限公司 Internet information hotspot predicting method
CN104216954A (en) * 2014-08-20 2014-12-17 北京邮电大学 Prediction device and prediction method for state of emergency topic
CN105589895A (en) * 2014-11-13 2016-05-18 深圳市腾讯计算机系统有限公司 Resource ranking data generation method and device
CN106980692A (en) * 2016-05-30 2017-07-25 国家计算机网络与信息安全管理中心 A kind of influence power computational methods based on microblogging particular event
CN111143655A (en) * 2019-12-30 2020-05-12 创新奇智(青岛)科技有限公司 Method for calculating news popularity
CN111310079A (en) * 2020-02-14 2020-06-19 腾讯科技(深圳)有限公司 Comment information sorting method and device, storage medium and server
CN111461553A (en) * 2020-04-02 2020-07-28 上饶市中科院云计算中心大数据研究院 System and method for monitoring and analyzing public sentiment in scenic spot

Similar Documents

Publication Publication Date Title
CN106980692B (en) Influence calculation method based on microblog specific events
Batool et al. Precise tweet classification and sentiment analysis
CN103258000B (en) Method and device for clustering high-frequency keywords in webpages
CN103745000B (en) Hot topic detection method of Chinese micro-blogs
CN102426610B (en) Microblog rank searching method and microblog searching engine
Yu et al. Ring: Real-time emerging anomaly monitoring system over text streams
CN103186663B (en) A kind of network public-opinion monitoring method based on video and system
Rehman et al. Building a data warehouse for twitter stream exploration
Li et al. Suggest what to tag: Recommending more precise hashtags based on users’ dynamic interests and streaming tweet content
US9407589B2 (en) System and method for following topics in an electronic textual conversation
CN103064880A (en) Method, device and system based on searching information for providing users with website choice
CN107193867A (en) Much-talked-about topic analysis method based on big data
CN112104642A (en) Abnormal account number determination method and related device
Lee et al. An automatic topic ranking approach for event detection on microblogging messages
CN114637903A (en) Public opinion data acquisition system for directional target data expansion
CN110019763B (en) Text filtering method, system, equipment and computer readable storage medium
CN113946736A (en) System and method for calculating event heat
Wu et al. An event timeline extraction method based on news corpus
Rzeszutek et al. Self-organizing maps for topic trend discovery
CN116595043A (en) Big data retrieval method and device
CN109902230A (en) A kind of processing method and processing device of news data
Zhao et al. A system to manage and mine microblogging data
Yao et al. Ushio: Analyzing news media and public trends in Twitter
Xia et al. Attribution of Responsibility for Pick Up Artist Issues in China: The Impacts of Journalist Gender, Geographical Location, and Publication Range
Van Britsom et al. Automatically generating multi-document summarizations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220118