CN107943905A - A kind of much-talked-about topic analysis method and system - Google Patents

A kind of much-talked-about topic analysis method and system Download PDF

Info

Publication number
CN107943905A
CN107943905A CN201711146862.5A CN201711146862A CN107943905A CN 107943905 A CN107943905 A CN 107943905A CN 201711146862 A CN201711146862 A CN 201711146862A CN 107943905 A CN107943905 A CN 107943905A
Authority
CN
China
Prior art keywords
topic
talked
much
article
article data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711146862.5A
Other languages
Chinese (zh)
Other versions
CN107943905B (en
Inventor
白荣超
万月亮
王梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruian Technology Co Ltd
Original Assignee
Beijing Ruian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruian Technology Co Ltd filed Critical Beijing Ruian Technology Co Ltd
Priority to CN201711146862.5A priority Critical patent/CN107943905B/en
Publication of CN107943905A publication Critical patent/CN107943905A/en
Application granted granted Critical
Publication of CN107943905B publication Critical patent/CN107943905B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The embodiment of the invention discloses a kind of much-talked-about topic analysis method and system, this method to include:According to acquisition strategies, collection and the relevant article data of much-talked-about topic;Rule is determined according to default, from the article data, determines source article, propagation path, spread scope and the development trend of the much-talked-about topic.Technical solution provided in an embodiment of the present invention, by making profound information excavating to much-talked-about topic, it can realize the analysis from many aspects to much-talked-about topic evolution on network, to help network management personnel to be capable of the situation of more fully awareness network focus incident, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take relevant public sentiment guide means in time, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.

Description

A kind of much-talked-about topic analysis method and system
Technical field
The present embodiments relate to technical field of information management, more particularly to a kind of much-talked-about topic analysis method and system.
Background technology
With the continuous maturation of Internet technology, between person to person, people and tissue, people and society interaction become increasingly It is more.Since everyone can issue or integrate information, become the media of information propagation, therefore, live in internet now Promotion under already entered epoch of an information explosion.
The growth and propagation of information are counted with geometric progression, and more and more information are full of around us. The information definition that event occurrence frequency is significantly higher than normal frequency by people is much-talked-about topic, and since these much-talked-about topics are The embodiment of the condition of the people and the will of the people of the people on interaction platform, has social development great research and reference value, each matchmaker Common people's topic of interest is only recognized by body, businessman and government, could grasp the demand and spin of masses, ability Determine which kind of corresponding measure taken for hot issue.
But at present, means are relatively simple used by the analysis for much-talked-about topic in the prior art, efficiency Also it is not high, and analysis result is relatively unilateral, can not accurately make profound information excavating to much-talked-about topic.
The content of the invention
The present invention provides a kind of much-talked-about topic analysis method and system, and efficiently and accurately much-talked-about topic is carried out with realizing Many analyses.
To reach this purpose, the present invention uses following technical scheme:
In a first aspect, an embodiment of the present invention provides a kind of much-talked-about topic analysis method, the described method includes:
According to acquisition strategies, collection and the relevant article data of much-talked-about topic;
Rule is determined according to default, from the article data, is determined the source article of the much-talked-about topic, is propagated road Footpath, spread scope and development trend.
Further, in the above method, the basis, which is preset, determines rule, from the article data, determines the heat The source article of point topic includes:
The article data is ranked up according to issuing time;
From the article data after sequence, choose described in the corresponding article conduct of the earliest article data of issuing time The source article of much-talked-about topic.
Further, in the above method, the basis, which is preset, determines rule, from the article data, determines the heat The propagation path of point topic includes:
Obtain the keyword of input;
According to the keyword, from the article data, determine and the relevant article data of the keyword;
According to issuing time, the first order of the much-talked-about topic will be organized into the relevant article data of the keyword Propagation path;
From the article data of the first order propagation path, the phrase for meeting default selection condition is chosen as new pass Keyword;
The new keyword is recommended into user, so that user determines the hot spot by the new keyword The second level propagation path of topic;
The operation that the new keyword recommended according to upper level propagation path determines next stage propagation path is repeated, Obtain the multistage propagation path of the much-talked-about topic.
Further, in the above method, the basis, which is preset, determines rule, from the article data, determines the heat The spread scope of point topic includes:
The identical article data in source is clustered, and counts the quantity of article data in each source respectively;
If the quantity meets default quantity term, the corresponding radiation scope in the source is determined as the hot spot and is talked about The spread scope of topic.
Further, in the above method, the basis, which is preset, determines rule, from the article data, determines the heat The development trend of point topic includes:
Statistics incremental difference of the article data under different media types, and be depicted as curve daily;
According to the slope of a curve and default medium type weight coefficient, determine that the development of the much-talked-about topic becomes Gesture.
Further, in the above method, preset in the basis and determine rule, from the article data, determined described After the source article of much-talked-about topic, propagation path, spread scope and development trend, further include:
According to Generalization bounds, article to be recommended is determined from the article data, and recommend user.
Second aspect, an embodiment of the present invention provides a kind of much-talked-about topic analysis system, including:
Data acquisition module, for according to acquisition strategies, collection and the relevant article data of much-talked-about topic;
Analysis of central issue module, for determining rule according to default, from the article data, determines the much-talked-about topic Source article, propagation path, spread scope and development trend.
Further, in said system, the analysis of central issue module includes Source Tracing submodule, path analysis submodule Block, surface analysis submodule and trend analysis submodule, wherein:
The Source Tracing submodule includes:
Data sorting unit, for the article data to be ranked up according to issuing time;
Trace to the source determination unit, for from the article data after sequence, choosing the earliest article data of issuing time Source article of the corresponding article as the much-talked-about topic;
The path analysis submodule includes:
Keyword acquiring unit, for obtaining the keyword of input;
Related data determination unit, for according to the keyword, from the article data, determining and the keyword Relevant article data;
First path determination unit, for, according to issuing time, will be organized into the relevant article data of the keyword The first order propagation path of the much-talked-about topic;
New keywords choose unit, meet default choosing for from the article data of the first order propagation path, choosing The phrase of condition is taken as new keyword;
Second path determining unit, for the new keyword to be recommended user so that user pass through it is described new Keyword, determine the second level propagation path of the much-talked-about topic;
Operation execution unit, for repeat the new keyword recommended according to upper level propagation path determine it is next The operation of level propagation path, obtains the multistage propagation path of the much-talked-about topic;
The surface analysis submodule includes:
Data statistics unit, for the identical article data in source to be clustered, and counts each source Chinese respectively The quantity of chapter data;
Scope determination unit, if meeting default quantity term for the quantity, by the corresponding radiation model in the source Enclose the spread scope for being determined as the much-talked-about topic;
The trend analysis submodule includes:
Drawing of Curve unit, for counting daily incremental difference of the article data under different media types, and is painted Curve is made;
Trend determination unit, for according to the slope of a curve and default medium type weight coefficient, determining described The development trend of much-talked-about topic.
Further, the system also includes:
Article recommending module, determines rule for being preset in the basis, from the article data, determines the hot spot After the source article of topic, propagation path, spread scope and development trend, according to Generalization bounds, from the article data Determine article to be recommended, and recommend user.
The technical solution that the embodiment of the present invention is provided, by making profound information excavating, Ke Yishi to much-talked-about topic The now analysis to much-talked-about topic evolution on network from many aspects, to help network management personnel more fully to understand The situation of network hotspot event, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take relevant public sentiment to draw in time Means are led, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is attached drawing needed in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of invention, for those of ordinary skill in the art, without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is the flow diagram for the much-talked-about topic analysis method that the embodiment of the present invention one provides;
Fig. 2 is the flow diagram of much-talked-about topic analysis method provided by Embodiment 2 of the present invention;
Fig. 3 is the flow diagram for the much-talked-about topic analysis method that the embodiment of the present invention three provides;
Fig. 4 is the flow diagram for the much-talked-about topic analysis method that the embodiment of the present invention four provides;
Fig. 5 is the flow diagram for the much-talked-about topic analysis method that the embodiment of the present invention five provides;
Fig. 6 is the flow diagram for the much-talked-about topic analysis method that the embodiment of the present invention six provides;
Fig. 7 is the structure diagram for the much-talked-about topic analysis system that the embodiment of the present invention seven provides.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that in order to just It illustrate only part related to the present invention rather than entire infrastructure in description, attached drawing.
Embodiment one
Fig. 1 is the flow diagram for the much-talked-about topic analysis method that the embodiment of the present invention one provides, and this method is preferably applicable in In the application scenarios for needing to analyze much-talked-about topic on internet, this method can be held by much-talked-about topic analysis system OK, which can be by software and/or hardware realization.Referring to Fig. 1, this method includes:
S101, according to acquisition strategies, collection and the relevant article data of much-talked-about topic.
It should be noted that acquisition strategies refer to the collection side set in advance to every kind of different classes of data or information Formula, wherein, conventional data acquisition modes include:Three kinds of web service interfaces, data exchange, web crawlers modes, in reality When border gathers, acquisition time, acquisition interval and frequency acquisition etc. can be also configured.For example acquisition time is arranged on night During rest, and call time in data, could be provided as the data gathered when 24 is small before daily zero hour reports, this meets more Number user behavior custom.
Specifically, the data type of gathered data as needed, when selecting suitable acquisition mode, and setting collection Between, acquisition interval and frequency acquisition etc., then carry out corresponding collection action.
S102, basis are default to determine rule, from the article data, determines the source article of the much-talked-about topic, passes Broadcast path, spread scope and development trend.
It should be noted that default definite rule is to analyze much-talked-about topic and a set of system for handling for formulating, for Different analysis levels is preset with different establish rules really then.When needs carry out different article datas in same aspect During analysis, even if having larger difference between this different article data, it is also intended to follow this set of definite regular.
Specifically, when determining source article, propagation path, spread scope and the development trend of much-talked-about topic, correspond to all There is a set of definite rule set in advance.For example it is the hair by using article data when determining the source article of much-talked-about topic The cloth time, and when determining the propagation path of much-talked-about topic, then need to serve as theme with source article.
The technical solution that the embodiment of the present invention is provided, by making profound information excavating, Ke Yishi to much-talked-about topic The now analysis to much-talked-about topic evolution on network from many aspects, to help network management personnel more fully to understand The situation of network hotspot event, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take relevant public sentiment to draw in time Means are led, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.
Embodiment two
As shown in Fig. 2, much-talked-about topic analysis method provided by Embodiment 2 of the present invention, is the technology provided in embodiment one On the basis of scheme, step S102 " is determined by rule according to default, from the article data, determines the much-talked-about topic The further optimization of source article ".Details are not described herein for the explanation of identical with the various embodiments described above or corresponding term.I.e.:
S201, according to acquisition strategies, collection and the relevant article data of much-talked-about topic.
The article data, be ranked up by S202 according to issuing time;
It should be noted that issuing time is an important information of article data.Either news website, or opinion Altar, or it is blog, any article data that we are seen all can be with its issuing time.When we browse When one article or model, according to issuing time, we are it is known that this article or model appear in the time of internet.
It is larger with the order of magnitude of the relevant article data of much-talked-about topic due to collecting, needing to own from what is collected When source article is determined in article data, issuing time is a most important comparison information.
In one embodiment, article data just starts carrying out interspersed row according to its issuing time in the process of collection Sequence, issuing time are accurate to date Hour Minute Second.
S203, from the article data after sequence, choose the earliest corresponding article of article data of issuing time and make For the source article of the much-talked-about topic.
It should be noted that after the completion of all article datas collected are sorted, the more early article number of issuing time According to position it is more forward, therefore source text of the most forward corresponding article of article data of chosen position as the much-talked-about topic Chapter.
The technical solution that the embodiment of the present invention is provided, by making profound information excavating, Ke Yishi to much-talked-about topic The now analysis to much-talked-about topic evolution on network from many aspects, to help network management personnel more fully to understand The situation of network hotspot event, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take relevant public sentiment to draw in time Means are led, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.
Embodiment three
As shown in figure 3, the much-talked-about topic analysis method that the embodiment of the present invention three provides, is the technology provided in embodiment one On the basis of scheme, step S102 " is determined by rule according to default, from the article data, determines the much-talked-about topic The further optimization of propagation path ".Details are not described herein for the explanation of identical with the various embodiments described above or corresponding term.I.e.:
S301, according to acquisition strategies, collection and the relevant article data of much-talked-about topic.
S302, the keyword for obtaining input;
It should be noted that the keyword obtained in the step is inputted by user, which appears in earliest a collection of In article data, it can play summary to much-talked-about topic.
S303, according to the keyword, from the article data, determine and the relevant article data of the keyword.
Specifically, the article data that title or summary info are included to the keyword is determined as and the relevant text of the keyword Chapter data.
S304, according to issuing time, will be organized into the of the much-talked-about topic with the relevant article data of the keyword Level-one propagation path;
It should be noted that between relevant article data, expression of description, viewpoint in article etc. all have compared with Strong identical property, it is taken as that they belong to same propagation path.
S305, from the article data of the first order propagation path, choose the phrase conduct for meeting default selection condition New keyword;
It should be noted that as the further development of much-talked-about topic and discussion, the focus of much-talked-about topic also will necessarily Fall in different places and angle, new keyword will be produced in itself with the relevant article data of much-talked-about topic, at this time, if Want the trend of understanding much-talked-about topic in time, then need the weight according to each phrase in the much-talked-about topic related article data With the change of frequency, extract new keyword and recommend user.
Specifically, within one section of keyword renewal time, weight selection is maximum, the highest phrase of frequency is as new key Word recommends user.
S306, by the new keyword recommend user, so that user is determined described by the new keyword The second level propagation path of much-talked-about topic;
It is similar with the process of definite first order propagation path, in the text in addition to the article data of first order propagation path In chapter data, the definite and relevant article data of new keyword, and according to issuing time, arrangement obtains the hot spot words The second level propagation path of topic.That is, the new keyword recommended according to upper level propagation path determines that next stage passes Broadcast path.
S307, repeat the new keyword recommended according to upper level propagation path and determine next stage propagation path Operation, obtains the multistage propagation path of the much-talked-about topic.
It should be noted that the development of much-talked-about topic can constantly develop out new keyword, by will be related to keyword Article data, sort according to issuing time, the propagation path of the much-talked-about topic can be obtained.Between propagation path, by pair The time of occurrence for the keyword answered has apparent causal connection, thus there are the relation of the superior and the subordinate.
The technical solution that the embodiment of the present invention is provided, by making profound information excavating, Ke Yishi to much-talked-about topic The now analysis to much-talked-about topic evolution on network from many aspects, to help network management personnel more fully to understand The situation of network hotspot event, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take relevant public sentiment to draw in time Means are led, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.
Example IV
As shown in figure 4, the much-talked-about topic analysis method that the embodiment of the present invention four provides, is the technology provided in embodiment one On the basis of scheme, step S102 " is determined by rule according to default, from the article data, determines the much-talked-about topic The further optimization of spread scope ".Details are not described herein for the explanation of identical with the various embodiments described above or corresponding term.I.e.:
S401, according to acquisition strategies, collection and the relevant article data of much-talked-about topic.
S402, clustered the identical article data in source, and counts the quantity of article data in each source respectively;
It should be noted that source refers to the issue address of article data, and these addresses would generally lead on internet Each different medium type, such as news and forum.
Specifically, by the identical article data in source, such as the article number from same news website or forum website According to cluster together, then after Statistical Clustering Analysis article data quantity.
If S403, the quantity meet default quantity term, the corresponding radiation scope in the source is determined as described The spread scope of much-talked-about topic.
Analyzed it should be noted that spread scope is the region angle propagated from much-talked-about topic, main research heat Crowd's range that point topic is influenced, can more intuitively reflect propagation area, understand the prevalence of much-talked-about topic.Each Medium type, such as news and forum, have all preset corresponding radiation scope (individual, foreign countries, city-level, provincial, national), it has Body refers to the news or the broadly domain scope that forum can travel to.
If in one embodiment, it is specified that there is an at least news on news website of the radiation scope for the whole nation, Or there are at least three models in forum of the radiation scope for the whole nation, then the spread scope for defining the topic is the whole nation;Rule As long as it is fixed in radiation scope to there is an at least news on provincial news website, or in radiation scope be provincial forum Upper appearance at least two models, the then spread scope for defining the topic are provincial;Provide that as long as in radiation scope be the new of districts and cities Hear and occur an at least news on website, or an at least model occur in the forum that radiation scope is districts and cities, then define The spread scope of the topic is districts and cities.The calculation of other radiation scopes is similar to the above, specifically can be according to actual conditions Depending on.
The technical solution that the embodiment of the present invention is provided, by making profound information excavating, Ke Yishi to much-talked-about topic The now analysis to much-talked-about topic evolution on network from many aspects, to help network management personnel more fully to understand The situation of network hotspot event, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take relevant public sentiment to draw in time Means are led, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.
Embodiment five
As shown in figure 5, the much-talked-about topic analysis method that the embodiment of the present invention five provides, is the technology provided in embodiment one On the basis of scheme, step S102 " is determined by rule according to default, from the article data, determines the much-talked-about topic The further optimization of development trend ".Details are not described herein for the explanation of identical with the various embodiments described above or corresponding term.I.e.:
S501, according to acquisition strategies, collection and the relevant article data of much-talked-about topic.
S502, statistics incremental difference of the article data under different media types, and be depicted as curve daily;
In embodiments of the present invention, it is preferred that medium type includes news, forum and blog.
Specifically, statistics incremental difference of the article data under news, forum and blog these three medium types daily Value, and it is depicted as curve.
S503, according to the slope of a curve and default medium type weight coefficient, determine the hair of the much-talked-about topic Exhibition trend.
It should be noted that the development trend of much-talked-about topic refers to that topic of the much-talked-about topic under different medium types refers to Number summation, wherein, the calculation of topic index is the product of incremental difference curve and weight coefficient.
Since news, audient's degree of these three medium types of forum with blog are different, it is therefore desirable to be these three media Type assigns different weight coefficients.
In a kind of real-time mode, it is preferred that the medium type weight coefficient of news is preset as 0.6, by the matchmaker of forum Body type weight coefficient is preset as 0.35, and the medium type weight coefficient of blog is preset as 0.05.
The technical solution that the embodiment of the present invention is provided, by making profound information excavating, Ke Yishi to much-talked-about topic The now analysis to much-talked-about topic evolution on network from many aspects, to help network management personnel more fully to understand The situation of network hotspot event, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take relevant public sentiment to draw in time Means are led, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.
Embodiment six
Fig. 6 is the flow diagram for the much-talked-about topic analysis method that the embodiment of the present invention six provides, and the present embodiment is above-mentioned On the basis of embodiment, preset in the basis and determine rule, from the article data, determine the source of the much-talked-about topic After article, propagation path, spread scope and development trend, optimization is made to this method.Specifically, preset really in the basis Set pattern then, from the article data, determines that source article, propagation path, spread scope and the development of the much-talked-about topic become After gesture, add " according to Generalization bounds, article to be recommended being determined from the article data, and recommend user ".With it is upper State that embodiment is identical or the explanation of corresponding term details are not described herein.The method of the present embodiment can specifically include following step Suddenly:
S601, according to acquisition strategies, collection and the relevant article data of much-talked-about topic.
S602, basis are default to determine rule, from the article data, determines the source article of the much-talked-about topic, passes Broadcast path, spread scope and development trend.
S603, according to Generalization bounds, article to be recommended is determined from the article data, and recommend user.
In one embodiment, the spread scope from article data, reprinting rate and temperature this in terms of three, from the article In data, recommend article to user.
Specifically, from the article data, choose five most wide articles of spread scope and recommend user, choose and reprint Five most articles of number, which recommend user and choose comment reply number five articles of highest, recommends user.
The technical solution that the embodiment of the present invention is provided, by making profound information excavating, Ke Yishi to much-talked-about topic The now analysis to much-talked-about topic evolution on network from many aspects, and by recommending referential article, network pipe can be helped Reason personnel are capable of the situation of more fully awareness network focus incident, and for the much-talked-about topic for needing to be subject to management and control, can Easy to take relevant public sentiment guide means in time, the accuracy and detection efficiency of the analysis of much-talked-about topic are improved.
Embodiment seven
As shown in fig. 7, the embodiment of the present invention seven provides a kind of structure diagram of much-talked-about topic analysis system, system tool Body includes following module:
Data acquisition module 71, for according to acquisition strategies, collection and the relevant article data of much-talked-about topic.
Analysis of central issue module 72, for determining rule according to default, from the article data, determines the much-talked-about topic Source article, propagation path, spread scope and development trend.
Preferably, the analysis of central issue module 72 includes Source Tracing submodule, path analysis submodule, surface analysis Module and trend analysis submodule, wherein:
The Source Tracing submodule includes:
Data sorting unit, for the article data to be ranked up according to issuing time;
Trace to the source determination unit, for from the article data after sequence, choosing the earliest article data of issuing time Source article of the corresponding article as the much-talked-about topic.
The path analysis submodule includes:
Keyword acquiring unit, for obtaining the keyword of input;
Related data determination unit, for according to the keyword, from the article data, determining and the keyword Relevant article data;
First path determination unit, for, according to issuing time, will be organized into the relevant article data of the keyword The first order propagation path of the much-talked-about topic;
New keywords choose unit, meet default choosing for from the article data of the first order propagation path, choosing The phrase of condition is taken as new keyword;
Second path determining unit, for the new keyword to be recommended user so that user pass through it is described new Keyword, determine the second level propagation path of the much-talked-about topic;
Operation execution unit, for repeat the new keyword recommended according to upper level propagation path determine it is next The operation of level propagation path, obtains the multistage propagation path of the much-talked-about topic.
The surface analysis submodule includes:
Data statistics unit, for the identical article data in source to be clustered, and counts each source Chinese respectively The quantity of chapter data;
Scope determination unit, if meeting default quantity term for the quantity, by the corresponding radiation model in the source Enclose the spread scope for being determined as the much-talked-about topic.
The trend analysis submodule includes:
Drawing of Curve unit, for counting daily incremental difference of the article data under different media types, and is painted Curve is made;
Trend determination unit, for according to the slope of a curve and default medium type weight coefficient, determining described The development trend of much-talked-about topic.
Preferably, the much-talked-about topic analysis system further includes:
Article recommending module, determines rule for being preset in the basis, from the article data, determines the hot spot After the source article of topic, propagation path, spread scope and development trend, according to Generalization bounds, from the article data Determine article to be recommended, and recommend user.
Much-talked-about topic analysis system provided in an embodiment of the present invention, by according to acquisition strategies, collection and much-talked-about topic phase The article data of pass;Rule is determined according to default, from the article data, is determined the source article of the much-talked-about topic, is passed Broadcast path, spread scope and development trend.Based on above method system, by making profound information excavating to much-talked-about topic, The analysis from many aspects to much-talked-about topic evolution on network can be realized, to help network management personnel can be more fully The situation of ground awareness network focus incident, and for the much-talked-about topic for needing to be subject to management and control, it can be easy to take in time relevant Public sentiment guide means, improve the accuracy and detection efficiency of the analysis of much-talked-about topic.
Said system can perform the method that any embodiment of the present invention is provided, and possess the corresponding function module of execution method And beneficial effect.
Note that it above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also It can include other more equivalent embodiments, and the scope of the present invention is determined by scope of the appended claims.

Claims (9)

  1. A kind of 1. much-talked-about topic analysis method, it is characterised in that including:
    According to acquisition strategies, collection and the relevant article data of much-talked-about topic;
    Rule is determined according to default, from the article data, determines the source article, propagation path, biography of the much-talked-about topic Broadcast scope and development trend.
  2. 2. according to the method described in claim 1, it is characterized in that, the basis, which is preset, determines rule, from the article data In, determining the source article of the much-talked-about topic includes:
    The article data is ranked up according to issuing time;
    From the article data after sequence, the earliest corresponding article of article data of issuing time is chosen as the hot spot The source article of topic.
  3. 3. according to the method described in claim 1, it is characterized in that, the basis, which is preset, determines rule, from the article data In, determining the propagation path of the much-talked-about topic includes:
    Obtain the keyword of input;
    According to the keyword, from the article data, determine and the relevant article data of the keyword;
    Will be with the relevant article data of the keyword, according to issuing time, the first order for being organized into the much-talked-about topic is propagated Path;
    From the article data of the first order propagation path, the phrase for meeting default selection condition is chosen as new key Word;
    The new keyword is recommended into user, so that user determines the much-talked-about topic by the new keyword Second level propagation path;
    The operation that the new keyword recommended according to upper level propagation path determines next stage propagation path is repeated, is obtained The multistage propagation path of the much-talked-about topic.
  4. 4. according to the method described in claim 1, it is characterized in that, the basis, which is preset, determines rule, from the article data In, determining the spread scope of the much-talked-about topic includes:
    The identical article data in source is clustered, and counts the quantity of article data in each source respectively;
    If the quantity meets default quantity term, the corresponding radiation scope in the source is determined as the much-talked-about topic Spread scope.
  5. 5. according to the method described in claim 1, it is characterized in that, the basis, which is preset, determines rule, from the article data In, determining the development trend of the much-talked-about topic includes:
    Statistics incremental difference of the article data under different media types, and be depicted as curve daily;
    According to the slope of a curve and default medium type weight coefficient, the development trend of the much-talked-about topic is determined.
  6. 6. according to the method described in claim 1, rule is determined it is characterized in that, being preset in the basis, from the article number In, after the source article, propagation path, spread scope and the development trend that determine the much-talked-about topic, further include:
    According to Generalization bounds, article to be recommended is determined from the article data, and recommend user.
  7. A kind of 7. much-talked-about topic analysis system, it is characterised in that including:
    Data acquisition module, for according to acquisition strategies, collection and the relevant article data of much-talked-about topic;
    Analysis of central issue module, for determining rule according to default, from the article data, determines the source of the much-talked-about topic Article, propagation path, spread scope and development trend.
  8. 8. system according to claim 7, it is characterised in that the analysis of central issue module include Source Tracing submodule, Path analysis submodule, surface analysis submodule and trend analysis submodule, wherein:
    The Source Tracing submodule includes:
    Data sorting unit, for the article data to be ranked up according to issuing time;
    Trace to the source determination unit, corresponded to for from the article data after sequence, choosing the earliest article data of issuing time Source article of the article as the much-talked-about topic;
    The path analysis submodule includes:
    Keyword acquiring unit, for obtaining the keyword of input;
    Related data determination unit, for according to the keyword, from the article data, determining related to the keyword Article data;
    First path determination unit, for, according to issuing time, will be organized into described with the relevant article data of the keyword The first order propagation path of much-talked-about topic;
    New keywords choose unit, meet default selection bar for from the article data of the first order propagation path, choosing The phrase of part is as new keyword;
    Second path determining unit, for the new keyword to be recommended user, so that user passes through the new pass Keyword, determines the second level propagation path of the much-talked-about topic;
    Operation execution unit, determines that next stage passes for repeating the new keyword recommended according to upper level propagation path The operation in path is broadcast, obtains the multistage propagation path of the much-talked-about topic;
    The surface analysis submodule includes:
    Data statistics unit, for the identical article data in source to be clustered, and counts article number in each source respectively According to quantity;
    Scope determination unit, it is if meeting default quantity term for the quantity, the corresponding radiation scope in the source is true It is set to the spread scope of the much-talked-about topic;
    The trend analysis submodule includes:
    Drawing of Curve unit, for counting daily incremental difference of the article data under different media types, and is depicted as Curve;
    Trend determination unit, for according to the slope of a curve and default medium type weight coefficient, determining the hot spot The development trend of topic.
  9. 9. system according to claim 7, it is characterised in that further include:
    Article recommending module, determines rule for being preset in the basis, from the article data, determines the much-talked-about topic Source article, propagation path, after spread scope and development trend, according to Generalization bounds, determined from the article data Article to be recommended, and recommend user.
CN201711146862.5A 2017-11-17 2017-11-17 Hot topic analysis method and system Active CN107943905B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711146862.5A CN107943905B (en) 2017-11-17 2017-11-17 Hot topic analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711146862.5A CN107943905B (en) 2017-11-17 2017-11-17 Hot topic analysis method and system

Publications (2)

Publication Number Publication Date
CN107943905A true CN107943905A (en) 2018-04-20
CN107943905B CN107943905B (en) 2020-09-08

Family

ID=61931739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711146862.5A Active CN107943905B (en) 2017-11-17 2017-11-17 Hot topic analysis method and system

Country Status (1)

Country Link
CN (1) CN107943905B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522464A (en) * 2018-10-22 2019-03-26 西南石油大学 Information source detection method and system
CN109948024A (en) * 2019-03-12 2019-06-28 安徽新华学院 A kind of public sentiment monitoring method and system based on microblogging
CN111104627A (en) * 2018-10-29 2020-05-05 北京国双科技有限公司 Hot event prediction method and device
CN111368070A (en) * 2018-12-06 2020-07-03 北京国双科技有限公司 Method and device for determining hot event
CN112000866A (en) * 2020-08-05 2020-11-27 杭州安恒信息技术股份有限公司 Internet data analysis method, device, electronic device and medium
CN112579920A (en) * 2020-12-09 2021-03-30 成都中科大旗软件股份有限公司 Cross-space-time propagation analysis method based on events
CN112632364A (en) * 2021-03-09 2021-04-09 中译语通科技股份有限公司 News propagation speed evaluation method and system
CN114036221A (en) * 2021-09-24 2022-02-11 国务院国有资产监督管理委员会研究中心 Thematic event analysis method
CN117093762A (en) * 2023-07-18 2023-11-21 南京特尔顿信息科技有限公司 Public opinion data evaluation analysis system and method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080052147A1 (en) * 2006-07-18 2008-02-28 Eran Reshef System and method for influencing public opinion
US20130124556A1 (en) * 2005-10-21 2013-05-16 Abdur R. Chowdhury Real Time Query Trends with Multi-Document Summarization
CN103744877A (en) * 2013-12-20 2014-04-23 潘大庆 Public opinion monitoring application system deployed in internet and application method
CN103955505A (en) * 2014-04-24 2014-07-30 中国科学院信息工程研究所 Micro-blog-based real-time event monitoring method and system
CN105389389A (en) * 2015-12-10 2016-03-09 安徽博约信息科技有限责任公司 Network public opinion transmission situation media linked analysis method
CN105989176A (en) * 2015-03-05 2016-10-05 北大方正集团有限公司 Data processing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130124556A1 (en) * 2005-10-21 2013-05-16 Abdur R. Chowdhury Real Time Query Trends with Multi-Document Summarization
US20080052147A1 (en) * 2006-07-18 2008-02-28 Eran Reshef System and method for influencing public opinion
CN103744877A (en) * 2013-12-20 2014-04-23 潘大庆 Public opinion monitoring application system deployed in internet and application method
CN103955505A (en) * 2014-04-24 2014-07-30 中国科学院信息工程研究所 Micro-blog-based real-time event monitoring method and system
CN105989176A (en) * 2015-03-05 2016-10-05 北大方正集团有限公司 Data processing method and device
CN105389389A (en) * 2015-12-10 2016-03-09 安徽博约信息科技有限责任公司 Network public opinion transmission situation media linked analysis method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
殷风景: ""面向网络舆情监控的热点话题发现技术研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522464A (en) * 2018-10-22 2019-03-26 西南石油大学 Information source detection method and system
CN111104627B (en) * 2018-10-29 2023-04-07 北京国双科技有限公司 Hot event prediction method and device
CN111104627A (en) * 2018-10-29 2020-05-05 北京国双科技有限公司 Hot event prediction method and device
CN111368070A (en) * 2018-12-06 2020-07-03 北京国双科技有限公司 Method and device for determining hot event
CN109948024A (en) * 2019-03-12 2019-06-28 安徽新华学院 A kind of public sentiment monitoring method and system based on microblogging
CN112000866A (en) * 2020-08-05 2020-11-27 杭州安恒信息技术股份有限公司 Internet data analysis method, device, electronic device and medium
CN112000866B (en) * 2020-08-05 2024-03-26 杭州安恒信息技术股份有限公司 Internet data analysis method, device, electronic device and medium
CN112579920B (en) * 2020-12-09 2023-06-20 成都中科大旗软件股份有限公司 Method for realizing cross-space-time propagation analysis based on event
CN112579920A (en) * 2020-12-09 2021-03-30 成都中科大旗软件股份有限公司 Cross-space-time propagation analysis method based on events
CN112632364A (en) * 2021-03-09 2021-04-09 中译语通科技股份有限公司 News propagation speed evaluation method and system
CN114036221A (en) * 2021-09-24 2022-02-11 国务院国有资产监督管理委员会研究中心 Thematic event analysis method
CN117093762A (en) * 2023-07-18 2023-11-21 南京特尔顿信息科技有限公司 Public opinion data evaluation analysis system and method
CN117093762B (en) * 2023-07-18 2024-02-13 南京特尔顿信息科技有限公司 Public opinion data evaluation analysis system and method

Also Published As

Publication number Publication date
CN107943905B (en) 2020-09-08

Similar Documents

Publication Publication Date Title
CN107943905A (en) A kind of much-talked-about topic analysis method and system
CN104281607A (en) Microblog hot topic analyzing method
US9342802B2 (en) System and method of tracking rate of change of social network activity associated with a digital object
US7889679B2 (en) Arrangements for networks
US20140297403A1 (en) Social Analytics System and Method for Analyzing Conversations in Social Media
Rehman et al. Building a data warehouse for twitter stream exploration
CN104657425A (en) Topic management type network public opinion evaluation management system and method
KR101566616B1 (en) Advertisement decision supporting system using big data-processing and method thereof
Cheng et al. Jobminer: A real-time system for mining job-related patterns from social media
CN103617169A (en) Microblog hot topic extracting method based on Hadoop
Cano et al. Social influence analysis in microblogging platforms–a topic-sensitive based approach
JP2015532495A (en) System and method for presenting and navigating network data sets
Chan et al. Discovering correlated spatio-temporal changes in evolving graphs
CN101477552A (en) Website user rank division method
CN104408083A (en) Socialized media analyzing system
WO2018237098A1 (en) Methods and systems for identifying markers of coordinated activity in social media movements
CN108023768A (en) Network event chain establishment method and network event chain establish system
Chen et al. D-map+ interactive visual analysis and exploration of ego-centric and event-centric information diffusion patterns in social media
CN108255933A (en) A kind of social media dynamic event develops visual analysis method and system
Sun et al. Visualization for knowledge graph based on education data
Obradović et al. A social network analysis and mining methodology for the monitoring of specific domains in the blogosphere
Zhao et al. A consensus model for large-scale multi-attribute group decision making with collaboration-reference network under uncertain linguistic environment
Bakaev et al. Data extraction for decision-support systems: Application in labour market monitoring and analysis
KR20150076275A (en) System and method for contents recommendation using semantic clusters
CN109299368B (en) Method and system for intelligent and personalized recommendation of environmental information resources AI

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A hot topic analysis method and system

Effective date of registration: 20220105

Granted publication date: 20200908

Pledgee: China Co. truction Bank Corp Beijing Zhongguancun branch

Pledgor: RUN TECHNOLOGIES Co.,Ltd. BEIJING

Registration number: Y2022990000005

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20220712

Granted publication date: 20200908

Pledgee: China Co. truction Bank Corp Beijing Zhongguancun branch

Pledgor: RUN TECHNOLOGIES Co.,Ltd. BEIJING

Registration number: Y2022990000005

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A hot topic analysis method and system

Effective date of registration: 20220907

Granted publication date: 20200908

Pledgee: China Co. truction Bank Corp Beijing Zhongguancun branch

Pledgor: RUN TECHNOLOGIES Co.,Ltd. BEIJING

Registration number: Y2022110000206

PC01 Cancellation of the registration of the contract for pledge of patent right

Granted publication date: 20200908

Pledgee: China Co. truction Bank Corp Beijing Zhongguancun branch

Pledgor: RUN TECHNOLOGIES Co.,Ltd. BEIJING

Registration number: Y2022110000206