CN103500163B - The method and apparatus of identification event key development - Google Patents

The method and apparatus of identification event key development Download PDF

Info

Publication number
CN103500163B
CN103500163B CN201310314465.XA CN201310314465A CN103500163B CN 103500163 B CN103500163 B CN 103500163B CN 201310314465 A CN201310314465 A CN 201310314465A CN 103500163 B CN103500163 B CN 103500163B
Authority
CN
China
Prior art keywords
event
news
key development
point
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310314465.XA
Other languages
Chinese (zh)
Other versions
CN103500163A (en
Inventor
沈剑平
彭学政
李凯
罗嵘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201310314465.XA priority Critical patent/CN103500163B/en
Publication of CN103500163A publication Critical patent/CN103500163A/en
Application granted granted Critical
Publication of CN103500163B publication Critical patent/CN103500163B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Providing a kind of method and apparatus identifying event key development, described method includes: obtain event searching word bunch based on event core word;By carrying out event key development identification based on imedias advertisement for event searching word bunch, it is thus achieved that the first event key development point set;Event key development identification is carried out for event core word, it is thus achieved that the some set of second event key development by searching word based on news heat;By carrying out merging and duplicate removal by the first event key development point set and the some set of second event key development, it is thus achieved that the 3rd event key development point set;3rd event key development point set is optimized, it is thus achieved that final event key development venation.Method and apparatus according to the invention, it is possible to provide apparent event venation, meets user's concern demand to event, improves Consumer's Experience, and without human-edited, is substantially reduced special topic manufacturing cost.

Description

The method and apparatus of identification event key development
Technical field
The present invention relates to a kind of news topic tracking technique, more particularly, it relates to a kind of by being not required to To determine in the case of manually marking that the time of origin point of important subevent is to identify the side of event key development Method and equipment.
Background technology
Along with the high speed development of the network technology, either use mobile terminal (such as, mobile phone etc.) or Surf the web immobile terminal (such as, desk computer etc.) news, all has become as people the most Common a kind of leisure way.According to Tentent Science investigate, the investigation user of 61.67% use surfing Internet with cell phone with It is main for browsing news.In news portal website, typically event is referred to as special topic, hot ticket (or Person's topic) generally it is made up of some subevents.Each event has Emergence and Development, climax, a knot Bundle process, whole during important subevent be together in series just formed expression event progress event arteries and veins Network, therefore event venation is the important need fully understanding a media event development.
Prior art is mainly based upon editor's mark, manually realizes event latest developments identification.Such as, Various portal websites are all to use the artificial mark of editor, and Google experimental project living storis is also It is to use the artificial mark of editor, i.e. every news documents is labeled (such as background document, progress literary composition Shelves etc.), then machine collects document with displaying again from the document that editor has marked.
Additionally, Tengxun search news follow the tracks of system be a set of usertracking and find topic latest developments be System, but its latest developments mainly using tracking event rather than concern event (include going through of event History) key development, the event progress figure of generation is not clear event venation.
The method of the artificial mark of editor of prior art, event (topic) coverage rate is narrow, and human cost is high, It is not suitable with the demand that magnanimity media event venation is excavated.
In terms of automatically carry out the excavation of event venation currently with machine, a kind of mode be employing event with The mode of track, carries out thing by the hot ticket that this stage is occurred with the hot ticket occurred previous stage Part associates, if there being historical events can be associated with current event, then current event is the one of historical events Individual progress.But because of the impact of event box news, event correlation often occurs that topic drifts about.Another The mode of kind is the mode using clustering documents, is gathered with the current generation by all bunches that previous stage, cluster obtained The topic bunch of class carries out topic bunch association, and owing to clustering documents belongs to unsupervised learning, cluster cost is high. The mode that simultaneous events is followed the tracks of is difficult to process cold start-up problem, and need current all topics and The all topics of history are associated coupling, and later development cost is relatively big, and the construction cycle is longer.
Accordingly, it would be desirable to a kind of in the case of without artificial mark without by association, the method that clusters The method and apparatus efficiently identifying event key development.
Summary of the invention
It is an object of the invention at least solve the problems referred to above, and provide at following advantage.According to this Bright one side, it is provided that a kind of method and apparatus identifying event key development, described method and apparatus Carry out event key development identification by searching word based on imedias advertisement and news heat, obtain final event Key development venation.
According to an aspect of the present invention, it is provided that a kind of method identifying event key development, described method Including: obtain event searching word bunch based on event core word;By searching for event based on imedias advertisement Rope word bunch carries out event key development identification, it is thus achieved that the first event key development point set;By based on newly News heat is searched word and is carried out event key development identification for event core word, it is thus achieved that second event key development point Set;By the first event key development point set and the some set of second event key development are merged And duplicate removal, it is thus achieved that the 3rd event key development point set;3rd event key development point set is carried out excellent Change, it is thus achieved that final event key development venation.
The step carrying out event key development identification for event searching word bunch based on imedias advertisement can be wrapped Include: (1), by using event searching word bunch retrieval news inverted index, calculates thing within a predetermined period of time The news quantity of part search word bunch hit every day on a timeline, it is thus achieved that the news report of event searching word bunch Trendgram;(2) adjust by news report trendgram being carried out news quantity vacation based on effect vacation, Obtain imedias advertisement trendgram;(3) by imedias advertisement trendgram being carried out Time-Series analysis to identify News burst point, obtains the candidate events key development point set with major issue as granularity, wherein, greatly Event refers to greater than or is equal to the set of the continuous news burst point of the first predetermined number of days;(4) to candidate's thing Persistent period in the some set of part key development is more than or equal to the second predetermined number of days and has and substantially develops arteries and veins The major issue of network carries out based on the secondary cutting uniformly assumed, it is thus achieved that the first event key development point set, Wherein, the major issue with obvious development grain refers to individually carry out Time-Series analysis within the described persistent period It also is able to recognize the major issue of news burst point.
The step carrying out event key development identification for event searching word bunch based on imedias advertisement can be wrapped Include: (1), by using event searching word bunch retrieval news inverted index, calculates thing within a predetermined period of time The news quantity of each search word hit every day on a timeline in part search word bunch, it is thus achieved that each search The news report trendgram of word;(2) by based on the effect vacation news report trend to each search word Figure carries out news quantity vacation and adjusts, it is thus achieved that the imedias advertisement trendgram of each search word;(3) to often The imedias advertisement trendgram of individual search word carries out Time-Series analysis to identify news burst point, it is thus achieved that each search The candidate events key development point with major issue as granularity of rope word, wherein, major issue refer to greater than or etc. Set in the continuous news burst point of the first predetermined number of days;(4) all in event searching word bunch are searched The candidate events key development point of rope word merges, it is thus achieved that the some set of candidate events key development;(5) Persistent period in the some set of candidate events key development more than or equal to the second predetermined number of days and is had bright The major issue of aobvious development grain carries out based on the secondary cutting uniformly assumed, it is thus achieved that the first event key development Point set, wherein, the major issue with obvious development grain refers to individually carry out within the described persistent period Time-Series analysis also is able to recognize the major issue of news burst point.
The step obtaining event searching word bunch comprises the steps that by search in searching for daily record user and event core The event searching word that heart word is corresponding, it is thus achieved that event searching word bunch.
The step that vacation, news quantity adjusted comprises the steps that by statistics proxima luce (prox. luc) vacation, same day vacation, vacation After date the whole network news of one day index obtains after proxima luce (prox. luc) vacation, same day vacation, vacation one respectively The whole network news total amount;Calculate the whole network news total amount and the whole network news total amount of proxima luce (prox. luc) vacation on same day vacation Between difference and the whole network news total amount on same day vacation and the whole network news total amount of proxima luce (prox. luc) vacation between The ratio of difference;The news quantity of event searching word bunch on same day vacation is adjusted according to the ratio calculated.
First predetermined number of days can be 3 days, and the second predetermined number of days can be 5 days.
The step of Time-Series analysis comprises the steps that the mode using sliding time window, with the first predetermined amount of time It is one and calculates time window, with the second predetermined amount of time for sliding time window forward slip, know respectively News burst point in the most each calculating time window;As long as calculating some in time window at one Time point is identified as news burst point, then this time point is set to candidate key evolution time point;By institute There is candidate key evolution time point to merge, obtain the candidate events key development with major issue as granularity Point set.
First predetermined amount of time can be 30 days, and the second predetermined amount of time can be 2 days.
The step of identification news burst point comprises the steps that the event in all skies calculated in this calculating time window The average of the news quantity of search word bunch and variance;Threshold value is calculated: thresholding by following disclosure Value=average+0.8 × variance;If some time point in this calculating time window is more than the thresholding calculated Value, then be identified as news burst point by this time point.
Step based on the secondary cutting uniformly assumed comprises the steps that in the persistent period pre-more than or equal to second In determining natural law and there is the major issue of obvious development grain, by the event searching word brand new news amount of every day with The meansigma methods of the event searching word brand new news amount a few days ago of this day compares;If the event of this day is searched Rope word brand new news amount is more than described meansigma methods, then retain this sky as event key development point;If this sky Event searching word brand new news amount less than or equal to described meansigma methods, then remove the event key development of this day Point.
Search, based on news heat, the step that word carries out event key development identification for event core word to comprise the steps that Use event core word is searched in dictionary in news heat and is scanned for;The news heat searched is searched corresponding to word Time point be identified as event key development point.
The step being optimized the 3rd event key development point set comprises the steps that for a major issue, Optimize unit by the news amount of event key development points most for news quantity and this event key development point it After the ratio of news amount of each event key development point compare with predetermined threshold;If news quantity Each event key after the news amount of most event key development points and this event key development point is entered The ratio of the news amount of machine plotting is all higher than predetermined threshold, then the institute after removing this event key development point is busy Part key development point.
The step being optimized the 3rd event key development point set may also include that reservation event key is entered First event key development point of the progress every time in exhibition process.
According to a further aspect in the invention, it is provided that a kind of equipment identifying event key development, set described in For including: event searching word bunch obtains unit, obtain event searching word bunch based on event core word;First Recognition unit, by carrying out event key development identification based on imedias advertisement for event searching word bunch, Obtain the first event key development point set;Second recognition unit, by searching word for thing based on news heat Part core word carries out event key development identification, it is thus achieved that the some set of second event key development;Sum unit, By the first event key development point set and the some set of second event key development are merged and are gone Weight, it is thus achieved that the 3rd event key development point set;Optimize unit, to the 3rd event key development point set It is optimized, it is thus achieved that final event key development venation.
First recognition unit comprises the steps that news amount calculation unit, by using event searching word bunch to retrieve News inverted index, calculates the new of event searching word bunch hit every day on a timeline within a predetermined period of time Hear quantity, it is thus achieved that the news report trendgram of event searching word bunch;Vacation adjustment unit, by based on vacation Wall effect carries out news quantity vacation and adjusts news report trendgram, it is thus achieved that imedias advertisement trendgram; Time-Series analysis unit, by imedias advertisement trendgram being carried out Time-Series analysis to identify news burst point, Obtaining the candidate events key development point set with major issue as granularity, wherein, major issue refers to greater than Or the set of the continuous news burst point equal to the first predetermined number of days;Cutting unit, crucial to candidate events Persistent period in progress point set more than or equal to the second predetermined number of days and has the big of obvious development grain Event carries out based on the secondary cutting uniformly assumed, it is thus achieved that the first event key development point set, wherein, The major issue with obvious development grain refers to that individually carrying out Time-Series analysis within the described persistent period also is able to Recognize the major issue of news burst point.
First recognition unit comprises the steps that news amount calculation unit, by using event searching word bunch to retrieve News inverted index, calculates each search word in event searching word bunch within a predetermined period of time at time shaft The news quantity of hit upper every day, it is thus achieved that the news report trendgram of each search word;Vacation adjustment unit, Adjust by the news report trendgram of each search word being carried out news quantity vacation based on effect vacation, Obtain the imedias advertisement trendgram of each search word;Time-Series analysis unit, the media to each search word Attention rate trendgram carries out Time-Series analysis to identify news burst point, it is thus achieved that each search word with major issue For the candidate events key development point set of granularity, wherein, major issue refers to greater than or makes a reservation for equal to first The set of the continuous news burst point of natural law;Integrated unit, by all search words in event searching word bunch The some set of candidate events key development merge, it is thus achieved that final candidate events key development point set; Cutting unit, to the persistent period in final candidate events key development point set more than or equal to second Predetermined number of days and there is the major issue of obvious development grain carry out based on the secondary cutting uniformly assumed, it is thus achieved that First event key development point set, wherein, the major issue with obvious development grain refers to hold described Individually carry out Time-Series analysis in the continuous time and also be able to recognize the major issue of news burst point.
Event searching word bunch obtains unit can be corresponding with event core word by searching in searching for daily record user Event searching word, it is thus achieved that event searching word bunch.
Vacation, adjustment unit can be new by the whole network of one day after statistics proxima luce (prox. luc) vacation, same day vacation, vacation Hear index and obtain the whole network news total amount of a day after proxima luce (prox. luc) vacation, same day vacation, vacation respectively;Meter Calculate the difference between the whole network news total amount and the whole network news total amount of proxima luce (prox. luc) vacation on same day vacation and vacation The ratio of the difference between the whole network news total amount and the whole network news total amount of proxima luce (prox. luc) vacation on the same day;According to The ratio calculated adjusts the news quantity of event searching word bunch on same day vacation.
First predetermined number of days can be 3 days, and the second predetermined number of days can be 5 days.
Time-Series analysis unit can use the mode of sliding time window, is in terms of one by the first predetermined amount of time Evaluation time window, with the second predetermined amount of time for sliding time window forward slip, identifies each meter respectively News burst point in evaluation time window;As long as calculating in time window at one, some time point is known Not Wei news burst point, then this time point is set to candidate key evolution time point;All candidates are closed Key evolution time point merges, and obtains the candidate events key development point set with major issue as granularity.
First predetermined amount of time can be 30 days, and the second predetermined amount of time can be 2 days.
Time-Series analysis unit can pass through method below identification news burst point: calculates this calculating time window The average of the news quantity of the event searching word bunch in interior all skies and variance;Come by following disclosure Calculating threshold value: threshold value=average+0.8 × variance;If some time in this calculating time window This time point more than the threshold value calculated, is then identified as news burst point by point.
Cutting unit more than or equal to the second predetermined number of days and can have obvious development grain in the persistent period In major issue, by the event searching word bunch a few days ago of the event searching word brand new news amount of every day Yu this day The meansigma methods of news amount compares;If the event searching word brand new news amount of this day is more than described meansigma methods, Then retain this sky as event key development point;If the event searching word brand new news amount of this day less than or etc. In described meansigma methods, then remove the event key development point of this day.
Second recognition unit can use event core word to search in dictionary in news heat and scan for;To search News heat search the time point corresponding to word and be identified as event key development point.
For a major issue, optimizing unit can be by the news of event key development points most for news quantity Measure with this event key development point after the ratio of news amount of each event key development point and predetermined threshold Compare, if the news amount of the most event key development point of news quantity and this event key development The ratio of the news amount of each event key development point after Dian is all higher than predetermined threshold, then optimizing unit can Remove all event key development points after this event key development point.
Optimize unit and can retain first event key development of the progress every time in event key development process Point.
The present invention can provide apparent event venation, meets user's concern demand to event, improves Consumer's Experience.Additionally, the present invention is without human-edited, it is substantially reduced special topic manufacturing cost.Additionally, this Invention can carry out quick event progress and follow the tracks of topic, ageing height.Additionally, the side that the present invention provides Method and equipment are a kind of current techique schemes unrelated with detailed programs, therefore have the strongest versatility and Portable.
Accompanying drawing explanation
By combining accompanying drawing, below embodiment description, the present invention these and/or other side and excellent Point will be made apparent from, and it is more readily appreciated that wherein:
Fig. 1 is the block diagram of the equipment of the identification event key development of the exemplary embodiment according to the present invention;
Fig. 2 is the block diagram of the first recognition unit 120 of the exemplary embodiment according to the present invention;
Fig. 3 is the block diagram of the first recognition unit 120 in accordance with an alternative illustrative embodiment of the present invention;
Fig. 4 is the flow process of the method for the identification event key development of the exemplary embodiment according to the present invention Figure;
Fig. 5 is showing of the example of the event key development arteries and veins summary illustrating the exemplary embodiment according to the present invention Figure.
Detailed description of the invention
There is provided following description referring to the drawings to help the present invention limited by claim and equivalent thereof Comprehensive understanding of embodiment.Including various specific detail to help to understand, but these details are considered only as It is exemplary.Therefore, those of ordinary skill in the art is it will be recognized that without departing from the scope of the present invention In the case of spirit, embodiment described herein can be made various changes and modifications.Additionally, in order to Clear and succinct, omit known function and the description of structure.
Fig. 1 is the block diagram of the equipment of the identification event key development of the exemplary embodiment according to the present invention.
With reference to Fig. 1, according to the equipment 100 of the identification event key development of the exemplary embodiment of the present invention Obtain unit the 110, first recognition unit 120, second including event searching word (query) bunch and identify single Unit 130, sum unit 140, optimization unit 150.
Event searching word bunch obtains unit 110 can obtain event searching based on event core word (term) Word bunch.Specifically, event searching word bunch acquisition unit 110 (can not show by searching for daily record user Go out) in the search event searching word corresponding with event core word, obtain event searching word bunch.
Such as, event searching word bunch obtains unit 110 by search in searching for daily record user and event core The event searching word that heart word is corresponding, it is thus achieved that event searching word bunch.
First recognition unit 120 can be by carrying out event pass based on imedias advertisement for event searching word bunch Key progress identifies, it is thus achieved that the first event key development point set.Come in detail below by reference to Fig. 2 and Fig. 3 The thin operation describing the first recognition unit 120.
Fig. 2 is the block diagram of the first recognition unit 120 of the exemplary embodiment according to the present invention.
With reference to Fig. 2, the first recognition unit 120 can include that news amount calculation unit 121a, vacation adjust Unit 122a, Time-Series analysis unit 123a, cutting unit 124a.
News amount calculation unit 121a can be by using event searching word bunch retrieval news inverted index (not Illustrate), calculate the news quantity of event searching word bunch hit every day on a timeline within a predetermined period of time, Obtain the news report trendgram of event searching word bunch.
Vacation, adjustment unit 122a can be new by news report trendgram being carried out vacation based on effect vacation News quantity adjusts, it is thus achieved that imedias advertisement trendgram.
Specifically, there is effect vacation in news report, and vacation, effect referred on festivals or holidays, network The news total amount delivered is few a lot of than usual.Accordingly, it would be desirable to according to effect vacation, in news report trend The news quantity of vacation is adjusted, it is thus achieved that final imedias advertisement trendgram on the basis of figure.
Vacation, adjustment unit 122a can be by statistics proxima luce (prox. luc) vacation, same day vacation, one day complete after vacation Net news index obtains the whole network news total amount of a day after proxima luce (prox. luc) vacation, same day vacation, vacation respectively. Subsequently, adjustment unit vacation 122a can calculate the whole network news total amount on same day vacation and the complete of proxima luce (prox. luc) vacation The whole network news total amount on the difference between net news total amount and same day vacation and the whole network news of proxima luce (prox. luc) vacation The ratio of the difference between total amount.Subsequently, event searching word bunch on same day vacation is adjusted according to the ratio calculated News quantity.
Such as, May 1 was vacation, need adjust May 1 about XXX evental news report number Amount.First, adjustment unit 122a counted the whole network on the same day and had 1,000,000 news report vacation, and united The whole network news report counting out April 30 has 800,000, and the whole network news report on May 2 has 500,000 , then calculating difference ratio is (100-80)/(100-50)=0.4.For " XXX event ", May The news report of 1 day is 70, and the news report on April 30 is 80, the Xin Wen Bao on May 2 Road is 50, and therefore, vacation, adjustment unit 122a was come by equation below according to difference ratio 0.4 Adjust May 1 about XXX evental news report quantity: ((70+x)-80)/((70+x)-50)=0.4, Wherein, x represents news report adjustment amount, here, x=30, say, that May 1 about XXX Evental news report quantity should be adjusted by 100.
According to the exemplary embodiment of the present invention, vacation, adjustment unit 122a can also own in the middle of 1 year The above-mentioned difference ratio of vacation all records generation model dictionary vacation, in the key carrying out certain event During progress identifies, this of model dictionary can be used to adjust news report quantity during this event vacation vacation.
Time-Series analysis unit 123a can be new to identify by imedias advertisement trendgram carries out Time-Series analysis Hear burst point, obtain the candidate events key development point set with major issue as granularity, wherein, major issue Part refers to greater than or is equal to the set of the continuous news burst point of the first predetermined number of days.News burst point refers to News quantity exceedes the time point of preassigned (such as, exceeding certain predetermined value etc.).Here, first Predetermined number of days can be 3 days, say, that the news burst point of more than for three days on end or 3 days can group Become a major issue.
Specifically, according to the exemplary embodiment of the present invention, Time-Series analysis unit 123a can use slip The mode of time window, is one with the first predetermined amount of time and calculates time window, with second scheduled time Section is sliding time window forward slip, identifies the news burst point in each calculating time window respectively; As long as calculating in time window at one and some time point being identified as news burst point, then by this time Point is set to candidate key evolution time point;All candidate key evolution time points are merged, obtains Candidate events key development point set with major issue as granularity.Here, the first predetermined amount of time can be 30 My god, the second predetermined amount of time can be 2 days.
According to the exemplary embodiment of the present invention, Time-Series analysis unit 123a can pass through method below identification News burst point: calculate the news quantity of the event searching word bunch in all skies in this calculating time window Average and variance;Threshold value is calculated: threshold value=average+0.8 × variance by following disclosure;As Really some time point in this calculating time window is more than the threshold value calculated, then by this time point identification For news burst point.
As it will be easily appreciated by one skilled in the art that the method that the invention is not restricted to above-mentioned identification news burst point, The present invention can also use other method to identify news burst point.
Persistent period in the some set of candidate events key development can be more than or equal to by cutting unit 124a Second predetermined number of days and there is the major issue of obvious development grain carry out based on the secondary cutting uniformly assumed, Obtaining the first event key development point set, wherein, the major issue with obvious development grain refers in institute Individually carry out Time-Series analysis in stating the persistent period and also be able to recognize the major issue of news burst point.Here, Second predetermined number of days can be 5 days, say, that cutting unit 124a can to the persistent period more than or etc. In 5 days and this is individually carried out Time-Series analysis more than or equal to the news of persistent period of 5 days also be able to know The major issue being clipped to news burst point is carried out based on the secondary cutting uniformly assumed.
Specifically, according to the exemplary embodiment of the present invention, cutting unit 124a was more than in the persistent period Or equal to the second predetermined number of days and in there is the major issue of obvious development grain, by the event searching of every day Word brand new news amount compares with the meansigma methods of the event searching word brand new news amount a few days ago of this day;If The event searching word brand new news amount of this day is more than described meansigma methods, then retain this sky as event key development Point;If the event searching word brand new news amount of this day is less than or equal to described meansigma methods, then remove this day Event key development point.
Such as, it is May 1 to May 5 and the major issue with obvious development grain in the persistent period In, if the news report amount of this event on May 1 is 100, two days before May 1 are (i.e., April 29 and April 30) the news report amount of this event is respectively 90 and 80, then and these two days The meansigma methods of the news report amount of this event is 85.Therefore, new by May 1 of cutting unit 124a Hear report amount 100 to compare with meansigma methods 85.100 are more than 85, then cutting unit 124a thinks thing Part has new development just at continuing fermentation, retains May 1 as event key development point.If May 2 The news report amount of day this event is 60, two days (that is, April 30 and Mays before May 2 1 day) the news report amount of this event is respectively 80 and 100, then the news of this this event of two days The meansigma methods of report amount is 90, therefore, cutting unit 124a by the news report amount 60 on May 2 with Meansigma methods 90 compares.60 are less than 90, then cutting unit 124a thinks that this event does not has new development, And cooling may be started, then remove this event key development point on May 2.
As it will be easily appreciated by one skilled in the art that the invention is not restricted to above-mentioned secondary based on uniformly hypothesis cuts The method divided, the present invention can also use other method to identify news burst point.
Fig. 3 is the block diagram of the first recognition unit 120 in accordance with an alternative illustrative embodiment of the present invention.
With reference to Fig. 3, the first recognition unit 120 can include that news amount calculation unit 121b, vacation adjust Unit 122b, Time-Series analysis unit 123b, integrated unit 124b, cutting unit 125b.
News amount calculation unit 121b can retrieve news inverted index by using event searching word bunch, The news of each search word hit every day on a timeline in calculating event searching word bunch in predetermined amount of time Quantity, it is thus achieved that the news report trendgram of each search word.
According to the exemplary embodiment of the present invention, news amount calculation unit 121b can include n subelement 121b1,121b2 ..., 121bn, each subelement can calculate search word every day on a timeline The news quantity of hit obtains the news report trendgram of a search word.Such as, event searching word bunch In have 4 search words, then use 4 subelements in news amount calculation unit 121b obtain this 4 The news report trendgram of each search word in individual search word.
Vacation, adjustment unit 122b can be by based on the effect vacation news report trend to each search word Figure carries out news quantity vacation and adjusts, it is thus achieved that the imedias advertisement trendgram of each search word.
According to the exemplary embodiment of the present invention, vacation, adjustment unit 122b can include n subelement 122b1,122b2 ..., 122bn, each subelement can be to the news report trendgram of a search word Carry out news quantity vacation to adjust, it is thus achieved that the imedias advertisement trendgram of a search word.Such as, event Search word bunch has 4 search words, then uses 4 subelements in adjustment unit 122b vacation to obtain The imedias advertisement trendgram of each search word in these 4 search words.
According to the exemplary embodiment of the present invention, vacation, adjustment unit 122b carried out news quantity adjustment vacation Method and adjustment unit 122a vacation in Fig. 2 carry out method that vacation, news quantity adjusted based on phase With, differ only in vacation adjustment unit 122b new to the vacation of the single search word in event searching word bunch News quantity is adjusted, and the adjustment unit 122a vacation vacation to whole event search word bunch in Fig. 2 News quantity is adjusted, and therefore thereof will be omitted it and describes in detail.
Time-Series analysis unit 123b can carry out Time-Series analysis to the imedias advertisement trendgram of each search word To identify news burst point, it is thus achieved that the candidate events key development with major issue as granularity of each search word Point set, wherein, major issue refers to greater than or is equal to the collection of the continuous news burst point of the first predetermined number of days Close.News burst point refers to that news quantity exceedes preassigned (such as, exceeding certain predetermined value etc.) Time point.Here, the first predetermined number of days can be 3 days, say, that more than for three days on end or 3 days News burst point can form a major issue.
According to the exemplary embodiment of the present invention, Time-Series analysis unit 123b can include n subelement 123b1,123b2 ..., 123bn, each subelement can be to the imedias advertisement trend of a search word Figure carries out Time-Series analysis to identify news burst point, it is thus achieved that the time with major issue as granularity of a search word Select the some set of event key development.Such as, event searching word bunch has 4 search words, then uses sequential 4 subelements in analytic unit 123b obtain candidate's thing of each search word in these 4 search words The point set of part key development.
According to the exemplary embodiment of the present invention, Time-Series analysis unit 123b carry out the method for Time-Series analysis with The method that Time-Series analysis unit 123a in Fig. 2 carries out Time-Series analysis is essentially identical, differs only in sequential When the imedias advertisement trendgram of the single search word in event searching word bunch is carried out by analytic unit 123b Sequence is analyzed, and the imedias advertisement of whole event search word bunch is become by the Time-Series analysis unit 123a in Fig. 2 Gesture figure carries out Time-Series analysis, therefore thereof will be omitted it and describes in detail.
Integrated unit 124b can be by the candidate events key development of all search words in event searching word bunch Point set is merged, it is thus achieved that final candidate events key development point set.
Persistent period in final candidate events key development point set can be more than by cutting unit 125b Or equal to the second predetermined number of days and there is the major issue of obvious development grain carry out based on the secondary uniformly assumed Cutting, it is thus achieved that the first event key development point set, wherein, the major issue with obvious development grain is Refer to that individually carrying out Time-Series analysis within the described persistent period also is able to recognize the major issue of news burst point. Here, the second predetermined number of days can be 5 days, say, that cutting unit 124a can be big to the persistent period In or individually carry out Time-Series analysis also more than or equal to the news of persistent period of 5 days equal to 5 days and to this It is capable of identify that the major issue of news burst point is carried out based on the secondary cutting uniformly assumed.
According to the exemplary embodiment of the present invention, the secondary that cutting unit 125b is carried out based on uniformly assuming is cut The method divided and the cutting unit 125a in Fig. 2 carry out method phase based on the secondary cutting uniformly assumed Seemingly, therefore thereof will be omitted it to describe in detail.
Referring back to Fig. 1, the second recognition unit 130 can be by searching word for event core based on news heat Word carries out event key development identification, it is thus achieved that the some set of second event key development.
Specifically, according to the exemplary embodiment of the present invention, the second recognition unit 130 uses event core Heart word scans in news heat searches dictionary (not shown), and it is right that the news heat searched is searched word institute The time point answered is identified as event key development point.
Sum unit 140 can be by by the first event key development point set and second event key development point Set carries out merging and duplicate removal, it is thus achieved that the 3rd event key development point set.
Optimize unit 150 can the 3rd event key development point set be optimized, it is thus achieved that final event is closed Key progress venation.
According to the exemplary embodiment of the present invention, optimizing unit 150 can close events most for news quantity The news amount of each event key development point after the news amount of key progress point and this event key development point Ratio compare with predetermined threshold, if the news amount of the most event key development point of news quantity with The ratio of the news amount of each event key development point after this event key development point is all higher than predetermined threshold Value, then optimize unit 150 and think that the news of the these days after this event key development point may belong to and turn Carry, thus remove all event key development points after this event key development point.Here, predetermined threshold Value can be 0.8.
Such as, for the major issue that persistent period is May 1 to May 3, May 1 News quantity is 100, and the news quantity on May 2 is 60, and the news quantity on May 3 is 50 , the news quantity on May 1 is most, and 100/60=1.67,100/50=2, is all higher than making a reservation for Threshold value 0.8, then optimize unit 150 and think that the news in May 2 and May 3 may belong to reprinting, Thus remove the event key development point in May 2 and May 3.
In accordance with an alternative illustrative embodiment of the present invention, optimize unit 150 to retain event key development and enter First event key development point of the progress every time in journey.
Such as, certain section of event key development process be May 1 to July 1, wherein, event is crucial Progress point is May 1 to May 3, May 15, May 23 to May 28, June 2 Day, June 20 to June 22, then first event key development point of progress is May 1 for the first time Day, first event key development point of second time progress is May 15, first of third time progress Event key development point is May 23, and first event key development point of the 4th progress is June 2 Day, first event key development point of the 5th progress is June 20, it is necessary to retained.But, During above-mentioned cutting or optimizing, the event key development point in May 15 and June 2 may be It is removed, but owing to May 15 and June 2 being second time and first thing of the 4th progress respectively Part key development point, therefore recovers and retains the event key development point in May 15 and June 2.
As it will be easily appreciated by one skilled in the art that the event progress optimization process of the present invention is not limited to above-mentioned side Method, it is also possible to go optimization event to be in progress by other conventional optimization process, thus form more perfect event Key development venation.
Fig. 4 is the flow process of the method for the identification event key development of the exemplary embodiment according to the present invention Figure.
With reference to Fig. 4, in step 401, event searching word bunch obtains unit 110 can be come based on event core word Obtain event searching word bunch.
Specifically, event searching word bunch acquisition unit 110 can be by searching for daily record (not shown) user The event searching word that middle search is corresponding with event core word, obtains event searching word bunch.
In step 402, the first recognition unit 120 can pass through based on imedias advertisement for event searching word Bunch carry out event key development identification, it is thus achieved that the first event key development point set.
Pay close attention to based on media owing to describing the first recognition unit 120 in detail by referring to Fig. 2 and Fig. 3 Spend and carry out event key development knowledge method for distinguishing for event searching word bunch, therefore, omit it here detailed Describe.
In step 403, the second recognition unit 130 can be by searching word for event core word based on news heat The event of carrying out key development identification, it is thus achieved that second event key development point set.
Specifically, according to the exemplary embodiment of the present invention, the second recognition unit 130 uses event core Heart word scans in news heat searches dictionary (not shown), and it is right that the news heat searched is searched word institute The time point answered is identified as event key development point.
In step 404, sum unit 140 can be by by the first point set of event key development and the second thing The point set of part key development carries out merging and duplicate removal, it is thus achieved that the 3rd event key development point set.
In step 405, optimizing unit 150 can be optimized the 3rd event key development point set, obtains Obtain final event key development venation.
According to the exemplary embodiment of the present invention, optimizing unit 150 can close events most for news quantity The news amount of each event key development point after the news amount of key progress point and this event key development point Ratio compare with predetermined threshold, if the news amount of the most event key development point of news quantity with The ratio of the news amount of each event key development point after this event key development point is all higher than predetermined threshold Value, then optimize unit 150 and think that the news of the these days after this event key development point may belong to and turn Carry, thus remove all event key development points after this event key development point.Here, predetermined threshold Value can be 0.8.
In accordance with an alternative illustrative embodiment of the present invention, optimize unit 150 to retain event key development and enter First event key development point of the progress every time in journey.
Fig. 5 is showing of the example of the event key development arteries and veins summary illustrating the exemplary embodiment according to the present invention Figure.Key development with reference to Fig. 5, XXX event is high-visible.
The invention provides a kind of method and apparatus identifying event key development, described method and apparatus leads to Cross and search word based on imedias advertisement and news heat and carry out event key development identification, obtain final event and close Key progress venation.The present invention can provide apparent event venation, meets user's concern need to event Ask, improve Consumer's Experience.Additionally, the present invention is without human-edited, it is substantially reduced special topic manufacturing cost. Follow the tracks of additionally, the present invention can carry out quick event progress to topic, ageing height.Additionally, the present invention The method and apparatus provided is a kind of current techique scheme unrelated with detailed programs, therefore has the strongest Versatility and portability.
The said method according to the present invention can be performed according to computer program instructions.Owing to these programs refer to Order can be included in computer, application specific processor or able to programme or specialized hardware, performs the most wherein Instruction can be conducive to the execution of above-mentioned function.As understood by those skilled in the art, computer, Processor or programmable hardware include the memory device that can store or receive software or computer code, described Software or computer code realize in the present invention when by computer, processor or hardware access and execution The method described.
Although the present invention is shown and described with reference to its exemplary embodiment, but the skill of this area Art personnel are it should be understood that without departing from the spirit of the present invention limited by claim and equivalent thereof and model In the case of enclosing, its form and details can be carried out various change.

Claims (24)

1. the method identifying event key development, described method includes:
Event searching word bunch is obtained based on event core word;
By carrying out event key development identification based on imedias advertisement for event searching word bunch, it is thus achieved that the One event key development point set;
Event key development identification is carried out for event core word, it is thus achieved that second by searching word based on news heat The point set of event key development;
By the first event key development point set and the some set of second event key development are carried out merge and Duplicate removal, it is thus achieved that the 3rd event key development point set;
3rd event key development point set is optimized, it is thus achieved that final event key development venation,
Wherein, search word based on news heat and carry out the step bag of event key development identification for event core word Include:
Use event core word is searched in dictionary in news heat and is scanned for;
The news heat searched is searched the time point corresponding to word and is identified as event key development point.
The most the method for claim 1, wherein based on imedias advertisement for event searching word bunch The step of the event of carrying out key development identification includes:
(1) by using event searching word bunch retrieval news inverted index, thing is calculated within a predetermined period of time The news quantity of part search word bunch hit every day on a timeline, it is thus achieved that the news report of event searching word bunch Trendgram;
(2) adjust by news report trendgram being carried out news quantity vacation based on effect vacation, it is thus achieved that Imedias advertisement trendgram;
(3) by imedias advertisement trendgram being carried out Time-Series analysis to identify news burst point, obtain Candidate events key development point set with major issue as granularity, wherein, major issue refers to greater than or is equal to The set of the continuous news burst point of the first predetermined number of days;
(4) to the persistent period in the some set of candidate events key development more than or equal to the second predetermined number of days And the major issue with obvious development grain carries out based on the secondary cutting uniformly assumed, it is thus achieved that the first event Key development point set, wherein, the major issue with obvious development grain referred within the described persistent period Individually carry out Time-Series analysis and also be able to recognize the major issue of news burst point.
The most the method for claim 1, wherein based on imedias advertisement for event searching word bunch The step of the event of carrying out key development identification includes:
(1) by using event searching word bunch retrieval news inverted index, thing is calculated within a predetermined period of time The news quantity of each search word hit every day on a timeline in part search word bunch, it is thus achieved that each search The news report trendgram of word;
(2) by the news report trendgram of each search word being carried out news number vacation based on effect vacation Amount adjusts, it is thus achieved that the imedias advertisement trendgram of each search word;
(3) the imedias advertisement trendgram to each search word carries out Time-Series analysis to identify news burst Point, it is thus achieved that the candidate events key development point with major issue as granularity of each search word, wherein, major issue Part refers to greater than or is equal to the set of the continuous news burst point of the first predetermined number of days;
(4) the candidate events key development point of all search words in event searching word bunch is merged, Obtain the some set of candidate events key development;
(5) to the persistent period in the some set of candidate events key development more than or equal to the second predetermined number of days And the major issue with obvious development grain carries out based on the secondary cutting uniformly assumed, it is thus achieved that the first event Key development point set, wherein, the major issue with obvious development grain referred within the described persistent period Individually carry out Time-Series analysis and also be able to recognize the major issue of news burst point.
The most the method for claim 1, wherein the step obtaining event searching word bunch includes:
By searching for the event searching word corresponding with event core word in searching for daily record user, it is thus achieved that event Search word bunch.
5. method as claimed in claim 2 or claim 3, wherein, the step that vacation, news quantity adjusted includes:
Obtained respectively by the whole network news index of a day after statistics proxima luce (prox. luc) vacation, same day vacation, vacation Vacation proxima luce (prox. luc), same day vacation, the whole network news total amount of a day after vacation;
Calculate the difference between the whole network news total amount and the whole network news total amount of proxima luce (prox. luc) vacation on same day vacation And the ratio of the difference between the whole network news total amount on same day vacation and the whole network news total amount of proxima luce (prox. luc) vacation Example;
The news quantity of event searching word bunch on same day vacation is adjusted according to the ratio calculated.
6. method as claimed in claim 2 or claim 3, wherein, the first predetermined number of days was 3 days, and second is pre- Determining natural law is 5 days.
7. method as claimed in claim 2 or claim 3, wherein, the step of Time-Series analysis includes:
Use the mode of sliding time window, be one with the first predetermined amount of time and calculate time window, with Second predetermined amount of time is sliding time window forward slip, in identifying each calculating time window respectively News burst point;
As long as calculating in time window at one and some time point being identified as news burst point, then should Time point is set to candidate key evolution time point;
All candidate key evolution time points are merged, obtains the candidate events with major issue as granularity Key development point set.
8. method as claimed in claim 7, wherein, the first predetermined amount of time is 30 days, and second is pre- The section of fixing time is 2 days.
9. method as claimed in claim 7, wherein, identifies that the step of news burst point includes:
Calculate the news quantity of the event searching word bunch in all skies in this calculating time window average and Variance;
Threshold value is calculated: threshold value=average+0.8 × variance by following disclosure;
If some time point in this calculating time window is more than the threshold value calculated, then by this time Point is identified as news burst point.
10. method as claimed in claim 2 or claim 3, wherein, based on the secondary cutting uniformly assumed Step includes:
In the major issue that the persistent period was more than or equal to for the second predetermined number of days and has obvious development grain, By putting down of the event searching word brand new news amount a few days ago of the event searching word brand new news amount of every day and this day Average compares;
If the event searching word brand new news amount of this day is more than described meansigma methods, then retain this sky as event Key development point;
If the event searching word brand new news amount of this day is less than or equal to described meansigma methods, then remove this day Event key development point.
11. methods as claimed in claim 2 or claim 3, wherein, to the 3rd event key development point set The step being optimized includes:
For a major issue, by the news amount of event key development points most for news quantity and this event The ratio of the news amount of each event key development point after key development point compares with predetermined threshold;
If after the news amount of the event key development point that news quantity is most and this event key development point The ratio of news amount of each event key development point be all higher than predetermined threshold, then remove this event key and enter All event key development points after machine plotting.
12. methods as claimed in claim 2 or claim 3, wherein, to the 3rd event key development point set The step being optimized also includes:
First event key development point of the progress every time in reservation event key development process.
13. 1 kinds of equipment identifying event key development, described equipment includes:
Event searching word bunch obtains unit, obtains event searching word bunch based on event core word;
First recognition unit, enters by carrying out event key based on imedias advertisement for event searching word bunch Exhibition identifies, it is thus achieved that the first event key development point set;
Second recognition unit, carries out event key development by searching word based on news heat for event core word Identify, it is thus achieved that the some set of second event key development;
Sum unit, by by the first event key development point set and the some set of second event key development Carry out merging and duplicate removal, it is thus achieved that the 3rd event key development point set;
Optimize unit, the 3rd event key development point set is optimized, it is thus achieved that final event key is entered Exhibition venation,
Wherein, the second recognition unit use event core word is searched in dictionary in news heat and is scanned for;To search Rope to news heat search the time point corresponding to word and be identified as event key development point.
14. equipment as claimed in claim 13, wherein, the first recognition unit includes:
News amount calculation unit, by using event searching word bunch retrieval news inverted index, predetermined The news quantity of event searching word bunch hit every day on a timeline is calculated, it is thus achieved that event searching in time period The news report trendgram of word bunch;
Vacation adjustment unit, by news report trendgram being carried out news quantity vacation based on effect vacation Adjust, it is thus achieved that imedias advertisement trendgram;
Time-Series analysis unit, by carrying out Time-Series analysis to identify that news happens suddenly to imedias advertisement trendgram Point, obtains the candidate events key development point set with major issue as granularity, and wherein, major issue refers to Set more than or equal to the continuous news burst point of the first predetermined number of days;
Cutting unit, pre-more than or equal to second to the persistent period in the some set of candidate events key development Determine natural law and there is the major issue of obvious development grain carry out based on the secondary cutting uniformly assumed, it is thus achieved that the One event key development point set, wherein, the major issue with obvious development grain refers to described lasting Individually carry out Time-Series analysis in time and also be able to recognize the major issue of news burst point.
15. equipment as claimed in claim 13, wherein, the first recognition unit includes:
News amount calculation unit, by using event searching word bunch retrieval news inverted index, predetermined The news number of each search word hit every day on a timeline in calculating event searching word bunch in time period Amount, it is thus achieved that the news report trendgram of each search word;
Vacation adjustment unit, by the news report trendgram of each search word being carried out based on effect vacation Vacation, news quantity adjusted, it is thus achieved that the imedias advertisement trendgram of each search word;
Time-Series analysis unit, carries out Time-Series analysis to identify to the imedias advertisement trendgram of each search word News burst point, it is thus achieved that the candidate events key development point set with major issue as granularity of each search word, Wherein, major issue refers to greater than or equal to the set of continuous news burst point of the first predetermined number of days;
Integrated unit, by the candidate events key development point set of all search words in event searching word bunch Merge, it is thus achieved that final candidate events key development point set;
Cutting unit, was more than or equal to the persistent period in final candidate events key development point set Second predetermined number of days and there is the major issue of obvious development grain carry out based on the secondary cutting uniformly assumed, Obtaining the first event key development point set, wherein, the major issue with obvious development grain refers in institute Individually carry out Time-Series analysis in stating the persistent period and also be able to recognize the major issue of news burst point.
16. equipment as claimed in claim 13, wherein, event searching word bunch obtain unit by with The event searching word corresponding with event core word is searched in family search daily record, it is thus achieved that event searching word bunch.
17. equipment as described in claims 14 or 15, wherein, vacation, adjustment unit was by statistics vacation After phase proxima luce (prox. luc), same day vacation, vacation one day the whole network news index obtain respectively proxima luce (prox. luc) vacation, The whole network news total amount of one day after same day vacation, vacation;Calculate the whole network news total amount and the vacation on same day vacation Difference between the whole network news total amount of phase proxima luce (prox. luc) is previous with the whole network news total amount on same day vacation and vacation The ratio of the difference between the whole network news total amount of day;Adjust event on same day vacation according to the ratio calculated to search The news quantity of rope word bunch.
18. equipment as described in claims 14 or 15, wherein, the first predetermined number of days was 3 days, the Two predetermined number of days were 5 days.
19. equipment as described in claims 14 or 15, wherein, Time-Series analysis unit uses when sliding Between the mode of window, be one with the first predetermined amount of time and calculate time window, with the second predetermined amount of time For sliding time window forward slip, identify the news burst point in each calculating time window respectively;Only To calculate in time window at one and some time point to be identified as news burst point, then by this time point It is set to candidate key evolution time point;All candidate key evolution time points are merged, obtain with Major issue is the candidate events key development point set of granularity.
20. equipment as claimed in claim 19, wherein, the first predetermined amount of time is 30 days, second Predetermined amount of time is 2 days.
21. equipment as claimed in claim 19, wherein, Time-Series analysis unit is known by method below Other news burst point: calculate the news quantity of the event searching word bunch in all skies in this calculating time window Average and variance;Threshold value is calculated: threshold value=average+0.8 × variance by following disclosure; If some time point in this calculating time window is more than the threshold value calculated, then this time point is known Wei news burst point.
22. equipment as described in claims 14 or 15, wherein, cutting unit was more than in the persistent period Or equal to the second predetermined number of days and in there is the major issue of obvious development grain, by the event searching of every day Word brand new news amount compares with the meansigma methods of the event searching word brand new news amount a few days ago of this day;If The event searching word brand new news amount of this day is more than described meansigma methods, then retain this sky as event key development Point;If the event searching word brand new news amount of this day is less than or equal to described meansigma methods, then remove this day Event key development point.
23. equipment as described in claims 14 or 15, wherein, for a major issue, optimize single Every by after the news amount of event key development points most for news quantity and this event key development point of unit The ratio of the news amount of individual event key development point compares with predetermined threshold, if news quantity is most Each event key development point after the news amount of event key development point and this event key development point The ratio of news amount is all higher than predetermined threshold, then optimize owning after unit removes this event key development point Event key development point.
24. equipment as described in claims 14 or 15, wherein, optimize unit reservation event key and enter First event key development point of the progress every time in exhibition process.
CN201310314465.XA 2013-07-24 2013-07-24 The method and apparatus of identification event key development Active CN103500163B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310314465.XA CN103500163B (en) 2013-07-24 2013-07-24 The method and apparatus of identification event key development

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310314465.XA CN103500163B (en) 2013-07-24 2013-07-24 The method and apparatus of identification event key development

Publications (2)

Publication Number Publication Date
CN103500163A CN103500163A (en) 2014-01-08
CN103500163B true CN103500163B (en) 2016-12-28

Family

ID=49865375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310314465.XA Active CN103500163B (en) 2013-07-24 2013-07-24 The method and apparatus of identification event key development

Country Status (1)

Country Link
CN (1) CN103500163B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331493B (en) * 2014-11-17 2017-07-07 百度在线网络技术(北京)有限公司 By the computer implemented method and device that data are explained for generating trend
CN104636461B (en) * 2015-02-06 2018-10-23 北京中搜云商网络技术有限公司 A method of the dynamic event cluster based on KNN and extraction
CN106844466A (en) * 2016-12-21 2017-06-13 百度在线网络技术(北京)有限公司 Event train of thought generation method and device
CN109325524A (en) * 2018-08-31 2019-02-12 中国科学院自动化研究所 Track of issues and changes phase division methods, system and relevant device
CN110415151A (en) * 2019-07-08 2019-11-05 上海易点时空网络有限公司 Restricted driving policy monitoring method and device, storage medium
CN113836448B (en) * 2021-09-22 2023-10-20 抖音视界有限公司 Information display method, device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194015A (en) * 2011-06-30 2011-09-21 重庆新媒农信科技有限公司 Retrieval information heat statistical method
CN102646114A (en) * 2012-02-17 2012-08-22 清华大学 News topic timeline abstract generating method based on breakthrough point
CN103164427A (en) * 2011-12-13 2013-06-19 中国移动通信集团公司 Method and device of news aggregation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020152245A1 (en) * 2001-04-05 2002-10-17 Mccaskey Jeffrey Web publication of newspaper content

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194015A (en) * 2011-06-30 2011-09-21 重庆新媒农信科技有限公司 Retrieval information heat statistical method
CN103164427A (en) * 2011-12-13 2013-06-19 中国移动通信集团公司 Method and device of news aggregation
CN102646114A (en) * 2012-02-17 2012-08-22 清华大学 News topic timeline abstract generating method based on breakthrough point

Also Published As

Publication number Publication date
CN103500163A (en) 2014-01-08

Similar Documents

Publication Publication Date Title
CN103500163B (en) The method and apparatus of identification event key development
US8010545B2 (en) System and method for providing a topic-directed search
CN103324718B (en) Method and system based on humongous search Web log mining topic venation
CN100504866C (en) Integrative searching result sequencing system and method
CN102880712B (en) Method and system for sequencing searched network videos
CN103870461B (en) Subject recommending method, device and server
CN111178586B (en) Method for tracking, predicting and dredging network patriotic public opinion events
CN102222103A (en) Method and device for processing matching relationship of video content
CN102119385A (en) Method and subsystem for searching media content within a content-search-service system
CN103577501A (en) Hot topic searching system and hot topic searching method
CN103870454A (en) Method and method for recommending data
CN107423202A (en) Event resolver, event resolution system, event analytic method and event analysis program
JPWO2007091587A1 (en) Representative image or representative image group display system, method and program thereof, and representative image or representative image group selection system, method and program thereof
CN104636407B (en) Parameter value training and searching request treating method and apparatus
CN106296286A (en) The predictor method of ad click rate and estimating device
CN103838754A (en) Information searching device and method
CN106445894A (en) New media intelligent online editing method and apparatus, and network information release platform
CN111026965A (en) Hot topic tracing method and device based on knowledge graph
US20090024591A1 (en) Device, method and program for producing related words dictionary, and content search device
CN106469176A (en) A kind of method and apparatus for extracting text snippet
CN113468868B (en) NLP-based real-time network hot content analysis method
JP2000331020A (en) Method and device for information reference and storage medium with information reference program stored
CN116501974A (en) Push system for scientific research achievements
WO2009066392A1 (en) Map-searching device, map-searching method, map-searching program, and recording medium
US7716209B1 (en) Automated advertisement publisher identification and selection

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant