CN106570140B - Determine the method and device of information focus - Google Patents
Determine the method and device of information focus Download PDFInfo
- Publication number
- CN106570140B CN106570140B CN201610964928.0A CN201610964928A CN106570140B CN 106570140 B CN106570140 B CN 106570140B CN 201610964928 A CN201610964928 A CN 201610964928A CN 106570140 B CN106570140 B CN 106570140B
- Authority
- CN
- China
- Prior art keywords
- information
- focus
- information focus
- state
- list
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method and device for determining information focus, belong to areas of information technology.This method includes:Calculate class and the similarity of each information focus in first list;When the similarity of any information focus in class and first list is more than first threshold, class is added in information list corresponding to information focus;When the similarity of any information focus to be confirmed in class and second list is more than Second Threshold, and the information focus to be confirmed meets preparatory condition, information focus to be confirmed is moved in first list.The present invention not merely determines information focus according to cluster result, but the similarity of the information focus in cluster result and first list is more than first threshold, or the similarity of the focus to be confirmed in cluster result and second list is when being more than Second Threshold and meeting preparatory condition, cluster result is defined as information focus, improves the accuracy of identified information focus.
Description
This application claims Patent Office of the People's Republic of China, Application No. 201610354737.2, invention were submitted on 05 26th, 2016
The priority of the Chinese patent application of entitled " a kind of unstructured information focus management system and method ", entire contents are led to
Reference is crossed to be incorporated in the present application.
Technical field
The present invention relates to areas of information technology, more particularly to a kind of method and device for determining information focus.
Background technology
In modern society, internet is increasingly becoming the main path of information issue.By internet, user can microblogging,
For a certain hot ticket, popular personage or hot issue issue comment information in the social class website such as forum, blog, in information
Technical field, the hot ticket, popular personage or hot issue are commonly referred to as information focus.Due to the comment letter to information focus
Breath reflects the public sentiment dynamic of current social, significant to social stability and national development, therefore, it is necessary in time from sea
Information focus is determined in amount information, and then uses effective measure A clear guidance public opinion.
Prior art is it is determined that during information focus, the main method for using keyword retrieval, and detailed process is:From internet
It is upper to obtain pending information;The keyword of each pending information of extraction;Calculate the keyword of the pending information of any two
Similarity, if the similarity of the keyword of the two pending information is more than predetermined threshold value, by two pending information
Gather for one kind, and using the keyword as such class label;If the information content that any sort includes is more than predetermined number,
Using such class label as an information focus.
During the present invention is realized, inventor has found that prior art at least has problems with:
Because prior art is according only to a cluster result, just meet the class of certain condition as information heat information content
Point, and may be a fake information focus indeed according to such identified information focus, therefore, determined by prior art
Information focus is inaccurate.
The content of the invention
In order to solve problem of the prior art, the embodiments of the invention provide a kind of method and dress for determining information focus
Put.The technical scheme is as follows:
On the one hand, there is provided a kind of method for determining information focus, methods described include:
Treat processing information to be clustered, obtain multiple classes;
For any one class, the class and the similarity of each information focus in first list, the first list are calculated
For storage information focus;
If the class and the similarity of any information focus in the first list are more than first threshold, by the class
It is added in information list corresponding to described information focus;
If the class and the similarity of each information focus in the first list are respectively less than the first threshold, count
The class and the similarity of each information focus to be confirmed in second list are calculated, the second list is used to store information to be confirmed
Focus;
, will if the class and the similarity of any information focus to be confirmed in the second list are more than Second Threshold
The class is added in information list corresponding to the information focus to be confirmed, and meets to preset in the information focus to be confirmed
During condition, the information focus to be confirmed is moved in the first list.
In another embodiment of the present invention, each class has a class label, described to calculate the class and secondary series
In table after the similarity of each information focus to be confirmed, in addition to:
If the class and the similarity of each information focus to be confirmed in the second list are respectively less than second threshold
Value, then be defined as target information focus to be confirmed by the class label of the class;
Target information focus to be confirmed is added in the second list, and in target information heat to be confirmed
When point meets the preparatory condition, target information focus to be confirmed is moved in the first list.
In another embodiment of the present invention, it is described that the class is added to information list corresponding to described information focus
In after, in addition to:
The change curve of described information focus is drawn using the information content of described information focus as the longitudinal axis, by transverse axis of the time;
Obtain first information amount of the described information focus in current time;
The second information content of described information focus within a specified time is obtained, between the specified time and the current time
Every preset duration;
According to the first information amount, second information content and the preset duration, the change curve and horizontal stroke are calculated
The angle of axle;
According to the change curve and the angle of transverse axis, the current life cycle state of described information focus is determined;
According to the current life cycle state of described information focus, the life cycle state of described information focus is carried out more
Newly.
In another embodiment of the present invention, the life cycle state includes generating state, state of development, outburst shape
State, weak state and extinction state;
It is described that the current life cycle state of described information focus is determined according to the change curve and the angle of transverse axis,
Including:
If the angle of the change curve and transverse axis is less than the first default value, it is determined that described information focus is current
Life cycle state is generating state;
If the angle of the change curve and transverse axis is less than the second default value more than first default value, really
It is state of development to determine the current life cycle state of described information focus;
If the angle of the change curve and transverse axis is less than the 3rd default value more than second default value, really
It is outburst state to determine the current life cycle state of described information focus;
If the angle of the change curve and transverse axis is less than the 4th default value more than the 3rd default value, really
It is weak state to determine the current life cycle state of described information focus;
If the angle of the change curve and transverse axis is more than the 4th default value, it is determined that described information focus is worked as
Preceding life cycle state is extinction state;
Wherein, first default value is less than second default value, and second default value is less than described the
Three default values, the 3rd default value are less than the 4th default value.
In another embodiment of the present invention, the life cycle state current according to described information focus, to institute
State information focus life cycle state be updated after, methods described also includes:
If the life cycle state after the renewal of described information focus is extinction state, described information focus is moved to
3rd list, the 3rd list are used to store the information focus in extinction state.
On the other hand, there is provided a kind of device for determining information focus, described device include:
Cluster module, clustered for treating processing information, obtain multiple classes;
Computing module, for for any one class, calculating the class and the similarity of each information focus in first list,
The first list is used for storage information focus;
Add module, for being more than first threshold when the similarity of any information focus in the class and the first list
When, the class is added in information list corresponding to described information focus;
The computing module, it is additionally operable to when the class and the similarity of each information focus in the first list are respectively less than
During the first threshold, the class and the similarity of each information focus to be confirmed in second list, the second list are calculated
For storing information focus to be confirmed;
The add module, for when the class and the similarity of any information focus to be confirmed in the second list it is big
When Second Threshold, the class is added in information list corresponding to the information focus to be confirmed;
Mobile module, for when the information focus to be confirmed meets preparatory condition, by the information focus to be confirmed
It is moved in the first list.
In another embodiment of the present invention, each class has a class label, and described device also includes:
First determining module, for when the class and the similarity of each information focus to be confirmed in the second list it is equal
During less than the Second Threshold, the class label of the class is defined as target information focus to be confirmed;
The add module, for target information focus to be confirmed to be added in the second list;
The mobile module, it is used for and when target information focus to be confirmed meets the preparatory condition, by described in
Target information focus to be confirmed is moved in the first list.
In another embodiment of the present invention, described device also includes:
Drafting module, for drawing described information heat using the information content of described information focus as the longitudinal axis, by transverse axis of the time
The change curve of point;
Acquisition module, for obtaining first information amount of the described information focus in current time;
The acquisition module, it is additionally operable to obtain the second information content of described information focus within a specified time, it is described to specify
Time and the current time interval preset duration;
The computing module, for according to the first information amount, second information content and the preset duration, calculating
The angle of the change curve and transverse axis;
Second determining module, for the angle according to the change curve and transverse axis, determine that described information focus is current
Life cycle state;
Update module, for according to the current life cycle state of described information focus, to the life of described information focus
Periodic state is updated.
In another embodiment of the present invention, the life cycle state includes generating state, state of development, outburst shape
State, weak state and extinction state;
Second determining module, for when the angle of the change curve and transverse axis is less than the first default value, really
It is generating state to determine the current life cycle state of described information focus;Described in being more than when the angle of the change curve and transverse axis
When first default value is less than the second default value, it is state of development to determine the current life cycle state of described information focus;
When the angle of the change curve and transverse axis is less than three default values more than second default value, described information is determined
The current life cycle state of focus is outburst state;When the angle of the change curve and transverse axis is more than the 3rd present count
When value is less than four default values, it is weak state to determine the current life cycle state of described information focus;When the change
When the angle of curve and transverse axis is more than four default value, the current life cycle state of described information focus is determined to disappear
Die state;Wherein, first default value is less than second default value, and second default value is less than the described 3rd
Default value, the 3rd default value are less than the 4th default value.
In another embodiment of the present invention, the mobile module, it is additionally operable to life after described information focus updates
When life periodic state is extinction state, described information focus is moved to the 3rd list, the 3rd list is in for storage
The information focus of extinction state.
The beneficial effect that technical scheme provided in an embodiment of the present invention is brought is:
Information focus is not merely determined according to cluster result, but in cluster result and the information focus in first list
Similarity be more than first threshold, or the similarity of cluster result and the focus to be confirmed in second list is more than Second Threshold
And when meeting preparatory condition, cluster result is defined as information focus, improve the accuracy of identified information focus.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, make required in being described below to embodiment
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for
For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings
Accompanying drawing.
Fig. 1 is showing for the implementation environment involved by a kind of method for determination information focus that one embodiment of the invention provides
It is intended to;
Fig. 2 is a kind of method flow diagram for determination information focus that another embodiment of the present invention provides;
Fig. 3 is a kind of method flow diagram for determination information focus that another embodiment of the present invention provides;
Fig. 4 is a kind of schematic diagram for information focus discovery procedure that another embodiment of the present invention provides;
Fig. 5 is a kind of schematic diagram for information focus life cycle management process that another embodiment of the present invention provides;
Fig. 6 is a kind of schematic diagram for information hotspot tracking process that another embodiment of the present invention provides;
Fig. 7 is a kind of schematic diagram for information focus life cycle that another embodiment of the present invention provides;
Fig. 8 is a kind of structural representation of the device for determination information focus that another embodiment of the present invention provides.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention
Formula is described in further detail.
Fig. 1 is refer to, it illustrates the implementation ring involved by the method for determination information focus provided in an embodiment of the present invention
Border, referring to Fig. 1, the implementation environment includes:MIM message input module 101, information focus life cycle find tracing management module
102nd, managing listings 103, information focus output module 104.
Wherein, managing listings 103 include enlivening hotspot list 121, historical heat list 122 and hotspot list to be confirmed
123.This enliven hotspot list 121 be used for store newfound information focus and in development, outburst, weak state letter
Focus is ceased, the information focus enlivened in hotspot list needs persistently to be tracked.The historical heat list 122 is used at storage
Information focus in the information focus of extinction state, the historical heat list need not be tracked again.Focus row to be confirmed
Table 123 is used to store information focus to be confirmed, and the hot information to be confirmed needs in the hotspot list 123 to be confirmed are
The no confirmation for information focus.
MIM message input module 101 is used to input magnanimity information, and the information of the MIM message input module 101 input can come from mutually
Information database in networking, stores the various comment informations on internet in the information database, including user microblogging,
The comment information of the social class website orientation such as wechat, blog, the information of the MIM message input module 101 input can be from servicing
Device local data base;For the ease of analyzing the magnanimity information of input, MIM message input module 101 is additionally operable to the letter to input
Breath is handled, and by handling the information of input, the information of input can be converted into plain text, in the process
In, if the information of input is HTML (HyperText Markup Language, HyperText Markup Language) information, then need
The html tag in HTML information is removed, retains the text in HTML information, and the text retained is converted into txt shapes
Formula, if the information of input is office documents, needs to parse the office documents using document parsing application, lead to
Cross and office documents are parsed, text can be extracted from office documents, and the text of extraction is converted into txt forms;
MIM message input module 101 is additionally operable to arrange the information after processing, and is deposited the information after arrangement automatically in units of day
Storage is found and tracked for use as to information focus in server local database.
Information focus life cycle finds that tracing management module 102 is used to enter the information that MIM message input module 101 inputs
The automatic discovery of row, tracking, management etc., the information focus life cycle find that tracing management module 102 includes focus discovery submodule
Block 111, hotspot tracking submodule 112, focus life cycle management submodule 113.The focus life cycle finds tracking module
102 input is information hotspot list 123 to be confirmed, and it is to enliven hotspot list 121 and historical heat after updating that it, which is exported,
List 122.
Information focus output module 104 enlivens hotspot list 121, historical heat list 122 and heat to be confirmed for output
Point list 123, three lists of the message output module 104 output can be shown in the form of HTML, can also be embedding
Enter into other systems.
Based on the implementation environment shown in Fig. 1, the embodiments of the invention provide a kind of method for determining information focus, referring to figure
2, method flow provided in an embodiment of the present invention includes:
201st, treat processing information to be clustered, obtain multiple classes.
202nd, for any one class, class and the similarity of each information focus in first list are calculated, first list is used for
Storage information focus.
203rd, such as fruit and the similarity of any information focus in first list are more than first threshold, then class are added into letter
Cease in information list corresponding to focus.
204th, the similarity of each information focus is respectively less than first threshold such as in fruit and first list, then calculates class and the
The similarity of each information focus to be confirmed in two lists, second list are used to store information focus to be confirmed.
205th, such as fruit and the similarity of any information focus to be confirmed in second list are more than Second Threshold, then add class
It is added in information list corresponding to information focus to be confirmed, and when information focus to be confirmed meets preparatory condition, will be to be confirmed
Information focus is moved in first list.
Method provided in an embodiment of the present invention, information focus is not merely determined according to cluster result, but tied in cluster
The similarity of information focus in fruit and first list is more than first threshold, or cluster result with it is to be confirmed in second list
The similarity of focus is more than Second Threshold and when meeting preparatory condition, and cluster result is defined as into information focus, improve really
The accuracy of fixed information focus.
In another embodiment of the present invention, each class has a class label, calculates each in class and second list
After the similarity of information focus to be confirmed, in addition to:
Similarity such as each information focus to be confirmed in fruit and second list is respectively less than Second Threshold, then by the class of class
Label is defined as target information focus to be confirmed;
Target information focus to be confirmed is added in second list, and meets default bar in target information focus to be confirmed
During part, target information focus to be confirmed is moved in first list.
In another embodiment of the present invention, after class is added in information list corresponding to information focus, also wrap
Include:
The change curve of information focus is drawn using the information content of information focus as the longitudinal axis, by transverse axis of the time;
Obtain first information amount of the information focus in current time;
Obtain the second information content of information focus within a specified time, specified time and current time interval preset duration;
According to first information amount, the second information content and preset duration, the angle of calculating change curve and transverse axis;
According to change curve and the angle of transverse axis, the current life cycle state of information focus is determined;
According to the current life cycle state of information focus, the life cycle state of information focus is updated.
In another embodiment of the present invention, life cycle state include generating state, state of development, outburst state,
Weak state and extinction state;
According to change curve and the angle of transverse axis, the current life cycle state of information focus is determined, including:
If the angle of change curve and transverse axis is less than the first default value, it is determined that the current life cycle of information focus
State is generating state;
If the angle of change curve and transverse axis is less than the second default value more than the first default value, it is determined that information heat
The current life cycle state of point is state of development;
If the angle of change curve and transverse axis is less than the 3rd default value more than the second default value, it is determined that information heat
The current life cycle state of point is outburst state;
If the angle of change curve and transverse axis is less than the 4th default value more than the 3rd default value, it is determined that information heat
The current life cycle state of point is weak state;
If the angle of change curve and transverse axis is more than the 4th default value, it is determined that the current life cycle of information focus
State is extinction state;
Wherein, the first default value is less than the second default value, and the second default value is less than the 3rd default value, and the 3rd is pre-
If numerical value is less than the 4th default value.
In another embodiment of the present invention, according to the current life cycle state of information focus, to information focus
After life cycle state is updated, method also includes:
If the life cycle state after the renewal of information focus is extinction state, information focus is moved to the 3rd row
Table, the 3rd list are used to store the information focus in extinction state.
Above-mentioned all optional technical schemes, any combination can be used to form the alternative embodiment of the present invention, herein no longer
Repeat one by one.
Based on the implementation environment shown in Fig. 1, the embodiments of the invention provide a kind of method for determining hot information, referring to figure
3, method flow provided in an embodiment of the present invention includes:
301st, server obtains pending information.
Wherein, the quantity of pending information can be 1000,2000,3000 etc., and the present embodiment is not to pending
The quantity of information makees specific limit.The pending information is mainly unstructured information, and its message form is not fixed, including each
The file of kind form, such as electronic document, Email, video file, multimedia.
The mode of pending information is obtained on server, is included but is not limited to:Server can obtain from local storage
Multiple information betided in current time are taken, and using the accessed information betided in current time as pending letter
Breath;Server can also obtain multiple information betided in current time from the information database on internet, and will be obtained
The information betided in current time got is as pending information.The current time can be chronomere hour, also may be used
Using day as chronomere, in the present embodiment using unit of the day as current time, that is to say current time is the same day.
Because accessed information has various forms, and these various forms of information analyses are got up with ten
Point inconvenience, therefore, the present embodiment provide method after pending information is got, also by the pending information to getting
Handled, handled by treating processing information, pending information can be converted to plain text.
302nd, server is treated processing information and clustered, and obtains multiple classes.
In areas of information technology, often have to the comment information of same event, same personage or same report identical
Keyword, multiple information are gathered for one kind if based on the identical keyword, and information is managed in units of class,
Information management efficiency can be improved.Therefore, in the present embodiment, server, which can use, specifies clustering algorithm to treat processing information progress
Cluster, obtains multiple classes.Wherein, clustering algorithm is specified to include suffix tree clustering algorithm, fuzzy clustering algorithm etc..To specify cluster
Algorithm is exemplified by suffix tree clustering algorithm, specific cluster process is:
The first step, server carry out word segmentation processing to each pending information, obtain multiple participles, filter out multiple participles
In stop words, and obtain noun from the participle after filtering, verb and length be more than the participles of 2 words as each treat from
The characteristic value of information is managed, for any pending information, calculates TF-IDF (Term of each characteristic value in clustering information
Frequency-Inverse Document Frequency, word frequency-reverse document-frequency) as each characteristic value weight it is special
Value indicative, and the space vector with the eigenvalue cluster of the pending information into the pending information.
Second step, for the pending information of any two, server calculate space corresponding to two pending information to
The similarity of amount, if the similarity of space vector is more than specified threshold corresponding to two pending information, it can determine that this
Two pending information are similar.
For the computational methods of the similarity of space vector corresponding to two pending information, include but is not limited to:Meter
The cosine value for calculating space vector corresponding to two pending information is determined, if empty corresponding to two pending information
Between vectorial cosine value be more than specified threshold, then can determine that two pending information are similar.For example, wait to locate for any two
Information A and B are managed, according to being (a from space vector corresponding to pending information A1, a2, a3..., an), corresponding to pending information B
Space vector is (b1, b2, b3..., bn), server calculates space vector corresponding to pending information A and pending information B
Cosine valueWherein, specified threshold can be 0.3,0.5,0.6
Deng, the present embodiment so that specified threshold is 0.3 as an example, if the cosine value is more than 0.3, it can determine that pending information A and wait to locate
It is similar to manage information B.
3rd step, server calculate any of other information and the two the pending information in pending information information
Similarity, if the similarity of the other information and any information in pending information is more than specified threshold, other are believed
Breath gathers for one kind with the two pending information.
Processing information is treated by using this kind of clustering algorithm to be clustered, can obtain multiple classes, and each class is corresponding one
Class label, such label are such corresponding space vector.
As a kind of optional embodiment, in order to improve cluster speed, when treating processing information and being clustered, gather
The quantity of class is up to 100, when any pending information and current obtained 100 classes are dissimilar, can wait to locate by this
Reason information, which removes, no longer to be handled.
As a kind of optional embodiment, in order to improve calculating speed, the method that the present embodiment provides can be according to each class
The information content included, obtained multiple classes are ranked up, and according to ranking results, specified quantity is chosen from multiple classes
Individual class, so as to be handled in subsequent step the class of selected specified quantity.Wherein, specified quantity is by server
Reason ability determines, can be 30,40,50 etc., the present embodiment is so that specified quantity is 30 as an example.
303rd, for any one class, server calculates such similarity with each information focus in first list.
In the present embodiment, server safeguards three managing listings, including first list, second list, the 3rd list.Its
In, first list is enlivens hotspot list, for storing fixed information focus and recording the attribute letter of each information focus
Breath, including information content for including of information focus mark, information hotspot name, information focus etc.;Second list is to be confirmed
Hotspot list, for storing hot information to be confirmed;3rd list is historical heat list, and extinction state is in for storing
Information focus.
In the present embodiment, the information focus in first list can use XML (Extensible Markup
Language, extensible markup language) form stored.Specific form is as follows:
<hotlist>
<hot>
<id>1</id>
<title>Title</title>
<status>Occur</status>
<infonum>
<date>2012-01-01</date><num>21</num>
<date>2012-01-02</date><num>51</num>
……
</infonum>
<docid>3,5,6……</docid>
</hot>
</hotlist>
Wherein, id represents the numbering of information focus, and the id can be numeric type;Title represents the title of information focus, should
Title can be character string type;Status represents the current state of information focus;Infonum represents the daily letter of information focus
Breath amount, wherein date represent the date, and num represents the information content within date certain time;Docid represents that information focus is wrapped
The document code of the information included or the numbering of database recorded information, the docid can be character string type.Deposited using XML
When storing up the information focus in first list, each id uses comma, " separate.
Any one obtained class is clustered for above-mentioned steps 302, in order to determine whether such is an information focus, service
Device can calculate such similarity with each information focus in first list.Calculating such and each letter in first list
, can be by calculating the cosine of such corresponding class label space vector corresponding with any information focus when ceasing the similarity of focus
Value is determined.
If the 304, such is more than first threshold with the similarity of any information focus in first list, server adds class
It is added in information list corresponding to information focus, performs step 308.
Wherein, first threshold can be 0.3,0.4,0.6 etc., and the present embodiment is so that first threshold is 0.3 as an example.In this implementation
Example in, each information focus corresponds to an information list, have recorded in the information list each information focus include it is every
Individual information.
When the similarity of such and any information focus in first list is more than first threshold, illustrate such and the information
Focus is similar, and now server can determine that such is information focus, and then such is added into information corresponding to the information focus
In list.
The 305th, if such is respectively less than first threshold, server meter with the similarity of each information focus in first list
Calculate class and the similarity of each information focus to be confirmed in second list.
When such and the similarity of each information focus heat is respectively less than first threshold in first list, illustrate such with it is each
Information focus is dissimilar, and in order to further determine that whether such is information focus, server will also calculate such and secondary series
The similarity of each information focus to be confirmed in table.During specific calculating, it can be treated really with each by calculating the corresponding class label
The cosine value for recognizing space vector corresponding to information focus is determined.
306th, such as fruit and the similarity of any information focus to be confirmed in second list are more than Second Threshold, then server
Class is added in information list corresponding to information focus to be confirmed, and when information focus to be confirmed meets preparatory condition, will
Information focus to be confirmed is moved in first list.
In the present embodiment, each information focus to be confirmed corresponds to an information list, have recorded in the information list
Each information that each information focus to be confirmed includes.When such similarity with any information focus to be confirmed in second list
More than Second Threshold, server determines that such is information focus to be confirmed, and such is added into information focus to be confirmed and corresponded to
Information list in.When it is determined that the information focus to be confirmed meets preparatory condition, it is determined that the information focus to be confirmed is information
Focus, and the information focus to be confirmed is moved in first list.
Wherein, preparatory condition includes the duration more than preset duration etc., that is to say the information to be confirmed in preset duration
The change curve of focus and the angle of transverse axis exceed default value, and the default value can be 30 degree, 40 degree etc..When this is default
Length can be 2 days, 3 days, 4 days etc., and the present embodiment does not make specific limit to the size of preset duration.
307th, such as fruit and the similarity of each information focus to be confirmed in second list are respectively less than Second Threshold, then service
The class label of class is defined as target information focus to be confirmed by device, and target information focus to be confirmed is added into second list
In, and when target information focus to be confirmed meets preparatory condition, target information focus to be confirmed is moved to first by server
In list.
When the similarity of such and each information focus to be confirmed in second list is respectively less than Second Threshold, illustrate such with
Each information focus to be confirmed is dissimilar in second list, and such can be now defined as to target information focus to be confirmed,
It is for a new information focus to be confirmed.When it is determined that target information focus to be confirmed meets preparatory condition, server
Can determine that target information focus to be confirmed is information focus, and target information focus to be confirmed is moved to from second list
In first list.
Above-mentioned steps 301 to 307 be information focus discovery procedure, for the ease of understanding the process, below will using Fig. 4 as
Example illustrates.
Referring to Fig. 4, server obtains pending information, and treats processing information using suffix tree clustering algorithm and gathered
Class, multiple classes are obtained, for the preceding n class in multiple classes, server calculates any one class each to be believed with enlivening in hotspot list
The similarity of focus is ceased, if such is more than threshold value with enlivening the similarity of any information focus in hotspot list, by such
It is added to and enlivens in hotspot list such corresponding information list, if such is with enlivening each information focus in hotspot list
Similarity is respectively less than threshold value, then calculates such similarity with each focus to be confirmed in hotspot list to be confirmed, if such
It is less than threshold value with the similarity of each focus to be confirmed in hotspot list to be confirmed, then generates a new focus to be confirmed, and
The new focus to be confirmed is added in hotspot list to be confirmed;If such with it is any to be confirmed in hotspot list to be confirmed
The similarity of focus is more than threshold value, then such is added in information list corresponding to the focus to be confirmed, and judges that this is treated really
Whether the duration for recognizing information focus is more than 2 days, should if the duration of the information focus to be confirmed is more than 2 days
Information focus to be confirmed, which is moved to, to be enlivened in hotspot list.
308th, server is managed to the life cycle of the information focus in first list.
Wherein, focus life cycle is used to the development trend and current state of information focus be described, focus life
Life periodic state includes findings that state, state of development, weak state and extinction state.When information focus is in discovery state,
Information focus just occurs, and the information content growth of information focus is very slow;When information focus is in state of development, information heat
Point is lasting to be occurred, when the information content growth of information focus is more slow;When information focus is in outburst state, information focus
Information content increases very fast;When information focus is in weak state, the information content of information focus is in negative growth;At information focus
When extinction state, the information content of information focus is in persistently negative growth.
In the present embodiment, in order to preferably grasp the public sentiment of current social dynamic, server will also be in first list
The life cycle of information focus be managed.Specific management process comprises the following steps 3081~3086:
3081st, server draws the change of the information focus using the information content of the information focus as the longitudinal axis, by transverse axis of the time
Change curve.
Referring to Fig. 5, Fig. 5 is the change curve of information focus, wherein, transverse axis represents the time, and the longitudinal axis represents information focus
Information content.
3082nd, server obtains first information amount of the information focus in current time.
Any sort that server is obtained by the cluster of calculation procedure 302 is similar to each information focus in first list
Degree, such is added in the corresponding information list of the information focus similar with such, and existed by counting the information focus
The information content increased newly in current time, obtains first information amount of the information focus in current time, so according to this first
Information content is updated to the attribute information of the information focus in first list.
Referring to Fig. 6, server obtains multiple classes using suffix tree clustering algorithm, and for any one class, server calculates should
Class and the similarity for enlivening each information focus in hotspot list, if the similarity of such and any information focus is more than threshold
Such, then be added in information list corresponding to the information focus by value, and the information content newly increased on the day of counting, and then according to
The information content that the same day newly increases, renewal enliven hotspot list.
3083rd, server obtains the second information content of information focus within a specified time.
Wherein, specified time and current time interval preset duration, the preset duration can be 2 days, 3 days, 4 days etc., this
Embodiment is so that preset duration is 3 days as an example.Server is after the first information amount in current time that gets, with current time
Chosen forward for starting point with the current time interval time of 3 days as specified time, and count the information focus at the appointed time
The second interior information content.
3084th, server calculates information focus in preset duration according to first information amount, the second information content and preset duration
Interior growth acceleration.
Based on first information amount, the second information content and preset duration, server can calculate the change song in the information focus
The slope of line, and then determine the change curve of the information focus and the angle of transverse axis.First information amount is set as y2, the second information
Measure as y1, current time x2, specified time x1, then the slope of the change curve of the information focus isThe change of the information focus can be obtained according to the slope of the change curve of the information focus
Change the angle of curve and transverse axis
3085th, server determines the current life cycle shape of the information focus according to the change curve and the angle of transverse axis
State.
Server determines the current life cycle state of the information focus, wrapped according to the change curve and the angle of transverse axis
Include following several situations:
If the angle of the first situation, the change curve and transverse axis is less than the first default value, server determines letter
It is generating state to cease the current life cycle state of focus.
Wherein, the first default value is less than 90 degree, can be 30 degree, 40 degree etc..When according to first information quantity, the second letter
Breath quantity and preset duration determine that the angle of the change curve and transverse axis is less than the first default value, then server can determine that this
The current life cycle state of information focus is generating state.
If the angle of second of situation, the change curve and transverse axis is less than the second present count more than the first default value
Value, then server determines that the current life cycle state of information focus is state of development.
Wherein, the second default value is more than the first default value and less than 90 degree, can be 50 degree, 60 degree etc..Work as basis
First information quantity, the second information content and preset duration determine that the angle of the change curve and transverse axis is more than the first present count
Value is less than the second default value, then server can determine that the current life cycle state of the information focus is state of development.
If the angle of the third situation, the change curve and transverse axis is more than the 3rd present count less than the second default value
Value, then server determines that the current life cycle state of information focus is outburst state.
Wherein, the 3rd default value is more than 90 degree, can be 120 degree, 130 degree etc..When according to first information quantity, second
Information content and preset duration determine that the angle of the change curve and transverse axis is more than the 3rd present count less than the second default value
During value, then server can determine that the current life cycle state of the information focus is outburst state.
If the 4th kind of situation, growth acceleration are less than the 4th default value more than the 3rd default value, server is true
It is weak state to determine the current life cycle state of information focus.
Wherein, the 4th default value is more than the 3rd default value and more than 90 degree, can be 140 degree, 150 degree etc..Work as root
Determine that the angle of the change curve and transverse axis is default more than the 3rd according to first information quantity, the second information content and preset duration
Numerical value is less than the 4th default value, then server can determine that the current life cycle state of the information focus is weak state.
If the angle of the 5th kind of situation, the change curve and transverse axis is more than the 4th default value, server determines letter
It is extinction state to cease the current life cycle state of focus.
When the angle that the change curve and transverse axis are determined according to first information quantity, the second information content and preset duration
More than the 4th default value, then server can determine that the current life cycle state of the information focus is extinction state.
3086th, server enters according to the current life cycle state of information focus to the life cycle state of information focus
Row renewal.
Due to have recorded the life cycle state of each information focus in first list in the present embodiment, therefore, work as service
, can be according to identified current life cycle state, to first row after current life cycle state of the device according to the information focus
The life cycle state of the information focus is updated in table.
When the life cycle state of the information focus in first list is updated, if after information focus renewal
Life cycle state be extinction state, then server needs information focus being moved to the 3rd list, and the 3rd list is used for
Information focus of the storage in extinction state.
It should be noted that the life cycle of an above-mentioned information focus by first list is entered exemplified by being managed
Row explanation, it is managed for the life cycle of other information focus in first list, can be according to above- mentioned information focus
Way to manage is managed, and here is omitted.
The management process of the above-mentioned life cycle to information focus, reference can be made to Fig. 7.
Referring to Fig. 7, for enlivening each information focus in hotspot list, server counts information heat in units of day
The information content of point, using current time as starting point, the information content in the specified time with the 3 day time of current time interval is obtained,
And according to the information content of the information focus, the information content of the information focus and 3 days interval times in specified time in current time,
Calculate the slope of the change curve of the information focus, and then according to the slope, determine the change curve and transverse axis of the information focus
Angle, the angle is also referred to as the growth acceleration of the information focus, so as to the growth acceleration according to the information focus, it is determined that
The current life cycle state of the information focus, if the current life cycle state of the information focus is generating state, development
One kind in state, outburst state, weak state, then renewal enliven the life cycle state of the information focus in hotspot list,
If the current life cycle state of the information focus is extinction state, the information focus is moved to historical heat list
In.
Method provided in an embodiment of the present invention, information focus is not merely determined according to cluster result, but tied in cluster
The similarity of information focus in fruit and first list is more than first threshold, or cluster result with it is to be confirmed in second list
The similarity of focus is more than Second Threshold and when meeting preparatory condition, and cluster result is defined as into information focus, improve really
The accuracy of fixed information focus.
Referring to Fig. 8, the embodiments of the invention provide a kind of device for determining information focus, the device includes:
Cluster module 801, clustered for treating processing information, obtain multiple classes;
Computing module 802, for for any one class, calculating class and the similarity of each information focus in first list,
The first list is used for storage information focus;
Add module 803, for when the similarity of any information focus in class and first list is more than first threshold, inciting somebody to action
Class is added in information list corresponding to information focus;
Computing module 802, it is additionally operable to when class and the similarity of each information focus in first list are respectively less than first threshold
When, class and the similarity of each information focus to be confirmed in second list are calculated, the second list is used to store information to be confirmed
Focus;
Add module 803, for being more than the second threshold when the similarity of any information focus to be confirmed in class and second list
During value, class is added in information list corresponding to information focus to be confirmed;
Mobile module 804, for when information focus to be confirmed meets preparatory condition, information focus to be confirmed to be moved to
In first list.
In another embodiment of the present invention, each class has a class label, and the device also includes:
First determining module, for being respectively less than second when the similarity of each information focus to be confirmed in class and second list
During threshold value, the class label of class is defined as target information focus to be confirmed;
Add module 803, for target information focus to be confirmed to be added in second list;
Mobile module 804, it is used for and when target information focus to be confirmed meets preparatory condition, by target information to be confirmed
Focus is moved in first list.
In another embodiment of the present invention, the device also includes:
Drafting module, for drawing the change of information focus using the information content of information focus as the longitudinal axis, by transverse axis of the time
Curve;
Acquisition module, for obtaining first information amount of the information focus in current time;
Acquisition module, is additionally operable to obtain the second information content of information focus within a specified time, specified time with it is current when
Between be spaced preset duration;
Computing module, for according to first information amount, the second information content and preset duration, calculating change curve and transverse axis
Angle;
Second determining module, for the angle according to change curve and transverse axis, determine the current life cycle of information focus
State;
Update module, for according to the current life cycle state of information focus, to the life cycle state of information focus
It is updated.
In another embodiment of the present invention, life cycle state include generating state, state of development, outburst state,
Weak state and extinction state;
Second determining module, for when the angle of change curve and transverse axis is less than the first default value, determining information heat
The current life cycle state of point is generating state;It is less than second when the angle of change curve and transverse axis is more than the first default value
During default value, it is state of development to determine the current life cycle state of information focus;When the angle of change curve and transverse axis is big
When the second default value is less than three default values, it is outburst state to determine the current life cycle state of information focus;When
When the angle of change curve and transverse axis is less than four default values more than the 3rd default value, the current life of information focus is determined
Periodic state is weak state;When the angle of change curve and transverse axis is more than four default values, determine that information focus is current
Life cycle state be extinction state;Wherein, the first default value is less than the second default value, and the second default value is less than the
Three default values, the 3rd default value are less than the 4th default value.
In another embodiment of the present invention, mobile module 804, it is additionally operable to life cycle after information focus updates
When state is extinction state, information focus is moved to the 3rd list, the 3rd list is used to store the information in extinction state
Focus.
Device provided in an embodiment of the present invention, information focus is not merely determined according to cluster result, but tied in cluster
The similarity of information focus in fruit and first list is more than first threshold, or cluster result with it is to be confirmed in second list
The similarity of focus is more than Second Threshold and when meeting preparatory condition, and cluster result is defined as into information focus, improve really
The accuracy of fixed information focus.
It should be noted that:The device for the determination information focus that above-described embodiment provides it is determined that during information focus, only with
The division progress of above-mentioned each functional module, can be as needed and by above-mentioned function distribution by not for example, in practical application
Same functional module is completed, and will determine that the internal structure of the device of information focus is divided into different functional modules, to complete
All or part of function described above.In addition, the device for the determination information focus that above-described embodiment provides is with determining information
The embodiment of the method for focus belongs to same design, and its specific implementation process refers to embodiment of the method, repeats no more here.
One of ordinary skill in the art will appreciate that hardware can be passed through by realizing all or part of step of above-described embodiment
To complete, by program the hardware of correlation can also be instructed to complete, described program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.
Claims (10)
- A kind of 1. method for determining information focus, it is characterised in that methods described includes:Treat processing information to be clustered, obtain multiple classes;For any one class, the class and the similarity of each information focus in first list are calculated, the first list is used for Storage information focus;If the class and the similarity of any information focus in the first list are more than first threshold, the class is added Into information list corresponding to described information focus;If the class and the similarity of each information focus in the first list are respectively less than the first threshold, institute is calculated Class and the similarity of each information focus to be confirmed in second list are stated, the second list is used to store information heat to be confirmed Point;If the class and the similarity of any information focus to be confirmed in the second list are more than Second Threshold, by described in Class is added in information list corresponding to the information focus to be confirmed, and meets preparatory condition in the information focus to be confirmed When, the information focus to be confirmed is moved in the first list.
- 2. according to the method for claim 1, it is characterised in that each class has a class label, described to calculate the class After the similarity of each information focus to be confirmed in second list, in addition to:If the class and the similarity of each information focus to be confirmed in the second list are respectively less than the Second Threshold, The class label of the class is defined as target information focus to be confirmed;Target information focus to be confirmed is added in the second list, and expired in target information focus to be confirmed During the foot preparatory condition, target information focus to be confirmed is moved in the first list.
- 3. according to the method for claim 1, it is characterised in that described that the class is added to corresponding to described information focus After in information list, in addition to:The change curve of described information focus is drawn using the information content of described information focus as the longitudinal axis, by transverse axis of the time;Obtain first information amount of the described information focus in current time;The second information content of described information focus within a specified time is obtained, the specified time and the current time interval are pre- If duration;According to the first information amount, second information content and the preset duration, the change curve and transverse axis are calculated Angle;According to the change curve and the angle of transverse axis, the current life cycle state of described information focus is determined;According to the current life cycle state of described information focus, the life cycle state of described information focus is updated.
- 4. according to the method for claim 3, it is characterised in that the life cycle state includes generating state, development shape State, outburst state, weak state and extinction state;It is described that the current life cycle state of described information focus is determined according to the change curve and the angle of transverse axis, including:If the angle of the change curve and transverse axis is less than the first default value, it is determined that the current life of described information focus Periodic state is generating state;If the angle of the change curve and transverse axis is less than the second default value more than first default value, it is determined that institute It is state of development to state the current life cycle state of information focus;If the angle of the change curve and transverse axis is less than the 3rd default value more than second default value, it is determined that institute It is outburst state to state the current life cycle state of information focus;If the angle of the change curve and transverse axis is less than the 4th default value more than the 3rd default value, it is determined that institute It is weak state to state the current life cycle state of information focus;If the angle of the change curve and transverse axis is more than the 4th default value, it is determined that described information focus is current Life cycle state is extinction state;Wherein, first default value is less than second default value, and it is pre- that second default value is less than the described 3rd If numerical value, the 3rd default value is less than the 4th default value.
- 5. according to the method for claim 3, it is characterised in that the life cycle shape current according to described information focus State, after being updated to the life cycle state of described information focus, methods described also includes:If the life cycle state after the renewal of described information focus is extinction state, described information focus is moved to the 3rd List, the 3rd list are used to store the information focus in extinction state.
- 6. a kind of device for determining information focus, it is characterised in that described device includes:Cluster module, clustered for treating processing information, obtain multiple classes;Computing module, it is described for for any one class, calculating the class and the similarity of each information focus in first list First list is used for storage information focus;Add module, for when the similarity of any information focus in the class and the first list is more than first threshold, The class is added in information list corresponding to described information focus;The computing module, it is additionally operable to when the class and the similarity of each information focus in the first list are respectively less than described During first threshold, the class and the similarity of each information focus to be confirmed in second list are calculated, the second list is used for Store information focus to be confirmed;The add module, for being more than the when the similarity of any information focus to be confirmed in the class and the second list During two threshold values, the class is added in information list corresponding to the information focus to be confirmed;Mobile module, for when the information focus to be confirmed meets preparatory condition, the information focus to be confirmed to be moved Into the first list.
- 7. device according to claim 6, it is characterised in that each class has a class label, and described device also includes:First determining module, for being respectively less than when the class and the similarity of each information focus to be confirmed in the second list During the Second Threshold, the class label of the class is defined as target information focus to be confirmed;The add module, for target information focus to be confirmed to be added in the second list;The mobile module, for when target information focus to be confirmed meets the preparatory condition, the target to be treated Confirmation focus is moved in the first list.
- 8. device according to claim 6, it is characterised in that described device also includes:Drafting module, for using the information content of described information focus as the longitudinal axis, described information focus is drawn by transverse axis of the time Change curve;Acquisition module, for obtaining first information amount of the described information focus in current time;The acquisition module, it is additionally operable to obtain the second information content of described information focus within a specified time, the specified time With the current time interval preset duration;The computing module, for according to the first information amount, second information content and the preset duration, described in calculating The angle of change curve and transverse axis;Second determining module, for the angle according to the change curve and transverse axis, determine the current life of described information focus Periodic state;Update module, for according to the current life cycle state of described information focus, to the life cycle of described information focus State is updated.
- 9. device according to claim 8, it is characterised in that the life cycle state includes generating state, development shape State, outburst state, weak state and extinction state;Second determining module, for when the angle of the change curve and transverse axis is less than the first default value, determining institute It is generating state to state the current life cycle state of information focus;When the angle of the change curve and transverse axis is more than described first When default value is less than the second default value, it is state of development to determine the current life cycle state of described information focus;Work as institute When stating the angle of change curve and transverse axis and being less than three default values more than second default value, described information focus is determined Current life cycle state is outburst state;When the angle of the change curve and transverse axis is small more than the 3rd default value When four default values, it is weak state to determine the current life cycle state of described information focus;When the change curve When being more than four default value with the angle of transverse axis, it is extinction shape to determine the current life cycle state of described information focus State;Wherein, first default value is less than second default value, and it is default that second default value is less than the described 3rd Numerical value, the 3rd default value are less than the 4th default value.
- 10. device according to claim 8, it is characterised in that the mobile module, be additionally operable to work as described information focus more When life cycle state after new is extinction state, described information focus is moved to the 3rd list, the 3rd list is used for Information focus of the storage in extinction state.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2016103547372 | 2016-05-26 | ||
CN201610354737 | 2016-05-26 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106570140A CN106570140A (en) | 2017-04-19 |
CN106570140B true CN106570140B (en) | 2018-03-02 |
Family
ID=58536136
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610964928.0A Active CN106570140B (en) | 2016-05-26 | 2016-11-04 | Determine the method and device of information focus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106570140B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110147482B (en) * | 2017-09-11 | 2021-06-22 | 上海优扬新媒信息技术有限公司 | Method and device for acquiring burst hotspot theme |
CN108647222B (en) * | 2018-03-22 | 2021-01-08 | 中国互联网络信息中心 | Line three-dimensional roaming hotspot icon positioning method and system |
CN109257700B (en) * | 2018-11-19 | 2020-11-06 | 广东小天才科技有限公司 | Positioning method, server and system based on positioning deviation rectification |
CN112966505B (en) * | 2021-01-21 | 2021-10-15 | 哈尔滨工业大学 | Method, device and storage medium for extracting persistent hot phrases from text corpus |
CN114153915A (en) * | 2021-09-10 | 2022-03-08 | 北京天德科技有限公司 | Method and system for tracing and tracing information in block chain |
CN113836307B (en) * | 2021-10-15 | 2024-02-20 | 国网北京市电力公司 | Power supply service work order hot spot discovery method, system, device and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101477556A (en) * | 2009-01-22 | 2009-07-08 | 苏州智讯科技有限公司 | Method for discovering hot sport in internet mass information |
CN101661513A (en) * | 2009-10-21 | 2010-03-03 | 上海交通大学 | Detection method of network focus and public sentiment |
CN102982110A (en) * | 2012-11-08 | 2013-03-20 | 中国科学院自动化研究所 | Method for extracting hot spot event information of cyberspace in physical space |
-
2016
- 2016-11-04 CN CN201610964928.0A patent/CN106570140B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101477556A (en) * | 2009-01-22 | 2009-07-08 | 苏州智讯科技有限公司 | Method for discovering hot sport in internet mass information |
CN101661513A (en) * | 2009-10-21 | 2010-03-03 | 上海交通大学 | Detection method of network focus and public sentiment |
CN102982110A (en) * | 2012-11-08 | 2013-03-20 | 中国科学院自动化研究所 | Method for extracting hot spot event information of cyberspace in physical space |
Also Published As
Publication number | Publication date |
---|---|
CN106570140A (en) | 2017-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106570140B (en) | Determine the method and device of information focus | |
CN103365924B (en) | A kind of method of internet information search, device and terminal | |
JP5802745B2 (en) | Intelligent navigation method, apparatus and system | |
CN105488092B (en) | A kind of time-sensitive and adaptive sub-topic online test method and system | |
CN104008106B (en) | A kind of method and device obtaining much-talked-about topic | |
US20080147642A1 (en) | System for discovering data artifacts in an on-line data object | |
US20150154306A1 (en) | Method for searching related entities through entity co-occurrence | |
CN101957816A (en) | Webpage metadata automatic extraction method and system based on multi-page comparison | |
CN103279543B (en) | Path mode inquiring system for massive image data | |
CN107256263A (en) | Internet hot spots information automatic monitoring method | |
US10417334B2 (en) | Systems and methods for providing a microdocument framework for storage, retrieval, and aggregation | |
TW201415254A (en) | Method and system for recommending semantic annotations | |
CN108875065A (en) | A kind of Indonesia's news web page recommended method based on content | |
Nakashole et al. | Real-time population of knowledge bases: opportunities and challenges | |
Moro et al. | Early Profile Pruning on XML-aware Publish/Subscribe Systems. | |
Zhang et al. | Continuous top-k monitoring on document streams | |
Khodaei et al. | Temporal-textual retrieval: Time and keyword search in web documents | |
US20220374482A1 (en) | Dynamic taxonomy builder and smart feed compiler | |
CN106250456A (en) | Bid winning announcement extraction method and device | |
Zhang | Start small, build complete: Effective and efficient semantic table interpretation using tableminer | |
Sharma et al. | Shallow neural network and ontology-based novel semantic document indexing for information retrieval | |
Jin et al. | Tise: A temporal search engine for web contents | |
Kumar et al. | Term-frequency inverse-document frequency definition semantic (TIDS) based focused web crawler | |
US20130124509A1 (en) | Publish-subscribe based methods and apparatuses for associating data files | |
US20150154195A1 (en) | Method for entity-driven alerts based on disambiguated features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20180705 Address after: 230088 room 609-6, R & D center of China (Hefei) International Intelligent Speech Industrial Park, 3333, hi tech Road, Hefei, Anhui. Patentee after: Anhui Tai Yue Xiang Sheng Software Co., Ltd. Address before: 100089 Beijing Haidian District Haidian District 25 East Road three units 6 units Patentee before: China Science and Technology (Beijing) Co., Ltd. |
|
TR01 | Transfer of patent right |