CN106844466A - Event train of thought generation method and device - Google Patents

Event train of thought generation method and device Download PDF

Info

Publication number
CN106844466A
CN106844466A CN201611193377.9A CN201611193377A CN106844466A CN 106844466 A CN106844466 A CN 106844466A CN 201611193377 A CN201611193377 A CN 201611193377A CN 106844466 A CN106844466 A CN 106844466A
Authority
CN
China
Prior art keywords
resource
time window
assessment models
subelement
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611193377.9A
Other languages
Chinese (zh)
Inventor
莫洋
沈剑平
黄强
郑景耀
骆金昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201611193377.9A priority Critical patent/CN106844466A/en
Publication of CN106844466A publication Critical patent/CN106844466A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses event train of thought generation method and device, wherein method includes:For pending event, the resource in each time window is obtained respectively;For each time window, the prominence score of each resource in the time window is determined respectively, and select the resource that prominence score meets pre-provisioning request from each resource in the time window, the resource that will be selected is used as the representative resource in the time window;Representative resource in each time window is combined sequentially in time, event train of thought is obtained.Using scheme of the present invention, it is possible to increase the information acquisition efficiency of user.

Description

Event train of thought generation method and device
【Technical field】
The present invention relates to network technology, more particularly to event train of thought generation method and device.
【Background technology】
Currently, user is using search engine etc. when being scanned for, when such as being scanned for a certain event, can only by with this Related each resource such as the News Resources of event, are ranked up etc. after processing according to predetermined way, show user.
And user is if it is intended to the main process of understanding event progress, then need to search corresponding resource respectively and looked into See, implement it is extremely difficult, so as to reduce the information acquisition efficiency of user.
【The content of the invention】
In view of this, the invention provides event train of thought generation method and device, it is possible to increase the acquisition of information effect of user Rate.
Concrete technical scheme is as follows:
A kind of event train of thought generation method, including:
For pending event, the resource in each time window is obtained respectively;
For each time window, the prominence score of each resource in the time window is determined respectively, and from institute State and select the resource that prominence score meets pre-provisioning request in each resource in time window, the resource that will be selected as it is described when Between representative resource in window;
Representative resource in each time window is combined sequentially in time, event train of thought is obtained.
A kind of event train of thought generating means, including:Processing unit;
The processing unit, for for pending event, the resource in each time window being obtained respectively;For each Time window, determines the prominence score of each resource in the time window respectively, and each from the time window The resource that prominence score meets pre-provisioning request is selected in resource, the resource that will be selected is used as the representativeness in the time window Resource;Representative resource in each time window is combined sequentially in time, event train of thought is obtained.
Be can be seen that using scheme of the present invention based on above-mentioned introduction, for pending event, can respectively obtain each Resource in time window, and for each time window, therefrom selecting can most reflect the representativeness of event progress respectively Resource, and then event train of thought is obtained using the representative combination of resources in selected each time window, so, when user uses When being scanned for such as search engine, event train of thought directly can be showed into user, asked present in prior art so as to be overcome Topic, and then improve the information acquisition efficiency of user.
【Brief description of the drawings】
Fig. 1 is the flow chart of event train of thought generation method embodiment of the present invention.
Fig. 2 is the resource schematic diagram in the time window for getting of the present invention.
Fig. 3 is the schematic diagram of generation event train of thought of the present invention.
Fig. 4 is " star A divorces " the corresponding event train of thought schematic diagram of event of the present invention.
Fig. 5 is the composition structural representation of event train of thought generating means embodiment of the present invention.
【Specific embodiment】
For problems of the prior art, a kind of event train of thought generation scheme is proposed in the present invention, can be effectively Being filtered out from substantial amounts of resource can most reflect the representative resource of event progress, and automatically generate the displaying of event train of thought To user.
In order that technical scheme is clearer, clear, develop simultaneously embodiment referring to the drawings, to institute of the present invention The scheme of stating is described in further detail.
Embodiment one
Fig. 1 is the flow chart of event train of thought generation method embodiment of the present invention, as shown in figure 1, including in detail below Implementation:
In 11, for pending event, the resource in each time window is obtained respectively;
In 12, for each time window, the prominence score of each resource in the time window is determined respectively, and The resource that prominence score meets pre-provisioning request is selected from each resource in the time window, when the resource that will be selected is as this Between representative resource in window;
In 13, the representative resource in each time window is combined sequentially in time, obtains event train of thought.
The resource can be News Resources etc..
To realize such scheme, it is necessary to obtain training sample in advance, and assessment models are obtained according to training sample training, this Sample, for pending event, can be in units of time window, for the resource got in each time window, difference root The prominence score of each resource is determined according to assessment models, and prominence score is selected from each resource in each time window Meet the resource of pre-provisioning request, the resource that will be selected as the representative resource in the time window, and then by each time window Interior representative resource is combined sequentially in time, obtains event train of thought.
Each part mentioned above content is described in detail individually below.
One) training sample
In order to obtain follow-up assessment models, it is necessary to obtain training sample first.
In scheme of the present invention, can be using the method based on pairing (pairwise), from some moneys for having a time sequencing In source, the several resources that can most reflect event development are selected, so as to the resource being selected and the money not being selected can be got Good and bad relation between source, and then generate training sample.
Such as, the resource in the corresponding any time window of any one event can be shown, is obtained from the money for being shown The high-quality resource selected in source, constitutes a resource pair by each high-quality resource with each the non-prime resource for being shown respectively, Each resource is generated respectively to corresponding training sample.
By taking " star A divorces " event as an example, whole event is being continued to develop over time, can respectively get each Resource in time window.Time window, refers to that (can such as wait duration to be cut the time shaft cutting that whole event develops Point) it is resulting each time period after multiple continuous time periods.
Fig. 2 is the resource schematic diagram in the time window for getting of the present invention, as shown in Fig. 2 can be by these Resource shows sample to collect personnel, and sample is collected personnel and can therefrom be selected and thinks most reflect coming for " star A divorces " event Dragon goes 2 resources of arteries and veins, using selected resource as high-quality resource.
Afterwards, each high-quality resource can be constituted into a resource pair with each the non-prime resource for being shown respectively.
Such as, selected high-quality resource is the resource 1 and resource 2 shown in Fig. 2, then can obtain following resource pair:(money Source 1, resource 3), (resource 1, resource 4), (resource 1, resource 5), (resource 2, resource 3), (resource 2, resource 4), (resource 2, money Source 5) etc..
Followed by, each resource can be respectively generated to corresponding training sample, be may include in each training sample:Respectively from The feature extracted in one the two of resource centering resource, and, two resource result of determination which is better and which is worse.
Each resource pair is directed to, each resource that can be respectively to the resource centering carries out feature extraction, and combines two The resource allocation result of determination which is better and which is worse, generates a training sample.
Result of determination can represent with 1 and 0, such as, if a previous resource for resource centering is better than latter resource, Then result of determination can be 1, if conversely, latter resource is better than previous resource, result of determination can be 0.
So, by taking (resource 1, resource 3), (resource 2, resource 4) two resources pair as an example, its corresponding training sample will divide It is not:(feature, the feature of resource 3,1 of resource 1), (feature, the feature of resource 4,1 of resource 2).
Using above-mentioned processing mode, sample is only showed to collect some resources in one time window of personnel every time, Allow sample to collect personnel and therefrom select several optimal resources, so that sample collects personnel being taken into full account when being selected The timeliness background of event train of thought, that is, not only allow for the correlation of resource, it is also contemplated that the train of thought importance of resource, meanwhile, Using above-mentioned processing mode so that sample collects personnel and can just get more training sample by less work, so that Improve sample collection efficiency etc..
Two) feature extraction
The feature extracted from each resource including but not limited to one below or any combination, it is preferred that extractable Go out following whole features:
Plain text feature, resource temperature feature, search temperature feature, similar resource number feature.
1) plain text feature
The plain text for how obtaining resource is characterized as prior art, such as, can be based on bag of words (Bag of words) method, Using the weight meter of term frequency-inverse document frequency (TF-IDF, Term Frequency-Inverse Document Frequency) Calculation mode extracts the plain text feature of resource.
2) resource temperature feature
What this feature mainly reflected is the quantity that resource is clicked reading, how to obtain and is similarly prior art.
3) temperature feature is searched for
For event train of thought, in the key node of train of thought, tend to cause people to scan for it, by such as Baidu search daily record etc. is analyzed, and can find the volumes of searches to certain keyword at which time point and reach peak value, with This time point corresponding resource often has more important meaning in event evolution.
The resource different for two, it is assumed that keyword " star A divorces " is corresponded to, due to two issuing times of resource Difference, the search temperature of the corresponding keyword when resource is issued also can be different, therefore, can be using search temperature as resource One key character.
4) similar resource number feature
In internet, important resource can usually be reprinted in different forms, and it is typically similar in terms of content, because This, by the excavation to internet mass data, can extract the similar resource number of each resource, as the feature of the resource, Reflect the importance of resource from other side.
On the basis of content described above, the search temperature feature and similar resource number feature of resource how are obtained It is prior art.
Three) model training
After enough training samples are got, you can obtain required assessment models according to training sample training, How to be trained is prior art.
The number of assessment models can be one, or, to improve the accuracy of assessment result, the number of assessment models One can be more than, specific number can be decided according to the actual requirements.
Can be respectively trained and obtain each assessment models according to the training sample for getting.
Each assessment models is two disaggregated models of pairwise, i.e., can be to resource and money using assessment models Good and bad relation between source is judged.
Assessment models may include but be not limited to one below or any combination:SVMs (SVM, Support Vector Machine) model, logistic regression (Logistic Regression) model, random forest (Random Forest) Model etc..
Four) event train of thought generation
For pending event, the resource in each time window can be respectively obtained.
For each time window, the important of each resource in the time window can be respectively determined according to assessment models Property scoring.
By taking any time window as an example, for each resource in the time window, following treatment can be respectively carried out:
A) using the resource as resource to be assessed, by other each resources difference in resource to be assessed and the time window One resource pair of composition;
B) two resource result of determination which is better and which is worse of each resource centering are determined respectively according to assessment models;
C) statistical decision result meets the resource logarithm of following condition:Resource to be assessed is another better than place resource centering Resource;
D) using statistics as resource to be assessed prominence score.
Wherein, b) described in process, for each resource pair, can carry respectively according to feature extraction mode described in two) The feature of each resource of the resource centering is taken out, and then according to the feature and assessment models for extracting, determines the resource Two resource result of determination which is better and which is worse of centering, the feature that will be extracted is assessed as the input of assessment models The result of determination of model output.
In addition, when assessment models number is more than for the moment, for each resource pair, it will obtained respectively according to each assessment models To a result of determination, each result of determination can be collected, final result of determination is determined according to summarized results.
Such as, 3 assessment models are co-existed in, for any resource to x, 3 result of determination difference of assessment models output It is 1,1,0, then because assessment models number that result of determination is 1 is 2, result of determination is that 0 assessment models number is 1, therefore can According to the principle that the minority is subordinate to the majority, using 1 as resource to the corresponding result of determination of x.
Assuming that include 4 resources in time window, respectively 1~resource of resource 4 is processed in the manner described above Afterwards, can obtain the classification matrix of pairwise bis- shown in table one:
The classification matrix of one pairwise of table bis-
In Table 1, each resource and the comparative result between itself can represent with 0, so that will not be to subsequent statistical result Produce influence.
The numerical value in the 2nd row~the 5th row in table one can be sued for peace respectively, so as to respectively obtain 1~resource of resource 4 Prominence score, wherein, the prominence score of resource 1 is 1, and the prominence score of resource 2 is 3, the prominence score of resource 3 It is 2, the prominence score of resource 4 is 1.
For each time window, after the prominence score for getting each resource in the time window respectively, can The resource that prominence score meets pre-provisioning request is selected from each resource in the time window, when the resource that will be selected is as this Between representative resource in window.
Wherein, selecting the mode of the resource that prominence score meets pre-provisioning request can be:
Mode one
The N number of resource of prominence score highest is selected as the representative resource in the time window, N is positive integer, had Body value can be decided according to the actual requirements, such as can value be 1, by taking the time window corresponding to table one as an example, due to the weight of resource 2 The property wanted scoring highest, therefore can be using resource 2 as the representative resource in the time window;
Mode two
Resource of the prominence score more than predetermined threshold is selected as the representative resource in the time window, the threshold value Specific value can equally be decided according to the actual requirements.
After the representative resource in each time window is respectively obtained, by the representative resource in each time window according to Time sequencing is combined, you can obtain event train of thought.
Based on above-mentioned introduction, Fig. 3 is the schematic diagram of generation event train of thought of the present invention, as shown in figure 3, left side The all resources in each time window that each resource representation gets, in each time window that each resource representation on right side is determined Representative resource.
Fig. 4 is " star A divorces " the corresponding event train of thought schematic diagram of event of the present invention.
Above is the introduction on embodiment of the method, below by way of device embodiment, enters to advance to scheme of the present invention One step explanation.
Embodiment two
Fig. 5 is the composition structural representation of event train of thought generating means embodiment of the present invention, as shown in figure 5, including: Processing unit 51.
Processing unit 51, for for pending event, the resource in each time window being obtained respectively;During for each Between window, the prominence score of each resource in the time window is determined respectively, and from each resource in the time window The resource that prominence score meets pre-provisioning request is selected, the resource that will be selected is used as the representative resource in the time window;Will Representative resource in each time window is combined sequentially in time, obtains event train of thought.
As shown in figure 5, can be further included in described device:Model training unit 52.
Model training unit 52, for obtaining training sample, assessment models is obtained according to training sample training, will assess mould Type is sent to processing unit 51;Correspondingly, processing unit 51 determines each money in each time window respectively according to assessment models The prominence score in source.
Wherein, be may particularly include in model training unit 52:Sample collects subelement 521 and model training subelement 522。
Sample collects subelement 521, for the resource in the corresponding any time window of any one event to be shown, obtains The high-quality resource selected from the resource for being shown is taken, respectively by each high-quality resource and each the non-prime resource group for being shown Into a resource pair, each resource is generated respectively to corresponding training sample, training sample is sent to model training subelement 522。
Assessment models, for obtaining assessment models according to training sample training, are sent to place by model training subelement 522 Reason unit 51.
Be may include in each training sample for being generated:The spy for being extracted from the two of resource centering resources respectively Levy, and, two resource result of determination which is better and which is worse.
Each resource pair is directed to, each resource that can be respectively to the resource centering carries out feature extraction, and combines two The resource allocation result of determination which is better and which is worse, generates a training sample.
Result of determination can represent with 1 and 0, such as, if a previous resource for resource centering is better than latter resource, Then result of determination can be 1, if conversely, latter resource is better than previous resource, result of determination can be 0.
The feature extracted from each resource may include but be not limited to one below or any combination:Plain text feature, Resource temperature feature, search temperature feature, similar resource number feature.
In addition, the number of assessment models can be one, or, to improve the accuracy of assessment result, assessment models Number can also be more than one, and model training subelement 522 can be respectively trained and obtain each and comment according to the training sample for getting Estimate model.
Assessment models may include but be not limited to one below or any combination:Supporting vector machine model, Logic Regression Models, Random Forest model.
As shown in figure 5, be may particularly include in processing unit 51:Obtain subelement 511, selection subelement 512 and combination Subelement 513.
Subelement 511 is obtained, for for pending event, the resource in each time window being obtained respectively, and send Give selection subelement 512.
Selection subelement 512, for for each time window, following treatment being carried out respectively:
For each resource in the time window, respectively using the resource as resource to be assessed, by resource to be assessed with Other each resources in the time window separately constitute a resource pair;Get each resource pair respectively according to assessment models In two resource result of determination which is better and which is worse;Statistical decision result meets the resource logarithm of following condition:Resource to be assessed Better than another resource of place resource centering;Using statistics as resource to be assessed prominence score;
The resource that prominence score meets pre-provisioning request, the resource that will be selected are selected from each resource in the time window As the representative resource in the time window, combination subelement 513 is sent to.
Combination subelement 513, for the representative resource in each time window to be combined sequentially in time, obtains Event train of thought.
For each resource pair, selection subelement 512 can respectively extract the spy of each resource of the resource centering first Levy, and then according to the feature and assessment models for extracting, determine two resource judgements which is better and which is worse of the resource centering As a result, the feature that will be extracted obtains the result of determination of assessment models output as the input of assessment models.
When assessment models number is more than for the moment, for each resource pair, selection subelement 512 can respectively according to each assessment mould Type gets a result of determination, and then each result of determination is collected, and determines final judgement according to summarized results As a result.
For each time window, selection subelement 512 is getting the important of each resource in the time window respectively Property scoring after, the resource that prominence score meets pre-provisioning request can be selected from each resource in the time window, will select Resource as the representative resource in the time window.
Such as, for each time window, selection subelement 512 can be selected important from each resource in the time window Property the scoring N number of resource of highest, N is positive integer, and the resource that will be selected is used as the representative resource in the time window.
Or, for each time window, selection subelement 512 can be selected important from each resource in the time window Property scoring more than predetermined threshold resource, the resource that will be selected is used as the representative resource in the time window.
After the representative resource in each time window is respectively obtained, combination subelement 513 can be by each time window Representative resource be combined sequentially in time, so as to obtain event train of thought.
The specific workflow of Fig. 5 shown device embodiments refer to the respective description in preceding method embodiment, herein Repeat no more.
In a word, using scheme of the present invention, for pending event, the money in each time window can respectively be obtained Source, and for each time window, therefrom selecting can most reflect the representative resource of event progress respectively, and then utilize institute The representative combination of resources in each time window selected obtains event train of thought, so, when user is carried out using such as search engine During search, event train of thought directly can be showed into user, so as to overcome problems of the prior art, and then improve use The information acquisition efficiency at family.
In several embodiments provided by the present invention, it should be understood that disclosed apparatus and method, can be by it Its mode is realized.For example, device embodiment described above is only schematical, for example, the division of the unit, only Only a kind of division of logic function, can there is other dividing mode when actually realizing.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be according to the actual needs selected to realize the mesh of this embodiment scheme 's.
In addition, during each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list Unit can both be realized in the form of hardware, it would however also be possible to employ hardware adds the form of SFU software functional unit to realize.
The above-mentioned integrated unit realized in the form of SFU software functional unit, can store and be deposited in an embodied on computer readable In storage media.Above-mentioned SFU software functional unit storage is in a storage medium, including some instructions are used to so that a computer Equipment (can be personal computer, server, or network equipment etc.) or processor (processor) perform the present invention each The part steps of embodiment methods described.And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. it is various Can be with the medium of store program codes.
Presently preferred embodiments of the present invention is the foregoing is only, is not intended to limit the invention, it is all in essence of the invention Within god and principle, any modification, equivalent substitution and improvements done etc. should be included within the scope of protection of the invention.

Claims (18)

1. a kind of event train of thought generation method, it is characterised in that including:
For pending event, the resource in each time window is obtained respectively;
For each time window, determine the prominence score of each resource in the time window respectively, and from it is described when Between select the resource that prominence score meets pre-provisioning request in each resource in window, the resource that will be selected is used as the time window Intraoral representative resource;
Representative resource in each time window is combined sequentially in time, event train of thought is obtained.
2. method according to claim 1, it is characterised in that
The method is further included:Training sample is obtained, assessment models are obtained according to training sample training;
The prominence score of each resource determined respectively in the time window includes:
According to the assessment models, the prominence score of each resource in the time window is determined respectively.
3. method according to claim 2, it is characterised in that
It is described according to the assessment models, the prominence score of each resource in the time window is determined respectively to be included:
For each resource in the time window, following treatment is carried out respectively:
Using the resource as resource to be assessed, the resource to be assessed is divided with other each resources in the time window Zu Cheng not a resource pair;
Get two resource result of determination which is better and which is worse of each resource centering respectively according to the assessment models;
Statistical decision result meets the resource logarithm of following condition:Another money of the resource to be assessed better than place resource centering Source;
Using statistics as the resource to be assessed prominence score.
4. method according to claim 3, it is characterised in that
Each training sample includes:
The feature for being extracted from the two of resource centering resources respectively, and, two resource judgement knots which is better and which is worse Really;
The two resources result of determination bag which is better and which is worse for getting each resource centering respectively according to the assessment models Include:
Two features of resource of each resource centering are extracted respectively;
According to the feature for extracting and the assessment models, two resources of each resource centering are got respectively, and which is better and which is worse Result of determination.
5. method according to claim 4, it is characterised in that
The acquisition training sample includes:
Resource in the corresponding any time window of any one event is shown;
Obtain the high-quality resource selected from the resource for being shown;
Each high-quality resource is constituted into a resource pair with each the non-prime resource for being shown respectively;
Each resource is generated respectively to corresponding training sample.
6. method according to claim 3, it is characterised in that
The number of the assessment models is for one or more than one;
It is described to obtain assessment models and include according to training sample training:
Each assessment models is obtained according to training sample training respectively;
The two resources result of determination bag which is better and which is worse for getting each resource centering respectively according to the assessment models Include:
When the assessment models number is more than for the moment, for each resource pair, gets one according to each assessment models respectively and sentence Determine result, each result of determination is collected, final result of determination is determined according to summarized results.
7. method according to claim 6, it is characterised in that
The assessment models include one below or any combination:
Supporting vector machine model, Logic Regression Models, Random Forest model.
8. method according to claim 4, it is characterised in that
The feature extracted from each resource includes one below or any combination:
Plain text feature, resource temperature feature, search temperature feature, similar resource number feature.
9. method according to claim 1, it is characterised in that
The resource that prominence score meets pre-provisioning request, the money that will be selected are selected in each resource in the time window Source includes as the representative resource in the time window:
The N number of resource of prominence score highest is selected from each resource in the time window, N is positive integer, by what is selected Resource is used as the representative resource in the time window;
Or, resource of the prominence score more than predetermined threshold is selected from each resource in the time window, by what is selected Resource is used as the representative resource in the time window.
10. a kind of event train of thought generating means, it is characterised in that including:Processing unit;
The processing unit, for for pending event, the resource in each time window being obtained respectively;For each time Window, determines the prominence score of each resource in the time window, and each resource from the time window respectively In select the resource that prominence score meets pre-provisioning request, the resource that will be selected as in the time window representativeness money Source;Representative resource in each time window is combined sequentially in time, event train of thought is obtained.
11. devices according to claim 10, it is characterised in that
Described device is further included:Model training unit;
The model training unit, for obtaining training sample, assessment models is obtained according to training sample training, will be described Assessment models are sent to the processing unit;
The processing unit determines that the importance of each resource in the time window is commented respectively according to the assessment models Point.
12. devices according to claim 11, it is characterised in that
The processing unit includes:Obtain subelement, selection subelement and combination subelement;
The acquisition subelement, for for pending event, the resource in each time window being obtained respectively, and be sent to institute State selection subelement;
The selection subelement, for for each time window, following treatment being carried out respectively:
For each resource in the time window, respectively using the resource as resource to be assessed, by the money to be assessed Source separately constitutes a resource pair with other each resources in the time window;Got respectively according to the assessment models Two resource result of determination which is better and which is worse of each resource centering;Statistical decision result meets the resource logarithm of following condition: Another resource of the resource to be assessed better than place resource centering;Using statistics as the resource to be assessed importance Scoring;
The resource that prominence score meets pre-provisioning request is selected from each resource in the time window, the resource that will be selected is made It is the representative resource in the time window, is sent to the combination subelement;
The combination subelement, for the representative resource in each time window to be combined sequentially in time, obtains thing Part train of thought.
13. devices according to claim 12, it is characterised in that
Each training sample includes:
The feature for being extracted from the two of resource centering resources respectively, and, two resource judgement knots which is better and which is worse Really;
The selection subelement extracts two features of resource of each resource centering respectively, according to the feature for extracting and The assessment models, get two resource result of determination which is better and which is worse of each resource centering respectively.
14. devices according to claim 13, it is characterised in that
The model training unit includes:Sample collects subelement and model training subelement;
The sample collects subelement, for the resource in the corresponding any time window of any one event to be shown, obtains The high-quality resource selected from the resource for being shown, respectively by each high-quality resource and each the non-prime resource composition for being shown One resource pair, generates each resource to corresponding training sample respectively, and the training sample is sent into the model training Subelement;
The model training subelement, for obtaining assessment models according to training sample training, by assessment models hair Give the processing unit.
15. devices according to claim 14, it is characterised in that
The number of the assessment models is for one or more than one;
The model training subelement obtains each assessment models according to training sample training respectively;
The selection subelement is further used for,
When the assessment models number is more than for the moment, for each resource pair, gets one according to each assessment models respectively and sentence Determine result, each result of determination is collected, final result of determination is determined according to summarized results.
16. devices according to claim 15, it is characterised in that
The assessment models include one below or any combination:
Supporting vector machine model, Logic Regression Models, Random Forest model.
17. devices according to claim 13, it is characterised in that
The feature extracted from each resource includes one below or any combination:
Plain text feature, resource temperature feature, search temperature feature, similar resource number feature.
18. devices according to claim 12, it is characterised in that
For each time window, the selection subelement selects prominence score most from each resource in the time window N number of resource high, N is positive integer, and the resource that will be selected is used as the representative resource in the time window;
Or, for each time window, the selection subelement selects importance from each resource in the time window More than the resource of predetermined threshold, the resource that will be selected is used as the representative resource in the time window for scoring.
CN201611193377.9A 2016-12-21 2016-12-21 Event train of thought generation method and device Pending CN106844466A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611193377.9A CN106844466A (en) 2016-12-21 2016-12-21 Event train of thought generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611193377.9A CN106844466A (en) 2016-12-21 2016-12-21 Event train of thought generation method and device

Publications (1)

Publication Number Publication Date
CN106844466A true CN106844466A (en) 2017-06-13

Family

ID=59135953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611193377.9A Pending CN106844466A (en) 2016-12-21 2016-12-21 Event train of thought generation method and device

Country Status (1)

Country Link
CN (1) CN106844466A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679985A (en) * 2017-09-12 2018-02-09 阿里巴巴集团控股有限公司 Feature of risk screening, description message forming method, device and electronic equipment
CN110232077A (en) * 2019-06-19 2019-09-13 北京百度网讯科技有限公司 Event train of thought generation method and device
CN110555108A (en) * 2018-05-31 2019-12-10 北京百度网讯科技有限公司 Event context generation method, device, equipment and storage medium
WO2022095375A1 (en) * 2020-11-06 2022-05-12 平安科技(深圳)有限公司 Event context generation method and apparatus, and terminal device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012917A (en) * 2010-11-26 2011-04-13 百度在线网络技术(北京)有限公司 Information processing device and method
CN103500163A (en) * 2013-07-24 2014-01-08 百度在线网络技术(北京)有限公司 Method and device for recognizing event key progress
CN104933129A (en) * 2015-06-12 2015-09-23 百度在线网络技术(北京)有限公司 Event context acquisition method and system based on micro-blogs

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012917A (en) * 2010-11-26 2011-04-13 百度在线网络技术(北京)有限公司 Information processing device and method
CN103500163A (en) * 2013-07-24 2014-01-08 百度在线网络技术(北京)有限公司 Method and device for recognizing event key progress
CN104933129A (en) * 2015-06-12 2015-09-23 百度在线网络技术(北京)有限公司 Event context acquisition method and system based on micro-blogs

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUHUSHANGWEI: ""Learning to rank的讲解,单文档方法,文档对方法,文档列方法"", 《CSDN博客-HTTPS://BLOG.CSDN.NET/YUHUSHANGWEI/ARTICLE/DETAILS/48547151》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679985A (en) * 2017-09-12 2018-02-09 阿里巴巴集团控股有限公司 Feature of risk screening, description message forming method, device and electronic equipment
CN107679985B (en) * 2017-09-12 2021-01-05 创新先进技术有限公司 Risk feature screening and description message generating method and device and electronic equipment
TWI745589B (en) * 2017-09-12 2021-11-11 開曼群島商創新先進技術有限公司 Risk feature screening, description message generation method, device and electronic equipment
CN110555108A (en) * 2018-05-31 2019-12-10 北京百度网讯科技有限公司 Event context generation method, device, equipment and storage medium
CN110232077A (en) * 2019-06-19 2019-09-13 北京百度网讯科技有限公司 Event train of thought generation method and device
WO2022095375A1 (en) * 2020-11-06 2022-05-12 平安科技(深圳)有限公司 Event context generation method and apparatus, and terminal device and storage medium

Similar Documents

Publication Publication Date Title
CN106980692B (en) Influence calculation method based on microblog specific events
KR101536520B1 (en) Method and server for extracting topic and evaluating compatibility of the extracted topic
EP3522029A1 (en) Natural language search results for intent queries
CN106940732A (en) A kind of doubtful waterborne troops towards microblogging finds method
TW200900973A (en) Personalized shopping recommendation based on search units
CN104102658B (en) Content of text method for digging and device
EP2224361A1 (en) Generating a domain corpus and a dictionary for an automated ontology
CN106844466A (en) Event train of thought generation method and device
CN101261629A (en) Specific information searching method based on automatic classification technology
WO2021019831A1 (en) Management system and management method
CN104462323B (en) Semantic similarity calculation method, method for processing search results and device
CN104572720B (en) A kind of method, apparatus and computer readable storage medium of webpage information re-scheduling
CN102789449A (en) Method and device for evaluating comment text
CN102662987B (en) A kind of sorting technique of the network text semanteme based on Baidupedia
CN105095175A (en) Method and device for obtaining truncated web title
Lalji et al. Twitter sentiment analysis using hybrid approach
CN103226601B (en) A kind of method and apparatus of picture searching
CN108111310A (en) A kind of generation method and device of candidate password dictionary
CN104899310B (en) Information sorting method, the method and device for generating information sorting model
CN102063497A (en) Open type knowledge sharing platform and entry processing method thereof
JP5512737B2 (en) Topic extraction apparatus and topic extraction method
JP5315726B2 (en) Information providing method, information providing apparatus, and information providing program
Silva et al. Method for collecting relevant topics from Twitter supported by big data
Qureshi et al. Exploiting wikipedia for entity name disambiguation in tweets
CN110837553A (en) Method for searching mail and related product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170613