CN110008334A - A kind of information processing method, device and storage medium - Google Patents

A kind of information processing method, device and storage medium Download PDF

Info

Publication number
CN110008334A
CN110008334A CN201710660877.7A CN201710660877A CN110008334A CN 110008334 A CN110008334 A CN 110008334A CN 201710660877 A CN201710660877 A CN 201710660877A CN 110008334 A CN110008334 A CN 110008334A
Authority
CN
China
Prior art keywords
text information
text
information
parameter
informations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710660877.7A
Other languages
Chinese (zh)
Other versions
CN110008334B (en
Inventor
王树伟
温旭
花贵春
范欣
姜国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Beijing Co Ltd
Original Assignee
Tencent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Beijing Co Ltd filed Critical Tencent Technology Beijing Co Ltd
Priority to CN201710660877.7A priority Critical patent/CN110008334B/en
Publication of CN110008334A publication Critical patent/CN110008334A/en
Application granted granted Critical
Publication of CN110008334B publication Critical patent/CN110008334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The invention discloses a kind of information processing method, device and storage mediums, the described method includes: extracting the character features of the first text information to be assessed, description time parameter is obtained based on the character features and preset mapping relations, the description time parameter characterizes the temporal information of the first text information content description;All text informations to be assessed including first text information are clustered according to preset cluster mode, to identify the corresponding classification of first text information;The effective time parameter of first text information is determined based on the corresponding classification of first text information;The effective time parameter characterization corresponds to the effective time of the classification;Based on the description time parameter, the effective time parameter and current time information, the first stylish degree parameter of first text information is obtained.

Description

A kind of information processing method, device and storage medium
Technical field
The present invention relates to internet information processing technique more particularly to a kind of information processing methods, device and storage medium.
Background technique
There is ten hundreds of articles to be published daily in internet, and " freshness " of article is undoubtedly user and closes very much Note, it characterizes the timeliness of article, for example article is nearest news, or expired old lore, and measures text The freshness of chapter is referred to as stylish degree.Currently, the mode of the assessment stylish degree of article is usually to pass through to carry in identification article Temporal information, however use such mode, when inside the article without specific temporal expressions, then can not identify, cause to call together How low the rate of returning is;Moreover, if what article content said is recent occurrence, but due to referring to long ago send out in article part Raw historical events then will lead to identification mistake.
Summary of the invention
The embodiment of the present invention provides a kind of information processing method, device and storage medium, can accurately assess article Stylish degree, recall rate are high.
The technical solution of the embodiment of the present invention is achieved in that
The embodiment of the invention provides a kind of information processing methods, which comprises
The character features for extracting the first text information to be assessed, are obtained based on the character features and preset mapping relations Time parameter must be described, the description time parameter characterizes the temporal information of the first text information content description;
According to preset cluster mode to all text informations to be assessed including first text information into Row cluster, to identify the corresponding classification of first text information;
The effective time parameter of first text information is determined based on the corresponding classification of first text information;It is described Effective time parameter characterization corresponds to the effective time of the classification;
Based on the description time parameter, the effective time parameter and current time information, first text is obtained The stylish degree parameter of the first of word information.
In above scheme, before the character features for extracting the first text information to be assessed, the method also includes:
The character features of multiple second text informations of acquisition are marked respectively, obtain multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the study Model forms the mapping relations of the character features and the description time parameter.
In above scheme, the method also includes:
When obtaining first of each text information in all text informations in addition to first text information respectively New degree parameter;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the first ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
In above scheme, the method also includes:
Determine the third text for meeting preset requirement in all text informations with the similarity of first text information Word information;The corresponding description time parameter of the third text information is sky;
When being set as the description time parameter of the third text information to be equal to the description of first text information Between parameter.
In above scheme, the method also includes:
Determine identical with the first text information classification text information quantity and the quantity it is preset extremely Variation in few two periods;
According to the variation and preset temperature judgment condition of the quantity, the quantity, obtain for characterizing text temperature First text information Heat range;
The Heat range of first based on first text information stylish degree parameter and first text information, obtains The stylish degree parameter of the second of first text information.
In above scheme, the first stylish degree parameter based on first text information and first text information Heat range, obtain the second stylish degree parameter of first text information, comprising:
According to the first stylish degree parameter of first text information and preset first scoring tactics, described first is obtained The corresponding first assessment score of text information;
According to the Heat range and preset second scoring tactics of first text information, the first text letter is obtained Cease corresponding second assessment score;
Processing is weighted to the first assessment score and the second assessment score, obtains first text information The second stylish degree parameter.
In above scheme, the method also includes:
When obtaining second of each text information in all text informations in addition to first text information respectively New degree parameter;
The second stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the second ranking results;
According to second ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
The embodiment of the invention also provides a kind of information processing unit, described device includes:
Module is obtained, for extracting the character features of the first text information to be assessed, is based on the character features and instruction The learning model got obtains description time parameter, and the description time parameter characterizes the first text information content description Temporal information;
Cluster module, for according to preset cluster mode to the institute to be assessed including first text information There is text information to be clustered, to identify the corresponding classification of first text information;
Determining module, for determining the effective of first text information based on the corresponding classification of first text information Time parameter;The effective time parameter characterization corresponds to the effective time of the classification;
Processing module is obtained for being based on the description time parameter, the effective time parameter and current time information To the first stylish degree parameter of first text information.
In above scheme, described device further include:
Model training module is marked for the character features respectively to multiple second text informations of acquisition, obtains Multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the study Model forms the mapping relations of the character features and the description time parameter.
In above scheme, the processing module is also used to obtain respectively in all text informations except first text The stylish degree parameter of first of each text information outside word information;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the first ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
In above scheme, the processing module is also used to determine in all text informations and believes with first text The similarity of breath meets the third text information of preset requirement;The corresponding description time parameter of the third text information is sky;
When being set as the description time parameter of the third text information to be equal to the description of first text information Between parameter.
In above scheme, the processing module is also used to determine text letter identical with the first text information classification The variation of the quantity of breath and the quantity within preset at least two period;
According to the variation and preset temperature judgment condition of the quantity, the quantity, obtain for characterizing text temperature First text information Heat range;
The Heat range of first based on first text information stylish degree parameter and first text information, obtains The stylish degree parameter of the second of first text information.
In above scheme, the processing module is also used to obtain respectively in all text informations except first text The stylish degree parameter of second of each text information outside word information;
The second stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the second ranking results;
According to second ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
The embodiment of the invention also provides a kind of information processing unit, described device includes:
Memory, for storing executable program;
Processor, for realizing above-mentioned information processing method by executing the executable program.
The embodiment of the invention also provides a kind of readable storage medium storing program for executing, are stored with executable program, the executable program Above-mentioned information processing method is realized when being executed by processor.
Using above- mentioned information of embodiment of the present invention processing method, device and storage medium, article can be accurately assessed Stylish degree, and then article recommendation can be carried out to user, realize that simple, recall rate is high.
Detailed description of the invention
Fig. 1 is the schematic diagram that each side's hardware entities of information exchange are carried out in the embodiment of the present invention;
Fig. 2 is the flow diagram one of information processing method in the embodiment of the present invention;
Fig. 3 is the flow diagram two of information processing method in the embodiment of the present invention;
Fig. 4 is the flow diagram three of information processing method in the embodiment of the present invention;
Fig. 5 is the display schematic diagram that article recommendation is carried out in the embodiment of the present invention;
Fig. 6 is the composed structure schematic diagram of information processing unit in the embodiment of the present invention;
Fig. 7 is an exemplary diagram of the information processing unit as hardware entities in the embodiment of the present invention.
Specific embodiment
Inventor has found under study for action, can identify article using the stylish degree of the temporal information characterization article of article Temporal information method can use dictionary and arrange in pairs or groups dictionary method, firstly, carrying out participle pretreatment to article (i.e. will be literary Chapter content is split as unit of word), such as: " 2 months No. 14, stock supervisory committee has carried out the investigation of a new round " are split into " 2/ Month/No. 14/, stock supervisory committee/development// new/wheel// investigation ";Then, it the dictionary of settling time information representation mode and takes With library, wherein dictionary includes: such as date expression: December 13, this Sunday;Time description expression: the morning, afternoon etc.;Time Range expression: recently, in the recent period, a few days ago etc.;Collocation library includes: that date and time such as describes to arrange in pairs or groups: morning this Sunday, No. 12 icepros Morning, in mid-August, 2016 etc.;Then, using the temporal expressions in time dictionary and collocation library identification article, selection is best able to Represent article and describe time of event as event time, such as: frequency of occurrence most time in selection text as a result, In selection text the time it is earliest the most as a result, in selection text critical positions time as a result, as title, first segment, The positions such as first.
However, it is found by the inventors that using dictionary or the method in library of arranging in pairs or groups, can accurate recognition time information, But it there is a problem that recall rate is too low in many cases, that is, the quantity that " identification " goes out is very little.Its there are the drawbacks of Including at least following several:
As soon as can not identify this using the method for dictionary or library of arranging in pairs or groups 1, without specific temporal expressions inside article The stylish degree of article.
2, the thing that an article is said occurs recently, but refers to the history having occurred for a long time in beginning location Event, then just will appear the situation of identification mistake.
3, dictionary or rule base always have the incomplete situation of covering, at this moment cannot identify the stylish of article well Degree.
Before the present invention is described in detail, the noun provided in an embodiment of the present invention being related to and term are said Bright, the noun being related to and term provided in an embodiment of the present invention are suitable for following explanation.
1) stylish degree: measuring the reference standard of the timeliness of text information, reflects text information in time timely Degree;The stylish degree of article before the stylish degree of important news is higher than one week on the day of such as.
2) stylish degree parameter: the stylish degree for being used to assess text information obtained based on the relevant time parameter of text information Parameter.
With reference to the accompanying drawing and specific embodiment the present invention is described in further detail.
It should be noted that term involved in the embodiment of the present invention " first second third " be only be that difference is similar Object, do not represent the particular sorted for object, it is possible to understand that ground, " Yi Er third " can be in the case where permission Exchange specific sequence or precedence.It should be understood that the object that " first second third " is distinguished in the appropriate case can be mutual It changes, so that the embodiment of the present invention described herein can be real with the sequence other than those of illustrating or describing herein It applies.
Fig. 1 is the schematic diagram that each side's hardware entities of information exchange are carried out in the embodiment of the present invention, includes: service in Fig. 1 Device 11 ... 1n, terminal device 21-24, terminal device (including the types such as mobile phone, desktop computer, PC machine, all-in-one machine) pass through wired Network or wireless network and server carry out information exchange.In one example, issued in terminal device acquisition network multiple Article information (such as multiple articles) handles multiple text information to obtain the stylish degree parameter of each text information, base Multiple text informations are ranked up in stylish degree parameter, are then based on ranking results at least one of multiple text informations Text information recommends other terminal users, that is, sending the relevant information (such as title, link) including recommended article Recommendation information gives other terminals.
Embodiment one
The embodiment of the invention provides a kind of information processing method, Fig. 2 show information processing side in the embodiment of the present invention The flow diagram of method, as shown in Fig. 2, the information processing method in the embodiment of the present invention includes:
Step 101: extracting the character features of the first text information to be assessed, based on the character features and preset reflect It penetrates relationship and obtains description time parameter, the description time parameter characterizes the time letter of the first text information content description Breath.
In practical applications, text information to be assessed can obtain in advance or equipment by web crawlers from What internet grabbed.Here text information can be any text information relevant to text, such as article, one A text paragraph, news, the article abstract of text etc., correspondingly, the first text information is more texts to be assessed One in chapter, character features can be the word key feature of article, including at least one of: word feature (the i.e. meaning of word The content characteristic of expression), part of speech feature (noun, adjective etc.) and word length characteristic (number of characters for such as including).
In actual implementation, before the character features for extracting the first text information to be assessed, further includes: reflected described in acquisition Relationship is penetrated, is specifically included:
The character features of multiple second text informations of acquisition are marked respectively, obtain multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the study Model forms the mapping relations of the character features and the description time parameter.
One example are as follows: word segmentation processing and word signature are carried out to the M piece article of acquisition, obtain M sample information;M For the positive integer greater than 2;
Using word feature as training characteristics to the M sample information training machine learning model, to be based on the machine Device learning model forms the word feature of article and the mapping relations of the description time parameter.
Correspondingly, the learning model obtained based on the character features and training obtains description time parameter, comprising:
The character features (the word feature of such as article) is imported into the learning model that the training obtains, is obtained and the text There are the description time parameters of mapping relations for word feature.
Based on the above embodiment of the present invention, in practical applications, above-mentioned learning model can be conditional random field models, quilt It works for morphological analyses such as Chinese word segmentation and part-of-speech taggings, this is another group of output under the conditions of given one group of input stochastic variable Under the conditions of the conditional probability distribution model of stochastic variable namely given stochastic variable X, the Markov random field of stochastic variable Y. Certainly, the present invention is not limited to this kind of machine learning models.
Step 102: according to preset cluster mode to all texts to be assessed including first text information Word information is clustered, to identify the corresponding classification of first text information.
Here, in actual implementation, all text informations to be assessed can be plurality of articles to be assessed, preset poly- Class mode can be Agglomerative Hierarchical Clustering, and at the beginning of being clustered, each text information (each article) is used as a cluster (set of i.e. one group data object), based on apart from nearest principle, each step merges two immediate clusters, when merging, As long as successively taking current nearest point pair, if this point merges two clusters at place to being not currently in a cluster, and Terminate when meeting the threshold value of cluster.In embodiments of the present invention, two text informations (two articles) are most like, it may be assumed that two texts The similarity of keyword set s's and is greater than default in 1/d and two text information reciprocal of distance between word information (two articles) Threshold value.
Wherein, editing distance can such as be used, two articles are seen using a variety of distances by characterizing the distance between two articles The character string for making two different lengths determines between two character strings, and the minimum edit operation needed for another is changed into as one Number, editing distance show that more greatly two article similitudes are poorer.Alternatively, characterization two articles between distance using Jie Kade away from From regarding two articles as two character sets at this time, Jie Kade distance shows that more greatly two article similitudes are poorer.
It is to be assessed after carrying out end of clustering to all text informations to be assessed based on the above embodiment of the present invention Text information is divided into different classes (cluster).
Step 103: the effective time of first text information is determined based on the corresponding classification of first text information Parameter;The effective time parameter characterization corresponds to the effective time of the classification.
Here, in actual implementation, the classification of text information and the mapping relations of effective time parameter have been preset, when After the corresponding classification of the first text information has been determined, corresponding effective time ginseng can determine based on the preset mapping relations Number, as determined, the corresponding effective time of cluster belonging to the first text information is 5 days.
Step 104: being based on the description time parameter, the effective time parameter and current time information, obtain institute State the first stylish degree parameter of the first text information.
Here, in actual implementation, this step be can specifically include:
The remaining effective time parameter of first text information is obtained based on following formula:
Remaining effective time=current time-(description time+effective time);
Using the remaining effective time as the first stylish degree parameter of first text information.
One example are as follows: current time is June 28, and the effective time of the first text information is 5 days, the first text information The description time be June 22, then 28 days remaining effective time=June of the first text information-(+5 days on the 22nd June)=1 day.
Based on the above embodiment of the present invention, in practical applications, the method also includes:
When obtaining first of each text information in all text informations in addition to first text information respectively New degree parameter;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the first ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
That is, in practical applications, there are multiple text informations (article) to be assessed, according to step 101 to step Rapid 104 mode obtains the first stylish degree parameter of all articles to be assessed, and then can be according to the first stylish degree parameter to more A article is ranked up, and is then chosen the part article in ranking results according to default rule (as 10 before selection ranking) and is made For Candidate Recommendation article, and then recommend user.
Embodiment two
The embodiment of the invention provides a kind of information processing method, Fig. 3 show information processing side in the embodiment of the present invention The flow diagram of method, as shown in figure 3, the information processing method in the embodiment of the present invention includes:
Step 201: extracting the character features of the first text information to be assessed, based on the character features and preset reflect It penetrates relationship and obtains description time parameter.
Here, the description time parameter characterizes the temporal information of the first text information content description.
In practical applications, text information to be assessed can obtain in advance or equipment by web crawlers from What internet grabbed.Here text information can be an article, correspondingly, the first text information is to be assessed more One in article, character features can be the word key feature of article, including at least one of: word feature (i.e. word The content characteristic that meaning indicates), part of speech feature (noun, adjective etc.) and word length characteristic (number of characters for such as including).
Based on the above embodiment of the present invention, in practical applications, above-mentioned mapping relations can be obtained by training learning model, It specifically includes:
The character features of multiple second text informations of acquisition are marked respectively, obtain multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the study Model forms the mapping relations of the character features and the description time parameter;
Here learning model can be conditional random field models, be used for the morphological analyses such as Chinese word segmentation and part-of-speech tagging Work, this is the conditional probability distribution model of another group of output stochastic variable under the conditions of given one group of input stochastic variable, namely Under the conditions of given stochastic variable X, the Markov random field of stochastic variable Y.Certainly, the present invention is not limited to this kind of machine learning Model.
Step 202: according to preset cluster mode to all texts to be assessed including first text information Word information is clustered, to identify the corresponding classification of first text information.
Here, in actual implementation, all text informations to be assessed can be plurality of articles to be assessed, preset poly- Class mode can be Agglomerative Hierarchical Clustering, and at the beginning of being clustered, each text information (each article) is used as a cluster, Each step merges two immediate clusters, when merging, as long as successively taking current nearest point pair, if this point is not to currently In a cluster, two clusters at place are merged, and are terminated when meeting the threshold value of cluster.In embodiments of the present invention, two Text information (two articles) is most like, it may be assumed that 1/d and two text reciprocal letter of distance between two text informations (two articles) In breath the similarity of keyword set s's and be greater than preset threshold.
Wherein, editing distance can such as be used, two articles are seen using a variety of distances by characterizing the distance between two articles The character string for making two different lengths determines between two character strings, and the minimum edit operation needed for another is changed into as one Number, editing distance show that more greatly two article similitudes are poorer.Alternatively, characterization two articles between distance using Jie Kade away from From regarding two articles as two character sets at this time, Jie Kade distance shows that more greatly two article similitudes are poorer.
It is to be assessed after carrying out end of clustering to all text informations to be assessed based on the above embodiment of the present invention Text information is divided into different classes (cluster).
In practical applications, using above-mentioned learning model, there may be the description time parameters that cannot recognize that text information The case where, meet the of preset requirement with the similarity of first text information at this point, determining in all text informations Three text informations (the corresponding description time parameter of the third text information is sky);When by the description of the third text information Between parameter be set as being equal to the description time parameter of first text information.That is, being sky by description time parameter The description time parameter of text information be set as, when being equal to the description for the text information for meeting preset requirement with its similarity Between parameter.
Step 203: the effective time parameter of the first text information, and base are determined based on the corresponding classification of the first text information In description time parameter, effective time parameter and current time information, the first stylish degree parameter of the first text information is obtained.
Here, the effective time parameter characterization corresponds to the effective time of the classification.
In actual implementation, the classification of text information and the mapping relations of effective time parameter are preset, determination is worked as After the corresponding classification of first text information, corresponding effective time parameter can determine based on the preset mapping relations, such as Determine that the corresponding effective time of cluster belonging to the first text information is 5 days.
The remaining effective time parameter of first text information is obtained based on following formula:
Remaining effective time=current time-(description time+effective time);
Using the remaining effective time as the first stylish degree parameter of first text information.
Step 204: according to text information identical with the first text information classification respectively in preset at least two time The variation of quantity and the quantity at least two period in section, determines the temperature etc. of the first text information Grade.
Here, in actual implementation, before this step, the method also includes:
Determine identical with the first text information classification text information quantity and the quantity it is preset extremely Variation in few two periods;
According to the variation and preset temperature judgment condition of the quantity, the quantity, obtain for characterizing text temperature First text information Heat range.
The temperature judgment condition here preset at can be set according to actual needs, such as: set same category of text The quantity of information is greater than preset value (such as 100), and quantity variation indicates that temperature persistently reduces, Heat range one to continue to decline Grade;Quantity variation declines afterwards first to rise, and indicates that once temperature is high, but temperature has started to reduce, Heat range is second level;Quantity Variation rises to be lasting, indicates that temperature is higher and higher, Heat range is three-level.The quantity of text information is less than preset value no matter How is quantity variation, it is believed that its Heat range is level-one.
One example are as follows: the corresponding classification of the first text information includes 120 text informations, falls in June 1 to June 3 Between text information share 80, fall in June 4 to the text information between June 7 and share 30, fall in June 8 to June 11 In the daytime text information shares 10, it is seen then that in the two periods, the variation of the quantity of text information is to continue to decline, and is sentenced Break such text information temperature be level-one.
Step 205: the temperature of the based on first text information first stylish degree parameter and first text information Grade obtains the second stylish degree parameter of first text information.
Here, in actual implementation, this step specifically includes:
According to the first stylish degree parameter of first text information and preset first scoring tactics, described first is obtained The corresponding first assessment score of text information;
According to the Heat range and preset second scoring tactics of first text information, the first text letter is obtained Cease corresponding second assessment score;
Processing is weighted to the first assessment score and the second assessment score, obtains first text information The second stylish degree parameter.
Wherein, an example of the first scoring tactics are as follows: the remaining effective time of text information is greater than 2 days, then first comments Estimating score is 5 points, and the remaining effective time of text information was greater than 0 less than 2 days, then the first assessment score is 4 points, text information Remaining effective time is in expired two days (i.e. -2 < remaining effective time < 0), then the first assessment score is 3 points, text information Remaining effective time is expired two days (remaining effective time < -2), then the first assessment score is 2 points.
One example of the second scoring tactics are as follows: Heat range is divided into three grades, corresponding second assessment point of level-one Number 2 divides, and the corresponding second assessment score 3 of second level divides, and the corresponding second assessment score 5 of three-level divides.
After having obtained the first assessment score and the second assessment score of a text information, it is weighted processing, is obtained The stylish degree parameter of the second of the text information, it may be assumed that
Second stylish degree parameter=x* first assesses score+y* second and assesses score;Wherein, x, y are positive number, x+y= 1。
Step 206: obtain the second stylish degree parameter of each text information in all text informations, second based on acquisition Stylish degree parameter is ranked up all text informations, determines Candidate Recommendation text information based on ranking results.
That is, in practical applications, there are multiple text informations (article) to be assessed, according to step 201 to step Rapid 205 mode obtains the second stylish degree parameter of all text informations to be assessed, and then can be according to the second stylish degree parameter Multiple text informations are ranked up, then choose the portion in ranking results according to default rule (as 10 before selection ranking) Divide text information as Candidate Recommendation text information, and then recommends user.
Embodiment three
The embodiment of the invention provides a kind of information processing method, Fig. 4 show information processing side in the embodiment of the present invention The flow diagram of method, by taking text information is article as an example, as shown in figure 4, the information processing method packet in the embodiment of the present invention It includes:
Step 301: extracting the word key feature of N piece article to be assessed.
Here, word key feature includes at least one of: word feature, part of speech feature and word length characteristic.
N is the positive integer greater than 2;The content characteristic of word characteristic present word, part of speech feature such as noun, adjective etc., word is long Degree feature can refer to the number of characters that word includes.
Step 302: the word key feature of N articles being imported into the machine learning model that training obtains respectively, obtains N texts The description time parameter of chapter.
Here, the temporal information of description time parameter characterization article content description.
In actual implementation, before this step, the method also includes:
Word segmentation processing and word signature are carried out to the M piece article of acquisition, obtain M sample information;The M article is not It is same as the N article;M is positive integer;
Using word feature as training characteristics to the M sample information training machine learning model, to be based on the machine Device learning model forms the word feature of article and the mapping relations of the description time parameter.
Step 303: N piece article to be assessed being clustered according to preset cluster mode, with the text to the same category Chapter merges, and obtains at least one cluster, and the effective time of every article is determined based on cluster belonging to N articles.
Here, in actual implementation, preset cluster mode can be Agglomerative Hierarchical Clustering, at the beginning of being clustered, often One article merges two immediate clusters as a cluster, each step, when merging, as long as successively taking current nearest point It is right, if this point merges two clusters at place to being not currently in a cluster, and terminate when meeting the threshold value of cluster. In embodiments of the present invention, two articles are most like, it may be assumed that close in the 1/d reciprocal of distance and two text informations between two articles The similarity of keyword collection s's and be greater than preset threshold.
Wherein, editing distance can such as be used, two articles are seen using a variety of distances by characterizing the distance between two articles The character string for making two different lengths determines between two character strings, and the minimum edit operation needed for another is changed into as one Number, editing distance show that more greatly two article similitudes are poorer.Alternatively, characterization two articles between distance using Jie Kade away from From regarding two articles as two character sets at this time, Jie Kade distance shows that more greatly two article similitudes are poorer.
Step 304: description time parameter, effective time and current time information based on every article in N articles obtain To the first stylish degree parameter of the stylish degree for assessing article.
Here, the effective time parameter characterization corresponds to the effective time of the classification.
In actual implementation, the classification of article and the mapping relations of effective time parameter are preset, when text has been determined After the corresponding classification of chapter, corresponding effective time can determine based on the preset mapping relations.
The remaining effective time parameter of article is obtained based on following formula:
Remaining effective time=current time-(description time+effective time);
Using the remaining effective time as the first stylish degree parameter of article.
Step 305: the quantity for the same category of article that cluster obtains is determined, according to the quantity and preset at least two The variation of the quantity in a period obtains the heat of every article for characterizing article temperature based on preset judgment rule Spend grade.
Here, in actual implementation, preset temperature judgment rule can be set according to actual needs, such as: setting The quantity of same category of text information is greater than preset value (such as 100), and quantity variation indicates that temperature persistently drops to continue to decline Low, Heat range is level-one;Quantity variation declines afterwards first to rise, and indicates that once temperature is high, but temperature has started to reduce, temperature Grade is second level;Quantity variation rises to be lasting, indicates that temperature is higher and higher, Heat range is three-level.The quantity of text information Less than preset value, change regardless of quantity, it is believed that its Heat range is level-one.
One example are as follows: the corresponding classification of the first text information includes 120 text informations, falls in June 1 to June 3 Between text information share 80, fall in June 4 to the text information between June 7 and share 30, fall in June 8 to June 11 In the daytime text information shares 10, it is seen then that in the two periods, the variation of the quantity of text information is to continue to decline, and is sentenced Break such text information temperature be level-one.
Step 306: the Heat range of the based on every article first stylish degree parameter and every article obtains every article The second stylish degree parameter, N piece article to be assessed is ranked up based on the second stylish degree parameter, and according to ranking results will At least one N articles are used as Candidate Recommendation article.
Here, in actual implementation, this step specifically includes:
The first stylish degree parameter and preset first scoring tactics according to N articles obtain every article corresponding the One assessment score;
According to the Heat range and preset second scoring tactics of N articles, corresponding second assessment of every article is obtained Score;
Processing is weighted to the first assessment score and the second assessment score, obtains the of the stylish degree for assessing article Two stylish degree parameters.
In practical applications, the first scoring tactics and the second scoring tactics are the scoring tactics of actual needs setting;
One example of the first scoring tactics are as follows: the remaining effective time of article is greater than 2 days, then the first assessment score is 5 Point, remaining effective time of article was greater than 0 less than 2 days, then the first assessment score is 4 points, and the remaining effective time of article was In two days phases (i.e. -2 < remaining effective time < 0), then the first assessment score is 3 points, and the remaining effective time of article is expired two Its (remaining effective time < -2), then the first assessment score is 2 points.
One example of the second scoring tactics are as follows: Heat range is divided into three grades, corresponding second assessment point of level-one Number 2 divides, and the corresponding second assessment score 3 of second level divides, and the corresponding second assessment score 5 of three-level divides.
After having obtained the first assessment score and the second assessment score of an article, it is weighted processing, obtains this article The stylish degree parameter of the second of chapter, it may be assumed that
Second stylish degree parameter=x* first assesses score+y* second and assesses score;Wherein, x, y are positive number, x+y= 1。
After being ranked up based on the second stylish degree parameter to N piece article to be assessed, according to recommendation article Selection Strategy It chooses at least one N articles and is used as Candidate Recommendation article, such as choose preceding 5 articles in ranking results and be used as and recommend article.
Step 307: the relevant information of the Candidate Recommendation article is sent to the terminal of recommended user.
Here, in actual implementation, it is determined that after Candidate Recommendation article, the relevant information of Candidate Recommendation article is obtained, such as: Article title, article link etc..It is illustrated in figure 5 the signal of the relevant information of the terminal display Candidate Recommendation article of recommended user Figure, display mode can in various ways, such as the form of advertisement (table plague advertisement, primary advertisement).
Example IV
The embodiment of the invention provides a kind of information processing unit, it is illustrated in figure 6 information processing in the embodiment of the present invention The composed structure schematic diagram of device, as shown in fig. 6, information processing unit includes: in the embodiment of the present invention
Obtain module 11, for extracting the character features of the first text information to be assessed, based on the character features and The learning model that training obtains obtains description time parameter, and the description time parameter characterizes the first text information content and retouches The temporal information stated;
Cluster module 12, for according to preset cluster mode to be assessed including first text information All text informations are clustered, to identify the corresponding classification of first text information;
Determining module 13, for determining having for first text information based on the corresponding classification of first text information Imitate time parameter;The effective time parameter characterization corresponds to the effective time of the classification;
Processing module 14, for being based on the description time parameter, the effective time parameter and current time information, Obtain the first stylish degree parameter of first text information.
In one embodiment, described device further include:
Model training module 15 is marked for the character features respectively to multiple second text informations of acquisition, obtains To multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the study Model forms the mapping relations of the character features and the description time parameter.
In one embodiment, the processing module 14 is also used to obtain respectively in all text informations except described the The stylish degree parameter of first of each text information outside one text information;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the first ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
In one embodiment, the processing module 14 is also used to determine in all text informations and first text The similarity of word information meets the third text information of preset requirement;The corresponding description time parameter of the third text information is It is empty;
When being set as the description time parameter of the third text information to be equal to the description of first text information Between parameter.
In one embodiment, the processing module 14 is also used to determine text identical with the first text information classification The variation of the quantity of word information and the quantity within preset at least two period;
According to the variation and preset temperature judgment condition of the quantity, the quantity, obtain for characterizing text temperature First text information Heat range;
The Heat range of first based on first text information stylish degree parameter and first text information, obtains The stylish degree parameter of the second of first text information.
In one embodiment, the processing module 14 is also used to obtain respectively in all text informations except described the The stylish degree parameter of second of each text information outside one text information;
The second stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the second ranking results;
According to second ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
The embodiment of the invention also provides a kind of information processing units, comprising: memory, for storing executable program;
Processor is realized when for by executing the executable program:
The character features for extracting the first text information to be assessed, the study mould obtained based on the character features and training Type obtains description time parameter, and the description time parameter characterizes the temporal information of the first text information content description;
According to preset cluster mode to all text informations to be assessed including first text information into Row cluster, to identify the corresponding classification of first text information;
The effective time parameter of first text information is determined based on the corresponding classification of first text information;It is described Effective time parameter characterization corresponds to the effective time of the classification;
Based on the description time parameter, the effective time parameter and current time information, first text is obtained The stylish degree parameter of the first of word information.
The processor is realized when for by executing the executable program:
The character features of multiple second text informations of acquisition are marked respectively, obtain multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the study Model forms the mapping relations of the character features and the description time parameter.
The processor is realized when for by executing the executable program:
When obtaining first of each text information in all text informations in addition to first text information respectively New degree parameter;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the first ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
The processor is realized when for by executing the executable program:
Determine the third text for meeting preset requirement in all text informations with the similarity of first text information Word information;The corresponding description time parameter of the third text information is sky;
When being set as the description time parameter of the third text information to be equal to the description of first text information Between parameter.
The processor is realized when for by executing the executable program:
Determine identical with the first text information classification text information quantity and the quantity it is preset extremely Variation in few two periods;
According to the variation and preset temperature judgment condition of the quantity, the quantity, obtain for characterizing text temperature First text information Heat range;
The Heat range of first based on first text information stylish degree parameter and first text information, obtains The stylish degree parameter of the second of first text information.
The processor is realized when for by executing the executable program:
According to the first stylish degree parameter of first text information and preset first scoring tactics, described first is obtained The corresponding first assessment score of text information;
According to the Heat range and preset second scoring tactics of first text information, the first text letter is obtained Cease corresponding second assessment score;
Processing is weighted to the first assessment score and the second assessment score, obtains first text information The second stylish degree parameter.
The processor is realized when for by executing the executable program:
When obtaining second of each text information in all text informations in addition to first text information respectively New degree parameter;
The second stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the second ranking results;
According to second ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
The embodiment of the invention also provides a kind of readable storage medium storing program for executing, are stored with executable program, the executable program Realization when being executed by processor:
The character features for extracting the first text information to be assessed, the study mould obtained based on the character features and training Type obtains description time parameter, and the description time parameter characterizes the temporal information of the first text information content description;
According to preset cluster mode to all text informations to be assessed including first text information into Row cluster, to identify the corresponding classification of first text information;
The effective time parameter of first text information is determined based on the corresponding classification of first text information;It is described Effective time parameter characterization corresponds to the effective time of the classification;
Based on the description time parameter, the effective time parameter and current time information, first text is obtained The stylish degree parameter of the first of word information.
The executable code processor is also realized when executing:
The character features of multiple second text informations of acquisition are marked respectively, obtain multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the study Model forms the mapping relations of the character features and the description time parameter.
The executable code processor is also realized when executing:
When obtaining first of each text information in all text informations in addition to first text information respectively New degree parameter;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the first ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
The executable code processor is also realized when executing:
Determine the third text for meeting preset requirement in all text informations with the similarity of first text information Word information;The corresponding description time parameter of the third text information is sky;
When being set as the description time parameter of the third text information to be equal to the description of first text information Between parameter.
The executable code processor is also realized when executing:
Determine identical with the first text information classification text information quantity and the quantity it is preset extremely Variation in few two periods;
According to the variation and preset temperature judgment condition of the quantity, the quantity, obtain for characterizing text temperature First text information Heat range;
The Heat range of first based on first text information stylish degree parameter and first text information, obtains The stylish degree parameter of the second of first text information.
The executable code processor is also realized when executing:
According to the first stylish degree parameter of first text information and preset first scoring tactics, described first is obtained The corresponding first assessment score of text information;
According to the Heat range and preset second scoring tactics of first text information, the first text letter is obtained Cease corresponding second assessment score;
Processing is weighted to the first assessment score and the second assessment score, obtains first text information The second stylish degree parameter.
The executable code processor is also realized when executing:
When obtaining second of each text information in all text informations in addition to first text information respectively New degree parameter;
The second stylish degree parameter according to each text information in all text informations is ranked up all text informations, Obtain the second ranking results;
According to second ranking results using at least one text information in all text informations as Candidate Recommendation Text information.
It need to be noted that: above is referred to the description of information processing unit, be with above method description it is similar, together The beneficial effect of method describes, and does not repeat them here.For undisclosed technical detail in Installation practice of the present invention, please refer to The description of embodiment of the present invention method.
In the present embodiment, information processing unit is as shown in Figure 7 as an example of hardware entities.The information processing apparatus It sets including processor 61, storage medium 62 and at least one external communication interface 63;The processor 61, storage medium 62 with And external communication interface 63 is connected by bus 64.
It will be appreciated by those skilled in the art that: realize that all or part of the steps of above method embodiment can pass through journey Sequence instructs relevant hardware to complete, and program above-mentioned can be stored in a computer readable storage medium, which exists When execution, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes: movable storage device, deposits at random Access to memory (RAM, Random Access Memory), read-only memory (ROM, Read-Only Memory), magnetic disk or The various media that can store program code such as CD.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the present invention is implemented The technical solution of example substantially in other words can be embodied in the form of software products the part that the relevant technologies contribute, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with It is personal computer, terminal or network equipment etc.) execute all or part of each embodiment the method for the present invention.And Storage medium above-mentioned, which includes: that movable storage device, RAM, ROM, magnetic or disk etc. are various, can store program code Medium.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (15)

1. a kind of information processing method, which is characterized in that the described method includes:
The character features for extracting the first text information to be assessed, are retouched based on the character features and preset mapping relations Time parameter is stated, the description time parameter characterizes the temporal information of the first text information content description;
All text informations to be assessed including first text information are gathered according to preset cluster mode Class, to identify the corresponding classification of first text information;
The effective time parameter of first text information is determined based on the corresponding classification of first text information;It is described effective The effective time of the corresponding classification of time parameter characterization;
Based on the description time parameter, the effective time parameter and current time information, the first text letter is obtained First stylish degree parameter of breath.
2. the method according to claim 1, wherein the method also includes:
The character features of multiple second text informations of acquisition are marked respectively, obtain multiple sample informations;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the learning model Form the mapping relations of the character features and the description time parameter.
3. method according to claim 1 or 2, which is characterized in that it is described according to preset cluster mode to including described All text informations to be assessed including first text information are clustered, to identify the corresponding class of first text information Not, comprising:
According to preset cluster mode, obtain in all text informations to be assessed respectively between text information two-by-two away from From;
Based on merging to form at least one cluster to text information apart from nearest principle;
Cluster belonging to first text information is judged, to identify the corresponding classification of first text information.
4. method according to claim 1 or 2, which is characterized in that the method also includes:
The first stylish degree of each text information in all text informations in addition to first text information is obtained respectively Parameter;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, obtains First ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation text Information.
5. method according to claim 1 or 2, which is characterized in that the method also includes:
Determine the third text letter for meeting preset requirement in all text informations with the similarity of first text information Breath;The corresponding description time parameter of the third text information is sky;
The description time parameter of the third text information is set to be equal to the description time ginseng of first text information Number.
6. method according to claim 1 or 2, which is characterized in that the method also includes:
The quantity and the quantity for determining text information identical with the first text information classification are preset at least two Variation in a period;
According to the variation and preset temperature judgment condition of the quantity, the quantity, the institute for characterizing text temperature is obtained State the Heat range of the first text information;
The Heat range of first based on first text information stylish degree parameter and first text information obtains described The stylish degree parameter of the second of first text information.
7. according to the method described in claim 6, it is characterized in that, the first stylish degree based on first text information The Heat range of parameter and first text information obtains the second stylish degree parameter of first text information, comprising:
According to the first stylish degree parameter of first text information and preset first scoring tactics, first text is obtained The corresponding first assessment score of information;
According to the Heat range and preset second scoring tactics of first text information, first text information pair is obtained The the second assessment score answered;
Processing is weighted to the first assessment score and the second assessment score, obtains the of first text information Two stylish degree parameters.
8. according to the method described in claim 6, it is characterized in that, the method also includes:
The second stylish degree of each text information in all text informations in addition to first text information is obtained respectively Parameter;
The second stylish degree parameter according to each text information in all text informations is ranked up all text informations, obtains Second ranking results;
According to second ranking results using at least one text information in all text informations as Candidate Recommendation text Information.
9. a kind of information processing unit, which is characterized in that described device includes:
Module is obtained, for extracting the character features of the first text information to be assessed, based on character features and preset Mapping relations obtain description time parameter, and the description time parameter characterizes the time letter of the first text information content description Breath;
Cluster module, for according to preset cluster mode to all texts to be assessed including first text information Word information is clustered, to identify the corresponding classification of first text information;
Determining module, for determining the effective time of first text information based on the corresponding classification of first text information Parameter;The effective time parameter characterization corresponds to the effective time of the classification;
Processing module obtains institute for being based on the description time parameter, the effective time parameter and current time information State the first stylish degree parameter of the first text information.
10. device according to claim 9, which is characterized in that described device further include:
Model training module is marked for the character features respectively to multiple second text informations of acquisition, obtains multiple Sample information;
Using character features as training characteristics to the multiple sample information training learning model, to be based on the learning model Form the mapping relations of the character features and the description time parameter.
11. device according to claim 9 or 10, which is characterized in that
The processing module is also used to obtain each text in all text informations in addition to first text information respectively The stylish degree parameter of the first of word information;
The first stylish degree parameter according to each text information in all text informations is ranked up all text informations, obtains First ranking results;
According to first ranking results using at least one text information in all text informations as Candidate Recommendation text Information.
12. device according to claim 9 or 10, which is characterized in that
The processing module is also used to determine in all text informations and meets in advance with the similarity of first text information If it is required that third text information;The corresponding description time parameter of the third text information is sky;
The description time parameter of the third text information is set to be equal to the description time ginseng of first text information Number.
13. device according to claim 9 or 10, which is characterized in that
The processing module is also used to determine the quantity of text information identical with the first text information classification, Yi Jisuo State variation of the quantity within preset at least two period;
According to the variation and preset temperature judgment condition of the quantity, the quantity, the institute for characterizing text temperature is obtained State the Heat range of the first text information;
The Heat range of first based on first text information stylish degree parameter and first text information obtains described The stylish degree parameter of the second of first text information.
14. a kind of information processing unit, which is characterized in that described device includes:
Memory, for storing executable program;
Processor, for realizing the described in any item information processing sides of claim 1 to 8 by executing the executable program Method.
15. a kind of readable storage medium storing program for executing, which is characterized in that be stored with executable program, the executable code processor is held Claim 1 to 8 described in any item information processing methods are realized when row.
CN201710660877.7A 2017-08-04 2017-08-04 Information processing method, device and storage medium Active CN110008334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710660877.7A CN110008334B (en) 2017-08-04 2017-08-04 Information processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710660877.7A CN110008334B (en) 2017-08-04 2017-08-04 Information processing method, device and storage medium

Publications (2)

Publication Number Publication Date
CN110008334A true CN110008334A (en) 2019-07-12
CN110008334B CN110008334B (en) 2023-03-14

Family

ID=67164013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710660877.7A Active CN110008334B (en) 2017-08-04 2017-08-04 Information processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN110008334B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060285172A1 (en) * 2004-10-01 2006-12-21 Hull Jonathan J Method And System For Document Fingerprint Matching In A Mixed Media Environment
US20070067177A1 (en) * 2005-08-31 2007-03-22 Temptime Corporation Quality assurance system and methods of use
US20080201131A1 (en) * 2007-02-15 2008-08-21 Gautam Kar Method and apparatus for automatically discovering features in free form heterogeneous data
US20090234861A1 (en) * 2005-09-14 2009-09-17 Jorey Ramer Using mobile application data within a monetization platform
CN101853261A (en) * 2009-11-23 2010-10-06 电子科技大学 Network public-opinion behavior analysis method based on social network
CN102270212A (en) * 2011-04-07 2011-12-07 浙江工商大学 User interest feature extraction method based on hidden semi-Markov model
CN103927365A (en) * 2014-04-21 2014-07-16 武汉大学 Web page time sensibility measurement method based on energy function
CN104731811A (en) * 2013-12-20 2015-06-24 北京师范大学珠海分校 Cluster information evolution analysis method for large-scale dynamic short texts
CN104808964A (en) * 2015-05-04 2015-07-29 腾讯科技(北京)有限公司 Information processing method and client side
CN105528432A (en) * 2015-12-15 2016-04-27 北大方正集团有限公司 Digital resource hotspot generating method and device
CN106294356A (en) * 2015-05-14 2017-01-04 北京大学 Microblogging timeline based on dynamic clustering generates method and device
CN106383877A (en) * 2016-09-12 2017-02-08 电子科技大学 On-line short text clustering and topic detection method of social media
CN106815297A (en) * 2016-12-09 2017-06-09 宁波大学 A kind of academic resources recommendation service system and method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060285172A1 (en) * 2004-10-01 2006-12-21 Hull Jonathan J Method And System For Document Fingerprint Matching In A Mixed Media Environment
US20070067177A1 (en) * 2005-08-31 2007-03-22 Temptime Corporation Quality assurance system and methods of use
US20090234861A1 (en) * 2005-09-14 2009-09-17 Jorey Ramer Using mobile application data within a monetization platform
US20080201131A1 (en) * 2007-02-15 2008-08-21 Gautam Kar Method and apparatus for automatically discovering features in free form heterogeneous data
CN101853261A (en) * 2009-11-23 2010-10-06 电子科技大学 Network public-opinion behavior analysis method based on social network
CN102270212A (en) * 2011-04-07 2011-12-07 浙江工商大学 User interest feature extraction method based on hidden semi-Markov model
CN104731811A (en) * 2013-12-20 2015-06-24 北京师范大学珠海分校 Cluster information evolution analysis method for large-scale dynamic short texts
CN103927365A (en) * 2014-04-21 2014-07-16 武汉大学 Web page time sensibility measurement method based on energy function
CN104808964A (en) * 2015-05-04 2015-07-29 腾讯科技(北京)有限公司 Information processing method and client side
CN106294356A (en) * 2015-05-14 2017-01-04 北京大学 Microblogging timeline based on dynamic clustering generates method and device
CN105528432A (en) * 2015-12-15 2016-04-27 北大方正集团有限公司 Digital resource hotspot generating method and device
CN106383877A (en) * 2016-09-12 2017-02-08 电子科技大学 On-line short text clustering and topic detection method of social media
CN106815297A (en) * 2016-12-09 2017-06-09 宁波大学 A kind of academic resources recommendation service system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QING QIANG WU 等: ""Emerging Topic Detection Model Based on LDA and Its Application in Stem Cell Field"" *
万红新: ""动态时间分布LDA 的网络舆情热点词链提取研究"" *

Also Published As

Publication number Publication date
CN110008334B (en) 2023-03-14

Similar Documents

Publication Publication Date Title
Anastasopoulos et al. Machine learning for public administration research, with application to organizational reputation
US11645319B1 (en) Systems and methods for identifying issues in electronic documents
CN102576358B (en) Word pair acquisition device, word pair acquisition method, and program
Kestemont et al. Cross-genre authorship verification using unmasking
CN107862070B (en) Online classroom discussion short text instant grouping method and system based on text clustering
CN108363821A (en) A kind of information-pushing method, device, terminal device and storage medium
CN110532451A (en) Search method and device for policy text, storage medium, electronic device
Benchimol et al. Text mining methodologies with R: An application to central bank texts
CN108416375B (en) Work order classification method and device
CN108885623A (en) The lexical analysis system and method for knowledge based map
CN110110225B (en) Online education recommendation model based on user behavior data analysis and construction method
CN107807958B (en) Personalized article list recommendation method, electronic equipment and storage medium
CN110888990A (en) Text recommendation method, device, equipment and medium
CN110134845A (en) Project public sentiment monitoring method, device, computer equipment and storage medium
CN113254777B (en) Information recommendation method and device, electronic equipment and storage medium
EP3961426A2 (en) Method and apparatus for recommending document, electronic device and medium
CN112328857A (en) Product knowledge aggregation method and device, computer equipment and storage medium
CN115186654A (en) Method for generating document abstract
CN116882414B (en) Automatic comment generation method and related device based on large-scale language model
CN111259223B (en) News recommendation and text classification method based on emotion analysis model
Cho et al. Topic category analysis on twitter via cross-media strategy
Fritsche et al. Deciphering professional forecasters' stories: Analyzing a corpus of textual predictions for the German economy
CN115587828A (en) Interpretable method of telecommunication fraud scene based on Shap value
CN110717008B (en) Search result ordering method and related device based on semantic recognition
US11403654B2 (en) Identifying competitors of companies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant