CN110008334B

CN110008334B - Information processing method, device and storage medium

Info

Publication number: CN110008334B
Application number: CN201710660877.7A
Authority: CN
Inventors: 王树伟; 温旭; 花贵春; 范欣; 姜国华
Original assignee: Tencent Technology Beijing Co Ltd
Current assignee: Tencent Technology Beijing Co Ltd
Priority date: 2017-08-04
Filing date: 2017-08-04
Publication date: 2023-03-14
Anticipated expiration: 2037-08-04
Also published as: CN110008334A

Abstract

The invention discloses an information processing method, an information processing device and a storage medium, wherein the method comprises the following steps: extracting character features of first character information to be evaluated, and obtaining description time parameters based on the character features and a preset mapping relation, wherein the description time parameters represent time information described by the content of the first character information; clustering all character information to be evaluated including the first character information according to a preset clustering mode so as to identify the category corresponding to the first character information; determining an effective time parameter of the first text information based on the category corresponding to the first text information; the effective time parameter represents the effective time corresponding to the category; and obtaining a first time-update parameter of the first character information based on the description time parameter, the effective time parameter and the current time information.

Description

Information processing method, device and storage medium

Technical Field

The present invention relates to internet information processing technologies, and in particular, to an information processing method, an information processing apparatus, and a storage medium.

Background

There are tens of thousands of articles published each day in the internet, and the "freshness" of the articles is certainly of great concern to users, and it characterizes the timeliness of the articles, such as the latest news or the old news that has expired, and what measures the freshness of the articles is called the newness. At present, the way of evaluating the newness of an article is usually to identify the time information carried in the article, however, when the interior of the article is not expressed clearly by adopting the way, the article cannot be identified, so that the recall rate is low; moreover, if the content of the article says what has happened recently, but since some part of the article refers to a historical event that occurred a long time ago, a recognition error will result.

Disclosure of Invention

Embodiments of the present invention provide an information processing method, an information processing apparatus, and a storage medium, which can accurately evaluate the timeliness and the newness of an article and have a high recall rate.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides an information processing method, which comprises the following steps:

extracting character features of first character information to be evaluated, and obtaining description time parameters based on the character features and a preset mapping relation, wherein the description time parameters represent time information described by the content of the first character information;

clustering all character information to be evaluated including the first character information according to a preset clustering mode so as to identify the category corresponding to the first character information;

determining an effective time parameter of the first text information based on the category corresponding to the first text information; the effective time parameter represents the effective duration corresponding to the category;

and obtaining a first time-update parameter of the first character information based on the description time parameter, the effective time parameter and the current time information.

In the above scheme, before extracting the text features of the first text information to be evaluated, the method further includes:

respectively marking character characteristics of the collected second character information to obtain a plurality of sample information;

and training a learning model for the plurality of sample information by taking the character features as training features so as to form a mapping relation between the character features and the description time parameters based on the learning model.

In the above scheme, the method further comprises:

respectively acquiring a first time-freshness parameter of each text message except the first text message in all the text messages;

sequencing all the character information according to the first time-of-novelty parameter of each character information in all the character information to obtain a first sequencing result;

and taking at least one text message in all the text messages as candidate recommended text messages according to the first sequencing result.

In the above scheme, the method further comprises:

determining third text information of which the similarity with the first text information meets a preset requirement in all the text information; the description time parameter corresponding to the third text message is null;

and setting the description time parameter of the third text information to be equal to the description time parameter of the first text information.

In the above scheme, the method further comprises:

determining the number of the text messages with the same category as the first text message and the change of the number in at least two preset time periods;

according to the quantity, the change of the quantity and a preset heat judgment condition, obtaining a heat grade of the first character information for representing the character heat;

and obtaining a second time-freshness parameter of the first text information based on the first time-freshness parameter of the first text information and the heat level of the first text information.

In the above scheme, obtaining a second temporal freshness parameter of the first text information based on the first temporal freshness parameter of the first text information and the heat level of the first text information includes:

obtaining a first evaluation score corresponding to the first text information according to a first time-freshness parameter of the first text information and a preset first scoring strategy;

obtaining a second evaluation score corresponding to the first character information according to the heat degree grade of the first character information and a preset second scoring strategy;

and weighting the first evaluation score and the second evaluation score to obtain a second time-of-freshness parameter of the first text message.

In the above scheme, the method further comprises:

respectively acquiring a second time-freshness parameter of each text message except the first text message in all the text messages;

sequencing all the text messages according to the second time-of-freshness parameter of each text message in all the text messages to obtain a second sequencing result;

and taking at least one text message in all the text messages as candidate recommended text messages according to the second sorting result.

An embodiment of the present invention further provides an information processing apparatus, where the apparatus includes:

the acquisition module is used for extracting character features of first character information to be evaluated, and acquiring description time parameters based on the character features and a learning model obtained by training, wherein the description time parameters represent time information described by the content of the first character information;

the clustering module is used for clustering all character information to be evaluated including the first character information according to a preset clustering mode so as to identify the category corresponding to the first character information;

the determining module is used for determining an effective time parameter of the first text information based on the category corresponding to the first text information; the effective time parameter represents the effective time corresponding to the category;

and the processing module is used for obtaining a first time-update parameter of the first character information based on the description time parameter, the effective time parameter and the current time information.

In the above solution, the apparatus further includes:

the model training module is used for respectively marking the character characteristics of the collected second character information to obtain a plurality of sample information;

and training a learning model for the plurality of sample information by using character features as training features so as to form a mapping relation between the character features and the description time parameters based on the learning model.

In the above scheme, the processing module is further configured to obtain a first time update parameter of each text message except the first text message in all the text messages;

sequencing all the text messages according to the first time-freshness parameter of each text message in all the text messages to obtain a first sequencing result;

In the above scheme, the processing module is further configured to determine a third text message, of the all text messages, whose similarity to the first text message meets a preset requirement; the description time parameter corresponding to the third text message is null;

In the above scheme, the processing module is further configured to determine the number of text messages of the same category as the first text message, and a change of the number in at least two preset time periods;

and obtaining a second time-of-freshness parameter of the first text information based on the first time-of-freshness parameter of the first text information and the heat level of the first text information.

In the above scheme, the processing module is further configured to obtain a second time-update parameter of each text message in all the text messages except the first text message;

sequencing all the character information according to the second time-of-freshness parameter of each character information in all the character information to obtain a second sequencing result;

a memory for storing an executable program;

and a processor for implementing the above-described information processing method by executing the executable program.

The embodiment of the invention also provides a readable storage medium, which stores an executable program, and the executable program realizes the information processing method when being executed by a processor.

By applying the information processing method, the information processing device and the storage medium provided by the embodiment of the invention, the time-freshness of the article can be accurately evaluated, the article can be recommended to the user, and the method, the device and the storage medium are simple to implement and high in recall rate.

Drawings

FIG. 1 is a diagram of hardware entities for performing information interaction in an embodiment of the present invention;

FIG. 2 is a first flowchart illustrating an information processing method according to an embodiment of the present invention;

FIG. 3 is a second flowchart illustrating an information processing method according to an embodiment of the present invention;

FIG. 4 is a third schematic flowchart illustrating an information processing method according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating a display of article recommendations in an embodiment of the present invention;

FIG. 6 is a schematic diagram of a component structure of an information processing apparatus according to an embodiment of the present invention;

fig. 7 is a diagram illustrating an information processing apparatus as an example of a hardware entity according to an embodiment of the present invention.

Detailed Description

The inventor finds in research that the time information of an article can be used to represent the time-new degree of the article, and the method for identifying the time information of the article can use a word bank and a method for collocating the word bank, and firstly, the article is subjected to word segmentation preprocessing (i.e. the article content is split by taking words as units), for example: split the 'No. 14/month No. 2, certificate Authority carries out a new round of investigation' into 'No. 2/month/14/number, certificate Authority/carried out/new/round/investigation'; then, establishing a word bank and a collocation bank of time information expression modes, wherein the word bank comprises: expressed as a date: 12 months and 13 days, the current week, etc.; time description expression: morning, afternoon, etc.; time range expresses: near, few, first few days, etc.; the collocation library comprises: matching according to date and time descriptions: morning of 12 th morning, 8 th evening of 2016 and so on; then, identifying time expression in the article by using a time word library and a collocation library, and selecting the time which can represent the article description event most as the event time, such as: the time with the most occurrence number of the text is selected as the result, or the time with the earliest time in the text is selected as the result, or the time of the important position in the text, such as the position of a title, a first section, a first sentence and the like, is selected as the result.

However, the inventor found that the time information can be recognized more accurately by using a thesaurus or a collocation database, but in many cases, the recall ratio is too low, that is, the number of recognized times is too small. The existing defects at least comprise the following defects:

1. there is no clear time expression in an article, and the time freshness of the article cannot be identified by using a word bank or a collocation bank method.

2. A paper saying that a thing has recently occurred, but a long-occurring historical event is referenced at the beginning, and a recognition error occurs.

3. The word stock or the rule stock always has incomplete coverage, and the time-freshness of the article cannot be well recognized at the moment.

Before the present invention is explained in detail, terms and expressions provided by the embodiments of the present invention are explained, and the terms and expressions provided by the embodiments of the present invention are applied to the following explanations.

1) The time-new degree: the reference standard for measuring the timeliness of the text information reflects the timeliness of the text information in time; for example, the newness of the headline on the day is higher than the newness of the article one week ago.

2) The hour-new degree parameter: and obtaining a parameter for evaluating the time-new degree of the text information based on the time parameter related to the text information.

The invention is described in further detail below with reference to the drawings and specific embodiments.

It should be noted that the terms "first \ second \ third" related to the embodiments of the present invention are only used for distinguishing similar objects, and do not represent a specific ordering for the objects, and it should be understood that "first \ second \ third" may interchange a specific order or sequence when allowed. It should be understood that the terms first, second, and third, as used herein, are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or otherwise described herein.

Fig. 1 is a schematic diagram of hardware entities performing information interaction in an embodiment of the present invention, where fig. 1 includes: the server 11 \8230; \ 82301 n and the terminal devices 21-24, wherein the terminal devices (including mobile phones, desktop computers, PC machines, all-in-one machines and the like) carry out information interaction with the server through a wired network or a wireless network. In one example, a terminal device obtains a plurality of article information (e.g., a plurality of articles) published in a network, processes the plurality of text information to obtain a time-freshness parameter of each text information, sorts the plurality of text information based on the time-freshness parameter, and recommends at least one text information of the plurality of text information to other terminal users based on a sorting result, that is, sends recommendation information including related information (e.g., title, link, etc.) of the recommended article to other terminals.

Example one

An embodiment of the present invention provides an information processing method, and fig. 2 is a schematic flowchart of the information processing method in the embodiment of the present invention, and as shown in fig. 2, the information processing method in the embodiment of the present invention includes:

step 101: extracting character features of first character information to be evaluated, and obtaining description time parameters based on the character features and a preset mapping relation, wherein the description time parameters represent time information described by the content of the first character information.

In practical application, the text information to be evaluated may be obtained in advance, or may be obtained by capturing the text information from the internet through a web crawler. The text information may be any text information related to text, such as an article, a text paragraph, news of a text, an article abstract, and the like, and correspondingly, the first text information is one of a plurality of articles to be evaluated, and the text feature may be a word key feature of the article, which includes at least one of the following: word characteristics (i.e., the content characteristics represented by the meaning of the word), part-of-speech characteristics (nouns, adjectives, etc.), and word length characteristics (e.g., the number of characters included).

In practical implementation, before extracting the text features of the first text information to be evaluated, the method further includes: acquiring the mapping relationship specifically includes:

One example is: performing word segmentation processing and word feature marking on the collected M articles to obtain M pieces of sample information; m is a positive integer greater than 2;

and training a machine learning model for the M pieces of sample information by using word characteristics as training characteristics so as to form a mapping relation between the word characteristics of the article and the description time parameters based on the machine learning model.

Correspondingly, the obtaining of the description time parameter based on the word feature and the trained learning model includes:

and importing the character features (such as word features of an article) into the learning model obtained by training to obtain the description time parameters which have a mapping relation with the character features.

Based on the above embodiment of the present invention, in practical applications, the learning model may be a conditional random field model, and is used for lexical analysis work such as chinese word segmentation and part-of-speech tagging, which is a conditional probability distribution model of another set of output random variables under a given set of input random variables, that is, a markov random field of a random variable Y under a given random variable X condition. Of course, the present invention is not limited to this machine learning model.

Step 102: and clustering all the character information to be evaluated including the first character information according to a preset clustering mode so as to identify the category corresponding to the first character information.

Here, in actual implementation, all the text information to be evaluated may be a plurality of articles to be evaluated, the preset clustering manner may be aggregation level clustering, at the beginning of clustering, each text information (each article) is used as a cluster (i.e., a set of a group of data objects), based on the principle of closest distance, each step merges two closest clusters, during merging, only the current closest point pair is sequentially taken, if the point pair is not currently in a cluster, the two clusters are merged, and the merging is ended when the threshold value of clustering is satisfied. In the embodiment of the present invention, two pieces of text information (two articles) are most similar, that is: the sum of the reciprocal 1/d of the distance between two pieces of text information and the similarity s of the keyword sets in the two pieces of text information is larger than a preset threshold value.

The distance between two articles can be represented by various distances, for example, editing distance is adopted, the two articles are regarded as two character strings with different lengths, the minimum number of editing operations required for converting one character string into the other character string is determined, and the greater the editing distance is, the poorer the similarity of the two articles is. Or, the Jacard distance is used for representing the distance between the two articles, the two articles are regarded as two character sets, and the greater the Jacard distance is, the poorer the similarity of the two articles is.

Based on the above embodiment of the present invention, after all the text information to be evaluated is clustered, the text information to be evaluated is divided into different classes (clusters).

Step 103: determining an effective time parameter of the first text information based on the category corresponding to the first text information; the valid time parameter represents the valid duration corresponding to the category.

In practical implementation, a mapping relationship between the category of the text information and the validity time parameter is preset, and after the category corresponding to the first text information is determined, the corresponding validity time parameter can be determined based on the preset mapping relationship, for example, the validity duration corresponding to the cluster to which the first text information belongs is determined to be 5 days.

Step 104: and obtaining a first time-update parameter of the first character information based on the description time parameter, the effective time parameter and the current time information.

Here, in actual implementation, the step may specifically include:

obtaining a remaining effective time parameter of the first text message based on the following formula:

remaining validity time = current time- (description time + validity time);

and taking the residual effective time as a first time-freshness parameter of the first text message.

One example is: when the current time is 28 days in 6 months, the validity time of the first text message is 5 days, and the description time of the first text message is 22 days in 6 months, the remaining validity time of the first text message = 28 days in 6 months- (22 days in 6 months +5 days) =1 day.

Based on the above embodiment of the present invention, in practical application, the method further includes:

That is to say, in practical application, there are a plurality of text messages (articles) to be evaluated, the first time newness parameter of all the articles to be evaluated is obtained according to the manner from step 101 to step 104, and then the plurality of articles can be ranked according to the first time newness parameter, and then a part of the articles in the ranking result is selected as candidate recommended articles according to a preset rule (for example, 10 articles ranked first) to be recommended to the user.

Example two

An embodiment of the present invention provides an information processing method, and fig. 3 is a schematic flowchart of the information processing method in the embodiment of the present invention, and as shown in fig. 3, the information processing method in the embodiment of the present invention includes:

step 201: and extracting the character features of the first character information to be evaluated, and obtaining the description time parameter based on the character features and a preset mapping relation.

Here, the description time parameter represents time information described by the first text information content.

In practical application, the text information to be evaluated may be obtained in advance, or may be obtained by capturing from the internet through a web crawler. The text information may be an article, and accordingly, the first text information is one of a plurality of articles to be evaluated, and the text feature may be a word key feature of the article, including at least one of: word characteristics (i.e., the content characteristics represented by the meaning of the word), part-of-speech characteristics (nouns, adjectives, etc.), and word length characteristics (e.g., the number of characters included).

Based on the above embodiment of the present invention, in practical applications, the mapping relationship may be obtained by training a learning model, and specifically includes:

training a learning model for the plurality of sample information by using character features as training features to form the mapping relation between the character features and the description time parameters based on the learning model;

the learning model can be a conditional random field model, and is used for lexical analysis work such as Chinese word segmentation and part-of-speech tagging, which is a conditional probability distribution model of another set of output random variables under the condition of a given set of input random variables, namely, a Markov random field of a random variable Y under the condition of a given random variable X. Of course, the present invention is not limited to this machine learning model.

Step 202: and clustering all the character information to be evaluated including the first character information according to a preset clustering mode so as to identify the category corresponding to the first character information.

In practical implementation, all the text information to be evaluated may be a plurality of articles to be evaluated, the preset clustering mode may be aggregation level clustering, at the beginning of clustering, each text information (each article) is used as a cluster, two closest clusters are merged at each step, during merging, only the current closest point pair is taken in sequence, if the point pair is not in one cluster at present, the two clusters are merged, and the merging is ended when the threshold value of clustering is met. In the embodiment of the present invention, two pieces of text information (two articles) are most similar, that is: the sum of the reciprocal 1/d of the distance between two text messages (two articles) and the similarity s of the keyword sets in the two text messages is larger than a preset threshold value.

The distance between two articles can be represented by various distances, for example, editing distance is adopted, the two articles are regarded as two character strings with different lengths, the minimum number of editing operations required for converting one character string into the other character string is determined, and the greater the editing distance is, the poorer the similarity of the two articles is. Or, the Jacard distance is used for representing the distance between two articles, the two articles are considered as two character sets, and the greater the Jacard distance is, the poorer the similarity of the two articles is.

In practical application, when the learning model is adopted, a situation that the description time parameter of the text information cannot be identified may exist, at this time, third text information (the description time parameter corresponding to the third text information is null) in all the text information, the similarity of which with the first text information meets a preset requirement, is determined; and setting the description time parameter of the third text information to be equal to the description time parameter of the first text information. That is, the description time parameter of the text information whose description time parameter is empty is set to be equal to the description time parameter of the text information whose similarity satisfies the preset requirement.

Step 203: and determining an effective time parameter of the first text information based on the category corresponding to the first text information, and obtaining a first time update parameter of the first text information based on the description time parameter, the effective time parameter and the current time information.

Here, the validity time parameter characterizes a validity time duration corresponding to the category.

In actual implementation, a mapping relation between the category of the text information and the valid time parameter is preset, and after the category corresponding to the first text information is determined, the corresponding valid time parameter can be determined based on the preset mapping relation, for example, the valid duration corresponding to the cluster to which the first text information belongs is determined to be 5 days.

remaining validity time = current time- (description time + validity time);

Step 204: and determining the heat degree grade of the first text message according to the number of the text messages with the same type as the first text message in at least two preset time periods and the change of the number in the at least two time periods.

Here, in actual implementation, before this step, the method further includes:

determining the quantity of the text messages with the same category as the first text message and the change of the quantity in at least two preset time periods;

and obtaining the heat degree grade of the first character information for representing the character heat degree according to the quantity, the change of the quantity and a preset heat degree judgment condition.

The preset heat judgment condition can be set according to actual needs, such as: setting the quantity of the character information of the same category to be larger than a preset value (such as 100), wherein the quantity changes to be continuously reduced, the heat is indicated to be continuously reduced, and the heat level is one level; the quantity is changed into ascending and then descending, which indicates that the heat is high once, but the heat begins to reduce, and the heat level is two-level; the number of the heat rate is changed to continuously rise, which means that the heat rate is higher and higher, and the heat rate is three-level. The number of the text messages is smaller than a preset value, and the heat level is considered to be one grade no matter how the number changes.

One example is: the category corresponding to the first character information comprises 120 character information, the total number of the character information falling between 6 month 1 and 6 month 3 is 80, the total number of the character information falling between 6 month 4 and 6 month 7 is 30, and the total number of the character information falling between 6 month 8 and 6 month 11 is 10.

Step 205: and obtaining a second time-freshness parameter of the first text information based on the first time-freshness parameter of the first text information and the heat level of the first text information.

Here, in actual implementation, the steps specifically include:

obtaining a first evaluation score corresponding to the first character information according to a first time-freshness parameter of the first character information and a preset first scoring strategy;

obtaining a second evaluation score corresponding to the first character information according to the heat degree level of the first character information and a preset second grading strategy;

and weighting the first evaluation score and the second evaluation score to obtain a second time-of-freshness parameter of the first character information.

Wherein, one example of the first scoring strategy is: if the remaining effective time of the text information is more than 2 days, the first evaluation score is 5 scores, if the remaining effective time of the text information is more than 0 and less than 2 days, the first evaluation score is 4 scores, if the remaining effective time of the text information is within two overdue days (namely-2 < the remaining effective time < 0), the first evaluation score is 3 scores, if the remaining effective time of the text information is two overdue days (the remaining effective time < -2), the first evaluation score is 2 scores.

One example of a second scoring strategy is: and dividing the heat grade into three grades, wherein the second evaluation score corresponding to the first grade is 2, the second evaluation score corresponding to the second grade is 3, and the second evaluation score corresponding to the third grade is 5.

After the first evaluation score and the second evaluation score of the character information are obtained, weighting processing is carried out to obtain a second time-of-novelty parameter of the character information, namely:

a second time freshness parameter = x first evaluation score + y second evaluation score; wherein x and y are both positive numbers, and x + y =1.

Step 206: and acquiring a second time-of-freshness parameter of each character information in all the character information, sequencing all the character information based on the acquired second time-of-freshness parameter, and determining candidate recommended character information based on a sequencing result.

That is to say, in practical applications, there are a plurality of text messages (articles) to be evaluated, the second time-recency parameter of all the text messages to be evaluated is obtained according to the manner from step 201 to step 205, and then the text messages are sorted according to the second time-recency parameter, and then part of the text messages in the sorting result is selected as candidate recommended text messages according to a preset rule (for example, 10 messages with the top rank are selected) and recommended to the user.

EXAMPLE III

An embodiment of the present invention provides an information processing method, and fig. 4 is a schematic flow diagram of an information processing method in an embodiment of the present invention, where text information is used as an example, as shown in fig. 4, the information processing method in the embodiment of the present invention includes:

step 301: and extracting word key characteristics of the N articles to be evaluated.

Here, the term key feature includes at least one of: word characteristics, part-of-speech characteristics, and word length characteristics.

N is a positive integer greater than 2; the word characteristics represent the content characteristics of words, the part-of-speech characteristics are nouns, adjectives and the like, and the word length characteristics can refer to the number of characters contained in the words.

Step 302: and respectively importing the key word features of the N articles into the machine learning model obtained by training to obtain the description time parameters of the N articles.

Here, the description time parameter represents time information of the article content description.

In practical implementation, before this step, the method further includes:

performing word segmentation processing and word feature marking on the collected M articles to obtain M pieces of sample information; the M articles are different from the N articles; m is a positive integer;

Step 303: the method comprises the steps of clustering N articles to be evaluated according to a preset clustering mode, merging the articles in the same category to obtain at least one cluster, and determining the effective time of each article based on the cluster to which the N articles belong.

Here, in actual implementation, the preset clustering manner may be aggregation-level clustering, each article serves as one cluster at the beginning of clustering, two closest clusters are merged at each step, during merging, only the current closest point pair is taken in sequence, if the point pair is not currently in one cluster, the two clusters where the point pair is located are merged, and the merging is ended when a threshold value of clustering is met. In the present example, the two articles are most similar, namely: the sum of the reciprocal 1/d of the distance between the two articles and the similarity s of the keyword sets in the two text messages is larger than a preset threshold value.

Step 304: and obtaining a first time newness parameter for evaluating the time newness of the article based on the description time parameter, the effective time and the current time information of each article in the N articles.

In actual implementation, a mapping relation between the article category and the valid time parameter is preset, and after the article category is determined, the corresponding valid time can be determined based on the preset mapping relation.

The remaining effective time parameter of the article is obtained based on the following formula:

remaining validity time = current time- (description time + validity time);

and taking the residual effective time as a first time newness parameter of the article.

Step 305: determining the quantity of the articles of the same category obtained by clustering, and obtaining the heat degree grade of each article for representing the heat degree of the article based on a preset judgment rule according to the quantity and the change of the quantity in at least two preset time periods.

Here, in practical implementation, the preset heat judgment rule may be set according to practical needs, such as: setting the quantity of the character information of the same category to be larger than a preset value (such as 100), wherein the quantity changes to be continuously reduced, the heat is continuously reduced, and the heat level is one level; the quantity is changed into ascending and then descending, which indicates that the heat is high once, but the heat begins to reduce, and the heat level is two-level; the number of the heat rate is continuously increased, which means that the heat rate is higher and higher, and the heat rate is three-level. The number of the text messages is smaller than a preset value, and the heat level is considered to be one grade no matter how the number changes.

Step 306: and obtaining a second time newness parameter of each article based on the first time newness parameter of each article and the heat level of each article, ranking the N articles to be evaluated based on the second time newness parameter, and taking at least one of the N articles as a candidate recommended article according to a ranking result.

Here, in actual implementation, the steps specifically include:

obtaining a first evaluation score corresponding to each article according to a first time newness parameter of the N articles and a preset first evaluation strategy;

obtaining a second evaluation score corresponding to each article according to the heat degree grades of the N articles and a preset second evaluation strategy;

and performing weighting processing on the first evaluation score and the second evaluation score to obtain a second time-of-newness parameter for evaluating the time-of-newness of the article.

In practical application, the first scoring strategy and the second scoring strategy are both scoring strategies which need to be set actually;

one example of a first scoring policy is: if the remaining effective time of the article is more than 2 days, the first evaluation score is 5 scores, if the remaining effective time of the article is more than 0 and less than 2 days, the first evaluation score is 4 scores, if the remaining effective time of the article is within two overdue days (namely-2 < the remaining effective time < 0), the first evaluation score is 3 scores, if the remaining effective time of the article is two overdue days (the remaining effective time is < -2), the first evaluation score is 2 scores.

After obtaining a first evaluation score and a second evaluation score of an article, performing weighting processing to obtain a second time newness parameter of the article, namely:

And after the N articles to be evaluated are ranked based on the second freshness parameter, selecting at least one of the N articles as a candidate recommended article according to a recommended article selection strategy, for example, selecting the first 5 articles in the ranking result as the recommended article.

Step 307: and sending the relevant information of the candidate recommended articles to a terminal of a recommended user.

Here, in actual implementation, after determining the candidate recommended articles, relevant information of the candidate recommended articles is acquired, such as: article titles, article links, etc. As shown in fig. 5, a terminal of the recommending user displays relevant information of the candidate recommended article, and the display mode may be in various modes, such as an advertisement (a screen-inserted advertisement, a native advertisement, etc.).

Example four

An embodiment of the present invention provides an information processing apparatus, as shown in fig. 6, which is a schematic diagram of a composition structure of the information processing apparatus in the embodiment of the present invention, as shown in fig. 6, the information processing apparatus in the embodiment of the present invention includes:

the acquisition module 11 is configured to extract a text feature of first text information to be evaluated, and obtain a description time parameter based on the text feature and a learning model obtained through training, where the description time parameter represents time information described in content of the first text information;

the clustering module 12 is configured to cluster all text information to be evaluated, including the first text information, according to a preset clustering manner, so as to identify a category corresponding to the first text information;

a determining module 13, configured to determine an effective time parameter of the first text information based on a category corresponding to the first text information; the effective time parameter represents the effective time corresponding to the category;

the processing module 14 is configured to obtain a first time update parameter of the first text information based on the description time parameter, the valid time parameter, and the current time information.

In one embodiment, the apparatus further comprises:

the model training module 15 is configured to mark character features of the collected multiple pieces of second character information respectively to obtain multiple pieces of sample information;

In an embodiment, the processing module 14 is further configured to obtain a first time-update parameter of each text message in all the text messages except the first text message;

In an embodiment, the processing module 14 is further configured to determine a third text message, of the all text messages, whose similarity to the first text message meets a preset requirement; the description time parameter corresponding to the third text message is null;

In an embodiment, the processing module 14 is further configured to determine the number of text messages of the same category as the first text message, and a change of the number in at least two preset time periods;

In an embodiment, the processing module 14 is further configured to obtain a second time-update parameter of each text message except the first text message in all the text messages;

An embodiment of the present invention further provides an information processing apparatus, including: a memory for storing an executable program;

a processor for implementing, by executing the executable program:

extracting character features of first character information to be evaluated, and obtaining description time parameters based on the character features and a learning model obtained by training, wherein the description time parameters represent time information described by the content of the first character information;

clustering all character information to be evaluated including the first character information according to a preset clustering mode so as to identify a category corresponding to the first character information;

The processor is used for realizing the following steps when the executable program is executed:

The processor is configured to implement, by executing the executable program:

respectively acquiring a second time-update parameter of each text message except the first text message in all the text messages;

The embodiment of the invention also provides a readable storage medium, which stores an executable program, and the executable program realizes that:

and obtaining a first time-update parameter of the first text information based on the description time parameter, the effective time parameter and the current time information.

The executable program when executed by the processor further implements:

respectively marking character characteristics of the plurality of collected second character information to obtain a plurality of sample information;

The executable program when executed by the processor further implements:

respectively acquiring a first time-update parameter of each text message except the first text message in all the text messages;

The executable program when executed by the processor further implements:

Here, it should be noted that: the above description related to the information processing apparatus is similar to the above description of the method, and the description of the advantageous effects of the method is omitted for brevity. For technical details not disclosed in the embodiments of the apparatus of the present invention, reference is made to the description of the embodiments of the method of the present invention.

In this embodiment, an example of a hardware entity of an information processing apparatus is shown in fig. 7. The information processing apparatus includes a processor 61, a storage medium 62, and at least one external communication interface 63; the processor 61, the storage medium 62 and the external communication interface 63 are all connected by a bus 64.

Those skilled in the art will understand that: all or part of the steps of implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer-readable storage medium, and when executed, executes the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Random Access Memory (RAM), a Read-Only Memory (ROM), a magnetic disk, or an optical disk.

Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention or portions thereof contributing to the related art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a RAM, a ROM, a magnetic or optical disk, or various other media that can store program code.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An information processing method, characterized in that the method comprises:

respectively obtaining the distance between every two character information in all character information to be evaluated, including the first character information, according to a preset clustering mode;

combining the text information based on the principle of closest distance to form at least one cluster;

judging the cluster to which the first character information belongs to identify the category corresponding to the first character information;

determining an effective time parameter of the first text information based on the category corresponding to the first text information; the effective time parameter represents the effective time corresponding to the category;

2. The method of claim 1, further comprising:

training a learning model for the plurality of sample information by using character features as training features to form the mapping relation between the character features and the description time parameters based on the learning model.

3. The method according to claim 1 or 2, characterized in that the method further comprises:

4. The method according to claim 1 or 2, characterized in that the method further comprises:

5. The method according to claim 1 or 2, characterized in that the method further comprises:

6. The method of claim 5, wherein obtaining a second temporal freshness parameter of the first textual information based on the first temporal freshness parameter of the first textual information and the heat level of the first textual information comprises:

7. The method of claim 5, further comprising:

8. An information processing apparatus characterized in that the apparatus comprises:

the acquisition module is used for extracting character features of first character information to be evaluated and acquiring description time parameters based on the character features and a preset mapping relation, wherein the description time parameters represent time information described by the content of the first character information;

the clustering module is used for respectively obtaining the distance between every two character information in all character information to be evaluated, including the first character information, according to a preset clustering mode; combining the character information based on the principle of nearest distance to form at least one cluster; judging the cluster to which the first character information belongs to identify the category corresponding to the first character information;

the determining module is used for determining an effective time parameter of the first text information based on the category corresponding to the first text information; the effective time parameter represents the effective duration corresponding to the category;

9. The apparatus of claim 8, further comprising:

and training a learning model for the plurality of sample information by using the character features as training features so as to form the mapping relation between the character features and the description time parameters based on the learning model.

10. The apparatus according to claim 8 or 9,

the processing module is further configured to obtain a first time-update parameter of each text message in all the text messages except the first text message;

11. The apparatus of claim 8 or 9,

the processing module is further configured to determine third text information, of the all text information, of which the similarity to the first text information meets a preset requirement; the description time parameter corresponding to the third text message is null;

12. The apparatus of claim 8 or 9,

the processing module is further configured to determine the number of the text messages of the same category as the first text message, and the change of the number in at least two preset time periods;

13. An information processing apparatus characterized in that the apparatus comprises:

a memory for storing an executable program;

a processor for implementing the information processing method of any one of claims 1 to 7 by executing the executable program.

14. A computer-readable storage medium characterized by storing an executable program that when executed by a processor implements the information processing method of any one of claims 1 to 7.