CN108491529B

CN108491529B - Information recommendation method and device

Info

Publication number: CN108491529B
Application number: CN201810266958.3A
Authority: CN
Inventors: 牛化康
Original assignee: Baidu Online Network Technology Beijing Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd
Priority date: 2018-03-28
Filing date: 2018-03-28
Publication date: 2021-11-16
Anticipated expiration: 2038-03-28
Also published as: CN108491529A

Abstract

The invention provides an information recommendation method and device, wherein the method comprises the following steps: acquiring historical behavior data of a user to be recommended; which comprises the following steps: the information flow clicked by the user to be recommended in a preset historical time period; acquiring keywords in an information stream to generate a keyword text; inputting the keyword text into a preset document theme generation model LDA to obtain an LDA vector corresponding to a user to be recommended; the LDA vector includes: the probability that the keyword text belongs to each topic; determining similar users corresponding to the users to be recommended according to the LDA vectors corresponding to the users to be recommended and the LDA vectors corresponding to the candidate users; according to the historical behavior data of the similar users, information flow is recommended to the users to be recommended, so that the historical behavior data and the LDA model can be combined in time to obtain the LDA vectors corresponding to the users to be recommended, the use of the user model is avoided, the calculation of the similar users is carried out according to the LDA vectors corresponding to the users, the calculation amount is small, the calculation speed is high, and the real-time requirement can be met.

Description

Information recommendation method and device

Technical Field

The invention relates to the technical field of internet, in particular to an information recommendation method and device.

Background

At present, a collaborative recommendation algorithm based on users mainly utilizes behaviors of similar users of a given user to recommend the given user, for example, information streams are recommended for the given user according to information streams clicked and browsed by the similar users.

At present, there are two methods for determining similar users of a given user, one is to obtain a user-item matrix, analyze the matrix, obtain a vector of the given user mapped to a low dimension, and determine the similar users of the given user according to the similarity between the vectors. Another approach is to compute the similarity between a given user and other users based on known user models. The user model comprises attributes such as interest words and interest categories. However, in the method, the user model is generated based on the historical click data of the user, the generation speed is low, the hysteresis is large, and when similarity calculation is performed based on the user model, the user model needs to be traversed to obtain the attributes of the user, the calculation speed is low, and the real-time requirement is difficult to meet.

Disclosure of Invention

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, a first object of the present invention is to provide an information recommendation method, which is used for solving the problems in the prior art that the computation speed of similar users is slow, the computation amount is large, and the real-time requirements are difficult to meet.

A second object of the present invention is to provide an information recommendation apparatus.

A third object of the present invention is to provide another information recommendation apparatus.

A fourth object of the invention is to propose a non-transitory computer-readable storage medium.

A fifth object of the invention is to propose a computer program product.

In order to achieve the above object, an embodiment of a first aspect of the present invention provides an information recommendation method, including:

acquiring historical behavior data of a user to be recommended; the historical behavior data comprises: the information flow clicked by the user to be recommended in a preset historical time period;

acquiring keywords in the information stream to generate a keyword text;

inputting the keyword text into a preset document theme generation model LDA to obtain an LDA vector corresponding to the user to be recommended; the LDA vector comprises: the probability that the keyword text belongs to each topic;

determining similar users corresponding to the user to be recommended according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user;

and recommending information flow to the user to be recommended according to the historical behavior data of the similar users.

Further, the determining similar users corresponding to the user to be recommended according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user includes:

determining at least one theme group to which the user to be recommended belongs according to the LDA vector corresponding to the user to be recommended;

aiming at each topic group, calculating the similarity between the user to be recommended and each candidate user in the topic group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the topic group;

and determining the candidate users with the corresponding similarity meeting a preset similarity threshold as the similar users corresponding to the users to be recommended.

Further, each topic group is provided with a corresponding similarity threshold.

Further, before calculating, for each topic group, a similarity between the user to be recommended and each candidate user in the topic group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the topic group, the method further includes:

adding the LDA vector corresponding to the user to be recommended into at least one subject group;

aiming at each topic group to which the user to be recommended belongs, dividing the topic group to obtain at least two sub-groups;

correspondingly, the calculating, for each topic group, the similarity between the user to be recommended and each candidate user in the topic group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the topic group includes:

aiming at each topic group, acquiring a first sub-group comprising LDA vectors corresponding to the users to be recommended;

and calculating the similarity between the user to be recommended and each candidate user in the first sub-group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the first sub-group.

Further, the determining at least one topic group to which the user to be recommended belongs according to the LDA vector corresponding to the user to be recommended includes:

obtaining a theme of which the corresponding probability in the LDA vector corresponding to the user to be recommended is greater than a preset probability threshold;

and determining the subject group matched with the subject as the subject group to which the user to be recommended belongs.

Further, the keyword includes any one or more of the following information: the information flow comprises a label corresponding to the information flow, a search word corresponding to the information flow and a keyword in the information flow content.

Further, the method further comprises the following steps:

obtaining a training sample; the training samples include: a plurality of keyword texts and corresponding LDA vectors;

and training an initial LDA model according to the training samples to obtain the preset LDA model.

Further, the recommending information flow to the user to be recommended according to the historical behavior data of the similar users includes:

comparing the historical behavior data of the similar users with the historical behavior data of the users to be recommended, and determining information streams to be recommended which are not clicked by the users to be recommended in the historical behavior data of the similar users;

and recommending the information flow to be recommended to the user to be recommended.

According to the information recommendation method, historical behavior data of a user to be recommended are obtained; the historical behavior data includes: the information flow clicked by the user to be recommended in a preset historical time period; acquiring keywords in an information stream to generate a keyword text; inputting the keyword text into a preset document theme generation model LDA to obtain an LDA vector corresponding to a user to be recommended; the LDA vector includes: the probability that the keyword text belongs to each topic; determining similar users corresponding to the users to be recommended according to the LDA vectors corresponding to the users to be recommended and the LDA vectors corresponding to the candidate users; according to the historical behavior data of the similar users, information flow is recommended to the users to be recommended, so that the historical behavior data and the LDA model can be combined in time to obtain the LDA vectors corresponding to the users to be recommended, the use of the user model is avoided, the calculation of the similar users is carried out according to the LDA vectors corresponding to the users, the calculation amount is small, the calculation speed is high, and the real-time requirement can be met.

To achieve the above object, an embodiment of a second aspect of the present invention provides an information recommendation apparatus, including:

the acquisition module is used for acquiring historical behavior data of a user to be recommended; the historical behavior data comprises: the information flow clicked by the user to be recommended in a preset historical time period;

the generating module is used for acquiring keywords in the information stream and generating a keyword text;

the input module is used for inputting the keyword text into a preset document theme generation model LDA to obtain an LDA vector corresponding to the user to be recommended; the LDA vector comprises: the probability that the keyword text belongs to each topic;

the determining module is used for determining similar users corresponding to the user to be recommended according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user;

and the recommending module is used for recommending information flow to the user to be recommended according to the historical behavior data of the similar users.

Further, the determining module comprises:

the determining unit is used for determining at least one theme group to which the user to be recommended belongs according to the LDA vector corresponding to the user to be recommended;

the calculation unit is used for calculating the similarity between the user to be recommended and each candidate user in the theme group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the theme group aiming at each theme group;

the determining unit is further configured to determine the candidate user whose corresponding similarity meets a preset similarity threshold as a similar user corresponding to the user to be recommended.

Further, the determining module further includes: an adding unit and a dividing unit;

the adding unit is used for adding the LDA vector corresponding to the user to be recommended into at least one subject group;

the dividing unit is used for dividing the theme groups to obtain at least two sub-groups according to each theme group to which the user to be recommended belongs;

correspondingly, the computing unit is specifically configured to,

Further, the determining unit is specifically configured to,

Further, the device further comprises: a training module;

the acquisition module is also used for acquiring training samples; the training samples include: a plurality of keyword texts and corresponding LDA vectors;

and the training module is used for training an initial LDA model according to the training samples to obtain the preset LDA model.

Further, the recommendation module is specifically configured to,

The information recommendation device of the embodiment of the invention acquires the historical behavior data of the user to be recommended; the historical behavior data includes: the information flow clicked by the user to be recommended in a preset historical time period; acquiring keywords in an information stream to generate a keyword text; inputting the keyword text into a preset document theme generation model LDA to obtain an LDA vector corresponding to a user to be recommended; the LDA vector includes: the probability that the keyword text belongs to each topic; determining similar users corresponding to the users to be recommended according to the LDA vectors corresponding to the users to be recommended and the LDA vectors corresponding to the candidate users; according to the historical behavior data of the similar users, information flow is recommended to the users to be recommended, so that the historical behavior data and the LDA model can be combined in time to obtain the LDA vectors corresponding to the users to be recommended, the use of the user model is avoided, the calculation of the similar users is carried out according to the LDA vectors corresponding to the users, the calculation amount is small, the calculation speed is high, and the real-time requirement can be met.

To achieve the above object, an embodiment of a third aspect of the present invention provides another information recommendation apparatus, including: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the information recommendation method as described above when executing the program.

In order to achieve the above object, a fourth aspect of the present invention provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the information recommendation method as described above.

In order to achieve the above object, a fifth aspect of the present invention provides a computer program product, wherein when executed by an instruction processor in the computer program product, an information recommendation method is performed, and the method includes:

acquiring keywords in the information stream to generate a keyword text;

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flowchart of an information recommendation method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of another information recommendation method according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an information recommendation apparatus according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of another information recommendation apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of another information recommendation apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of another information recommendation apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of another information recommendation apparatus according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

An information recommendation method and apparatus according to an embodiment of the present invention are described below with reference to the drawings.

Fig. 1 is a schematic flowchart of an information recommendation method according to an embodiment of the present invention. As shown in fig. 1, the information recommendation method includes the following steps:

s101, acquiring historical behavior data of a user to be recommended; the historical behavior data includes: and (4) the information flow clicked by the user to be recommended in a preset historical time period.

The execution subject of the information recommendation method provided by the invention is an information recommendation device, and the information recommendation device can be specifically a hardware device, such as a terminal device, a background server, a server cluster and the like, or software installed on the hardware device and the like. In this embodiment, the user to be recommended refers to a user who clicks to look up various information streams. Information streams such as Baidu feed streams, news, etc. The historical behavior data refers to click look-up behaviors of the user to be recommended on various information. For example, historical behavior data may include: the method comprises the steps of information flow clicked by a user to be recommended within a preset historical time period, click time of the information flow, the length of reference time, specific content of the information flow and the like. In this embodiment, the historical behavior data of the user to be recommended may be collected by the terminal device that displays the information stream and reported to the information recommendation device. The historical behavior data may also include the user's identification so that the information recommendation device can distinguish different users according to the user's identification.

In this embodiment, when the recommending user refers to the information flow by using the plurality of terminal devices, the historical behavior data may be a combination of historical behavior data related to the user to be recommended, which is reported by the plurality of terminal devices. The user identifier may be, for example, an account number of the user on the hundred-degree feed stream, or the like, which may uniquely identify the user.

The preset historical time period may be, for example, a time period within 1 hour, a day, a month, and the like before the current time. It should be noted that, in this embodiment, the information recommendation device may periodically obtain historical behavior data of the user to be recommended, determine similar users according to the historical behavior data, and further recommend the information stream to the user to be recommended. In addition, the information recommendation device can also recommend the information flow to the user when the historical behavior data of the user to be recommended meets the preset conditions. The preset condition may be, for example, that the number of information streams referred by the user to be recommended in a certain time period satisfies a certain threshold, or that the time for the user to be recommended to refer to the information streams satisfies a certain threshold, or the like. The preset conditions can be set according to actual needs.

S102, obtaining keywords in the information flow and generating a keyword text.

In this embodiment, the keywords in the information stream may include any one or more of the following information: the information flow comprises a label corresponding to the information flow, a search word corresponding to the information flow and a keyword in the information flow content. The label corresponding to the information flow may be, for example, a classification type, a domain, and the like of the information flow. Such as sports, economy, biology, mathematics, education, entertainment, etc. The keywords in the information flow content are words obtained by performing operations such as word segmentation and filtering on the information flow content.

In this embodiment, the keyword text may include: and (3) each keyword in the information stream clicked by the user to be recommended, or each keyword and the word frequency thereof in the information stream clicked by the user to be recommended.

S103, inputting the keyword text into a preset document theme generation model LDA to obtain an LDA vector corresponding to a user to be recommended; the LDA vector includes: the probability that the keyword text belongs to each topic.

In this embodiment, the document topic generation model (LDA) is a three-layer bayesian probability model, and includes three layers of structures, i.e., a word, a topic, and a document. The document topic generation model means that each word of an article is obtained through a process of selecting a certain topic with a certain probability and selecting a certain word from the topic with a certain probability. Thus, the LDA model can be used to identify underlying topic information in large-scale document collections (document collections) or corpora (corpus).

In this embodiment, each keyword in the keyword text may be represented by a vector, for example, an onehot vector, and the keyword text composed of vectors corresponding to each keyword is input into the LDA model, so as to obtain an LDA vector output by the LDA model.

In this embodiment, the number of dimensions of the LDA vector is equal to the total number of topics; each dimension of the LDA vector can correspond to a theme respectively; the value of each dimension of the LDA vector represents the probability that the keyword text belongs to the corresponding topic. The theme includes: theme 1, theme 2, theme 3, theme 4 and theme 5 are taken as examples, the dimension number of the LDA vector is 5, and the value of the first dimension of the LDA vector represents the probability that the keyword text belongs to the theme 1; the value of the second dimension of the LDA vector represents the probability that the keyword text belongs to topic 2; the value of the third dimension of the LDA vector represents the probability that the keyword text belongs to topic 3; the value of the fourth dimension of the LDA vector represents the probability that the keyword text belongs to topic 4; the value of the fifth dimension of the LDA vector represents the probability that the keyword text belongs to topic 5.

Further, before step 103, the method may further include: obtaining a training sample; the training samples include: a plurality of keyword texts and corresponding LDA vectors; and training the initial LDA model according to the training samples to obtain a preset LDA model. The LDA vector corresponding to the keyword text in the training sample can be determined according to the interests and hobbies of the corresponding user on the information flow of each topic and the like.

And S104, determining similar users corresponding to the user to be recommended according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user.

In this embodiment, the candidate users refer to other users except the user to be recommended, among all users who click to refer to the information stream. In this embodiment, before step 104, the method may further include: the historical behavior data of each candidate user is periodically acquired, and the LDA vector corresponding to each candidate user is acquired and stored with reference to

steps

102 and 103, so as to calculate the similar user.

Specifically, the information recommendation device may calculate, according to the LDA vector corresponding to the user to be recommended and the LDA vectors corresponding to the respective candidate users, a distance between the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to the candidate users, calculate, according to the distance, a similarity between the user to be recommended and the candidate users, and determine the candidate user whose corresponding similarity satisfies a preset similarity threshold as the similar user.

And S105, recommending information flow to the user to be recommended according to the historical behavior data of the similar users.

Specifically, the process of the information recommendation device executing step 105 may specifically be that historical behavior data of similar users is compared with historical behavior data of users to be recommended, and information streams to be recommended that are not clicked by the users to be recommended in the historical behavior data of the similar users are determined; and recommending the information flow to be recommended to the user to be recommended.

Fig. 2 is a schematic flow chart of another information recommendation method according to an embodiment of the present invention, and as shown in fig. 2, on the basis of the embodiment shown in fig. 1, step 104 may specifically include the following steps:

s1041, determining at least one theme group to which the user to be recommended belongs according to the LDA vector corresponding to the user to be recommended.

The process of the information recommendation device executing step 1041 may specifically be to obtain a topic, in which a corresponding probability in an LDA vector corresponding to the user to be recommended is greater than a preset probability threshold; and determining the theme group matched with the theme as the theme group to which the user to be recommended belongs.

In this embodiment, each topic corresponds to a topic group, and each topic group includes: and the value of the corresponding dimension is larger than the LDA vector of the preset probability threshold value. For example, taking an example that the first dimension of the LDA vector corresponds to the first topic as an example, if the value of the first dimension of the LDA vector corresponding to the candidate user is greater than the preset probability threshold, the LDA vector corresponding to the candidate user is added to the topic group corresponding to the first topic.

In addition, it should be noted that, because the heat degrees of different topics are different, in this embodiment, different probability threshold values may be set for different topics. For example, taking an example that the first dimension of the LDA vector corresponds to the first topic as an example, if the value of the first dimension of the LDA vector corresponding to the candidate user is greater than the probability threshold of the first dimension, the LDA vector corresponding to the candidate user is added to the topic group corresponding to the first topic.

S1042, aiming at each topic group, calculating the similarity between the user to be recommended and each candidate user in the topic group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the topic group.

And S1043, determining the candidate users with the corresponding similarity meeting the preset similarity threshold as the similar users corresponding to the users to be recommended.

In this embodiment, for each topic group, after the similarity between the user to be recommended and each candidate user in the topic group is obtained, the similarity between the user to be recommended and each candidate user in the topic group may be compared with a preset similarity threshold, and a candidate user whose corresponding similarity is greater than the preset similarity threshold is determined as a similar user corresponding to the user to be recommended in the topic group. And combining the similar users in each topic group to obtain the similar users corresponding to the users to be recommended.

In addition, since different topics have different degrees of popularity, in this embodiment, a corresponding similarity threshold may be set for each topic group. The similarity threshold corresponding to each topic group may be the same or different.

In the embodiment, the at least one subject group to which the user to be recommended belongs is determined first, the similarity between the user to be recommended and each candidate user in the at least one subject group to which the user to be recommended belongs is calculated, and the similarity between the candidate user in other subject groups and the user to be recommended is not calculated, so that the calculation amount in the process of determining the similar user is reduced, the determination speed of the similar user is increased, the information flow can be recommended to the user to be recommended in time, and the recommendation speed and recommendation efficiency of the information flow are increased.

Further, in order to further reduce the calculation amount in the similar user determination process and improve the determination speed of the similar user, on the basis of the embodiment shown in fig. 2 and before step 1042, the method may further include: adding an LDA vector corresponding to a user to be recommended into at least one subject group to which the LDA vector belongs; and aiming at each topic group to which the user to be recommended belongs, dividing the topic groups to obtain at least two sub-groups. The theme group may be divided randomly or according to the value of the dimension corresponding to the theme in each LDA vector.

Correspondingly, step 1042 may specifically be that, for each topic group, a first sub-group including an LDA vector corresponding to the user to be recommended is obtained; and calculating the similarity between the user to be recommended and each candidate user in the first sub-group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the first sub-group.

In this embodiment, under the condition that the number of LDA vectors in a subject group is too large, at least one subject group to which a user to be recommended belongs may be determined first, the subject groups are divided, a subgroup including LDA vectors corresponding to the user to be recommended is determined, a similarity between the user to be recommended and each candidate user in the subgroup is calculated, and a similarity between the candidate user in other subgroups and the user to be recommended is not calculated, so that the calculation amount in the similar user determination process is further reduced, the determination speed of the similar user is increased, an information stream can be recommended to the user to be recommended in time, and the recommendation speed and recommendation efficiency of the information stream are increased.

Fig. 3 is a schematic structural diagram of an information recommendation apparatus according to an embodiment of the present invention. As shown in fig. 3, includes: an acquisition module 31, a generation module 32, an input module 33, a determination module 34 and a recommendation module 35.

The obtaining module 31 is configured to obtain historical behavior data of a user to be recommended; the historical behavior data comprises: the information flow clicked by the user to be recommended in a preset historical time period;

a generating module 32, configured to obtain a keyword in the information stream, and generate a keyword text;

the input module 33 is configured to input the keyword text into a preset document theme generation model LDA to obtain an LDA vector corresponding to the user to be recommended; the LDA vector comprises: the probability that the keyword text belongs to each topic;

a determining module 34, configured to determine, according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user, a similar user corresponding to the user to be recommended;

and the recommending module 35 is configured to recommend an information stream to the user to be recommended according to the historical behavior data of the similar users.

The information recommendation device provided by the invention can be hardware equipment, such as terminal equipment, a background server, a server cluster and the like, or software installed on the hardware equipment and the like. In this embodiment, the user to be recommended refers to a user who clicks to look up various information streams. Information streams such as Baidu feed streams, news, etc. The historical behavior data refers to click look-up behaviors of the user to be recommended on various information. For example, historical behavior data may include: the method comprises the steps of information flow clicked by a user to be recommended within a preset historical time period, click time of the information flow, the length of reference time, specific content of the information flow and the like. In this embodiment, the historical behavior data of the user to be recommended may be collected by the terminal device that displays the information stream and reported to the information recommendation device. The historical behavior data may also include the user's identification so that the information recommendation device can distinguish different users according to the user's identification.

In this embodiment, the candidate users refer to other users except the user to be recommended, among all users who click to refer to the information stream. In this embodiment, the information recommendation apparatus may periodically obtain historical behavior data of each candidate user, and execute the functions of the generation module 32 and the input module 33 to obtain and store the LDA vector corresponding to each candidate user, so as to calculate and call similar users.

Further, the recommending module 35 may be specifically configured to compare the historical behavior data of the similar user with the historical behavior data of the user to be recommended, and determine an information stream to be recommended that is not clicked by the user to be recommended in the historical behavior data of the similar user; and recommending the information flow to be recommended to the user to be recommended.

Further, with reference to fig. 4, on the basis of the embodiment shown in fig. 3, the apparatus may further include: a training module 36;

the obtaining module 31 is further configured to obtain a training sample; the training samples include: a plurality of keyword texts and corresponding LDA vectors;

the training module 36 is configured to train an initial LDA model according to the training samples to obtain the preset LDA model.

Fig. 5 is a schematic structural diagram of another information recommendation apparatus according to an embodiment of the present invention. As shown in fig. 5, on the basis of the embodiment shown in fig. 3, the determining module 34 may specifically include: a determination unit 341 and a calculation unit 342.

The determining unit 341 is configured to determine, according to the LDA vector corresponding to the user to be recommended, at least one topic group to which the user to be recommended belongs;

a calculating unit 342, configured to calculate, for each topic group, a similarity between the user to be recommended and each candidate user in the topic group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the topic group;

the determining unit 343 is further configured to determine, as a similar user corresponding to the user to be recommended, a candidate user whose corresponding similarity meets a preset similarity threshold.

The determining unit 341 may be specifically configured to obtain a topic in which a corresponding probability in an LDA vector corresponding to a user to be recommended is greater than a preset probability threshold; and determining the theme group matched with the theme as the theme group to which the user to be recommended belongs.

In addition, it should be noted that, because the heat degrees of different topics are different, in this embodiment, different probability threshold values may be set for different topics. For example, taking an example that the first dimension of the LDA vector corresponds to the first topic as an example, if the value of the first dimension of the LDA vector corresponding to the candidate user is greater than the probability threshold of the first dimension, the LDA vector corresponding to the candidate user is added to the topic group corresponding to the first topic. In this embodiment, for each topic group, after the similarity between the user to be recommended and each candidate user in the topic group is obtained, the similarity between the user to be recommended and each candidate user in the topic group may be compared with a preset similarity threshold, and a candidate user whose corresponding similarity is greater than the preset similarity threshold is determined as a similar user corresponding to the user to be recommended in the topic group. And combining the similar users in each topic group to obtain the similar users corresponding to the users to be recommended.

Further, referring to fig. 6 in combination, on the basis of the embodiment shown in fig. 5, the determining module 34 may further include: an adding unit 343 and a dividing unit 344.

The adding unit 343 is configured to add the LDA vector corresponding to the user to be recommended to the at least one subject group to which the LDA vector belongs;

the dividing unit 344 is configured to divide the topic group to obtain at least two sub-groups for each topic group to which the user to be recommended belongs;

correspondingly, the calculating unit 342 is specifically configured to,

Fig. 7 is a schematic structural diagram of another information recommendation apparatus according to an embodiment of the present invention. The information recommendation device includes:

memory 1001, processor 1002, and computer programs stored on memory 1001 and executable on processor 1002.

The processor 1002, when executing the program, implements the information recommendation method provided in the above-described embodiments.

Further, the information recommendation apparatus further includes:

a communication interface 1003 for communicating between the memory 1001 and the processor 1002.

A memory 1001 for storing computer programs that may be run on the processor 1002.

Memory 1001 may include high-speed RAM memory and may also include non-volatile memory (e.g., at least one disk memory).

The processor 1002 is configured to implement the information recommendation method according to the foregoing embodiment when executing the program.

If the memory 1001, the processor 1002, and the communication interface 1003 are implemented independently, the communication interface 1003, the memory 1001, and the processor 1002 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.

Optionally, in a specific implementation, if the memory 1001, the processor 1002, and the communication interface 1003 are integrated on one chip, the memory 1001, the processor 1002, and the communication interface 1003 may complete communication with each other through an internal interface.

The processor 1002 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention.

The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the information recommendation method as described above.

The present invention also provides a computer program product, which when executed by an instruction processor performs a method of information recommendation, the method comprising:

acquiring keywords in the information stream to generate a keyword text;

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. An information recommendation method, comprising:

acquiring keywords in the information stream to generate a keyword text;

recommending information flow to the user to be recommended according to the historical behavior data of the similar users;

the determining similar users corresponding to the user to be recommended according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user includes:

determining at least one theme group to which the user to be recommended belongs according to the LDA vector corresponding to the user to be recommended, wherein a theme of which the corresponding probability in the LDA vector corresponding to the user to be recommended is greater than a preset probability threshold value is obtained, and determining the theme group matched with the theme as the theme group to which the user to be recommended belongs;

2. The method of claim 1, wherein each topic group is provided with a corresponding similarity threshold.

3. The method according to claim 1, wherein before calculating, for each topic group, a similarity between the user to be recommended and each candidate user in the topic group according to the LDA vector corresponding to the user to be recommended and the LDA vector corresponding to each candidate user in the topic group, the method further comprises:

4. The method according to claim 1, wherein the determining at least one topic group to which the user to be recommended belongs according to the LDA vector corresponding to the user to be recommended comprises:

5. The method of claim 1, wherein the keywords comprise any one or more of the following information: the information flow comprises a label corresponding to the information flow, a search word corresponding to the information flow and a keyword in the information flow content.

6. The method of claim 1, further comprising:

7. The method according to claim 1, wherein recommending information flow to the user to be recommended according to the historical behavior data of the similar users comprises:

8. An information recommendation apparatus, comprising:

the recommending module is used for recommending information flow to the user to be recommended according to the historical behavior data of the similar users;

the determining module comprises:

the determining unit is used for determining at least one theme group to which the user to be recommended belongs according to the LDA vector corresponding to the user to be recommended, wherein the theme with the probability higher than a preset probability threshold value corresponding to the LDA vector corresponding to the user to be recommended is obtained, and the theme group matched with the theme is determined as the theme group to which the user to be recommended belongs;

9. The apparatus of claim 8, wherein each topic group is provided with a corresponding similarity threshold.

10. The apparatus of claim 8, wherein the determining module further comprises: an adding unit and a dividing unit;

correspondingly, the computing unit is specifically configured to,

11. The apparatus according to claim 8, characterized in that the determination unit is specifically configured to,

12. The apparatus of claim 8, wherein the keywords comprise any one or more of the following information: the information flow comprises a label corresponding to the information flow, a search word corresponding to the information flow and a keyword in the information flow content.

13. The apparatus of claim 8, further comprising: a training module;

14. The apparatus of claim 8, wherein the recommendation module is specifically configured to,

15. An information recommendation apparatus, comprising:

memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the information recommendation method according to any one of claims 1-7 when executing the program.

16. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the information recommendation method according to any one of claims 1-7.