CN111767459A - Item recommendation method and device - Google Patents

Item recommendation method and device Download PDF

Info

Publication number
CN111767459A
CN111767459A CN201910983938.2A CN201910983938A CN111767459A CN 111767459 A CN111767459 A CN 111767459A CN 201910983938 A CN201910983938 A CN 201910983938A CN 111767459 A CN111767459 A CN 111767459A
Authority
CN
China
Prior art keywords
recommended
item
user
article
specific
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910983938.2A
Other languages
Chinese (zh)
Inventor
闫瑞阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201910983938.2A priority Critical patent/CN111767459A/en
Publication of CN111767459A publication Critical patent/CN111767459A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Abstract

The invention discloses a method and a device for recommending articles, and relates to the technical field of computers. One embodiment of the method comprises: according to the attribute information of the to-be-recommended articles, constructing at least one to-be-recommended article pool, and setting a specific theme of the at least one to-be-recommended article pool; determining the correlation degree between the user and the item to be recommended according to the historical behavior data of the user and the specific theme based on the document theme generation model; and training an article recommendation model by using the correlation degree and the historical behavior data of the user, and recommending the target article to the user according to the article recommendation model. According to the embodiment, the document theme generating model in machine learning is adopted for theme word extraction, all data can be used to the maximum extent, and therefore articles can be recommended to a user in a personalized mode more accurately and effectively.

Description

Item recommendation method and device
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for recommending articles.
Background
In the information age, online shopping has become the mainstream shopping mode, and various items such as clothes, furniture, online courses and life services can be purchased on the internet. In order to meet the diversified demands of users, the network platform shows various articles to the users, but in the face of such a great variety of articles, how to recommend the articles in which the users are interested to the users at the first time is of great significance.
Because the source of the obtained article information is complex and irregular, and a large number of missing values exist in the article information, in the article recommendation method in the prior art, the article information is collected, filtered and processed with the missing values, and then a user preference model is obtained by combining with a user history record, so that articles are recommended for a user by using the preference model.
However, in the current object recommendation scene, a zero setting method or a traditional missing value filling method, such as mean filling, mode filling or nearest neighbor filling, is adopted, so that it is difficult to effectively cope with the data missing phenomenon, which causes inaccuracy of a user preference model, and even brings poor use experience to the user due to the fact that the user is recommended with the object which is disliked by the user.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for recommending an article, which can extract a topic word by using a document topic generation model in machine learning, and can use all data to the greatest extent, so as to recommend an article more accurately and effectively for a user in a personalized manner.
To achieve the above object, according to a first aspect of embodiments of the present invention, there is provided a method of item recommendation.
The method for recommending the article comprises the following steps: according to the attribute information of the to-be-recommended articles, constructing at least one to-be-recommended article pool, and setting a specific theme of the at least one to-be-recommended article pool; determining the degree of correlation between the user and the item to be recommended according to historical behavior data of the user and the specific theme based on a document theme generation model; and training an article recommendation model by using the correlation degree and the historical behavior data of the user, and recommending a target article to the user according to the article recommendation model.
Optionally, the generating a model based on a document theme, determining a degree of correlation between the user and the item to be recommended according to historical behavior data of the user and the specific theme, includes: generating a specific theme vector of each item to be recommended in the at least one item pool to be recommended according to the specific theme of the at least one item pool to be recommended based on a document theme generation model; calculating a user preference theme vector according to the historical behavior data of the user and the specific theme vector of each item to be recommended; and calculating the similarity between the user preference theme vector and the specific theme vector of each item to be recommended based on a modified cosine similarity algorithm so as to obtain the correlation degree between the user and each item to be recommended.
Optionally, the calculating a user preference topic vector according to the historical behavior data of the user and the specific topic vector of each item to be recommended includes: extracting at least one historical operation article of the user from the articles to be recommended according to the historical behavior data of the user, and acquiring a specific theme vector of the at least one historical operation article; determining at least one specific to-be-recommended item pool to which the at least one historical operation item belongs; based on a preset operation weighting algorithm, according to the specific theme vector and the operation information of each historical operation article in each specific to-be-recommended article pool, calculating the user preference theme vector corresponding to each specific to-be-recommended article pool.
Optionally, the calculating the similarity between the user preference topic vector and the specific topic vector of each item to be recommended includes: and calculating the similarity between the user preference theme vector corresponding to each specific to-be-recommended item pool and the specific theme vector of each to-be-recommended item in each specific to-be-recommended item pool.
Optionally, the calculating, based on a preset operation weighting algorithm, a user preference theme vector corresponding to each specific to-be-recommended item pool according to the specific theme vector and the operation information of each historically-operated item in each specific to-be-recommended item pool includes: acquiring the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; based on a preset operation weighting algorithm, determining a weighting weight corresponding to a specific theme vector of each historical operation article in each specific article pool to be recommended according to the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; and calculating a user preference theme vector corresponding to each specific item pool to be recommended according to the weighted weight.
Optionally, the constructing at least one pool of the items to be recommended according to the attribute information of the items to be recommended includes: according to the category information in the attribute information of the item to be recommended, dividing the item to be recommended into at least one category; determining the characteristic attribute corresponding to each category based on the attribute screening principle corresponding to each category; screening out characteristic information from the attribute information of the to-be-recommended articles corresponding to each category according to the characteristic attribute corresponding to each category; and constructing a pool of the objects to be recommended corresponding to each category by using the objects to be recommended corresponding to each category and the screened characteristic information.
Optionally, the training an item recommendation model by using the correlation degree and the historical behavior data of the user, and recommending a target item to the user according to the item recommendation model includes: generating first characteristic data and second characteristic data according to the correlation degree between the user and the item to be recommended and the historical behavior data of the user; training an article recommendation model corresponding to the user by using the first feature data, wherein the input of the article recommendation model corresponding to the user is feature data, and the output is a target article; and determining a target item corresponding to the second characteristic data by using the item recommendation model corresponding to the user, and pushing the target item corresponding to the second characteristic data to the user.
To achieve the above object, according to a second aspect of the embodiments of the present invention, there is provided an article recommendation apparatus.
The article recommending device of the embodiment of the invention comprises: the building module is used for building at least one object pool to be recommended according to the attribute information of the objects to be recommended and setting a specific theme of the at least one object pool to be recommended; the determining module is used for generating a model based on a document theme and determining the correlation degree between the user and the to-be-recommended item according to the historical behavior data of the user and the specific theme; and the recommending module is used for training an article recommending model by utilizing the correlation degree and the historical behavior data of the user, and recommending a target article to the user according to the article recommending model.
Optionally, the determining module is further configured to: generating a specific theme vector of each item to be recommended in the at least one item pool to be recommended according to the specific theme of the at least one item pool to be recommended based on a document theme generation model; calculating a user preference theme vector according to the historical behavior data of the user and the specific theme vector of each item to be recommended; and calculating the similarity between the user preference theme vector and the specific theme vector of each item to be recommended based on a modified cosine similarity algorithm so as to obtain the correlation degree between the user and each item to be recommended.
Optionally, the determining module is further configured to: extracting at least one historical operation article of the user from the articles to be recommended according to the historical behavior data of the user, and acquiring a specific theme vector of the at least one historical operation article; determining at least one specific to-be-recommended item pool to which the at least one historical operation item belongs; based on a preset operation weighting algorithm, according to the specific theme vector and the operation information of each historical operation article in each specific to-be-recommended article pool, calculating the user preference theme vector corresponding to each specific to-be-recommended article pool.
Optionally, the determining module is further configured to: and calculating the similarity between the user preference theme vector corresponding to each specific to-be-recommended item pool and the specific theme vector of each to-be-recommended item in each specific to-be-recommended item pool.
Optionally, the determining module is further configured to: acquiring the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; based on a preset operation weighting algorithm, determining a weighting weight corresponding to a specific theme vector of each historical operation article in each specific article pool to be recommended according to the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; and calculating a user preference theme vector corresponding to each specific item pool to be recommended according to the weighted weight.
Optionally, the building module is further configured to: according to the category information in the attribute information of the item to be recommended, dividing the item to be recommended into at least one category; determining the characteristic attribute corresponding to each category based on the attribute screening principle corresponding to each category; screening out characteristic information from the attribute information of the to-be-recommended articles corresponding to each category according to the characteristic attribute corresponding to each category; and constructing a pool of the objects to be recommended corresponding to each category by using the objects to be recommended corresponding to each category and the screened characteristic information.
Optionally, the recommendation module is further configured to: generating first characteristic data and second characteristic data according to the correlation degree between the user and the item to be recommended and the historical behavior data of the user; training an article recommendation model corresponding to the user by using the first feature data, wherein the input of the article recommendation model corresponding to the user is feature data, and the output is a target article; and determining a target item corresponding to the second characteristic data by using the item recommendation model corresponding to the user, and pushing the target item corresponding to the second characteristic data to the user.
To achieve the above object, according to a third aspect of embodiments of the present invention, there is provided an electronic apparatus.
An electronic device of an embodiment of the present invention includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors implement the method for recommending the article according to the embodiment of the invention.
To achieve the above object, according to a fourth aspect of embodiments of the present invention, there is provided a computer-readable medium.
A computer-readable medium of an embodiment of the present invention has stored thereon a computer program that, when executed by a processor, implements a method of item recommendation of an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: the method comprises the steps of constructing an article pool to be recommended, setting a specific theme of the article pool to be recommended, obtaining the degree of correlation between a user and an article to be recommended by adopting a document theme generation model in machine learning, combining the degree of correlation with historical behavior data of the user, training to obtain an article recommendation model, and recommending the article for the user. In addition, in consideration of the fact that different users have different requirements for different types of articles in actual application, and the operation behaviors of the users are completely different for different types of articles, in the embodiment of the invention, the articles to be recommended are firstly classified according to different category information, all the articles to be recommended are divided into a plurality of categories, and the specific category scale can be set according to the actual situation, so that the accuracy of article recommendation in the embodiment of the invention can be improved, the user experience is improved, and the practicability of the scheme is improved.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of a method of item recommendation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the training process of LDA;
FIG. 3 is a schematic diagram of a main flow of a method of item recommendation according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the main modules of an apparatus for item recommendation according to an embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 6 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the network platform, the article supplier uses different fields to describe the article attribute when describing the article, thereby embodying the characteristics of the article, which not only facilitates the comparison and selection of users, but also is beneficial to the analysis and statistics of the article by the data department. Due to the large number of suppliers and the complexity of offline scenes, different item suppliers may use different fields to describe the same type of item, for example, a description field for dress style in clothing may have different fields for style, fashion style, etc. Meanwhile, some goods suppliers may have a common situation that the common field is not filled when filling in the goods information, thereby causing data loss.
In the article recommendation method in the prior art, a traditional missing value processing mode is adopted, for example, related attributes of similar articles are selected for filling, if no similar article exists, the missing value is set to zero directly, then the processed features are used as the features of a recommendation model for training, and the training result is used for final prediction. However, the conventional method for processing missing values is not suitable for the item recommendation scenario, for example, if the type information of the shoes is not provided, if the nearest neighbor method is used for filling, it is likely that the characteristics of the running shoes are filled in, which may cause an error in item recommendation. For another example, some non-numeric data cannot be supplemented by numeric methods, such as style fields of clothing, including popular, traditional, campus, etc., which cannot be filled using means such as average, mode, nearest neighbor, etc., but a large amount of items are lost if the field is not filled because a piece of clothing is not. Even if a proximity value padding or zero setting method can be used, some errors are introduced, meanwhile, fewer codes cannot be used for replacing the existing attribute field, and the problem of expansibility is also brought when the attribute is directly added into a model as a feature.
Therefore, the invention provides a method for recommending articles, which adopts a document theme generation model (namely, late Dirichlet Allocation, LDA for short, an unsupervised machine learning technology, which can be used for identifying Latent theme information in a large-scale document set or a corpus) in machine learning to extract theme words, can use all data to the maximum extent, and effectively avoids the problem of data loss, thereby more accurately and effectively recommending articles for users in a personalized manner. Fig. 1 is a schematic diagram of the main steps of a method of item recommendation according to an embodiment of the present invention. As a reference embodiment of the present invention, as shown in fig. 1, the main steps of the method for recommending an item according to an embodiment of the present invention may include:
step S101: according to the attribute information of the to-be-recommended articles, constructing at least one to-be-recommended article pool, and setting a specific theme of the at least one to-be-recommended article pool;
step S102: determining the correlation degree between the user and the item to be recommended according to the historical behavior data of the user and the specific theme based on the document theme generation model;
step S103: and training an article recommendation model by using the correlation degree and the historical behavior data of the user, and recommending the target article to the user according to the article recommendation model.
In the article recommendation method of the embodiment of the invention, the attribute information of the article to be recommended is firstly acquired. The items to be recommended refer to all items recommended for the user, namely, items registered on the network platform; the attribute information refers to basic information of the article, such as information of a category, color, brand, size, model, and the like of the article. Considering that hundreds of millions of articles are registered on a network platform, if all attribute information of the articles are used for training a model, a lot of time and resources are consumed, in the embodiment of the present invention, articles that have interacted with a user may be screened out as articles to be recommended, where the articles that have interacted with the user may be articles browsed, collected or concerned by the user on the network platform, and then the attribute information of the articles to be recommended is obtained to construct an article pool to be recommended. In consideration of diversity of article types, in the embodiment of the invention, different article pools to be recommended can be constructed according to different article types.
The method for constructing the article pool is specifically realized as follows: the method comprises the steps of extracting required basic data from a database, firstly obtaining user browsing, clicking, paying attention to, collecting or selecting log data of various user equipment channels from a bottom database, cleaning data of different channels every day according to the day, simultaneously extracting related attribute information of all articles, combining according to article types, and finishing basic data extraction work.
Then, considering that the LDA model is used for topic word extraction in the embodiment of the present invention, it is necessary to separately set a specific theme of each item pool to be recommended, for example, if the item pool to be recommended is a clothing category, the set specific theme may be a sweet theme, a leisure theme, a mature theme, or the like, if the item pool to be recommended is a book category, the set specific theme may be a novel theme, a prose theme, a poem theme, or the like, and if the item pool to be recommended is an appliance category, the set specific theme may be a home appliance theme, an office appliance theme, or the like.
After the specific theme of each to-be-recommended item pool is set, the degree of correlation between the user and the to-be-recommended item may be determined based on the LDA model and by combining the historical behavior data of the user and the specific theme of the to-be-recommended item pool set in step S101. It should be noted that, in the embodiment of the present invention, at least one to-be-recommended item pool is constructed in step S101, and for each to-be-recommended item pool, a specific theme corresponding to the to-be-recommended item pool is set. Then, for each item pool to be recommended, the correlation degree between the user and the item to be recommended is calculated respectively. For example, there are three pools of articles to be recommended for clothing, books, and home appliances, and theme vectors S1, S2, and S3 of the three pools of articles to be recommended are set, respectively. Then processing the acquired user behavior data according to the categories of the articles, namely dividing the behavior data of the user into A1, A2 and A3, wherein A1 is the behavior data of the user for the clothing articles; a2 is a1 is behavior data of the user for the book item; a3 is the behavior data of the user for the household appliance. Then, for the clothing articles, the degree of the correlation between the user and the clothing articles is calculated according to S1 and A1, the degree of the correlation between the user and the book articles is calculated according to S2 and A2, and the degree of the correlation between the user and the household electrical appliance articles is calculated according to S3 and A3, so that the degree of the correlation between the user and all the articles to be recommended, namely the degree of the user' S liking on each article to be recommended, is obtained.
After the preference degree of the user to the recommended articles is obtained, the preference degree can be combined with the historical behavior data of the user to train an article recommendation model. Finally, the trained item recommendation model can be utilized to recommend items for the user. In the embodiment of the invention, the preference degree of the user to the item to be recommended is calculated by using the LDA, so that the data waste caused by the traditional missing value processing mode can be avoided, the data utilization rate is improved, the problem of missing value processing of the traditional method is solved, the historical behavior data of the user and the preference degree of the user to the item obtained by the LDA are combined, an item recommendation model is trained, and the accuracy of item recommendation can be improved.
As can be seen from the foregoing description, in the method for recommending an item according to the embodiment of the present invention, determining the preference degree of the item to be recommended by the user is an important part, and before determining the preference degree, at least one item pool to be recommended needs to be constructed, so that items in different item pools to be recommended can be analyzed, and then, a specific method for constructing the item pool to be recommended is analyzed. As a reference embodiment of the present invention, constructing at least one to-be-recommended item pool according to attribute information of an item to be recommended may include:
step S1011: according to the category information in the attribute information of the to-be-recommended articles, dividing the to-be-recommended articles into at least one category;
step S1012: determining the characteristic attribute corresponding to each category based on the attribute screening principle corresponding to each category;
step S1013: screening out characteristic information from the attribute information of the to-be-recommended articles corresponding to each category according to the characteristic attribute corresponding to each category;
step S1014: and constructing a pool of the objects to be recommended corresponding to each category by using the objects to be recommended corresponding to each category and the screened characteristic information.
In consideration of the fact that different users have different requirements for different types of articles in actual application, and the operation behaviors of the users are completely different for different types of articles, in the embodiment of the invention, the articles to be recommended are firstly classified according to different category information, all the articles to be recommended are classified into a plurality of categories, and the specific category scale can be set according to actual conditions, for example, the garment can be used as a large category, and can also be set into two categories, namely, women's garment and men's garment. In addition, in the embodiment of the invention, the items to be recommended are classified, and the requirement degree of the user is also considered, for example, for some articles necessary for life, the user may simply browse and select the items, and for some large valuables, the user can browse for many times and pay attention to the items repeatedly and finally select the items, so that completely different user behavior characteristics can generate errors if only one model is used for training.
For example, the range of the article types can be defined according to the service scene in combination with the service requirements and the engineering capability, so that the range of the articles in the article pool to be recommended is determined. For example, in a certain network platform, clothing, shoes and numbers are the most important items, and then the items in the item pool to be recommended come from the three categories, besides, certain off-line information is stored in engineering prediction, so that the size of storage space occupied by the off-line information and the time consumed by prediction have great requirements on engineering, and therefore the item pool cannot be too large, otherwise, the storage itself becomes a problem.
After the item to be recommended is classified, the item characteristic attributes in the item class need to be screened. The characteristic attribute refers to an attribute describing characteristics of the article, such as size, style, material and the like. Because the attribute information of the items filled by the item suppliers on the network platform is not uniform, and hundreds or even more situations exist for describing the attributes of the same type of items at the same time, for example, for the most common style field of clothing, there are several similar fields for describing style, fashion style, preference, applicable style and the like, and therefore the attributes need to be screened. According to the screening of the characteristic attributes, the characteristic attributes can be screened out to serve as basic characteristics of subsequent model construction according to a large number of statistical indexes, such as indexes of coverage rate, discrimination, average number, mode, effective field proportion, numerical value distribution and the like, and specific service requirements are combined according to the indexes. The concrete explanation is as follows: the coverage degree of each attribute on the article is different, some attributes are similar to most clothes, such as style, but the missing situations of different attributes are greatly different, such as the meaning of fields representing the style and the fashion style are similar, but most article suppliers do not provide attribute information of the fashion style, and then the property of the fashion style is considered to be deleted. Meanwhile, to select an attribute with a strong degree of discrimination, that is, the selected attribute can discriminate the particular of an article from other articles, for example, regarding the attribute of material, in the clothing article, about 65% of the clothing material is cotton, about 90% or more of the clothing material is associated with cotton, and then substantially all of the clothing is associated with cotton, so that the discrimination capability of the attribute is poor for the clothing category, and the attribute is not selected as a characteristic attribute with almost no degree of discrimination.
In summary, the characteristic attribute selected for the item to be recommended of each category in the embodiment of the present invention needs to have the characteristics of strong discrimination, high coverage in the item, large variance, high effective field ratio, and the like. After the characteristic attribute corresponding to each category is determined, specific characteristic information corresponding to the characteristic attribute needs to be screened from the attribute information of the item to be recommended corresponding to each category. For example, assuming that the item is a garment, the corresponding article to be recommended is various garments, the selected characteristic attributes include style, color and style, and three garments including C1, C2 and C3 are provided, the style of C1 is extracted as fresh, the color is light blue, the style is loose, the style of C2 is normal, the color is white, the style is manicure, the style of C3 is mature, the color is spliced, and the style is manicure. It should be noted that, if some to-be-recommended articles do not have the feature information corresponding to a certain feature attribute, it is indicated that a missing value exists, and at this time, the missing value is directly empty and does not need to be filled with other information. This is because in the method for recommending an article according to an embodiment of the present invention, the LDA is used to generate the subject vector of the article, the feature information describing an article is combined into a document, and then the subject vector of the document is determined, so that the absence of some feature information is not significant for the determination of the subject vector of the article, which will be described in detail later.
Step 1014 is to construct a to-be-recommended item pool corresponding to each category by using the to-be-recommended item corresponding to each category and the screened feature information. That is to say, the to-be-recommended item pool constructed in the embodiment of the present invention may include the to-be-recommended item and feature information corresponding to the to-be-recommended item. Equivalently, for each item to be recommended, the corresponding feature information needs to be extracted and stored in a form of adding the item to be recommended and the feature information, so that the binding of the item to be recommended and the feature information is formed, and the feature information does not have a sequential relationship, so that the feature information can be stored in an array mode.
After the method for constructing the pool of items to be recommended according to the embodiment of the present invention is specifically described, an important part of the embodiment of the present invention, that is, a degree of relevance between the user and the items to be recommended is determined. In some referential embodiments of the present invention, determining the degree of relevance of the user to the item to be recommended by utilizing LDA may include:
step S1021: generating a specific theme vector of each item to be recommended in at least one item pool to be recommended according to a specific theme of the at least one item pool to be recommended based on a document theme generation model;
step S1022: calculating a user preference theme vector according to historical behavior data of a user and the specific theme vector of each item to be recommended;
step S1023: and calculating the similarity between the user preference theme vector and the specific theme vector of each item to be recommended based on the modified cosine similarity algorithm so as to obtain the correlation degree between the user and each item to be recommended.
Through steps S1021 to S1023, in determining the degree of correlation between the user and the to-be-recommended item, the embodiment of the present invention first obtains the specific theme vector of each to-be-recommended item, then obtains the user preference theme vector, and finally calculates the similarity between the user preference theme vector and the specific theme vector of each to-be-recommended item, and the similarity represents the degree of correlation between the user and the to-be-recommended item, that is, the degree of preference of the user for the to-be-recommended item.
LDA is a core part of the method for recommending the goods according to the embodiment of the invention, and FIG. 2 is a schematic diagram of the training process of LDA. LDA is a document theme generating model, which can extract a plurality of theme words from a document by using a three-layer Bayes probability model to describe the document, wherein the theme represents the core meaning of the document. In actual use, n topics can be specified for m documents, the generation vector k dimension of the ith document represents the probability that the document belongs to the kth topic, and n is a hyper-parameter to be selected in use and is selected in combination with actual effects. In the scenario of item recommendation according to the embodiment of the present invention, n specific topics of a certain item pool to be recommended are specified, feature information describing each item to be recommended in the item pool to be recommended may be combined to be regarded as a document, and a topic word is extracted by using an LDA to describe each item to be recommended, so that a specific topic vector of the item to be recommended may be generated by using the feature information of the item to be recommended, and each dimension of the specific topic vector represents a probability that the item to be recommended belongs to a specific topic.
For example, for a certain item pool to be recommended, three items a, b and c to be recommended in the item pool to be recommended are set, and there are two specific themes. For the item a to be recommended, the obtained special subject vector is [0.3,0.7], namely the probability that the item a to be recommended belongs to two special subjects is 0.3 and 0.7 respectively; for the item b to be recommended, the specific topic vector is [0.2,0.8], and the specific topic vector of the item c to be recommended is [0.9,0.3 ]. Not only can the specific theme vector of the item to be recommended be obtained, but also the direct similarity of the item can be calculated by using the specific theme vector, for example, the cosine similarity of a and b is larger than the cosine similarity of a and c, so that a and b can be considered to be more similar.
And combining with a specific scene description, a plurality of vocabularies for describing the item to be recommended form a target scene matched with the user, and the extracted specific subject is the description of the scene. For example, in a target scene constructed by attributes such as warm color system, sweet, student, chiffon, and skirt, the target user is a young girl, and the usage scene is a style that likes sweet and truthful, then in the corresponding specific theme vector, the score of the specific theme that is sweet and truthful style is much higher than that of other theme styles. The method for generating the feature topic vector by adopting the LDA has the following advantages: because the specific subject word is constructed, if a single object to be recommended lacks a certain or a plurality of characteristic attributes, the final subject classification is not influenced, so that the method is insensitive to missing values and can utilize all available data to the maximum extent; when new characteristic attributes need to be added, the method is not complex, and only the new characteristic attributes need to be added, so that the change on engineering is reduced, and the expansibility of the model is increased; vectorization of the to-be-recommended articles is more beneficial to subsequent calculation, complex calculation is avoided, and the generated specific theme vector is used for replacing statistical characteristic information to describe the characteristics of the to-be-recommended articles.
The process of generating the special subject vector of each item to be recommended is explained in detail above, and next, the preferred subject vector of the user is calculated according to the historical behavior data of the user and the special subject vector of each item to be recommended. The user behavior data refers to behavior data of a user on a network platform, for example, data collected, data concerned, browsing data, and the like of the user on a certain network platform, and a specific time may be set according to a specific user behavior and an influence degree of the behavior. For example, collection and attention data of a user within 90 days and browsing data within 15 days can be obtained, and it is considered that the number of items browsed by the user is much larger than that of other behaviors in a normal situation, so that the time period of the browsing behavior is 15 days, besides, browsing information of the user has certain randomness, and the importance degree of the browsing information rapidly decreases along with time, which is also the reason for selecting 15 days as the period.
As still another reference embodiment of the present invention, calculating a user preference topic vector according to the historical behavior data of the user and the specific topic vector of each item to be recommended may include:
step S10221: extracting at least one historical operation article of the user from the articles to be recommended according to the historical behavior data of the user, and acquiring a specific theme vector of the at least one historical operation article;
step S10222: determining at least one specific to-be-recommended item pool to which at least one historical operation item belongs;
step S10223: and calculating a user preference theme vector corresponding to each specific to-be-recommended item pool according to the specific theme vector and the operation information of each historical operation item in each specific to-be-recommended item pool based on a preset operation weighting algorithm.
In step S10221, at least one historical operation item of the user may be directly extracted from the items to be recommended, since the items to be recommended refer to all items that can be recommended for the user, that is, items registered on the network platform, and therefore the items related to the user behavior data belong to the items to be recommended. Meanwhile, in the step S1021, the special topic vector of each item to be recommended is already obtained, so that the special topic vector of the historical operation item can be directly obtained.
After the historical operation item is acquired in step S10221, it may be determined which item pool to be recommended the historical operation item belongs to, that is, the specific item pool to be recommended in step S10222, according to the item class information of the historical operation item, so that the acquired historical operation item may be classified into different categories. Then, a user's preferred topic vector for each category may be calculated. For example, there are three categories of clothing, books and household appliances, the articles operated by the user in history include coats, trousers, sweaters, detective novels, prose poetry sets, refrigerators and color tvs, the specific theme vectors of the articles operated by the user in history can be obtained respectively, the coats, the trousers and the sweaters belong to the clothing, the detective novels and the prose poetry sets belong to the books, and the refrigerators and the color tvs belong to the household appliances, so that the preferred theme vector of the user for the clothing can be obtained by using the specific theme vectors of the coats, the trousers and the sweaters, the preferred theme vector of the user for the books can be obtained by using the specific theme vectors of the detective novels and the prose sets, and the preferred theme vector of the user for the household appliances can be obtained by using the specific theme vectors.
In addition, it should be noted that there are various ways to obtain the preference theme vector of the user for a certain type of article, and the preference theme vector can be obtained by adopting a direct addition method. For example, for an article of clothing class, where the user has historically operated the article with coats, pants, and sweaters, the corresponding specific theme vectors are m1, m2, and m3, then the user's preferred theme vector for the clothing class may be the sum of the vectors for m1, m2, and m 3. This is based on the fact that if a user is particularly interested in a particular topic, the vector score for that particular topic is also very high, so the direct addition is in line with the business logic, but the direct addition ignores the user's preference for different particular topics to be different. In practice, the user will be more concerned with the favorite items and less likely and generally preferred users will be relatively ignorant, with these preferences being time sensitive, so it should be taken into account that the weights for the particular topics of interest to the user are different.
In order to solve the problem that the weights of the feature topics in which the user is interested are different, the step S10223 calculates a user preference topic vector corresponding to each specific to-be-recommended item pool according to a specific topic vector and operation information of each historical operation item in each specific to-be-recommended item pool based on a preset operation weighting algorithm, and may include: acquiring the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; based on a preset operation weighting algorithm, determining a weighting weight corresponding to a specific theme vector of each historical operation article in each specific article pool to be recommended according to the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; and calculating a user preference theme vector corresponding to each specific to-be-recommended item pool according to the weighted weight.
In the embodiment of the invention, the mode of weighing the interested subject of the user is a time-dependent hierarchical weighting statistical mode. The time-efficient hierarchical weighting statistics is designed mainly by considering the characteristics of user behaviors, weight setting is carried out on the time and the specific interaction mode of the user interacting the historical operation articles, and the obtained specific theme vector of the historical operation articles is multiplied by the set weight and then added to obtain the final user preference vector. Specifically, the time-dependent data weight is attenuated from the peak to the tail of the gaussian distribution, and the gaussian distribution is selected because the time attenuation is found by comparing the gaussian distribution, the t distribution, the τ distribution, the γ distribution and the like, so that the best effect is obtained by using the gaussian distribution, the interaction behavior closer to the current time will obtain the highest weight value, and the weight value score will be lower as the time is farther from the current time. The hierarchical weighting means that the behaviors of the users are distinguished to add specific theme vectors of historical operation articles, the historical operation articles interacted with the users can be divided into strong interaction and weak interaction, the strong interaction behavior can be selection, the weak interaction behavior can be concerned and browsed, the weight values of different user behaviors are determined by combining according to different behavior weights, and finally the final user preference theme vector is obtained by multiplying the specific theme vector by the different weight values and adding.
For example, for an article of clothing category, the user historical behavior data includes a jacket selected in the previous week, trousers browsed today, and a sweater focused on yesterday, the corresponding special subject vectors are m1, m2, and m3, and the weights are set to be z1, z2, and z3, respectively, so that the preference subject vector of the user for the clothing category is m1 z1+ m2 z2+ m3 z 3. This is based on the fact that if a user is particularly interested in a particular topic, the vector score for that particular topic is also very high, so the direct addition is in line with the business logic, but the direct addition ignores the user's preference for different particular topics to be different. In practice, the user will be more concerned with the favorite items and less likely and generally preferred users will be relatively ignorant, with these preferences being time sensitive, so it should be taken into account that the weights for the particular topics of interest to the user are different.
In the item recommendation method in the embodiment of the invention, after the user preference theme vector and the specific theme vector of each item to be recommended are obtained through calculation, the similarity between the user preference theme vector and the specific theme vector of each item to be recommended can be calculated by using a modified cosine similarity algorithm, so that the correlation degree between the user and each item to be recommended, namely the preference degree of the user for each item to be recommended can be obtained.
In general, there are many methods for calculating vector similarity, and there are euclidean distance, manhattan distance, cosine of included angle, etc., which need to be selected according to actual service targets. In the embodiment of the present invention, a modified cosine similarity algorithm may be used for calculation, specifically, for an item to be recommended, a specific theme vector m4 of the item to be recommended may be obtained, and a preferred theme vector m5 of a category of the item to be recommended by a user may also be obtained, so that a cosine similarity between m4 and m5 may be calculated, and a preference score of the user for the item to be recommended, that is, a degree of a user's liking for the item to be recommended, is obtained. Since simple cosine similarity represents differences in direction and is insensitive to distance, modified cosine similarity may be used in embodiments of the present invention, where the sum is obtained by subtracting the mean from each corresponding element in the vector, and then dividing by the product of the moduli of the two vectors, in such a way that the sensitivity of the vectors to distance is increased.
In addition, in the embodiment of the present invention, calculating the similarity between the user preference topic vector and the specific topic vector of each item to be recommended may include: and calculating the similarity between the user preference theme vector corresponding to each specific to-be-recommended item pool and the specific theme vector of each to-be-recommended item in each specific to-be-recommended item pool. In the embodiment of the invention, at least one object pool to be recommended is constructed, a specific theme vector of each object to be recommended is obtained, and user preference theme vectors for different objects are also obtained. The user preference topic vector of different items introduced in step S10222 is the user preference topic vector of the user for the specific to-be-recommended item pool, so that the similarity between the user preference topic vector corresponding to each specific to-be-recommended item pool and the specific topic vector of each to-be-recommended item in each specific to-be-recommended item pool can be calculated.
After determining the preference degree of the user to the item to be recommended, in the item recommendation method provided by the embodiment of the invention, the determined preference degree is combined with the historical behavior data of the user, an item recommendation model is trained, and then the item is recommended for the user. Therefore, as a further reference example of the embodiment of the present invention, training an item recommendation model using the correlation degree and the historical behavior data of the user, and recommending a target item to the user according to the item recommendation model may include:
step S1031: generating first characteristic data and second characteristic data according to the correlation degree of the user and the to-be-recommended articles and the historical behavior data of the user;
step S1032: training an article recommendation model corresponding to the user by using the first feature data, wherein the input of the article recommendation model corresponding to the user is the feature data, and the output is the target article;
step S1033: and determining a target item corresponding to the second characteristic data by using the item recommendation model corresponding to the user, and pushing the target item corresponding to the second characteristic data to the user.
The first feature data and the second feature data in step S1031 may be distinguished according to time, for example, the first feature data includes the degree of correlation between the user and the item to be recommended and the historical behavior data of the user, and the second feature data includes the degree of correlation between the user and the item to be recommended and the historical behavior data of the user in the last three days, and of course, other methods may be used for division. In this way, the item recommendation model corresponding to the user may be trained by using the first feature data, where the input of the item recommendation model corresponding to the user is the feature data, and the output is the target item. And finally, determining a target object corresponding to the second characteristic data by using the trained object recommendation model corresponding to the user, and pushing the target object corresponding to the second characteristic data to the user.
In practical use, an XGboost model (i.e., a machine learning model) or an LR (i.e., Logistic Regression model) model is generally used as the final item recommendation model. In the embodiment of the invention, the LR model and the XGBoost model can be fused for recommending the article, and the fusion mode is to weight and output the results recommended by the two models as the final recommendation result. The weights of the two models are manually adjusted according to the on-line model performance, because the principles of the two models are different and can be regarded as the process of integration and reinforcement of the weak classifier. When article recommendation is carried out, different article pools to be recommended can be scored, and finally articles displayed for a user are sorted, wherein the sorting basis is that the articles with high preference scores of the user are arranged in the front, and the articles with low preference scores of the user are arranged in the back. The fusion model features used on the actual line not only include the degree of correlation between the user and the item to be recommended, but also include the brand, the category and other feature scores of the user for the item, which together determine the preference score of the end user for a certain item. In addition, when the correlation degree of the user and the item to be recommended is obtained by using the LDA, the score normalization is carried out, and all dimensions and other characteristics are normalized to be between 0 and 1, so that the final training and recommendation are facilitated.
Fig. 3 is a schematic diagram of a main flow of a method of item recommendation according to an embodiment of the present invention. As shown in fig. 3, the main flow of the method for recommending an item according to the embodiment of the present invention may include:
step S301: classifying the to-be-recommended articles into at least one category according to category information in the attribute information of the to-be-recommended articles, determining the characteristic attribute corresponding to each category based on the attribute screening principle corresponding to each category, and screening out the characteristic information from the attribute information of the to-be-recommended articles corresponding to each category according to the characteristic attribute corresponding to each category;
step S302: constructing an article pool to be recommended corresponding to each category by using the articles to be recommended corresponding to each category and the screened characteristic information, and setting a specific theme of at least one article pool to be recommended;
step S303: generating a specific theme vector of each item to be recommended in at least one item pool to be recommended according to a specific theme of the at least one item pool to be recommended based on a document theme generation model;
step S304: extracting at least one historical operation article of the user from the articles to be recommended according to the historical behavior data of the user;
step S305: obtaining a specific theme vector of at least one historical operating article, and determining at least one specific to-be-recommended article pool to which the at least one historical operating article belongs;
step S306: acquiring the operation time and operation behavior of each historical operation article in each specific article pool to be recommended;
step S307: based on a preset operation weighting algorithm, determining a weighting weight corresponding to a specific theme vector of each historical operation article in each specific article pool to be recommended according to the operation time and operation behavior of each historical operation article in each specific article pool to be recommended;
step S308: calculating a user preference theme vector corresponding to each specific to-be-recommended item pool according to the weighted weight;
step S309: based on a modified cosine similarity algorithm, calculating the similarity between a user preference theme vector corresponding to each specific to-be-recommended article pool and a specific theme vector of each to-be-recommended article in each specific to-be-recommended article pool so as to obtain the correlation degree between the user and each to-be-recommended article;
step S310: generating first characteristic data and second characteristic data according to the correlation degree of the user and the to-be-recommended articles and the historical behavior data of the user;
step S311: training an article recommendation model corresponding to the user by using first feature data, wherein the input of the article recommendation model corresponding to the user is the feature data, and the output is a target article;
step S312: and determining a target item corresponding to the second characteristic data by using the item recommendation model corresponding to the user, and pushing the target item corresponding to the second characteristic data to the user.
According to the technical scheme of item recommendation of the embodiment of the invention, an item pool to be recommended can be constructed, a specific theme of the item pool to be recommended is set, then the correlation degree between a user and an item to be recommended is obtained by adopting a document theme generation model in machine learning, then the correlation degree is combined with historical behavior data of the user, an item recommendation model is obtained by training, and the item is recommended to the user. In addition, in consideration of the fact that different users have different requirements for different types of articles in actual application, and the operation behaviors of the users are completely different for different types of articles, in the embodiment of the invention, the articles to be recommended are firstly classified according to different category information, all the articles to be recommended are divided into a plurality of categories, and the specific category scale can be set according to the actual situation, so that the accuracy of article recommendation in the embodiment of the invention can be improved, the user experience is improved, and the practicability of the scheme is improved.
Fig. 4 is a schematic diagram of main blocks of an article recommendation apparatus according to an embodiment of the present invention. As shown in fig. 4, the apparatus 400 for recommending items according to the embodiment of the present invention mainly includes the following modules: a building module 401, a determining module 402 and a recommending module 403.
The building module 401 may be configured to build at least one item pool to be recommended according to attribute information of an item to be recommended, and set a specific theme of the at least one item pool to be recommended; the determining module 402 may be configured to generate a model based on the document theme, and determine a degree of correlation between the user and the item to be recommended according to the historical behavior data of the user and the specific theme; the recommending module 403 may be configured to train an item recommendation model using the correlation degree and the historical behavior data of the user, and recommend the target item to the user according to the item recommendation model.
In this embodiment of the present invention, the determining module 402 may further be configured to: generating a specific theme vector of each item to be recommended in at least one item pool to be recommended according to a specific theme of the at least one item pool to be recommended based on a document theme generation model; calculating a user preference theme vector according to historical behavior data of a user and the specific theme vector of each item to be recommended; and calculating the similarity between the user preference theme vector and the specific theme vector of each item to be recommended based on the modified cosine similarity algorithm so as to obtain the correlation degree between the user and each item to be recommended.
In this embodiment of the present invention, the determining module 402 may further be configured to: extracting at least one historical operation article of the user from the articles to be recommended according to the historical behavior data of the user, and acquiring a specific theme vector of the at least one historical operation article; determining at least one specific to-be-recommended item pool to which at least one historical operation item belongs; and calculating a user preference theme vector corresponding to each specific to-be-recommended item pool according to the specific theme vector and the operation information of each historical operation item in each specific to-be-recommended item pool based on a preset operation weighting algorithm.
In this embodiment of the present invention, the determining module 402 may further be configured to: and calculating the similarity between the user preference theme vector corresponding to each specific to-be-recommended item pool and the specific theme vector of each to-be-recommended item in each specific to-be-recommended item pool.
In this embodiment of the present invention, the determining module 402 may further be configured to: acquiring the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; based on a preset operation weighting algorithm, determining a weighting weight corresponding to a specific theme vector of each historical operation article in each specific article pool to be recommended according to the operation time and operation behavior of each historical operation article in each specific article pool to be recommended; and calculating a user preference theme vector corresponding to each specific to-be-recommended item pool according to the weighted weight.
In this embodiment of the present invention, the building module 401 may further be configured to: according to the category information in the attribute information of the to-be-recommended articles, dividing the to-be-recommended articles into at least one category; determining the characteristic attribute corresponding to each category based on the attribute screening principle corresponding to each category; screening out characteristic information from the attribute information of the to-be-recommended articles corresponding to each category according to the characteristic attribute corresponding to each category; and constructing a pool of the objects to be recommended corresponding to each category by using the objects to be recommended corresponding to each category and the screened characteristic information.
In this embodiment of the present invention, the recommending module 403 may further be configured to: generating first characteristic data and second characteristic data according to the correlation degree of the user and the to-be-recommended articles and the historical behavior data of the user; training an article recommendation model corresponding to the user by using the first feature data, wherein the input of the article recommendation model corresponding to the user is the feature data, and the output is the target article; and determining a target item corresponding to the second characteristic data by using the item recommendation model corresponding to the user, and pushing the target item corresponding to the second characteristic data to the user.
From the above description, it can be seen that the device for recommending articles in the embodiment of the present invention can construct an article pool to be recommended, set a specific theme of the article pool to be recommended, then obtain a degree of correlation between a user and an article to be recommended by using a document theme generation model in machine learning, then combine the degree of correlation with historical behavior data of the user, train to obtain an article recommendation model, and implement article recommendation for the user. In addition, in consideration of the fact that different users have different requirements for different types of articles in actual application, and the operation behaviors of the users are completely different for different types of articles, in the embodiment of the invention, the articles to be recommended are firstly classified according to different category information, all the articles to be recommended are divided into a plurality of categories, and the specific category scale can be set according to the actual situation, so that the accuracy of article recommendation in the embodiment of the invention can be improved, the user experience is improved, and the practicability of the scheme is improved.
Fig. 5 illustrates an exemplary system architecture 500 of an item recommendation method or apparatus to which embodiments of the invention may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 501, 502, 503. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the method for recommending items provided by the embodiment of the present invention is generally executed by the server 505, and accordingly, the apparatus for recommending items is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a construction module, a determination module, and a recommendation module. The names of the modules do not form a limitation on the modules themselves under certain conditions, for example, a building module may also be described as a module for building at least one item pool to be recommended and setting a specific theme of the at least one item pool to be recommended according to attribute information of the items to be recommended.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: according to the attribute information of the to-be-recommended articles, constructing at least one to-be-recommended article pool, and setting a specific theme of the at least one to-be-recommended article pool; determining the correlation degree between the user and the item to be recommended according to the historical behavior data of the user and the specific theme based on the document theme generation model; and training an article recommendation model by using the correlation degree and the historical behavior data of the user, and recommending the target article to the user according to the article recommendation model.
According to the technical scheme of the embodiment of the invention, the object pool to be recommended can be constructed, the specific theme of the object pool to be recommended is set, the correlation degree between the user and the object to be recommended is obtained by adopting the document theme generation model in machine learning, then the correlation degree is combined with the historical behavior data of the user, the object recommendation model is obtained by training, the object is recommended to the user, all data can be used to the maximum extent by adopting the document theme generation model in machine learning to extract the theme words, the problem of data loss caused by adopting a traditional missing value processing mode in the prior art is solved, and the object is recommended to the user in a personalized mode more accurately and effectively. In addition, in consideration of the fact that different users have different requirements for different types of articles in actual application, and the operation behaviors of the users are completely different for different types of articles, in the embodiment of the invention, the articles to be recommended are firstly classified according to different category information, all the articles to be recommended are divided into a plurality of categories, and the specific category scale can be set according to the actual situation, so that the accuracy of article recommendation in the embodiment of the invention can be improved, the user experience is improved, and the practicability of the scheme is improved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method of item recommendation, comprising:
according to the attribute information of the to-be-recommended articles, constructing at least one to-be-recommended article pool, and setting a specific theme of the at least one to-be-recommended article pool;
determining the degree of correlation between the user and the item to be recommended according to historical behavior data of the user and the specific theme based on a document theme generation model;
and training an article recommendation model by using the correlation degree and the historical behavior data of the user, and recommending a target article to the user according to the article recommendation model.
2. The method according to claim 1, wherein the determining the degree of correlation between the user and the item to be recommended according to the historical behavior data of the user and the specific topic based on the document topic generation model comprises:
generating a specific theme vector of each item to be recommended in the at least one item pool to be recommended according to the specific theme of the at least one item pool to be recommended based on a document theme generation model;
calculating a user preference theme vector according to the historical behavior data of the user and the specific theme vector of each item to be recommended;
and calculating the similarity between the user preference theme vector and the specific theme vector of each item to be recommended based on a modified cosine similarity algorithm so as to obtain the correlation degree between the user and each item to be recommended.
3. The method according to claim 2, wherein calculating a user preference topic vector according to the historical behavior data of the user and the specific topic vector of each item to be recommended comprises:
extracting at least one historical operation article of the user from the articles to be recommended according to the historical behavior data of the user, and acquiring a specific theme vector of the at least one historical operation article;
determining at least one specific to-be-recommended item pool to which the at least one historical operation item belongs;
based on a preset operation weighting algorithm, according to the specific theme vector and the operation information of each historical operation article in each specific to-be-recommended article pool, calculating the user preference theme vector corresponding to each specific to-be-recommended article pool.
4. The method according to claim 3, wherein the calculating the similarity between the user preferred subject vector and the specific subject vector of each item to be recommended comprises:
and calculating the similarity between the user preference theme vector corresponding to each specific to-be-recommended item pool and the specific theme vector of each to-be-recommended item in each specific to-be-recommended item pool.
5. The method according to claim 3, wherein the calculating a user preference topic vector corresponding to each specific item pool to be recommended according to the specific topic vector and the operation information of each historical operation item in each specific item pool to be recommended based on a preset operation weighting algorithm comprises:
acquiring the operation time and operation behavior of each historical operation article in each specific article pool to be recommended;
based on a preset operation weighting algorithm, determining a weighting weight corresponding to a specific theme vector of each historical operation article in each specific article pool to be recommended according to the operation time and operation behavior of each historical operation article in each specific article pool to be recommended;
and calculating a user preference theme vector corresponding to each specific item pool to be recommended according to the weighted weight.
6. The method according to claim 1, wherein the constructing at least one item pool to be recommended according to attribute information of the item to be recommended comprises:
according to the category information in the attribute information of the item to be recommended, dividing the item to be recommended into at least one category;
determining the characteristic attribute corresponding to each category based on the attribute screening principle corresponding to each category;
screening out characteristic information from the attribute information of the to-be-recommended articles corresponding to each category according to the characteristic attribute corresponding to each category;
and constructing a pool of the objects to be recommended corresponding to each category by using the objects to be recommended corresponding to each category and the screened characteristic information.
7. The method of claim 1, wherein training an item recommendation model using the degree of correlation and the historical behavior data of the user, and recommending a target item to the user according to the item recommendation model comprises:
generating first characteristic data and second characteristic data according to the correlation degree between the user and the item to be recommended and the historical behavior data of the user;
training an article recommendation model corresponding to the user by using the first feature data, wherein the input of the article recommendation model corresponding to the user is feature data, and the output is a target article;
and determining a target item corresponding to the second characteristic data by using the item recommendation model corresponding to the user, and pushing the target item corresponding to the second characteristic data to the user.
8. An article recommendation apparatus, comprising:
the building module is used for building at least one object pool to be recommended according to the attribute information of the objects to be recommended and setting a specific theme of the at least one object pool to be recommended;
the determining module is used for generating a model based on a document theme and determining the correlation degree between the user and the to-be-recommended item according to the historical behavior data of the user and the specific theme;
and the recommending module is used for training an article recommending model by utilizing the correlation degree and the historical behavior data of the user, and recommending a target article to the user according to the article recommending model.
9. An electronic device, comprising:
one or more processors; a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN201910983938.2A 2019-10-16 2019-10-16 Item recommendation method and device Pending CN111767459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910983938.2A CN111767459A (en) 2019-10-16 2019-10-16 Item recommendation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910983938.2A CN111767459A (en) 2019-10-16 2019-10-16 Item recommendation method and device

Publications (1)

Publication Number Publication Date
CN111767459A true CN111767459A (en) 2020-10-13

Family

ID=72718385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910983938.2A Pending CN111767459A (en) 2019-10-16 2019-10-16 Item recommendation method and device

Country Status (1)

Country Link
CN (1) CN111767459A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113379482A (en) * 2021-05-28 2021-09-10 车智互联(北京)科技有限公司 Item recommendation method, computing device and storage medium
CN113538110A (en) * 2021-08-13 2021-10-22 苏州工业职业技术学院 Similar article recommendation method based on browsing sequence
WO2022151649A1 (en) * 2021-01-15 2022-07-21 稿定(厦门)科技有限公司 Deep interest network-based topic recommendation method and apparatus

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022151649A1 (en) * 2021-01-15 2022-07-21 稿定(厦门)科技有限公司 Deep interest network-based topic recommendation method and apparatus
CN113379482A (en) * 2021-05-28 2021-09-10 车智互联(北京)科技有限公司 Item recommendation method, computing device and storage medium
CN113379482B (en) * 2021-05-28 2023-12-01 车智互联(北京)科技有限公司 Article recommendation method, computing device and storage medium
CN113538110A (en) * 2021-08-13 2021-10-22 苏州工业职业技术学院 Similar article recommendation method based on browsing sequence
CN113538110B (en) * 2021-08-13 2023-08-11 苏州工业职业技术学院 Similar article recommending method based on browsing sequence

Similar Documents

Publication Publication Date Title
CN104254852B (en) Method and system for mixed information inquiry
CN105760400B (en) A kind of PUSH message sort method and device based on search behavior
CN108205768A (en) Database building method and data recommendation method and device, equipment and storage medium
CN108664513B (en) Method, device and equipment for pushing keywords
CN106708821A (en) User personalized shopping behavior-based commodity recommendation method
CN109492180A (en) Resource recommendation method, device, computer equipment and computer readable storage medium
CN111079015B (en) Recommendation method and device, computer equipment and storage medium
EP4181026A1 (en) Recommendation model training method and apparatus, recommendation method and apparatus, and computer-readable medium
CN102576438A (en) Method and apparatus for executing a recommendation
CN103885951A (en) Graphics and text information releasing and generating method and graphics and text information releasing and generating device
US20130073618A1 (en) Information Providing System, Information Providing method, Information Providing Device, Program, And Information Storage Medium
CN111767459A (en) Item recommendation method and device
CN106326318B (en) Searching method and device
CN111899047A (en) Resource recommendation method and device, computer equipment and computer-readable storage medium
CN109087162A (en) Data processing method, system, medium and calculating equipment
CN108170731A (en) Data processing method, device, computer storage media and server
CN113689259A (en) Commodity personalized recommendation method and system based on user behaviors
CN103309869A (en) Method and system for recommending display keyword of data object
CN110781307A (en) Target item keyword and title generation method, search method and related equipment
US10474670B1 (en) Category predictions with browse node probabilities
CN112036987B (en) Method and device for determining recommended commodity
US10387934B1 (en) Method medium and system for category prediction for a changed shopping mission
CN111461846B (en) Shopping record analysis feedback system and method thereof
CN109658195B (en) Commodity display decision method
CN111177564A (en) Product recommendation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination