CN114647787A - User personalized recommendation method based on multi-modal data - Google Patents
User personalized recommendation method based on multi-modal data Download PDFInfo
- Publication number
- CN114647787A CN114647787A CN202210322829.8A CN202210322829A CN114647787A CN 114647787 A CN114647787 A CN 114647787A CN 202210322829 A CN202210322829 A CN 202210322829A CN 114647787 A CN114647787 A CN 114647787A
- Authority
- CN
- China
- Prior art keywords
- user
- recommendation
- mapping
- modal
- agent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a user personalized recommendation method based on multi-modal data, and relates to the technical field of network recommendation. According to the method, after historical behavior records and mapping logs allowed to be collected by a user are obtained, the characteristics of all objects related to the historical behavior records and the characteristics of each object to be recommended are extracted and mapped to the same multi-dimensional space, multi-modal data are integrated, then a reinforcement learning model is used as a recommendation system agent, the agent is trained through the collected user records, and the trained recommendation agent is used for recommendation, so that user-customized recommendation of the multi-modal data is achieved. The user historical behavior record, the mapping log, the object set to be recommended and the like oriented in the method provided by the invention can all contain objects (such as texts, pictures, videos and the like) in a plurality of fields, and the difference among multiple modes is blurred by a method of integrating after feature extraction, so that the problem that a traditional recommendation system can only be applied to a single field for recommendation is solved.
Description
Technical Field
The invention relates to the technical field of network recommendation, in particular to a user personalized recommendation method based on multi-mode data.
Background
In recent years, with the rapid development of the internet, modern society has become an information-oriented and digital society, data is flooded throughout the world, and information explosion has become a constant state. However, in the face of a large amount of data, the utilization rate of the Information by the user is rather reduced, i.e. an Information overload (overload) problem is generated. In this regard, the recommendation system is one of the key technologies for effectively solving the information overload problem. In fact, with the rapid development of the internet, the internet of things and cloud computing technology, the personalized recommendation system has become the standard for internet products at present, and internet information to be faced by internet users in the fields of e-commerce, video, news, music and the like is closely related to the recommendation system.
The recommendation system acquires historical behavior data of the user, such as browsing data of a webpage, purchase history or item rating, and the like, so that personalized recommendation is performed on the user. The solution of the recommendation problem is a recommendation system based on content filtering, a recommendation system based on collaborative filtering and a hybrid recommendation system; with the rise of deep learning algorithms, various recommendation algorithms using neural networks are also widely applied to the implementation of recommendation systems, such as recurrent neural networks, convolutional neural networks, generative confrontation networks, and the like; in addition, there are recommendation techniques based on reinforcement learning, recommendation techniques based on heterogeneous networks, and the like.
Most of the existing recommendation systems provide solutions to user personalized recommendation problems based on single modalities, that is, historical data of a user is learned only in a single field (such as news, videos, music and the like), so that interest and preference of the user in the single field can be captured only and recommendation is performed based on the interest and preference, and diversity of recommendation results is greatly limited.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a user personalized recommendation method based on multimodal data, which is based on the shortcomings of the prior art, and learns the user preference and forms the user portrait according to the historical data formed by the user in various fields (such as news, video, music, etc.), so as to recommend the multimodal data and expand the diversity of recommendation results.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a user personalized recommendation method based on multi-modal data comprises the following steps:
step 1: acquiring a total object set I and an object set L to be recommended0Clearly, there is a need for a variety of multimodal information to be addressed.
And 2, step: and aiming at each oriented modality, applying a corresponding feature extraction algorithm to extract features of each object in the total object set and mapping the features into the same mathematical multidimensional space S.
And step 3: and acquiring and accumulating user historical behavior records and image logs. The historical behavior record is the click history record of the user before mapping, namely, a clicked object list; the image log is a list of objects displayed to the user and the clicking behavior of the user on the objects, wherein 1 represents clicking, and 0 represents not clicking.
And 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent of the recommendation system and a reinforcement learning environment parameter.
And 5: training of the recommendation system agent is performed.
Step 6: processing a set of objects to be recommended L using a trained recommendation system agent0Simulating a set L of objects to be recommended by a user using a recommendation system0And generating a corresponding image log D.
And 7: and extracting an object set interacted by the user in the mapping log D, namely an individualized interactive object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L.
And 8: and processing the total recommendation list L according to different requirements to generate a multi-modal recommendation list.
And step 9: and recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log. The agent is further trained and updated each time a given n logs are accumulated.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: according to the user personalized recommendation method based on the multi-modal data, after historical behavior records and mapping logs allowed to be collected by a user are obtained, all objects involved in the historical behavior records and the features of each object to be recommended are extracted and mapped to the same multi-dimensional space, the multi-modal data are integrated, then a reinforcement learning model is used as a recommendation system intelligent body, the intelligent body is trained through the collected user records, and the trained recommendation intelligent body is used for recommendation, so that the user personalized recommendation of the multi-modal data is achieved. Compared with the prior art, the method provided by the invention can be used for solving the problem that a traditional recommendation system can only be applied to a single field for recommendation by aiming at the user historical behavior record, the mapping log, the object set to be recommended and the like in multiple fields (such as texts, pictures, videos and the like) and blurring the difference among multiple modes through a feature extraction and integration method.
Drawings
FIG. 1 is a flowchart of a recommendation method provided by an embodiment of the present invention;
fig. 2 is a flowchart of a training method of an agent (rainbow model) according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
As shown in fig. 1, the user personalized recommendation method based on multimodal data of the present embodiment is as follows.
Step 1: acquiring a total object set I and an object set L to be recommended0Clearly, there is a need for a variety of multimodal information to be addressed.
Step 2: and aiming at each oriented modality, applying a corresponding feature extraction algorithm to extract features of each object in the total object set and mapping the features into the same mathematical multidimensional space S. For example, the image is characterized by using a CNN or RNN convolutional neural network, the text is characterized by using a tf-idf algorithm or a CNN convolutional neural network, the video is characterized by using a P3D (Pseudo-3D redundant Networks) Residual network, the audio is characterized by using a Discrete Wavelet Transform (DWT) or a Perceptual Linear Prediction (PLP) algorithm, and the like.
And step 3: and acquiring and accumulating user historical behavior records and image logs. The historical behavior record is the click history record of the user before mapping, namely, a clicked object list; the image log is a list of objects displayed to the user and the clicking behavior of the user on the objects, wherein 1 represents clicking, and 0 represents not clicking.
And 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent of the recommendation system and a reinforcement learning environment parameter.
And 5: training of the recommendation system agent is performed, as shown in fig. 2, as follows:
step 5.1: the agent requests a user-related record from the user historical behavior record and the mapping log obtained in step 3.
Step 5.2: and (4) returning records to the memory, and replacing the corresponding object with the feature representation through the result of the step 2.
Step 5.3: the intelligent body forms a user portrait according to the historical behavior record of the user, images are sequentially executed according to the user portrait, if the action of the intelligent body is consistent with the real log, the intelligent body obtains rewards, and if the action of the intelligent body is inconsistent with the real log, the intelligent body does not obtain rewards or obtains punishment.
Step 5.4: after all the entries in the user mapping log are executed, judging that a reinforcement learning Frame is finished, and judging: and if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the intelligent agent, turning to the step 6, and otherwise, returning to the step 5.1.
And 6: processing a set of objects to be recommended L using a trained recommendation system agent0Simulating a set L of objects to be recommended by a user using a recommendation system0And generating a corresponding image log D.
And 7: and extracting an object set interacted by the user in the mapping log D, namely an individualized interactive object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L.
And 8: processing the total recommendation list L according to different requirements to generate a multi-modal recommendation list;
when the wide interests of the user in multiple fields need to be mined or the recommendation result modes need to be diversified, the total recommendation list L can be simply sorted according to the specific scene requirements and then output to generate the multi-mode recommendation list L1。
When a user needs to be recommended in other fields according to a certain object (such as matching a news picture, recommending a character hyperlink for a movie, and the like), other modal objects closest to the current object can be selected from the multidimensional space S according to the nearest neighbor principle, and a multi-modal recommendation list L is generated2。
When more accurate and effective recommendation needs to be performed on the user by combining different levels of knowledge expressions, a multi-modal fusion (multimodal fusion) algorithm is used for fusing features of different modalities and training the agent based on the fused features to generate a recommendation list L3 with multi-modal expressions for the same object, such as a news title matched with a news cover.
And step 9: and recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log. The agent is further trained and updated each time a given n logs are accumulated.
This example sequentially performs the method of the present invention based on a milcrosoft News Dataset (MIND for short):
step 1: because the MIND data set only contains news text data and does not meet the requirements of multi-modal tasks, corresponding picture data contained in news is crawled through a spiderFlow crawler tool (https:// githu. com/chenyuansgit/spiderFlow) according to news links provided in the MIND data set, and a picture data set corresponding to the news text is formed. Therefore, the obtained total object set I is all news texts and news pictures and an object set L to be recommended0For testing the concentrated image log news text and the corresponding image set of newsThe types of multimodal information that need to be addressed are text and images.
Step 2: aiming at two modes of a text and an image, corresponding feature extraction algorithms are respectively applied, the news text features are weighted by Tf-idf terms through the news text features to extract basic text features, and then feature selection is carried out through a chi-square test algorithm to reduce the obtained basic text feature dimensionality, so that the model generalization capability is stronger, overfitting is reduced, and understanding between features and feature values is enhanced. The news picture features are extracted through a pre-trained VGG16 neural network in a keras library, and finally the extracted features of each object in the total object set are mapped to the same mathematical multidimensional space S through a umap algorithm (https:// axiv.org/abs/1802.03426).
And step 3: and reading the user historical behavior record and the image log accumulated in the data set.
And 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent and a reinforcement learning environment parameter (a rainbow model is taken as an example): the action space is an integer of 0-10, and the probability of clicking the object is predicted when the action space is divided by 10; the state space is a multidimensional matrix and reflects the user portrait and the characteristics of the current object to be judged; the reward is determined by the auc index derived from the prediction of the image log output.
And 5: training of the recommendation system agent is performed, as shown in fig. 2, as follows:
step 5.1: the agent requests a user-related record from the set obtained in step 3;
step 5.2: the memory returns the record, and replaces the corresponding object with the feature representation through the result of the step 2;
step 5.3: the intelligent agent separates user preference characteristics by using Mean-Shift algorithm clustering according to historical records of users to form user images, predicts click rates of the users on objects in the mapping logs item by item according to the obtained user images, and awards the intelligent agent according to auc indexes obtained by the prediction result and the real record calculation;
step 5.4: after all the entries in the user mapping log are executed, judging that a reinforcement learning Frame is finished, and judging: and if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the intelligent agent, otherwise, returning to the step 5.1.
Step 6: and processing the mapping log in the test set by using the trained recommendation system agent, predicting the click probability of each object by the user according to the historical behavior record of the user in the test set, calculating auc indexes, and recording the prediction result D.
And 7: and using the high click probability object in the prediction result D as a medium recommendation list L.
And 8: the general recommendation list L can be processed according to different requirements to generate a multi-mode recommendation list; the following requirements are processed as follows:
when the wide interests of the user in multiple fields need to be mined or diversified recommendation result modals need to be mined, the general recommendation list L can be simply sorted according to specific scene requirements and then output to generate a multi-modal recommendation list L1。
When a user needs to be recommended in other fields according to a certain object (such as matching a news picture or recommending a news hyperlink for a picture), other modal objects closest to the current object can be selected from the multidimensional space S according to the nearest neighbor principle to generate a multi-modal recommendation list L2。
And step 9: and recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log. The agent may be further trained and updated each time a given n pieces of log are accumulated.
Through the steps, the finally obtained experiment result indexes are shown in table 1 (step 5 adopts different recommendation models respectively).
TABLE 1 results of various recommended models
The reinforcement learning model is adopted as a method for recommending the intelligent agent of the recommending system, and a test is carried out on a data set of MIcrosoft News Dataset (MIND for short), and the result shows that the performance indexes of the reinforcement learning intelligent agent are similar to those of the traditional recommending model, namely the reinforcement learning intelligent agent can be competent for recommending tasks. The data of the MIND data set on the image mode is expanded through the crawler, so that the MIND data set becomes a data set of a text and image mixed mode, and further experiments are carried out.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions and scope of the present invention as defined in the appended claims.
Claims (7)
1. A user personalized recommendation method based on multi-modal data is characterized by comprising the following steps: the method comprises the following steps:
step 1: acquiring a total object set I and an object set L to be recommended0Defining the type of multi-mode information to be oriented;
step 2: aiming at each oriented mode, a corresponding feature extraction algorithm is applied, and features of each object in the total object set are extracted and mapped to the same mathematical multidimensional space S;
and 3, step 3: acquiring and accumulating user historical behavior records and mapping logs;
and 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent of the recommendation system and a reinforcement learning environment parameter;
and 5: performing training of a recommendation system agent;
step 6: processing a set of objects to be recommended L using a trained recommendation system agent0Simulating a set L of objects to be recommended by a user using a recommendation system0Generating a corresponding mapping log D;
and 7: extracting an object set interacted by the user in the mapping log D, namely an individualized interactive object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L;
and 8: processing the total recommendation list L according to different requirements to generate a multi-modal recommendation list;
and step 9: recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log; the agent is further trained and updated each time a given n logs are accumulated.
2. The method of claim 1, wherein the method comprises: the historical behavior record in the step 3 is the click history record of the user before mapping, namely a clicked object list; the image log is a list of objects displayed to the user and the clicking behavior of the user on the objects, wherein 1 represents clicking, and 0 represents not clicking.
3. The method of claim 1, wherein the method comprises: the specific process of the step 5 is as follows:
step 5.1: the intelligent agent requests a user related record from the user historical behavior record and the mapping log obtained in the step 3;
step 5.2: the memory returns records, and the corresponding object is replaced by the feature representation through the result of the step 2;
step 5.3: the intelligent body forms a user portrait according to the historical behavior record of the user, sequentially executes mapping according to the user portrait, obtains reward if the action of the intelligent body is consistent with the real log, and does not obtain reward or obtain punishment if the action of the intelligent body is inconsistent with the real log;
step 5.4: after all the entries in the user mapping log are executed, judging that a reinforcement learning Frame is finished, and judging: and if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the intelligent agent, turning to the step 6, and otherwise, returning to the step 5.1.
4. The method for user-customized recommendation based on multimodal data according to any of claims 1-3, wherein: in the step 2, the image uses a CNN convolutional neural network to extract features; extracting features of the text by using a tf-idf algorithm or a cnn convolutional neural network; the video is characterized by a P3D residual network, and the audio is characterized by a discrete wavelet transform or a perceptual linear prediction algorithm.
5. The method for personalized recommendation of a user based on multimodal data according to any of claims 1-3, wherein: in the step 8, when the user's extensive interest in multiple fields needs to be mined or a variety of recommendation result modalities needs to be adopted, the general recommendation list L is simply sorted according to specific scene requirements and then output to generate a multi-modal recommendation list L1。
6. The method for user-customized recommendation based on multimodal data according to any of claims 1-3, wherein: in step 8, when it is necessary to recommend the user in other fields according to a certain object, another modal object closest to the current object is selected from the multi-dimensional space S according to the nearest neighbor principle, and a multi-modal recommendation list L is generated2。
7. The method for user-customized recommendation based on multimodal data according to any of claims 1-3, wherein: in the step 8, when a user needs to be recommended more accurately and effectively by combining knowledge expressions of different levels, a multi-modal fusion algorithm is used to fuse features of different modalities, and an agent is trained based on the fused features, so as to generate a recommendation list L3 with multi-modal expression for the same object.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210322829.8A CN114647787A (en) | 2022-03-30 | 2022-03-30 | User personalized recommendation method based on multi-modal data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210322829.8A CN114647787A (en) | 2022-03-30 | 2022-03-30 | User personalized recommendation method based on multi-modal data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114647787A true CN114647787A (en) | 2022-06-21 |
Family
ID=81994734
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210322829.8A Pending CN114647787A (en) | 2022-03-30 | 2022-03-30 | User personalized recommendation method based on multi-modal data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114647787A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116702094A (en) * | 2023-08-01 | 2023-09-05 | 国家计算机网络与信息安全管理中心 | Group application preference feature representation method |
-
2022
- 2022-03-30 CN CN202210322829.8A patent/CN114647787A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116702094A (en) * | 2023-08-01 | 2023-09-05 | 国家计算机网络与信息安全管理中心 | Group application preference feature representation method |
CN116702094B (en) * | 2023-08-01 | 2023-12-22 | 国家计算机网络与信息安全管理中心 | Group application preference feature representation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111581510B (en) | Shared content processing method, device, computer equipment and storage medium | |
CN111382309B (en) | Short video recommendation method based on graph model, intelligent terminal and storage medium | |
CN111444428B (en) | Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium | |
CN112119388A (en) | Training image embedding model and text embedding model | |
CN112100504B (en) | Content recommendation method and device, electronic equipment and storage medium | |
CN113590965B (en) | Video recommendation method integrating knowledge graph and emotion analysis | |
CN117836765A (en) | Click prediction based on multimodal hypergraph | |
CN113051468B (en) | Movie recommendation method and system based on knowledge graph and reinforcement learning | |
CN116601626A (en) | Personal knowledge graph construction method and device and related equipment | |
CN113590970A (en) | Personalized digital book recommendation system and method based on reader preference, computer and storage medium | |
CN115964560B (en) | Information recommendation method and equipment based on multi-mode pre-training model | |
CN116977701A (en) | Video classification model training method, video classification method and device | |
CN113094587A (en) | Implicit recommendation method based on knowledge graph path | |
CN115238191A (en) | Object recommendation method and device | |
CN114201516A (en) | User portrait construction method, information recommendation method and related device | |
CN115221352A (en) | Big data short video recommendation system based on collaborative filtering algorithm | |
CN111026910B (en) | Video recommendation method, device, electronic equipment and computer readable storage medium | |
CN115640449A (en) | Media object recommendation method and device, computer equipment and storage medium | |
CN118069927A (en) | News recommendation method and system based on knowledge perception and user multi-interest feature representation | |
CN114647787A (en) | User personalized recommendation method based on multi-modal data | |
CN116956183A (en) | Multimedia resource recommendation method, model training method, device and storage medium | |
CN114417875B (en) | Data processing method, apparatus, device, readable storage medium, and program product | |
CN117786234B (en) | Multimode resource recommendation method based on two-stage comparison learning | |
CN118230224B (en) | Label scoring method, label scoring model training method and device | |
CN116881575B (en) | Content pushing method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |