CN114647787A - User personalized recommendation method based on multi-modal data - Google Patents

User personalized recommendation method based on multi-modal data Download PDF

Info

Publication number
CN114647787A
CN114647787A CN202210322829.8A CN202210322829A CN114647787A CN 114647787 A CN114647787 A CN 114647787A CN 202210322829 A CN202210322829 A CN 202210322829A CN 114647787 A CN114647787 A CN 114647787A
Authority
CN
China
Prior art keywords
user
recommendation
mapping
modal
agent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210322829.8A
Other languages
Chinese (zh)
Inventor
郭楠
傅章鹏
高天寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN202210322829.8A priority Critical patent/CN114647787A/en
Publication of CN114647787A publication Critical patent/CN114647787A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a user personalized recommendation method based on multi-modal data, and relates to the technical field of network recommendation. According to the method, after historical behavior records and mapping logs allowed to be collected by a user are obtained, the characteristics of all objects related to the historical behavior records and the characteristics of each object to be recommended are extracted and mapped to the same multi-dimensional space, multi-modal data are integrated, then a reinforcement learning model is used as a recommendation system agent, the agent is trained through the collected user records, and the trained recommendation agent is used for recommendation, so that user-customized recommendation of the multi-modal data is achieved. The user historical behavior record, the mapping log, the object set to be recommended and the like oriented in the method provided by the invention can all contain objects (such as texts, pictures, videos and the like) in a plurality of fields, and the difference among multiple modes is blurred by a method of integrating after feature extraction, so that the problem that a traditional recommendation system can only be applied to a single field for recommendation is solved.

Description

User personalized recommendation method based on multi-modal data
Technical Field
The invention relates to the technical field of network recommendation, in particular to a user personalized recommendation method based on multi-mode data.
Background
In recent years, with the rapid development of the internet, modern society has become an information-oriented and digital society, data is flooded throughout the world, and information explosion has become a constant state. However, in the face of a large amount of data, the utilization rate of the Information by the user is rather reduced, i.e. an Information overload (overload) problem is generated. In this regard, the recommendation system is one of the key technologies for effectively solving the information overload problem. In fact, with the rapid development of the internet, the internet of things and cloud computing technology, the personalized recommendation system has become the standard for internet products at present, and internet information to be faced by internet users in the fields of e-commerce, video, news, music and the like is closely related to the recommendation system.
The recommendation system acquires historical behavior data of the user, such as browsing data of a webpage, purchase history or item rating, and the like, so that personalized recommendation is performed on the user. The solution of the recommendation problem is a recommendation system based on content filtering, a recommendation system based on collaborative filtering and a hybrid recommendation system; with the rise of deep learning algorithms, various recommendation algorithms using neural networks are also widely applied to the implementation of recommendation systems, such as recurrent neural networks, convolutional neural networks, generative confrontation networks, and the like; in addition, there are recommendation techniques based on reinforcement learning, recommendation techniques based on heterogeneous networks, and the like.
Most of the existing recommendation systems provide solutions to user personalized recommendation problems based on single modalities, that is, historical data of a user is learned only in a single field (such as news, videos, music and the like), so that interest and preference of the user in the single field can be captured only and recommendation is performed based on the interest and preference, and diversity of recommendation results is greatly limited.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a user personalized recommendation method based on multimodal data, which is based on the shortcomings of the prior art, and learns the user preference and forms the user portrait according to the historical data formed by the user in various fields (such as news, video, music, etc.), so as to recommend the multimodal data and expand the diversity of recommendation results.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a user personalized recommendation method based on multi-modal data comprises the following steps:
step 1: acquiring a total object set I and an object set L to be recommended0Clearly, there is a need for a variety of multimodal information to be addressed.
And 2, step: and aiming at each oriented modality, applying a corresponding feature extraction algorithm to extract features of each object in the total object set and mapping the features into the same mathematical multidimensional space S.
And step 3: and acquiring and accumulating user historical behavior records and image logs. The historical behavior record is the click history record of the user before mapping, namely, a clicked object list; the image log is a list of objects displayed to the user and the clicking behavior of the user on the objects, wherein 1 represents clicking, and 0 represents not clicking.
And 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent of the recommendation system and a reinforcement learning environment parameter.
And 5: training of the recommendation system agent is performed.
Step 6: processing a set of objects to be recommended L using a trained recommendation system agent0Simulating a set L of objects to be recommended by a user using a recommendation system0And generating a corresponding image log D.
And 7: and extracting an object set interacted by the user in the mapping log D, namely an individualized interactive object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L.
And 8: and processing the total recommendation list L according to different requirements to generate a multi-modal recommendation list.
And step 9: and recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log. The agent is further trained and updated each time a given n logs are accumulated.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: according to the user personalized recommendation method based on the multi-modal data, after historical behavior records and mapping logs allowed to be collected by a user are obtained, all objects involved in the historical behavior records and the features of each object to be recommended are extracted and mapped to the same multi-dimensional space, the multi-modal data are integrated, then a reinforcement learning model is used as a recommendation system intelligent body, the intelligent body is trained through the collected user records, and the trained recommendation intelligent body is used for recommendation, so that the user personalized recommendation of the multi-modal data is achieved. Compared with the prior art, the method provided by the invention can be used for solving the problem that a traditional recommendation system can only be applied to a single field for recommendation by aiming at the user historical behavior record, the mapping log, the object set to be recommended and the like in multiple fields (such as texts, pictures, videos and the like) and blurring the difference among multiple modes through a feature extraction and integration method.
Drawings
FIG. 1 is a flowchart of a recommendation method provided by an embodiment of the present invention;
fig. 2 is a flowchart of a training method of an agent (rainbow model) according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
As shown in fig. 1, the user personalized recommendation method based on multimodal data of the present embodiment is as follows.
Step 1: acquiring a total object set I and an object set L to be recommended0Clearly, there is a need for a variety of multimodal information to be addressed.
Step 2: and aiming at each oriented modality, applying a corresponding feature extraction algorithm to extract features of each object in the total object set and mapping the features into the same mathematical multidimensional space S. For example, the image is characterized by using a CNN or RNN convolutional neural network, the text is characterized by using a tf-idf algorithm or a CNN convolutional neural network, the video is characterized by using a P3D (Pseudo-3D redundant Networks) Residual network, the audio is characterized by using a Discrete Wavelet Transform (DWT) or a Perceptual Linear Prediction (PLP) algorithm, and the like.
And step 3: and acquiring and accumulating user historical behavior records and image logs. The historical behavior record is the click history record of the user before mapping, namely, a clicked object list; the image log is a list of objects displayed to the user and the clicking behavior of the user on the objects, wherein 1 represents clicking, and 0 represents not clicking.
And 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent of the recommendation system and a reinforcement learning environment parameter.
And 5: training of the recommendation system agent is performed, as shown in fig. 2, as follows:
step 5.1: the agent requests a user-related record from the user historical behavior record and the mapping log obtained in step 3.
Step 5.2: and (4) returning records to the memory, and replacing the corresponding object with the feature representation through the result of the step 2.
Step 5.3: the intelligent body forms a user portrait according to the historical behavior record of the user, images are sequentially executed according to the user portrait, if the action of the intelligent body is consistent with the real log, the intelligent body obtains rewards, and if the action of the intelligent body is inconsistent with the real log, the intelligent body does not obtain rewards or obtains punishment.
Step 5.4: after all the entries in the user mapping log are executed, judging that a reinforcement learning Frame is finished, and judging: and if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the intelligent agent, turning to the step 6, and otherwise, returning to the step 5.1.
And 6: processing a set of objects to be recommended L using a trained recommendation system agent0Simulating a set L of objects to be recommended by a user using a recommendation system0And generating a corresponding image log D.
And 7: and extracting an object set interacted by the user in the mapping log D, namely an individualized interactive object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L.
And 8: processing the total recommendation list L according to different requirements to generate a multi-modal recommendation list;
when the wide interests of the user in multiple fields need to be mined or the recommendation result modes need to be diversified, the total recommendation list L can be simply sorted according to the specific scene requirements and then output to generate the multi-mode recommendation list L1
When a user needs to be recommended in other fields according to a certain object (such as matching a news picture, recommending a character hyperlink for a movie, and the like), other modal objects closest to the current object can be selected from the multidimensional space S according to the nearest neighbor principle, and a multi-modal recommendation list L is generated2
When more accurate and effective recommendation needs to be performed on the user by combining different levels of knowledge expressions, a multi-modal fusion (multimodal fusion) algorithm is used for fusing features of different modalities and training the agent based on the fused features to generate a recommendation list L3 with multi-modal expressions for the same object, such as a news title matched with a news cover.
And step 9: and recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log. The agent is further trained and updated each time a given n logs are accumulated.
This example sequentially performs the method of the present invention based on a milcrosoft News Dataset (MIND for short):
step 1: because the MIND data set only contains news text data and does not meet the requirements of multi-modal tasks, corresponding picture data contained in news is crawled through a spiderFlow crawler tool (https:// githu. com/chenyuansgit/spiderFlow) according to news links provided in the MIND data set, and a picture data set corresponding to the news text is formed. Therefore, the obtained total object set I is all news texts and news pictures and an object set L to be recommended0For testing the concentrated image log news text and the corresponding image set of newsThe types of multimodal information that need to be addressed are text and images.
Step 2: aiming at two modes of a text and an image, corresponding feature extraction algorithms are respectively applied, the news text features are weighted by Tf-idf terms through the news text features to extract basic text features, and then feature selection is carried out through a chi-square test algorithm to reduce the obtained basic text feature dimensionality, so that the model generalization capability is stronger, overfitting is reduced, and understanding between features and feature values is enhanced. The news picture features are extracted through a pre-trained VGG16 neural network in a keras library, and finally the extracted features of each object in the total object set are mapped to the same mathematical multidimensional space S through a umap algorithm (https:// axiv.org/abs/1802.03426).
And step 3: and reading the user historical behavior record and the image log accumulated in the data set.
And 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent and a reinforcement learning environment parameter (a rainbow model is taken as an example): the action space is an integer of 0-10, and the probability of clicking the object is predicted when the action space is divided by 10; the state space is a multidimensional matrix and reflects the user portrait and the characteristics of the current object to be judged; the reward is determined by the auc index derived from the prediction of the image log output.
And 5: training of the recommendation system agent is performed, as shown in fig. 2, as follows:
step 5.1: the agent requests a user-related record from the set obtained in step 3;
step 5.2: the memory returns the record, and replaces the corresponding object with the feature representation through the result of the step 2;
step 5.3: the intelligent agent separates user preference characteristics by using Mean-Shift algorithm clustering according to historical records of users to form user images, predicts click rates of the users on objects in the mapping logs item by item according to the obtained user images, and awards the intelligent agent according to auc indexes obtained by the prediction result and the real record calculation;
step 5.4: after all the entries in the user mapping log are executed, judging that a reinforcement learning Frame is finished, and judging: and if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the intelligent agent, otherwise, returning to the step 5.1.
Step 6: and processing the mapping log in the test set by using the trained recommendation system agent, predicting the click probability of each object by the user according to the historical behavior record of the user in the test set, calculating auc indexes, and recording the prediction result D.
And 7: and using the high click probability object in the prediction result D as a medium recommendation list L.
And 8: the general recommendation list L can be processed according to different requirements to generate a multi-mode recommendation list; the following requirements are processed as follows:
when the wide interests of the user in multiple fields need to be mined or diversified recommendation result modals need to be mined, the general recommendation list L can be simply sorted according to specific scene requirements and then output to generate a multi-modal recommendation list L1
When a user needs to be recommended in other fields according to a certain object (such as matching a news picture or recommending a news hyperlink for a picture), other modal objects closest to the current object can be selected from the multidimensional space S according to the nearest neighbor principle to generate a multi-modal recommendation list L2
And step 9: and recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log. The agent may be further trained and updated each time a given n pieces of log are accumulated.
Through the steps, the finally obtained experiment result indexes are shown in table 1 (step 5 adopts different recommendation models respectively).
TABLE 1 results of various recommended models
Figure BDA0003572401540000051
The reinforcement learning model is adopted as a method for recommending the intelligent agent of the recommending system, and a test is carried out on a data set of MIcrosoft News Dataset (MIND for short), and the result shows that the performance indexes of the reinforcement learning intelligent agent are similar to those of the traditional recommending model, namely the reinforcement learning intelligent agent can be competent for recommending tasks. The data of the MIND data set on the image mode is expanded through the crawler, so that the MIND data set becomes a data set of a text and image mixed mode, and further experiments are carried out.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions and scope of the present invention as defined in the appended claims.

Claims (7)

1. A user personalized recommendation method based on multi-modal data is characterized by comprising the following steps: the method comprises the following steps:
step 1: acquiring a total object set I and an object set L to be recommended0Defining the type of multi-mode information to be oriented;
step 2: aiming at each oriented mode, a corresponding feature extraction algorithm is applied, and features of each object in the total object set are extracted and mapped to the same mathematical multidimensional space S;
and 3, step 3: acquiring and accumulating user historical behavior records and mapping logs;
and 4, step 4: initializing a recommendation system, inputting the multi-dimensional space S obtained in the step (2) and the user historical behavior record and the mapping log obtained in the step (3) into the recommendation system, and setting an intelligent agent of the recommendation system and a reinforcement learning environment parameter;
and 5: performing training of a recommendation system agent;
step 6: processing a set of objects to be recommended L using a trained recommendation system agent0Simulating a set L of objects to be recommended by a user using a recommendation system0Generating a corresponding mapping log D;
and 7: extracting an object set interacted by the user in the mapping log D, namely an individualized interactive object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L;
and 8: processing the total recommendation list L according to different requirements to generate a multi-modal recommendation list;
and step 9: recommending the generated multi-modal recommendation list to obtain a specific user interaction result so as to generate a mapping log; the agent is further trained and updated each time a given n logs are accumulated.
2. The method of claim 1, wherein the method comprises: the historical behavior record in the step 3 is the click history record of the user before mapping, namely a clicked object list; the image log is a list of objects displayed to the user and the clicking behavior of the user on the objects, wherein 1 represents clicking, and 0 represents not clicking.
3. The method of claim 1, wherein the method comprises: the specific process of the step 5 is as follows:
step 5.1: the intelligent agent requests a user related record from the user historical behavior record and the mapping log obtained in the step 3;
step 5.2: the memory returns records, and the corresponding object is replaced by the feature representation through the result of the step 2;
step 5.3: the intelligent body forms a user portrait according to the historical behavior record of the user, sequentially executes mapping according to the user portrait, obtains reward if the action of the intelligent body is consistent with the real log, and does not obtain reward or obtain punishment if the action of the intelligent body is inconsistent with the real log;
step 5.4: after all the entries in the user mapping log are executed, judging that a reinforcement learning Frame is finished, and judging: and if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the intelligent agent, turning to the step 6, and otherwise, returning to the step 5.1.
4. The method for user-customized recommendation based on multimodal data according to any of claims 1-3, wherein: in the step 2, the image uses a CNN convolutional neural network to extract features; extracting features of the text by using a tf-idf algorithm or a cnn convolutional neural network; the video is characterized by a P3D residual network, and the audio is characterized by a discrete wavelet transform or a perceptual linear prediction algorithm.
5. The method for personalized recommendation of a user based on multimodal data according to any of claims 1-3, wherein: in the step 8, when the user's extensive interest in multiple fields needs to be mined or a variety of recommendation result modalities needs to be adopted, the general recommendation list L is simply sorted according to specific scene requirements and then output to generate a multi-modal recommendation list L1
6. The method for user-customized recommendation based on multimodal data according to any of claims 1-3, wherein: in step 8, when it is necessary to recommend the user in other fields according to a certain object, another modal object closest to the current object is selected from the multi-dimensional space S according to the nearest neighbor principle, and a multi-modal recommendation list L is generated2
7. The method for user-customized recommendation based on multimodal data according to any of claims 1-3, wherein: in the step 8, when a user needs to be recommended more accurately and effectively by combining knowledge expressions of different levels, a multi-modal fusion algorithm is used to fuse features of different modalities, and an agent is trained based on the fused features, so as to generate a recommendation list L3 with multi-modal expression for the same object.
CN202210322829.8A 2022-03-30 2022-03-30 User personalized recommendation method based on multi-modal data Pending CN114647787A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210322829.8A CN114647787A (en) 2022-03-30 2022-03-30 User personalized recommendation method based on multi-modal data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210322829.8A CN114647787A (en) 2022-03-30 2022-03-30 User personalized recommendation method based on multi-modal data

Publications (1)

Publication Number Publication Date
CN114647787A true CN114647787A (en) 2022-06-21

Family

ID=81994734

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210322829.8A Pending CN114647787A (en) 2022-03-30 2022-03-30 User personalized recommendation method based on multi-modal data

Country Status (1)

Country Link
CN (1) CN114647787A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702094A (en) * 2023-08-01 2023-09-05 国家计算机网络与信息安全管理中心 Group application preference feature representation method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702094A (en) * 2023-08-01 2023-09-05 国家计算机网络与信息安全管理中心 Group application preference feature representation method
CN116702094B (en) * 2023-08-01 2023-12-22 国家计算机网络与信息安全管理中心 Group application preference feature representation method

Similar Documents

Publication Publication Date Title
CN111581510B (en) Shared content processing method, device, computer equipment and storage medium
CN111382309B (en) Short video recommendation method based on graph model, intelligent terminal and storage medium
CN111444428B (en) Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
CN112119388A (en) Training image embedding model and text embedding model
CN112100504B (en) Content recommendation method and device, electronic equipment and storage medium
CN113590965B (en) Video recommendation method integrating knowledge graph and emotion analysis
CN117836765A (en) Click prediction based on multimodal hypergraph
CN113051468B (en) Movie recommendation method and system based on knowledge graph and reinforcement learning
CN116601626A (en) Personal knowledge graph construction method and device and related equipment
CN113590970A (en) Personalized digital book recommendation system and method based on reader preference, computer and storage medium
CN115964560B (en) Information recommendation method and equipment based on multi-mode pre-training model
CN116977701A (en) Video classification model training method, video classification method and device
CN113094587A (en) Implicit recommendation method based on knowledge graph path
CN115238191A (en) Object recommendation method and device
CN114201516A (en) User portrait construction method, information recommendation method and related device
CN115221352A (en) Big data short video recommendation system based on collaborative filtering algorithm
CN111026910B (en) Video recommendation method, device, electronic equipment and computer readable storage medium
CN115640449A (en) Media object recommendation method and device, computer equipment and storage medium
CN118069927A (en) News recommendation method and system based on knowledge perception and user multi-interest feature representation
CN114647787A (en) User personalized recommendation method based on multi-modal data
CN116956183A (en) Multimedia resource recommendation method, model training method, device and storage medium
CN114417875B (en) Data processing method, apparatus, device, readable storage medium, and program product
CN117786234B (en) Multimode resource recommendation method based on two-stage comparison learning
CN118230224B (en) Label scoring method, label scoring model training method and device
CN116881575B (en) Content pushing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination