CN114077662A - Meta learning method, device, equipment and storage medium of cold start recommendation system - Google Patents

Meta learning method, device, equipment and storage medium of cold start recommendation system Download PDF

Info

Publication number
CN114077662A
CN114077662A CN202111424274.XA CN202111424274A CN114077662A CN 114077662 A CN114077662 A CN 114077662A CN 202111424274 A CN202111424274 A CN 202111424274A CN 114077662 A CN114077662 A CN 114077662A
Authority
CN
China
Prior art keywords
semantic
embedding
task
item
meta
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111424274.XA
Other languages
Chinese (zh)
Inventor
苏欣
李天源
刘绪崇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Police Academy
Original Assignee
Hunan Police Academy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Police Academy filed Critical Hunan Police Academy
Priority to CN202111424274.XA priority Critical patent/CN114077662A/en
Publication of CN114077662A publication Critical patent/CN114077662A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a meta-learning method, a meta-learning device, a meta-learning equipment and a storage medium of a cold-start recommendation system, wherein the meta-learning method at a model level and a heterogeneous information network at a data level are used for solving the cold-start problem; in the data level, the method enriches the relevant semantic information on the data by constructing the meta path; and on the model level, a memory which can store the semantic specification is adopted for guiding the model with the semantic parameter initialization, and a meta-optimization method is adopted for optimizing the method so as to achieve rapid adaptation.

Description

Meta learning method, device, equipment and storage medium of cold start recommendation system
Technical Field
The present application relates to the field of recommendation systems, and in particular, to a meta-learning method, apparatus, device, and storage medium for a cold-start recommendation system.
Background
Since the rapid development of mobile applications, the recommendation system plays an increasingly important role in the industry, and the core is to solve the problem of user information overload, but at the same time, many challenges are brought. Although recommendation systems can be divided into several broad categories, and conventional matrix factorization based recommendation methods or deep learning based techniques have been successful, one problem that most recommendation systems inevitably have is the cold start problem. Due to lack of user interaction with the item, it is often difficult for the recommendation system to accurately recommend new users. The cold start problem can be divided into user cold start and item cold start, corresponding to a situation where a recommendation system lacking user interaction with an item cannot handle a new user or a new item exists. An effective solution to alleviate the cold start problem is to enrich new users and new items with assistance data, for example where the recommendation system makes certain progress based on the user or item content. In addition, heterogeneous information networks are used to enrich users with complementary heterogeneous information in interacting with items. On the other hand, meta-learning has a very good effect in the field of small sample learning, and the cold start problem is revealed to a certain extent. Because meta-learning has a good performance in the learning configuration initialization of a new task, the recommendation of a user is generally regarded as a learning task, and the core idea is to learn a global parameter to initialize the parameter of a personalized recommendation model. The personalized parameters are updated locally to know the preference of a specific user, the global parameters are updated by minimizing the loss of training tasks among users, and finally, the learned global parameters are used for guiding the model setting of a new user, so that how to stably perform meta-learning becomes a problem to be solved urgently.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a meta-learning method, a meta-learning device, meta-learning equipment and a storage medium of a cold-start recommendation system, and aims to solve the technical problem that meta-learning cannot be stably performed in the prior art.
To achieve the above object, the present invention provides a meta-learning method for a cold start recommendation system, the method comprising:
extracting a semantic enhancement task constructor of a meta path from the configuration file to obtain semantic enhancement task data;
generating a context aggregation model according to the semantic enhancement task data to derive semantic embeddings with context semantics;
obtaining initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to obtain the semantic embedding and the project embedding which are suitable for a new task;
estimating the rating of the user u on the item i through a preference prediction model by the learned semantic embedding and item embedding;
obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adapter on a specific meta path p;
adapting a global task prior omega to the task through a plurality of gradient descent steps according to the global context prior phi;
and initializing parameters x of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory so as to optimize a learning result.
Optionally, the step of generating a context aggregation model according to the semantic enhancement task data to derive semantic embedding with context semantics includes:
giving initial user embedding e according to the semantic enhancement taskuAnd an
Figure RE-GDA0003451038920000021
Embedded list e of rating items ofiContext aggregation model gφThrough CuDeriving semantic embedding e with contextual semanticssThe formula is as follows:
Figure RE-GDA0003451038920000031
wherein C isuRepresenting a phase with user u caused by direct interaction or meta-pathThe related item set is a context aggregation model with parameters of
Figure RE-GDA0003451038920000032
Figure RE-GDA0003451038920000033
Is a d-dimensional dense vector which is mainly used for embedding a plurality of features of a user.
Figure RE-GDA0003451038920000034
Is a feature embedding matrix, σ (-) is an activation function, and the formula is as follows:
Figure RE-GDA0003451038920000035
wherein a isiIs a fixed parameter within the interval (1, + ∞).
Optionally, the obtaining of the initial item embedding, and the constructing of two multilayer fully-connected layer neural network models to learn the semantic embedding and the item embedding to obtain the semantic embedding and the item embedding adapted to the new task includes:
acquiring initial project embedding;
constructing two multilayer full-connection layer neural network models, wherein the network models are
Figure RE-GDA0003451038920000036
Wherein z (·) represents a fully connected layer; chi shapes,χiRespectively representing full connection layer parameters of learning semantic embedding and project embedding;
and learning the semantic embedding and the project embedding according to the two multilayer fully-connected layer neural network models to obtain the semantic embedding and the project embedding which are suitable for new tasks.
Optionally, the step of estimating the rating of the user u on the item i by the preference prediction model through the learned semantic embedding and item embedding comprises:
estimating the rating of the user u on the item i by the learned semantic embedding and item embedding through a preference prediction model, wherein the preference prediction model has a specific formula
Figure RE-GDA0003451038920000037
Wherein the MLP is a multi-layer perceptron,
Figure RE-GDA0003451038920000041
representing a cascade. Where h iswA rating prediction model parameterized by w, containing weights and biases in MLP.
Optionally, the semantic adaptor is a task
Figure RE-GDA0003451038920000042
Semantic enhanced support set of
Figure RE-GDA0003451038920000043
The step of obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adaptor on a specific meta-path p comprises the following steps:
specifying task T for user uuSupport set
Figure RE-GDA0003451038920000044
Is semantically contextual
Figure RE-GDA0003451038920000045
Enhancing;
given P ∈ P, semantics embed esIn the semantic space of p is
Figure RE-GDA0003451038920000046
Computing task TuMedium rated term
Figure RE-GDA0003451038920000047
Loss of
Figure RE-GDA0003451038920000048
Wherein
Figure RE-GDA0003451038920000049
Representing the meta-path p induces a prediction rating for u on item i in the semantic space;
by computing task T after gradient descentuLoss in each semantic space
Figure RE-GDA00034510389200000410
To obtain semantic priors for various aspects
Figure RE-GDA00034510389200000411
Where gamma represents the semantic learning rate.
Optionally, the step of adapting a global task a priori ω to the task by a plurality of gradient descent steps according to the global context a priori Φ comprises:
a priori φ on support set based on the global context
Figure RE-GDA00034510389200000412
Is updated to
Figure RE-GDA00034510389200000413
Converting global priors omega to the same space
Figure RE-GDA00034510389200000414
Wherein &' is the product of elements, κ (·) can be viewed asThe conversion function realized by the full-connection layer network finally uses gradient descent to make omegapAdapting to tasks
Figure RE-GDA00034510389200000415
Figure RE-GDA00034510389200000416
Optionally, the semantic specific memory is a built-in semantic memory MSAnd configuration file memory MFI.e. embedding memory M by means of semanticsSAnd configuration file memory MFThereby initializing personality parameters χs
The initializing parameter χ of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory to optimize the learning result comprises the following steps:
definition configuration file fs
Figure RE-GDA0003451038920000051
Wherein
Figure RE-GDA0003451038920000052
A common dimension of the first dimension is different semantic dimensions;
computing semantic attention values
Figure RE-GDA0003451038920000053
Extending semantic profile vectors to and MFThe same dimensional space, i.e.
Figure RE-GDA0003451038920000054
The similarity between the two is calculated by cosine similarity, i.e.
Figure RE-GDA0003451038920000055
The semantic attention value a can be obtained by carrying out normalization through the sofamax functions
Semantic embedded memory
Figure RE-GDA0003451038920000056
Store all fast gradients, d, of the same shape as the parameters of the semantic embedding modelθsRepresenting the dimension of the parameter in the semantic embedded memory to obtain the personalized bias item bsBy passing
Figure RE-GDA0003451038920000057
In the initialization stage, the two memories are initialized randomly and updated in the training process;
MFwill be updated in the following way
Figure RE-GDA0003451038920000058
Where α is a hyper-parameter that controls how much new profile information is added; mSWill also be updated by:
Figure RE-GDA0003451038920000059
wherein
Figure RE-GDA00034510389200000510
Representing the training loss, α, β are hyper-parameters that control how much new information is retained.
In addition, to achieve the above object, the present invention further provides a learning apparatus of a cold start recommendation system, the apparatus including:
the extraction module is used for extracting the semantic enhancement task constructor of the meta-path from the configuration file so as to obtain semantic enhancement task data;
the building module is used for generating a context aggregation model according to the semantic enhancement task data so as to derive semantic embedding with context semantics;
the learning module is used for acquiring initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to acquire the semantic embedding and the project embedding which adapt to a new task;
the prediction module is used for estimating the rating of the user u on the item i through a preference prediction model according to the learned semantic embedding and item embedding;
the evaluation module is used for obtaining global context prior phi according to the semantic context evaluation loss caused by the combination of the aggregation model and the preference prediction model and a semantic adaptor on a specific meta-path p;
an adaptation module for adapting a global task prior omega to the task by a plurality of gradient descent steps according to the global context prior phi;
and the implementation module is used for initializing parameters of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory so as to optimize a learning result.
Furthermore, to achieve the above object, the present invention also provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the method according to any one of the above items when executing the computer program.
Furthermore, to achieve the above object, the present invention also proposes a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method as defined in any one of the above.
The semantic enhancement task constructor of the meta-path is extracted from the configuration file to obtain semantic enhancement task data; generating a context aggregation model according to the semantic enhancement task data to derive semantic embeddings with context semantics; obtaining initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to obtain the semantic embedding and the project embedding which are suitable for a new task; estimating the rating of the user u on the item i through a preference prediction model by the learned semantic embedding and item embedding; obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adapter on a specific meta path p; adapting a global task prior omega to the task through a plurality of gradient descent steps according to the global context prior phi; and initializing parameters of the fully-connected neural network model by combining the two multilayer fully-connected neural network models with a semantic specific memory to optimize a learning result, so that the technical effect of quick adaptation is realized.
Drawings
FIG. 1 is a flowchart illustrating a meta-learning method of a cold-boot recommendation system according to a first embodiment of the present invention;
FIG. 2 is a diagram of a recommendation scenario in a first embodiment of a meta-learning method for a cold-start recommendation system of the present invention;
fig. 3 is a block diagram of a meta learning apparatus of a cold start recommendation system according to a first embodiment of the present invention.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
An embodiment of the present invention provides a meta-learning method for a cold-start recommendation system, and referring to fig. 1, fig. 1 is a flowchart illustrating a first embodiment of the meta-learning method for a cold-start recommendation system according to the present invention.
In this embodiment, the meta-learning method of the cold-start recommendation system includes the following steps:
step S10: and extracting the semantic enhancement task constructor of the meta-path from the configuration file to obtain semantic enhancement task data.
It will be appreciated that most conventional cold start recommendation systems initialize the base model f by learning global meta-parameters through a meta-learnerθThus, the global meta-parameter θ is also referred to as a global prior, which is optimized over a number of tasks and over one or several gradients, given a few instancesAfter the step stage, the new target task is quickly adapted, specifically, the meta-learning divides the task into a meta-training task and a meta-testing task, and for the meta-training task and the meta-testing task, a support set and a query set are included, which are mutually exclusive. The key point is to replace semantic contexts of different levels of meta-paths with tasks, which are defined as follows:
Figure RE-GDA0003451038920000081
wherein SuRepresenting a semantic enhanced support set, QuRepresenting a set of semantically enhanced queries, each
Figure RE-GDA0003451038920000082
Are mutually exclusive, they include the sets of support and queries from user u that are mutually exclusive, they include randomly selected items from the set of items scored by user u, and likewise, the sets of support and queries for semantic enhancement are defined as
Figure RE-GDA0003451038920000083
Wherein
Figure RE-GDA0003451038920000084
And
Figure RE-GDA0003451038920000085
of items rated by user uIn the collection of the images, the image data is collected,
Figure RE-GDA0003451038920000086
and
Figure RE-GDA0003451038920000087
representing the semantic context according to the set of meta-paths P.
It should be noted that the main task of the semantic enhancement task constructor is to construct various meta-paths for the data set, and these meta-paths are generalized and have various semantics, as shown in fig. 2. Specifically, in the recommendation scenario, it is assumed that the user u has not rated the item i, but the item that the user u rated has some relation to the unrated item i or the user that has some relation to the user u has some relation to the item i, and here, the relations may be defined as various semantics.
Step S20: and generating a context aggregation model according to the semantic enhancement task data to derive semantic embedding with context semantics.
Further, the initial user embedding e is given according to the semantic enhancement taskuAnd an
Figure RE-GDA0003451038920000091
Embedded list e of rating items ofiContext aggregation model gφThrough CuDeriving semantic embedding e with contextual semanticssThe formula is as follows:
Figure RE-GDA0003451038920000092
wherein C isuRepresenting a collection of items related to user u, caused by direct interaction or meta-path, is a context aggregation model with parameters of
Figure RE-GDA0003451038920000093
Figure RE-GDA0003451038920000094
Is a d-dimensional dense vector which is mainly used for embedding a plurality of features of a user.
Figure RE-GDA0003451038920000095
Is a feature embedding matrix, σ (-) is an activation function, and the formula is as follows:
Figure RE-GDA0003451038920000096
wherein a isiIs a fixed parameter within the interval (1, + ∞).
Step S30: obtaining initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to obtain the semantic embedding and the project embedding which are suitable for a new task.
Further, the step of obtaining initial project embedding, constructing two multilayer fully-connected layer neural network models, and learning the semantic embedding and the project embedding to obtain semantic embedding and project embedding adapted to a new task includes: acquiring initial project embedding; constructing two multilayer full-connection layer neural network models, wherein the network models are
Figure RE-GDA0003451038920000101
Wherein z (·) represents a fully connected layer; chi shapes,χiRespectively representing full connection layer parameters of learning semantic embedding and project embedding; and learning the semantic embedding and the project embedding according to the two multilayer fully-connected layer neural network models to obtain the semantic embedding and the project embedding which are suitable for new tasks.
Step S40: and estimating the rating of the user u on the item i by a preference prediction model through the learned semantic embedding and item embedding.
Further, the step of estimating the rating of the user u on the item i by the preference prediction model through the learned semantic embedding and item embedding comprises: estimating the rating of the user u on the item i by the learned semantic embedding and item embedding through a preference prediction model, wherein the preference prediction model has a specific formula
Figure RE-GDA0003451038920000102
Wherein the MLP is a multi-layer perceptron,
Figure RE-GDA0003451038920000103
denotes a cascade, here hwA rating prediction model parameterized by w, containing weights and biases in MLP.
Step S50: and obtaining global context prior phi according to the aggregation model and the preference prediction model and the semantic context evaluation loss caused by the semantic adapter on a specific meta path p.
Further, the semantic adaptor is a task
Figure RE-GDA0003451038920000104
Semantic enhanced support set of
Figure RE-GDA0003451038920000105
The step of obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adaptor on a specific meta-path p comprises the following steps: specifying task T for user uuSupport set
Figure RE-GDA0003451038920000106
Is semantically contextual
Figure RE-GDA0003451038920000107
Enhancing; given P ∈ P, semantics embed esIn the semantic space of p is
Figure RE-GDA0003451038920000108
Computing task TuMedium rated term
Figure RE-GDA0003451038920000111
Loss of
Figure RE-GDA0003451038920000112
Wherein
Figure RE-GDA0003451038920000113
Representing the meta-path p induces a prediction rating for u on item i in the semantic space; by computing task T after gradient descentuLoss in each semantic space
Figure RE-GDA0003451038920000114
To obtain semantic priors for various aspects
Figure RE-GDA0003451038920000115
Wherein gamma denotes the semantic learning rate, wherein
Figure RE-GDA0003451038920000116
Is an adaptive model parameter given a user u and a semantic context p, and gamma represents the semantic learning rate. The adaptation is realized by a single step gradient descent, which is based on the gradient of the supervision loss calculated on the support set, and represents the semantically enhanced user
Figure RE-GDA0003451038920000117
While freezing the gradient to phi.
Step S60: adapting a global task prior omega to the task by a plurality of gradient descent steps according to the global context prior phi.
Further, said step of adapting a global task a priori ω to the task by a plurality of gradient descent steps according to said global context a priori Φ comprises: according to the overall situationContext a priori φ will support on set
Figure RE-GDA0003451038920000118
Is updated to
Figure RE-GDA0003451038920000119
Converting global priors omega to the same space
Figure RE-GDA00034510389200001110
Where &' is the product of elements,. kappa (-), can be viewed as a transfer function for several full-connection layer network implementations, finally using a gradient descent such that ω ispAdapting to tasks
Figure RE-GDA00034510389200001111
Figure RE-GDA00034510389200001112
Step S70: and initializing parameters of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory so as to optimize a learning result.
In the specific implementation, the semantic specific memory is specifically that a semantic embedded memory M is firstly constructedSAnd configuration file memory MFI.e. embedding memory M by means of semanticsSAnd configuration file memory MFThereby effectively initializing personality parameters xs. Configuration file memory MFInformation associated with the configuration file F for providing the search attention value as。asFor slave MSExtracting key information, wherein MSEach row of (a) holds a respective bias term. These two memory matrices will help in initializing χsGenerating personalized bias item bsI.e. chis←xs-τbs. τ is oneA hyper-parameter for controlling the initialization thetasDegree of personalization of the time; in detail, a semantic profile f is givens. The semantic meaning is a two-dimensional vector representation, the first dimensions of semantic vectors of different users are not necessarily consistent, and therefore, the memory of the configuration file is considered to be represented by a three-dimensional vector, namely
Figure RE-GDA0003451038920000121
Wherein
Figure RE-GDA0003451038920000122
A common dimension of the first dimension is a different semantic vector. Then calculating semantic attention value
Figure RE-GDA0003451038920000123
First, the semantic profile vector is extended to MFThe same dimensional space, i.e.
Figure RE-GDA0003451038920000124
The similarity between the two is calculated by cosine similarity, i.e.
Figure RE-GDA0003451038920000125
Finally, the semantic attention value a can be obtained by carrying out normalization through the sofamax functions
Semantic embedded memory
Figure RE-GDA0003451038920000126
All fast gradients are stored that are the same shape as the semantic embedding model parameters,
Figure RE-GDA0003451038920000127
representing the dimension of the parameter in semantic embedded memory. Then obtaining personalized bias item bsBy passing
Figure RE-GDA0003451038920000128
In the initialization phase, the two memories are initialized randomly and updated in the training process. MFWill be updated in the following way
Figure RE-GDA0003451038920000131
Where alpha is a hyper-parameter that controls how much new profile information is added. Similarly, MSWill also be updated by:
Figure RE-GDA0003451038920000132
wherein
Figure RE-GDA0003451038920000133
Representing the training loss, α, β are hyper-parameters that control how much new information is retained.
Further, the semantic specific memory is a structural semantic embedded memory MSAnd configuration file memory MFI.e. embedding memory M by means of semanticsSAnd configuration file memory MFThereby initializing personality parameters χs(ii) a The step of initializing parameters of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory to optimize a learning result χ comprises the following steps: definition configuration file fs
Figure RE-GDA0003451038920000134
Wherein
Figure RE-GDA0003451038920000135
A common dimension of the first dimension is different semantic dimensions; computing semantic attention values
Figure RE-GDA0003451038920000136
Vector semantic profilesExtend to and MFThe same dimensional space, i.e.
Figure RE-GDA0003451038920000137
The similarity between the two is calculated by cosine similarity, i.e.
Figure RE-GDA0003451038920000138
The semantic attention value a can be obtained by carrying out normalization through the sofamax functions(ii) a Semantic embedded memory
Figure RE-GDA0003451038920000139
Store all fast gradients, d, of the same shape as the parameters of the semantic embedding modelθsRepresenting the dimension of the parameter in the semantic embedded memory to obtain the personalized bias item bsBy passing
Figure RE-GDA00034510389200001310
In the initialization stage, the two memories are initialized randomly and updated in the training process;
MFwill be updated in the following way
Figure RE-GDA0003451038920000141
Where α is a hyper-parameter that controls how much new profile information is added; mSWill also be updated by:
Figure RE-GDA0003451038920000142
wherein
Figure RE-GDA0003451038920000143
Representing the training loss, α, β are hyper-parameters that control how much new information is retained.
It should be noted that the purpose of the co-adaptive meta learner is to optimize global prior θ ═ phi, ω, χ for different semantic tasks, which will pass through the meta training task
Figure RE-GDA0003451038920000144
The backward propagation of query loss of (1) optimizes φ, i.e.:
Figure RE-GDA0003451038920000145
it does not directly update the global prior with task data. In addition, the semantic embedding priors and the item embedding priors are also updated using gradient descent, by
Figure RE-GDA0003451038920000146
λ is a hyper-parameter and f represents the recommended model.
The embodiment obtains semantic enhancement task data by extracting a semantic enhancement task constructor of a meta path from a configuration file; generating a context aggregation model according to the semantic enhancement task data to derive semantic embeddings with context semantics; obtaining initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to obtain the semantic embedding and the project embedding which are suitable for a new task; estimating the rating of the user u on the item i through a preference prediction model by the learned semantic embedding and item embedding; obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adapter on a specific meta path p; adapting a global task prior omega to the task through a plurality of gradient descent steps according to the global context prior phi; and initializing parameters of the fully-connected neural network model by combining the two multilayer fully-connected neural network models with a semantic specific memory to optimize a learning result, so that the technical effect of quick adaptation is realized.
Referring to fig. 3, fig. 3 is a block diagram illustrating a meta learning apparatus of a cold start recommendation system according to a first embodiment of the present invention.
As shown in fig. 3, the meta learning apparatus of the cold start recommendation system according to the embodiment of the present invention includes:
an extracting module 301, configured to extract a semantic enhanced task constructor of a meta-path from a configuration file to obtain semantic enhanced task data;
a building module 302, configured to generate a context aggregation model according to the semantic enhancement task data to derive semantic embeddings with context semantics;
the learning module 303 is configured to obtain initial project embedding, construct two multilayer fully-connected layer neural network models, and learn the semantic embedding and the project embedding to obtain semantic embedding and project embedding adapted to a new task;
a prediction module 304, configured to estimate a rating of the user u on the item i through a preference prediction model according to the learned semantic embedding and item embedding;
an evaluation module 305, configured to obtain a global context prior phi according to a semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adaptor on a specific meta-path p;
an adaptation module 306 for adapting a global task prior ω to a task by a plurality of gradient descent steps according to the global context prior Φ;
an implementation module 307, configured to initialize parameters of the fully-connected neural network model according to the two multi-layer fully-connected neural network models in combination with the semantic specific memory to optimize a learning result.
The embodiment obtains semantic enhancement task data by extracting a semantic enhancement task constructor of a meta path from a configuration file; generating a context aggregation model according to the semantic enhancement task data to derive semantic embeddings with context semantics; obtaining initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to obtain the semantic embedding and the project embedding which are suitable for a new task; estimating the rating of the user u on the item i through a preference prediction model by the learned semantic embedding and item embedding; obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adapter on a specific meta path p; adapting a global task prior omega to the task through a plurality of gradient descent steps according to the global context prior phi; and initializing parameters of the fully-connected neural network model by combining the two multilayer fully-connected neural network models with a semantic specific memory to optimize a learning result, so that the technical effect of quick adaptation is realized.
In an embodiment, the building module 302 is further configured to give an initial user-embedded e according to the semantic enhancement taskuAnd an
Figure RE-GDA0003451038920000161
Embedded list e of rating items ofiContext aggregation model gφThrough CuDeriving semantic embedding e with contextual semanticssThe formula is as follows:
Figure RE-GDA0003451038920000162
wherein C isuRepresenting a collection of items related to user u, caused by direct interaction or meta-path, is a context aggregation model with parameters of
Figure RE-GDA0003451038920000163
Figure RE-GDA0003451038920000164
Is a d-dimensional dense vector which is mainly used for embedding a plurality of features of a user.
Figure RE-GDA0003451038920000165
Is a feature embedding matrix, σ (-) is an activation function, and the formula is as follows:
Figure RE-GDA0003451038920000171
wherein a isiIs a fixed parameter within the interval (1, + ∞).
In an embodiment, the learning module 303 is further configured to obtain an initial item embedding; constructing two multilayer full-connection layer neural network models, wherein the network models are
Figure RE-GDA0003451038920000172
Wherein z (·) represents a fully connected layer; chi shapes,χiRespectively representing full connection layer parameters of learning semantic embedding and project embedding; and learning the semantic embedding and the project embedding according to the two multilayer fully-connected layer neural network models to obtain the semantic embedding and the project embedding which are suitable for new tasks.
In an embodiment, the prediction module 304 is further configured to estimate the rating of the user u on the item i by using a preference prediction model, wherein the preference prediction model is specifically formulated as
Figure RE-GDA0003451038920000173
Wherein the MLP is a multi-layer perceptron,
Figure RE-GDA0003451038920000174
representing a cascade. Where h iswA rating prediction model parameterized by w, containing weights and biases in MLP.
In an embodiment, the evaluation module 305 is further configured to specify a task T of the user uuSupport set
Figure RE-GDA0003451038920000175
Is semantically contextual
Figure RE-GDA0003451038920000176
Enhancing; given P ∈ P, semantics embed esIn the semantic space of p is
Figure RE-GDA0003451038920000177
Computing task TuMedium rated term
Figure RE-GDA0003451038920000178
Loss of
Figure RE-GDA0003451038920000179
Wherein
Figure RE-GDA0003451038920000181
Representing the meta-path p induces a prediction rating for u on item i in the semantic space;
by computing task T after gradient descentuLoss in each semantic space
Figure RE-GDA0003451038920000182
y to obtain semantic priors for various aspects
Figure RE-GDA0003451038920000183
Where gamma represents the semantic learning rate.
In an embodiment, the adapting module 306 is further configured to adapt the value of phi on the support set according to the global context a priori
Figure RE-GDA0003451038920000184
Is updated to
Figure RE-GDA0003451038920000185
Converting global priors omega to the same space
Figure RE-GDA0003451038920000186
Where &' is the product of elements,. kappa (-), can be viewed as a transfer function for several full-connection layer network implementations, finally using a gradient descent such that ω ispAdapting to tasks
Figure RE-GDA00034510389200001814
Figure RE-GDA0003451038920000187
In an embodiment, the implementation module 307 is further configured to determine the semantic profile fs
Figure RE-GDA0003451038920000189
Wherein
Figure RE-GDA00034510389200001810
A common dimension of the first dimension is different semantic dimensions; computing semantic attention values
Figure RE-GDA00034510389200001811
Extending semantic profile vectors to and MFThe same dimensional space, i.e.
Figure RE-GDA00034510389200001812
The similarity between the two is calculated by cosine similarity, i.e.
Figure RE-GDA00034510389200001813
The semantic attention value a can be obtained by carrying out normalization through the sofamax functions
Semantic embedded memory
Figure RE-GDA0003451038920000191
Store all fast gradients, d, of the same shape as the parameters of the semantic embedding modelθsRepresenting the dimension of the parameter in the semantic embedded memory to obtain the personalized bias item bsBy passing
Figure RE-GDA0003451038920000192
In the initialization stage, the two memories are initialized randomly and updated in the training process; mFWill be updated in the following way
Figure RE-GDA0003451038920000193
Where α is a hyper-parameter that controls how much new profile information is added; mSWill also be updated by:
Figure RE-GDA0003451038920000194
wherein
Figure RE-GDA0003451038920000195
Representing the training loss, α, β are hyper-parameters that control how much new information is retained.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., a rom/ram, a magnetic disk, an optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A meta-learning method for a cold start recommendation system, the method comprising:
extracting a semantic enhancement task constructor of a meta path from the configuration file to obtain semantic enhancement task data;
generating a context aggregation model according to the semantic enhancement task data to derive semantic embeddings with context semantics;
obtaining initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to obtain the semantic embedding and the project embedding which are suitable for a new task;
estimating the rating of the user u on the item i through a preference prediction model by the learned semantic embedding and item embedding;
obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adapter on a specific meta path p;
adapting a global task prior omega to the task through a plurality of gradient descent steps according to the global context prior phi;
and initializing parameters of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory so as to optimize a learning result.
2. The method of claim 1, wherein the step of generating a context aggregation model from the semantic enhanced task data to derive semantic embeddings with contextual semantics comprises:
giving initial user embedding e according to the semantic enhancement taskuAnd an
Figure FDA0003377652610000011
Embedded list e of rating items ofiContext aggregation model gφThrough CuDeriving semantic embedding e with contextual semanticssThe formula is as follows:
Figure FDA0003377652610000012
wherein C isuRepresenting a collection of items related to user u, caused by direct interaction or meta-path, is a context aggregation model with parameters of
Figure FDA0003377652610000013
Figure FDA0003377652610000014
Is a d-dimensional dense vector which is mainly used for embedding a plurality of features of a user.
Figure FDA0003377652610000015
Is a feature embedding matrix, σ (-) is an activation function, and the formula is as follows:
Figure FDA0003377652610000021
wherein a isiIs a fixed parameter within the interval (1, + ∞).
3. The method of claim 1, wherein the obtaining of initial item embedding, building two multi-layered fully-connected layer neural network models learning the semantic embedding and the item embedding to obtain semantic embedding and item embedding adapted to new tasks, comprises:
acquiring initial project embedding;
constructing two multilayer full-connection layer neural network models, wherein the network models are
Figure FDA0003377652610000022
Wherein z (·) represents a fully connected layer; chi shapes,χiRespectively representing full connection layer parameters of learning semantic embedding and project embedding;
and learning the semantic embedding and the item embedding according to the two multilayer fully-connected layer neural network models to obtain the learned semantic embedding and item embedding.
4. The method of claim 1, wherein the step of estimating a rating of user u on item i by a preference prediction model from the learned semantics embedding and item embedding comprises:
estimating the rating of the user u on the item i by the learned semantic embedding and item embedding through a preference prediction model, wherein the preference prediction model has a specific formula
Figure FDA0003377652610000023
Wherein the MLP is a multi-layer perceptron,
Figure FDA0003377652610000024
denotes a cascade, here hwA rating prediction model parameterized by w, containing weights and biases in MLP.
5. The method of claim 1, wherein the semantic adaptor is a task
Figure FDA0003377652610000025
Semantic enhanced support set of
Figure FDA0003377652610000026
The step of obtaining global context prior phi according to the semantic context evaluation loss caused by the aggregation model and the preference prediction model in combination with a semantic adaptor on a specific meta-path p comprises the following steps:
specifying task T for user uuSupport set
Figure FDA0003377652610000031
Is semantically contextual
Figure FDA0003377652610000032
Enhancing;
given P ∈ P, semantics embed esIn the semantic space of p is
Figure FDA0003377652610000033
Computing task TuMedium rated term
Figure FDA0003377652610000034
Loss of
Figure FDA0003377652610000035
Wherein
Figure FDA0003377652610000036
Representing the meta-path p induces a prediction rating for u on item i in the semantic space;
by computing task T after gradient descentuLoss in each semantic space
Figure FDA0003377652610000037
To obtain semantic priors for various aspects
Figure FDA0003377652610000038
Where gamma represents the semantic learning rate.
6. The method of claim 1, wherein said step of adapting a global task a priori ω to a task through a plurality of gradient descent steps according to said global context a priori Φ comprises:
a priori φ on support set based on the global context
Figure FDA0003377652610000039
Is updated to
Figure FDA00033776526100000310
Converting global priors omega to the same space
Figure FDA00033776526100000311
WhereinAs a product of elements, κ (·) can be viewed as a transfer function for several full-connection layer network implementations, finally using a gradient descent such that ω ispAdapting to tasks
Figure FDA00033776526100000312
Figure FDA00033776526100000313
7. The method of claim 1, wherein the semantic specific memory is a build semantic embedded memory MSAnd configuration file memory MFI.e. embedding memory M by means of semanticsSAnd configuration file memory MFThereby initializing personality parameters χs
The step of initializing parameters of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory to optimize a learning result comprises the following steps:
definition configuration file fs
Figure FDA0003377652610000041
Wherein
Figure FDA0003377652610000042
A common dimension of the first dimension is different semantic dimensions;
computing semantic attention values
Figure FDA0003377652610000043
Extending semantic profile vectors to and MFThe same dimensional space, i.e.
Figure FDA0003377652610000044
The similarity between the two is calculated by cosine similarity, i.e.
Figure FDA0003377652610000045
The semantic attention value a can be obtained by carrying out normalization through the sofamax functions
Semantic embedded memory
Figure FDA0003377652610000046
All fast gradients are stored with the same shape as the semantic embedded model parameters,
Figure FDA0003377652610000047
representing the dimension of the parameter in the semantic embedded memory to obtain the personalized bias item bsBy passing
Figure FDA0003377652610000048
In the initialization stage, the two memories are initialized randomly and updated in the training process;
MFwill be updated in the following way
Figure FDA0003377652610000049
Where α is a hyper-parameter that controls how much new profile information is added; mSWill also be updated by:
Figure FDA00033776526100000410
wherein
Figure FDA00033776526100000411
Representing the loss of training, alpha, beta being a super-value controlling how much new information is retainedAnd (4) parameters.
8. A meta-learning apparatus for a cold start recommendation system, the apparatus comprising:
the extraction module is used for extracting the semantic enhancement task constructor of the meta-path from the configuration file so as to obtain semantic enhancement task data;
the building module is used for generating a context aggregation model according to the semantic enhancement task data so as to derive semantic embedding with context semantics;
the learning module is used for acquiring initial project embedding, and constructing two multilayer full-connection layer neural network models to learn the semantic embedding and the project embedding so as to acquire the semantic embedding and the project embedding which adapt to a new task;
the prediction module is used for estimating the rating of the user u on the item i through a preference prediction model according to the learned semantic embedding and item embedding;
the evaluation module is used for obtaining global context prior phi according to the semantic context evaluation loss caused by the combination of the aggregation model and the preference prediction model and a semantic adaptor on a specific meta-path p;
an adaptation module for adapting a global task prior omega to the task by a plurality of gradient descent steps according to the global context prior phi;
and the implementation module is used for initializing parameters of the fully-connected neural network model according to the two multilayer fully-connected neural network models and combining the semantic specific memory so as to optimize a learning result.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202111424274.XA 2021-11-26 2021-11-26 Meta learning method, device, equipment and storage medium of cold start recommendation system Pending CN114077662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111424274.XA CN114077662A (en) 2021-11-26 2021-11-26 Meta learning method, device, equipment and storage medium of cold start recommendation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111424274.XA CN114077662A (en) 2021-11-26 2021-11-26 Meta learning method, device, equipment and storage medium of cold start recommendation system

Publications (1)

Publication Number Publication Date
CN114077662A true CN114077662A (en) 2022-02-22

Family

ID=80284347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111424274.XA Pending CN114077662A (en) 2021-11-26 2021-11-26 Meta learning method, device, equipment and storage medium of cold start recommendation system

Country Status (1)

Country Link
CN (1) CN114077662A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023185320A1 (en) * 2022-03-30 2023-10-05 腾讯科技(深圳)有限公司 Cold start object recommendation method and apparatus, computer device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023185320A1 (en) * 2022-03-30 2023-10-05 腾讯科技(深圳)有限公司 Cold start object recommendation method and apparatus, computer device and storage medium

Similar Documents

Publication Publication Date Title
US11443162B2 (en) Resource constrained neural network architecture search
CN108959396B (en) Machine reading model training method and device and question and answer method and device
EP3711000B1 (en) Regularized neural network architecture search
CN107066464B (en) Semantic natural language vector space
AU2016256753B2 (en) Image captioning using weak supervision and semantic natural language vector space
CN111581510A (en) Shared content processing method and device, computer equipment and storage medium
JP5734460B2 (en) Method and system for comparing images
CN110795527B (en) Candidate entity ordering method, training method and related device
CN112131890A (en) Method, device and equipment for constructing intelligent recognition model of conversation intention
JP2020061173A (en) Answer learning device, answer learning method, answer generating device, answer generating method, and program
CN111651576B (en) Multi-round reading understanding method based on transfer learning
CN110598123B (en) Information retrieval recommendation method, device and storage medium based on image similarity
CN114077662A (en) Meta learning method, device, equipment and storage medium of cold start recommendation system
CN109858031B (en) Neural network model training and context prediction method and device
CN111242162A (en) Training method and device of image classification model, medium and electronic equipment
JP2019028484A (en) Attribute identification apparatus, attribute identification model learning apparatus, method and program
CN112287140A (en) Image retrieval method and system based on big data
CN111191065A (en) Homologous image determining method and device
CN114443916B (en) Supply and demand matching method and system for test data
Zerrouk et al. Evolutionary algorithm for optimized CNN architecture search applied to real-time boat detection in aerial images
CN112766288B (en) Image processing model construction method, device, electronic equipment and readable storage medium
CN110659962A (en) Commodity information output method and related device
Fan et al. Online data clustering using variational learning of a hierarchical dirichlet process mixture of dirichlet distributions
JP7133687B1 (en) SEARCH DEVICE, SEARCH METHOD, AND PROGRAM
CN117252665B (en) Service recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination