CN111179031A

CN111179031A - Training method, device and system for commodity recommendation model

Info

Publication number: CN111179031A
Application number: CN201911337899.5A
Authority: CN
Inventors: 刘正夫
Original assignee: 4Paradigm Beijing Technology Co Ltd
Current assignee: 4Paradigm Beijing Technology Co Ltd
Priority date: 2019-12-23
Filing date: 2019-12-23
Publication date: 2020-05-19
Anticipated expiration: 2039-12-23
Also published as: CN111179031B

Abstract

The invention discloses a training method, a device and a system of a commodity recommendation model, wherein the method comprises the following steps: acquiring interactive data of a target user for executing various interactive operations on a target commodity and an initial sample set for training a commodity recommendation model; training an initial commodity recommendation model by adopting a preset machine learning algorithm based on the initial sample set; combining the interactive data of various interactive operations according to the initial commodity recommendation model and the initial sample set to obtain combined interactive data; based on a preset graph neural network, obtaining the user characteristics of each target user and the commodity characteristics of each target commodity according to the combined interactive data; constructing a new sample set according to the user characteristics, the commodity characteristics and the initial sample set; and training a final commodity recommendation model by adopting a machine learning algorithm based on the new sample set.

Description

Training method, device and system for commodity recommendation model

Technical Field

The invention relates to the technical field of commodity recommendation, in particular to a training method of a commodity recommendation model, a training device of the commodity recommendation model, a system comprising at least one computing device and at least one storage device and a readable storage medium.

Background

In the internet era, a great deal of interactive data between users and commodities is generated every day. The interaction data contain rich information, and people can use the interaction data to construct a commodity recommendation model for accurate marketing, so that users can be better served. In the industry at present, it is recommended that the model is generally performed by extracting features and then constructing training samples to train the model.

In the prior art, when extracting features for training a recommendation model, only specified interactive operations performed by a user for a commodity are usually considered, and other interactive operations performed by the user for the commodity are rarely considered, so that the accuracy of commodity recommendation performed on the user through the recommendation model is low.

Disclosure of Invention

The invention aims to provide a new technical scheme for training a commodity recommendation model.

According to a first aspect of the present invention, there is provided a training method for a commodity recommendation model, including:

the method comprises the steps that interactive data of a target user for executing various interactive operations on a target commodity and an initial sample set used for training a commodity recommendation model are obtained, wherein each original sample in the initial sample set comprises a plurality of selected features and labels, and the labels represent whether the target user executes specified interactive operations on the target commodity when the corresponding original samples are generated;

training an initial commodity recommendation model by adopting a preset machine learning algorithm based on the initial sample set;

merging the interactive data of various interactive operations according to the initial commodity recommendation model and the initial sample set to obtain merged interactive data;

based on a preset graph neural network, obtaining the user characteristics of each target user and the commodity characteristics of each target commodity according to the combined interaction data;

constructing a new sample set according to the user characteristics, the commodity characteristics and the initial sample set;

and training a final commodity recommendation model by adopting the machine learning algorithm based on the new sample set.

Optionally, the merging, according to the initial commodity recommendation model and the initial sample set, interaction data of multiple interaction operations to obtain merged interaction data includes:

based on the initial commodity recommendation model, obtaining a prediction matching degree between each target user and each target commodity according to the selected features;

constructing a prediction matching degree matrix according to the prediction matching degree between each target user and each target commodity;

respectively constructing an interaction matrix corresponding to the interaction operation according to the interaction data of each interaction operation;

and training a preset machine learning model according to the prediction matching degree matrix and each interaction matrix to obtain the combined interaction data.

Optionally, the training a preset machine learning model according to the prediction matching degree matrix and each interaction matrix to obtain the merged interaction data includes:

establishing an expression of a combined interaction matrix according to the interaction matrix by taking undetermined parameters of the machine learning model as variables;

constructing a first loss function according to the predicted matching degree matrix and the expression of the parallel interaction matrix;

solving the first loss function, and determining the value of undetermined parameters of the machine learning model to obtain the combined interaction matrix;

and obtaining the merged interactive data according to the merged interactive matrix.

Optionally, the obtaining the merged interactive data according to the merged interactive matrix includes:

determining a total number of elements and a non-null rate in an interaction matrix of the specified interaction operation; wherein the non-null rate is a ratio of a number of non-zero elements to the total number of elements;

determining a score threshold according to the total number of elements and the non-null value rate;

and adjusting the element value of which the numerical value is less than or equal to the score threshold value in the combined interactive matrix to a first set value, and adjusting the element value of which the numerical value is greater than the score threshold value to a second set value to obtain the combined interactive data.

Optionally, the expression of the merged interaction matrix is represented as:

wherein, X_iAn interaction matrix which is the ith interaction operation; w is a_iAnd b is the undetermined parameter of the machine learning model, and P is the combined interaction matrix.

Optionally, the first loss function is represented as:

wherein, K_i,jIs the value of the element in the ith row and the jth column in the prediction matching degree matrix, P_i,jThe element value of the ith row and the jth column in the combined interaction matrix is obtained; m is the number of rows of the matrix and n is the number of columns of the matrix.

Optionally, the obtaining, based on the preset graph neural network, the user characteristic of each target user and the commodity characteristic of each target commodity according to the merged interaction data includes:

respectively constructing a graph neural network corresponding to each target user and each target commodity according to the merged interactive data; wherein, the number of layers of each graph neural network is the same;

training the graph neural network of each target user and each target commodity according to the combined interactive data, and obtaining the value of each target user in the hidden layer of the corresponding graph neural network as the user characteristic of the corresponding target user; and obtaining the value of each target commodity in the hidden layer of the corresponding graph neural network as the commodity characteristic of the corresponding target commodity.

Optionally, the training of the neural network of the graph for each target user and each target commodity according to the merged interaction data includes:

for each target commodity and each target user, respectively taking the parameters to be determined of the neural network of the corresponding graph as variables, and constructing an expression of a hidden layer; wherein the undetermined parameters of each graph neural network at the same layer are the same;

constructing an expression of the distance between each target commodity and each target user according to the expression of the hidden layer of each target commodity and the expression of the hidden layer of each target user;

constructing a second loss function according to the expression of the distance between each target commodity and each target user and the combined interactive data;

and solving the second loss function, and determining the value of the parameter to be determined of each graph neural network.

Optionally, taking each target commodity and each target user in turn as target nodes;

the expression of the hidden layer of the target node is expressed as:

h⁰＝x

wherein, therein

Is a value of a hidden layer of the target node of a k-th layer, x is an initial value of the target node, N (v) represents a node connected to the target node in the graph neural network, σ is an activation function, W_kAnd B_kAll are undetermined coefficients of the kth layer of the graph neural network.

Optionally, the expression of the second loss function is represented as:

L2＝∑_ij(y_ij-P_ij)²

wherein, P_ijCorresponding element values of an ith target user and a jth target commodity in the synthetic interaction matrix; y is_ijThe distance between the ith target user and the jth target commodity is obtained.

Optionally, the machine learning algorithm is a GBDT algorithm.

Optionally, the constructing a new sample set according to the user characteristics, the commodity characteristics, and the initial sample set includes:

and adding the user characteristics of the corresponding target user and the commodity characteristics of the corresponding target commodity in each initial sample to obtain a new sample.

Optionally, the method further includes:

acquiring a characteristic value of a selected characteristic of at least one preset candidate commodity corresponding to a user to be recommended and new interaction data of the user to be recommended for executing the multiple interaction operations on each candidate commodity;

obtaining the user characteristics of the user to be recommended and the commodity characteristics of each candidate commodity according to the new interaction data;

based on the final commodity recommendation model, obtaining a recommendation score of each candidate commodity corresponding to the user to be recommended according to the feature value of the selected feature of each candidate commodity corresponding to the user to be recommended, the user feature of the user to be recommended and the commodity feature of the candidate commodity;

and selecting the candidate commodities of which the recommendation scores accord with preset recommendation conditions as recommended commodities to be recommended to the user to be recommended.

Optionally, the step of selecting the candidate product of which the recommendation score meets the preset recommendation condition as the recommended product to be recommended to the user to be recommended includes:

sorting the candidate commodities in a descending order according to the recommendation score, and acquiring a sorting order of each candidate commodity;

and selecting candidate commodities with the sorting order meeting the preset sorting range, and recommending the candidate commodities to the user to be recommended as recommended commodities.

Optionally, the method further includes:

and displaying the candidate commodities and the sorting order of each candidate commodity.

According to a second aspect of the present invention, there is provided a training apparatus for a commodity recommendation model, comprising:

the system comprises a data acquisition module, a commodity recommendation module and a commodity recommendation module, wherein the data acquisition module is used for acquiring interaction data of a target user for executing various interaction operations on a target commodity and an initial sample set for training a commodity recommendation model, each original sample in the initial sample set comprises a plurality of selected features and labels, and the labels represent whether the target user executes specified interaction operations on the target commodity when corresponding original samples are generated;

the initial training module is used for training an initial commodity recommendation model by adopting a preset machine learning algorithm based on the initial sample set;

the data merging module is used for merging the interactive data of various interactive operations according to the initial commodity recommendation model and the initial sample set to obtain merged interactive data;

the characteristic generation module is used for obtaining the user characteristic of each target user and the commodity characteristic of each target commodity according to the combined interactive data based on a preset graph neural network;

the sample construction module is used for constructing a new sample set according to the user characteristics, the commodity characteristics and the initial sample set; and the number of the first and second groups,

and the final training module is used for training a final commodity recommendation model by adopting the machine learning algorithm based on the new sample set.

Optionally, the data merging module is further configured to:

Optionally, the expression of the merged interaction matrix is represented as:

Optionally, the first loss function is represented as:

wherein, K_i,jFor the element value of the ith row and the jth column in the prediction matching degree matrix，P_i,jThe element value of the ith row and the jth column in the combined interaction matrix is obtained; m is the number of rows of the matrix and n is the number of columns of the matrix.

Optionally, the feature generation module is further configured to:

the expression of the hidden layer of the target node is expressed as:

h⁰＝x

wherein, therein

Optionally, the expression of the second loss function is represented as:

L2＝∑_ij(y_ij-P_ij)²

Optionally, the machine learning algorithm is a GBDT algorithm.

Optionally, the sample construction module is further configured to:

Optionally, the method further includes:

the module is used for acquiring a characteristic value of a selected characteristic of at least one preset candidate commodity corresponding to a user to be recommended and new interaction data of the user to be recommended for executing the multiple interaction operations on each candidate commodity;

the module is used for obtaining the user characteristics of the user to be recommended and the commodity characteristics of each candidate commodity according to the new interaction data;

a module for obtaining a recommendation score of each candidate commodity corresponding to the user to be recommended according to the feature value of the selected feature of each candidate commodity corresponding to the user to be recommended, the user feature of the user to be recommended and the commodity feature of the candidate commodity based on the final commodity recommendation model;

and the module is used for selecting the candidate commodities of which the recommendation scores accord with the preset recommendation conditions as the recommended commodities to be recommended to the user to be recommended.

Optionally, the selecting the candidate goods of which the recommendation scores meet the preset recommendation conditions as the recommended goods to be recommended to the user to be recommended includes:

Optionally, the method further includes:

a module for presenting the candidate items, and the rank order of each candidate item.

According to a third aspect of the present invention there is provided a system comprising at least one computing device and at least one storage device, wherein the at least one storage device is arranged to store instructions for controlling the at least one computing device to perform the method according to the first aspect of the present invention.

According to a fourth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method according to the first aspect of the present invention.

In the embodiment of the invention, the user characteristics of each target user and the commodity characteristics of each target commodity are obtained on the basis of the graph neural network through the interactive data of the target user for executing various interactive operations on the target commodity, and the final commodity recommendation model is obtained through training according to the user characteristics and the commodity characteristics, so that the final commodity recommendation model can learn useful information more easily, and the recommendation effect of the final commodity recommendation model can be further improved.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a block diagram of one example of a hardware configuration of an electronic device that can be used to implement an embodiment of the present invention.

FIG. 2 is a flowchart illustrating a method for training a merchandise recommendation model according to an embodiment of the invention;

FIG. 3 is a flowchart illustrating the steps of obtaining consolidated data according to an embodiment of the invention;

FIG. 4 is a diagram illustrating results of a neural network in accordance with an embodiment of the present invention;

FIG. 5 is a block diagram of a training apparatus for a merchandise recommendation model according to an embodiment of the present invention;

fig. 6 is a block schematic diagram of a system according to an embodiment of the invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

Various embodiments and examples according to embodiments of the present invention are described below with reference to the accompanying drawings.

< hardware configuration >

Fig. 1 is a block diagram showing a hardware configuration of an electronic apparatus 1000 that can implement an embodiment of the present invention.

The electronic device 1000 may be a laptop, desktop, cell phone, tablet, etc. As shown in fig. 1, the electronic device 1000 may include a processor 1100, a memory 1200, an interface device 1300, a communication device 1400, a display device 1500, an input device 1600, a speaker 1700, a microphone 1800, and the like. The processor 1100 may be a central processing unit CPU, a microprocessor MCU, or the like. The memory 1200 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1300 includes, for example, a USB interface, a headphone interface, and the like. The communication device 1400 is capable of wired or wireless communication, for example, and may specifically include Wifi communication, bluetooth communication, 2G/3G/4G/5G communication, and the like. The display device 1500 is, for example, a liquid crystal display panel, a touch panel, or the like. The input device 1600 may include, for example, a touch screen, a keyboard, a somatosensory input, and the like. A user can input/output voice information through the speaker 1700 and the microphone 1800.

The electronic device shown in fig. 1 is merely illustrative and is in no way meant to limit the invention, its application, or uses. In an embodiment of the present invention, the memory 1200 of the electronic device 1000 is configured to store instructions for controlling the processor 1100 to operate so as to execute a training method of any one of the commodity recommendation models provided by the embodiment of the present invention. It will be appreciated by those skilled in the art that although a plurality of means are shown for the electronic device 1000 in fig. 1, the present invention may relate to only some of the means therein, e.g. the electronic device 1000 relates to only the processor 1100 and the storage means 1200. The skilled person can design the instructions according to the disclosed solution. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.

< method examples >

In this embodiment, a training method of a commodity recommendation model is provided. The training method of the commodity recommendation model can be implemented by electronic equipment. The electronic device may be the electronic device 1000 as shown in fig. 1.

As shown in fig. 2, the training method of the product recommendation model in this embodiment may include the following steps S2100 to S2600:

step S2100, acquiring interaction data of a target user performing various interaction operations on a target commodity, and an initial sample set used for training a commodity recommendation model.

Each original sample in the initial sample set comprises a plurality of selected features and labels, and the labels represent whether specified interactive operations are executed by a target user aiming at a target commodity when the corresponding original sample is generated.

The interaction data may indicate whether the target user performs a corresponding interaction operation for the target good, which may include, for example, a purchase, a click, or a search.

In one embodiment of the present invention, the interaction data corresponding to each interaction operation may be a data table as shown in tables 1-3 below. In the data table, when a target user executes corresponding interactive operation aiming at a target commodity, the value of a corresponding position is 1; and when the target user does not execute corresponding interactive operation aiming at the target commodity, the value of the corresponding position is 0.

For example, Table 1 may be an interaction data sheet where the interaction is a purchase. Table 2 may be an interaction data table in which the interaction operation is a click; table 3 may be an interaction data table in which the interaction is a search. Wherein, U represents the user number of the target user, and P represents the commodity number of the target commodity.

TABLE 1

TABLE 2

TABLE 3

In an embodiment of the present invention, the specified interactive operation may be one of the aforementioned interactive operations, and specifically may be specified in advance according to an application scenario or a specific requirement. For example, the specified interaction may be a purchase.

In one embodiment of the present invention, the initial sample set may be a data table as shown in Table 4 below. The selected characteristics of each original sample may include characteristics f 1-f 4, and each original sample corresponds to a combination of a target user and a target product, and specifically may be a combination of a user number corresponding to a target user and a product number of a target product.

When the target user executes the specified interactive behavior for the target commodity, the value of the corresponding tag position in table 4 is 1; when the target user does not execute the specified interaction behavior for the target product, the value of the corresponding tag position in table 4 is 0.

TABLE 4

User number	Trade mark number	f1	f2	f3	f4	Label (R)
							U1	P1	2	1	3	11	1
U1	P2	4	2	5	12	0
							U2	P3	20	10	15	10	1
U3	P2	25	13	12	23	1
							U4	P3	34	32	13	22	0
U5	P2	52	17	15	27	1
							U6	P1	29	83	32	23	0
U2	P4	96	27	36	27	1
							U1	P4	25	32	35	22	0

Step S2200 is that based on the initial sample set, a preset machine learning algorithm is adopted to train an initial commodity recommendation model.

The machine learning algorithm of the present embodiment may be a recommendation algorithm, and may be any one of a GBDT algorithm, a DNN algorithm, an LR algorithm, a collaborative filtering algorithm, and a GraphX algorithm, for example.

And step S2300, merging the interactive data of the various interactive operations according to the initial commodity recommendation model and the initial sample set to obtain merged interactive data.

In an embodiment of the present invention, merging interaction data of multiple interaction operations according to the initial product recommendation model and the initial sample set, and obtaining merged interaction data may include steps S2310 to S2340 shown in fig. 3:

step S2310, based on the initial commodity recommendation model, a prediction matching degree between each target user and each target commodity is obtained according to the selected features.

Specifically, the selected features of each original sample may be input into an initial product recommendation model to obtain a corresponding prediction matching degree, that is, the prediction matching degree between the target user and the target product corresponding to the original sample.

Step S2320, a prediction matching degree matrix is constructed according to the prediction matching degree between each target user and each target commodity.

The predicted matching degree between each target user and each target commodity may be, as shown in table 5, where the value of each position is the predicted matching degree between the corresponding target user and the target commodity. Wherein, U represents the user number of the target user, and P represents the commodity number of the target commodity.

TABLE 5

The predicted match matrix obtained from table 5 above can be expressed as:

step S2330, an interaction matrix corresponding to the interaction operation is constructed according to the interaction data of each interaction operation.

Specifically, the interactive matrix is constructed in the same manner as the prediction matching degree matrix. In each interactive matrix of the interactive operation, the target users corresponding to the elements at the same position are the same as the target commodities.

According to the data table of the purchasing operation in table 1, the interaction matrix for constructing the purchasing operation may be:

according to the data table of the click operation in table 2, the interactive matrix for the click operation is constructed as follows:

according to the data table of the search operation in table 3, the interactive matrix for constructing the search operation may be:

and S2340, training a preset machine learning model according to the prediction matching degree matrix and each interaction matrix to obtain combined interaction data.

In an embodiment of the present invention, training a preset machine learning model according to the predicted matching degree matrix and each interaction matrix to obtain merged interaction data may include steps S2341 to S2344 as follows:

and S2341, constructing an expression of the combined interaction matrix according to the interaction matrix by taking undetermined parameters of the machine learning model as variables.

In one embodiment of the present invention, the expression of the merged interaction matrix may be expressed as:

wherein, X_iAn interaction matrix which is the ith interaction operation; w is a_iB is the undetermined parameter of the machine learning model, and P is the combined interaction matrix.

Step S2342, a first loss function is constructed according to the expression of the prediction matching degree matrix and the parallel interaction matrix.

In one embodiment of the invention, the first loss function may be expressed as:

wherein, K_i,jFor predicting the value of an element in the ith row and jth column of the match matrix, P_i,jThe element values of the ith row and the jth column in the combined interaction matrix are obtained; m is the number of rows of the matrix and n is the number of columns of the matrix.

Step S2343, solving the first loss function, determining the value of the undetermined parameter of the machine learning model, and obtaining a combined interaction matrix.

In an embodiment of the present invention, the first loss function may be optimized by using a stochastic gradient descent optimization algorithm, so that the value of the first loss function L1 is minimized, and the trained w is obtained_iAnd b, further obtaining a combined interaction matrix P.

In an example of the present invention, the interaction matrix of the 1 st interactive operation is the interaction matrix of the purchase operation, the interaction matrix of the 2 nd interactive operation is the interaction matrix of the click operation, the interaction matrix of the 3 rd interactive operation is the interaction matrix of the search operation, and w is obtained by solving the first loss function₁＝0.4,w₂＝0.1,w₃0.1 and 0.1. Then, by P ═ sigmoid (b + ∑ y)_i＝1w_i*X_i) The merged interaction matrix P can be obtained as:

and step S2344, obtaining the merged interactive data according to the merged interactive matrix.

In an embodiment of the present invention, obtaining the merged interactive data according to the merged interactive matrix may include steps S2344-1 to S2344-3 as follows:

step S2344-1, determining the total number of elements and the non-null rate in the interaction matrix of the specified interactive operation.

Wherein the non-null rate is a ratio of the number of non-zero elements to the total number of elements.

Specifically, in the case that the specified interactive operation is a purchase, the total number of elements in the interactive matrix of the purchase operation and the non-null value rate may be determined.

Interaction matrix at purchase operation

Where the total number of elements is 24 and the number of non-zero elements is 9, then the non-null rate may be 9/24-0.375.

And step S2344-2, determining a score threshold value according to the total number of the elements and the non-null value rate.

In one embodiment of the present invention, the values of the elements in the combined interaction matrix may be sorted in a descending order in advance, and the sorting value of each element may be determined.

The sorting threshold is determined according to the total number N of elements and the non-null rate p, and may be, for example, N × λ × p, where λ is a coefficient of the null filling rate, and may be, for example, 2. Based on the sorting threshold, a score threshold may be determined. The score threshold may be an element value with an order value of N x λ x p.

For example, if the element value of the rank value N × λ × p is 0.52, then the score threshold may also be 0.52.

And step S2344-3, adjusting the element values of the combined interactive matrix, the numerical values of which are less than or equal to the score threshold, to be first set values, and adjusting the element values of which the numerical values are greater than the score threshold to be second set values, so as to obtain combined interactive data.

In an embodiment of the present invention, the first setting value and the second setting value may be set in advance according to an application scenario or a specific requirement, for example, the first setting value may be 1, and the second setting value may be 0. Then, the value of the element in the merged interaction matrix P that is greater than the score threshold may be adjusted to 1, and the value of the element that is less than or equal to the score threshold may be adjusted to 0.

In the case of a score threshold of 0.52, the matrix will be merged

After the element values in (b) are adjusted, the obtained merged interactive data can be as shown in table 6 below:

TABLE 6

Wherein, U represents the user number of the target user, and P represents the commodity number of the target commodity.

And step S2400, based on a preset graph neural network, obtaining user characteristics of each target user and commodity characteristics of each target commodity according to the combined interactive data.

In an embodiment of the present invention, based on the preset graph neural network, obtaining the user characteristic of each target user and the commodity characteristic of each target commodity according to the merged interaction data may include steps S2410 to S2420 as follows:

step S2410, respectively constructing a graph neural network corresponding to each target user and each target commodity according to the merged interactive data.

Wherein, the number of layers of each graph neural network is the same.

Specifically, each target user and each target commodity may be used as a root node, and a graph neural network corresponding to each target user and a graph neural network corresponding to each target commodity may be established. In the case where there are N1 target users and N2 target commodities, N1+ N2 graph neural networks may be constructed, and the root node of each graph neural network is different. The k-th layer in the graph neural network may be a target commodity (or a target user) corresponding to the first setting value with the target user (or the target commodity) of the k-1-th layer in the merged interactive data.

For example, in the case that the number of layers of the graph neural network is five, for the graph neural network with the target user U1 as a root node, the first layer of the graph neural network may be the target user U1, the second layer may be all target commodities corresponding to the first setting value with the target user U1 in the merged interactive data, the third layer may be all target users corresponding to the first setting value with the target commodity of the second layer in the merged interactive data, and the fourth layer may be all target commodities corresponding to the first setting value with the target user of the third layer in the merged interactive data, and the schematic structural diagram of the graph neural network may be as shown in fig. 4.

Step S2420, training the graph neural network of each target user and each target commodity according to the merged interactive data, and obtaining the value of each target user in the hidden layer of the corresponding graph neural network as the user characteristic of the corresponding target user; and obtaining the value of each target commodity in the hidden layer of the corresponding graph neural network as the commodity characteristic of the corresponding target commodity.

In one embodiment of the present invention, training the neural network of each target user and each target commodity based on the merged interaction data includes steps S2421-S2424 as follows:

step S2421, for each target commodity and each target user, respectively taking the parameters to be determined of the neural network of the corresponding graph as variables, and constructing an expression of the hidden layer.

And the undetermined parameters of each graph neural network on the same layer are the same.

In an embodiment of the present invention, each target commodity and each target user may be taken as a target node in turn, and an expression of a hidden layer of the target node may be represented as:

h⁰＝x

wherein, therein

Is the value of the hidden layer of the target node of the k-th layer, x is the initial value of the target node, N (v) represents the node connected to the target node in the graph neural network, σ is the activation function (such as, but not limited to, ReLU), W_kAnd B_kAre all undetermined coefficients of the kth layer of the graph neural network.

Specifically, the expression of the hidden layer of the ith target user may be represented as:

wherein, therein

Is the value of the hidden layer, x, of the k-th layer in the graph neural network of the ith target user_iIs the initial value of the ith target user, N (i) represents the node connected with the ith target user in the graph neural network, σ is the activation function, W_kAnd B_kAre all undetermined coefficients of the kth layer of the graph neural network.

The expression of the hidden layer of the jth target commodity can be expressed as:

wherein, therein

Is the value, x, of the hidden layer at the k-th layer in the graph neural network of the jth target commodity_iIs the jth eyeInitial value of target commodity, N (i) represents a node connected to jth target commodity in the neural network of the graph, σ is an activation function, W_kAnd B_kAre all undetermined coefficients of the kth layer of the graph neural network.

Step S2422, an expression of the distance between each target commodity and each target user is constructed according to the expression of the hidden layer of each target commodity and the expression of the hidden layer of each target user.

In one embodiment of the present invention, the distance between each target item and each target user may be a cosine distance.

Then, the distance y between the ith target user and the jth target product_ijThe expression of (c) can be expressed as:

wherein the content of the first and second substances,

is the value of the hidden layer of the k layer in the graph neural network of the ith target user,

is the value of the hidden layer of the kth layer in the graph neural network of the jth target commodity.

Step S2423, a second loss function is constructed according to the expression of the distance between each target commodity and each target user and the merged interactive data.

In one embodiment of the present invention, the expression of the second penalty function L2 may be expressed as:

L2＝∑_ij(y_ij-P_ij)²

wherein, P_ijMerging data corresponding to the ith target user and the jth target commodity in the interactive data; y is_ijIs the distance between the ith target user and the jth target product.

And step S2424, solving the second loss function, and determining the value of the parameter to be determined of each graph neural network.

And solving the second loss function to obtain the value of the parameter to be determined of each graph neural network. According to each graph neural network, the value of the hidden layer of the kth layer in the graph neural network of each target user and the value of the hidden layer of the kth layer in the graph neural network of each target commodity can be obtained.

Further, taking the value of each target user in the hidden layer of the corresponding graph neural network as the user characteristic of the corresponding target user; and taking the value of each target commodity in the hidden layer of the corresponding graph neural network as the commodity characteristic of the corresponding target commodity.

Specifically, the value of the hidden layer of the kth layer in the graph neural network of the ith target user can be

As the user characteristics of the ith target user, the value of the hidden layer of the kth layer in the graph neural network of the jth target commodity is used

As the item characteristics of the jth target item.

In an embodiment of the present invention, based on the preset graph neural network, the user characteristic of each target user obtained according to the merged interactive data shown in table 6 may be as shown in table 7, and the commodity characteristic of each target commodity obtained according to the merged interactive data shown in table 6 may be as shown in table 8:

TABLE 7

User number	User features
		U1	(0.1,0.2,0,0.4)
U2	(1,0.4,0.2,0.6)
		U3	(0.3,0.6,0,0.2)
U4	(0.4,0.8,0.3,0.3)
		U5	(0.1,0.1,0.2,0.1)
U6	(0.9,0.7,0.2,0.6)

TABLE 8

Trade mark number	Characteristics of the goods
		P1	(0.3,0.2,0.1,0.5)
P2	(0.3,0.1,0.1,0.8)
		P3	(0.7,0.8,0.2,0.5)
P4	(0.3,0.6,0.4,0.4)

And S2500, constructing a new sample set according to the user characteristics, the commodity characteristics and the initial sample set.

In one embodiment of the present invention, constructing a new sample set based on the user characteristics, the commodity characteristics, and the initial sample set comprises:

In one embodiment of the present invention, the user characteristics in table 7 and the commodity characteristics in table 8 may be added to the sample data table in table 4, and a new sample data table as shown in table 9 may be obtained.

TABLE 9

And step S2600, training a final commodity recommendation model by adopting a machine learning algorithm based on the new sample set.

In an embodiment of the present invention, the initial commodity recommendation model may be retrained based on the new sample set to obtain the final commodity recommendation model.

In one embodiment of the invention, the commodity recommendation can be performed on the user according to the final commodity recommendation model. Specifically, the method may further include steps S3100 to S3400 as follows:

step S3100, obtaining a feature value of a selected feature of at least one preset candidate commodity corresponding to the user to be recommended and new interaction data of a plurality of interaction operations executed by the user to be recommended for each candidate commodity.

Step S3200, the user characteristics of the user to be recommended and the commodity characteristics of each candidate commodity are obtained according to the new interaction data.

The step of obtaining the user characteristics of the user to be recommended and the commodity characteristics of the candidate commodity according to the new interaction data may include:

combining the new interactive data of various interactive operations according to the final commodity recommendation model to obtain new combined interactive data; and obtaining the user characteristics of the user to be recommended and the commodity characteristics of the candidate commodities according to the new combined interactive data based on the graph neural network. Specifically, the foregoing step 2300 and step S2400 may be referred to, and are not described herein again.

And S3300, based on the final commodity recommendation model, obtaining a recommendation score of each candidate commodity corresponding to the user to be recommended according to the feature value of the selected feature of each candidate commodity corresponding to the user to be recommended, the user feature of the user to be recommended and the commodity feature of the candidate commodity.

Specifically, the feature value of the selected feature of each candidate commodity corresponding to each user to be recommended, the user feature of the user to be recommended, and the commodity feature of the candidate commodity may be input into the final commodity recommendation model, so that the recommendation score of each candidate commodity corresponding to the user to be recommended can be obtained.

And step S3400, selecting candidate commodities of which the recommendation scores accord with preset recommendation conditions as recommended commodities to be recommended to a user to be recommended.

In an embodiment of the present invention, the step of selecting a candidate product whose recommendation score meets a preset recommendation condition as a recommended product and recommending the candidate product to the user to be recommended includes steps S3410 and S3420 shown below:

step S3410, sorting the candidate products in a descending order according to the recommendation score, and obtaining a sorting order of each candidate product.

Step S3420, selecting the candidate commodities with the sorting order according with the preset sorting range, and recommending the candidate commodities to the user to be recommended as the recommended commodities.

In an embodiment of the present invention, the ranking range may be set in advance according to an application scenario or a specific requirement, for example, the ranking range may be 1 to 3, and then candidate commodities with ranking orders of 1, 2, and 3 may be selected and recommended to the user to be recommended as the recommended commodity.

In one embodiment of the present invention, the method may further comprise:

and displaying the candidate commodities and the sequencing order of each candidate commodity for the selection of the user to be recommended.

< apparatus embodiment >

In the present embodiment, a training apparatus 5000 for a commodity recommendation model is provided, as shown in fig. 5, and includes a data obtaining module 5100, an initial training module 5200, a data merging module 5300, a feature generating module 5400, a sample constructing module 5500, and a final training module 5600.

The data obtaining module 5100 is configured to obtain interaction data of a target user performing various interaction operations on a target commodity, and an initial sample set used for training a commodity recommendation model. Each original sample in the initial sample set comprises a plurality of selected features and labels, and the labels represent whether specified interactive operations are executed by a target user aiming at a target commodity when the corresponding original sample is generated.

The initial training module 5200 is configured to train an initial commodity recommendation model based on the initial sample set by using a preset machine learning algorithm.

The data merging module 5300 is configured to merge the interaction data of the multiple interaction operations according to the initial commodity recommendation model and the initial sample set to obtain merged interaction data.

The feature generation module 5400 is configured to obtain, based on a preset graph neural network, a user feature of each target user and a commodity feature of each target commodity according to the merged interaction data.

The sample construction module 5500 is configured to construct a new sample set according to the user characteristics, the commodity characteristics, and the initial sample set.

The final training module 5600 is configured to train a final commodity recommendation model based on the new sample set by using a machine learning algorithm.

In an embodiment of the present invention, the data merging module 5300 is further configured to:

based on the initial commodity recommendation model, obtaining a prediction matching degree between each target user and each target commodity according to the selected characteristics;

and training a preset machine learning model according to the predicted matching degree matrix and each interaction matrix to obtain combined interaction data.

In an embodiment of the present invention, training a preset machine learning model according to the prediction matching degree matrix and each interaction matrix, and obtaining merged interaction data includes:

establishing an expression of a combined interaction matrix according to the interaction matrix by taking undetermined parameters of a machine learning model as variables;

constructing a first loss function according to the expression of the prediction matching degree matrix and the parallel interaction matrix;

solving a first loss function, and determining the value of undetermined parameters of the machine learning model to obtain a combined interaction matrix;

In an embodiment of the present invention, obtaining the merged interactive data according to the merged interactive matrix includes:

determining a total number of elements and a non-null rate in an interaction matrix of a specified interaction operation; wherein the non-null rate is a ratio of the number of non-zero elements to the total number of elements;

determining a score threshold value according to the total number of elements and the non-null value rate;

and adjusting the element values of which the numerical values are less than or equal to the score threshold value in the combined interactive matrix to be a first set value, and adjusting the element values of which the numerical values are greater than the score threshold value to be a second set value to obtain combined interactive data.

In one embodiment of the invention, the expression of the merged interaction matrix is represented as:

In one embodiment of the invention, the first loss function is expressed as:

In one embodiment of the present invention, the feature generation module 5400 may further be configured to:

In one embodiment of the present invention, training a graph neural network for each target user and each target commodity based on merged interaction data comprises:

for each target commodity and each target user, respectively taking the parameters to be determined of the neural network of the corresponding graph as variables, and constructing an expression of a hidden layer; the undetermined parameters of each graph neural network on the same layer are the same;

In one embodiment of the invention, each target commodity and each target user are taken as target nodes in turn;

the expression of the hidden layer of the target node is expressed as:

h⁰＝x

wherein, therein

Is a value of a hidden layer of a target node of a k-th layer, x is an initial value of the target node, N (v) represents a node connected to the target node in the graph neural network, σ is an activation function, W_kAnd B_kAre all undetermined coefficients of the kth layer of the graph neural network.

In one embodiment of the invention, the expression of the second penalty function is expressed as:

L2＝∑_ij(y_ij-P_ij)²

wherein, P_ijCorresponding element values of an ith target user and a jth target commodity in the synthetic interaction matrix; y is_ijIs the distance between the ith target user and the jth target product.

In one embodiment of the invention, the machine learning algorithm is the GBDT algorithm.

In one embodiment of the present invention, the sample construction module 5500 may also be configured to:

In an embodiment of the present invention, the training apparatus 5000 for the commodity recommendation model may further include:

the module is used for acquiring a characteristic value of a selected characteristic of at least one preset candidate commodity corresponding to a user to be recommended and new interaction data of the user to be recommended for executing various interaction operations on each candidate commodity;

and the module is used for selecting candidate commodities of which the recommendation scores accord with preset recommendation conditions and recommending the candidate commodities to the user to be recommended as the recommended commodities.

In one embodiment of the present invention, selecting candidate commodities whose recommendation scores meet a preset recommendation condition as recommended commodities to be recommended to a user to be recommended includes:

sorting the candidate commodities in a descending order according to the recommendation score, and acquiring the sorting order of each candidate commodity;

and selecting candidate commodities with the sorting order meeting the preset sorting range, and recommending the candidate commodities to a user to be recommended as recommended commodities.

and a module for displaying the candidate commodities and the sorting order of each candidate commodity.

It will be appreciated by those skilled in the art that the training apparatus 5000 for the merchandise recommendation model may be implemented in various ways. For example, the training apparatus 5000 of the commodity recommendation model may be implemented by an instruction configuration processor. For example, the instructions may be stored in ROM and read from ROM into a programmable device when the device is started up to implement the training apparatus 5000 for the commodity recommendation model. For example, the training apparatus 5000 of the merchandise recommendation model may be solidified into a dedicated device (e.g., ASIC). The training apparatus 5000 of the merchandise recommendation model may be divided into units independent of each other, or may be implemented by combining them together. The training device 5000 of the commodity recommendation model may be implemented by one of the various implementations described above, or may be implemented by a combination of two or more of the various implementations described above.

In this embodiment, the training apparatus 5000 for the merchandise recommendation model may have various implementation forms, for example, the training apparatus 5000 for the merchandise recommendation model may be any functional module running in a software product or an application providing the model training service, or a peripheral insert, a plug-in, a patch, etc. of the software product or the application, or the software product or the application itself.

< System >

In this embodiment, as shown in fig. 6, a system 6000 of at least one computing device 6100 and at least one storage device 6200 is also provided. The at least one storage device 6200 is to store executable instructions; the instructions are for controlling at least one computing device 6100 to perform a method of training a merchandise recommendation model according to any embodiment of the invention.

In this embodiment, the system 6000 may be a device such as a mobile phone, a tablet computer, a palm computer, a desktop computer, a notebook computer, a workstation, a game machine, or a distributed system formed by a plurality of devices.

< computer-readable storage Medium >

In this embodiment, there is also provided a computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the training method of the commodity recommendation model according to any embodiment of the present invention.

The present invention may be an apparatus, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A training method of a commodity recommendation model comprises the following steps:

2. The method of claim 1, wherein the merging interaction data of a plurality of interaction operations according to the initial commodity recommendation model and the initial sample set to obtain merged interaction data comprises:

3. The method of claim 2, wherein the training a preset machine learning model according to the prediction matching degree matrix and each interaction matrix to obtain the merged interaction data comprises:

4. The method of claim 3, the deriving the merged interaction data from the merged interaction matrix comprising:

5. The method of claim 3, the expression of the merged interaction matrix is represented as:

6. The method of claim 3, the first loss function being represented as:

7. The method of claim 1, wherein the obtaining of the user characteristic of each target user and the commodity characteristic of each target commodity according to the merged interaction data based on the preset graph neural network comprises:

8. A training device for a commodity recommendation model comprises:

9. A system comprising at least one computing device and at least one storage device, wherein the at least one storage device is to store instructions for controlling the at least one computing device to perform the method of any of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.