CN115018552A

CN115018552A - Method for determining click rate of product

Info

Publication number: CN115018552A
Application number: CN202210754365.8A
Authority: CN
Inventors: 陈恩红; 张裕人; 金斌斌; 王皓; 侯旻; 于润龙
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2022-09-06

Abstract

The present disclosure provides a method for determining a click rate of a product. The method comprises the steps of obtaining a product data training sample set, wherein the product data training sample set comprises a plurality of training samples, and each training sample comprises a plurality of product feature vectors and an interactive behavior sequence; processing a plurality of product feature vectors by using a mean value clustering model aiming at each training sample to obtain a plurality of clustering center vectors; determining the sampling probability corresponding to each clustering center vector according to the preset feature vector of the preset target product and a plurality of clustering center vectors corresponding to the interactive behavior sequence; sampling the interactive behavior sequence according to the plurality of sampling probabilities to obtain a plurality of interactive behavior subsequences; training a deep interest network by using a preset feature vector and a plurality of interactive behavior subsequences to obtain a trained click rate prediction model; and inputting the product characteristic vector of the product to be predicted and the interaction behavior sequence of the user to be predicted into a click rate prediction model, and outputting a prediction result.

Description

Method for determining click rate of product

Technical Field

The present disclosure relates to the field of neural network technologies, and in particular, to a method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product for determining a click rate of a product.

Background

Click rate prediction is a new user behavior prediction technology under the development background of the information era, is commonly existing in the internet industry, is also the basis of online application such as online advertisement and e-commerce recommendation, and aims to predict the potential demands of different users and recommend proper and high-quality click content.

In a real scene, due to the rapid development of information technology, the behavior records accumulated by a user on an internet platform are increasingly increased, the length of a behavior sequence is increased in an explosive manner, and related requirements are various, so that the behavior record data of the user has the characteristics of large quantity and wide interest.

In implementing the disclosed concept, the inventors found that there are at least the following problems in the related art: when the click rate of a certain commodity clicked by a user is predicted by the conventional click rate prediction model, the prediction precision is low.

Disclosure of Invention

In view of this, the disclosed embodiments provide a method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product for determining a click rate of a product.

One aspect of the embodiments of the present disclosure provides a method for determining a click rate of a product, including:

obtaining a product data training sample set, wherein the product data training sample set comprises a plurality of training samples, each training sample comprises a plurality of product feature vectors with preset dimensionality and an interactive behavior sequence, and the interactive behavior sequence represents the time and sequence of interactive behaviors of a user and a plurality of products;

processing a plurality of product feature vectors by using a mean clustering model to obtain a plurality of clustering center vectors aiming at each training sample, wherein the mean clustering model is obtained by pre-training a K mean clustering model, and each clustering center vector is a feature vector capable of representing a plurality of product feature vectors in the same class;

determining a sampling probability corresponding to each clustering center vector according to a preset feature vector of a preset target product and a plurality of clustering center vectors corresponding to the interactive behavior sequence;

sampling the interactive behavior sequence according to the plurality of sampling probabilities to obtain a plurality of interactive behavior subsequences;

training a deep interest network by using the preset feature vector and the plurality of interactive behavior subsequences to obtain a trained click rate prediction model;

and inputting the product characteristic vector of the product to be predicted and the interaction behavior sequence of the user to be predicted into the click rate prediction model, and outputting a prediction result, wherein the prediction result represents the probability of the user to be predicted clicking the product to be predicted.

According to an embodiment of the present disclosure, the plurality of product feature vectors are obtained by:

screening the obtained multiple product data to obtain screened product data;

for a plurality of screened product data, respectively carrying out one-hot coding processing on a plurality of types of the product data under the condition that the product attribute of the product data accords with a first preset condition to obtain a first one-hot vector corresponding to each type, wherein the product attribute comprises product price, product brand or purchase time;

splicing the plurality of first unique heat vectors to obtain splicing characteristic vectors of the product data;

under the condition that the product attribute of the product data does not meet the first preset condition, processing the product by using a barrel dividing method to obtain a plurality of types of category vectors;

carrying out one-hot encoding processing on the category vector of each category to obtain a second one-hot vector corresponding to each category;

splicing the plurality of second unique heat vectors to obtain splicing characteristic vectors of the product data;

and mapping the spliced feature vector by preset dimensionality to obtain the product feature vector.

According to an embodiment of the present disclosure, the above-mentioned screening a plurality of acquired product data to obtain screened product data includes:

determining the number of times of interaction between the user and each product according to the interaction behavior sequence;

and determining the product with the interaction times larger than a preset threshold value as the screened product data.

According to the embodiment of the disclosure, the product feature vector is obtained by processing product information of a product by using an embedded layer; the sampling probability is obtained by processing by using a sampling module;

according to an embodiment of the present disclosure, the method further includes:

sequentially training an initial neural network by using samples in an optimized training sample set to obtain a first loss value corresponding to the optimized training sample, wherein the samples in the optimized training sample set comprise positive samples and negative samples, the positive samples comprise a plurality of interactive behavior subsequences of a user, and the negative samples comprise a plurality of interactive behavior subsequences of at least one other user;

updating the model parameters of the embedding layer, the model parameters of the sampling module and the model parameters of the initial neural network by using a random gradient descent algorithm under the condition that the first loss value does not meet a first convergence threshold value;

and under the condition that the first loss value meets the first convergence threshold value, determining the updated model parameters of the embedding layer as the target model parameters of the embedding layer, determining the updated model parameters of the sampling module as the target model parameters of the sampling module, and determining the initial neural network after updating the model parameters as a trained contrast learning target model.

According to an embodiment of the present disclosure, the first loss value L is ^C Is calculated as shown in the following equation:

D _ω (p，q)＝exp(p ^T ·W·q)

wherein p and q represent interactive behavior subsequences obtained by two times of sampling by a user; p ^l Representing the joint probability distribution of the sampled user subsequences, i.e. (P, q) -P ^l ；

Then representing a subsequence of interactive behaviour obtained by randomly sampling from a sequence of interactive behaviour of at least one other user; d _ω The method is characterized in that the method is a discriminator with omega parameters in an initial neural network and is defined as a log bilinear model; p is a radical of ^T The characterization transposes p.

According to an embodiment of the present disclosure, the processing a plurality of the product feature vectors by using a mean clustering model to obtain a plurality of clustering center vectors includes:

processing a plurality of product feature vectors by using a mean clustering model to obtain a plurality of product feature component vectors corresponding to each product feature vector;

selecting a plurality of target product feature component vectors from a plurality of product feature component vectors of a plurality of products according to a preset selection condition;

and obtaining a plurality of clustering center vectors according to the plurality of target product characteristic component vectors, wherein the number of the clustering center vectors is less than that of the product characteristic vectors.

According to an embodiment of the present disclosure, the determining, according to a preset feature vector of a preset target product and a plurality of the cluster center vectors corresponding to the interaction behavior sequence, a sampling probability corresponding to each of the cluster center vectors includes:

determining a similarity score between the preset feature vector and each clustering center vector according to the preset feature vector and the interaction behavior sequence;

determining the relative time difference between the preset target product and each clustering center vector according to the interaction behavior sequence;

determining a sampling probability corresponding to each of the products based on a plurality of the similarity scores and a plurality of the relative time differences.

According to an embodiment of the present disclosure, the determining a relative time difference between the preset target product and each of the cluster center vectors according to the interaction behavior sequence includes:

measuring the time sequence relation of a plurality of clustering center vectors corresponding to the interactive behavior sequence by adopting relative time difference to obtain a time stamp set, wherein each time stamp in the time stamp set corresponds to the time of the interactive behavior between a user and one clustering center vector;

and determining the relative time difference between the preset target product and each clustering center vector based on the timestamp set.

According to an embodiment of the present disclosure, the determining a sampling probability corresponding to each of the products according to a plurality of the similarity scores and a plurality of the relative time differences includes:

converting the relative time difference into a time series fraction for each of the relative time differences;

and calculating the sampling probability of each product according to the time sequence fraction and the similarity fraction corresponding to the relative time difference.

According to an embodiment of the present disclosure, the training of the deep interest network by using the preset feature vector and the plurality of interactive behavior subsequences to obtain a trained click rate prediction model includes:

inputting a preset feature vector and a plurality of product feature vectors corresponding to each of the interactive behavior sub-sequences into an attention mechanism layer to obtain a plurality of correlation weights, wherein one of the correlation weights represents the correlation between the preset feature vector and one of the product feature vectors;

inputting the plurality of correlation weights into a pooling layer to obtain a target characterization vector, wherein the target characterization vector characterizes a vector of a relationship between the user and the preset target product;

inputting the target characterization vector into a multilayer perceptron layer, and outputting a training prediction result;

calculating a second loss value according to the training prediction result and the loss function;

under the condition that the second loss value does not meet a second convergence threshold value, updating the model parameters of the deep interest network by using a random gradient descent algorithm;

and under the condition that the second loss value meets a second convergence threshold value, determining the updated deep interest network as a click rate prediction model after training.

Another aspect of the embodiments of the present disclosure provides a product click rate determining apparatus, including:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a product data training sample set, the product data training sample set comprises a plurality of training samples, each training sample comprises a plurality of product feature vectors with preset dimensionality and an interactive behavior sequence, and the interactive behavior sequence represents the time and the sequence of interactive behaviors of a user and a plurality of products;

a clustering module, configured to, for each training sample, process a plurality of product feature vectors using a mean clustering model to obtain a plurality of clustering center vectors, where the mean clustering model is obtained by pre-training a K-means clustering model, and each clustering center vector is a feature vector that can represent a plurality of product feature vectors in the same class;

the first determining module is used for determining the sampling probability corresponding to each clustering center vector according to a preset feature vector of a preset target product and a plurality of clustering center vectors corresponding to the interactive behavior sequence;

the second determining module is used for sampling the interactive behavior sequence according to the plurality of sampling probabilities to obtain a plurality of interactive behavior subsequences;

the training module is used for training a deep interest network by utilizing the preset feature vector and the plurality of interactive behavior subsequences to obtain a trained click rate prediction model;

and the prediction module is used for inputting the product characteristic vector of the product to be predicted and the interaction behavior sequence of the user to be predicted into the click rate prediction model and outputting a prediction result, wherein the prediction result represents the probability of the user to be predicted clicking the product to be predicted.

Another aspect of an embodiment of the present disclosure provides an electronic device including: one or more processors; memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described above.

Another aspect of embodiments of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the method as described above when executed.

Another aspect of an embodiment of the present disclosure provides a computer program product comprising computer executable instructions for implementing the method as described above when executed.

According to the embodiment of the disclosure, the plurality of product feature vectors are processed through the mean clustering model, so that the specific product feature vectors are replaced by the determined clustering centers, noise elimination is facilitated, the sampling probability of the preset target commodity and each clustering center is calculated conveniently, further, the interactive behavior subsequence with high correlation, low noise and high information content and persistence can be generated based on the sampling probability, and the click rate prediction model generated by training through the interactive behavior subsequence and the preset feature vectors has high prediction precision when in use.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:

FIG. 1 schematically illustrates an exemplary system architecture for applying a product click-through rate determination method according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow chart A of a method of product click-through rate determination according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart B of a method of product click-through rate determination according to an embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow diagram for obtaining a product feature vector according to an embodiment of the present disclosure;

FIG. 5 schematically illustrates a recommendation flow diagram for a product according to an embodiment of the present disclosure;

FIG. 6 schematically shows a block diagram of a product click-through rate determination apparatus according to an embodiment of the present disclosure; and

FIG. 7 schematically shows a block diagram of an electronic device implementing a method for product click-through rate determination according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, operations, and/or components, but do not preclude the presence or addition of one or more other features, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

In a real scene, due to the rapid development of information technology, the behavior records accumulated by a user on an internet platform are increasingly increased, the length of a behavior sequence is increased in an explosive manner, and related requirements are various, so that the behavior record data of the user has the characteristics of large quantity and wide interest. The large number of the behavior records refers to that the user behavior records are large in number and large in time span, and obvious performance challenges are brought; the wide interest refers to various user requirements contained in a plurality of behavior records, and two behavior records of the same user may be closely related or unrelated, so that the traditional data analysis method is difficult to obtain the current accurate interest of the user.

The inventor researches and discovers that most of related click rate prediction methods are used for intercepting recent user behavior sequences, for example, methods such as directly using an attention mechanism on long sequence data bring huge time cost in an inference stage and easily affect the click rate prediction precision, and are difficult to be practically applied to click rate prediction tasks with high time performance requirements.

Therefore, designing a method capable of efficiently extracting information and improving the accuracy of click rate prediction becomes an urgent problem to be solved by those skilled in the art.

In view of this, embodiments of the present disclosure provide a method for determining a product click rate. The method comprises the steps of obtaining a product data training sample set, wherein the product data training sample set comprises a plurality of training samples, each training sample comprises a plurality of product feature vectors with preset dimensionality and an interactive behavior sequence, and the interactive behavior sequence represents the time and the sequence of interactive behaviors of a user and a plurality of products; aiming at each training sample, processing a plurality of product feature vectors by using a mean clustering model to obtain a plurality of clustering center vectors, wherein the mean clustering model is obtained by pre-training a K mean clustering model, and each clustering center vector is a feature vector capable of representing a plurality of product feature vectors in the same class; determining the sampling probability corresponding to each clustering center vector according to the preset feature vector of the preset target product and a plurality of clustering center vectors corresponding to the interactive behavior sequence; sampling the interactive behavior sequence according to the plurality of sampling probabilities to obtain a plurality of interactive behavior subsequences; training a deep interest network by using a preset feature vector and a plurality of interactive behavior subsequences to obtain a trained click rate prediction model; and inputting the product characteristic vector of the product to be predicted and the interaction behavior sequence of the user to be predicted into a click rate prediction model, and outputting a prediction result, wherein the prediction result represents the probability of the user to be predicted clicking the product to be predicted.

FIG. 1 schematically illustrates an exemplary system architecture 100 to which a product click-through rate determination method may be applied, according to an embodiment of the disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.

As shown in fig. 1, the system architecture 100 according to this embodiment may include

terminal devices

101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired and/or wireless communication links, and so forth.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have installed thereon various messaging client applications, such as a click-through rate prediction type application, a web browser application, a search type application, an instant messaging tool, a mailbox client, and/or social platform software, etc. (by way of example only).

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

terminal devices

101, 102, 103. The background management server may analyze and perform other processing on the received data such as the click rate prediction request, and feed back a processing result (e.g., the click rate, the web page, the information, or the data obtained or generated according to the user request) to the terminal device.

It should be noted that the product click rate determination method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the product click rate determination apparatus provided by the embodiments of the present disclosure may be generally disposed in the server 105. The product click rate determining method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster that is different from the server 105 and is capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105. Accordingly, the product click rate determining apparatus provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105. Alternatively, the product click rate determining method provided by the embodiment of the present disclosure may also be executed by the

terminal device

101, 102, or 103, or may also be executed by another terminal device different from the

terminal device

101, 102, or 103. Correspondingly, the product click rate determining apparatus provided by the embodiment of the present disclosure may also be disposed in the

terminal device

101, 102, or 103, or in another terminal device different from the

terminal device

101, 102, or 103.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

FIG. 2 schematically shows a flowchart A of a method of product click-through rate determination according to an embodiment of the present disclosure. FIG. 3 schematically shows a flowchart B of a product click-through rate determination method according to an embodiment of the disclosure.

As shown in fig. 2 and 3, the product click rate determining method includes operations S201 to S206.

In operation S201, a product data training sample set is obtained, where the product data training sample set includes a plurality of training samples, each training sample includes a plurality of product feature vectors with preset dimensions and an interactive behavior sequence, and the interactive behavior sequence represents a time and an order of an interactive behavior between a user and a plurality of products.

In operation S202, for each training sample, a mean clustering model is used to process a plurality of product feature vectors to obtain a plurality of clustering center vectors, where the mean clustering model is obtained by pre-training a K-means clustering model, and each clustering center vector is a feature vector that can represent a plurality of product feature vectors in the same class.

In operation S203, a sampling probability corresponding to each cluster center vector is determined according to a preset feature vector of a preset target product and a plurality of cluster center vectors corresponding to the interactive behavior sequence.

In operation S204, the interactive behavior sequence is sampled according to the multiple sampling probabilities, so as to obtain multiple interactive behavior subsequences.

In operation S205, a deep interest network is trained by using the preset feature vector and the plurality of interactive behavior subsequences, so as to obtain a click rate prediction model after training.

In operation S206, the product feature vector of the product to be predicted and the interaction behavior sequence of the user to be predicted are input into the click rate prediction model, and a prediction result is output, where the prediction result represents a probability that the user to be predicted clicks the product to be predicted.

According to the embodiment of the disclosure, the product feature vector is generated according to product data, and the product data refers to information data of each product in the user interaction sequence, including but not limited to the type, price and the like of the product, and is used for extracting the characterization information of the product. The product feature vectors in the plurality of training samples may be the same.

According to an embodiment of the present disclosure, the sequence of interaction behaviors may refer to: n interactive behaviors of user u

The user's behavior sequence can be formed by symbols

And (4) showing.

According to the embodiment of the disclosure, the acquired product data and the interaction behavior sequence may be within a preset time period, and the preset time period may be specifically set according to actual needs, for example, may be three years. The preset dimension may be determined according to actual conditions, and may be set to 50 dimensions, for example.

According to the embodiment of the disclosure, when the mean clustering model is trained in advance, a plurality of product feature vectors of the disclosure may be used as the pre-training samples, and a plurality of product feature vectors different from the disclosure may also be used as the pre-training samples.

According to the embodiment of the disclosure, when the mean value clustering model processes the product feature vectors, a corresponding mapping table can be generated, and the mapping table is used for determining the closest clustering center vector according to the preset feature vectors.

According to the embodiment of the disclosure, a plurality of product feature vectors in a training sample are processed by using a mean clustering model, so that a clustering center capable of representing a plurality of product feature vectors in the same class is obtained. And determining the sampling probability of the preset feature vector and each cluster center vector corresponding to the interactive behavior sequence according to the preset feature vector of the preset target product. Based on a plurality of sampling probabilities, a small number of related products are sampled from the interactive behavior sequence to form an interactive behavior subsequence with high correlation, low noise and high information content, wherein the sampling can be performed according to the sampling probability, for example, a plurality of sampling probability intervals are set, and the products in one sampling probability interval are combined into one interactive behavior subsequence. Using the obtained preset feature vector e _t And training the deep interest network by a plurality of interactive behavior subsequences to obtain a trained click rate prediction model.

According to the embodiment of the disclosure, when the click rate prediction model is used, the product feature vector of the product to be predicted and the interaction behavior sequence of the user to be predicted can be input into the click rate prediction model, and the prediction result of the probability that the user to be predicted clicks the product to be predicted is output, for example, the probability that the user to be predicted clicks the product to be predicted is 92%.

According to the embodiment of the disclosure, the plurality of product feature vectors are processed through the mean clustering model, so that the specific product feature vectors are replaced by the determined clustering centers, noise is eliminated, the sampling probability of the preset target commodity and each clustering center is calculated conveniently, an interactive behavior subsequence with high correlation, low noise and high information content and persistence can be generated based on the sampling probability, and the click rate prediction model generated by training through the interactive behavior subsequence and the preset feature vectors has high prediction accuracy when in use.

According to the embodiment of the disclosure, the method for screening a plurality of acquired product data to obtain screened product data comprises the following operations:

and determining the number of interactions between the user and each product according to the interaction behavior sequence. And determining the product with the interaction times larger than a preset threshold value as the screened product data.

According to the embodiment of the present disclosure, the preset threshold may be specifically set according to actual requirements, for example, in order to process long sequence data, the preset threshold may be set to 50 pieces.

According to the embodiment of the disclosure, the product feature vector is obtained by processing the product information of the product by using the embedded layer; the sampling probability is obtained by processing with a sampling module.

According to an embodiment of the present disclosure, the method for determining the product click rate further includes the following operations:

the initial neural network is trained by sequentially utilizing the samples in the optimized training sample set to obtain a first loss value corresponding to the optimized training samples, wherein the samples in the optimized training sample set comprise positive samples and negative samples, the positive samples comprise a plurality of interactive behavior subsequences of the user, and the negative samples comprise a plurality of interactive behavior subsequences of at least one other user.

And under the condition that the first loss value does not meet the first convergence threshold value, updating the model parameters of the embedded layer, the model parameters of the sampling module and the model parameters of the initial neural network by using a random gradient descent algorithm. And under the condition that the first loss value meets a first convergence threshold value, determining the updated model parameters of the embedding layer as the target model parameters of the embedding layer, determining the updated model parameters of the sampling module as the target model parameters of the sampling module, and determining the initial neural network after updating the model parameters as the trained contrast learning target model.

According to the embodiment of the disclosure, the auxiliary task formed by the operations refers to a contrast learning method proposed in the field of image processing, and the idea is to add noise, change details and the like to the original data to obtain a negative sample different from the original data, and then enable the model society to distinguish the original data from the negative sample through self-supervision learning, so as to effectively enhance the learning effect of the model without needing more data.

According to the embodiment of the disclosure, the interest difference of different users is often large, and the interest of the same user is often stable and the difference is not large. Based on the common sense assumption, the embodiment performs multiple sampling in the interactive behavior sequence of each user, takes multiple interactive behavior subsequences sampled from the same user as positive samples, takes multiple interactive behavior subsequences sampled from different users as negative samples, and constructs a self-supervised contrast learning target model for training according to a similar form of an InfoNCE loss function.

According to an embodiment of the present disclosure, the first loss value L ^C Is as shown in equations (1) and (2):

D _ω (p，q)＝exp(p ^T ·W·q) (2)

wherein p and q represent interactive behavior subsequences obtained by two times of sampling by a user; p is ^l Representing the joint probability distribution of the sampled user subsequences, i.e. (P, q) -P ^l ；

Representing a subsequence of interactive behavior randomly sampled from the sequence of interactive behavior of at least one other user; d _ω The method is characterized in that the method is a discriminator with omega parameters in an initial neural network and is defined as a log bilinear model; p is a radical of ^T The characterization transposes p.

According to the embodiment of the disclosure, in each round of training, each user performs two times of sampling as positive samples p and q, and obtains five negative samples through sampling in the sequence of interactive behaviors with other users

Self-supervision contrast learning is carried out, random gradient descent algorithm optimization parameters omega are executed through an Adam optimizer, parameters of models such as an embedded layer and a sampling module are optimized and updated through a back propagation mechanism, accordingly capture and differentiation of the models on user interests are enhanced, memory of personal interests of each user is strengthened by a click rate prediction model, and prediction accuracy is effectively improved.

As shown in fig. 3, processing a plurality of product feature vectors by using a mean clustering model to obtain a plurality of clustering center vectors includes the following operations:

and processing the plurality of product feature vectors by using the mean clustering model to obtain a plurality of product feature component vectors corresponding to each product feature vector. And selecting a plurality of target product characteristic component vectors from a plurality of product characteristic component vectors of a plurality of products according to a preset selection condition. And obtaining a plurality of clustering center vectors according to the plurality of target product characteristic component vectors, wherein the number of the clustering center vectors is less than that of the product characteristic vectors.

According to the embodiment of the disclosure, the clustering center vector can be one of a plurality of product feature vectors, or can be a clustering center vector formed by splicing a plurality of product feature vectors, and the spliced clustering center vector can better represent a plurality of product feature vectors in the category.

According to the embodiment of the disclosure, a K-means clustering algorithm is used for clustering training based on all product feature vectors, wherein the number C of clustering center vectors can be set to one tenth of the total number of products. For M products, the loss function of the K-means clustering method is shown in equation (3):

wherein e _i Is a product feature vector of product i, c _j Cluster center vectors corresponding to a plurality of products i of the same class.

According to the embodiment of the disclosure, during each round of training, clustering training is performed first to ensure that the clustering center vector always has better representativeness. Through the steps, a product-clustering center mapping table and K clustering center vectors can be obtained, and the mapping table and the K clustering center vectors obtained in the last training round can be stored for online deployment.

According to the embodiment of the disclosure, when the clustering center vector is a spliced clustering center vector, the preset selection condition may refer to selection according to different product types. For example, a preset number of product target product feature component vectors are selected from a plurality of product feature component vectors of a plurality of products of each kind, and the preset number of product target product feature component vectors of each kind are spliced to obtain one or more clustering center vectors of the kind.

According to the embodiment of the disclosure, when the clustering center vector is one of the plurality of product feature vectors, one or more clustering center vectors capable of representing different categories can be directly selected from the plurality of product feature vectors.

As shown in fig. 3, determining a sampling probability corresponding to each cluster center vector according to a preset feature vector of a preset target product and a plurality of cluster center vectors corresponding to the interaction behavior sequence includes the following operations:

and determining the similarity score of the preset feature vector and each cluster center vector according to the preset feature vector and the interaction behavior sequence. And determining the relative time difference between the preset target product and each cluster center vector according to the interaction behavior sequence. A sampling probability corresponding to each product is determined based on the plurality of similarity scores and the plurality of relative time differences.

According to an embodiment of the present disclosure, determining a sampling probability corresponding to each product from the plurality of similarity scores and the plurality of relative time differences comprises the operations of:

for each relative time difference, the relative time difference is converted to a timing score. The sampling probability of each product is calculated based on the time series score and the similarity score corresponding to the relative time difference.

According to the embodiment of the present disclosure, before the above operation is performed, a plurality of products i in the interactive behavior sequence are replaced by the determined plurality of clustering center vectors i.

According to the embodiment of the disclosure, a preset feature vector e of a preset target product is given _t Firstly, the similarity between each cluster center vector i in the interactive behavior sequence of the user and a preset target product t is measured by adopting a weighted vector inner product to obtain a similarity score r _i ＝(W _a e _i )·(W _b e _t ) ^T Wherein W is _a And W _b To correspond to the weight, T is the time at which the interaction occurred.

According to the embodiment of the disclosure, determining the relative time difference between the preset target product and each cluster center vector according to the interaction behavior sequence comprises the following operations:

and measuring the time sequence relation of a plurality of clustering center vectors corresponding to the interactive behavior sequence by adopting the relative time difference to obtain a timestamp set, wherein each timestamp in the timestamp set corresponds to the time of the interactive behavior between the user and one clustering center vector. And determining the relative time difference of the preset target product and each cluster center vector based on the timestamp set.

According to the disclosureIn the open embodiment, the time sequence relation of each clustering center vector i in the interactive behavior sequence is measured by using the relative time difference. Formally, for a sequence of interactive behaviors of user u, a timestamp for each behavior is obtained, with a set of timestamps recorded as

According to an embodiment of the present disclosure, the minimum time difference in the sequence of interactive behaviors is recorded as

Thereby obtaining the relative time difference between the clustering center vector i in the interactive behavior sequence and the preset target product t

Embedding the relative time difference into a time sequence fraction through an encoder, and then weighting and summing the time sequence fraction and the similarity fraction to obtain a final sampling fraction R _i ＝r _i +W _c ·E(l _i ) Wherein W is _c Is the summed weight. According to the method, the sampling score of each clustering center vector i in the interaction behavior sequence of the user u when the preset target product t is given can be calculated and recorded as

Thus, with a method similar to Softmax, the sampling weight of each product i, i.e. the sampling probability, can be obtained, and the sampling probability is shown in equation (4).

According to the embodiment of the disclosure, the sampling weight of each user when the user takes a plurality of clustering centers as an interactive behavior sequence is saved in the last round of training, so that when online reasoning is carried out, a product to be predicted, which needs to be predicted, can be directly mapped to a certain clustering center, and then the saved sampling weight is directly searched, thus the overhead of calculating the sampling weight can be saved, the sampling probability with very little time consumption and the subsequent click rate prediction can be directly carried out in the user sequence, and the time efficiency is ensured.

Fig. 4 schematically illustrates a flow chart for obtaining a product feature vector according to an embodiment of the present disclosure.

As shown in fig. 4, a plurality of product feature vectors are obtained through operations S401 to S407 as follows.

In operation S401, the obtained multiple product data are subjected to a screening process, so as to obtain screened product data.

In operation S402, for a plurality of screened product data, under the condition that the product attribute of the product data meets a first preset condition, performing unique hot coding processing on a plurality of categories of the product data, respectively, to obtain a first unique hot vector corresponding to each category, where the product attribute includes a product price, a product brand, or purchase time;

in operation S403, a plurality of first unique heat vectors are spliced to obtain a splicing feature vector of the product data.

In operation S404, in a case that the product attribute of the product data does not meet the first preset condition, the product is processed by using a bucket division method to obtain a category vector of a plurality of categories.

In operation S405, a unique hot encoding process is performed on the category vector of each category to obtain a second unique hot vector corresponding to each category.

In operation S406, the plurality of second unique heat vectors are spliced to obtain a splicing feature vector of the product data.

In operation S407, the mosaic feature vector is mapped in a preset dimension to obtain a product feature vector.

According to the embodiment of the disclosure, data with extremely low occurrence frequency is difficult to learn, so the disclosure can remove product data with low occurrence frequency in the interactive behavior sequence of all users.

According to an embodiment of the present disclosure, the first preset condition may refer to whether a product price, a product brand, a purchase time, or a product category is continuous.

In an embodiment of the present invention, when it is determined that the types of the multiple screened product data are relatively discrete, a unique heat coding process may be performed on each product data to obtain a corresponding first unique heat vector. For example, if the screened product data is 1882 in total, each product will have a unique heat vector of length 1882 to characterize its category.

In another embodiment, when it is determined that the product prices, product brands, and purchase times of the plurality of screened product data are continuous, the plurality of screened product data are subjected to barrel separation processing, for example, the plurality of product data are divided into a low price region, a flat price region, and a high price region, and then the converted category features are subjected to unique hot coding, so as to obtain corresponding second unique hot vectors.

According to the embodiment of the disclosure, all the one-hot feature vectors of each product are spliced and mapped into a dense feature vector with a dimension of 50, namely a product feature vector, through an Embedding layer (Embedding), and the dense feature vector is used as a final characterization of the product. Formally expressed, the final characterization of the product feature vector of product i is e _i ＝E(x _i ) Where E is an embedded layer encoder with weights, x _i And the feature vectors are spliced products.

According to the embodiment of the disclosure, a deep interest network is trained by using a preset feature vector and a plurality of interactive behavior subsequences to obtain a trained click rate prediction model, which comprises the following operations:

and inputting the preset feature vector and a plurality of product feature vectors corresponding to each interactive behavior subsequence into an attention mechanism layer to obtain a plurality of correlation weights, wherein one correlation weight represents the correlation between the preset feature vector and one product feature vector.

And inputting the plurality of correlation weights into the pooling layer to obtain a target characterization vector, wherein the target characterization vector represents a vector of a relation between the user and a preset target product. And inputting the target characterization vector into a multilayer perceptron layer, and outputting a training prediction result.

And calculating a second loss value according to the training prediction result and the loss function. And under the condition that the second loss value does not meet the second convergence threshold value, updating the model parameters of the deep interest network by using a random gradient descent algorithm. And under the condition that the second loss value meets a second convergence threshold value, determining the updated deep interest network as a click rate prediction model after training.

According to the embodiment of the disclosure, whether the user is willing to click the preset target product is predicted to be the target task of the disclosure. In the above description, the short sequence of the user behavior has been obtained by sampling, so that the click rate prediction module can complete the prediction only by using the short sequence and the preset feature vector of the preset target product.

According to the embodiment of the disclosure, in order to obtain a faster prediction speed and a more stable and accurate prediction effect, the disclosure selects a din (deep Interest network) model based on an attention mechanism as a click rate prediction module. The model adopts Embedding&MLP structure, first, the preset feature vector e of the target product is preset _t And a clustering center vector e in the subsequence of interactive behaviors _i Then, further obtaining a preset target product and each clustering center vector e in the interaction behavior subsequence through an attention mechanism _i And summing and pooling the correlation weights to obtain a common vector representation of the user and a preset target product as a target representation vector, inputting the target representation vector into a Multilayer perceptron (MLP), and finally finishing the training of the click rate prediction model through a plurality of training samples.

It should be noted that, when the target characterization vector is input into the multilayer perceptron, the feature vector after splicing and smoothing the target characterization vector and the additional features of the preset target product may be used as a final target characterization vector, where the additional features include, but are not limited to, a merchant location of the preset target product, a popularity of the product, and a promotion content of the product.

The click rate prediction model is trained and optimized by adopting a cross entropy loss function, and is specifically shown in a formula (5):

b is a product data training sample set with data volume D; x is a target characterization vector; p (x) epsilon [0, 1] represents the click probability of a multilayer perceptron to some sample prediction; y is the final prediction label, which takes the value of 1 or 0, respectively, to indicate that the sample actually belongs to the positive case or the negative case.

According to the embodiment of the disclosure, according to the loss function, the Adam optimizer is also adopted to execute a random gradient descent method for gradient updating in each round of training until a second loss value meets a second convergence threshold value.

FIG. 5 schematically shows a recommendation flow diagram for a product according to an embodiment of the disclosure.

As shown in fig. 5, since the click rate prediction model learns the user preferences implicit in the interactive behavior sequence for a long time during training, and stores a small number of parameters such as corresponding cluster center vectors and sampling probabilities, the interactive behavior subsequence with strong correlation can be quickly sampled during prediction, and efficient and accurate click rate prediction can be performed, thereby satisfying the requirement of high timeliness in an online environment. And thus can be specifically used through operations S501 to 504.

In operation S501, the click-through rate prediction model is deployed to the e-commerce platform, so as to determine the click-through rate of each product by the product click-through rate determination method.

In operation S502, the e-commerce platform may recommend each product to the user according to the predicted click rate of the user, so as to increase the purchase amount of the product.

In operation S503, the e-commerce platform may locally update the product feature vectors and the interaction behavior sequences and retrain the click rate prediction model every preset time interval.

Operation S504 is performed, the click rate prediction model deployed on the line is updated, so that the click rate prediction model can constantly pay attention to the latest interest of the user, and the prediction accuracy is guaranteed.

According to an embodiment of the present disclosure, the sequence of interaction behaviors is determined by:

an average sequence length is determined from the plurality of initial interaction behavior sequences. And intercepting or completing the initial interactive behavior sequence of each user according to the average sequence length to obtain an interactive behavior sequence.

According to the embodiment of the disclosure, the average length N of a plurality of initial interactive behavior sequences is recorded in advance, and then truncation or completion operation is performed on the initial interactive behavior sequences, so that all the interactive behavior sequences are N. Then, sequentially splicing the clustering center vectors corresponding to all products in the initial interactive behavior sequence to obtain the characteristic vector of each user, and taking the characteristic vector as the final characteristic u of the user _i As shown in equation (6):

through the processing, the product characteristic vector and the interaction behavior sequence are processed into low-dimensional and dense embedded vectors, and parameter training and calculation are facilitated. The clustering method enables the final representation of the user to only contain the characteristic vector of the product with strong representativeness, namely the obtained clustering center vector, is beneficial to eliminating noise, and the clustering center is used for replacing a specific product in subsequent calculation, so that the storage and operation performance of on-line deployment is optimized.

FIG. 6 schematically shows a block diagram of a product click rate determination apparatus according to an embodiment of the present disclosure.

As shown in FIG. 6, the product click rate determination apparatus 600 includes an obtaining module 601, a clustering module 602, a first determining module 603, a second determining module 604, a training module 605, and a predicting module 606.

The obtaining module 601 is configured to obtain a product data training sample set, where the product data training sample set includes a plurality of training samples, each training sample includes a plurality of product feature vectors with preset dimensions and an interactive behavior sequence, and the interactive behavior sequence represents time and sequence of an interactive behavior between a user and a plurality of products.

The clustering module 602 is configured to, for each training sample, process a plurality of product feature vectors using a mean clustering model to obtain a plurality of clustering center vectors, where the mean clustering model is obtained by pre-training a K-means clustering model, and each clustering center vector is a feature vector that can represent a plurality of product feature vectors in the same class.

The first determining module 603 is configured to determine a sampling probability corresponding to each cluster center vector according to a preset feature vector of a preset target product and a plurality of cluster center vectors corresponding to the interaction behavior sequence.

The second determining module 604 is configured to perform sampling processing on the interactive behavior sequence according to the multiple sampling probabilities to obtain multiple interactive behavior subsequences.

And the training module 605 is configured to train the deep interest network by using the preset feature vector and the plurality of interactive behavior subsequences to obtain a click rate prediction model after training.

The prediction module 606 is configured to input the product feature vector of the product to be predicted and the interaction behavior sequence of the user to be predicted into the click rate prediction model, and output a prediction result, where the prediction result represents a probability that the user to be predicted clicks the product to be predicted.

According to the embodiment of the present disclosure, the product click rate determining apparatus 600 further includes a screening module, a first encoding module, a first splicing module, a bucket dividing module, a second encoding module, a second splicing module, and a mapping module.

And the screening module is used for screening the obtained multiple product data to obtain screened product data.

And the first coding module is used for respectively carrying out one-hot coding processing on a plurality of types of the product data under the condition that the product attribute of the product data accords with a first preset condition aiming at each screened product data to obtain a first one-hot vector corresponding to each type, wherein the product attribute comprises product price, product brand or purchase time.

And the first splicing module is used for splicing the plurality of first unique heat vectors to obtain splicing characteristic vectors of the product data.

And the bucket dividing module is used for processing the products by using a bucket dividing method under the condition that the product attributes of the product data do not accord with a first preset condition to obtain the category vectors of a plurality of categories.

And the second coding module is used for carrying out one-hot coding processing on the category vector of each category to obtain a second one-hot vector corresponding to each category.

And the second splicing module is used for splicing the plurality of second independent heat vectors to obtain splicing characteristic vectors of the product data.

And the mapping module is used for mapping the splicing characteristic vector in a preset dimension to obtain a product characteristic vector.

According to an embodiment of the present disclosure, a screening module includes a first determining unit and a screening unit.

And the first determining unit is used for determining the number of times of interaction between the user and each product according to the interaction behavior sequence.

And the screening unit is used for determining the product with the interaction times larger than the preset threshold value as the screened product data.

According to an embodiment of the present disclosure, the product click rate determining apparatus 600 further includes an optimization training module, a first updating module and a second updating module.

And the optimization training module is used for training the initial neural network by sequentially utilizing the samples in the optimization training sample set to obtain a first loss value corresponding to the optimization training sample, wherein the samples in the optimization training sample set comprise a positive sample and a negative sample, the positive sample comprises a plurality of interactive behavior subsequences of the user, and the negative sample comprises a plurality of interactive behavior subsequences of at least one other user.

And the first updating module is used for updating the model parameters of the embedding layer, the model parameters of the sampling module and the model parameters of the initial neural network by using a random gradient descent algorithm under the condition that the first loss value does not meet the first convergence threshold value.

And the second updating module is used for determining the updated model parameters of the embedded layer as the target model parameters of the embedded layer, determining the updated model parameters of the sampling module as the target model parameters of the sampling module and determining the initial neural network after the model parameters are updated as the trained contrast learning target model under the condition that the first loss value meets the first convergence threshold value.

According to an embodiment of the present disclosure, the clustering module 602 includes a first obtaining unit, a selecting unit, and a second obtaining unit.

The first obtaining unit is used for processing the plurality of product feature vectors by using the mean clustering model to obtain a plurality of product feature component vectors corresponding to each product feature vector.

And the selecting unit is used for selecting a plurality of target product characteristic component vectors from a plurality of product characteristic component vectors of a plurality of products according to a preset selecting condition.

And the second obtaining unit is used for obtaining a plurality of clustering center vectors according to the plurality of target product characteristic component vectors, wherein the number of the clustering center vectors is less than that of the product characteristic vectors.

According to an embodiment of the present disclosure, the first determination module 603 includes a second determination unit, a third determination unit, and a fourth determination unit.

And the second determining unit is used for determining the similarity score of the preset feature vector and each clustering center vector according to the preset feature vector and the interaction behavior sequence.

And the third determining unit is used for determining the relative time difference between the preset target product and each clustering center vector according to the interaction behavior sequence.

A fourth determining unit for determining a sampling probability corresponding to each product according to the plurality of similarity scores and the plurality of relative time differences.

According to an embodiment of the present disclosure, the third determining unit includes a deriving subunit and a determining subunit.

And the obtaining subunit is used for measuring the time sequence relation of a plurality of clustering center vectors corresponding to the interactive behavior sequence by adopting the relative time difference to obtain a timestamp set, wherein each timestamp in the timestamp set corresponds to the time of the interactive behavior between the user and one clustering center vector.

And the determining subunit is used for determining the relative time difference between the preset target product and each cluster center vector based on the timestamp set.

According to an embodiment of the present disclosure, the fifth determining unit includes a converting subunit and a calculating subunit.

A conversion subunit, configured to convert, for each relative time difference, the relative time difference into a time sequence score.

And the calculating subunit is used for calculating the sampling probability of each product according to the time sequence score and the similarity score corresponding to the relative time difference.

According to an embodiment of the present disclosure, the training module 605 includes an input unit, a pooling unit, a sensing unit, a calculating unit, an updating unit, and a fifth determining unit.

And the input unit is used for inputting the preset feature vector and a plurality of product feature vectors corresponding to each interactive behavior subsequence into the attention mechanism layer to obtain a plurality of correlation weights, wherein one correlation weight represents the correlation between the preset feature vector and one product feature vector.

And the pooling unit is used for inputting the plurality of correlation weights into a pooling layer to obtain a target characterization vector, wherein the target characterization vector represents a vector of a relation between the user and a preset target product.

And the perception unit is used for inputting the target characterization vector into the multilayer perceptron layer and outputting a training prediction result.

And the calculating unit is used for calculating a second loss value according to the training prediction result and the loss function.

And the updating unit is used for updating the model parameters of the deep interest network by using a random gradient descent algorithm under the condition that the second loss value does not meet the second convergence threshold value.

And a fifth determining unit, configured to determine the updated deep interest network as the click rate prediction model after the training is completed, when the second loss value satisfies the second convergence threshold.

Any of the modules, units, sub-units, or at least part of the functionality of any of them according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, units and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, units, and sub-units according to the embodiments of the present disclosure may be implemented at least partially as a hardware Circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a Circuit, or implemented by any one of or a suitable combination of software, hardware, and firmware. Alternatively, one or more of the modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as computer program modules, which, when executed, may perform the corresponding functions.

For example, any number of the obtaining module 601, the clustering module 602, the first determining module 603, the second determining module 604, the training module 605, and the prediction module 606 may be combined and implemented in one module/unit/sub-unit, or any one of the modules/units/sub-units may be split into a plurality of modules/units/sub-units. Alternatively, at least part of the functionality of one or more of these modules/units/sub-units may be combined with at least part of the functionality of other modules/units/sub-units and implemented in one module/unit/sub-unit. According to an embodiment of the present disclosure, at least one of the obtaining module 601, the clustering module 602, the first determining module 603, the second determining module 604, the training module 605, and the predicting module 606 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or any other reasonable manner of integrating or packaging a circuit, or as any one of three manners of software, hardware, and firmware, or as a suitable combination of any of them. Alternatively, at least one of the obtaining module 601, the clustering module 602, the first determining module 603, the second determining module 604, the training module 605 and the predicting module 606 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.

It should be noted that the product click rate determining device portion in the embodiment of the present disclosure corresponds to the product click rate determining method portion in the embodiment of the present disclosure, and the description of the product click rate determining device portion specifically refers to the product click rate determining method portion, which is not described herein again.

Fig. 7 schematically shows a block diagram of an electronic device adapted to implement the above described method according to an embodiment of the present disclosure. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701, which can perform various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.

In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are stored. The processor 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. The processor 701 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 702 and/or the RAM 703. It is noted that the programs may also be stored in one or more memories other than the ROM 702 and RAM 703. The processor 701 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

Electronic device 700 may also include input/output (I/O) interface 705, which input/output (I/O) interface 705 is also connected to bus 704, according to an embodiment of the present disclosure. The system 700 may also include one or more of the following components connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a Display panel such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by the processor 701, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to an embodiment of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples may include, but are not limited to: a portable Computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM) or flash Memory), a portable compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the preceding. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 702 and/or the RAM703 and/or one or more memories other than the ROM 702 and the RAM703 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method provided by the embodiments of the present disclosure, when the computer program product is run on an electronic device, the program code being configured to cause the electronic device to implement the method of determining a click rate of a product provided by the embodiments of the present disclosure.

The computer program, when executed by the processor 701, performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure. The above described systems, devices, modules, units, etc. may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of a signal on a network medium, distributed, downloaded and installed via the communication section 709, and/or installed from the removable medium 711. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. A method of product click-through rate determination, comprising:

the method comprises the steps of obtaining a product data training sample set, wherein the product data training sample set comprises a plurality of training samples, each training sample comprises a plurality of product feature vectors with preset dimensionality and an interactive behavior sequence, and the interactive behavior sequence represents the time and the sequence of interactive behaviors of a user and a plurality of products;

determining the sampling probability corresponding to each clustering center vector according to the preset feature vector of a preset target product and the plurality of clustering center vectors corresponding to the interactive behavior sequence;

training a deep interest network by using the preset feature vectors and the plurality of interactive behavior subsequences to obtain a trained click rate prediction model;

2. The method of claim 1, wherein the plurality of product feature vectors are obtained by:

screening the obtained multiple product data to obtain screened product data;

aiming at a plurality of screened product data, respectively carrying out one-hot coding processing on a plurality of categories of the product data under the condition that the product attributes of the product data accord with a first preset condition to obtain a first one-hot vector corresponding to each category, wherein the product attributes comprise product price, product brand or purchase time;

splicing the first unique heat vectors to obtain splicing characteristic vectors of the product data;

under the condition that the product attribute of the product data does not meet the first preset condition, processing the product by using a barrel dividing method to obtain category vectors of a plurality of categories;

and mapping the splicing characteristic vector in a preset dimension to obtain the product characteristic vector.

3. The method according to claim 2, wherein the step of performing screening processing on the acquired plurality of product data to obtain screened product data includes:

4. The method according to claim 1 or 2, wherein the product feature vector is obtained by processing product information of a product by using an embedding layer; the sampling probability is obtained by processing by using a sampling module;

the method further comprises the following steps:

training an initial neural network by sequentially utilizing samples in an optimized training sample set to obtain a first loss value corresponding to the optimized training samples, wherein the samples in the optimized training sample set comprise a positive sample and a negative sample, the positive sample comprises a plurality of interactive behavior subsequences of a user, and the negative sample comprises a plurality of interactive behavior subsequences of at least one other user;

and under the condition that the first loss value meets the first convergence threshold value, determining the updated model parameters of the embedded layer as the target model parameters of the embedded layer, determining the updated model parameters of the sampling module as the target model parameters of the sampling module, and determining the initial neural network after updating the model parameters as a trained contrast learning target model.

5. The method of claim 4, wherein the first loss value L ^C Is as shown in equations (1) and (2):

D _ω (p，q)＝exp(p ^T ·W·q) (2)

6. The method of claim 1, wherein said processing a plurality of said product feature vectors using a mean clustering model to obtain a plurality of cluster center vectors comprises:

processing the plurality of product feature vectors by using a mean clustering model to obtain a plurality of product feature component vectors corresponding to each product feature vector;

and obtaining a plurality of clustering center vectors according to the plurality of target product characteristic sub-vectors, wherein the number of the clustering center vectors is less than that of the product characteristic vectors.

7. The method of claim 1, wherein the determining a sampling probability corresponding to each of the cluster center vectors according to a preset feature vector of a preset target product and a plurality of the cluster center vectors corresponding to the sequence of interaction behaviors comprises:

determining similarity scores of the preset feature vectors and each clustering center vector according to the preset feature vectors and the interaction behavior sequences;

determining a sampling probability corresponding to each of the products based on the plurality of similarity scores and the plurality of relative time differences.

8. The method of claim 7, wherein said determining a relative time difference of said preset target product and each of said cluster center vectors according to said sequence of interaction behaviors comprises:

determining a relative time difference between the preset target product and each of the cluster center vectors based on the set of timestamps.

9. The method of claim 7 or 8, wherein said determining a sampling probability corresponding to each of said products from a plurality of said similarity scores and a plurality of said relative time differences comprises:

for each of the relative time differences, converting the relative time difference into a timing score;

and calculating the sampling probability of each product according to the time sequence score and the similarity score corresponding to the relative time difference.

10. The method of claim 1, wherein the training of the deep interest network using the preset feature vectors and the plurality of interactive behavior subsequences to obtain a trained click-through rate prediction model comprises:

inputting a preset feature vector and a plurality of product feature vectors corresponding to each interactive behavior subsequence into an attention mechanism layer to obtain a plurality of correlation weights, wherein one correlation weight represents the correlation between the preset feature vector and one product feature vector;

calculating a second loss value according to the training prediction result and a loss function;