WO2020093781A1

WO2020093781A1 - Multimedia resource estimated click through rate determination method and apparatus, and server

Info

Publication number: WO2020093781A1
Application number: PCT/CN2019/105452
Authority: WO
Inventors: 牛亚男
Original assignee: 北京达佳互联信息技术有限公司
Priority date: 2018-11-06
Filing date: 2019-09-11
Publication date: 2020-05-14
Also published as: CN109408724B; CN109408724A

Abstract

A multimedia resource estimated click through rate determination method and apparatus, and a server, relating to the field of information recommendation. The method comprises: obtaining user behavior information of the current user (301); obtaining multimedia attribute information of a first multimedia resource (302), the first multimedia resource being a multimedia resource to be recommended to the current user; calling a click through rate estimation model (303), the click through rate estimation model comprising an embedded layer and a click through rate estimation network, the embedded layer comprising a weight matrix corresponding to at least one information type, and the click through rate estimation network being used for taking an embedded vector output by the embedded layer as an input and outputting an estimated click through rate of the multimedia resource; and inputting the user behavior information and the multimedia attribute information into the click through rate estimation model, and outputting the estimated click through rate of the user on the first multimedia resource (304). According to the method, the accuracy of click through rate estimation can be improved.

Description

Method, device and server for determining estimated click rate of multimedia resources

This disclosure requires the priority of the Chinese patent application filed on November 06, 2018 in the Chinese Patent Office with the application number 201811314076.6 and the invention titled "Multimedia Resource Estimated Click Rate Determination Method, Device and Server" Incorporated in this application.

Technical field

The present disclosure relates to the field of information recommendation, in particular to a method, device and server for determining the estimated click rate of multimedia resources.

Background technique

In the information recommendation system, related technologies will estimate the clickthrough rate (CTR) of multimedia resources, and display multimedia resources with a higher estimated clickthrough rate to users, so as to increase the probability of users clicking multimedia resources. Improve the accuracy of information recommendation.

When predicting the click rate of multimedia resources, the related art commonly uses a deep neural network (DNN) click rate prediction model to calculate the estimated click rate. The input data of the click rate estimation model can be divided into different fields, for example, the field of multimedia resources and the field of users. As shown in the schematic diagram of the click rate prediction model shown in FIG. 1, the input data can be converted into a low-dimensional embedding vector through an embedding layer, where different fields can correspond to different weight matrices in the embedding layer. Each feature information can determine the embedding vector through the corresponding weight matrix. There is a one-to-one correspondence between feature information and embedding vectors. Therefore, the weight matrix corresponding to the domain can also be called the embedding mapping table of the domain. Finally, in the related art, the embedded vector can be input into the deep neural network to output the estimated click rate of the multimedia resource.

However, in a field, the number of certain feature information may be less. For example, when the feature information in the user field includes the image identifier that the user clicked, if the user clicks the image less times, the collected image identifier The number is also small. When training the click-through rate estimation model, a small number of feature information will lead to insufficient learning of the corresponding weight matrix, that is, the obtained embedding vector is weakly represented, which leads to When estimating, the accuracy of the estimated click-through rate is low.

Summary of the invention

The present disclosure provides a method, device and server for determining the estimated click rate of multimedia resources, which can solve the problem of low accuracy of the estimated click rate.

According to a first aspect of an embodiment of the present disclosure, a method for determining a multimedia resource estimated click rate includes:

Obtain user's user behavior information;

Acquiring multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the user;

Calling a click-through rate prediction model, the click-through rate prediction model includes an embedding layer and a click-through rate prediction network, the embedding layer includes a weight matrix corresponding to at least one information type, and the click-through rate prediction network is used to The embedding vector output from the embedding layer is used as an input to output the estimated click rate of the multimedia resource;

Input the user behavior information and the multimedia attribute information into the click-through rate estimation network, and output the user's estimated click-through rate to the first multimedia resource.

Optionally, inputting the user behavior information and the multimedia attribute information into the click-through rate estimation model, and outputting the user's estimated click-through rate to the first multimedia resource includes:

For each information type, input the information of the user behavior information and the multimedia attribute information that belong to the information type into the weight matrix corresponding to the information type in the embedding layer, and output at least one embedding vector;

Input at least one embedding vector output by the embedding layer into the click-through rate estimation model, and output the user's estimated click-through rate to the first multimedia resource.

Optionally, the training method of the click-through rate prediction model includes:

Acquiring the initial model of the click-through rate estimation model;

Obtaining at least one training sample, the training sample including multimedia attribute information of a second multimedia resource, user behavior information of a sample user when browsing the second multimedia resource, and the sample user's response to the second multimedia Clicks on physical resources, the clicks include clicked or not clicked;

Training the initial model based on the at least one training sample to obtain the click-through rate prediction model.

Optionally, the initial model includes an initial embedding layer and an initial click rate prediction network, and the initial embedding layer includes an initial weight matrix corresponding to at least one information type;

The training the initial model based on the at least one training sample to obtain the estimated click-through rate model includes:

For the initial weight matrix corresponding to each information type, adjust the parameters of the initial weight matrix based on the training samples containing the information type to obtain the weight matrix corresponding to the information type after training;

Adjust the parameters of the initial click-through rate estimation network based on the at least one training sample to obtain a trained click-through rate estimation network;

Based on the weight matrix corresponding to at least one information type after training and the trained click-through rate estimation network, the click-through rate estimation model is obtained.

Optionally, during the parameter adjustment of the initial weight matrix corresponding to each information type, when the number of training samples containing the first information type is less than the number of training samples containing the second information type, the first The learning rate corresponding to one information type is greater than the learning rate corresponding to the second information type.

Optionally, the information type includes work identification, author identification, and / or style identification.

Optionally, the user behavior information includes click history information, attention information and / or favorite information, the click history information is used to indicate multimedia attribute information of the multimedia resource clicked by the user, and the attention information is used to indicate the user's attention Multimedia attribute information of the multimedia resource, the favorite information is used to represent multimedia attribute information of the user's favorite multimedia resource.

According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for determining an estimated click rate of multimedia resources, including:

The obtaining unit is configured to obtain user behavior information of the user;

The obtaining unit is further configured to obtain multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the user;

The calling unit is configured to call a click-through rate prediction model, the click-through rate prediction model includes an embedding layer and a click-through rate prediction network, the embedding layer includes a weight matrix corresponding to at least one information type, and the click-through rate prediction model The estimation network is used to take the embedding vector output by the embedding layer as an input and output the estimated click rate of the multimedia resource;

The determining unit is configured to input the user behavior information and the multimedia attribute information into the click-through rate estimation model, and output the estimated click-through rate of the user to the first multimedia resource.

Optionally, the determining unit is configured to:

Input at least one embedding vector output by the embedding layer into the click-through rate estimation network, and output the user's estimated click-through rate to the first multimedia resource.

Optionally, the device further includes a training unit, the training unit is configured to:

Acquiring the initial model of the click-through rate estimation model;

Optionally, the initial model includes an initial embedding layer and an initial click rate estimation network, and the initial embedding layer includes an initial weight matrix corresponding to at least one information type;

The training unit is configured to:

According to a third aspect of the embodiments of the present disclosure, a server is provided, including:

One or more processors;

One or more memories for storing one or more processor executable instructions;

Wherein, the one or more processors are configured as:

Obtain user's user behavior information;

The user behavior information and the multimedia attribute information are input to the click-through rate estimation model, and the user's estimated click-through rate for the first multimedia resource is output.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, which when the instructions in the storage medium are executed by the processor of the server, enables the server to execute a multimedia resource estimated click rate Determination method, the method includes:

Obtain user's user behavior information;

Input the user behavior information and the multimedia attribute information into the click-through rate estimation model, and output the estimated click-through rate of the user to the first multimedia resource.

According to a fifth aspect of the embodiments of the present disclosure, an application program / computer program product is provided. When the application program / computer program product is running on a server, the server is caused to perform a method for determining a multimedia resource estimated click rate. Methods include:

Obtain the user behavior information of the current user;

Acquiring multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the current user;

The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects:

In user behavior information and multimedia attribute information, information of the same information type can be determined by the same weight matrix in the embedding layer, which can improve the representativeness of the embedding vector. Therefore, when the click rate of the multimedia resource is estimated based on the method of this embodiment, the accuracy of the estimated click rate is improved.

It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and do not limit the present disclosure.

BRIEF DESCRIPTION

In order to more clearly explain the embodiments of the present disclosure and the technical solutions of the prior art, the following briefly introduces the drawings required in the embodiments and the prior art. Obviously, the drawings in the following description are only For some of the disclosed embodiments, those of ordinary skill in the art can obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of a click-through rate estimation model according to an exemplary embodiment.

Fig. 2 is a diagram of an implementation environment according to an exemplary embodiment.

Fig. 3 is a flow chart of a method for determining a multimedia resource estimated click rate according to an exemplary embodiment.

Fig. 4 is a flow chart of a method for determining an estimated click rate of a multimedia resource according to an exemplary embodiment.

Fig. 5 is a schematic diagram of an application program interface according to an exemplary embodiment.

Fig. 6 is a schematic diagram of a click rate prediction model according to an exemplary embodiment.

Fig. 7 is a schematic diagram of a click rate prediction model according to an exemplary embodiment.

Fig. 8 is a flow chart of a method for training a click rate prediction model according to an exemplary embodiment.

Fig. 9 is a flow chart of a method for training a click rate prediction model according to an exemplary embodiment.

Fig. 10 is a block diagram of a device for determining an estimated click rate of a multimedia resource according to an exemplary embodiment.

Fig. 11 is a block diagram of a device for determining an estimated click rate of a multimedia resource according to an exemplary embodiment.

detailed description

In order to make the objectives, technical solutions, and advantages of the present disclosure clearer, the disclosure will be further described in detail below with reference to the accompanying drawings and embodiments. Obviously, the described embodiments are only a part of the embodiments of the present disclosure, but not all the embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by a person of ordinary skill in the art without creative efforts fall within the protection scope of the present disclosure.

This embodiment provides an implementation environment diagram of a method for determining a multimedia resource estimated click rate. The implementation environment diagram is shown in FIG. 2. The implementation environment may include multiple terminals 201 and a server 202 for providing services for the multiple terminals 201. A plurality of terminals 201 are connected to the server 202 through a wireless or wired network, and the plurality of terminals 201 may be computer devices or smart terminals that can access the server 202. The terminal 201 may be installed with an application program for recommending multimedia resources (such as pictures, short videos, etc.), and the user may log in to the above application program. The server 202 can provide background services for the above-mentioned applications and record user behavior information of each user. The server 202 may also have at least one database for storing click-through rate estimation models, multimedia resources and corresponding multimedia attribute information, user behavior information of various users, and so on.

This embodiment provides a method for determining the estimated click rate of multimedia resources. This method may be implemented by a server. As shown in FIG. 3, a flowchart of a method for determining the estimated click rate of multimedia resources. The processing flow of the method may include the following A step of:

In step S301, the server acquires user behavior information of the user.

In step S302, the server acquires multimedia attribute information of the first multimedia resource.

The first multimedia resource is a multimedia resource to be recommended to the user.

In step S303, the server calls the click-through rate estimation model.

The click-through rate prediction model includes an embedding layer and a click-through rate prediction network. The embedding layer includes a weight matrix corresponding to at least one information type. The click-through rate prediction network is used to take the embedding vector output by the embedding layer as input and output multimedia resources Estimated clickthrough rate.

In step S304, the server inputs user behavior information and multimedia attribute information into the click-through rate estimation model, and outputs the user's estimated click-through rate to the first multimedia resource.

Optionally, inputting user behavior information and multimedia attribute information into the click-through rate estimation network, and outputting the user's estimated click-through rate to the first multimedia resource include:

For each information type, input information belonging to the information type in the user behavior information and multimedia attribute information into the weight matrix corresponding to the information type in the embedding layer, and output at least one embedding vector;

The at least one embedding vector output from the embedding layer is input into the click rate estimation model, and the user's estimated click rate of the first multimedia resource is output.

Optionally, the training methods of the CTR prediction model include:

Get the initial model of the click-through rate estimation model;

Obtain at least one training sample. The training sample includes multimedia attribute information of the second multimedia resource, user behavior information when the sample user browses the second multimedia resource, and the click situation of the sample user on the second multimedia resource. Including clicked or not clicked;

The initial model is trained based on at least one training sample to obtain a click-through rate estimation model.

Training the initial model based on at least one training sample to obtain a click-through rate estimation model, including:

For the initial weight matrix corresponding to each information type, the initial weight matrix is adjusted based on the training samples containing the information type to obtain the weight matrix corresponding to the trained information type;

Adjust the parameters of the initial click-through rate estimation network based on at least one training sample to obtain the trained click-through rate estimation network;

Based on the weight matrix corresponding to the at least one information type after training and the click-through rate prediction network after training, a click-through rate prediction model is obtained.

Optionally, during the parameter adjustment process of the initial weight matrix corresponding to each information type, when the number of training samples containing the first information type is less than the number of training samples containing the second information type, the first information type corresponds to The learning rate of is greater than the learning rate corresponding to the second information type.

Optionally, the information type includes work identification, author identification and / or style identification.

Optionally, the user behavior information includes click history information, attention information and / or favorite information, the click history information includes information recorded when the user clicks on any multimedia resource, the attention information includes information recorded when the user clicks on the attention option, and the favorite information includes Information recorded when the user clicks the favorite option.

In this embodiment, a method for determining the estimated click rate of multimedia resources will be introduced in conjunction with a specific implementation manner. The method may be implemented by a server, as shown in the flowchart of the method for determining the estimated click rate of multimedia resources shown in FIG. 4, the processing flow of the method may include the following steps:

In step S401, the server acquires user behavior information of the user.

When the user starts the application, the terminal can send a multimedia resource acquisition request to the server; or, when the user searches for multimedia resources through the application, the terminal can also send the multimedia resource acquisition request to the server. When the server receives the multimedia resource acquisition request sent by the terminal, it may trigger processing logic of the recommended multimedia resource, so as to be processed by the method for determining the estimated click rate of the multimedia resource provided in this embodiment. This embodiment does not limit the specific manner of triggering the processing logic of recommending multimedia resources.

At this time, the server may obtain the user behavior information of each user from the stored user behavior information of each user according to the user's identification information. In a possible implementation manner, the user behavior information may include click history information, attention information and / or favorite information, the click history information may be used to represent multimedia attribute information of the multimedia resource clicked by the user, and the attention information may be used to represent the user The multimedia attribute information and favorite information of the focused multimedia resource can be used to represent the multimedia attribute information of the user's favorite multimedia resource.

Optionally, for the multimedia attribute information of the multimedia resource recorded by the server, the information type may include work identification, author identification, and / or style identification.

The following describes the process of the server recording the above information:

As shown in the schematic diagram of the application program interface shown in FIG. 5, the terminal can display the multimedia resources provided by the application program, and the display interface of the multimedia resources can include attention options and favorite options.

When the user clicks to view the multimedia resource, the server may receive a request to load the multimedia resource, and then may add corresponding multimedia attribute information to the user's click history information for storage. For example, when the user clicks to watch a short video, the server may add the work identification, author identification, and / or style identification of the short video to the user's click history information.

If the user needs to subscribe to the multimedia resource, they can click the follow option in the display interface. Furthermore, the server may receive a request to add attention to the multimedia resource, and add the corresponding multimedia attribute information to the attention information of the user for storage. Of course, the attention option may be not only attention to multimedia resources, but also attention to authors or styles of multimedia resources, which is not limited in this embodiment. For example, if the follow option is for the attention of the author of the multimedia resource, the server can add the author ID of the multimedia resource to the user's attention information, and after that, when the author updates the work, the user can receive the corresponding Update notifications to achieve the effect of subscription.

If the user likes the multimedia resource, he can click the favorite option in the display interface, and the server adds the corresponding favorite information processing, which is similar to the processing of adding attention information, and will not be repeated here.

Of course, the above solutions for recording click history information, attention information, and favorite information are all optional solutions, that is, the user behavior information recorded by the server may be one or more of click history information, attention information, and favorite information. The type of information contained therein may also be one or more of a work identification, an author identification and a style identification, and may also include other information types. The specific information type is not limited in this embodiment.

In step S402, the server acquires multimedia attribute information of the first multimedia resource.

The first multimedia resource is a multimedia resource to be recommended to the user. The multimedia attribute information may be obtained by the author when the author uploads the corresponding multimedia resource, or the server automatically generates the information based on the author and content of the multimedia resource. This embodiment does not limit the generation method of the multimedia attribute information.

The first multimedia resource may include multiple multimedia resources, for example, it may be a currently popular multimedia resource, or it may be a search result obtained by searching a user. Since the multimedia resources that the user can view are limited, the server can determine which first multimedia resources to display, and can also sort the displayed first multimedia resources in order to achieve the priority of displaying multimedia resources more in line with user needs . In this embodiment, the estimated click rate of the multimedia resource is used to achieve the above purpose. The higher the estimated click rate, the greater the possibility that the user clicks the multimedia resource, that is, the multimedia resource more meets the user's needs.

When determining the estimated click rate of the first multimedia resource, for a first multimedia resource, the server may, according to the identification information of the first multimedia resource, from the stored multimedia resource and the corresponding multimedia attribute information To obtain multimedia attribute information of the first multimedia resource.

In step S403, the server calls the click-through rate estimation model.

As shown in the schematic diagram of the click-through rate prediction model shown in FIG. 6, the click-through rate prediction model provided in this embodiment may include an embedding layer and a click-through rate prediction network. The embedding layer may include a weight matrix corresponding to at least one information type The prediction network can be used to take the embedding vector output by the embedding layer as an input and output the estimated click rate of the multimedia resource. The click prediction network may be a deep neural network, or a convolutional neural network, etc. The technician can design the click rate prediction network according to requirements, and the specific network structure is not limited in this embodiment.

Of course, the server can also store multiple click-through rate prediction models, and the server can call the click-through rate prediction model that meets the preset conditions. For example, when the server also stores the accuracy of each click-through rate prediction model, Call any click rate prediction model that meets the accuracy rate greater than the preset threshold. This embodiment does not limit the specific method of calling the click rate model.

The click-through rate prediction model can be periodically trained, and the latest click-through rate prediction model obtained by training can be stored in the server. When the server needs to determine the estimated click rate of the first multimedia resource, it can call the latest click rate prediction model.

In step S404, for each information type, the server inputs the information belonging to the information type in the user behavior information and multimedia attribute information into the weight matrix corresponding to the information type in the embedding layer, and outputs at least one embedding vector.

The server may input the user behavior information obtained in the above step S401 and the multimedia attribute information of the first multimedia resource obtained in step S402 into the embedding layer of the click rate prediction model.

In the embedding layer, the input data can be encoded in order to adapt to the calculation of the neural network. Generally speaking, the vector dimension obtained after encoding is large, so the dimensionality reduction process can be performed on the encoded vector through the weight matrix in the embedded layer, so as to reduce the IO (Input-Output) overhead in the processing process. For example, one-hot encoding of a feature information of the input data to obtain a vector (0,0,0,1,0,0,0,0,0), which can be converted into an embedded vector (0.145,0.152) through the weight matrix .

Each information type has a corresponding weight matrix. In user behavior information and multimedia attribute information, the server can obtain the vector corresponding to the feature information of the same information type, input the vector corresponding to the same information type into the corresponding weight matrix, and output the embedded vector.

As mentioned above in step S401, in a possible implementation manner, the information type may include a work identification, an author identification and / or a style identification. Exemplarily, as shown in the schematic diagram of the click-through rate estimation model shown in FIG. 7, the user behavior information is click history information, wherein each feature information is a work identifier clicked by the user, and the feature information of multimedia attribute information is a work of the multimedia resource Logo. The embedding mapping table shown in FIG. 7 is a weight matrix corresponding to the work identification. The server can determine the embedding vector from the work identification in the user behavior information and the work identification of the multimedia attribute information through the embedding mapping table. That is, the weight matrix in the embedding layer used by the information of the work identification is the same, and it will not be distinguished by the domain of multimedia resources or the domain of users.

In step S405, the server inputs at least one embedding vector output from the embedding layer into the click-through rate estimation network, and outputs the user's estimated click-through rate to the first multimedia resource.

After the server determines each embedding vector of the input data, it can input the embedding vector into the click-through rate estimation network, perform data processing through each network node in the click-through rate estimation network, and output the estimated click-through rate of the first multimedia resource.

For each first multimedia resource, the estimated click rate can be determined through the above steps S402-S405. The plurality of first multimedia resources may be determined in parallel by means of parallel processing. In this embodiment, the order in which the plurality of first multimedia resources determine the estimated click rate is not limited.

After the server determines the estimated click rate of each first multimedia resource, it may sort each first multimedia resource according to the order of the estimated click rate from large to small, and may divide each first multimedia resource The physical resources and the corresponding order are sent to the terminal. Furthermore, the terminal may display the received first multimedia resource according to the order. Optionally, when the number of displayed multimedia resources is a preset number, the server may send the preset number of first multimedia resources ranked first and the corresponding order to the terminal, and the terminal may respond to the preset Set a number of first multimedia resources for display.

Of course, the first multimedia resource sent by the server to the terminal may be an abbreviated form corresponding to the first multimedia resource in order to reduce the consumption of network resources. For example, when the first multimedia resource is a picture, the server The image thumbnail may be sent; when the first multimedia resource is a short video, the server may send a preview image to the terminal. The specific form of the first multimedia resource sent in this embodiment is not limited.

After the terminal displays the first multimedia resource, the current user may first see the first multimedia resource with a higher estimated click rate. Therefore, the method for determining the estimated click rate of the multimedia resource provided by this embodiment can improve the user's click rate of the recommended multimedia resource. When the above method is applied to an application, the user retention rate of the application can be improved.

In this embodiment, in the user behavior information and multimedia attribute information, the information of the same information type can be determined by the same weight matrix in the embedding layer, which can improve the representativeness of the embedding vector. Therefore, when the click rate of the multimedia resource is estimated based on the method of this embodiment, the accuracy of the estimated click rate can be improved.

The above embodiment describes the process of determining the estimated click rate using the click rate prediction model. Before using the click rate prediction model, the click rate prediction model can be trained. This embodiment provides a training method for a click-through rate prediction model, which can be implemented by a server. As shown in the flowchart of the training method of the click-through rate prediction model shown in FIG. 8, the processing flow of this method may include the following steps:

In step S801, the server acquires the initial model of the click-through rate estimation model.

Corresponding to the click-through rate estimation model described in the above embodiment, the initial model may include an initial embedding layer and an initial click-through rate estimation network, and the initial embedding layer includes an initial weight matrix corresponding to at least one information type.

The initial model of the click-through rate estimation model can be stored in the server. The initial model may be a machine learning model designed by a technician to determine an estimated click rate, which takes user behavior information and multimedia attribute information as inputs, predicts the user's click rate on multimedia resources, and outputs the estimated click rate. However, since the model parameters in the initial model are all preset initial values, the accuracy of the predicted click-through rate is low, and the initial model needs to be trained.

In step S802, the server acquires at least one training sample.

The training sample may include multimedia attribute information of the second multimedia resource, user behavior information of the sample user when browsing the second multimedia resource, and the click situation of the sample user on the second multimedia resource. The click situation may include Clicked or not clicked. That is, the second multimedia resource may refer to a multimedia resource whose history is displayed to the sample user.

The following describes the process by which the server records the information in the training samples:

Whenever the server sends the second multimedia resource for presentation to the terminal, it may record the multimedia attribute information of the second multimedia resource. Of course, the server may also record the identification information of the second multimedia resource, and the identification information may be used to obtain multimedia attribute information of the second multimedia resource.

At this time, the server can also obtain the user behavior information of the user, and the user behavior information and the above information recorded by the server (such as multimedia attribute information of the second multimedia resource or identification information of the second multimedia resource) Record accordingly.

When the user clicks to view any second multimedia resource, the server may receive a request to load the second multimedia resource, and further, may record the click of the second multimedia resource as clicked, as described above Recording corresponding to the information, for example, recording corresponding to multimedia attribute information, user behavior information, and click status of the second multimedia resource.

When the terminal closes the display interface, it can send a display close notification to the server. When the server receives the display close notification, it can obtain the unclicked second multimedia resource from the second multimedia resources sent, and send the corresponding Clicks are recorded as unclicked. Alternatively, after sending the second multimedia resource for display to the terminal, the server may also obtain the unclicked second multimedia resource when the preset duration is reached. This embodiment does not limit the specific manner in which the server obtains the unclicked second multimedia resource.

When the server trains the click-through rate estimation model, it can obtain the multimedia attribute information of the second multimedia resource recorded in the above process, the user behavior information when the user browses the second multimedia resource, and the user ’s second multimedia The click of physical resources is used as a training sample. Optionally, the server may use the training sample with the clicked condition as a positive sample, and the training sample with the clicked condition as an unclicked as a negative sample.

In step S803, the server trains the initial model based on at least one training sample, and obtains a click-through rate estimation model.

For each training sample, the server can input the multimedia attribute information and user behavior information into the initial model, and perform data processing based on the model parameters of each network node in the initial model, to obtain the initial model for the second multimedia resource. Estimate click rate. Then, the server may determine the gradient of each model parameter in the initial model according to the user's click on the second multimedia resource in the training sample and the corresponding estimated click rate. The server can determine the correction value of each model parameter according to the gradient of each model parameter, and adjust the parameter of each model parameter based on the correction value, that is, the error back propagation.

In a possible implementation manner, as shown in the flowchart of the training method of the click-through rate estimation model shown in FIG. 9, step S803 may include steps S8031-S8033:

In step S8031, for the initial weight matrix corresponding to each information type, the server adjusts the parameters of the initial weight matrix based on the training samples containing the information type to obtain the weight matrix corresponding to the trained information type.

Each model parameter in the weight matrix can be adjusted during the training process. It has been introduced in the above embodiments that the feature information of the same information type can determine the embedding vector based on the same weight matrix, and then determine the estimated click rate. Correspondingly, during the training process, for the initial weight matrix corresponding to each type of information, the server can obtain training samples for determining the estimated click rate through the initial weight matrix, and the corresponding estimated click rate, which can then be estimated Click rate and actual user click, calculate the gradient of each model parameter in the initial weight matrix, determine the corresponding correction value according to the gradient, and adjust each model parameter. After the training, the obtained weight matrix corresponding to each information type can be used to determine the embedding vector for the corresponding information.

Compared with the scheme of dividing the weight matrix based on the field of multimedia resources and the field of users, since the information of the same information type uses the same weight matrix, the training matrix containing the information type can be fully utilized to train the weight matrix so that the weight matrix Do full learning.

In step S8032, the server adjusts the parameters of the initial click-through rate estimation network based on at least one training sample to obtain the trained click-through rate estimation network.

In general, each training sample can be processed through the click-through rate estimation network. Therefore, the server can obtain each training sample and the corresponding estimated click-through rate, and adjust the parameters of the initial click-through rate prediction network for specific processing. As mentioned above, it will not be repeated here.

In step S8033, the server obtains a click rate prediction model based on the weight matrix corresponding to the at least one information type after training and the click rate prediction network after training.

When the conditions for the end of training are reached (for example, the preset number of trainings is reached or the value of the loss function is less than the target value), the server can obtain each weight matrix in the current embedding layer, and the click-through rate estimation network, etc., to constitute a click-through rate estimation model , And the click-through rate estimation model can be stored. When the server needs to predict the multimedia resource, it can obtain the stored click rate prediction model for processing.

Of course, after this, the server can also train the stored click-through rate estimation model again. The training process is the same as the above process. The server continuously updates the click-through rate estimation model, which can improve the accuracy of the click-through rate estimation model.

In a possible implementation manner, in the above parameter adjustment process of the initial weight matrix corresponding to each information type, when the number of training samples containing the first information type is smaller than the number of training samples containing the second information type , The learning rate corresponding to the first information type is greater than the learning rate corresponding to the second information type.

In the above step S8031, when determining the correction value of each model parameter in the initial weight matrix according to the gradient, the server may adjust the learning rate according to the gradient, for example, the learning rate adjustment method may be AdaGrad (Adaptive Gradient, adaptive learning rate) algorithm.

In general, if there are fewer training samples containing information types, the determined gradient is smaller, that is, the gradient changes more smoothly, and the server can increase the learning rate corresponding to the information type, that is, the amplitude of the correction value increases. Make the weight matrix get fuller gradient update. Through the above method, the model parameters can be fully learned even when the features are sparse, and the accuracy of the model can be improved.

Of course, the above method for adjusting the learning rate can also be applied in the above step S8032, so that the model parameters in the click-through rate estimation network can also be adjusted adaptively according to the gradient. .

In a possible implementation, the training goal of the initial model by the server may be to maximize AUC (Area Under the ROC Curve, area under the ROC curve; ROC, Receiver Operating Characteristic, receiver operating characteristics).

If the training sample with clicks is called the first training sample and the training sample with clicks is not called the second training sample, then AUC may refer to the first training sample before the second training sample Probability.

After the server determines the estimated click-through rate for each training sample through the initial model, it can be arranged in order of the estimated click-through rate from large to small, and then according to the number of first training samples ranked before all second training samples, And the total number of training samples to determine the value of AUC. The larger the AUC, the more the first training samples are ranked before all the second training samples, that is, the higher the accuracy of the click rate prediction model.

Of course, the server can also determine the AUC based on other methods. For example, after the ROC curve is established based on the estimated click-through rate of the training sample, the area under the ROC curve is calculated by the integration method. This embodiment does not limit the specific method for determining the AUC.

Experiments show that the method provided in this embodiment can significantly improve AUC, that is, the accuracy of the click rate prediction model obtained by the method in this embodiment is improved.

In this embodiment, the weight matrix corresponding to an information type can be trained based on training samples containing the information type. Since there is no field based on multimedia resources and user's field to divide the feature information of the same information type, the training samples can be fully utilized, so that the weight matrix of the embedding layer can be fully learned, the representativeness of the embedding vector can be improved, and then the click-through rate can be improved. Estimate the accuracy of the model.

Fig. 10 is a block diagram of a device for determining an estimated click rate of a multimedia resource according to an exemplary embodiment. Referring to FIG. 10, the device includes an acquisition unit 1010, a calling unit 1020, and a determination unit 1030.

The obtaining unit 1010 is configured to obtain user behavior information of the user;

The obtaining unit 1010 is further configured to obtain multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the user;

The calling unit 1020 is configured to call a click-through rate estimation model. The click-through rate estimation model includes an embedding layer and a click-through rate estimation network. The embedding layer includes a weight matrix corresponding to at least one information type. The rate estimation network is used to take the embedding vector output by the embedding layer as an input and output the estimated click rate of the multimedia resource;

The determining unit 1030 is configured to input the user behavior information and the multimedia attribute information into the click-through rate estimation model, and output the user's estimated click-through rate to the first multimedia resource.

Optionally, the determining unit 1030 is configured to:

Acquiring the initial model of the click-through rate estimation model;

Obtaining at least one training sample, the training sample including multimedia attribute information of the second multimedia resource, user behavior information when the user browses the second multimedia resource, and the user's response to the second multimedia resource Clicks, including clicked or not clicked;

The training unit is configured as:

Regarding the device in the above embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment related to the method, and will not be elaborated here.

In this embodiment, in the user behavior information and multimedia attribute information, the information of the same information type can be determined by the same weight matrix in the embedding layer, which can improve the representativeness of the embedding vector. Therefore, when the click rate of the multimedia resource is predicted based on the method of this embodiment, the accuracy of the click rate prediction model is improved.

Fig. 11 is a block diagram of an apparatus 1100 for determining an estimated click rate of a multimedia resource according to an exemplary embodiment. For example, the device 1100 may be provided as a server. Referring to FIG. 11, the device 1100 includes a processing component 1122, which further includes one or more processors, and memory resources represented by the memory 1132, for storing instructions executable by the processing component 1122, such as application programs. The application program stored in the memory 1132 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 1122 is configured to execute an instruction to execute the above method for determining the estimated click rate of the multimedia resource.

The device 1100 may also include a power component 1126 configured to perform power management of the device 1100, a wired or wireless network interface 1150 configured to connect the device 1100 to the network, and an input output (I / O) interface 1158. The device 1100 can operate an operating system based on the memory 1132, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.

In an exemplary embodiment, a non-transitory computer-readable storage medium is also provided, such as a memory including instructions, which can be executed by a processor in the server to complete the method for determining the estimated click rate of the multimedia resource. For example, the computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, or the like.

In an exemplary embodiment, an application program / computer program product is also provided, which includes one or more instructions, and the one or more instructions may be executed by a processor of the server to complete the above-mentioned estimated click rate of multimedia resources Determine the method.

Those skilled in the art will easily think of other embodiments of the present disclosure after considering the description and practicing the contents disclosed herein. This application is intended to cover any variations, uses, or adaptive changes of the present disclosure that follow the general principles of the present disclosure and include common general knowledge or customary technical means in the technical field not disclosed in the present disclosure . The description and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are pointed out by the following claims.

It should be understood that the present disclosure is not limited to the precise structure that has been described above and shown in the drawings, and various modifications and changes can be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

A method for determining the estimated click rate of multimedia resources, which includes:

Obtain user's user behavior information;

Acquiring multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the user;

Calling a click-through rate prediction model, the click-through rate prediction model includes an embedding layer and a click-through rate prediction network, the embedding layer includes a weight matrix corresponding to at least one information type, and the click-through rate prediction network is used to The embedding vector output from the embedding layer is used as an input to output the estimated click rate of the multimedia resource;

The user behavior information and the multimedia attribute information are input to the click-through rate estimation model, and the user's estimated click-through rate for the first multimedia resource is output.
The method according to claim 1, wherein the user behavior information and the multimedia attribute information are input to the click-through rate estimation model, and the user's information about the first multimedia resource is output Estimated CTR, including:

For each information type, input the information of the user behavior information and the multimedia attribute information that belong to the information type into the weight matrix corresponding to the information type in the embedding layer, and output at least one embedding vector;

Input at least one embedding vector output by the embedding layer into the click-through rate estimation network, and output the user's estimated click-through rate to the first multimedia resource.
The method according to claim 1, wherein the training method of the click-through rate prediction model comprises:

Acquiring the initial model of the click-through rate estimation model;

Obtaining at least one training sample, the training sample including multimedia attribute information of a second multimedia resource, user behavior information of a sample user when browsing the second multimedia resource, and the sample user's response to the second multimedia Clicks on physical resources, the clicks include clicked or not clicked;

Training the initial model based on the at least one training sample to obtain the click-through rate prediction model.
The method according to claim 3, wherein the initial model includes an initial embedding layer and an initial click-through rate estimation network, and the initial embedding layer includes an initial weight matrix corresponding to at least one information type;

The training the initial model based on the at least one training sample to obtain the estimated click-through rate model includes:

For the initial weight matrix corresponding to each information type, adjust the parameters of the initial weight matrix based on the training samples containing the information type to obtain the weight matrix corresponding to the information type after training;

Adjust the parameters of the initial click-through rate estimation network based on the at least one training sample to obtain a trained click-through rate estimation network;

Based on the weight matrix corresponding to at least one information type after training and the trained click-through rate estimation network, the click-through rate estimation model is obtained.
The method according to claim 4, wherein during the parameter adjustment of the initial weight matrix corresponding to each information type, when the number of training samples containing the first information type is smaller than that containing the second information type When the number of training samples, the learning rate corresponding to the first information type is greater than the learning rate corresponding to the second information type.
The method according to any one of claims 1-5, wherein the information type includes a work identification, an author identification, and / or a style identification.
The method according to any one of claims 1-5, wherein the user behavior information includes click history information, attention information, and / or favorite information, and the click history information is used to represent multimedia of a multimedia resource clicked by the user Attribute information, the attention information is used to represent multimedia attribute information of the multimedia resource that the user focuses on, and the favorite information is used to represent multimedia attribute information of the multimedia resource that the user prefers.
A device for determining the estimated click rate of multimedia resources is characterized by comprising:

The obtaining unit is configured to obtain user behavior information of the user;

The obtaining unit is further configured to obtain multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the user;

The calling unit is configured to call a click-through rate prediction model, the click-through rate prediction model includes an embedding layer and a click-through rate prediction network, the embedding layer includes a weight matrix corresponding to at least one information type, and the click-through rate prediction model The estimation network is used to take the embedding vector output by the embedding layer as an input and output the estimated click rate of the multimedia resource;

The determining unit is configured to input the user behavior information and the multimedia attribute information into the click-through rate estimation model, and output the estimated click-through rate of the user to the first multimedia resource.
The apparatus according to claim 8, wherein the determining unit is configured to:

For each information type, input the information of the user behavior information and the multimedia attribute information that belong to the information type into the weight matrix corresponding to the information type in the embedding layer, and output at least one embedding vector;

Input at least one embedding vector output by the embedding layer into the click-through rate estimation network, and output the user's estimated click-through rate to the first multimedia resource.
The apparatus according to claim 8, wherein the apparatus further comprises a training unit, the training unit is configured to:

Acquiring the initial model of the click-through rate estimation model;

Obtaining at least one training sample, the training sample including multimedia attribute information of a second multimedia resource, user behavior information of a sample user when browsing the second multimedia resource, and the sample user's response to the second multimedia Clicks on physical resources, the clicks include clicked or not clicked;

Training the initial model based on the at least one training sample to obtain the click-through rate prediction model.
The apparatus according to claim 10, wherein the initial model includes an initial embedding layer and an initial click-through rate prediction network, and the initial embedding layer includes an initial weight matrix corresponding to at least one information type;

The training unit is configured to:

For the initial weight matrix corresponding to each information type, adjust the parameters of the initial weight matrix based on the training samples containing the information type to obtain the weight matrix corresponding to the information type after training;

Adjust the parameters of the initial click-through rate estimation network based on the at least one training sample to obtain a trained click-through rate estimation network;

Based on the weight matrix corresponding to at least one information type after training and the trained click-through rate estimation network, the click-through rate estimation model is obtained.
The apparatus according to claim 11, wherein during the parameter adjustment of the initial weight matrix corresponding to each information type, when the number of training samples containing the first information type is smaller than that containing the second information type When the number of training samples, the learning rate corresponding to the first information type is greater than the learning rate corresponding to the second information type.
The device according to any one of claims 8-12, wherein the information type includes a work identification, an author identification, and / or a style identification.
The device according to any one of claims 8-12, wherein the user behavior information includes click history information, attention information, and / or favorite information, and the click history information is used to represent multimedia of a multimedia resource clicked by the user Attribute information, the attention information is used to represent multimedia attribute information of the multimedia resource that the user focuses on, and the favorite information is used to represent multimedia attribute information of the multimedia resource that the user prefers.
A server, characterized in that it includes:

One or more processors;

One or more memories for storing one or more processor executable instructions;

Wherein, the one or more processors are configured as:

Obtain user's user behavior information;

Acquiring multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the user;

Calling a click-through rate prediction model, the click-through rate prediction model includes an embedding layer and a click-through rate prediction network, the embedding layer includes a weight matrix corresponding to at least one information type, and the click-through rate prediction network is used to The embedding vector output from the embedding layer is used as an input to output the estimated click rate of the multimedia resource;

The user behavior information and the multimedia attribute information are input to the click-through rate estimation model, and the user's estimated click-through rate for the first multimedia resource is output.
A non-transitory computer-readable storage medium, characterized in that when instructions in the storage medium are executed by a processor of a server, the server is enabled to perform a method for determining a multimedia resource estimated click rate, the method include:

Obtain user's user behavior information;

Acquiring multimedia attribute information of a first multimedia resource, where the first multimedia resource is a multimedia resource to be recommended to the user;

Calling a click-through rate prediction model, the click-through rate prediction model includes an embedding layer and a click-through rate prediction network, the embedding layer includes a weight matrix corresponding to at least one information type, and the click-through rate prediction network is used to The embedding vector output from the embedding layer is used as an input to output the estimated click rate of the multimedia resource;

The user behavior information and the multimedia attribute information are input to the click-through rate estimation model, and the user's estimated click-through rate for the first multimedia resource is output.