CN114722268A

CN114722268A - Media resource processing method and device, storage medium and electronic equipment

Info

Publication number: CN114722268A
Application number: CN202110004963.9A
Authority: CN
Inventors: 凌程; 王亚龙; 王瑞; 夏锋; 林乐宇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-01-04
Filing date: 2021-01-04
Publication date: 2022-07-08

Abstract

The invention discloses a media resource processing method and device, a storage medium and electronic equipment. Wherein, the method comprises the following steps: obtaining a first predicted click rate output by a first training neural network by inputting the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users into the first training neural network; inputting the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network; according to the first predicted click rate, the second predicted click rate and the actual click result, model parameters in the second training neural network are adjusted, deviation of position information of the media resources can be eliminated by using the adjusted second training neural network, the purpose of recommending the media resources which are interested by the user to the user can be achieved accurately, and the technical problem that in the prior art, the accuracy of recommending the media resources according to user behaviors is low is solved.

Description

Media resource processing method and device, storage medium and electronic equipment

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for processing media resources, a storage medium, and an electronic device.

Background

In a recommendation scenario, recommended goods are mostly presented to the user in the form of a list, such as common e-commerce, video, news recommendations, and so on. For a request of a user, the ranking model will score and rank all commodities in the recall candidate set, and the commodity set with the highest score is displayed to the user, and there is a position bias problem that the user is more inclined to click on the top-ranked commodities, and the inclination is irrelevant to the real interest of the user. If the click rate at different positions is counted, the highest click rate of the top-ranked commodities can be found. This phenomenon may cause the user to ignore the actual items of interest, and simply click on the top ranked items. The user behavior is an important characteristic for building model input, so that the model can learn the behavior of the user with position deviation, and the commodity which is most interested by the user cannot be arranged in front of the scoring in real time.

At present, the method for eliminating position-bias is a feature-based method, wherein a position feature is added during off-line training, and a default feature is used to replace the position feature during on-line prediction. During off-line training, the PAL (position bias) model models the position feature separately, outputs a value (the value can be understood as the probability of being exposed) through a shallow tower, and finally multiplies the value by the click rate pCTR part; only the fraction of the click rate pCTR is used for online prediction.

The feature-based approach uses the position feature of the exposure during offline training and a default feature on the line, which results in auc being high and auc being low during offline training.

The PAL model introduces position information through independent modeling of position characteristics and finally through a multiplication operation, and only uses a pCTR part during online prediction, so that the model structure can weaken the influence of the position characteristics which cannot be obtained during online prediction and reduce gap between training auc and prediction auc. However, the position features are modeled independently, so that the position features lack deep intersection with other features, and the fitting capability of the model is influenced, so that the accuracy is low when media resource information recommendation is performed according to the PAL model.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a media resource processing method and device, a storage medium and electronic equipment, and at least solves the technical problem that in the prior art, the accuracy of recommending media resources according to user behaviors is low.

According to an aspect of the embodiments of the present invention, there is provided a method for processing a media resource, including: inputting the position characteristics of sample media resources, the resource characteristics of the sample media resources and the user characteristics of a sample user into a first training neural network to obtain a first predicted click rate output by the first training neural network, wherein the position characteristics are used for representing the actual arrangement positions of the sample media resources in a group of media resources pushed to the sample user, and the first predicted click rate is used for representing the predicted probability of the sample user clicking the sample media resources; inputting the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for expressing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure; determining a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether the sample user actually clicks the sample media resource; and adjusting the model parameters in the second training neural network according to the loss value of the target loss function.

According to another aspect of the embodiments of the present invention, there is also provided a processing apparatus of a media resource, including: a first input unit, configured to input a location feature of a sample media resource, a resource feature of the sample media resource, and a user feature of a sample user into a first training neural network, so as to obtain a first predicted click rate output by the first training neural network, where the location feature is used to represent an actual arrangement location of the sample media resource in a group of media resources pushed to the sample user, and the first predicted click rate is used to represent a predicted probability of the sample user clicking the sample media resource; a second input unit, configured to input the resource characteristics of the sample media resource and the user characteristics of the sample user into a second training neural network, so as to obtain a second predicted click rate output by the second training neural network, where the second predicted click rate is used to represent a predicted probability that the sample user clicks the sample media resource, and the first training neural network and the second training neural network have a same model structure; a determining unit, configured to determine a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, and an actual click result, where the actual click result is used to indicate whether the sample user actually clicks the sample media resource; and the adjusting unit is used for adjusting the model parameters in the second training neural network according to the loss value of the target loss function.

According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the above processing method of a media resource when running.

According to still another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the above processing method of a media resource through the computer program.

In the embodiment of the invention, the position characteristics of sample media resources, the resource characteristics of the sample media resources and the user characteristics of sample users are input into a first training neural network to obtain a first predicted click rate output by the first training neural network, wherein the position characteristics are used for representing the actual arrangement positions of the sample media resources in a group of media resources pushed to the sample users, and the first predicted click rate is used for representing the predicted probability of the sample users clicking the sample media resources; inputting the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for expressing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure; determining a loss value of a target loss function of a second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether a sample user actually clicks a sample media resource; adjusting the model parameters in the second training neural network according to the loss value of the target loss function, so as to determine the loss value of the target loss function of the second training neural network according to the first predicted click rate output by the first training neural network, the second predicted click rate output by the second training neural network and the actual click result, and the model parameter in the second training neural network is adjusted according to the loss value, the deviation of the media resource position information can be eliminated by utilizing the adjusted second training neural network, and the purpose of recommending the media resource which is interested by the user to the user can be accurately achieved, namely, when recommending media resources to users, the influence of the position information of the media resources can be eliminated, the media resources which are interested by the users can be recommended to the users more accurately, and the technical problem that the accuracy of recommending the media resources according to the user behaviors is low in the prior art is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention and do not constitute a limitation of the invention. In the drawings:

FIG. 1 is a schematic diagram of an application environment of an alternative method for processing a media asset according to an embodiment of the invention;

FIG. 2 is a flow chart of an alternative method of processing a media asset according to an embodiment of the invention;

FIG. 3 is a schematic illustration of an alternate ordered display of a set of media assets in a page in accordance with an embodiment of the invention;

FIG. 4 is a schematic diagram of an alternative first training neural network, according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an alternative second training neural network, according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of an alternative third training neural network, according to an embodiment of the present invention;

FIG. 7 is a flow diagram of an alternative click rate prediction model for eliminating position bias based on knowledge distillation in accordance with embodiments of the present invention;

FIG. 8 is a block diagram of an alternative click rate prediction model based on knowledge-based distillation to eliminate position bias, according to an embodiment of the present invention;

FIG. 9 is a schematic structural diagram of an alternative media asset processing device according to an embodiment of the invention;

fig. 10 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For a better understanding of the embodiments provided herein, some of the terms are set forth below:

feed stream recommendation: the content recommendation of the aggregated information can dynamically and real-timely propagate to subscribers through Feed streams, and is an effective way for users to acquire information streams.

ctr: i.e., click-through rate, in a recommendation system, the recalled content subsets are typically sorted by click-through rate, and then distributed in conjunction with a policy.

Position bias (position bias): in the recommendation system, the attention of each item can be influenced by the display position, and items at the front position are usually more easily noticed by the user than items at the back position and are also more easily clicked, so that the perception of the model to the preference of the user is deviated, and the estimated ctr is inaccurate.

Knowledge distillation (knowledge distillation): and learning the knowledge of the teacher network by using the student network, namely fitting the output of the teacher network by using the output of the student network.

AUC: and respectively randomly extracting a positive sample and a negative sample from the positive sample and the negative sample, wherein the predicted value of the positive sample is greater than the probability of the negative sample. May be used to evaluate the ranking quality offline.

According to an aspect of the embodiments of the present invention, a method for processing a media resource is provided, and optionally, as an optional implementation manner, the method for processing a media resource may be applied, but not limited, to the environment shown in fig. 1. A terminal 102, a network 104, and a server 106.

The server 106 inputs the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users into the first training neural network to obtain a first predicted click rate output by the first training neural network, wherein the position characteristics are used for representing the actual arrangement positions of the sample media resources in a group of media resources pushed to the sample users, and the first predicted click rate is used for representing the predicted probability of the sample users clicking the sample media resources; inputting the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for expressing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure; determining a loss value of a target loss function of a second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether a sample user actually clicks a sample media resource; adjusting the model parameters in the second training neural network according to the loss value of the target loss function, so as to determine the loss value of the target loss function of the second training neural network according to the first predicted click rate output by the first training neural network, the second predicted click rate output by the second training neural network and the actual click result, and the model parameter in the second training neural network is adjusted according to the loss value, the deviation of the media resource position information can be eliminated by utilizing the adjusted second training neural network, and the purpose of accurately recommending the media resource which is interested by the user to the user can be achieved, that is, when recommending media resources to users, the influence of the position information of the media resources can be eliminated, the media resources which are interested by the users can be recommended to the users more accurately, and the technical problem that the accuracy of recommending the media resources according to the user behaviors is low in the prior art is solved.

It should be noted that the method of processing the media resource may include, but is not limited to, the completion performed by the terminal 102, the completion performed by the server 106, and the completion performed by the terminal 103 and the server 106 in cooperation.

Optionally, in this embodiment, the terminal 102 may be a terminal device configured with a target client, and may include but is not limited to at least one of the following: mobile phones (such as Android Mobile phones, iOS Mobile phones, etc.), notebook computers, tablet computers, palm computers, MID (Mobile Internet Devices), PAD, desktop computers, smart televisions, etc. The target client may be a video client, an instant messaging client, a browser client, an educational client, etc. Such networks may include, but are not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication. The server may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The above is only an example, and this is not limited in this embodiment.

Optionally, as an optional implementation manner, as shown in fig. 2, the method for processing a media resource includes:

step S202, inputting the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users into a first training neural network to obtain a first predicted click rate output by the first training neural network, wherein the position characteristics are used for representing the actual arrangement positions of the sample media resources in a group of media resources pushed to the sample users, and the first predicted click rate is used for representing the predicted probability of the sample users clicking the sample media resources.

Step S204, inputting the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for expressing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure.

Step S206, determining a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether a sample user actually clicks the sample media resource.

And step S208, adjusting model parameters in the second training neural network according to the loss value of the target loss function.

Optionally, in this embodiment, the media resource processing method includes, but is not limited to, being used in a recommendation scene, such as a recommendation scene of a media resource in WeChat "see-at-a-glance", a recommendation scene of an e-commerce commodity, a video recommendation scene, a news recommendation scene, and a waiting in a recommendation scene of an announcement resource in a game.

In this embodiment, the position characteristic information of the sample media resource represents an actual arrangement position of the sample media resource in a group of media resources pushed to the user, and taking a group of media resources viewed by WeChat as an example, as shown in fig. 3, a schematic diagram of a group of media resources in a page is displayed in a sorted manner. In FIG. 3, each media asset is shown in a position on the page, such as the "Ali reinforcement learning rearrangement practice" is the most forward position on the page, which indicates that the media asset is clicked the most frequently in the WeChat user's friends, or that the WeChat user clicks the resource information associated with the media asset the most, so that the "Ali reinforcement learning rearrangement practice" information is the most forward position on the page. That is, when a group of media assets is displayed in a page, the location of the media assets in the page may be understood as location characteristic information of the media assets, which may include, but is not limited to, being represented by coordinate information.

In this embodiment, the resource characteristics of the sample media assets are used to represent the characteristics of the sample media assets, and the resource characteristics may include, but are not limited to, the type of the sample media assets, such as the sample media assets being financial information, sports information, entertainment information, game resources, and so on. The user characteristics of the user may represent attribute information of the user, and the user characteristics may include, but are not limited to, gender of the user, age of the user, academic calendar of the user, work attribute of the user, game stage of the user, and the like.

In this embodiment, the position characteristics of the sample media resources, the resource characteristics of the sample media resources, and the user characteristics of the sample users are input to a first training neural network, so as to obtain a first predicted click rate output by the first training neural network, where the first training neural network may include, but is not limited to, a teachers network in knowledge distillation, and the first training neural network may output a first predicted click rate, where the first predicted click rate represents a predicted probability that the sample users click the sample media resources. If the user can be predicted to click on the sample media resource "Ali reinforcement learning rearrangement practice" is 0.6. As shown in fig. 4, the first training neural network is a schematic structural diagram. Inputting the position information S1 of the sample media resource 1, the user characteristics 1 and 2 of the user A and the resource characteristic finance class L1 of the sample media resource 1 into a first training neural network, and outputting a first predicted click rate. It should be noted that the user characteristics of the user may not be limited to the user characteristics 1 and the user characteristics 2, and may be various user characteristics with different dimensions, the more the characteristics are, the more accurate the output first predicted click rate is, the more accurate the resource characteristics of the sample media resource 1 may be information with various dimensions, and the more the characteristics are, the more accurate the output first predicted click rate is. The above is merely an example, and this is not limited in this embodiment.

In this embodiment, the resource characteristics of the sample media resources and the user characteristics of the sample users are input to a second training neural network, so as to obtain a second predicted click rate output by the second training neural network, where the second predicted click rate is used to represent a predicted probability that the sample users click on the sample media resources, and the first training neural network and the second training neural network have the same model structure, where the second training neural network may include, but is not limited to, a student network in knowledge distillation, and the second training neural network may output the second predicted click rate, where the second predicted click rate represents a predicted probability that the sample users click on the sample media resources. If the user can be predicted to click on the sample media resource "Ali reinforcement learning rearrangement practice" is 0.6. As shown in fig. 5, the structure of the second training neural network is schematically illustrated. And inputting the user characteristics 1 and 2 of the user A and the resource characteristics financial class L1 of the sample media resource 1 into a second training neural network, and outputting a second predicted click rate. It should be noted that the user characteristics of the user may not be limited to the user characteristics 1 and the user characteristics 2, and may be various user characteristics with different dimensions, where the more the user characteristics are, the more accurate the output second predicted click rate is, the more accurate the resource characteristics of the sample media resource 1 may be information with various dimensions, and the more the user characteristics are, the more accurate the output second predicted click rate is. The above is merely an example, and this is not limited in this embodiment.

Optionally, in this embodiment, determining a loss value of the target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, and the actual click result may include: and determining a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, the predicted exposure rate and an actual click result, wherein the predicted exposure rate is determined according to the position characteristics of the sample media resources, and the predicted exposure rate is used for representing the probability that the sample media resources are pushed to the sample user.

And inputting the position characteristics of the sample media resources into the third training neural network to obtain the predicted exposure rate output by the third training neural network.

As shown in fig. 6, the schematic structural diagram of the third trained neural network inputs the position information S1 of the sample media resource 1 into the third trained neural network to obtain the predicted exposure rate output by the third trained neural network, where the predicted exposure rate represents the probability that the sample media resource is pushed to the sample user, and for example, the probability that the sample media resource "ali reinforcement learning rearrangement practice" is pushed to the sample user is 0.8.

Optionally, in this embodiment, determining a loss value of the target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, the predicted exposure rate, and the actual click result may include: determining a first loss value according to the first predicted click rate and the second predicted click rate; determining a second loss value according to the second predicted click rate, the predicted exposure rate and the actual click result; determining a loss value of the target loss function of the second trained neural network based on the first loss value and the second loss value.

In this embodiment, the first loss value may be a loss value formed by fitting a second predicted click rate to the first predicted click rate, and the second loss value may be a loss value of the second predicted click rate with respect to an actual click result.

Optionally, in this embodiment, determining a loss value of the target loss function of the second trained neural network according to the first loss value and the second loss value may include: determining a loss value of an objective loss function of the second trained neural network by:

L＝αL^(soft)+(1-α)L^(hard)

wherein L is^(soft)Represents a first loss value, L^(hard)And expressing a second loss value, L expressing the loss value of the target loss function of the second training neural network, alpha expressing a preset weight, and 0 < alpha < 1.

Wherein, the α may also be 0 or 1, and may be 0.6 in a normal case. The above is merely an example, and this is not limited in this embodiment.

According to the embodiment provided by the application, the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users are input into a first training neural network, and a first predicted click rate output by the first training neural network is obtained, wherein the position characteristics are used for representing the actual arrangement positions of the sample media resources in a group of media resources pushed to the sample users, and the first predicted click rate is used for representing the predicted probability of the sample users clicking the sample media resources; inputting the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for expressing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure; determining a loss value of a target loss function of a second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether a sample user actually clicks a sample media resource; adjusting the model parameters in the second training neural network according to the loss value of the target loss function, so as to determine the loss value of the target loss function of the second training neural network according to the first predicted click rate output by the first training neural network, the second predicted click rate output by the second training neural network and the actual click result, and the model parameter in the second training neural network is adjusted according to the loss value, the deviation of the media resource position information can be eliminated by utilizing the adjusted second training neural network, and the purpose of recommending the media resource which is interested by the user to the user can be accurately achieved, namely, when recommending media resources to users, the influence of the position information of the media resources can be eliminated, the media resources which are interested by the users can be recommended to the users more accurately, and the technical problem that the accuracy of recommending the media resources according to the user behaviors is low in the prior art is solved.

Optionally, in this embodiment, the adjusted second training neural network may be used for pushing the media resource. That is, the influence of the position characteristic information of the media resources can be eliminated, a group of media resources are sorted, and the sorted group of media resources are sequentially displayed in the page, as shown in fig. 3, the group of media resources are sequentially displayed.

Optionally, in this embodiment, determining the first loss value according to the first predicted click rate and the second predicted click rate may include: determining a square of a difference between the first predicted click rate and the second predicted click rate as a first loss value; or

Determining the sum or product of the second predicted click rate and the predicted exposure rate as a predicted output value; a first loss value is determined as a square of a difference between the first predicted click rate and the predicted output value.

Optionally, in this embodiment, determining the second loss value according to the second predicted click rate, the predicted exposure rate, and the actual click result may include: determining the sum or product of the second predicted click rate and the predicted exposure rate as a predicted output value; and determining the cross entropy of the predicted output value and the actual click result as a second loss value.

Optionally, in this embodiment, adjusting the model parameter in the second trained neural network according to the loss value of the target loss function may include: the model parameters in the second trained neural network are adjusted in accordance with an adjustment direction that reduces the loss value of the target loss function.

In this embodiment, the loss function is used to measure the degree of inconsistency between the predicted value and the true value of the model, and is a non-negative true value function, and the smaller the loss function is, the better the robustness of the model is. The process of training the model is continuous iterative computation, and in the process of iterative computation, an optimization algorithm with gradient reduction can be used, so that the loss function is smaller and smaller. The gradient descent is an optimization algorithm for making a loss function smaller and smaller, and model parameters of a machine learning algorithm can not be solved, namely, a constraint optimization problem.

Optionally, in this embodiment, the method may further include: the method comprises the steps of obtaining position characteristics of sample media resources, resource characteristics of the sample media resources and user characteristics of sample users, wherein the resource characteristics of the sample media resources comprise resource characteristics of one or more dimensions, and the user characteristics of the sample users comprise user characteristics of one or more dimensions.

The obtaining of the location characteristic of the sample media resource, the resource characteristic of the sample media resource, and the user characteristic of the sample user may include:

the method comprises the steps of obtaining position characteristics of sample media resources, resource characteristics of the sample media resources and user characteristics of sample users, wherein the resource characteristics of the sample media resources comprise types of the sample media resources, and the user characteristics of the sample users comprise ages and sexes of the sample users.

Optionally, determining a loss value of the target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, and the actual click result may include:

s1, inputting a first predicted click rate output by the first training neural network, a second predicted click rate output by the second training neural network and an actual click result into the target full-connection layer module;

s2, determining a first loss value between the first predicted click rate and the second predicted click rate through the target full link layer module;

s3, determining a second loss value between a second predicted click rate and an actual click result through the target full link layer module;

and S4, determining a loss value of the target loss function according to the first loss value and the second loss value through the target full-link layer module.

The process of predicting through the neural network can be divided into a training process of the neural network and a use of the trained neural network.

In this embodiment, in the neural network training process, the fully-connected layer may determine a first loss value between the first predicted click rate and the second predicted click rate according to the received first predicted click rate, the second predicted click rate, and the actual click result, may also determine a second loss value between the second predicted click rate and the actual click result, and further determines a loss value of the target loss function according to the first loss value and the second loss value in the fully-connected layer. And determining a loss function according to the first loss value and the second loss value, and adjusting the loss function to finish the training of the neural network. In order to increase the accuracy of the neural network pushing, in the training process, the predicted exposure rate may be used as a parameter of the training, that is, the predicted exposure rate may be input to the fully-connected layer, and the predicted exposure rate may be used as a parameter for determining a loss value of the loss function.

In actual use, the fully connected layer receives the second predicted click rate and then outputs a pushing result, it should be noted that the fully connected layer can also receive the predicted exposure rate, and the pushing position of the media resource is determined according to the predicted exposure rate and the first predicted click rate.

It should be noted that the fully-connected layer may be a fully-connected layer in the second trained neural network, or may be outside the second trained neural network, that is, the fully-connected layer may be a part of the neural network structure, or may be a next layer of operation structure after the neural network outputs data.

Optionally, the inputting the position feature of the sample media resource, the resource feature of the sample media resource, and the user feature of the sample user into the first training neural network to obtain the first predicted click rate output by the first training neural network may include: generating a cross feature vector of the position feature, the resource feature and the user feature; and determining a first predicted click rate according to the cross feature vector.

In this embodiment, a cross feature vector can be generated according to the position feature, the resource feature and the user feature, and the position information, the resource information and the user information can be simultaneously represented through the cross feature, so that feature parameter data can be better fused, and a more accurate first predicted click rate can be obtained. For example, the intersection of the three features may know that a girl of 25 years likes to click on the entertainment news displayed in the upper left part, the user feature age is 25, the gender is female, the position feature is position, and the resource feature categories is entertainment news.

As an alternative embodiment, the following will take a look in WeChat as an example to detail the technical solution of the present application: that is, a set of articles is recommended as an example to illustrate how to adjust the model parameters of the second trained neural network and how to recommend the articles to the user.

Step S1, inputting the position characteristics of the sample article, the characteristics of the article and the user characteristics of the sample user into a first training neural network to obtain a first predicted click rate output by the first training neural network; the position feature W1 of the article 1, the feature P1 of the article 1, the age feature n1 of the user B and the gender feature x1 of the user are input into a first training neural network, and a first predicted click rate output by the first training neural network is obtained. The user characteristics of the user B may not be limited to the age characteristic n1 of the user B and the gender characteristic x1 of the user B, and may be various user characteristics with different dimensions, the more the characteristics are, the more accurate the output first predicted click rate is, the more accurate the resource characteristics of the sample media resource 1 may be information with various dimensions, and the more the characteristics are, the more accurate the output first predicted click rate is.

And step S2, inputting the characteristic P1 of the article 1, the age characteristic n1 of the user B and the sex characteristic x1 of the user into a second training neural network to obtain a second predicted click rate output by the second training neural network, and outputting the second predicted click rate. It should be noted that the user characteristics of the user B may not be limited to the age characteristic n1 of the user B and the gender characteristic x1 of the user B, and may be various user characteristics with different dimensions, the more the characteristics are, the more accurate the output second predicted click rate is, the more the resource characteristics of the sample media resource 1 may be information with various dimensions, and the more the characteristics are, the more accurate the output second predicted click rate is.

And step S3, determining a loss value of a target loss function of the second training neural network according to the first prediction click rate, the second prediction click rate and the actual click result, and adjusting model parameters in the second training neural network according to the loss value of the target loss function.

As an alternative embodiment, the following describes the technical solution of the present application by taking a game announcement in a game application as an example:

step S1, inputting the position characteristics of the sample bulletin resources, the characteristics of the bulletin resources and the user characteristics of the sample users into a first training neural network to obtain a first predicted click rate output by the first training neural network; wherein, the position characteristic W2 of the announcement resource 2, the characteristic P2 of the announcement resource 2, the age characteristic n1 of the user C, the gender characteristic x1 of the user C, the game level of the user C and the injury s1 output by the user C within a period of time are input into the first training neural network, and a first predicted click rate output by the first training neural network is obtained. The user C may further include, but is not limited to, a plurality of user features of different dimensions, the more the features are, the more accurate the output first predicted click rate is, the resource feature of the advertisement resource 2 may be information of a plurality of dimensions, and the more the features are, the more accurate the output first predicted click rate is.

Step S2, the feature P2 of the announcement resource 2, the age feature n1 of the user C, the gender feature x1 of the user C, the game level of the user C and the injury S1 output by the user C within a period of time are input into the second training neural network, and a second predicted click rate output by the second training neural network is obtained. The user C may further include, but is not limited to, a plurality of user features of different dimensions, the more the features are, the more accurate the output second predicted click rate is, the resource feature of the advertisement resource 2 may be information of a plurality of dimensions, and the more the features are, the more accurate the output second predicted click rate is. It should be noted that the user characteristic of the user C may also be a plurality of characteristics with different dimensions, the more the characteristics are, the more accurate the output second predicted click rate is, the resource characteristic of the announcement resource 2 may be information with a plurality of dimensions, and the more the characteristics are, the more accurate the output second predicted click rate is.

The adjusted second training neural network can be used as a target neural network, and the target neural network can be used for recommending game announcement information in game application, namely recommending information in which the user is interested to the user. That is, the target neural network can be used to push information to the user, and the content of the pushed information is the content in which the user is interested, so as to increase the conversion rate of the use of the information.

Optionally, the present application further provides an optional click rate prediction model for eliminating the position deviation based on knowledge distillation, as shown in fig. 7, which is a flowchart of the click rate prediction model for eliminating the position deviation based on knowledge distillation.

In this embodiment, the click-through rate prediction model may include, but is not limited to, recommendations for information in a WeChat view.

The WeChat is a feed stream recommendation product, different contents such as graphic and text public numbers, videos, news and the like can be recommended, the click rate estimation model (comprising the first training neural network, the second training neural network and the third training neural network) can estimate the click rate of each article, each video and each news, the news is ranked according to the click rate, and finally the contents with high click rate are recommended to the user.

In this embodiment, the teacher network (equivalent to the first training neural network) uses a feature-based method, and uses the position feature (the position feature of the sample media resource) and other features (equivalent to the resource feature of the sample media resource and the user feature of the sample user) as the input of the teacher network deep recommendation model depefm (the first training neural network model), so that the position feature and other features are intersected in multiple dimensions; the student network (equivalent to the second neural network) adopts a PAL model structure, models position characteristics independently, and adds or multiplies results obtained by the neural network dnn (equivalent to the third training neural network) and depefm to obtain final output; cross entropy is calculated for soft target output by the teacher network and hard target of real label, linear weighting is carried out, and loss function of the student network can be obtained

L＝αL^(soft)+(1-α)L^(hard)

Wherein L is^(soft)Represents the first loss value, L^(hard)Representing the second loss value, L representing a loss value of the target loss function of the second trained neural network, α representing a preset weight, 0 < α < 1.

In the embodiment, the output of the teacher network can be transmitted to the student network through knowledge distillation, so that the student network can learn the multi-dimensional intersection result of the position characteristics and other characteristics, the student network models the position characteristics independently, only the pCTR part output by depefm is used in online prediction, and the influence caused by not taking the position characteristics on line is reduced. In this embodiment, the network structure used by depfm is a modified version of depfm, as shown in fig. 8, which is a structural diagram of a click rate prediction model for eliminating positional deviation based on knowledge distillation. FIG. 8 shows that the click rate estimation model for eliminating the position deviation based on knowledge distillation includes a teacher network, a student network and a neural network. Inputting the position characteristics, the user characteristics of the user and the resource characteristics of the media resources into a teacher network to obtain a first predicted click rate output by the teacher network, inputting the position characteristics into a neural network model to obtain a predicted exposure rate output by the neural network model, inputting the user characteristics of the user and the resource characteristics of the media resources into a student network to obtain a second predicted click rate output by the student network, and determining a first loss value according to the first predicted click rate and the second predicted click rate; determining a loss value of a target loss function of the student network according to the first predicted click rate, the second predicted click rate, the predicted exposure rate and an actual click result, and adjusting model parameters in the student network according to the loss value of the target loss function.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiment of the present invention, there is also provided a processing apparatus of a media resource for implementing the processing method of the media resource. As shown in fig. 9, the processing device of the media resource includes: a first input unit 91, a second input unit 93, a determination unit 95, and an adjustment unit 97.

The first input unit 91 is configured to input the position feature of the sample media resource, the resource feature of the sample media resource, and the user feature of the sample user into the first training neural network, so as to obtain a first predicted click rate output by the first training neural network, where the position feature is used to represent an actual arrangement position of the sample media resource in a group of media resources pushed to the sample user, and the first predicted click rate is used to represent a predicted probability that the sample user clicks the sample media resource.

And a second input unit 93, configured to input the resource characteristics of the sample media resources and the user characteristics of the sample user into a second training neural network, to obtain a second predicted click rate output by the second training neural network, where the second predicted click rate is used to represent a predicted probability that the sample user clicks the sample media resources, and the first training neural network and the second training neural network have the same model structure.

The determining unit 95 is configured to determine a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, and an actual click result, where the actual click result is used to indicate whether the sample user actually clicks the sample media resource.

And an adjusting unit 97, configured to adjust the model parameter in the second training neural network according to the loss value of the target loss function.

Optionally, in this embodiment, the determining unit 95 may include: and the first determining module is used for determining a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, the predicted exposure rate and an actual click result, wherein the predicted exposure rate is the exposure rate determined according to the position characteristics of the sample media resources, and the predicted exposure rate is used for representing the probability that the sample media resources are pushed to the sample user.

By the embodiment provided by the application, the first input unit 91 inputs the position characteristics of the sample media resources, the resource characteristics of the sample media resources, and the user characteristics of the sample user into the first training neural network to obtain a first predicted click rate output by the first training neural network, wherein the position characteristics are used for representing actual arrangement positions of the sample media resources in a group of media resources pushed to the sample user, and the first predicted click rate is used for representing a predicted probability of the sample user clicking the sample media resources; the second input unit 93 inputs the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for representing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure; the determining unit 95 determines a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether a sample user actually clicks a sample media resource; the adjusting unit 97 adjusts the model parameters in the second trained neural network according to the loss value of the target loss function. The method and the device achieve the purposes that the loss value of the target loss function of the second training neural network is determined according to the first predicted click rate output by the first training neural network, the second predicted click rate output by the second training neural network and the actual click result, the model parameters in the second training neural network are adjusted according to the loss value, the deviation of the position information of the media resources can be eliminated by using the adjusted second training neural network, and the media resources interested by the user can be accurately recommended to the user, namely, when the media resources are recommended to the user, the influence of the position information of the media resources can be eliminated, the media resources interested by the user can be more accurately recommended to the user, and further solve the technical problem that in the prior art, the accuracy of recommending the media resources according to the user behaviors is low.

Optionally, the apparatus may further include: and the third input unit is used for inputting the position characteristics of the sample media resources into the third training neural network to obtain the predicted exposure rate output by the third training neural network.

The first determining module may include: the first determining submodule is used for determining a first loss value according to the first predicted click rate and the second predicted click rate; the second determining submodule is used for determining a second loss value according to a second predicted click rate, a predicted exposure rate and an actual click result; and the third determining submodule is used for determining the loss value of the target loss function of the second training neural network according to the first loss value and the second loss value.

It should be noted that, the third determining submodule is further configured to perform the following operations: determining a loss value of an objective loss function of the second trained neural network by:

L＝αL^(soft)+(1-α)L^(hard)

wherein L is^(soft)Represents a first loss value, L^(hard)And representing a second loss value, L representing the loss value of the target loss function of the second training neural network, alpha representing a preset weight, and 0 < alpha < 1.

The first determining sub-module is further configured to perform the following operations: determining a square of a difference between the first predicted click rate and the second predicted click rate as a first loss value; or determining the sum or product of the second predicted click rate and the predicted exposure rate as a predicted output value; a first loss value is determined as a square of a difference between the first predicted click rate and the predicted output value.

The second determining sub-module is further configured to perform the following operations: determining the sum or product of the second predicted click rate and the predicted exposure rate as a predicted output value; and determining the cross entropy of the predicted output value and the actual click result as a second loss value.

Optionally, the adjusting unit 97 may include: and the adjusting module is used for adjusting the model parameters in the second training neural network according to the adjusting direction which reduces the loss value of the target loss function.

Optionally, the apparatus may further include: the acquisition unit is used for acquiring the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users, wherein the resource characteristics of the sample media resources comprise resource characteristics of one or more dimensions, and the user characteristics of the sample users comprise user characteristics of one or more dimensions.

Wherein, the acquiring unit may include: the acquisition module is used for acquiring the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users, wherein the resource characteristics of the sample media resources comprise the types of the sample media resources, and the user characteristics of the sample users comprise the ages and the sexes of the sample users.

Optionally, the determining unit may further include: the first input module is used for inputting a first predicted click rate output by the first training neural network, a second predicted click rate output by the second training neural network and an actual click result into the target full-connection layer module; the second determining module is used for determining a first loss value between the first predicted click rate and the second predicted click rate through the target full-connection layer module; the third determining module is used for determining a second loss value between the second predicted click rate and the actual click result through the target full-link layer module; and the fourth determining module is used for determining the loss value of the target loss function according to the first loss value and the second loss value through the target full-connection layer module.

Optionally, the first input unit 91 may include: the generating module is used for generating a cross feature vector of the position feature, the resource feature and the user feature; and the fifth determining module is used for determining the first predicted click rate according to the cross feature vector.

According to another aspect of the embodiment of the present invention, there is also provided an electronic device for implementing the processing method of the media resource, where the electronic device may be the terminal device or the server shown in fig. 1. The present embodiment takes the electronic device as a server as an example for explanation. As shown in fig. 10, the electronic device comprises a memory 1002 and a processor 1004, the memory 1002 having stored therein a computer program, the processor 1004 being arranged to execute the steps of any of the method embodiments described above by means of the computer program.

Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, inputting the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users into a first training neural network to obtain a first predicted click rate output by the first training neural network, wherein the position characteristics are used for representing the actual arrangement positions of the sample media resources in a group of media resources pushed to the sample users, and the first predicted click rate is used for representing the predicted probability of the sample users clicking the sample media resources;

s2, inputting the resource characteristics of the sample media resources and the user characteristics of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for expressing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure;

s3, determining a loss value of a target loss function of a second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether a sample user actually clicks a sample media resource;

and S4, adjusting the model parameters in the second training neural network according to the loss value of the target loss function.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 10 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 10 is a diagram illustrating a structure of the electronic device. For example, the electronics may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.

The memory 1002 may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for processing a media resource in the embodiment of the present invention, and the processor 1004 executes various functional applications and data processing by running the software programs and modules stored in the memory 1002, that is, implementing the above-described method for processing a media resource. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may be specifically, but not limited to, used to store information such as a first training neural network, a second training neural network, a location characteristic of a sample media asset, a resource characteristic of the sample media asset, a user characteristic of a sample user, and a loss value of an objective loss function. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, a first input unit 91, a second input unit 93, a determination unit 95, and an adjustment unit 97 in the processing device of the media resource. In addition, the media resource processing apparatus may further include, but is not limited to, other module units in the media resource processing apparatus, which is not described in this example again.

Optionally, the above-mentioned transmission device 1006 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 1006 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices so as to communicate with the internet or a local area Network. In one example, the transmission device 1006 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. Nodes can form a Peer-To-Peer (P2P, Peer To Peer) network, and any type of computing device, such as a server, a terminal, and other electronic devices, can become a node in the blockchain system by joining the Peer-To-Peer network.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The computer instructions are read by a processor of the computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the method for processing a media resource as provided in the above-described processing aspect of the media resource or various alternative implementations of the processing aspect of the media resource. Wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for processing media resources, comprising:

inputting the position characteristics of sample media resources, the resource characteristics of the sample media resources and the user characteristics of a sample user into a first training neural network to obtain a first predicted click rate output by the first training neural network, wherein the position characteristics are used for representing the actual arrangement positions of the sample media resources in a group of media resources pushed to the sample user, and the first predicted click rate is used for representing the predicted probability of the sample user clicking the sample media resources;

inputting the resource features of the sample media resources and the user features of the sample users into a second training neural network to obtain a second predicted click rate output by the second training neural network, wherein the second predicted click rate is used for representing the predicted probability of the sample users clicking the sample media resources, and the first training neural network and the second training neural network have the same model structure;

determining a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate and an actual click result, wherein the actual click result is used for indicating whether the sample user actually clicks the sample media resource;

and adjusting the model parameters in the second training neural network according to the loss value of the target loss function.

2. The method of claim 1, wherein determining a loss value for an objective loss function of the second trained neural network based on the first predicted click rate, the second predicted click rate, and an actual click result comprises:

and determining a loss value of the target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, the predicted exposure rate and the actual click result, wherein the predicted exposure rate is determined according to the position characteristics of the sample media resources, and the predicted exposure rate is used for representing the probability that the sample media resources are pushed to the sample user.

3. The method of claim 2, further comprising:

inputting the position features of the sample media assets into a third training neural network to obtain the predicted exposure rate output by the third training neural network.

4. The method of claim 2, wherein determining a loss value for an objective loss function of the second trained neural network based on the first predicted click rate, the second predicted click rate, a predicted exposure rate, and the actual click result comprises:

determining a first loss value according to the first predicted click rate and the second predicted click rate;

determining a second loss value according to the second predicted click rate, the predicted exposure rate and the actual click result;

determining a loss value of an objective loss function of the second training neural network according to the first loss value and the second loss value.

5. The method of claim 4, wherein determining the loss value of the target loss function of the second trained neural network from the first loss value and the second loss value comprises:

determining a loss value for the target loss function of the second trained neural network by:

L＝αL^(soft)+(1-α)L^(hard)

6. The method of claim 4, wherein determining a first loss value based on the first predicted click through rate and the second predicted click through rate comprises:

determining a square of a difference between the first predicted click rate and the second predicted click rate as the first loss value; or

Determining a sum or product of the second predicted click rate and the predicted exposure rate as a predicted output value; determining a square of a difference between the first predicted click rate and the predicted output value as the first loss value.

7. The method of claim 4, wherein determining a second loss value based on the second predicted click rate, the predicted exposure rate, and the actual click result comprises:

determining a sum or product of the second predicted click rate and the predicted exposure rate as a predicted output value;

and determining the cross entropy of the predicted output value and the actual click result as the second loss value.

8. The method of any one of claims 1 to 7, wherein the adjusting model parameters in the second trained neural network according to the loss values of the target loss function comprises:

and adjusting the model parameters in the second training neural network according to the adjustment direction which reduces the loss value of the target loss function.

9. The method according to any one of claims 1 to 7, further comprising:

obtaining the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users, wherein the resource characteristics of the sample media resources comprise resource characteristics of one or more dimensions, and the user characteristics of the sample users comprise user characteristics of one or more dimensions.

10. The method of claim 9, wherein the obtaining the location characteristic of the sample media asset, the asset characteristic of the sample media asset, and the user characteristic of the sample user comprises:

acquiring the position characteristics of the sample media resources, the resource characteristics of the sample media resources and the user characteristics of the sample users, wherein the resource characteristics of the sample media resources comprise the types of the sample media resources, and the user characteristics of the sample users comprise the ages and the sexes of the sample users.

11. The method of any one of claims 1 to 7, wherein determining a loss value for an objective loss function of the second trained neural network based on the first predicted click rate, the second predicted click rate, and an actual click result comprises:

inputting the first predicted click rate output by the first training neural network, the second predicted click rate output by the second training neural network and the actual click result into a target full link layer module;

determining, by the target fully-connected layer module, a first loss value between the first predicted click rate and the second predicted click rate;

determining, by the target fully-connected layer module, a second loss value between the second predicted click rate and the actual click result;

determining, by the target fully-connected layer module, a loss value of the target loss function according to the first loss value and the second loss value.

12. The method of any one of claims 1 to 7, wherein the inputting location features of a sample media asset, asset features of the sample media asset, and user features of a sample user into a first training neural network resulting in a first predicted click rate output by the first training neural network comprises:

generating a cross feature vector of the location feature with the resource feature and the user feature;

and determining the first predicted click rate according to the cross feature vector.

13. An apparatus for processing a media asset, comprising:

a first input unit, configured to input a location feature of a sample media resource, a resource feature of the sample media resource, and a user feature of a sample user into a first training neural network, so as to obtain a first predicted click rate output by the first training neural network, where the location feature is used to represent an actual arrangement location of the sample media resource in a group of media resources pushed to the sample user, and the first predicted click rate is used to represent a predicted probability of the sample user clicking the sample media resource;

a second input unit, configured to input the resource feature of the sample media resource and the user feature of the sample user into a second training neural network, so as to obtain a second predicted click rate output by the second training neural network, where the second predicted click rate is used to represent a predicted probability that the sample user clicks the sample media resource, and the first training neural network and the second training neural network have a same model structure;

a determining unit, configured to determine a loss value of a target loss function of the second training neural network according to the first predicted click rate, the second predicted click rate, and an actual click result, where the actual click result is used to indicate whether the sample user actually clicks the sample media resource;

and the adjusting unit is used for adjusting the model parameters in the second training neural network according to the loss value of the target loss function.

14. A computer-readable storage medium, comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 12.

15. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 12 by means of the computer program.