CN114819000B

CN114819000B - Feedback information estimation model training method and device and electronic equipment

Info

Publication number: CN114819000B
Application number: CN202210746663.2A
Authority: CN
Inventors: 应元翔; 谢淼; 解浪
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-10-21
Anticipated expiration: 2042-06-29
Also published as: CN114819000A

Abstract

The disclosure relates to a feedback information estimation model training method, a device and electronic equipment, belonging to the technical field of deep learning, wherein the method comprises the following steps: the method comprises the steps of obtaining a first feature set and a sample data set, calibrating an estimation result output by a feedback information estimation model according to a calibration model corresponding to the feature of each dimension in the first feature set to obtain a calibration result of the sample data set, determining at least one target feature from the first feature set based on the calibration result, and adding the target features into a training process of the feedback information estimation model. The accuracy of the calibration result obtained based on the calibration model corresponding to the target characteristic is greater than the accuracy of the estimation result, which indicates that a more accurate result can be obtained by calibrating the estimation result through the calibration model, so that the iteration efficiency of model training and the accuracy of the model can be effectively improved by adding the characteristic corresponding to the calibration model into the training process of the feedback information estimation model.

Description

Feedback information estimation model training method and device and electronic equipment

Technical Field

The disclosure relates to the technical field of deep learning, and in particular relates to a feedback information estimation model training method and device and electronic equipment.

Background

With the gradual maturity of deep learning technology, the deep learning has replaced the traditional machine learning algorithm and becomes the first choice of technology in machine learning. The essence of deep learning is that more useful features are learned by constructing a machine learning model with a plurality of hidden layers and performing model training by using massive training data, so that the accuracy of model output is improved.

In the related art, in order to improve the performance of the model, a feature set of the model is often optimized by adopting a feature selection method. For example, the technician manually selects the feature based on parameters such as the missing rate, relevance, and information value of the feature. For another example, original high-dimensional data is mapped to low-dimensional data in a data dimension reduction mode, and meanwhile, noise in the data is filtered out, so that a feature set with higher abstraction degree is obtained.

However, the above method often requires a lot of manpower, calculation effort and time cost, and it cannot be ensured that the selected features can bring benefits to the model, resulting in a low accuracy of the trained model.

Disclosure of Invention

The invention provides a feedback information estimation model training method and device and electronic equipment, which can effectively improve the iteration efficiency of model training and the accuracy of a model. The technical scheme of the disclosure is as follows.

According to a first aspect of the embodiments of the present disclosure, a method for training a feedback information prediction model is provided, the method including:

the method comprises the steps of obtaining a first feature set and a sample data set, wherein the first feature set comprises features of multiple dimensions of media resources, the sample data set comprises multiple sample media resources, and each sample media resource comprises feature values of the features of the multiple dimensions of the sample media resource;

calibrating the estimation result of the sample data set obtained based on the feedback information estimation model based on the calibration model corresponding to each feature in the first feature set to obtain the calibration result of the sample data set corresponding to each calibration model;

determining at least one target feature from the first feature set based on the pre-estimated result of the sample data set and the calibration result of the sample data set corresponding to each calibration model to obtain a second feature set, wherein the accuracy of the calibration result obtained based on the calibration model corresponding to the target feature is greater than the accuracy of the pre-estimated result;

and training the feedback information estimation model based on the second characteristic set and the sample data set to obtain the trained feedback information estimation model.

According to the technical scheme, after a first feature set and a sample data set are obtained, an estimation result output by a feedback information estimation model is calibrated according to a calibration model corresponding to the feature of each dimension in the first feature set to obtain a calibration result of the sample data set, at least one target feature is determined from the first feature set based on the calibration result, and the target features are added into a training process of the feedback information estimation model. In the process, the accuracy of the calibration result obtained based on the calibration model corresponding to the target characteristic is greater than that of the estimation result, which indicates that a more accurate result can be obtained after the estimation result is calibrated by the calibration model, so that the iteration efficiency of model training and the accuracy of the model can be effectively improved by adding the characteristic corresponding to the calibration model into the training process of the feedback information estimation model.

In some embodiments, the method further comprises:

based on a first feature of the media resource, acquiring a feature value of the first feature from the sample data set, wherein the first feature refers to a feature of any dimension of the media resource in the first feature set;

based on the characteristic value of the first characteristic, carrying out bucket division on the sample data set to obtain a plurality of buckets;

based on the sample media assets corresponding to each bucket, a calibration model corresponding to the first feature is determined.

In this way, the server traverses the features of each dimension in the first feature set to obtain a calibration model corresponding to each feature, so as to facilitate the subsequent screening of at least one target feature from the first feature set.

In some embodiments, the determining at least one target feature from the first feature set based on the pre-estimates of the sample data set and the calibration results of the sample data set corresponding to each calibration model to obtain a second feature set includes:

acquiring a first evaluation value of the feedback information pre-estimation model based on the pre-estimation result of the sample data set and the label information of the sample data set, wherein the first evaluation value indicates the accuracy of the feedback information pre-estimation model;

acquiring a second evaluation value of the first calibration model based on the calibration result of the sample data set corresponding to the first calibration model and the label information of the sample data set, wherein the first calibration model is a calibration model corresponding to any dimension of the first feature set, and the second evaluation value indicates the accuracy of the first calibration model;

and determining the characteristic corresponding to the first calibration model as the target characteristic under the condition that the second evaluation value is larger than the first evaluation value.

The larger the evaluation value is, the closer the result output by the model corresponding to the evaluation value is to the true value, that is, the better the model corresponding to the evaluation value is. Under the condition that the second evaluation value is larger than the first evaluation value, the accuracy of the calibration result output by the first calibration model is larger than the estimation result output by the feedback information estimation model, namely, a more accurate result can be obtained after the estimation result is calibrated by the first calibration model, therefore, the characteristic corresponding to the first calibration model is determined as a target characteristic and is added into the training process of the subsequent feedback information estimation model, and the iteration efficiency of model training and the accuracy of the model can be effectively improved.

In some embodiments, the method further comprises:

acquiring an evaluation value of the calibration model corresponding to each characteristic based on the sample data set, the feedback information pre-estimation model and the calibration model corresponding to each characteristic;

and determining the calibration model corresponding to the maximum evaluation value in the evaluation values of the calibration models corresponding to each characteristic as a target calibration model, wherein the target calibration model is used for calibrating the estimation result of the media resource obtained based on the feedback information estimation model which is on-line.

In this way, since the evaluation value of the model can be used as the evaluation criterion of the model, the larger the evaluation value is, the closer the result output by the model corresponding to the evaluation value is to the true value, and therefore, the calibration model corresponding to the largest evaluation value among the plurality of calibration models is determined to be the target calibration model, the result output by the target calibration model is the closest to the true value, and thus, when the target calibration model is applied to the online service, the accuracy of the calibration result can be improved.

In some embodiments, the method further comprises:

acquiring sample data in the online time period based on the feedback information pre-estimation model and the target calibration model which are online;

acquiring a third evaluation value of the feedback information estimation model which is online based on the feedback information estimation model which is online and sample data in the online time period, wherein the third evaluation value indicates the accuracy of the feedback information estimation model which is online;

acquiring a fourth evaluation value of the target calibration model based on the feedback information pre-estimation model which is online, the target calibration model and the sample data in the online time period, wherein the fourth evaluation value indicates the accuracy of the target calibration model;

and in the case that the third evaluation value is larger than the fourth evaluation value, the target calibration model is off-line.

By the method, the server can evaluate the necessity of the target calibration model in time according to the sample data of the latest period of time, and the target calibration model is offline in time under the condition that the online service does not need the target calibration model, so that the computing resource is saved, and the operating efficiency of the server is improved. Similarly, the server can also get the target calibration model online in time under the condition that the online service needs the target calibration model according to the sample data of the latest period of time, and the accuracy of the online service is improved.

In some embodiments, the training the feedback information prediction model based on the second feature set and the sample data set to obtain the trained feedback information prediction model includes:

based on the second feature set, adjusting the network structure of the feedback information estimation model to obtain the adjusted feedback information estimation model;

and training the adjusted feedback information estimation model based on the sample data set to obtain the trained feedback information estimation model.

In some embodiments, the adjusting the network structure of the feedback information prediction model based on the second feature set to obtain the adjusted feedback information prediction model includes at least one of:

based on the first target characteristic in the second characteristic set, adding a network layer for processing the first target characteristic in the feedback information estimation model to obtain the adjusted feedback information estimation model, wherein the first target characteristic refers to a characteristic which does not exist in the feedback information estimation model;

and adjusting a network layer used for processing the second target characteristic in the feedback information estimation model based on the second target characteristic in the second characteristic set to obtain the adjusted feedback information estimation model, wherein the second target characteristic refers to the characteristic existing in the feedback information estimation model.

The first target characteristic is added into the feedback information estimation model, so that the trained feedback information estimation model can output a more accurate estimation result. For a second target characteristic, if the feedback information estimation model has the second target characteristic, the learning of the feedback information estimation model on the second target characteristic is not sufficient, the server strengthens the learning of the second target characteristic by adjusting the network structure of the feedback information estimation model, so that the trained feedback information estimation model can output a more accurate estimation result.

In some embodiments, the training the adjusted feedback information estimation model based on the sample data set to obtain the trained feedback information estimation model includes:

training the adjusted feedback information estimation model based on the sample data set until an iteration cutoff condition is met to obtain an intermediate feedback information estimation model;

under the condition that a calibration model corresponding to the second feature set is obtained based on the sample data set, the feedback information estimation model and the second feature set, the intermediate feedback information estimation model is trained based on the sample data set until a target condition is met, and the trained feedback information estimation model is obtained, wherein the target condition is that the accuracy of an estimation result obtained based on the intermediate feedback information estimation model is greater than or equal to that of the calibration result.

The feedback information estimation model obtained through the training mode fully learns the target characteristics in the second characteristic set, and the accuracy of the model can be improved. Moreover, the calibration model based on the target characteristics can obtain a calibration result closer to a true value, so that the scale of model training can be reduced by utilizing the target characteristics to train the feedback information pre-estimation model, and the iteration efficiency of the model training is greatly improved.

According to a second aspect of the embodiments of the present disclosure, there is provided a feedback information prediction model training apparatus, including:

an obtaining unit configured to perform obtaining a first feature set and a sample data set, the first feature set including features of multiple dimensions of a media resource, the sample data set including multiple sample media resources, each sample media resource including feature values of the features of the multiple dimensions of the sample media resource;

the calibration unit is configured to execute a calibration model corresponding to each feature in the first feature set, calibrate the estimation result of the sample data set obtained based on the feedback information estimation model, and obtain a calibration result of the sample data set corresponding to each calibration model;

the determining unit is configured to determine at least one target feature from the first feature set based on the pre-estimation result of the sample data set and the calibration result of the sample data set corresponding to each calibration model, so as to obtain a second feature set, wherein the accuracy of the calibration result obtained based on the calibration model corresponding to the target feature is greater than that of the pre-estimation result;

and the training unit is configured to train the feedback information prediction model based on the second feature set and the sample data set to obtain the trained feedback information prediction model.

In some embodiments, the apparatus further comprises a calibration model determination unit configured to perform:

based on the sample media resources corresponding to each bucket, a calibration model corresponding to the first feature is determined.

In some embodiments, the determining unit is configured to perform:

In some embodiments, the apparatus further comprises a target calibration model determination unit configured to perform:

and determining the calibration model corresponding to the maximum evaluation value in the evaluation values of the calibration models corresponding to each feature as a target calibration model, wherein the target calibration model is used for calibrating the estimation result of the media resource obtained based on the feedback information estimation model which is on-line.

In some embodiments, the apparatus further comprises a target calibration model offline unit configured to perform:

In some embodiments, the training unit is configured to perform:

In some embodiments, the training unit is configured to perform at least one of:

based on a first target characteristic in the second characteristic set, adding a network layer for processing the first target characteristic in the feedback information estimation model to obtain an adjusted feedback information estimation model, wherein the first target characteristic refers to a characteristic which does not exist in the feedback information estimation model;

and adjusting a network layer used for processing the second target characteristics in the feedback information estimation model based on the second target characteristics in the second characteristic set to obtain the adjusted feedback information estimation model, wherein the second target characteristics refer to characteristics existing in the feedback information estimation model.

In some embodiments, the training unit is configured to perform:

training the adjusted feedback information prediction model based on the sample data set until an iteration cutoff condition is met to obtain an intermediate feedback information prediction model;

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

one or more processors;

a memory for storing the processor executable program code;

wherein the processor is configured to execute the program code to implement the feedback information estimation model training method.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein when program codes of the computer-readable storage medium are executed by a processor of an electronic device, the electronic device is enabled to execute the above-mentioned feedback information prediction model training method.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is a schematic diagram of an implementation environment of a feedback information estimation model training method according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a feedback information prediction model training system according to an embodiment of the present disclosure;

fig. 3 is a flowchart of a feedback information prediction model training method provided in an embodiment of the present disclosure;

FIG. 4 is a flowchart of another feedback information prediction model training method provided by the embodiment of the disclosure;

fig. 5 is a block diagram of a training apparatus for a feedback information prediction model according to an embodiment of the present disclosure;

fig. 6 is a block diagram of a server provided by an embodiment of the present disclosure.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the accompanying drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in other sequences than those illustrated or described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The data to which the present disclosure relates may be data that is authorized by a user or sufficiently authorized by parties.

Fig. 1 is a schematic diagram of an implementation environment of a feedback information prediction model training method provided in an embodiment of the present disclosure, and referring to fig. 1, the implementation environment includes: a terminal 101 and a server 102.

The terminal 101 and the server 102 can be directly or indirectly connected through a wired network or a wireless network, and the disclosure is not limited thereto. In some embodiments, the terminal 101 is a smartphone, tablet, laptop, desktop computer, or the like, but is not so limited. The terminal 101 can provide information required by the feedback information prediction model training method, such as training parameters, an initial deep learning model, a sample data set and the like, to the server 102. Terminal 101 generally refers to one of a plurality of terminals, and the disclosed embodiments are illustrated only with terminal 101.

The server 102 is an independent physical server, or a server cluster or a distributed file system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. The server 102 is configured to execute the feedback information estimation model training method provided by the embodiment of the present disclosure, and perform model training based on the information provided by the terminal 101. In some embodiments, the number of the servers 102 may be more or less, and the embodiments of the present disclosure do not limit this. Of course, the server 102 may also include other functional servers to provide more comprehensive and diverse services.

Before introducing the training method of the feedback information estimation model provided by the embodiment of the present disclosure, for convenience of understanding, a training system of the feedback information estimation model provided by the embodiment of the present disclosure is introduced below, and the training system can automatically implement the calibration, iteration and upgrade processes of the feedback information estimation model. The feedback information estimation model provided by the embodiment of the present disclosure is a Logistic Regression (LR) -based deep learning model, and is used for estimating a feedback behavior of an object to a media resource, and the feedback behavior may also be understood as an interaction behavior of the object to the media resource. For example, the media resource is a video, a live broadcast, a picture, and the like, and the feedback behavior of the object on the media resource includes clicking, praise, and the collection and viewing duration exceeds the target duration, which is not limited in the embodiment of the present disclosure. Illustratively, taking a media resource as a video and a feedback behavior as a click as an example, the feedback information prediction model can predict the probability that an object clicks a certain video. In some embodiments, the feedback information prediction model is a model constructed based on a Convolutional Neural Network (CNN), and in other embodiments, the feedback information prediction model is a model with other structures. For example, the feedback information estimation model is a model constructed based on a Deep Neural Network (DNN) or a Recurrent Neural Network (RNN), and the like, and the structure of the feedback information estimation model is not limited in the embodiment of the present disclosure.

The above-described training system will be briefly described with reference to fig. 2. Fig. 2 is a schematic diagram of a feedback information prediction model training system according to an embodiment of the present disclosure. As shown in fig. 2, the training system includes the following four modules: the system comprises an online service module, a data collection module, an offline training module and a model evaluation module. The calibration, iteration and upgrading of the feedback information estimation model are realized through the synergistic effect of the four modules. The functions of these four modules are described below with reference to fig. 2.

(1) And an online service module.

The online service module is used for acquiring a plurality of candidate media resources according to the online resource acquisition request, predicting the feedback behavior of the object to the candidate media resources through a feedback information prediction Model for each candidate media resource so as to obtain the prediction result of each candidate media resource, further calibrating the prediction result of each candidate media resource through a Calibration Model (Calibration Model) so as to obtain the Calibration result of each candidate media resource, finally determining the target media resource according to the Calibration result of each candidate media resource, and providing the target media resource to the object.

Illustratively, for any one candidate media resource, the process of the online service module obtaining the calibration result of the candidate media resource includes the following three stages.

The first stage, performing feature extraction on the resource acquisition request and the candidate media resourceObtaining a characteristic set X = { X } corresponding to the feedback information estimation model ₁ ，x ₂ ，…，x _p Feature set Y = { Y } corresponding to calibration model ₁ ，y ₂ ，…，y _q P and q are positive integers.

Wherein the feature sets X and Y comprise features in multiple dimensions. For example, taking a media resource as an example of a video, the characteristics of the multiple dimensions include, but are not limited to, portrait information of an object from which the resource acquisition request is initiated, context information when the resource acquisition request is initiated, a type of the video, and video content.

And a second stage of inputting the characteristic set X into the feedback information prediction model to obtain a prediction result score = M (X) of the candidate media resource.

Wherein, the estimation result is numerical value information. For example, taking the candidate media resource as a video as an example, the estimation result is a probability of the object clicking the video (a value range is between 0 and 1, or is presented in a percentage manner, which is not limited in this regard).

And a third stage of inputting the estimation result of the candidate media resource and the feature set Y into a calibration model to obtain a calibration result cali _ score = Cal (score, Y) of the candidate media resource.

Wherein, the calibration result is numerical value information. For example, taking the estimated result of the candidate media resource as 0.8 as an example, after calibration by the calibration model, the calibration result of the candidate media resource is 0.85.

It should be noted that the principle of the online service module implementing the three phases is embodied by the embodiment of the subsequent training method, and is not described herein again.

(2) And a data collection module.

The data collection module is used for collecting data generated by the online service module in the online service process. The collected data can be used as a sample data set for training a feedback information pre-estimation model and a calibration model. Illustratively, the data collected by the data collection module is described below with reference to Table 1.

TABLE 1

As shown in table 1, for each resource acquisition request processed by the online service module, the data collection module generates a sample data set D = { D } based on the resource acquisition request and each candidate media resource corresponding to the resource acquisition request ₁ ，d ₂ ，…，d _n N is a positive integer, and the sample data set includes a plurality of sample media resources. Wherein the sample ID of each sample media asset is used to indicate the sample media asset and the corresponding asset retrieval request, e.g., the sample ID is 0001abcd,0001 for uniquely identifying the sample media asset, and abcd for uniquely identifying the asset retrieval request. Illustratively, for any one sample media asset, the sample media asset includes the following sections.

1. And a resource acquisition request corresponding to the sample media resource.

2. Characteristic set X = { X corresponding to feedback information prediction model ₁ ，x ₂ ，…，x _p And p is a positive integer.

3. Characteristic set Y = { Y corresponding to calibration model ₁ ，y ₂ ，…，y _q And q is a positive integer.

4. And (3) estimating the estimation result score = M (X) output by the feedback information estimation model.

5. Calibration results output by the calibration model cali _ score = Cal (score, Y).

6. Label information label corresponding to the sample media resource. Illustratively, taking the sample media resource as a video, and taking the feedback information prediction model for predicting the click rate as an example, the tag information is divided into 1 and 0, where a tag information of 1 indicates that the sample media resource is clicked by an object, and a tag information of 0 indicates that the sample media resource is not clicked by the object.

(3) And an offline training module.

And the offline training module is used for training and optimizing the feedback information pre-estimation model and the calibration model according to the sample data set collected by the data collection module. This part will be described in detail in the following embodiments of the training method, and will not be described herein.

(4) And a model evaluation module.

The model evaluation module is used for performing model evaluation on the feedback information pre-estimation model and the calibration model in the online service and offline training process so as to guide model iteration. The model evaluation module compares the estimation result output by the on-line feedback information estimation model with the calibration result output by the on-line calibration model through posterior data to determine whether the off-line calibration model is needed. In the on-line training, the model evaluation module evaluates the training effects of the feedback information estimation model and the calibration model respectively and guides the model iteration. This part will be described in detail in the following embodiments of the training method, and will not be described herein.

On the basis of introducing the feedback information estimation model training system provided by the embodiment of the disclosure, the feedback information estimation model training method provided by the embodiment of the disclosure is introduced through several method embodiments.

Fig. 3 is a flowchart of a method for training a feedback information prediction model according to an embodiment of the present disclosure. As shown in fig. 3, the method is performed by a server and includes the following steps 301 to 304.

In step 301, the server obtains a first feature set and a sample data set.

In an embodiment of the disclosure, the first set of features includes features of multiple dimensions of the media asset. For example, taking a media resource as an image as an example, the features of the multiple dimensions include: portrait information of the object, type of picture, and picture content, etc. For another example, taking a media resource as a video, the characteristics of the multiple dimensions include: the image information of the object, the video length, the video content, the video type, and the like, which are not limited in this disclosure. In some embodiments, the first set of features is selected manually by a developer. In some embodiments, the first feature set is obtained by a feature engineering method such as a filtering method (Filter), a packing method (Wrapper), or an embedding method (Embedded), which is not limited in the embodiments of the present disclosure.

The sample dataset comprises a plurality of sample media assets, each sample media asset comprising feature values of features of a plurality of dimensions of the sample media asset. In some embodiments, each sample media asset further comprises: the estimation result of the sample media resource is obtained based on the online feedback information estimation model, and the calibration result of the sample media resource and the label information of the sample media resource are obtained based on the online calibration model. Schematically, referring to the training system shown in fig. 2, the sample data set is obtained by collecting online data, and the content included in each sample media resource is shown in table 1, which is not described herein again.

In step 302, the server calibrates the estimation result of the sample data set obtained based on the feedback information estimation model based on the calibration model corresponding to each feature in the first feature set, so as to obtain the calibration result of the sample data set corresponding to each calibration model.

In the embodiment of the disclosure, the feedback information prediction model is used for predicting the feedback behavior of the object on the media resource to obtain the prediction result of the media resource. For the feature of any dimension in the first feature set, the calibration model corresponding to the feature is used for calibrating the estimation result obtained based on the feedback information estimation model to obtain a calibration result.

In step 303, the server determines at least one target feature from the first feature set based on the pre-estimated result of the sample data set and the calibration result of the sample data set corresponding to each calibration model, to obtain a second feature set.

In the embodiment of the present disclosure, the accuracy of the calibration result obtained based on the calibration model corresponding to the target feature is greater than the accuracy of the estimation result. For each calibration model, the server judges whether the accuracy of the calibration result output by the calibration model is greater than the estimated result output by the feedback information estimation model or not based on the estimated result of the sample data set and the calibration result of the sample data set corresponding to the calibration model, and indicates that the calibration model can obtain a more accurate result after calibrating the estimated result under the condition of determining that the accuracy of the calibration result is greater than the estimated result output by the feedback information estimation model, so that the characteristic corresponding to the calibration model is determined as a target characteristic. In other words, this process may also be understood as using the calibration model to screen out the target feature from the first feature set, so as to provide guidance for feature selection of the feedback information estimation model, thereby improving the iteration efficiency of the feedback information estimation model and the accuracy of the feedback information estimation model.

In step 304, the server trains the feedback information estimation model based on the second feature set and the sample data set, so as to obtain a trained feedback information estimation model.

In the embodiment of the present disclosure, the trained feedback information prediction model is used for predicting the feedback behavior of the object on the media resource when online service is performed. For example, taking media resources as pictures as an example, the trained feedback information estimation model is applied to a picture recommendation system, so that the click rate of the pictures can be improved, and a service guide is provided for the picture recommendation system. For another example, taking a media resource as a video as an example, the trained feedback information estimation model is applied to a video recommendation system, so that the click rate of the video, the watching duration of the User, the number of Active users (DAU), and the like can be increased, and a service guide is provided for the video recommendation system. The embodiment of the present disclosure does not limit the application scenario of the feedback information estimation model.

According to the technical scheme provided by the embodiment of the disclosure, after a first feature set and a sample data set are obtained, according to a calibration model corresponding to the feature of each dimension in the first feature set, a pre-estimation result output by a feedback information pre-estimation model is calibrated to obtain a calibration result of the sample data set, based on the calibration result, at least one target feature is determined from the first feature set, and the target features are added into a training process of the feedback information pre-estimation model. In the process, the accuracy of the calibration result obtained based on the calibration model corresponding to the target characteristic is greater than that of the estimation result, which shows that a more accurate result can be obtained after the estimation result is calibrated by the calibration model, so that the iteration efficiency of model training and the accuracy of the model can be effectively improved by adding the characteristic corresponding to the calibration model into the training process of the feedback information estimation model.

Fig. 3 is a basic flowchart of the present disclosure, and the following further explains a scheme provided by the present disclosure based on a specific implementation, and fig. 4 is a flowchart of another feedback information estimation model training method provided by an embodiment of the present disclosure, and as shown in fig. 4, the method is executed by a server and includes the following steps 401 to 407.

In step 401, the server obtains a first feature set and a sample data set.

In an embodiment of the disclosure, the first set of features comprises features of multiple dimensions of a media asset, the sample set of data comprises a plurality of sample media assets, each sample media asset comprising feature values of the features of the multiple dimensions of the sample media asset. Step 401 is the same as step 301, and therefore is not described herein again.

In step 402, the server obtains a calibration model corresponding to each feature based on the first feature set and the sample data set.

In an embodiment of the present disclosure, the sample data set includes training samples and test samples. The training sample is used to determine a calibration model corresponding to each feature, and the test sample is used to determine a second feature set in subsequent steps, for example, the server uses 80% of sample media resources in the sample data set as the training sample, and uses 20% of sample media resources in the sample data set as the test sample, which is not limited in the embodiment of the present disclosure. In this step 402, the server traverses the feature of each dimension in the first feature set, and obtains a calibration model corresponding to each feature based on a feature value of each feature in a training sample of the sample data set.

This process is described below by taking the feature of any dimension in the first feature set as an example, and schematically, the feature of this dimension is referred to as the first feature, and includes the following steps 4021 to 4023.

In step 4021, a feature value of a first feature of a media resource is obtained from the sample data set based on the first feature.

Wherein the first feature refers to a feature of any dimension of the media resource in the first feature set. The server obtains all feature values of the first feature from the training samples of the sample data set based on the first feature.

In step 4022, based on the eigenvalue of the first characteristic, bucket division is performed on the sample data set to obtain a plurality of buckets.

And the server constructs a mapping function from the characteristic value to the sub-bucket based on the acquired characteristic value of the first characteristic, and sub-buckets are performed on the sample data set to obtain a plurality of buckets. For example, the range of the eigenvalue of the first characteristic is 0 to 100, the sample media resources corresponding to the eigenvalue within the range of 0 to 10 are placed into the first bucket, the sample media resources corresponding to the eigenvalue within the range of 11 to 20 are placed into the second bucket, and so on.

In step 4023, a calibration model corresponding to the first feature is determined based on the sample media assets corresponding to each bucket.

For any bucket, the server obtains the estimation result and the label information of the sample media resource in the bucket, divides the sum of the label information corresponding to the bucket by the sum of the estimation result to obtain the calibration coefficient corresponding to the bucket, and determines the calibration model corresponding to the first characteristic based on the calibration coefficient corresponding to each bucket.

Illustratively, taking a set of a plurality of buckets corresponding to the first feature as G, each bucket is denoted as G, and the calibration coefficient of each bucket is calculated by the following formula (1).

（1）

In the formula (I), the compound is shown in the specification,

is the calibration factor for the bucket g,

is the sum of label information label of the sample media asset d in the bucket g,

is the sum of the estimates score of the sample media assets d in the bucket g,

representing the mapping function to which the bucket g corresponds, and z represents the first feature.

Through the step 402, the server traverses the features of each dimension in the first feature set to obtain a calibration model corresponding to each feature, so as to facilitate the subsequent screening of at least one target feature from the first feature set. For example, the first feature set includes features of 10 dimensions, and through the above step 402, the server obtains 10 calibration models, each corresponding to a plurality of calibration coefficients.

In step 403, the server calibrates the estimation result of the sample data set obtained based on the feedback information estimation model based on the calibration model corresponding to each feature in the first feature set, so as to obtain the calibration result of the sample data set corresponding to each calibration model.

In the embodiment of the disclosure, the server inputs the test sample in the sample data set into the feedback information prediction model to obtain the prediction result of the sample data set. And for the calibration model corresponding to any feature in the first feature set, the server calibrates the pre-estimated result of the sample data set based on the calibration model to obtain the calibration result of the sample data set corresponding to the calibration model. In some embodiments, the feedback information prediction model is an initial feedback information prediction model that is not online, and in other embodiments, the feedback information prediction model is an online feedback information prediction model, that is, after the feedback information prediction model is online for a period of time, the server executes the current process according to the acquired sample data set, which is not limited in this disclosure.

In step 404, the server determines at least one target feature from the first feature set based on the pre-estimated result of the sample data set and the calibration result of the sample data set corresponding to each calibration model, to obtain a second feature set.

In the embodiment of the present disclosure, the accuracy of the calibration result obtained based on the calibration model corresponding to the target feature is greater than the accuracy of the estimated result. And the server judges the accuracy of the pre-estimated result and the calibration result of the sample data set based on the label information of the sample data set, so as to determine at least one target feature and obtain a second feature set. In some embodiments, the server embodies the accuracy of the estimates and calibration results of the sample data set based on the Area Under the Receiver Operating Characteristics (ROC) Curve (AUC). It should be understood that, in deep learning, generally, AUC is used as an evaluation criterion of a model, and the larger the AUC is, the closer the result output by the model corresponding to the AUC is to the true value, that is, the better the model corresponding to the AUC is.

Taking the calibration model corresponding to the feature of any dimension in the first feature set as an example, the process of determining the target feature by the server is described below, and schematically, the calibration model corresponding to the feature of the dimension is referred to as the first calibration model, and includes steps 4041 to 4043 described below.

In step 4041, a first evaluation value of the feedback information prediction model is obtained based on the prediction result of the sample data set and the label information of the sample data set.

Wherein the first evaluation value indicates the accuracy of the feedback information prediction model. Illustratively, the first evaluation value is AUC of the feedback information prediction model, and the server obtains an ROC curve of the feedback information prediction model based on a prediction result of the sample data set and tag information of the sample data set, calculates an area under the ROC curve, and obtains AUC of the feedback information prediction model, that is, the first evaluation value.

In step 4042, a second evaluation value of the first calibration model is obtained based on the calibration result of the sample data set corresponding to the first calibration model and the label information of the sample data set.

Wherein the second evaluation value indicates an accuracy of the first calibration model. Illustratively, the second evaluation value is an AUC of the first calibration model, and the server obtains an ROC curve of the first calibration model based on the calibration result of the sample data set and the tag information of the sample data set, calculates an area under the ROC curve, and obtains the AUC of the first calibration model, that is, the second evaluation value.

In step 4043, in the case where the second evaluation value is larger than the first evaluation value, the feature corresponding to the first calibration model is determined as the target feature.

Under the condition that the second evaluation value is larger than the first evaluation value, the accuracy of the calibration result output by the first calibration model is larger than the estimation result output by the feedback information estimation model, namely, a more accurate result can be obtained after the estimation result is calibrated by the first calibration model, so that the characteristic corresponding to the first calibration model is determined as a target characteristic and is added into the training process of the subsequent feedback information estimation model, and the iteration efficiency of model training and the accuracy of the model can be effectively improved.

In step 405, the server determines a target calibration model based on the calibration model corresponding to each feature in the first set of features.

In the embodiment of the present disclosure, the target calibration model is used to calibrate the estimation result of the media resource obtained based on the online feedback information estimation model. Illustratively, with reference to the training system described above in FIG. 2, the target calibration model can be applied to the online service module of the training system.

In some embodiments, the server obtains a calibration model corresponding to each feature in the first set of features based on a process similar to step 404 described above, and determines a target calibration model from the obtained plurality of calibration models. Illustratively, the server acquires an evaluation value of the calibration model corresponding to each feature based on the sample data set, the feedback information pre-estimation model and the calibration model corresponding to each feature; and determining the calibration model corresponding to the maximum evaluation value in the evaluation values of the calibration model corresponding to each feature as a target calibration model. Since the evaluation value (i.e., AUC) can be used as an evaluation criterion of the model, and the larger the evaluation value is, the closer the result output by the model corresponding to the evaluation value is to the true value, the calibration model corresponding to the largest evaluation value among the plurality of calibration models is determined as the target calibration model, and the result output by the target calibration model is the closest to the true value, so that when the target calibration model is applied to the online service, the accuracy of the calibration result can be improved.

In some embodiments, when the target calibration model is applied to the online service, the server determines whether the target calibration model needs to be offline based on sample data in an online time period. Schematically, this process includes the following steps 1 to 4.

Step 1, obtaining sample data in an online time period based on an online feedback information estimation model and a target calibration model. The online time period can be set according to actual requirements, for example, the online time period is 1 day, that is, the server acquires corresponding sample data after the feedback information estimation model and the target calibration model are online for 1 day.

And 2, acquiring a third evaluation value of the on-line feedback information prediction model based on the on-line feedback information prediction model and sample data in the on-line time period, wherein the third evaluation value indicates the accuracy of the on-line feedback information prediction model.

And 3, acquiring a fourth evaluation value of the target calibration model based on the on-line feedback information pre-estimation model, the target calibration model and the sample data in the on-line time period, wherein the fourth evaluation value indicates the accuracy of the target calibration model.

And 4, under the condition that the third evaluation value is larger than the fourth evaluation value, the model is calibrated by the offline target. When the third evaluation value is greater than the fourth evaluation value, it indicates that the accuracy of the estimation result output by the feedback information estimation model is greater than the accuracy output by the target calibration model, that is, the online service does not need to calibrate the estimation result through the target calibration model, so in this case, the target calibration model is offline in time.

It should be noted that the specific implementation of the above steps is the same as that of the above steps 401 to 404, and therefore, the detailed description thereof is omitted here. By the method, the server can evaluate the necessity of the target calibration model in time according to the sample data of the latest period of time, and the target calibration model is offline in time under the condition that the online service does not need the target calibration model, so that the computing resource is saved, and the operating efficiency of the server is improved. Similarly, the server can also get the target calibration model online in time under the condition that the online service needs the target calibration model according to the sample data of the latest period of time, and the accuracy of the online service is improved.

In addition, in the embodiment of the present disclosure, the server sequentially executes the steps 402 to 405 in this order. That is, after the calibration model corresponding to each feature in the first feature set is obtained, the second feature set and the target calibration model are determined.

In some embodiments, the server sequentially processes the features of each dimension in the first feature set, and in each processing process, the calibration model corresponding to the current feature is compared with the obtained target calibration model to update the target calibration model. For example, taking an example that the first feature set includes 2-dimensional features a and B, the server obtains a calibration model corresponding to the feature a based on the sample data set, obtains an AUC1 of the calibration model corresponding to the feature a and an AUC2 corresponding to the feedback information prediction model based on the calibration model corresponding to the feature a and the feedback information prediction model, determines the feature a as a target feature when the AUC1 is greater than the AUC2, and takes the calibration model corresponding to the feature a as the target calibration model. Similarly, the server processes the feature B to obtain an AUC3 of the calibration model corresponding to the feature B, determines the feature B as the target feature when the AUC3 is greater than the AUC2, compares the AUC3 with the AUC1, and takes the calibration model corresponding to the feature B as the target calibration model when the AUC3 is greater than the AUC 1.

Through the steps 401 to 405, the server screens out at least one target feature from the first feature set based on the first feature set, the sample data set and the feedback information pre-estimation model, and obtains a target calibration model so as to improve the accuracy of a calibration result during online service. In other words, the process from step 401 to step 405 may also be understood as a training process for the calibration model, that is, the first feature set, the sample data set and the feedback information estimation model are used to both screen out at least one target feature and train to obtain the target calibration model.

The following is a schematic description of the training process of the calibration model, that is, the above steps 401 to 405, based on the following pseudo code.

{ Train _ Calibration _ Model (LR _ Model, Z, D): v/training calibration Model (LR _ Model, Z, D):

output _ features = { }// output characteristics = { }

cali _ model = null// calibration model = null

max _ AUC = -1// max AUC = -1

The sample data set D comprises training samples D _ train and testing samples D _ test

for Z in Z: // feature Z for any dimension in the first feature set Z:

for D in D _ train: v/for sample media assets in the test sample:

obtaining eigenvalues of the eigenvalues z

bin (z) - > G =// dividing buckets for characteristic values of characteristic z, constructing a mapping function, and recording a set of buckets as G

for G in G: // for any bucket in the set

cali _ factor (g) = sum (label (d))/sum (score (d)) if bin (z (d)) = g (see equation (1))

for D in D _ test: // for sample media assets in test sample

score (d) = LR _ model (d)// estimate = LR _ model (d)

cali _ score (d) = score (d) × cali _ factor (bin (z (d)))// calibration result = estimate × calibration coefficient

if auc (Score, label) < auc (Cali _ Score, label) the: if the AUC of the feedback information prediction model is less than the AUC of the calibration model:

output _ feature = output _ feature + { z }// output feature = output feature + { z }, z being a target feature

if auc (Cali _ Score, label) > max _ auc the n: v/if the AUC of the calibration model > maximum AUC:

cali _ model = cali _ factor// calibration model = calibration coefficient

max _ AUC = AUC (Cali _ Score, label)// max AUC = AUC of the calibration model

return output _ features, cali _ model// return output characteristics, calibration model }.

Illustratively, during this training process, the input information acquired by the server includes a first feature set Z = { Z = { (Z) } ₁ ，z ₂ ，…，z _m M is a positive integer, and a sample data set D = { D = } ₁ ，d ₂ ，…，d _n And a feedback information estimation Model LR _ Model. Wherein, the sample data set D comprises a training sample D _ train and a test sample D _ test.

For each feature z in the first set of features, the following steps are performed.

In the first step, a training sample D _ train is traversed to obtain a feature value of the feature z.

And secondly, performing bucket partitioning on the acquired characteristic values, and constructing a mapping function bin (z) from the values to the buckets.

And thirdly, calculating a calibration coefficient cali _ factor (g) of each barrel g to obtain a calibration model corresponding to the characteristic z, wherein the details are shown in a formula (1).

And fourthly, obtaining a pre-estimation result score and a calibration result cali _ score of each sample media resource based on the test sample, the feedback information pre-estimation model and the calibration model corresponding to the characteristic z. And acquiring a first evaluation value of the feedback information estimation model and a second evaluation value of the calibration model, indicating that the characteristic z has benefit on the feedback information estimation model when the second evaluation value is greater than the first evaluation value, and adding the characteristic z into the output characteristic output _ features, namely obtaining the target characteristic. Further, the calibration model corresponding to the maximum evaluation value among the plurality of calibration models is determined as a target calibration model, and the target calibration model is output. It should be noted that the training process has been described in detail in the steps 401 to 405, and therefore, the description thereof is omitted here.

In step 406, the server adjusts the network structure of the feedback information estimation model based on the second feature set, so as to obtain the adjusted feedback information estimation model.

In the embodiment of the present disclosure, the server adjusts the network structure of the feedback information estimation model by comparing whether the target feature in the second feature set is the same as the feature of the feedback information estimation model, so as to obtain the adjusted feedback information estimation model. In some embodiments, the feedback information prediction model is an initial feedback information prediction model that is not online, and in other embodiments, the feedback information prediction model is an online feedback information prediction model, that is, after the feedback information prediction model is online for a period of time, the server executes the current process according to the obtained second feature set, which is not limited in this disclosure.

Illustratively, step 406 includes the following two cases.

And in case one, the target characteristics in the second characteristic set do not exist in the feedback information prediction model.

In this case, the server adds the target feature to the feedback information prediction model. Illustratively, taking the first target feature as an example, the first target feature refers to a feature that is not present in the feedback information prediction model. And the server adds a network layer for processing the first target characteristic in the feedback information estimation model based on the first target characteristic to obtain the adjusted feedback information estimation model. For example, a network layer for processing the first target feature is added at an input layer of the feedback information prediction model. By adding the first target characteristics into the feedback information estimation model, the trained feedback information estimation model can output more accurate estimation results.

And in case two, the target characteristics in the second characteristic set exist in the feedback information prediction model.

Under the condition, the learning of the feedback information estimation model to the target characteristics is not sufficient, and the server strengthens the learning of the target characteristics by adjusting the network structure of the feedback information estimation model, so that the trained feedback information estimation model can output more accurate estimation results. Illustratively, the second target feature is taken as an example, and the second target feature refers to a feature already existing in the feedback information estimation model. And the server adjusts a network layer used for processing the second target characteristic in the feedback information prediction model based on the second target characteristic to obtain the adjusted feedback information prediction model. For example, the network layer for processing the second target feature is connected to the output layer of the feedback information prediction model by means of direct connection (short). For another example, the network layer of the second target feature is adjusted to a position close to the output layer of the feedback information estimation model, and the like, which is not limited in the embodiment of the disclosure, and other methods for adjusting the model network structure for enhancing feature learning may be applied in this process.

In step 407, the server trains the adjusted feedback information prediction model based on the sample data set to obtain the trained feedback information prediction model.

In the embodiment of the disclosure, the server trains the adjusted feedback information prediction model based on all sample media resources in the sample data set. In some embodiments, the server trains the adjusted feedback information prediction model based on a part of sample media resources in the sample data set, for example, the server selects sample media resources in the sample data set within the last 10 days of the sample generation time as training samples, and in this way, the model is trained by using the recently generated sample media resources, which can improve the accuracy of the model.

In some embodiments, the server tries to train the calibration model using the feedback information prediction model during the training process of the feedback information prediction model based on the same training process as the above steps 401 to 405 until the accuracy of the prediction result output by the feedback information prediction model is greater than or equal to the accuracy of the calibration result output by the calibration model. This process is described below, including steps 4071 and 4072 as follows.

In step 4071, based on the sample data set, training the adjusted feedback information prediction model until an iteration cutoff condition is satisfied, and obtaining an intermediate feedback information prediction model.

Illustratively, this process is performed by the server, and taking the ith iteration in the training process as an example (i is a positive integer), the process of the server training to obtain the intermediate feedback information estimation model includes the following steps 4071-1 to 4071-3.

Step 4071-1, inputting the sample media resource into the adjusted feedback information estimation model to obtain the estimation result of the sample media resource.

Step 4071-2, calculating a loss value based on the estimation result of the sample media resource and the label information. And the server constructs a loss function based on the difference between the estimation result and the label information, and calculates and obtains a loss value corresponding to the sample media resource based on the loss function. It should be noted that the manner of constructing the loss function by the server is not limited to the above manner, and the loss function in the embodiment of the present disclosure may be various loss functions commonly used in the training process of the deep learning model, for example, an absolute value loss function, a cosine similarity loss function, a square loss function, a cross entropy loss function, and the like, which is not limited in the embodiment of the present disclosure.

Step 4071-3, under the condition that the loss value or the iteration satisfies the iteration cutoff condition, outputting the intermediate feedback information estimation model, under the condition that the loss value or the iteration satisfies the iteration cutoff condition, adjusting the network parameters of the model, and performing the (i + 1) th iteration based on the adjusted feedback information estimation model. The iteration cutoff condition is that a loss value (also called an error value) is smaller than a set threshold, and the set threshold can be set according to actual requirements. In some embodiments, the iteration stop condition is that the number of iterations reaches a target number, or the training duration reaches a target duration, which is not limited in the embodiments of the present disclosure.

In step 4072, under the condition that the calibration model corresponding to the second feature set is obtained based on the sample data set, the feedback information prediction model and the second feature set, the intermediate feedback information prediction model is trained based on the sample data set until a target condition is met, and the trained feedback information prediction model is obtained.

The target condition is that the accuracy of the estimation result obtained based on the intermediate feedback information estimation model is greater than or equal to the accuracy of the calibration result. The server trains the calibration model based on the intermediate feedback information estimation model according to the training process similar to the above steps 401 to 405 until the accuracy of the estimation result output by the intermediate feedback information estimation model is greater than or equal to the accuracy of the calibration result output by the calibration model. In this process, the accuracy of the estimation result output by the intermediate feedback information estimation model is smaller than the calibration result output by the calibration model, which indicates that the learning of the intermediate feedback information estimation model on the target feature in the second feature set is still insufficient, and therefore, the training process shown in the step 4071 needs to be repeated by adjusting the network structure of the model until the target condition is met.

It should be noted that, in the embodiment of the present disclosure, the server sequentially performs

steps

406 and 407, that is, firstly adjusts the network structure of the feedback information estimation model based on the second feature set, and then trains.

In some embodiments, the server traverses each target feature in the second feature set, and sequentially processes the target features based on each target feature, in each processing process, the network structure of the feedback information estimation model is adjusted based on the current target feature, the adjusted feedback information estimation model is trained, and so on, until the trained feedback information estimation model is obtained.

After the

above steps

406 and 407, the server trains the feedback information prediction model based on the second feature set. The following pseudo code is used to schematically describe the training process of the feedback information estimation model, that is, the description of the

above steps

406 and 407.

{ Train _ LR _ Model (W, D, base _ Model): // training feedback information prediction model (W, D, base _ model)

output _ model = base _ model// output model = base _ model

do：

for W in W: // for any target feature W in the second feature set W:

if w in output _ model then: if the feature w already exists in the model:

output _ model = adjust the network structure and train the new model

else：

output _ model = add w to the model and train

f, cali _ Model = Train _ Calibration _ Model (base _ Model, W, D _ Train)// Calibration Model = Train Calibration Model (base _ Model, W, D _ Train)

while cali _ model = null// calibration model is null

return output _ model// return output model }.

Illustratively, during this training process, the input information acquired by the server includes a second feature set W = { W = { (W) ₁ ，w ₂ ，…，w _k K is a positive integer, and a sample data set D = { D = { D = ₁ ，d ₂ ，…，d _n And n is a positive integer.

Where the output model is initialized to base _ model. Each feature w in the second set of features is determined, and if the feature is not already present in the model, the feature is added to the model, and if the feature is already present in the model, the network structure of the model is adjusted to enhance the learning of this feature. And updating the intermediate feedback information estimation model obtained by training into an output _ model. Training the calibration model based on the intermediate feedback information pre-estimation model and the second characteristic set until the accuracy of the pre-estimation result output by the intermediate feedback information pre-estimation model is greater than or equal to the accuracy of the calibration result output by the calibration model, and outputting the trained feedback information pre-estimation model. It should be noted that the training process has been described in detail in the steps 406 to 407, and therefore, the description thereof is omitted here.

Through the steps 401 to 407, the server screens the first feature set by using the calibration model, so as to obtain a second feature set, and trains the feedback information estimation model based on the second feature set. Illustratively, the feedback information prediction model is M (x) ₁ ，x ₂ ，x ₃ ，x ₄ ，…，x _h ) Adding the target characteristics w in the second characteristic set to the feedback information estimation model to obtain an adjusted feedback information estimation model M' (x) ₁ ，x ₂ ，x ₃ ，x ₄ ，…，x _h W), h are positive integers. The relationship between M and M' is as follows: m ' = M × Cali _ Score (w), i.e., M ' is equivalent to adding one layer of multiplication on the basis of the model M, since M ' is the set of used features { x } ₁ ，x ₂ ，x ₃ ，x ₄ ，…，x _h W, and M' is more accurate than M, indicating the presence of the set of usage features { x } ₁ ，x ₂ ，x ₃ ，x ₄ ，…，x _h W the accuracy of the trained model is higher than that of model M. Therefore, the accuracy of the model can be effectively improved by training through the method.

In addition, based on the training system shown in fig. 2, the calibration, iteration and upgrade of the feedback information estimation model are realized through the cooperation among the online service module, the data collection module, the offline training module and the model evaluation module. In some embodiments, the offline training module trains the calibration model according to the sample data set received by the data collection module every preset time interval, and after obtaining the target feature, trains the feedback information prediction model by using the target feature. For example, the preset time period is 30 minutes, which is not limited.

In addition, in some embodiments, when the trained feedback information estimation model and the target calibration model are applied to an online service, the accuracy of the model can be effectively improved, so that the finally determined target media resource better meets the requirement. Schematically, this process is briefly described below, and includes the following steps a to D.

Step A, the server responds to a resource acquisition request aiming at the media resource to acquire a plurality of candidate media resources.

And step B, the server acquires a feature set corresponding to each candidate media resource based on the resource acquisition request and the candidate media resources.

The feature set corresponding to each candidate media resource comprises features corresponding to the online feedback information pre-estimation model and features corresponding to the target calibration model, and the target calibration model is used for calibrating a pre-estimation result of the media resource obtained based on the online feedback information pre-estimation model.

And step C, the server acquires the calibration result of each candidate media resource based on the feature set corresponding to each candidate media resource, the online feedback information estimation model and the target calibration model.

For any candidate media resource, the process of the server obtaining the calibration result of the candidate media resource includes: inputting the characteristics corresponding to the online feedback information estimation model in the characteristic set corresponding to the candidate media resource into the online feedback information estimation model to obtain the estimation result of the candidate media resource; inputting the estimated result and the characteristics corresponding to the target calibration model in the characteristic set corresponding to the candidate media resource into the target calibration model to obtain the calibration result of the candidate media resource.

And D, the server determines the target media resource based on the calibration result of each candidate media resource.

And the server determines the first N-bit (N is a positive integer) media resources as target media resources based on the size of the calibration result of each candidate media resource, and returns the target media resources as the response of the resource acquisition request.

In summary, according to the technical scheme provided by the embodiment of the present disclosure, after the first feature set and the sample data set are obtained, according to the calibration model corresponding to the feature of each dimension in the first feature set, the estimation result output by the feedback information estimation model is calibrated to obtain the calibration result of the sample data set, based on which, at least one target feature is determined from the first feature set, and the target features are added to the training process of the feedback information estimation model. In the process, the accuracy of the calibration result obtained based on the calibration model corresponding to the target characteristic is greater than that of the estimation result, which shows that a more accurate result can be obtained after the estimation result is calibrated by the calibration model, so that the iteration efficiency of model training and the accuracy of the model can be effectively improved by adding the characteristic corresponding to the calibration model into the training process of the feedback information estimation model.

Fig. 5 is a block diagram of a training apparatus for a feedback information prediction model according to an embodiment of the present disclosure. Referring to fig. 5, the apparatus includes an acquisition unit 501, a calibration unit 502, a determination unit 503, and a training unit 504.

An obtaining unit 501 configured to perform obtaining a first feature set and a sample data set, the first feature set including features of multiple dimensions of a media resource, the sample data set including multiple sample media resources, each sample media resource including feature values of the features of the multiple dimensions of the sample media resource.

A calibration unit 502 configured to perform calibration based on a calibration model corresponding to each feature in the first feature set, calibrate the estimation result of the sample data set obtained based on the feedback information estimation model, and obtain a calibration result of the sample data set corresponding to each calibration model.

A determining unit 503 configured to perform determining at least one target feature from the first feature set based on the pre-estimation result of the sample data set and the calibration result of the sample data set corresponding to each calibration model, to obtain a second feature set, wherein the accuracy of the calibration result obtained based on the calibration model corresponding to the target feature is greater than the accuracy of the pre-estimation result.

A training unit 504 configured to perform training on the feedback information prediction model based on the second feature set and the sample data set, so as to obtain the trained feedback information prediction model.

In some embodiments, the determining unit 503 is configured to perform:

acquiring a second evaluation value of the first calibration model based on a calibration result of the sample data set corresponding to the first calibration model and label information of the sample data set, wherein the first calibration model is a calibration model corresponding to any dimension of the first feature set, and the second evaluation value indicates the accuracy of the first calibration model;

acquiring sample data in an online time period based on the feedback information estimation model and the target calibration model which are online;

acquiring a fourth evaluation value of the target calibration model based on the feedback information pre-estimation model which is online, the target calibration model and sample data in the online time period, wherein the fourth evaluation value indicates the accuracy of the target calibration model;

in the case where the third evaluation value is larger than the fourth evaluation value, the target calibration model is taken off line.

In some embodiments, the training unit 504 is configured to perform:

based on the second characteristic set, adjusting the network structure of the feedback information estimation model to obtain the adjusted feedback information estimation model;

In some embodiments, the training unit 504 is configured to perform at least one of:

In some embodiments, the training unit 504 is configured to perform:

According to the technical scheme provided by the embodiment of the disclosure, after a first feature set and a sample data set are obtained, according to a calibration model corresponding to the feature of each dimension in the first feature set, a pre-estimation result output by a feedback information pre-estimation model is calibrated to obtain a calibration result of the sample data set, based on the calibration result, at least one target feature is determined from the first feature set, and the target features are added into a training process of the feedback information pre-estimation model. In the process, the accuracy of the calibration result obtained based on the calibration model corresponding to the target characteristic is greater than that of the estimation result, which indicates that a more accurate result can be obtained after the estimation result is calibrated by the calibration model, so that the iteration efficiency of model training and the accuracy of the model can be effectively improved by adding the characteristic corresponding to the calibration model into the training process of the feedback information estimation model.

It should be noted that: in the feedback information estimation model training device provided in the above embodiment, when training a model, only the division of each function module is illustrated, and in practical applications, the function distribution may be completed by different function modules according to needs, that is, the internal structure of the apparatus is divided into different function modules to complete all or part of the above-described functions. In addition, the feedback information estimation model training device and the feedback information estimation model training method provided in the above embodiments belong to the same concept, and the specific implementation process thereof is described in detail in the method embodiments, and will not be described herein again.

Fig. 6 is a block diagram of a server provided by an embodiment of the present disclosure. As shown in fig. 6, the server 600 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 601 and one or more memories 602, where at least one program code is stored in the one or more memories 602, and is loaded and executed by the one or more processors 601 to implement the processes executed by the server in the feedback information prediction model training method provided by the above-mentioned method embodiments. Certainly, the server 600 may further have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input and output, and the server 600 may further include other components for implementing functions of the device, which are not described herein again.

In an exemplary embodiment, a computer readable storage medium including program code, such as a memory 602 including program code, executable by a processor 601 of the server 600 to perform the feedback information prediction model training method is also provided. Alternatively, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact-Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

It should be noted that information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals referred to in this application are authorized by the user or sufficiently authorized by various parties, and the collection, use, and processing of the relevant data is required to comply with relevant laws and regulations and standards in relevant countries and regions. For example, the sample media assets referred to in this application are all acquired with sufficient authorization.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A feedback information estimation model training method is characterized by comprising the following steps:

acquiring a first feature set and a sample data set, wherein the first feature set comprises features of multiple dimensions of media resources, the sample data set comprises multiple sample media resources, and each sample media resource comprises feature values of the features of the multiple dimensions of the sample media resource;

determining at least one target feature from the first feature set based on the pre-estimation result of the sample data set and the calibration result of the sample data set corresponding to each calibration model to obtain a second feature set, wherein the accuracy of the calibration result obtained based on the calibration model corresponding to the target feature is greater than that of the pre-estimation result;

and training the feedback information prediction model based on the second characteristic set and the sample data set to obtain the trained feedback information prediction model.

2. The method for training the feedback information estimation model according to claim 1, further comprising:

acquiring a feature value of a first feature from the sample data set based on the first feature of the media resource, wherein the first feature refers to a feature of any dimension of the media resource in the first feature set;

determining a calibration model corresponding to the first feature based on the sample media assets corresponding to each bucket.

3. The method according to claim 1, wherein the determining at least one target feature from the first feature set based on the pre-estimation result of the sample data set and the calibration result of the sample data set corresponding to each calibration model to obtain a second feature set comprises:

acquiring a second evaluation value of a first calibration model based on a calibration result of the sample data set corresponding to the first calibration model and label information of the sample data set, wherein the first calibration model is a calibration model corresponding to a feature of any dimension in the first feature set, and the second evaluation value indicates the accuracy of the first calibration model;

4. The method for training the feedback information estimation model according to claim 1, further comprising:

and determining the calibration model corresponding to the maximum evaluation value in the evaluation values of the calibration models corresponding to the characteristics as a target calibration model, wherein the target calibration model is used for calibrating the estimation result of the media resources obtained based on the feedback information estimation model which is on-line.

5. The method for training the feedback information estimation model according to claim 4, further comprising:

acquiring a third evaluation value of the feedback information pre-estimation model which is online based on the feedback information pre-estimation model which is online and sample data in the online time period, wherein the third evaluation value indicates the accuracy of the feedback information pre-estimation model which is online;

and in the case that the third evaluation value is larger than the fourth evaluation value, taking the target calibration model off line.

6. The method for training the feedback information prediction model according to claim 1, wherein the training the feedback information prediction model based on the second feature set and the sample data set to obtain the trained feedback information prediction model comprises:

training the adjusted feedback information prediction model based on the sample data set to obtain the trained feedback information prediction model.

7. The method for training the feedback information estimation model according to claim 6, wherein the adjusting the network structure of the feedback information estimation model based on the second feature set to obtain the adjusted feedback information estimation model includes at least one of:

based on a first target characteristic in the second characteristic set, adding a network layer for processing the first target characteristic in the feedback information estimation model to obtain the adjusted feedback information estimation model, wherein the first target characteristic refers to a characteristic which does not exist in the feedback information estimation model;

8. The method for training the feedback information estimation model according to claim 6, wherein the training the adjusted feedback information estimation model based on the sample data set to obtain the trained feedback information estimation model comprises:

under the condition that a calibration model corresponding to the second feature set is obtained based on the sample data set, the feedback information pre-estimation model and the second feature set, training the intermediate feedback information pre-estimation model based on the sample data set until a target condition is met, and obtaining the trained feedback information pre-estimation model, wherein the target condition is that the accuracy of a pre-estimation result obtained based on the intermediate feedback information pre-estimation model is greater than or equal to the accuracy of the calibration result.

9. A device for training a feedback information prediction model is characterized by comprising:

the calibration unit is configured to execute a calibration model corresponding to each feature in the first feature set, calibrate the estimation result of the sample data set obtained based on the feedback information estimation model, and obtain the calibration result of the sample data set corresponding to each calibration model;

10. The apparatus for training the feedback information estimation model according to claim 9, further comprising a calibration model determining unit configured to perform:

11. The apparatus according to claim 9, wherein the determining unit is configured to perform:

12. The apparatus for training the feedback information estimation model according to claim 9, further comprising a target calibration model determining unit configured to perform:

and determining the calibration model corresponding to the maximum evaluation value in the evaluation values of the calibration models corresponding to each characteristic as a target calibration model, wherein the target calibration model is used for calibrating the pre-estimation result of the media resource obtained based on the feedback information pre-estimation model which is on-line.

13. The apparatus for training the feedback information prediction model according to claim 12, further comprising a target calibration model offline unit configured to perform:

acquiring a third evaluation value of the feedback information pre-estimation model which is on-line based on the feedback information pre-estimation model which is on-line and sample data in the on-line time period, wherein the third evaluation value indicates the accuracy of the feedback information pre-estimation model which is on-line;

14. The feedback information prediction model training apparatus according to claim 9, wherein the training unit is configured to perform:

15. The apparatus according to claim 14, wherein the training unit is configured to perform at least one of:

16. The feedback information prediction model training apparatus according to claim 14, wherein the training unit is configured to perform:

17. An electronic device, characterized in that the electronic device comprises:

one or more processors;

a memory for storing the processor executable program code;

wherein the processor is configured to execute the program code to implement the feedback information prediction model training method according to any one of claims 1 to 8.

18. A computer-readable storage medium, wherein program code in the computer-readable storage medium, when executed by a processor of an electronic device, enables the electronic device to perform the feedback information prediction model training method of any one of claims 1 to 8.