CN112015990A

CN112015990A - Method and device for determining network resources to be recommended, computer equipment and medium

Info

Publication number: CN112015990A
Application number: CN202010899720.1A
Authority: CN
Inventors: 张永池
Original assignee: Guangzhou Baiguoyuan Information Technology Co Ltd
Current assignee: Guangzhou Baiguoyuan Information Technology Co Ltd
Priority date: 2020-08-31
Filing date: 2020-08-31
Publication date: 2020-12-01

Abstract

The embodiment of the invention discloses a method, a device, computer equipment and a medium for determining network resources to be recommended, wherein the method comprises the following steps: the method comprises the steps of obtaining recommendation probabilities of network resources determined by at least two recommendation models, wherein each recommendation model is generated based on learning of a single behavior index sample set, and determining the network resources to be recommended from the network resources by adopting a comprehensive model based on the recommendation probabilities, wherein the comprehensive model is determined based on each recommendation model and corresponding target coefficients, the target coefficients are at least two groups of reference coefficients generated randomly, so that the recommendation effect value of the comprehensive model meets an expected group of reference coefficients, the recommendation effect value is an index for evaluating the recommendation effect of the comprehensive model, the problem that the model recommendation effect is influenced by manually setting model comprehensive parameters in the related technology can be avoided, and the model recommendation effect is optimized.

Description

Method and device for determining network resources to be recommended, computer equipment and medium

Technical Field

The embodiment of the invention relates to the field of computers, in particular to a method and a device for determining network resources to be recommended, computer equipment and a medium.

Background

In a recommendation scenario, in order to make a recommendation result closer to an actual preference of a user, network resource recommendation is generally performed based on a plurality of different behavior indexes, for example, the behavior indexes may be a click rate, a like rate, a share rate, an attention rate, a comment rate, and the like.

Network resource recommendation is usually performed based on a plurality of different behavior indexes by adopting a comprehensive model. The construction mode of the comprehensive model can be as follows: and respectively constructing respective recommendation models based on different behavior indexes, enabling the recommendation models to independently learn corresponding sample sets of the behavior indexes to obtain a plurality of recommendation models, and synthesizing the recommendation models by adopting artificially specified model comprehensive parameters to obtain a comprehensive model. However, the model comprehensive parameters of the comprehensive model obtained in the above manner are limited to manual selection, and there is a problem that the model recommendation effect is affected by the manual setting of the model comprehensive parameters.

Disclosure of Invention

The embodiment of the invention provides a method, a device, computer equipment and a medium for determining network resources to be recommended, and solves the problem that the model recommendation effect is influenced by manually setting model comprehensive parameters in the related art.

In a first aspect, an embodiment of the present invention provides a method for determining a network resource to be recommended, including:

obtaining recommendation probabilities of network resources determined by at least two recommendation models, wherein each recommendation model is generated based on learning of a single behavior index sample set, and samples in each single behavior index sample set correspond to the same user behavior;

and determining the network resources to be recommended from the network resources by adopting a comprehensive model based on the recommendation probability, wherein the comprehensive model is determined based on each recommendation model and corresponding target coefficients, the target coefficients are at least two groups of reference coefficients generated randomly, so that the recommendation effect value of the comprehensive model meets an expected group of reference coefficients, and the recommendation effect value is an index for evaluating the recommendation effect of the comprehensive model.

In a second aspect, an embodiment of the present invention provides an apparatus for determining a network resource to be recommended, where the apparatus includes:

the recommendation probability obtaining module is used for obtaining recommendation probabilities of the network resources determined by at least two recommendation models, wherein each recommendation model is generated based on learning of a single behavior index sample set, and samples in each single behavior index sample set correspond to the same user behavior;

and the recommendation resource determining module is used for determining the network resources to be recommended from the network resources based on the recommendation probability by adopting a comprehensive model, wherein the comprehensive model is determined based on each recommendation model and corresponding target coefficients, the target coefficients are at least two groups of reference coefficients generated randomly, the recommendation effect value of the comprehensive model meets an expected group of reference coefficients, and the recommendation effect value is an index for evaluating the recommendation effect of the comprehensive model.

In a third aspect, an embodiment of the present invention provides a computer device, where the computer device includes:

one or more processors;

a memory for storing one or more programs;

when the one or more programs are executed by the one or more processors, the one or more processors implement the method for determining the network resource to be recommended according to any embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the method for determining a network resource to be recommended according to any embodiment of the present invention is implemented.

The embodiment of the invention provides a method, a device, computer equipment and a medium for determining network resources to be recommended. According to the embodiment of the invention, the recommendation effect of the comprehensive model is evaluated according to the off-line sample data, one group which enables the comprehensive model to meet the expectation is selected from multiple groups of reference coefficients according to the recommendation effect and is taken as the target coefficient, the problem that the model recommendation effect is influenced by manually setting the model comprehensive parameters in the model comprehensive method in the related art is avoided, the model recommendation effect is optimized, and the use viscosity of a user is improved.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:

fig. 1 is a flowchart of a method for determining a network resource to be recommended according to an embodiment of the present invention;

fig. 2 is a flowchart of another method for determining a network resource to be recommended according to an embodiment of the present invention;

fig. 3 is a flowchart of a method for determining a network resource to be recommended according to an embodiment of the present invention;

fig. 4 is a schematic view of a determination process of a comprehensive model in the method for determining network resources to be recommended according to the embodiment of the present invention;

fig. 5 is a schematic diagram of an online adjustment process of a comprehensive model in the method for determining network resources to be recommended according to the embodiment of the present invention;

fig. 6 is a block diagram of a device for determining network resources to be recommended according to an embodiment of the present invention;

fig. 7 is a block diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

Fig. 1 is a flowchart of a method for determining a network resource to be recommended according to an embodiment of the present invention. The method may be performed by a determining device of the network resource to be recommended, which may be implemented by software and/or hardware and is typically configured on the server side. As shown in fig. 1, the method includes:

and step 110, acquiring the recommendation probability of the network resource determined by the at least two recommendation models.

The recommendation models are machine learning models for executing network resource recommendation at a server side, and each recommendation model is generated based on learning of a single behavior index sample set. The samples in each single behavior index sample set correspond to the same user behavior. It should be noted that the network resource is resource data stored by the server and consumed by the user. For example, the network resource may be data such as a short video, a long video, a live video, a text chapter or post. The behavior index represents a proportion of the network resources operated in the set period of time to the recommended network resources. For example, the behavior indexes include click rate, approval rate, share rate, attention rate, comment rate and the like. The single behavior index samples are samples of a single user behavior used to train the recommendation model. A single behavior index sample may be generated by tagging the sample. And marking corresponding labels on the samples related to the clicking operation in the samples to obtain a clicking sample set. For example, the form of the label of the click sample set may be that user a clicks on video x. In a similar manner, a complimentary sample set, a shared sample set, an attention sample set, and a review sample set may be generated. For example, a tag drive that points to a sample set may be that user a points to video a. Alternatively, the tagged form of the comment sample set may be that user B has commented on video B, and so on. The click sample set, the like sample set, the share sample set, the attention sample set and the comment sample set can be respectively used as a single behavior index sample set.

And when training the recommendation model, constructing the recommendation model corresponding to each single behavior index sample set. For example, when a recommendation model corresponding to a click behavior is trained, a click sample set is learned through a construction model to obtain a click recommendation model. Or when the recommendation model corresponding to the praise behavior is trained, learning is carried out on the praise sample set through the construction model, and the praise recommendation model is obtained. Or when a recommendation model corresponding to the sharing behavior is trained, the sharing sample set is learned through the construction model, and the sharing recommendation model is obtained. Or when training a recommendation model corresponding to the attention behavior, learning the attention sample set by constructing the model to obtain the attention recommendation model. Or when a recommendation model corresponding to the comment behavior is trained, the comment sample set is learned through the constructed model, and the comment recommendation model is obtained.

It should be noted that the recommendation probability represents the probability that each network resource is operated by the user, which is predicted by the recommendation model. For example, the probability of the network resource being clicked by the user is predicted through a click recommendation model. Or predicting the probability of the network resource being praised by the user through the praise recommendation model. Or predicting the probability of the network resource being shared by the user through the sharing recommendation model. Or predicting the probability of the network resource being concerned by the user through the attention recommendation model. Or predicting the probability that the network resource is commented by the user through the comment recommendation model.

Illustratively, recommendation models constructed by different behavior indexes are selected based on requirements of a recommendation scene, and recommendation probabilities of network resources are predicted through the selected recommendation models respectively. For example, a click recommendation model, a like recommendation model, a share recommendation model, an attention recommendation model and a comment recommendation model are selected based on recommendation scene requirements, and the probability of a network resource being clicked, the probability of being liked, the probability of being shared, the probability of being attended and the probability of being commented are predicted through the models respectively. It should be noted that, in an actual recommendation scenario, more recommendation models or fewer recommendation models may be selected, and are not limited to the examples listed above.

And 120, determining the network resources to be recommended from the network resources based on the recommendation probability by adopting a comprehensive model.

It should be noted that the integrated model is used for integrating and sorting the output results of the recommendation models. Specifically, the comprehensive model performs weighted summation on recommendation probabilities of the same network resources in output results of each recommendation model to obtain a total recommendation probability of corresponding network resources, then sorts each network resource based on the total recommendation probability, and determines and outputs the network resources to be recommended based on the sorting result.

In this embodiment, the integrated model is determined based on each recommendation model and the corresponding target coefficient. The target coefficient is at least two groups of reference coefficients which are randomly generated, so that the recommendation result value of the comprehensive model meets an expected group of reference coefficients, and the recommendation effect value is an index for evaluating the recommendation effect of the comprehensive model. Optionally, the recommended effect value of the comprehensive model may be determined based on the ranking result of the samples in each single behavior index sample set determined by the comprehensive model, and the number of positive samples and the number of exposure samples corresponding to the single behavior index sample set.

Specifically, at least two groups of reference coefficients with corresponding numbers are randomly generated based on the number of model comprehensive coefficients in a preset comprehensive model expression. And respectively combining each group of reference coefficients to the comprehensive model expression to obtain at least two comprehensive models. For example, the comprehensive model expression is f (m)₁,m₂,m₃.....)＝x₁*m₁+x₂*m₂+x₃*m₃..., wherein m is₁，m₂，m₃... represents a recommendation model based on learning a single set of behavior index samples, x₁，x₂，x₃… … denotes the integrated model coefficients that adjust the model output results, and the integrated model coefficient values at this time are not determined. And generating at least two groups of reference coefficients with corresponding numbers according to the number of the comprehensive model coefficients. Substituting the reference coefficients into f (m) respectively₁,m₂,m₃...) to obtain at least two comprehensive models, wherein the coefficient value of the comprehensive model is determined, namely the reference coefficient is generated randomly.

In particular, one of the at least two integrated models is selected as the current integrated model. And predicting the sample recommendation probability of each sample in each single behavior index sample set by adopting the current comprehensive model. And sequencing the positive samples in the corresponding single behavior index sample set based on the sample recommendation probability, and calculating a recommendation effect value according to a positive sample sequencing result, the number of the positive samples corresponding to the single behavior index sample set and the number of the exposure samples. Alternatively, the recommended effectiveness value may be an AUC (Area Under ROC Curve) indicator of the model. The physical meaning of the AUC is that a pair of positive examples and negative examples is selected, the probability that the positive example score is greater than the negative example score is higher, and the higher the AUC is, the better the recommendation effect is. AUC can be calculated using the following formula:

wherein rank represents the positive sample sorting result, P represents the set of positive samples, | P | is the number of positive samples, | N | is the number of exposure samples.

It should be noted that the number of positive samples refers to the number of samples recommended to and operated by the user. The number of exposure samples refers to the number of samples recommended to the user but not operated by the user. For example, for the case where the single behavior index sample set is a click sample set, the number of positive samples is the number of samples recommended to and clicked on by the user. The number of exposed samples is the number of samples recommended to the user but not clicked on by the user. Similarly, for the case where the single behavior index sample set is a shared sample set, the positive sample number is the number of samples recommended to and shared by the user. The number of exposed samples is the number of samples recommended to the user but not shared by the user.

After the recommendation effect values of the comprehensive model based on the single behavior index sample sets are respectively calculated, the weighted sum of the recommendation effect values corresponding to the single behavior index sample sets is calculated according to the preset weight of each behavior index, and the weighted sum is used as the recommendation effect value of the corresponding comprehensive model. And taking the comprehensive model with the maximum recommendation effect value as a final online model, wherein the reference coefficients included in the comprehensive model are a group of reference coefficients which enable the recommendation effect value to meet the expectation. For example, for click behavior, assume that its weight is b₁Then the corresponding single behavior index sample set 1 corresponds to the recommended effect value auc₁The weight of (a) is b₁. For praise behavior, assume its weight is b₂Then the corresponding single behavior index sample set 2 corresponds to the recommended effect value auc₂The weight of (a) is b₂. For sharing behavior, assume its weight is b₃Then correspond toThe single behavior index sample set 3 of (2) corresponds to the recommendation effect value auc₃The weight of (a) is b₃. Similarly, if there are k behaviors, the single behavior index sample set k corresponds to the recommendation effect value auc_kThe weight of (a) is b_k. The recommended effectiveness value auc _ total ═ b of the integrated model can be obtained₁*auc₁+b₂*auc₂+...+b_k*auc_k. And comparing the recommended effect values of the comprehensive models, and taking the reference coefficient corresponding to the comprehensive model with the maximum recommended effect value as a target coefficient.

According to the method and the device, the recommendation probability of the network resources determined by at least two recommendation models is obtained, the network resources to be recommended are determined from the network resources by adopting the comprehensive model based on the recommendation probability, the comprehensive model is obtained by combining at least two groups of reference coefficients generated randomly with each recommendation model, and the recommendation effect value obtained based on off-line sample data evaluation meets the expected comprehensive model. According to the embodiment of the invention, the recommendation effect of the comprehensive model is evaluated according to the off-line sample data, one group which enables the comprehensive model to meet the expectation is selected from multiple groups of reference coefficients according to the recommendation effect and is taken as the target coefficient, the problem that the model recommendation effect is influenced by manually setting the model comprehensive parameters in the model comprehensive method in the related art is avoided, the model recommendation effect is optimized, and the use viscosity of a user is improved.

Fig. 2 is a flowchart of another method for determining a network resource to be recommended according to an embodiment of the present invention. The present embodiment further defines the step of determining that the recommended effect value satisfies the expected comprehensive model on the basis of the above-mentioned embodiments. As shown in fig. 2, the method comprises the steps of:

and step 210, randomly generating at least two groups of reference coefficients with corresponding quantity based on the quantity of the model comprehensive coefficients in the preset comprehensive model expression.

It should be noted that the integrated model expression may be set by human and embedded in executable computer instructions for indicating the combination of recommended models. For example, the integrated model expression may be a linear weighting function, and the model integrated coefficient is a weighting coefficient of each recommended model. And the number of the recommended models in the comprehensive model expression is consistent with the number of the comprehensive coefficients of the models. The embodiment of the present invention is not particularly limited to the specific form of the integrated model expression.

It should be noted that the number of sets of randomly generated reference coefficients is not particularly limited in the embodiment of the present invention. For example, assume that 4 sets of reference coefficients, i.e., integrated model coefficients (x), are generated₁、x₂、x₃……x_k) Are respectively (a)₁、a₂、a₃……a_k)，(c₁、c₂、c₃……c_k)，(d₁、d₂、d₃……d_k) And (e)₁、e₂、e₃……e_k)。

In the embodiment of the invention, at least two groups of reference coefficients with corresponding numbers are randomly generated based on the number of the comprehensive coefficients of the model, so that the recommendation effect values of all the comprehensive models are compared subsequently, and a group of reference coefficients which enable the recommendation effect values to meet the expected target coefficient is determined according to the comparison result, so that the recommendation effect of the recommendation model can be optimized.

And step 220, combining the reference coefficients of each group to the comprehensive model expression respectively to obtain at least two comprehensive models.

Exemplarily, the reference coefficients are (a)₁、a₂、a₃……a_k)，(c₁、c₂、c₃……c_k)，(d₁、d₂、d₃……d_k)，(e₁、e₂、e₃……e_k) In the case of (3), the respective sets of reference coefficients are respectively combined to the integrated model expressions, resulting in the following 4 integrated models.

f₁(m1,m2,m3.....)＝a₁*m₁+a₂*m₂+a₃*m₃......+a_k*m_k；

f₂(m1,m2,m3.....)＝c₁*m₁+c₂*m₂+c₃*m₃......+c_k*m_k；

f₃(m1,m2,m3.....)＝d₁*m₁+d₂*m₂+d₃*m₃......+d_k*m_k；

f₄(m1,m2,m3.....)＝e₁*m₁+e₂*m₂+e₃*m₃......+e_k*m_k。

And step 230, for each comprehensive model, updating the corresponding reference coefficients one by one according to a preset numerical value, determining the recommended effect values of the corresponding comprehensive models before and after each updating operation, and taking the reference coefficient which enables the recommended effect value to be maximum in the updating operation process as a target coefficient.

Specifically, one of at least two integrated models is arbitrarily selected as the current integrated model. It should be noted that, since each of the at least two integrated models is selected as the current integrated model, it is not important to select which integrated model is selected as the current integrated model first.

And randomly selecting any one direction of the numerical value increase direction and the numerical value decrease direction, and updating the reference coefficient of the current comprehensive model one by one according to a preset numerical value. After each updating of a reference coefficient, a new comprehensive model can be obtained. Optionally, any one of the two directions of increasing the numerical value and decreasing the numerical value is selected arbitrarily, and the parameters of the reference coefficients of the current comprehensive model are updated one by one according to preset numerical values, and the updating may be performed in series or in parallel. For example, after the reference coefficients of the first selected integrated model are updated, a similar reference coefficient updating operation may be performed on the second selected integrated model until the parameters of each of the at least two integrated models are updated. Or, the operation of updating the reference coefficients of all the comprehensive models one by one is executed in parallel.

It should be noted that, in the embodiment of the present invention, specific values of the preset values are not specifically limited. For example, the same preset value is used when the same reference coefficient is updated, and the same preset value or different preset values may be used when different reference coefficients in the same integrated model are updated. Under the condition of updating the reference coefficients in different comprehensive models, the same preset numerical value can be adopted, and different preset numerical values can also be adopted.

In the embodiment of the present invention, any one of the two directions of increasing the numerical value and decreasing the numerical value is arbitrarily selected, and updating the reference coefficients of the current comprehensive model one by one according to the preset numerical value can be understood as: firstly, any reference coefficient of the current comprehensive model is increased or decreased by a preset value, the rest reference coefficients are unchanged, and then the recommended effect value of the comprehensive model before and after the updating operation is determined. If the above updating operation decreases the recommended effect value of the integrated model, it is determined that the preset value is updated in a direction opposite to the currently selected direction. The reason for updating only one reference coefficient each time is to avoid the influence of the change of other reference coefficients on the recommendation effect value in the process of searching for the target coefficient which makes the recommendation effect value locally optimal.

The following steps can be adopted to determine the recommendation effect value of the comprehensive model in the embodiment of the invention: determining the sample recommendation probability of each single behavior index sample set by adopting a comprehensive model, and determining the sequencing result of the samples in the corresponding single sample set according to the sample recommendation probability; and calculating the recommendation effect value of the comprehensive model according to the sequencing result and the number of positive samples and the number of exposure samples corresponding to the single behavior index sample set.

Specifically, one of the single behavior index sample sets is arbitrarily selected as a current sample set, the samples in the current sample set are respectively scored by adopting a current comprehensive model, and the recommended probability of each sample is determined. And determining which single behavior index sample set the current sample set is according to the label of the current sample set. The tags may include clicks, praise, share, concern, comment, and the like. And determining the sequencing result of each sample in the current sample set according to the recommended probability of each sample, and calculating the sum of the sequencing results of the positive samples. And substituting the sum of the sequencing results of the positive samples, the number of the positive samples and the number of the exposure samples into the AUC formula to calculate AUC of the current sample set. Similarly, any sample set in the remaining single behavior index sample set is taken as the current sample set, and auc of the newly determined current sample set is calculated in a similar manner. Auc of each single behavior index sample set is calculated using a similar method as described above. The recommended effectiveness value auc _ total of the current integrated model is calculated based on auc of all the single behavior index sample sets.

For example, for f₁(m₁,m₂,m₃...) of the reference coefficient₁、a₂、a₃……a_k) By f before the refresh operation₁(m₁,m₂,m₃...) respectively predicting the sample recommendation probability of each single behavior index sample set, and calculating f according to the sample recommendation probability corresponding to each single behavior index sample set, the number of positive samples and the number of exposure samples corresponding to the single behavior index sample set and the AUC calculation formula₁(m₁,m₂,m₃....) recommendation effect values auc for each behavior index. Then, a recommendation effect value auc _ total of the integrated model is calculated based on preset weight values of the respective recommendation models and the recommendation effect value auc of the respective behavior indexes.

When the current updating operation is the first updating operation, the recommended effect value of the comprehensive model before the first updating operation is used as the recommended effect value of the corresponding comprehensive model before the current updating operation; under the condition that the updating operation is not the first updating operation, taking the recommended effect value of the comprehensive model after the last updating operation as the recommended effect value of the corresponding comprehensive model before the updating operation; and determining the recommended effect value of the corresponding comprehensive model after each updating operation. And comparing the recommended effect values of the comprehensive model before and after the updating operation to determine whether to end the updating process for the current reference coefficient in the current comprehensive model.

In the embodiment of the invention, under the condition that the recommendation effect value is increased by the updating operation, the reference coefficient corresponding to the updating operation is continuously updated according to the preset value; and under the condition that the updating operation reduces the recommendation effect value, taking the numerical value of the reference coefficient before the current updating operation as the target coefficient of the corresponding recommendation model.

For example, for any one integrated model, when x is paired₁After updating the value of (a), if the recommended effect value of the comprehensive model is greater than the recommended effect value before updating, continuing to update x according to the preset value₁Until x is selected₁After the updating operation is carried out, the recommendation effect value of the comprehensive model is smaller than the recommendation effect value before the updating operation, and the value of the reference coefficient before the updating operation is taken as x₁The corresponding target coefficient. When x is₁Enabling the recommendation effect value of the recommendation model to reach the value aiming at x₁In the case of the maximum value in the update process of (3), x is determined₁The recommendation effect of the comprehensive model is locally optimal, and x is determined at the moment₁The value of (A) is taken as x₁The target coefficient of (2). According to x₁Updating the comprehensive model f of the target coefficient₁(m₁,m₂,m₃...) recommendation model m₁Corresponding reference coefficients.

In determining x₁After the target coefficient which makes the recommendation effect of the comprehensive model locally optimal, the x is randomly selected₂、x₃……x_kIn which a reference coefficient is selected, e.g. x₃. When to x₃After the updating operation is carried out, if the recommendation effect value of the comprehensive model is larger than the recommendation effect value before the updating operation, continuing to update x according to a preset numerical value₃Up to pair x₃After the updating operation is carried out, the recommendation effect value of the comprehensive model is smaller than the recommendation effect value before the updating operation, and the reference coefficient before the updating operation is taken as x₃The corresponding target coefficient. When x is₃Enabling the recommendation effect value of the recommendation model to reach the value aiming at x₃In the case of the maximum value in the update process of (3), x is determined₃The recommendation effect of the comprehensive model is locally optimal, and x is determined at the moment₃The value of (A) is taken as x₃The target coefficient of (2). And respectively determining target coefficients corresponding to the residual reference coefficients in a similar manner. Updating the comprehensive model f according to the target coefficients of the reference coefficients₁(m₁,m₂,m₃...) for each recommended model.

In the embodiment of the present invention, when the current updating operation is the first updating, if the current updating operation decreases the recommended effect value, the reference coefficient corresponding to the current updating operation is updated in a direction opposite to the current direction.

Specifically, for the first update, it is not particularly limited to select the direction update reference coefficient whose value increases first, or to select the direction update reference coefficient whose value decreases first. For a certain reference coefficient of the current comprehensive model, if the direction of increasing the numerical value is selected first to update the reference coefficient, so that the recommendation effect value of the comprehensive model is reduced, the direction of reducing the numerical value is selected to continuously update the reference coefficient until the local optimum of the recommendation effect value is achieved. And under the opposite condition, namely, firstly selecting the direction of the reduced numerical value to update the reference coefficient so as to reduce the recommendation effect value of the comprehensive model, and continuously updating the reference coefficient by selecting the direction of the increased numerical value until the local optimum of the recommendation effect value is achieved.

For example, at the time of the first update operation, x is updated in the direction in which the value increases₁And obtaining a new comprehensive model. Calculating x according to similar procedure₁The value of the recommended effect of the new integrated model after the increase is auc _ total'. Comparing auc _ total 'with auc _ total, if auc _ total' is less than auc _ total, then determining that the recommended effect value of the comprehensive model is reduced by the updating operation, and adjusting the reference coefficient corresponding to the updating operation along the direction opposite to the updating operation. If auc _ total' is more than auc _ total, the fact that the recommendation effect of the comprehensive model is improved by the updating operation is determined, and the reference coefficient corresponding to the updating operation is adjusted along the same direction as the updating operation. For (x)₁、x₂、x₃……x_k) According to x₁The numerical adjustment direction is determined in a similar manner, and details are not repeated here.

And 240, under the condition that all the reference coefficients in all the comprehensive models are replaced by corresponding target coefficients, determining the comprehensive model meeting the expectation according to the recommended effect values of at least two comprehensive models.

Specifically, under the condition that each reference coefficient in each comprehensive model is replaced by a corresponding target coefficient, the recommendation effect values of each comprehensive model are compared, and the comprehensive model corresponding to the maximum recommendation effect value is used as the target comprehensive model.

For example, after each reference coefficient in each integrated model is replaced with a corresponding target coefficient, a plurality of locally optimal integrated models are obtained. Comparing the recommendation effect value auc _ total of each locally optimal comprehensive model, determining the comprehensive model corresponding to the maximum recommendation effect value, taking the comprehensive model corresponding to the maximum effect value as a target comprehensive model, and enabling the recommendation effect of the comprehensive model to be optimal in the comprehensive model formed by multiple groups of reference coefficients generated randomly by the target coefficient corresponding to each recommendation model in the target comprehensive model.

And step 250, acquiring the recommendation probability of the network resources determined by each recommendation model included in the target comprehensive model.

And step 260, determining the network resources to be recommended from the network resources based on the recommendation probability by adopting a target comprehensive model.

Specifically, the target integrated model is subjected to online processing, so that network resource recommendation is performed according to online user data through the target integrated model. For example, online user data is input into the target integrated model, and recommendation probabilities of the network resources are determined based on the online user data through the recommendation models included in the target integrated model. And adjusting the recommendation probability output by the corresponding recommendation model through the target coefficient corresponding to each recommendation model, integrating the recommendation probabilities of the same network resources, then, performing descending order arrangement on the network resources according to the integrated recommendation probability, and taking the plurality of network resources ranked in the front as the network resources to be recommended. It should be noted that, the embodiment of the present invention does not limit how many network resources to be recommended are specifically selected from the network resources, and may be set according to an actual recommendation scenario.

The embodiment of the invention randomly generates at least two groups of reference coefficients, constructs at least two comprehensive models, respectively performs numerical value reduction (or increase) adjustment on the reference coefficients of each comprehensive model one by one to obtain a comprehensive recommendation model with optimal local recommendation effect, calculates the recommendation effect value of the comprehensive model with the optimal local recommendation effect, takes the maximum recommendation effect value in the comprehensive model with the optimal local recommendation effect as a target comprehensive model to be on-line, adopts the target comprehensive model to determine network resources to be recommended in network resources based on-line user data, avoids the problem of influencing the recommendation effect of the comprehensive model in a way of manually defining the reference coefficients, generates a plurality of groups of reference coefficients in an automatic way, selects one group of target coefficients from the plurality of groups of reference coefficients based on off-line data to optimize the recommendation effect of the comprehensive model, the recommendation effect of the recommendation model can be effectively improved.

Fig. 3 is a flowchart of a method for determining a network resource to be recommended according to an embodiment of the present invention. The present embodiment further describes the step of performing online adjustment to the recommended effect value satisfying the target coefficient of the expected comprehensive model based on the above embodiment. As shown in fig. 3, the method comprises the steps of:

step 301, generating at least two comprehensive models based on a preset comprehensive model expression.

Illustratively, at least two sets of corresponding numbers of reference coefficients are randomly generated based on the number of model synthesis coefficients in a preset synthesis model expression. And respectively combining each group of reference coefficients to the comprehensive model expression to obtain at least two comprehensive models.

And 302, updating the corresponding reference coefficients of each comprehensive model one by one according to a preset numerical value, determining the recommended effect value of the corresponding comprehensive model before and after each updating operation, and taking the reference coefficient which enables the recommended effect value to be maximum in the updating operation process as a target coefficient.

And 303, under the condition that each reference coefficient in each comprehensive model is replaced by a corresponding target coefficient, determining a target comprehensive model meeting expectations according to the recommended effect values of at least two comprehensive models.

And step 304, dividing the user data into a control group and an experimental group according to the user identification.

It should be noted that the user data includes user behavior data, user characteristics, and network resource characteristics. The user characteristic data is an abstraction of user information and is used for representing one or a class of users in the network resource recommendation process. The user behavior data represents browsing preferences of the user in browsing network resources. A network resource feature is an abstraction of network resource information used to represent one or a class of network resources in a resource recommendation process.

Wherein the user identification is a kind of user characteristic data. In particular, the user identification may be a user number userID. For example, a user whose user number satisfies a set condition is set in advance, and the user data belongs to the experimental group. It should be noted that, the content of the setting condition in the embodiment of the present invention is not specifically limited, and the setting condition may be set by the user number itself, any one or more digits in the user number, or a combination of the two. For example, the set condition may be that, for a user whose mantissa is odd, the user data thereof belongs to the experimental group. Alternatively, the set condition may be that, for a user whose mantissa is an even number, the user data belongs to the experimental group. Alternatively, the setting condition may be that, for a user whose mantissa is a multiple of a certain number, the user data belongs to the experimental group. Alternatively, the setting condition may be that, for a user whose mantissa is a certain number, the user data belongs to an experimental group, or the like. The remaining user data of the user data except the user data of the experimental group was used as a control group.

And 305, determining a first network resource recommended to the users of the experimental group through the target comprehensive model, and determining a second network resource recommended to the users of the control group through the original comprehensive model.

In the embodiment of the invention, the target comprehensive model is the comprehensive model with the recommendation effect value determined by the embodiment according to expectation. The original integrated model is the currently used integrated model for network resource recommendation.

Specifically, the user data included in the experimental group is input into the target comprehensive model, and the recommendation probability of the network resource is determined based on the input user data through each recommendation model included in the target comprehensive model. And adjusting the recommendation probability output by the corresponding recommendation model according to the target coefficient of each recommendation model in the target comprehensive model, then, synthesizing the recommendation probabilities output by different recommendation models of the same network resource, and sequencing the network resources based on the synthesized recommendation probability. And determining the first network resource recommended to the user corresponding to the experimental group according to the sorting result.

For example, the user data corresponding to the experimental group includes: the user A clicks the video a, the user characteristics of the user A and the video characteristics of the video a; the user B clicks the video a, the user characteristics of the user B and the video characteristics of the video a; the user A shares the video a, the user characteristics of the user A and the video characteristics of the video a; the user C clicks the video a, the user characteristics of the user C and the video characteristics of the video a; the user C reviews the video a, the user characteristics of the user C and the video characteristics of the video a; … … user X focuses on video m, user characteristics of user X and video characteristics of video m.

And inputting the user data of the experimental group into the target comprehensive model, and determining the probability of the video being clicked, the probability of being praised, the probability of being shared, the probability of being concerned and the like in each video library based on the input user data through each recommendation model included in the target comprehensive model. Then, the product of the target coefficient corresponding to each recommendation model and the output result of the corresponding recommendation model is calculated. And then, adding products corresponding to clicking, praise, sharing, attention and other operations of the same video to obtain a comprehensive score of the video, and taking the videos with the highest comprehensive scores as first network resources recommended to users in the experiment group.

Specifically, the user data included in the comparison group is input into the original comprehensive model, and the recommendation probability of the network resource is determined based on the input user data through each recommendation model included in the original comprehensive model. And adjusting the recommendation probability output by the corresponding recommendation model according to the coefficient of each recommendation model in the original comprehensive model, then, synthesizing the recommendation probabilities output by different recommendation models of the same network resource, and sequencing the network resources based on the synthesized recommendation probability. And determining the second network resource recommended to the user corresponding to the comparison group according to the sorting result. It should be noted that the manner of determining the second network resource is similar to the manner of determining the first network resource, and is not described herein again.

Step 306, obtaining user behavior data for the first network resource and the second network resource within a set time period, and respectively determining behavior indexes corresponding to the experimental group and the control group according to the user behavior data.

It should be noted that, the embodiment of the present invention does not limit the specific value of the set time period, and the duration of the set time period may be set according to actual needs.

The user behavior data is used for data representing the operation behavior of the user on the recommended network resource. The user behavior data comprises a user identifier, a user operation behavior and a user operation object. For example, the user behavior data may be data presented in the form that user a clicked (or shared, attended, liked, etc.) on video c.

Specifically, user behavior data of a user corresponding to the test group on the first network resource within a set time period is obtained, and user behavior data of a user corresponding to the comparison group on the second network resource within the set time period is obtained. For example, the relevant data of the behavior of clicking, agreeing, paying attention to or sharing the first network resource by the user corresponding to the experiment group in the set time period is obtained. And acquiring related data of actions such as clicking, praise, concern or share on the second network resource by the user corresponding to the comparison group in a set time period.

In the embodiment of the invention, the behavior index represents the proportion of the operated network resources in the set time period to the network resources recommended to the user.

Specifically, the behavior indexes respectively determined according to the user behavior data and corresponding to the experimental groups may be: and determining the number of all clicked, approved, shared or concerned network resources in the first network resource according to the user behavior data. And then, calculating the proportion of all clicked network resources in the first network resources to the first network resources as the click rate corresponding to the experimental group. And calculating the proportion of all the praised network resources in the first network resources as the praise rate corresponding to the experimental group. And calculating the proportion of all shared network resources in the first network resources to the first network resources as the corresponding sharing rate of the experimental group. And calculating the proportion of all concerned network resources in the first network resources as the attention rate corresponding to the experiment group.

Specifically, the behavior indexes respectively determined according to the user behavior data and corresponding to the control group may be: and determining the number of all clicked, approved, shared or concerned network resources in the second network resources according to the user behavior data. And then, calculating the proportion of all clicked network resources in the second network resources to the second network resources, and taking the proportion as the click rate corresponding to the comparison group. And calculating the proportion of all the approved network resources in the second network resources as the approval rate corresponding to the comparison group. And calculating the proportion of all the shared network resources in the second network resources to the second network resources as the sharing rate corresponding to the comparison group. And calculating the proportion of all concerned network resources in the second network resources as the attention rate corresponding to the comparison group.

And 307, calculating the ratio of the corresponding behavior indexes of the experimental group and the control group.

Specifically, the ratio of the click rate of the experimental group to the click rate of the control group is calculated, the ratio of the praise rate of the experimental group to the praise rate of the control group is calculated, the ratio of the sharing rate of the experimental group to the sharing rate of the control group is calculated, and the ratio of the attention rate of the experimental group to the attention rate of the control group is calculated.

Step 308, judging whether the ratio of the corresponding behavior indexes of the experimental group and the comparison group is smaller than 1, if so, executing step 309, otherwise, executing step 310.

Specifically, whether the ratio of the click rate, the ratio of the like rate, the ratio of the sharing rate and the ratio of the attention rate are smaller than 1 is respectively compared.

And 309, increasing a set value for the target coefficient corresponding to the behavior index with the ratio smaller than 1 to obtain a new target comprehensive model, and returning to execute the step 305.

It should be noted that the value of the set value can be set according to actual needs, and the embodiment of the present invention does not limit the value.

Specifically, a behavior index with a ratio smaller than 1 is selected at will, an addition operation result of the set numerical value and the target coefficient of the recommended model corresponding to the behavior index is calculated, and the value of the corresponding target coefficient is replaced by the addition operation result. For example, if the ratio of the click rates of the experimental group and the control group is less than 1, the target coefficient x of the click recommendation model corresponding to the set value and the click index is determined_iPerforming addition operation by using the set value + x_iAs new target coefficient x_iAnd for the recommended model with the ratio of the behavior indexes being greater than 1, keeping the value of the target coefficient unchanged to obtain a new target comprehensive model, then, returning to the step 305 until the ratio of the click rates of the experimental group and the comparison group is greater than 1, and then selecting one behavior index from the remaining behavior indexes with the ratio being less than 1 to execute a similar adjustment operation.

And step 310, when the ratio of each behavior index is greater than 1, updating the original comprehensive model by using the new target comprehensive model.

Illustratively, if the ratio of all the behavior indexes of the experimental group and the control group is greater than 1, the original comprehensive model is updated by using a new target comprehensive model with the ratio greater than 1.

And 311, obtaining recommendation probability of the network resources determined by each recommendation model included in the target comprehensive model.

And step 312, determining the network resources to be recommended from the network resources based on the recommendation probability by adopting a target comprehensive model.

According to the embodiment of the invention, user data are divided into an experimental group and a comparison group, a target comprehensive model and an original comprehensive model are respectively input to obtain a first network resource recommended to a user of the experimental group and determined by the target comprehensive model, a second network resource recommended to a user of the comparison group and determined by the original comprehensive model, the ratio of a behavior index aiming at the first network resource to a behavior index aiming at the second network resource in a set time period is determined, and when the ratio is smaller than 1, a target coefficient of a recommended model corresponding to the behavior index with the ratio smaller than 1 is increased by set data to obtain a new target comprehensive model; and when the ratio of all the behavior indexes is larger than 1, updating the original comprehensive model by adopting the new target comprehensive model. According to the technical scheme of the embodiment of the invention, the on-line index of the target comprehensive model in the set time period is determined, and each target coefficient of the comprehensive model is dynamically adjusted compared with the change condition of the on-line index of the original comprehensive model, so that the comprehensive model is matched with the actual on-line index.

Fig. 4 is a schematic diagram of a determination process of a comprehensive model in the method for determining network resources to be recommended according to the embodiment of the present invention. As shown in fig. 4, in the recommendation scenario, a single behavior index sample corresponding to each behavior index is obtained according to a big data sample of a user. That is, for k behavior indexes (e.g., click rate, approval rate, sharing rate, attention rate, comment rate, etc.), there are samples thereof (e.g., index 1 sample, index 2 sample, … …, index k sample), and a sample set corresponding to each behavior index is referred to as a single behavior index sample set. K recommendation models (such as model m) can be trained through k single behavior index sample sets₁Model m₂Model m₃,._k). The user can set the synthesis function f (m) of these recommendation models₁,m₂,m₃.....). For example, if one wants to have these recommended models combined linearly, then f (m) is set₁,m₂,m₃.....)＝x₁*m₁+x₂*m₂+……+x_k*m_kAt this time, x₁，x₂，……，x_kAre the comprehensive model coefficients of the value to be determined. The embodiment of the invention can evaluate the recommendation effect of the comprehensive model for each behavior index based on the off-line data through a random algorithm and a coordinate descending method, thereby searching the optimal comprehensive model coefficient of the comprehensive model based on the recommendation effect. It should be noted that a stochastic algorithm is used to generate at least two sets of synthetic model coefficients. The random algorithm includes numerical probability algorithm, Monte Carlo algorithm, LasThe Vegas (Las Vegas) algorithm, the Sherwood (Sherwood) algorithm, and the like.

Specifically, the integrated model f (m) may be set₁,m₂,m₃.....)＝x₁*m₁+x₂*m₂+……+x_k*m_k. At least two sets of comprehensive model coefficients, called reference coefficients, may be generated by a stochastic algorithm. Respectively substituting each group of reference coefficients into the comprehensive model f (m)₁,m₂,m₃...) to obtain at least two composite models. And evaluating the recommendation effect value of each comprehensive model by adopting the index 1 sample, the index 2 sample, … … and the index k sample.

It should be noted that, for the recommendation algorithm, the recommendation effect value usually adopts an auc index. Using each synthetic model f (m)₁,m₂,m₃...) the individual samples of each behavior index are scored, and the labels in the samples can be combined to determine whether the samples are click rate samples, like click rate samples, share rate samples, attention rate samples or comment rate samples, so as to determine which recommendation model the calculated auc corresponds to. If the calculated auc corresponds to the click recommendation model, then the calculated auc is recorded as auc₁. Similarly, if the calculated auc corresponds to a praise recommendation model, the calculated auc is recorded as auc₂. Thus, k auc, i.e., auc, corresponding to k index samples can be calculated₁、auc₂、……、auc_k。

For the k indexes, the importance degrees may not be the same, and the user may preset the weighting coefficients b of different indexes₁、b₂、……、b_k. For example, in a recommendation scene, there are indexes such as a user click rate, a user approval rate, a comment rate, a share rate, and an attention rate. If the service of us wants to improve the approval rate and the attention rate more, but does not care about the click rate, the comment rate and the sharing rate of the user, then we can use the weighting coefficients corresponding to the approval rate and the attention rate indexes, for example, b₁，b₂Heightening, and weighting coefficients corresponding to the click rate, comment rate and share rate indexes of the user, for example, b₃，b₄，b₅And (5) turning down. Therefore, the obtained auc _ total evaluation index can better reflect the approval rate and the attention rate index, and the approval rate and the attention rate index can be better improved after auc _ total is improved.

According to the preset weighting coefficients of different indexes, a comprehensive recommendation effect value auc _ total ═ b can be obtained₁*auc₁+b₂*auc₂+...+b_k*auc_k. This auc _ total is used as a score to measure how good a set of integrated model coefficients x1, x2, … …, xk is.

The method of finding the optimal coefficient for satisfying the recommendation effect of the integrated model to the expectation may include:

1. several sets of starting model synthesis parameters were randomly generated. For example, randomly generating (x)₁、x₂、x₃……x_k) Are respectively (a)₁、a₂、a₃……a_k)，(c₁、c₂、c₃……c_k)，(d₁、d₂、d₃……d_k)，(e₁、e₂、e₃……e_k). It should be noted that, in the embodiment of the present invention, the number of sets of randomly generated reference coefficients is not limited, and how many sets of randomly generated reference coefficients may be set according to actual needs.

2. For a randomly generated set of starting model synthesis parameters, a recommended effectiveness value auc _ total for the corresponding synthesis model is calculated. Then, a coordinate descent method is used, one parameter x is updated in one direction at a time, i.e. one parameter x is updated₁So that x is₁＝x₁+/- Δ, where Δ is the step size of each coordinate drop, is an empirical value and can be adjusted based on practical considerations. Assuming that auc _ total is increased by updating the parameters, the updated parameters are used to replace the original parameters. And returning to execute the updating operation of the same parameter until the parameter is updated to reduce auc _ total, namely the parameter enables auc _ total to reach local optimization. The remaining parameters are selected in turn and the parameter update operation is performed in a similar manner as described above to obtain a number of locally optimal auc _ total.

3. And for a plurality of groups of starting model comprehensive parameters which are generated randomly, obtaining a plurality of locally optimal auc _ total according to a coordinate descent method. And taking the model synthesis parameter corresponding to the maximum auc _ total as the final selected result, namely the target coefficient of the synthesis model.

After the comprehensive model is online, daily user data of the comprehensive model can be collected, and then daily index performance of the comprehensive model is determined. The target coefficient is adjusted according to the index performance of each day, so that the index performance of the comprehensive model can be matched with the actual on-line index.

Specifically, in a recommended scene, when a model updating event is triggered, user data newly generated on the day is divided into an experimental group and a comparison group, the user data corresponding to the experimental group is input into a new comprehensive model, and the user data corresponding to the comparison group is input into an original comprehensive model. And determining a first network resource recommended to the users of the experimental group through the new comprehensive model, and determining a second network resource recommended to the users of the control group through the original comprehensive model. And acquiring user behavior data of the first network resource of the user corresponding to the experimental group in a set time period, and determining each on-line index corresponding to the experimental group according to the user behavior data. And acquiring user behavior data of the second network resource by the user corresponding to the comparison group in a set time period, and determining each on-line index corresponding to the comparison group according to the user behavior data. The ratio of the on-line index of the experimental group to the corresponding on-line index in the control group was calculated. And when the ratio of the indexes on one line is less than 1, adjusting the target coefficient of the recommendation model corresponding to the indexes on the line.

Fig. 5 is a schematic diagram of an online adjustment process of the comprehensive model in the method for determining network resources to be recommended according to the embodiment of the present invention. As shown in fig. 5, after the gray scale of the integrated model is on line for a period of time t, the on-line index 1 to the on-line index k corresponding to the new integrated model may be obtained, and the on-line index 1 to the on-line index k corresponding to the original integrated model may also be obtained. And respectively calculating the ratio of the on-line index 1, the on-line indexes 2 and … … and the on-line index k of the new model and the original model. If the ratio is less than 1, determining that the index change on the line is negative, and making the corresponding recommendation thereofModel m_iTarget coefficient x of_i＝x_i+ Δ, where Δ is the adjustment step size, and can be set as small as possible. In the pair x_iAfter adjustment, the change of the on-line index is monitored in the next time t. If the index change on the line is still negative, the corresponding recommendation model m is made_iTarget coefficient x of_i＝x_i+ Δ. In the pair x_iAfter adjustment, monitoring the change condition of the on-line index in the next time t until the index changes to the positive direction, namely adjusting the target coefficient of the recommendation model corresponding to the on-line index which changes in the negative direction at intervals of time t. If the ratio is larger than 1, the change of the on-line index is determined to be positive, and the corresponding recommendation model m does not need to be adjusted_iThe target coefficient of (2).

Here, Δ is not related to Δ in the coordinate descent method, and may be the same value or different values.

It should be noted that, with reference to the above method, the target coefficients of the recommendation model corresponding to the indexes on all the negatively changed lines are adjusted.

The on-line gray scale means that a certain flow rate (i.e., data of a user whose user identifier satisfies a set condition in the user data) is given to the new upper integrated model as an experimental group, and the remaining flow rate (data of the user data other than the user data of the experimental group) is given to a control group, and the index performance of the new upper integrated model processing the user data of the experimental group is monitored, and the change of the index performance of the original model processing the user data of the control group is compared with the change of the index performance of the original model processing the user data of the control group. And if the change condition of the index performance meets the preset condition, the new comprehensive model is really on-line.

It should be noted that the trigger condition of the model update event may be a timing trigger, a periodic trigger, or a manual trigger, and the embodiment of the present invention is not limited in particular.

According to the embodiment of the invention, the recommendation effect of the comprehensive model is evaluated through the off-line data, and then the comprehensive model with the recommendation effect meeting the expectation is selected as the comprehensive model to be on-line. After the comprehensive model is online, the target coefficient of the comprehensive model is dynamically adjusted according to the change condition of the online index, the recommendation effect of the model can be greatly improved, and the user experience is improved.

Fig. 6 is a block diagram of a structure of a device for determining a network resource to be recommended according to an embodiment of the present invention, where the device may be implemented by software and/or hardware, and may be generally integrated at a server, and may determine the network resource to be recommended by executing the method for determining a network resource to be recommended provided by the present invention. As shown in fig. 6, the apparatus includes:

a recommendation probability obtaining module 610, configured to obtain recommendation probabilities of network resources determined by at least two recommendation models, where each recommendation model is generated based on learning a single behavior index sample set, and samples in each single behavior index sample set correspond to the same user behavior;

and a recommended resource determining module 620, configured to determine a network resource to be recommended from the network resources based on the recommendation probability by using a comprehensive model, where the comprehensive model is determined based on each recommendation model and a corresponding target coefficient, the target coefficient is one of at least two sets of reference coefficients generated randomly, so that a recommendation effect value of the comprehensive model meets an expected set of reference coefficients, and the recommendation effect value is an index for evaluating a recommendation effect of the comprehensive model.

According to the device for determining the network resources to be recommended, the recommendation probability of the network resources determined by at least two recommendation models is obtained, the network resources to be recommended are determined from the network resources based on the recommendation probability by adopting the comprehensive model, the comprehensive model is obtained by combining at least two groups of reference coefficients generated randomly with the recommendation models, and the recommendation effect value obtained based on off-line sample data evaluation meets the expected comprehensive model. According to the embodiment of the invention, the recommendation effect of the comprehensive model is evaluated according to the off-line sample data, one group which enables the comprehensive model to meet the expectation is selected from multiple groups of reference coefficients according to the recommendation effect and is taken as the target coefficient, the problem that the model recommendation effect is influenced by manually setting the model comprehensive parameters in the model comprehensive method in the related art is avoided, the model recommendation effect is optimized, and the use viscosity of a user is improved.

It should be noted that, in the embodiment of the apparatus for determining a network resource to be recommended, each unit and each module included in the embodiment are only divided according to functional logic, but are not limited to the above division, as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

The embodiment of the invention provides computer equipment, and the computer equipment can be integrated with a device for determining network resources to be recommended, which is provided by the embodiment of the invention. The computer device includes one or more processors; a memory for storing one or more programs; when the one or more programs are executed by the one or more processors, the one or more processors implement the method for determining the network resource to be recommended provided by the embodiment of the invention. Fig. 7 is a block diagram of a computer device according to an embodiment of the present invention. In fig. 7, a processor is taken as an example, the computer device 700 includes a processor 710 and a memory 720, the processor 710 and the memory 720 are connected by a bus or other means, and fig. 7 is taken as an example of the connection by the bus.

The embodiment of the present invention further provides a computer-readable storage medium, on which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the method for determining a network resource to be recommended according to any embodiment of the present invention is implemented.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for determining a network resource to be recommended is characterized by comprising the following steps:

2. The method of claim 1, further comprising, before obtaining the recommendation probabilities for the at least two network resources determined by the recommendation model:

generating at least two comprehensive models based on a preset comprehensive model expression;

for each comprehensive model, updating the corresponding reference coefficients one by one according to a preset numerical value, determining the recommended effect value of the corresponding comprehensive model before and after each updating operation, and taking the reference coefficient which enables the recommended effect value to be maximum in the updating operation process as a target coefficient;

and under the condition that each reference coefficient in each comprehensive model is replaced by a corresponding target coefficient, determining the target comprehensive model meeting expectations according to the recommended effect values of at least two comprehensive models.

3. The method of claim 2, wherein generating at least two integrated models based on a preset integrated model expression comprises:

randomly generating at least two groups of reference coefficients with corresponding quantity based on the quantity of model comprehensive coefficients in a preset comprehensive model expression;

and respectively combining each group of reference coefficients to the comprehensive model expression to obtain at least two comprehensive models.

4. The method according to claim 2, wherein the using the reference coefficient that maximizes the recommended effect value during the update operation as the target coefficient comprises:

under the condition that the recommended effect value is increased by the current updating operation, continuously updating the reference coefficient corresponding to the current updating operation according to a preset numerical value;

and under the condition that the recommended effect value is reduced by the current updating operation, taking the numerical value of the reference coefficient corresponding to the current updating operation before the current updating operation as a target coefficient.

5. The method according to claim 2, wherein said updating the corresponding reference coefficients one by one according to the preset value comprises:

and selecting any one direction of the numerical value increase direction and the numerical value decrease direction, and updating the reference coefficients of the corresponding comprehensive models one by one according to a preset numerical value.

6. The method of claim 5, after determining the recommended effectiveness value of the corresponding integrated model before and after each update operation, further comprising:

and when the updating operation is the first updating, if the updating operation reduces the recommended effect value, updating the reference coefficient corresponding to the updating operation along the direction opposite to the numerical value adjusting direction of the updating operation.

7. The method of claim 2, wherein determining the recommended effectiveness value of the integrated model before and after each updating operation comprises:

when the current updating operation is the first updating operation, taking the recommended effect value of the comprehensive model before the first updating operation as the recommended effect value of the corresponding comprehensive model before the current updating operation;

under the condition that the updating operation is not the first updating operation, taking the recommended effect value of the comprehensive model after the last updating operation as the recommended effect value of the corresponding comprehensive model before the updating operation;

and determining the recommended effect value of the corresponding comprehensive model after each updating operation.

8. The method of claim 7, wherein determining the recommendation effect value for the integrated model comprises:

determining the sample recommendation probability of each single behavior index sample set by adopting the comprehensive model, and determining the sequencing result of the samples in the corresponding single sample set according to the sample recommendation probability;

and calculating the recommendation effect value of the comprehensive model according to the sequencing result and the number of positive samples and the number of exposure samples corresponding to the single behavior index sample set.

9. The method of claim 2, wherein determining the target comprehensive model satisfying the expectation according to the recommended effect values of at least two comprehensive models when each reference coefficient in each comprehensive model is replaced by the corresponding target coefficient comprises:

and under the condition that each reference coefficient in each comprehensive model is replaced by the corresponding target coefficient, comparing the recommendation effect values of the comprehensive models, and taking the comprehensive model corresponding to the maximum recommendation effect value as the target comprehensive model.

10. The method according to claim 9, further comprising, after taking the integrated model corresponding to the maximum recommendation effect value as the target integrated model:

dividing user data into a control group and an experimental group according to a user identifier, wherein the user data comprises user behavior data, user characteristics and network resource characteristics;

determining a first network resource recommended to the users of the experimental group through the target comprehensive model, and determining a second network resource recommended to the users of the control group through the original comprehensive model;

acquiring user behavior data aiming at the first network resource and the second network resource in a set time period, and respectively determining behavior indexes corresponding to the experimental group and the comparison group according to the user behavior data, wherein the behavior indexes represent the proportion of the operated network resources in the set time period to the network resources recommended to the user;

when the ratio of the behavior indexes of the experimental group to the control group is smaller than 1, increasing a set value for a target coefficient corresponding to the behavior index with the ratio smaller than 1 to obtain a new target comprehensive model, and returning to the step of determining the first network resource recommended to the user of the experimental group through the target comprehensive model;

and under the condition that the ratio of all the behavior indexes of the experimental group and the control group is greater than 1, updating the original comprehensive model by adopting a new target comprehensive model.

11. An apparatus for determining a network resource to be recommended, comprising:

12. A computer device, characterized in that the computer device comprises:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the method for determining network resources to be recommended according to any one of claims 1-10.

13. A computer-readable storage medium having stored thereon computer-executable instructions, which, when executed by a processor, implement a method for determining network resources to be recommended according to any one of claims 1-10.