CN114399030A

CN114399030A - Recommendation model training method, media information recommendation method, device and equipment

Info

Publication number: CN114399030A
Application number: CN202210054275.8A
Authority: CN
Inventors: 叶永洪
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-04-26

Abstract

The application discloses a recommendation model training method, a media information recommendation device and equipment, and belongs to the technical field of artificial intelligence. The method comprises the following steps: acquiring a recommended first prediction result of each training text based on a first network model, and acquiring a recommended second prediction result of each training text based on a second network model; obtaining a first comparison result of first prediction results of two training texts in any training text pair and a second comparison result of second prediction results, wherein the first comparison result is used for representing the difference of the first prediction results, and the second comparison result is used for representing the difference of the second prediction results; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model. The recommendation model can learn the difference between the first prediction result and the second prediction result of different training texts, so that the accuracy of the recommendation model is improved, and the recommendation accuracy of the media information is further improved.

Description

Recommendation model training method, media information recommendation method, device and equipment

Technical Field

The embodiment of the application relates to the technical field of artificial intelligence, in particular to a recommendation model training method, a media information recommendation device and equipment.

Background

With the development of artificial intelligence technology, the functions of the application programs are more and more, and the recommended function is an important function in the application programs. The recommendation function is a function of determining media information to be recommended from a plurality of pieces of media information and recommending the media information to be recommended, wherein the media information to be recommended can be determined from the plurality of pieces of media information using a recommendation model.

In the related art, a first prediction result of each training text may be predicted based on a first network model, and a second prediction result of each training text may be predicted based on a second network model, where model parameters of the first network model and the second network model are different, and both the first prediction result and the second prediction result are used to represent a recommended possibility of the training text. Next, for any one of the training texts, a loss value of any one of the training texts is determined based on the first prediction result and the second prediction result of any one of the training texts. And then, adjusting the second network model based on the loss value corresponding to each training text to obtain a recommendation model.

Because the loss value of the training text is obtained based on the first prediction result and the second prediction result of the training text, the accuracy of the recommendation model obtained based on the loss value is not high, and the recommendation accuracy of the media information is affected.

Disclosure of Invention

The embodiment of the application provides a training method of a recommendation model, a media information recommendation method, a device and equipment, which can be used for solving the problem that the recommendation accuracy of media information is not high due to the fact that the accuracy of the recommendation model is not high in the related technology.

In one aspect, an embodiment of the present application provides a training method for a recommendation model, where the method includes:

acquiring a first recommended prediction result of each training text based on a first network model, and acquiring a second recommended prediction result of each training text based on a second network model, wherein model parameters of the first network model and the second network model are different;

for any training text pair, obtaining a first comparison result of first prediction results and a second comparison result of second prediction results of two training texts in any training text pair, wherein the first comparison result is used for representing the difference of the first prediction results, the second comparison result is used for representing the difference of the second prediction results, and any two training texts in each training text pair are included;

and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model.

On the other hand, an embodiment of the present application provides a media information recommendation method, where the method includes:

acquiring a plurality of media information, wherein the media information comprises a target text;

acquiring recommendation results recommended by target texts in each piece of media information based on a recommendation model, wherein the recommendation model is obtained based on any one of the recommendation model training methods;

and determining the media information to be recommended meeting the recommendation condition from the plurality of pieces of media information based on the recommendation result of the target text in each piece of media information, and recommending the media information to be recommended.

In another aspect, an embodiment of the present application provides a training apparatus for recommending a model, where the apparatus includes:

the acquisition module is used for acquiring a first recommended prediction result of each training text based on a first network model and acquiring a second recommended prediction result of each training text based on a second network model, wherein model parameters of the first network model and the second network model are different;

the obtaining model is further configured to obtain, for any training text pair, a first comparison result of first prediction results and a second comparison result of second prediction results of two training texts in any training text pair, where the first comparison result is used to represent a difference of the first prediction results, the second comparison result is used to represent a difference of the second prediction results, and any two training texts in each training text pair are included;

and the adjusting model is used for adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model.

In a possible implementation manner, the obtaining model is configured to obtain labeling results of two training texts in any training text pair, where the labeling results are obtained through labeling and used to represent whether the training texts are recommended; and determining a first comparison result of any training text pair based on the labeling results of the two training texts in any training text pair and the first prediction result of the two training texts in any training text pair.

In a possible implementation manner, the obtaining model is configured to determine, in response to that labeling results of two training texts in any one of the training text pairs are different, a first comparison result of the any one of the training text pairs based on target data; and in response to the labeling results of the two training texts in any one training text pair are the same, determining a first comparison result of any one training text pair based on a first prediction result of the two training texts in any one training text pair.

In a possible implementation manner, the obtaining model is configured to determine, in response to that a difference between labeling results of two training texts in any one of the training text pairs is a first reference value, a first comparison result of the any one of the training text pairs based on first data; and in response to the difference between the labeling results of the two training texts in any one training text pair being a second reference value, determining a first comparison result of any one training text pair based on second data, wherein the first reference value is greater than the second reference value, and the first data is greater than the second data.

In a possible implementation manner, the obtaining model is configured to determine, in response to that a difference between first predicted results of two training texts in any one of the training text pairs is smaller than a third reference value, a first comparison result of the any one of the training text pairs based on third data; in response to the difference between the first predicted results of the two training texts in any one training text pair being not less than a third reference value and not greater than a fourth reference value, determining a first comparison result of any one training text pair based on fourth data, the third reference value being less than the fourth reference value, the third data being less than the fourth data; and in response to the difference between the first predicted results of the two training texts in any one training text pair being greater than a fourth reference value, determining a first comparison result of any one training text pair based on fifth data, the fourth data being smaller than the fifth data.

In a possible implementation manner, the obtaining model is configured to determine a target difference value of second prediction results of two training texts in any training text pair; determining a second comparison result for the any one of the training text pairs based on the target difference value.

In a possible implementation manner, the obtaining model is configured to determine, based on the first network model, a first attribute feature of each attribute in any training text, where the first attribute feature of any attribute is used to describe any attribute; performing feature fusion processing on the first attribute features of each attribute in any training text to obtain the first text features of any training text; and determining a first prediction result of the any training text based on the first attribute features of the attributes in the any training text and the first text features of the any training text.

In one possible implementation, the acquisition model is used for at least one of:

performing feature fusion processing on the first attribute features of each attribute in any training text based on a multilayer perceptron to obtain the first text features of any training text; and performing feature cross processing on the first attribute features of each attribute in any training text to obtain the first text features of any training text.

In a possible implementation manner, the obtaining model is configured to determine, based on the second network model, a second attribute feature of each attribute in any one of the training texts, where the second attribute feature of any one of the attributes is used to describe any one of the attributes; performing feature fusion processing on the second attribute features of each attribute in any training text to obtain second text features of any training text; and determining a second prediction result of the any training text based on the second attribute features of the attributes in the any training text and the second text features of the any training text.

In a possible implementation manner, the adjustment model is configured to obtain a labeling result of each training text; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair, and the labeling result and the second prediction result of each training text pair to obtain the recommendation model.

In a possible implementation manner, the adjustment model is configured to determine a first attribute feature of each attribute in any one of the training texts based on the first network model, and determine a second attribute feature of each attribute in any one of the training texts based on the second network model; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first attribute characteristic and the second attribute characteristic of each attribute in each training text to obtain the recommendation model.

In a possible implementation manner, the adjustment model is configured to obtain a first text feature and a second text feature of each training text; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first text characteristic and the second text characteristic of each training text to obtain the recommendation model.

In a possible implementation manner, the obtaining module is further configured to obtain a labeling result of each training text;

and the adjusting model is further used for adjusting the first network model based on the labeling result and the first prediction result of each training text to obtain the adjusted first network model.

In another aspect, an embodiment of the present application provides a media information recommendation device, where the device includes:

the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a plurality of media information, and the media information comprises a target text;

the acquisition module is further configured to acquire recommended results recommended by the target texts in each piece of media information based on a recommendation model, where the recommendation model is obtained based on any one of the above-mentioned recommendation model training methods;

and the recommendation model is used for determining the media information to be recommended meeting recommendation conditions from the plurality of pieces of media information based on the recommendation results of the target texts in the pieces of media information, and recommending the media information to be recommended.

In a possible implementation manner, the obtaining module is configured to determine, for a target text in any piece of media information, attribute features of each attribute in the target text based on the recommendation model, where the attribute features of any attribute are used to describe any attribute; performing feature fusion processing on the attribute features of each attribute in the target text to obtain text features of the target text; and determining a recommendation result of the target text based on the attribute features of the attributes in the target text and the text features of the target text.

In another aspect, an embodiment of the present application provides an electronic device, where the electronic device includes a processor and a memory, where the memory stores at least one program code, and the at least one program code is loaded and executed by the processor, so that the electronic device implements any one of the above recommendation model training methods or any one of the above media information recommendation methods.

In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, and the at least one program code is loaded and executed by a processor, so as to enable a computer to implement any one of the above-mentioned training methods for recommendation models or any one of the above-mentioned media information recommendation methods.

In another aspect, a computer program or a computer program product is provided, where at least one computer instruction is stored in the computer program or the computer program product, and the at least one computer instruction is loaded and executed by a processor, so as to enable a computer to implement any one of the above training methods for recommendation models or any one of the above media information recommendation methods.

The technical scheme provided by the embodiment of the application at least has the following beneficial effects:

in the technical scheme provided by the embodiment of the application, the first comparison result of any training text pair is used for representing the difference of the first prediction results of two training texts in any training text pair, and the second comparison result of any training text pair is used for representing the difference of the second prediction results of two training texts in any training text pair. The recommendation model is obtained based on the first comparison result and the second comparison result of each training text pair, so that the recommendation model can learn the difference between the first prediction result and the second prediction result of different training texts, and the prediction capability of the recommendation model is improved, so that the accuracy of the recommendation model is improved, and the recommendation accuracy of the media information is further improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment of a training method of a recommendation model or a media information recommendation method according to an embodiment of the present application;

FIG. 2 is a flowchart of a training method for a recommendation model according to an embodiment of the present disclosure;

FIG. 3 is a flowchart of a media information recommendation method according to an embodiment of the present application;

FIG. 4 is a schematic illustration of an asynchronous distillation provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of an asynchronous distillation training process provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a display page provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a training apparatus for recommending a model according to an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a media information recommendation device according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. First, terms related to the embodiments of the present application will be explained and explained.

And (3) distillation learning: a training mode of a deep learning model. Two models are involved, one is a strong model (also called teacher model) and the other is a weak model (also called student model). The teacher model transmits the knowledge it has learned to the student model by supervising the loss, a process called distillation. The focus of distillation learning is to design the loss function of distillation.

Asynchronous distillation: and establishing a single process (which can be called a teacher process) to train the teacher model, adopting another process to carry out distillation training, and loading the latest teacher model from the teacher process.

Encoding (Embedding): the input discrete information is converted into a trainable vector by a table look-up mode, and the dimension of the vector is called as 'Embedding Size'.

Pair Loss (Pairwise Loss): a loss of the ranking algorithm in the recommendation system. Input samples within a Batch (Batch) are compared pairwise and converted to labels of 0 or 1 for model learning.

Fig. 1 is a schematic diagram of an implementation environment of a recommendation model training method or a media information recommendation method provided in an embodiment of the present application, where the implementation environment includes an electronic device 11 as shown in fig. 1, and the recommendation model training method or the media information recommendation method in the embodiment of the present application may be executed by the electronic device 11. Illustratively, the electronic device 11 may include at least one of a terminal device or a server.

The terminal device may be at least one of a smartphone, a gaming console, a desktop computer, a tablet computer, and a laptop portable computer. The server may be one server, or a server cluster formed by multiple servers, or any one of a cloud computing platform and a virtualization center, which is not limited in this embodiment of the present application. The server can be in communication connection with the terminal device through a wired network or a wireless network. The server may have functions of data processing, data storage, data transceiving, and the like, and is not limited in the embodiment of the present application.

The training method of the recommendation model and the media information recommendation method provided by the embodiment of the application are realized based on an Artificial Intelligence (AI) technology, wherein the AI is a theory, a method, a technology and an application system which simulate, extend and expand human Intelligence by using a digital computer or a machine controlled by the digital computer, sense the environment, acquire knowledge and obtain the best result by using the knowledge. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and researched in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical services, smart customer service, internet of vehicles, automatic driving, smart traffic and the like.

Based on the foregoing implementation environment, an embodiment of the present application provides a method for training a recommendation model, which may be executed by the electronic device 11 in fig. 1, taking a flowchart of the method for training a recommendation model provided in the embodiment of the present application as an example, as shown in fig. 2. As shown in fig. 2, the method includes steps 201 to 203.

Step 201, obtaining a first prediction result recommended by each training text based on a first network model, and obtaining a second prediction result recommended by each training text based on a second network model, wherein model parameters of the first network model and the second network model are different.

The first network model and the second network model have the same model structure but different model parameters. The embodiment of the application does not limit the model structure of the first network model or the second network model. Illustratively, the model structures of the first network model and the second network model each include a coding Layer, a feature crossing Layer, a Multi-Layer perceptron (MLP) Layer, a concatenation Layer, and a full connection Layer. Wherein, the coding layer of the first network model is a Deep Neural network And Factorization (Deep FM) model with dimension 128, And the coding layer of the second network model is a Deep FM model with dimension 16. The dimensions of Hidden Units (Hidden Units) in the MLP layer of the first network model are 512, 256, 128, 64, respectively, while the dimensions of Hidden Units in the MLP layer of the second network model are 128, 64, 32, 16, respectively.

According to the method and the device, a plurality of training texts can be obtained, any training text can be a text in media information, and the media information comprises at least one of audio, pictures and texts. In one aspect, a plurality of training texts are input into a first network model, and a first prediction result of each training text is predicted by the first network model. On the other hand, a plurality of training texts are input into a second network model, and a second prediction result of each training text is predicted by the second network model. The first prediction result and the second prediction result of any training text can be vectors for representing the recommended possibility of any training text, and can also be probabilities for representing the recommended possibility of any training text. The vector and the probability in the embodiment of the present application may be collectively referred to as a Logit, the vector is an unnormalized value, and the probability is a normalized value. For the classification problem, the dimension of the vector and the number of the probabilities both represent the number of classes to be classified, for example, for the binary classification problem, the dimension of the vector is 2, and the number of the probabilities is two.

In the embodiment of the application, a manner of obtaining a first prediction result recommended by each training text based on the first network model is similar to a manner of obtaining a second prediction result recommended by each training text based on the second network model, and a manner of obtaining a second prediction result recommended by each training text is described below with emphasis on the perspective of the second network model.

In a possible implementation manner, obtaining a recommended second prediction result of each training text based on a second network model includes: determining second attribute features of the attributes in any training text based on the second network model, wherein the second attribute features of any attribute are used for describing any attribute; performing feature fusion processing on the second attribute features of each attribute in any training text to obtain the second text features of any training text; and determining a second prediction result of any training text based on the second attribute characteristics of the attributes in any training text and the second text characteristics of any training text.

For any training text, the training text includes at least one attribute, the attribute may be called a feature Field (Field), and the attribute may be a keyword extracted from the training text, or a sentence, a paragraph, etc. in the training text. The attribute of the training text is not limited in the embodiments of the present application, and for example, the attribute of the training text includes, but is not limited to, a title of the media information, a body of the media information, an author of the media information, a tag of the media information, a source of the training text, and the like. The labels of the media information are labels of categories such as food, advertisements, animals, movies and the like to which the media information belongs, and the sources of the training texts are information such as text parts, audio parts, picture parts and the like of the training texts from the media information.

For any one of the training texts, a second attribute feature of each attribute in any one of the training texts can be determined by using the coding layer of the second network model. Optionally, the coding layer of the second network model includes two kinds of coding tables. One type of encoding Table is a low-dimensional encoding Table, which may be referred to as a Wide encoding Table (Wide encoding Table), where the notation W is used_0SAnd (4) showing. Another type of encoding Table is a high-dimensional encoding Table, which may be referred to as a Quad Embedding Table (Quad Embedding Table), where the notation W is used_SAnd (4) showing.

Assume that m attributes are included in any one training text, and m is a positive integer. The low-dimensional coding table in the coding layer of the second network model may project the respective attributes in any one of the training texts as second low-dimensional attribute features of the respective attributes. When the dimension of the second low-dimensional attribute feature is 1-dimensional, the second low-dimensional attribute feature of each attribute can be denoted as E_0S∈R^mWherein E is_0SR is a real number for the second low-dimensional attribute feature of each attribute. A high-dimensional coding table in a coding layer of the second network model may project each attribute in any one of the training texts as a second high-dimensional attribute feature of each attribute. When the dimension of the second high-dimensional attribute feature is d_sWhen dimension is taken, the second highest dimension attribute feature of each attribute can be noted as

Wherein E is_SIs the second highest of each attributeAnd (5) dimension attribute characteristics. It should be noted that the second attribute characteristics of each attribute include a second low-dimensional attribute characteristic of each attribute and a second high-dimensional attribute characteristic of each attribute.

And then, the second network model performs feature fusion processing on the second attribute features of each attribute in any training text to obtain the second text features of any training text.

Optionally, the feature fusion processing is performed on the second attribute features of each attribute in any training text to obtain the second text features of any training text, where the second text features include at least one of the following: performing feature fusion processing on the second attribute features of the attributes in any training text based on the multilayer perceptron to obtain the second text features of any training text; and performing feature cross processing on the second attribute features of the attributes in any training text to obtain the second text features of any training text.

The multi-layer perceptron layer of the second network model is a network structure of the multi-layer perceptron, and the multi-layer perceptron is a network structure formed by a plurality of layers of full connection layers. And performing feature fusion processing on the second attribute features of the attributes in any training text based on the multilayer perceptron to obtain the second text features of any training text.

Optionally, feature fusion processing is performed on the second high-dimensional attribute features of each attribute in any training text based on a multilayer perceptron, so as to obtain a second text feature of any training text. Assuming that the multilayer perceptron has a total of l (marked as 0 to l-1) layers, the network parameters of the ith layer are

And

the multi-layered perceptron can be represented as equation (1) as shown below.

Wherein,

for the characteristics of each attribute in any training text output by the ith layer of the multi-layer perceptron,

and (4) features of each attribute in any training text input for the ith layer of the multi-layer perceptron. The input of the 0 th layer of the multilayer perceptron is the second high-dimensional attribute feature of each attribute in any training text, namely

E_SAnd the second high-dimensional attribute features of the attributes in any training text. The output of the l-1 layer of the multi-layer perceptron is the second text characteristic of any training text, namely

D_SFor a second text feature of any one of the training texts, D_SDimension of k_S。

The Feature intersection (Feature Interaction) layer of the second network model may perform Feature intersection processing on the second attribute features of each attribute in any one of the training texts to obtain the second text features of any one of the training texts. The feature cross processing is to perform Hadamard product operation processing on second attribute features of every two attributes in any training text, and the Hadamard product operation processing is to multiply corresponding elements in a vector one by one (namely Element-Wise).

Optionally, the feature crossing layer of the second network model performs feature crossing processing on the second high-dimensional attribute features of each attribute in any training text to obtain the second text feature of any training text, as shown in the following formula (2).

Wherein, I_sFor either trainingA second text feature of the exercise text,

for the second high-dimensional attribute feature of the ith attribute in any one of the training texts,

and m is the number of the attributes in any training text, and the degree represents Hadamard product operation processing. Wherein,

r is a real number, d_sIs a dimension.

After the second attribute features of the attributes in any training text and the second text features of any training text are obtained, the splicing layer of the second network model splices the second attribute features of the attributes in any training text and the second text features of any training text to obtain the second splicing features of any training text.

Optionally, the second low-dimensional attribute features of each attribute in any training text are spliced with the second text features of any training text to obtain the second splicing features of any training text. Wherein the second low-dimensional attribute characteristic of each attribute in any training text is E_0SThe second text feature of any training text is I_sAnd/or D_SThe second splicing characteristic of any training text is

Wherein R is a real number, m is E_0SDimension of (d)_sIs I_sDimension of (c), k_SIs D_SOf (c) is calculated. Optionally, Z_S＝Concat(I_S,D_S,E_0S) Concat characterizes splicing.

Then, firstAnd the full connection layer of the two network models determines a second prediction result of any training text based on the second splicing characteristics of any training text. Assume the parameters of the fully-connected layer of the second network model are

And

the output of the fully connected layer of the second network model can be represented as

Wherein Z is_SA second stitching feature for any of the training texts. Optionally, O_S∈R¹And R is a real number.

It should be noted that the output of the fully-connected layer of the second network model may be used as the second prediction result of any training text. Or the output of the full connection layer of the second network model is activated to obtain a second prediction result of any training text. For example, the sigmoid function may be used to activate the output of the fully-connected layer of the second network model, where the second prediction result of any training text is σ_S＝Sigmoid(O_S)。

The following describes the content of obtaining the recommended first prediction result of each training text based on the first network model.

In one possible implementation manner, obtaining a recommended first prediction result of each training text based on the first network model includes: determining first attribute features of the attributes in any training text based on the first network model, wherein the first attribute features of any attribute are used for describing any attribute; performing feature fusion processing on the first attribute features of each attribute in any training text to obtain the first text features of any training text; and determining a first prediction result of any training text based on the first attribute features of the attributes in any training text and the first text features of any training text.

For any one of the training texts, a first attribute feature of each attribute in any one of the training texts may be determined using an encoding layer of the first network model. Optionally, the first attribute feature of each attribute includes a first low-dimensional attribute feature of each attribute and a first high-dimensional attribute feature of each attribute.

In a possible implementation manner, the feature fusion processing is performed on the first attribute feature of each attribute in any training text to obtain the first text feature of any training text, and the feature fusion processing includes at least one of the following: performing feature fusion processing on the first attribute features of each attribute in any training text based on a multilayer perceptron to obtain the first text features of any training text; and performing feature cross processing on the first attribute features of the attributes in any training text to obtain the first text features of any training text.

The first attribute features of each attribute in any training text can be subjected to feature fusion processing based on a multilayer perceptron layer of a first network model, and the first text features of any training text can be obtained. Optionally, the multi-layer perceptron layer based on the first network model performs feature fusion processing on the first high-dimensional attribute features of each attribute in any training text to obtain the first text feature of any training text.

The feature cross layer of the first network model may also be used to perform feature cross processing on the first attribute features of each attribute in any training text to obtain the first text features of any training text. Optionally, the feature crossing layer of the first network model performs feature crossing processing on the first high-dimensional attribute features of each attribute in any training text to obtain the first text feature of any training text.

And then, the splicing layer of the first network model splices the first attribute features of each attribute in any training text with the first text features of any training text to obtain the first splicing features of any training text. Optionally, the first low-dimensional attribute features of each attribute in any training text are spliced with the first text features of any training text by the splicing layer of the first network model, so as to obtain the first splicing features of any training text. Then, the fully-connected layer of the first network model determines a first prediction result of any training text based on the first splicing features of any training text.

In the embodiment of the application, the first network model and the second network model have the same model structure. Therefore, the manner of determining the first prediction result of any one of the training texts by the first network model is consistent with the manner of determining the second prediction result of any one of the training texts by the second network model, and the description about determining the second prediction result of any one of the training texts can be referred to above, and is not repeated here.

Step 202, for any training text pair, obtaining a first comparison result of first prediction results and a second comparison result of second prediction results of two training texts in any training text pair, where the first comparison result is used for representing a difference of the first prediction results, the second comparison result is used for representing a difference of the second prediction results, and any two training texts in each training text pair are included in any training text pair.

Every two training texts in the plurality of training texts may constitute a training text pair. For any training text pair, based on the first prediction results of the two training texts in the any training text pair, determining the difference of the first prediction results between the two training texts in the any training text pair, namely determining the first comparison result of the any training text pair. Two ways of determining the first comparison result for any one training text pair, denoted as implementation a1 and implementation a2, respectively, are described below.

Implementation a1, obtaining a first comparison result of the first predicted results of the two training texts in any training text pair, includes: in response to the difference between the first predicted results of the two training texts in any one training text pair being smaller than the third reference value, determining a first comparison result of any one training text pair based on the third data; in response to the difference between the first predicted results of the two training texts in any one training text pair being not smaller than the third reference value and not larger than the fourth reference value, determining a first comparison result of any one training text pair based on fourth data, the third reference value being smaller than the fourth reference value, the third data being smaller than the fourth data; in response to a difference between the first predicted results of the two training texts in any one of the training text pairs being greater than a fourth reference value, a first comparison result of any one of the training text pairs is determined based on fifth data, the fourth data being less than the fifth data.

In this embodiment of the present application, a difference between first predicted results of two training texts in any training text pair may be calculated first, and a first comparison result of any training text pair may be determined based on the difference.

And when the difference between the first predicted results of the two training texts in any one training text pair is smaller than the third reference value, determining a first comparison result of any one training text pair based on the third data. For example, the third data is used as the first comparison result of any training text pair, or the third data is normalized to obtain the first comparison result of any training text pair. The normalization process is a process of normalizing data to [0,1 ].

In the embodiment of the application, two training texts in any training text pair may be respectively recorded as i and j, and a first prediction result of the training text i is recorded as

The first prediction result of the training text j is recorded as

When in use

Then, V_ijIs-1. Where the third reference value is-epsilon (epsilon is an arbitrary positive number) and the third data is-1. V_ijMay be the difference in the first prediction between training text i and training text j, i.e. the first comparison result for any one training text pair. Or according to S_ij＝(V_ij+1)/2, obtaining the first comparison result S of any training text pair_ij。

And when the difference between the first predicted results of the two training texts in any one training text pair is not less than the third reference value and not more than the fourth reference value, determining the first comparison result of any one training text pair based on the fourth data. For example, the fourth data is used as the first comparison result of any training text pair, or the fourth data is normalized to obtain the first comparison result of any training text pair. Wherein the third reference value is smaller than the fourth reference value, and the third data is smaller than the fourth data.

The first prediction result of the training text j is recorded as

When in use

Then, V_ij0. Where the third reference value is-epsilon (epsilon is an arbitrary positive number), the fourth reference value is epsilon, and the fourth data is 0. V_ijMay be the difference in the first prediction between training text i and training text j, i.e. the first comparison result for any one training text pair. Or according to S_ij＝(V_ij+1)/2, obtaining the first comparison result S of any training text pair_ij。

When the difference between the first prediction results of the two training texts in any training text pair is greater than the fourth reference value, the first comparison result of any training text pair is determined based on the fifth data, for example, the fifth data is used as the first comparison result of any training text pair, or the fifth data is normalized to obtain the first comparison result of any training text pair. Wherein the fourth data is smaller than the fifth data.

In the embodiment of the present application, two training texts in any training text pair may be respectivelyRecording as i and j, recording the first prediction result of the training text i

The first prediction result of the training text j is recorded as

When in use

Then, V_ij1. Where the fourth reference value is epsilon (epsilon is an arbitrary positive number) and the fifth data is 1. V_ijMay be the difference in the first prediction between training text i and training text j, i.e. the first comparison result for any one training text pair. Or according to S_ij＝(V_ij+1)/2, obtaining the first comparison result S of any training text pair_ij。

It should be noted that the third reference value and the fourth reference value are used to measure whether the difference of the first prediction result between the two training texts is large enough. When the difference in the first prediction result between two training texts is smaller than the third reference value or larger than the fourth reference value, it indicates that the difference in the first prediction result between two training texts is large enough, and the two training texts can be regarded as a distinguishable text pair (i.e., positive sample). When the difference in the first prediction result between two training texts is not less than the third reference value and not greater than the fourth reference value, it indicates that the difference in the first prediction result between the two training texts is not large enough, and the two training texts may be regarded as an indistinguishable text pair (i.e., a negative sample). Therefore, the first comparison result of any training text pair carries the information that the training text pair is a positive sample or a negative sample, and the first comparison result of any training text pair is determined based on the third data, the fourth data or the fifth data, so that the first comparison result of any training text pair carries the ordering relationship between the two training texts. Since any training text pair may be a positive sample or a negative sample, the first comparison result of any training text pair carries the ordering relationship between the positive samples or between the negative samples.

Since the first prediction result of the training text may be accurate or may not be accurate, the embodiment of the present application may further determine the first comparison result of any training text pair according to the labeling result of the training text, so as to modify the first comparison result in implementation a1 by using the labeling result of the training text, which can reduce transmission of wrong knowledge, thereby improving accuracy of the model. As shown in implementation a2 below.

Implementation a2, obtaining a first comparison result of the first predicted results of the two training texts in any training text pair, includes: obtaining labeling results of two training texts in any training text pair, wherein the labeling results are obtained through labeling and are used for representing whether the training texts are recommended or not; and determining a first comparison result of any training text pair based on the labeling results of the two training texts in any training text pair and the first prediction result of the two training texts in any training text pair.

In the embodiment of the application, the labeling result of each training text can be obtained, and the labeling result of any training text is used for representing whether any training text is recommended.

In one possible implementation, individual training texts are recommended to the sample user. For any training text, if the sample user clicks the training text, the labeling result of the training text is recommended, and if the sample user does not click the training text, the labeling result of the training text is not recommended.

In another possible implementation manner, for any training text, the click rate of the training text is determined based on the number of times the training text is clicked and the number of times the training text is displayed. If the click rate of the training text is greater than the click rate threshold value, the marking result of the training text is recommended, and if the click rate of the training text is not greater than the click rate threshold value, the marking result of the training text is not recommended.

Two different symbols can be used for representing that the marking result of the training text is recommended or not recommended. Illustratively, the numeral symbol 1 is used to indicate that the training text is labeled as recommended, and the numeral symbol 0 is used to indicate that the training text is labeled as not recommended.

By the method, the labeling result of each training text can be obtained, and the first comparison result of any training text pair is determined based on the labeling result of each of the two training texts in any training text pair and the first prediction result of each of the two training texts in any training text pair.

In one possible implementation manner, determining a first comparison result of any training text pair based on the labeling results of two training texts in any training text pair and the first prediction result of two training texts in any training text pair includes: in response to the fact that the labeling results of the two training texts in any training text pair are different, determining a first comparison result of any training text pair based on the target data; and in response to the labeling results of the two training texts in any one training text pair being the same, determining a first comparison result of any one training text pair based on a first prediction result of the two training texts in any one training text pair.

In this embodiment of the application, it may be determined whether the labeling results of the two training texts in any one training text pair are the same, and a first comparison result of any one training text pair is determined based on the determination result.

And when the labeling results of the two training texts in any one training text pair are different, determining a first comparison result of any one training text pair based on the target data.

Optionally, determining a first comparison result for any of the training text pairs based on the target data comprises: in response to the difference between the labeling results of the two training texts in any one training text pair being a first reference value, determining a first comparison result of any one training text pair based on the first data; and in response to the difference between the labeling results of the two training texts in any one training text pair being a second reference value, determining a first comparison result of any one training text pair based on second data, wherein the first reference value is greater than the second reference value, and the first data is greater than the second data.

In this embodiment of the present application, a difference between labeling results of two training texts in any training text pair may be calculated first, and a first comparison result of any training text pair may be determined based on the difference.

When the difference between the labeling results of the two training texts in any training text pair is the first reference value, determining the first comparison result of any training text pair based on the first data, for example, determining the first data as the first comparison result of any training text pair, or performing normalization processing on the first data to obtain the first comparison result of any training text pair.

In the embodiment of the application, two training texts in any training text pair can be respectively recorded as i and j, and the labeling result of the training text i is recorded as Y⁽ⁱ⁾And marking the marking result of the training text j as Y^(j). When Y is⁽ⁱ⁾-Y^(j)When 1, then

Wherein the first reference value is 1, and the first data is 1.

May be the difference in the first prediction between training text i and training text j, i.e. the first comparison result for any one training text pair. Can also be according to

Normalization processing is carried out to obtain a first comparison result S of any training text pair_ij。

When the difference between the labeling results of the two training texts in any training text pair is the second reference value, determining the first comparison result of any training text pair based on the second data, for example, determining the second data as the first comparison result of any training text pair, or performing normalization processing on the second data to obtain the first comparison result of any training text pair. The first reference value is larger than the second reference value, and the first data is larger than the second data.

In the embodiment of the application, two training texts in any training text pair can be respectively recorded as i and j, and the labeling result of the training text i is recorded as Y⁽ⁱ⁾And marking the marking result of the training text j as Y^(j). When Y is⁽ⁱ⁾-Y^(j)When the value is-1, then

Wherein the second reference value is-1 and the second data is-1.

And when the labeling results of the two training texts in any training text pair are the same, determining a first comparison result of any training text pair based on a first prediction result of the two training texts in any training text pair.

Optionally, when the labeling results of the two training texts in any one training text pair are the same, the first comparison result of any one training text pair is determined according to the implementation manner a 1. Please refer to the description of the implementation a1 in detail, which is not described herein.

In the embodiment of the application, two training texts in any training text pair can be respectively recorded as i and j, and the labeling result of the training text i is recorded as Y⁽ⁱ⁾And marking the marking result of the training text j as Y^(j)When Y is⁽ⁱ⁾-Y^(j)When equal to 0, then

Wherein, V_ijIs a parameter determined according to implementation a 1.

It is to be understood that the first comparison result due to any one of the training text pairs is determined based on any one of the first data to the fifth data. Therefore, the first comparison result of any one of the training text pairs can be used as the annotation information. The second network model is trained by using the labeling information, so that the performance and the accuracy of the second network model can be improved.

In this embodiment of the application, a second comparison result of any training text pair may also be determined based on a second prediction result of two training texts in any training text pair, and the second comparison result is used to represent a difference between the second prediction results of the two training texts in any training text pair.

In one possible implementation manner, obtaining a second comparison result of second prediction results of two training texts in any training text pair includes: determining a target difference value of second prediction results of two training texts in any training text pair; a second comparison result for any of the training text pairs is determined based on the target difference value.

In this embodiment of the application, a difference between the second prediction results of the two training texts in any training text pair may be calculated first, and the difference is used as a target difference. Since the second prediction result is usually a small data, for example, a data of the order of 0.1, the target difference is also a small data.

To improve the robustness of the training, the learning ability of the model is enhanced, especially the pair S of the model is enhanced_ijLearning ability of training text pair (S) 1_ijA training text pair of 1 requires two of the training text pairsThe difference between the first predictors of the training texts is larger than a fourth reference value, and the first predictors are typically smaller data). Therefore, the embodiment of the present application amplifies the data of the target difference by using an amplification factor, wherein the amplification factor is [0,1]]The data in between.

Optionally, a second comparison result of any training text pair is determined according to the ratio of the target difference value to the amplification factor, as shown in the following formula (3).

Wherein, P_ijIs the difference of the second predicted result between the training text i and the training text j, i.e. the second comparison result of any one training text pair. The second prediction result of the training text i is recorded as

The second prediction result of the training text j is recorded as

T∈[0,1]Is an amplification factor.

It should be noted that, a training text pair may be referred to as a pair (pair), a first comparison result of the training text pair may be used as label information of the training text pair, and a second comparison result of the training text pair may be used as prediction information of the training text pair.

And step 203, adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model.

In the embodiment of the present application, a comparison result loss value is calculated based on the first comparison result of each training text pair and the second comparison result of each training text pair, as shown in the following formula (4). Wherein the comparison result loss value is used to characterize loss information between the first comparison result and the second comparison result.

L_σ＝-∑_i＜j{S_ijlog(P_ij)+(1-S_ij)log(1-P_ij) The equation (4) belongs to the element B

Wherein L is_σFor comparing the result loss values, i and j respectively represent two training texts in any training text pair, S_ijFor the first comparison result, P, of any one training text pair_ijFor the second comparison result of any one training text pair, B characterizes a plurality of training texts.

And after the comparison result loss value is calculated, adjusting the second network model based on the comparison result loss value to obtain an adjusted second network model. And if the adjusted second network model meets the training ending condition (if the training times are reached), the adjusted second network model is the recommended model. And if the adjusted second network model does not meet the training end condition, taking the adjusted second network model as the second network model for next training, and continuously adjusting the second network model in the manner from step 201 to step 203 until the recommended model is obtained.

In a possible implementation manner, the adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain the recommendation model includes: acquiring a labeling result of each training text; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair, and the labeling result and the second prediction result of each training text pair to obtain a recommendation model.

The labeling result of each training text may be obtained, and the obtaining manner of the labeling result of any training text is described above and is not described herein again.

In the embodiment of the present application, a loss value between the labeling result and the second prediction result is determined based on the labeling result of each training text and the second prediction result of each training text, as shown in the following formula (5).

Wherein,

is the loss value between the labeling result and the second prediction result, Y is the labeling result of each training text, σ_SA second prediction result for each training text.

In the embodiment of the present application, the loss value of the comparison result is calculated based on the first comparison result and the second comparison result of each training text pair, and the calculation manner of the loss value of the comparison result is described above and is not described herein again.

And then, adjusting the second network model based on the loss value of the comparison result and the loss value between the labeling result and the second prediction result to obtain an adjusted second network model, and obtaining a recommendation model based on the adjusted second network model.

In a possible implementation manner, the adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain the recommendation model includes: determining a first attribute feature of each attribute in any training text based on the first network model, and determining a second attribute feature of each attribute in any training text based on the second network model; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first attribute characteristic and the second attribute characteristic of each attribute in each training text to obtain a recommendation model.

The determination of the first attribute feature of each attribute in any training text based on the first network model and the determination of the second attribute feature of each attribute in any training text based on the second network model have been described above, and are not described herein again.

The comparison result loss value is calculated based on the first comparison result of each training text pair and the second comparison result of each training text pair, and the calculation manner of the comparison result loss value has been described above, and is not described herein again.

An attribute feature loss value is calculated as a mean square error loss (also referred to as an L2 loss) based on the first attribute feature of each attribute in each training text and the second attribute feature of each attribute in each training text. Optionally, the attribute feature loss value is calculated based on the first high-dimensional attribute feature of each attribute in each training text and the second high-dimensional attribute feature of each attribute in each training text, as shown in the following formula (6). Wherein the attribute feature loss value is used to characterize a loss between the first attribute feature and the second attribute feature.

Wherein L is_ERepresenting attribute feature loss value, E_SSecond high-dimensional attribute features representing respective attributes in respective training texts, E_TA first high-dimensional attribute feature representing each attribute in each training text,

is used for mixing E_SAnd E_TThe dimension of (a) to perform the alignment parameter.

And then, adjusting the second network model based on the comparison result loss value and the attribute characteristic loss value to obtain an adjusted second network model, and obtaining a recommendation model based on the adjusted second network model.

In a possible implementation manner, the adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain the recommendation model includes: acquiring a first text feature and a second text feature of each training text; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first text characteristic and the second text characteristic of each training text to obtain a recommendation model.

The obtaining of the first text feature of each training text and the second text feature of each training text have been described above, and are not described herein again.

The comparison result loss value may be calculated based on the first comparison result of each training text pair and the second comparison result of each training text pair, and the calculation manner of the comparison result loss value has been described above, and is not described herein again.

And determining a text feature loss value according to the mean square error loss based on the first text feature of each training text and the second text feature of each training text. Wherein the text feature loss value is used to characterize a loss between the first text feature and the second text feature.

When the first text feature is obtained by performing feature fusion processing on the first attribute feature of each attribute in any training text based on the multi-layer perceptron, and the second text feature is obtained by performing feature fusion processing on the second attribute feature of each attribute in any training text based on the multi-layer perceptron, the text feature loss value is recorded as a text feature loss value corresponding to the multi-layer perceptron, and the text feature loss value corresponding to the multi-layer perceptron can be determined according to a mean square error loss mode shown in the following formula (7).

Wherein L is_DFor the corresponding text feature loss value of the multi-layer perceptron, D_TFor the first text feature of each training text corresponding to the multi-layer perceptron, D_SFor the second text feature of each training text corresponding to the multi-layer perceptron,

is used for mixing D_SAnd D_TThe dimension of (a) to perform the alignment parameter.

When the first text feature is obtained by performing feature intersection processing on the first attribute feature of each attribute in any training text, and the second text feature is obtained by performing feature intersection processing on the second attribute feature of each attribute in any training text, the text feature loss value is recorded as a text feature loss value corresponding to the feature intersection, and the text feature loss value corresponding to the feature intersection can be determined according to a mean square error loss manner shown in the following formula (8).

Wherein L is_IIs a text feature loss value, I, corresponding to a feature cross_SIs the second text feature of each training text with cross-corresponding features, I_TIs the first text feature of each training text corresponding to the feature cross,

is used for mixing I_SAnd I_TThe dimension of (a) to perform the alignment parameter.

And then, adjusting the second network model based on the comparison result loss value and the text characteristic loss value (at least one of the text characteristic loss value corresponding to the multilayer perceptron and the text characteristic loss value corresponding to the characteristic intersection) to obtain an adjusted second network model, and obtaining a recommendation model based on the adjusted second network model.

It should be noted that the loss value of the second network model may be determined by using at least one of a loss value between the labeling result and the second prediction result, an attribute feature loss value, a text feature loss value corresponding to the multi-layer perceptron, a text feature loss value corresponding to the feature intersection, and a comparison result loss value. And adjusting the second network model based on the loss value of the second network model to obtain an adjusted second network model, and obtaining a recommendation model based on the adjusted second network model.

Alternatively, the loss value of the second network model is determined according to equation (9) shown below.

Wherein L is^SIs a loss value, L, of the second network model_EIs an attribute feature loss value, L_IIs a text feature loss value, L, of feature cross correspondences_DFor the corresponding text feature loss value, L, of the multi-layer perceptron_σIn order to compare the loss values of the results,

a loss value between the annotated result and the second predicted result. Attribute characteristic loss values, text characteristic loss values corresponding to the multilayer perceptron and text characteristic loss values corresponding to characteristic intersections all belong to the characteristic loss values, the characteristic loss values are used for measuring the similarity between the middle hidden layer of the first network model and the middle hidden layer of the second network model, and lambda is_FIs the weighting coefficient corresponding to the characteristic loss value. Lambda [ alpha ]_σIs a weighting coefficient, λ, corresponding to the loss value of the comparison result_gAnd marking a weighting coefficient corresponding to the loss value between the result and the second prediction result. In the embodiments of the present application, λ is not limited_F、λ_σAnd λ_gMagnitude of (2), exemplary, λ_F＝λ_σ＝3.0，λ_g＝1.0。

In a possible implementation manner, after predicting the first prediction result of each training text based on the first network model, the method further includes: acquiring a labeling result of each training text; and adjusting the first network model based on the labeling result and the first prediction result of each training text to obtain the adjusted first network model.

In the embodiment of the present application, a loss value between the labeling result and the first prediction result is determined based on the labeling result of each training text and the first prediction result of each training text, as shown in the following formula (10).

Wherein,

is the loss value between the labeling result and the first prediction result, Y is the labeling result of each training text, σ_TA first prediction result for each training text.

And then, adjusting the first network model based on the loss value between the labeling result and the first prediction result to obtain the adjusted first network model. If the adjusted first network model meets the training end condition (for example, the training times are reached), the adjusted first network model can also be used as a recommendation model. If the adjusted first network model does not meet the training end condition, taking the adjusted first network model as the first network model for the next training, and continuing to adjust the first network model in the manner from step 201 to step 203 until the training end condition is met.

In the embodiment of the present application, the training mode of the first network model and the second network model is an asynchronous distillation training mode, the first network model may be used as a teacher model, and the second network model may be used as a student model. The asynchronous distillation training mode can avoid mutual interference of the teacher model and the student models, and the knowledge which is firstly and asynchronously learned by the teacher model is transferred to the student models during distillation training, so that the accuracy and the performance of the student models are improved.

The first comparison result of any training text pair of the above method is used for representing the difference of the first prediction results of the two training texts in any training text pair, and the second comparison result of any training text pair is used for representing the difference of the second prediction results of the two training texts in any training text pair. The recommendation model is obtained based on the first comparison result and the second comparison result of each training text pair, so that the recommendation model can learn the difference between the first prediction result and the second prediction result of different training texts, and the prediction capability of the recommendation model is improved, so that the accuracy of the recommendation model is improved, and the recommendation accuracy of the media information is further improved.

Based on the foregoing implementation environment, an embodiment of the present application provides a media information recommendation method, which may be executed by the electronic device 11 in fig. 1, taking a flowchart of the media information recommendation method provided in the embodiment of the present application as an example, as shown in fig. 3. As shown in fig. 3, the method includes steps 301 to 303.

Step 301, acquiring a plurality of media information, wherein the media information includes a target text.

According to the method and the device, a plurality of pieces of media information can be acquired, any piece of media information comprises at least one of audio, pictures and texts, and the target text can be extracted from any piece of media information.

Step 302, obtaining recommendation results of the recommended target texts in each piece of media information based on the recommendation model.

The recommendation model is obtained based on the training method of the recommendation model provided by the embodiment.

In the embodiment of the application, a plurality of target texts are input into a recommendation model, and the recommendation result of each target text is predicted by the recommendation model. The recommendation result of any one target text may be a vector for characterizing the recommended possibility of any one target text, or may be a probability for characterizing the recommended possibility of any one target text.

In one possible implementation manner, obtaining recommendation results of recommendation of the target text in each piece of media information based on a recommendation model includes: for a target text in any piece of media information, determining attribute characteristics of each attribute in the target text based on a recommendation model, wherein the attribute characteristics of any attribute are used for describing any attribute; performing feature fusion processing on the attribute features of each attribute in the target text to obtain text features of the target text; and determining a recommendation result of the target text based on the attribute features of the attributes in the target text and the text features of the target text.

For the target text in any media information, the target text includes at least one attribute, and the attribute may be a keyword extracted from the target text, or a sentence, a paragraph, etc. in the target text. The attribute of the target text is not limited in the embodiments of the present application, and for example, the attribute of the target text includes, but is not limited to, a title of the media information, a body of the media information, an author of the media information, a tag of the media information, a source of the target text, and the like.

For any one target text, the encoding layer of the recommendation model can be utilized to determine the attribute characteristics of each attribute in any one target text. The determining mode of the attribute features of each attribute in any target text is the same as the determining mode of the second attribute features of each attribute in any training text, and is not described herein again.

And then, the recommendation model performs feature fusion processing on the attribute features of each attribute in any target text to obtain the text features of any target text. The determination method of the text feature of any target text is the same as the determination method of the second text feature of any training text, and is not repeated here.

After the attribute features of each attribute in any target text and the text features of any target text are obtained, the recommendation model splices the attribute features of each attribute in any target text and the text features of any target text to obtain the splicing features of any target text. The determination method of the splicing feature of any target text is the same as the determination method of the second splicing feature of any training text, and is not repeated here.

And then, the recommendation model determines the recommendation result of any target text based on the splicing characteristics of any target text. The determination method of the recommendation result of any target text is the same as the determination method of the second prediction result of any training text, and is not repeated here.

Step 303, determining the media information to be recommended meeting the recommendation condition from the plurality of media information based on the recommendation result of the target text in each piece of media information, and recommending the media information to be recommended.

In the embodiment of the application, for the recommendation result of the target text in any piece of media information, if the recommendation result is greater than a recommendation threshold, it is determined that any piece of media information is to-be-recommended media information meeting the recommendation condition, and if the recommendation result is not greater than the recommendation threshold, it is determined that any piece of media information is not to-be-recommended media information meeting the recommendation condition. In this way, the media information to be recommended meeting the recommendation condition is determined from the plurality of media information, and the media information to be recommended is sent to the client, so that the client recommends the media information to be recommended, namely the client displays the recommended media information.

It should be noted that, in addition to obtaining one recommendation model (i.e., the recommendation model in step 302) after adjusting the second network model, another recommendation model can be obtained after adjusting the first network model. For the sake of convenience, the recommendation model obtained by adjusting the first network model is referred to as a first recommendation model, and the recommendation model obtained by adjusting the second network model is referred to as a second recommendation model (i.e., the recommendation model in step 302).

Optionally, the number of the to-be-recommended media information is multiple, after the to-be-recommended media information meeting the recommendation condition is determined, the recommendation result of the target text in each piece of the to-be-recommended media information is predicted based on the first recommendation model, and the target media information meeting the screening condition is determined from the multiple pieces of to-be-recommended media information based on the recommendation result of the target text in each piece of the to-be-recommended media information. If the recommendation result of the target text in any piece of to-be-recommended media information is greater than the screening threshold, determining that any piece of to-be-recommended media information is the target media information meeting the screening condition, and if the recommendation result of the target text in any piece of to-be-recommended media information is not greater than the screening threshold, determining that any piece of to-be-recommended media information is not the target media information meeting the screening condition.

It can be understood that the model structures of the first recommendation model and the second recommendation model are the same, and therefore, the manner of predicting the recommendation result of the target text in each piece of media information to be recommended based on the first recommendation model is similar to the manner of step 302, and the description of step 302 is given, and is not repeated here.

In the embodiment of the application, the second recommendation model is used for determining the media information to be recommended from the plurality of media information, the first recommendation model is used for determining the target media information from the plurality of media information to be recommended, and the accuracy of the media information to be recommended can be improved based on the principle of coarse screening and fine screening.

The recommendation model of the method is obtained based on the first comparison result and the second comparison result of each training text pair, the first comparison result of any training text pair is used for representing the difference of the first prediction results of the two training texts in any training text pair, and the second comparison result of any training text pair is used for representing the difference of the second prediction results of the two training texts in any training text pair. The recommendation model can learn the difference between the first prediction result and the second prediction result of different training texts, and the prediction capability of the recommendation model is improved, so that the accuracy of the recommendation model is improved, and the recommendation accuracy of the media information is further improved.

The above method steps illustrate a training method of a recommendation model and a media information recommendation method in the embodiment of the present application, and a detailed description will be given below with reference to a scenario. In a scenario (e.g., a subscription number image-text recommendation scenario) of the embodiment of the present application, the teacher model corresponds to the above-mentioned first network model, and the student model corresponds to the above-mentioned second network model. The encoding layer of the Teacher (Teacher) model is the Deep FM model with dimension 128, while the encoding layer of the Student (Student) model is the Deep FM model with dimension 16. The dimensions of the hidden units in the MLP layer of the teacher model are 512, 256, 128, 64, respectively, while the dimensions of the hidden units in the MLP layer of the student model are 128, 64, 32, 16, respectively. The teacher model serves for fine ranking, and the student model serves for coarse ranking, that is, the above-mentioned "first determines the media information to be recommended from the plurality of media information by using the second recommendation model, and then determines the target media information from the plurality of media information to be recommended by using the first recommendation model".

Referring to fig. 4, fig. 4 is a schematic diagram of an asynchronous distillation provided in an embodiment of the present application. Asynchronous distillation comprises two asynchronous pipelines (Pipeline), namely a teacher Pipeline and a student Pipeline, wherein the teacher Pipeline is used for routine training of a teacher model, and the student Pipeline is used for distillation training of a student model. As can be seen from FIG. 4, the data flows of the teacher pipeline and the student pipeline are independent of each other, and the data flow of the teacher pipeline precedes the data flow of the student pipeline in the time dimension.

For the teacher pipeline, training texts are extracted from the database, teacher models are extracted from a teacher model library (the model library is also called Checkpoint), the teacher models are trained by the training texts to update the teacher models, and the updated teacher models are stored in the teacher model library.

And aiming at the student pipeline, extracting the training text from the database, extracting the teacher model from the teacher model library, and extracting the student model from the student model library. And training the student models by using the training texts and the teacher model to update the student models, and storing the updated student models in a student model library.

The mechanism of asynchronous distillation may cause the update of the student model to be delayed (e.g., by tens of minutes) from the update of the teacher model. In the context of the embodiments of the present application, the impact of such delays is negligible. Compared with a synchronous distillation mechanism, the student model trained by the asynchronous distillation mechanism has stronger fitting capability, can serve for coarse typesetting more effectively, and improves the recommendation effect.

Referring to fig. 5, fig. 5 is a schematic diagram of an asynchronous distillation training process according to an embodiment of the present application. As can be seen from FIG. 5, the model structures of the teacher model and the student model are completely the same, and only the difference of the model parameters exists between the teacher model and the student model, so that the isomorphic design of the model is beneficial to the effect of fitting the teacher model by the student model, and the accuracy of the student model is improved. In addition, the teacher model and the student model are independent from each other, and the coding layer, the characteristic cross layer, the multilayer perceptron layer, the splicing layer, the full-connection layer and the like of the teacher model and the student model are independent from each other.

In the embodiment of the present application, the training text includes a plurality of attributes, which are attribute 1, attribute 2, attribute 3, attribute …, and attribute m (m is a positive integer). On one hand, the training text can be input into the teacher model for training the teacher model, and on the other hand, the training text can also be input into the student model for training the student model by combining the teacher model. The teacher model and the student model have the same model structure, so that the teacher model processes the training texts in the same way as the student model processes the training texts. From the perspective of the student model, the processing mode of the student model on the training text is explained below.

After the training text is input into the student model, the coding layer of the student model comprises a square coding table and a wide coding table. The square encoding table is used to encode the attribute 1, the attribute 2, the attribute 3, the attribute … and the attribute m in the training text into a high-dimensional attribute feature (corresponding to the second high-dimensional attribute feature in the above), and the wide encoding table is used to encode the attribute 1, the attribute 2, the attribute 3, the attribute … and the attribute m in the training text into a low-dimensional attribute feature (corresponding to the second low-dimensional attribute feature in the above). The high-dimensional attribute features are subjected to feature intersection to obtain another text feature. And then, the two text features (corresponding to the second text feature in the text) are spliced with the low-dimensional attribute feature to obtain a spliced feature (corresponding to the second spliced feature in the text), the spliced feature is input into the full-connection layer, and a prediction result (corresponding to the second prediction result in the text) is output by the full-connection layer, wherein the prediction result is a prediction result of the training text and is used for representing the recommended possibility of the training text. In this way, the student model can output the prediction result of the training text. Based on the same principle, the teacher model may also output the prediction result of the training text (corresponding to the first prediction result in the above), which is not described in detail herein.

And for the teacher model, obtaining the labeling result of the training text, wherein the labeling result of the training text is used for representing whether the training text is recommended or not. Based on the labeling result of the training text and the prediction result of the training text output by the teacher model, a loss value (corresponding to the loss value between the labeling result and the first prediction result in the above text) is determined. The loss value is a loss value of the teacher model and is used for adjusting the teacher model.

For the student model, a loss value of 1 (corresponding to the comparison result loss value above) is determined based on the prediction result of the training text output by the teacher model and the prediction result of the training text output by the student model. And acquiring the labeling result of the training text, and determining a loss value 2 (corresponding to the loss value between the labeling result and the second prediction result in the above) based on the labeling result of the training text and the prediction result of the training text output by the student model. And determining a loss value 3 (corresponding to the text characteristic loss value corresponding to the characteristic intersection in the text) based on the text characteristic obtained by the characteristic intersection of the high-dimensional attribute characteristic in the teacher model and the text characteristic obtained by the characteristic intersection of the high-dimensional attribute characteristic in the student model. And determining a loss value 4 (corresponding to the text characteristic loss value corresponding to the multilayer perceptron in the text characteristic loss value) based on the text characteristic obtained after the high-dimensional attribute characteristic in the teacher model passes through the multilayer perceptron and the text characteristic obtained after the high-dimensional attribute characteristic in the student model passes through the multilayer perceptron. Based on the high-dimensional attribute features in the teacher model and the high-dimensional attribute features in the student model, a loss value of 5 (corresponding to the attribute feature loss value above) is determined. Determining a loss value of the student model based on the loss values 1 to 5, the loss value of the student model being used for adjusting the student model.

In the embodiment of the application, the teacher model and the student model can be adjusted for multiple times until the training end condition is met. At this time, the teacher model and the student model can both be used as recommendation models. The recommendation result of the target text in the media information can be determined by using the student model, and the media information to be recommended meeting the recommendation condition is determined from the plurality of media information based on the recommendation result of the target text in each media information. And sending the media information to be recommended to the client so that the client recommends the media information to be recommended, namely recommending the media information by using the client.

Referring to fig. 6, fig. 6 is a schematic diagram of a display page provided in an embodiment of the present application, where the display page is a display page in a subscription number teletext recommendation scenario, and in the subscription number teletext recommendation scenario, the display page may show recommended media information, where the recommended media information includes but is not limited to at least one of a picture, a text, and a video, and the recommended media information may be advertisement information. The display page includes an identifier of the power of the electronic device (as shown by reference numeral 601), an identifier of the connected wireless signal of the electronic device (as shown by reference numeral 602), an identifier of the connected base station signal of the electronic device (as shown by reference numeral 603), and time information "11: 45". The display page is a display page of "subscription number message", the left side of the page includes a return control (as shown by a reference numeral 604), and clicking the return control can return to the last display page of the display page of "subscription number message". The display page includes four pieces of recommended media information.

The first media information is video media information that includes, but is not limited to, the title "[ modeling algorithm ] monte carlo simulation algorithm", the author "author 1", a play control (as indicated by reference numeral 605), and a delete control (as indicated by reference numeral 606). The second media information is multimedia information containing pictures and text, including but not limited to the title "interviewer: question a ", author" author 2 ", picture (as shown by reference numeral 607), and delete control. The third media information is also multimedia information containing pictures and text, including but not limited to the title "work: question B ", author" author 3 ", picture (as indicated by reference number 608), and delete control. The fourth media information is video media information including, but not limited to, the title "[ tools ] interface management tools", author "author 4", play controls, and delete controls.

In the embodiment of the application, in the subscription number image-text recommendation scene, an off-line experiment and an on-line experiment are performed for the thick ranks.

For an off-line experiment, more than 2 hundred million pieces of data in 10 hours a day are obtained as training texts. In the next hour within 10 hours of the day, data amounting to more than 2 million are obtained as test texts.

In the embodiment of the application, three models are obtained by training with the training text. The first model is a model obtained by conventional training using a training text, and is denoted as DFM _16_ NORM, and the coding layer of the model is a 16-dimensional Deep FM model. The second model is obtained by performing asynchronous distillation training on a student model by using the training method of the recommended model provided in the embodiment of the application, and is marked as DFM _16_ ASYNC _ DISTILL, and the coding layer of the model is also a 16-dimensional Deep FM model. The third model is obtained by performing synchronous distillation training on a student model and is marked as DFM _16_ SYNC _ DISTILL, and the coding layer of the model is also a 16-dimensional Deep FM model.

It should be noted that, when asynchronous distillation is performed, a teacher model of a Deep FM model with a 128-dimensional coding layer needs to be used for distillation training of the student model, and a loss value obtained based on a labeling result of the training text and a prediction result of the training text output by the teacher model is only used for adjusting the teacher model. When synchronous distillation is carried out, a teacher model of a Deep FM model with a 128-dimensional coding layer is required to be adopted to carry out distillation training on the student models, and loss values obtained based on the labeling results of the training texts and the prediction results of the training texts output by the teacher model are used for adjusting the student models besides the teacher model, namely, the teacher model and the student models simultaneously carry out training.

In the embodiment of the present application, a plurality of experiments are performed by using DFM _16_ NORM, DFM _16_ ASYNC _ disable, and DFM _16_ SYNC _ disable, respectively, and an Area (Area Under current, AUC) Under a Receiver Operating Characteristic (ROC) Curve is determined based on the experiment results, so as to obtain the results shown in table 1 below.

TABLE 1

Model name	AUC
		DFM_16_NORM	0.717969
DFM_16_SYNC_DISTILL	0.721443
		DFM_16_ASYNC_DISTILL	0.727317

As can be seen from Table 1, DFM _16_ ASYNC _ DISTILL obtained from asynchronous distillation training works best. Compared with DFM _16_ NORM obtained by conventional training, the increase of the AUC of DFM _16_ ASYNC _ DISTILL reaches 1.3%. DFM _16_ ASYNC _ disable is higher than the AUC value of DFM _16_ SYNC _ disable obtained by the synchronous distillation training, but DFM _16_ SYNC _ disable is higher than the AUC value of DFM _16_ NORM. Therefore, the model fitting capability is improved regardless of synchronous distillation training or asynchronous distillation training. Because the teacher model is a 'stronger instructor' in the asynchronous distillation training, the fitting capability of the model obtained by the asynchronous distillation training is higher than that of the model obtained by the synchronous distillation training.

For the on-line experiment, the index used was Click Through Rate (CTR). By comparing DFM _16_ SYNC _ disable and DFM _16_ NORM, in a 7-day continuous observation, the CTR rise of DFM _16_ SYNC _ disable reaches 2% compared with DFM _16_ NORM, thereby verifying that the recommendation model obtained by the training method of the recommendation model provided by the embodiment of the present application has more accurate recommendation capability.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a training apparatus for a recommendation model according to an embodiment of the present application, and as shown in fig. 7, the apparatus includes:

an obtaining module 701, configured to obtain a first predicted result recommended by each training text based on a first network model, and obtain a second predicted result recommended by each training text based on a second network model, where model parameters of the first network model and the second network model are different;

the obtaining model 701 is further configured to obtain, for any training text pair, a first comparison result of first prediction results and a second comparison result of second prediction results of two training texts in any training text pair, where the first comparison result is used to represent a difference of the first prediction results, the second comparison result is used to represent a difference of the second prediction results, and any two training texts in each training text pair are included;

and the adjusting model 702 is configured to adjust the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model.

In a possible implementation manner, the obtaining model 701 is used for obtaining labeling results of two training texts in any training text pair, wherein the labeling results are obtained by labeling and used for representing whether the training texts are recommended or not; and determining a first comparison result of any training text pair based on the labeling results of the two training texts in any training text pair and the first prediction result of the two training texts in any training text pair.

In a possible implementation manner, the obtaining model 701 is configured to determine, in response to that labeling results of two training texts in any one training text pair are different, a first comparison result of any one training text pair based on target data; and in response to the labeling results of the two training texts in any one training text pair being the same, determining a first comparison result of any one training text pair based on a first prediction result of the two training texts in any one training text pair.

In a possible implementation manner, the obtaining module 701 is configured to determine, in response to a difference between labeling results of two training texts in any one training text pair being a first reference value, a first comparison result of any one training text pair based on first data; and in response to the difference between the labeling results of the two training texts in any one training text pair being a second reference value, determining a first comparison result of any one training text pair based on second data, wherein the first reference value is greater than the second reference value, and the first data is greater than the second data.

In a possible implementation manner, a model 701 is obtained, and is used for determining a first comparison result of any training text pair based on third data in response to that a difference between first prediction results of two training texts in any training text pair is smaller than a third reference value; in response to the difference between the first predicted results of the two training texts in any one training text pair being not smaller than the third reference value and not larger than the fourth reference value, determining a first comparison result of any one training text pair based on fourth data, the third reference value being smaller than the fourth reference value, the third data being smaller than the fourth data; in response to a difference between the first predicted results of the two training texts in any one of the training text pairs being greater than a fourth reference value, a first comparison result of any one of the training text pairs is determined based on fifth data, the fourth data being less than the fifth data.

In a possible implementation manner, a model 701 is obtained, which is used to determine a target difference value of second prediction results of two training texts in any training text pair; a second comparison result for any of the training text pairs is determined based on the target difference value.

In a possible implementation manner, the obtaining model 701 is configured to determine, based on a first network model, a first attribute feature of each attribute in any training text, where the first attribute feature of any attribute is used to describe any attribute; performing feature fusion processing on the first attribute features of each attribute in any training text to obtain the first text features of any training text; and determining a first prediction result of any training text based on the first attribute features of the attributes in any training text and the first text features of any training text.

In one possible implementation, a model 701 is obtained for at least one of:

performing feature fusion processing on the first attribute features of each attribute in any training text based on a multilayer perceptron to obtain the first text features of any training text; and performing feature cross processing on the first attribute features of the attributes in any training text to obtain the first text features of any training text.

In a possible implementation manner, the obtaining model 701 is configured to determine a second attribute feature of each attribute in any training text based on a second network model, where the second attribute feature of any attribute is used to describe any attribute; performing feature fusion processing on the second attribute features of each attribute in any training text to obtain the second text features of any training text; and determining a second prediction result of any training text based on the second attribute characteristics of the attributes in any training text and the second text characteristics of any training text.

In one possible implementation, the model 702 is adjusted to obtain labeling results of each training text; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair, and the labeling result and the second prediction result of each training text pair to obtain a recommendation model.

In a possible implementation manner, the adjustment model 702 is configured to determine a first attribute feature of each attribute in any one of the training texts based on a first network model, and determine a second attribute feature of each attribute in any one of the training texts based on a second network model; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first attribute characteristic and the second attribute characteristic of each attribute in each training text to obtain a recommendation model.

In one possible implementation, the model 702 is adjusted to obtain a first text feature and a second text feature of each training text; and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first text characteristic and the second text characteristic of each training text to obtain a recommendation model.

In a possible implementation manner, the obtaining module 701 is further configured to obtain a labeling result of each training text;

the adjustment model 702 is further configured to adjust the first network model based on the labeling result and the first prediction result of each training text, so as to obtain an adjusted first network model.

The first comparison result of any training text pair in the device is used for representing the difference of the first prediction results of the two training texts in any training text pair, and the second comparison result of any training text pair is used for representing the difference of the second prediction results of the two training texts in any training text pair. The recommendation model is obtained based on the first comparison result and the second comparison result of each training text pair, so that the recommendation model can learn the difference between the first prediction result and the second prediction result of different training texts, and the prediction capability of the recommendation model is improved, so that the accuracy of the recommendation model is improved, and the recommendation accuracy of the media information is further improved.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a media information recommendation device according to an embodiment of the present application, and as shown in fig. 8, the device includes:

an obtaining module 801, configured to obtain multiple pieces of media information, where the media information includes a target text;

the obtaining module 801 is further configured to obtain recommendation results recommended by the target texts in each piece of media information based on a recommendation model, where the recommendation model is obtained based on any one of the above training methods of the recommendation model;

and the recommendation model 802 is configured to determine media information to be recommended that meets recommendation conditions from the multiple pieces of media information based on recommendation results of the target texts in the pieces of media information, and recommend the media information to be recommended.

In a possible implementation manner, the obtaining module 801 is configured to determine, for a target text in any piece of media information, attribute features of each attribute in the target text based on a recommendation model, where the attribute features of any attribute are used to describe any attribute; performing feature fusion processing on the attribute features of each attribute in the target text to obtain text features of the target text; and determining a recommendation result of the target text based on the attribute features of the attributes in the target text and the text features of the target text.

The recommendation model of the device is obtained based on the first comparison result and the second comparison result of each training text pair, the first comparison result of any training text pair is used for representing the difference of the first prediction results of the two training texts in any training text pair, and the second comparison result of any training text pair is used for representing the difference of the second prediction results of the two training texts in any training text pair. The recommendation model can learn the difference between the first prediction result and the second prediction result of different training texts, and the prediction capability of the recommendation model is improved, so that the accuracy of the recommendation model is improved, and the recommendation accuracy of the media information is further improved.

It should be understood that, when the apparatuses provided in fig. 7 and 8 implement their functions, the division of the functional modules is merely illustrated, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.

Fig. 9 shows a block diagram of a terminal device 900 according to an exemplary embodiment of the present application. The terminal device 900 may be a portable mobile terminal such as: a smartphone, a tablet, a laptop, or a desktop computer. Terminal device 900 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so on.

In general, terminal device 900 includes: a processor 901 and a memory 902.

Processor 901 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so forth. The processor 901 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 901 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 901 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 901 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 902 is used to store at least one instruction for execution by the processor 901 to implement a method of training a recommendation model or a method of media information recommendation provided by method embodiments herein.

In some embodiments, the terminal device 900 may further include: a peripheral interface 903 and at least one peripheral. The processor 901, memory 902, and peripheral interface 903 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 903 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 904, display screen 905, camera assembly 906, audio circuitry 907, and power supply 908.

The peripheral interface 903 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 901 and the memory 902. In some embodiments, the processor 901, memory 902, and peripheral interface 903 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 901, the memory 902 and the peripheral interface 903 may be implemented on a separate chip or circuit board, which is not limited by this embodiment.

The Radio Frequency circuit 904 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 904 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 904 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 904 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 904 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 905 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 905 is a touch display screen, the display screen 905 also has the ability to capture touch signals on or over the surface of the display screen 905. The touch signal may be input to the processor 901 as a control signal for processing. At this point, the display 905 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 905 may be one, and is disposed on the front panel of the terminal device 900; in other embodiments, the number of the display screens 905 may be at least two, and the display screens are respectively disposed on different surfaces of the terminal device 900 or in a folding design; in other embodiments, the display 905 may be a flexible display, disposed on a curved surface or on a folded surface of the terminal device 900. Even more, the display screen 905 may be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display panel 905 can be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and other materials.

The camera assembly 906 is used to capture images or video. Optionally, camera assembly 906 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 906 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuit 907 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 901 for processing, or inputting the electric signals to the radio frequency circuit 904 for realizing voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different positions of the terminal apparatus 900. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 901 or the radio frequency circuit 904 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuit 907 may also include a headphone jack.

Power supply 908 is used to provide power to various components within terminal device 900. The power source 908 may be alternating current, direct current, disposable or rechargeable. When the power source 908 comprises a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal device 900 also includes one or more sensors 909. The one or more sensors 909 include, but are not limited to: an acceleration sensor 911, a gyro sensor 912, a pressure sensor 913, an optical sensor 914, and a proximity sensor 915.

The acceleration sensor 911 can detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal apparatus 900. For example, the acceleration sensor 911 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 901 can control the display screen 905 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 911. The acceleration sensor 911 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 912 can detect the body direction and the rotation angle of the terminal device 900, and the gyro sensor 912 and the acceleration sensor 911 cooperate to acquire the 3D motion of the user on the terminal device 900. The processor 901 can implement the following functions according to the data collected by the gyro sensor 912: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

The pressure sensor 913 may be disposed on a side bezel of the terminal device 900 and/or underneath the display 905. When the pressure sensor 913 is disposed on the side frame of the terminal device 900, the holding signal of the terminal device 900 from the user can be detected, and the processor 901 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 913. When the pressure sensor 913 is disposed at a lower layer of the display screen 905, the processor 901 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 905. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The optical sensor 914 is used to collect the ambient light intensity. In one embodiment, the processor 901 may control the display brightness of the display 905 according to the ambient light intensity collected by the optical sensor 914. Specifically, when the ambient light intensity is high, the display brightness of the display screen 905 is increased; when the ambient light intensity is low, the display brightness of the display screen 905 is reduced. In another embodiment, the processor 901 may also dynamically adjust the shooting parameters of the camera assembly 906 according to the ambient light intensity collected by the optical sensor 914.

A proximity sensor 915, also called a distance sensor, is generally provided on the front panel of the terminal apparatus 900. The proximity sensor 915 is used to collect the distance between the user and the front surface of the terminal device 900. In one embodiment, when the proximity sensor 915 detects that the distance between the user and the front face of the terminal device 900 gradually decreases, the processor 901 controls the display 905 to switch from the bright screen state to the dark screen state; when the proximity sensor 915 detects that the distance between the user and the front surface of the terminal device 900 becomes gradually larger, the processor 901 controls the display 905 to switch from the breath screen state to the bright screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 9 does not constitute a limitation of terminal device 900 and may include more or fewer components than shown, or combine certain components, or employ a different arrangement of components.

Fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 1000 may generate a relatively large difference due to a difference in configuration or performance, and may include one or more processors 1001 and one or more memories 1002, where the one or more memories 1002 store at least one program code, and the at least one program code is loaded and executed by the one or more processors 1001 to implement the method for training a recommendation model or the method for recommending media information provided in the foregoing method embodiments, and for example, the processor 1001 is a CPU. Of course, the server 1000 may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input and output, and the server 1000 may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, a computer-readable storage medium is further provided, in which at least one program code is stored, and the at least one program code is loaded and executed by a processor, so as to enable an electronic device to implement any one of the above-mentioned training methods for recommendation models or media information recommendation methods.

Alternatively, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program or a computer program product is further provided, in which at least one computer instruction is stored, and the at least one computer instruction is loaded and executed by a processor, so as to enable a computer to implement any one of the above-mentioned training methods for recommendation models or media information recommendation methods.

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the principles of the present application should be included in the protection scope of the present application.

Claims

1. A method of training a recommendation model, the method comprising:

2. The method of claim 1, wherein obtaining a first comparison result of the first predicted results of the two training texts in any one of the training text pairs comprises:

obtaining labeling results of two training texts in any training text pair, wherein the labeling results are obtained through labeling and are used for representing whether the training texts are recommended or not;

and determining a first comparison result of any training text pair based on the labeling results of the two training texts in any training text pair and the first prediction result of the two training texts in any training text pair.

3. The method of claim 2, wherein determining the first comparison result of the any training text pair based on the labeling result of the two training texts in the any training text pair and the first prediction result of the two training texts in the any training text pair comprises:

in response to the fact that the labeling results of the two training texts in any training text pair are different, determining a first comparison result of any training text pair based on target data;

and in response to the labeling results of the two training texts in any one training text pair are the same, determining a first comparison result of any one training text pair based on a first prediction result of the two training texts in any one training text pair.

4. The method of claim 3, wherein determining a first comparison result for the any one of the pairs of training texts based on the target data comprises:

in response to the difference between the labeling results of the two training texts in any one training text pair being a first reference value, determining a first comparison result of any one training text pair based on first data;

and in response to the difference between the labeling results of the two training texts in any one training text pair being a second reference value, determining a first comparison result of any one training text pair based on second data, wherein the first reference value is greater than the second reference value, and the first data is greater than the second data.

5. The method of claim 3, wherein determining the first comparison result for the any one training text pair based on the first predicted result of the two training texts in the any one training text pair comprises:

in response to the difference between the first predicted results of the two training texts in any one training text pair being smaller than a third reference value, determining a first comparison result of any one training text pair based on third data;

in response to the difference between the first predicted results of the two training texts in any one training text pair being not less than a third reference value and not greater than a fourth reference value, determining a first comparison result of any one training text pair based on fourth data, the third reference value being less than the fourth reference value, the third data being less than the fourth data;

and in response to the difference between the first predicted results of the two training texts in any one training text pair being greater than a fourth reference value, determining a first comparison result of any one training text pair based on fifth data, the fourth data being smaller than the fifth data.

6. The method of claim 1, wherein obtaining a second comparison of the second predicted results of the two training texts in any one of the training text pairs comprises:

determining a target difference value of second prediction results of the two training texts in any training text pair;

determining a second comparison result for the any one of the training text pairs based on the target difference value.

7. The method of any one of claims 1 to 6, wherein obtaining the recommended first prediction result of each training text based on the first network model comprises:

determining first attribute features of various attributes in any training text based on the first network model;

performing feature fusion processing on the first attribute features of each attribute in any training text to obtain the first text features of any training text;

and determining a first prediction result of the any training text based on the first attribute features of the attributes in the any training text and the first text features of the any training text.

8. The method according to claim 7, wherein the performing feature fusion processing on the first attribute feature of each attribute in any one of the training texts to obtain the first text feature of any one of the training texts comprises at least one of:

performing feature fusion processing on the first attribute features of each attribute in any training text based on a multilayer perceptron to obtain the first text features of any training text;

and performing feature cross processing on the first attribute features of each attribute in any training text to obtain the first text features of any training text.

9. The method according to any one of claims 1 to 6, wherein the obtaining of the recommended second prediction result of each training text based on the second network model comprises:

determining second attribute features of the attributes in any training text based on the second network model, wherein the second attribute features of any attribute are used for describing any attribute;

performing feature fusion processing on the second attribute features of each attribute in any training text to obtain second text features of any training text;

and determining a second prediction result of the any training text based on the second attribute features of the attributes in the any training text and the second text features of the any training text.

10. The method according to any one of claims 1 to 6, wherein the adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model comprises:

acquiring the labeling result of each training text;

and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair, and the labeling result and the second prediction result of each training text pair to obtain the recommendation model.

11. The method according to any one of claims 1 to 6, wherein the adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model comprises:

determining a first attribute feature of each attribute in any training text based on the first network model, and determining a second attribute feature of each attribute in any training text based on the second network model;

and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first attribute characteristic and the second attribute characteristic of each attribute in each training text to obtain the recommendation model.

12. The method according to any one of claims 1 to 6, wherein the adjusting the second network model based on the first comparison result and the second comparison result of each training text pair to obtain a recommendation model comprises:

acquiring a first text feature and a second text feature of each training text;

and adjusting the second network model based on the first comparison result and the second comparison result of each training text pair and the first text characteristic and the second text characteristic of each training text to obtain the recommendation model.

13. The method according to any one of claims 1 to 6, wherein after predicting the first prediction result of each training text based on the first network model, further comprising:

acquiring the labeling result of each training text;

and adjusting the first network model based on the labeling result and the first prediction result of each training text to obtain the adjusted first network model.

14. A method for recommending media information, the method comprising:

acquiring recommendation results of recommendation of target texts in each piece of media information based on a recommendation model obtained based on the training method of the recommendation model of any one of claims 1 to 13;

15. The method of claim 14, wherein obtaining the recommendation result of the recommendation of the target text in each media information based on the recommendation model comprises:

for a target text in any piece of media information, determining attribute characteristics of each attribute in the target text based on the recommendation model, wherein the attribute characteristics of any attribute are used for describing any attribute;

performing feature fusion processing on the attribute features of each attribute in the target text to obtain text features of the target text;

and determining a recommendation result of the target text based on the attribute features of the attributes in the target text and the text features of the target text.

16. An apparatus for training a recommendation model, the apparatus comprising:

17. An apparatus for recommending media information, the apparatus comprising:

the obtaining module is further configured to obtain a recommendation result that a target text in each piece of media information is recommended based on a recommendation model obtained based on the training method of the recommendation model according to any one of claims 1 to 13;

18. An electronic device, comprising a processor and a memory, wherein at least one program code is stored in the memory, and the at least one program code is loaded and executed by the processor to cause the electronic device to implement the training method of recommendation model according to any one of claims 1 to 13 or the media information recommendation method according to any one of claims 14 to 15.

19. A computer-readable storage medium having at least one program code stored therein, the at least one program code being loaded and executed by a processor to cause a computer to implement the method of training a recommendation model according to any one of claims 1 to 13 or the method of recommending media information according to any one of claims 14 to 15.

20. A computer program product having stored therein at least one computer instruction, which is loaded and executed by a processor, to cause a computer to implement a method of training a recommendation model according to any one of claims 1 to 13 or a method of media information recommendation according to any one of claims 14 to 15.