CN115858911A

CN115858911A - Information recommendation method and device, electronic equipment and computer-readable storage medium

Info

Publication number: CN115858911A
Application number: CN202111122164.8A
Authority: CN
Inventors: 杜颖
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-09-24
Filing date: 2021-09-24
Publication date: 2023-03-28

Abstract

The embodiment of the application discloses an information recommendation method and device, electronic equipment and a computer readable storage medium, and relates to artificial intelligence, cloud technology and multimedia technology. The method comprises the following steps: acquiring object information of a target object of a first application and at least one piece of information to be recommended; for each piece of information to be recommended, predicting to obtain a first recommended value of the information to be recommended, which corresponds to each evaluation index of at least two recommendation evaluation indexes, through a multi-target prediction model based on object information of a target object and the information to be recommended, and obtaining a second recommended value of the information to be recommended by fusing the first recommended values corresponding to the information to be recommended; and determining target recommendation information of the target object from at least one piece of information to be recommended according to the second recommendation value of each piece of information to be recommended. According to the scheme provided by the embodiment of the application, the accuracy of information recommendation can be effectively provided.

Description

Information recommendation method and device, electronic equipment and computer-readable storage medium

Technical Field

The present application relates to the field of artificial intelligence, cloud technology, and multimedia technology, and in particular, to an information recommendation method, apparatus, electronic device, computer-readable storage medium, and computer program product.

Background

With the development of science and technology, the internet has become an indispensable part of people's lives. In the era of rapid development of the internet, various information publishing platforms are also rapidly developed, and users can conveniently and rapidly acquire various information through the information publishing platforms. For example, through a news information publishing platform, a user can read news online and know messages in various fields in time.

In order to meet different requirements of different users, information recommendation is realized through some recommendation algorithms at present, so that different users can obtain different recommendation information. Although many different information recommendation algorithms exist, the accuracy of information recommendation remains to be improved.

Disclosure of Invention

The application aims to provide an information recommendation method and device, an electronic device and a computer-readable storage medium, which can effectively improve the information recommendation accuracy. In order to achieve the purpose, the technical scheme provided by the embodiment of the application is as follows:

in one aspect, the present application provides an information recommendation method, including:

acquiring object information of a target object of a first application and at least one piece of information to be recommended;

for each piece of information to be recommended, predicting a first recommendation value of the information to be recommended, which corresponds to each of at least two recommendation evaluation indexes, through a multi-target prediction model based on the object information of the target object and the information to be recommended;

for each piece of information to be recommended, a second recommendation value of the information to be recommended is obtained by fusing the first recommendation values corresponding to the information to be recommended;

determining target recommendation information of the target object from the at least one piece of information to be recommended according to a second recommendation value of each piece of information to be recommended;

the multi-target prediction model is obtained by training a neural network model based on a plurality of training samples and a target loss function of the model, the target loss function comprises loss functions corresponding to all evaluation indexes, and the value of the loss function of each evaluation index is determined according to the training loss values and the loss correction coefficients of the plurality of training samples corresponding to the evaluation indexes;

a training sample includes sample data of a sample object, and the loss correction coefficient of each of the evaluation indicators corresponding to the sample is determined based on target information included in the sample data of the sample, where the target information includes usage behavior information of the sample object corresponding to the second application.

On the other hand, an embodiment of the present application provides an information recommendation device, including:

the information acquisition module is used for acquiring object information of a target object of the first application and at least one piece of information to be recommended;

the recommendation value determining module is used for predicting to obtain a first recommendation value of the information to be recommended, which corresponds to each of at least two recommendation evaluation indexes, through a multi-target prediction model based on the object information of the target object and the information to be recommended for each piece of information to be recommended, and obtaining a second recommendation value of the information to be recommended by fusing each first recommendation value corresponding to the information to be recommended;

the target information determining module is used for determining target recommendation information of the target object from the at least one piece of information to be recommended according to a second recommendation value of each piece of information to be recommended;

the multi-target prediction model is obtained by training a neural network model based on a plurality of training samples and a target loss function of the model, the target loss function comprises loss functions corresponding to all evaluation indexes, and the value of the loss function of each evaluation index is determined according to training loss values and loss correction coefficients of the plurality of training samples corresponding to the evaluation indexes;

a training sample includes sample data of a sample object, and a loss correction coefficient of each evaluation index corresponding to the sample is determined based on target information included in the sample data of the sample, the target information including usage behavior information of the sample object corresponding to the second application.

Optionally, the sample data of the sample object includes model input data of the sample object, label tags of the model input data corresponding to the evaluation indexes, and target information of the sample object, and the model input data includes sample object information and sample recommendation information; the neural network model comprises a multitask learning model and a first network model; the multi-target prediction model is obtained by training through a model training device in the following mode:

repeatedly executing the following operations on the neural network model based on each training sample until the total training loss value meets the training ending condition, and determining the multi-task learning model at the training ending as a multi-target prediction model:

for each training sample, inputting model input data of each training sample into a multitask learning model to obtain a prediction recommendation value of the sample corresponding to each evaluation index, and inputting target information of the sample into a first network model to obtain a loss correction coefficient of the sample corresponding to each evaluation index;

for each training sample, determining a training loss value of a loss function of the sample corresponding to each evaluation index based on a prediction recommended value and a label of the sample corresponding to each evaluation index, and determining a training loss value of the sample based on the training loss value and a loss correction coefficient of the sample corresponding to each evaluation index;

and determining a training total loss value based on the training loss value of each training sample, and if the training total loss value does not meet the training end condition, adjusting the model parameters of the neural network model.

Optionally, a loss correction coefficient of a training sample corresponding to an evaluation index characterizes a correlation between the sample and the evaluation index; for each training sample, the model training device, in determining the training loss value for that sample, is configured to:

and taking the loss correction coefficients of the sample corresponding to the evaluation indexes as first weights of the training loss values of the evaluation indexes, and performing weighted summation on the training loss values of the sample corresponding to the evaluation indexes to obtain the training loss value of the sample.

Optionally, the loss correction coefficient of a training sample corresponding to an evaluation index characterizes the possibility that the sample is a noise sample of the evaluation index; for each training sample, the model training device, in determining the training loss value for that sample, is configured to:

for each evaluation index, determining a second weight of the training loss value of the sample corresponding to the evaluation index based on the training correction coefficient of the sample corresponding to the evaluation index, wherein the second weight is in negative correlation with the training correction coefficient;

weighting the training loss value of the sample corresponding to each evaluation index based on the second weight of the sample corresponding to each evaluation index;

and determining the training loss value of the sample based on the weighted training loss value of the sample corresponding to each evaluation index.

Optionally, the target loss function further includes a regular correction term, and for each training sample, the model training apparatus, when determining the training loss value of the sample, is configured to:

determining the value of a regular correction term of the sample based on the training correction coefficient of the sample corresponding to each evaluation index;

and determining the training loss value of the sample based on the weighted training loss value of the sample corresponding to each evaluation index and the regular correction term.

Optionally, the expression of the regular correction term is: log S, wherein S represents the product of the training correction coefficients corresponding to the evaluation indexes.

Optionally, the multi-objective prediction type includes a second network model corresponding to each evaluation index; for each piece of information to be recommended, the recommendation value determination module may be configured to:

acquiring a first information characteristic of object information of a target object and a second information characteristic of the information to be recommended; and splicing the first information characteristic and the second information characteristic, respectively inputting the spliced information characteristics into the second network models of the evaluation indexes, and predicting to obtain a first recommendation value of the information to be recommended corresponding to the evaluation index through the second network model of each evaluation index.

Optionally, each second network model includes a first feature extraction layer, at least two second feature extraction layers, a feature fusion layer, and a recommendation prediction layer; for each evaluation index, the second network model corresponding to the evaluation index obtains a predicted recommendation value of the information to be recommended corresponding to the evaluation index by executing the following operations:

performing feature extraction on the spliced features of the information to be recommended through a first feature extraction layer to obtain first features, and performing feature extraction on the spliced features of the information to be recommended through at least one second feature extraction layer to obtain second features corresponding to each second feature extraction layer;

respectively determining the matching degree of each second feature and the first feature through a feature fusion layer, weighting the corresponding second features based on the matching degree, and fusing the weighted second features to obtain fused features;

and obtaining a first recommendation value of the information to be recommended, which corresponds to the evaluation index, through a recommendation value prediction layer based on the fused features.

Optionally, the information to be recommended includes multimedia information, and the at least two evaluation indexes include information reading duration and information click rate.

Optionally, the usage behavior information of one sample object corresponding to the second application includes at least one of:

an access frequency of the sample object to access the second application;

a recommended amount of information to recommend to the sample object by the second application;

the recommended information reading amount of the sample object in the second application;

a length of time the sample object is used for the second application;

the sample object is directed to the interaction behavior information of the recommendation information in the second application.

In another aspect, an embodiment of the present application further provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory, where the processor executes the computer program to implement the steps of the method provided by the embodiment of the present application.

On the other hand, the embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the method provided by the embodiment of the present application.

In yet another aspect, the present application further provides a computer program product, where the computer program product includes a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method provided by the present application.

The technical scheme provided by the embodiment of the application has the following beneficial effects: according to the information recommendation method provided by the embodiment of the application, when the target information to be recommended is determined, the recommendation of each piece of information to be recommended corresponding to a plurality of different recommendation evaluation indexes can be predicted through a trained multi-target prediction model, namely, the recommendation values of each piece of information to be recommended corresponding to a plurality of recommendation evaluation indexes are comprehensively considered, and whether one piece of information to be recommended is used as the target recommendation information or not is comprehensively measured from a plurality of different dimensions. When a multi-target prediction model is obtained through training, besides model input data, use behavior information of each sample object to application is also considered in each training sample of the model, and the information is adopted to correct a training loss value of a corresponding training sample, so that the training loss condition of the model can better accord with actual behavior data of the sample object, more accurate model parameters can be learned by the model based on different use behavior information, the performance of the model is improved, the accuracy of predicting a first recommended value corresponding to information to be recommended through the model is improved, a basis is provided for screening target recommendation information which more accords with the target object, application requirements are better met, and the use perception of a user is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic flowchart of an information recommendation method according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a neural network model according to an embodiment of the present disclosure;

FIG. 3 is a system architecture diagram of an information recommendation system according to an embodiment of the present application;

fig. 4 is a schematic flowchart of an information recommendation method according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a neural network model provided in an embodiment of the present application;

fig. 6 is a schematic flowchart of an information recommendation method according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an information recommendation device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device to which the embodiment of the present application is applied.

Detailed Description

Embodiments of the present application are described below in conjunction with the drawings in the present application. It should be understood that the embodiments set forth below in connection with the drawings are exemplary descriptions for explaining technical solutions of the embodiments of the present application, and do not limit the technical solutions of the embodiments of the present application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the terms "comprises" and/or "comprising," when used in this specification in connection with embodiments of the present application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, as embodied in the art. The term "and/or" as used herein indicates at least one of the items defined by the term, e.g., "a and/or B" indicates either an implementation as "a", or an implementation as "a and B".

The information recommendation method is provided for solving the problems that in the existing information recommendation application scene, the personalized recommendation effect of information still needs to be improved, and the accuracy of information recommendation is not ideal.

Optionally, the information recommendation method provided in the embodiment of the present application may be implemented based on an Artificial Intelligence (AI) technology. For example, obtaining the first recommendation value of the information to be recommended corresponding to each recommendation evaluation index may be implemented by a trained neural network model (i.e., a multi-objective prediction model). AI is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. As artificial intelligence technology has been researched and developed in a wide variety of fields, it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and will play an increasingly important role.

The information to be recommended in the embodiment of the present application may include text information, such as news including text content, and the multi-target prediction model may be a neural network model based on Natural Language Processing (NLP). NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

Optionally, the data processing according to the embodiment of the present application may be implemented based on Cloud technology (Cloud technology), for example, when the multi-target prediction model is obtained by training a neural network model, the Cloud technology may be used, and the data calculation related in the training process may use a Cloud computing (Cloud computing) mode. The cloud technology is a hosting technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Cloud computing refers to a delivery and use mode of an IT infrastructure, and refers to acquiring required resources in an on-demand and easily-extensible manner through a network; the generalized cloud computing refers to a delivery and use mode of a service, and refers to obtaining a required service in an on-demand and easily-extensible manner through a network. Such services may be IT and software, internet related, or other services. With the development of diversification of internet, real-time data stream, and connection devices, and the promotion of demands for search services, social networks, mobile commerce, open collaboration, and the like, cloud computing has been rapidly developed. Different from the prior parallel distributed computing, the generation of cloud computing can promote the revolutionary change of the whole internet mode and the enterprise management mode in concept.

For better illustration and understanding of the solutions provided in the embodiments of the present application, some relevant technical terms referred to in the embodiments of the present application will be first introduced:

the recommendation system comprises: recommendation systems are a tool for automatically contacting users and items, which can help users find information of interest to them in information overload environments, and can push information to users of interest to them.

And (3) news personalized recommendation: and recommending news of interest to the user according to the user information (such as interest characteristics, reading behaviors and the like) of the user.

CTR (Click Through Rate) estimation module: the click rate estimation module can also be called as a click rate estimation module, and carries out click rate estimation calculation on the object candidate set according to the feature list and the sorting model. In the embodiment of the application, the CTR model is used for predicting the click rate of the information to be recommended.

MTL (multi-task learning): multiple related tasks are put together to learn, while multiple tasks are learned.

Attention mechanism: the essence comes from the human visual attention mechanism. People generally do not see a scene from head to tail all at a time when people perceive things, but often observe a specific part according to needs. And when people find that a scene often appears in a part where people want to observe, people can learn to focus on the part when similar scenes reappear in the future, focus more Attention on the useful part, and the nature of the Attention is weighting.

User activity: the user activity refers to the behavior frequency of the user using the product, such as the frequency of logging in each day in news APP (application), the number of reading, the duration of use and the like.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Fig. 1 shows a schematic flow chart of an information recommendation method provided in an embodiment of the present application, where the method may be executed by any electronic device, for example, a terminal device, and the terminal device may determine target recommendation information corresponding to a target object from information to be recommended by executing the method, so that the target recommendation information more meeting personalized requirements of the target object may be subsequently displayed to the target object, and user perception is improved. The method can be executed by a server, optionally, the server can be a cloud server, the method can be implemented as an application program or as a plug-in or a function module of an existing application program with an information recommendation function, for example, the method can be used as a new function module of a news application program, by executing the method of the embodiment of the application, for different users, target recommendation information recommended to the user can be screened out more accurately, further, the target recommendation information can be pushed to terminal equipment of the user, the user is shown, based on the method, the use perception of the user on the application program can be improved, and the user viscosity of the application program can be improved. The terminal equipment comprises a user terminal, wherein the user terminal comprises but is not limited to a mobile phone, a computer, intelligent voice interaction equipment, intelligent household appliances, a vehicle-mounted terminal, wearable electronic equipment, AR/VR equipment and the like.

As shown in fig. 1, the information recommendation method provided in the embodiment of the present application may include the following steps S110 to S140, and optionally, the method may be executed by a server.

Step S110: the method comprises the steps of obtaining object information of a target object of a first application and at least one piece of information to be recommended.

The first application may be any application having an information recommendation function, the type of the application is not limited in this embodiment of the application, and the application may be an application that can be installed on a mobile terminal device or other computer devices, or an online application, such as a web page version application, or an applet. The target object of the first application is any object that uses the application, i.e., a user, and may be an object that is registered, logged in, or an object that is not registered. For example, a user registers or logs in an application, but in case of user authorization, the server may use the identifier of the terminal device used by the user as the identifier of the user, and make information recommendation for the user.

The object information of the target object, i.e. the user image of the target object, includes, but is not limited to, attribute information of the object (e.g. related information such as age, sex, location, user's taste, etc. of the object known in the case of object authorization), historical behavior information corresponding to the first application, e.g. operation behavior data related to the first application, operation behavior data of other applications belonging to the same type as the first application, etc., and the operation behavior data may include, but is not limited to, frequency of using the application by the user (i.e. the user), operation data on the application, e.g. duration of each use, number of read recommendations (information already pushed to the user) per use, number of recommendations per click per time of using the application (i.e. when the application was used), type of recommendations, etc.

As an alternative, the object information of the target object includes attribute information of the object and usage behavior information of the object corresponding to the first application.

The information to be recommended is candidate recommendation information, and a specific obtaining mode of the information to be recommended is not limited in the embodiment of the present application, and may be information randomly selected from an information database, or information obtained by performing preliminary screening from the information database by some pre-screening means, that is, information obtained by performing at least one coarse ranking. For example, the first application is a news application, a large amount of news are stored in an information database of the application, and the server of the first application can randomly take out a set amount of news from the information database of the application as candidate information to be recommended, or screen out a plurality of news of the type from the information database as information to be recommended according to the user portrait of the target user and the type of the news read by the user at ordinary times.

The embodiment of the present application is not limited to the type of the information to be recommended. In practical applications, the information to be recommended in different application scenarios is likely to be different. The information to be recommended may be multimedia information, and one piece of information to be recommended may include one or more items of text, image, audio, video, and the like. For example, the first application may be a news application, the information to be recommended may be news to be recommended, and the news may include text and may also include images.

Step S120: and for each piece of information to be recommended, predicting to obtain a first recommendation value of the information to be recommended, which corresponds to each evaluation index of at least two recommendation evaluation indexes, through a multi-target prediction model based on the object information of the target object and the information to be recommended.

The multi-target prediction model is obtained by training based on a plurality of training samples, each training sample comprises sample data of a sample object, a training loss value of each training sample is obtained based on a training loss value and a loss correction coefficient of the sample corresponding to each evaluation index, the loss correction coefficient of the sample corresponding to each evaluation index is associated with target information of the sample object, the target information comprises using behavior information of the sample object corresponding to a second application, and the second application is a first application or an application belonging to the same type as the first application.

In the embodiment of the application, the recommendation evaluation index is an index for evaluating whether information to be recommended is used as target recommendation information, and by adopting a plurality of different evaluation indexes, the information to be recommended can be evaluated from a plurality of different dimensions, so that the accuracy of final evaluation is improved.

The at least two recommendation evaluation indexes specifically include which evaluations are not limited in the embodiment of the present application, and may be configured according to actual application requirements. The information to be recommended corresponds to a first recommendation value of a recommendation evaluation index, and represents the possibility that the information to be recommended will be regarded as target recommendation information (i.e. the possibility of being recommended to a target user) from the perspective of the dimension of the recommendation evaluation index, the higher the first recommendation value, the higher the possibility that the information to be recommended will be regarded as target recommendation information from the perspective of the dimension, and the first recommendation value may also be referred to as a first score.

Optionally, the information to be recommended may be multimedia information, such as text information (e.g., news integrity), and the at least two recommendation evaluation indicators may include, but are not limited to, an information reading duration and an information click rate. For one piece of information to be recommended, the size of the first recommendation value of the index, namely the information reading time length corresponding to the information, represents the length of time used by a target object for reading the information, namely the time length spent on the information; the first recommended value of the information click rate corresponding to the information represents the possibility that the information will be clicked by the target object.

In the embodiment of the application, a trained multi-target prediction model can be used to predict the first recommendation value of each piece of information to be recommended, for each piece of information to be recommended, the input of the model can be the object information of a target object and the piece of information to be recommended, and the output is the first recommendation value of the piece of information to be recommended corresponding to each evaluation index.

The embodiment of the application is not limited to a specific model structure of the multi-target prediction model, and the multi-target prediction model can be selected according to actual requirements, can be a multi-task learning model, can be obtained by training an initial multi-task learning model based on training samples, and can optionally adopt a multi-task learning model with an attention mechanism, so that when information feature extraction is carried out, the model can focus more attention on a useful part, and thus can learn features with better expression capability.

In an alternative embodiment of the present application, the multi-objective prediction model includes a second network model corresponding to each evaluation index; for each piece of information to be recommended, the predicting to obtain, through a multi-objective prediction model, a first recommended value of the piece of information to be recommended, which corresponds to each of at least two recommended evaluation indexes, based on the object information of the target object and the piece of information to be recommended, may include:

acquiring a first information characteristic of object information of the target object and a second information characteristic of the information to be recommended;

and splicing the first information characteristic and the second information characteristic, respectively inputting the spliced characteristics into the second network models of the evaluation indexes, and predicting to obtain a first recommendation value of the information to be recommended corresponding to the evaluation index through the second network model of each evaluation index.

The extraction of the first information feature and the second information feature may be implemented in a multi-objective prediction model, or implemented outside the model, that is, the multi-objective prediction model may include a feature extraction module for extracting the first information feature and the second information feature, or the first feature information and the second feature information may be obtained through word Embedding (Embedding) or other feature extraction methods.

After the first information characteristic of the target object and the second information characteristic of the information to be recommended are obtained, the first information characteristic and the second information characteristic can be spliced and then respectively input to each second network model, and a first recommendation value corresponding to each evaluation index is obtained through prediction. Optionally, the first recommendation value may be in a form of a score, for example, the output of each second network model may be a recommendation score or a score interval of recommendation evaluation indexes corresponding to the model of the information to be recommended, for example, a value range of the recommendation score is 0 to 100, the first recommendation value may be a specific score, or the value range may be divided into 10 score intervals, and the first recommendation value may correspond to one of the intervals.

The embodiment of the present application is not limited to the model structure of the second network model corresponding to each recommended evaluation index. The second network models corresponding to different recommended evaluation indexes may be completely parallel model branches, or may be models having a partially shared network structure, that is, some network structures are structures commonly included in a plurality of second network models corresponding to a plurality of evaluation indexes.

In an optional embodiment of the present application, each second network model includes a first feature extraction layer, at least one second feature extraction layer, a feature fusion layer, and a recommendation prediction layer;

for each evaluation index, the second network model corresponding to the evaluation index obtains a predicted recommendation value of the information to be recommended corresponding to the evaluation index by executing the following operations:

performing feature extraction on the spliced features of the information to be recommended through a first feature extraction layer to obtain first features, and performing feature extraction on the spliced features of the information to be recommended through at least two second feature extraction layers to obtain second features corresponding to each second feature extraction layer;

respectively determining the matching degree of each second feature and the first feature through the feature fusion layer, weighting the corresponding second features based on the matching degrees, and fusing the weighted second features to obtain fused features;

In order to better comprehensively utilize information of various aspects contained in the user information (namely, the object information) and the information to be recommended and extract and obtain multiple semantic information contained in the user information and the information to be recommended, when the first recommendation values corresponding to various evaluation indexes are obtained based on the spliced features, a neural network structure based on an attention mechanism can be adopted to extract and obtain information features with better expression capability. The model parameters of the first feature extraction layer and the second feature extraction layers are different, the features containing multilayer semantic information are extracted by adopting a plurality of feature extraction layers, the weight corresponding to each second feature can be determined by calculating the matching degree (such as similarity, and can be determined based on the distance between the first feature and the second feature) of the first feature and the second feature, optionally, the matching degree can be directly used as the weight, the second features are subjected to weighted fusion to obtain the fused features containing a plurality of layers and multiple semantic information, and the first recommendation value corresponding to the information to be recommended is further obtained based on the features.

Optionally, the second network models corresponding to different evaluation indexes may share the plurality of second feature extraction layers.

As an alternative example, fig. 2 shows a schematic diagram of the principle of obtaining a first recommendation value (score value shown in fig. 2) of information to be recommended corresponding to one evaluation index. In this example, the second neural network model includes two second feature extraction layers, the features after being spliced are respectively input into the first feature extraction layer and each second feature extraction layer to obtain features F1, F2, and F3 respectively corresponding to the three feature extraction layers, and the feature fusion layer calculates similarity S of F1 and F2 based on the features F1, F2, and F3 ₁₂ Meter for measuringCalculating the similarity S of F1 and F3 ₁₃ And adopt S ₁₂ And S ₁₃ Carrying out weighted summation calculation on the F2 and the F3 to obtain the fused feature S ₁₂ *F2+S ₁₃ * And F3, inputting the fused features into a recommended value prediction layer to obtain a first recommended value corresponding to the evaluation index corresponding to the second neural network.

In order to obtain a multi-target prediction model meeting application conditions, a large number of training samples need to be obtained, in this embodiment of the application, each training sample includes sample data of a sample object, where the sample data includes model input data (model input corresponding to a purpose of the training model), the model input data includes sample object information and sample recommendation information of the sample object, the sample data further includes a label of the model input data corresponding to each evaluation index, that is, the sample recommendation information corresponds to a label of each evaluation index, and the label corresponding to one evaluation index represents a true recommendation value of the sample recommendation information corresponding to the index.

When the multi-task learning model is trained, model input data of each training sample can be input into the model, the sample corresponds to a prediction recommendation value of each evaluation index through the model, and a training loss value of the model can be calculated based on a label (namely a real recommendation value) and the prediction recommendation value of each training sample corresponding to each evaluation index, wherein the loss value represents the difference between the real recommendation value and the prediction recommendation value of each training sample corresponding to a plurality of training indexes.

In an embodiment of the application, the sample data of one sample object further includes target information of the sample object, and the target information corresponding to one sample object includes usage behavior information of the target object corresponding to the second application. The target information may also be referred to as activity of the object or activity of the user, and represents a use condition of the user for the second application, that is, information related to the use of the second application by the user may be calculated as the target information of the user. Alternatively, the second application may be the first application or an application of the same type as the first application, that is, an application having a high similarity to the first application. By adopting the first application or the sample data of the sample object of the application belonging to the same application type with the first application, the prediction performance of the multi-target prediction model obtained by training when applied to the information to be recommended of the first application can be effectively guaranteed. Of course, if the generalization capability of the model is considered, sample data corresponding to a plurality of applications may be used.

an access frequency of the sample object to access the second application; a recommended amount of information to recommend to the sample object by the second application; the recommended information reading amount of the sample object in the second application; a length of time the sample object is used for the second application; the sample object is directed to the interaction behavior information of the recommendation information in the second application.

For a sample object, the access frequency may refer to the number of times that the object uses the second application within a set time duration, for example, the number of times that the object logs in the second application every day. The recommended amount of information, which may also be referred to as an amount of information exposure, refers to the amount of information exposed/presented to the subject over a certain period of time. The recommended information reading amount is the number of information read/viewed/clicked by the object in the second application within a certain time length. The interactive behavior information refers to the interactive information generated by the object when using the second application and the recommendation information in the second application, and may include, but is not limited to, behavior information such as praise, forward, post comments, and the like.

In practical applications, since different groups of people have different attributes, the target of the application program used by different people is usually different. For example, when some people use an application, the usage time is rich, the time spent on the application is flexible and scalable, and the proportion of interest in the target of using the application (i.e., the consuming application) is large, so the space corresponding to the information recommended to the people is not limited. However, some people have limited consumable time on the application program, and the time spent on the application program has a certain amount, so that the proportion of information acquired for the purpose of using the application program is larger, the possibility that the information which needs to spend longer reading time is read by the information is lower, and the user can hope to have more short information to be recommended.

In view of this, the information recommendation method provided in the embodiment of the present application further considers the usage behavior information of each sample object corresponding to the second application when training the multitask learning model, and modifies the loss value (i.e., the value of the loss function) of the training sample corresponding to each recommended evaluation index based on the usage behavior information, so that the calculated training loss value of one training sample corresponding to each index can better conform to the usage behavior of the sample object of the sample for the second application, and when constraining the learning of the model based on the total training loss condition of the model (i.e., the value of the target loss function, i.e., the value of the total training loss), the model can learn to more conform to the real usage condition of the application program corresponding to the sample and more accurate model parameters based on the usage behavior corresponding to each sample.

Correspondingly, when the first recommendation values of the information to be recommended, which correspond to the evaluation indexes, are predicted based on the trained multi-target prediction model, the prediction result can be more accurate, and a basis is provided for determining more accurate target recommendation information from the information to be recommended based on the first recommendation values subsequently.

The training total loss value is determined based on the loss function value corresponding to each evaluation index, for example, the loss function values corresponding to each evaluation index are added, or each evaluation index may be configured with a corresponding weight, and the loss function values corresponding to each evaluation index may be weighted and summed based on the weight corresponding to each evaluation index to obtain the training total loss value. When the model is trained, whether to end training may be determined based on whether the total training loss value satisfies a training end condition, where the training end condition may be configured according to requirements, and may include, but is not limited to, convergence of a target loss function or the number of times of training reaches a set number of times.

Step S130: and for each piece of information to be recommended, fusing the first recommendation values corresponding to the information to obtain a second recommendation value of the information.

Step S140: and determining target recommendation information of the target object from at least one piece of information to be recommended according to the second recommendation value of each piece of information to be recommended.

The first recommendation values corresponding to the multiple evaluation indexes reflect the possibility that the information to be recommended may be selected in multiple different evaluation dimensions, in order to comprehensively evaluate the information, the second recommendation value, which is a comprehensive recommendation value of the information, can be obtained by fusing the first recommendation values corresponding to the dimensions, and the target recommendation information can be determined from the information to be recommended based on the comprehensive recommendation values of the information to be recommended. The manner of fusing the plurality of first recommendation values may be configured according to actual application requirements, optionally, a second recommendation value obtained by adding the first recommendation values, or an average value of the first recommendation values is used as the second recommendation value, and the second recommendation value is obtained by weighting and summing the first recommendation values based on the weight corresponding to each evaluation index preset value for the weight corresponding to each different evaluation index preset value.

After the second recommendation value of each piece of information to be recommended is obtained, according to the sequence from high recommendation value to low recommendation value, a set number of pieces of information to be recommended which are ranked in the front can be determined as the target recommendation information, or the pieces of information to be recommended of which the second recommendation value is greater than or equal to a set threshold value can be determined as the target recommendation information.

According to the information recommendation method provided by the embodiment of the application, when the target information to be recommended is determined, the recommendation of each piece of information to be recommended corresponding to a plurality of different recommendation evaluation indexes can be predicted through a trained multi-target prediction model, namely, the recommendation values of each piece of information to be recommended corresponding to a plurality of recommendation evaluation indexes are comprehensively considered, and whether one piece of information to be recommended is used as the target recommendation information or not is comprehensively measured from a plurality of different dimensions. Further, when a multi-target prediction model is obtained through training, in the method provided by the embodiment of the application, each training sample includes model input data, and the use behavior information of each sample object for the application is also considered, and the training loss value of the corresponding training sample is corrected by using the information, so that the training loss condition of the model can better accord with the actual behavior data of the sample object, the model can learn more accurate model parameters based on different use behavior information, the performance of the model is improved, the accuracy of predicting the first recommendation value corresponding to the information to be recommended through the model is improved, a basis is provided for screening the target recommendation information which better accords with the target object, the application requirements are better met, and the use perception of the user on the application is improved.

In an optional embodiment of the present application, the neural network model includes a multitask learning model and a first network model; the multi-target prediction model is obtained by training in the following way:

repeatedly executing the following operations on the neural network model based on each training sample until the total training loss value meets the training ending condition, and determining the multi-task learning model at the training ending time as a multi-target prediction model:

In this scheme, each predicted recommendation value corresponding to a training sample (that is, a first recommendation value corresponding to sample recommendation information in sample data) is predicted by using a multitask learning model, and a loss correction coefficient corresponding to each recommended evaluation index is predicted by using a first network model. The input of the multitask learning model is model input data (sample object information and sample recommendation information) in each training sample or data obtained by vectorizing the data (may be referred to as vectorized data), and the model input data is output as a prediction recommendation value corresponding to each evaluation index. The input of the first network model is the target information of the sample object in each training sample or the vectorized data of the information, and the input is the loss correction coefficient corresponding to each evaluation index.

For each training, the training loss value of each training sample corresponding to each evaluation index (that is, the value of the loss function corresponding to each evaluation index) can be obtained by calculating the difference between the predicted recommended value of each training sample corresponding to each evaluation index and the standard label, the training loss value of the corresponding index is corrected by using the loss correction coefficient corresponding to each evaluation index, the training loss values corresponding to each corrected evaluation index are added to obtain the training loss value of the sample, and the training loss values of each training sample are summed to obtain the training total loss value of the model. If the training total loss value corresponding to a certain training meets the training end condition, the multi-task learning model at the moment can be used as a multi-target prediction model, or a plurality of test samples are adopted to test the multi-task learning model at the moment, the multi-task learning model meeting the preset testing end condition is used as the multi-target prediction model, and if the multi-task learning model does not meet the training end condition and the testing end condition, the model can be trained on the basis of the training samples continuously until the training end condition and the testing end condition are met. If the total loss value of the training does not meet the training end condition, the model parameters of the model can be adjusted by adopting a gradient descent algorithm, and the adjusted model is continuously trained.

In an optional embodiment of the present application, a loss correction coefficient of a training sample corresponding to an evaluation index characterizes a correlation between the sample and the evaluation index;

for each training sample, determining a training loss value of the sample based on the training loss value and the loss correction coefficient of the sample corresponding to each evaluation index, wherein the training loss value of the sample comprises the following steps:

That is, the weight of the training loss value of a training sample corresponding to each evaluation index may be predicted based on the target information of the training sample (i.e., the usage behavior information of a sample object corresponding to the second application), and a training sample corresponding to the weight of an evaluation index may be understood as the degree of importance of the training sample in the dimension of the evaluation index for model training during training, that is, the confidence of the training sample for the evaluation index/the reliability of the sample data. By the scheme, when the model is trained, the model can pay more attention to the training loss of the evaluation index with higher confidence coefficient corresponding to the training sample, and the purpose of improving the performance of the model is achieved.

Taking two recommendation evaluation indexes as an example, the training Loss value Loss of one training sample can be expressed as follows:

Loss＝loss1*s ₁ +loss2*s ₂ (1)

wherein, loss1 and loss2 respectively represent the training loss values of the training sample corresponding to two evaluation indexes, s ₁ And s ₂ The training correction coefficients for loss1 and loss2 are shown, respectively.

In an alternative embodiment of the present application, the loss correction factor of a training sample corresponding to an evaluation index characterizes the likelihood that the sample is a noise sample of the evaluation index;

for each training sample, determining a training loss value of the sample based on the training loss value and the loss correction coefficient of the sample corresponding to each evaluation index may include:

In this alternative, the noise parameter (i.e., the above-mentioned training correction coefficient) of the sample corresponding to the training loss value of each evaluation index may be predicted based on the usage behavior information of the sample object of the training sample corresponding to the second application, and the likelihood that the training sample is the noise sample on the evaluation index may be understood as the value of the noise parameter of one evaluation index corresponding to one training sample. Accordingly, after the training correction coefficients of the training samples corresponding to the respective evaluation indexes are predicted by the first network model, the corresponding weights may be determined based on the coefficients, for example, the inverse of the training correction coefficient of a training sample corresponding to an evaluation index or the inverse of the square of the training correction coefficient may be used as the weights of the sample corresponding to the evaluation index. Similarly, when the scheme is adopted to train the model, for each training sample, the model can pay less attention to the noise sample of the evaluation index, namely, the model can pay more attention to the training sample with higher confidence coefficient of the evaluation index, so that the purpose of improving the performance of the model is achieved.

In an optional embodiment of the present application, the target loss function further includes a regular correction term, and for each training sample, the method may further include:

determining a training loss value of the sample based on the weighted training loss values of the sample corresponding to the evaluation indexes, including:

Optionally, in practical application, in order to avoid a problem that training of a model is seriously unbalanced due to an excessively large difference between training correction coefficients of training samples corresponding to each evaluation index (for example, the training correction coefficient of a certain evaluation index is 0 or approaches to 0, which may cause difficulty in convergence of a target loss function of the model, an excessively large loss gradient corresponding to a loss function corresponding to different evaluation indexes, and the like), during training, the problem that the training correction coefficients of each evaluation index are greatly different and may cause an excessive constraint may be further performed by setting a regular correction term, that is, a purpose of punishing a value of a total training loss may be performed by the regular correction term, so as to improve a training speed of the model.

Optionally, the expression of the regular correction term may be logS, where S represents a product of training correction coefficients corresponding to the evaluation indexes.

Also taking two recommendation evaluation indexes as an example, the training Loss value Loss of one training sample can be expressed as follows:

Loss＝loss1*(1/2s ₁ ² )+loss2*(1/2s ₂ ² )+logs ₁ s ₂ (2)

wherein, loss1 and loss2 respectively represent the training loss values of the training sample corresponding to two evaluation indexes, s ₁ And s ₂ Represents the training correction coefficients, logs, of loss1 and loss2, respectively ₁ s ₂ Representing the above-mentioned regular correction term.

The optional implementation schemes provided by the application can be suitable for any information recommendation system, and based on the scheme provided by the application, the accuracy of real personalized feedback of the user can be further improved, and the user requirements can be better met. In order to better illustrate the utility of the solution provided in the present application, an alternative embodiment of the present application is described below with reference to a specific application scenario.

The application scenario of this embodiment is a news recommendation scenario, the first application is a news app, the information to be recommended is news, and the target object is a user of the application. The recommendation evaluation indexes in the scene are illustrated by taking two examples, including the reading duration and the click rate of news. Based on the scheme provided by the embodiment of the application, the target news (namely the target recommendation information) which better meets the requirements of the user can be screened from a plurality of candidate recommendation news (namely the information to be recommended).

In a news recommendation scene, the applied data promotion indexes generally include two targets of average reading duration and average reading duration, and not only the reading duration of the user is promoted, but also the reading duration of the user is promoted. In the scene, the multi-target prediction model is a sequencing model for solving the multi-target profit, and the quality of the multi-target prediction model directly determines the profit of the data index. Although various information recommendation modes exist in the prior art, most of the existing recommendation modes adopt the unified treatment of all people, and are difficult to achieve the simultaneous increase of two targets, for example, the overall people has a bias duration tendency in the unified fusion process, and as a result, the average reading duration index is increased while the average reading duration index is reduced or kept flat, because the bias duration tendency fusion is biased to recommend an article with a bias duration, and the long duration articles are weighted in the evaluation process, so that for some people with limited time, the reading number is inevitably reduced after the reading time of a single article is prolonged.

Moreover, the characteristics of the crowd are not well considered in the existing scheme, and the problem of different crowds cannot be solved. But different groups of people have different attributes and the purpose of news consumption is also different. The consumable time of some people is abundant, the time spent on news reading is elastic and telescopic, so that the consumption purpose is met with a larger interest ratio and is easier to immerse and read; some people have limited reading consumable time, and the time spent on news reading has a qualitative effect, so the information acquisition proportion in the consumption purpose is larger, the possibility of immersion reading according to the interest is smaller, and more short information acquisition is expected. However, the difference of the population is not considered in the existing scheme, and the recommendation effect needs to be improved.

The information recommendation method provided by the embodiment of the application fully considers the difference characteristics among different crowds, and can correct the training loss of the sample based on the target information of the training sample in the process of obtaining the multi-target prediction model through training, so that the self-adaptive training model based on the activity of the user is realized. Therefore, when the trained multi-target prediction model is used for predicting the first recommendation value of the information to be recommended corresponding to the multiple indexes, the model can be different from population to population, for example, for the population with abundant time and large elasticity, the population time is abundant, the reading number is basically kept unchanged after the reading time of a single article is prolonged, so that the reading number is not reduced while the time index is increased, and the model can be biased to optimize the reading time during prediction; for the time-limited crowd, the crowd has limited time, the multi-click rate can be optimized preferentially during model prediction, and the articles of the segments can be recommended, so that the reading number of the articles can be increased when the reading time of the crowd is fixed, and the reading number is increased under the condition of ensuring that the total reading time is not changed. Based on the mode, the accuracy of the real and personalized interest feedback of the user is improved, the multi-target data income of the application is increased, and the user stickiness of the application is improved.

An alternative embodiment of the news (information in this scenario, or may also be referred to as an article) recommendation method in this application scenario is described in detail below.

Fig. 3 shows a schematic structural diagram of an alternative news recommendation system in the application scenario, and as shown in fig. 3, the system includes a terminal device 10 of a user, a server side of a first application, that is, an application server 20 shown in fig. 3, and a model training apparatus 30, and the terminal device 10 and the application server communicate with each other through a network. Among them, the terminal device 10 may be installed with a news APP, and a client (a news client shown in the figure) that opens the APP can read news.

The model training device 30 may be configured to train a neural network model based on a training sample, so as to obtain a trained multi-target prediction model. The trained multi-target prediction model may be deployed in an application server 20, and the application server 20 may be configured to execute the information recommendation method provided in the embodiment of the present application, and predict the first recommendation value of each candidate recommended news based on the trained multi-target prediction model, so that the target news is determined from the multiple candidate recommended news based on the first recommendation value of each candidate recommended news, and may be pushed to a target user (which may be any user of the application), that is, to a terminal device of the user, and is displayed to the user through a news client.

The following describes a flow of a news recommendation method in the application scenario in conjunction with the news recommendation system shown in fig. 3, and as shown in fig. 4, the method may include the following steps 41 to 44.

Step 41 and step 42: sample collection and sample processing, i.e. obtaining a training data set (a large number of training samples).

Step 43: model training, i.e., training based on a training data set, results in a multi-objective predictive model (i.e., the available model shown in FIG. 4).

Step S44: and (4) online prediction of the model, namely selecting target recommendation information from the information to be recommended based on the available model.

Here, steps 41 to 43 may be executed by the model training apparatus 30, and step S44 is executed by the application server 20. The embodiment of the method for obtaining the training samples is not limited, and optionally, under the authorization of the user, the user information of the news APP user, the frequency of using the APP by the user, the use duration, and the reading condition, the clicking condition, the exposure log and other relevant data of the user on the news, which are historically recommended to the user, are collected, the collected sample data is subjected to data cleaning, adoption and other processing, so that the training samples meeting the preset data format requirements are obtained, and the neural network model can be trained on the basis of the training samples, so that the available generation model is obtained.

The neural network model adopted during training comprises a multitask learning model and a first network model, the multitask learning model in the embodiment comprises a second network model (click rate estimation model for short) corresponding to click rate and a second network model (duration estimation model for short) corresponding to reading duration, the click rate estimation model is used for predicting a first recommended value of the training sample corresponding to the click rate, and the duration estimation model is used for predicting a first recommended value of the training sample corresponding to the reading duration. The model training device 30 trains the neural network model by using a training data set, and the trained multi-task learning model is the multi-target prediction model in the application scenario.

As an example, fig. 5 is a schematic diagram of a structure and a training principle of a neural network model provided in an embodiment of the present application, as shown in fig. 5, a click rate prediction model in this example includes a first feature extraction layer a, 3 second feature extraction layers (i.e., feature extraction layers 1 to 3), an attention network a (a feature fusion layer in this example), and a recommendation prediction layer a, and a duration prediction model includes a first feature extraction layer B, 3 second feature extraction layers (i.e., feature extraction layers 1 to 3), an attention network B, and a recommendation prediction layer B, where the click rate prediction model and the duration prediction model share 3 second feature extraction layers. The first network structure in this example comprises a feature extraction network and a gating network (i.e. a Gate network) cascaded in sequence. The specific structure of each network layer may be selected according to actual needs, and this embodiment is not limited, for example, the recommended value prediction layer a and the recommended value prediction layer B may adopt a Tower network (i.e., tower network).

In this embodiment, each training sample includes sample object information and sample recommendation information (i.e., sample news) of a sample object, a label tag (click rate label) of the sample recommendation information corresponding to the index of click rate, a label tag (duration label) of the sample recommendation information corresponding to the index of reading duration, and target information of the sample object, where the sample object includes user portrait data of the sample object, including the target information and other related information besides the target information.

When the neural network model shown in fig. 5 is trained based on a training data set, sample object information, sample recommendation information, and target information of each training sample may be subjected to vectorization processing, so as to obtain vectorized features, for each training sample, a first information feature of the sample object information and a second information feature of the sample recommendation information are spliced, the spliced features are input to a feature extraction layer a, 3 second feature extraction layers, and a feature extraction layer B, respectively, for the click rate estimation model, the first feature extracted by the feature extraction layer a is used as a query vector of the attention network a, the second features extracted by the 3 second feature extraction layers are used as key vectors of the attention network a, the similarity between the query vector and each key vector is calculated, the similarities corresponding to the three key vectors are used as weights, the three key vectors are subjected to weighted summation, so as to obtain fused features, the features are input to a recommendation value prediction layer a, and a first value (i.e., a recommendation value) of the sample corresponding to a click rate index, i.e., a prediction value shown in fig. 5, is predicted value. Similarly, a first recommended value (a time length score shown in fig. 5) of the reading time index predicted by the attention network B and the recommended value prediction layer B can be obtained, based on the first recommended value and the label of the click rate index corresponding to the training sample, a first loss value loss _ ctr corresponding to the click rate index, that is, loss1 in the foregoing expression (1) or (2), can be calculated, and based on the first recommended value and the label of the reading time index corresponding to the training sample, a second loss value loss _ time corresponding to the reading time index, that is, loss2 in the foregoing expression (1) or (2), can be calculated.

For each training sample, the target information of the sample or the feature (the user activity feature shown in fig. 5) obtained by vectorizing the target information may be input into the feature extraction network of the independent first network model for learning, and the learned feature is learned through a one-layer gated network to obtain a loss correction coefficient s corresponding to the click rate index ₁ (i.e., loss correction factor of loss _ ctr) and a loss correction factor s corresponding to the reading duration index ₂ (i.e., loss correction factor for loss _ time). Wherein the loss correction factor s ₁ And s ₂ May be two values ranging between (0,1).

For each training sample, the training loss value of each sample can be calculated by expression (1) or expression (2) in the foregoing. In this embodiment, the training loss value for training one training sample can be expressed as:

Loss＝loss_ctr*s ₁ +loss_time*s ₂

or,

Loss＝loss_ctr*(1/2s ₁ ² )+loss_time*(1/2s ₂ ² )+logs ₁ s ₂

optionally, the labeling label corresponding to the click rate index may be a binary label, that is, a click label or a non-click label, the label corresponding to the click label may be 1, the label corresponding to the non-click label may be 0, if the label of one training sample is 1, it indicates that the sample object may click the sample recommendation information, and the loss function corresponding to the index may adopt a cross entropy loss function. The reading duration index may be a specific duration, and the arithmetic function corresponding to the tag may be a mean-square error (MSE) loss function.

After the training loss value of each training sample is calculated, the training loss values of all the training samples can be added to obtain the total training loss of the neural network model. And if the total training loss does not meet the training end condition, adjusting the model parameters, repeating the training process until the training end condition is met, and taking the multi-task learning model at the training end as a multi-target prediction model. The model training mode provided by the application realizes a person-to-person multi-target tendency optimization model aiming at the crowd through the interaction of user activity gating and a multi-task learning model, and realizes the self-adaptive training based on the user activity (namely the target information).

After the trained multi-target prediction model is obtained, the multi-target prediction model can be put into online use, specifically, the application server 20 can determine at least one target news which can be recommended to the target user from the multiple news to be recommended by calling the trained multi-target prediction model based on the user portrait of the target user and the multiple news to be recommended, and recommend the target news to the user.

Fig. 6 is a schematic diagram illustrating an alternative process for determining target news provided in this embodiment, as shown in fig. 6, a user portrait is object information of a target object, and a content pool is a news database in which a large amount of news is stored. The news recommendation system of the embodiment can comprise a user portrait model, a recall module, a multi-index estimation model and a rearrangement model, wherein a user portrait of a target user is obtained by a user of the user portrait model, the recall module screens roughly-arranged articles, namely news, from a content pool and outputs the articles to the multi-index estimation module, the multi-index estimation module carries out fine sequencing on the recalled articles by predicting second recommendation values of the roughly-arranged articles, and the articles are finally recommended to the user after being sequenced by the rearrangement module according to other sequencing strategies. The following describes the procedure for determining the articles to be finally recommended to the user with reference to these models:

and S61, recalling the news to be recommended through the recall module.

Optionally, based on the preference of the user, or according to the popularity of the news in the content pool, a preset number of news with higher popularity can be screened out from the news as the news to be recommended.

Step S62: and (3) multi-index estimation, namely, a multi-index estimation module predicts a first recommendation value corresponding to a click rate index and a first recommendation value corresponding to a reading duration index of each to-be-recommended news based on a user portrait by calling a trained multi-target prediction model, obtains a second recommendation value of each to-be-recommended news by fusing the recommendation values of the two indexes, sorts the to-be-recommended news according to the sequence of the second recommendation values from large to small, screens out a specified number of target information news according to the sorting result, and selects the news according to the second recommendation value.

Optionally, the news to be recommended may include multiple types of news, and after sorting the news to be recommended according to the second recommendation value from large to small, a certain amount of target information may be screened out for each news type according to the sorting order.

And step 63: and rearranging the screened target information according to a preset strategy, for example, content styles of each target news can be rearranged in a mixed manner, and the rearranged target information is recommended to a target user, namely, sent to a user terminal of the target user, so that the target user is shown through a client of the news APP.

In the whole recommendation process, the multi-index estimation module plays a role in lifting the weight. The estimation module depends on a multi-target prediction module (also called a sequencing module), and the quality of the multi-target prediction module determines the accuracy of a recommendation result. The multi-target prediction model adopted by the embodiment of the application can be different from one group of people, so that the total reading time length is increased while the total reading time length is ensured in one part of the group of people, and the reading time length is increased while the total reading time length is ensured in the other part of the group of people. Therefore, the optimization effect of the whole crowd for different confirmations is achieved, and for the application scene embodiment, the reading duration index and the reading number (click rate) index of the news APP are increased, and better user personalized recommendation is achieved.

It can be understood that the scheme of the embodiment of the present application may be applied to the above-mentioned news recommendation scenarios, and may also be applied to the filtering of target recommendation information in other recommendation scenarios.

Based on the same principle as the method provided by the embodiment of the present application, the embodiment of the present application further provides an information recommendation apparatus, as shown in fig. 7, the information recommendation apparatus 100 may include an information acquisition module 110, a recommendation value determination module 120, and a target information determination module 130. Wherein:

an information obtaining module 110, configured to obtain object information of a target object of a first application and at least one piece of information to be recommended;

the recommendation value determining module 120 is configured to, for each piece of information to be recommended, predict, through a multi-objective prediction model, a first recommendation value of the piece of information to be recommended, which corresponds to each of at least two recommendation evaluation indexes, based on object information of a target object and the piece of information to be recommended, and obtain, by fusing the first recommendation values corresponding to the piece of information to be recommended, a second recommendation value of the piece of information to be recommended;

the target information determining module 130 is configured to determine target recommendation information of a target object from at least one piece of information to be recommended according to a second recommendation value of each piece of information to be recommended;

Optionally, the loss correction coefficient of a training sample corresponding to an evaluation index characterizes the possibility that the sample is a noise sample of the evaluation index; for each training sample, the model training device, in determining the training loss value for that sample, is to:

acquiring a first information characteristic of object information of a target object and a second information characteristic of the information to be recommended; and splicing the first information characteristic and the second information characteristic, respectively inputting the spliced information characteristic into a second network model of each evaluation index, and predicting through the second network model of each evaluation index to obtain a first recommendation value of the information to be recommended corresponding to the evaluation index.

an access frequency of the sample object to access the second application;

a length of time the sample object is used for the second application;

The apparatus of the embodiment of the present application may execute the method provided by the embodiment of the present application, and the implementation principle is similar, the actions executed by the modules in the apparatus of the embodiments of the present application correspond to the steps in the method of the embodiments of the present application, and for the detailed functional description of the modules of the apparatus, reference may be specifically made to the description in the corresponding method shown in the foregoing, and details are not repeated here.

Based on the same principle as the information recommendation method and apparatus provided in the embodiments of the present application, an embodiment of the present application further provides an electronic device (e.g., a server), where the electronic device may include a memory, a processor, and a computer program stored in the memory, and the processor executes the computer program to implement the steps of the method provided in any optional embodiment of the present application.

Alternatively, fig. 8 shows a schematic structural diagram of an electronic device to which the embodiment of the present application is applied, and as shown in fig. 8, an electronic device 4000 shown in fig. 8 includes a processor 4001 and a memory 4003. Processor 4001 is coupled to memory 4003, such as via bus 4002. Optionally, the electronic device 4000 may further include a transceiver 4004, and the transceiver 4004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data. In addition, the transceiver 4004 is not limited to one in practical applications, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.

The Processor 4001 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 4001 may also be a combination that performs a computational function, including, for example, a combination of one or more microprocessors, a combination of a DSP and a microprocessor, or the like.

Bus 4002 may include a path that carries information between the aforementioned components. The bus 4002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 4002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.

The Memory 4003 may be a ROM (Read Only Memory) or other types of static storage devices that can store static information and instructions, a RAM (Random Access Memory) or other types of dynamic storage devices that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer, without limitation.

The memory 4003 is used for storing computer programs for executing the embodiments of the present application, and is controlled by the processor 4001 to execute. The processor 4001 is used to execute computer programs stored in the memory 4003 to implement the steps shown in the foregoing method embodiments.

Embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program may implement the steps and corresponding contents of the foregoing method embodiments.

Embodiments of the present application further provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the steps and corresponding contents of the foregoing method embodiments can be implemented.

It should be understood that, although each operation step is indicated by an arrow in the flowchart of the embodiment of the present application, the implementation order of the steps is not limited to the order indicated by the arrow. In some implementation scenarios of the embodiments of the present application, the implementation steps in the flowcharts may be performed in other sequences as desired, unless explicitly stated otherwise herein. In addition, some or all of the steps in each flowchart may include multiple sub-steps or multiple stages based on an actual implementation scenario. Some or all of these sub-steps or stages may be performed at the same time, or each of these sub-steps or stages may be performed at different times, respectively. In a scenario where execution times are different, an execution sequence of the sub-steps or the phases may be flexibly configured according to requirements, which is not limited in the embodiment of the present application.

The foregoing is only an optional implementation manner of a part of implementation scenarios in this application, and it should be noted that, for those skilled in the art, other similar implementation means based on the technical idea of this application are also within the protection scope of the embodiments of this application without departing from the technical idea of this application.

Claims

1. An information recommendation method, comprising:

for each piece of information to be recommended, predicting to obtain a first recommendation value of the information to be recommended, which corresponds to each evaluation index of at least two recommendation evaluation indexes, through a multi-target prediction model based on the object information of the target object and the information to be recommended;

2. The method of claim 1, wherein the sample data of the sample object includes model input data of the sample object, the model input data corresponding to the label tag of each of the evaluation indicators, and target information of the sample object, and the model input data includes sample object information and sample recommendation information; the neural network model comprises a multitask learning model and a first network model;

the multi-target prediction model is obtained by training in the following way:

repeatedly executing the following operations on the neural network model based on the training samples until the total training loss value meets a training ending condition, and determining a multi-task learning model at the training ending time as the multi-target prediction model:

for each training sample, inputting model input data of each training sample into the multi-task learning model to obtain a prediction recommendation value of the sample corresponding to each evaluation index, and inputting target information of the sample into the first network model to obtain a loss correction coefficient of the sample corresponding to each evaluation index;

for each training sample, determining a training loss value of the loss function of the sample corresponding to each evaluation index based on the predicted recommended value and the label of the sample corresponding to each evaluation index, and determining a training loss value of the sample based on the training loss value and the loss correction coefficient of the sample corresponding to each evaluation index;

and determining the training total loss value based on the training loss value of each training sample, and if the training total loss value does not meet the training end condition, adjusting the model parameters of the neural network model.

3. The method of claim 2, wherein the loss correction factor of a training sample corresponding to an evaluation index characterizes the correlation of the sample with the evaluation index;

for each training sample, the determining the training loss value of the sample based on the training loss value and the loss correction coefficient of the sample corresponding to each evaluation index comprises:

and taking the loss correction coefficient of the sample corresponding to each evaluation index as a first weight of the training loss value of each evaluation index, and performing weighted summation on the training loss values of the sample corresponding to each evaluation index to obtain the training loss value of the sample.

4. The method of claim 2, wherein the loss correction factor of a training sample corresponding to an evaluation index characterizes the likelihood that the sample is a noisy sample of the evaluation index;

for each training sample, determining the training loss value of the sample based on the training loss value and the loss correction coefficient of the sample corresponding to each evaluation index comprises:

5. The method of claim 4, wherein the target loss function further comprises a regular correction term, and for each of the training samples, the method further comprises:

determining the training loss value of the sample based on the weighted training loss value of the sample corresponding to each of the evaluation indexes comprises:

6. The method of claim 5, wherein the regular correction term is expressed as: log S, wherein S represents the product of the training correction coefficients corresponding to the evaluation indexes.

7. The method according to any one of claims 1 to 6, wherein the multi-objective predictive type includes a second network model corresponding to each of the evaluation indices;

for each piece of information to be recommended, predicting a first recommendation value of the piece of information to be recommended, which corresponds to each evaluation index of at least two recommendation evaluation indexes, through a multi-objective prediction model based on the object information of the target object and the piece of information to be recommended, and the method comprises the following steps:

acquiring a first information characteristic of the object information of the target object and a second information characteristic of the information to be recommended;

and splicing the first information characteristic and the second information characteristic, respectively inputting the spliced information characteristics into the second network models of the evaluation indexes, and predicting to obtain a first recommendation value of the information to be recommended corresponding to the evaluation index through the second network model of each evaluation index.

8. The method of claim 7, wherein each of the second network models comprises a first feature extraction layer, at least two second feature extraction layers, a feature fusion layer, and a recommendation prediction layer;

for each evaluation index, the second network model corresponding to the evaluation index obtains a predicted recommendation value of the information to be recommended corresponding to the evaluation index by performing the following operations:

and obtaining a first recommended value of the information to be recommended, which corresponds to the evaluation index, through a recommended value prediction layer based on the fused features.

9. The method according to any one of claims 1 to 6, wherein the information to be recommended comprises multimedia information, and the at least two evaluation indexes comprise information reading duration and information click rate.

10. The method according to any one of claims 1 to 6, wherein the usage behavior information of one sample object corresponding to the second application comprises at least one of:

an access frequency of the sample object to the second application;

the usage duration of the second application by the sample object;

the sample object aims at the interaction behavior information of the recommendation information in the second application.

11. An information recommendation apparatus, characterized in that the apparatus comprises:

12. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to implement the steps of the method of any of claims 1-10.

13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 10.

14. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 10 when executed by a processor.