CN116452263A

CN116452263A - Information recommendation method, device, equipment, storage medium and program product

Info

Publication number: CN116452263A
Application number: CN202210006355.6A
Authority: CN
Inventors: 邵哲; 蒋挺宇; 潘军伟; 张凌寒; 陈细华; 刘大鹏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-01-05
Filing date: 2022-01-05
Publication date: 2023-07-18

Abstract

The application provides an information recommendation method, an information recommendation device, information recommendation equipment, a storage medium and a program product; the embodiment of the application can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, vehicle-mounted and the like, and relates to the artificial intelligence technology; the method comprises the following steps: responding to an information request of a target object, and acquiring a pre-training recommendation model, historical associated information of the target object and historical behavior information of the target object on the historical associated information; screening candidate recommendation information aiming at a target object; fine tuning the pre-training recommendation model based on the characteristics of the history associated information, the characteristics of the target object and the history behavior information to obtain a target prediction model corresponding to the target object; predicting interest parameters of the target object aiming at the candidate recommendation information through a target prediction model; and screening target recommendation information from the candidate recommendation information based on the interest parameters, and returning the target recommendation information to the target object. According to the information recommendation method and device, accuracy of information recommendation can be improved.

Description

Information recommendation method, device, equipment, storage medium and program product

Technical Field

The present disclosure relates to cloud technologies, and in particular, to an information recommendation method, apparatus, device, storage medium, and program product.

Background

The purpose of information recommendation is to recommend information that the user may be interested in according to the interests and preferences of the user, so that the user can perform the conversion action expected by the information delivery party. For example, merchandise links are recommended according to the interests of the user to attract the user to place orders, or related articles are recommended according to the favorite public figures of the user to attract the user to forward, collect, etc.

In the related art, information recommendation is mostly implemented based on artificial intelligence technology, for example, data such as exposure, clicking, conversion and the like of information are utilized, a deep learning model is trained on line by the data such as characteristics of a user, then information interested by the user is screened on line by utilizing the trained deep learning model, and the interested information is returned to the user, so that an information recommendation process is completed.

However, the model trained in this way learns more general features of the user, i.e. learns to be commonalities of the user, so that the information selected by this model is not necessarily of interest to the user, eventually resulting in a lower accuracy of information recommendation.

Disclosure of Invention

The embodiment of the application provides an information recommendation method, device and equipment, a computer readable storage medium and a program product, which can improve the accuracy of information recommendation.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides an information recommendation method, which comprises the following steps:

responding to an information request of a target object, and acquiring a pre-training recommendation model, history associated information of the target object and history behavior information of the target object on the history associated information;

screening candidate recommendation information aiming at the target object;

fine-tuning the pre-training recommendation model based on the characteristics of the history associated information, the characteristics of the target object and the history behavior information to obtain a target prediction model corresponding to the target object;

predicting interest parameters of the target object aiming at the candidate recommendation information through the target prediction model;

and screening target recommendation information from the candidate recommendation information based on the interest parameters, and returning the target recommendation information to the target object.

The embodiment of the application provides an information recommendation device, which comprises:

The information acquisition module is used for responding to an information request of a target object and acquiring a pre-training recommendation model, history associated information of the target object and history behavior information of the target object on the history associated information;

the information screening module is used for screening candidate recommendation information aiming at the target object;

the model fine-tuning module is used for fine-tuning the pre-training recommendation model based on the characteristics of the history associated information, the characteristics of the target object and the history behavior information to obtain a target prediction model corresponding to the target object;

the parameter prediction module is used for predicting interest parameters of the target object aiming at the candidate recommendation information through the target prediction model;

the information screening module is further configured to screen target recommendation information from the candidate recommendation information based on the interest parameter, and return the target recommendation information to the target object.

In some embodiments of the present application, the model fine tuning module is further configured to generate an embedded feature and a fine tuning feature based on the feature of the history associated information and the feature of the target object; generating input features of the pre-training recommendation model based on controlling fusion of the embedded features and the fine tuning features; and carrying out parameter adjustment on the pre-training recommendation model according to the input characteristics and the historical behavior information until the fine adjustment ending condition is reached, so as to obtain the target prediction model.

In some embodiments of the present application, the model fine tuning module is further configured to perform parameter prediction on the embedded feature to obtain a control parameter; the control parameters are used for representing whether the embedded features and the fine tuning features are subjected to feature fusion or not; and controlling the fusion of the fine tuning feature and the embedded feature according to the control parameter to obtain the input feature of the pre-training recommendation model.

In some embodiments of the present application, the model fine tuning module is further configured to determine, when the control parameter characterization performs feature fusion, a fusion result of the embedded feature and the fine tuning feature as the input feature of the pre-training recommendation model; when the control parameter characterization does not feature fusion, the embedded feature is determined as the input feature of the pre-trained recommendation model.

In some embodiments of the present application, the historical behavior information includes: historical conversion information representing whether the target object performs conversion operation for the historical association information; the model fine-tuning module is further used for predicting a predicted conversion result of the target object aiming at the history associated information from the input characteristics by utilizing the pre-training recommendation model; and carrying out parameter adjustment on the pre-training recommended model according to the difference between the predicted conversion result and the historical conversion information until the fine tuning ending condition is reached, so as to obtain the target prediction model.

In some embodiments of the present application, the model fine tuning module is further configured to integrate the features of the history associated information and the features of the target object to obtain an integrated sparse feature; transforming the integrated sparse features into dense features to obtain the embedded features; and extracting the characteristics of the embedded characteristics to obtain the fine tuning characteristics.

In some embodiments of the present application, the information screening module is further configured to preliminarily screen a plurality of recall information from the information base by using a recall model; sequencing a plurality of recall information through a coarse ranking model to obtain a coarse ranking information sequence; the first N pieces of information in the coarse arrangement information sequence are determined to be the candidate recommendation information; n is more than or equal to 1.

In some embodiments of the present application, the interest parameters include: estimating click rate and conversion rate; the information screening module is further used for calculating estimated recommended income indexes based on the estimated click rate and the estimated conversion rate; k candidate recommendation information with the maximum estimated recommendation profit index is determined to be the target recommendation information; wherein K is more than or equal to 1.

In some embodiments of the present application, the information recommendation apparatus further includes: a model training module; the model training module is used for acquiring operation data of each information in the information base and an initial recommendation model; the operation data at least comprises exposure data, click data and conversion data; and training the initial recommendation model regularly by utilizing the characteristics of each piece of information in the information base and the operation data to obtain the pre-training recommendation model.

In some embodiments of the present application, the parameter prediction module is further configured to transform the feature of the target object and the feature of the candidate recommendation information to obtain a transformed feature, and extract an adjustment feature from the transformed feature; generating fusion parameters aiming at the conversion characteristics and the adjustment characteristics, and controlling fusion of the conversion characteristics and the adjustment characteristics by utilizing the fusion parameters to obtain characteristics to be predicted; and predicting the feature to be predicted through the target prediction model, and predicting the interest parameter of the target object aiming at the candidate recommendation information.

In some embodiments of the present application, the generating the fusion parameter for the conversion feature and the adjustment feature is implemented by a parameter prediction model, and the performing parameter prediction on the embedded feature to obtain a control parameter is implemented by an initial parameter model;

the model training module is further used for generating sample values obeying preset distribution aiming at the generated control parameters; calculating an updated component of the control parameter according to the sample value; summing the control parameter and the update component to obtain the update parameter; and carrying out parameter optimization on the initial parameter model by utilizing the normalization result of the updated parameters until the optimization ending condition is reached, so as to obtain the parameter prediction model.

The embodiment of the application provides information recommendation equipment, which comprises the following components:

a memory for storing executable instructions;

and the processor is used for realizing the information recommendation method provided by the embodiment of the application when executing the executable instructions stored in the memory.

The embodiment of the application provides a computer readable storage medium, which stores executable instructions for implementing the information recommendation method provided by the embodiment of the application when the executable instructions are executed by a processor.

Embodiments of the present application provide a computer program product, including a computer program or instructions, which when executed by a processor implement the information recommendation method provided in the embodiments of the present application.

The embodiment of the application has the following beneficial effects: the information recommending device performs fine adjustment by utilizing the characteristics of the target object, the characteristics of the history associated information of the target object and the history behavior information on the basis of the pre-training recommending model which has learned the common characteristics of the user when receiving the information request of the target object, so as to obtain a target predicting model which learns the individual characteristics of the target object, realizes the customization of the target object, finally utilizes the target predicting model to more accurately predict the interest parameters of the target object for candidate recommending information, utilizes the interest parameters to screen the target recommending information which accords with the preference of the target object, and finally improves the accuracy of information recommending.

Drawings

Fig. 1 is a schematic architecture diagram of an information recommendation system provided in an embodiment of the present application;

fig. 2 is a schematic structural diagram of the server in fig. 1 according to an embodiment of the present application;

fig. 3 is a flowchart of an information recommendation method according to an embodiment of the present application;

fig. 4 is a schematic illustration showing target recommendation information provided in an embodiment of the present application;

fig. 5 is a second flowchart of an information recommendation method provided in the embodiment of the present application;

FIG. 6 is a schematic diagram of a process for predicting control parameters provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of a process for generating input features provided by an embodiment of the present application;

fig. 8 is a flowchart of a method for recommending information according to an embodiment of the present application;

FIG. 9 is a data flow diagram of advertisement pushing provided by an embodiment of the present application;

FIG. 10 is a diagram of a model structure at trim provided by an embodiment of the present application;

FIG. 11 is a graph showing the comparison of AUC provided in the examples of the present application;

fig. 12 is a schematic diagram of MSE comparison provided in an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

Before further describing embodiments of the present application in detail, the terms and expressions that are referred to in the embodiments of the present application are described, and are suitable for the following explanation.

1) Cloud Technology (Cloud Technology) refers to a hosting Technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.

The cloud technology is a generic term of network technology, information technology, integration technology, management platform technology, application technology and the like based on cloud computing business model application, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. With the high development and application of the internet industry, each article may have its own identification in the future, and all the articles need to be transmitted to a background system for logic processing. The data of different levels can be processed separately, and various industry data all need strong system rear shield support and can be realized only through cloud computing.

2) Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.

3) Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

4) The factorizer (Factorization Machine, FM) is a machine learning algorithm based on matrix decomposition, aims to solve the problem of feature combination under sparse data, and is widely applied to click rate estimation models.

5) The deep neural network (Deep Neural Network, DNN) is a neural network with multiple hidden layers, which is one of the most efficient click rate prediction models.

6) Embedding conversion, a method of converting discrete vectors into continuous vectors. In general, the DNN model may convert discrete features into embedded vectors and serve as an input layer for the model after stitching or other processing.

7) The Re-parameterization technique (Re-parameterization Trick) is a method for solving the problem of sample unguided in a deep learning model.

8) Fine-Tuning (Fine-Tuning) is the process of migrating parameters of an already trained model (e.g., a pre-trained model) to a new model to aid in the training of the new model. It is common practice to freeze part of the layers of the pre-training model, train the remaining network layers and the full connection layer.

9) Thousands of display benefits (eCPM) refer to the effective benefits of thousands of displays and are key quantitative indicators for assessing benefits.

10 Click-Through-Rate (CTR) refers to the ratio of the number of times information is clicked after presentation to the number of times information is presented. Click-through rates are often used to measure the degree of interest in information.

11 Conversion Rate (CVR) is a ratio of users who point to a Conversion behavior expected by an information delivery party among users who have hit recommended information, where the Conversion behavior may be a behavior that can positively affect information, such as ordering, collection, and forwarding.

12 In response to a condition or state that is indicative of a dependency of an operation performed, one or more operations performed may be in real-time or with a set delay when the dependency is satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.

The model trained in the mode learns more general characteristics of the user, namely the learned common characteristics of the user, so that the information selected by the model lacks personalized recommendation of the user, and finally the accuracy of information recommendation is lower.

Further, in the related art, the data size of the features of the user is generally smaller, and when the model is trained by using the features of the user, the fitting phenomenon is very easy to occur, so that difficulty is brought to model training. In this regard, the related art is generally addressed by a strategy for fine tuning a model trained using a large dataset. The model is pre-trained by using a larger source field data set, and then the pre-trained model is finely tuned based on a smaller target field data set, so that the problem of difficult training can be solved, knowledge learned on the source field data set can be migrated to the model in the target field, and the performance of the model is better.

Common strategies for fine tuning models based on target domain datasets include: trimming based on regularization terms, trimming based on knowledge distillation, trimming based on gating network control, and the like.

Trimming based on regularization term means that when the model is learned from zero, the capacity of the network, namely the effective size of the search space, is implicitly limited through regularization, so that optimization is promoted and overfitting is avoided. The start Point of the fine-tuning conveys information about the source domain and the source task, and therefore the pre-trained model sets a Start Point (SP) that can be used to locate the functional space that is effectively searched during fine-tuning. The process is generally carried out from L ² Regularization term starts to solve the problem of overfitting of small data sets in the fine tuning process, commonly used L ² The regularization term is shown in formula (1):

and the regularization term of the starting point of the pre-training model is shown as a formula (2):

wherein SP is the starting point, i.e. it is desired to pass the constraint weights w not so far from the starting point, i.e. the knowledge w that has been learned originally ⁰ To resist the over-fitting situation. This penalty is much more efficient than the usual L2 penalty, and is also more efficient and easier to implement than the strategy of freezing the first layer of the network. Experiments show that the L2-SP reserves the memory of the features learned on the source data set, and a more general regularization method is adopted for the over-fitting problem, so that the bias is not learned on the small data set, the model structure is not required to be changed, the realization is simple, and the result is effective.

Trimming based on knowledge distillation is a combination of knowledge distillation techniques and trimming techniques. Knowledge distillation can train a smaller network to achieve the performance of a complex network on a source dataset or on a large number of unlabeled datasets. For each sample in the new task, a pre-trained Model was used as a Teacher Model (Teacher Model). Compared to conventional fine tuning, the regression output by the teacher model is used to constrain the samples in the new task. The training process in this way is as follows:

the first step: recording the output of new data on the original network, the shared parameters being noted as θ _s The parameters for a particular task are denoted as θ _o For the newly added class, increasing the number of nodes of the corresponding full-connection layer, and randomly initializing the weight theta _n 。

And a second step of: the network is trained and the loss function across all tasks is optimized. During training, first freeze θ _s And theta _o Then train theta _n Directing convergence and then retraining all θ _s 、θ _o And theta _n Until convergence.

The fine tuning based on knowledge distillation is improved in both the fine tuning method and the fine tuning optimization method, all the optimization is performed in efficiency and performance, and new data are trained by using a teacher model, so that knowledge learned in the source field can be effectively reserved. The upper bound for fine tuning based on knowledge distillation is joint learning, but a more efficient method is adopted, and only target data is needed for input data.

In the gating network control fine tuning method, a model comprises three networks: the first is a trained pre-training network, the parameters of which are frozen and cannot be trained and updated; the second is a fine tuning network, which is initialized with a pre-trained model, can be fine tuned and updated; the third is a gating network for controlling the characteristics and paths of the network layer.

The gating network is divided into two parts, namely a characteristic control part and a network control part. Wherein, the characteristic control part is used for deciding which characteristics need to be reserved or fine-tuned, and the network control part is used for controlling whether each layer of the upper layer network needs fine-tuning. The formula of the feature control section is shown in formula (3):

wherein, the liquid crystal display device comprises a liquid crystal display device,is a learned feature during pre-training, < >>Is an input during pre-training, < >>Is an input in fine tuning, < >>Is the decision value of the ith feature, the value is 0 or 1, i is the number of feature domains, and x _hot,i Is the temperature characteristic of the ith characteristic, +.>Is the output of the feature control section.

Although trimming based on regularization terms and trimming based on knowledge distillation can solve the problem of overfitting caused by too small data volume, the two ways do not have a selective update for the data features, for example, when the user data has some newly added features or some special industry features are different from the source data set, the trimming based on regularization terms and the knowledge learned by trimming based on knowledge distillation are difficult to guide the new user features, so that the trimmed model cannot achieve better performance, i.e. the trimming effect is poor.

Still further, although the feature can be selectively updated by fine tuning based on the gating network control, the method needs to save parameters of the pre-training model and the fine tuning model at the same time, so that the parameter amount of the model is large, and thus, the model needs to consume more memory resources during calculation.

In addition, in order to consider the individuality of different users in information recommendation, the adaptive models are trained and stored for the different users on line respectively for use in on-line prediction, so that more storage resources are required to be occupied.

The embodiment of the application provides an information recommendation method, device, equipment, a computer readable storage medium and a program product, which can improve the accuracy of information recommendation. The following describes exemplary applications of the information recommendation device provided in the embodiments of the present application, where the information recommendation device provided in the embodiments of the present application may be implemented as a notebook computer, a tablet computer, a desktop computer, a set-top box, a mobile device (for example, a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, a portable game device), and other various types of user terminals, may also be implemented as a server, and may also be implemented as a device cluster including the server and the terminal. Next, an exemplary application when the information recommendation apparatus is implemented as or in a server will be described.

Referring to fig. 1, fig. 1 is a schematic architecture diagram of an information recommendation system provided in the embodiment of the present application, in order to support an information recommendation application, in the information recommendation system 100, a terminal 400 is connected to a server 200 through a network 300, where the network 300 may be a wide area network or a local area network, or a combination of the two. In the information recommendation system 100, a database 500 is further provided for providing data support to the server 200. Database 500 may be independent of server 200 or may be configured in server 200. Fig. 1 shows an example in which database 500 is independent of server 200.

The terminal 400 is configured to generate an information request in response to an operation of a target object (i.e., a user) on the graphical interface 410, and to transmit the information request to the server 200 through the network 300.

The server 200 is configured to obtain a pre-training recommendation model in response to an information request of a target object, and obtain, from the database 500, historical association information of the target object, and historical behavior information of the target object for the historical association information; screening candidate recommendation information aiming at a target object; fine tuning the pre-training model based on the characteristics of the history associated information, the characteristics of the target object and the history behavior information to obtain a target prediction model of the target object; predicting interest parameters of the target object aiming at candidate recommendation information through a target prediction model; and screening target recommendation information from the candidate recommendation information based on the interest parameters, and returning the target recommendation information to the target object, namely, returning the target recommendation information to the terminal 400.

The terminal 400 is also used to present target recommendation information on the graphical interface 410.

In some embodiments, the server 200 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms. The terminal 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiment of the present invention.

Referring to fig. 2, fig. 2 is a schematic structural diagram of the server in fig. 1 provided in an embodiment of the present application, and the server 200 shown in fig. 2 includes: at least one processor 210, a memory 250, at least one network interface 220, and a user interface 230. The various components in server 200 are coupled together by bus system 240. It is understood that the bus system 240 is used to enable connected communications between these components. The bus system 240 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 240 in fig. 2.

The processor 210 may be an integrated circuit chip with signal processing capabilities such as a general purpose processor, such as a microprocessor or any conventional processor, or the like, a digital signal processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

The user interface 230 includes one or more output devices 231, including one or more speakers and/or one or more visual displays, that enable presentation of media content. The user interface 230 also includes one or more input devices 232, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

The memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 250 optionally includes one or more storage devices physically located remote from processor 210.

Memory 250 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The non-volatile memory may be read only memory (ROM, read Only Me mory) and the volatile memory may be random access memory (RAM, random Access Memor y). The memory 250 described in embodiments of the present application is intended to comprise any suitable type of memory.

In some embodiments, memory 250 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 251 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

network communication module 252 for reaching other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 include: bluetooth, wireless compatibility authentication (Wi-Fi), universal serial bus (USB, universal Serial Bus), and the like;

a presentation module 253 for enabling presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 231 (e.g., a display screen, speakers, etc.) associated with the user interface 230;

an input processing module 254 for detecting one or more user inputs or interactions from one of the one or more input devices 232 and translating the detected inputs or interactions.

In some embodiments, the information recommending apparatus provided in the embodiments of the present application may be implemented in a software manner, and fig. 2 shows the information recommending apparatus 255 stored in the memory 250, which may be software in the form of a program, a plug-in, or the like, including the following software modules: information acquisition module 2551, information screening module 2552, model fine tuning module 2553, parameter prediction module 2554, and model training module 2555, which are logical, and thus can be arbitrarily combined or further split depending on the functions implemented. The functions of the respective modules will be described hereinafter.

In other embodiments, the information recommendation apparatus provided in the embodiments of the present application may be implemented in hardware, and by way of example, the information recommendation apparatus provided in the embodiments of the present application may be a processor in the form of a hardware decoding processor that is programmed to perform the information recommendation method provided in the embodiments of the present application, for example, the processor in the form of a hardware decoding processor may employ one or more application specific integrated circuits (ASIC, application Specific Integrated Circuit), DSP, programmable logic device (PLD, progra mmable Logic Device), complex programmable logic device (CPLD, complex Programmabl e Logic Device), field programmable gate array (FPGA, field-Programmable Gate Array), or other electronic components.

In some embodiments, the terminal or the server may implement the rights issuing method provided in the embodiments of the present application by running a computer program. For example, the computer program may be a native program or a software module in an operating system; a local (Native) Application program (APP), i.e. a program that needs to be installed in an operating system to run, such as an information recommendation APP; the method can also be an applet, namely a program which can be run only by being downloaded into a browser environment; but also an applet that can be embedded in any APP. In general, the computer programs described above may be any form of application, module or plug-in.

The embodiment of the application can be applied to various scenes such as cloud technology, artificial intelligence, intelligent transportation, vehicle-mounted and the like. Next, an information recommendation method provided in the embodiment of the present application will be described in conjunction with exemplary applications and implementations of a terminal provided in the embodiment of the present application.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method for recommending information according to an embodiment of the present application, and the steps illustrated in fig. 3 will be described.

S101, responding to an information request of a target object, and acquiring a pre-training recommendation model, history associated information of the target object and history behavior information of the target object on the history associated information.

The embodiment of the application is implemented in a scene of information recommendation to the target object, for example, recommending the lower link of the commodity of interest to the target object or an article of the field which the target object wants to know to the target object. In the embodiment of the application, the information recommendation device determines whether to start an information recommendation process to the target object according to the receiving condition of the recommendation request. When the information recommendation device receives an information request from a target object, the information recommendation device responds to the information request to acquire a pre-trained recommendation model which is already pre-trained, historical association information associated with the target object and historical behavior information of the target object aiming at the historical association information.

The pre-training recommendation model is a model that learns common characteristics of the user with respect to information, for example, a model that learns common characteristics of interests and favorites of the user. Further, the pre-training recommendation model may be trained using historical operation data of the already-recommended information (it is clear from the historical operation data what type of information is easy to be popular), or may be trained in combination with the historical operation data and all of the portrayal information of the already-recommended information (it is clear from the historical operation data and all of the portrayal information of the user that the type of information popular in different guest groups is possible).

In other words, the pre-training recommendation model in the embodiment of the present application lacks learning on the individual features of the target object, so if the information screened by directly using the pre-training recommendation model is not really interested in the target object.

It is understood that the history related information of the target object refers to information that the target object has operated in a history period, that is, information related to the target object in the information base. The historical behavior information of the target object on the historical associated information refers to operations performed by the target object on the information, and the operations may be clicking, converting, shielding, and the like, which are not limited herein.

In the embodiment of the application, the target object is any online user who sends out an information request.

The pre-trained predictive model may be a convolutional neural network (Convolutional Neural Network) model or a deep neural network (Deep Neural Network) model, which is not limited herein.

S102, screening candidate recommendation information aiming at a target object.

The information recommending device can initially screen each information in the information base to obtain candidate recommending information of the target object, so that recommending information needed for recommending the target object can be screened out from the candidate recommending information.

It may be understood that the information recommending apparatus determines the latest information in the information base as candidate recommended information, or may screen out the information in the information base that is relatively similar to the history associated information, and determine the latest information as candidate recommended information, which is not limited herein.

Further, in some embodiments, the information recommendation device may directly use information recalled from the information base by the recall model as candidate recommendation information, and in other embodiments, the information recommendation device may perform coarse ranking on information recalled by the recall model, and determine a result obtained after the coarse ranking as candidate recommendation information.

It is understood that the candidate recommendation information may include songs, small videos (videos with a duration of less than 5 minutes), articles, pictures, long videos (videos with a duration of more than 5 minutes), merchandise links, merchandise promotions, and the like, which are not limited herein.

Further, the candidate recommended information does not refer specifically to one information, but refers generally to all the preliminarily screened information. In other words, the candidate recommendation information may include one or more pieces of information that are primarily screened, and the types of the information may be the same, for example, song, article, etc., or may be different, for example, some pieces of information are articles, other pieces of information are ordering links, etc., which is not limited herein.

S103, fine tuning is conducted on the pre-training recommendation model based on the characteristics of the history associated information, the characteristics of the target object and the history behavior information, and a target prediction model corresponding to the target object is obtained.

After the information recommendation equipment determines the history associated information, the characteristics of the history associated information and the characteristics of the target object are used as inputs of a pre-training recommendation model, the history behavior information is used as a supervision item of the pre-training recommendation model, and the pre-training recommendation model is subjected to fine adjustment to obtain a target prediction model.

It should be noted that, the history related information and the history behavior information are both strongly related to the target object, and can indicate the favoring condition of the target object for different types of information, so that the characteristics of the history related information, the characteristics of the target object and the history behavior information are combined to finely tune the pre-training recommendation model, so that the common characteristics of different users can be kept, the pre-training recommendation model can fully learn the individual characteristics of the target object for different types of information, and the model customization is realized for the target object on the basis of the pre-training recommendation model, and the target object is conveniently screened out the interested information by using the customized model of the target object.

It is to be understood that the characteristics of the target object may be basic characteristics such as age, region, sex, etc. of the target object, or may be characteristics that the target object clicks on other platforms or is converted information, which is not limited herein.

It should be noted that, the filtering of the candidate recommendation information and the fine tuning of the pre-training recommendation model do not affect the calculation of the final interest parameter, so in some embodiments, the information recommendation device may further execute S103 first, then execute S102, or execute S102 and S103 simultaneously, which is not limited herein.

S104, predicting interest parameters of the target object aiming at the candidate recommendation information through the target prediction model.

After the target prediction model is obtained, the information recommendation device analyzes the candidate recommendation information by using the target prediction model to determine whether the target object is interested in the candidate recommendation information or the interested degree, so that the interested parameters of the target object are obtained, and information required to be recommended to the target object is selected according to the interested degree.

In some embodiments, the information recommendation device may perform feature extraction on the candidate recommendation information, and then input features of the candidate recommendation information into the target prediction model to analyze interest parameters of the target object. In other embodiments, the information recommendation device may input the features of the target object into the target prediction model at the same time to obtain the interest parameter in addition to the features of the candidate recommendation information into the target prediction model, which is not limited herein.

It may be understood that the feature of the candidate recommendation information may be an encoding feature of a type tag of the candidate recommendation information, or may be an encoding feature of a word number and a duration of the candidate recommendation information, or may be a semantic feature of the candidate recommendation information determined by an artificial intelligence technology, which is not limited herein.

The interest parameter may be a degree of interest of the target object with respect to the candidate recommendation information, for example, 0.8, 0.5, etc., and may be a flag of whether the target object is interested in the candidate recommendation information, for example, not interested, very interested, etc., which is not limited herein.

S105, screening target recommendation information from the candidate recommendation information based on the interest parameters, and returning the target recommendation information to the target object.

Comparing the interest parameters of the candidate recommendation information by the information recommendation equipment, screening out the maximum interest parameter, and determining the information corresponding to the maximum interest parameter from the candidate recommendation information as target recommendation information; or comparing the interest parameter with a parameter threshold value, and screening out information higher than the parameter threshold value from the candidate recommendation information to serve as target recommendation information for the target object. Then, the information recommending device sends the target recommending information to the target object through the network to complete the response to the information request, and the information recommending process is achieved.

Fig. 4 is a schematic illustration showing target recommendation information provided in an embodiment of the present application. In the display interface 4-1 of the terminal of the target object, target recommendation information screened by the information recommendation device for the target object is displayed, wherein the target recommendation information comprises links 4-11 of commodities of interest of the target object and graphic profiles 4-12 of the commodities.

It should be noted that, the target recommendation information screened by the information recommendation device accords with the interests and preferences of the target object, so that the target object is highly likely to perform transformation behaviors on the target recommendation information, and the recommended information is more accurate.

It can be understood that compared with the prior art that more universal features of the user are learned when the model is trained on line, personalized recommendation of the user is lacking when the trained model is utilized to conduct information recommendation, in the embodiment of the application, when the information recommendation device receives the information request of the target object, on the basis of the pre-trained recommendation model which has learned the common features of the user, fine adjustment is conducted by utilizing the features of the target object, the features of the historical associated information of the target object and the historical behavior information, so that a target prediction model which learns the personalized features of the target object is obtained, customization of the target object is achieved, finally interest parameters of the target object for candidate recommendation information are predicted more accurately by utilizing the target prediction model, the target recommendation information which accords with the preference of the target object is screened by utilizing the interest parameters, and finally the accuracy of information recommendation is improved. Furthermore, in the embodiment of the application, the model fine tuning process is started when the request of the target object is received, namely fine tuning is performed on line in real time, so that the adaptive models do not need to be trained and stored for different users on line, and storage resources can be saved.

Based on fig. 3, referring to fig. 5, fig. 5 is a second flowchart of an information recommendation method provided in an embodiment of the present application. In some embodiments of the present application, fine tuning the pre-training recommendation model based on the characteristics of the history associated information, the characteristics of the target object, and the history behavior information to obtain a target prediction model corresponding to the target object, that is, a specific implementation process of S103 may include: s1031 to S1033 are as follows:

s1031, generating an embedding start feature and a fine tuning feature based on the features of the history associated information and the features of the target object.

When the information recommendation device performs fine adjustment on the pre-training recommendation model, feature processing is performed on the features of the history associated information and the features of the target object to obtain embedded features and fine adjustment features.

In some embodiments, the information recommendation device may first convert the features of the history associated information and the features of the target object into embedded features and then extract fine-tuning features from the embedded features. In other embodiments, the information recommendation device may also extract embedded features and trim features from both the features of the historical association information and the features of the target object.

S1032, based on controlling the fusion of the embedded features and the fine tuning features, the input features of the pre-training recommendation model are generated.

After the embedded features and the fine tuning features are obtained, the information recommendation device controls fusion of the embedded features and the fine tuning features to clearly determine whether the pre-trained recommendation model learns knowledge from the embedded features and the fine tuning features at the same time or learns knowledge from the embedded features.

In some embodiments, the information recommendation device may determine whether it needs to blend with the trim feature by analyzing the embedded feature. In other embodiments, the information recommendation device may determine whether to fuse the embedded feature and the fine tuning feature by estimating the information amount of the features of the history associated information and the features of the target object, by a size relationship between the estimated information amount and a corresponding threshold, for example, fusing when the information amount is less than the threshold, not fusing when the information amount is greater than the threshold, and so on.

S1033, carrying out parameter adjustment on the pre-training recommendation model according to the input characteristics and the historical behavior information to obtain a target prediction model.

And finally, the information recommendation equipment inputs the obtained input characteristics into a pre-training recommendation model to predict, and then adjusts the parameters of the pre-training recommendation model by utilizing the difference between the prediction result and the historical behavior information until the fine tuning end condition is reached, so as to obtain a target prediction model customized for the target object.

Note that the trimming end condition may be that the number of training iterations reaches a predetermined number, for example 10000 times, or that the accuracy during training iterations reaches a predetermined accuracy, for example 99.99%, etc., which is not limited herein.

It can be understood that in the embodiment of the present application, whether the embedded feature and the fine tuning feature are fused to obtain the input feature for fine tuning, or the input feature is directly obtained based on the embedded feature for fine tuning, the information recommendation device only needs to save the parameters of the pre-training recommendation model, and does not need to generate and save the parameters of the fine tuning model, so that the parameter amount of the model is smaller, and thus, the memory resource consumed by the model during calculation is also smaller.

In some embodiments of the present application, based on controlling the fusion of the embedded feature and the fine tuning feature, the specific implementation process of generating the input feature of the pre-trained recommendation model, i.e. S1032, may include: s1032a-S1032b as follows:

s1032a, carrying out parameter prediction on the embedded features to obtain control parameters.

In some embodiments, the information recommendation device may input the embedded features into a model for parameter prediction to implement parameter prediction to obtain control parameters; in other embodiments, the information recommendation device may match the embedded features with feature templates corresponding to different control parameters to determine the control parameters of the embedded features.

Fig. 6 is a schematic diagram illustrating a process of predicting control parameters according to an embodiment of the present application. The information recommendation apparatus inputs the embedded features 6-1 (multidimensional features including a plurality of feature components 6-11) into the model 6-2 for parameter prediction, resulting in control parameters 6-3 of the respective embedded features 6-1. The model 6-2 is composed of a feature extraction layer 6-21 and a re-parameter calculation layer 6-22.

It should be noted that the control parameters are used to characterize whether the embedded features and the fine tuning features are feature fused. Illustratively, when the control parameter is 1, the embedded feature and the fine tuning feature are characterized to be fused, and when the control parameter is 0, the embedded feature and the fine tuning feature are not characterized to be fused.

S1032b, controlling the fusion of the fine tuning feature and the embedded feature according to the control parameter to obtain the input feature of the pre-training model.

After the control parameters are obtained, the information recommendation device performs corresponding operations on the embedded features and the fine tuning features according to whether the embedded features and the fine tuning features represented by the control parameters are fused, so that input features of the pre-training model are obtained.

In the embodiment of the application, the information recommendation device can predict the control parameters from the embedded features, and then realize fusion control of the embedded features and the fine tuning features based on the control parameters to obtain the input features so as to facilitate fine tuning by using the input features later.

In some embodiments of the present application, the controlling the fusion of the fine tuning feature and the embedded feature according to the control parameter, to obtain the input feature of the pre-training model, that is, the specific implementation process of S1032b may include: s201 or S202, as follows:

and S201, determining a fusion result of the embedded feature and the fine tuning feature as an input feature of the pre-training recommendation model when the control parameter characterization is subjected to feature fusion.

The information recommendation equipment analyzes the control parameters, and when the control parameter characterization needs to be fused with the embedded feature and the fine tuning feature, the embedded feature and the fine tuning feature are fused into one feature, namely the input feature, through superposition or a splicing mode.

S202, when the control parameter characterization does not perform feature fusion, determining the embedded features as input features of a pre-training recommendation model of the pre-training model.

When the information recommendation device explicitly controls parameters to indicate that the embedded feature and the fine tuning feature do not need to be subjected to feature fusion, the fine tuning feature is abandoned, and the embedded feature is directly determined as an input feature.

Exemplary, based on fig. 6, referring to fig. 7, fig. 7 is a schematic diagram of a generating process of an input feature provided in an embodiment of the present application. After the embedded feature 6-1 is input into the model 6-2 to obtain the control parameter 6-3, the information recommendation device determines whether the embedded feature 6-1 is to be fused with the fine tuning feature 7-1 in combination with the specific value of the control parameter 6-3, thereby obtaining the input feature. When the value of the control parameter 6-3 is 1, the information recommendation device determines the fusion result 7-2 of the embedded feature 6-1 and the fine-tuning feature 7-1 as the input feature, and when the value of the control parameter 6-3 is 0, the information recommendation device directly determines the embedded feature 6-1 as the input feature. Thus, the generation of the input features is completed.

In the embodiment of the application, the information recommendation device fuses the embedded feature and the fine tuning feature when the control parameter characterization is subjected to feature fusion to obtain the input feature, and directly takes the embedded feature as the input feature when the control parameter characterization is not subjected to feature fusion. Therefore, only the input features obtained by fusion are needed to be calculated by using the pre-training model, or the embedded features are directly calculated, and forward transmission is not needed to be carried out by selecting one of the embedded features and the fine tuning features, so that parameters corresponding to the fine tuning features are not needed to participate in calculation, the number of parameters needed in calculation is reduced, and memory resources are saved.

In some embodiments of the present application, the historical behavior information includes: and the history conversion information is used for representing whether the target object performs conversion operation aiming at the history association information. At this time, according to the input features and the historical behavior information, parameter adjustment is performed on the pre-training recommendation model until reaching the fine tuning end condition, to obtain the target prediction model, that is, the specific implementation process of S1033 may include: s1033a-S1033b are as follows:

s1033a, predicting a predicted conversion result of the target object aiming at the history associated information from the input characteristics by utilizing the pre-training recommendation model.

The information recommendation device inputs the input features into a pre-training recommendation model, and the output of the pre-training recommendation model is a predicted conversion result predicted for the target object, so that the predicted conversion result characterizes whether the target object can perform conversion operation for the history associated information.

In some embodiments of the present application, the history associated information includes at least: historical exposure information viewed by the target object and historical click information clicked by the target object. That is, the information recommendation device predicts the information that the target object sees for the history period and whether the information that the target object clicks for the history period will perform the conversion operation or not by using the pre-training recommendation model.

S1033b, carrying out parameter adjustment on the pre-training recommendation model according to the difference between the predicted conversion result and the historical conversion information until the fine adjustment ending condition is reached, and obtaining the recommendation prediction model.

The information recommending device analyzes the predicted conversion result and the actual conversion condition of the target object to the history associated information, namely the difference between the history conversion information, and then the parameter adjustment of the pre-training recommending model is realized by back propagation of the difference, so that one iteration is completed, and the model customization of the target object is completed until the fine tuning ending condition is reached, so that the target recommendation is obtained.

In the embodiment of the application, the information recommendation device can take the history conversion information as a supervision item through inputting the characteristics as the input of the pre-training recommendation model, so that the pre-training recommendation model can fully learn the personality characteristics of the target object.

In some embodiments of the present application, generating the embedded feature and the fine-tuning feature based on the feature of the history associated information and the feature of the target object, i.e. the specific implementation procedure of S1031, may include: s1031a to S1031c are as follows:

s1031a, integrating the characteristics of the history associated information and the characteristics of the target object to obtain integrated sparse characteristics.

In the embodiment of the present application, the features of the history associated information and the features of the target object are sparse features, and the information recommendation device integrates the features of the history associated information and the features of the target object into one feature by splicing or fusing, which is the integrated sparse feature.

Further, when the continuous value features exist in the history associated information features and the target object features, the information recommendation device discretizes the continuous value features to obtain discretized sparse features, and then integrates the discretized sparse features to obtain integrated sparse features.

S1031b, transforming the integrated sparse features into dense features to obtain embedded features.

In some embodiments, the information recommendation device performs a transformation operation on the integrated sparse features through the feature change matrix to obtain denser embedded features. In other embodiments, the information recommendation device may input the integrated sparse features into a feature extraction model to obtain denser embedded features.

And S1031c, extracting the characteristics of the embedded characteristics to obtain fine adjustment characteristics.

Finally, the information recommendation device can continue to perform feature extraction on the embedded features through a feature extraction model or downsampling, and the extracted result is the fine-tuning feature.

In the embodiment of the application, the information recommendation device generates the integrated sparse feature from the features of the history associated information and the features of the target object, then converts the integrated sparse feature into the denser embedded feature, and finally extracts the fine tuning feature on the basis of the embedded feature. Thus, the information recommendation device completes the generation process of the embedded features and the fine-tuning features so as to facilitate the subsequent generation of the input features based on the embedded features and the fine-tuning features.

Based on fig. 3, referring to fig. 8, fig. 8 is a flowchart illustrating a third information recommendation method according to an embodiment of the present application. In some embodiments of the present application, the screening candidate recommendation information for the target object, that is, the specific implementation process of S102 may include: S1021-S1023 as follows:

S1021, preliminarily screening a plurality of recall information from the information base by using a recall model.

S1022, sorting the plurality of recall information through the coarse arrangement model to obtain a coarse arrangement information sequence.

The information recommendation equipment acquires a recall model, and analyzes each piece of information in the information base by using the recall model so as to extract a plurality of pieces of recall information. Then, the information recommendation device performs scoring again on the plurality of extracted recall information by using the obtained coarse ranking model, and ranks the plurality of recall information into a sequence from high to low according to the scoring result, so that a coarse ranking information sequence is obtained.

S1023, determining the first N pieces of information in the coarse arrangement information sequence as candidate recommendation information.

And finally, the information recommendation equipment extracts N pieces of information with highest scores in the coarse-ranking information sequence to serve as candidate recommendation information, wherein N is more than or equal to 1.

In the embodiment of the application, the information recommendation device can determine candidate recommendation information for the target object through recall and coarse ranking, so that the target recommendation information can be conveniently extracted from the candidate recommendation information.

In some embodiments of the present application, the parameters of interest include: the pre-estimated click rate and the pre-estimated conversion rate, at this time, the step of screening the target recommendation information from the candidate recommendation information based on the interest parameter, that is, the specific implementation process of S105, may include: s1051 to S1052, as follows:

S1051, calculating an estimated recommended profit index based on the estimated click rate and the estimated conversion rate.

In some embodiments, the information recommendation device may multiply the estimated click rate and the estimated conversion rate to obtain the estimated recommended profit index. In other embodiments, the information recommendation device may weight the estimated click rate and the estimated conversion rate to obtain the estimated recommended revenue index.

The estimated recommended profit index characterizes profit conditions which can be brought by the target object after the candidate recommended information is converted.

S1052, determining K candidate recommendation information with the maximum estimated recommendation profit index as target recommendation information.

The information recommendation equipment screens out the maximum K estimated recommendation profit indexes from the calculated estimated recommendation profit indexes, and extracts candidate recommendation information corresponding to the K estimated recommendation profit indexes, so that target recommendation information is obtained. Wherein K is more than or equal to 1.

In the embodiment of the application, the information recommendation device can calculate the estimated recommended profit index based on the estimated click rate and the estimated conversion rate contained in the interest parameter, and then recommend the information with the larger estimated recommended profit index to the target object, so that the information recommended to the target object can be ensured to bring more reasonable profit.

In some embodiments of the present application, before the pre-trained recommendation model is acquired, i.e. before S101, the method may further include: s106 to S107, as follows:

s106, acquiring operation data of each piece of information in the information base and an initial recommendation model.

The operation data includes at least exposure data, click data, and conversion data. The exposure data refers to data such as the exposure times and exposure time of information, the click data refers to data such as the click times and click time of information, and the conversion data refers to information such as the conversion times and conversion time of information.

It is understood that the initial recommended model may be a model obtained by random initialization of parameters, i.e. a model that is not trained at all, or a model that has been trained in a historical time, which is not limited herein.

And S107, training the initial recommendation model at regular time by utilizing the characteristics and the operation data of each information in the information base to obtain a pre-training recommendation model.

In some embodiments, the information recommendation device uses the characteristics of each information as input of an initial recommendation model, uses the operation data as a supervision item, and regularly performs supervised training on the initial recommendation model to obtain a pre-trained recommendation model. In other embodiments, the information recommendation device may also use the characteristics and operation data of each information as input at the same time, and perform unsupervised training on the initial recommendation model to obtain a pre-trained recommendation model.

In the embodiment of the application, the information in the information base is changed, and the corresponding operation data is also changed, so that the information recommending device trains the initial recommending model at regular time by utilizing the characteristics and the operation data of each information, which is equivalent to updating the initial training model at regular time, so as to ensure that the basic model used in model customization for the target object is always up to date.

In some embodiments of the present application, predicting, by the target prediction model, the interest parameter of the target object for the candidate recommendation information, that is, the specific implementation process of S104, may include: S1041-S1043 as follows:

s1041, converting the characteristics of the target object and the characteristics of the candidate recommendation information to obtain conversion characteristics, and extracting adjustment characteristics from the conversion characteristics.

It can be understood that in the embodiment of the present application, the generation process of the conversion feature is similar to the generation process of the embedded feature, and the generation process of the adjustment feature is similar to the generation process of the fine adjustment feature, which is not described herein.

S1042, generating fusion parameters aiming at the conversion characteristics and the adjustment characteristics, and controlling fusion of the conversion characteristics and the adjustment characteristics by utilizing the fusion parameters to obtain the characteristics to be predicted.

The information recommendation equipment predicts the conversion characteristics to obtain fusion parameters which characterize whether the conversion characteristics and the fusion characteristics are subjected to characteristic fusion, and controls fusion of the conversion characteristics and the fusion characteristics based on the fusion parameters to obtain the characteristics to be predicted.

Illustratively, equation (4) is a process of calculating the feature to be predicted shown in the embodiments of the present application:

embedding _new ＝embedding _old +p(x)×f(embedding _old ) (4)

wherein, ebedding _old Is a conversion feature, p (x) is the fusion parameter of the output, f () is a hidden layer transform, f (emmbedding) _old ) Is an adjustment feature, embedding _new Is the feature to be predicted.

S1043, predicting the feature to be predicted through a target prediction model, and predicting interest parameters of the target object aiming at candidate recommendation information.

And finally, the information recommendation equipment inputs the feature to be predicted into a target prediction model for calculation, and the output of the target prediction model is the interest parameter of the target object aiming at the candidate recommendation information.

In the embodiment of the application, the information recommendation device determines the interest parameter of the target object for the candidate recommendation information in the same manner as the prediction stage in the fine adjustment of the model, so that the interest parameter is conveniently used for screening the target recommendation information for the target object.

In some embodiments of the present application, generating the fusion parameters for the conversion feature and the adjustment feature is implemented by a parameter prediction model, performing parameter prediction on the embedded feature to obtain the control parameter, and implementing the parameter prediction on the embedded feature by an initial parameter model, where after obtaining the control parameter, before generating the fusion parameters for the conversion feature and the adjustment feature, that is, after S1032a, before S1042, the method may further include: s301 to S304 are as follows:

s301, generating sample values obeying preset distribution aiming at the generated control parameters.

The number of control parameters is the same as the number of embedded features and fine tuning features, and at this time, the information recommendation device generates a sample value for each control parameter, so as to obtain a number of sample values of the control parameters, where the sample values conform to a preset distribution.

It is understood that the preset distribution may refer to a U (0, 1) distribution, a gaussian distribution, or other types of distributions, which are not limited herein.

Illustratively, the information recommendation device may generate independent sample values e subject to a (0, 1) distribution for n control parameters ₁ ,…，∈ _n 。

S302, calculating an update component of the control parameter according to the sample value.

The information recommendation device may obtain the updated component corresponding to each sample value by performing a logarithmic operation on each sample value, or may further perform a logarithmic operation on a result of the logarithmic operation on each sample value to obtain the updated component of each sample value, which is not limited herein.

Illustratively, equation (5) is an equation for calculating an update component provided in the embodiments of the present application, as follows:

G _i ＝-log(-log(∈ _i )) (5)

wherein, E is _i Refer to each sample value, G _i Refers to the updated component corresponding to each control parameter.

S303, summing the control parameters and the update components to obtain update parameters.

And the information recommendation equipment adds each control parameter and the corresponding update component to obtain the update parameter corresponding to each control parameter.

Exemplary, when the sample value is E ₁ ,…,∈ _n The update component is G ₁ ,..,G _n In this case, the update parameter corresponding to each control parameter may be expressed as v' = [ v ] ₁ +G ₁ ,v ₂ +G ₂ ,...,v _n +G _n ]。

And S304, performing parameter optimization on the initial parameter model by using a normalization result of the updated parameters until an optimization ending condition is met, and obtaining a parameter prediction model.

Finally, the information recommendation device performs normalization calculation on each updated parameter to obtain a normalization result for each updated parameter, and then performs back propagation calculation on the obtained initial parameter model by taking the normalization result as a loss value to optimize parameters in the initial parameter model. And (3) carrying out loop iteration in such a way, and stopping until reaching the optimization ending condition to obtain the parameter prediction model.

Exemplary, the present embodiments provide a formulation schematic of the normalized calculation, see equation (6):

wherein v' _i Representing each control parameterThe update parameter corresponding to the number, τ is a temperature parameter, the smaller τ is, the lower the temperature is, the sampled sample is more similar to a single heat (one-hot) vector, σ _τ (v′ _i ) Is a normalization result.

It is understood that the optimization ending condition may be the same as or different from the trimming ending condition, and the present application is not limited herein.

In the embodiment of the application, the information recommendation device may calculate an update component for each control parameter, then calculate an update parameter for each control parameter, and finally perform parameter optimization on the initial parameter model based on a normalization result of the update parameter, so that even if the control parameters are discontinuous and lead to difficulty in derivation, the initial parameter model can be trained normally to obtain a parameter prediction model capable of generating fusion parameters.

In the following, an exemplary application of the embodiments of the present application in a practical application scenario will be described.

The embodiment of the application is realized in a scene of personalized recommendation of advertisements for users, namely when the users open information pages and refresh lists, an advertisement system receives advertisement requests (information requests), and proper advertisements (target recommendation information) are screened from an advertisement library to be displayed. When the advertisement is displayed to the user, the advertisement system automatically deducts the fee from the advertiser when the user knows that the user clicks the advertisement, even the behavior (conversion behavior) such as ordering and activating APP is generated.

Referring to fig. 9, fig. 9 is a data flow chart of advertisement pushing according to an embodiment of the present application. In an online advertisement service scene, an advertisement platform 9-1 receives an advertisement request 9-2 sent by a user (a target object), performs feature extraction and tag generation 9-4 on exposure, clicking and conversion data 9-3 (historical behavior information) of the advertisement (historical associated information) which is put into the advertisement by the user, obtains features of the advertisement (features of the historical associated information), further obtains sample data, sends the obtained sample data into a target domain database 9-5, and performs fine adjustment by combining a pre-trained source model 9-6 (a pre-trained recommendation model) and using a fine adjustment method 9-7 provided by the embodiment of the application to obtain a target model 9-8 (a target recommendation model). The final targeting model outputs the predicted results (targeting recommendation information) and returns to the user for advertising responses 9-9.

Wherein exposure, click and conversion data (operational data) of all previously placed advertisements (individual information in the information base) are stored in the source database for timed model training. Each retrained model is loaded into a server (information recommendation device) of the advertising platform to customize the model for the user in real time.

In generating the sample data, the server may determine a single advertisement click record of the user as an input sample, and determine whether the user is converted to a tag. For each conversion by the user, if a click within a given window prior to the conversion can be found, then the click is marked as a positive sample and all other clicks are marked as negative samples. The characteristics of each input sample include: user-side features, advertisement-side features, and contextual features, all of which are discrete, for features that are otherwise continuous values, are converted to discrete features. For example, for height (there are countless values between 178 and 179), it may be discretized into discrete features.

The fine tuning method provided by the embodiment of the application utilizes a method similar to residual error to fine tune the characteristics, so that two sets of model parameters do not need to be saved, and the parameter quantity of the model can be reduced.

Referring to fig. 10, fig. 10 is a diagram illustrating a model structure during trimming according to an embodiment of the present application. First, the server will preprocess the sparse feature 10-1, i.e., convert the sparse user's features and advertisement features (integrated sparse features) into a dense fixed length k-dimensional ebedding vector 10-2 (embedded features), and the ebedding vector 10-2 remains unchanged during the fine tuning process. The server then performs bypass trimming on the feature (no bypass is provided during model pre-training, and only feature trim bypass is added during model trimming) and trims the ebedding vector 10-2. Specifically, the embedding vector 10-2 is subjected to DNN hidden layer transformation to obtain a fine tuning feature 10-3, then a gating network (an initial parameter model) outputs a 0/1 decision aiming at the embedding vector 10-2 and the fine tuning feature 10-3, 1 is that the bypass is disconnected, and the original embedding vector 10-2 plus the fine tuning feature 10-3 becomes a new embedding feature; 0 is the bypass break and the new emmbedding feature is the original emmbedding vector 10-2, thus enabling the input 10-5 (input feature) of the dense vector generation layer 10-4. The server then forwards the input 10-5 to pass through the DNN layer 10-6 and passes the resulting feature through the Softmax function of the output layer to obtain a logic value indicating the user's intent 10-7 for the commodity (predictive conversion result), and then begins back propagation in conjunction with the tag (historical conversion information), completing the fine tuning.

The gating network is a lightweight network, the input of the gating network is an original empdding vector after sparse feature preprocessing, the output of the gating network comprises n binary vectors of 0/1, and n is the number of features or the number of components, and the gating network is used for controlling whether a feature fine tuning bypass of each feature or component needs to be updated or not. The problem of the policy network outputting discrete values is not conductive is referred to herein, and the problem of non-conductive can be solved by utilizing Gumbel-Softmax skills.

The dense vector generation layer is to do inner product for all n inputs two by two to obtain (n-1) n/2 k-dimensional empedding, and then generate a k-dimensional dense vector by superposition as the input of DNN layer. And constructing a multi-layer fully-connected neural network model by the DNN layer. The output layer is a logic value obtained by Softmax function for the processed features in the DNN layer.

In the fine tuning process, except that the original ebedding vector is unchanged, the gating network and other network parameters are learned together (the parameter optimization is carried out on the initial parameter model until the optimization finishing condition is reached, and a parameter prediction model is obtained). The embodiments of the present application solve the problem of difficult derivation caused by discontinuous output of the gating network by using Gumbel-Softmax technique. The core idea of Gumbel-Softmax skills is heavy parameter skills (Reparameterization Trick). In deep learning, a probability distribution is often generated by using a neural network a, and the probability distribution is generally predetermined, and the neural network only needs to generate a statistical parameter D of the distribution. Next, a sample S is sampled from the probability distribution, and this sample is then input into the subsequent neural network B for processing, and an achievable loss function L is calculated. However, due to this sampling step, there is no way to do end-to-end training, so that the derivative of L with respect to S can be obtained, and also the derivative of D with respect to the parameter of A, but in general the derivative of S with respect to D cannot be obtained. The use of a parameter redirection method is required to obtain the derivative of S with respect to D. The method comprises the following process steps:

1. Generating n E obeying uniform distribution U (0, 1) for an n-dimensional vector v (including each control parameter) of the gating network output ₁ ,..,∈ _n ；

2. G is obtained by calculation of (4) _i ；

3. Correspondingly adding to obtain a new value vector (including an updated parameter of each control parameter);

4. normalizing the function calculated by the formula (6), calculating the probability size and obtaining the final category.

In the online prediction process, the server trains a new model (trains the initial recommendation model regularly to obtain the pre-trained recommendation model) every one hour and pushes the model to the online. The specific flow is as follows:

1. the requesting party, namely the user sends out an advertisement request, and the recall and coarse ranking model carries out preliminary screening on the advertisements and then returns the advertisement to the fine ranking system for an advertisement set (candidate recommendation information).

2. The fine-ranking system inquires the characteristics of the user side and the advertisement side, inputs the characteristics into a network structure shown in fig. 10 after preprocessing, and calculates pCTR (estimated click rate) and pCVR (estimated conversion rate).

3. eCPM (estimated recommended revenue index) is calculated through the pCTR and pCVR values calculated in the step (2), all advertisements in the collection are ordered, and topK (maximum K candidate recommended information is selected as target recommended information) is selected for exposure.

To verify the effectiveness of advertisement recommendation in embodiments of the present application, experiments were performed on criterion advertisement click data sets and MovieLens movie scoring data sets commonly used for advertisement recommendation. The pre-training model adopts NF M and deep FM models, the task on the criterion advertisement click data set is to predict whether a user clicks the advertisement, and the evaluation index is the Area Under the Curve (AUC) of the test set. The task on the movielens movie scoring dataset is to predict the user's score for the movie, belonging to the regression task, with the evaluation index being the average loss of the test set (Mean Square Error, MSE).

Referring to fig. 11, fig. 11 is a graph showing AUC contrast provided in the examples of the present application. Using the NFM pre-training model in criterion ad click dataset, it can be seen from fig. 11 (horizontal axis is iterative period epochs, vertical axis is AUC), the AUC of the conventional fine tuning method is 0.7438, the AUC of the embodiment of the present application is 0.7569, and the improvement is 0.0131.

Fig. 12 is a schematic diagram of MSE comparison provided in an embodiment of the present application. The deep fm pre-training model is used in the MovieLens film scoring dataset, and as can be seen from fig. 12 (the horizontal axis is the iteration period epochs, the vertical axis is the MSE), the MSE of the conventional fine tuning method is 0.9049, the MSE of the embodiment of the present application is 0.8941, and the error rate is reduced by 0.0108.

According to the method, the accuracy of recommending advertisements to users can be improved, and meanwhile, when the model is finely adjusted, the gating network is utilized to generate gating sparsity to determine whether feature fusion is carried out, so that two sets of model parameters do not need to be saved, the parameter quantity of the model can be reduced, the model is customized for the users on line in real time, offline training and storing of models of different users are not needed, and therefore offline storage resources are saved.

It will be appreciated that in embodiments of the present application, related data such as user characteristics, click, exposure, and conversion data of a user for an advertisement, etc., when the embodiments of the present application are applied to a specific product or technology, user permissions or consents need to be obtained, and the collection, use, and processing of related data need to comply with related laws and regulations and standards of related countries and regions.

Continuing with the description below of an exemplary structure of the information recommendation device 255 provided in the embodiments of the present application implemented as a software module, in some embodiments, as shown in fig. 2, the software module stored in the information recommendation device 255 of the memory 250 may include:

an information obtaining module 2551, configured to obtain, in response to an information request of a target object, a pre-training recommendation model, history associated information of the target object, and history behavior information of the target object for the history associated information;

An information screening module 2552, configured to screen candidate recommendation information for the target object;

the model fine tuning module 2553 is configured to fine tune the pre-training recommendation model based on the characteristics of the history associated information, the characteristics of the target object, and the history behavior information, so as to obtain a target prediction model corresponding to the target object;

a parameter prediction module 2554, configured to predict, by using the target prediction model, an interest parameter of the target object for the candidate recommendation information;

the information filtering module 2552 is further configured to filter target recommendation information from the candidate recommendation information based on the interest parameter, and return the target recommendation information to the target object.

In some embodiments of the present application, the model fine tuning module 2553 is further configured to generate an embedded feature and a fine tuning feature based on the feature of the history associated information and the feature of the target object; generating input features of the pre-training recommendation model based on controlling fusion of the embedded features and the fine tuning features; and carrying out parameter adjustment on the pre-training recommendation model according to the input characteristics and the historical behavior information until the fine adjustment ending condition is reached, so as to obtain the target prediction model.

In some embodiments of the present application, the model fine-tuning module 2553 is further configured to perform parameter prediction on the embedded feature to obtain a control parameter; the control parameters are used for representing whether the embedded features and the fine tuning features are subjected to feature fusion or not; and controlling the fusion of the fine tuning feature and the embedded feature according to the control parameter to obtain the input feature of the pre-training recommendation model.

In some embodiments of the present application, the model fine tuning module 2553 is further configured to determine, when the control parameter characterization performs feature fusion, a fusion result of the embedded feature and the fine tuning feature as the input feature of the pre-training recommendation model; when the control parameter characterization does not feature fusion, the embedded feature is determined as the input feature of the pre-trained recommendation model.

In some embodiments of the present application, the historical behavior information includes: historical conversion information representing whether the target object performs conversion operation for the historical association information; the model fine tuning module 2553 is further configured to predict, using the pre-training recommendation model, a predicted transformation result of the target object with respect to the history associated information from the input feature; and carrying out parameter adjustment on the pre-training recommended model according to the difference between the predicted conversion result and the historical conversion information until the fine tuning ending condition is reached, so as to obtain the target prediction model.

In some embodiments of the present application, the model fine-tuning module 2553 is further configured to integrate the features of the history associated information and the features of the target object to obtain an integrated sparse feature; transforming the integrated sparse features into dense features to obtain the embedded features; and extracting the characteristics of the embedded characteristics to obtain the fine tuning characteristics.

In some embodiments of the present application, the information filtering module 2552 is further configured to preliminarily filter a plurality of recall information from the information base using a recall model; sequencing a plurality of recall information through a coarse ranking model to obtain a coarse ranking information sequence; the first N pieces of information in the coarse arrangement information sequence are determined to be the candidate recommendation information; n is more than or equal to 1.

In some embodiments of the present application, the interest parameters include: estimating click rate and conversion rate; the information filtering module 2552 is further configured to calculate an estimated recommended revenue indicator based on the estimated click rate and the estimated conversion rate; k candidate recommendation information with the maximum estimated recommendation profit index is determined to be the target recommendation information; wherein K is more than or equal to 1.

In some embodiments of the present application, the information recommendation device 255 further includes: model training module 2555; the model training module 2555 is configured to obtain operation data of each information in the information base, and an initial recommendation model; the operation data at least comprises exposure data, click data and conversion data; and training the initial recommendation model regularly by utilizing the characteristics of each piece of information in the information base and the operation data to obtain the pre-training recommendation model.

In some embodiments of the present application, the parameter prediction module 2554 is further configured to convert the feature of the target object and the feature of the candidate recommendation information to obtain a converted feature, and extract an adjustment feature from the converted feature; generating fusion parameters aiming at the conversion characteristics and the adjustment characteristics, and controlling fusion of the conversion characteristics and the adjustment characteristics by utilizing the fusion parameters to obtain characteristics to be predicted; and predicting the feature to be predicted through the target prediction model, and predicting the interest parameter of the target object aiming at the candidate recommendation information.

the model training module 2555 is further configured to generate sample values obeying a preset distribution for the generated control parameters; calculating an updated component of the control parameter according to the sample value; summing the control parameter and the update component to obtain the update parameter; and carrying out parameter optimization on the initial parameter model by utilizing the normalization result of the updated parameters until the optimization ending condition is reached, so as to obtain the parameter prediction model.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the information recommendation method according to the embodiment of the present application.

The present embodiments provide a computer readable storage medium storing executable instructions that, when executed by a processor, cause the processor to perform an information recommendation method provided by the embodiments of the present application, for example, an information recommendation method as shown in fig. 3.

In some embodiments, the computer readable storage medium may be FRAM, ROM, PROM, EP ROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.

In some embodiments, the executable instructions may be in the form of programs, software modules, scripts, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.

As an example, the executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in one or more scripts in a hypertext markup language (HTML, hyper Text Markup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

As an example, the executable instructions may be deployed to be executed on one computing device (information recommendation device) or on multiple computing devices located at one site, or on multiple computing devices distributed across multiple sites and interconnected by a communication network.

In summary, through the embodiment of the present application, when an information recommendation device receives an information request of a target object, on the basis of a pre-training recommendation model that has learned common characteristics of users, fine adjustment is performed by using characteristics of the target object, characteristics of historical associated information of the target object, and historical behavior information, so as to obtain a target prediction model that learns individual characteristics of the target object, thereby realizing customization of the target object, and finally, more accurately predicting interest parameters of the target object for candidate recommendation information by using the target prediction model, and screening target recommendation information meeting preference of the target object by using the interest parameters, so that accuracy of information recommendation is improved; the model fine tuning process is started when a request of a target object is received, namely fine tuning is performed on line in real time, so that an adaptive model is not required to be trained and stored for different users on line, and storage resources can be saved; the information recommendation device only needs to store the parameters of the pre-training recommendation model, and does not need to generate and store the parameters of the fine-tuning model, so that the parameter amount of the model is smaller, and therefore memory resources required to be consumed by the model in calculation are also smaller.

The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and scope of the present application are intended to be included within the scope of the present application.

Claims

1. An information recommendation method, characterized in that the information recommendation method comprises:

screening candidate recommendation information aiming at the target object;

2. The method according to claim 1, wherein the fine tuning the pre-training recommendation model based on the characteristics of the history related information, the characteristics of the target object, and the history behavior information, to obtain a target prediction model corresponding to the target object, includes:

Generating embedded features and fine-tuning features based on the features of the history associated information and the features of the target object;

generating input features of the pre-training recommendation model based on controlling fusion of the embedded features and the fine tuning features;

and carrying out parameter adjustment on the pre-training recommendation model according to the input characteristics and the historical behavior information until the fine adjustment ending condition is reached, so as to obtain the target prediction model.

3. The method of claim 2, wherein the generating input features of the pre-trained recommendation model based on controlling the fusion of the embedded features and the fine-tuning features comprises:

carrying out parameter prediction on the embedded features to obtain control parameters; the control parameters are used for representing whether the embedded features and the fine tuning features are subjected to feature fusion or not;

and controlling the fusion of the fine tuning feature and the embedded feature according to the control parameter to obtain the input feature of the pre-training recommendation model.

4. A method according to claim 3, wherein said controlling the fusion of the fine tuning feature and the embedded feature in accordance with the control parameter to obtain the input feature of the pre-trained recommendation model comprises:

When the control parameter characterization is subjected to feature fusion, determining a fusion result of the embedded feature and the fine tuning feature as the input feature of the pre-training recommendation model;

when the control parameter characterization does not feature fusion, the embedded feature is determined as the input feature of the pre-trained recommendation model.

5. The method according to any one of claims 2 to 4, wherein the historical behavior information comprises: historical conversion information representing whether the target object performs conversion operation for the historical association information; and performing parameter adjustment on the pre-training recommendation model according to the input characteristics and the historical behavior information until reaching a fine tuning ending condition, so as to obtain the target prediction model, wherein the method comprises the following steps:

predicting a predicted conversion result of the target object aiming at the history associated information from the input characteristics by utilizing the pre-training recommendation model;

and carrying out parameter adjustment on the pre-training recommended model according to the difference between the predicted conversion result and the historical conversion information until the fine tuning ending condition is reached, so as to obtain the target prediction model.

6. The method of claim 5, wherein the history associated information comprises at least: and the historical exposure information which is watched by the target object and the historical click information which is clicked by the target object.

7. The method of any of claims 2 to 4, wherein the generating embedded features and trim features based on the features of the historical association information and the features of the target object comprises:

integrating the characteristics of the history associated information and the characteristics of the target object to obtain integrated sparse characteristics;

transforming the integrated sparse features into dense features to obtain the embedded features;

and extracting the characteristics of the embedded characteristics to obtain the fine tuning characteristics.

8. The method according to any one of claims 1 to 4, wherein screening candidate recommendation information for the target object includes:

using the recall model to preliminarily screen a plurality of recall information from the information base;

sequencing a plurality of recall information through a coarse ranking model to obtain a coarse ranking information sequence;

the first N pieces of information in the coarse arrangement information sequence are determined to be the candidate recommendation information; n is more than or equal to 1.

9. The method of any one of claims 1 to 4, wherein the interest parameter comprises: estimating click rate and conversion rate; the selecting target recommendation information from the candidate recommendation information based on the interest parameter comprises the following steps:

calculating an estimated recommended profit index based on the estimated click rate and the estimated conversion rate;

k candidate recommendation information with the maximum estimated recommendation profit index is determined to be the target recommendation information; wherein K is more than or equal to 1.

10. The method of any one of claims 1 to 4, wherein prior to the obtaining the pre-trained recommendation model, the method further comprises:

acquiring operation data of each piece of information in an information base and an initial recommendation model; the operation data at least comprises exposure data, click data and conversion data;

and training the initial recommendation model regularly by utilizing the characteristics of each piece of information in the information base and the operation data to obtain the pre-training recommendation model.

11. The method according to claim 3 or 4, wherein predicting, by the target prediction model, the interest parameter of the target object for the candidate recommendation information comprises:

Converting the characteristics of the target object and the characteristics of the candidate recommendation information to obtain conversion characteristics, and extracting adjustment characteristics from the conversion characteristics;

generating fusion parameters aiming at the conversion characteristics and the adjustment characteristics, and controlling fusion of the conversion characteristics and the adjustment characteristics by utilizing the fusion parameters to obtain characteristics to be predicted;

and predicting the feature to be predicted through the target prediction model, and predicting the interest parameter of the target object aiming at the candidate recommendation information.

12. The method of claim 11, wherein generating the fusion parameters for the conversion feature and the adjustment feature is performed by a parameter prediction model, and wherein performing parameter prediction on the embedded feature to obtain the control parameters is performed by an initial parameter model;

after the parameter prediction is performed on the embedded feature to obtain the control parameter, and before the fusion parameter is generated for the conversion feature and the adjustment feature, the method further includes:

generating sample values obeying preset distribution aiming at the generated control parameters;

calculating an updated component of the control parameter according to the sample value;

Summing the control parameter and the update component to obtain the update parameter;

and carrying out parameter optimization on the initial parameter model by utilizing the normalization result of the updated parameters until the optimization ending condition is reached, so as to obtain the parameter prediction model.

13. An information recommendation device, characterized in that the information recommendation device comprises:

14. An information recommendation device, characterized in that the information recommendation device comprises:

a memory for storing executable instructions;

a processor for implementing the information recommendation method of any one of claims 1 to 12 when executing executable instructions stored in said memory.

15. A computer readable storage medium storing executable instructions which when executed by a processor implement the information recommendation method of any one of claims 1 to 12.

16. A computer program product comprising a computer program or instructions which, when executed by a processor, implements the information recommendation method of any one of claims 1 to 12.