WO2022116862A1

WO2022116862A1 - Information pushing method and system, model training method, and related devices

Info

Publication number: WO2022116862A1
Application number: PCT/CN2021/132104
Authority: WO
Inventors: 李天浩; 陈大乾
Original assignee: 京东科技控股股份有限公司
Priority date: 2020-12-03
Filing date: 2021-11-22
Publication date: 2022-06-09
Also published as: CN112417293A

Abstract

The present disclosure relates to an information pushing method and system, a model training method, and related devices, which relate to the field of data processing. The information pushing method comprises: acquiring a trained target domain prediction model sent by an offline system, wherein the target domain prediction model is obtained by means of training using a prediction result of a source domain prediction model and user training data of a target domain, and the source domain prediction model is obtained by means of training using user training data of a source domain; and using the target domain prediction model to predict data to be detected of a user of the target domain, so as to obtain an information pushing result for the corresponding user. Therefore, when a new service or a new application scenario goes online, a corresponding prediction model can be quickly and accurately provided in the embodiments of the present disclosure.

Description

Information push method and system, model training method and related equipment

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based on the CN application number 202011397968.4 and the filing date is December 3, 2020, and claims its priority. The disclosure of the CN application is hereby incorporated into this application as a whole.

technical field

The present disclosure relates to the field of data processing, and in particular, to an information push method and system, a model training method and related equipment.

Background technique

Within an enterprise, there is often a series of product matrices, each of which has user data for a specific business scenario. In the personalized recommendation method of the related art, model training is usually performed based on actual own business scene data.

SUMMARY OF THE INVENTION

According to a first aspect of some embodiments of the present disclosure, an information push method is provided, including: acquiring a target domain prediction model sent by an offline system and completed training, wherein the target domain prediction model is a prediction result using a source domain prediction model , and the user training data of the target domain are obtained by training, the source domain prediction model is obtained by training the user training data of the source domain; using the target domain prediction model, the user data to be tested in the target domain is predicted to Obtain the information push result to the corresponding user.

In some embodiments, the target domain prediction model and the encoded value of the target domain prediction model that have been trained and sent by the offline system are periodically acquired, and the information push method further includes: comparing the acquired encoded value of the prediction network model with the currently used target If the coding values of the domain prediction models are different, verify the version of the obtained target domain prediction model; if the online verification of the obtained target domain prediction model passes, use the obtained target domain prediction model to replace the currently used one. The target domain prediction model, wherein the online verification includes version verification.

In some embodiments, the online verification further includes model verification, and the information push method further includes: in the case that the acquired value of the preset parameter of the target domain prediction model is within a preset range, by model validation.

In some embodiments, the obtained target domain prediction model is a solidification map file, in which the parameters of the target domain prediction model determined through training are converted into constants.

In some embodiments, the user data to be measured includes user characteristics and product characteristics, and the information push result to the corresponding user is a result of whether the user recommends a product.

According to a second aspect of some embodiments of the present disclosure, there is provided a model training method for information push, including: training a source domain prediction model by using user training data of a source domain; The domain feature data is input into the source domain prediction model and the target domain prediction model respectively, and the corresponding source domain prediction results and target domain prediction results of the user are obtained; according to the difference between the source domain prediction result and the target domain prediction result, and the corresponding tag value of the user Adjust the parameters of the target domain prediction model based on the difference between the prediction result of the target domain and the target domain.

In some embodiments, adjusting the parameters of the target domain prediction model includes: adjusting the parameters of the target domain prediction model based on a loss function of the target domain prediction model, wherein the loss function of the target domain prediction model includes the source domain prediction result The cross-entropy with the target domain prediction result, and the cross-entropy between the user-corresponding tag value and the target domain prediction result.

In some embodiments, the complexity of the target domain prediction model is higher than the complexity of the source domain prediction model.

In some embodiments, the number of dimensions of the input data of the target domain prediction model is greater than the number of dimensions of the input data of the source domain prediction model.

In some embodiments, the model training method further includes: acquiring first feature data and second feature data corresponding to each of the multiple users, wherein the second feature data corresponding to the same user includes a partial dimension of the first feature data feature; use the first feature data to train the first preliminary model; input the first feature data and the second feature data corresponding to the same user into the first preliminary model and the second preliminary model respectively, and obtain the first preliminary model corresponding to the user The prediction result and the prediction result of the second preparatory model; according to the difference between the prediction result of the first preparatory model and the prediction result of the second preparatory model, and the difference between the mark value corresponding to the user and the prediction result of the first preparatory model, the parameters of the second preparatory model Make adjustments; use the adjusted second preliminary model as the source domain prediction model.

In some embodiments, the first preliminary model and the second preliminary model have the same network model structure except that the input layer is different.

In some embodiments, the source domain prediction result and the target domain prediction result corresponding to the user are recommendation results associated with the same item.

According to a third aspect of some embodiments of the present disclosure, there is provided an apparatus for pushing information, comprising: an acquisition module configured to acquire a target domain prediction model sent by an offline system and completed training, wherein the target domain prediction model is obtained by using a source The prediction result of the domain prediction model and the user training data of the target domain are obtained by training, and the source domain prediction model is obtained by training the user training data of the source domain; the prediction module is configured to use the target domain prediction model, Predict the user data to be measured in the target domain to obtain information push results for the corresponding users.

According to a fourth aspect of some embodiments of the present disclosure, there is provided a model training apparatus for information push, including: a source domain training module configured to train a source domain prediction model using user training data of the source domain; target domain training The module is configured to input the source domain feature data and the target domain feature data corresponding to the same user into the source domain prediction model and the target domain prediction model, respectively, to obtain the source domain prediction result and the target domain prediction result corresponding to the user; and, according to The difference between the prediction result of the source domain and the prediction result of the target domain, as well as the difference between the mark value corresponding to the user and the prediction result of the target domain, adjust the parameters of the prediction model of the target domain.

According to a fifth aspect of some embodiments of the present disclosure, an information push system is provided, including: an information push apparatus; and a model training apparatus for information push.

According to a sixth aspect of some embodiments of the present disclosure, there is provided an information pushing apparatus, comprising: a memory; and a processor coupled to the memory, the processor being configured to execute any one of the foregoing based on instructions stored in the memory Information push method.

According to a seventh aspect of some embodiments of the present disclosure, there is provided a model training apparatus for information push, comprising: a memory; and a processor coupled to the memory, the processor being configured to, based on instructions stored in the memory, Perform any one of the aforementioned model training methods for information push.

According to an eighth aspect of some embodiments of the present disclosure, there is provided a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements any one of the foregoing information pushing methods, or any of the foregoing usage methods. Model training method for information push.

Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments of the present disclosure with reference to the accompanying drawings.

Description of drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present disclosure, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

FIG. 1 shows a schematic flowchart of an information push method according to some embodiments of the present disclosure.

FIG. 2 shows a schematic flowchart of a model training method for information push according to some embodiments of the present disclosure.

FIG. 3 shows a schematic flowchart of a pre-training method according to some embodiments of the present disclosure.

FIG. 4 shows a schematic flowchart of a model verification method according to some embodiments of the present disclosure.

FIG. 5 shows a schematic structural diagram of an information pushing apparatus according to some embodiments of the present disclosure.

FIG. 6 shows a schematic structural diagram of a model training apparatus for information push according to some embodiments of the present disclosure.

FIG. 7 shows a schematic structural diagram of an information push system according to some embodiments of the present disclosure.

FIG. 8 shows a schematic structural diagram of a data processing apparatus according to some embodiments of the present disclosure.

FIG. 9 shows a schematic structural diagram of a data processing apparatus according to other embodiments of the present disclosure.

Detailed ways

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only a part of the embodiments of the present disclosure, but not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application or uses in any way. Based on the embodiments in the present disclosure, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present disclosure.

The relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

Meanwhile, it should be understood that, for the convenience of description, the dimensions of various parts shown in the accompanying drawings are not drawn in an actual proportional relationship.

Techniques, methods, and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods, and devices should be considered part of the authorized description.

In all examples shown and discussed herein, any specific value should be construed as illustrative only and not as limiting. Accordingly, other examples of exemplary embodiments may have different values.

It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further discussion in subsequent figures.

In some business scenarios, some products in the growing stage often have insufficient user data, which makes it difficult for the model to obtain good generalization ability during model training in this business scenario, resulting in poor personalization effect.

A technical problem to be solved by the embodiments of the present disclosure is: how to improve the accuracy of user recommendation in a new business scenario.

FIG. 1 shows a schematic flowchart of an information push method according to some embodiments of the present disclosure. As shown in FIG. 1 , the information pushing method of this embodiment includes steps S102 to S104.

In step S102, the prediction network model sent by the offline system and the training has been completed is obtained, wherein the prediction network model is obtained by using the prediction result of the teacher network model of the prediction network model and the user training data of the target domain for training. The model's teacher network model is obtained by training with user training data from the source domain.

In some embodiments, "domains" represent different business scenarios. The source domain is, for example, a mature and usable business scenario with a large amount of user data, and the target domain is, for example, a new business scenario with a small amount of user data.

For example, a company's main business is e-commerce, and its e-commerce platform has sufficient user purchase data. If the likelihood of a user purchasing an item is predicted, so as to recommend an item for the user, the prediction model can be trained using the user purchase data. After that, the company expanded its financial category business, and some users of the original e-commerce platform also opened financial category business. However, due to the limited number of users who open such services, it is difficult to obtain models with high accuracy if only relying on user financial business data to train predictive models. At this time, the user purchase data can be regarded as the data of the source domain, and the user's financial business data can be regarded as the data of the target domain. By using the training results of the data in the source domain, the training of the prediction model in the target domain is assisted.

In the related art, a commonly used training method of a neural network model is: training according to the difference between the prediction result of the training data and the marked value by the model. However, due to the problem of insufficient data volume in the target domain, a source domain prediction model obtained by training with sufficient data volume is considered in the training process. When training the target domain prediction model, not only the prediction accuracy of the input data of the target domain prediction model itself is considered, but also the difference between the prediction results of the target domain prediction model and the prediction results of the source domain prediction model, so that the target domain prediction model Can learn the knowledge learned by the source domain prediction model. Therefore, even when the amount of data in the target domain is small, a target domain prediction model with higher accuracy can be obtained. The specific training method of the target domain prediction model will be further introduced in the following embodiments.

In some embodiments, the acquired target domain prediction model is a solidified map file, and in the solidified map file, the parameters of the target domain prediction model determined through training are converted into constants. Therefore, the online system obtains a lighter model file, which helps to improve the online efficiency of the new version of the model.

In step S104, the prediction network model is used to predict the user data to be measured in the target domain, so as to obtain an information push result for the corresponding user.

In some embodiments, the user data to be measured includes user characteristics, product characteristics, and environmental characteristics (eg, time characteristics, characteristics of other users, characteristics of related products, characteristics of platform activities, etc.) and the like.

The information push result to the corresponding user is: the result of whether the product is recommended for the user. For example, the information push result for the corresponding user includes a first indicator and a second indicator, the first indicator indicates that the product is recommended for the user, and the second indicator indicates that the product is not recommended for the user. said product. For example, the user data to be tested of user A includes the characteristics of user A and the characteristics of a certain shampoo, and the information push result is whether the shampoo is recommended for user A. In some embodiments, the prediction network model outputs the judgment probability, and the recommendation result is determined according to the comparison result between the judgment probability and the preset probability.

After the recommendation result is determined, a push interface between the front-end application module and the back-end server can be used to send the recommendation result to the user's terminal in a preset information format.

In some embodiments, when the information push result is to recommend a corresponding product to the user, the information of the corresponding product is sent to the user terminal; when the information push result is that the corresponding product is not recommended to the user, the corresponding product is not sent to the user. product information.

The above embodiment effectively utilizes cross-domain data information, aggregates the knowledge that can be learned from multiple data islands, and assists the training of the target domain prediction model in combination with the training results of the source domain prediction model, which further improves the generalization ability of the model. . Therefore, when a new service or a new application scenario is launched, the embodiments of the present disclosure can quickly and accurately provide a corresponding prediction model.

The following describes an embodiment of the model training method for information push of the present disclosure with reference to FIG. 2 .

FIG. 2 shows a schematic flowchart of a model training method for information push according to some embodiments of the present disclosure. As shown in FIG. 2 , the model training method for information push in this embodiment includes steps S202 to S206.

In some embodiments, prior to the training process, the user's exposure logs, click logs, product content forward index, user profile feature logs, and the like are collected. After collecting these data, for example, data fusion is performed through identifications such as device numbers, and dirty data lacking effective features is removed, thereby obtaining multiple pieces of user data. In some embodiments, positive samples and negative samples can also be extracted from the obtained data according to a preset ratio; in addition, samples that do not meet preset conditions can also be filtered out, for example, the browsing of products is often lower than a certain threshold. samples etc.

In step S202, the source domain prediction model is trained using the user training data of the source domain.

In step S204, the source domain feature data and the target domain feature data corresponding to the same user are input into the source domain prediction model and the target domain prediction model, respectively, to obtain the source domain prediction result and the target domain prediction result corresponding to the user. That is, input the source domain feature data corresponding to a user into the source domain prediction model to obtain the source domain prediction result; then input the target domain feature data corresponding to the same user into the target domain prediction model to obtain the target domain prediction result.

In step S206, the parameters of the target domain prediction model are adjusted according to the difference between the prediction result of the source domain and the prediction result of the target domain, and the difference between the mark value corresponding to the user and the prediction result of the target domain.

In some embodiments, the parameters of the target domain prediction model are adjusted based on the loss function of the target domain prediction model, wherein the loss function of the target domain prediction model includes the cross entropy of the source domain prediction result and the target domain prediction result, and the user The cross-entropy of the corresponding label value and the prediction result of the target domain. For example, take Equation (1) as the loss function of the target domain prediction model.

L=CE(y,pred)+αCE(q,pred) (1)

In formula (1), L represents the value of the loss function; CE(*,*) represents the calculation of cross entropy for the two variables in parentheses; y represents the label value; pred represents the prediction result of the target domain; q represents the prediction result of the source domain; α represents a preset parameter.

In some embodiments, the value of q is determined by the softmax layer represented by equation (2).

In formula (2), qi represents the probability corresponding to the i-th class in the classification result of the source domain prediction model; z _i represents the result input to the softmax layer corresponding to the _i -th class; j represents the source domain prediction model given zj represents the result input to the softmax layer corresponding to the _jth class; T represents the preset "temperature value" parameter, which is used to represent the softening degree of the prediction result of the source domain prediction model. In some embodiments, T=10, wherein, after the inventor's test, this value can obtain a better training effect. After determining the q _i corresponding to each class, the maximum value among them is taken as the value of q.

For example, a certain user has user purchase data on the e-commerce platform as the source domain and financial business data on the financial business platform as the target domain. Let's make a recommendation for a certain mobile phone. The source domain prediction result corresponding to the user is, for example, whether a mobile phone is recommended for the user, and the target domain prediction result corresponding to the user is, for example, whether the user recommends a financial service for purchasing mobile phones by installments. If for the same user and the same item, both the source domain prediction result and the target domain prediction result are recommended, it can be considered that the output values of the source domain prediction model and the target domain prediction model are the same.

In some embodiments, the complexity of the target domain prediction model is higher than the complexity of the source domain prediction model. The complexity of the model is measured, for example, by the number of layers of the model, the number of layers with preset mechanisms (eg, attention mechanisms), the number of parameters, and so on.

In the related art, the complexity of the model trained later is often lower than the complexity of the model trained earlier, or the model trained later can process lower-dimensional data than the model trained earlier. This is to facilitate the post-trained model to meet the requirements of lightweight operation. For example, when transplanting a server-side model to a mobile terminal to run, the computing capability, storage capability, and data processing capability of the mobile terminal need to be considered. However, the embodiments of the present disclosure are applied in the scenario of data cross-domain learning, so the complexity of the target domain prediction model may be higher than that of the source domain prediction model, and the input data of the target domain prediction model may also be more complex data . Therefore, even for new business scenarios, more complex information push ideas can be implemented.

In some embodiments, in order to further improve the efficiency of training the target domain prediction model, the source domain prediction model uses as few input dimensions as possible. In order to improve the training efficiency and also ensure the accuracy of the training, in some embodiments, the source domain prediction model is obtained through a pre-training process. An embodiment of the pre-training method of the present disclosure is described below with reference to FIG. 3 . The training idea of this embodiment is similar to the idea of training the target domain prediction model, both of which use the training result of one model to improve the training accuracy of another model.

FIG. 3 shows a schematic flowchart of a pre-training method according to some embodiments of the present disclosure. As shown in FIG. 3 , the pre-training method of this embodiment includes steps S302 to S310.

In step S302, first feature data and second feature data corresponding to each of the multiple users are acquired, wherein the second feature data corresponding to the same user includes features of a partial dimension of the first feature data.

In some embodiments, the first feature data and the second feature data are both data of the source domain, and the difference lies in the number of features of the two. For example, if the first feature data is 1000-dimensional user purchase data, the second feature data takes part of the dimensions to form 100-dimensional user purchase data.

In step S304, a first preliminary model is trained by using the first feature data.

In step S306, the first feature data and the second feature data corresponding to the same user are input into the first preliminary model and the second preliminary model respectively, and the first preliminary model prediction result and the second preliminary model prediction result corresponding to the user are obtained .

In step S308, the parameters of the second preliminary model are adjusted according to the difference between the prediction result of the first preliminary model and the prediction result of the second preliminary model, and the difference between the mark value corresponding to the user and the prediction result of the first preliminary model.

For example, referring to the training process of the target domain prediction model, this gap can be represented by cross-entropy. In some embodiments, the parameters of the second preliminary model are adjusted based on the loss function of the second preliminary model, wherein the loss function of the second preliminary model includes the intersection of the prediction result of the first preliminary model and the prediction result of the second preliminary model entropy, and the cross entropy between the label value corresponding to the user and the prediction result of the second preliminary model. For example, take Equation (3) as the loss function of the target domain prediction model.

L _pre =(1-λ)CE(y',pred')+λCE(q',pred') (3)

In formula (3), L _pre represents the value of the loss function; CE(*,*) represents the calculation of the cross entropy for the two variables in brackets; y' represents the label value; pred' represents the prediction result of the second preliminary model; q' Indicates the prediction result of the first preliminary model.

Although the second preparatory network has fewer input dimensions, it also learns the training results of the first preparatory network obtained by training data with more dimensions during the training process. Therefore, the second preparatory network also has a higher prediction accuracy.

In some embodiments, the first preliminary model and the second preliminary model have the same network model structure except for different input layers. Thus, the training process of the second preparatory model can be made to focus more on knowledge extraction of unused features.

In step S310, the adjusted second preliminary model is used as the source domain prediction model. After a number of tuning iterations, the second preparatory model completes training.

In some embodiments, the trained second preliminary model may also be tested. If the test accuracy rate is greater than the preset value, the second preliminary model is used as the source domain prediction model. If the test accuracy rate is not greater than the preset value, retraining may be selected; or it may be considered that the input features of the second preliminary model are insufficient to characterize the user, and the features need to be re-selected as the input features of the second preliminary model.

After the second preliminary model is obtained through the above pre-training process and the source domain prediction model is determined, the source domain prediction model can be made to use fewer input features but have a prediction accuracy comparable to the model represented by multi-dimensional features, thereby indirectly improving The training efficiency of the target domain prediction model.

In some embodiments, the training process and the information pushing process of the target domain prediction model can be deployed in an offline system and an online system, respectively. The offline system can periodically update the trained target domain prediction model through the data accumulated during the business process, and send it to the online system for application.

In some embodiments, the online system may also verify the target domain prediction model before updating it. An embodiment of the model verification method of the present disclosure is described below with reference to FIG. 4 .

FIG. 4 shows a schematic flowchart of a model verification method according to some embodiments of the present disclosure. As shown in FIG. 4 , the model verification method of this embodiment includes steps S402 to S408.

In step S402, regularly acquire the target domain prediction model and the encoded value of the target domain prediction model sent by the offline system and completed training. In some embodiments, the encoded value is an MD5 encoded value.

In step S404, in the case that the obtained coded value of the prediction network model is different from the coded value of the currently used target domain prediction model, verify the version of the obtained target domain prediction model.

For example, the offline system sends the newly trained version of the target domain prediction model twice. After the first transmission, the online system has already put it online for use. When sending for the second time, if the online system repeatedly executes the online process of the same model, it will affect the system efficiency and waste system resources. Therefore, by verifying the coded value, the situation of repeated online access is avoided, and system resources are saved.

In step S406, based on the version verification result, determine the online verification result of the acquired target domain prediction model.

In some embodiments, the online verification further includes model verification, which is used to verify whether the acquired value of the preset parameter of the target domain prediction model is within the preset range, for example, checking whether the key parameter is empty, and so on. Therefore, it is possible to find out the error of the sending object or the transmission error in time, which improves the stability of the system.

In step S408, if the online verification of the acquired target domain prediction model is passed, the currently used target domain prediction model is replaced with the acquired target domain prediction model.

Through the above verification process, the stability of the model online process can be improved.

An embodiment of the information pushing apparatus is described below with reference to FIG. 5 .

FIG. 5 shows a schematic structural diagram of an information pushing apparatus according to some embodiments of the present disclosure. As shown in FIG. 5 , the information pushing apparatus 500 of this embodiment includes: an obtaining module 5100 configured to obtain a target domain prediction model sent by an offline system and completed training, wherein the target domain prediction model is obtained by using a source domain prediction model The prediction result and the user training data of the target domain are obtained by training, and the source domain prediction model is obtained by training the user training data of the source domain; the prediction module 5200 is configured to use the target domain prediction model to perform the target domain The user data to be tested is predicted to obtain the information push results for the corresponding users.

In some embodiments, the acquiring module 5100 is further configured to periodically acquire the target domain prediction model and the encoded value of the target domain prediction model that are sent by the offline system and have completed training; the information pushing apparatus 500 further includes: a verification module 5300, which is be configured to verify the version of the obtained target domain prediction model under the condition that the obtained coding value of the prediction network model is different from the coding value of the currently used target domain prediction model; and, in the obtained target domain prediction When the online verification of the model is passed, the acquired target domain prediction model is used to replace the currently used target domain prediction model, wherein the online verification includes the version verification.

In some embodiments, the online verification further includes model verification, and the verification module 5300 is further configured to, in the case that the acquired value of the preset parameter of the target domain prediction model is within a preset range, by Model validation of predictive models for the target domain.

In some embodiments, the acquired target domain prediction model is a solidification map file, in which the parameters of the target domain prediction model determined through training are converted into constants.

In some embodiments, the user data to be measured includes user characteristics and product characteristics, and the information push result to the corresponding user is a result of whether the user recommends the product.

The following describes an embodiment of the model training apparatus for information push of the present disclosure with reference to FIG. 6 .

FIG. 6 shows a schematic structural diagram of a model training apparatus for information push according to some embodiments of the present disclosure. As shown in FIG. 6 , the model training device 600 of this embodiment includes: a source domain training module 6100 configured to train a source domain prediction model using user training data in the source domain; and a target domain training module 6200 configured to The source domain feature data and target domain feature data corresponding to the same user are respectively input into the source domain prediction model and the target domain prediction model, and the source domain prediction result and the target domain prediction result corresponding to the user are obtained; and, according to the The difference between the prediction result of the source domain and the prediction result of the target domain, and the difference between the mark value corresponding to the user and the prediction result of the target domain, adjust the parameters of the prediction model of the target domain.

In some embodiments, the target domain training module 6200 is further configured to adjust parameters of the target domain prediction model based on a loss function of the target domain prediction model, wherein the loss function of the target domain prediction model includes the source domain The cross entropy between the prediction result and the target domain prediction result, and the cross entropy between the tag value corresponding to the user and the target domain prediction result.

In some embodiments, the source domain training module 6100 is further configured to obtain first feature data and second feature data corresponding to each of the plurality of users, wherein the second feature data corresponding to the same user includes the Partial dimension features of the first feature data; using the first feature data to train the first preliminary model; inputting the first feature data and the second feature data corresponding to the same user into the first preliminary model and the second preliminary model respectively , obtain the first preliminary model prediction result and the second preliminary model prediction result corresponding to the user; according to the difference between the first preliminary model prediction result and the second preliminary model prediction result, and the mark value corresponding to the user According to the difference between the prediction result of the first preliminary model and the first preliminary model, the parameters of the second preliminary model are adjusted; the adjusted second preliminary model is used as the source domain prediction model.

The following describes an embodiment of the information push system of the present disclosure with reference to FIG. 7 .

FIG. 7 shows a schematic structural diagram of an information push system according to some embodiments of the present disclosure. As shown in FIG. 7 , the information pushing system 70 of this embodiment includes an information pushing apparatus 500 and a model training apparatus 600 for information pushing.

In some embodiments, the information push apparatus 500 is deployed in the online system of the information push system 70 , and the model training apparatus 600 is deployed in the offline system of the information push system 70 .

FIG. 8 shows a schematic structural diagram of a data processing apparatus according to some embodiments of the present disclosure, where the data processing apparatus is an information push apparatus or a model training apparatus for information push. As shown in FIG. 8 , the data processing apparatus 80 of this embodiment includes: a memory 810 and a processor 820 coupled to the memory 810 , and the processor 820 is configured to execute any one of the foregoing implementations based on instructions stored in the memory 810 The information push method in the example or the model training method for information push.

The memory 810 may include, for example, a system memory, a fixed non-volatile storage medium, and the like. The system memory stores, for example, an operating system, an application program, a boot loader (Boot Loader), and other programs.

FIG. 9 shows a schematic structural diagram of a data processing apparatus according to other embodiments of the present disclosure, where the data processing apparatus is an information push apparatus or a model training apparatus for information push. As shown in FIG. 9 , the data processing apparatus 90 in this embodiment includes: a memory 910 and a processor 920, and may further include an input/output interface 930, a network interface 940, a storage interface 950, and the like. These

interfaces

930 , 940 , 950 and the memory 910 and the processor 920 can be connected, for example, through a bus 960 . The input and output interface 930 provides a connection interface for input and output devices such as a display, a mouse, a keyboard, and a touch screen. Network interface 940 provides a connection interface for various networked devices. The storage interface 950 provides a connection interface for external storage devices such as SD cards and U disks.

Embodiments of the present disclosure further provide a computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, any one of the foregoing information push methods or a model training method for information push is implemented. .

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein .

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flows of the flowcharts and/or the block or blocks of the block diagrams.

These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present disclosure shall be included in the protection of the present disclosure. within the range.

Claims

An information push method, comprising:

Obtain the target domain prediction model sent by the offline system and completed training, wherein the target domain prediction model is obtained by using the prediction result of the source domain prediction model and the user training data of the target domain for training, and the source domain prediction model is obtained. The model is obtained by training with user training data from the source domain;

Using the target domain prediction model, the user data to be measured in the target domain is predicted to obtain the information push result for the corresponding user.
The information push method according to claim 1, wherein the target domain prediction model and the encoded value of the target domain prediction model sent by the offline system and completed training are regularly obtained, and the information push method further comprises:

In the case that the coding value of the obtained prediction network model is different from the coding value of the currently used target domain prediction model, verify the version of the obtained target domain prediction model;

In the case that the online verification of the acquired target domain prediction model is passed, the currently used target domain prediction model is replaced with the acquired target domain prediction model, wherein the online verification includes the version verification.
The information push method according to claim 2, wherein the online verification further includes model verification, and the information push method further includes:

When the value of the preset parameter of the acquired target domain prediction model is within a preset range, model verification of the acquired target domain prediction model is performed.
The information push method according to claim 1, wherein the obtained target domain prediction model is a solidified map file, and in the solidified map file, the parameters of the target domain prediction model determined through training are converted into constant.
The information push method according to claim 1, wherein the user data to be measured includes user characteristics and product characteristics, and the information push result to the corresponding user is a result of whether the user recommends the product.
A model training method for information push, comprising:

Use the user training data of the source domain to train the source domain prediction model;

Inputting the source domain feature data and target domain feature data corresponding to the same user into the source domain prediction model and the target domain prediction model, respectively, to obtain the source domain prediction result and the target domain prediction result corresponding to the user;

The parameters of the target domain prediction model are adjusted according to the difference between the source domain prediction result and the target domain prediction result, and the difference between the tag value corresponding to the user and the target domain prediction result.
The model training method according to claim 6, wherein the adjusting the parameters of the target domain prediction model comprises:

Adjust the parameters of the target domain prediction model based on the loss function of the target domain prediction model, wherein the loss function of the target domain prediction model includes the intersection of the source domain prediction result and the target domain prediction result entropy, and the cross-entropy between the tag value corresponding to the user and the prediction result of the target domain.
The model training method according to claim 6, wherein the complexity of the target domain prediction model is higher than that of the source domain prediction model.
The model training method according to claim 6, wherein the number of dimensions of the input data of the target domain prediction model is greater than the number of dimensions of the input data of the source domain prediction model.
The model training method according to claim 6, wherein the training of the source domain prediction model using the user training data of the source domain comprises:

acquiring first feature data and second feature data corresponding to each of the multiple users, wherein the second feature data corresponding to the same user includes features of some dimensions of the first feature data;

using the first feature data to train the first preliminary model;

Inputting the first feature data and the second feature data corresponding to the same user into the first preliminary model and the second preliminary model, respectively, to obtain the first preliminary model prediction result and the second preliminary model prediction result corresponding to the user;

According to the difference between the prediction result of the first preparatory model and the prediction result of the second preparatory model, and the difference between the mark value corresponding to the user and the prediction result of the first preparatory model, the parameters of the second preparatory model are determined. make adjustments;

The adjusted second preliminary model is used as the source domain prediction model.
The model training method according to claim 10, wherein the first preliminary model and the second preliminary model have the same network model structure except for different input layers.
The model training method according to claim 6, wherein the source domain prediction result and the target domain prediction result corresponding to the user are recommendation results associated with the same item.
An information push device, comprising:

The acquisition module is configured to acquire the target domain prediction model sent by the offline system and completed the training, wherein the target domain prediction model is obtained by using the prediction result of the source domain prediction model and the user training data of the target domain for training , the source domain prediction model is obtained by using the user training data of the source domain for training;

The prediction module is configured to use the target domain prediction model to predict the user data to be measured in the target domain to obtain information push results for the corresponding users.
A model training device for information push, comprising:

a source domain training module, configured to train a source domain prediction model using user training data in the source domain;

The target domain training module is configured to input the source domain feature data and target domain feature data corresponding to the same user into the source domain prediction model and the target domain prediction model respectively, and obtain the source domain prediction result and target corresponding to the user. domain prediction result; and, according to the difference between the source domain prediction result and the target domain prediction result, and the difference between the mark value corresponding to the user and the target domain prediction result, the parameters of the target domain prediction model make adjustments.
An information push system, comprising:

The information push device of claim 13; and

The model training device for information push according to claim 14.
An information push device, comprising:

memory; and

A processor coupled to the memory, the processor configured to perform the information pushing method of any one of claims 1-5 based on instructions stored in the memory.
A model training device for information push, comprising:

memory; and

A processor coupled to the memory, the processor configured to execute the model training method for information push according to any one of claims 6-12 based on the instructions stored in the memory.
A computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, realizes the information pushing method described in any one of claims 1 to 5, or any one of claims 6 to 12. The described model training method for information push.