CN111949886B

CN111949886B - Sample data generation method and related device for information recommendation

Info

Publication number: CN111949886B
Application number: CN202010887600.XA
Authority: CN
Inventors: 郝晓波; 葛凯凯; 刘雨丹; 唐琳瑶; 谢若冰; 张旭; 林乐宇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-08-28
Filing date: 2020-08-28
Publication date: 2023-11-24
Anticipated expiration: 2040-08-28
Also published as: CN111949886A

Abstract

The embodiment of the application discloses a sample data generation method and a related device for information recommendation based on artificial intelligence, which are used for acquiring user behavior data from multiple product fields to construct an original training data set, and carrying out vectorization processing on the original training data set to obtain an initial user behavior feature vector of each product field in the multiple product fields. And taking the product fields to be expanded in the product fields as target product fields respectively, and generating a sample data candidate set corresponding to the target product fields according to the initial user behavior feature vector of each product field. And mixing the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set so as to train a recommended model in the field of target products through the target training data set. The method can balance sample proportions in different product fields, improves the training effect of the recommendation model, and further improves the recommendation effect in the small sample product field.

Description

Sample data generation method and related device for information recommendation

Technical Field

The present application relates to the field of computers, and in particular, to a method and related apparatus for generating sample data for information recommendation.

Background

With the development of the internet, how to effectively screen and filter information, and accurately recommend information of interest to a user, such as information of movies, commodities or foods, is an important research topic.

Current recommendation methods are generally based on a specific product or Application (APP), and users thereof are often target users of the product or APP, so that the user-friendliness is limited. In addition, even if considering the recommendation method based on a plurality of products or APP, since the number of user behavior logs of different products is very different, if different numbers of user behavior logs are put together to train one multi-objective model, effective model training cannot be obtained.

Therefore, the current recommendation model for information recommendation has poor training effect, so that the information recommendation effect is poor, and especially the information recommendation effect of a small-data-volume product is difficult to meet the requirements of users.

Disclosure of Invention

In order to solve the technical problems, the application provides a sample data generation method for information recommendation based on artificial intelligence, which combines user behavior data in multiple product fields, generates more sample data to balance sample proportions in different product fields by producing pseudo samples, improves the training effect of a recommendation model, and further improves the recommendation effect of small sample product fields.

The embodiment of the application discloses the following technical scheme:

in one aspect, an embodiment of the present application provides a method for generating sample data for information recommendation, the method including:

acquiring user behavior data of a plurality of product fields, and constructing an original training data set;

vectorizing according to the original training data set to obtain an initial user behavior feature vector of each product field in the plurality of product fields;

taking the product fields to be expanded in the product fields as target product fields respectively, and generating a sample data candidate set corresponding to the target product fields according to the initial user behavior feature vector of each product field;

and mixing the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set, wherein the target training data set is used for training a recommendation model in the field of the target product.

In another aspect, an embodiment of the present application provides a sample data generating apparatus for information recommendation, including an acquiring unit, a determining unit, a generating unit, and a mixing unit:

the acquisition unit is used for acquiring user behavior data of a plurality of product fields and constructing an original training data set;

The determining unit is used for carrying out vectorization processing according to the original training data set to obtain an initial user behavior feature vector of each product field in the plurality of product fields;

the generating unit is used for respectively taking the product fields to be expanded in the product fields as target product fields and generating sample data candidate sets corresponding to the target product fields according to the initial user behavior feature vectors of each product field;

the mixing unit is used for mixing the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set, and the target training data set is used for training a recommendation model in the field of the target product.

In another aspect, an embodiment of the present application provides a sample data generating device for information recommendation, the device including a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to execute the sample data generating method for information recommendation according to any one of the foregoing according to instructions in the program code.

In another aspect, an embodiment of the present application provides a computer-readable storage medium for storing program code for executing the sample data generating method for information recommendation described in any one of the foregoing.

According to the technical scheme, the user behavior data of a plurality of product fields can be obtained, and the original training data set is constructed, so that the user behaviors in different product fields are mutually complemented. Because the possibility that the user uses a plurality of products simultaneously is small, the user behavior characteristics in the multi-product field are sparse, the information content of the original training data set is insufficient, and particularly for the product field with less user behavior data, the user is difficult to train to obtain an effective recommendation model, so that the application can expand the quantity of the user behavior data by generating a model production pseudo sample. For example, vectorization may be performed according to the original training data set to obtain an initial user behavior feature vector for each of the plurality of product domains. And taking the product fields to be expanded in the product fields as target product fields respectively, and generating sample data candidate sets corresponding to the target product fields, namely pseudo samples, according to the initial user behavior feature vectors of each product field. And mixing the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set so as to train a recommended model in the field of target products through the target training data set obtained after expansion. According to the scheme, the user behavior data in the multiple product fields are combined, more sample data are generated to balance sample proportions in different product fields through the production of the pseudo samples, the training effect of the recommendation model is improved, and the recommendation effect of the small sample product field is further improved.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

Fig. 1 is a schematic application scenario diagram of a sample data generating method for information recommendation according to an embodiment of the present application;

FIG. 2 is a flowchart of a sample data generating method for information recommendation according to an embodiment of the present application;

FIG. 3 is an overall frame diagram of a sample data generation method for information recommendation according to an embodiment of the present application;

fig. 4 is a schematic diagram of a model structure of an FGM model according to an embodiment of the present application;

FIG. 5a is a schematic view of a "look at" recommended interface of an APP according to an embodiment of the present application;

FIG. 5b is a schematic diagram of a recommended interface of a reading APP according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for generating sample data for information recommendation according to an embodiment of the present application;

FIG. 7 is a block diagram of a sample data generating device for information recommendation according to an embodiment of the present application;

fig. 8 is a block diagram of a terminal device according to an embodiment of the present application;

fig. 9 is a block diagram of a server according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

In interest recommendation systems, conventional recommendation methods are based on a specific product or a specific APP, and the user is often the target user of the product, so the user circle is limited.

For example, a user may only express an interest point related to the content of the APP under a certain APP, for example, the user may like to watch video content such as a variety, a movie, etc. under a video APP, but the user may be interested in books while not interested in a variety, a movie, etc. under reading the APP. Therefore, the user behavior under a certain product can only describe the interests of the user in a certain limited scene, and it is difficult to cover the whole interests of the user, for example, under the video APP, video content such as a television drama which the user may like is recommended to the user, and the original novel of the television drama is not recommended to the user, however, the user is interested in the television drama, and then the original novel is also interested in the television drama, but the whole interests of the user are difficult to be covered by the traditional recommendation method.

In addition, because the daily living user quantity in different product fields is large in difference, the quantity of the user behavior data in different product fields is large in difference, for example, the magnitude of the user behavior data in the product field A is more than 100 times that of the user behavior data in the product field B (for example, reading a book APP). If different amounts of user behavior data are put together to train a multi-objective model, a small amount of user behavior data is submerged under a large amount of other user behavior data, so that effective model training cannot be obtained, even if cross-domain recommendation is considered, the information recommendation effect is not good, and especially the information recommendation effect of a small-data-amount product is difficult to meet the requirements of users.

Therefore, the embodiment of the application provides a sample data generation method for information recommendation based on artificial intelligence, which combines user behavior data in multiple product fields, generates more sample data to balance sample proportions in different product fields by producing pseudo samples, improves the training effect of a recommendation model, and further improves the recommendation effect of small sample product fields.

The method provided by the embodiment of the application relates to the technical field of cloud, for example, big data (Big data), wherein the Big data refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability. With the advent of the cloud age, big data has attracted more and more attention, and special techniques are required for big data to effectively process a large amount of data within a tolerant elapsed time. Technologies applicable to big data include massively parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems. Such as mining online behavior data of users in various product areas.

The method provided by the embodiment of the application also relates to the field of artificial intelligence. Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

In the embodiment of the application, the artificial intelligence technology can comprise the directions of natural language processing, machine learning and the like. Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.

Machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine Learning typically includes Deep Learning (Deep Learning) techniques, including artificial neural networks (artificial neural network), such as convolutional neural networks (Convolutional Neural Network, CNN), recurrent neural networks (Recurrent Neural Network, RNN), deep neural networks (Deep neural network, DNN), and the like.

In this embodiment, a model may be generated using machine learning to generate a sample data candidate set using the generated model; a recommendation model may also be generated using machine learning, such that information recommendation is achieved using the recommendation model.

The method provided by the embodiment of the application can be applied to various recommendation systems, so that information recommendation in the field of products is realized, for example, a user can browse articles and videos recorded by a public number platform and a video platform recommended by the recommendation system in an interface of a 'watching at one' applet and a 'reading' applet of a certain product. The recommendation system uses the characteristics of the user such as age, gender, article category, keywords and the like and the historical user behavior data as the basis of the recommended content to realize personalized information recommendation of thousands of people and thousands of sides.

In order to facilitate understanding of the technical scheme of the application, the sample data generating method for information recommendation based on artificial intelligence provided by the embodiment of the application is introduced below in combination with an actual application scene.

Referring to fig. 1, fig. 1 is an application scenario schematic diagram of a sample data generating method for information recommendation according to an embodiment of the present application. The application scene comprises a terminal device 101 and a server 102, wherein one or more products, such as a reading APP, can be installed on the terminal device 101, and when the reading APP is opened by the terminal device 101, the server 102 can return recommendation information to the terminal device 101 through a recommendation system so as to realize the inter-domain recommendation of contents to users. For example, in reading the book APP, books such as novels may be recommended to the user, and movie drama or the like adapted according to novels may be recommended to the user.

The server 102 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services. The terminal device 101 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal device 101 and the server 102 may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.

In order to implement cross-domain recommendation, the server 102 may acquire user behavior data of multiple product domains, and construct an original training data set to implement mutual complementation of user behaviors in different product domains, so as to train a recommendation model. Because the likelihood that the user uses multiple products simultaneously is small, the user behavior characteristics of the multiple product domains are sparse, the information content of the original training data set is insufficient, and particularly for the product domains with less user behavior data, the server 102 can expand the quantity of the user behavior data by producing pseudo samples because the server is difficult to train to obtain an effective recommendation model.

For example, the server 102 may perform vectorization processing according to the original training data set to obtain an initial user behavior feature vector for each of the plurality of product domains. And taking the product fields to be expanded in the product fields as target product fields respectively, and generating sample data candidate sets corresponding to the target product fields, namely pseudo samples, according to the initial user behavior feature vectors of each product field.

And mixing the sample data in the sample data candidate set with the user behavior data in the original training data set, so as to supplement the user behavior data in the original training data set and obtain a target training data set. Therefore, the server can train the recommended model in the field of the target product through the target training data set obtained after expansion, the training effect of the recommended model is improved, and the recommended effect in the field of the small sample product is further improved.

It should be noted that, the method for generating sample data for information recommendation provided by the embodiment of the present application may also be executed by a terminal device, which is not limited by the present application. In addition, the method can be realized in an offline scene, and the generated target training data set can be provided for an online multi-domain recommendation model to use.

Next, a description will be given of a sample data generating method for information recommendation provided in an embodiment of the present application with reference to the accompanying drawings, using a server as an execution body.

Referring to fig. 2, fig. 2 shows a flowchart of a sample data generation method for information recommendation, the method comprising:

s201, acquiring user behavior data of a plurality of product fields, and constructing an original training data set.

The server can acquire user behavior data of a plurality of product fields and construct an original training data set. In one possible implementation manner, the User behavior data in the original training data set may be represented by a triple relationship data structure, where the triple relationship data structure represents a product field, a correspondence between a User and User click content, and may be represented as (User, domain, item), where User represents a User, domain represents a product field, and Item represents User click content under corresponding Domain.

User behavior data across product fields can be formally defined through a triple relationship data structure, so that an original training data set can be conveniently constructed.

Referring to fig. 3, fig. 3 shows an overall frame diagram of a sample data generation method for information recommendation. The server may obtain user behavior data of a plurality of product areas from the user click log through the multi-product area user behavior processing module ((see S301 in fig. 3)) to construct an original training data set.

The method for constructing the original training data set may be that a multi-product Domain user behavior processing module gathers online user behavior data of users in various product domains, and constructs a three-dimensional candidate set, wherein Domain represents the product Domain, item represents user click content under corresponding Domain, and label contains two behaviors of exposure click and exposure non-click, as labels, so as to train the users to generate a generation model of pseudo samples.

In some cases, some useless data may exist in the acquired user behavior data, and the useless data is difficult to reflect the interests of the user, for example, the user clicks on all browsed contents one by one, so that the interests of the user are difficult to analyze. Thus, in some possible implementations, data processing operations such as data cleansing and extreme behavior filtering may be performed on user behavior data for multiple product domains, thereby constructing an original training data set.

S202, vectorizing according to the original training data set to obtain an initial user behavior feature vector of each product field in the product fields.

The acquired user behavior data for the plurality of product domains may be used to train a recommendation model across the product domains. However, since the possibility that the user uses a plurality of products simultaneously is small, the user behavior characteristics of the multi-product field are sparse, the information amount of the original training data set is insufficient, and particularly for the product field with less user behavior data, it is difficult to train to obtain an effective recommendation model. Thus, in order to expand the data volume of small sample product domains, balancing the sample proportions of different product domains, a pseudo sample, i.e. a sample data candidate set, may be generated, see the S202-S203.

In order to facilitate processing of the collected user behavior data, a pseudo sample with similar characteristics to the user behavior data can be generated, and the server can perform vectorization processing on the user behavior data in the original training data set to obtain an initial user behavior feature vector of each product field.

S203, taking the product fields to be expanded in the product fields as target product fields respectively, and generating sample data candidate sets corresponding to the target product fields according to the initial user behavior feature vectors of the product fields.

In this embodiment, the user behavior data in a plurality of product fields may be expanded, that is, the product field to be expanded is the plurality of product fields, so that the recommendation effect of the product field with small data amount may be improved, and the recommendation effect of the product field with large data amount may also be improved.

However, for some product fields with large data volume, since the data volume of the product field is already very large and the coverage is comprehensive, even if the user behavior data is expanded again, the recommendation effect is difficult to be improved or the recommendation effect is not improved obviously. In this case, in order to reduce the amount of calculation, the user behavior data may be extended by generating a dummy sample only for a product field of a small data amount. At this time, the product domain to be expanded is a small data volume product domain among the plurality of product domains, for example, may be a product domain in which the number of user behavior data in the plurality of product domains is less than a preset threshold.

In one possible implementation, the server may input the initial user behavior feature vector of each product domain to a generation model, and generate the sample data candidate set by predicting the generation model according to the initial user behavior feature vector of each product domain.

After the server inputs the initial user behavior feature vector of each product field to the generation model, the server can screen the initial user behavior feature vector, capture and extract the user behavior feature vector most relevant to the input distribution of each product field, namely the effective user behavior feature vector. Based on this, one possible implementation of S203 may be to extract valid user behavior feature vectors of each product domain from the initial user behavior feature vectors, and predict to generate a sample data candidate set according to the valid user behavior feature vectors of each product domain.

The generation model may be various neural network models, for example, CNN, DNN, and the like. In one possible implementation, the generation model may be further improved and optimized, for example, a feature generation (Feature Generation For Multi-task Recommendation, FGM) model for multitasking recommendation is proposed as the generation model employed by the embodiments of the present application. The embodiment of the application mainly takes an FGM model as an example to introduce a sample data generation method for information recommendation.

In some cases, the model structure of the FGM model may be as shown in fig. 4, and the generating model may include a domain encoder (DomainEncoder), where an implementation manner of extracting the valid user behavior feature vector of each product domain from the initial user behavior feature vector may be that, for each product domain, the initial user behavior feature vector is subjected to random removal processing by the domain encoder to obtain an encoded user behavior feature vector, and then the valid user behavior feature vector is determined according to the encoded user behavior feature vector. The coded user behavior feature vector can be used as a domain code (domain code) of a user in the product field.

In this embodiment, the generating model may include a plurality of domainencoders, where the initial user behavior feature vector of each product domain corresponds to one DomainEncoder, and the initial user behavior feature vector corresponding to each product domain is input to the corresponding DomainEncoder, so as to obtain the encoded user behavior feature vector.

The structure of each DomainEncoder may be shown in a dashed box on the right side of fig. 4, where the DomainEncoder includes a mask module, based on which, a mode of randomly removing the initial user behavior feature vector by using the domain encoder may be that the DomainEncoder randomly masks a part of the input user behavior feature vector by using a mask (mask) mode, where the mask formula is as follows:

X _t representing an input initial user behavior feature vector, pos _t Representing the order of the input initial user behavior feature vectors, MASK represents the expression of randomly obscured initial user behavior feature vectors,and the user behavior characteristic vector is the coded user behavior characteristic vector.

Because the part belongs to the bottom model feature, pre-training is needed to be carried out on user behavior data of a plurality of product fields, abstract expression of the bottom feature is obtained, and then the part of feature parameters are fixed in use of an upper layer.

By carrying out random removal processing on the initial user behavior feature vectors, the relevance among the features is enhanced, the generalization capability of a generated model is improved, and the reasoning of sample generation is enhanced.

Referring to the dashed box on the right side of fig. 4, the DomainEncoder further includes an attention (attention) module, based on which, the effective user behavior feature vector may be determined from the encoded user behavior feature vector by performing an attention operation on a Domain identifier, such as an Identity (ID) vector, of the product Domain and the encoded user behavior feature vector (i.e., mask result), and calculating the effective user behavior feature vector under the Domain. Wherein the Domain identification vector may be represented as a Domain ID vector.

In general, the attribute operation may determine the weight of the encoded user behavior feature vector according to the encoded user behavior feature vector and the domain identification vector of the product field, where the weight may reflect the correlation between the encoded user behavior feature vector and the product field, and further determine the effective user behavior feature vector according to the weight of the encoded user behavior feature vector. For example, the encoded user behavior feature vector whose weight reaches a certain threshold may be determined as the valid user behavior feature vector. The "matrix product (matmul)" in the right-hand dashed box of fig. 4 represents the operational function of the attention operation.

Of course, other ways of obtaining the effective user behavior feature vector may be used, for example, the weight of each initial user behavior feature vector may be directly determined without performing mask processing of a domainncoder, and thus the effective user behavior feature vector may be determined, which is not limited in the embodiment of the present application.

It should be noted that the obtained effective user behavior feature vector may be added to the Domain ID vector, so as to ensure that the effective user behavior feature vector may be effectively transferred downward.

In fig. 4, product fields 1, … … each correspond to a DomainEncoder, a gray circle represents a feature vector of a user (e.g., a user information vector), a white circle represents an initial user behavior feature vector, and a black circle represents a feature vector of a product field (e.g., a domainid vector). The initial user behavior feature vector of each product field and the Domain ID vector of the product field are used for obtaining the effective user behavior feature vector through a Domain Encoder, and the effective user behavior feature vector can be effectively transferred downwards by combining with Domain ID to be transferred downwards.

In some cases, for a certain product field, for example, a target product field, the correlation degree between the effective user behavior feature vector of other product fields and the target product field may be different, some correlation degrees may be larger, the generated pseudo sample has a larger reference value, some correlation degrees may be smaller, and the reference value of the generated pseudo sample is not large. Therefore, in order to train to obtain a reasonable pseudo sample, when the server generates a sample data candidate set according to the effective user behavior feature vector prediction, the server can extract a target user behavior feature vector which meets the preset condition with the correlation degree of the target product field from the effective user behavior feature vector of each product field, and filter irrelevant information. And further predicting the target user behavior feature vector to generate a sample data candidate set.

In some possible implementations, referring to fig. 4, the generating model further includes a transformer (transformer) computing layer, where an implementation manner of determining the target user behavior feature vector may be to obtain, by the transformer computing layer, an influence weight of the effective user behavior feature vector of each product domain on the target product domain according to the effective user behavior feature vector of each product domain, where the influence weight is used to reflect a correlation degree between the effective user behavior feature vector of each product domain and the target product domain. Based on the characteristics of the transform calculation layer, the transform calculation layer can obtain the influence weights of multiple groups of effective user behavior feature vectors in each product field on the target product field, namely, the multi-head vector is reserved (the process can be called user behavior coding (userbeviourencode)), so that multi-domain information of a user is reserved as completely as possible, and information transmission loss is reduced while the effective information of the cross-domain user behavior feature vectors is amplified.

Then, the target user behavior feature vector is extracted from the effective user behavior feature vector of each product field according to the influence weight and the effective user behavior feature vector of the target product field. For example, multiplication of the influence weight and the effective user behavior feature vector can be performed, expressions of the user cross-domain feature information (namely, the effective user behavior feature vector of each product domain) which are most relevant to the target product domain are extracted, irrelevant information is filtered, and the user cross-domain feature information is abstracted into a user vector space of the user in the target product domain.

the equation for the transition calculation of the transform calculation layer is as follows:

wherein d _t Domain identification vector representing field of target product, f _t The sum representing the effective user behavior feature vectors in the field of the target product can be regarded as a key of the intent.The multi-headed vector after the transformation calculation can be regarded as a query of the intent. Normalized calculation by softmax to obtain alpha _i (which can be used as the corresponding influence weight of each product field), and then weighting and calculating the original multi-head vector (namely the effective user behavior feature vector of each product field) to obtain a target user behavior feature vector so as to express a user vector space.

It should be noted that, the parameters are different from the parameters of the DomainEncoder, and the pre-training is not required, and the parameters are updated when training is performed on the user behavior data of a plurality of product fields.

In connection with the model structure shown in fig. 4, the generative model further includes a feature fusion layer (concat), a plurality of fully connected layers (FCs), such as FC 1 and FC 2. The target user behavior feature vector passes through concat, FC 1 and FC 2 to obtain sample data candidate sets corresponding to the target product field, and as shown in FIG. 4, the target user behavior feature vector is combined with the feature vector of the user in the target field to obtain sample data candidate sets corresponding to the target product field, and P1, P2 and … … Pn respectively represent sample data in the sample data candidate sets in the target product field. Finally, sample data candidate sets corresponding to each product field are obtained.

The generated model used in the embodiment of the present application may be trained using the original training data set (see S302 in fig. 3), that is, the target user behavior feature vector obtained through the above steps may be used to train the generated model. In one possible implementation, since the behavior of the recommender system is discrete user behavior data, the user behavior data cannot be expressed by continuous vectors, and it is necessary to characterize by producing possible sample data. Thus, the generated model may produce sample data identical to the real sample in a state where training is converged. In order to avoid generating such invalid sample data and to produce sample data as close as possible to the real sample data, a sample distribution loss function may be introduced when training the generation model, the smaller the value of the sample distribution loss function is, the larger the distribution gap between the predicted user behavior feature vector and the real user behavior feature vector is.

Based on this, the training method of generating the model includes the steps of: and generating a predicted user behavior feature vector according to the initial user behavior feature vector through the generation model, constructing a sample distribution loss function according to the predicted user behavior feature vector and the acquired real user behavior feature vector, and training the generation model according to the distribution loss function. The distribution loss function training generation model can ensure that sample data which is similar to and different from real sample data is generated, and the effectiveness of generating a pseudo sample is ensured.

In one possible implementation manner, the fractional loss function may be obtained by performing maximum mean difference calculation (MMD) on the predicted user behavior feature vector and the real user behavior feature vector, that is, determining a maximum mean difference distance according to the predicted user behavior feature vector and the real user behavior feature vector, taking the maximum mean difference distance (MMD distance) as a sample distribution loss function, and learning a suitable distance space through a generation model to ensure the effectiveness of generating the pseudo sample. Of course, the distributed loss function may be determined by means of euclidean distance calculation or the like, in addition to MMD calculation, which is not limited in this embodiment.

The calculation formula of the MMD distance is shown as formula (3):

wherein MMD represents MMD distance;is a desired operator; f (X) _u ) Representing the real user behavior feature vector of the user u, namely real sample data; f (Y) _u ) Predicted user behavior feature vectors, i.e., predicted sample data, representing user u;representing f () as from->A selected one of the operations; u represents the set of users.

MMD distance is used as distance Loss bias to be introduced into a final model Loss, and Loss is controlled through super parameters _MMD Is a function of the influence of (a) on the influence of (b) on the influence. The finally constructed loss function can be represented by equation (4):

L＝L _G +λ _M L _M (4)

Wherein L represents a loss function; l (L) _G Representing a general loss function, such as a mean square error calculated from the predicted user behavior feature vector and the target user behavior feature vector; l (L) _M Representing an MMD distance loss function; lambda (lambda) _M The presentation constant may be determined empirically.

The trained generative model is saved (see S303 in fig. 3), e.g., the trained generative model may be saved in a database to generate pseudo-samples using the generative model.

S204, mixing the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set.

After the foregoing steps, a generated model is trained offline and a sample data candidate set for each product field is generated using the generated model (see S304 in fig. 3). And then, carrying out sample mixing with actual sample data of each product field, such as user behavior data in an original training data set (see S305 in fig. 3), obtaining a target training data set, supplementing the sample number of the small-data-amount product, and balancing the sample proportion of different product fields.

And providing the obtained target training data set for online multi-domain recommendation models for use, namely training the recommendation model in the target product domain according to the target training data set, and returning recommendation information to the terminal equipment through the recommendation model. Because the recommendation model in the field of the target product is obtained by training according to the target training data set, and the target training data set is generated based on the user behavior data in the field of multiple products, the recommendation model can be used for realizing the inter-field recommendation content more accurately, the training effect of the recommendation model is improved, and the recommendation effect in the field of small sample products is further improved.

Taking a "watching" or reading APP of a certain APP as an example, information recommendation is performed in the target product field, and the recommendation interfaces thereof may be shown in fig. 5a and fig. 5b, respectively, where the recommendation interfaces display information recommended to the user, for example, "entrepreneur: creating a civil sink brand×. If the recommendation model corresponding to the target product field is obtained by training the target training data set obtained in S201-S204, where the target training data set is determined based on the user behavior data of multiple product fields (such as public number platform and video platform), then the articles and videos recorded by the public number platform and video platform can be browsed on "watching" or reading APP of an APP.

After the online multi-domain recommendation model provides recommendation information for the user, the user can browse, click and the like according to the recommendation information. The terminal equipment can return the clicking condition of the user on the recommended information to the server, so that user behavior data of a plurality of product fields are obtained according to the clicking condition when an original training data set is constructed, the original training data set is constructed, and a pseudo sample can be generated by combining more accurate user behavior data generated on line in real time. The pseudo sample can more accurately reflect the interests and hobbies of the user, and further improves the recommendation effect.

In addition, the method provided by the embodiment of the application can improve the cold start effect of users in certain product fields.

Next, a description will be given of a sample data generating method for information recommendation provided in the embodiment of the present application in connection with an actual application scenario. The application scenario may be when the user browses the reading APP, the reading APP recommends information to the user according to the age, sex, and historical user behavior data of the user. In order to achieve cross-domain recommendation and meet the needs of users, the embodiment of the application provides a cross-domain information recommendation method, and referring to fig. 6, the method comprises an offline process and an online service process, wherein the offline process is mainly used for generating sample data for information recommendation, the online service process is mainly used for training a recommendation model by using the generated sample data, and the recommendation model is used for recommending information to users. The method comprises the following steps:

s601, summarizing online user behavior data of users in various product fields by using a multi-product field user behavior processing module, and constructing an original training data set.

S602, inputting the original training data set into the FGM model, and training the FGM model.

S603, storing the FGM model.

S604, generating a sample data candidate set of each product field by using the trained FGM model.

And S605, providing the sample data candidate set for the online recommendation model to complete training of the recommendation model.

S606, the user opens the reading APP on the terminal equipment.

S607, the terminal equipment acquires recommendation information determined by the server through a recommendation model.

And S608, the terminal equipment displays the recommendation information to the user.

Wherein S601-S604 are offline processes, and S605-S608 are online service processes.

Based on the foregoing embodiment corresponding to fig. 2, an embodiment of the present application further provides a sample data generating device 700 for information recommendation, referring to fig. 7, where the device 700 includes an obtaining unit 701, a determining unit 702, a generating unit 703, and a mixing unit 704:

the acquiring unit 701 is configured to acquire user behavior data of a plurality of product fields, and construct an original training data set;

the determining unit 702 is configured to perform vectorization processing according to the original training data set to obtain an initial user behavior feature vector of each of the product domains;

the generating unit 703 is configured to generate a sample data candidate set corresponding to a target product domain according to the initial user behavior feature vector of each product domain, where the product domain to be expanded in the product domains is respectively used as the target product domain;

The mixing unit 704 is configured to mix the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set, where the target training data set is used to train a recommendation model in the target product field.

In a possible implementation manner, the generating unit 703 is configured to:

and inputting the initial user behavior feature vector of each product field into a generation model, and predicting and generating the sample data candidate set according to the initial user behavior feature vector of each product field through the generation model.

In a possible implementation manner, the device further includes a training unit:

the training unit is used for generating a predicted user behavior feature vector according to the target user behavior feature vector through the generation model; constructing a sample distribution loss function according to the predicted user behavior feature vector and the collected real user behavior feature vector; the smaller the value of the sample distribution loss function is, the larger the distribution gap between the predicted user behavior feature vector and the real user behavior feature vector is represented; and training the generation model according to the distribution loss function.

In a possible implementation manner, the training unit is configured to:

determining a maximum mean difference distance according to the predicted user behavior feature vector and the real user behavior feature vector;

and taking the maximum mean difference distance as the sample distribution loss function.

In a possible implementation manner, the generating unit 703 is configured to:

extracting effective user behavior feature vectors of each product field from the initial user behavior feature vectors;

and predicting according to the effective user behavior feature vectors of each product field to generate the sample data candidate set.

In a possible implementation, the generating model includes a domain encoder, and the generating unit 703 is configured to:

for each product field, carrying out random removal processing on the initial user behavior feature vector through the domain encoder to obtain an encoded user behavior feature vector;

and determining the effective user behavior feature vector according to the encoded user behavior feature vector.

In a possible implementation manner, the generating unit 703 is configured to:

determining the weight of the coded user behavior feature vector according to the coded user behavior feature vector and the domain identification vector of the product domain;

And determining the effective user behavior feature vector according to the weight of the coded user behavior feature vector.

In a possible implementation manner, the generating unit 703 is configured to:

extracting target user behavior feature vectors with the correlation degree meeting preset conditions with the target product field from the effective user behavior feature vectors of each product field;

and predicting the target user behavior feature vector to generate the sample data candidate set.

In a possible implementation manner, the generating model includes a deformer calculation layer, and the generating unit 703 is configured to:

according to the effective user behavior feature vectors of each product field, the influence weight of the effective user behavior feature vectors of each product field on the target product field is obtained through the deformer calculation layer, and the influence weight is used for reflecting the correlation degree between the effective user behavior feature vectors of each product field and the target product field;

and extracting the target user behavior feature vector from the effective user behavior feature of each product field according to the influence weight and the effective user behavior feature vector of the target product field.

In one possible implementation manner, the product domain to be expanded is a product domain in which the number of user behavior data in the plurality of product domains is less than a preset threshold.

In a possible implementation manner, the training unit is further configured to:

training a recommendation model of the target product field according to the target training data set;

the apparatus further comprises a return unit:

and the return unit is used for returning the recommendation information to the terminal equipment through the recommendation model.

In a possible implementation manner, the obtaining unit 701 is further configured to:

acquiring the clicking condition of the user on the recommended information;

and acquiring user behavior data of a plurality of product fields according to the clicking conditions, and constructing an original training data set.

The embodiment of the application also provides sample data generating equipment for information recommendation, which is used for executing the sample data generating method for information recommendation. The apparatus is described below with reference to the accompanying drawings. Referring to fig. 8, the device may be a terminal device, taking the terminal device as a smart phone as an example:

fig. 8 is a block diagram showing a part of a structure of a smart phone related to a terminal device provided by an embodiment of the present application. Referring to fig. 8, a smart phone includes: radio Frequency (RF) circuit 810, memory 820, input unit 830, display unit 840, sensor 850, audio circuit 860, wireless fidelity (wireless fidelity, wiFi) module 870, processor 880, and power supply 890. Those skilled in the art will appreciate that the smartphone structure shown in fig. 8 is not limiting of the smartphone and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The memory 820 may be used to store software programs and modules, and the processor 880 performs various functional applications and data processing of the smart phone by running the software programs and modules stored in the memory 820. The memory 820 may mainly include a storage program area that may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and a storage data area; the storage data area may store data (such as audio data, phonebooks, etc.) created according to the use of the smart phone, etc. In addition, memory 820 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The processor 880 is a control center of the smart phone, connects various parts of the entire smart phone using various interfaces and lines, performs various functions of the smart phone and processes data by running or executing software programs and/or modules stored in the memory 820, and calling data stored in the memory 820. In the alternative, processor 880 may include one or more processing units; preferably, the processor 880 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 880.

In this embodiment, the processor 880 in the terminal device (e.g., the smart phone described above) may perform the following steps;

The sample data generating device for information recommendation provided in the embodiment of the present application may also be a server, as shown in fig. 9, fig. 9 is a block diagram of a server 900 provided in the embodiment of the present application, where the server 900 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (Central Processing Units, abbreviated as CPU) 922 (e.g. one or more processors) and a memory 932, and one or more storage media 930 (e.g. one or more mass storage devices) storing application programs 942 or data 944. Wherein the memory 932 and the storage medium 930 may be transitory or persistent. The program stored in the storage medium 930 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 922 may be arranged to communicate with a storage medium 930 to execute a series of instruction operations in the storage medium 930 on the server 900.

The server 900 may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input/output interfaces 958, and/or one or more operating systems 941, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.

In this embodiment, the central processor 922 in the server may perform the following steps:

According to an aspect of the present application, there is provided a computer-readable storage medium for storing a program code for executing the sample data generating method for information recommendation according to the foregoing embodiments.

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the various alternative implementations of the above embodiments.

The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A sample data generation method for information recommendation, the method comprising:

mixing the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set, wherein the target training data set is used for training a recommendation model in the field of the target product;

according to the initial user behavior feature vector of each product field, generating a sample data candidate set corresponding to the target product field comprises the following steps:

inputting the initial user behavior feature vector of each product field into a generation model, and extracting the effective user behavior feature vector of each product field from the initial user behavior feature vector;

2. The method according to claim 1, wherein the training means for generating the model comprises the steps of:

generating a predicted user behavior feature vector according to the initial user behavior feature vector through the generation model;

constructing a sample distribution loss function according to the predicted user behavior feature vector and the collected real user behavior feature vector; the smaller the value of the sample distribution loss function is, the larger the distribution gap between the predicted user behavior feature vector and the real user behavior feature vector is represented;

and training the generation model according to the distribution loss function.

3. The method of claim 2, wherein said constructing a sample distribution loss function from said predicted user behavior feature vector and an acquired real user behavior feature vector comprises:

4. The method of claim 1, wherein the generating the model includes a domain encoder that extracts the valid user behavior feature vectors for each product domain from the initial user behavior feature vectors, comprising:

for each product field, performing random removal processing on the initial user behavior feature vector through the domain encoder to obtain an encoded user behavior feature vector;

5. The method of claim 4, wherein determining the valid user behavior feature vector from the encoded user behavior feature vector comprises:

6. The method of claim 1, wherein the generating the model includes a deformer calculation layer extracting a target user behavior feature vector satisfying a preset condition with respect to the target product domain from the valid user behavior feature vectors of each product domain, including:

According to the effective user behavior feature vector of each product field, the influence weight of the effective user behavior feature vector of each product field on the target product field is obtained through the deformer calculation layer, and the influence weight is used for reflecting the correlation degree between the effective user behavior feature vector of each product field and the target product field;

and extracting the target user behavior feature vector from the effective user behavior feature vector of each product field according to the influence weight and the effective user behavior feature vector of the target product field.

7. The method of any of claims 1-6, wherein the product domain to be augmented is a product domain of the plurality of product domains having a number of user behavior data less than a preset threshold.

8. The method according to any one of claims 1-6, further comprising:

and returning recommendation information to the terminal equipment through the recommendation model.

9. The method of claim 8, wherein the method further comprises:

Acquiring the clicking condition of the user on the recommended information;

the obtaining the user behavior data of a plurality of product fields, and constructing an original training data set comprises the following steps:

10. A sample data generating device for information recommendation, characterized in that the device comprises an acquisition unit, a determination unit, a generation unit and a mixing unit:

the mixing unit is used for mixing the sample data in the sample data candidate set with the user behavior data in the original training data set to obtain a target training data set, wherein the target training data set is used for training a recommendation model in the field of the target product;

The generating unit is specifically configured to:

11. The apparatus of claim 10, further comprising a training unit;

the training unit is used for generating a predicted user behavior feature vector according to the initial user behavior feature vector through the generation model; constructing a sample distribution loss function according to the predicted user behavior feature vector and the collected real user behavior feature vector; the smaller the value of the sample distribution loss function is, the larger the distribution gap between the predicted user behavior feature vector and the real user behavior feature vector is represented; and training the generation model according to the distribution loss function.

12. The apparatus of claim 11, wherein the training unit is configured to:

13. The apparatus of claim 10, wherein the generation model comprises a domain encoder, the generation unit to:

14. The apparatus of claim 13, wherein the generating unit is configured to:

15. The apparatus of claim 10, wherein the generative model comprises a deformer calculation layer, the generating unit to:

16. The apparatus of any one of claims 10-15, wherein the product domain to be augmented is a product domain of the plurality of product domains having a number of user behavior data less than a preset threshold.

17. The apparatus according to any one of claims 10-15, wherein the apparatus further comprises:

the training unit is used for training a recommendation model in the field of the target product according to the target training data set;

18. The apparatus of claim 17, wherein the acquisition unit is further configured to:

Acquiring the clicking condition of the user on the recommended information;

19. A sample data generating device for information recommendation, the device comprising a processor and a memory:

the processor is configured to perform the method of any of claims 1-9 according to instructions in the program code.

20. A computer readable storage medium, characterized in that the computer readable storage medium is for storing a program code for performing the method of any one of claims 1-9.