CN114298118A

CN114298118A - Data processing method based on deep learning, related equipment and storage medium

Info

Publication number: CN114298118A
Application number: CN202011041771.7A
Authority: CN
Inventors: 赵猛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-09-28
Filing date: 2020-09-28
Publication date: 2022-04-08
Anticipated expiration: 2040-09-28
Also published as: CN114298118B

Abstract

The embodiment of the invention provides a data processing method based on deep learning, related equipment and a storage medium, wherein the method comprises the following steps: acquiring a feature data set of a target user, wherein the feature data set comprises a plurality of user feature vectors for describing features of the target user, acquiring a virtual class vector with the highest similarity with a first label feature vector from the plurality of virtual class vectors, the first label feature vector is the feature vector of any one of a plurality of predefined classification labels, generating a fused feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors, determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the fusion feature vector and the first label feature vector, indirect interaction between the user features and the label features can be achieved based on the virtual class vectors, accuracy in label matching is improved through deep learning, and user interest is fully and accurately mined.

Description

Data processing method based on deep learning, related equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a data processing method based on deep learning, a related device, and a storage medium.

Background

The double-tower model structure is one of deep learning network structures and is widely applied to the field of deep matching. When mining user interest, a double-tower model structure generally matches a user with a tag, the double-tower model structure comprises a user tower network and a tag tower network, the user tower network and the tag tower network are generally decoupled, that is, embedding (embedding) is performed in respective network structures to obtain respective feature vectors, that is, the user feature vectors are obtained in the user tower network, the tag feature vectors are obtained in the tag tower network, and computational interaction is performed on the user feature vectors and the tag feature vectors in the last layer of the double-tower model structure to obtain the matching degree of the user and each tag. It can be seen that the information of the user feature vector generated by the current double-tower model structure is single, and interaction with other information is lacked to enrich the user feature information, which easily results in low accuracy when predicting the matching degree of the user and each label.

Disclosure of Invention

The embodiment of the invention provides a data processing method based on deep learning, related equipment and a storage medium, which can realize indirect interaction between user characteristics and label characteristics based on virtual class vectors, thereby improving the accuracy of label matching and being beneficial to fully and accurately mining user interests.

In one aspect, an embodiment of the present invention provides a data processing method based on deep learning, where the method includes:

acquiring a feature data set of a target user, wherein the feature data set comprises a plurality of user feature vectors for describing features of the target user;

acquiring a virtual class vector with the highest similarity with a first label feature vector from a plurality of virtual class vectors, wherein the first label feature vector is a feature vector of any one of a plurality of predefined classification labels, and the plurality of virtual class vectors are generated according to the number and the dimensionality of the plurality of user feature vectors;

generating a fusion feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors;

and determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the fusion feature vector and the first label feature vector.

In another aspect, an embodiment of the present invention provides a data processing apparatus, where the apparatus includes:

an obtaining module, configured to obtain a feature data set of a target user, where the feature data set includes a plurality of user feature vectors used for describing features of the target user;

the obtaining module is further configured to obtain a virtual class vector with the highest similarity to a first tag feature vector from a plurality of virtual class vectors, where the first tag feature vector is a feature vector of any one of a plurality of predefined class tags, and the plurality of virtual class vectors are generated according to the number and dimensions of the plurality of user feature vectors;

a generating module, configured to generate a fusion feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors;

and the determining module is used for determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the fusion feature vector and the first label feature vector.

In still another aspect, an embodiment of the present invention provides a server, where the server includes a processor, a network interface, and a storage device, where the processor, the network interface, and the storage device are connected to each other, where the network interface is controlled by the processor to send and receive data, and the storage device is used to store a computer program, where the computer program includes program instructions, and the processor is configured to call the program instructions to execute the above deep learning-based data processing method.

In still another aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, where the computer program includes program instructions, where the program instructions are executed by a processor to execute the above deep learning-based data processing method.

In yet another aspect, the invention implementation discloses a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the deep learning-based data processing method.

In the embodiment of the invention, the server can obtain a feature data set of a target user, the feature data set comprises a plurality of user feature vectors for describing features of the target user, the virtual class vector with the highest similarity with the first label feature vector is obtained from the plurality of virtual class vectors, the first label feature vector is the feature vector of any one of a plurality of predefined classification labels, then the virtual class vector with the highest similarity and the plurality of user feature vectors are used for generating a fusion feature vector of the target user, the matching degree between the target user and the classification label corresponding to the first label feature vector is determined according to the fusion feature vector and the first label feature vector, indirect interaction between the user features and the label features can be realized based on the virtual class vector, so that the accuracy in label matching is improved, the full and full user interest is facilitated, And (5) accurately excavating.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a data processing method based on deep learning according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of another deep learning-based data processing method according to an embodiment of the present invention;

FIG. 3a is a schematic diagram of a training process of a label matching model according to an embodiment of the present invention;

FIG. 3b is a schematic diagram of a usage flow of a tag matching model according to an embodiment of the present invention;

FIG. 3c is a schematic diagram of an overall process of matching a user with a tag according to an embodiment of the present invention;

FIG. 3d is a schematic diagram of a network structure of a tag matching model according to an embodiment of the present invention;

fig. 3e is a schematic diagram of a network structure of a user tower in a tag matching model according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a server according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

Cloud computing (cloud computing) refers to a delivery and use mode of an IT infrastructure, and refers to obtaining required resources in an on-demand and easily-extensible manner through a network; the generalized cloud computing refers to a delivery and use mode of a service, and refers to obtaining a required service in an on-demand and easily-extensible manner through a network. Such services may be IT and software, internet related, or other services. Cloud Computing is a product of development and fusion of traditional computers and Network Technologies, such as Grid Computing (Grid Computing), Distributed Computing (Distributed Computing), Parallel Computing (Parallel Computing), Utility Computing (Utility Computing), Network Storage (Network Storage Technologies), Virtualization (Virtualization), Load balancing (Load Balance), and the like. With the development of diversification of internet, real-time data stream and connecting equipment and the promotion of demands of search service, social network, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Different from the prior parallel distributed computing, the generation of cloud computing can promote the revolutionary change of the whole internet mode and the enterprise management mode in concept.

Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode. With the advent of the cloud era, big data has attracted more and more attention, and the big data needs special technology to effectively process a large amount of data within a tolerance elapsed time. The method is suitable for the technology of big data, and comprises a large-scale parallel processing database, data mining, a distributed file system, a distributed database, a cloud computing platform, the Internet and an extensible storage system.

The scheme provided by the embodiment of the application relates to technologies such as machine learning of artificial intelligence and big data of cloud computing, and is specifically explained by the following embodiments:

aiming at the problem that the matching degree of a user and each label is easily predicted due to lack of information interaction between the user characteristic vector and the label characteristic vector when the user characteristic vector and the label characteristic vector are obtained in the conventional double-tower model structure, the prediction accuracy is low.

The data processing method based on deep learning provided by the embodiment of the invention can be suitable for scenes where the user interests are mined, and can accurately mine the interest characteristics of each user by obtaining the classification labels matched with each user. The related double-tower model structure can comprise two sub-networks of a user tower and a label tower, wherein the user tower sub-network is used for generating a fused feature vector of a user, and the label tower sub-network is used for generating a label feature vector. In the user tower network, indirect interaction between user features and label features can be realized by utilizing the virtual class vectors, so that the obtained fusion feature vectors can learn part of relevant information in the labels, the accuracy in predicting the matching degree of the user and each label can be improved on the premise of ensuring that the user network and the label network are still decoupled, and the high-efficiency prediction efficiency is also ensured.

The double-tower model structure may be a Deep Structured Semantic Model (DSSM), or a variation model of DSSM, such as a convolutional neural network CNN-DSSM, a long-short term memory network LSTM-DSSM, or the like, which is not limited in the present invention.

Referring to fig. 1, a schematic flow chart of a data processing method based on deep learning according to an embodiment of the present invention is shown, where the data processing method based on deep learning according to the embodiment of the present invention includes the following steps:

101. a feature data set of a target user is obtained, the feature data set comprising a plurality of user feature vectors describing features of the target user.

The target user may be any user. The characteristics of the user can be described from multiple dimensions, including basic attribute characteristics, historical behavior characteristics, statistical characteristics and the like, the basic attribute characteristics can include age, gender, academic history, region and the like, the historical behavior characteristics can include recommended contents (such as recommended advertisements) clicked or viewed by the user within a period of time, and the statistical characteristics can refer to statistical results of the historical behavior characteristics, such as the number of times, time, place and the like that the user clicked or viewed the recommended contents within a period of time.

Specifically, the server may obtain feature data of the target user, where the feature data includes one or more of basic attribute features, historical behavior features, and statistical features, and construct a plurality of user feature vectors of the target user according to the feature data, for example, may perform word vector embedding (embedding) on the feature data to obtain a plurality of user feature vectors, thereby obtaining a feature data set of the target user.

102. And acquiring a virtual class vector with the highest similarity with a first label feature vector from the plurality of virtual class vectors, wherein the first label feature vector is the feature vector of any one of a plurality of predefined classification labels.

The plurality of virtual class vectors may be a plurality of initialization vectors generated according to the number and the dimensions of the plurality of user feature vectors. The plurality of classification tags may be defined according to actual needs, for example, the classification tags may refer to the classification of goods, and the plurality of classification tags may include cosmetics, electronic products, clothing, automobiles, home appliances, snacks, drinks, and the like; as another example, the category labels may refer to categories of interests, and the plurality of category labels may include sports, science and technology, health, finance, military, fashion, and the like. Word vector embedding (embedding) processing may be performed on each classification label to obtain a feature vector (denoted as a label feature vector) of each classification label.

103. And generating a fusion feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors.

Specifically, for any one classification label, the corresponding label feature vector is recorded as a first label feature vector, and when the matching degree between the user and the classification label is predicted, in order to construct a feature vector capable of sufficiently representing the user feature information, the server may obtain a virtual class vector with the highest similarity with the first label feature vector from the plurality of virtual class vectors, and may generate the fusion feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors. For example, the similarity between each virtual class vector and the first tag feature vector can be calculated through dot product operation, and then the virtual class vector with the highest similarity (i.e., the dot product value is the largest) is found out from the plurality of virtual class vectors, so that the first classification tag can use the virtual class vector with the highest similarity as its own representative vector, and perform information interaction with the user feature vector by using the representative vector, thereby realizing indirect interaction between the user and the tag, ensuring mutual decoupling between the user network structure and the tag network structure in the model, and avoiding direct interaction.

104. And determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the fusion feature vector and the first label feature vector.

Specifically, the server may input the fused feature vector and the first tag feature vector into a matching layer of the tag matching model to perform data processing, for example, dot product processing may be performed, so as to obtain an output result of the matching layer, and determine a matching degree between the target user and the classification tag corresponding to the first tag feature vector according to the output result, where the matching degree may be a score value, and the range of the score value may be 0 to 100, or may be normalized by 0 to 1. Wherein, the tag matching model may adopt the double tower model structure described above.

In some possible embodiments, the tag matching model further includes a presentation layer, and the presentation layer includes a user tower network and a tag tower network, the user tower network is used for generating the converged feature vector of the user, and the tag tower network is used for generating the tag feature vector.

In the embodiment of the invention, the server can obtain a feature data set of a target user, the feature data set comprises a plurality of user feature vectors for describing features of the target user, then a virtual class vector with the highest similarity to a first label feature vector is obtained from the plurality of virtual class vectors, the first label feature vector is the feature vector of any one of a plurality of predefined classification labels, a fusion feature vector of the target user is generated by using the virtual class vector with the highest similarity and the plurality of user feature vectors, so that the matching degree between the target user and the classification label corresponding to the first label feature vector is determined according to the fusion feature vector and the first label feature vector, and indirect interaction between the user features and the label features can be realized based on the virtual class vector on the premise of ensuring the decoupling of a user network and the label network, the user feature vector can learn partial information of the tags, accuracy in predicting matching degree of the user and each tag can be improved, efficient prediction efficiency is guaranteed, and user interest can be fully and accurately mined.

Referring to fig. 2, a schematic flow chart of another data processing method based on deep learning according to an embodiment of the present invention is shown, where the data processing method based on deep learning according to the embodiment of the present invention includes the following steps:

201. a feature data set of a target user is obtained, the feature data set comprising a plurality of user feature vectors describing features of the target user.

202. And acquiring a virtual class vector with the highest similarity with a first label feature vector from the plurality of virtual class vectors, wherein the first label feature vector is the feature vector of any one of a plurality of predefined classification labels.

The specific implementation manner of steps 201 to 202 can refer to the related description of steps 101 to 102 in the foregoing embodiments, and is not described herein again.

203. And determining a target candidate feature vector corresponding to the virtual class vector with the highest similarity from a plurality of candidate feature vectors, wherein each candidate feature vector in the plurality of candidate feature vectors is generated according to one virtual class vector in the plurality of virtual class vectors and the plurality of user feature vectors.

Specifically, the server may obtain a classification tag corresponding to the first tag feature vector, and query a target index corresponding to the classification tag from an index-tag mapping relationship table, where the index-tag mapping relationship table includes each classification tag and an index of a corresponding virtual class vector with the highest similarity, and then use a virtual class vector corresponding to a target index in the plurality of virtual class vectors as the virtual class vector with the highest similarity to the first tag feature vector, and may quickly determine the virtual class vector most similar to each tag feature vector by querying the index-tag mapping relationship table. For example, there are 100 classification tags, there are 5 virtual class vectors, which may be denoted as vc _1, vc _2, vc _3, vc _4, and vc _5, where the indexes (index) of the 5 virtual class vectors are 1, 2, 3, 4, and 5, respectively, the index-tag mapping relationship table includes each classification tag in the 100 classification tags and the value of the index of the virtual class vector with the highest similarity corresponding to each classification tag, and assuming that the value of the target index is 3, the virtual class vector vc _3 is the virtual class vector with the highest similarity to the first tag feature vector.

In some feasible embodiments, the server may perform data processing on one virtual class vector and a plurality of user feature vectors by using an attention mechanism to obtain one candidate feature vector, the virtual class vector is used as a query when attention is performed, the plurality of user feature vectors are used as keys and values, and if there are n virtual class vectors, n candidate feature vectors may be obtained by using an attention mechanism.

Wherein, the virtual class vector Vc is used_iAnd user feature vector (U)₀，U₁，…，U_m) For example, the specific implementation of the attribute for the virtual class vector and the plurality of user feature vectors can be represented by the following formula:

wherein Vc _ Emb_iAs virtual class vector Vc_iAnd generating n candidate feature vectors corresponding to the n virtual class vectors.

In some possible embodiments, the server may first find out the virtual class vector with the highest similarity, and then perform an entry for the virtual class vector with the highest similarity and a plurality of user feature vectors, that is: and performing data processing on the virtual class vector with the highest similarity and the plurality of user feature vectors by using an attention network to obtain a target candidate feature vector, and then generating a fusion feature vector of the target user by using the target candidate feature vector and the plurality of user feature vectors.

In some possible embodiments, the server may construct a mapping relationship table between the index and the tag by calculating the similarity, and the specific implementation may include: and respectively calculating the similarity between each virtual class vector in the plurality of virtual class vectors and the label feature vector aiming at the label feature vector corresponding to each classification label in the plurality of classification labels, then obtaining a target virtual class vector with the highest similarity, obtaining an index of the target virtual class vector, and then creating a mapping relation table of the index and the label according to the index of the target virtual class vector, wherein the mapping relation table of the index and the label comprises each classification label and the index of the corresponding target virtual class vector.

In some possible embodiments, the specific implementation manner of the server generating the virtual class vector may include: the number and the dimensionality of the user feature vectors are obtained, a plurality of randomly initialized vectors are correspondingly generated according to the number and the dimensionality, and the plurality of randomly initialized vectors can be used as a plurality of required virtual class vectors, so that the generation and the initialization of the virtual class vectors are completed. For example, if there are 5 user feature vectors of each user and the dimension is 32 dimensions, 5 32-dimensional vectors may be randomly generated corresponding to the initialization and used as the virtual class vector.

204. Generating a fused feature vector of the target user using the target candidate feature vector and the plurality of user feature vectors.

Specifically, the server may perform data processing on the multiple user feature vectors by using the self-attention network to obtain self-intersection feature vectors of the target user, and then perform fusion processing on the target candidate feature vectors and the self-intersection feature vectors to obtain fusion feature vectors of the target user. For example, the target candidate feature vector and the self-intersected feature vector may be subjected to an averaging (reduce mean) process to obtain a fused feature vector of the target user.

205. And inputting the fusion characteristic vector and the first label characteristic vector into a matching layer of a label matching model to obtain an output result of the matching layer.

206. And determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the output result.

In some feasible embodiments, if the target user is a user who is used as a training sample, when training the tag matching model, the server may obtain exposure click data of the target user on a classification tag corresponding to the first tag feature vector, where the exposure click data refers to an actual click condition of the target user on a relevant recommended content of the classification tag corresponding to the first tag feature vector, that is, whether the relevant recommended content has been checked, then obtain a loss value of a loss function of the tag matching model according to the exposure click data and the matching degree, and adjust a model parameter of the tag matching model by using the loss value, for example, adjust the model parameter by using a gradient descent method, so as to complete training of the tag matching model.

In some possible embodiments, as shown in fig. 3a, a training process of the tag matching model may generate a plurality of User feature vectors including a gender feature vector, an age feature vector, a behavior feature vector, a statistical interest feature vector, and the like according to a feature input of a current User, where the generated plurality of Virtual class vectors (Virtual Cluster Embedding) includes n Virtual class vectors of vector 1, vector 2, … …, and vector n, and perform an attribute process on the plurality of User feature vectors (i.e., attribute based on Virtual Cluster) based on each Virtual class vector, where each Virtual class vector corresponds to one aggregation vector (i.e., the above candidate feature vector), so as to obtain n candidate feature vectors (VC User Embedding), that is, candidate feature vector 1, candidate feature vector 2, … …, and candidate feature vector n. On one hand, the User tower sub-network of the tag matching model performs self-intersection processing on a plurality of User feature vectors (namely, User feature self-intersection) to obtain self-intersection feature vectors (User Interacted Embedding) of the users; on the other hand, the user tower network of the label matching model adopts an attention mechanism to perform data processing on each virtual class vector and a plurality of user feature vectors to obtain corresponding candidate feature vectors, determines a virtual class vector (namely an index of the virtual class vector) most similar to the label feature vector by performing point multiplication operation on each virtual class vector aiming at a certain label feature vector, selects a corresponding target candidate feature vector from the n candidate feature vectors according to the index, and takes the target candidate feature vector as the feature vector obtained by indirect interaction between a user and a label through the virtual class vectors; and then, carrying out reduce mean processing on the self-crossed characteristic vector and the target candidate characteristic vector by the user tower network to obtain a fused characteristic vector for representing the user, carrying out point multiplication operation on the fused characteristic vector of the user and the label characteristic vector to predict the matching degree between the user and the corresponding classification label, and carrying out optimization and adjustment on model parameters of a label matching model according to the predicted matching degree and the actual click condition of the user on the recommendation content related to the classification label, thereby realizing the training of the label matching model.

In addition, in the training process, for each classification label, after finding out the virtual class vector with the highest similarity with the label feature vector of the classification label, the index of the virtual class vector with the highest similarity can be obtained, and then a mapping relation table of the index and the classification label is created according to the index. For example, there are 100 classification tags, there are 5 virtual class vectors, which can be denoted as vc _1, vc _2, vc _3, vc _4, and vc _5, and the indexes (index) of the 5 virtual class vectors are 1, 2, 3, 4, and 5, respectively, so that the mapping relationship table of the index and the tag includes the numerical value of each classification tag in the 100 classification tags and the index of the corresponding virtual class vector.

In some possible embodiments, after obtaining the feature data of a certain User, inputting the tag matching model, obtaining a self-intersected feature vector (User intersected feature) of the User and a candidate feature vector Set (VC User intersected Set), where the candidate feature vector Set includes a plurality of candidate feature vectors, each candidate feature vector is obtained by performing an attribute process on one tag feature vector and a plurality of User feature vectors, for any one classification tag, obtaining an index v _ index of a virtual class vector with the highest similarity between the index and the classification tag feature vector of the classification tag by querying a mapping relation table of the index and the classification tag, and determining a target candidate feature vector corresponding to the index v _ index from the candidate feature vector Set according to the index v _ index, and then, carrying out fusion processing on the self-crossed feature vector and the target candidate feature vector to obtain a fusion feature vector of the user, carrying out point multiplication processing on the fusion feature vector and the label feature vector of the classification label to obtain a score of the user on each classification label, obtaining the score of the user on each classification label according to the processing mode, and then determining the interest label of the user according to the ranking of the scores to realize full and accurate mining of the interest of the user.

In some possible embodiments, the overall process of matching the user and the tag may be as shown in fig. 3c, taking advertisement recommendation as an example, which mainly includes: sample extraction, model training, model prediction and result selection.

Specifically, after interest orientation is performed on a user, an advertisement log is obtained, a data set is constructed, the interest of the user is labeled by the data set, the labeling format can be (user, tag, 0/1), (user, tag, 0) indicates that the user does not click on the advertisement content corresponding to the tag, and (user, tag, 1) indicates that the user clicks on the advertisement content corresponding to the tag, and then a characteristic data set of the user is constructed by combining data such as basic attributes, historical behaviors, statistical characteristics and the like of the user and is used for model training. When the model is predicted, a characteristic data set of the user is constructed according to data such as basic attributes, historical behaviors, statistical characteristics and the like of the user, and the characteristic data set is input into the model for prediction.

In some possible embodiments, the network structure of the tag matching model may be as shown in fig. 3 d. The tag matching model adopts a double-tower structure and comprises a user tower network and a tag tower network, the user tower network is used for processing input user features and virtual class vectors to generate fusion feature vectors of users, and the tag tower network is used for generating tag feature vectors.

In some possible embodiments, the network structure of the user tower in the tag matching model may be as shown in fig. 3 e. The user tower can specifically adopt a user tower based on an extreme depth factor decomposition Machine (xDeepFM) model, the user tower converts features of each dimension of a user into corresponding feature vectors, the user tower performs self-crossing processing on a plurality of user feature vectors on one hand to obtain self-crossing feature vectors of the user, and on the other hand, each virtual class vector and the plurality of user feature vectors are processed through an attention mechanism to obtain a plurality of candidate feature vectors, and the corresponding candidate feature vectors and the feature vectors obtained by self-crossing are selected to be fused according to the virtual class vector with the maximum similarity to the label feature vectors to obtain the fused feature vectors of the user. The method has the advantages that the prediction of the fusion feature vector of the user is still decoupled from the tag feature vector, the interaction of the features of the user tower and the tag tower is realized on the basis of the virtual class vector on the premise of ensuring the decoupling of the user tower and the tag tower, and the accuracy in predicting the matching degree of the user and each tag is improved.

It should be noted that, compared with the prior art, a double-tower model structure with no newly added virtual class vector is provided, the method only increases the calculated amount of two parts, firstly, the attention calculation of the user feature vector and the virtual class vector is increased when the user tower predicts the user fusion feature vector, the matrix calculation based on the GPU is adopted in the training and predicting processes, the increased calculated amount is very limited, secondly, the dot-product calculation of the label feature vector and the virtual class vector is increased, and the increased calculated amount is negligible, so that the indirect interaction between the user feature and the label feature can be increased based on the virtual class vector under the condition of ensuring high-efficiency prediction efficiency, and the accuracy when the label is matched is effectively improved.

In the embodiment of the present invention, a server may obtain a feature data set of a target user, where the feature data set includes a plurality of user feature vectors for describing features of the target user, obtain a virtual class vector with the highest similarity to a first tag feature vector from the plurality of virtual class vectors, where the first tag feature vector is a feature vector of any one of a plurality of predefined class tags, determine a target candidate feature vector corresponding to the virtual class vector with the highest similarity from the plurality of candidate feature vectors, generate a fused feature vector of the target user using the target candidate feature vector and the plurality of user feature vectors, input the fused feature vector and the first tag feature vector into a matching layer of a tag matching model, obtain an output result of the matching layer, determine a matching degree between the target user and the class tag corresponding to the first tag feature vector according to the output result, indirect interaction between the user features and the label features can be achieved based on the virtual class vectors, so that the accuracy in label matching is improved, and full and accurate mining of user interests is facilitated.

In some feasible embodiments, based on the data processing method provided by the foregoing embodiment, an embodiment of the present invention further provides a content recommendation method, which specifically includes the following steps:

(a) and acquiring the matching degree between the user and each classification label in the predefined plurality of classification labels.

For a specific implementation of the matching degree between the user and the classification tag, reference may be made to the related description in the foregoing embodiment, which is not described herein again.

(b) And determining at least one classification label from the plurality of classification labels according to the matching degree, and taking the at least one classification label as an interest label of the user.

Specifically, the server may rank the matching degree corresponding to each classification tag from high to low, determine at least one classification tag ranked in the top, for example, the classification tag ranked in the top three digits (top 3), and use the determined at least one classification tag as an interest tag of the user, so that the matching degree determined based on the foregoing embodiment may be used to fully and accurately discover the interest of the user, and lay a good foundation for performing accurate content recommendation subsequently.

(c) And obtaining the classification label of the content to be recommended, and determining a recalled user from a user set by using the classification label of the content to be recommended and the interest label of each user.

The content to be recommended may specifically be advertisement data of a certain commodity, and the commodity may be a physical commodity such as a digital product, a cosmetic, or the like, or the commodity may also be a virtual commodity such as a game, a prop, or the like, which is not limited in the embodiment of the present invention.

Specifically, for the content to be recommended, the server may recall the user according to the classification tag of the content to be recommended, that is, a user who may be interested in the content to be recommended is screened from a plurality of users included in the user set, for example, the interest tag of each user in the user set may be compared with the classification tag of the content to be recommended, if the interest tag includes the classification tag of the content to be recommended, or the similarity between the interest tag and the classification tag of the content to be recommended is higher (for example, a preset similarity threshold is reached), the corresponding user is taken as the recall user, and all users that can be recalled are determined from the user set in this way.

(d) And pushing the content to be recommended to the terminal equipment corresponding to the recalled user.

Specifically, after the server determines the recalled user, the server can push the content to be recommended to the terminal device corresponding to the recalled user, so that the content is pushed directionally and accurately.

In some possible embodiments, after determining the recalled user, the server may perform operations such as coarse ranking and fine ranking on the recalled user to further screen the recalled user, where the coarse ranking refers to screening the user with fewer features (for example, the user and a part of features of the content to be recommended), and the fine ranking refers to screening the user with more features (for example, the user and all features of the content to be recommended). Specifically, in the rough ranking stage, for the recalled users, the server may input part of the features of the users and part of the features of the contents to be recommended into the scoring model, obtain a score of the matching degree between each user of the recalled users and the contents to be recommended, select the part of users with higher rank from the recalled users according to the scoring rank, and realize the user screening through rough ranking. In the fine ranking stage, aiming at the users screened by the coarse ranking, the server can input all the characteristics of the users and all the characteristics of the contents to be recommended into the scoring model to obtain the scoring of the matching degree between each user in the screened users and the contents to be recommended, and selects the users with higher rank from the screened users according to the scoring ranking, so that the contents to be recommended can be pushed to the terminal equipment corresponding to the users screened after the fine ranking through the fine ranking, the range of pushing objects can be narrowed through the coarse ranking and the fine ranking, the accuracy in content recommendation is further improved, and the success rate of content recommendation is improved.

Referring to fig. 4, a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention is shown, where the apparatus includes:

an obtaining module 401, configured to obtain a feature data set of a target user, where the feature data set includes a plurality of user feature vectors for describing features of the target user;

the obtaining module 401 is further configured to obtain, from a plurality of virtual class vectors, a virtual class vector with a highest similarity to a first tag feature vector, where the first tag feature vector is a feature vector of any one of a plurality of predefined class tags, and the plurality of virtual class vectors are generated according to the number and the dimensions of the plurality of user feature vectors;

a generating module 402, configured to generate a fused feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors;

a determining module 403, configured to determine, according to the fusion feature vector and the first tag feature vector, a matching degree between the target user and the class tag corresponding to the first tag feature vector.

Optionally, the determining module 403 is specifically configured to:

inputting the fusion characteristic vector and the first label characteristic vector into a matching layer of a label matching model to obtain an output result of the matching layer;

and determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the output result.

Optionally, the generating module 402 is specifically configured to:

determining a target candidate feature vector corresponding to the virtual class vector with the highest similarity from a plurality of candidate feature vectors, wherein each candidate feature vector in the plurality of candidate feature vectors is generated according to one virtual class vector in the plurality of virtual class vectors and the plurality of user feature vectors;

generating a fused feature vector of the target user using the target candidate feature vector and the plurality of user feature vectors.

Optionally, the generating module 402 is further configured to, for any virtual class vector in the plurality of virtual class vectors, perform data processing on the any virtual class vector and the plurality of user feature vectors by using an attention network to obtain a plurality of candidate feature vectors.

Optionally, the generating module 402 is specifically configured to:

performing data processing on the virtual class vector with the highest similarity and the plurality of user feature vectors by using an attention network to obtain target candidate feature vectors;

Optionally, the generating module 402 is specifically configured to:

performing data processing on the plurality of user feature vectors by using a self-attention network to obtain self-intersection feature vectors of the target user;

and performing fusion processing on the target candidate feature vector and the self-crossed feature vector to obtain a fusion feature vector of the target user.

Optionally, the obtaining module 401 is specifically configured to:

obtaining a classification label corresponding to the first label feature vector;

querying a target index corresponding to the classification label from an index-label mapping relation table, wherein the index-label mapping relation table comprises each classification label in the plurality of classification labels and an index of a corresponding virtual class vector with the highest similarity;

and taking the virtual class vector corresponding to the target index in the plurality of virtual class vectors as the virtual class vector with the highest similarity with the first label characteristic vector.

Optionally, the obtaining module 401 is further configured to, for a tag feature vector corresponding to each of the plurality of classification tags, obtain a target virtual class vector with a highest similarity to the tag feature vector from the plurality of virtual class vectors, and obtain an index of the target virtual class vector;

the generating module 402 is further configured to create an index-to-label mapping relationship table according to the index of the target virtual class vector, where the index-to-label mapping relationship table includes each classification label and an index of a corresponding target virtual class vector.

Optionally, the obtaining module 401 is further configured to:

acquiring exposure click data of the target user on the classification label corresponding to the first label feature vector;

obtaining a loss value of a loss function of the label matching model according to the exposure click data and the matching degree;

and adjusting the model parameters of the label matching model by using the loss value so as to finish the training of the label matching model.

Optionally, the obtaining module 401 is specifically configured to:

acquiring exposure click data and feature data of a target user, wherein the feature data comprises one or more of basic attribute features, historical behavior features and statistical features;

and generating a feature data set of the target user according to the exposure click data and the feature data.

Optionally, the obtaining module 401 is further configured to obtain the number and the dimensions of the plurality of user feature vectors;

the generating module 402 is further configured to generate a plurality of randomly initialized vectors according to the number and the dimension, and use the plurality of randomly initialized vectors as the plurality of virtual class vectors.

Optionally, the tag matching model further includes a presentation layer, where the presentation layer includes a user tower network and a tag tower network, the user tower network is used to obtain the fused feature vector of the user, and the tag tower network is used to obtain the tag feature vector.

It should be noted that the functions of each functional module of the data processing apparatus according to the embodiment of the present invention may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the related description of the foregoing method embodiment, which is not described herein again.

Referring to fig. 5, a schematic structural diagram of a server according to an embodiment of the present invention is shown, where the server according to the embodiment of the present invention includes a power supply module and the like, and includes a processor 501, a storage device 502, and a network interface 503. The processor 501, the storage device 502, and the network interface 503 can exchange data with each other.

The storage device 502 may include a volatile memory (volatile memory), such as a random-access memory (RAM); the storage device 502 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a solid-state drive (SSD), etc.; the memory means 502 may also comprise a combination of memories of the kind described above.

The processor 501 may be a Central Processing Unit (CPU) 501. In one embodiment, the processor 501 may also be a Graphics Processing Unit (GPU) 501. The processor 501 may also be a combination of a CPU and a GPU. In one embodiment, the storage device 502 is used to store program instructions. The processor 501 may call the program instructions to perform the following operations:

Optionally, the processor 501 is specifically configured to:

Optionally, the processor 501 is further configured to:

and aiming at any virtual class vector in the plurality of virtual class vectors, performing data processing on the any virtual class vector and the plurality of user feature vectors by using an attention network to obtain a plurality of candidate feature vectors.

Optionally, the processor 501 is specifically configured to:

Optionally, the processor 501 is further configured to:

aiming at the label feature vector corresponding to each classification label in the classification labels, obtaining a target virtual class vector with the highest similarity between the target virtual class vector and the label feature vector from a plurality of virtual class vectors;

obtaining an index of the target virtual class vector;

and creating a mapping relation table of indexes and labels according to the indexes of the target virtual class vectors, wherein the mapping relation table of the indexes and the labels comprises each classification label and the index of the corresponding target virtual class vector.

Optionally, the processor 501 is further configured to:

Optionally, the processor 501 is specifically configured to:

Optionally, the processor 501 is further configured to:

acquiring the quantity and the dimensionality of the plurality of user feature vectors;

and generating a plurality of randomly initialized vectors according to the number and the dimension, and taking the plurality of randomly initialized vectors as the plurality of virtual class vectors.

In a specific implementation, the processor 501, the storage device 502, and the network interface 503 described in this embodiment of the present invention may execute the implementation described in the related embodiment of the data processing method based on deep learning provided in fig. 1 and fig. 2 in this embodiment of the present invention, or may execute the implementation described in the related embodiment of the data processing device provided in fig. 4 in this embodiment of the present invention, which is not described herein again.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, where the program includes one or more instructions that can be stored in a computer storage medium, and when executed, the program may include processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps performed in the embodiments of the methods described above.

The above disclosure is only a few examples of the present application, and certainly should not be taken as limiting the scope of the present application, which is therefore intended to cover all modifications that are within the scope of the present application and which are equivalent to the claims.

Claims

1. A data processing method based on deep learning is characterized by comprising the following steps:

2. The method according to claim 1, wherein the determining the matching degree between the target user and the class label corresponding to the first label feature vector according to the fused feature vector and the first label feature vector comprises:

3. The method according to claim 1 or 2, wherein the generating the fused feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors comprises:

4. The method of claim 3, wherein prior to determining the target candidate eigenvector corresponding to the index from the plurality of candidate eigenvectors, the method further comprises:

5. The method according to claim 1, wherein the generating the fused feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors comprises:

6. The method according to claim 3 or 5, wherein the generating a fused feature vector of the target user using the target candidate feature vector and the plurality of user feature vectors comprises:

7. The method according to claim 1, wherein the obtaining the virtual class vector with the highest similarity with the first tag feature vector from the plurality of virtual class vectors comprises:

8. The method according to claim 7, wherein before the obtaining the virtual class vector with the highest similarity with the first tag feature vector from the plurality of virtual class vectors, the method further comprises:

obtaining an index of the target virtual class vector;

9. The method of claim 2, further comprising:

10. The method of claim 9, wherein the obtaining the feature data set of the target user comprises:

11. The method according to claim 1, wherein before the obtaining the virtual class vector with the highest similarity with the first tag feature vector from the plurality of virtual class vectors, the method further comprises:

12. The method of claim 2, wherein the tag matching model further comprises a presentation layer, and wherein the presentation layer comprises a user tower network and a tag tower network, the user tower network is used for obtaining the converged feature vector of the user, and the tag tower network is used for obtaining the tag feature vector.

13. A data processing apparatus, characterized in that the apparatus comprises:

14. A server, characterized in that the server comprises a processor, a network interface and a storage device, wherein the processor, the network interface and the storage device are connected with each other, wherein the network interface is controlled by the processor for transceiving data, the storage device is used for storing a computer program, the computer program comprises program instructions, and the processor is configured to call the program instructions for executing the deep learning based data processing method according to any one of claims 1 to 12.

15. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions which are executed by a processor to perform the deep learning based data processing method according to any one of claims 1 to 12.