CN114298118B

CN114298118B - Data processing method based on deep learning, related equipment and storage medium

Info

Publication number: CN114298118B
Application number: CN202011041771.7A
Authority: CN
Inventors: 赵猛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-09-28
Filing date: 2020-09-28
Publication date: 2024-02-09
Anticipated expiration: 2040-09-28
Also published as: CN114298118A

Abstract

The embodiment of the invention provides a data processing method based on deep learning, related equipment and a storage medium, wherein the method comprises the following steps: the feature data set of the target user is obtained, the feature data set comprises a plurality of user feature vectors used for describing the features of the target user, a virtual class vector with highest similarity with a first tag feature vector is obtained from the plurality of virtual class vectors, the first tag feature vector is a feature vector of any one of a plurality of predefined classification tags, a fusion feature vector of the target user is generated by utilizing the virtual class vector with highest similarity and the plurality of user feature vectors, the matching degree between the target user and the classification tags corresponding to the first tag feature vector is determined according to the fusion feature vector and the first tag feature vector, indirect interaction between the user features and the tag features can be achieved based on the virtual class vector, accuracy in tag matching is improved through deep learning, and full and accurate mining of user interests is facilitated.

Description

Data processing method based on deep learning, related equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a data processing method based on deep learning, a related device, and a storage medium.

Background

The double-tower model structure is one of deep learning network structures, and is widely applied in the field of deep matching. When mining user interests, the dual-tower model structure usually matches users with tags, the dual-tower model structure comprises a user tower network and a tag tower network, the user tower network and the tag tower network are usually decoupled, that is, embedding (embedding) is performed in the respective network structures to obtain respective feature vectors, that is, the user feature vectors are obtained in the user tower network, the tag feature vectors are obtained in the tag tower network, and computing interaction is performed on the user feature vectors and the tag feature vectors in the last layer of the dual-tower model structure to obtain the matching degree of the users with the respective tags. It can be seen that the information of the user feature vector generated by the current double-tower model structure is single, interaction with other information is lacking to enrich the user feature information, and accuracy is low when the matching degree of the user and each label is predicted easily.

Disclosure of Invention

The embodiment of the invention provides a data processing method, related equipment and storage medium based on deep learning, which can realize indirect interaction between user features and tag features based on virtual class vectors, thereby improving the accuracy of tag matching and being beneficial to fully and accurately mining the interests of users.

In one aspect, an embodiment of the present invention provides a data processing method based on deep learning, where the method includes:

acquiring a feature data set of a target user, wherein the feature data set comprises a plurality of user feature vectors for describing features of the target user;

obtaining a virtual class vector with highest similarity with a first tag feature vector from a plurality of virtual class vectors, wherein the first tag feature vector is a feature vector of any one of a plurality of predefined classification tags, and the plurality of virtual class vectors are generated according to the number and the dimensions of the plurality of user feature vectors;

generating a fusion feature vector of the target user by utilizing the virtual class vector with the highest similarity and the plurality of user feature vectors;

and determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the fusion feature vector and the first label feature vector.

In another aspect, an embodiment of the present invention provides a data processing apparatus, including:

an acquisition module for acquiring a feature data set of a target user, the feature data set comprising a plurality of user feature vectors for describing features of the target user;

the obtaining module is further configured to obtain a virtual class vector with highest similarity to a first tag feature vector from a plurality of virtual class vectors, where the first tag feature vector is a feature vector of any one of a predefined plurality of classification tags, and the plurality of virtual class vectors are generated according to the number and dimensions of the plurality of user feature vectors;

the generation module is used for generating a fusion feature vector of the target user by utilizing the virtual class vector with the highest similarity and the plurality of user feature vectors;

and the determining module is used for determining the matching degree between the target user and the classified label corresponding to the first label feature vector according to the fusion feature vector and the first label feature vector.

In yet another aspect, an embodiment of the present invention provides a server, where the server includes a processor, a network interface, and a storage device, where the processor, the network interface, and the storage device are connected to each other, where the network interface is controlled by the processor to send and receive data, and the storage device is used to store a computer program, where the computer program includes program instructions, and the processor is configured to invoke the program instructions to perform the data processing method based on deep learning.

In yet another aspect, an embodiment of the present invention provides a computer readable storage medium storing a computer program including program instructions executable by a processor to perform the above-described deep learning-based data processing method.

In yet another aspect, the present implementations disclose a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the computer device performs the deep learning-based data processing method described above.

In the embodiment of the invention, the server can acquire the feature data set of the target user, wherein the feature data set comprises a plurality of user feature vectors for describing the features of the target user, a virtual class vector with highest similarity with a first tag feature vector is acquired from a plurality of virtual class vectors, the first tag feature vector is the feature vector of any one of a plurality of predefined classification tags, then a fusion feature vector of the target user is generated by utilizing the virtual class vector with highest similarity and the plurality of user feature vectors, and the matching degree between the target user and the classification tag corresponding to the first tag feature vector is determined according to the fusion feature vector and the first tag feature vector, so that indirect interaction between the user feature and the tag feature can be realized based on the virtual class vector, thereby improving the accuracy of tag matching and being beneficial to fully and accurately mining the user interests.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a data processing method based on deep learning according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of another data processing method based on deep learning according to an embodiment of the present invention;

FIG. 3a is a schematic diagram of a training process of a tag matching model according to an embodiment of the present invention;

FIG. 3b is a schematic diagram of a usage flow of a tag matching model according to an embodiment of the present invention;

FIG. 3c is a schematic diagram of an overall process of matching a user with a tag according to an embodiment of the present invention;

fig. 3d is a schematic diagram of a network structure of a tag matching model according to an embodiment of the present invention;

fig. 3e is a schematic diagram of a network structure of a user tower in a tag matching model according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a server according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Cloud computing (closed computing) refers to the delivery and usage mode of an IT infrastructure, meaning that required resources are obtained in an on-demand, easily scalable manner through a network; generalized cloud computing refers to the delivery and usage patterns of services, meaning that the required services are obtained in an on-demand, easily scalable manner over a network. Such services may be IT, software, internet related, or other services. Cloud Computing is a product of fusion of traditional computer and network technology developments such as Grid Computing (Grid Computing), distributed Computing (Distributed Computing), parallel Computing (Parallel Computing), utility Computing (Utility Computing), network storage (Network Storage Technologies), virtualization (Virtualization), load balancing (Load balancing), and the like. With the development of the internet, real-time data flow and diversification of connected devices, and the promotion of demands of search services, social networks, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Unlike the previous parallel distributed computing, the generation of cloud computing will promote the revolutionary transformation of the whole internet mode and enterprise management mode in concept.

Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability. With the advent of the cloud age, big data has attracted more and more attention, and special techniques are required for big data to effectively process a large amount of data within a tolerant elapsed time. Technologies applicable to big data include massively parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems.

The scheme provided by the embodiment of the application relates to technologies such as machine learning of artificial intelligence and big data of cloud computing, and is specifically described through the following embodiments:

aiming at the problem that in the existing double-tower model structure, when the user feature vector and the label feature vector are obtained, due to the lack of information interaction between the user feature and the label feature, the problem that the prediction accuracy is low when the matching degree of a user and each label is predicted is easily caused.

The data processing method based on deep learning provided by the embodiment of the invention can be applied to the scene of user interest mining, and the interest characteristics of each user can be accurately mined by obtaining the classification label matched with each user. The related dual-tower model structure may include two sub-networks of a user tower for generating a fused feature vector of the user and a tag tower for generating a tag feature vector. In the user tower sub-network, the indirect interaction between the user features and the tag features can be realized by using the virtual class vectors, so that the obtained fusion feature vectors can learn part of relevant information in the tags, and on the premise of ensuring that the user network and the tag network are still decoupled, the accuracy of predicting the matching degree of the user and each tag can be improved, and the efficient prediction efficiency is also ensured.

The dual-tower model structure can be a deep structured semantic model (Deep Structured Semantic Models, DSSM) or a variant model of DSSM, such as convolutional neural network CNN-DSSM, long-short-term memory network LSTM-DSSM, and the like, and the invention is not limited.

Please refer to fig. 1, which is a schematic flow chart of a data processing method based on deep learning according to an embodiment of the present invention, the data processing method based on deep learning according to an embodiment of the present invention includes the following steps:

101. a feature dataset of a target user is obtained, the feature dataset comprising a plurality of user feature vectors describing features of the target user.

The target user may be any user. The characteristics of the user may be described from multiple dimensions, including basic attribute characteristics, including age, gender, academic, territory, etc., historical behavior characteristics may include recommended content (e.g., recommended advertisements) that the user clicks or views over a period of time, statistical characteristics may refer to statistical results of the historical behavior characteristics, such as the number of times, time, place, etc., that the user clicks or views the recommended content over a period of time.

Specifically, the server may obtain feature data of the target user, where the feature data includes one or more of basic attribute features, historical behavior features and statistical features, and construct a plurality of user feature vectors of the target user according to the feature data, for example, may perform word vector embedding (embedding) on the feature data to obtain a plurality of user feature vectors, so as to obtain a feature data set of the target user.

102. And obtaining a virtual class vector with highest similarity with a first tag feature vector from the plurality of virtual class vectors, wherein the first tag feature vector is a feature vector of any one of a plurality of predefined classification tags.

The plurality of virtual class vectors may be a plurality of initialization vectors generated according to the number and dimensions of the plurality of user feature vectors. The plurality of classification tags may be defined according to actual requirements, for example, the classification tag may refer to a classification of a commodity, and then the plurality of classification tags may include cosmetics, electronic products, clothing, automobiles, home appliances, snacks, drinks, and the like; for another example, the category labels may refer to a category of interest, and the plurality of category labels may include sports, science, health, financial, military, time, and the like. Wherein, a word vector embedding (unbinding) process may be performed on each class label to obtain a feature vector (denoted as a label feature vector) of each class label.

103. And generating a fusion feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors.

Specifically, for any one of the classification labels, the corresponding label feature vector is recorded as a first label feature vector, when predicting the matching degree between the user and the classification label, in order to construct a feature vector capable of fully representing the user feature information, the server may acquire a virtual class vector with the highest similarity with the first label feature vector from a plurality of virtual class vectors, and then generate a fusion feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors. For example, the similarity between each virtual class vector and the first tag feature vector can be calculated through dot product operation, then the virtual class vector with the highest similarity (namely, the dot product value is the largest) is found out from the plurality of virtual class vectors, so that the first classification tag can be used as the self representative vector of the virtual class vector with the highest similarity, and the representative vector is used for carrying out information interaction with the user feature vector, thereby realizing indirect interaction between the user and the tag, ensuring mutual decoupling between the user network structure and the tag network structure in the model, and avoiding direct interaction.

104. And determining the matching degree between the target user and the classification label corresponding to the first label feature vector according to the fusion feature vector and the first label feature vector.

Specifically, the server may input the fused feature vector and the first tag feature vector into the matching layer of the tag matching model to perform data processing, for example, dot product processing, so as to obtain an output result of the matching layer, and determine, according to the output result, a matching degree between the target user and the classification tag corresponding to the first tag feature vector, where the matching degree may specifically be a score value, and the range of the score value may be 0-100, or may be 0-1 after normalization processing. The tag matching model can adopt the double-tower model structure.

In some possible implementations, the tag matching model further includes a presentation layer including a user tower network and a tag tower network, the user tower network for generating a fused feature vector of the user, the tag tower network for generating a tag feature vector.

In the embodiment of the invention, the server can acquire the feature data set of the target user, wherein the feature data set comprises a plurality of user feature vectors for describing the features of the target user, then the virtual class vector with the highest similarity with the first tag feature vector is acquired from the plurality of virtual class vectors, the first tag feature vector is the feature vector of any one of a plurality of predefined classification tags, and the fusion feature vector of the target user is generated by utilizing the virtual class vector with the highest similarity and the plurality of user feature vectors, so that the matching degree between the target user and the classification tag corresponding to the first tag feature vector is determined according to the fusion feature vector and the first tag feature vector, and the indirect interaction between the user feature and the tag feature is realized based on the virtual class vector on the premise of ensuring the decoupling between the user network and the tag network, so that the user feature vector can learn part of information of the tag, the accuracy of predicting the matching degree of the user and each tag is improved, the efficient prediction efficiency is also ensured, and the user interest is fully and accurately mined.

Please refer to fig. 2, which is a schematic flow chart of another data processing method based on deep learning according to an embodiment of the present invention, the data processing method based on deep learning according to an embodiment of the present invention includes the following steps:

201. a feature dataset of a target user is obtained, the feature dataset comprising a plurality of user feature vectors describing features of the target user.

202. And obtaining a virtual class vector with highest similarity with a first tag feature vector from the plurality of virtual class vectors, wherein the first tag feature vector is a feature vector of any one of a plurality of predefined classification tags.

The specific implementation manner of the steps 201 to 202 may refer to the descriptions related to the steps 101 to 102 in the foregoing embodiments, which are not repeated here.

203. And determining a target candidate feature vector corresponding to the virtual class vector with the highest similarity from a plurality of candidate feature vectors, wherein each candidate feature vector in the plurality of candidate feature vectors is generated according to one of the plurality of virtual class vectors and the plurality of user feature vectors.

Specifically, the server may obtain a classification label corresponding to the first label feature vector, query a target index corresponding to the classification label from a mapping relation table of indexes and labels, where the mapping relation table of indexes and labels includes indexes of each classification label and a virtual class vector with highest similarity, and then use the virtual class vector corresponding to the target index in the plurality of virtual class vectors as the virtual class vector with highest similarity to the first label feature vector, and determine the virtual class vector with the highest similarity to each label feature vector quickly by querying the mapping relation table of indexes and labels. For example, there are 100 class labels, 5 virtual class vectors may be denoted as vc_1, vc_2, vc_3, vc_4, and vc_5, and indexes (index) of the 5 virtual class vectors are 1, 2, 3, 4, and 5, respectively, and the mapping relationship table of the indexes and the labels includes the value of the index of the virtual class vector with the highest similarity corresponding to each class label of the 100 class labels, and the virtual class vector vc_3 is the virtual class vector with the highest similarity with the first label feature vector assuming that the value of the target index is 3.

In some possible embodiments, the server may use an attribute mechanism to perform data processing on a virtual class vector and a plurality of user feature vectors to obtain a candidate feature vector, and when the attribute is performed, the virtual class vector is used as a query, the plurality of user feature vectors are used as keys and values, and if n virtual class vectors exist, n candidate feature vectors can be obtained through an attention mechanism.

Wherein, virtual class vector Vc _i And user feature vector (U) ₀ ，U ₁ ，…，U _m ) For example, a specific implementation of the attitudes for the virtual class vector and the plurality of user feature vectors may be as followsThe formula is:

wherein Vc_Emb _i For virtual class vector Vc _i The generated candidate feature vectors and n virtual class vectors correspondingly generate n candidate feature vectors.

In some possible embodiments, the server may first find the virtual class vector with the highest similarity, and then perform the attribute for the virtual class vector with the highest similarity and the multiple user feature vectors, that is: and performing data processing on the virtual class vector with the highest similarity and the plurality of user feature vectors by using the attention network to obtain a target candidate feature vector, and then generating a fusion feature vector of the target user by using the target candidate feature vector and the plurality of user feature vectors.

In some possible embodiments, the server may construct a mapping table of indexes and labels by calculating the similarity, and the specific implementation manner may include: and respectively calculating the similarity between each virtual class vector and the tag feature vector in the plurality of virtual class vectors according to the tag feature vector corresponding to each classification tag in the plurality of classification tags, then acquiring a target virtual class vector with the highest similarity, acquiring an index of the target virtual class vector, and then creating a mapping relation table of the index and the tag according to the index of the target virtual class vector, wherein the mapping relation table of the index and the tag comprises each classification tag and the index of the corresponding target virtual class vector.

In some possible implementations, a specific implementation of server generation of virtual class vectors may include: the method comprises the steps of obtaining the number and the dimension of user feature vectors, correspondingly generating a plurality of randomly initialized vectors according to the number and the dimension, and taking the plurality of randomly initialized vectors as a plurality of required virtual class vectors, thereby completing the generation and the initialization of the virtual class vectors. For example, there are 5 user feature vectors for each user, and the dimension is 32 dimensions, then 5 vectors of 32 dimensions can be randomly generated corresponding to the initialization and used as virtual class vectors.

204. And generating a fusion feature vector of the target user by utilizing the target candidate feature vector and the plurality of user feature vectors.

Specifically, the server may perform data processing on the plurality of user feature vectors by using the self-attention network to obtain a self-intersecting feature vector of the target user, and then perform fusion processing on the target candidate feature vector and the self-intersecting feature vector to obtain a fused feature vector of the target user. For example, an averaging (reduce mean) process may be performed on the target candidate feature vector and the self-intersecting feature vector to obtain a fused feature vector for the target user.

205. And inputting the fusion feature vector and the first tag feature vector into a matching layer of a tag matching model to obtain an output result of the matching layer.

206. And determining the matching degree between the target user and the classified label corresponding to the first label feature vector according to the output result.

In some possible embodiments, if the target user is a user who is a training sample, when training the tag matching model, the server may obtain exposure click data of the target user on the classification tag corresponding to the first tag feature vector, where the exposure click data refers to an actual click condition of the target user on the relevant recommended content of the classification tag corresponding to the first tag feature vector, that is, whether the relevant recommended content is checked, and then obtain a loss value of a loss function of the tag matching model according to the exposure click data and the matching degree, and use the loss value to adjust model parameters of the tag matching model, for example, may adjust model parameters by adopting a gradient descent method, so as to complete training of the tag matching model.

In some possible embodiments, as shown in fig. 3a, the training process of the tag matching model may generate a plurality of user feature vectors according to the feature input of the current user, where the generated plurality of virtual class vectors (Virtual Cluster Embedding) include a vector 1, a vector 2, a vector … …, and a vector n total of n virtual class vectors, and an attribute process is performed on the plurality of user feature vectors based on each virtual class vector (i.e., attribute of the virtual cluster) respectively, where each virtual class vector corresponds to one aggregate vector (i.e., the candidate feature vector described above), so as to obtain n candidate feature vectors (VC User Embedding), i.e., candidate feature vector 1, candidate feature vector 2, … …, and candidate feature vector n. The self-intersecting processing (namely, the self-intersecting of the user features) is carried out on a plurality of user feature vectors on the aspect of a user tower subnetwork of the tag matching model, so as to obtain self-intersecting feature vectors (User Interacted Embedding) of the user; on the other hand, the user tower sub-network of the tag matching model adopts an attribute mechanism to carry out data processing on each virtual class vector and a plurality of user feature vectors to obtain corresponding candidate feature vectors, and for a certain tag feature vector, a virtual class vector (namely the index of the virtual class vector) which is most similar to the tag feature vector is determined by carrying out dot product operation on each virtual class vector, then a corresponding target candidate feature vector is selected from n candidate feature vectors according to the index, and the target candidate feature vector is used as a feature vector which is indirectly interacted with the tag by the virtual class vector by a user; and then, the user tower sub-network performs a reduction mean processing on the self-intersecting feature vector and the target candidate feature vector to obtain a fusion feature vector for representing the user, performs a dot product operation on the fusion feature vector of the user and the tag feature vector to predict the matching degree between the user and the corresponding classified tag, and optimally adjusts model parameters of the tag matching model according to the predicted matching degree and the actual clicking condition of the user on recommended content related to the classified tag, thereby realizing training of the tag matching model.

In addition, in the training process, for each classification label, after the virtual class vector with the highest similarity with the label feature vector of the classification label is found, the index of the virtual class vector with the highest similarity can be obtained, and then a mapping relation table of the index and the classification label is created according to the index. For example, there are 100 class labels, 5 virtual class vectors, which may be denoted as vc_1, vc_2, vc_3, vc_4, and vc_5, and indexes (index) of the 5 virtual class vectors are 1, 2, 3, 4, and 5, respectively, and a mapping table of the indexes and the labels includes a numerical value of each class label of the 100 class labels and the index of the corresponding virtual class vector.

In some possible embodiments, the use (i.e. on-line prediction) process of the tag matching model may be as shown in fig. 3b, after feature data of a certain user is obtained, the tag matching model is input, a self-intersecting feature vector (User Interacted Embedding) of the user and a candidate feature vector set (VC User Embedding Set) are obtained, the candidate feature vector set includes a plurality of candidate feature vectors, each candidate feature vector is obtained by performing an attribute processing on one tag feature vector and a plurality of user feature vectors, for any classification tag, an index v_index of a virtual class vector with the highest similarity between the index and the tag feature vector of the classification tag can be obtained by querying a mapping relation table of the index and the classification tag, a target candidate feature vector corresponding to the index v_index is determined from the candidate feature vector set according to the index v_index, then fusion processing is performed on the self-intersecting feature vector and the target candidate feature vector to obtain a fusion feature vector of the user, point multiplication processing is performed on the tag feature vector of the fusion feature vector and the classification tag feature vector, thus the score of the user on the classification tag can be obtained, and then the interest of the user can be accurately sorted according to the score of each user can be obtained, and the interest of the user can be well sorted according to the score of the classification tag.

In some possible embodiments, the overall process of matching the tag with the user may be as shown in fig. 3c, and mainly includes: sample extraction, model training, model prediction and result selection.

Specifically, after interest orientation is carried out on a user, an advertisement log is obtained, a data set is constructed, the interest of the user is marked by the data set, the marking format can be (user, tag, 0/1), (user, tag, 0) indicates that the user does not click on advertisement content corresponding to the tag, and (user, tag, 1) indicates that the user clicks on advertisement content corresponding to the tag, and then a characteristic data set of the user is constructed by combining data such as basic attributes, historical behaviors and statistical characteristics of the user for model training. In the model prediction, a characteristic data set of the user is constructed according to data such as user basic attributes, historical behaviors, statistical characteristics and the like, and the model is input for prediction.

In some possible embodiments, the network structure of the tag matching model may be as shown in fig. 3 d. The tag matching model adopts a double-tower structure and comprises a user tower network and a tag tower network, wherein the user tower network is used for processing input user characteristics and virtual class vectors to generate fusion characteristic vectors of users, and the tag tower network is used for generating tag characteristic vectors.

In some possible embodiments, the network structure of the user tower in the tag matching model may be as shown in fig. 3 e. The user tower can specifically adopt a user tower based on an ultra-deep factorizer (extreme Deep Factorization Machine, xDeepFM) model, the user tower converts the characteristics of each dimension of a user into corresponding characteristic vectors, the user tower performs self-intersecting processing on a plurality of user characteristic vectors to obtain self-intersecting characteristic vectors of the user, and on the other hand, each virtual class vector and the plurality of user characteristic vectors are processed through an attention mechanism to obtain a plurality of candidate characteristic vectors, and the corresponding candidate characteristic vectors are selected to be fused with the characteristic vectors obtained by self-intersecting according to the virtual class vector with the maximum similarity with the tag characteristic vectors to obtain fusion characteristic vectors of the user. It can be seen that the prediction of the fusion feature vector of the user is decoupled from the tag feature vector, and on the premise of ensuring decoupling of the user tower and the tag tower, the interaction of the features of the user tower and the tag tower is realized based on the virtual class vector, so that the accuracy of predicting the matching degree of the user and each tag is improved.

It should be noted that, compared with the prior art, the double-tower model structure without the newly added virtual class vector only increases the calculation amount of two parts, firstly increases the attention calculation of the user feature vector and the virtual class vector when the user tower predicts the user fusion feature vector, is based on the matrix calculation of the GPU in the training and predicting process, and has very limited calculation amount, secondly increases the dot product calculation of the tag feature vector and the virtual class vector, and has negligible calculation amount, so that the indirect interaction between the user feature and the tag feature can be increased based on the virtual class vector under the condition of ensuring the efficient predicting efficiency, and the accuracy of tag matching is effectively improved.

In the embodiment of the invention, the server can acquire the feature data set of the target user, the feature data set comprises a plurality of user feature vectors for describing the features of the target user, the virtual class vector with the highest similarity with the first tag feature vector is acquired from the plurality of virtual class vectors, the first tag feature vector is the feature vector of any one of the predefined classification tags, the target candidate feature vector corresponding to the virtual class vector with the highest similarity is determined from the plurality of candidate feature vectors, the fusion feature vector of the target user is generated by utilizing the target candidate feature vector and the plurality of user feature vectors, the fusion feature vector and the first tag feature vector are input into a matching layer of a tag matching model, the output result of the matching layer is obtained, the matching degree between the target user and the classification tag corresponding to the first tag feature vector is determined according to the output result, and indirect interaction between the user feature and the tag feature can be realized based on the virtual class vector, so that the accuracy in tag matching is improved, and the interest of the user is fully and accurately mined.

In some possible implementations, based on the data processing method provided by the foregoing embodiments, the embodiment of the present invention further provides a content recommendation method, which may specifically include the following steps:

(a) A degree of match between the user and each of the predefined plurality of category labels is obtained.

The specific implementation of the matching degree between the user and the classification label may refer to the related description in the foregoing embodiment, which is not repeated herein.

(b) And determining at least one classification label from the plurality of classification labels according to the matching degree, and taking the at least one classification label as the interest label of the user.

Specifically, the server may rank the matching degree corresponding to each classification label from high to low, determine at least one classification label ranked in front, for example, the classification label ranked in front three (top 3), and use the determined at least one classification label as the interest label of the user, so that the user interest can be fully and accurately discovered by using the matching degree determined by the method in the foregoing embodiment, and a good basis is laid for accurate content recommendation performed subsequently.

(c) And obtaining the classification label of the content to be recommended, and determining the recalled user from the user set by utilizing the classification label of the content to be recommended and the interest label of each user.

The content to be recommended may be advertisement data of a certain commodity, and the commodity may be a physical commodity, such as a digital product, a cosmetic, or the like, or may be a virtual commodity, such as a game, a prop, or the like, which is not limited in the embodiment of the present invention.

Specifically, for the content to be recommended, the server may recall the user according to the classification label of the content to be recommended, that is, the user possibly interested in the content to be recommended is selected from a plurality of users included in the user set, for example, the interest label of each user in the user set may be compared with the classification label of the content to be recommended, if the interest label includes the classification label of the content to be recommended, or if the similarity between the interest label and the classification label of the content to be recommended is higher (for example, the preset similarity threshold is reached), the corresponding user is used as the recall user, and in this way, all the users capable of recall are determined from the user set.

(d) And pushing the content to be recommended to terminal equipment corresponding to the recalled user.

Specifically, after determining the recalled user, the server can push the content to be recommended to the terminal equipment corresponding to the recalled user, so that the content is pushed directionally and accurately.

In some possible embodiments, after determining the recalled user, the server may perform operations such as coarse ranking, fine ranking, etc. on the recalled user to further screen the recalled user, where coarse ranking refers to screening the user with fewer features (e.g., the user and a portion of the features of the content to be recommended), and fine ranking refers to screening the user with more features (e.g., all of the features of the user and the content to be recommended). Specifically, in the coarse ranking stage, for the recalled users, the server may input part of the features of the users and part of the features of the content to be recommended into a scoring model, obtain a score of the matching degree between each user of the recalled users and the content to be recommended, and select the part of the recalled users with higher ranking from the recalled users according to the score ranking, so as to realize screening the users through coarse ranking. In the fine ranking stage, aiming at the users screened through coarse ranking, the server can input all the characteristics in the characteristics of the users and all the characteristics in the characteristics of the contents to be recommended into a scoring model, the scoring of the matching degree between each user in the screened users and the contents to be recommended is obtained, the part of users with higher ranking is selected from the screened users according to the scoring ranking, the users are screened through fine ranking, the contents to be recommended can be pushed to terminal equipment corresponding to the screened users after fine ranking, the range of pushing objects can be narrowed through coarse ranking and fine ranking, the accuracy in content recommendation is further improved, and the success rate of content recommendation is improved.

Referring to fig. 4, a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention includes:

an acquisition module 401, configured to acquire a feature data set of a target user, where the feature data set includes a plurality of user feature vectors for describing features of the target user;

the obtaining module 401 is further configured to obtain a virtual class vector with highest similarity to a first tag feature vector from a plurality of virtual class vectors, where the first tag feature vector is a feature vector of any one of a predefined plurality of classification tags, and the plurality of virtual class vectors are generated according to the number and dimensions of the plurality of user feature vectors;

a generating module 402, configured to generate a fusion feature vector of the target user by using the virtual class vector with the highest similarity and the plurality of user feature vectors;

a determining module 403, configured to determine, according to the fusion feature vector and the first tag feature vector, a matching degree between the target user and the classification tag corresponding to the first tag feature vector.

Optionally, the determining module 403 is specifically configured to:

inputting the fusion feature vector and the first tag feature vector into a matching layer of a tag matching model to obtain an output result of the matching layer;

And determining the matching degree between the target user and the classified label corresponding to the first label feature vector according to the output result.

Optionally, the generating module 402 is specifically configured to:

determining a target candidate feature vector corresponding to the virtual class vector with the highest similarity from a plurality of candidate feature vectors, wherein each candidate feature vector in the plurality of candidate feature vectors is generated according to one of the plurality of virtual class vectors and the plurality of user feature vectors;

and generating a fusion feature vector of the target user by utilizing the target candidate feature vector and the plurality of user feature vectors.

Optionally, the generating module 402 is further configured to perform data processing on any one of the virtual class vectors and the plurality of user feature vectors by using an attention network for any one of the plurality of virtual class vectors, so as to obtain a plurality of candidate feature vectors.

Optionally, the generating module 402 is specifically configured to:

performing data processing on the virtual class vector with the highest similarity and the plurality of user feature vectors by using an attention network to obtain target candidate feature vectors;

Optionally, the generating module 402 is specifically configured to:

performing data processing on the plurality of user feature vectors by using a self-attention network to obtain self-intersecting feature vectors of the target user;

and carrying out fusion processing on the target candidate feature vector and the self-intersecting feature vector to obtain a fusion feature vector of the target user.

Optionally, the obtaining module 401 is specifically configured to:

obtaining a classification label corresponding to the first label feature vector;

querying a target index corresponding to the classification label from a mapping relation table of the index and the label, wherein the mapping relation table of the index and the label comprises indexes of each classification label in the plurality of classification labels and a corresponding virtual class vector with highest similarity;

and taking the virtual class vector corresponding to the target index in the plurality of virtual class vectors as the virtual class vector with the highest similarity with the first tag feature vector.

Optionally, the obtaining module 401 is further configured to obtain, for a tag feature vector corresponding to each of the plurality of classification tags, a target virtual class vector with the highest similarity with the tag feature vector from a plurality of virtual class vectors, and obtain an index of the target virtual class vector;

The generating module 402 is further configured to create an index-tag mapping table according to the index of the target virtual class vector, where the index-tag mapping table includes the index of each classification tag and the index of the corresponding target virtual class vector.

Optionally, the obtaining module 401 is further configured to:

acquiring exposure click data of the target user on the classification label corresponding to the first label feature vector;

acquiring a loss value of a loss function of the tag matching model according to the exposure click data and the matching degree;

and adjusting model parameters of the tag matching model by using the loss value to finish training the tag matching model.

Optionally, the obtaining module 401 is specifically configured to:

acquiring exposure click data and feature data of a target user, wherein the feature data comprises one or more of basic attribute features, historical behavior features and statistical features;

and generating a characteristic data set of the target user according to the exposure click data and the characteristic data.

Optionally, the obtaining module 401 is further configured to obtain the number and dimensions of the plurality of user feature vectors;

The generating module 402 is further configured to generate a plurality of randomly initialized vectors according to the number and the dimension, and use the plurality of randomly initialized vectors as the plurality of virtual class vectors.

Optionally, the tag matching model further includes a representation layer, where the representation layer includes a user tower network and a tag tower network, the user tower network is used to obtain a fused feature vector of the user, and the tag tower network is used to obtain a tag feature vector.

It should be noted that, the functions of each functional module of the data processing apparatus according to the embodiments of the present invention may be specifically implemented according to the method in the embodiments of the method, and the specific implementation process may refer to the related description of the embodiments of the method, which is not repeated herein.

Referring to fig. 5, a schematic structural diagram of a server according to an embodiment of the present invention includes a power module and other structures, and includes a processor 501, a storage device 502, and a network interface 503. Data may be interacted between the processor 501, the storage 502, and the network interface 503.

The storage 502 may include volatile memory (RAM), such as random-access memory (RAM); the storage 502 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a Solid State Drive (SSD), etc.; the storage 502 may also include a combination of the types of memory described above.

The processor 501 may be a central processing unit 501 (central processing unit, CPU). In one embodiment, the processor 501 may also be a graphics processor 501 (Graphics Processing Unit, GPU). The processor 501 may also be a combination of a CPU and a GPU. In one embodiment, the storage 502 is configured to store program instructions. The processor 501 may call the program instructions to perform the following operations:

Optionally, the processor 501 is specifically configured to:

Optionally, the processor 501 is further configured to:

and aiming at any virtual class vector in the plurality of virtual class vectors, carrying out data processing on the any virtual class vector and the plurality of user feature vectors by using an attention network so as to obtain a plurality of candidate feature vectors.

Optionally, the processor 501 is specifically configured to:

Optionally, the processor 501 is further configured to:

aiming at the label feature vector corresponding to each classification label in the plurality of classification labels, obtaining a target virtual class vector with highest similarity with the label feature vector from a plurality of virtual class vectors;

acquiring an index of the target virtual class vector;

and creating a mapping relation table of indexes and labels according to the indexes of the target virtual class vectors, wherein the mapping relation table of indexes and labels comprises the indexes of each classified label and the corresponding target virtual class vector.

Optionally, the processor 501 is further configured to:

Optionally, the processor 501 is specifically configured to:

Optionally, the processor 501 is further configured to:

acquiring the number and the dimension of the plurality of user feature vectors;

generating a plurality of randomly initialized vectors according to the number and the dimension, and taking the plurality of randomly initialized vectors as the plurality of virtual class vectors.

In specific implementation, the processor 501, the storage device 502 and the network interface 503 described in the embodiments of the present invention may perform the implementation described in the related embodiments of the deep learning-based data processing method provided in fig. 1 and 2, and may also perform the implementation described in the related embodiments of the data processing device provided in fig. 4, which are not repeated herein.

Those skilled in the art will appreciate that all or part of the processes in the methods of the embodiments described above may be implemented by means of hardware associated with a computer program comprising one or more instructions, and the program may be stored in a computer storage medium, where the program, when executed, may comprise processes in embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the steps performed in the embodiments of the methods described above.

The foregoing disclosure is only illustrative of some of the embodiments of the present application and is not, of course, to be construed as limiting the scope of the appended claims, and therefore, all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims

1. A data processing method based on deep learning, the method comprising:

2. The method of claim 1, wherein generating the fused feature vector for the target user using the highest similarity virtual class vector and the plurality of user feature vectors comprises:

3. The method according to claim 2, wherein before determining the target candidate feature vector corresponding to the virtual class vector with the highest similarity from the plurality of candidate feature vectors, the method further comprises:

4. The method of claim 1, wherein generating the fused feature vector for the target user using the highest similarity virtual class vector and the plurality of user feature vectors comprises:

5. The method of claim 2, wherein the generating the fused feature vector for the target user using the target candidate feature vector and the plurality of user feature vectors comprises:

6. The method of claim 1, wherein the obtaining the virtual class vector with the highest similarity to the first tag feature vector from the plurality of virtual class vectors comprises:

7. The method of claim 6, wherein prior to obtaining the virtual class vector having the highest similarity to the first tag feature vector from the plurality of virtual class vectors, the method further comprises:

acquiring an index of the target virtual class vector;

8. The method according to claim 1, wherein the method further comprises:

9. The method of claim 8, wherein the acquiring the feature data set of the target user comprises:

10. The method of claim 1, wherein prior to obtaining the virtual class vector having the highest similarity to the first tag feature vector from the plurality of virtual class vectors, the method further comprises:

11. The method of claim 1, wherein the tag matching model further comprises a presentation layer comprising a user tower network and a tag tower network, the user tower network for obtaining a fused feature vector of a user, the tag tower network for obtaining a tag feature vector.

12. A data processing apparatus, the apparatus comprising:

the determining module is used for inputting the fusion feature vector and the first tag feature vector into a matching layer of a tag matching model to obtain an output result of the matching layer; and determining the matching degree between the target user and the classified label corresponding to the first label feature vector according to the output result.

13. A server comprising a processor, a network interface and a storage device, the processor, the network interface and the storage device being interconnected, wherein the network interface is controlled by the processor to receive and transmit data, the storage device being configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the deep learning based data processing method of any of claims 1-11.

14. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions for execution by a processor for performing the deep learning based data processing method according to any one of claims 1 to 11.