CN111368205A - Data recommendation method and device, computer equipment and storage medium - Google Patents

Data recommendation method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111368205A
CN111368205A CN202010159124.XA CN202010159124A CN111368205A CN 111368205 A CN111368205 A CN 111368205A CN 202010159124 A CN202010159124 A CN 202010159124A CN 111368205 A CN111368205 A CN 111368205A
Authority
CN
China
Prior art keywords
data
nodes
user
sample
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010159124.XA
Other languages
Chinese (zh)
Other versions
CN111368205B (en
Inventor
刘巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010159124.XA priority Critical patent/CN111368205B/en
Publication of CN111368205A publication Critical patent/CN111368205A/en
Application granted granted Critical
Publication of CN111368205B publication Critical patent/CN111368205B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Abstract

The embodiment of the application discloses a data recommendation method and device, computer equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: the method comprises the steps of coding a graph network based on a coding model, obtaining feature vectors of a target user node and a plurality of data nodes, respectively obtaining association degrees of the plurality of data nodes and the target user node based on a classification model according to the feature vectors of the target user node and the plurality of data nodes, determining the target data node in the plurality of data nodes according to the obtained association degrees, and recommending target data to a target user, so that the accuracy of data recommendation is improved. The method improves the accuracy of the acquired feature vectors, can accurately determine the target data nodes for the target user nodes through the association degree of the target user nodes and the data nodes, and improves the accuracy of the determined target data nodes.

Description

Data recommendation method and device, computer equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a data recommendation method and device, computer equipment and a storage medium.
Background
The appearance and popularization of the internet bring a great amount of information to users, and the requirements of the users on the information are met, but with the great increase of the information amount, the users cannot obtain the information really needed by the users from the great amount of information, and therefore the required information is generally recommended for the users.
Related art provides a data recommendation method, which recommends similar data for a user according to a history of data purchased by the user. The recommendation method is simple, so that the accuracy of data recommendation is poor.
Disclosure of Invention
The embodiment of the application provides a data recommendation method and device, computer equipment and a storage medium, and the accuracy of data recommendation can be improved. The technical scheme is as follows:
in one aspect, a data recommendation method is provided, the method including:
acquiring a graph network, wherein the graph network comprises a plurality of user nodes and a plurality of data nodes;
coding the graph network based on a coding model to obtain feature vectors of a target user node and the plurality of data nodes, wherein the target user node is any user node in the plurality of user nodes;
based on a classification model, respectively obtaining the association degrees of the plurality of data nodes and the target user node according to the feature vectors of the target user node and the plurality of data nodes;
and determining a target data node in the plurality of data nodes according to the obtained plurality of association degrees, and recommending the target data to a target user.
In another aspect, a data recommendation apparatus is provided, the apparatus including:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a graph network, and the graph network comprises a plurality of user nodes and a plurality of data nodes;
the first coding processing module is used for coding the graph network based on a coding model to obtain a target user node and the feature vectors of the plurality of data nodes, wherein the target user node is any user node in the plurality of user nodes;
the association degree obtaining module is used for respectively obtaining the association degrees of the plurality of data nodes and the target user node according to the feature vectors of the target user node and the plurality of data nodes on the basis of a classification model;
and the data recommendation module is used for determining a target data node in the plurality of data nodes according to the obtained plurality of association degrees and recommending the target data to a target user.
Optionally, the coding model training module includes:
the graph network processing unit is used for processing the first sample graph network and the second sample graph network by adopting a preset loss function to obtain a loss value;
and the coding model training unit is used for training the coding model according to the loss value when responding to that the loss value is larger than a preset threshold value.
Optionally, the apparatus further comprises:
the third acquisition module is used for acquiring the trained coding model;
a fourth obtaining module, configured to obtain a third sample graph network and multiple sample node sets, where the third sample graph network includes multiple sample user nodes and multiple sample data nodes, and each sample node set includes one sample user node, a positive sample node connected to the sample user node, and a negative sample node not connected to the sample user node;
a third encoding processing module, configured to perform encoding processing on the third sample graph network based on an encoding model, to obtain a feature vector of each node in the third sample graph network;
and the third coding processing module is used for training the classification model according to the feature vectors of the sample user nodes, the positive sample nodes and the negative sample nodes in the plurality of sample sets.
Optionally, the data recommendation module includes:
and the data node determining unit is used for determining the data node with the association degree larger than a preset threshold value in the plurality of data nodes as the target data node.
In another aspect, a computer device is provided, which comprises a processor and a memory, wherein at least one program code is stored in the memory, and the at least one program code is loaded and executed by the processor to implement the data recommendation method according to the above aspect.
In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, the at least one program code being loaded and executed by a processor to implement the data recommendation method according to the above aspect.
The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:
the method, the device, the computer equipment and the storage medium provided by the embodiment of the application acquire the graph network, encode the graph network based on the encoding model, acquire the feature vectors of the target user node and the plurality of data nodes, improve the accuracy of the acquired feature vectors, respectively acquire the association degrees of the plurality of data nodes and the target user node according to the feature vectors of the target user node and the plurality of data nodes based on the classification model, determine the target data node in the plurality of data nodes according to the acquired association degrees, accurately determine the target data node for the target user node through the association degrees of the target user node and the plurality of data nodes, improve the accuracy of the determined target data node, recommend the target data to the target user, and accordingly improve the accuracy of data recommendation.
And the feature vectors of the target user node and the plurality of data nodes are obtained through the first coding sub-model and the second coding sub-model, so that over-fitting of the feature vectors is avoided, the stability of the model is improved, and the accuracy of the feature vectors is improved. And the first feature vector is obtained through the first coding sub-model, so that feature information of adjacent nodes is merged into the feature vectors of the nodes according to different weights, and the data is recommended to a target user more specifically according to the obtained feature vectors, and the accuracy of data recommendation is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a data recommendation method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a user interacting with an application server according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a graph network provided by an embodiment of the present application;
FIG. 4 is a flowchart of a coding model training method provided in an embodiment of the present application;
fig. 5 is a flowchart of a network for obtaining a second sample graph according to an embodiment of the present application;
fig. 6 is a flowchart of a network for obtaining a second sample graph according to an embodiment of the present application;
FIG. 7 is a flowchart of a classification model training method provided by an embodiment of the present application;
FIG. 8 is a flowchart of a data recommendation method provided in an embodiment of the present application;
fig. 9 is a schematic structural diagram of a data recommendation device according to an embodiment of the present application;
FIG. 10 is a schematic structural diagram of a data recommendation device according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a terminal according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be further described in detail with reference to the accompanying drawings.
The terms "first," "second," "third," and the like as used herein may be used herein to describe various concepts that are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, a first sample graph network may be referred to as a second sample graph network, and similarly, a second sample graph network may be referred to as a first sample graph network, without departing from the scope of the present application.
As used herein, the terms "plurality," "each," "any," and the like, include two or more than two, each referring to each of the corresponding plurality, and any referring to any one of the plurality. For example, the plurality of elements includes 3 elements, each of which refers to each of the 3 elements, and any one of the 3 elements refers to any one of the 3 elements, which may be a first one, a second one, or a third one.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning. According to the scheme provided by the embodiment of the application, the coding model and the classification model can be trained based on the machine learning technology of artificial intelligence, and the method for recommending data for the user is realized by utilizing the trained coding model and classification model.
Deep learning: deep learning is a method based on characterization learning of data in machine learning. The learning samples may be represented in a variety of ways, such as identified by a vector, or more abstractly as a series of edges, specially shaped regions, and so forth. Tasks (e.g., face recognition or facial expression recognition) are more easily learned from the examples using some specific representation methods. The benefit of deep learning is to replace the manual feature acquisition with unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms.
Unsupervised learning: in real life, such problems often occur: lack sufficient a priori knowledge and are therefore difficult or too costly to label manually. Naturally, we expect that computers can perform these tasks on behalf of us, or at least provide some assistance. Various problems in pattern recognition are solved from training samples whose classes are unknown (not labeled), referred to as unsupervised learning.
The recommendation system comprises: the recommendation system is a personalized information recommendation system which recommends information, data and the like which are interesting to a user to the user according to the information requirements, interests and the like of the user. Such as recommending data for the user, and helping the user complete the process of purchasing the data. The personalized recommendation is to recommend data and information which are interesting to the user according to the interest characteristics and the historical behaviors of the user. Recommendation systems are now widely used in many areas, the most typical of which with good development and application prospects is the field of e-commerce. Meanwhile, the research popularity of the academic community on the recommendation system is high all the time, and an independent subject is gradually formed.
The data recommendation method provided by the embodiment of the application can be used in computer equipment, the computer equipment comprises a terminal or a server, the terminal can be various terminals such as a mobile phone, a computer and a tablet computer, and the server can be a server, a server cluster formed by a plurality of servers, or a cloud computing server center.
The method provided by the embodiment of the application can be used for recommending data scenes for the user.
For example, in a movie recommendation scenario:
the computer equipment acquires the graph network through the movie playing application, and by adopting the data recommendation method provided by the embodiment of the application, a plurality of movies in the movie playing application are recommended for the target user, so that the recommended movies meet the preference of the user, and accurate information recommendation is realized.
For another example, in a cell phone recommendation scenario:
when a user purchases a mobile phone through a shopping website, after computer equipment acquires a graph network, the data recommendation method provided by the embodiment of the application is adopted to recommend the mobile phone to the user according to the preference of the user and a plurality of types of mobile phones in the shopping website, so that the recommended data meets the requirements of the user, and the attraction to the user is improved.
Fig. 1 is a flowchart of a data recommendation method provided in an embodiment of the present application, and is applied to a computer device, as shown in fig. 1, the method includes:
101. the computer device obtains a plurality of operation records.
Each operation record comprises a user identifier and a data identifier, and the user identifier indicates that a user corresponding to the user identifier executes preset operation on data corresponding to the data identifier. The user identification may be a user account number, a telephone number, a user nickname, or the like. The data identification may be a data name, a data code, etc., and the data representation may be article data, movie data, song data, etc. The preset operation is an operation that can view data for a user, an operation that a user adds data to a shopping cart, an operation that a user purchases data, and the like.
In one possible implementation, the user identification and the data identification included in different operation records are not identical. For example, in the obtained multiple operation records, a first operation record includes a user identifier 1 and a data identifier a, a second operation record includes a user identifier 2 and a data identifier B, a third operation record includes a user identifier 3 and a data identifier a, a fourth operation record includes a user identifier 2 and a data identifier C, the user identifiers and the data identifiers included in the first operation record and the second operation record are different, the user identifiers included in the first operation record and the third operation record are different, the data identifiers are the same, the user identifiers included in the second operation record and the fourth operation record are the same, and the data identifiers are different.
In addition, each operation record may further include user information of a user identifier and data information of a data identifier, where the user information may be age, height, hobbies, place of residence, relationships of friends, and the like, and the data information may be a data type, a data price, a data usage, and the like. For example, one operation record includes a user identifier, a data identifier, user information of the user identifier, and data information of the data identifier. The user information includes "female, place of residence XX, 80 after, school bus, movie frequently watched, stock cast, and XX is friend", and the data information includes "movie name, comedy type, movie duration, movie ticket price".
For the manner of obtaining the operation record, in a possible implementation manner, the computer device is an application server, and in response to a user corresponding to the user identifier executing a preset operation on data corresponding to the data identifier, the application server generates an operation record, where the operation record includes the user identifier and the data identifier.
For example, the terminal logs in the user identifier, and the terminal establishes a connection with the application server. The terminal sends a data information query request to the application server, the data information query request carries the user identifier and the data identifier, the application server sends the data information corresponding to the data identifier in the database to the terminal according to the data identifier, and the application server generates an operation record, wherein the operation record comprises the user identifier and the data identifier. As shown in fig. 2, if a plurality of user identifiers 201 perform a preset operation on a data identifier in an application, the application server generates a plurality of operation records, where each operation record includes a user identifier and a data identifier.
102. The computer equipment respectively creates a plurality of user nodes and a plurality of data nodes in the graph network according to a plurality of user identifications and a plurality of data identifications in a plurality of operation records, and connects the user nodes belonging to the same operation record with the data nodes to obtain the graph network.
The graph network is a representation form of connection relations among a plurality of nodes. Because different operation records can include the same user identifier and the same data identifier, when a user node is created for each user identifier, only one user node is created for the same user identifier, only one data node is created for the same data identifier, so that a plurality of different user nodes and a plurality of different data nodes are obtained, and the user nodes corresponding to two user identifiers belonging to the same operation record are connected according to a plurality of operation records, so that the graph network is obtained.
In a possible implementation manner, the operation record further includes user information of a user identifier and data information of a data identifier, where the user information includes other user identifiers that are in a friend relationship with the user identifier, and the data information includes a data type, then the step 102 may include: respectively creating a plurality of user nodes and a plurality of data nodes in the graph network according to the plurality of user identifications and the plurality of data identifications, connecting the user nodes belonging to the same operation record with the data nodes, connecting the data nodes belonging to the same data type according to the user information of the plurality of user identifications and the data information of the plurality of data identifications, and connecting the user nodes belonging to the friend relationship to obtain the graph network.
For example, the first operation record includes a user identifier 1 and a data identifier a, the second operation record includes a user identifier 2 and a data identifier B, the third operation record includes a user identifier 3 and a data identifier a, the fourth operation record includes a user identifier 2 and a data identifier C, the fifth operation record includes a user identifier 3 and a data identifier C, and the user identifier 1 and the user identifier 3 are in a friend relationship, and the data identifier B and the data identifier C belong to the same data type, then 3 user nodes are created through the user identifier 1, the user identifier 2 and the user identifier 3, 3 data nodes are created through the data identifier a, the data identifier B and the data identifier C, the user nodes belonging to the same operation record are connected with the data nodes, the user nodes belonging to the friend relationship are connected, the data nodes belonging to the same data type are connected, the resulting graph network is shown in FIG. 3. User identifier 1 corresponds to user node 301, user identifier 2 corresponds to user node 302, user identifier 3 corresponds to user node 303, data identifier a corresponds to data node 304, data identifier B corresponds to data node 305, and data identifier C corresponds to data node 306.
103. And the computer equipment carries out coding processing on the graph network based on the coding model to obtain the characteristic vectors of the target user node and the plurality of data nodes.
The target user node is any user node in the plurality of user nodes. The feature vector of the user node is used for representing the vector of the user feature information, the feature vector of the data node is used for representing the vector of the data feature information, and the feature vector of the user node and the feature vector of the data node can comprise a plurality of dimensions.
The coding model is used for acquiring the characteristic vectors of the user nodes and the data nodes in the graph network, and the coding model aggregates the characteristic information between adjacent nodes according to the connection relation between the nodes in the graph network, so that the characteristic vectors of the target user nodes and the plurality of data nodes are obtained.
In one possible implementation, the step 103 may include: and coding the graph network based on the coding model to obtain the characteristic vector of each user node and the characteristic vector of each data node in the graph network.
In the embodiment of the application, the graph network is encoded based on the encoding model, and the feature vector of the target user node and the feature vectors of the data nodes are obtained, wherein the target user node can be any user node in the graph network. And coding the graph network based on the coding model, and acquiring the feature vector of each user node and the feature vector of each data node in the graph network.
In addition, before step 103, the coding model needs to be trained, so that the graph network can be coded according to the trained coding model when step 103 is executed. In one possible implementation, as shown in fig. 4, the training process of the coding model may include the following steps 1031-1034:
1031. the method comprises the steps of obtaining a first sample graph network, wherein the sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes.
The first sample graph network is similar to the graph network in step 102, and is not described again here.
1032. And carrying out coding processing on the first sample graph network based on the coding model to obtain the feature vector of each node in the first sample graph network.
The step is similar to the step 103, and is not described herein again.
1033. And decoding the feature vectors of the plurality of nodes in the first sample graph network based on the decoding model to obtain a second sample graph network.
The second sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes, and the plurality of sample user nodes and the plurality of sample data nodes are the same as the plurality of sample user nodes and the plurality of sample data nodes in the first sample graph network.
The decoding model is used for converting the feature vectors of a plurality of user nodes and a plurality of data nodes into a graph network. Since the feature vectors of the plurality of sample user nodes and the plurality of sample data nodes are obtained through the first sample graph network, the connection relationship between each sample user node and each sample data node is merged into the feature vectors of the plurality of sample user nodes and the plurality of sample data nodes, and the plurality of sample user nodes and the plurality of sample data nodes are decoded through the decoding model, so that a second sample graph network can be obtained to represent the connection relationship between the plurality of sample user nodes and the plurality of sample data nodes.
In the embodiment of the present application, since the input of the coding model is a graph network, the output of the coding model is a feature vector of each node, and the feature vector cannot be compared with the graph network. Therefore, in the process of training the coding model, feature vectors of a plurality of nodes output by the coding model need to be decoded based on the decoding model, and the obtained feature vectors of a plurality of sample user nodes and a plurality of sample data nodes are converted into a graph network form, so that the coding model can be trained according to the difference between the two graph networks by comparing the input graph network with the output graph network.
For the manner of decoding processing, in one possible implementation, a transposed feature vector H of the feature vector H for each sample user node is determined based on the decoding modelTAccording to the feature vector H of each sample user node and the corresponding transposed feature vector HTAnd acquiring the attribute value R corresponding to each sample user node, and connecting the attribute values R corresponding to the plurality of sample user nodes to obtain the second sample graph network. The feature vector H and the transposed feature vector HTThe attribute value R satisfies the following relationship:
R=σ(H·HT)
where σ () is a sigmoid (logistic regression) function for mapping a real number into an interval of (0, 1).
1034. The coding model is trained on the basis of the difference between the first sample graph network and the second sample graph network.
The second sample graph network is obtained by decoding the feature vectors of the plurality of sample user nodes, so that the feature vectors of the sample user nodes output by the coding model are different from the feature vectors of the real sample user nodes due to the poor accuracy of the coding model in the training process, and the obtained first sample graph network and the second sample graph network are also different.
In one possible implementation, this step 1034 may include: and processing the first sample graph network and the second sample graph network by adopting a preset loss function to obtain a loss value, and training the coding model according to the loss value when responding to the loss value being greater than a preset threshold value.
Wherein the preset loss function is a function for determining the difference between the two graph networks, and the preset loss function can be any preset function. The loss value is used to indicate the degree of difference between the two graph networks, the greater the loss value, the greater the difference between the two graph networks, the smaller the loss value, and the smaller the difference between the two graph networks. The preset threshold may be any value that can be set to represent a desired loss value for the coding model. And in response to the loss value being not greater than the preset threshold value, the accuracy of the coding model is represented to meet the requirement, the coding model needs to be trained continuously, and in response to the loss value being not greater than the preset threshold value, the accuracy of the coding model is represented to meet the requirement, and the training of the coding model can be stopped.
In one possible implementation, the preset loss function may be:
Figure BDA0002405141510000101
wherein i is used for representing a node in the first sample graph network, i can be a positive integer greater than 0 and not greater than n, and n is a positive integer greater than 0; l represents a loss value, σiRepresenting the value of an attribute, g, of a node in the second sample graph networkiFor the attribute values of the nodes in the first sample graph network, exp () represents an exponential function with a natural constant e as the base.
In the embodiment of the present application, the coding model may be iteratively trained sequentially through a plurality of first sample graph networks. When the coding model is trained, the loss value is taken as a reference, the current loss value is obtained through the current first sample graph network, the coding model is adjusted according to the current loss value, then the adjusted coding model is trained through the next first sample graph network, the coding model is adjusted again according to the obtained loss value, the coding model is sequentially trained according to the arrangement sequence of the first sample graph networks, after multiple rounds of iterative training, the training of the coding model is stopped when the loss value is not larger than a preset threshold value in response to the loss value, the trained coding model is obtained, and then the feature vectors of the user nodes and the data nodes can be obtained based on the trained coding model.
In one possible implementation, the coding model includes a first coding sub-model and a second coding sub-model, and the step 103 may include: and based on the second coding submodel, coding the first eigenvector of each node to obtain a second eigenvector of each node.
The first coding sub-model may be a GAT (Graph Attention Network) or other models. The second coding submodel may be a VAE (variant Auto-Encoder) or other model.
The first coding sub-model aggregates the feature information between adjacent nodes according to the connection relation between the nodes in the graph network, so that a first feature vector of each node in the graph network is obtained, the first coding sub-model inputs the obtained first feature vector of each node into the second coding sub-model, the second coding sub-model carries out coding processing on the first feature vector of each node, the feature vector of each node is updated, and a second feature vector of each node is obtained. And the first characteristic vector is coded by the second coding sub-model, so that the first characteristic vector of each node is prevented from being over-fitted, and the accuracy of the second characteristic vector is improved.
When the first coding sub-model is a GAT and the second coding sub-model is a VAE model, the first coding sub-model performs weighted summation on the feature information of each node and adjacent nodes in the graph network through a weight matrix to obtain a first feature vector of each node, wherein the first feature vector comprises a mean vector and a standard deviation vector. And the second coding sub-model determines a normal distribution function according to the mean vector and the standard deviation vector in the first characteristic vector, and codes the first characteristic vector of each node according to the normal distribution function to obtain a second characteristic vector of each node. The weight matrix comprises the weight between each node and the adjacent node to represent the influence degree of the adjacent node on the node. If the weight between two data nodes is used for representing the similarity degree of the two data, the larger the weight is, the more similar the two data nodes are; the weight between two user nodes is used for representing the friend degree between two users, and the larger the weight is, the better the relationship between the two users is; the weight between the user node and the data node is used for representing the preference degree of the user to the data, and the greater the weight is, the more the user likes the data, thereby realizing an Attention mechanism.
In one possible implementation manner, the first feature vector includes a mean vector m and a standard deviation vector n, and the second coding sub-model performs coding processing on the mean vector m and the standard deviation vector n according to the parameter e to obtain a second feature vector H, where the mean vector m, the standard deviation vector n, and the second feature vector H satisfy the following relationship:
H=n·e+m
and e is the probability density obtained by the normal distribution function corresponding to the mean vector m and the standard deviation vector n, wherein the probability density is the probability density corresponding to the random value in the normal distribution function.
In one possible implementation, the training process of the first coding submodel and the second coding submodel may include: the method comprises the steps of obtaining a first sample graph network, wherein the sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes, coding the first sample graph network based on a first coding submodel to obtain a first feature vector of each node, coding the first feature vector of each node based on a second coding submodel to obtain a second feature vector of each node, decoding the first feature vectors of the nodes in the first sample graph network based on a decoding model to obtain a second sample graph network, and training the first coding submodel and the second coding submodel according to the difference between the first sample graph network and the second sample graph network.
As shown in fig. 5, the first sample graph network is input into the first encoding sub-model 501, the first encoding sub-model 501 outputs a first feature vector of each node, the first feature vector of each node is input into the second encoding sub-model 502, the second encoding sub-model outputs a second feature vector of each node, and the second feature vector of each node is input into the decoding model 503, so as to obtain the second sample graph network.
As shown in fig. 6, the first sample graph network is input into the first encoding sub-model 601, the first encoding sub-model 601 outputs a mean vector and a standard deviation vector of each node, the mean vector and the standard deviation vector of each node are aggregated by the second encoding sub-model 602, a second feature vector of each node is output, and the second feature vector of each node is input into the decoding model 603, so that a second sample graph network is obtained.
In addition, in the process of training the coding model, an SGD (Stochastic gradient descent) optimizer is used, the learning rate is 0.001, and the Epoch (iteration count) is 10, and the coding model is iteratively trained.
In addition, in the process of training the coding model, the coding model can be trained through a graph network in Cora (a data set), CiteSeer (a data set), and redbit (a data set). The graph networks included in different data sets are different, and the number of nodes in the graph networks, the number of edges connected between the nodes, and the number of dimensions of the feature vector of each node are shown in table 1.
TABLE 1
Number of nodes Number of edges Number of dimensions
Cora 2708 10556 1433
CiteSeer 3327 9104 3703
Reddit 232965 1146158892 602
In addition, the encoding model in the embodiment of the present application may be a model such as SC (Spectral Clustering), DW (Deep Walk), GAE (Graph Auto-Encoder), VGAE (variant Auto-Encoder), sagega (Sample And aggregated Graph Auto-Encoder), And the like. In the training process of the coding model, the accuracy of the coding model obtained by training different models on different data sets is different. The accuracy with which each model can be trained on different data sets is shown in table 2.
TABLE 2
Cora CiteSeer Reddit
SC 84.6±0.01 80.2±0.02 84.2±0.02
DW 83.1±0.01 80.5±0.02 84.4±0.001
GAE 83.91±0.49 78.7±0.01 82.2±0.02
VGAE 84.28±0.15 78.9±0.03 82.7±0.02
SAGEGAE 88.28±0.02 84.12±0.1 96.2±0.021
104. And the computer equipment respectively acquires the association degrees of the plurality of data nodes and the target user node according to the feature vectors of the target user node and the plurality of data nodes based on the classification model.
The classification model is used for obtaining the association degree between the user node and the data node, and may be a LR Classifier (Logistic Regression Classifier) or other models. The association degree is used for representing the association degree between the user node and the data node, the greater the association degree is, the more the data corresponding to the data node conforms to the preference of the user corresponding to the user node, and the smaller the association degree is, the more the data corresponding to the data node does not conform to the preference of the user corresponding to the user node.
In one possible implementation, this step 104 may include: and the computer equipment sequentially inputs the feature vectors of the target user node and each data node into a classification model, and the classification model processes the feature vectors of the target user node and each data node and outputs the association degree of the target user node and each data node.
Before step 104, the classification model needs to be trained, so that the feature vectors of the target user node and the plurality of data nodes can be processed according to the trained classification model when step 104 is executed. In one possible implementation, as shown in fig. 7, the training process of the classification model may include the following steps 1041-:
1041. and acquiring the trained coding model.
Wherein, the trained coding model can be obtained by the training in steps 1031-1034.
1042. A third sample graph network and a plurality of sample node sets are obtained.
The third sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes, and each sample node set comprises one sample user node, a positive sample node connected with the sample user node and a negative sample node not connected with the sample user node. The positive sample nodes and the negative sample nodes both belong to the third sample graph network. For any sample user node, the data node connected with the sample user node in the sample data nodes is used as a positive sample node, and the data node not connected with the sample user node in the sample data nodes is used as a negative sample node. Since the third sample graph network includes a plurality of sample user nodes, for each sample user node, the plurality of sample data nodes are divided into positive sample nodes and negative sample nodes, and then a plurality of sample node sets can be obtained.
1043. And coding the third sample graph network based on the coding model to obtain the feature vector of each node in the third sample graph network.
This step is similar to step 103 described above and will not be described further herein.
1044. And training the classification model according to the feature vectors of the sample user nodes, the positive sample nodes and the negative sample nodes in the plurality of sample sets.
In the process of training the classification model, for each sample set, the classification model is trained according to the feature vectors of the sample user nodes, the feature vectors of the positive sample nodes and the feature vectors of the negative sample nodes in the sample set.
In one possible implementation, this step 1044 may include: determining the association degree of a sample user node and a positive sample node as a first association degree, determining the association degree of the sample user node and a negative sample node as a second association degree, taking the characteristic vectors of the sample user node and the positive sample node as the input of the classification model, taking the first association degree as the output of the classification model, training the classification model, taking the characteristic vectors of the sample user node and the negative sample node as the input of the classification model, taking the second association degree as the output of the classification model, and training the classification model.
The first relevance degree is greater than the second relevance degree, and both the first relevance degree and the second relevance degree can be values set arbitrarily. For example, the first degree of association is set to 1, and the second degree of association is set to 0.
In one possible implementation, the step 1044 further includes: processing the feature vectors of the sample user nodes and the positive sample nodes based on the classification model to obtain the association degree of the sample user nodes and the positive sample nodes, processing the association degree of the sample user nodes and the positive sample nodes and the first association degree of the sample user nodes and the positive sample nodes by adopting a preset loss function to obtain a loss value, responding to the fact that the loss value is larger than a preset threshold value, and training the classification model according to the loss value. Processing the feature vectors of the sample user nodes and the negative sample nodes based on the classification model to obtain the association degree of the sample user nodes and the negative sample nodes, processing the association degree of the sample user nodes and the negative sample nodes and the second association degree of the sample user nodes and the negative sample nodes by adopting a preset loss function to obtain a loss value, and training the coding model according to the loss value in response to the fact that the loss value is larger than a preset threshold value.
In another possible implementation, after step 1044, the method further includes: processing the feature vector of the sample user node, the feature vectors of the plurality of positive sample nodes and the feature vectors of the plurality of negative sample nodes in each sample set based on the classification model to obtain the association degree between the sample user node and the plurality of positive sample nodes in each sample set and the association degree between the sample user node and the plurality of negative sample nodes, determining the number of sample nodes with accurate association degree output by the classification model according to the association degree between the sample user node and the plurality of positive sample nodes, the association degree between the sample user node and the plurality of negative sample nodes, the first association degree between the sample user node and the plurality of positive sample nodes and the first association degree between the sample user node and the plurality of negative sample nodes in each sample set, and determining the accuracy of the classification model according to the number of the sample nodes and the total number of the positive sample nodes and the negative sample nodes in the sample set, and responding to the fact that the accuracy of the classification model is smaller than a preset threshold value, and continuing training the classification model according to the residual sample set. The preset threshold may be an arbitrarily set threshold.
For example, 50 positive sample nodes, 50 negative sample nodes, a preset threshold value of 90%, and 40 sample nodes with accurate correlation degree output by the classification model, if the accuracy of the classification model is 40%, and if the accuracy is smaller than the preset threshold value, the classification model continues to be trained.
105. And the computer equipment determines a target data node in the plurality of data nodes according to the obtained plurality of association degrees and recommends the target data to the target user.
The target user is a user corresponding to the target user node, and the target data is data corresponding to the target data node. And the computer equipment determines a target data node for the target user node from the plurality of data nodes according to the association degree of the target user node and each data node, and recommends the target data corresponding to the target data node to the target user corresponding to the target user node.
In this embodiment of the present application, one or more target data nodes may be determined for a target user node according to the association degree between the target user node and each data node, where the target data node may be a data node connected to the target user node in a graph network, or may be a data node unconnected to the target user node in the graph network.
For example, when a user 1 purchases data a, a user 2 purchases data B, the user 1 and the user 2 are in a friend relationship, the data a and the data C belong to the same type of data, that is, in a graph network, a user node 1 is connected with a data node a, a user node 2 is connected with a data node B, a user node 1 is connected with a user node 2, and a data node a is connected with a data node C, and when a target data node is determined for the user node 1 based on the data recommendation method provided by the present application, the data node a, the data node B, and the data node C are determined as the target data node, and the data a, the data B, and the data C are recommended to the user 1.
In one possible implementation, this step 105 may include: and determining the data nodes with the association degree larger than a preset threshold value in the plurality of data nodes as target data nodes.
The preset threshold may be any preset value. And in the association degrees corresponding to the plurality of data nodes, if the association degree is greater than a preset threshold value, the data node corresponding to the association degree is shown to accord with the preference of the target user node, and the data node corresponding to the association degree is determined as the target data node.
It should be noted that, in the embodiment of the present application, the graph network is obtained by processing the obtained multiple operation records, and in another embodiment, the graph network may be obtained in other manners without performing steps 101 and 102.
With the continuous expansion of the electronic commerce scale, the number and the types of data rapidly increase, and a user needs to spend a large amount of time to find the required data, so that the problem of information overload is caused. Users are continually lost in browsing large amounts of unrelated data and information. In order to solve the problem of information overload, a recommendation system is produced. Compared with a search engine, the recommendation system carries out personalized calculation by researching the interest preference of the user, and finds the interest points of the user through the recommendation system, so that the user is guided to find own data requirements. A good recommendation system not only can provide personalized services for users, but also can establish close relations with the users. In the internet era, recommendation systems are ubiquitous. The recommendation system can recommend goods, movies, songs, news reports, hotel travel, etc. to the user, providing the user with customized choices. Matrix decomposition, neighborhood methods and various mixing methods are also used behind the recommendation system, so that the recommendation system has high accuracy and stability.
By the method, the recommendation system is realized, data recommendation can be performed for the user, and convenience is provided for the user.
The method provided by the embodiment of the application comprises the steps of obtaining a graph network, carrying out coding processing on the graph network based on a coding model, obtaining feature vectors of a target user node and a plurality of data nodes, improving the accuracy of the obtained feature vectors, respectively obtaining the association degrees of the plurality of data nodes and the target user node according to the feature vectors of the target user node and the plurality of data nodes based on a classification model, determining the target data node in the plurality of data nodes according to the obtained association degrees, accurately determining the target data node for the target user node through the association degrees of the target user node and the plurality of data nodes, improving the accuracy of the determined target data node, recommending target data to a target user, and improving the accuracy of data recommendation.
And the feature vectors of the target user node and the plurality of data nodes are obtained through the first coding sub-model and the second coding sub-model, so that over-fitting of the feature vectors is avoided, the stability of the model is improved, and the accuracy of the feature vectors is improved. And the first feature vector is obtained through the first coding sub-model, so that feature information of adjacent nodes is merged into the feature vectors of the nodes according to different weights, and the data is recommended to a target user more specifically according to the obtained feature vectors, and the accuracy of data recommendation is improved.
As shown in fig. 8, a flowchart of a data recommendation method provided in an embodiment of the present application includes:
1. and acquiring a sample graph network, wherein the sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes.
2. And training the initial coding model according to the obtained sample graph network to obtain the trained coding model.
3. And training the initial classification model according to the sample graph network and the trained coding model to obtain the trained classification model.
4. And processing the acquired graph network based on the trained coding model and classification model to recommend data to the target user.
Fig. 9 is a schematic structural diagram of a data recommendation device according to an embodiment of the present application, and as shown in fig. 9, the data recommendation device includes:
a first obtaining module 901, configured to obtain a graph network, where the graph network includes a plurality of user nodes and a plurality of data nodes;
a first encoding processing module 902, configured to perform encoding processing on a graph network based on an encoding model, to obtain feature vectors of a target user node and multiple data nodes, where the target user node is any user node in the multiple user nodes;
an association degree obtaining module 903, configured to obtain, based on the classification model, association degrees of the multiple data nodes and the target user node according to the feature vectors of the target user node and the multiple data nodes, respectively;
and the data recommendation module 904 is configured to determine a target data node of the plurality of data nodes according to the obtained plurality of association degrees, and recommend the target data to a target user.
The device provided by the embodiment of the application obtains the graph network, carries out coding processing on the graph network based on the coding model, obtains the characteristic vectors of the target user node and the plurality of data nodes, improves the accuracy of the obtained characteristic vectors, respectively obtains the association degrees of the plurality of data nodes and the target user node according to the characteristic vectors of the target user node and the plurality of data nodes based on the classification model, determines the target data node in the plurality of data nodes according to the obtained association degrees, can accurately determine the target data node for the target user node through the association degrees of the target user node and the plurality of data nodes, improves the accuracy of the determined target data node, recommends the target data to the target user, and accordingly improves the accuracy of data recommendation.
Optionally, as shown in fig. 10, the first obtaining module 901 includes:
an operation record obtaining unit 9101, configured to obtain multiple operation records, where each operation record includes a user identifier and a data identifier, and indicates that a user corresponding to the user identifier performs a preset operation on data corresponding to the data identifier;
the node connecting unit 9102 is configured to create a plurality of user nodes and a plurality of data nodes in the graph network according to the plurality of user identifiers and the plurality of data identifiers in the plurality of operation records, and connect the user nodes and the data nodes that belong to the same operation record to obtain the graph network.
Optionally, each operation record further includes user information of the user identifier and data information of the data identifier, the user information includes other user identifiers in a friend relationship with the user identifier, and the data information includes a data type; the node connecting unit is also used for respectively creating a plurality of user nodes and a plurality of data nodes in the graph network according to the plurality of user identifications and the plurality of data identifications and connecting the user nodes belonging to the same operation record with the data nodes; and connecting the data nodes belonging to the same data type according to the user information of the plurality of user identifications and the data information of the plurality of data identifications, and connecting the user nodes belonging to the friend relationship to obtain the graph network.
Optionally, as shown in fig. 10, the coding model includes a first coding sub-model and a second coding sub-model, and the first coding processing module 902 includes:
a first encoding processing unit 9201, configured to perform encoding processing on a graph network based on a first encoding sub-model, to obtain a first feature vector of each node;
a second encoding processing unit 9202, configured to perform encoding processing on the first feature vector of each node based on the second encoding sub-model, to obtain a second feature vector of each node.
Optionally, as shown in fig. 10, the apparatus further comprises:
a second obtaining module 905, configured to obtain a first sample graph network, where the first sample graph network includes a plurality of sample user nodes and a plurality of sample data nodes;
a second encoding processing module 906, configured to perform encoding processing on the first sample graph network based on the encoding model to obtain a feature vector of each node in the first sample graph network;
a decoding processing module 907, configured to perform decoding processing on the feature vectors of the multiple nodes in the first sample graph network based on the decoding model to obtain a second sample graph network;
the coding model training module 908 is configured to train a coding model according to a difference between the first sample graph network and the second sample graph network.
Optionally, as shown in fig. 10, the coding model training module 908 includes:
a graph network processing unit 9801, configured to process the first sample graph network and the second sample graph network by using a preset loss function to obtain a loss value;
and the coding model training unit 9802 is used for training the coding model according to the loss value when the loss value is larger than a preset threshold value.
Optionally, as shown in fig. 10, the apparatus further comprises:
a third obtaining module 909, configured to obtain the trained coding model;
a fourth obtaining module 910, configured to obtain a third sample graph network and multiple sample node sets, where the third sample graph network includes multiple sample user nodes and multiple sample data nodes, and each sample node set includes one sample user node, a positive sample node connected to the sample user node, and a negative sample node not connected to the sample user node;
a third encoding processing module 911, configured to perform encoding processing on the third sample graph network based on the encoding model, to obtain a feature vector of each node in the third sample graph network;
the classification model training module 912 is configured to train a classification model according to feature vectors of sample user nodes, positive sample nodes, and negative sample nodes in a plurality of sample sets.
Optionally, as shown in fig. 10, the data recommendation module 904 includes:
the data node determination unit 9401 is configured to determine a data node, of the plurality of data nodes, whose association degree is greater than a preset threshold as a target data node.
Fig. 11 is a schematic structural diagram of a terminal according to an embodiment of the present application, which is capable of implementing operations executed by a computer device in the foregoing embodiments. The terminal 1100 may be a portable mobile terminal such as: the mobile terminal comprises a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, Moving Picture Experts compress standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, Moving Picture Experts compress standard Audio Layer 4), a notebook computer, a desktop computer, a head-mounted device, a smart television, a smart sound box, a smart remote controller, a smart microphone, or any other smart terminal. Terminal 1100 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.
In general, terminal 1100 includes: a processor 1101 and a memory 1102.
Processor 1101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. Memory 1102 may include one or more computer-readable storage media, which may be non-transitory, for storing at least one instruction for processor 1101 to implement the data recommendation methods provided by method embodiments herein.
In some embodiments, the terminal 1100 may further include: a peripheral interface 1103 and at least one peripheral. The processor 1101, memory 1102 and peripheral interface 1103 may be connected by a bus or signal lines. Various peripheral devices may be connected to the peripheral interface 1103 by buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1104, display screen 1105, and audio circuitry 1106.
The Radio Frequency circuit 1104 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1104 communicates with communication networks and other communication devices via electromagnetic signals.
The display screen 1105 is used to display a UI (user interface). The UI may include graphics, text, icons, video, and any combination thereof. The display 1105 may be a touch display and may also be used to provide virtual buttons and/or a virtual keyboard.
The audio circuitry 1106 may include a microphone and a speaker. The microphone is used for collecting audio signals of a user and the environment, converting the audio signals into electric signals, and inputting the electric signals to the processor 1101 for processing, or inputting the electric signals to the radio frequency circuit 1104 to realize voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided, each at a different location of terminal 1100. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is then used to convert the electrical signal from the processor 1101 or the radio frequency circuit 1104 into an audio signal.
Those skilled in the art will appreciate that the configuration shown in fig. 11 does not constitute a limitation of terminal 1100, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.
Fig. 12 is a schematic structural diagram of a server 1200 according to an embodiment of the present application, where the server 1200 may generate a relatively large difference due to a difference in configuration or performance, and may include one or more processors (CPUs) 1201 and one or more memories 1202, where the memory 1202 stores at least one instruction, and the at least one instruction is loaded and executed by the processors 1201 to implement the methods provided by the foregoing method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.
The server 1200 may be used to perform the data recommendation method described above.
The embodiment of the present application further provides a computer device, where the computer device includes a processor and a memory, where the memory stores at least one program code, and the at least one program code is loaded and executed by the processor, so as to implement the data recommendation method of the foregoing embodiment.
The embodiment of the present application further provides a computer-readable storage medium, where at least one program code is stored in the computer-readable storage medium, and the at least one program code is loaded and executed by a processor, so as to implement the data recommendation method of the foregoing embodiment.
The embodiment of the present application further provides a computer program, where at least one program code is stored in the computer program, and the at least one program code is loaded and executed by a processor, so as to implement the data recommendation method in the above embodiment.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only an alternative embodiment of the present application and should not be construed as limiting the present application, and any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (15)

1. A method for recommending data, the method comprising:
acquiring a graph network, wherein the graph network comprises a plurality of user nodes and a plurality of data nodes;
coding the graph network based on a coding model to obtain feature vectors of a target user node and the plurality of data nodes, wherein the target user node is any user node in the plurality of user nodes;
based on a classification model, respectively obtaining the association degrees of the plurality of data nodes and the target user node according to the feature vectors of the target user node and the plurality of data nodes;
and determining a target data node in the plurality of data nodes according to the obtained plurality of association degrees, and recommending the target data to a target user.
2. The method of claim 1, wherein obtaining the graph network comprises:
acquiring a plurality of operation records, wherein each operation record comprises a user identifier and a data identifier and indicates that a user corresponding to the user identifier executes preset operation on data corresponding to the data identifier;
and respectively creating a plurality of user nodes and a plurality of data nodes in the graph network according to the plurality of user identifications and the plurality of data identifications in the plurality of operation records, and connecting the user nodes belonging to the same operation record with the data nodes to obtain the graph network.
3. The method according to claim 2, wherein each operation record further includes other user identifiers having a friend relationship with the user identifier and a data type of the data identifier; the creating a plurality of user nodes and a plurality of data nodes in the graph network according to the plurality of user identifiers and the plurality of data identifiers in the plurality of operation records, and connecting the user nodes belonging to the same operation record with the data nodes to obtain the graph network, includes:
respectively creating a plurality of user nodes and a plurality of data nodes in a graph network according to the plurality of user identifications and the plurality of data identifications, and connecting the user nodes belonging to the same operation record with the data nodes;
and connecting the data nodes belonging to the same data type, and connecting the user nodes belonging to the friend relationship to obtain the graph network.
4. The method of claim 1, wherein the coding model includes a first coding sub-model and a second coding sub-model, and the obtaining the feature vectors of the target user node and the plurality of data nodes by coding the graph network based on the coding model includes:
based on the first coding submodel, coding the graph network to obtain a first feature vector of each node;
and based on the second coding submodel, coding the first feature vector of each node to obtain a second feature vector of each node.
5. The method of claim 1, wherein before the encoding the graph network based on the encoding model to obtain the feature vectors of the target user node and the plurality of data nodes, the method further comprises:
the method comprises the steps of obtaining a first sample graph network, wherein the first sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes;
coding the first sample graph network based on the coding model to obtain a feature vector of each node in the first sample graph network;
decoding the feature vectors of a plurality of nodes in the first sample graph network based on a decoding model to obtain a second sample graph network;
training the coding model according to a difference between the first sample graph network and the second sample graph network.
6. The method of claim 5, wherein the training the coding model according to the difference between the first sample graph network and the second sample graph network comprises:
processing the first sample graph network and the second sample graph network by adopting a preset loss function to obtain a loss value;
and when the loss value is larger than a preset threshold value, training the coding model according to the loss value.
7. The method of claim 5, wherein after the training of the coding model according to the difference between the first sample graph network and the second sample graph network, the method further comprises:
acquiring a trained coding model;
the method comprises the steps of obtaining a third sample graph network and a plurality of sample node sets, wherein the third sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes, and each sample node set comprises one sample user node, a positive sample node connected with the sample user node and a negative sample node not connected with the sample user node;
coding the third sample graph network based on a coding model to obtain a feature vector of each node in the third sample graph network;
and training the classification model according to the feature vectors of the sample user nodes, the positive sample nodes and the negative sample nodes in the plurality of sample sets.
8. The method according to claim 1, wherein the determining a target data node among the plurality of data nodes according to the obtained plurality of association degrees comprises:
and determining the data nodes with the association degree larger than a preset threshold value in the plurality of data nodes as the target data nodes.
9. A data recommendation apparatus, characterized in that the apparatus comprises:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a graph network, and the graph network comprises a plurality of user nodes and a plurality of data nodes;
the first coding processing module is used for coding the graph network based on a coding model to obtain a target user node and the feature vectors of the plurality of data nodes, wherein the target user node is any user node in the plurality of user nodes;
the association degree obtaining module is used for respectively obtaining the association degrees of the plurality of data nodes and the target user node according to the feature vectors of the target user node and the plurality of data nodes on the basis of a classification model;
and the data recommendation module is used for determining a target data node in the plurality of data nodes according to the obtained plurality of association degrees and recommending the target data to a target user.
10. The apparatus of claim 9, wherein the first obtaining module comprises:
the operation record acquisition unit is used for acquiring a plurality of operation records, wherein each operation record comprises a user identifier and a data identifier and indicates that a user corresponding to the user identifier executes preset operation on data corresponding to the data identifier;
and the node connecting unit is used for respectively creating a plurality of user nodes and a plurality of data nodes in the graph network according to the plurality of user identifications and the plurality of data identifications in the plurality of operation records, and connecting the user nodes belonging to the same operation record with the data nodes to obtain the graph network.
11. The method according to claim 9, wherein each operation record further includes other user identifiers having a friend relationship with the user identifier and a data type of the data identifier; the node connecting unit is further configured to create a plurality of user nodes and a plurality of data nodes in a graph network according to the plurality of user identifiers and the plurality of data identifiers, and connect the user nodes belonging to the same operation record with the data nodes; and connecting the data nodes belonging to the same data type, and connecting the user nodes belonging to the friend relationship to obtain the graph network.
12. The apparatus of claim 9, wherein the coding model comprises a first coding sub-model and a second coding sub-model, and wherein the first coding processing module comprises:
the first coding processing unit is used for coding the graph network based on the first coding submodel to obtain a first feature vector of each node;
and the second coding processing unit is used for coding the first feature vector of each node based on the second coding submodel to obtain a second feature vector of each node.
13. The apparatus of claim 9, further comprising:
the second acquisition module is used for acquiring a first sample graph network, and the first sample graph network comprises a plurality of sample user nodes and a plurality of sample data nodes;
the second coding processing module is used for coding the first sample graph network based on the coding model to obtain a feature vector of each node in the first sample graph network;
the decoding processing module is used for decoding the feature vectors of the plurality of nodes in the first sample graph network based on a decoding model to obtain a second sample graph network;
and the coding model training module is used for training the coding model according to the difference between the first sample graph network and the second sample graph network.
14. A computer device, characterized in that the computer device comprises a processor and a memory, wherein at least one program code is stored in the memory, and the at least one program code is loaded and executed by the processor to implement the data recommendation method according to any one of claims 1 to 8.
15. A computer-readable storage medium having at least one program code stored therein, the at least one program code being loaded and executed by a processor to implement the data recommendation method of any one of claims 1 to 8.
CN202010159124.XA 2020-03-09 2020-03-09 Data recommendation method and device, computer equipment and storage medium Active CN111368205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010159124.XA CN111368205B (en) 2020-03-09 2020-03-09 Data recommendation method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010159124.XA CN111368205B (en) 2020-03-09 2020-03-09 Data recommendation method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111368205A true CN111368205A (en) 2020-07-03
CN111368205B CN111368205B (en) 2021-04-06

Family

ID=71208829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010159124.XA Active CN111368205B (en) 2020-03-09 2020-03-09 Data recommendation method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111368205B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307256A (en) * 2020-10-28 2021-02-02 有半岛(北京)信息科技有限公司 Cross-domain recommendation and model training method and device
CN112861963A (en) * 2021-02-04 2021-05-28 北京三快在线科技有限公司 Method, device and storage medium for training entity feature extraction model

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100312644A1 (en) * 2009-06-04 2010-12-09 Microsoft Corporation Generating recommendations through use of a trusted network
US20160155136A1 (en) * 2014-12-02 2016-06-02 Fair Isaac Corporation Auto-encoder enhanced self-diagnostic components for model monitoring
CN105786798A (en) * 2016-02-25 2016-07-20 上海交通大学 Natural language intention understanding method in man-machine interaction
CN108304359A (en) * 2018-02-06 2018-07-20 中国传媒大学 Unsupervised learning uniform characteristics extractor construction method
CN108446766A (en) * 2018-03-21 2018-08-24 北京理工大学 A kind of method of quick trained storehouse own coding deep neural network
US20180318718A1 (en) * 2012-06-29 2018-11-08 Zynga Inc. Social Network Data Analysis to Generate Incentives for Online Gaming
CN109242633A (en) * 2018-09-20 2019-01-18 阿里巴巴集团控股有限公司 A kind of commodity method for pushing and device based on bigraph (bipartite graph) network
CN109241412A (en) * 2018-08-17 2019-01-18 深圳先进技术研究院 A kind of recommended method, system and electronic equipment based on network representation study
CN109671433A (en) * 2019-01-10 2019-04-23 腾讯科技(深圳)有限公司 A kind of detection method and relevant apparatus of keyword
CN110209820A (en) * 2019-06-05 2019-09-06 腾讯科技(深圳)有限公司 User identifier detection method, device and storage medium
CN110321484A (en) * 2019-06-18 2019-10-11 中国平安财产保险股份有限公司 A kind of Products Show method and device
CN110415091A (en) * 2019-08-06 2019-11-05 重庆仙桃前沿消费行为大数据有限公司 Shop and Method of Commodity Recommendation, device, equipment and readable storage medium storing program for executing
CN110457587A (en) * 2019-08-16 2019-11-15 广东工业大学 A kind of topic recommended method, device, equipment and storage medium based on bipartite graph

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100312644A1 (en) * 2009-06-04 2010-12-09 Microsoft Corporation Generating recommendations through use of a trusted network
US20180318718A1 (en) * 2012-06-29 2018-11-08 Zynga Inc. Social Network Data Analysis to Generate Incentives for Online Gaming
US20160155136A1 (en) * 2014-12-02 2016-06-02 Fair Isaac Corporation Auto-encoder enhanced self-diagnostic components for model monitoring
CN105786798A (en) * 2016-02-25 2016-07-20 上海交通大学 Natural language intention understanding method in man-machine interaction
CN108304359A (en) * 2018-02-06 2018-07-20 中国传媒大学 Unsupervised learning uniform characteristics extractor construction method
CN108446766A (en) * 2018-03-21 2018-08-24 北京理工大学 A kind of method of quick trained storehouse own coding deep neural network
CN109241412A (en) * 2018-08-17 2019-01-18 深圳先进技术研究院 A kind of recommended method, system and electronic equipment based on network representation study
CN109242633A (en) * 2018-09-20 2019-01-18 阿里巴巴集团控股有限公司 A kind of commodity method for pushing and device based on bigraph (bipartite graph) network
CN109671433A (en) * 2019-01-10 2019-04-23 腾讯科技(深圳)有限公司 A kind of detection method and relevant apparatus of keyword
CN110209820A (en) * 2019-06-05 2019-09-06 腾讯科技(深圳)有限公司 User identifier detection method, device and storage medium
CN110321484A (en) * 2019-06-18 2019-10-11 中国平安财产保险股份有限公司 A kind of Products Show method and device
CN110415091A (en) * 2019-08-06 2019-11-05 重庆仙桃前沿消费行为大数据有限公司 Shop and Method of Commodity Recommendation, device, equipment and readable storage medium storing program for executing
CN110457587A (en) * 2019-08-16 2019-11-15 广东工业大学 A kind of topic recommended method, device, equipment and storage medium based on bipartite graph

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FATEMEH AKBARI 等: "Graph-Based Friend Recommendation in Social Networks Using Artificial Bee Colony", 《2013 IEEE 11TH INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING》 *
王鸿伟: "基于网络特征学习的个性化推荐系统", 《中国博士学位论文全文数据库 信息科技辑(月刊)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307256A (en) * 2020-10-28 2021-02-02 有半岛(北京)信息科技有限公司 Cross-domain recommendation and model training method and device
CN112861963A (en) * 2021-02-04 2021-05-28 北京三快在线科技有限公司 Method, device and storage medium for training entity feature extraction model

Also Published As

Publication number Publication date
CN111368205B (en) 2021-04-06

Similar Documents

Publication Publication Date Title
CN111339443A (en) User label determination method and device, computer equipment and storage medium
CN111931062A (en) Training method and related device of information recommendation model
CN113761153B (en) Picture-based question-answering processing method and device, readable medium and electronic equipment
CN111241394B (en) Data processing method, data processing device, computer readable storage medium and electronic equipment
WO2021155691A1 (en) User portrait generating method and apparatus, storage medium, and device
CN112328849A (en) User portrait construction method, user portrait-based dialogue method and device
CN112528164B (en) User collaborative filtering recall method and device
CN111368205B (en) Data recommendation method and device, computer equipment and storage medium
CN113515942A (en) Text processing method and device, computer equipment and storage medium
CN111428091A (en) Encoder training method, information recommendation method and related device
CN111709398A (en) Image recognition method, and training method and device of image recognition model
CN112364937A (en) User category determination method and device, recommended content determination method and electronic equipment
US11763204B2 (en) Method and apparatus for training item coding model
CN111159380A (en) Interaction method and device, computer equipment and storage medium
CN110825902A (en) Method and device for realizing feature similarity search, electronic equipment and storage medium
CN117056474A (en) Session response method and device, electronic equipment and storage medium
CN116957128A (en) Service index prediction method, device, equipment and storage medium
CN112801053B (en) Video data processing method and device
CN116821781A (en) Classification model training method, text analysis method and related equipment
CN111860870A (en) Training method, device, equipment and medium for interactive behavior determination model
CN113704596A (en) Method and apparatus for generating a set of recall information
CN112347278A (en) Method and apparatus for training a characterization model
CN114372205B (en) Training method, device and equipment of characteristic quantization model
CN116756411A (en) Content item recommendation method, device, equipment and storage medium
CN116501993B (en) House source data recommendation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40025928

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant