CN112221159B

CN112221159B - Virtual item recommendation method and device and computer readable storage medium

Info

Publication number: CN112221159B
Application number: CN202011131039.9A
Authority: CN
Inventors: 蔡红云; 程序; 何峰; 张发强; 姚亮
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-10-21
Filing date: 2020-10-21
Publication date: 2022-04-08
Anticipated expiration: 2040-10-21
Also published as: CN112221159A

Abstract

The application discloses a virtual item recommendation method, a virtual item recommendation device and a computer readable storage medium, wherein a heterogeneous network of a target game is obtained; representing a target user node in the heterogeneous network as a user learning vector by using a heterogeneous network representation learning model, and representing each prop node in the heterogeneous network as a prop learning vector; acquiring first preference information of a target game user to each game virtual prop according to the user learning vector and the prop learning vector; acquiring a historical operation sequence of a target game user, and representing the historical operation sequence as a time sequence learning vector through a time sequence model; acquiring second preference information of the target game user to each game virtual prop according to the time sequence learning vector and the prop learning vector; and recommending the game virtual item to the target game user according to the first preference information and the second preference information. The game virtual item can be more accurately recommended to the game user.

Description

Virtual item recommendation method and device and computer readable storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a virtual property recommendation method and device and a computer readable storage medium.

Background

With the rapid development of the artificial intelligence technology, the intelligent recommendation algorithm is deployed on the server, so that the user can be pertinently recommended, and game virtual items which are possibly preferred by the user can be recommended to the user by taking a game as an example.

In the prior art, the preference of a user to a game item is generally learned by adopting an intelligent recommendation algorithm based on a graph characteristic learning model, the intelligent recommendation algorithm expresses the user and the game virtual item into a vector, and finally the preference degree of the user to the game item is predicted through the similarity of the two vectors.

In the process of research and practice of the prior art, the inventor of the application finds that the prediction accuracy is low when the preference of a user to a game item is predicted in the prior art.

Disclosure of Invention

The embodiment of the application provides a virtual item recommendation method and device and a computer-readable storage medium, which can predict the accuracy of a user's preference for game items.

In order to solve the above technical problem, an embodiment of the present application provides the following technical solutions:

a virtual item recommendation method comprises the following steps:

the method comprises the steps of obtaining a heterogeneous network of a target game, wherein the heterogeneous network comprises user nodes of game users, prop nodes of game virtual props, first side connecting lines among the user nodes and second side connecting lines among the user nodes and the prop nodes, the first side connecting lines represent social relations among the game users, and the second side connecting lines represent historical operations of the game users on the game virtual props;

representing target user nodes of target game users as user learning vectors through a heterogeneous network representation learning model, and representing each prop node as a prop learning vector;

acquiring first preference information of the target game user to each game virtual item according to the user learning vector of the target user node and the item learning vector of each item node;

acquiring a historical operation sequence of the target game user, and representing the historical operation sequence as a time sequence learning vector through a time sequence model;

acquiring second preference information of the target game user to each game virtual item according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node;

and recommending game virtual props to the target game users according to the first preference information and the second preference information.

A virtual item recommendation device, comprising:

the network acquisition unit is used for acquiring a heterogeneous network of a target game, wherein the heterogeneous network comprises user nodes of game users, prop nodes of game virtual props, first side connecting lines among the user nodes and second side connecting lines among the user nodes and the prop nodes, the first side connecting lines represent social relations among the game users, and the second side connecting lines represent historical operations of the game users on the game virtual props;

the first characterization unit is used for characterizing target user nodes of a target game user into user learning vectors through a heterogeneous network characterization learning model and characterizing each prop node into a prop learning vector;

a first preference obtaining unit, configured to obtain first preference information of each game virtual item for the target game user according to the user learning vector of the target user node and the item learning vector of each item node;

the second characterization unit is used for acquiring a historical operation sequence of the target game user and characterizing the historical operation sequence into a time sequence learning vector through a time sequence model;

a second preference obtaining unit, configured to obtain, according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node, second preference information of the target game user for each game virtual item;

and the item recommending unit is used for recommending the game virtual item to the target game user according to the first preference information and the second preference information.

Optionally, in an embodiment, the first characterization unit is configured to:

acquiring a preset wandering path set for wandering sampling in a heterogeneous network, wherein the wandering path set comprises a plurality of different wandering paths;

carrying out wandering sampling on the target user node in the heterogeneous network according to the wandering path set to obtain a plurality of wandering node sequences;

the step of characterizing the target user node of the target game user as a user learning vector through the heterogeneous network characterization learning model comprises the following steps:

and inputting the plurality of walking node sequences into the heterogeneous network characterization learning model for characterization learning to obtain a user learning vector of the target user node.

Optionally, in an embodiment, the first characterization unit is configured to:

extracting a characteristic vector of each node aiming at each walking node sequence, and mapping the extracted characteristic vectors to the same characteristic space through the heterogeneous network characterization learning model;

splicing feature vectors mapped by all nodes in each wandering node sequence into spliced feature vectors through the heterogeneous network characterization learning model;

aggregating splicing feature vectors of a plurality of walking node sequences corresponding to the same walking path into a first aggregation feature vector through the heterogeneous network characterization learning model;

aggregating a plurality of first aggregation feature vectors corresponding to a plurality of different walking paths into a second aggregation feature vector through the heterogeneous network characterization learning model;

linearly converting the second aggregated feature vector into the user learning vector by the heterogeneous network characterization learning model.

Optionally, in an embodiment, the first characterization unit is configured to:

aiming at each wandering node sequence corresponding to the same wandering path, generating a corresponding first attention weight by using an attention mechanism according to the pre-distributed weight of the edge connecting line between adjacent nodes;

and according to the first attention weight of each walking node sequence corresponding to the same walking path, aggregating the splicing feature vectors of the multiple walking node sequences corresponding to the same walking path into the first aggregation feature vector through the heterogeneous network characterization learning model.

Optionally, in an embodiment, the virtual item recommendation apparatus further includes a weight assignment module, configured to:

according to the interaction records among the user nodes, weights are pre-distributed to first edge connecting lines among the user nodes;

and pre-distributing weights for a second edge connecting line between the user node and the prop node according to the historical operation of the user node on the prop node.

Optionally, in an embodiment, the timing model includes a bidirectional long-short term memory model, and the second characterization unit is configured to:

acquiring a prop learning vector of a target prop node corresponding to each historical operation in the historical operation sequence;

characterizing the prop learning vector of each target prop node as a time sequence vector through the bidirectional long-short term memory model;

acquiring an operating time ratio of each target prop node, and generating a corresponding second attention weight by using an attention mechanism according to the operating time ratio;

and aggregating the time sequence vectors of the plurality of target prop nodes corresponding to the historical operation sequence into the time sequence learning vector according to the second attention weight corresponding to each target prop node.

Optionally, in an embodiment, the first preference obtaining unit is configured to:

and calculating the similarity between the user learning vector of the target user node and the prop learning vector of each prop node, and setting the similarity as first preference information of the target game user to each game virtual prop.

Optionally, in an embodiment, the second preference obtaining unit is configured to:

and calculating the matching degree of the time sequence learning vector of the historical operation sequence and the prop learning vector of each prop node, and setting the matching degree as second preference information of the target game user to each game virtual prop.

Optionally, in an embodiment, the item recommending unit is configured to:

fusing the first preference information and the second preference information of the target game user to each game virtual item to obtain target preference information of the target game user to each game virtual item;

sequencing the target preference information of each game virtual item according to the target game user to obtain a sequencing result;

and determining a target game virtual item to be recommended according to the sequencing result, and recommending the target game virtual item to the target game user.

Optionally, in an embodiment, the network obtaining unit is configured to:

obtaining the social relationship of each game user in the target game to obtain a social relationship set;

acquiring historical operation of each game user on the game virtual prop in the target game to obtain a historical operation set;

and constructing the heterogeneous network according to the social relationship set and the historical operation set.

Optionally, in an embodiment, the virtual item recommendation apparatus further includes a dimension allocation module, configured to:

acquiring the quantity and the value of user nodes and prop nodes in the heterogeneous network;

determining target dimensions of the time sequence learning vector, the user learning vector and the prop learning vector according to the quantity and the value, wherein the target dimensions are smaller than the quantity and the value;

the first characterization unit is configured to:

and representing the target user node of the target game user as a user learning vector of the target dimension through a heterogeneous network representation learning model.

Optionally, in an embodiment, the virtual item recommendation device further includes a joint training module, configured to:

the method comprises the steps of obtaining a sample heterogeneous network, wherein the sample heterogeneous network comprises sample user nodes of sample game users, sample prop nodes of sample game virtual props, third connecting lines among the sample user nodes and fourth connecting lines among the sample user nodes and the sample prop nodes, the third connecting lines represent social relations among the sample game users, and the fourth connecting lines represent historical operations of the sample game users on the sample game virtual props;

acquiring a historical operation sequence of each sample game user;

and performing joint training on the heterogeneous network characterization learning model and the time sequence model according to the sample heterogeneous network and the historical operation sequence of each sample game user.

Optionally, in an embodiment, the joint training module is configured to:

characterizing each sample user node as a sample user learning vector and each prop node as a sample prop learning vector by the heterogeneous network characterization learning model;

characterizing a sample historical operation sequence of each sample game user as a sample time sequence learning vector through the time sequence model;

obtaining a first loss value of the heterogeneous network characterization learning model according to the sample user learning vector of each sample user node and the sample prop learning vector of each sample prop node;

acquiring a second loss value of the time sequence model according to the sample time sequence learning vector of each sample historical operation sequence and the sample prop learning vector of each sample prop node;

and fusing the first loss value and the second loss value to obtain a fusion loss value, and performing joint training on the heterogeneous network characterization learning model and the time sequence model by taking the minimized fusion loss value as a constraint.

A computer-readable storage medium, wherein the computer-readable storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to execute the steps in the virtual item recommendation method.

In the application, a heterogeneous network of a target game is obtained; representing target user nodes of a target game user in a heterogeneous network into user learning vectors by using a heterogeneous network representation learning model, and representing each prop node in the heterogeneous network into a prop learning vector; acquiring first preference information of a target game user to each game virtual prop according to the user learning vector and the prop learning vector; acquiring a historical operation sequence of a target game user, and representing the historical operation sequence as a time sequence learning vector through a time sequence model; acquiring second preference information of the target game user to each game virtual prop according to the time sequence learning vector and the prop learning vector; and recommending the game virtual item to the target game user according to the first preference information and the second preference information. Therefore, the real preference of the game user to the game virtual prop can be more accurately reflected by combining the first preference information based on the heterogeneous network structure and the second preference information based on the operation time sequence, so that the game virtual prop can be more accurately recommended to the game user.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a scene schematic diagram of a virtual item recommendation system provided in an embodiment of the present application;

fig. 2 is a flow chart of a virtual item recommendation method provided in the embodiment of the present application;

FIG. 3 is a schematic diagram illustrating a learning vector of a user obtained by learning a heterogeneous network characterization learning model in an embodiment of the present application;

FIG. 4 is a schematic diagram of a one-way long short term memory cell according to an embodiment of the present application;

FIG. 5 is a diagram illustrating a learning vector obtained when a two-way long-short term memory model is configured as a time sequence model according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a heterogeneous network according to an embodiment of the present application;

FIG. 7 is an exemplary diagram of an item recommendation interface provided in an embodiment of the present application;

fig. 8 is another schematic flow chart of a virtual item recommendation method provided in the embodiment of the present application;

fig. 9 is a comparison diagram of the recommended effect when an on-line experiment is performed in the example of the present application.

Fig. 10 is a schematic structural diagram of a virtual item recommendation device provided in an embodiment of the present application;

fig. 11 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

It should be noted that Artificial Intelligence (AI) is a theory, method, technique, and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend, and extend human Intelligence, perceive the environment, acquire knowledge, and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can reflect in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machine has the functions of perception, operation and decision.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like. By using the machine learning technique and the corresponding training data set, network models for realizing different functions can be obtained through training, for example, a network model for gender classification can be obtained through training based on one training data set, a network model for image optimization can be obtained through training based on another training data set, and the like.

At present, with the continuous development of artificial intelligence technology, a network model is deployed on electronic devices such as smart phones, tablet computers, servers and the like, so as to enhance the processing capability of the electronic devices. For example, the electronic device can optimize the images shot by the electronic device through the deployed image optimization model, so that the image quality is improved.

In the related art, an intelligent recommendation algorithm based on a graph characteristic learning model is usually adopted to learn the preference of a user to a game item, wherein the graph characteristic learning model represents the user and a game virtual item into a vector, and finally the preference degree of the user to the game item is predicted through the similarity of the two vectors. However, such a single-dimensional predicted preference is difficult to accurately reflect the actual preference of the user. Based on the above, the application provides a virtual item recommendation method, a virtual item recommendation device and a computer-readable storage medium.

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, the present application further provides a virtual item recommendation system, as shown in fig. 1, the virtual item recommendation system includes a user terminal (the virtual item recommendation system may further include other user terminals except the illustrated user terminal, and the specific number is not limited herein), and a server, where the server is used to provide a game running service, and the user terminal (for example, a user-side terminal such as a smart phone, a tablet computer, and a desktop computer) is connected to the server through a communication network, and runs a game by using the running service provided by the server. The communication network between the user terminal and the server may be a wireless communication network, which includes one or more of a wireless wide area network, a wireless local area network, a wireless metropolitan area network, and a wireless personal area network, and may also be a wired communication network. It should be noted that network entities, such as routers, gateways, etc., included in the communication network are not shown in fig. 1. In the virtual item recommendation system shown in fig. 1, a user terminal may perform information interaction with a server through a communication network, for example, after a preference of a game user of the user terminal for a game virtual item is predicted, the server recommends the game virtual item to the user terminal according to the predicted preference.

For example, the server marks a game providing the running service as a target game, and first obtains a heterogeneous network of the target game, where the heterogeneous network includes user nodes of game users, item nodes of game virtual items, first edge connection lines between the user nodes, and second edge connection lines between the user nodes and the item nodes, the first edge connection lines represent social relationships between the game users, and the second edge connection lines represent historical operations of the game users on the game virtual items; then, the server sets the game user of the user terminal as a target game user, represents a target user node of the target game user as a user learning vector through a heterogeneous network representation learning model, and represents each prop node as a prop learning vector; acquiring first preference information of the target game user to each game virtual item according to the user learning vector of the target user node and the item learning vector of each item node; then, the server acquires a historical operation sequence of the target game user, and the historical operation sequence is represented as a time sequence learning vector through a time sequence model; acquiring second preference information of the target game user to each game virtual item according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node; and finally, the server recommends the game virtual item to the target game user of the user terminal according to the first preference information and the second preference information.

It should be noted that the scene schematic diagram of the virtual item recommendation system shown in fig. 1 is only an example, and the virtual item recommendation system and the scene described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application.

The following are detailed below. The numbers in the following examples are not intended to limit the order of preference of the examples.

In an embodiment, the virtual item recommendation device will be described in terms of a virtual item recommendation device, which may be integrated in a server having a storage unit and a processor installed therein and having an arithmetic capability. Referring to fig. 2, fig. 2 is a schematic flowchart of a virtual item recommendation method according to an embodiment of the present application. The virtual item recommendation method comprises the following steps:

in step 110, a heterogeneous network of target games is obtained.

It should be noted that the target game refers to a game that needs to be played with virtual item recommendation, and the target game may be a game that provides operation services for a server where the virtual item recommendation device is located, or a game that provides operation services for other servers, including but not limited to network first-person shooting games, network role-playing games, multiplayer online tactical competition games, and other types of games.

The heterogeneous network may be characterized by a structure diagram, for example, G ═ G, (V, E, W, T), where G denotes a structure diagram of the heterogeneous network, V is a node set, E is an edge set, W is a weight matrix of edges, and T is an edge or point-to-type mapping function. If there is an edge connecting line from node v to node v', edge connecting line e_v，v’(v, v ') in E, W v, v ' is the weight pre-assigned by the edge connection line between node v and node v '. T (v) type of Return node v, T (e v, v') Return edgeConnecting line e_v，v’Type (c) of the cell.

In the embodiment of the application, the heterogeneous network of the target game can be generated according to the local related information of the server, and the generated heterogeneous network of the target game can also be directly obtained from other servers.

The heterogeneous network of the target game comprises user nodes of game users, prop nodes of game virtual props, first side connecting lines among the user nodes and second side connecting lines among the user nodes and the prop nodes, wherein the first side connecting lines represent social relations among the game users, and the second side connecting lines represent historical operations of the game users on the game virtual props.

For example, a network role playing game is taken as an example, wherein social relationships among game users include but are not limited to friend relationships, teacher-apprentice relationships, family relationships, help relationships, and formation relationships; game virtual items include, but are not limited to, actionable characters, appearance items, and the like; the historical actions of the game user on the game virtual item include, but are not limited to, clicking, purchasing, and using (either by itself or giving away to other game users), and the like.

In an optional embodiment, the step of obtaining a heterogeneous network of the target game includes:

(1) obtaining the social relationship of each game user in the target game to obtain a social relationship set;

(2) acquiring historical operation of each game user on a game virtual item in a target game to obtain a historical operation set;

(3) and constructing a heterogeneous network according to the social relationship set and the historical operation set.

In this embodiment, when the heterogeneous network of the target game is obtained, the social relationship of each game user in the target game may be obtained, so as to obtain a social relationship set. For example, for a network role playing game, social relationships such as friend relationships, teacher-apprentice relationships, family relationships, help relationships, and formation relationships are used. In addition, the historical operation of each game user on the game virtual item in the target game is obtained, and the historical operation is obtainedSet S { (V)_ui，t_i) L U belongs to U, wherein V_uiIndicating historical operations performed by game user u, t_iIndicating the execution time of the historical operation, and U indicates all game users in the target game.

As described above, after the social relationship set and the historical operation set of the target game are obtained, the heterogeneous network corresponding to the target game may be constructed according to the social relationship set and the historical operation set.

For example, referring to fig. 6, a diagram of a heterogeneous network of a networked role-playing game is shown.

As shown in fig. 6, the system includes four user nodes, which are respectively a user node 1 corresponding to game user 1, a user node2 corresponding to game user 2, a user node 3 corresponding to game user 3, and a user node 4 corresponding to game user 4, and also includes four virtual item nodes, which are respectively an item node 1 corresponding to game virtual item 1, an item node2 corresponding to game virtual item 2, an item node 3 corresponding to game virtual item 3, and an item node 4 corresponding to game virtual item 4. The first edge connecting line between the user nodes represents the social relationship between the corresponding game users, for example, the first edge connecting line between the user node 1 and the user node2 represents that the social relationship between the game user 1 and the game user 2 is "teacher and apprentice"; a second edge connection line between the user node and the prop node represents the historical operation of the corresponding game user on the game virtual prop, for example, the second edge connection line between the user node 1 and the prop node 1 represents that the historical operation of the game user 1 and the game virtual prop 1 is "used". In addition, as shown in fig. 7, a category node may be further included in the heterogeneous network, and an edge connection line between the category node and the prop node represents a category of the prop node, for example, an edge connection line between the prop node 1 and the category node 1 represents that the category of the prop node 1 is "prop category 1".

In step 120, the target user nodes of the target game users are characterized as user learning vectors and each prop node is characterized as a prop learning vector by the heterogeneous network characterization learning model.

It should be noted that, in the embodiment of the present application, a heterogeneous network characterization learning model is trained in advance, and the heterogeneous network characterization learning model is configured to perform heterogeneous network characterization learning on the user node/game virtual item node, and map the user node/game virtual item node into a learning vector with a fixed length. Similar to the characterization learning of homogeneous networks, the learning objective is to make the learned vector retain the structural information in the heterogeneous network as much as possible, so that the learning vector of the node connected more closely in the heterogeneous network is closer in the mapping space, and besides, the gait of the node type and the side connection line type needs to be considered to be treated differently in the optimization objective.

In this embodiment of the present application, the target game user is used to refer to a game user who needs to perform game virtual item recommendation, and may be any game user in the target game, and specifically, a person skilled in the art may perform configuration of the target game user according to actual needs.

Correspondingly, performing heterogeneous network characterization learning on a target user node of a target game user in the heterogeneous network through a pre-trained heterogeneous network characterization learning model, and characterizing the target user node as a learning vector which is recorded as a user learning vector; in addition, heterogeneous network representation learning is carried out on the prop node of each game virtual prop in the heterogeneous network through a heterogeneous network representation learning model, and the representation is also represented as a learning vector and recorded as a prop learning vector.

In an optional embodiment, before the step of characterizing the target user node of the target game user as the user learning vector through the heterogeneous network characterization learning model, the method further includes:

(1) acquiring a preset wandering path set for wandering sampling in a heterogeneous network, wherein the wandering path set comprises a plurality of different wandering paths;

(2) carrying out wandering sampling on a target user node in a heterogeneous network according to the wandering path set to obtain a plurality of wandering node sequences;

the method comprises the following steps of representing target user nodes of target game users as user learning vectors through a heterogeneous network representation learning model, and comprises the following steps:

(3) and inputting the plurality of walking node sequences into a heterogeneous network characterization learning model for characterization learning to obtain a user learning vector of the target user node.

The wandering path (also called meta path) is used for restricting the node type selected by each wandering step when random wandering is performed in the heterogeneous network. For example, a user node-prop node-user node is a walking path. Different walking paths point to different semantics, for example, "user node-user node" represents a game user with a friend relationship, "user node-item node-user node" represents a game user who operates the same game virtual item, and the like.

In this embodiment, the migration path is configured in advance, for example, a plurality of different migration paths are configured in advance in this embodiment, and a migration path set is formed by these migration paths, where the configured migration path is not particularly limited, and may be configured by a person skilled in the art according to actual needs.

For example, the wandering path set configured in the embodiment of the present application includes four different wandering paths, which are:

a wandering path A: user node-user node, representing game users with friend relationship;

a wandering path B: user node-prop node-user node representing a game user who has operated the same game virtual prop;

a wandering path C: the item node-user node-item node represents a game virtual item operated by the same game user;

a wandering path D: property node-user node-property node, representing game virtual properties operated by game users with friend relationships.

Correspondingly, in order to characterize a target user node of a target game user as a user learning vector through a heterogeneous network characterization learning model, a preset wandering path set is firstly obtained, the wandering path set comprises a plurality of different pre-configured wandering paths, and then wandering sampling is carried out on the target user node according to the plurality of wandering paths in the wandering path set, so that a plurality of wandering node sequences are obtained.

From the above description, it can be understood that a sequence of wandering nodes may include the target user node and other user nodes, may also include the target user node and other user nodes and prop nodes, may also include the target user node and a plurality of prop nodes, and the like, where each sequence of wandering nodes includes structural information of the target user node in the heterogeneous network.

Correspondingly, a plurality of wandering node sequences carrying structural information of the target user node in the heterogeneous network are input into the heterogeneous network characterization learning model for characterization learning, and a user learning vector carrying the structural information of the target user node can be obtained.

In an optional embodiment, the step of inputting the plurality of walking node sequences into a heterogeneous network characterization learning model for characterization learning to obtain a user learning vector of the target user node includes:

(1) extracting a characteristic vector of each node in each walking node sequence, mapping the extracted characteristic vectors to the same characteristic space through a heterogeneous network representation learning model, and splicing the characteristic vectors mapped by all the nodes in each walking node sequence into spliced characteristic vectors;

(2) aggregating splicing feature vectors of a plurality of walking node sequences corresponding to the same walking path into a first aggregation feature vector through a heterogeneous network characterization learning model;

(3) aggregating a plurality of first aggregation characteristic vectors corresponding to a plurality of different walking paths into a second aggregation characteristic vector through a heterogeneous network representation learning model;

(4) and linearly converting the second aggregation characteristic vector into a user learning vector through the heterogeneous network characterization learning model.

For example, referring to fig. 3, when a plurality of walking node sequences are input into the heterogeneous network characterization learning model for characterization learning to obtain a user learning vector of a target user node, first, for each walking node sequence, a feature vector of each node (user node/prop node) is extracted, and feature vectors of different types of nodes are mapped to the same feature space through the feature transformation layer, for example, a layer of linear transformation may be used to map feature vectors of different types of nodes to the same feature space.

After the mapping of the feature vectors is completed, the feature vectors after all the nodes in each wandering node sequence are further spliced into spliced feature vectors. For example, a linear encoder may be used for stitching.

After the feature vectors mapped by all nodes in each wandering node sequence are spliced into a spliced feature vector, the spliced feature vectors of a plurality of wandering node sequences corresponding to the same wandering path are aggregated into a first aggregated feature vector through an internal aggregation layer. Wherein, an attention mechanism may be adopted to assign an attention weight to each wandering node sequence for constraining the aggregation of the stitching feature vectors of the plurality of wandering node sequences.

In colloquial terms, the attention mechanism focuses attention on important points, while ignoring other non-important factors. The method can be regarded as a resource allocation mechanism, which reallocates the resources of the originally evenly allocated resources according to the importance degree of the attention object, wherein the important units are divided into more units and the unimportant or bad units are divided into less units, and in the field of artificial intelligence, the resources to be allocated by the attention mechanism are weights. In the present embodiment, the weight assigned by the attention mechanism is referred to as an attention weight.

After the splicing feature vectors of the multiple sequences of the wandering nodes corresponding to the same wandering path are aggregated into a first aggregated feature vector, aggregating multiple first aggregated feature vectors corresponding to multiple different wandering paths into a second aggregated feature vector by using an attention mechanism of an external-wandering path through an external aggregation layer.

Finally, the second aggregated feature vector is linearly converted to a user learning vector via an output layer. For example, the second aggregated feature vector may be linearly converted into a user learning vector according to the following formula:

wherein h is_uRepresents the user learning vector, σ represents the sigmoid function,. represents the dot product,

representing the second aggregate eigenvector, and W represents the configured parameter matrix for linear transformation.

In an optional embodiment, the step of aggregating, by a heterogeneous network characterization learning model, splicing feature vectors of multiple sequences of walking nodes corresponding to the same walking path into a first aggregated feature vector includes:

(1) aiming at each wandering node sequence corresponding to the same wandering path, generating a corresponding first attention weight by using an attention mechanism according to the pre-distributed weight of the edge connecting line between adjacent nodes;

(2) according to the first attention weight of each walking node sequence corresponding to the same walking path, the splicing feature vectors of the multiple walking node sequences corresponding to the same walking path are aggregated into a first aggregation feature vector through a heterogeneous network representation learning model.

In this embodiment, the weight pre-assigned to the edge connection line between nodes in the heterogeneous network is added to guide the weight assignment in the attention mechanism.

For each wandering node sequence corresponding to the same wandering path, generating a corresponding first attention weight by using an attention mechanism according to a pre-assigned weight of an edge connecting line between adjacent nodes, which can be expressed as:

wherein the content of the first and second substances,

represents a first attention weight, w_vuRepresenting the pre-assigned weights of the edge connecting lines between two adjacent nodes u and v,

the importance of the edge connecting line to the node u is shown (the value mode can be determined by those skilled in the art according to the actual needs, for example, the calculation mode in the learning algorithm is characterized by a heterogeneous graph based on the meta-path),

represents the set of neighbor nodes of node v in the sequence of wandering nodes,

representing the importance, w, of the edge-connecting line between node v and the neighboring node to the neighboring node_vsRepresenting the pre-assigned weights of the edge connection lines between node v and the neighboring nodes.

After the first attention weight of each walking node sequence is generated, weighted summation is carried out on the splicing feature vectors of the multiple walking node sequences according to the first attention weight of each walking node sequence, and the splicing feature vectors are aggregated into one vector, namely a first aggregation feature vector.

In an optional embodiment, further comprising:

(1) according to the interaction records among game users, weights are pre-distributed to first edge connecting lines among corresponding user nodes;

(2) and pre-distributing weights for a second edge connecting line between the user node and the prop node according to the historical operation of the user node on the prop node.

It will be appreciated that a game user's closeness to all of his friends is not the same, such as some friends opening black and sending gifts all day long, and some people have little interaction after adding a friend. In addition, the game users have different preference degrees for the game virtual items purchased by the game users, for example, the game users have stronger purchasing willingness and requirements for the game virtual items purchased many times repeatedly.

Therefore, in this embodiment, the first edge connecting line between the corresponding user nodes may be pre-assigned with a weight according to the interaction record between the game users. The specific weight distribution manner is not limited herein, and a person skilled in the art may configure the weight distribution manner according to actual needs, for example, the more frequent the interaction between game users, the more weight correspondingly distributed to the first side connection line between corresponding user nodes is taken as a constraint, and the weight distribution is performed.

In addition, according to the historical operation of the game user on the game virtual prop, the weight is pre-distributed to the second edge connecting line between the corresponding user node and the prop node. The specific weight distribution manner is not limited herein, and a person skilled in the art may configure the weight distribution manner according to actual needs, for example, the more frequent the game user operates the game virtual item, the greater the weight correspondingly distributed to the second edge connection line between the corresponding user nodes is, the more constraint, and the weight distribution is performed.

In addition, how to characterize each prop node as a prop learning vector through the heterogeneous network characterization learning model may specifically refer to corresponding implementation of characterizing a target user node of a target game user as a user learning vector through the heterogeneous network characterization learning model, and details are not repeated here.

(1) acquiring the quantity and the value of user nodes and prop nodes in a heterogeneous network;

(2) determining target dimensions of the time sequence learning vector, the user learning vector and the prop learning vector according to the quantity and the value, wherein the target dimensions are smaller than the quantity and the value;

(3) and characterizing the target user node of the target game user as a user learning vector of a target dimension through a heterogeneous network characterization learning model.

In this embodiment, the dimensions of the time sequence learning vector, the user learning vector, and the prop learning vector are set to be the same, and the target dimensions of the time sequence learning vector, the user learning vector, and the prop learning vector, that is, the dimensions of the time sequence learning vector, the user learning vector, and the prop learning vector that are expected to be generated, are determined by using the numbers and values of the user nodes and the prop nodes in the heterogeneous network.

The method comprises the steps of firstly obtaining the number and the value of user nodes and prop nodes in the heterogeneous network, and then determining a time sequence learning vector, a user learning vector and a target dimension of a prop learning vector according to the number and the value, wherein the target dimension is smaller than the number and the value. For example, the process of obtaining the target dimension can be expressed as:

d＝[1/n*N]；

wherein d represents the target dimension, N can be an empirical value taken by one of ordinary skill in the art according to actual needs, N represents the number and value, and [ ] represents rounding.

In step 130, first preference information of the target game user for each game virtual item is obtained according to the user learning vector of the target user node and the item learning vector of each item node.

It should be noted that the user learning vector learned by the above heterogeneous network representation learning model carries structural information of the target user node in the heterogeneous network, and similarly, the prop learning vector learned by the above heterogeneous network representation learning model carries structural information of the prop node in the heterogeneous network. Therefore, by using the user learning vector of the target user node and the prop learning vector of each prop node, the preference information of the target game user to each game virtual prop in the structural dimension can be obtained and recorded as the first preference information.

In an optional implementation manner, the step of obtaining first preference information of the target game user for each game virtual item according to the user learning vector of the target user node and the item learning vector of each item node includes:

The similarity between the user learning vector of the target user node and the prop learning vector of a track tool node can be calculated by the inner product of two vectors, and can be represented as:

wherein h is_uRepresents the user learning vector, h_vRepresenting prop learning vector, T representing transposition, P_longThe similarity of the user learning vector and the prop learning vector is represented, which reflects the long-term preference of the game user for the game virtual prop, and is set as first preference information.

In step 140, a historical operation sequence of the target game user is obtained, and the historical operation sequence is characterized as a time sequence learning vector through a time sequence model.

As described above, the user node and the item node in the heterogeneous network are connected by a second edge connection line, and the second edge connection line represents the historical operation of the game user on the game virtual item. Therefore, all the historical operations executed by the target game user can be acquired according to all the second side connecting lines connected with the target user node, and all the historical operations executed by the target game user are sequenced according to the time sequence, so that the historical operation sequence of the target game user can be obtained. For example, the history operation sequence S ═ (V) of the target game user is acquired_ui，t_i) Wherein i is in [0, N]A medium value, N is a positive integer, V_uiIndicating a target game user at t_iHistorical operations performed at the moment, assuming that the target game users have performed 3 in totalIn the next operation, N is 2, V_u0Indicating a target game user at t₀Historical operation performed at a time, V_u1Indicating a target game user at t₁Historical operation performed at a time, V_u2Indicating a target game user at t₂Historical operations performed at the time of day.

It should be noted that, in the embodiment of the present application, a timing model is also trained in advance, and the timing model is configured to perform timing characterization learning on timing data, and map the timing data into a learning vector with a fixed length. The type of the timing model is not particularly limited, and may be configured by those skilled in the art according to practical needs, including but not limited to a recurrent neural network model such as a long-short term memory model, a bidirectional long-short term memory model, and a gate control cyclic unit model.

In the embodiment of the application, after the historical operation sequence of the target game user is obtained, the time sequence representation learning is carried out on the historical operation sequence through the pre-trained time sequence model, and the historical operation sequence is represented as a learning vector and is recorded as a time sequence learning vector.

In an alternative embodiment, the time series model comprises a two-way long-short term memory model, and the step of characterizing the historical sequence of operations as a time series learning vector by the time series model comprises:

(1) acquiring a prop learning vector of a target prop node corresponding to each historical operation in a historical operation sequence;

(2) representing the prop learning vector of each target prop node as a time sequence vector through a bidirectional long-short term memory model;

(3) acquiring the operation time ratio of each target prop node, and generating a corresponding second attention weight by using an attention mechanism according to the operation time ratio;

(4) and aggregating the time sequence vectors of the plurality of target prop nodes corresponding to the historical operation sequence into a time sequence learning vector according to the second attention weight corresponding to each target prop node.

In the present embodiment, a time series model is described as an example of a bidirectional long-short term memory model.

Referring to fig. 5, a unidirectional long-short term memory unit is taken as an example, as shown in fig. 4, the long-short term memory unit includes an input gate, an output gate, a forgetting gate, and a memory unit (or memory cell), where σ represents a sigmoid function, g and h respectively represent activation functions of the memory unit input to output, and a hyperbolic tangent function, i.e., tanh () is taken, and iterative computation is performed specifically according to the following formula:

i_t＝σ(W_ixx_t+W_ihh_t-1+W_icc_t-1+b_i)；

f_t＝σ(W_fxx_t+W_fhh_t-1+W_fcc_t-1+b_f)；

c_t＝f_t·c_t-1+i_t·tanh(W_cxx_t+W_chh_t-1+b_c))；

o_t＝σ(W_oxx_t+W_ohh_t-1+W_occ_t+b_o)；

h_t＝o_t·tanh(c_t)；

wherein i_tRepresents the output value of the input gate in the long and short term memory unit, sigma represents sigmoid function, W_ixWeight matrix, x, representing the input data to the input gate at the current time_tInput data representing the current time, W_ihA weight matrix representing the output value of the input gate at a time point in the long-short term memory unit, h_t-1Representing the output value, W, of a time instant in the long-short term memory unit_icA weight matrix representing the memory cell to input gate in the long and short term memory cell, c_t-1Representing the output value at a moment in time on the memory cell, b_iIndicating the offset of the input gate, f_tIndicating the output value, W, of a forgetting gate in a long-short term memory cell_fxWeight matrix, W, representing the input data to the forgetting gate at the current moment_fhA weight matrix W representing the output value of the long-short term memory unit at the previous moment to the forgetting gate_fcWeight matrix representing memory cells to forget gate, b_fRepresenting the offset of the memory cell, c_tRepresenting the output value, W, of the memory cell_cxWeight matrix, W, representing the input data to the memory unit at the current moment_chA weight matrix representing the output value of a time instant on the long-short term memory cell to the memory cell, b_cIndicates the offset of the memory cell, o_tRepresenting the output value, W, of the output gate in the long-short term memory cell_oxWeight matrix, W, representing input data to output gates at the present time_ohA weight matrix, W, representing the output value of the long-short term memory cell at a time to the output gate_ocA weight matrix representing the output values of the memory cells to the output gates, b_oIndicates the offset of the output gate, h_tRepresents the output value of the long-term and short-term memory cells, and represents a dot product.

Accordingly, in this embodiment, the prop learning vectors of the target prop node corresponding to each historical operation in the historical operation sequence are obtained, the prop learning vectors correspond to different times, and the prop learning vectors at the different times are used as the input x of the bidirectional long-short term memory model_t。

And then, respectively representing the prop learning vector of each target prop node as a forward time sequence vector and a reverse time sequence vector through a bidirectional long-short term memory model, and splicing the forward time sequence vector and the reverse time sequence vector to obtain a target time sequence vector. For example, please refer to FIG. 5, x₁To x_tRespectively show the prop learning vectors fh corresponding to different times up to the time t₁To fh_tShows the output of the forward long-short term memory unit at the different time points, bh, in the bidirectional long-short term memory model₁To bh_tThe output of the backward long-short term memory unit at the different time points in the bidirectional long-short term memory model is shown, and correspondingly, the forward timing vector fh is equal to (fh)₁，fh₂，……，fh_t) The inverted timing vector bh is (bh)₁，bh₂，……，bh_t)。

Then, obtaining an operation time ratio of each target prop node, and generating a corresponding second attention weight by using an attention mechanism according to the operation time ratio, which can be expressed as:

wherein eta, Q and b_aObtained by pre-training, w_tRepresents the operation time ratio of the game virtual item corresponding to the historical operation at the time t (namely the ratio of the operation time length of the game virtual item to the total time length corresponding to the historical operation sequence),

representing a target timing vector.

After the second attention weight of each target prop node is generated, the target time sequence vectors of a plurality of target prop nodes corresponding to the historical operation sequence are weighted and summed according to the second attention weight corresponding to each target prop node, and the weighted sum is aggregated into one vector, namely a time sequence learning vector.

In step 150, second preference information of the target game user for each game virtual item is obtained according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node.

It should be noted that the timing learning vector learned by the timing model above carries the operation information of the target user node in timing. Therefore, by using the time sequence learning vector of the historical operation sequence of the target game user and the item learning vector of each item node, the preference information of the target game user to each game virtual item in the time sequence dimension can be obtained and recorded as second preference information.

In an optional implementation manner, the step of obtaining second preference information of the target game user for each game virtual item according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node includes:

For example, a multi-layer Perceptron (MLP) may be used to calculate the matching degree between the temporal learning vector and the prop learning vector, which may be expressed as:

P_short＝MLP(h_s，h_v)；

wherein h is_sRepresents a time-series learning vector, h_vRepresenting prop learning vector, T representing transposition, P_shortAnd the matching degree of the time sequence learning vector and the prop learning vector is represented, the short-term preference of the game user to the game virtual prop at the current moment is reflected, and the short-term preference is set as second preference information.

In step 160, the game virtual item is recommended to the target game user according to the first preference information and the second preference information.

As described above, the first preference information reflects the preference of the target game user in the structural dimension, and the second preference information reflects the preference of the target game user in the timing dimension, so that the target game virtual item satisfying the preset recommendation condition can be determined by integrating the first preference information and the second preference information, and the target game virtual item can be recommended to the target game user.

For example, referring to fig. 7, an exemplary diagram of a item recommendation interface is provided, and as shown in fig. 7, the item recommendation interface includes a plurality of determined game virtual items for a target game user, which are item a, item B, item C, item D, item E, and item F. Optionally, the user terminal of the target game user may receive the target game virtual item information for the target game user from the server when the target game user is online, generate an item recommendation interface according to the target game virtual item information, and display the item recommendation interface to the target game user. As shown in fig. 7, the item recommendation interface further includes a purchase interface corresponding to the target game virtual item, where the purchase interface shows selling price information of the corresponding game virtual item (for example, the purchase interface corresponding to the item a shows that the selling price of the item a is 8 yuan, which is not described herein again for the same reason), and the target game user can purchase a desired game virtual item through the purchase interface according to actual needs.

In an optional implementation manner, the step of recommending the game virtual item to the target game user according to the first preference information and the second preference information includes:

(1) fusing first preference information and second preference information of the target game user on each game virtual item to obtain target preference information of the target game user on each game virtual item;

(2) sequencing the target preference information of each game virtual item according to a target game user to obtain a sequencing result;

(3) and determining the target game virtual item to be recommended according to the sequencing result, and recommending the target game virtual item to the target game user.

For example, an average value of the first preference information and the second preference information may be taken, and the first preference information and the second preference information may be weighted and summed.

Taking the fusion mode of weighted summation as an example, the weights of the two can be configured by those skilled in the art according to actual needs, for example, both are configured to be 0.5, and then the process of fusing the preference information can be expressed as:

P＝0.5*P_long+0.5*P_short；

where P represents target preference information that fuses the long-term and short-term preferences of the target game user for the game virtual item.

And then, sequencing the target preference information of each game virtual item according to the target game user to obtain a sequencing result, determining the target game virtual item to be recommended according to the sequencing result, and recommending the target game virtual item to the target game user.

For example, the virtual game items may be sorted according to the order of the target preference information, so as to obtain a sorting result, and then, according to the sorting result, a preset number of game virtual items before sorting (a suitable experience value may be obtained by a person skilled in the art according to a difference of the target game, for example, for a network role playing game, the preset number configured is 20) are determined as the target game virtual items, so as to be recommended to the target game user.

In an optional embodiment, the step of obtaining the heterogeneous network of the target game further includes:

(1) the method comprises the steps that a sample heterogeneous network is obtained, wherein the sample heterogeneous network comprises sample user nodes of sample game users, sample prop nodes of sample game virtual props, third connecting lines among the sample user nodes and fourth connecting lines among the sample user nodes and the sample prop nodes, the third connecting lines represent social relations among the sample game users, and the fourth connecting lines represent historical operations of the sample game users on the sample game virtual props;

(2) acquiring a historical operation sequence of each sample game user;

(3) and performing joint training on the heterogeneous network characterization learning model and the time sequence model according to the sample heterogeneous network and the historical operation sequence of each sample game user.

In an optional embodiment, the step of jointly training the heterogeneous network characterization learning model and the time sequence model according to the sample heterogeneous network and the historical operation sequence of each sample game user includes:

(1) characterizing each sample user node as a sample user learning vector and each prop node as a sample prop learning vector through a heterogeneous network characterization learning model;

(2) representing the sample historical operation sequence of each sample game user as a sample time sequence learning vector through a time sequence model;

(3) obtaining a first loss value of a heterogeneous network representation learning model according to the sample user learning vector of each sample user node and the sample prop learning vector of each sample prop node;

(4) acquiring a second loss value of the time sequence model according to the sample time sequence learning vector of each sample historical operation sequence and the sample prop learning vector of each sample prop node;

(5) and fusing the first loss value and the second loss value to obtain a fusion loss value, and performing joint training on the heterogeneous network characterization learning model and the time sequence model by taking the minimized fusion loss value as constraint.

For how to characterize each sample user node as a sample user learning vector, how to characterize each prop node as a sample prop learning vector, and how to characterize the sample historical operation sequence of each sample game user as a sample time sequence learning vector, the corresponding implementation can be performed by referring to the corresponding manner in the above embodiments, and details are not repeated here.

The process of obtaining the first loss value of the heterogeneous network characterization learning model can be represented as:

wherein u represents a sample game user, v represents a sample game virtual item, Y represents a sample set composed of all sample game users and all sample game virtual items, and Y represents_uvSample tag (pre-marked) for property, y _uv1 represents that sample game user u has operated sample game virtual item v, and vice versa_uv0 represents that sample game user u has not operated sample game virtual item v, h_uSample user learning vector, h, representing sample user node of sample game user u_vA sample prop learning vector representing a sample prop node of sample game virtual prop v.

The process of obtaining the second penalty value for the timing model may be represented as:

where s represents a historical operation, y_svFor time series sample labeling (pre-labeled), y _sv1 represents that the sample game user u operates the sample game virtual item v at the next moment of the corresponding moment of the sample historical operation sequence, and conversely y_svRepresenting that the sample game user u does not operate the sample game virtual item v, h at the next moment when the sample game user u passes through the corresponding moment of the sample historical operation sequence_sIs a sample timing learning vector for the sample history operation sequence s.

In the present embodiment, how to obtain the fusion loss value by fusing the first loss value and the second loss value may be configured by a person of ordinary skill in the art according to actual needs, for example, a sum of the first loss value and the second loss value may be calculated as the fusion loss value.

After the fusion loss value is obtained through fusion, the heterogeneous network characterization learning model and the time sequence model can be subjected to combined training by taking the minimized fusion loss value as constraint. For example, the Adam algorithm can be used to minimize the fusion loss value, so as to complete the joint training of the heterogeneous network characterization learning model and the timing model.

As can be seen from the above, in the embodiment of the present application, a heterogeneous network of a target game is obtained; representing target user nodes of a target game user in a heterogeneous network into user learning vectors by using a heterogeneous network representation learning model, and representing each prop node in the heterogeneous network into a prop learning vector; acquiring first preference information of a target game user to each game virtual prop according to the user learning vector and the prop learning vector; acquiring a historical operation sequence of a target game user, and representing the historical operation sequence as a time sequence learning vector through a time sequence model; acquiring second preference information of the target game user to each game virtual prop according to the time sequence learning vector and the prop learning vector; and recommending the game virtual item to the target game user according to the first preference information and the second preference information. Therefore, the real preference of the game user to the game virtual prop can be more accurately reflected by combining the first preference information based on the heterogeneous network structure and the second preference information based on the operation time sequence, so that the game virtual prop can be more accurately recommended to the game user.

The method described in connection with the above embodiments will be described in further detail below by way of example.

In this embodiment, a server is taken as an execution subject for explanation, please refer to fig. 8, and fig. 8 is another schematic flow chart of the virtual item recommendation method provided in this embodiment. The method flow can comprise the following steps:

in step 210, the server obtains a heterogeneous network of the target game.

The heterogeneous network may be characterized by a structure diagram, for example, G ═ G, (V, E, W, T), where G denotes a structure diagram of the heterogeneous network, V is a node set, E is an edge set, W is a weight matrix of edges, and T is an edge or point-to-type mapping function. If there is an edge connecting line from node v to node v', edge connecting line e_v，v’(v, v ') in E, W v, v ' is the weight pre-assigned by the edge connection line between node v and node v '. T (v) type of return node v, T (e v, v') return edge join line e_v，v’Type (c) of the cell.

In the embodiment of the application, the server may generate the heterogeneous network of the target game according to the local related information, or may directly obtain the generated heterogeneous network of the target game from other servers.

In step 220, the server obtains a preset wandering path set for performing wandering sampling in the heterogeneous network, and performs wandering sampling on a target user node in the heterogeneous network according to a plurality of wandering paths in the wandering path set to obtain a plurality of wandering node sequences.

Correspondingly, in order to characterize a target user node of a target game user as a user learning vector through a heterogeneous network characterization learning model, a server firstly obtains a preset wandering path set, the wandering path set comprises a plurality of different pre-configured wandering paths, and then wandering sampling is carried out on the target user node according to the plurality of wandering paths in the wandering path set, so that a plurality of wandering node sequences are obtained.

In step 230, the server inputs the plurality of walking node sequences into a heterogeneous network characterization learning model for characterization learning, so as to obtain a user learning vector of the target user node.

Correspondingly, the server inputs a plurality of wandering node sequences carrying structural information of the target user node in the heterogeneous network into the heterogeneous network characterization learning model for characterization learning, and a user learning vector carrying the structural information of the target user node can be obtained.

In step 240, the server characterizes each prop node as a prop learning vector through a heterogeneous network characterization learning model.

For how to represent each item node as an item learning vector through the heterogeneous network representation learning model, the implementation method of representing the target user node of the target game user as a user learning vector through the heterogeneous network representation learning model can be referred to and implemented accordingly, and details are not repeated here.

In step 250, the server calculates the similarity between the user learning vector of the target user node and the item learning vector of each item node, and accordingly sets the similarity as the first preference information of the target game user for each game virtual item.

The server can calculate the similarity between the user learning vector of the target user node and the item learning vector of each item node, and set the similarity as first preference information of the target game user to each game virtual item. The similarity between the user learning vector of the target user node and the prop learning vector of a track with the target user node can be calculated by the inner product of the two vectors, and can be represented as follows:

In step 260, the server obtains the historical operation sequence of the target game user and characterizes the historical operation sequence as a time sequence learning vector through a time sequence model.

As described above, the user node and the item node in the heterogeneous network are connected by a second edge connection line, and the second edge connection line represents the historical operation of the game user on the game virtual item. Therefore, the user node can be connected according to the connection target user nodeAnd all the second side connecting lines of the points acquire all the historical operations executed by the target game user, and sort all the historical operations executed by the target game user according to the time sequence, so that the historical operation sequence of the target game user can be obtained. For example, the server obtains the historical operation sequence S ═ (V) of the target game user_ui，t_i) Wherein i is in [0, N]A medium value, N is a positive integer, V_uiIndicating a target game user at t_iThe historical operation executed at any moment, assuming that the target game user has executed 3 times in total, N is 2, V_u0Indicating a target game user at t₀Historical operation performed at a time, V_u1Indicating a target game user at t₁Historical operation performed at a time, V_u2Indicating a target game user at t₂Historical operations performed at the time of day.

In step 270, the server calculates a matching degree between the time sequence learning vector of the historical operation sequence and the item learning vector of each item node, and accordingly sets second preference information of the target game user for each game virtual item.

The server can calculate the matching degree of the time sequence learning vector of the historical operation sequence and the item learning vector of each item node, and set the matching degree as second preference information of the target game user to each game virtual item.

P_short＝MLP(h_s，h_v)；

In step 280, the server fuses the first preference information and the second preference information of the target game user for each game virtual item to obtain target preference information of the target game user for each game virtual item.

For example, the server may take an average value of the first preference information and the second preference information, and may also perform weighted summation on the two.

P＝0.5*P_long+0.5*P_short；

In step 290, the server sorts the target preference information of each game virtual item according to the target game user to obtain a sorting result, determines the target game virtual item to be recommended according to the sorting result, and recommends the target game virtual item to the target game user.

The server sorts the target preference information of each game virtual item according to the target game user to obtain a sorting result, determines the target game virtual item to be recommended according to the sorting result, and recommends the target game virtual item to the target game user.

In order to verify the performance of the virtual item recommendation method provided by this embodiment, the following offline experiment is performed.

The history of the map played by the game user (game virtual items) and the social relationship of the game user are extracted from a certain game. Wherein:

map history of game users: data for a week is extracted from the game database, maps played for less than 1 minute are filtered out, and maps that appear only once are filtered out. The symbiosis resulted in 371560 game users, 55705 maps, 1620824 users playing a positive sample of maps and retained a 1:1 map of exposed non-clicks as a negative sample.

Social relationships of game users: a total of 69346 buddy relationships for all gaming users are covered.

And constructing a heterogeneous network according to the information, wherein the heterogeneous network comprises user nodes corresponding to game users and prop nodes corresponding to maps.

The following 4 wandering paths are adopted, including:

user node-user node;

user node-prop node-user node;

prop node-user node-prop node;

prop node-user node-prop node.

The game virtual item recommendation method (named T-MPGNN: Time-aware MetaPath GNN) provided by the application and three other comparison schemes are compared and tested based on the heterogeneous network. The comparison scheme comprises the following steps:

IMF (inertial modeling framework) -based implicit matrix decomposition recommendation algorithm for game user-prop interaction matrix

MetapPath MF: a heterogeneous recommendation algorithm that combines heterogeneous token learning with matrix factorization.

MAGNN: metapath GNN algorithm only considering heterogeneous structure information

Evaluation indexes are as follows: from the above offline data set, 20% of the game user history is randomly sampled as test samples, each sample being a triplet (u, v, y) representing whether the game user u played the map v. Where y-1 indicates played and y-0 indicates not played. Given a test sample, each scheme predicts y from (u, v). AUC (area over the receiver operating characteristic curve) was used as an evaluation index to measure the effect of each protocol. AUC measures the probability that the classifier correctly judges the value of a positive sample about the value of a negative sample in the case where one positive sample (y-1) and one negative sample (y-0) are randomly drawn. So the larger the AUC value of the classifier, the higher the accuracy.

The results of the comparative experiments are shown in the following table:

it can be found that the prediction accuracy AUC of the technical scheme T-MPGNN is the highest. More specifically, the AUC of T-MPGNN is relatively improved by 23.43% compared with IMF. Compared with the heterogeneous characterization algorithm MetaPath MF combined with matrix decomposition, the T-MPGNN obtained a relative increase in AUC of 10%. The MAGNN ignores timing information because only the long-term preferences of the game user are considered, and the AUC is 6% lower than the T-MPGNN.

In addition, the technical scheme T-MPGNN is also used for the scene of on-line item recommendation. Referring to fig. 9, in the item recommendation of a game, compared to the default recommendation list, the recommendation list based on the isomorphic representation node2vec, the recommendation list based on the business rules, and the XGBoost classification model, the T-MPGNN brings an increase of 12% to 477% in purchase rate, a relative increase of 22% to 651% in purchase rate, and an increase of 2% to 30% in purchase value.

In order to better implement the virtual item recommendation method provided by the embodiment of the application, the embodiment of the application further provides a device based on the virtual item recommendation method. The meaning of the noun is the same as that in the virtual item recommendation method, and specific implementation details can refer to the description in the method embodiment.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a virtual item recommendation device according to an embodiment of the present application, where the virtual item recommendation device may include a network obtaining unit 310, a first characterizing unit 320, a first preference obtaining unit 330, a second characterizing unit 340, a second preference obtaining unit 350, an item recommending unit 360, and the like.

A network obtaining unit 310, configured to obtain a heterogeneous network of a target game, where the heterogeneous network includes user nodes of game users, property nodes of game virtual properties, first edge connection lines between the user nodes, and second edge connection lines between the user nodes and the property nodes, where the first edge connection lines represent social relationships between the game users, and the second edge connection lines represent historical operations of the game users on the game virtual properties;

a first characterization unit 320, configured to characterize a target user node of a target game user as a user learning vector and characterize each prop node as a prop learning vector through a heterogeneous network characterization learning model;

a first preference obtaining unit 330, configured to obtain first preference information of each game virtual item for a target game user according to a user learning vector of a target user node and an item learning vector of each item node;

the second characterization unit 340 is configured to obtain a historical operation sequence of the target game user, and characterize the historical operation sequence as a time sequence learning vector through a time sequence model;

a second preference obtaining unit 350, configured to obtain, according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node, second preference information of the target game user for each game virtual item;

and the item recommending unit 360 is used for recommending the game virtual item to the target game user according to the first preference information and the second preference information.

Optionally, in an embodiment, the first characterizing unit 320 is configured to:

carrying out wandering sampling on a target user node in a heterogeneous network according to the wandering path set to obtain a plurality of wandering node sequences;

and inputting the plurality of walking node sequences into a heterogeneous network characterization learning model for characterization learning to obtain a user learning vector of the target user node.

extracting a characteristic vector of each node aiming at each walking node sequence, and mapping the extracted characteristic vectors to the same characteristic space through a heterogeneous network representation learning model;

splicing the feature vectors after mapping of all the nodes in each wandering node sequence into spliced feature vectors through a heterogeneous network representation learning model;

aggregating splicing feature vectors of a plurality of walking node sequences corresponding to the same walking path into a first aggregation feature vector through a heterogeneous network characterization learning model;

aggregating a plurality of first aggregation characteristic vectors corresponding to a plurality of different walking paths into a second aggregation characteristic vector through a heterogeneous network representation learning model;

and linearly converting the second aggregation characteristic vector into a user learning vector through the heterogeneous network characterization learning model.

according to the first attention weight of each walking node sequence corresponding to the same walking path, the splicing feature vectors of the multiple walking node sequences corresponding to the same walking path are aggregated into a first aggregation feature vector through a heterogeneous network representation learning model.

Optionally, in an embodiment, the timing model includes a bidirectional long-short term memory model, and the second characterization unit 340 is configured to:

acquiring a prop learning vector of a target prop node corresponding to each historical operation in a historical operation sequence;

representing the prop learning vector of each target prop node as a time sequence vector through a bidirectional long-short term memory model;

acquiring the operation time ratio of each target prop node, and generating a corresponding second attention weight by using an attention mechanism according to the operation time ratio;

and aggregating the time sequence vectors of the plurality of target prop nodes corresponding to the historical operation sequence into a time sequence learning vector according to the second attention weight corresponding to each target prop node.

Optionally, in an embodiment, the first preference obtaining unit 330 is configured to:

Optionally, in an embodiment, the second preference obtaining unit 350 is configured to:

Optionally, in an embodiment, the item recommendation unit 360 is configured to:

fusing first preference information and second preference information of the target game user on each game virtual item to obtain target preference information of the target game user on each game virtual item;

sequencing the target preference information of each game virtual item according to a target game user to obtain a sequencing result;

and determining the target game virtual item to be recommended according to the sequencing result, and recommending the target game virtual item to the target game user.

Optionally, in an embodiment, the network obtaining unit 310 is configured to:

acquiring historical operation of each game user on a game virtual item in a target game to obtain a historical operation set;

and constructing a heterogeneous network according to the social relationship set and the historical operation set.

acquiring the quantity and the value of user nodes and prop nodes in a heterogeneous network;

a first characterization unit 320 configured to:

and characterizing the target user node of the target game user as a user learning vector of a target dimension through a heterogeneous network characterization learning model.

Optionally, in an embodiment, the virtual item recommendation apparatus further includes a joint training module, configured to:

the method comprises the steps that a sample heterogeneous network is obtained, wherein the sample heterogeneous network comprises sample user nodes of sample game users, sample prop nodes of sample game virtual props, third connecting lines among the sample user nodes and fourth connecting lines among the sample user nodes and the sample prop nodes, the third connecting lines represent social relations among the sample game users, and the fourth connecting lines represent historical operations of the sample game users on the sample game virtual props;

acquiring a historical operation sequence of each sample game user;

Optionally, in an embodiment, the joint training module is configured to:

characterizing each sample user node as a sample user learning vector and each prop node as a sample prop learning vector through a heterogeneous network characterization learning model;

representing the sample historical operation sequence of each sample game user as a sample time sequence learning vector through a time sequence model;

obtaining a first loss value of a heterogeneous network representation learning model according to the sample user learning vector of each sample user node and the sample prop learning vector of each sample prop node;

and fusing the first loss value and the second loss value to obtain a fusion loss value, and performing joint training on the heterogeneous network characterization learning model and the time sequence model by taking the minimized fusion loss value as constraint.

The specific implementation of each unit can refer to the previous embodiment, and is not described herein again.

As can be seen from the above, in the embodiment of the present application, the network obtaining unit 310 obtains the heterogeneous network of the target game; the first characterization unit 320 characterizes a target user node of a target game user in a heterogeneous network as a user learning vector and each prop node in the heterogeneous network as a prop learning vector by using a heterogeneous network characterization learning model; the first preference obtaining unit 330 obtains first preference information of the target game user for each game virtual item according to the user learning vector and the item learning vector; the second representation unit 340 obtains a historical operation sequence of the target game user, and represents the historical operation sequence as a time sequence learning vector through a time sequence model; the second preference obtaining unit 350 obtains second preference information of the target game user for each game virtual item according to the time sequence learning vector and the item learning vector; the item recommending unit 360 recommends the game virtual item to the target game user according to the first preference information and the second preference information. Therefore, the real preference of the game user to the game virtual prop can be more accurately reflected by combining the first preference information based on the heterogeneous network structure and the second preference information based on the operation time sequence, so that the game virtual prop can be more accurately recommended to the game user.

The embodiment of the present application further provides a server, as shown in fig. 11, which shows a schematic structural diagram of the server according to the embodiment of the present application, specifically:

the server may include components such as a processor 410 of one or more processing cores, memory 420 of one or more computer-readable storage media, a power supply 430, and an input unit 440. Those skilled in the art will appreciate that the server architecture shown in FIG. 11 is not meant to be limiting, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:

the processor 410 is a control center of the server, connects various parts of the entire server using various interfaces and lines, performs various functions of the server and processes data by running or executing software programs and/or modules stored in the memory 420 and calling data stored in the memory 420, thereby performing overall monitoring of the server. Optionally, processor 410 may include one or more processing cores; optionally, the processor 410 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user interfaces, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 410.

The memory 420 may be used to store software programs and modules, and the processor 410 executes various functional applications and data processing by operating the software programs and modules stored in the memory 420. The memory 420 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the server, and the like. Further, the memory 420 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 420 may also include a memory controller to provide processor 410 access to memory 420.

The server further includes a power supply 430 for supplying power to each component, and optionally, the power supply 430 may be logically connected to the processor 410 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The power supply 430 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

The server may further include an input unit 440, and the input unit 440 may be used to receive input numeric or character information and generate a keyboard, mouse, joystick, optical or trackball signal input in relation to user settings and function control.

Although not shown, the server may further include a display unit and the like, which will not be described in detail herein. Specifically, in this embodiment, the processor 410 in the server loads the executable file corresponding to the process of one or more application programs into the memory 420 according to the following instructions, and the processor 410 runs the application programs stored in the memory 420, so as to implement the various method steps provided by the foregoing embodiment, as follows:

the method comprises the steps that a heterogeneous network of a target game is obtained, wherein the heterogeneous network comprises user nodes of game users, prop nodes of game virtual props, first side connecting lines among the user nodes and second side connecting lines among the user nodes and the prop nodes, the first side connecting lines represent social relations among the game users, and the second side connecting lines represent historical operations of the game users on the game virtual props;

acquiring first preference information of a target game user to each game virtual item according to the user learning vector of the target user node and the item learning vector of each item node;

acquiring a historical operation sequence of a target game user, and representing the historical operation sequence as a time sequence learning vector through a time sequence model;

acquiring second preference information of the target game user on each game virtual item according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node;

and recommending the game virtual item to the target game user according to the first preference information and the second preference information.

In the above embodiments, the descriptions of the embodiments have respective emphasis, and a part which is not described in detail in a certain embodiment may refer to the above detailed description of the virtual item recommendation method, and is not described herein again.

As can be seen from the above, the server according to the embodiment of the present application may obtain the heterogeneous network of the target game; representing target user nodes of a target game user in a heterogeneous network into user learning vectors by using a heterogeneous network representation learning model, and representing each prop node in the heterogeneous network into a prop learning vector; acquiring first preference information of a target game user to each game virtual prop according to the user learning vector and the prop learning vector; acquiring a historical operation sequence of a target game user, and representing the historical operation sequence as a time sequence learning vector through a time sequence model; acquiring second preference information of the target game user to each game virtual prop according to the time sequence learning vector and the prop learning vector; and recommending the game virtual item to the target game user according to the first preference information and the second preference information. Therefore, the real preference of the game user to the game virtual prop can be more accurately reflected by combining the first preference information based on the heterogeneous network structure and the second preference information based on the operation time sequence, so that the game virtual prop can be more accurately recommended to the game user.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, an embodiment of the present application provides a computer-readable storage medium, where a plurality of instructions are stored, and the instructions can be loaded by a processor to perform steps in any one of the virtual item recommendation methods provided in the embodiment of the present application. For example, the instructions may perform the steps of:

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

Wherein the computer-readable storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the instructions stored in the computer-readable storage medium can execute the steps in any of the virtual item recommendation methods provided in the embodiments of the present application, beneficial effects that can be achieved by any of the virtual item recommendation methods provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.

The method, the device and the computer-readable storage medium for recommending virtual properties provided by the embodiments of the present application are described in detail above, and a specific example is applied in the description to explain the principle and the implementation of the present application, and the description of the embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A virtual item recommendation method is characterized by comprising the following steps:

acquiring first preference information of the target game user to each game virtual item according to the user learning vector of the target user node and the item learning vector of each item node, wherein the first preference information is long-term preference of the target game user to each game virtual item;

acquiring second preference information of the target game user on each game virtual item according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node, wherein the second preference information is the short-term preference of the target game user on each game virtual item at the current moment;

2. The virtual item recommendation method according to claim 1, wherein before the step of characterizing the target user node of the target game user as the user learning vector through the heterogeneous network characterization learning model, the method further comprises:

acquiring a preset wandering path set for wandering sampling in the heterogeneous network, wherein the wandering path set comprises a plurality of different wandering paths;

3. The virtual item recommendation method according to claim 2, wherein the step of inputting the plurality of walking node sequences into the heterogeneous network characterization learning model for characterization learning to obtain the user learning vector of the target user node includes:

extracting a feature vector of each node in each walking node sequence, mapping the extracted feature vectors to the same feature space through the heterogeneous network characterization learning model, and splicing the feature vectors mapped by all nodes in each walking node sequence into spliced feature vectors;

4. The virtual item recommendation method according to claim 3, wherein the step of aggregating, by the heterogeneous network characterization learning model, the stitched feature vectors of the plurality of walking node sequences corresponding to the same walking path into a first aggregated feature vector comprises:

5. The virtual item recommendation method according to claim 4, further comprising:

according to the interaction records among game users, weights are pre-distributed to first edge connecting lines among corresponding user nodes;

6. The virtual item recommendation method according to claim 1, wherein the time sequence model comprises a two-way long-short term memory model, and the step of characterizing the historical operation sequence as a time sequence learning vector by the time sequence model comprises:

characterizing the prop learning vector of each target prop node as a target timing sequence vector through the bidirectional long-short term memory model;

and aggregating the target time sequence vectors of the plurality of target prop nodes corresponding to the historical operation sequence into the time sequence learning vector according to the second attention weight corresponding to each target prop node.

7. The virtual item recommendation method according to claim 1, wherein the step of obtaining the first preference information of the target game user for each game virtual item according to the user learning vector of the target user node and the item learning vector of each item node comprises:

8. The virtual item recommendation method according to claim 1, wherein the step of obtaining second preference information of the target game user for each game virtual item according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node comprises:

9. The method for recommending virtual items according to claim 1, wherein said step of recommending game virtual items to the target game user according to the first preference information and the second preference information comprises:

10. The virtual item recommendation method according to claim 1, wherein the step of obtaining a heterogeneous network of the target game comprises:

11. The virtual item recommendation method according to any one of claims 1-10, wherein before the step of characterizing the target user node of the target game user as the user learning vector through the heterogeneous network characterization learning model, the method further comprises:

12. The virtual item recommendation method according to any one of claims 1-10, wherein the step of obtaining a heterogeneous network of the target game is preceded by the step of:

acquiring a historical operation sequence of each sample game user;

13. The method for recommending virtual items according to claim 12, wherein the step of jointly training the learning model and the time sequence model for characterizing the heterogeneous network according to the sample heterogeneous network and the historical operation sequence of each sample game user comprises:

14. A virtual item recommendation device, comprising:

a first preference obtaining unit, configured to obtain first preference information of the target game user for each game virtual item according to the user learning vector of the target user node and the item learning vector of each item node, where the first preference information is a long-term preference of the target game user for each game virtual item;

a second preference obtaining unit, configured to obtain, according to the time sequence learning vector of the historical operation sequence and the item learning vector of each item node, second preference information of the target game user for each game virtual item, where the second preference information is a short-term preference of the target game user for each game virtual item at a current time;

15. A computer-readable storage medium storing instructions adapted to be loaded by a processor to perform the steps of the method for recommending virtual items according to any of claims 1-13.