Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a personalized scenic spot recommendation method of a multitask graph neural network, which comprises the following steps: acquiring user data in real time, and preprocessing the acquired user data; inputting the preprocessed data into a trained recommendation model to obtain a recommendation result; the recommendation model is composed of a user graph neural network, a scenic spot graph neural network and a recommendation network and is a cross unit;
the process of training the recommendation model includes:
s1: acquiring original data, and preprocessing the original data; the original data comprises user attribute data, scenic spot attribute data and scenic spot interaction data;
s2: extracting a user characteristic set and a scenic spot characteristic set of the preprocessed data; constructing a user knowledge graph according to the user feature set, and constructing a scenic spot knowledge graph according to the scenic spot feature set;
s3: inputting triple data in the user knowledge graph into a user graph neural network for training, and learning the vector expression of the user in the user knowledge graph; inputting the triad data in the scenic spot knowledge graph into a scenic spot graph neural network for training, and learning to obtain vector expression of the scenic spot in the scenic spot knowledge graph;
s4: respectively inputting the user potential features extracted by the user map neural network and the scenic spot potential features extracted by the scenic spot map neural network into a recommendation network through a cross unit to obtain potential feature vectors after user fusion and potential feature vectors after scenic spot fusion; forming a prediction score of the user for the scenic spot according to the fused user potential feature vector and the fused scenic spot potential feature vector;
s5: in the training process of the recommendation model, multi-task training is carried out on a recommendation network score prediction task, a user map neural network representation learning task of a user knowledge map, and a scenic spot map neural network representation learning task of the scenic spot knowledge map;
s6: calculating a loss function of the model in a multitasking process, wherein the loss function of the model comprises scenic region diagram neural network loss, user diagram neural network loss, recommendation network loss and regular term loss;
s7: and when the loss function value of the model is minimum, finishing the training of the model.
Preferably, the process of preprocessing the user data includes: cleaning user data, and deleting invalid data and abnormal data; the data after washing were normalized by z-score.
Preferably, the extracted user feature set comprises a biological attribute feature and a social attribute feature; the extracted scenic spot feature set comprises scenic spot resource features and scenic spot leading function features.
Preferably, the structure of the user graph neural network comprises two parts, namely a neural network comprising L-layer full connection and a neural network comprising H-layer full connection; the neural network comprising L layers of full connection is used for extracting potential feature vectors of head entities and relations in the user knowledge graph; the H-layer full-connection neural network is used for extracting the potential feature vectors of the head entity and the relation from the feature extraction layer to perform high-order feature combination to form a predicted tail entity; wherein L, H is a model hyper-parameter.
Preferably, the structure of the scenic spot map neural network is the same as that of the user map neural network; the neural network comprising L layers of full connection is used for extracting head entities and relation potential feature vectors in the scenic spot knowledge graph; the neural network comprising H layer full connection is used for extracting the potential feature vectors of the head entity and the relation from the feature extraction layer to carry out high-order feature combination to form a predicted tail entity; wherein L, H is a model hyper-parameter.
Preferably, the structure of the recommendation network comprises two parts, namely a neural network comprising L-layer full connection and a neural network comprising H-layer full connection; the neural network comprising the L-layer full connection is used for extracting potential features of the user and the scenic spot input in the recommendation network, wherein the user corresponds to a head entity input by the user graph neural network, and the scenic spot corresponds to a head entity input by the scenic spot graph neural network; the neural network comprising H-layer full connection is used for extracting potential feature vectors of the user and the scenic spot from the feature extraction layer, performing high-order feature combination, and predicting the score of the user on the scenic spot.
Preferably, the structure of the crossing unit comprises a user crossing unit and a scenic spot crossing unit; the user cross unit is used for connecting the user graph neural network and the recommendation network feature extraction layer, fusing the features extracted by the same user through the user graph neural network and the recommendation network through feature cross and feature compression, and obtaining a potential feature vector after user fusion; and the scenic spot crossing unit is used for connecting the scenic spot map neural network and the recommendation network feature extraction layer, and fusing the features extracted from the same scenic spot by the scenic spot map neural network and the recommendation network through feature crossing and feature compression to obtain a potential feature vector after scenic spot fusion.
Further, the expression of feature intersection and feature compression is:
the characteristics are crossed: c l =v l e l T
preferably, the process of obtaining the personalized scores of the user on the scenic spot comprises the following steps:
preferably, the loss function is expressed as:
the invention has the beneficial effects that:
1) The method constructs the knowledge map by using the attribute data of the users and the scenic spots, learns the scenic spots and the user characteristic expression in the knowledge map through the graph neural network, introduces the knowledge information expressing the scenic spots and the users in the knowledge map into a recommendation network, accurately learns the relation between the users and the characteristics of the scenic spots and fully excavates the information of the data;
2) The invention designs two cross units as connection links between a scenic spot graph neural network and a recommendation network and between a user graph neural network and the recommendation network, and learns potential interaction characteristics of scenic spots and users in the two forms. The expandability of the model can be enhanced through a multi-task alternative training mode, overfitting of the model is avoided, and the recommendation performance can be effectively improved.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the accompanying drawings. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a personalized scenic spot recommendation method of a multitask graph neural network, which mainly comprises the following steps: acquiring and preprocessing data; selecting characteristics to establish a knowledge graph of a user and a scenic spot; learning vector representation of entity nodes and relations in the knowledge graph by using a graph neural network; establishing a deep neural network by using historical interaction data of a given user on a scenic spot, and realizing personalized prediction of the rating of the user on the given scenic spot; designing two cross units to connect three networks, and effectively integrating information in knowledge maps of a given user and a given scenic spot into a recommendation system; and finally, training three networks in a multi-task alternate training mode of recommending network scoring tasks and representing learning tasks by users and scenic spot knowledge maps to complete model optimization and form personalized scores of the users for scenic spots.
A personalized scenic spot recommendation method of a multitask graph neural network comprises the following steps: acquiring user data in real time, and preprocessing the acquired user data; inputting the preprocessed data into a trained recommendation model to obtain a recommendation result; the recommendation model is composed of a user graph neural network, a scenic spot graph neural network and a recommendation network, and a cross unit.
As shown in fig. 1, the process of training the recommendation model includes:
s1: acquiring original data and preprocessing the original data; the original data comprises user attribute data, scenic spot attribute data and scenic spot interaction data;
s2: extracting a user characteristic set and a scenic spot characteristic set of the preprocessed data; constructing a user knowledge graph according to the user feature set, and constructing a scenic spot knowledge graph according to the scenic spot feature set;
s3: inputting the (head entity, relation, tail entity) triples in the user knowledge graph into a user graph neural network for training, and learning the vector expression of the user in the user knowledge graph; inputting (head entity, relation, tail entity) triples in the scenic spot knowledge graph into a scenic spot graph neural network for training, and learning to realize vector expression of scenic spots in the scenic spot knowledge graph;
s4: designing a cross unit to fuse the user potential features extracted by the user map neural network and the scenic spot potential features extracted by the scenic spot map neural network into a recommendation network to obtain potential feature vectors after user fusion and potential feature vectors after scenic spot fusion; forming a prediction score of the user for the scenic spot by using the fused user potential feature vector and the fused scenic spot potential feature vector;
s5: in each round of training, carrying out multi-task training on a recommended network scoring prediction task, a user map representation learning task of a user knowledge map by a user map neural network, and a scenic spot map representation learning task of a scenic spot knowledge map by a scenic spot map neural network; specifically, when a given task is trained, the other two network parameters are kept unchanged, the network parameters of the task are updated, and the last three tasks are alternately trained in sequence to complete the updating of the three network parameters until the model converges;
s6: the loss function for guiding the personalized scenic spot recommendation method of the whole multitask graph neural network is accumulation of scenic spot graph neural network loss, user graph neural network loss, recommendation network loss and regular term loss; acquiring user attribute data, scenic spot attribute data and user and scenic spot interaction data in various ways; the acquisition mode includes, but is not limited to, acquiring tourism website data, volunteer data, public transportation data, climate website data, map software data, social software data and the like by using methods such as web crawlers, data burial, questionnaire and the like; preprocessing the acquired data, including cleaning user data and deleting invalid data and abnormal data; because the data has the characteristic of data source diversification, in order to eliminate the problems of different dimensions among different source scalar data and the problem of value intervals among the same source scalar data, the data is subjected to z-score standardization, and the standardization formula is as follows:
where x represents raw data, u represents raw data mean, σ represents raw data standard deviation, and z represents processed data, whose mean is 0 and standard deviation is 1.
Selecting a user feature set according to the preprocessed data
Feature set of scenic spot
The scenic spot features comprise scenic spot resource features and scenic spot leading function features, and the user features comprise biological attribute features and social attribute features.
The scenic spot resource characteristics comprise natural tourism resources such as scenic spot landscape resources, geographical position resources, climate resources, greening resources, biological type resources and the like; religious cultural resources, historical cultural resources, national life and fashion resources, cultural relic resources and other human resources; scenic spot traffic resources, modern scientific and technological resources, modern construction resources, peripheral supporting facility resources and other social resources.
The main function characteristics of the scenic spot comprise a sightseeing scenic spot, a vacation scenic spot, a scientific and scientific scenic spot, an amusement scenic spot, an ecological scenic spot, a scientific and technological scenic spot, an adventure scenic spot and the like.
The user biological characteristics include age, gender, height, race, weight, language, physical and mental health.
The social characteristics of the user comprise a scholarly calendar, profession, marital family, relatives, living city, income condition, social status, profession, religious belief, ethnic and the like.
According to the characteristics, the knowledge graph is constructed in the form of triples (head entities, relations and tail entities).
As shown in FIG. 2 and FIG. 3, the user graph neural network and the scenic spot graph neural network take head entity head and relationship relation in the knowledge graph as input, and minimize and predict tail entity
Training the network by taking the distance t from the real tail entity as a target function, and finally obtaining knowledgeAtlas
And with
Vector expression of the intermediate entity. The user map neural network and the scenic spot map neural network both comprise lower L-layer full connection layers for extracting potential features of the relationship between the user knowledge graph and the scenic spot knowledge graph, and the expression formula is as follows:
wherein the content of the first and second substances,
respectively potential feature vectors of the relation between the user and the scenic spot knowledge map after passing through the lower L-layer full connection layer,
and representing a layer of fully-connected layers, wherein W is a weight parameter of each layer, b is a bias term parameter, and sigma (x) is a nonlinear activation function. The user map neural network and the scenic spot map neural network comprise upper H-layer full-connection layers respectively used for obtaining vector expressions of tail entities in the user knowledge map and the scenic spot knowledge map
And
the expression is as follows:
wherein, | | is a vector concatenator, w L ,e L The potential feature vectors of the user and the scenic spot in the knowledge graph of the user and the scenic spot after passing through the lower L-layer full connection layer are respectively.
As shown in fig. 4, the recommendation network includes a lower L-layer feature extraction layer, and interaction feature vectors of the knowledge graph and corresponding scenic spots and users in the recommendation network are learned through a cross unit in the feature extraction layer, so that information in the knowledge graph is merged into the recommendation network. The recommendation network also comprises an upper H-layer full-connection layer to learn high-order combination characteristics of the users and scenic spots, and finally, the scores are predicted and graded through a nonlinear activation function
The expression is as follows:
wherein u is L ,v L Respectively are the feature vectors with potential interaction features of the knowledge graph and the recommended network obtained after the lower L-layer cross unit.
As shown in fig. 5, the proposed model of the present invention involves two cross units for connecting the lower L-layer feature extraction layers of three networks. And a crossing unit connecting the neural network of the scenic spot map and the recommendation network inputs potential feature vectors of the scenic spot in the previous layer of recommendation network and the potential feature vectors of the previous layer corresponding to the scenic spot in the neural network of the map, and learns high-level potential interaction features of the scenic spot in the recommendation network and the neural network of the map through two steps of feature crossing and feature compression, so that information of the scenic spot in a knowledge map is introduced into a recommendation system. The characteristic intersection and the characteristic compression satisfy that:
the characteristics are crossed: c l =v l e l T
wherein the content of the first and second substances,
for the l-th layer potential feature vector of the scenic spot in the neural network,
and recommending the potential feature vectors of the ith layer of the scenic spot in the network.
The feature cross matrix is a result of pairwise crossing between the potential features of the scenic spots in the recommendation network and the potential features of the scenic spots in the knowledge graph.
Is potential feature vector of the l +1 layer of scenic spot in the neural network,
the potential feature vector of the l +1 layer of the scenic spot in the network is recommended.
And
for the model parameters, d is the length of the potential feature vector. The intersection unit connecting the user map neural network and the recommendation network has the same structure as the intersection unit connecting the scenic spot map neural network and the recommendation network.
As shown in fig. 6, the three networks are alternately trained layer by layer, the loss function is defined as the sum of the losses of the three networks, and the expression of the loss function is:
wherein the content of the first and second substances,
in order to be a cross-entropy function,
λ
1 ,λ
2 ,λ
3 being a hyper-parameter of the model, W
θ Is a regularization term parameter. And when the model is alternately trained, keeping the two network parameters unchanged, and updating the other network parameter. And training three networks by recommending a network scoring task and a multi-task alternate training mode of representing a learning task by a user and a scenic spot knowledge graph, and finishing model optimization. By determining the model parameters, the personalized rating of a specific user to the scenic spot can be obtained, and thus the personalized scenic spot recommendation list of the given user can be obtained.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by instructions associated with hardware via a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.
The above-mentioned embodiments, which are further detailed for the purpose of illustrating the invention, technical solutions and advantages, should be understood that the above-mentioned embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the present invention, and any modifications, equivalents, improvements, etc. made to the present invention within the spirit and principle of the present invention should be included in the protection scope of the present invention.