CN113268629A

CN113268629A - Heterogeneous picture singing list multi-label recommendation method fusing node preference

Info

Publication number: CN113268629A
Application number: CN202110477214.8A
Authority: CN
Inventors: 王晨旭; 郭晨野; 杨煜; 索凯强
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-04-29
Filing date: 2021-04-29
Publication date: 2021-08-17
Anticipated expiration: 2041-04-29
Also published as: CN113268629B

Abstract

A multi-label recommendation method for a heteromorphic graph song menu with node preference fusion comprises the following steps: constructing a song list abnormal composition graph through the abnormal data of the song list training set; carrying out neighbor sampling of fusion node preference on each song list through a song list different composition graph to obtain song list information containing song neighbor characteristics and song list information containing singer neighbor characteristics; carrying out continuous feature representation on the singing list by using word2vec technology, wherein the singing list information containing the song neighbor features and the singing list information containing the singer neighbor features; carrying out clustering analysis on the continuous feature representation of the song list by adopting a spectral clustering algorithm to obtain a song list clustering result; and calculating the weight value of each navigation type label in each type according to the singing list clustering result, and then completing label recommendation of the target singing list by using a local sensitive hash technology. The invention has the characteristics of simple structure and high recommendation efficiency. Compared with the traditional collaborative filtering method, the singing list label recommendation method has high accuracy and high recommendation speed.

Description

Heterogeneous picture singing list multi-label recommendation method fusing node preference

Technical Field

The invention belongs to the field of music recommendation systems, and particularly relates to a node preference fused heteromorphic picture song list multi-label recommendation method.

Background

In recent years, the song list function provided by the Internet is easy to implement by cloud music, so that the cloud music breaks through the mode of classifying and organizing the song lists of singers in traditional albums. The playing mode with the song list as the core can solve the requirement that a user finds and searches music by social sharing and personalized recommendation. The song list label plays an important role in improving the song listening experience of an online music user and encouraging the user to produce an individualized song list. In the context of rapid development of big data, we can implicitly infer the characteristics of songs in a song list from a large number of song lists with expert tags, making song list tag recommendations possible. The application of the collaborative filtering algorithm in the field of music recommendation is very common, and the algorithm is generally divided into three steps of data collection, similarity calculation and recommendation result giving. However, in the big data era, due to the high dimensional sparsity of song list data and the reason that interaction relationship is excessively concerned, the traditional collaborative filtering algorithm has the problems that a hot tag is easy to recommend when song list tag recommendation is carried out, the recommendation time is long, and the like, so that the method is difficult to be applied in practice.

Disclosure of Invention

In order to overcome the problems in the prior art, the invention aims to provide a multi-label recommendation method for a heteromorphic graph song menu with node preference fused.

In order to achieve the purpose, the invention adopts the following technical scheme:

a multi-label recommendation method for a heteromorphic graph song menu with node preference fusion comprises the following steps:

step 1: constructing a song list abnormal composition graph through the abnormal data of the song list training set;

step 2: performing neighbor sampling of fusion node preference on each song through a song list dissimilarity graph by adopting a song element-based path and a singer element-based path to obtain song list information containing song neighbor characteristics and song list information containing singer neighbor characteristics;

and step 3: carrying out continuous feature representation on the singing list by using word2vec technology, wherein the singing list information containing the song neighbor features and the singing list information containing the singer neighbor features;

and 4, step 4: carrying out clustering analysis on the continuous feature representation of the song list by adopting a spectral clustering algorithm to obtain a song list clustering result;

and 5: and calculating the weight value of each navigation type label in each type according to the singing list clustering result, and then completing label recommendation of the target singing list by using a local sensitive hash technology.

Further, the specific process of step 1 is as follows:

step 1.1: connecting the song list nodes according to the following formula

And song node

If the ith song list L_iDoes not include song m in the song list of (1)_jNode of song list

And song node

There is no edge between, otherwise, singing node

And song node

There is a song-singing margin in between

Step 1.2: connecting the song list nodes according to the following formula

And singer node

If the ith song list L_iThe list of singers does not contain singers_jThen singing node

And singer node

There is no edge between them, otherwise, the song list node

And singer node

Between the singer and the singing edge

Step 1.3: connecting the song list nodes according to the following formula

And user node

If the ith song list L_iIs not u_jNode of song list

And user node

There is no edge between them, otherwise, the song list node

And user node

There is a user-song list edge in between

Further, in step 1.1, song-singing list is bordered

As shown in the following formula:

wherein the content of the first and second substances,

is a song m_jNumber of occurrences, M_numIs the total number of songs, G_LIn order to form a different picture of the song,

is a song feature; step 1.2, singer-singing side

As shown in the following formula:

wherein the content of the first and second substances,

denoted as singer s_jNumber of occurrences, S_numThe total number of the singers is,

is a characteristic of the singer;

step 1.3, user-singing list side

As shown in the following formula:

wherein the content of the first and second substances,

is a user characteristic.

Further, the specific process of step 2 is as follows:

step 2.1: if singing node

Number of owned first-order neighbor song nodes N_mThe number N of first-order song neighbors greater than the set song list node_msThen use the song node

All first order neighbor song nodes of

As the relative weight of selection, N is randomly selected according to the relative weight_msOne first-order neighbor song node as alternative song neighbor

Step 2.2: if singing node

First-order neighbor song node number N_mThe number N of first-order song neighbors smaller than the set song list node_msThen will song list node

First order neighbor song node preference

As the relative weight, randomly selected (N) based on the relative weight_ms-N_m) One-order neighbor song node and song list node

Together form a total of N_msAlternative song neighbors of

Step 2.3: by alternative song neighbors

According to song-singing edge

Calculating target song list node

First order song list neighbor set

Singing order node preference

Step 2.4: according to singing list node preference

The weight randomly selects the singing list node

First order song list neighbor number N_LNeighbor set of first order song list

Step 2.5: neighbor set adopting song list

Calculating the singing bill node according to the following formula

Node preference of second order song node

Then through node preference

Randomly selecting 2 x N_LA second-order neighbor song

Step 2.6: integrating song list nodes by

Song list of

Step 2.7: repeating the steps 2.1-2.6 to connect all the singing nodes

Song list information containing song neighbor characteristics is sampled.

Further, in step 2.3, the singing node prefers

Calculated by the following formula:

wherein the content of the first and second substances,

for the ith song list L_iThe song neighbor and the jth song list L_jThe intersection number of the song neighbors;

further, in step 2.6, the song list

Has a length of N_ms+2*N_L；N_LRepresenting the number of first order song list neighbors.

Further, the specific process of step 3 is as follows:

step 3.1: word2vec language Model for initializing song characteristics of song list_mWord2vec language Model for singer features_sModel_mAnd Model_sThe length of the output vector and the iteration times of the model are input parameter values of the model;

step 3.2: inputting the song list information containing the neighboring features of the song as a sentence into the Model_mInputting the singing list information containing the characteristics of the singer neighbors as sentences into the Model_sPerforming the following steps;

step 3.3: word2vec language Model for song characteristics_mWord2vec language Model for singer features_sAfter training, outputting a song feature vectorization representation set vec of the song single node set_mSinger feature vectorization representation set vec of singing order nodes_sThe following formula shows:

wherein the content of the first and second substances,

for singing a song node

Is used for vectorizing the representation of the characteristics of the songs,

vectorizing the characteristics of the singers;

step 3.4: singing list node

Vectorization of

Expressed by the following formula:

further, the specific process of step 4 is as follows:

step 4.1: initializing a clustering Model according to the clustering class number n _ cluster_cluster；

Step 4.2: training set data of the singing sheet represented by continuous features

Input into Model_clusterPerforming cluster learning to obtain the cluster result of the song list training set

Step 4.3: training set label based on song list

Clustering results with song list training set

And calculating the navigation class weight of each category, wherein the following formula is shown:

wherein the content of the first and second substances,

is a song list L₁The combination of the labels of (1),

is a song list L₂The combination of the labels of (1),

is a song list L_nA label combination of w_labelRepresenting the weight of each navigational class in each category, c_iRepresents the ith cluster category, there are five categories in total, w_iFor the navigation class weights under the ith category,

a weight value representing a language navigation class,

a weight value representing a genre navigation class,

a weight value representing an emotional navigation class,

a weight value representing a scene navigation class,

a weight value representing a navigation class of the theme,

number of sings, n, expressed as a category of speech in the ith category_iThe total number of the singing lists of the ith category;

step 4.4: test set feature set of song list

Input clustering Model_clusterIn the method, the clustering result of the test set is calculated

Further, the specific process of step 5 is as follows:

step 5.1: training set according to song list

Grouping the sings in the training set by the navigation category to which the label combination belongs to obtain a grouped singing set;

step 5.2: carrying out LSH/MinHash barrel-dividing calculation on the grouped song collection to obtain a language type Hash barrel B_yzTheme type hash bucket B_ztScene type Hash barrel B_cjStyle hash bucket B_fgAnd sentiment hash bucket B_qgWherein the language category hash bucket B is made according to the menu song set and the menu singer set_yzTheme type hash bucket B_ztScene type Hash barrel B_cjStyle hash bucket B_fgAnd sentiment hash bucket B_qgTwo kinds of data hash buckets are generated respectively, as shown in the following formula:

wherein the content of the first and second substances,

representing a navigation-like menu-song hash bucket,

representing a song order-singer hash bucket;

step 5.3: from the test set

Take out a target song list L_recAnd in the test set of the test set, singing the category

Finding out corresponding belonged categories

According to the category of belongings

Obtaining the weight value of each navigation class

Step 5.4: set of songs from a target song list

Harmony singer set

Respectively mapped to the affiliated categories

Medium weight value w_iNavigation type song list-song hash bucket not equal to zero

And singing list-singer hash bucket

Then, a similar singing list set Sim to the target singing list is searched_ij；

Step 5.5: according to a set Sim of similar target sings_ijAnd weight value w_ijUpdating recommendation index for each tag

The update formula is as follows:

wherein the weight value w_ijIs composed of

Or

Is a recommendation index, t, for the ith label_iIs the ith label.

Step 5.6: finally, the label set R to be recommended_TagAccording to the recommended indexes of the labels

And sorting from large to small, and selecting the top N labels with the highest recommendation indexes as final recommendation results.

Compared with the prior art, the invention has the following beneficial effects: aiming at the problems that the traditional collaborative filtering algorithm is low in recommendation efficiency and low in recommendation accuracy rate, relevance among singing lists cannot be considered, and the like, the multi-label recommendation method for the heteromorphic composition singing lists with the preference of the fusion nodes is provided, and the singing list label recommendation process mainly comprises the following four steps: firstly, a song list abnormal composition is established by using the heterogeneous data of the song list, the neighbor information of the song list is extracted by using a meta-path song list characteristic sampling method based on the node preference abnormal composition, and the song list information containing the song neighbor characteristics and the song list information containing the singer neighbor characteristics are obtained. Secondly, the song list feature is continuously characterized in the word2vec technology. And then, calculating the clustering category represented by the singing sheet characteristics by using a spectral clustering algorithm, and calculating the weight value of each navigation class label in each category according to the clustered category of the singing sheet. And finally, combining the singing list set characteristics containing the singing list neighbors with the category group characteristics of the singing list set by using a local sensitive Hash technology to complete the final singing list multi-label recommendation task, so that the accuracy of the recommendation method is improved. The invention has the characteristics of simple structure and high recommendation efficiency. Compared with the traditional collaborative filtering method, the singing list label recommendation method has high accuracy and high recommendation speed.

Drawings

FIG. 1 is a diagram of a proposed model architecture.

Fig. 2 is an example of a song list heteromorphic diagram.

Fig. 3 is a schematic diagram of a song neighbor feature sampling method of a song list node.

Fig. 4 is a representation diagram of singing single node feature vectorization.

Fig. 5 is a schematic diagram of a singing single node cluster analysis method.

FIG. 6 is a navigation class ratio of each category in the singing list cluster.

Fig. 7 is a hash bucket plot of a training set song sheet-song set.

Fig. 8 is a schematic diagram of a multi-label recommendation method based on singing sheet clustering results.

FIG. 9 shows the execution time and recommendation accuracy results of the conventional method (TRAD) and the present invention (GRAC).

Fig. 10 shows the results of two data set-scale ablation experiments with different optimization modes of the present invention (GRAC).

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings.

The invention discloses a node preference fused song list heteromorphic graph multi-label recommendation method, which uses captured hundred thousand scale Internet easier cloud music song list data to divide the data into test set song list heteromorphic data L according to the proportion of 1:9_testAnd the singing list heterogeneous data L of the training set_train。

Firstly, using the training set song list heterogeneous data L of the song list in the network music_trainCreating a song list heteromorphic graph G_LExtracting neighbor information of the song list by using a meta-path song list feature sampling method based on node preference heterogeneous composition to acquire song heterogeneous feature information of the song list

And singer heterogeneous characteristic information

Secondly, adopting a skip-gram algorithm in the word2vec technology to continuously characterize the singing sheet characteristics

Then using spectral clustering algorithm to calculate the clustering class of the singing bill feature expressionClass of training set song list clustered

Calculating the weight value w of each navigation label in each type_label。

And finally, combining the singing sheet set characteristics containing the singing sheet neighbors with the category group characteristics of the singing sheet set, and using a locality sensitive hashing technology to complete the final singing sheet multi-label recommendation task.

The method specifically comprises the following steps:

step 1: construction of song list abnormal picture G through abnormal data of song list training set_LThe specific process is as follows:

the network Yi cloud song list data set comprises a set L ═ L of n song lists₁,L₂,…,L_n}. Each song in the collection contains 3 types of feature information

Within each feature is a set of corresponding feature sets, namely song features

Characteristics of singer

And user features

A certain song list L in the song list set_iCan be expressed as FV (L)_i). The description information of the song list heteromorphic graph is as follows:

the song list is an undirected graph, denoted G_LWhere, in the heteromorphic graph, there are four types of nodes, denoted as V ═ { V, E }, where V ═ V { (V) }_L，V_m，V_s，V_uAnd each type of node is composed of a plurality of node sets of the type, which are respectively: singing list node

Song node

Singer node

And user node

The heterogeneous graph also includes three types of edges, denoted as E ═ E_lm，E_ls，E_lu-wherein each type of edge is also formed by a set of geometries of the type of edge, respectively: song list-song edge

Singing sheet-singer side

And song list-user side

And each edge is also fused with preference information among the nodes, and the preference information is used as the weight of the edge.

Step 1.1: given song list node

And song node

Connecting the song list nodes according to the following formula

And song node

And song node

There is no edge between, otherwise, singing node

And song node

There is an edge in between, i.e. song-song single edge, use

It is shown that,

is a song m_jNumber of occurrences, M_numAs the total number of songs:

step 1.2: given song list node

And singer node

Connecting the song list nodes according to the following formula

And singer node

And singer node

There is no edge between them, otherwise, the song list node

And singer node

There is an edge therebetween, and

it is shown that,

denoted as singer s_jNumber of occurrences, S_numTotal number of singers:

step 1.3: given song list node

And user node

Connecting the song list nodes according to the following formula

And user node

If the ith song list L_iIs not u_jNode of song list

And user node

There is no edge between them, otherwise, the song list node

And user node

With edges in between and use

Representation, wherein the preference balance is represented by 1 for user nodes in a heterogeneous graph, since most users will only create one to three vocalists:

step 2: drawing picture G by singing_LUsing a song-based meta path R_mAnd based on singer's meta path R_sCarrying out neighbor sampling of fusion node preference on each song list to obtain song list information containing song neighbor characteristics and song list information containing singer neighbor characteristics;

the specific process is as follows:

step 2.1: given song list node

If singing node

Owned first-order neighbor song node

Number N_mThe number N of first-order song neighbors greater than the set song list node_msThen use the song node

All first order neighbor song nodes of

Step 2.2: if singing node

First order neighbor song node of

Number N_mThe number N of first-order song neighbors smaller than the set song list node_msThen will song list node

First order neighbor song node preference

Together form a first-order song neighbor set of which the total number is still N_msAlternative song neighbors of

Step 2.3: by alternative song neighbors

According to song-singing edge in the heteromorphic graph

Calculating target song list node

First order song list neighbor set

Singing order node preference

The calculation formula is as follows:

wherein the content of the first and second substances,

for the ith song list L_iThe song neighbor and the jth song list L_jThe number of intersections of song neighbors, namely the number of the same songs of the two song lists;

step 2.4: according to singing list node preference

The weight randomly selects the singing list node

First order song list neighbor number N_LNeighbor set of first order song list

Step 2.5: neighbor set by song list

Calculating the first-order neighbor song node preference of each neighbor song list, namely the song list node

Node preference of second order song node

Finally, as shown in the following formula, the node preference

Randomly selecting 2 x N_LA second-order neighbor song

Step 2.6: finally integrating the song list nodes by the following formula

Song list of

Wherein the content of the first and second substances,

has a length of N_ms+2*N_L。

Step 2.7: all singing nodes are connected through the steps 2.1-2.6

Song list information containing song neighbor characteristics is sampled.

And step 3: carrying out song list continuous feature representation on song list information containing neighbor features by using word2vec technology; the specific process is as follows:

step 3.1: word2vec language Model for initializing song characteristics of song list_mHarmony songHand feature word2vec language Model_sThe two Model models_mAnd Model_sThe length of the output vector and the iteration times of the model are input parameter values of the model;

wherein

For singing a song node

Is used for vectorizing the representation of the characteristics of the songs,

vectorized representation of the singer's features.

Step 3.4: singing list node

Vectorization of

The expression is shown in the following formula, namely, the characteristics of the songs are taken to be represented vectorially

And singer feature vectorized representation

Average value of (d):

wherein the content of the first and second substances,

for a vectorized representation of the characteristics of the song,

vectorized representation for singer features.

And 4, step 4: using a spectral clustering algorithm to perform clustering analysis on the continuous feature representation of the song list to obtain a song list clustering result; the specific process is as follows:

Step 4.3: training set label based on song list

Clustering results with song list training set

wherein the content of the first and second substances,

is a song list L₁The combination of the labels of (1),

is a song list L₂The combination of the labels of (1),

is a song list L_nA label combination of w_labelRepresenting the weight of each navigational class in each category, c_iRepresents the ith cluster category, and has five categories in total, i is 5, w_iIs the navigation class weight under the ith category, w_iThe parameters in (1) are weighted values of language navigation classes under the ith clustering classes

Weight values for style navigation classes

Weighted values for emotion navigation classes

Weight values for scene navigation classes

And weight value of topic navigation class

The weights are calculated in the same way for each navigation class, where

The calculation method of (a) is as shown in the above formula,

number of sings, n, expressed as a category of speech in the ith category_iTotal number of sings in ith category:

step 4.4: test set feature set of song list

Step 4.5: output test set song list category

And weight values w of navigation classes in each class_label。

And 5: and calculating the weight value of each navigation type label in each type according to the singing list clustering result, and then completing label recommendation of the target singing list by using a local sensitive hash technology. The specific process is as follows:

step 5.1: training set according to song list

step 5.3:from the test set

Finding out corresponding belonged categories

According to the category of belongings

Obtaining the weight value of each navigation class

Step 5.4: set of songs from a target song list

Harmony singer set

Respectively mapped to the affiliated categories

And singing list-singer hash bucket

Then, a similar singing list set Sim to the target singing list is searched_ijIt is represented as follows:

step 5.5: according to similarity to the target song listSong collection Sim_ijAnd weight value w_ijUpdating recommendation index for each tag

The update formula is as follows:

wherein the weight value w_ijIs composed of

Or

Is a recommendation index, t, for the ith label_iIs the ith label.

Sorting from big to small, selecting the top N label with the highest recommendation index as the final recommendation result, and finally outputting the result shown in the following formula:

wherein, N is a parameter, and can be determined according to actual needs. Rec_TIn order to recommend the result,

is a recommendation index for the first tag,

is a recommendation index for the second label,

is the recommendation index of the Nth label.

Firstly, dividing captured Internet music song list data into test set song lists L according to the proportion of 1:9_testAnd training set song list L_train. Using L_trainCreating a song list heteromorphic graph G_LExtracting neighbor information of the song list by using a meta-path song list feature sampling method based on node preference heterogeneous composition to acquire song heterogeneous feature information of the song list

And singer heterogeneous characteristic information

Then adopting a skip-gram algorithm in the word2vec technology to continuously characterize the singing sheet characteristics

Then using spectral clustering algorithm to calculate the clustering category of the singing sheet feature expression, and through training set the category of the singing sheet to be clustered

Calculating the weight value w of each navigation label in each type_label. And finally, combining the singing sheet set characteristics containing the singing sheet neighbors with the category group characteristics of the singing sheet set, and finishing the final singing sheet multi-label recommendation task by using a local sensitive Hash technology, thereby realizing the precise recommendation of the singing sheet labels.

The invention provides a node-preference-fused singing sheet heteromorphic graph multi-label recommendation model, which is based on the characteristics of sparse singing sheet data, uneven association degree among the singing sheet data and the like, applies the association relation among the multi-dimensional information of the singing sheet data to the recommendation model, expresses the singing sheet information by applying a singing sheet continuous characteristic expression method containing neighbor characteristics based on heteromorphic graph and word2vec technology, analyzes the category clustering characteristics of the singing sheet by using a spectral clustering algorithm, and applies the singing sheet category weight w_labelFinally, combining the singing list set characteristics containing the singing list cluster with the category group characteristics of the singing list set to use the locality sensitive Hash technology to recommend the singing list with multiple labelsAnd (5) performing tasks. The recommendation accuracy of the tag recommendation model provided by the method is greatly improved, and the recommendation efficiency is improved to a certain extent due to the low calculation complexity of the recommendation model. Compared with a recommendation model based on a deep learning algorithm, the method provided by the invention has better compatibility with a recommendation method based on a collaborative filtering algorithm adopted by the current online music platform, and the cost and risk for upgrading the system recommendation algorithm are lower.

A node preference fused heteromorphic graph song list multi-label recommendation system comprises

The following are specific examples.

Example 1

TABLE 1 Experimental data set

Referring to table 1, the invention first captures 10 ten thousand song list data from the internet cloud music platform, there are 73 different tags Tag, and there are 16449 different song list Tag combinations L_TagDividing the song list data into test set song lists L according to the proportion of 1:9_testAnd training set song list L_train。

Referring to fig. 1, fig. 1 is a model architecture diagram of a singing style heteromorphic graph multi-label recommendation method for fusion node preference provided by the invention, and mainly comprises a heterogeneous graph construction and fusion node preference neighbor sampling method, a singing style node feature representation and cluster analysis method and a singing style clustering result-based multi-label recommendation method.

Referring to fig. 2, which is a song list heterogeneous example, the present invention takes song information, singer information, and user information in the song list data as heterogeneous feature information of the song list and constructs a song list heterogeneous composition based on the information. The network Yi cloud song list data set comprises a set L ═ L of n song lists₁，L₂,…，L_n}. Each song in the collection contains 3 types of feature information

Each one ofWithin the seed features there is also a set of corresponding feature sets, i.e. song features

Characteristics of singer

And user features

A certain song list L in the song list set_iCan be expressed as FV (L)_i). The definition of the heteromorphic graph is as follows:

Song node

Singer node

And user node

Singing sheet-singer side

And song list-user side

The construction steps of the song list heteromorphic graph are as follows:

step 1: given song list node

And song node

Connecting two heterogeneous nodes according to the following formula, if song list L_iDoes not include m in the song list of (1)_jIf the song is not played, no edge exists between the two nodes, otherwise, an edge exists between the two nodes, and the song is played

It is shown that,

is a song m_jNumber of occurrences, M_numAs the total number of songs:

step 2: given song list node

And singer node

Connecting two heterogeneous nodes according to the following formula, if song list L_iThe list of singers does not contain singers_jThen there is no edge between the two nodes, otherwise there is an edge between the two nodes, and

it is shown that,

denoted as singer s_jNumber of occurrences, S_numTotal number of singers:

and step 3: singing order node

And user node

Connecting two heterogeneous nodes according to the following formula, if song list L_iIs not u_jIf there is no edge between two nodes, otherwise, there is an edge between two nodes and use

FIG. 3 is a schematic diagram of a song list node song neighbor feature sampling method based on a heteromorphic graph, and referring to FIG. 3, in a song list multi-label recommendation task, the heterogeneous information of a song list is better and more utilized, and the incidence relation between the song lists can be more accurately represented, so that in the song list neighbor node sampling method, the incidence relation between nodes is fully considered through node preference, the sampling of neighbor nodes is more meaningful, and in an Internet cloud song list heteromorphic graph G_LIn (1), through a song-based meta-path R_mSampling and singer element-based path R_sThe definition of the meta path is as follows:

wherein the meta path R is based on songs_mSampling and singer-based meta-path R_sThe sampling method is the same, and the method for obtaining the singing sheet information containing the singer neighbor characteristics is the same as the method for obtaining the singing sheet information containing the song neighbor characteristics. The method receives two input parameters, namely the first-order song neighbor number N of a song list node_msAnd the number of neighbors of the first order song list N_LThe method comprises the following specific steps:

step 1: selecting a song list node

If the current song list node has a first-order neighbor song node

Number N_mGreater than N_msPreference of all first-order neighbor song nodes using the current song list node

As the selected relative weight value, N is randomly selected according to the weight_msOne first-order neighbor song node as alternative song neighbor

Step 2: if singing node

First order neighbor song node of

Number N_mLess than N_msThen its first order neighbor song node is preferred

As the relative weight value, randomly selecting (N) according to the weight_ms-N_m) The first-order neighbor song nodes and the current existing song node neighbor set form a total number of N_msAlternative song neighbors of

And step 3: by alternative song neighbors

According to song-singing edge in the heteromorphic graph

Calculating target song list node

First order song list neighbor set

The calculation formula of the singing order node preference is shown as follows,

is L_iSong neighbors and L_jThe number of intersections of the songs z, namely the number of the same songs of the two song lists;

and 4, step 4: according to singing list node preference

The weight randomly selects the singing list node

N of (A)_LNeighbor set of first order song list

And 5: neighbor set by song list

The node preference of the second-order song node of (2), as shown in the following formula, is finally passed through the node preference

Randomly selecting 2 x N_LA second-order neighbor song

Step 6: finally integrating song list nodes

Song list of

Is shown in the following formula, wherein

Has a length of N_ms+2*N_L。

And 7: all song list nodes are connected through the steps

A list of songs is sampled that contains the characteristics of the song's neighbors.

Fig. 4 is a schematic diagram of vectorization representation of singing order node characteristics, and referring to fig. 4, in order to further depict singing order data from the aspect of singing order node characteristics, the invention hopes to perform cluster analysis on the singing order nodes containing neighbor information to obtain five types of singing order data groups, but because of the discreteness of the singing order characteristics, the invention needs to perform continuous characteristic representation on the singing order characteristics. Because the song list is different in composition G_LThe degrees of each node in the singing list have power law distribution characteristics, the node preference conforms to the distribution characteristics, so the singing list characteristics of neighbor sampling based on the node preference also follow the power law, the distribution is similar to the frequency distribution characteristics of text words in a word2vec model, and the embedded representation of the singing list nodes is learned by using a skip-gram model in the word2vec, namely, the song list containing neighbors subjected to wandering sampling in the singing list nodes

And singer list

When a sentence in the word2vec model is considered, song m in the list_iAnd singer s_iInputting the words in the word2vec model into the skip-gram model so as to learn the vector representation of the singing nodes. The method needs to receive two parameters, namely the output vector length size and the model iteration times iter _ num, and comprises the following specific steps:

step 1: word2vec language Model for initializing song characteristics of song list_mWord2vec language Model for singer features_sThe length of the output vector and the iteration times of the model are input parameter values;

step 2: inputting the song feature set of the song list node as a sentence into the Model_mInputting the singer feature set of the singer node as a sentence into the Model_sPerforming the following steps;

and step 3: outputting a song feature vectorization representation set vec of a song single node set after the two models are respectively trained_mSinger feature vectorization representation set vec of singing order nodes_sAs shown in the following formula, wherein

For singing a song node

Is used for vectorizing the representation of the characteristics of the songs,

vectorized representation of singer's features:

and 4, step 4: singing list node

The vectorization expression of the method is shown in the following formula, and the characteristics of the song are taken to be vectorized and expressed

And singer feature vectorized representation

Average value of (d):

TABLE 2 Internet Yiyun music song list tag categories

The song list multi-tag recommendation task can be regarded as a multi-classification task, and in the tag types of the online yi cloud song list data, as shown in the table 2, five tag navigation classes are provided, and each navigation class also has multiple emotion tag classes.

Fig. 5 is a schematic diagram of singing bill feature cluster analysis, and fig. 6 is a navigation class ratio of each category in the singing bill cluster. Referring to fig. 5 and 6, clustering analysis, which is a data mining tool in the field of machine learning, can be used for data classification, so that similar data can be clustered into the same class, different data can be classified into different classes, and implicit associations between classes can be found by using the difference of classes. The invention is used for clustering analysis of song list data, but the clustering analysis algorithm of the invention does not end the song list after being classified into a certain class, but calculates the weight value of each navigation class label in each class according to the class of the clustered song list so as to be recommended subsequently. This is because, if the label combination of a certain song list is [ "lonely", "hurt", "thoughts" ], then its navigation class should be [ "emotion" ] class, the song list is classified into "emotion" clustering class is suitable, and when the label combination of a certain song list is [ "rock", "europe", "america", "excitement" ], its navigation class should be [ "style", "language", "emotion" ] class, and the song list can only be classified into one clustering class, but any classification is unsuitable, so the invention only clusters the song list training set features through clustering algorithm, after calculating the navigation class weight of each class, then uses the original clustering model to perform cluster analysis on the song list test set data, the parameter that the clustering model needs to input is the clustering class number n _ cluster, its song list feature clustering steps are as follows:

step 1: initializing a clustering Model according to the clustering class number n _ cluster_cluster；；

Step 2: singing sheet training set data

Input into Model_clusterAnd (5) performing cluster learning.

And step 3: training set label based on song list

Clustering results with song list training set

Calculate the navigation class weight for each category, as shown in the following equation, w_labelRepresenting the weight of each navigational class in each category, c_iRepresents the ith cluster category, there are five categories in total, w_iIs the navigation class weight under the ith category, w_iThe parameters in the (i) th clustering category are respectively the weight values of the language navigation category, the style navigation category, the emotion navigation category, the scene navigation category and the theme navigation category under the ith clustering category, and the weight calculation mode of each navigation category is the same, wherein

The calculation method of (a) is shown in the following formula,

and 4, step 4: test set feature set of song list

And 5: output test set song list category

And weight values w of navigation classes in each class_label。

FIG. 7 is a schematic diagram of Hash bucket division of a song list-song set in a training set, and FIG. 8 is a schematic diagram of a multi-tag recommendation method based on a song list clustering result, wherein five navigation class tags [ "language", "scene", "theme", "style", "emotion ] of a song list are adopted in the invention"]Starting, dividing the singing sheet of the training set into five categories according to navigation categories, calculating the singing sheet hash barrel of each navigation category by using an LSH/MinHash algorithm, and finally completing a label recommendation task by combining singing sheet clustering analysis results. The method receives song list training set data containing neighbor information

And test set data

And clustering results of test set data

And navigation class weight w_labelFour parameters, the algorithm steps are as follows:

step 1: training set according to song list

Grouping the singing lists in the training set according to the navigation category to which the label combination belongs;

step 2: using the grouping song list set to carry out LSH/MinHash barrel-dividing calculation to obtain B_yzLanguage type hash bucket, B_ztTopic hash bucket, B_cjScene type hash bucket, B_fgStyle class hash bucket and B_qgAnd the emotion type hash bucket, wherein two data hash buckets are respectively generated according to each navigation type of the song list song set and the song list singer set, and the formula is as follows: :

and step 3: from the test set

Take out a target song list L_recAnd clustering results in the test set

Find out its corresponding category

Obtaining the weight value of each navigation class according to the class

And 4, step 4: set of songs from a target song list

Harmony singer set

Respectively mapped to the affiliated categories

Middle weight

Navigation type song list-song hash bucket not equal to zero

And singing list-singer hash bucket

In the method, a song list set Sim similar to the target song list is retrieved from each hash bucket_ijIt is represented as follows:

and 5: according to the target songSingle retrieved similar song list set Sim_ijAnd a weight w_ijUpdating recommendation index for each tag

The update formula is as follows:

step 6: finally, the recommended label set R is taken_TagAccording to the recommended indexes of the labels

the invention has the advantages that:

compared with the traditional collaborative filtering recommendation method, the algorithm has the characteristics of high recommendation accuracy and high recommendation speed, and the method is simple and effective.

Fig. 9 shows a comparison between a singing style heteromorphic graph multi-label recommendation method (GRAC) and a conventional collaborative filtering recommendation method (TRAD) for fusing node preferences in the present invention in terms of recommendation accuracy and recommendation efficiency. The effect evaluation of the recommendation method is carried out by the accuracy rate Rec_pAnd recall rate Rec_RAn F1 value was calculated using the F1 value as an evaluation index of the recommendation, the definition of which is as follows:

the precision ratio is as follows:

wherein R is_TIndicates that the correct number of tags, N, is recommended_RIndicating the number of recommended tags. The higher the accuracy, the more the representation is pushedThe greater the proportion of recommended total tags.

The recall ratio is as follows:

wherein N is_TThe larger the proportion of the recommended correct label number to the actual correct total label number is, the more the actual correct label number is represented.

F1 value:

the F1 value serves as a weighted harmony of the precision rate and the recall rate, the influence of the precision rate and the recall rate on the model accuracy evaluation is comprehensively considered, and the higher the F1 value is, the more stable the overall accuracy of the recommended model is, and vice versa.

Table 3 description of the experimental methods

Table 4 description of ablation Experimental methods

Table 3 and fig. 9 show the comparison results of the present invention and the traditional collaborative filtering method on the real song list data under different data set scales for the multi-label recommendation of the song list. The comparison result shows that the invention can excellently improve the effect of the singing bill label recommendation task under different singing bill data scales, and has better recommendation accuracy and recommendation real-time performance. Table 4 and fig. 10 show that the recommendation effectiveness of each optimization method provided by the present invention is verified under different data scales, and the result shows that the improvement method of the present invention can better improve the recommendation effect.

The invention deeply researches the application of heterogeneous graph information in a recommendation scene, and provides a heterogeneous graph multi-label recommendation method based on node preference, the method improves the expression capability of a target song list node through a song list heterogeneous information network, and enhances the relevance of each type of song list data by using an information aggregation technology, and the method mainly works in the following three aspects: 1) in order to fully utilize the heterogeneous relation of the song list data, the method uses the relation among the song list, the song and the singer to construct the song list heterogeneous composition, and in the process of constructing the composition, the edge weight information is established by calculating the preference degrees among different nodes, and the multi-dimensional preference information of each pair of nodes is fully utilized. Aiming at various types of characteristics of the target song list, the method provides a second-order neighbor random sampling method based on node preference, and the node characteristics of the target song list are constructed by fully utilizing the relationship between the neighbor song list and the target song list. 2) In order to improve the accuracy of calculating a similar song list set, the method provides that clustering analysis is carried out on song data based on neighbor feature representation and a spectral clustering algorithm, the song list is divided into five navigation classes, the accuracy of a clustering result of each class is obtained, and the clustering result is used as a class weight and applied to downstream recommendation. 3) In order to optimize the recommendation efficiency, the method further integrates a collaborative filtering algorithm based on locality sensitive hashing to perform a final label recommendation task on the basis of constructing singing single node characteristics by using a heterogeneous graph and performing clustering analysis on the nodes by using a spectral clustering algorithm, so that the recommendation accuracy of the algorithm is improved, and the recommendation efficiency is greatly improved. The experiment result based on the real song list data of the internet music cloud shows that the recommendation effect of the invention is greatly improved compared with the traditional collaborative filtering method, and the average operation time is greatly improved compared with the traditional collaborative filtering method.

Claims

1. A node preference fused heteromorphic graph song list multi-label recommendation method is characterized by comprising the following steps:

2. The method for recommending different composition song list with fused node preference according to claim 1, wherein the specific process of step 1 is as follows:

step 1.1: connecting the song list nodes according to the following formula

And song node

And song node

There is no edge between, otherwise, singing node

And song node

There is a song-singing margin in between

Step 1.2: connecting the song list nodes according to the following formula

And singer node

And singer node

There is no edge between them, otherwise, the song list node

And singer node

Between the singer and the singing edge

Step 1.3: connecting the song list nodes according to the following formula

And user node

If the ith song list L_iIs not u_jNode of song list

And user sectionDot

There is no edge between them, otherwise, the song list node

And user node

There is a user-song list edge in between

3. The method as claimed in claim 2, wherein in step 1.1, the song-song list side

As shown in the following formula:

wherein the content of the first and second substances,

is a song feature;

step 1.2, singer-singing side

As shown in the following formula:

wherein the content of the first and second substances,

is a characteristic of the singer;

step 1.3, user-singing list side

As shown in the following formula:

wherein the content of the first and second substances,

is a user characteristic.

4. The method for recommending different composition song list with fused node preference according to claim 2, wherein the specific process of step 2 is as follows:

step 2.1: if singing node

All first order neighbor song nodes of

Step 2.2: if singing node

First order neighbor song node preference

Together form a total of N_msAlternative song neighbors of

Step 2.3: by alternative song neighbors

According to song-singing edge

Calculating target song list node

First order song list neighbor set

Singing order node preference

Step 2.4: according to singing list node preference

The weight randomly selects the singing list node

First order song list neighbor number N_LNeighbor set of first order song list

Step 2.5: neighbor set adopting song list

Calculating the singing bill node according to the following formula

Node preference of second order song node

Then through node preference

Randomly selecting 2 x N_LA second-order neighbor song

Step 2.6: integrating song list nodes by

Song list of

Step 2.7: repeating the steps 2.1-2.6 to connect all the singing nodes

Song list information containing song neighbor characteristics is sampled.

5. The method as claimed in claim 4, wherein in step 2.3, the singing menu node preference is given

Calculated by the following formula:

wherein the content of the first and second substances,

for the ith song list L_iThe song neighbor and the jth song list L_jThe number of intersections of song neighbors.

6. The method as claimed in claim 5, wherein in step 2.6, the song list is divided into different groups, and the different groups are selected according to the different groups

7. The method for recommending different composition song list with fused node preferences as claimed in claim 1, wherein the specific process of step 3 is as follows:

wherein the content of the first and second substances,

for singing a song node

Is used for vectorizing the representation of the characteristics of the songs,

vectorizing the characteristics of the singers;

step 3.4: singing list node

Vectorization of

Expressed by the following formula:

8. the method for recommending different composition song list with fused node preferences as claimed in claim 1, wherein the specific process of step 4 is as follows:

Step 4.3: training set label based on song list

Clustering results with song list training set

wherein the content of the first and second substances,

is a song list L₁The combination of the labels of (1),

is a song list L₂The combination of the labels of (1),

a weight value representing a language navigation class,

a weight value representing a genre navigation class,

a weight value representing an emotional navigation class,

a weight value representing a scene navigation class,

a weight value representing a navigation class of the theme,

step 4.4: test set feature set of song list

9. The method for recommending different composition song list with fused node preferences as claimed in claim 1, wherein the specific process of step 5 is as follows:

step 5.1: training set according to song list