CN108334601B - Song recommendation method and device based on tag topic model and storage medium - Google Patents

Song recommendation method and device based on tag topic model and storage medium Download PDF

Info

Publication number
CN108334601B
CN108334601B CN201810097213.9A CN201810097213A CN108334601B CN 108334601 B CN108334601 B CN 108334601B CN 201810097213 A CN201810097213 A CN 201810097213A CN 108334601 B CN108334601 B CN 108334601B
Authority
CN
China
Prior art keywords
song
theme
probability distribution
list
topic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810097213.9A
Other languages
Chinese (zh)
Other versions
CN108334601A (en
Inventor
黄安埠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Music Entertainment Technology Shenzhen Co Ltd
Original Assignee
Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Music Entertainment Technology Shenzhen Co Ltd filed Critical Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority to CN201810097213.9A priority Critical patent/CN108334601B/en
Publication of CN108334601A publication Critical patent/CN108334601A/en
Application granted granted Critical
Publication of CN108334601B publication Critical patent/CN108334601B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a song recommending method, a song recommending device and a song recommending storage medium based on a tag topic model, wherein the method comprises the following steps: acquiring a song list set, wherein the song list set comprises a plurality of song lists, and each song list comprises theme information and a plurality of songs; constructing a tag set of the song list according to the theme information, wherein the tag set comprises at least one theme tag; assigning a topic tag within the set of tags to a song in the song list; acquiring the probability distribution of a new theme of the song; determining a target theme label distributed to the songs in the song list according to the new theme probability distribution; and generating a corresponding song recommendation list according to the target theme label distributed to the songs in the song list, and recommending the songs based on the song recommendation list. The scheme can improve the accuracy of song recommendation.

Description

Song recommendation method and device based on tag topic model and storage medium
Technical Field
The invention relates to the technical field of multimedia, in particular to a song recommendation method and device based on a tag topic model and a storage medium.
Background
With the rapid development of networks, people's daily life is more and more away from networks, people listen to songs, watch videos, watch news and the like through networks and become daily life habits of people, and taking music as an example, with the explosive growth of music data, users are more and more difficult to select favorite music from numerous music data, so that the users can be positively recommended interesting music, and the scheme is just a feasible and efficient scheme.
The main recommendation scheme at present is collaborative filtering and LDA (topic model), wherein collaborative filtering is a data source using the behavior of listening to songs of a user as input, and is easily affected by running water, and songs pushed by a collaborative filtering method are often biased to be popular; although the traditional LDA model can obtain the theme distribution and the song list of the theme of the user, the LDA model is very easily influenced by corpus training data to cause deviation during training, and the recommendation result is inaccurate.
Disclosure of Invention
The embodiment of the invention provides a song recommending method and device based on a tag topic model and a storage medium, which can greatly improve the accuracy of song recommendation.
The embodiment of the invention provides a song recommending method based on a tag topic model, which comprises the following steps:
acquiring a song list set, wherein the song list set comprises a plurality of song lists, and the song lists comprise subject information;
constructing a tag set of the song list according to the theme information, wherein the tag set comprises at least one theme tag;
assigning a topic tag within the set of tags to a song in the song list;
acquiring new theme probability distribution of the song, wherein the new theme probability distribution comprises the probability distribution of the song currently distributed to each theme label;
determining a target theme label distributed to the songs in the song list according to the new theme probability distribution;
and generating a corresponding song recommendation list according to the target theme label distributed to the songs in the song list, and recommending the songs based on the song recommendation list.
The embodiment of the invention also provides a song recommending device based on the label topic model, which comprises the following components:
a first obtaining unit configured to obtain a song list set, where the song list set includes a plurality of song lists, and the song lists include subject information;
the construction unit is used for constructing a label set of the song list according to the theme information, and the label set comprises at least one theme label;
an assigning unit, configured to assign the theme tags in the tag set to the songs in the song list;
a second obtaining unit, configured to obtain a new theme probability distribution of the song, where the new theme probability distribution includes a probability distribution that the song is currently allocated to each of the theme tags;
the determining unit is used for determining a target theme label distributed to the songs in the song list according to the new theme probability distribution;
and the recommending unit is used for generating a corresponding song recommending list according to the target theme label distributed to the songs in the song list and recommending the songs on the basis of the song recommending list.
In addition, the embodiment of the present invention further provides a storage medium, in which processor executable instructions are stored, and the processor provides the song recommendation method as described above by executing the instructions.
The method comprises the steps of firstly obtaining a song list set, wherein the song list set comprises a plurality of song lists, the song lists comprise theme information, then constructing the song list label set according to the theme information, the label set comprises at least one theme label, then distributing the theme label in the label set to the song in the song list, then obtaining new theme probability distribution of the song, determining a target theme label distributed to the song in the song list according to the new theme probability distribution, finally generating a corresponding song recommendation list according to the target theme label distributed to the song in the song list, and recommending the song based on the song recommendation list. In the embodiment of the invention, the unsupervised LDA model is converted into the supervised subject model for training by using the subject label of the song list, and the final song recommendation list is generated, so that the accuracy of song recommendation is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a scene schematic diagram of a song recommendation method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a song recommendation method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the formation of a labelset provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a theme song distribution provided by an embodiment of the invention;
FIG. 5 is a probabilistic graphical model of a tag topic model provided by an embodiment of the invention;
FIG. 6 is a schematic diagram of a summary of training data provided by an embodiment of the invention;
FIG. 7 is a schematic illustration of a Gibbs sampling process provided by an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a song recommending apparatus according to an embodiment of the present invention;
fig. 9 is another schematic structural diagram of a song recommending apparatus according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the figure is a schematic view of a scene of a song recommendation method provided in an embodiment of the present invention, where the scene may include a song recommendation apparatus, the song recommendation apparatus may be specifically integrated in a server and other network devices, and the server may be a server cluster formed by a plurality of servers, or a cloud computing service center.
As shown in fig. 1, the scenario may include a server a, a server b, and a terminal c, where the terminal c may be a smart phone, a personal computer, or the like. For example, the server b first obtains a song list set from the server a, the song list set including a plurality of song lists, the song lists including subject information. Then, a tag set of the song list is constructed according to the subject information, wherein the tag set comprises at least one subject tag. Assigning a topic tag within the set of tags to a song in the song list. And then, acquiring new theme probability distribution of the song, wherein the new theme probability distribution comprises the probability distribution of the song currently distributed to each theme label. And finally, determining a target theme label distributed to the songs in the song list according to the new theme probability distribution, generating a corresponding song recommendation list according to the target theme label distributed to the songs in the song list, and recommending the songs based on the song recommendation list. Further, the server b may send the generated song recommendation list to the terminal c, and recommend the song recommendation list to the user to select listening. In addition, a plurality of clients can be included in the scene.
The embodiment of the invention provides a song recommending method and device based on a tag topic model and a storage medium.
Wherein, the probability graph model of the label topic model is shown in fig. 5.
The first embodiment,
In the embodiment of the present invention, description will be made from the viewpoint of a song recommending apparatus, which may be specifically integrated in a server.
A song recommendation method based on a tag topic model comprises the following steps: acquiring a song list set, wherein the song list set comprises a plurality of song lists, and the song lists comprise subject information; constructing a tag set of the song list according to the theme information, wherein the tag set comprises at least one theme tag; assigning a topic tag within the set of tags to a song in the song list; acquiring new theme probability distribution of the song, wherein the new theme probability distribution comprises the probability distribution of the song currently distributed to each theme label; determining a target theme label distributed to the songs in the song list according to the new theme probability distribution; and generating a corresponding song recommendation list according to the target theme label distributed to the songs in the song list, and recommending the songs based on the song recommendation list.
Referring to fig. 2, fig. 2 is a flowchart illustrating a song recommendation method according to an embodiment of the present invention, where the method may include:
s101, acquiring a song list set, wherein the song list set comprises a plurality of song lists, and each song list comprises topic information.
It is understood that the menu collection is trained from the music library by the server, and the server needs to select songs in the menu collection, generate a personalized recommendation list, recommend to the user for listening and collecting, and the like.
Further, before acquiring the song list set, the method may further include: all songs in the music library are obtained and trained to generate training data, and the training data is a song list set.
Preferably, the song list set may be composed of two parts, one is a high-quality song list, the number of which is about 90 ten thousand, and the other is an artificial song list, the number of which is about 1 hundred million, which is composed of music data recently listened and collected by the user, as shown in fig. 6.
S102, constructing a label set of the song list according to the theme information, wherein the label set comprises at least one theme label.
In this embodiment, each song list corresponds to a tag set, where the tag set includes one or more topic tags. As shown in fig. 3, the topic tags such as "yue language", "classic", "vicission" and "deep situation" are extracted from the song list to construct a tag set.
Each song in the song list corresponds to one topic label in the label set, and a most suitable topic label is selected for each song through training.
S103, allocating the theme tags in the tag set to the songs in the song list.
In a specific implementation process, based on Gibbs (Gibbs Sample) sampling, a theme tag is extracted from a tag set, and the theme tag is randomly allocated to each song in a song list corresponding to the tag set, wherein each song has one and only one theme tag.
The key point of utilizing the theme model to carry out personalized recommendation is to regard the singing list as a document, construct a singing list-document model and then construct a word frequency matrix of the document according to the singing list-document model.
Optionally, before assigning a theme tag to each song, each song in the song list set may be regarded as a document, a word-frequency matrix doc is constructed, the word-frequency matrix doc is used as an input corpus, and then several statistics and dirichlet super parameters are set.
Among these, the following statistics may be set, such as:
nm,zt: indicating that in document m there are a total of t words assigned to topic z.
nz,tK: indicating the number of times the word t is assigned to the topic k.
zm,nK: representing document m, the word n is assigned to topic k.
Then, the values of the several statistics are initialized to 0.
Preferably, based on the theme model, two variables α and β can be learned from the song list-document model, where the variable α represents a dirichlet hyper-parameter corresponding to the theme, and the variable β represents a dirichlet hyper-parameter corresponding to the song.
When a theme z is assigned to a song t in the mth song list, the matrix is initialized, that is:
zm,t=z
nm,z=nm,z+1
nz,t=nz,t+1
s104, obtaining the new theme probability distribution of the song, wherein the new theme probability distribution comprises the probability distribution of the song currently distributed to each theme label.
In a specific implementation process, a new theme probability distribution of the song may be generated according to the theme tag distribution information of the remaining songs, and the specific steps are as follows:
and acquiring the theme probability distribution and the theme song probability distribution of the song list according to the theme label distribution information of the rest songs.
And generating new theme probability distribution of the song according to the theme probability distribution of the song list and the theme song probability distribution.
Specifically, the probability distribution θ of the theme of the song list and the probability distribution of the theme song may be calculated according to the following formulas
Figure RE-GDA0001679763150000061
Figure RE-GDA0001679763150000062
Figure RE-GDA0001679763150000071
Wherein, n ism,zRepresenting the number of songs in the menu m assigned to the topic z, nz,tRepresenting the number of times song t is assigned to topic z.
It can be understood that the obtaining of the theme probability distribution and the theme song probability distribution of the menu according to the theme tag distribution information of the remaining songs may include:
obtaining the current nm,zAnd nz,tAnd preset α and β;
according to the current nm,zAnd nz,tAnd generating singing sheet theme probability distribution and theme song probability distribution by preset alpha and beta.
Preferably, the probability distribution theta of the theme of the song list and the probability distribution of the theme song are generated
Figure RE-GDA0001679763150000072
Then, the probability distribution theta of the theme of the song list and the probability distribution of the theme song can be obtained
Figure RE-GDA0001679763150000073
A new theme probability distribution for the song is generated.
Specifically, the probability distribution of the new theme of the song may be calculated according to the following formula:
Figure RE-GDA0001679763150000074
where p (z | d, w) is a vector of size K dimensions, where K is the total number of topics.
It should be noted that before generating the new theme probability distribution of the song, the method may further include:
remove the theme tag previously assigned to the song and update nm,zAnd nz,t
Namely:
nm,z=nm,z-1
nz,t=nz,t-1
and S105, determining the target theme label distributed to the song in the song list according to the new theme probability distribution.
In a specific implementation process, a new theme can be sampled for the song according to the probability distribution of the new theme, and the sampling of the new theme is limited in the label set of the song corresponding to the song list.
Specifically, the new theme may be generated according to the following formula:
znew=label*numpy.random.multinomial(p(z|d,w))
wherein, label is a 0-1 vector with the size of k dimension, therefore, the sampling result is also sampled in the original prior label, and a new subject z is obtained after samplingnewUpdating z simultaneouslym,n,nm,z,nz,tNamely:
zm,t=znew
Figure RE-GDA0001679763150000081
Figure RE-GDA0001679763150000082
the above process is a Gibbs sampling, and after the above operations are performed on all songs in all the song lists, an iteration is completed, and the specific process is shown in fig. 7.
The above process is repeated continuously, and usually a confusion property is adopted to measure the topic model, and the calculation formula of property is as follows:
perplexity=e-loglikelihood/N
where N represents the number of songs contained in all the vocalists, logrikelihood is the maximum likelihood and is calculated by the formula:
Figure RE-GDA0001679763150000083
in summary, the new topic solving process is as follows: randomly assigning a theme to each song in the song list during initialization, and then counting nm,zAnd nz,tAnd calculating p (z | d, w) in each round, namely excluding the theme distribution of the current song, estimating the probability distribution of the current song belonging to each theme according to the theme distribution of other songs, and sampling a new theme for the song according to the probability distribution. Continuously updating the theme of the next song by the same method until the probability distribution theta of the theme of the song list and the probability distribution of the theme song are found
Figure RE-GDA0001679763150000084
The Markov chain is converged, the iteration is stopped, and the probability distribution theta of the theme of the parameter song sheet to be estimated and the probability distribution of the theme song are output
Figure RE-GDA0001679763150000085
Finally, the theme of each song is also obtained simultaneously.
Therefore, through the above operations, a specific theme song distribution list can be obtained, and each theme tag may include one or more songs, and of course, there may be zero songs.
And S106, generating a corresponding song recommendation list according to the target theme label distributed to the songs in the song list, and recommending the songs based on the song recommendation list.
In a specific implementation process, the steps may specifically include:
generating a user theme probability distribution and a theme song probability distribution based on the theme label finally distributed to the song;
and generating a song recommendation list according to the user theme probability distribution, the theme song probability distribution and a preset recommendation condition, and recommending songs based on the song recommendation list.
The preset recommendation condition may be a little song and a long song, and the long song is a song with small demand and poor sales.
For example, songs recommended by current recommendation schemes are generally hot, and songs that are good for the tastes of the small people cannot be recommended and are difficult to find by users. And the juveniles and long-tail songs can be discovered and discovered through the scheme. The songs recommended by the scheme can be suitable for users who like the genre of the little people.
As can be seen from the above, in the song recommendation method according to the embodiment of the present invention, a song list set is obtained first, the song list set includes a plurality of song lists, the song list includes topic information, a tag set of the song list is constructed according to the topic information, the tag set includes at least one topic tag, then the topic tag in the tag set is assigned to a song in the song list, then a new topic probability distribution of the song is obtained, a target topic tag assigned to the song in the song list is determined according to the new topic probability distribution, and finally a corresponding song recommendation list is generated according to the target topic tag assigned to the song in the song list, and song recommendation is performed based on the song recommendation list. In the embodiment of the invention, the unsupervised LDA model is converted into the supervised subject model for training by using the subject label of the song list, and the final song recommendation list is generated, so that the accuracy of song recommendation is improved.
Example II,
The key to the personalized recommendation using the theme model is to regard the menu as a document, the songs in the menu are equivalent to words, each menu usually has a specific style, such as different genres, etc., these styles are the theme tags of the menu, and a specific song is under each style (theme tag), as shown in fig. 4.
To better explain the method described in the above embodiment, the present embodiment will exemplify a song sheet as a document.
S201, extracting training data.
Wherein the training data includes a plurality of documents, each document including a plurality of words and one or more topic tags. Extracting the subject label of each document to form a label set, and distributing the label set as a label subject prior, namely distributing the subject label in the corresponding label set for each document when distributing the subject label.
S202, constructing a word frequency matrix.
It is understood that after the word frequency matrix is constructed, the following statistics can be set and initialized to 0.
nm,zT denotes a total of t words assigned to topic z in document m.
nz,tK denotes the number of times the word t is assigned to the topic k.
zm,nRepresenting the document m, the word n is assigned to the topic k.
Then, a variable α and a variable β are set, the variable α being a parameter of the prior Dirichlet distribution of the document-subject and the variable β being a parameter of the prior Dirichlet distribution of the subject-song.
And S203, distributing the theme label.
Preferably, initialization is performed, a topic label is allocated to each word, unlike LDA, when a label topic model is initialized, instead of randomly allocating a topic, a topic label of a document is used as prior data, words below the document are randomly allocated in a document label set, and when a topic label z is allocated to a word t in an mth document, an initialization matrix z is used to allocate a topic label z to the word t in the mth documentm,n,nm,z,nz,tThe method comprises the following steps:
zm,t=z
nm,z=nm,z+1
nz,t=nz,t+1
and S204, iteration.
Specifically, each oneThe secondary iteration, which re-assigns the topic by gibbs sampling, iterates each word in each text separately, first needs to remove the topic assigned to the word and modify nm,zAnd nz,tSo that:
nm,z=nm,z-1
nz,t=nz,t-1
the distribution of the new topic is then solved according to the following formula:
Figure RE-GDA0001679763150000111
wherein:
Figure RE-GDA0001679763150000112
Figure RE-GDA0001679763150000113
p (z | d, w) is a vector with the size of K dimension, wherein K is the total number of the topic tags, after the p (z | d, w) vector is obtained, sampling can be carried out according to the probability distribution, and different from LDA, the tag topic model limits the sampling of data in the corresponding document tag set, namely
znew=label*numpy.random.multinomial(p(z|d,w))
Here, label is a 0-1 vector with a size of k dimensions, so that the sampling result is also sampled in the original label set, and a new subject z is obtained after samplingnewUpdating z simultaneouslym,n,nm,z,nz,t
zm,t=znew
Figure RE-GDA0001679763150000114
Figure RE-GDA0001679763150000115
The above process is a Gibbs sampling, and after the above operations are performed on all words in all documents, an iteration is completed, and the process can be specifically represented by fig. 7. The process of S204 is repeated, usually using confusion property to measure the topic model, where the formula of property is:
perplexity=e-loglikelihood/N
where N represents the number of words contained in all documents, logrikelihood is the maximum likelihood, and its calculation formula is:
Figure RE-GDA0001679763150000121
wherein θ and
Figure RE-GDA0001679763150000122
please refer to the above expression.
In summary, the sampling process specifically includes: randomly assigning a topic to each word in a document upon initialization, and then counting nm,zAnd nz,tAnd calculating p (z | d, w) in each round, namely excluding the topic assignment of the current word, estimating the probability distribution of the current word belonging to each topic according to the topic assignment of other words, and sampling a new topic for the word according to the probability distribution. Continuously updating the topic of the next word by the same method until the document topic probability distribution theta and the topic word probability distribution are found
Figure RE-GDA0001679763150000123
The Markov chain is converged, the iteration is stopped, and the topic probability distribution theta and the topic word probability distribution of the parameter document to be estimated are output
Figure RE-GDA0001679763150000124
Finally, the subject of each word is obtained simultaneously.
The embodiment changes the unsupervised topic model into the supervised topic model for training by limiting the sampling of the topic model to the label set of the corresponding document. When the model is initialized, a more reasonable theme can be distributed to each word according to the prior label data, and errors can be gradually corrected in the training process, so that the finally obtained data is more accurate.
Example III,
In order to better implement the above method, an embodiment of the present invention further provides a song recommendation apparatus based on a tag topic model, where the song recommendation apparatus may be specifically integrated in a server, such as a service server, and the like, where the meaning of a noun is the same as that in the song recommendation method described above, and specific implementation details may refer to the description in the method embodiment.
For example, as shown in fig. 8, the song recommending apparatus may include a first obtaining unit 301, a constructing unit 302, an assigning unit 303, a second obtaining unit 304, a determining unit 305, and a recommending unit 306, as follows:
(1) a first acquisition unit 301;
a first obtaining unit 301, configured to obtain a song list set, where the song list set includes a plurality of song lists, and the song lists include subject information.
The song list set is training data extracted from the server a, and the training data consists of two parts, namely a high-quality song list, the number of which is about 90 ten thousand; the other is an artificial song list, the number of which is about 1 hundred million, which is composed of music data recently listened and collected by the user, as shown in fig. 6.
(2) A building unit 302;
a constructing unit 302, configured to construct a tag set of the song list according to the topic information, where the tag set includes at least one topic tag.
In this embodiment, each song list corresponds to a tag set, where the tag set includes one or more topic tags. As shown in fig. 3, the topic tags such as "yue language", "classic", "vicission" and "deep situation" are extracted from the song list to construct a tag set.
Each song in the song list corresponds to one topic label in the label set, and a most suitable topic label is selected for each song through training.
(3) A distribution unit 303;
an assigning unit 303, configured to assign the theme tags in the tag set to the songs in the song list.
It is to be understood that the topic assignment unit 303 is specifically configured to assign a topic to each song in each song list, wherein when assigning topic tags in a polynomial distribution, the sampling of the topic tags is limited within a priori tags, so that an unsupervised topic model becomes a supervised topic model.
(4) A second acquisition unit 304;
a second obtaining unit 304, configured to obtain a new theme probability distribution of the song, where the new theme probability distribution includes probability distributions that the song is currently allocated to the respective theme tags.
As shown in fig. 9, the second obtaining unit 304 may include:
a first generating subunit 3041, configured to obtain a theme probability distribution and a theme song probability distribution of the song list according to theme tag allocation information of remaining songs, where the remaining songs are songs in the song list other than the song.
Specifically, the first generating subunit 3041 may specifically calculate the probability distribution θ of the theme of the song sheet and the probability distribution of the theme song according to the following formulas
Figure RE-GDA0001679763150000141
Figure RE-GDA0001679763150000142
Figure RE-GDA0001679763150000143
Wherein, n ism,zRepresenting the number of songs in the menu m assigned to the topic z, nz,tRepresenting the number of times song t is assigned to topic z.
A second generating subunit 3042, configured to generate a new theme probability distribution of the song according to the song list theme probability distribution and the theme song probability distribution.
Specifically, the second generating subunit 3042 may calculate the new topic probability distribution according to the following formula:
Figure RE-GDA0001679763150000144
optionally, the second generating subunit 3042 may further be configured to:
obtaining the current nm,zAnd nz,tAnd preset alpha and beta.
According to the current nm,zAnd nz,tAnd generating singing sheet theme probability distribution and theme song probability distribution by preset alpha and beta.
For example, the first generating sub-unit 3041 may generate a song list topic probability distribution and a topic song probability distribution according to the current topic tag distribution information of the song, and the second generating sub-unit 3042 obtains the song list topic probability distribution and the topic song probability distribution from the first generating sub-unit 3041 to generate a new topic distribution probability of the song.
(5) A determination unit 305;
a determining unit 305, configured to determine a target topic tag to which the song in the song list is assigned according to the new topic probability distribution.
It is to be understood that the determining unit 305 iterates each song in each song list separately based on Gibbs (Gibbs) sampling, calculates a current new topic distribution probability for each round, and then samples new topic tags according to this probability. Until the probability distribution theta of the theme of the menu and the probability distribution of the theme song are found
Figure RE-GDA0001679763150000151
The Markov chain is converged, the iteration is stopped, and the expected parameters of the theme probability distribution theta and the theme song probability distribution are output
Figure RE-GDA0001679763150000152
Finally, the theme label of each song is also obtained at the same time.
Specifically, as shown in fig. 9, the determining unit 305 may include:
a cycle subunit 3061, configured to select a corresponding topic tag from the tag set according to the new topic probability distribution, and return to perform the step of assigning the selected topic tag to a corresponding song in the song list until a preset condition is met.
A determination subunit 3062, configured to use the theme tag ultimately assigned to the song in the song list as the target theme tag of the song.
(6) A recommendation unit 306.
And the recommending unit 306 is configured to generate a corresponding song recommendation list according to the target topic tag allocated to the song in the song list, and recommend the song based on the song recommendation list.
As shown in fig. 9, the recommending unit 306 may include:
a third generating subunit 3061, configured to generate a user topic probability distribution and a topic song probability distribution based on the topic tags finally assigned to the songs;
and a recommending subunit 3062, configured to generate a song recommendation list according to the user topic probability distribution, the topic song probability distribution, and a preset recommendation condition, and recommend songs based on the song recommendation list.
Fig. 9 may also be referred to as another schematic structural diagram of the song recommendation apparatus. Specifically, the apparatus may further include:
an updating unit 307 for n based on the current theme tag assignment information of the songm,zAnd nz,tAnd (6) updating.
A cleaning unit 308 for removing the hashtag previously assigned to the song,and update nm,zAnd nz,t
In a specific implementation, the above units may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and the specific implementation of the above units may refer to the foregoing method embodiments, which are not described herein again.
As can be seen from the above, in the song recommending device according to the embodiment of the present invention, a song list set is obtained first, the song list set includes a plurality of song lists, the song list includes topic information, a tag set of the song list is constructed according to the topic information, the tag set includes at least one topic tag, then the topic tag in the tag set is assigned to a song in the song list, then a new topic probability distribution of the song is obtained, a target topic tag assigned to the song in the song list is determined according to the new topic probability distribution, and finally a corresponding song recommendation list is generated according to the target topic tag assigned to the song in the song list, and song recommendation is performed based on the song recommendation list. In the embodiment of the invention, the unsupervised LDA model is converted into the supervised subject model for training by using the subject label of the song list, and the final song recommendation list is generated, so that the accuracy of song recommendation is improved.
Example four,
Correspondingly, an embodiment of the present invention further provides a server, as shown in fig. 10, which is a schematic structural diagram of the server provided in the embodiment of the present invention, specifically:
the server may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the server architecture shown in FIG. 10 is not meant to be limiting, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 401 is a control center of the server, connects various parts of the entire server using various interfaces and lines, performs various functions of the server and processes data by running or executing software programs and/or sub-units stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the server. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and sub-units, and the processor 401 executes various functional applications and data processing by operating the software programs and sub-units stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the server, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.
The server further includes a power supply 403 for supplying power to each component, and preferably, the power supply 403 may be logically connected to the processor 401 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The server may also include an input unit 404, the input unit 404 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the server may further include a display unit and the like, which will not be described in detail herein. Specifically, in this embodiment, the processor 401 in the server loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application program stored in the memory 402, thereby implementing various functions as follows:
taking a song list set, wherein the song list set comprises a plurality of song lists, and the song lists comprise subject information;
constructing a tag set of the song list according to the theme information, wherein the tag set comprises at least one theme tag;
assigning a topic tag within the set of tags to a song in the song list;
acquiring new theme probability distribution of the song, wherein the new theme probability distribution comprises the probability distribution of the song currently distributed to each theme label;
determining a target theme label distributed to the songs in the song list according to the new theme probability distribution;
and generating a corresponding song recommendation list according to the target theme label distributed to the songs in the song list, and recommending the songs based on the song recommendation list.
The server can achieve the effective effect that any one of the song recommending devices provided by the embodiments of the present invention can achieve, which is detailed in the foregoing embodiments and will not be described herein again.
The server of the embodiment of the invention firstly obtains a song list set, wherein the song list set comprises a plurality of song lists, each song list comprises theme information, then constructs a tag set of the song list according to the theme information, the tag set comprises at least one theme tag, then allocates the theme tags in the tag set to songs in the song list, then obtains new theme probability distribution of the songs, determines target theme tags allocated to the songs in the song list according to the new theme probability distribution, and finally generates a corresponding song recommendation list according to the target theme tags allocated to the songs in the song list and carries out song recommendation based on the song recommendation list. In the embodiment of the invention, the unsupervised LDA model is converted into the supervised subject model for training by using the subject label of the song list, and the final song recommendation list is generated, so that the accuracy of song recommendation is improved.
In the above embodiments, the descriptions of the embodiments have respective emphasis, and parts that are not described in detail in a certain embodiment may refer to the above detailed description of the song recommendation method, and are not described herein again.
It should be noted that, for the song recommendation method described in the present invention, it may be understood by a person skilled in the art that all or part of the process of implementing the song recommendation method described in the embodiment of the present invention may be completed by controlling related hardware through a computer program, where the computer program may be stored in a computer-readable storage medium, such as a memory of a terminal, and executed by at least one processor in the terminal, and during the execution process, the process of the embodiment of the session key generation method may be included. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.
For the song recommending apparatus according to the embodiment of the present invention, each functional module may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.
The method and the device for recommending songs based on the tag topic model provided by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (11)

1. A song recommendation method based on a tag topic model is characterized by comprising the following steps:
acquiring a song list set, wherein the song list set comprises a plurality of song lists, and each song list comprises theme information and a plurality of songs;
constructing a tag set of the song list according to the theme information, wherein the tag set comprises at least one theme tag;
assigning a topic tag within the set of tags to a song in the song list;
acquiring new theme probability distribution of the song, wherein the new theme probability distribution comprises the probability distribution of the song currently distributed to each theme label;
determining a target theme label distributed to the songs in the song list according to the new theme probability distribution;
and generating a corresponding song recommendation list according to the target theme label distributed to the songs in the song list, and recommending the songs based on the song recommendation list.
2. The song recommendation method of claim 1, wherein determining the target topic label to which the song in the song list is assigned according to the new topic probability distribution comprises:
selecting corresponding theme tags from the tag set according to the new theme probability distribution, and returning to execute the step of distributing the selected theme tags to corresponding songs in the song list until a preset condition is met;
and taking the theme label finally allocated to the song in the song list as the target theme label of the song.
3. The song recommendation method of claim 1, wherein said obtaining a new theme probability distribution for said song comprises:
obtaining the current nm,zAnd nz,tAnd preset α and β;
calculating the probability distribution of the theme of the song list and the probability distribution of the theme song according to the following formulas:
Figure FDA0002691774890000011
Figure FDA0002691774890000021
wherein, said nm,zRepresenting the number of songs in the menu m assigned to the thematic label z, nz,tRepresenting the number of times the song t is assigned to the topic label z, alpha representing the Dirichlet hyper-parameter corresponding to the topic, beta representing the Dirichlet hyper-parameter corresponding to the song, theta being the singing topic probability distribution,
Figure FDA0002691774890000022
is a topic song probability distribution;
and generating new theme probability distribution of the song according to the theme probability distribution of the song list and the theme song probability distribution.
4. The song recommendation method of claim 3, further comprising, before said obtaining a new subject probability distribution for the song:
remove the theme tag previously assigned to the song and update nm,zAnd nz,t
5. The song recommendation method according to claim 3 or 4, further comprising, after assigning the thematic tags within the tag set to songs in the song list:
assigning information to n based on the current theme tag of the songm,zAnd nz,tAnd (6) updating.
6. A song recommendation apparatus based on a tag topic model, comprising:
a first obtaining unit configured to obtain a song list set, where the song list set includes a plurality of song lists, and the song lists include subject information;
the construction unit is used for constructing a label set of the song list according to the theme information, and the label set comprises at least one theme label;
an assigning unit, configured to assign the theme tags in the tag set to the songs in the song list;
a second obtaining unit, configured to obtain a new theme probability distribution of the song, where the new theme probability distribution includes a probability distribution that the song is currently allocated to each of the theme tags;
the determining unit is used for determining a target theme label distributed to the songs in the song list according to the new theme probability distribution;
and the recommending unit is used for generating a corresponding song recommending list according to the target theme label distributed to the songs in the song list and recommending the songs on the basis of the song recommending list.
7. The song recommendation device according to claim 6, wherein the determination unit comprises:
a circulation subunit, configured to select a corresponding topic tag from the tag set according to the new topic probability distribution, and return to execute the step of allocating the selected topic tag to a corresponding song in the song list until a preset condition is met;
and the determining subunit is used for taking the theme label finally allocated to the song in the song list as the target theme label of the song.
8. The song recommendation device according to claim 6, wherein the second acquisition unit includes:
a first generation subunit to: obtaining the current nm,zAnd nz,tAnd preset α and β; calculating the probability distribution of the theme of the song list and the probability distribution of the theme song according to the following formulas:
Figure FDA0002691774890000031
Figure FDA0002691774890000032
wherein, said nm,zRepresenting the number of songs in the menu m assigned to the thematic label z, nz,tRepresenting the number of times the song t is assigned to the topic label z, alpha representing the Dirichlet hyper-parameter corresponding to the topic, beta representing the Dirichlet hyper-parameter corresponding to the song, theta being the singing topic probability distribution,
Figure FDA0002691774890000033
is a topic song probability distribution;
and the second generation subunit is used for generating the new theme probability distribution of the song according to the theme probability distribution of the song list and the theme song probability distribution.
9. The song recommendation device of claim 8, further comprising:
a clearing unit for removing the theme label previously assigned to the song and updating nm,zAnd nz,t
10. The song recommendation apparatus according to claim 8 or 9, further comprising:
update unit for baseDistributing information to the current theme tag of the song, and for nm,zAnd nz,tAnd (6) updating.
11. A storage medium having stored therein processor-executable instructions, the processor providing the song recommendation method of any one of claims 1-5 by executing the instructions.
CN201810097213.9A 2018-01-31 2018-01-31 Song recommendation method and device based on tag topic model and storage medium Active CN108334601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810097213.9A CN108334601B (en) 2018-01-31 2018-01-31 Song recommendation method and device based on tag topic model and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810097213.9A CN108334601B (en) 2018-01-31 2018-01-31 Song recommendation method and device based on tag topic model and storage medium

Publications (2)

Publication Number Publication Date
CN108334601A CN108334601A (en) 2018-07-27
CN108334601B true CN108334601B (en) 2021-03-16

Family

ID=62927598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810097213.9A Active CN108334601B (en) 2018-01-31 2018-01-31 Song recommendation method and device based on tag topic model and storage medium

Country Status (1)

Country Link
CN (1) CN108334601B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109213853B (en) * 2018-08-16 2022-04-12 昆明理工大学 CCA algorithm-based Chinese community question-answer cross-modal retrieval method
CN111611429B (en) * 2019-02-25 2023-05-12 北京嘀嘀无限科技发展有限公司 Data labeling method, device, electronic equipment and computer readable storage medium
CN109933678B (en) * 2019-03-07 2021-04-06 合肥工业大学 Artwork recommendation method and device, readable medium and electronic equipment
CN110378488B (en) * 2019-07-22 2024-04-26 深圳前海微众银行股份有限公司 Client-side change federal training method, device, training terminal and storage medium
CN112533030B (en) * 2019-09-19 2022-05-17 聚好看科技股份有限公司 Display method, display equipment and server of singing interface
CN111090771B (en) * 2019-10-31 2023-08-25 腾讯音乐娱乐科技(深圳)有限公司 Song searching method, device and computer storage medium
CN111753049B (en) * 2020-06-15 2024-04-16 广东美的厨房电器制造有限公司 Menu recommendation method and device, household electrical appliance and storage medium
CN112836082B (en) * 2021-02-08 2023-01-10 咪咕音乐有限公司 Method and device for generating song list, electronic equipment and storage medium
CN113220931B (en) * 2021-03-24 2023-01-03 西安交通大学 Multi-label song menu recommendation method, system, equipment and storage medium
CN113449147A (en) * 2021-07-06 2021-09-28 乐视云计算有限公司 Video recommendation method and device based on theme
CN114943006A (en) * 2022-07-01 2022-08-26 北京字跳网络技术有限公司 Singing bill display information generation method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102654859A (en) * 2011-03-01 2012-09-05 北京彩云在线技术开发有限公司 Method and system for recommending songs
CN105718566A (en) * 2016-01-20 2016-06-29 中山大学 Intelligent music recommendation system
CN106339507A (en) * 2016-10-31 2017-01-18 腾讯科技(深圳)有限公司 Method and device for pushing streaming media message
CN107368584A (en) * 2017-07-21 2017-11-21 山东大学 A kind of individualized video recommends method and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332567A1 (en) * 2009-06-26 2010-12-30 Ramin Samadani Media Playlist Generation
CN103970802B (en) * 2013-02-05 2018-12-14 北京音之邦文化科技有限公司 A kind of method and device of song recommendations
US9411897B2 (en) * 2013-02-06 2016-08-09 Facebook, Inc. Pattern labeling
CN106649686B (en) * 2016-12-16 2018-05-04 天翼爱音乐文化科技有限公司 User interest grouping method and system based on the potential feature of multilayer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102654859A (en) * 2011-03-01 2012-09-05 北京彩云在线技术开发有限公司 Method and system for recommending songs
CN105718566A (en) * 2016-01-20 2016-06-29 中山大学 Intelligent music recommendation system
CN106339507A (en) * 2016-10-31 2017-01-18 腾讯科技(深圳)有限公司 Method and device for pushing streaming media message
CN107368584A (en) * 2017-07-21 2017-11-21 山东大学 A kind of individualized video recommends method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于移动上下文的音乐推荐系统";曹磊;《中国优秀硕士学位论文全文数据库 信息科技辑》;20170215(第2017年02期);I138-4524 *

Also Published As

Publication number Publication date
CN108334601A (en) 2018-07-27

Similar Documents

Publication Publication Date Title
CN108334601B (en) Song recommendation method and device based on tag topic model and storage medium
US11379514B2 (en) User-specific media playlists
US9495645B2 (en) Method and system of iteratively autotuning prediction parameters in a media content recommender
CN111539197B (en) Text matching method and device, computer system and readable storage medium
CN111143604B (en) Similarity matching method and device for audio frequency and storage medium
CN110971659A (en) Recommendation message pushing method and device and storage medium
CN104991899A (en) Identification method and apparatus of user property
CN108920649A (en) A kind of information recommendation method, device, equipment and medium
JP2022020070A (en) Information processing, information recommendation method and apparatus, electronic device and storage media
US20190095423A1 (en) Text recognition method and apparatus, and storage medium
CN108427756B (en) Personalized query word completion recommendation method and device based on same-class user model
CN109241410B (en) Article recommendation method and device
CN111883131B (en) Voice data processing method and device
CN110852047A (en) Text score method, device and computer storage medium
CN113015010A (en) Push parameter determination method, device, equipment and computer readable storage medium
CN106156351A (en) Multimedia resource recommendation information generates method and device
CN107730306B (en) Movie scoring prediction and preference estimation method based on multi-dimensional preference model
CN114520931A (en) Video generation method and device, electronic equipment and readable storage medium
Dong et al. Music recommendation system based on fusion deep learning models
Dong et al. When Newer is Not Better: Does Deep Learning Really Benefit Recommendation From Implicit Feedback?
Yan et al. Tackling the achilles heel of social networks: Influence propagation based language model smoothing
CN114090848A (en) Data recommendation and classification method, feature fusion model and electronic equipment
CN108763400B (en) Object dividing method and device based on object behaviors and theme preferences
JP6798839B2 (en) Information providing device and information providing method
JP6798840B2 (en) Estimator and estimation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant