KR102222935B1

KR102222935B1 - Playlist Recommendation Method Using Collaborative Autoencoders

Info

Publication number: KR102222935B1
Application number: KR1020190038715A
Authority: KR
Inventors: 이종욱; 양호진
Original assignee: 성균관대학교산학협력단
Priority date: 2019-04-02
Filing date: 2019-04-02
Publication date: 2021-03-04
Also published as: KR102222935B9; KR20200116813A

Abstract

본 발명에 따른 자동 부호화기를 이용한 협업 필터링 기반 음악 재생목록 추천 방법은 제1 추천점수를 산출하는 단계, 제2 추천점수를 산출하는 단계 및 제1 추천점수와 제2 추천점수를 바탕으로 추천 재생목록을 생성하는 단계를 포함한다. 제1 추천점수를 산출하는 단계는 내용 인식 기반 자기부호화기의 기반에서 재생목록의 음원들이 재생되는 순서에 매칭되는 트랙정보를 포함하는 재생목록 벡터 및 음원들이 재생되는 순서에 매칭되는 가수정보를 포함하는 가수 벡터를 입력받는 단계를 포함한다. 제2 추천점수를 산출하는 단계는 캐릭터 레벨의 콘볼루션 신경망을 바탕에서, 재생목록 제목을 입력받는 단계를 포함한다. 추천 재생목록을 생성하는 단계는 협업필터링 기반의 선형 결합 방식에 따라, 제1 추천점수 및 제2 추천점수를 결합하는 단계를 포함한다.The method for recommending a music playlist based on collaborative filtering using an automatic encoder according to the present invention comprises: calculating a first recommendation score, calculating a second recommendation score, and a recommended playlist based on the first recommendation score and the second recommendation score. It includes the step of generating. The step of calculating the first recommendation score includes a playlist vector including track information matching the order in which the sound sources of the playlist are played on the basis of the content recognition-based self-encoder and singer information matching the order in which the sound sources are played. And receiving a mantissa vector. The step of calculating the second recommendation score includes receiving a playlist title based on a character-level convolutional neural network. The step of generating the recommended playlist includes combining the first recommendation score and the second recommendation score according to a linear combination method based on collaborative filtering.

Description

Playlist Recommendation Method Using Collaborative Autoencoders}

본 발명은 자동 부호화기를 이용한 협업 필터링 기반 음악 재생목록 추천 방법에 관한 것으로, 두 개의 하위 모델을 통해서 생성된 추천점수를 바탕으로 재생목록을 추천하는 방법에 관한 것이다.The present invention relates to a method of recommending a music playlist based on collaborative filtering using an automatic encoder, and to a method of recommending a playlist based on a recommendation score generated through two sub-models.

음악 추천 방식은 크게 콘텐츠 기반 추천과 협업 필터링 기반 추천으로 나눌 수 있다. 콘텐츠 기반 추천은 곡 및 가사를 포함하는 음원과 같은 음악의 고유 속성을 활용하여 사용자가 청취하는 비슷한 종류의 음악을 추천하는 방식이다. 콘텐츠 기반 추천 방식은 하나의 재생목록에 비슷한 분위기의 음원들이 공유되도록 노래를 추천할 수 있지만, 이로 인해서 다양한 종류의 음악을 소비하는 사용자의 욕구를 충족시키지 못한다. 협업 필터링 기반 추천은 사용자들의 음악 소비 패턴을 분석하여 사용자와 유사한 성향을 갖는 다른 사용자들이 선호하는 음악을 소개하는 방식이다. Music recommendation methods can be largely divided into content-based recommendation and collaborative filtering-based recommendation. Content-based recommendation is a method of recommending similar types of music that a user listens to by utilizing the inherent properties of music, such as a sound source including songs and lyrics. The content-based recommendation method can recommend songs so that sound sources with a similar atmosphere are shared in one playlist, but this does not satisfy the needs of users who consume various types of music. The collaborative filtering-based recommendation is a method of introducing music preferred by other users who have similar tendencies to the user by analyzing the music consumption patterns of users.

협업 필터링 기반 추천은 콘텐츠 기반 추천 방식보다 사용자들의 선호도가 높지만, 재생목록에 등장하는 빈도가 적거나 재생목록을 구성하는 음악 자체의 개수가 적은 경우 추천의 정확도가 낮아진다. 또한 사용자가 재생목록만 설정하고 노래를 저장하지 않은 상태에서는 협업 필터링 기반 추천은 불가능한 수준이 된다. Collaborative filtering-based recommendation has higher user preference than content-based recommendation method, but when the frequency of appearance in the playlist is small or the number of songs constituting the playlist is small, the accuracy of recommendation is lowered. In addition, collaborative filtering-based recommendation becomes impossible when the user sets only the playlist and does not save the song.

본 발명은 재생목록들에 등장하는 횟수가 적은 유명하지 않은 음악에 대해서도 이와 관련된 음악을 추천할 수 있는 협업 필터링 기반 추천 방식을 제공하기 위한 것이다.An object of the present invention is to provide a collaborative filtering-based recommendation method capable of recommending related music even for unfamous music having a small number of appearances in playlists.

본 발명은 재생목록에 포함된 음악의 개수가 적은 경우에도 추천 정확도를 높일 수 있는 협업 필터링 기반 추천 방식을 제공하기 위한 것이다.The present invention is to provide a collaborative filtering-based recommendation method capable of improving recommendation accuracy even when the number of songs included in a playlist is small.

또한, 본 발명은 사용자가 재생목록 제목만 설정하고 음악을 저장하지 않은 상태에서도 음악을 추천할 수 있는 협업 필터링 기반 추천 방식을 제공하기 위한 것이다.In addition, the present invention is to provide a collaborative filtering-based recommendation method in which a user can recommend music even when only the title of a playlist is set and music is not stored.

본 발명은 인코딩 과정에서 일정 정보를 생략된 정보를 바탕으로 학습하는 모델을 이용하여, 재생목록들에 등장하는 횟수가 적은 정보와 관련된 음악 또는 재생목록에 포함된 음악의 개수가 적은 경우에도 추천 정확도를 높일 수 있다.The present invention uses a model that learns based on information from which schedule information is omitted in the encoding process, and recommends accuracy even when the number of music related to information that appears in playlists is small or the number of music included in the playlist is small. Can increase.

또한, 본 발명은 재생목록 제목을 바탕으로 추천점수를 산출하기 때문에, 사용자가 재생목록 제목만 설정하고 음악을 저장하지 않은 상태에서도 음악을 추천할 수 있다. In addition, since the present invention calculates the recommended score based on the playlist title, the user can recommend music even when only the playlist title is set and no music is stored.

도 1은 본 발명에 따른 자동 부호화기를 이용한 협업 필터링 기반 음악 재생목록 추천 시스템의 개요를 나타내는 도면이다.
도 2는 본 발명에 따른 자동 부호화기를 이용한 협업 필터링 기반 음악 재생목록 추천 시스템 동작의 개요를 설명하는 도면이다.
도 3은 본 발명에 따른 자동 부호화기를 이용한 협업 필터링 기반 음악 재생목록 추천 방법을 나타내는 순서도이다.1 is a diagram illustrating an overview of a music playlist recommendation system based on collaborative filtering using an automatic encoder according to the present invention.
2 is a diagram for explaining an overview of the operation of a music playlist recommendation system based on collaborative filtering using an automatic encoder according to the present invention.
3 is a flowchart illustrating a method of recommending a music playlist based on collaborative filtering using an automatic encoder according to the present invention.

이하 설명하는 기술은 다양한 변경을 가할 수 있고 여러 가지 실시 예를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나 이는 이하 설명하는 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 이하 설명하는 기술의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The technology described below may be changed in various ways and may have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the technology to be described below with respect to a specific embodiment, and it should be understood to include all changes, equivalents, or substitutes included in the spirit and scope of the technology to be described below.

제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 해당 구성요소들은 상기 용어들에 의해 한정되지는 않으며, 단지 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 이하 설명하는 기술의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as 1st, 2nd, A, B, etc. may be used to describe various components, but the components are not limited by the above terms, and only for the purpose of distinguishing one component from other components. Is only used. For example, a first component may be referred to as a second component, and similarly, a second component may be referred to as a first component without departing from the scope of the rights of the technology described below. The term and/or includes a combination of a plurality of related listed items or any of a plurality of related listed items.

본 명세서에서 사용되는 용어에서 단수의 표현은 문맥상 명백하게 다르게 해석되지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함한다" 등의 용어는 설시된 특징, 개수, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 의미하는 것이지, 하나 또는 그 이상의 다른 특징들이나 개수, 단계 동작 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 배제하지 않는 것으로 이해되어야 한다.In terms of the terms used in the present specification, expressions in the singular should be understood as including plural expressions unless clearly interpreted differently in context, and terms such as "includes" are specified features, numbers, steps, actions, and components. It is to be understood that the presence or addition of one or more other features or numbers, step-acting components, parts or combinations thereof is not meant to imply the presence of, parts, or combinations thereof.

도면에 대한 상세한 설명을 하기에 앞서, 본 명세서에서의 구성부들에 대한 구분은 각 구성부가 담당하는 주기능 별로 구분한 것에 불과함을 명확히 하고자 한다. 즉, 이하에서 설명할 2개 이상의 구성부가 하나의 구성부로 합쳐지거나 또는 하나의 구성부가 보다 세분화된 기능별로 2개 이상으로 분화되어 구비될 수도 있다. 그리고 이하에서 설명할 구성부 각각은 자신이 담당하는 주기능 이외에도 다른 구성부가 담당하는 기능 중 일부 또는 전부의 기능을 추가적으로 수행할 수도 있으며, 구성부 각각이 담당하는 주기능 중 일부 기능이 다른 구성부에 의해 전담되어 수행될 수도 있음은 물론이다.Prior to the detailed description of the drawings, it is intended to clarify that the division of the constituent parts in the present specification is merely divided by the main function that each constituent part is responsible for. That is, two or more constituent parts to be described below may be combined into one constituent part, or one constituent part may be divided into two or more for each more subdivided function. In addition, each of the constituent units to be described below may additionally perform some or all of the functions of other constituent units in addition to its own main function, and some of the main functions of each constituent unit are different. It goes without saying that it can also be performed exclusively by.

또, 방법 또는 동작 방법을 수행함에 있어서, 상기 방법을 이루는 각 과정들은 문맥상 명백하게 특정 순서를 기재하지 않은 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 과정들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In addition, in performing the method or operation method, each of the processes constituting the method may occur differently from the specified order unless a specific order is clearly stated in the context. That is, each of the processes may occur in the same order as the specified order, may be performed substantially simultaneously, or may be performed in the reverse order.

도 1은 본 발명에 의한 심층 협업필터링 기반의 재생목록 추천 시스템을 나타내는 도면이다.1 is a diagram illustrating a playlist recommendation system based on deep collaborative filtering according to the present invention.

도 1을 참조하면, 본 발명에 따른 심층 협업필터링 기반의 재생목록 추천 시스템은 재생목록 정보(D_PL)로부터 추천 재생목록을 생성하는 제1 하위모델(100) 및 제2 하위모델(200)을 포함한다. Referring to FIG. 1, a playlist recommendation system based on deep collaborative filtering according to the present invention includes a first sub-model 100 and a second sub-model 200 for generating a recommended playlist from playlist information D_PL. do.

재생목록 정보(D_PL)는 트랙 정보(TL) 가수 정보(a) 및 콘텐츠(c)를 포함한다. 트랙 정보(TL)는 각각 노래 제목의 정보를 지칭한다. 가수 정보(a)는 해당 음악의 가수들에 대한 정보, 예컨대 가수 이름을 지칭한다. 콘텐츠(c)는 "Album"을 지칭하며, 즉 해당 노래가 수록된 앨범의 제목 정보를 지칭한다.The playlist information D_PL includes track information TL, singer information a and content c. Each track information TL refers to information on a song title. Singer information (a) refers to information about singers of the music, for example, singer names. Content (c) refers to "Album", that is, refers to title information of the album in which the corresponding song is recorded.

제1 하위모델(100)은 자기부호화기(Autoencoder)로 구현된 모델이며, 특히 내용 인식 기반 자기부호화기(Content-aware Autoencoder)로 구현된다. 제1 하위모델(100)은 인코더(110)와 디코더(120)를 포함한다. 인코더(110)는 입력값을 압축하고, 디코더(120)는 압축된 입력값을 복원한다. 제1 하위모델(100)은 재생목록 벡터(ap) 및 가수 벡터(p)를 입력값으로 제공받고, 입력값을 복원하는 과정에서 획득되는 복원값을 바탕으로 제1 추천점수를 산출한다. The first sub-model 100 is a model implemented with an autoencoder, and in particular, is implemented with a content-aware autoencoder. The first sub-model 100 includes an encoder 110 and a decoder 120. The encoder 110 compresses the input value, and the decoder 120 restores the compressed input value. The first sub-model 100 receives a playlist vector ap and a mantissa vector p as input values, and calculates a first recommendation score based on a restored value obtained in a process of restoring the input value.

재생목록 벡터(ap)는 재생목록 정보(D_PL)에 포함된 음원들이 재생된 순서에 매칭되는 노래 제목을 지칭한다. 가수 벡터(p)는 재생목록 정보(D_PL)에 포함된 음원들이 재생된 순서에 매칭되는 가수 이름을 지칭한다.The playlist vector ap refers to a song title in which sound sources included in the playlist information D_PL are matched to the order in which they are played. The singer vector p refers to a singer name in which sound sources included in the playlist information D_PL are matched to the order in which they are played.

제1 하위모델(100)은 자기 자신을 복원하는 과정에서 저차원으로 압축된 잠재 벡터를 학습하는 비지도학습의 방법을 채택한다.The first sub-model 100 adopts an unsupervised learning method of learning a latent vector compressed in a low dimension in the process of restoring itself.

제2 하위모델(200)은 캐릭터 레벨 콘볼루션 신경망(Character level Convoulutional Neural Networks)으로 구현된 모델이다. 제2 하위모델(200)은 제목 벡터(Tp)로부터 제2 추천점수를 산출한다. 제2 하위모델(200)은 제목 벡터(Tp)에 대해서 콘볼루션(Convolution) 연산을 수행하여 '0'과 '1' 사이의 제2 추천점수를 산출한다. 제2 하위모델(200)은 제1 하위모델(100)과는 구분되는 독립적인 모델로 구현된다.The second submodel 200 is a model implemented with character level convoulutional neural networks. The second sub-model 200 calculates a second recommendation score from the title vector Tp. The second sub-model 200 calculates a second recommendation score between '0' and '1' by performing a convolution operation on the title vector Tp. The second sub-model 200 is implemented as an independent model that is distinct from the first sub-model 100.

도 2는 본 발명에 따른 재생목록 추천 시스템의 동작 개요를 나타내는 도면이다. 도 3은 본 발명에 따른 심층 협업필터링 기반 재생목록 추천 방법을 나타내는 순서도이다. 2 is a diagram showing an overview of the operation of the playlist recommendation system according to the present invention. 3 is a flowchart illustrating a method for recommending a playlist based on deep collaborative filtering according to the present invention.

도 1 내지 도 3을 참조하여, 본 발명에 따른 재생목록 추천 방법을 살펴보면 다음과 같다.Referring to FIGS. 1 to 3, a method for recommending a playlist according to the present invention will be described as follows.

제1 단계(S1)에서, 제1 하위모델(100)은 내용 인식 기반 자기부호화기를 기반으로, 재생목록 정보(D_PL)으로부터 생성된 재생목록 벡터(ap) 및 가수 벡터(p)로부터 제1 추천점수를 생성한다. 이를 위해서, 제1 하위모델(100)은 재생목록 벡터(ap) 및 가수 벡터(p)가 결합된 결합 벡터(Vc)를 입력받고, 결합 벡터(Vc)를 인코딩하고 디코딩하는 과정을 통해서 제1 추천점수를 생성할 수 있다. In the first step (S1), the first sub-model 100 is based on the content recognition-based self-encoder, the first recommendation from the playlist vector (ap) and mantissa vector (p) generated from the playlist information (D_PL) Generate a score. To this end, the first submodel 100 receives the combined vector Vc in which the playlist vector ap and the mantissa vector p are combined, and encodes and decodes the combination vector Vc. You can generate a recommendation score.

재생목록 정보(D_PL)에 포함된 전체 노래의 개수가 "M"일 때, 재생목록 벡터(ap)의 크기는 "M"의 값을 갖는다. 재생목록 벡터(ap)들 각각의 원소는 트랙 정보(TL)들 하나 하나에 해당되며, 해당 재생목록이 트랙을 포함하지 않는다면, '0'의 값을 갖는다. 재생목록에 포함되었다면 해당 원소는 "1/n"의 값을 갖는다. 예컨대, 재생목록 정보(D_PL)에 트랙이 6개 존재한다면, 각각의 트랙들은 "1/6"의 값을 갖는다. 이는 자기부호화기의 인코더가 트랙 각각의 잠재 벡터(latent vector)들의 집합으로 구성되어 있다고 할 때, 벡터들의 평균을 취하는 것과 같은 효과를 갖는다.When the total number of songs included in the playlist information D_PL is "M", the size of the playlist vector ap has a value of "M". Each element of the playlist vectors (ap) corresponds to one of the track information (TL). If the playlist does not include a track, it has a value of '0'. If included in the playlist, the element has a value of "1/n". For example, if there are 6 tracks in the playlist information D_PL, each of the tracks has a value of "1/6". This has the same effect as taking the average of vectors, assuming that the encoder of the magnetic encoder is composed of a set of latent vectors for each track.

인코더(110)는 재생목록 벡터를 제공받고, 재생목록 벡터를 저차원의 잠재 벡터로 압축한다.The encoder 110 receives a playlist vector and compresses the playlist vector into a low-dimensional latent vector.

디코더(120)는 인코더(110)가 압축한 잠재 벡터를 복원하여, 복원벡터를 생성한다. 복원벡터는 입력벡터와 마찬가지로 M의 크기를 갖고, 각각의 원소는 해당 곡에 대한 추천 점수로 간주된다. 복원벡터의 각각의 원소는 "0" 이상 "1" 이하의 값을 갖고, 1에 가까울수록 높은 추천 점수를 기록한다는 의미로 해석될 수 있다. The decoder 120 generates a reconstructed vector by reconstructing the latent vector compressed by the encoder 110. Similar to the input vector, the reconstructed vector has a size of M, and each element is regarded as a recommended score for the corresponding song. Each element of the reconstructed vector has a value of "0" or more and "1" or less, and it can be interpreted as meaning that a higher recommendation score is recorded as it approaches 1.

가수 벡터(p)를 바탕으로 추천점수를 생성하기 위해서, 인코더(110)는 가수 벡터(p)를 입력받는다. 가수 벡터(p)는 재생목록 벡터(ap)와 유사하게, 재생목록 내에서 가수가 등장하면 "1", 가수가 등장하지 않으면 "0"의 값을 갖도록 생성된다. 예컨대, 재생목록 내에서 "가수(a)-음원"이 매칭될 때, 해다 음원이 재생되면 "가수(a)"가 등장한 것으로 파악하고, 가수 벡터(p)는 "1"의 값을 갖도록 생성된다. 가수 벡터(p)는 재생목록 벡터(ap)와 결합되어 결합 벡터(Vc)가 생성된다. 즉, 자기부호화기에 해당하는 제1 하위모델(100)은 재생목록 벡터(ap) 또는 가수 벡터(p) 중에서 어느 하나가 아닌, 결합 벡터(Vc)를 입력받아서 제1 추천점수를 생성한다. In order to generate the recommended score based on the mantissa vector (p), the encoder 110 receives the mantissa vector (p). Similar to the playlist vector ap, the singer vector p is created to have a value of "1" if a singer appears in the playlist and "0" if the singer does not appear. For example, when "singer (a)-sound source" is matched in a playlist, it is recognized that "singer (a)" appears when the sound source is played, and the singer vector (p) is created to have a value of "1". do. The mantissa vector (p) is combined with the playlist vector (ap) to create an associative vector (Vc). That is, the first sub-model 100 corresponding to the self-encoder receives the combination vector Vc, which is not one of the playlist vector ap or the mantissa vector p, and generates a first recommendation score.

상술한 설명은 재생목록 벡터(ap)와 가수 벡터(p)를 구분하여 설명하였지만, 본 발명의 제1 하위모델(100)은 재생목록 벡터(ap)와 가수 벡터(p)를 결합하여 결합 벡터(Vc)를 생성한다. 결합 벡터(Vc)는 [p; a_p]의 형태로 표현될 수 있다. In the above description, the playlist vector (ap) and the mantissa vector (p) have been separately explained, but the first sub-model 100 of the present invention combines the playlist vector (ap) and the mantissa vector (p) to provide a combination vector. (Vc) is created. The binding vector (Vc) is [p; It can be expressed in the form of a _{p ].}

그리고, 인코더(110)는 결합벡터([p; a_p])를 입력받는다. 아래의 [수학식 1]은 결합벡터([p; a_p])와 디코더(120)의 출력에 해당하는 복원 벡터(

)의 차이를 Cross Entropy Loss를 통해서 줄이는 것을 설명하고 있다.Then, the encoder 110 receives a combination vector ([p; a _p ]). [Equation 1] below is a combination vector ([p; a _p ]) and a reconstruction vector corresponding to the output of the decoder 120 (

) To reduce the difference through Cross Entropy Loss.

[수학식 1][Equation 1]

[수학식 1]에서, p는 재생목록 벡터, a_p는 가수 벡터를 의미한다. In [Equation 1], p denotes a playlist vector and a _p denotes a mantissa vector.

Θ는 = {W ∈ R^d×(n+k), W′ ∈ R(n+k)×d, b ∈ Rd, b′ ∈ Rn+k }으로 정의될 수 있다. 이때, W는 인공신경망에서의 가중치를 의미하고, b는 편향(bias)을 의미한다. W'는 디코더(120)의 가중치를 의미하고, b' 는 디코더(120)의 편향을 의미한다. Θ can be defined as = {W ∈ R ^d×(n+k) , W′ ∈ R(n+k)×d, b ∈ Rd, b′ ∈ Rn+k }. In this case, W denotes a weight in the artificial neural network, and b denotes a bias. W'denotes a weight of the decoder 120, and b'denotes a deflection of the decoder 120.

제1 하위모델(100)은 복원 벡터와 Cross Entropy를 줄이는 방식으로 인공신경망을 학습한다. 제1 하위모델(100)은 학습 과정에서 입력으로 주어지는 결합 벡터(Vc)의 정보 일부분을 제거한 것을 바탕으로 학습을 진행한다. 즉, 재생목록 벡터(ap) 또는 가수 벡터(p) 중에서 일부의 정보가 생략된 불완전한 결합 벡터(Vc)를 입력받았을 때, 온전한 재생목록 순서 및 가수 정보를 복원하는 것을 목표로 학습이 진행된다. The first sub-model 100 learns an artificial neural network in a manner that reduces reconstruction vectors and cross entropy. The first sub-model 100 performs learning based on removing a part of information of the combination vector Vc provided as an input in the learning process. That is, when a playlist vector (ap) or an incomplete combination vector (Vc) in which some information is omitted from among the singer vector (p) is received, the learning proceeds with the aim of restoring the complete playlist order and singer information.

일부가 제거된 결합 벡터(Vc)를 바탕으로 제1 하위모델(100)이 학습을 진행하는 과정은 다음과 같이 두 단계를 포함한다. A process in which the first sub-model 100 performs learning based on the combination vector Vc from which a part has been removed includes two steps as follows.

첫 번째 단계는 숨바꼭질(hide-and-seek) 단계이다. 숨바꼭질 단계는 재생목록 벡터(ap)와 가수 벡터(p) 중에서 어느 하나를 임의로 선택해 제거하는 단계이다. The first step is the hide-and-seek step. The hide-and-seek step is a step of randomly selecting and removing one of the playlist vector (ap) and the singer vector (p).

예컨대, 제1 하위모델(100)은 아래의 [수학식 2]와 같이, 가수 벡터(p)의 정보를 삭제하고 학습을 진행할 수 있다.For example, as shown in [Equation 2] below, the first sub-model 100 may delete information on the mantissa vector p and perform learning.

[수학식 2][Equation 2]

[수학식 2]에서, f는 인코더(110)를 의미하고, g는 디코더(120)를 의미한다. [^~p;0]에서 ^~p는 재생목록 벡터 중 일부의 정보가 생략(denoising)된 불완전한 벡터를 의미한다. [수학식2]]에서 가수 벡터에 해당하는 부분은 제거되었기 때문에, 인코더(110)의 결합벡터에서 가수 벡터 성분은 ”0”으로 표현된다. 결론적으로 제1 하위모델(100)은 결합벡터와 [^~p;0]의 복원벡터와의 차이를 줄이는 방향으로 학습을 진행한다.In [Equation 2], f denotes the encoder 110, and g denotes the decoder 120. In [ ^~ p;0], ^~ p denotes an incomplete vector in which some of the playlist vectors are denoted. Since the part corresponding to the mantissa vector has been removed in [Equation 2], the mantissa vector component in the combined vector of the encoder 110 is expressed as “0”. In conclusion, the first sub-model 100 performs learning in a direction that reduces the difference between the ^{combined vector and the reconstructed vector of [~ p; 0].}

또는, 제1 하위모델(100)은 아래의 [수학식 3]와 같이, 재생목록 벡터(ap)의 정보를 삭제하고 학습을 진행할 수 있다.Alternatively, the first sub-model 100 may delete information on the playlist vector ap and perform learning as shown in [Equation 3] below.

[수학식 3][Equation 3]

즉, [수학식 3]은 인코더(110)의 결합벡터에서 재생목록에 해당하는 성분이 '0'인 것을 표현하고 있다. That is, [Equation 3] expresses that the component corresponding to the playlist in the combined vector of the encoder 110 is '0'.

도 3은 [수학식 2]에서와 같이 가수 벡터(p)가 제거된 것을 도시하고 있다.3 shows that the mantissa vector (p) has been removed as in [Equation 2].

제1 하위모델(100)은 재생목록 벡터(ap)와 가수 벡터(p) 중에서 어느 하나의 정보를 생략하였을 때, 복원값으로 재생목록 벡터(ap)와 가수 벡터(p)를 모두 정확하게 복원하는 것을 목표로 한다. 이를 통해서, 재생목록들 간의 패턴뿐만 아니라, 재생목록과 가수 사이의 패턴을 함께 학습하여 사용자의 취향에 더욱 가까운 재생목록 추천을 할 수 있다.The first sub-model 100 accurately restores both the playlist vector ap and the singer vector p as a restoration value when any one of the playlist vector ap and the singer vector p is omitted. Aim for that. Through this, not only patterns between playlists, but also patterns between playlists and singers can be learned together to recommend playlists closer to the user's taste.

두 번재 단계는 추가 제거 단계이다. 추가 제거 단계는 숨바꼭질 단계에서 제거되지 않고 남겨진 벡터의 정보를 한 번 더 제거하는 과정이다. The second step is an additional elimination step. The additional removal step is a process of removing the information of the vector left unremoved in the hide-and-seek step once more.

추가 제거 단계에서 제거 값을 선택할 때, 재생목록에 저장된 순서와 상관없이 무작위로 가릴지, 아니면 재생목록에 저장된 정보들 중에서 최근 정보에 해당하는 뒷부분, 즉 나중에 저장된 정보를 가릴지 선택한다. 만약 플레이 리스트 내에서 무작위 재생을 선호하는 사용자들의 기호를 위해서는 재생목록에 저장된 순서와 상관없이 무작위로 가리는 방법을 사용한다. 또는 사용자가 재생목록을 순차적으로 재생할 때 그 위에 재생되는 노래를 추천하는 시스템, 즉 재생목록 순서를 추천하는 시스템을 구현하기 위해서는 재생목록에서 나중에 저장된 정보에 해당하는 뒷부분을 가리는 방법을 이용한다. When selecting the removal value in the additional removal step, regardless of the order stored in the playlist, it is selected whether to hide the last information corresponding to the latest information among the information stored in the playlist, namely, whether to hide the information stored later. If users prefer to play randomly within a playlist, they use a method of randomly hiding them regardless of the order in which they are stored in the playlist. Alternatively, in order to implement a system that recommends songs to be played on top of the playlist when the user sequentially plays the playlist, that is, a system that recommends the playlist order, a method of hiding the back part corresponding to information stored later in the playlist is used.

추가 제거 단계에서 제거값을 가리는 방법 및 제거값을 가리는 정도는 구현하고자 하는 시스템에 따라 결정될 수 있다. 예컨대, 노래의 개수가 적을수록 원활히 작동하는 시스템을 구현하기 위해서는 최대한 많은 양의 정보를 제거한다. 이에 따라 제1 하위모델(100)의 입력값이 적을 때에도 정확한 예측이 가능하도록 한다. 반면, 노래가 어느 정도 확보된 상태에서 원활이 작동하는 시스템을 구현하기 위해서는 상대적으로 적은 정보를 제거한다. 또는 입력값의 양에 상관없이 성능을 갖는 시스템을 구현하기 위해서는 한 번 학습이 이루어질 때마다 가려지는 비율이 임의의 랜덤한 값으로 선택되게 한다.The method of masking the removal value and the degree of masking the removal value in the additional removal step may be determined according to the system to be implemented. For example, in order to implement a system that works smoothly as the number of songs decreases, as much information as possible is removed. Accordingly, accurate prediction is possible even when the input value of the first sub-model 100 is small. On the other hand, relatively little information is removed in order to implement a system that works smoothly while songs are secured to some extent. Alternatively, in order to implement a system having performance regardless of the amount of input values, a ratio that is obscured each time learning is performed is selected as a random value.

제2 단계(S2)에서, 제2 하위모델(200)은 캐릭터 레벨의 콘볼루션 신경망 기반에서, 제목 벡터(Tp)으로부터 제2 추천 점수를 산출한다. 제목 벡터(Tp)는 재생목록의 제목의 철자 순서에 대한 정보를 포함한다.In the second step (S2), the second sub-model 200 calculates a second recommendation score from the title vector Tp based on the character-level convolutional neural network. The title vector Tp contains information on the spelling order of the title of the playlist.

제목 벡터(Tp)는 재생목록 제목(Title)이 갖는 철자의 개수와 순서를 바탕으로 생성된다. 제목 벡터(Tp)는 "C,h,r,i,s,t,m,a,s"라는 9개의 철자가 순차적으로 입력될 때, 9-by-k의 행렬로 이루어지는 벡터로 생성될 수 있다. 이때, "k"는 콘볼루션(convolution) 연산을 위한 필터의 차원을 지칭한다. 예컨대, 도 2의 제목 벡터(Tp)는 9-by-4 행렬로 표현되는 예를 도시하고 있다.The title vector (Tp) is generated based on the number and order of the spellings of the playlist title (Title). The title vector (Tp) can be created as a vector consisting of a 9-by-k matrix when nine letters of "C,h,r,i,s,t,m,a,s" are sequentially input. have. In this case, "k" refers to the dimension of a filter for a convolution operation. For example, the title vector Tp of FIG. 2 shows an example expressed in a 9-by-4 matrix.

제2 하위모델(200)은 k차원의 필터를 이용하여 콘볼루션 연산을 수행하여, 철자 조합의 패턴을 파악하고, 특성 벡터를 획득한다. 콘볼루션 연산에 있어서, "christmas"와 "christa"는 최종적으로 비슷한 값의 특성 벡터로 표현되기 때문에, 사용자가 입력한 오타에 대해서 강건한 특성을 갖는다. The second sub-model 200 performs a convolution operation using a k-dimensional filter to determine a pattern of spelling combinations, and obtains a feature vector. In the convolution operation, since "christmas" and "christa" are finally expressed as feature vectors of similar values, they are robust against typos entered by the user.

그리고, 제2 하위모델(200)은 특성 벡터에서 "max pooling" 연산을 통해서 가장 큰 특성값을 산출한다. In addition, the second sub-model 200 calculates the largest feature value through a “max pooling” operation in the feature vector.

살펴본 바와 같이, 제2 하위모델(200)은 "word level CNN"이 아닌 캐릭터 레벨의 콘볼루션 신경망 기반으로 재생 목록의 제목으로부터 제2 추천점수를 산출한다. "word level CNN"은 문장에서 단어 배열의 패턴을 분석하기 위한 것으로, 입력되는 문정을 벡터로 표현한다. "word level CNN"은 단어를 바탕으로 시스템이 동작하기 때문에, 오타로 인하여 단어의 분석이 안 될 경우에 모델 구동이 원활하지 못할 수 있다. As described above, the second sub-model 200 calculates the second recommendation score from the title of the playlist based on the character-level convolutional neural network rather than the "word level CNN". The "word level CNN" is for analyzing the pattern of word arrangement in a sentence, and expresses the input sentence as a vector. In the "word level CNN", since the system operates based on words, if the word cannot be analyzed due to a typo, the model may not be smoothly driven.

이에 반해서, 본 발명은 캐릭터 레벨의 콘볼루션 신경망을 기반으로 제2 추천점수를 생성하기 때문에, 오타 등으로 인해서 정확한 단어가 입력되지 않았을 경우에도 보다 원활한 구동이 가능하다.In contrast, in the present invention, since the second recommendation score is generated based on the character-level convolutional neural network, smoother driving is possible even when the correct word is not input due to a typo or the like.

제3 단계(S3)에서, 협업필터링 기반의 선형 결합 방식에 따라, 제1 추천점수 및 상기 제2 추천점수를 결합하여 결합 추천점수를 산정하고, 결합 추천점수에 따라 재생 목록에 매칭되는 곡을 추천한다. 제1 추천점수는 제1 하위모델(100)을 통해서 복원된 것이며, 제2 추천점수는 제2 하위모델(200)을 통해서 복원된 것이다. 제1 및 제2 추천점수는 디코더의 복원벡터로 간주될 수 있다.In the third step (S3), according to a linear combination method based on collaborative filtering, a combined recommendation score is calculated by combining the first recommendation score and the second recommendation score, and a song matching the playlist is selected according to the combined recommendation score. I recommend you. The first recommended score is restored through the first sub-model 100, and the second recommended score is restored through the second sub-model 200. The first and second recommendation scores may be regarded as reconstructed vectors of the decoder.

결합 모델(300)은 제1 추천점수에 대한 제1 가중치(Witem) 및 제2 추천점수에 대한 제2 가중치(Wtitle)를 설정하고, 이를 바탕으로 추천 재생목록을 생성한다.The combined model 300 sets a first weight (Witem) for the first recommendation score and a second weight (Wtitle) for the second recommendation score, and generates a recommended playlist based on this.

제1 및 제2 가중치들(Witem,Wtitle)의 크기는 동일하게 설정될 수 있다. 이처럼 제1 및 제2 가중치들(Witem,Wtitle)의 크기를 동일하게 설정하는 방식은 시스템 구현이 간단하다는 장점이 있지만, 제1 하위모델(100) 또는 제2 하위모델(200)의 정확도가 서로 다를 경우에, 추천 재생목록의 정확도가 낮아진다. The sizes of the first and second weights Witem and Wtitle may be set to be the same. This method of setting the same size of the first and second weights (Witem, Wtitle) has the advantage of simple system implementation, but the accuracy of the first sub-model 100 or the second sub-model 200 In other cases, the accuracy of the recommended playlist is lowered.

추천 재생목록의 정확도를 높이기 위해서, 제1 하위모델(100)과 제2 하위모델(200)을 동적으로 할당하는 방식을 이용할 수 있다.In order to increase the accuracy of the recommended playlist, a method of dynamically allocating the first sub-model 100 and the second sub-model 200 may be used.

제1 및 제2 가중치들(Witem,Wtitle)은 아래의 [수학식 4]에서와 같이, 재생목록 제목의 중요도를 1로 고정하고, 제1 추천점수의 가중치를 트랙 정보(TL)의 양과 가수 정보(a)의 양에 비례하도록 설정할 수 있다. As for the first and second weights (Witem, Wtitle), as in [Equation 4] below, the importance of the playlist title is fixed to 1, and the weight of the first recommendation score is the amount and singer of the track information (TL). It can be set to be proportional to the amount of information (a).

[수학식 4][Equation 4]

[수학식 4]에서, N([p; ap])은 결합 벡터(Vc)의 데이터 개수를 의미하고, "I(Tp)"는 재생목록 제목의 개수를 의미한다. 예컨대, 재생목록 정보(D_PL)에서 트랙 정보(TL)와 가수 정보(a)가 총 20개 주어졌을 때, 제1 가중치(Witem)는 20/21, 제2 가중치(Wtitle)는 1/21로 설정될 수 있다. In [Equation 4], N([p; ap]) means the number of data of the concatenation vector Vc, and “I(Tp)” means the number of playlist titles. For example, when a total of 20 track information (TL) and singer information (a) are given in the playlist information (D_PL), the first weight (Witem) is 20/21 and the second weight (Wtitle) is 1/21. Can be set.

이상 설명한 내용을 통해 당업자라면 본 명세서의 기술사상을 일탈하지 아니하는 범위에서 다양한 변경 및 수정이 가능함을 알 수 있을 것이다. 따라서, 본 명세서의 기술적 범위는 명세서의 상세한 설명에 기재된 내용으로 한정되는 것이 아니라 특허 청구의 범위에 의해 정하여져야만 할 것이다.It will be appreciated by those skilled in the art through the above description that various changes and modifications can be made without departing from the spirit of the present specification. Therefore, the technical scope of the present specification should not be limited to the content described in the detailed description of the specification, but should be determined by the scope of the claims.

100: 내용인식기반 자기부호화기
110: 인코더 120: 디코더
200: 캐릭터 레벨 콘볼루션 신경망 모델100: content recognition-based self-encoder
110: encoder 120: decoder
200: Character level convolutional neural network model

Claims

In the method of operating a playlist recommendation system based on deep collaboration filtering including a first sub-model, a second sub-model, and a combined model,
On the basis of the content recognition-based self-encoder using the first sub-model, a playlist vector including track information matching the order in which sound sources of the playlist are played, and singer information matching the order in which the sound sources are played are included. Calculating a first recommendation score based on the mantissa vector;
Calculating a second recommendation score based on a playlist title based on a character-level convolutional neural network using the second sub-model; And
Calculating a combined recommendation score by combining the first recommendation score and the second recommendation score according to a linear combination method based on collaborative filtering using the combination model, and generating a recommended playlist according to the combined recommendation score. Collaborative filtering-based music playlist recommendation method using an automatic encoder comprising a.

The method of claim 1,
The playlist vector is
A method of recommending a music playlist based on collaborative filtering using an automatic encoder including an element having a size of M when the number of sound sources is "M" and classifying whether or not the sound sources have been reproduced as binary data.

The method of claim 1,
The step of calculating the first recommendation score
Compressing the playlist vector and the singer vector to generate a low-dimensional latent vector.

The method of claim 3,
Generating the latent vector is
A method for recommending a music playlist based on collaborative filtering using an automatic encoder, comprising the step of deleting at least one information from the playlist vector or the singer vector.

The method of claim 4,
Generating the latent vector is
A method for recommending a music playlist based on collaborative filtering using an automatic encoder, further comprising the step of additionally removing at least part of information from information on the playlist vector or the singer vectors that are not deleted.

The method of claim 5,
The step of calculating the first recommendation score
A method for recommending a music playlist based on collaborative filtering using an automatic encoder, further comprising the step of generating a reconstructed vector by restoring the low-dimensional latent vector.

The method of claim 6,
A method for recommending a music playlist based on collaborative filtering using an automatic encoder that increases the likelihood of adopting the first recommendation score as the restoration vector is generated with a size of 0 or more and 1 or less.

The method of claim 1,
The step of calculating the second recommendation score is
A method for recommending music playlists based on collaborative filtering using an automatic encoder, comprising the step of convoluting a title vector including information obtained by sequentially sorting the spellings of the playlist titles.

The method of claim 8,
A method for recommending a music playlist based on collaborative filtering using an automatic encoder in which rows and columns of the title vector are each composed of a number of letters of the playlist title and a dimension of a filter used in the convolution operation.

The method of claim 9,
Generating the recommended playlist comprises:
Including the step of assigning a first weight to the first recommendation score and a second weight to the second recommendation score,
A method for recommending a music playlist based on collaborative filtering using an automatic encoder in which the first weight and the second weight are assigned using a dynamic allocation method.