KR102391644B1

KR102391644B1 - Method and Apparatus for VOD Content Recommendation

Info

Publication number: KR102391644B1
Application number: KR1020200116431A
Authority: KR
Inventors: 장시영
Original assignee: 주식회사 엘지유플러스
Priority date: 2020-09-10
Filing date: 2020-09-10
Publication date: 2022-04-27
Also published as: KR20220033943A

Abstract

VOD(Video On Demand) 컨텐츠 추천 방법이 제공된다. 본 방법은, 컨텐츠구매확률 매트릭스 P를 설정하는 단계 - 상기 P는 원소들을 포함하고, 상기 P의 원소들은 사용자들이 컨텐츠들을 구매하여 시청할 확률들을 나타냄 -, 상기 P가 사용자 별 특징들을 정의하는 고객특징정보 매트릭스 X 및 컨텐츠 별 특징들을 정의하는 컨텐츠특징정보 매트릭스 Y의 곱으로 표현되도록 상기 X 및 상기 Y를 설정하는 단계, 상기 P, 상기 사용자들이 상기 컨텐츠들을 시청하였는지의 여부를 나타내는 컨텐츠시청여부 매트릭스 O 및 상기 컨텐츠들의 인기도를 나타내는 인기도 매트릭스 C에 의해 정의되는 손실 함수(Loss Function: L)를 이용하여 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 결정하는 단계, 및 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 이용하여 상기 P의 원소들의 값들을 산출하는 단계를 포함할 수 있다.A video on demand (VOD) content recommendation method is provided. The method includes the steps of setting a content purchase probability matrix P, where P includes elements, and the elements of P represent probabilities that users will purchase and view content, where P defines user-specific features. Setting the X and Y to be expressed as a product of an information matrix X and a content feature information matrix Y defining characteristics for each content, the P, a content viewing or not matrix O indicating whether the users watched the content and determining the values of the elements of X and the values of the elements of Y using a Loss Function (L) defined by a popularity matrix C representing the popularity of the contents, and values of the elements of X. and calculating the values of the elements of P by using the values of the elements of Y.

Description

Method and Apparatus for VOD Content Recommendation

개시된 기술은 VOD(Video On Demand) 컨텐츠 관련의 정보 처리 기술에 관한 것이다.The disclosed technology relates to information processing technology related to video on demand (VOD) content.

근래에 들어 인터넷(Internet)의 활용이 일상화됨에 따라 홈 네트워크 시대가 도래하였다. 이러한 홈 네트워크를 실현한 구체적인 한 예로서 IPTV(Internet Protocol Television) 서비스를 들 수 있다. IPTV 서비스는 인터넷을 이용하여 제공되는 양방향 TV 서비스로서, 인터넷에 연결된 셋탑박스(set-top box)를 이용하여 컨텐츠 제공 서비스 사업자가 운용하는 컨텐츠 제공 서버와 연결하여 VOD(video on demand) 컨텐츠와 같은 컨텐츠를 다운로드 방식 또는 스트리밍 방식으로 내려 받아 시청할 수 있도록 한 서비스이다. IPTV 서비스는 일반 케이블 방송과는 달리 시청자가 자신이 편리한 시간에 보고 싶은 프로그램을 선별하여 볼 수 있도록 한다는 점에서 시청자에게는 다양한 볼거리와 편의성을 제공하는 한편 사업자에게는 가입자 별로 일반 케이블 방송사가 징수하는 월정액 이상의 매출을 올릴 수 있는 수익 모델이 되고 있다. 따라서 IPTV 서비스 사업자의 입장에서는 시청자에게 다양한 컨텐츠를 다양한 방식으로 마케팅하여 구매를 독려하는 것이 지대한 관심사가 되고 있다.In recent years, as the use of the Internet has become commonplace, the era of home networks has arrived. As a specific example of realizing such a home network, an Internet Protocol Television (IPTV) service can be cited. The IPTV service is an interactive TV service provided using the Internet, and is connected to a content providing server operated by a content providing service provider using a set-top box connected to the Internet, such as video on demand (VOD) content. It is a service that allows users to download and watch content in a download or streaming manner. Unlike general cable broadcasting, IPTV service provides a variety of attractions and convenience to viewers in that it allows viewers to select and watch the programs they want to watch at a convenient time. It is becoming a revenue model that can increase sales. Therefore, from the point of view of IPTV service providers, marketing various contents to viewers in various ways to encourage purchase is of great interest.

개시된 기술의 과제는 사용자 개개인에게 해당 사용자의 개인 취향과 컨텐츠의 대중적인 인기도를 동시에 반영하여 컨텐츠를 추천할 수 있도록 한, VOD 컨텐츠 추천 방법 및 장치를 제공하는 것이다.SUMMARY OF THE INVENTION An object of the disclosed technology is to provide a VOD content recommendation method and apparatus in which each user can recommend content by simultaneously reflecting the user's personal taste and the popular popularity of the content.

개시된 기술이 해결하고자 하는 과제들은 이상에서 언급한 과제들에 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the disclosed technology are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

일 측면에서, VOD(Video On Demand) 컨텐츠 추천 방법이 제공된다. 본 방법은, 컨텐츠구매확률 매트릭스 P를 설정하는 단계 - 상기 P는 원소들을 포함하고, 상기 P의 원소들은 사용자들이 컨텐츠들을 구매하여 시청할 확률들을 나타냄 -, 상기 P가 사용자 별 특징들을 정의하는 고객특징정보 매트릭스 X 및 컨텐츠 별 특징들을 정의하는 컨텐츠특징정보 매트릭스 Y의 곱으로 표현되도록 상기 X 및 상기 Y를 설정하는 단계, 상기 P, 상기 사용자들이 상기 컨텐츠들을 시청하였는지의 여부를 나타내는 컨텐츠시청여부 매트릭스 O 및 상기 컨텐츠들의 인기도를 나타내는 인기도 매트릭스 C에 의해 정의되는 손실 함수(Loss Function: L)를 이용하여 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 결정하는 단계, 및 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 이용하여 상기 P의 원소들의 값들을 산출하는 단계를 포함할 수 있다.In one aspect, a video on demand (VOD) content recommendation method is provided. The method includes the steps of setting a content purchase probability matrix P, where P includes elements, and the elements of P represent probabilities that users will purchase and view content, where P defines user-specific features. Setting the X and Y to be expressed as a product of an information matrix X and a content feature information matrix Y defining characteristics for each content, the P, a content viewing or not matrix O indicating whether the users watched the content and determining the values of the elements of X and the values of the elements of Y using a Loss Function (L) defined by a popularity matrix C representing the popularity of the contents, and values of the elements of X. and calculating the values of the elements of P by using the values of the elements of Y.

일 실시예에서, 상기 P의 행들의 수는 상기 사용자들의 수와 같고, 상기 P의 열들의 수는 상기 컨텐츠들의 개수와 같고, 상기 P의 원소들은 p_i,j로 나타내고, 상기 P의 원소 p_i,j는 i번째 사용자가 j번째 컨텐츠를 구매하여 시청할 확률을 나타내고, i는 1 이상이고 상기 사용자들의 수 이하이고, j는 1 이상이고 상기 컨텐츠들의 개수 이하이다.In one embodiment, the number of rows of P equals the number of users, the number of columns of P equals the number of contents, the elements of P are denoted by p _i,j , and the elements p of P _i,j represents a probability that the i-th user purchases and views the j-th content, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than the number of the content.

일 실시예에서, 상기 X의 행들의 개수는 상기 사용자들의 수와 같고, 상기 X의 열들의 수는 특징들(features)의 개수와 같고, 상기 X의 원소들은 x_i,j로 나타낸다.In one embodiment, the number of rows of X equals the number of users, the number of columns of X equals the number of features, and the elements of X are denoted by x _i,j .

일 실시예에서, 상기 Y의 행들의 개수는 상기 특징들의 개수와 같고, 상기 Y의 열들의 수는 상기 컨텐츠들의 개수와 같고, 상기 Y의 원소들은 y_i,j로 나타낸다.In one embodiment, the number of rows of Y equals the number of features, the number of columns of Y equals the number of contents, and the elements of Y are denoted by y _i,j .

일 실시예에서, 상기 O의 행들의 수는 상기 사용자들의 수와 같고, 상기 O의 열들의 수는 상기 컨텐츠들의 개수와 같고, 상기 O의 원소들은 o_i,j로 나타내고, 상기 O의 원소 o_i,j는 i번째 사용자가 j번째 컨텐츠를 시청하였는지의 여부를 나타내고, i는 1 이상이고 상기 사용자들의 수 이하이고, j는 1 이상이고 상기 컨텐츠들의 개수 이하이다.In one embodiment, the number of rows of O equals the number of users, the number of columns of O equals the number of contents, the elements of O are denoted by o _i,j , and the element o of O _i,j indicates whether the i-th user viewed the j-th content or not, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than or equal to the number of the content.

일 실시예에서, 상기 C의 행들의 수는 상기 사용자들의 수와 같고, 상기 C의 열들의 수는 상기 컨텐츠들의 개수와 같고, 상기 C의 원소들은 c_i,j로 나타내고, 상기 C의 원소 c_i,j는 i번째 사용자로부터의 j번째 컨텐츠의 인기도를 나타내고, i는 1 이상이고 상기 사용자들의 수 이하이고, j는 1 이상이고 상기 컨텐츠들의 개수 이하이다.In one embodiment, the number of rows of C equals the number of users, the number of columns of C equals the number of contents, the elements of C are denoted by c _i,j , and the elements c of C _i,j represents the popularity of the j-th content from the i-th user, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than or equal to the number of the content.

일 실시예에서, 상기 i번째 사용자로부터의 j번째 컨텐츠의 인기도 c_i,j는 아래의 식

에 따라 정의되고, 여기서

,

및 K는 상수이고, t_ij는 i번째 사용자가 j번째 컨텐츠를 시청한 시간과 j번째 컨텐츠의 전체 상영 시간의 비를 나타내고, R_j는 j번째 컨텐츠의 인기 순위를 나타낸다.In one embodiment, the popularity c _i,j of the j-th content from the i-th user is

is defined according to where

,

and K is a constant, t _ij represents the ratio of the time the i-th user watches the j-th content to the total running time of the j-th content, and R _j represents the popularity ranking of the j-th content.

일 실시에에서, 상기 손실 함수 L은 x_i,j 및 y_i,j의 함수이다.In one embodiment, the loss function L is a function of x _i,j and y _i,j .

일 실시예에서, 상기 손실 함수 L은 아래의 식

에 따라 정의되고, i는 사용자 인덱스이고, j는 컨텐츠 인덱스이고, S는 사용자 인덱스와 컨텐츠 인덱스의 조합에 대한 전체 집합을 나타내고,

는 상기 X의 노름(Norm)의 제곱을 나타내고,

는 상기 Y의 노름의 제곱을 나타내고,

는 상수이다.In one embodiment, the loss function L is

is defined according to, i is the user index, j is the content index, S represents the entire set for the combination of the user index and the content index,

represents the square of the norm of X,

represents the square of the norm of Y,

is a constant.

일 실시예에서, 상기 P, 상기 사용자들이 상기 컨텐츠들을 시청하였는지의 여부를 나타내는 컨텐츠시청여부 매트릭스 O 및 상기 컨텐츠들의 인기도를 나타내는 인기도 매트릭스 C에 의해 정의되는 손실 함수(Loss Function: L)를 이용하여 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 결정하는 단계는, 상기 손실 함수 L이 최소값에 근접하도록 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 복수 회 갱신하는 단계를 포함한다.In one embodiment, using a loss function (L) defined by the P, the content viewing matrix O indicating whether the users watched the contents, and the popularity matrix C indicating the popularity of the contents The determining the values of the elements of X and the values of the elements of Y includes updating the values of the elements of X and the values of the elements of Y a plurality of times such that the loss function L approaches a minimum value. .

일 실시예에서, 상기 손실 함수 L이 최소값에 근접하도록 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 복수 회 갱신하는 단계는, 상기 손실 함수 L에 경사하강법(Gradient Descent Algorithm)을 적용하여 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 복수 회 갱신하는 단계를 포함한다.In an embodiment, updating the values of the elements of X and the values of the elements of Y a plurality of times so that the loss function L approaches a minimum value includes applying a gradient descent algorithm to the loss function L. and updating the values of the elements of X and the values of the elements of Y a plurality of times.

일 실시예에서, 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 이용하여 상기 P의 원소들의 값들을 산출하는 단계는, 상기 X와 상기 Y를 곱하여 상기 P의 원소들의 값들을 산출하는 단계를 포함한다.In one embodiment, the calculating of the values of the elements of P using the values of the elements of X and the values of the elements of Y includes: multiplying the X and Y to calculate the values of the elements of P includes

일 실시예에서, 상기 방법은 상기 P의 원소들의 값들을 기초로 적어도 한 사용자에게 적어도 하나의 컨텐츠를 추천하는 것으로 결정하는 단계를 더 포함한다.In one embodiment, the method further comprises determining to recommend at least one content to at least one user based on the values of the elements of P.

일 실시예에서, 상기 P의 원소들의 값들을 기초로 적어도 한 사용자에게 적어도 하나의 컨텐츠를 추천하는 것으로 결정하는 단계는, 상기 P의 원소 p_i,j의 값이 선정된 임계값 이상인 경우 i번째 사용자에게 j번째 컨텐츠를 추천하는 것으로 결정하는 단계를 포함한다.In an embodiment, the determining of recommending at least one content to at least one user based on the values of the elements of P includes the i-th when the value of the element p _i,j of P is greater than or equal to a predetermined threshold value and determining to recommend the j-th content to the user.

다른 측면에서, VOD 컨텐츠 추천을 위한 장치가 제공된다. 본 장치는, 사용자들이 컨텐츠들을 시청하였는지의 여부를 나타내는 컨텐츠시청여부 매트릭스 O 및 상기 컨텐츠들의 인기도를 나타내는 인기도 매트릭스 C에 관한 데이터를 저장하는 데이터베이스부, 및 상기 데이터베이스부에 통신 가능하게 결합된 프로세싱 엔진을 포함하고, 상기 프로세싱 엔진은, (i) 컨텐츠구매확률 매트릭스 P를 설정하는 동작 - 상기 P는 원소들을 포함하고, 상기 P의 원소들은 사용자들이 컨텐츠들을 구매하여 시청할 확률들을 나타냄 -, (ii) 상기 P가 사용자 별 특징들을 정의하는 고객특징정보 매트릭스 X 및 컨텐츠 별 특징들을 정의하는 컨텐츠특징정보 매트릭스 Y의 곱으로 표현되도록 상기 X 및 상기 Y를 설정하는 동작, (iii) 상기 P, 상기 O 및 상기 C에 의해 정의되는 손실 함수 L을 이용하여 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 결정하는 동작, 및 (iv) 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 이용하여 상기 P의 원소들의 값들을 산출하는 동작을 수행하도록 구성될 수 있다.In another aspect, an apparatus for recommending VOD content is provided. The apparatus includes a database unit for storing data regarding a content viewing/noting matrix O indicating whether users have viewed the contents and a popularity matrix C indicating the popularity of the contents, and a processing engine communicatively coupled to the database unit. wherein the processing engine comprises, (i) the operation of setting a content purchase probability matrix P, wherein P includes elements, and the elements of P represent probabilities for users to purchase and watch content; (ii) setting X and Y so that P is expressed as a product of a customer characteristic information matrix X defining characteristics for each user and a content characteristic information matrix Y defining characteristics for each content; (iii) the P, the O and determining the values of the elements of X and the values of the elements of Y using the loss function L defined by C, and (iv) using the values of the elements of X and the values of the elements of Y and may be configured to perform an operation of calculating values of the elements of P.

일 실시예에서, 상기 X의 행들의 개수는 상기 사용자들의 수와 같고, 상기 X의 열들의 수는 특징들의 개수와 같고, 상기 X의 원소들은 x_i,j로 나타낸다.In one embodiment, the number of rows of X equals the number of users, the number of columns of X equals the number of features, and the elements of X are denoted by x _i,j .

에 따라 정의되고,

,

is defined according to

,

일 실시예에서, 상기 손실 함수 L은 x_i,j 및 y_i,j의 함수이다.In one embodiment, the loss function L is a function of x _i,j and y _i,j .

일 실시예에서, 상기 손실 함수 L은 아래의 식

는 상기 X의 노름(Norm)의 제곱을 나타내고,

는 상기 Y의 노름의 제곱을 나타내고,

는 상수이다.In one embodiment, the loss function L is

represents the square of the norm of X,

represents the square of the norm of Y,

is a constant.

일 실시예에서, 상기 프로세싱 엔진은, 상기 손실 함수 L이 최소값에 근접하도록 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 복수 회 갱신하는 동작을 수행하도록 더 구성된다.In an embodiment, the processing engine is further configured to perform the operation of updating the values of the elements of X and the values of the elements of Y a plurality of times such that the loss function L approaches a minimum value.

일 실시예에서, 상기 프로세싱 엔진은, 상기 손실 함수 L에 경사하강법을 적용하여 상기 X의 원소들의 값들 및 상기 Y의 원소들의 값들을 복수 회 갱신하는 동작을 수행하도록 더 구성된다.In an embodiment, the processing engine is further configured to apply gradient descent to the loss function L to update the values of the elements of X and the values of the elements of Y a plurality of times.

일 실시예에서, 상기 프로세싱 엔진은, 상기 X와 상기 Y를 곱하여 상기 P의 원소들의 값들을 산출하는 동작을 수행하도록 더 구성된다.In an embodiment, the processing engine is further configured to perform the operation of multiplying the X by the Y to yield values of the elements of P.

일 실시예에서, 상기 프로세신 엔진은, 상기 P의 원소들의 값들을 기초로 적어도 한 사용자에게 적어도 하나의 컨텐츠를 추천하는 것으로 결정하는 동작을 수행하도록 더 구성된다.In an embodiment, the processing engine is further configured to perform an operation of determining to recommend at least one content to at least one user based on the values of the elements of P.

일 실시예에서, 상기 프로세싱 엔진은, 상기 P의 원소 p_i,j의 값이 선정된 임계값 이상인 경우 i번째 사용자에게 j번째 컨텐츠를 추천하는 것으로 결정하는 동작을 수행하도록 더 구성된다.In an embodiment, the processing engine is further configured to perform an operation of determining to recommend the j-th content to the i-th user when the value of the element p _i,j of the P is equal to or greater than a predetermined threshold value.

또 다른 측면에서, 프로그램을 기록한 컴퓨터 판독가능 기록매체가 제공된다. 여기서 상기 프로그램은 명령어들을 포함하고, 상기 명령어들은 컴퓨터에 의해 실행될 때 상기 방법을 수행할 수 있다.In another aspect, a computer-readable recording medium recording a program is provided. Here, the program includes instructions, and the instructions can perform the method when executed by a computer.

개시된 기술의 실시예들에 따르면, 사용자 개개인에게 해당 사용자의 개인 취향과 컨텐츠의 대중적인 인기도를 동시에 반영하여 컨텐츠를 추천할 수 있는 기술적 효과가 있다.According to embodiments of the disclosed technology, there is a technical effect of being able to recommend content to each user by simultaneously reflecting the user's personal taste and the popular popularity of the content.

도 1은 VOD 컨텐츠 추천 장치의 블록도의 일 실시예를 도시한 도면이다.
도 2는 컨텐츠시청여부 매트릭스 O의 일 실시예를 도시한 도면이다.
도 3은 컨텐츠구매확률 매트릭스 P, 고객특징정보 매트릭스 X 및 컨텐츠특징정보 매트릭스 Y를 설정하는 방식을 예시하기 위한 도면이다.
도 4는 산출된 원소 값들을 갖는 컨텐츠구매확률 매트릭스 P의 일 실시예를 도시한 도면이다.
도 5는 VOD 컨텐츠를 추천하는 방법을 설명하기 위한 흐름도의 일 실시예를 도시한 도면이다.1 is a diagram illustrating an embodiment of a block diagram of a VOD content recommendation apparatus.
2 is a diagram illustrating an embodiment of a content viewing status matrix O. Referring to FIG.
3 is a diagram illustrating a method of setting a content purchase probability matrix P, a customer feature information matrix X, and a content feature information matrix Y. Referring to FIG.
4 is a diagram illustrating an embodiment of a content purchase probability matrix P having calculated element values.
5 is a diagram illustrating an embodiment of a flowchart for explaining a method of recommending VOD content.

본 발명의 이점들과 특징들 그리고 이들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해 질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 본 실시예들은 단지 본 발명의 개시가 완전하도록 하며 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려 주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present invention and a method of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but will be implemented in a variety of different forms, and these embodiments merely allow the disclosure of the present invention to be complete and those of ordinary skill in the art to which the present invention pertains. It is provided to fully inform the person of the scope of the invention, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용되는 용어는 단지 특정한 실시예를 설명하기 위해 사용되는 것으로 본 발명을 한정하려는 의도에서 사용된 것이 아니다. 예를 들어, 단수로 표현된 구성 요소는 문맥상 명백하게 단수만을 의미하지 않는다면 복수의 구성 요소를 포함하는 개념으로 이해되어야 한다. 또한, 본 발명의 명세서에서, '포함하다' 또는 '가지다' 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것일 뿐이고, 이러한 용어의 사용에 의해 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성이 배제되는 것은 아니다. 또한, 본 명세서에 기재된 실시예에 있어서 '모듈' 혹은 '부'는 적어도 하나의 기능이나 동작을 수행하는 기능적 부분을 의미할 수 있다.The terms used herein are used only to describe specific embodiments and are not intended to limit the present invention. For example, a component expressed in a singular should be understood as a concept including a plurality of components unless the context clearly means only the singular. In addition, in the specification of the present invention, terms such as 'comprise' or 'have' are only intended to designate that the features, numbers, steps, operations, components, parts, or combinations thereof described in the specification exist, and such The use of the term does not exclude the presence or addition of one or more other features or numbers, steps, operations, components, parts, or combinations thereof. Also, in the embodiments described in this specification, a 'module' or a 'unit' may mean a functional part that performs at least one function or operation.

덧붙여, 다르게 정의되지 않는 한 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미가 있는 것으로 해석되어야 하며, 본 발명의 명세서에서 명백하게 정의하지 않는 한 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In addition, unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the specification of the present invention, it should be interpreted in an ideal or excessively formal meaning. doesn't happen

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 보다 상세히 설명한다. 다만, 이하의 설명에서는 본 발명의 요지를 불필요하게 흐릴 우려가 있는 경우, 널리 알려진 기능이나 구성에 관한 구체적 설명은 생략하기로 한다.Hereinafter, embodiments of the present invention will be described in more detail with reference to the accompanying drawings. However, in the following description, when there is a risk of unnecessarily obscuring the gist of the present invention, detailed descriptions of well-known functions and configurations will be omitted.

도 1은 VOD 컨텐츠 추천 장치의 블록도의 일 실시예를 도시한 도면이다.1 is a diagram illustrating an embodiment of a block diagram of a VOD content recommendation apparatus.

도 1의 컨텐츠 추천 장치(100)는 IPTV 서비스 사업자가 운용하는, 하나 이상의 위치에 설치되는 하나 이상의 서버 컴퓨터에서 실행되는 컴퓨터 프로그램들로서 구현되는 시스템의 일 예일 수 있다. 도 1에 도시된 바와 같이, 장치(100)는 데이터베이스부(110) 및 데이터베이스부(110)에 통신 가능하게 결합된 프로세싱 엔진(120)을 포함할 수 있다. 데이터베이스부(110)는 사용자 데이터를 저장할 수 있다. 사용자 데이터는 사용자들이 컨텐츠들을 시청하였는지의 여부를 나타내는 컨텐츠시청여부 매트릭스(matrix) O 및 컨텐츠들의 인기도를 나타내는 인기도 매트릭스 C에 관한 데이터를 포함할 수 있다. 매트릭스 O의 행들(rows)의 수는, 예컨대 IPTV 서비스 사업자가 제공하는 VOD 서비스에 가입한 사용자들의 수(n)와 같고, 매트릭스 O의 열들(columns)의 수는 IPTV 서비스 사업자가 사용자들에게 요청 기반으로 제공하는 컨텐츠들의 총 개수(m)와 같다. 매트릭스 O의 원소들은 o_i,j로 나타내고, 매트릭스 O의 원소 o_i,j는 i번째 사용자가 j번째 컨텐츠를 시청하였는지의 여부를 나타낸다. 여기서 i는 1 이상이고 사용자들의 수(n) 이하인 자연수이고, j는 1 이상이고 컨텐츠들의 개수(m) 이하인 자연수이다. 매트릭스 O의 일 실시예를 도시한 도면인 도 2를 참조하면, o_3,1은 1이고 o_1,3은 0이므로 세 번째 사용자가 첫 번째 컨텐츠를 적어도 부분적으로 시청한 이력이 있고, 첫 번째 사용자가 세 번째 컨텐츠를 적어도 부분적으로도 시청한 이력이 없음을 나타낸다. 인기도 매트릭스 C의 행들의 수는 사용자들의 수(n)와 같고, 매트릭스 C의 열들의 수는 컨텐츠들의 개수(m)와 같고, 매트릭스 C의 원소들은 c_i,j로 나타내고, 매트릭스 C의 원소 c_i,j는 i번째 사용자로부터의 j번째 컨텐츠의 인기도를 나타낸다. i번째 사용자로부터의 j번째 컨텐츠의 인기도 c_i,j는 아래의 수학식 1에 따라 정의된다.The content recommendation apparatus 100 of FIG. 1 may be an example of a system implemented as computer programs running on one or more server computers installed in one or more locations, operated by an IPTV service provider. As shown in FIG. 1 , the apparatus 100 may include a database unit 110 and a processing engine 120 communicatively coupled to the database unit 110 . The database unit 110 may store user data. The user data may include data related to a content viewing matrix O indicating whether users have viewed the contents and a popularity matrix C indicating the popularity of the contents. The number of rows in matrix O is, for example, equal to the number (n) of users subscribed to VOD service provided by an IPTV service provider, and the number of columns in matrix O is requested by the IPTV service provider from users It is equal to the total number (m) of contents provided as a basis. The elements of the matrix O are denoted by o _i,j , and the elements o _i,j of the matrix O indicate whether the i-th user watched the j-th content. Here, i is a natural number greater than or equal to 1 and equal to or less than the number of users (n), and j is a natural number greater than or equal to 1 and equal to or less than the number of contents (m). 2, which is a diagram illustrating an embodiment of the matrix O, o _3,1 is 1 and o _1,3 is 0, so the third user has a history of at least partially watching the first content, and the first It indicates that the user has no history of at least partially watching the third content. The number of rows of the popularity matrix C is equal to the number of users (n), the number of columns of the matrix C is equal to the number of contents (m), the elements of matrix C are denoted by c _i,j , and element c of matrix C _i,j represents the popularity of the j-th content from the i-th user. The popularity c _i,j of the j-th content from the i-th user is defined according to Equation 1 below.

여기서

,

및 K는 상수이고, t_ij는 i번째 사용자가 j번째 컨텐츠를 시청한 시간과 j번째 컨텐츠의 전체 상영 시간의 비를 나타내고, R_j는 j번째 컨텐츠의 인기 순위를 나타낸다.here

,

위 수학식 1로부터 알 수 잇는 바와 같이, i번째 사용자로부터의 j번째 컨텐츠의 인기도 c_i,j는 j번째 컨텐츠의 전체 상영 시간에 비해 i번째 사용자가 j번째 컨텐츠를 시청한 시간이 많을수록 커지며, j번째 컨텐츠의 인기 순위가 높을수록, 즉 Rj가 적을수록 커진다. 이상으로 설명한 매트릭스 O 및 매트릭스 C의 각각에 관한 데이터는 프로그래밍적 관점에서 2차원 어레이(2-dimensional array)를 선언하고 그 요소들에 해당 매트릭스의 원소들의 값들을 저장함으로써 저장될 수 있으나, 매트릭스 O 및 매트릭스 C에 관한 데이터를 저장하는 방식이 이에 한정되는 것이 아님을 인식하여야 한다.As can be seen from Equation 1 above, the popularity c _i,j of the j-th content from the i-th user increases as the time the i-th user watches the j-th content increases compared to the total running time of the j-th content, The higher the popularity ranking of the j-th content, that is, the smaller the Rj is, the larger it is. Data regarding each of the matrix O and matrix C described above can be stored by declaring a two-dimensional array from a programming point of view and storing the values of the elements of the matrix in the elements, but the matrix O And it should be recognized that the method of storing data about the matrix C is not limited thereto.

데이터베이스부(110)는 프로세싱 엔진(120)을 구현하기 위해 필요한 소프트웨어/펌웨어를 더 저장할 수 있다. 데이터베이스부(110)는, 플래시 메모리 타입(flash memory type), 하드 디스크 타입(hard disk type), 멀티미디어 카드(MultiMedia Card: MMC), 카드 타입의 메모리(예를 들어, SD(Secure Digital) 카드 또는 XD(eXtream Digital) 카드 등), RAM(Random Access Memory), SRAM(Static Random Access Memory), ROM(Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크 및 광 디스크 중 어느 하나의 저장 매체로 구현될 수 있으나, 당업자라면 데이터베이스부(110)의 구현 형태가 이에 한정되는 것이 아님을 알 수 있을 것이다.The database unit 110 may further store software/firmware necessary to implement the processing engine 120 . The database unit 110 is a flash memory type (flash memory type), a hard disk type (hard disk type), a multimedia card (MultiMedia Card: MMC), a card type memory (eg, SD (Secure Digital) card or XD (eXtream Digital) cards, etc.), RAM (Random Access Memory), SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory) ), a magnetic memory, a magnetic disk, and an optical disk may be implemented as any one storage medium, but those skilled in the art will understand that the implementation form of the database unit 110 is not limited thereto.

프로세싱 엔진(120)은 컨텐츠구매확률 매트릭스 P, 고객특징정보 매트릭스 X 및 컨텐츠특징정보 매트릭스 Y를 설정하는 동작을 수행하도록 구성될 수 있다. 도 3에는 컨텐츠구매확률 매트릭스 P, 고객특징정보 매트릭스 X 및 컨텐츠특징정보 매트릭스 Y를 설정하는 방식을 예시하기 위한 도면이 도시되어 있다. 도 3에 도시된 바와 같이, 매트릭스 P의 행들의 수는 사용자들의 수(n)와 같고, 매트릭스 P의 열들의 수는 컨텐츠들의 개수(m)와 같다. 매트릭스 P의 원소들은 p_i,j로 나타내고, 매트릭스 P의 원소 p_i,j는 i번째 사용자가 j번째 컨텐츠를 구매하여 시청할 확률을 나타낸다. 본 개시에서는 사용자 데이터를 이용하여 컨텐츠 구매 확률들, 즉 매트릭스 P의 원소들 p_i,j의 값들을 예측하고 이에 기반하여 사용자 개개인 별로 컨텐츠를 추천한다. 프로세싱 엔진(120)은 매트릭스 P가 사용자 별 특징들(features)을 정의하는 고객특징정보 매트릭스 X 및 컨텐츠 별 특징들을 정의하는 컨텐츠특징정보 매트릭스 Y의 곱으로 표현되도록 매트릭스 X 및 매트릭스 Y를 설정하도록 구성될 수 있다. 일 실시예에서, 매트릭스 P는 매트릭스 팩토라이제이션 기법(matrix factorization technique)에 의해 매트릭스 X 및 매트릭스 Y로 분리될 수 있다. 컨텐츠구매확률 매트릭스 P를 고객특징정보 매트릭스 X와 컨텐츠특징정보 매트릭스 Y로 분리하는 이유는, 사용자 별 특징들과 컨텐츠 별 특징들을 분리하여 수치화함으로써 서로 유사한 특징들을 갖는 사용자들은 컨텐츠들에 대한 선호 경향이 유사하게 되고 서로 유사한 특징들을 갖는 컨텐츠들은 동일한 사용자를 기준으로 할 때 선호되는 경향이 유사하게 되는 방향으로 컨텐츠 구매 확률들 p_i,j를 결정할 수 있도록 하기 위함이다. 매트릭스 X의 행들의 개수는 사용자들의 수(n)와 같고, 매트릭스 X의 열들의 수는 특징들의 개수(k)와 같고, 매트릭스 X의 원소들은 x_i,j로 나타낸다. 매트릭스 Y의 행들의 개수는 특징들의 개수(k)와 같고, 매트릭스 Y의 열들의 수는 컨텐츠들의 개수(m)와 같고, 매트릭스 Y의 원소들은 y_i,j로 나타낸다. 특징들의 개수(k)가 클수록 매트릭스 P의 원소들 p_i,j의 예측값들의 정확도가 높아지겠지만 이 경우 프로세싱 엔진(120)의 처리 부하가 그 만큼 높아지므로, 정확도와 처리 부하 간에서 절충을 이루는 선에서 특징들의 개수(k)를 정할 수 있다. 이상으로 설명한 매트릭스 P, 매트릭스 X 및 매트릭스 Y의 각각은 프로그래밍적 관점에서 2차원 어레이를 선언함으로써 설정될 수 있으나, 이들을 설정하는 방식이 이에 한정되는 것이 아님을 인식하여야 한다.The processing engine 120 may be configured to perform an operation of setting the content purchase probability matrix P, the customer characteristic information matrix X, and the content characteristic information matrix Y. FIG. 3 is a diagram illustrating a method of setting the content purchase probability matrix P, the customer feature information matrix X, and the content feature information matrix Y. FIG. As shown in Fig. 3, the number of rows of the matrix P is equal to the number of users (n), and the number of columns of the matrix P is equal to the number of contents (m). The elements of the matrix P are represented by p _i,j , and the elements p _i,j of the matrix P represent the probability that the i-th user purchases and views the j-th content. In the present disclosure, content purchase probabilities, ie, values of elements p _i,j of the matrix P, are predicted using user data, and content is recommended for each user based thereon. The processing engine 120 is configured to set the matrix X and the matrix Y such that the matrix P is expressed as a product of a customer feature information matrix X defining user-specific features and a content feature information matrix Y defining content-specific features can be In one embodiment, matrix P may be separated into matrix X and matrix Y by a matrix factorization technique. The reason for separating the content purchase probability matrix P into the customer feature information matrix X and the content feature information matrix Y is that by separating and quantifying user-specific features and content-specific features, users with similar features tend to prefer content. This is so that the content purchase probabilities p _i,j can be determined in a direction in which a preference tendency for contents that are similar and have similar characteristics to each other based on the same user is similar. The number of rows of matrix X equals the number of users (n), the number of columns of matrix X equals the number of features (k), and the elements of matrix X are denoted by x _i,j . The number of rows of matrix Y equals the number of features (k), the number of columns of matrix Y equals the number of contents (m), and the elements of matrix Y are denoted by y _i,j . As the number of features (k) increases, the accuracy of the predicted values of the elements p _i,j of the matrix P increases, but in this case, the processing load of the processing engine 120 increases by that much, so it is a compromise between accuracy and processing load. The number of features (k) can be determined in Each of the above-described matrix P, matrix X, and matrix Y may be set by declaring a two-dimensional array from a programming point of view, but it should be recognized that the method of setting them is not limited thereto.

프로세싱 엔진(120)은 매트릭스 P, 매트릭스 O 및 매트릭스 C에 의해 정의되는 손실 함수(Loss Function: L)를 설정하는 동작을 수행하도록 더 구성될 수 있다. 일 실시예에서, 손실 함수 L은 아래의 수학식 2에 따라 설정될 수 있다.The processing engine 120 may be further configured to perform the operation of setting a loss function (L) defined by the matrix P, the matrix O, and the matrix C. In an embodiment, the loss function L may be set according to Equation 2 below.

여기서 i는 사용자 인덱스이고, j는 컨텐츠 인덱스이고, S는 사용자 인덱스와 컨텐츠 인덱스의 조합에 대한 전체 집합을 나타내고,

는 매트릭스 X의 노름(Norm)의 제곱을 나타내고,

는 매트릭스 Y의 노름의 제곱을 나타내고,

는 상수이다.where i is the user index, j is the content index, S represents the entire set of the combination of the user index and the content index,

represents the square of the norm of the matrix X,

denotes the square of the norm of the matrix Y,

is a constant.

위 수학식 2에서 오차 항(error term)인

에 c_ij를 곱하는 이유는 i번째 사용자로부터의 j번째 컨텐츠의 인기도가 클수록 해당 오차 항을 손실 함수 L에 크게 가중시켜 반영하도록 하기 위함이다. 또한 위 수학식 2에서

는 과적합(overfitting) 방지 함수인데, 이를 손실 함수 L에 포함시킨 이유는 학습 데이터에 대해서만 너무 높은 정확도를 나타내지 않도록 함으로써 실제 적용 시에 성능이 떨어지지 않도록 하기 위함이다. 손실 함수 L은 p_i,j의 함수이고 p_i,j는 x_i,j 및 y _i,j의 함수이므로, 손실 함수 L도 x_i,j 및 y _i,j의 함수일 수 있다.In Equation 2 above, the error term is

The reason for multiplying c _ij is so that the greater the popularity of the j-th content from the i-th user, the greater the error term is weighted and reflected in the loss function L. Also in Equation 2 above

is an overfitting prevention function, and the reason for including it in the loss function L is to prevent performance degradation in actual application by not showing too high accuracy only for training data. Since the loss function L is a function of p i, _j and p _i,j is a function of x _i,j and y _i,j , the loss function L can also be a function of x _i,j and y _i,j .

프로세싱 엔진(120)은 손실 함수 L을 이용하여 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 결정하는 동작을 수행하도록 더 구성될 수 있다. 프로세싱 엔진(120)은 손실 함수 L이 최소값에 근접하도록 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 학습에 의하여 결정할 수 있다. 프로세싱 엔진(120)은 손실 함수 L이 최소값에 근접하도록 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 학습에 의해 복수 회 갱신하도록 구성될 수 있다. 일 실시예에서, 프로세싱 엔진(120)은 손실 함수 L에 경사하강법(Gradient Descent Algorithm)을 적용하여 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 복수 회 갱신하는 동작을 수행하도록 구성될 수 있다. 일 실시예에서, 프로세싱 엔진(120)은 경사하강법을 적용하기에 적합한 방식으로 매트릭스 X의 원소들의 초기 값들과 매트릭스 Y의 원소들의 초기 값들을 설정하도록 구성될 수 있다. 손실 함수 L에 경사하강법을 적용하여 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 갱신함에 있어 많은 횟수로 갱신을 반복(iteration)하면 손실 함수 L이 그 만큼 최소값에 가까워지므로 매트릭스 X의 원소들의 값들과 매트릭스 Y의 원소들의 값들이 정확해지기는 하나, 너무 많은 횟수로 갱신을 반복하면 프로세싱 엔진(120)에 부하가 걸릴 수 있으므로 선정된 횟수로 반복 횟수를 제한하거나 처리 시간에 타임 아웃을 거는 방식으로 프로세싱 엔진(120)에 걸리는 부하를 줄일 수 있다.The processing engine 120 may be further configured to perform the operation of determining values of elements of matrix X and values of elements of matrix Y using the loss function L. The processing engine 120 may determine by learning the values of the elements of the matrix X and the values of the elements of the matrix Y such that the loss function L approaches the minimum value. The processing engine 120 may be configured to update the values of the elements of the matrix X and the values of the elements of the matrix Y a plurality of times by learning such that the loss function L approaches a minimum value. In one embodiment, the processing engine 120 is configured to apply a gradient descent algorithm to the loss function L to perform an operation of updating values of elements of matrix X and values of elements of matrix Y a plurality of times. can In one embodiment, processing engine 120 may be configured to set initial values of elements of matrix X and initial values of elements of matrix Y in a manner suitable for applying gradient descent. In updating the values of the elements of the matrix X and the values of the elements of the matrix Y by applying the gradient descent method to the loss function L, if the update is repeated a large number of times, the loss function L approaches the minimum value by that much. Although the values of the elements and the values of the elements of the matrix Y are accurate, if the update is repeated too many times, the processing engine 120 may be overloaded. It is possible to reduce the load applied to the processing engine 120 in such a way that the

프로세싱 엔진(120)은 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 이용하여 매트릭스 P의 원소들의 값들을 산출하는 동작을 수행할 수 있다. 이를 위해 프로세싱 엔진(120)은 매트릭스 X와 매트릭스 Y를 곱하여 매트릭스 P의 원소들의 값들을 산출할 수 있다. 일 실시예에서, 프로세싱 엔진(120)은 산출된 매트릭스 P의 원소들의 값들에 대해 보정을 수행하는 동작을 수행하도록 더 구성될 수 있다. 이러한 실시예의 경우 프로세싱 엔진(120)은 산출된 P의 원소의 값이 음수인 경우 0으로 보정하고 산출된 P의 원소의 값이 1을 초과하는 경우 1로 보정하는 동작을 수행한다. 프로세싱 엔진(120)은 매트릭스 P의 원소들의 값들을 기초로 적어도 한 사용자에게 적어도 하나의 컨텐츠를 추천하는 것으로 결정하는 동작을 수행하도록 더 구성될 수 있다. 프로세싱 엔진(120)은 매트릭스 P의 원소 p_i,j의 값이 선정된 임계값 이상인 경우 i번째 사용자에게 j번째 컨텐츠를 추천하는 것으로 결정하는 동작을 수행하도록 더 구성될 수 있다. 산출된 원소 값들을 갖는 매트릭스 P의 일 실시예를 도시한 도 4를 참조하여 예를 들어 보면, 선정된 임계값이 0.5라 가정하는 경우, p_1,1, p_1,2, p_2,2 및 p_3,1이 각각 0.67, 0.73, 0.95 및 0.09이므로, 프로세싱 엔진(120)은 첫 번째 사용자에게 첫 번째 및 두 번째 컨텐츠를 추천하는 것으로 결정하고, 두 번째 사용자에게 두 번째 컨텐츠를 추천하는 것으로 결정하고, 세 번째 사용자에게는 첫 번째 컨텐츠를 추천하지 않는 것으로 결정할 수 있다.The processing engine 120 may perform an operation of calculating values of the elements of the matrix P by using the values of the elements of the matrix X and the values of the elements of the matrix Y. To this end, the processing engine 120 may multiply the matrix X and the matrix Y to calculate values of elements of the matrix P. In an embodiment, the processing engine 120 may be further configured to perform an operation of performing correction on the calculated values of the elements of the matrix P. In this embodiment, the processing engine 120 performs an operation of correcting to 0 when the calculated value of the element of P is negative, and correcting it to 1 when the calculated value of the element of P exceeds 1. The processing engine 120 may be further configured to perform an operation of determining to recommend the at least one content to the at least one user based on the values of the elements of the matrix P. The processing engine 120 may be further configured to perform an operation of determining to recommend the j-th content to the i-th user when the value of the element p _i,j of the matrix P is equal to or greater than a predetermined threshold. For example, referring to FIG. 4 showing an embodiment of a matrix P having calculated element values, when it is assumed that the predetermined threshold value is 0.5, p _1,1 , p _1,2 , p _2,2 and p _3,1 are 0.67, 0.73, 0.95, and 0.09, respectively, so the processing engine 120 determines to recommend the first and second content to the first user, and recommends the second content to the second user. and not recommending the first content to the third user.

프로세싱 엔진(120)은, 응용 주문형 집적 회로(Application Specific Integrated Circuits: ASICs), 디지털 신호 처리기(Digital Signal Processors: DSPs), 디지털 신호 처리 소자(Digital Signal Processing Devices: DSPDs), 프로그램 가능 논리 소자(Programmable Logic Devices: PLDs), 현장 프로그램 가능 게이트 어레이(Field-Programmable Gate Arrays: FPGAs), 프로세서(processors), 제어기(controllers), 마이크로 컨트롤러(micro-controllers) 및 마이크로 프로세서(microprocessors) 중 적어도 하나에 기반한 하드웨어 플랫폼(hardware platform)으로서 구현될 수 있다. 프로세싱 엔진(120)은 또한 전술한 하드웨어 플랫폼 상에서 실행 가능한 펌웨어(firmware)/소프트웨어 모듈로 구현될 수 있다. 이 경우, 소프트웨어 모듈은 적절한 프로그램(program) 언어로 쓰여진 소프트웨어 애플리케이션(application)에 의해 구현될 수 있다.The processing engine 120 includes Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), and programmable logic devices (Programmable Logic Devices). Hardware based on at least one of Logic Devices (PLDs), Field-Programmable Gate Arrays (FPGAs), processors, controllers, micro-controllers, and microprocessors. It may be implemented as a hardware platform. The processing engine 120 may also be implemented as a firmware/software module executable on the aforementioned hardware platform. In this case, the software module may be implemented by a software application written in an appropriate program language.

도 5는 VOD 컨텐츠를 추천하는 방법을 설명하기 위한 흐름도의 일 실시예를 도시한 도면이다.5 is a diagram illustrating an embodiment of a flowchart for explaining a method of recommending VOD content.

본 컨텐츠 추천 방법은 컨텐츠구매확률 매트릭스 P를 설정하는 단계(S505)로부터 시작된다. 전술한 바와 같이, 매트릭스 P의 원소들은 p_i,j로 나타내고, 매트릭스 P의 원소 p_i,j는 i번째 사용자가 j번째 컨텐츠를 시청할 확률을 나타낸다. 단계(S510)에서는 매트릭스 P가 사용자 별 특징들을 정의하는 고객특징정보 매트릭스 X 및 컨텐츠 별 특징들을 정의하는 컨텐츠특징정보 매트릭스 Y의 곱으로 표현되도록 매트릭스 X 및 매트릭스 Y를 설정한다. 전술한 바와 같이, 매트릭스 X의 행들의 개수는 사용자들의 수(n)와 같고, 매트릭스 X의 열들의 수는 특징들의 개수(k)와 같고, 매트릭스 Y의 행들의 개수는 특징들의 개수(k)와 같고, 매트릭스 Y의 열들의 수는 컨텐츠들의 개수(m)와 같다. 단계(S515)에서는 컨텐츠구매확률 매트릭스 P, 컨텐츠시청여부 매트릭스 O 및 인기도 매트릭스 C에 의해 정의되는 손실 함수(Loss Function: L)를 이용하여 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 결정한다. 전술한 바와 같이, 매트릭스 O의 행들의 수는 사용자들의 수(n)와 같고, 매트릭스 O의 열들의 수는 컨텐츠들의 개수(m)와 같고, 매트릭스 O의 원소 o_i,j는 i번째 사용자가 j번째 컨텐츠를 시청하였는지의 여부를 나타낸다. 또한 전술한 바와 같이, 매트릭스 C의 행들의 수는 사용자들의 수(n)와 같고, 매트릭스 C의 열들의 수는 컨텐츠들의 개수(m)와 같고, 매트릭스 C의 원소들은 c_i,j로 나타내고, 매트릭스 C의 원소 c_i,j는 i번째 사용자로부터의 j번째 컨텐츠의 인기도를 나타낸다. i번째 사용자로부터의 j번째 컨텐츠의 인기도 c_i,j는 전술한 수학식 1에 따라 정의된다. 일 실시예에서, 손실 함수 L은 전술한 수학식 2에 따라 결정된다. 손실 함수 L은 매트릭스 X의 원소들 및 매트릭스 Y의 원소들의 함수일 수 있다. 본 단계에서는 손실 함수 L이 최소값에 근접하도록 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 복수 회 갱신한다. 본 단계에서는 손실 함수 L에 경사하강법을 적용하여 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 복수 회 갱신할 수 있다. 단계(S520)에서는 매트릭스 X의 원소들의 값들 및 매트릭스 Y의 원소들의 값들을 이용하여 매트릭스 P의 원소들의 값들을 산출한다. 본 단계에서는 매트릭스 X와 매트릭스 Y를 곱하여 매트릭스 P의 원소들의 값들을 산출할 수 있다. 또한 본 단계에서는 산출된 매트릭스 P의 원소들의 값들에 대해 보정을 수행할 수 있다. 산출된 매트릭스 P의 원소의 값이 음수인 경우 0으로 변경하고 산출된 매트릭스 P의 원소의 값이 1을 초과하는 경우 1로 변경함으로써 보정을 수행할 수 있다. 단계(S525)에서는 매트릭스 P의 원소들의 값들을 기초로 적어도 한 사용자에게 적어도 하나의 컨텐츠를 추천하는 것으로 결정한다. 본 단계에서는 매트릭스 P의 원소 p_i,j의 값이 선정된 임계값 이상인 경우 i번째 사용자에게 j번째 컨텐츠를 추천하는 것으로 결정할 수 있다.This content recommendation method starts with the step of setting the content purchase probability matrix P (S505). As described above, the elements of the matrix P are represented by p _i,j , and the elements p _i,j of the matrix P represent the probability that the i-th user will watch the j-th content. In step S510, the matrix X and the matrix Y are set so that the matrix P is expressed as a product of the customer characteristic information matrix X defining characteristics for each user and the content characteristic information matrix Y defining characteristics for each content. As described above, the number of rows in matrix X equals the number of users (n), the number of columns in matrix X equals the number of features (k), and the number of rows in matrix Y equals the number of features (k). , and the number of columns of the matrix Y is equal to the number of contents (m). In step S515, the values of the elements of the matrix X and the values of the elements of the matrix Y are determined using a loss function (L) defined by the content purchase probability matrix P, the content viewing or not matrix O, and the popularity matrix C do. As described above, the number of rows of matrix O equals the number of users (n), the number of columns of matrix O equals the number of contents (m), and element o _i,j of matrix O is Indicates whether the j-th content has been viewed. Also, as described above, the number of rows of matrix C is equal to the number of users (n), the number of columns of matrix C is equal to the number of contents (m), and the elements of matrix C are denoted by c _i,j , The element c _i,j of matrix C represents the popularity of the j-th content from the i-th user. The popularity c _i,j of the j-th content from the i-th user is defined according to Equation 1 above. In one embodiment, the loss function L is determined according to Equation (2) above. The loss function L may be a function of elements of matrix X and elements of matrix Y. In this step, the values of the elements of the matrix X and the values of the elements of the matrix Y are updated a plurality of times so that the loss function L approaches the minimum value. In this step, the values of the elements of the matrix X and the values of the elements of the matrix Y may be updated multiple times by applying the gradient descent method to the loss function L. In step S520, values of elements of matrix P are calculated using values of elements of matrix X and values of elements of matrix Y. In this step, values of elements of matrix P may be calculated by multiplying matrix X and matrix Y. Also, in this step, correction may be performed on the calculated values of the elements of the matrix P. When the calculated value of the element of the matrix P is negative, it is changed to 0, and when the calculated value of the element of the matrix P exceeds 1, the correction can be performed by changing it to 1. In step S525, it is determined to recommend at least one content to at least one user based on the values of elements of the matrix P. In this step, when the value of the element p _i,j of the matrix P is equal to or greater than a predetermined threshold, it may be determined to recommend the j-th content to the i-th user.

이상의 설명에 있어서 어떤 구성 요소가 다른 구성 요소에 접속되거나 결합된다는 기재의 의미는 당해 구성 요소가 그 다른 구성 요소에 직접적으로 접속되거나 결합된다는 의미뿐만 아니라 이들이 그 사이에 개재된 하나 또는 그 이상의 타 구성 요소를 통해 접속되거나 결합될 수 있다는 의미를 포함하는 것으로 이해되어야 한다. 이외에도 구성 요소들 간의 관계를 기술하기 위한 용어들(예컨대, '간에', '사이에' 등)도 유사한 의미로 해석되어야 한다.In the above description, the meaning of the description that a component is connected to or coupled to another component not only means that the component is directly connected or coupled to the other component, but also means that one or more other components are interposed therebetween. It should be understood to include the meaning that may be connected or coupled via an element. In addition, terms for describing the relationship between the elements (eg, 'between', 'between', etc.) should be interpreted with similar meanings.

본원에 개시된 실시예들에 있어서, 도시된 구성 요소들의 배치는 발명이 구현되는 환경 또는 요구 사항에 따라 달라질 수 있다. 예컨대, 일부 구성 요소가 생략되거나 몇몇 구성 요소들이 통합되어 하나로 실시될 수 있다. 또한 일부 구성 요소들의 배치 순서 및 연결이 변경될 수 있다.In the embodiments disclosed herein, the arrangement of the illustrated components may vary depending on the environment or requirements in which the invention is implemented. For example, some components may be omitted or some components may be integrated and implemented as one. Also, the arrangement order and connection of some components may be changed.

이상에서는 개시된 기술의 다양한 실시예들에 대하여 도시하고 설명하였지만, 개시된 기술은 상술한 특정의 실시예들에 한정되지 아니하며, 상술한 실시예들은 첨부하는 특허청구범위에서 청구하는 개시된 기술의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양하게 변형 실시될 수 있음은 물론이고, 이러한 변형 실시예들이 개시된 기술의 기술적 사상이나 범위와 별개로 이해되어져서는 아니 될 것이다. 따라서, 개시된 기술의 기술적 범위는 오직 첨부된 특허청구범위에 의해서만 정해져야 할 것이다.In the above, various embodiments of the disclosed technology have been illustrated and described, but the disclosed technology is not limited to the specific embodiments described above, and the above-described embodiments depart from the gist of the disclosed technology as claimed in the appended claims. Without this, various modifications may be made by those of ordinary skill in the art to which the present invention pertains, and these modified embodiments should not be understood separately from the technical spirit or scope of the disclosed technology. Accordingly, the technical scope of the disclosed technology should be defined only by the appended claims.

100: VOD 컨텐츠 추천 장치
110: 데이터베이스부
120: 프로세싱 엔진
O: 컨텐츠시청여부 매트릭스
C: 인기도 매트릭스
P: 컨텐츠구매확률 매트릭스
X: 고객특징정보 매트릭스
Y: 컨텐츠특징정보 매트릭스100: VOD content recommendation device
110: database unit
120: processing engine
O: Matrix of content viewing
C: Popularity Matrix
P: Content purchase probability matrix
X: Customer characteristic information matrix
Y: Content feature information matrix

Claims

A video on demand (VOD) content recommendation method, comprising:
setting a content purchase probability matrix P, wherein P includes elements, and the elements of P represent probabilities that users will purchase and watch content;
setting X and Y so that P is expressed as a product of a customer characteristic information matrix X defining characteristics for each user and a content characteristic information matrix Y defining characteristics for each content;
The elements of X are calculated using a loss function (L) defined by P, a content viewing matrix O indicating whether the users have viewed the contents, and a popularity matrix C indicating the popularity of the contents. determining values and values of the elements of Y; and
and calculating the values of the elements of P by using the values of the elements of X and the values of the elements of Y.

According to claim 1,
The number of rows of P is equal to the number of users, the number of columns of P is equal to the number of contents, the elements of P are denoted by p _i,j , and the elements p _i,j of P are i A content recommendation method, wherein a th user represents a probability of purchasing and viewing a j th content, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than or equal to the number of the content.

3. The method of claim 2,
wherein the number of rows of X equals the number of users, the number of columns of X equals the number of features, and the elements of X are denoted by x _i,j .

4. The method of claim 3,
wherein the number of rows of Y equals the number of features, the number of columns of Y equals the number of contents, and the elements of Y are denoted by y _i,j .

5. The method of claim 4,
The number of rows of O is equal to the number of users, the number of columns of O is equal to the number of contents, the elements of O are denoted by o _i,j , and the elements of O o _i,j are i indicates whether a th user has viewed the j th content, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than or equal to the number of the content.

6. The method of claim 5,
The number of rows of C is equal to the number of users, the number of columns of C is equal to the number of contents, the elements of C are denoted by c _i,j , and the elements c _i,j of C are i A content recommendation method, wherein i represents the popularity of a j-th content from a th user, i is greater than or equal to 1 and equal to or less than the number of users, and j is greater than or equal to 1 and equal to or less than the number of content.

7. The method of claim 6,
The popularity c _i,j of the j-th content from the i-th user is the following formula

is defined according to

,

7. The method of claim 6,
wherein the loss function L is a function of x _i,j and y _i,j .

9. The method of claim 8,
The loss function L is the formula

represents the square of the norm of X,

represents the square of the norm of Y,

is a constant, the content recommendation method.

10. The method of claim 9,
The elements of X are calculated using a loss function (L) defined by P, a content viewing matrix O indicating whether the users have viewed the contents, and a popularity matrix C indicating the popularity of the contents. and determining the values and the values of the elements of Y comprises updating the values of the elements of X and the values of the elements of Y a plurality of times so that the loss function L approaches a minimum value.

11. The method of claim 10,
In the step of updating the values of the elements of X and the values of the elements of Y a plurality of times so that the loss function L approaches the minimum value, a gradient descent algorithm is applied to the loss function L to obtain the element of X. and updating values of the elements of Y and the values of the elements of Y a plurality of times.

3. The method of claim 2,
The step of calculating the values of the elements of P using the values of the elements of X and the values of the elements of Y includes multiplying the X by the Y to calculate the values of the elements of P Recommended way.

3. The method of claim 2,
and determining to recommend at least one content to at least one user based on the values of the elements of P.

14. The method of claim 13,
The determining step of recommending at least one content to at least one user based on the values of the elements of P may include providing the j-th content to the i-th user when the value of the element p _i,j of P is greater than or equal to a predetermined threshold value. Content recommendation method comprising the step of determining to recommend.

A device for VOD content recommendation, comprising:
A database unit for storing data about a content viewing matrix O indicating whether users have viewed the contents and a popularity matrix C indicating the popularity of the contents; and
a processing engine communicatively coupled to the database unit;
The processing engine is
(i) an operation of setting a content purchase probability matrix P, wherein P includes elements, and the elements of P represent probabilities that users will purchase and watch content;
(ii) setting X and Y so that P is expressed as a product of a customer characteristic information matrix X defining characteristics for each user and a content characteristic information matrix Y defining characteristics for each content;
(iii) determining the values of the elements of X and the values of the elements of Y using a loss function L defined by said P, said O and said C; and
and (iv) calculating the values of the elements of P by using the values of the elements of X and the values of the elements of Y.

16. The method of claim 15,
The number of rows of P is equal to the number of users, the number of columns of P is equal to the number of contents, the elements of P are denoted by p _i,j , and the elements p _i,j of P are i A content recommendation apparatus, wherein a th user represents a probability of purchasing and viewing a j th content, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than or equal to the number of the content.

17. The method of claim 16,
wherein the number of rows of X equals the number of users, the number of columns of X equals the number of features, and the elements of X are represented by x _i,j .

18. The method of claim 17,
The number of rows of Y is equal to the number of features, the number of columns of Y is equal to the number of contents, and the elements of Y are represented by y _i,j .

19. The method of claim 18,
The number of rows of O is equal to the number of users, the number of columns of O is equal to the number of contents, the elements of O are denoted by o _i,j , and the elements of O o _i,j are i The content recommendation apparatus of claim 1, wherein: indicates whether a th user viewed the j-th content or not, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than or equal to the number of the content.

20. The method of claim 19,
The number of rows of C is equal to the number of users, the number of columns of C is equal to the number of contents, the elements of C are denoted by c _i,j , and the elements c _i,j of C are i A content recommendation apparatus, wherein i represents the popularity of a j-th content from a th user, i is 1 or more and less than or equal to the number of users, and j is 1 or more and less than or equal to the number of the content.

21. The method of claim 20,
The popularity c _i,j of the j-th content from the i-th user is the following formula

is defined according to

,

21. The method of claim 20,
The loss function L is a function of x _i,j and y _i,j .

23. The method of claim 22,
The loss function L is the formula

represents the square of the norm of X,

represents the square of the norm of Y,

is a constant, content recommendation device.

24. The method of claim 23,
and the processing engine is further configured to perform an operation of updating the values of the elements of X and the values of the elements of Y a plurality of times so that the loss function L approaches a minimum value.

25. The method of claim 24,
The processing engine is further configured to perform an operation of updating the values of the elements of X and the values of the elements of Y a plurality of times by applying a gradient descent method to the loss function L.

17. The method of claim 16,
The processing engine is further configured to perform an operation of multiplying the X and the Y to calculate the values of the elements of P.

17. The method of claim 16,
and the processing engine is further configured to perform an operation of determining to recommend at least one content to at least one user based on the values of the elements of P.

28. The method of claim 27,
The processing engine is further configured to perform an operation of determining to recommend the j-th content to the i-th user when the value of the element p _i,j of the P is greater than or equal to a predetermined threshold value.

A computer-readable recording medium recording a program, the program including instructions, which, when executed by a computer, perform the method according to any one of claims 1 to 14.