KR20150101341A

KR20150101341A - Apparatus and method for recommending movie based on distributed fuzzy association rules mining

Info

Publication number: KR20150101341A
Application number: KR1020140022934A
Authority: KR
Inventors: 김민성
Original assignee: 에스케이플래닛 주식회사
Priority date: 2014-02-26
Filing date: 2014-02-26
Publication date: 2015-09-03
Also published as: KR102167593B1; WO2015129983A1

Abstract

Disclosed are a device and a method for recommending a movie based on a distributed fuzzy association rule mining. The present invention acquires first rating data comprising a movie rating, converts the acquired first rating data to second rating data using a fuzzy membership function, and generates a list of an associated movie by applying the converted second rating data to a fuzzy association rule mining. Recommending a movie associated with a user who is to receive recommendation is available by recommending a movie in the list of the associated movie in descending order by using the generated list of the associated movie.

Description

[0001] APPARATUS AND METHOD FOR RECOMMENDING MOVIE BASED ON DISTRIBUTED FUZZY ASSOCIATION RULES MINING [0002]

본 발명은 사용자의 영화 평점 정보를 언어적 정보로 변환하여 언어적 정보의 연관 관계를 통해 영화를 추천하는 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 장치 및 방법에 관한 것으로, 특히 영화 평점에 대한 데이터를 퍼지 연관 규칙 마이닝에 적용하여 획득한 연관 영화 목록을 생성하고, 생성한 목록을 이용하여 추천 대상 사용자에게 영화를 추천할 수 있는 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 장치 및 방법에 관한 것이다.The present invention relates to a movie recommendation apparatus and method based on distributed fuzzy association rule mining that converts a movie rating information of a user into linguistic information and recommends a movie through association of linguistic information, To a movie recommendation apparatus and method based on distributed fuzzy association rule mining capable of generating a list of related movies acquired by applying to a fuzzy association rule mining and recommending movies to a recommendation target user using the generated list.

일반적인 영화 추천 알고리즘을 통한 영화 추천은 사용자의 영화 구매 이력을 기반으로 연관 규칙 마이닝(Association Rule Mining)을 통해 '이 영화를 본 사람이 본 영화'의 형태로 추천한다. 또는, 영화 간의 유사도를 계산하여 '이 영화와 유사한 영화'를 추출하는 방식으로 추천이 가능하다. A movie recommendation through a general recommendation algorithm is recommended in association rule mining based on a user's movie purchase history in the form of a movie viewed by a person who viewed the movie. Alternatively, it is possible to calculate the similarity between movies and extract 'movies similar to this movie'.

그러나 영화에 대한 평점 로그가 대용량이고, 사용자의 수가 많은 경우에는 통상적인 방법을 이용하여 연관 규칙을 계산해내기 어렵기 때문에, 영화 추천이라는 도메인에 퍼지 연관 규칙 마이닝을 적용한 예를 찾아 보기 어려운 실정이다. However, it is difficult to find an example of applying fuzzy association rule mining to a movie recommendation domain because it is difficult to calculate the association rule using a conventional method when the rating log of the movie is large and the number of users is large.

따라서, 영화에 대한 사용자의 평점 정보를 언어적인 정보로 변환하고, 언어적 정보에 기반한 분산 퍼지 연관 규칙 마이닝을 통해 사용자의 영화 평점 정보에 대한 연관 관계를 획득하고, 획득한 연관 관계를 이용하여 사용자에게 적합한 영화를 추천할 수 있는 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 기술의 필요성이 절실하게 대두된다.Therefore, the user's rating information about the movie is converted into linguistic information, the association relation of the user's movie rating information is acquired through the distributed fuzzy association rules mining based on the linguistic information, There is an urgent need for movie recommendation technology based on distributed fuzzy associative rule mining that can recommend a suitable movie.

한국 공개 특허 제10-2013-0009360A호, 2013년 7월 30일 공개 (명칭: 영화추천 서비스 제공방법 및 그 시스템)Korean Patent Laid-Open No. 10-2013-0009360A, July 30, 2013 (name: method and system for providing movie recommendation service)

본 발명의 목적은, 사용자들이 영화에 대해 남기는 평점 정보를 언어적 평가 정보로 치환하여, 치환된 평가 정보와 연관성 있는 영화를 추천함으로써 사용자의 선호도 성향에 적합한 영화를 추천하는 것이다.An object of the present invention is to recommend a movie suitable for a user's preference propensity by replacing the rating information left by the users with the linguistic evaluation information and recommending the movie related to the replaced evaluation information.

또한, 본 발명의 목적은 분산 프레임 워크에 적합한 데이터 처리 방식을 이용하여, 대용량의 영화 평점 정보를 효율적으로 처리하여 사용자들에게 보다 신뢰성 있는 영화 추천 기능을 제공하는 것이다.Another object of the present invention is to provide a reliable movie recommendation function to users by efficiently processing movie rating information of a large capacity using a data processing method suitable for a distributed framework.

상기한 목적을 달성하기 위한 본 발명에 따른 영화 추천 장치는, 영화에 대한 평점을 포함한 제1 평점 데이터를 획득하는 데이터 획득부, 획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝(Fuzzy Association Rule Mining)에 적용하여 연관 영화 목록을 생성하는 연관 목록 생성부 및 추천 대상 사용자에게 연관 영화 목록을 이용하여 영화를 추천하는 영화 추천부를 포함한다.According to an aspect of the present invention, there is provided a movie recommendation apparatus including a data acquiring unit acquiring first rating data including a rating for a movie, converting the acquired first rating data to second rating data, An association list generation unit for applying the second rating data to the fuzzy association rule mining to generate an associated movie list, and a movie recommendation unit for recommending the movie using the associated movie list to the recommendation target user.

이 때, 연관 목록 생성부는 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다.At this time, the association list generation unit assigns ratings to at least one of a fuzzy membership function including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function, And converting the language label according to the obtained fuzzy membership value into the second score data by replacing the score with the score.

이 때, 연관 목록 생성부는 퍼지 연관 규칙 마이닝을 이용해서 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하고, 생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 연관 영화 목록을 생성할 수 있다.At this time, the association list generation unit may generate at least one of fuzzy reliability and fuzzy correlation using fuzzy association rule mining, and generate a related movie list based on at least one of the generated fuzzy reliability and fuzzy correlation.

이 때, 연관 목록 생성부는 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성하는 연관 조합 생성부 및 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산하는 퍼지 지지도 계산부를 포함할 수 있다.In this case, the association list generation unit generates an association combination generation unit for generating a film-by-film association combination by combining the converted second rating data according to the fuzzy association rule, and a rating history for each movie, And a fuzzy support calculation unit for calculating fuzzy support for each movie using the generated rating history of each movie.

이 때, 퍼지 지지도 계산부는 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 영화별 퍼지 지지도를 계산할 수 있다.In this case, the fuzzy support calculation unit can calculate the fuzzy support per film by using the reference value obtained by normalizing the fuzzy membership value.

이 때, 연관 목록 생성부는 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하고, 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산할 수 있다.At this time, the association list generation unit calculates at least two of the fuzzy membership functions so as to calculate the association fuzzy support for the movie-related association, and the fuzzy membership reliability is calculated using at least one of the fuzzy support of the movie and the calculated association- Can be calculated.

이 때, 연관 목록 생성부는 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다.In this case, the association list generation unit may calculate the fuzzy correlation using one or more of the fuzzy support per fuze, the fuzzy reliability, and the square value of the fuzzy support per movie.

이 때, 영화 추천부는 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다.At this time, the movie recommendation unit determines the ranking of the associated movie list generated according to the predetermined importance level, and recommends the movie in the order of the determined associated high movie list.

이 때, 연관 영화 목록은 영화의 제목, 장르, 감독, 국가, 제작연도 및 이미지 중 하나 이상의 정보를 포함할 수 있다.At this time, the associated movie list may include one or more of the title, genre, director, country, year and image of the movie.

또한, 본 발명에 따른 영화 추천 방법은, 영화에 대한 평점을 포함한 입력 데이터를 획득하는 단계, 획득한 입력 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝에 적용하여 연관 영화 목록을 생성하는 단계 및 추천 대상 사용자에게 생성한 연관 영화 목록을 이용하여 영화를 추천하는 단계를 포함한다.The movie recommendation method according to the present invention includes the steps of acquiring input data including a rating for a movie, converting the acquired input data to second rating data, and applying the converted second rating data to fuzzy association rule mining Creating a list of related movies, and recommending movies using a list of associated movies created for the user to be recommended.

이 때, 연관 영화 목록을 생성하는 단계는 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하는 단계를 포함하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다.In this case, the step of generating the associated movie list may include assigning a rating to at least one of a fuzzy membership function including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function And converting the language label according to the obtained fuzzy belonging value into the second score data by replacing the score with the score.

이 때, 연관 영화 목록을 생성하는 단계는 연관 규칙 마이닝을 이용하여 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하는 단계 및 생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 연관 영화 목록을 생성하는 단계를 포함할 수 있다.The generating of the associated movie list may include generating at least one of fuzzy reliability and fuzzy correlation using association rule mining, generating a list of associated movies based on at least one of the generated fuzzy reliability and fuzzy correlation, .

이 때, 연관 영화 목록을 생성하는 단계는 퍼지 연관 규칙 마이닝을 이용하여 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하는 단계 및 생성한 퍼지 신뢰도 및 퍼지 상관도 중 하나를 기준으로 연관 영화 목록을 생성하는 단계를 포함할 수 있다.At this time, the step of generating the associated movie list may include generating at least one of fuzzy reliability and fuzzy correlation using fuzzy association rule mining, and generating a related movie list based on one of the generated fuzzy reliability and fuzzy correlation .

이 때, 연관 영화 목록을 생성하는 단계는 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하는 단계 및 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산하는 단계를 포함할 수 있다.In this case, the step of generating the associated movie list may include combining at least two of the fuzzy membership functions to calculate the association fuzzy support for the movie-related association, and calculating at least one of the fuzzy support for the movie and the associated combination fuzzy support And calculating the fuzzy reliability using the fuzzy reliability.

이 때, 연관 영화 목록을 생성하는 단계는 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산하는 단계를 포함할 수 있다.In this case, the step of generating the associated movie list may include calculating the fuzzy correlation using at least one of the values of the fuzzy support of the movie, the fuzzy reliability, and the square value of the fuzzy support of the movie.

이 때, 영화를 추천하는 단계는 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하는 단계를 포함하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다.At this time, the step of recommending a movie may include a step of determining a ranking of the associated movie list generated according to a predetermined importance, and the movie may be recommended in the order of the determined associated high movie list.

본 발명에 따르면, 다수의 사용자로부터 영화에 대한 평점 정보를 획득하고, 획득한 평점 정보를 이용하여 연관성 있는 영화 목록을 추출함으로써, 영화를 추천할 사용자의 평점을 이용하여 추천 대상 사용자의 선호도에 상응하는 영화를 추천할 수 있다.According to the present invention, rating information about a movie is acquired from a plurality of users, and a list of related movies is extracted using the acquired rating information. By using the rating of a user to recommend a movie, You can recommend a movie.

또한, 본 발명은 영화에 대한 평점을 언어적 정보를 치환하여 연관 관계를 도출함에 따라, 언어적 정보 간의 다양한 방향성에 기반하여 다양한 추천 영화 목록을 생성하여 제공할 수 있다.In addition, according to the present invention, since the score of the movie is replaced with the linguistic information to derive the association, various recommended movie lists can be generated and provided based on various directions of the linguistic information.

도 1은 본 발명의 일실시예에 따른 영화 추천 장치를 나타낸 블록도이다.
도 2는 도 1의 영화 추천 장치 중 연관 목록 생성부를 나타낸 블록도이다.
도 3은 본 발명의 일실시예에 따른 영화 추천 방법을 나타낸 동작 흐름도이다.
도 4는 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정을 나타낸 동작 흐름도이다.
도 5는 영화의 대한 사용자들의 제1 평점 데이터의 일 예를 나타낸 도면이다.
도 6은 본 발명에 따른 제2 평점 데이터를 생성하기 위한 퍼지 소속 함수를 나타낸 도면이다.
도 7은 도 5에 나타난 제1 평점 데이터를 도 6의 (a) 퍼지 소속 함수를 이용하여 퍼지 소속도 값 및 제2 평점 데이터로 나타낸 도면이다.1 is a block diagram showing a movie recommendation apparatus according to an embodiment of the present invention.
FIG. 2 is a block diagram illustrating an association list generation unit of the movie recommendation apparatus of FIG. 1. FIG.
3 is a flowchart illustrating a movie recommendation method according to an exemplary embodiment of the present invention.
4 is a flowchart illustrating a process of generating a list of related movies according to an exemplary embodiment of the present invention.
5 is a view showing an example of first rating data of users for a movie.
6 is a diagram showing a fuzzy membership function for generating second rating data according to the present invention.
FIG. 7 is a diagram showing the first rating data shown in FIG. 5 using the fuzzy membership value and the second rating data using the fuzzy membership function of FIG. 6 (a).

이하 본 발명의 바람직한 실시예를 첨부한 도면을 참조하여 상세히 설명한다. 다만, 하기의 설명 및 첨부된 도면에서 본 발명의 요지를 흐릴 수 있는 공지 기능 또는 구성에 대한 상세한 설명은 생략한다. 또한, 도면 전체에 걸쳐 동일한 구성 요소들은 가능한 한 동일한 도면 부호로 나타내고 있음에 유의하여야 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description and the accompanying drawings, detailed description of well-known functions or constructions that may obscure the subject matter of the present invention will be omitted. It should be noted that the same constituent elements are denoted by the same reference numerals as possible throughout the drawings.

이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위한 용어의 개념으로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다. 따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다. 또한 제 1, 제 2 등의 용어는 다양한 구성요소들을 설명하기 위해 사용하는 것으로, 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용될 뿐, 상기 구성요소들을 한정하기 위해 사용되지 않는다.The terms and words used in the present specification and claims should not be construed to be limited to ordinary or dictionary meanings and the inventor is not limited to the concept of terminology for describing his or her invention in the best way. It should be interpreted as meaning and concept consistent with the technical idea of the present invention. Therefore, the embodiments described in the present specification and the configurations shown in the drawings are merely the most preferred embodiments of the present invention, and not all of the technical ideas of the present invention are described. Therefore, It is to be understood that equivalents and modifications are possible. Also, the terms first, second, etc. are used for describing various components and are used only for the purpose of distinguishing one component from another component, and are not used to define the components.

도 1은 본 발명의 일실시예에 따른 영화 추천 장치를 나타낸 블록도이다.1 is a block diagram showing a movie recommendation apparatus according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일실시예에 따른 영화 추천 장치(100)는 데이터 획득부(110), 연관 목록 생성부(120) 및 영화 추천부(130)를 포함할 수 있다.Referring to FIG. 1, a recommendation apparatus 100 according to an exemplary embodiment of the present invention may include a data acquisition unit 110, an association list generation unit 120, and a movie recommendation unit 130.

데이터 획득부(110)는 영화에 대한 사용자의 평점을 포함하는 제1 평점 데이터를 획득할 수 있다.The data acquisition unit 110 may acquire first rating data including a user's rating for the movie.

이 때, 제1 평점 데이터에는 입력 로그로부터 획득할 수 있는 정보, 즉 사용자를 구분할 수 있는 사용자 아이디, 영화를 구분할 수 있는 영화 아이디를 포함할 수 있으며, 숫자 형태로 된 평점을 포함할 수 있다. 또한, 다른 형태의 입력 로그에서 사용자 아이디, 영화 아이디 및 평점을 추출할 수 있으면 제1 평점 데이터를 획득할 수 있다. 예를 들어, 제1 평점 데이터를 하나의 트랜잭션으로 처리하기 위해 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태로 나타낼 수 있다. 여기에서 User1은 사용자 아이디, m1, m3 및 m100은 영화 아이디 그리고 r_1_1, r_1_3 및 r_1_100 등은 평점으로 n번째 사용자가 m번째 영화에 대해 매긴 평점은 r_n_m의 형태로 표시할 수 있다. In this case, the first rating data may include information that can be obtained from the input log, that is, a user ID that can identify a user, a movie ID that can identify a movie, and a rating in numerical form. Also, if the user ID, movie ID, and rating can be extracted from other types of input logs, first rating data can be obtained. For example, in order to process the first rating data as one transaction, it can be represented as 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100)'. Here, User1 can display the user ID, m1, m3, and m100 as movie IDs, r_1_1, r_1_3, and r_1_100 as ratings, and nth user can display ratings of mth movies as r_n_m.

또한, 연관 목록 생성부(120)는 획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝(Fuzzy Association Rule Mining)에 적용하여 연관 영화 목록을 생성할 수 있다. Also, the association list generation unit 120 converts the acquired first rating data to second rating data, applies the converted second rating data to fuzzy association rule mining, and generates an associated movie list can do.

이 때, 퍼지 연관 규칙 마이닝은 퍼지 이론을 연관 규칙 마이닝에 적용한 기법으로써, 각 대상이 어떤 집합에 속한다 또는 속하지 않는다는 이진법 논리로부터, 각 대상이 그 모임에 속하는 정도를 소속 함수로 나타냄으로써 표현할 수 있다. 따라서, 연관 규칙 마이닝과 같이 사용자의 로그를 기반으로 소비된 상품 간의 연관성을 계산하여, 마켓이나 스토어에 등록된 상품간의 연관성을 계산할 수 있다.In this case, fuzzy association rule mining is a technique applied to association rule mining by fuzzy theory, and it can be expressed from the binary logic that each object belongs to a certain set or does not belong to each subject, . Therefore, it is possible to calculate the association between the products registered in the market or the store by calculating the association between the consumed products based on the user's log, such as association rule mining.

이와 같은 퍼지 연관 규칙은 주로 한 대의 기계 장치를 이용해서 계산되는데, 대용량의 추천을 위해서는 이러한 로직의 분산 처리가 필요하므로, 이 때, 본 발명에서는 퍼지 연관 규칙 마이닝을 분산 프레임 워크에 기반하여 더 효과적으로 계산하기 위해 하둡(Hadoop)의 맵리듀스(MapReduce)를 사용할 수 있다.Such a fuzzy association rule is mainly calculated by using a single machine. For the recommendation of a large capacity, it is necessary to distribute the logic. In this case, the present invention uses fuzzy association rule mining more effectively You can use Hadoop's MapReduce to compute it.

이 때, 맵리듀스에서는 mapper와 reducer의 단계별로 <key, value>를 정의하여 해결할 수 있다. <key, value>는 데이터가 처리되는 기본 단위인 데이터 페어(pair)이며, key와 value는 임의의 구조체나 클래스로 정의하여 복잡한 형태의 데이터를 처리할 수 있다.In this case, in MapReduce, it is possible to solve by defining <key, value> for each step of mapper and reducer. <key, value> is a data pair that is the basic unit in which data is processed, and key and value can be defined as arbitrary structures or classes to process complex types of data.

이 때, 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다. At this time, a fuzzy membership value is acquired by assigning a rating to at least one of a fuzzy membership function including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function , The language label according to the obtained fuzzy membership value can be converted into the second rating data by replacing with the rating.

이 때, 제1 평점 데이터에 포함된 평점을 퍼지 소속 함수에 대입하면 언어 레이블과 관련하여 0과 1사이의 값으로 퍼지 소속도 값을 획득할 수 있고, 퍼지 소속도 값이 큰 언어 레이블의 값을 평점과 치환함으로써 제2 평점 데이터로 변환할 수 있다. 예를 들어, 제1 평점 데이터에 포함된 8점의 평점을 퍼지 소속 함수를 이용하여 퍼지 소속도 값으로 생성한 값이 '보통'의 언어 레이블과 관련하여 0.3의 퍼지 소속도 값을 획득하고, '좋다'의 언어 레이블과 관련하여 0.7의 퍼지 소속도 값을 획득하였다면, 제1 평점 데이터의 평점 8에 대한 정보를 '좋다'로 치환하여 제2 평점 데이터를 생성할 수 있다.At this time, if the rating included in the first rating data is substituted into the fuzzy membership function, the fuzzy membership value can be obtained with a value between 0 and 1 in relation to the language label, and the value of the language label having a large fuzzy membership value Can be converted into the second rating data by substituting the score with the rating. For example, a fuzzy membership value of 0.3 is obtained with respect to a language label having a value obtained by generating a fuzzy membership value using a fuzzy membership function of eight points included in the first rating data, If the fuzzy belonging value of 0.7 is obtained in association with the 'good' language label, the second rating data can be generated by replacing the information about the rating point 8 of the first rating data with 'good'.

이 때, 각각의 퍼지 소속 함수는 '싫다', '보통', '좋다'와 같은 언어 레이블과 대응되게 되며, 퍼지 소속 함수의 범위와 퍼지 소속 함수의 개수는 따로 지정이 가능할 수 있다.In this case, each fuzzy membership function is associated with a language label such as 'No,' 'Normal', 'Good', and the range of the fuzzy membership function and the number of fuzzy membership functions can be specified separately.

또한, 연관 목록 생성부(120)는 퍼지 연관 규칙 마이닝을 이용해서 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하고, 생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 연관 영화 목록을 생성할 수 있다. The association list generation unit 120 generates at least one of fuzzy reliability and fuzzy correlation using fuzzy association rule mining and generates a related movie list based on at least one of the generated fuzzy reliability and fuzzy correlation .

또한, 연관 목록 생성부(120)는 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성할 수 있고, 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산할 수 있다.In addition, the association list generation unit 120 may combine the converted second rating data according to the fuzzy association rule to generate a movie-specific association combination, generate a rating history for each movie organized by the second rating data for each movie , And the fuzzy support for each movie can be calculated using the generated rating history of each movie.

이 때, 영화별 연관 조합은 퍼지 연관 규칙에 따라 미리 설정된 연관 규칙의 길이로 조합을 생성할 수 있다. 예를 들어, 연관 규칙의 길이가 2라고 할 경우 'm1, m3, user1, r_1_1, r_1_3'으로 영화 m1과 m3에 대한 영화 조합을 생성하고, 해당 영화 조합에 대한 사용자별 영화의 평점 정보를 모아서 'm1, m3,(user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ... , (userN, r_N_1, r_N_3)'과 같은 데이터를 수집할 수 있다. 이 때, r_n_m과 같은 표현은 n번째 사용자가 m번째 영화에 대해 매긴 평점으로 해석할 수 있다. At this time, the film-by-film association combination can generate the combination with the length of the association rule preset according to the fuzzy association rule. For example, if the length of the association rule is 2, a movie combination for movies m1 and m3 is created with 'm1, m3, user1, r_1_1, r_1_3' data such as 'm1, m3, (user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ..., (userN, r_N_1, r_N_3)' can be collected. In this case, the expression such as r_n_m can be interpreted as a rating value of the n-th user for the m-th movie.

또한, 영화별 평점 이력은 예를 들어, 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태의 제2 평점 데이터를 영화별로 모아서 'm1, (user1,r_1_1), (user1,r_2_1), ... , (userN, r_N_1)'과 같은 형태로 모을 수 있다. The rating history of each movie is obtained by collecting second rating data of the form of 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100) (user1, r_1_1), (user1, r_2_1), ..., (userN, r_N_1) '.

이 때, 영화별 퍼지 지지도는 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 계산할 수 있다. 예를 들어, 아래의 수학식 1을 이용하여 각각의 퍼지 소속도 값을 정규화할 수 있다. 이 때, 수학식 1에서

는 l번째 소속 함수에 대해

값이 가지는 소속도 값,

는 정규화된 기본값 및

는 트랜잭션 DB

에서 i번째 레코드 값을 나타낼 수 있다.At this time, the fuzzy support per movie can be calculated using the reference value obtained by normalizing the fuzzy membership value. For example, each fuzzy membership value can be normalized using the following equation (1). At this time, in Equation (1)

For the lth membership function

The value belonging to the value,

Is the normalized default and

The transaction DB

Can represent the i-th record value.

이와 같이 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 아래의 수식과 같이 영화별 퍼지 지지도를 계산할 수 있다. In this way, the fuzzy support for each movie can be calculated using the reference value obtained by normalizing the fuzzy membership value as shown in the following equation.

또한, 아래의 수학식 2를 이용하여 퍼지 지지도를 계산할 수 있다. 이 때, 수학식 2에서

는 l번째 소속 함수에 대해

값이 가지는 소속도 값,

는 정규화된 기본값,

는 트랜잭션 DB

에서 i번째 레코드 값 및

는 계산한 기준 값을 대입하여 계산한 퍼지 지지도를 나타낼 수 있다.Further, the fuzzy degree of support can be calculated using the following equation (2). At this time, in Equation (2)

For the lth membership function

The value belonging to the value,

Is the normalized default,

The transaction DB

The value of the ith record in

Can represent the fuzzy support calculated by substituting the calculated reference value.

또한, 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하고, 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산할 수 있다. Also, fuzzy reliability can be calculated by using at least two of the fuzzy membership functions to calculate the association fuzzy support for the movie association combination, and using at least one of the fuzzy support for the movie and the associated association fuzzy support.

이 때, 연관 조합 퍼지 지지도는 'm1, m2, MF_1, MF_2, FS(m1, MF1, m2, MF_2)'의 형태로 나타낼 수 있으며 이 때 m은 영화, MF는 퍼지 소속 함수, FS는 퍼지 지지도를 나타낼 수 있다. 예를 들어, 아래의 수학식 3을 이용하여 퍼지 신뢰도를 계산할 수 있다. 이 때, 수학식 3에서

는 l번째 소속 함수에 대해

값이 가지는 소속도 값,

는 정규화된 기본값,

는 트랜잭션 DB

에서 i번째 레코드 값,

,

는 기준 값을 대입하여 계산한 퍼지 지지도 및

는 퍼지 신뢰도를 나타낼 수 있다.In this case, the associative combination fuzzy support can be expressed as 'm1, m2, MF_1, MF_2, FS (m1, MF1, m2, MF_2)' where m is movie, MF is fuzzy membership function, Lt; / RTI > For example, fuzzy reliability can be calculated using Equation 3 below. At this time, in Equation (3)

For the lth membership function

The value belonging to the value,

Is the normalized default,

The transaction DB

The i-th record value,

,

Is the fuzzy support calculated by substituting the reference value and

Can represent fuzzy reliability.

또한, 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다. 예를 들어, 아래의 수학식 4를 이용하여 퍼지 상관도를 계산할 수 있다. 이 때, 수학식 4에서

는 l번째 소속 함수에 대해

값이 가지는 소속도 값,

는 정규화된 기본값,

는 트랜잭션 DB

에서 i번째 레코드 값,

,

는 기준 값을 대입하여 계산한 퍼지 지지도,

는 퍼지 신뢰도 및

는 퍼지 상관도를 나타낼 수 있다.Also, the fuzzy correlation can be calculated using at least one of the squared values of the per-movie fuzzy support, fuzzy reliability, and per-movie fuzzy support. For example, the fuzzy correlation can be calculated using Equation (4) below. At this time, in Equation 4,

For the lth membership function

The value belonging to the value,

Is the normalized default,

The transaction DB

The i-th record value,

,

The fuzzy degree of support calculated by substituting the reference value,

Fuzzy reliability and

Can represent a fuzzy correlation.

여기서here

영화 추천부(130)는 추천 대상 사용자에게 연관 영화 목록을 이용하여 영화를 추천할 수 있다. 예를 들어, 연관 영화 목록에 포함된 영화 중에서 연관성이 높은 영화의 순서대로 추천 대상 사용자에게 보여줌으로써 사용자에게 보다 적합한 영화 순으로 추천할 수 있다.The movie recommendation unit 130 may recommend the movie to the recommendation target user using the related movie list. For example, it is possible to recommend movies in the order of movies more suitable for the user by showing them to the recommendation target users in order of movies having high relevance among the movies included in the related movie list.

이 때, 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다. 예를 들어, 제2 평점 데이터에 포함된 언어 레이블을 이용하여 영화간의 연관 관계가 '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 추천 서비스를 통해 추천된 영화가 마음에 든 사용자를 더 유입하는 계기로 사용할 수 있는 반면, '좋다 -> 싫다', '싫다 -> 싫다'의 관계는 직접적인 구매 유도보다는 호기심 유도를 위한 용도에 적합할 수 있다. 또한, '보통'으로 연결되는 관계는 직접적인 추천 서비스 상 기능과 연결되기 어려워 추천 서비스 측면에서는 불필요한 정보가 될 수 있다. 따라서, '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 중요도가 높은 순위로 결정하고 '좋다 -> 싫다', '싫다 -> 싫다' 및 '보통'으로 연결되는 관계는 비교적 중요도가 낮은 순위로 결정할 수 있다.At this time, it is possible to determine the ranking of the associated movie list generated according to the predetermined importance, and to recommend the movie in the order of the determined associated high movie list. For example, using the language label included in the second rating data, the relationship between movies is good -> good or the relation of 'nope -> good' The relationship of 'good -> no,' 'no ->, and' no 'can be used for inducing curiosity rather than inducing direct purchase. In addition, the relation that is connected to 'normal' may be unnecessary information in terms of recommendation service because it is difficult to link with direct recommendation service function. Thus, the relationship between 'good -> good' and 'not - good' is determined to be of high importance, and the relation of 'good -> dislike', 'dislike -> dislike' It can be determined that the order of importance is low.

또한, 연관 영화 목록간의 중복된 영화가 존재하는 경우, 연관 영화 목록의 순위를 기준으로 하위 목록에 있는 중복 영화를 삭제할 수 있다. 예를 들어, '영화 A를 좋다고 한 사용자가 좋다'고 한 영화 목록에 영화 B가 존재하는데, '영화 A를 좋다고 한 사용자가 싫다'고 한 영화 목록에도 영화 B가 존재한다면, 연관 영화 목록의 순위를 확인하여 비교적 하위 목록인 '영화 A를 좋다고 한 사용자가 싫다'고 한 영화 목록에서 영화 B를 삭제할 수 있다.Also, if there are duplicate movies between the associated movie lists, the duplicate movies in the sublist can be deleted based on the ranking of the associated movie lists. For example, if there is a movie B in the list of movies that says "a movie user A is good," and there is a movie B in a movie list that says "he / she does not like a movie user A" It is possible to delete the movie B from the list of movies in which a user who does not like the movie A is said to be relatively inferior to the user by checking the ranking.

이와 같은 영화 추천 장치를 이용하여 사용자들이 영화에 대해 남기는 평점 정보를 통해 사용자와 연관성 있는 영화를 추천함으로써 사용자의 선호 성향에 적합한 영화를 추천할 수 있다.By using the movie recommendation device, users can recommend a movie suitable for the user's preference by recommending a movie related to the user through rating information about the movie.

도 2는 도 1의 영화 추천 장치 중 연관 목록 생성부를 나타낸 블록도이다.FIG. 2 is a block diagram illustrating an association list generation unit of the movie recommendation apparatus of FIG. 1. FIG.

도 2를 참조하면, 도 1의 영화 추천 장치 중 연관 목록 생성부(120)는 연관 조합 생성부(210) 및 퍼지 지지도 계산부(220)를 포함한다.Referring to FIG. 2, the association list generation unit 120 of the movie recommendation apparatus of FIG. 1 includes an association combination generation unit 210 and a fuzziness degree calculation unit 220.

연관 조합 생성부(210)는 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성할 수 있다. The association combination generation unit 210 may combine the converted second rating data according to the fuzzy association rule to generate a movie-specific association combination.

이 때, 영화별 연관 조합은 퍼지 연관 규칙에 따라 미리 설정된 연관 규칙의 길이로 조합을 생성할 수 있다. 예를 들어, 연관 규칙의 길이가 2라고 할 경우 'm1, m3, user1, r_1_1, r_1_3'으로 영화 m1과 m3에 대한 영화 조합을 생성하고, 해당 영화 조합에 대한 사용자별 영화의 평점 정보를 모아서 'm1, m3,(user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ... , (userN, r_N_1, r_N_3)'과 같은 데이터를 수집할 수 있다. 이 때, r_n_m과 같은 표현은 n번째 사용자가 m번째 영화에 대해 매긴 평점으로 해석할 수 있다.At this time, the film-by-film association combination can generate the combination with the length of the association rule preset according to the fuzzy association rule. For example, if the length of the association rule is 2, a movie combination for movies m1 and m3 is created with 'm1, m3, user1, r_1_1, r_1_3' data such as 'm1, m3, (user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ..., (userN, r_N_1, r_N_3)' can be collected. In this case, the expression such as r_n_m can be interpreted as a rating value of the n-th user for the m-th movie.

퍼지 지지도 계산부(220)는 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산할 수 있다.The fuzzy confidence calculation unit 220 may generate a rating history for each movie, which is obtained by classifying the second rating data for each movie, and calculate the fuzzy support for each movie using the rating history for each movie.

이 때, 영화별 평점 이력은 예를 들어, 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태의 제2 평점 데이터를 영화별로 모아서 'm1, (user1,r_1_1), (user1,r_2_1), ... , (userN, r_N_1)'과 같은 형태로 모을 수 있다. In this case, the rating history of each movie is obtained by collecting second rating data of the form of 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100) , (user1, r_1_1), (user1, r_2_1), ..., (userN, r_N_1) '.

이 때, 영화별 퍼지 지지도는 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 계산할 수 있다. 예를 들어, 상기에서 설명한 수학식 1을 이용하여 각각의 퍼지 소속도 값을 정규화할 수 있다. At this time, the fuzzy support per movie can be calculated using the reference value obtained by normalizing the fuzzy membership value. For example, each fuzzy belonging value can be normalized using Equation 1 described above.

또한, 상기에서 설명한 수학식 2를 이용하여 퍼지 지지도를 계산할 수 있다.In addition, the fuzzy degree of support can be calculated using Equation 2 described above.

도 3은 본 발명의 일실시예에 따른 영화 추천 방법을 나타낸 동작 흐름도이다.3 is a flowchart illustrating a movie recommendation method according to an exemplary embodiment of the present invention.

도 3을 참조하면, 본 발명의 일실시예에 따른 영화 추천 방법은 영화에 대한 사용자의 평점을 포함하는 제1 평점 데이터를 획득할 수 있다(S310).Referring to FIG. 3, a movie recommendation method according to an embodiment of the present invention may acquire first rating data including a rating of a user for a movie (S310).

이 때, 제1 평점 데이터에는 입력 로그로부터 획득할 수 있는 정보, 즉 사용자를 구분할 수 있는 사용자 아이디, 영화를 구분할 수 있는 영화 아이디를 포함할 수 있으며, 숫자 형태로 된 평점을 포함할 수 있다. 또한, 다른 형태의 입력 로그에서 사용자 아이디, 영화 아이디 및 평점을 추출할 수 있으면 제1 평점 데이터를 획득할 수 있다. 예를 들어, 제1 평점 데이터를 하나의 트랜잭션으로 처리하기 위해 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태로 나타낼 수 있다. 여기에서 User1은 사용자 아이디, m1, m3 및 m100은 영화 아이디 그리고 r_1_1, r_1_3 및 r_1_100 등은 평점으로 n번째 사용자가 m번째 영화에 대해 매긴 평점은 r_n_m의 형태로 표시할 수 있다.In this case, the first rating data may include information that can be obtained from the input log, that is, a user ID that can identify a user, a movie ID that can identify a movie, and a rating in numerical form. Also, if the user ID, movie ID, and rating can be extracted from other types of input logs, first rating data can be obtained. For example, in order to process the first rating data as one transaction, it can be represented as 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100)'. Here, User1 can display the user ID, m1, m3, and m100 as movie IDs, r_1_1, r_1_3, and r_1_100 as ratings, and nth user can display ratings of mth movies as r_n_m.

또한, 본 발명의 일실시예에 따른 영화 추천 장치는 획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝(Fuzzy Association Rule Mining)에 적용하여 연관 영화 목록을 생성할 수 있다(S320).In addition, the movie recommendation apparatus according to an embodiment of the present invention converts the acquired first rating data to second rating data, applies the converted second rating data to a fuzzy association rule mining A movie list can be generated (S320).

또한, 상기에서 설명한 수학식 2를 이용하여 퍼지 지지도를 계산할 수 있다. In addition, the fuzzy degree of support can be calculated using Equation 2 described above.

이 때, 연관 조합 퍼지 지지도는 'm1, m2, MF_1, MF_2, FS(m1, MF1, m2, MF_2)'의 형태로 나타낼 수 있으며 이 때 m은 영화, MF는 퍼지 소속 함수, FS는 퍼지 지지도를 나타낼 수 있다. 예를 들어, 상기에서 설명한 수학식 3을 이용하여 퍼지 신뢰도를 계산할 수 있다. In this case, the associative combination fuzzy support can be expressed as 'm1, m2, MF_1, MF_2, FS (m1, MF1, m2, MF_2)' where m is movie, MF is fuzzy membership function, Lt; / RTI > For example, the fuzzy reliability can be calculated using Equation 3 described above.

또한, 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다. 예를 들어, 상기에서 설명한 수학식 4를 이용하여 퍼지 상관도를 계산할 수 있다. Also, the fuzzy correlation can be calculated using at least one of the squared values of the per-movie fuzzy support, fuzzy reliability, and per-movie fuzzy support. For example, the fuzzy correlation can be calculated using Equation 4 described above.

또한, 본 발명의 일실시예에 따른 영화 추천 방법은 추천 대상 사용자에게 연관 영화 목록을 이용하여 영화를 추천할 수 있다(S330). 예를 들어, 연관 영화 목록에 포함된 영화 중에서 연관성이 높은 영화의 순서대로 추천 대상 사용자에게 보여줌으로써 사용자에게 보다 적합한 영화 순으로 추천할 수 있다.In addition, the movie recommendation method according to an embodiment of the present invention may recommend a movie using a related movie list to a recommendation target user (S330). For example, it is possible to recommend movies in the order of movies more suitable for the user by showing them to the recommendation target users in order of movies having high relevance among the movies included in the related movie list.

이 때, 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다. 예를 들어, 제2 평점 데이터에 포함된 언어 레이블을 이용하여 영화간의 연관 관계가 '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 추천 서비스를 이용하여 추천된 영화가 마음에 든 사용자를 더 유입하는 계기로 사용할 수 있는 반면, '좋다 -> 싫다', '싫다 -> 싫다'의 관계는 직접적인 구매 유도보다는 호기심 유도를 위한 용도에 적합할 수 있다. 또한, '보통'으로 연결되는 관계는 직접적인 추천 서비스 상 기능과 연결되기 어려워 서비스 측면에서는 불필요한 정보가 될 수 있다. 따라서, '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 높은 중요도 순위로 결정하고 '좋다 -> 싫다', '싫다 -> 싫다' 및 '보통'으로 연결되는 관계는 비교적 낮은 중요도 순위로 결정할 수 있다.At this time, it is possible to determine the ranking of the associated movie list generated according to the predetermined importance, and to recommend the movie in the order of the determined associated high movie list. For example, using the language label included in the second rating data, the relation between movies is good -> good or the relation of 'nope -> good' The relationship between 'good -> nope' and 'nope -> nope' can be used for inducing curiosity rather than inducing direct purchase. In addition, the relationship that is connected to 'normal' may be unnecessary information in terms of service because it is difficult to connect with the function of direct recommendation service. Therefore, the relationship between 'good -> good' and 'bad - good' is determined to be of high importance, and the relationship of 'good -> no,' 'no ->, It can be determined as the order of importance.

이와 같은 영화 추천 방법을 통해 영화 추천 서비스를 이용하는 사용자들에게 대용량의 사용자 로그를 이용한 신뢰성 있는 영화 추천 서비스를 제공할 수 있다.Through such a movie recommendation method, it is possible to provide a reliable movie recommendation service using a large-capacity user log to users who use the movie recommendation service.

도 4는 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정을 나타낸 동작 흐름도이다.4 is a flowchart illustrating a process of generating a list of related movies according to an exemplary embodiment of the present invention.

도 4를 참조하면, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 제1 평점 데이터에 포함된 평점을 언어 레이블로 치환하여 제2 평점 데이터를 획득할 수 있다(S410).Referring to FIG. 4, the process of generating a list of related movies according to an exemplary embodiment of the present invention may acquire second rating data by replacing a rating included in the first rating data with a language label (S410).

이 때, 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다.At this time, a fuzzy membership value is acquired by assigning a rating to at least one of a fuzzy membership function including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function , The language label according to the obtained fuzzy membership value can be converted into the second rating data by replacing with the rating.

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성할 수 있다(S420).In addition, in the process of generating the associated movie list according to an embodiment of the present invention, the converted second rating data may be combined according to the fuzzy association rule to generate a movie-specific association combination (S420).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고(S430), 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산할 수 있다(S440).In addition, the process of generating a list of associated movies according to an embodiment of the present invention generates a rating history for each movie in which the second rating data is sorted for each movie (S430) The support can be calculated (S440).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 단계(S420) 및 단계(S440)에서 각각 생성 및 계산한 영화별 연관 조합과 영화별 퍼지 지지도를 이용하여, 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산할 수 있다(S450).In addition, the process of generating the associated movie list according to an embodiment of the present invention may be performed by using the association combinations and the per-movie fuzzy support generated and calculated in steps S420 and S440, At least two or more combinations may be combined to calculate the association fuzzy support for the movie-specific association (S450).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산할 수 있다(S460).In addition, the process of generating the associated movie list according to an exemplary embodiment of the present invention may calculate the fuzzy reliability using at least one of the fuzzy support of the movie and the calculated association fuzzy support (S460).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다(S470).In addition, the process of generating the associated movie list according to an exemplary embodiment of the present invention may calculate the fuzzy correlation using one or more of the fuzzy support, fuzzy reliability, and per-movie fuzzy support scores at step S470.

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 생성한 퍼지 신뢰도 및 퍼지 상관도 중 하나를 기준으로 연관 영화 목록을 생성할 수 있다(S480).In addition, the process of generating an associated movie list according to an exemplary embodiment of the present invention may generate an associated movie list based on one of the generated fuzzy reliability and fuzzy correlation (S480).

도 5는 영화의 대한 사용자들의 제1 평점 데이터의 일 예를 나타낸 도면이다.5 is a view showing an example of first rating data of users for a movie.

도 6은 본 발명에 따른 제2 평점 데이터를 생성하기 위한 퍼지 소속 함수를 나타낸 도면이다. 6 is a diagram showing a fuzzy membership function for generating second rating data according to the present invention.

도 7은 도 5에 나타난 제1 평점 데이터를 도 6의 (a) 퍼지 소속 함수를 이용하여 퍼지 소속도 값 및 제2 평점 데이터로 나타낸 도면이다.FIG. 7 is a diagram showing the first rating data shown in FIG. 5 using the fuzzy membership value and the second rating data using the fuzzy membership function of FIG. 6 (a).

도 5, 도 6 및 도 7을 참조하면, 도 5와 같이 사용자들이 영화에 대해서 평점을 부여하였을 때, 도 6에 나타낸 퍼지 소속 함수들을 이용하여 도 7과 같이 퍼지 소속도 값 및 제2 평점 데이터를 생성할 수 있다.5, 6, and 7, when a user gives a rating to a movie as shown in FIG. 5, the fuzzy membership value shown in FIG. 6 and the second rating data Lt; / RTI >

예를 들어, 도 6의 (a) 퍼지 소속 함수를 이용하는 경우에 도 5의 사용자 1이 영화 1에 대해 평점을 2점 부여하였음을 알 수 있다. 이 때, 사용자 1의 영화 1에 대한 퍼지 소속도 값은 도 7에서 싫다:1.0, 보통:0.0, 좋다:0.0으로 나타냄을 알 수 있다. 따라서, 이 경우에는 언어 레이블 '싫다'를 제1 평점 데이터의 평점 2점과 치환하여 제2 평점 데이터로 변환할 수 있다.For example, in the case of using the fuzzy membership function in FIG. 6A, it can be seen that the user 1 in FIG. 5 assigns two points to the movie 1. At this time, the value of the fuzzy belonging to the movie 1 of the user 1 is found to be unpleasant: 1.0, normal: 0.0, and good: 0.0 in FIG. Accordingly, in this case, it is possible to replace the language label 'No' with the second rating data of the first rating data to convert it into second rating data.

또한, 사용자 1이 영화 4에 대해서는 평점을 8점을 부여하였음을 알 수 있다. 이 때, 사용자 1의 영화 4에 대한 퍼지 소속도 값은 도 7에서 싫다:0.0, 보통:0.33, 좋다:0.67으로 나타냄을 알 수 있다. 따라서, 이 경우에는 언어 레이블 '좋다'를 제1 평점 데이터의 평점 8점과 치환하여 제2 평점 데이터로 변환할 수 있다.Also, it can be seen that the user 1 has given 8 points to the movie 4. At this time, the value of the fuzzy belonging speed for the movie 4 of the user 1 is shown as "No, 0.0, Normal: 0.33, and Good: 0.67" in FIG. Therefore, in this case, it is possible to convert the language label 'good' into the second rating data by replacing 8 points of the first rating data.

본 발명에 따른 영화 추천 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 모든 형태의 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함할 수 있다. 이러한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The movie recommendation method according to the present invention may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Includes all types of hardware devices that are specially configured to store and execute magneto-optical media and program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions may include machine language code such as those generated by a compiler, as well as high-level language code that may be executed by a computer using an interpreter or the like. Such a hardware device may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상에서와 같이 본 발명에 따른 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 장치 및 방법은 상기한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다.As described above, the apparatus and method for recommending a movie based on the distributed fuzzy association rules mining according to the present invention can be applied to the configurations and methods of the embodiments described above in a limited manner, All or some of the embodiments may be selectively combined.

본 발명에 의하면 영화에 대한 평점을 포함한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝에 적용하여 연관 영화 목록을 생성하고, 생성한 연관 영화 목록을 통해 영화를 추천 함으로써 추천 대상 사용자의 선호도에 적합한 영화를 효과적으로 추천할 수 있다. 나아가, 이와 같은 추천 기능을 대용량의 사용자에게 적용하기 위해 분산 프레임 워크에 적합한 데이터 처리 방식을 사용하기 때문에 대규모의 평점 로그 데이터에도 적용할 수 있어 보다 신뢰성 있는 서비스를 제공할 수 있다.According to the present invention, the first rating data including the rating for the movie is converted into the second rating data, the second rating data is applied to the fuzzy association rule mining to generate the associated movie list, It is possible to effectively recommend a movie suitable for the recommendation target user's recommendation. Furthermore, since the data processing method suitable for a distributed framework is used to apply such a recommendation function to a large capacity user, it can be applied to a large scale log data, thereby providing a more reliable service.

100: 영화 추천 장치 110: 데이터 획득부
120: 연관 목록 생성부 130: 영화 추천부
210: 연관 조합 생성부 220: 퍼지 지지도 계산부100: Movie recommendation device 110:
120: Association list generation unit 130: Movie recommendation unit
210: associative combination generation unit 220:

Claims

A data acquiring unit acquiring first rating data including a rating for a movie;
An association list generation unit for converting the acquired first rating data to second rating data and applying the converted second rating data to a fuzzy association rule mining to generate an associated movie list; And
A movie recommendation section for recommending a movie using the associated movie list to a recommendation target user,
Wherein the motion recommendation apparatus comprises:

The method according to claim 1,
The association list generation unit
A fuzzy membership function including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function to obtain a fuzzy membership value by substituting the rating into one or more fuzzy membership functions,
And converting the language label according to the acquired fuzzy membership value into the second rating data by replacing the language label with the rating.

The method according to claim 1,
The association list generation unit
Generating at least one of fuzzy reliability and fuzzy correlation using the fuzzy association rules mining,
And generates the associated movie list based on at least one of the generated fuzzy reliability and the fuzzy correlation.

The method of claim 3,
The associated movie list generation unit
An associative combination generating unit for generating a movie-specific association by combining the converted second rating data according to a fuzzy association rule; And
And a fuzzy support calculation unit for generating a fuzzy history for each movie in which the second rating data is organized for each movie and calculating fuzzy support for each movie using the generated rating history for each movie.

The method of claim 4,
The purging support calculation unit
Wherein the fuzzy degree of support for each movie is calculated using a reference value obtained by normalizing the value of the fuzzy support value.

The method of claim 4,
The association list generation unit
Wherein the fuzzy membership function is a combination of at least two of the fuzzy membership functions,
Wherein the fuzzy reliability calculation unit calculates the fuzzy reliability using at least one of the per-movie fuzzy support and the calculated associative combination fuzzy support.

The method of claim 4,
The association list generation unit
Wherein the fuzzy correlation calculating unit calculates the fuzzy correlation using at least one of a squared value of the per-movie fuzzy support, fuzzy confidence, and per-movie fuzzy support.

The method according to claim 1,
The movie recommendation unit
Determining a ranking of the associated movie list generated according to a predetermined importance, and recommending the movie in the order of the determined associated high movie list.

The method according to claim 1,
The associated movie list
A movie title, a genre, a director, a country, a year of production, and an image.

Obtaining first rating data including a rating for a movie;
Converting the acquired first rating data to second rating data, applying the converted second rating data to fuzzy association rule mining to generate a list of associated movies; And
Recommending a movie to the recommendation target user using the associated movie list
Wherein the movie recommendation method comprises:

The method of claim 10,
The step of generating the associated movie list
Obtaining a fuzzy membership value by substituting the rating into at least one of a fuzzy membership function including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function, Including,
And converting the language label according to the acquired fuzzy membership value into the second rating data by replacing the rating label with the rating value.

The method of claim 10,
The step of generating the associated movie list
Generating at least one of fuzzy reliability and fuzzy correlation using the fuzzy association rule mining; And
And generating the associated movie list based on at least one of the generated fuzzy reliability and the fuzzy correlation.

The method of claim 12,
The step of generating the associated movie list
Combining the converted second rating data according to the fuzzy association rule to generate a movie-specific association combination; And
Generating a rating score for each movie in which the second rating data is organized for each movie, and calculating fuzzy support for each movie using the rating history for each movie.

14. The method of claim 13,
The step of generating the associated movie list
Calculating at least two of the fuzzy membership functions to associate the combined fuzzy support for the per-movie association combination; And
And calculating the fuzzy reliability using at least one of the per-movie fuzzy support and the calculated associative combination fuzzy support.

14. The method of claim 13,
The step of generating the associated movie list
And calculating the fuzzy correlation using at least one of squared values of fuzzy support, fuzzy reliability, and per-movie fuzzy support of the movie.

The method of claim 10,
The step of recommending the movie
Determining a ranking of the generated associated movie list according to a predetermined importance,
And recommending the movie in the order of the determined associated high movie list.

A computer-readable recording medium on which a program for executing the method according to any one of claims 10 to 16 is recorded.