KR102126796B1

KR102126796B1 - news preference determining apparatus based on staying time using deep running and method therefor

Info

Publication number: KR102126796B1
Application number: KR1020180152506A
Authority: KR
Inventors: 강명주; 곽지훈; 노형민; 김현욱; 서현
Original assignee: 서울대학교 산학협력단
Priority date: 2018-11-30
Filing date: 2018-11-30
Publication date: 2020-06-26
Also published as: KR20200066447A

Abstract

본 발명은 딥러닝을 이용하여 뉴스 구독자가 읽은 뉴스에 대한 선호도를 판단하는 기술에 관한 것으로, 체류 시간 정보 수집부, 정규 분포 함수 매칭부, 체류시간 분포 추정부, 맞춤형 점수 계산부, 개인 선호도 판단부를 포함할 수 있으며, 뉴스 구독자가 구독한 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집하고, 이를 이용하여 체류 시간을 각 구독자별 상대 점수로 환산한 맞춤형 점수를 계산하여 뉴스 선호도를 제공함으로써 더욱 정확한 개인별 선호도를 제공할 수 있다는 효과가 존재한다.The present invention relates to a technology for determining preferences for news read by news subscribers using deep learning, a residence time information collection unit, a normal distribution function matching unit, a residence time distribution estimation unit, a custom score calculation unit, and personal preference determination It can include wealth, and it collects residence time information on news articles on web pages subscribed to by news subscribers, and uses this to calculate customized scores converted to relative scores for each subscriber and provide news preferences. The effect is that it can provide accurate personal preferences.

Description

News preference determining apparatus based on staying time using deep running and method therefor}

본 발명은 딥러닝을 이용하여 뉴스 구독자가 읽은 뉴스에 대한 선호도를 판단하는 기술에 대한 것으로, 더 자세하게는 각 뉴스 구독자가 구독한 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집하고 수집된 체류 시간을 상대 점수로 환산하여 해당 뉴스의 개인 선호도를 판단하여 개인별 뉴스 선호도를 제공하는 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 장치 및 그 방법을 제공하는 것을 목적으로 한다.The present invention relates to a technology for determining a preference for news read by a news subscriber using deep learning, and more specifically, collecting residence time information for a news article on a web page subscribed by each news subscriber and collecting collected residence time An object of the present invention is to provide an apparatus and method for determining a news preference based on a residence time using deep learning, which determines individual preferences of the corresponding news by converting to a relative score and provides individual news preferences.

온라인 뉴스의 폭발적인 증가에 따라 사용자가 뉴스를 찾는 요구 역시 비례적으로 증가하고 있다. With the explosion of online news, the demand for users to search for news is also increasing proportionately.

특히, 온라인 상에 발생하는 뉴스에서 각 개인 사용자가 원하는 정보를 찾아 주거나 여과시켜 주는 정보 여과는 정보 과부하가 발생하는 현 시점에서 중요한 기술이다.In particular, information filtration that finds or filters information desired by each individual user in news generated online is an important technique at this time when information overload occurs.

종래의 뉴스 추천 시스템은 사용자가 온라인 상에서 읽은 문서를 바탕으로 사용자 프로파일을 구축하고, 구축한 프로파일에서 문서 추천에 사용할 용어를 분석하며, 분석된 용어와 의 유사도가 높은 문서를 사용자에게 추천한다.The conventional news recommendation system builds a user profile based on a document read by a user online, analyzes a term to be used for recommending a document in the constructed profile, and recommends a document having a high degree of similarity to the analyzed term to the user.

종래의 뉴스 추천 시스템은 뉴스 구독자가 뉴스 기사에 대한 체류 시간을 기반으로 뉴스 구독자별 뉴스 선호도를 산정하는 것에는 뉴스 구독자마다 뉴스를 읽은 속도가 달라 절대 시간을 기준으로 선호도를 산정하는 경우 선호도 판단에 오류가 발생하는 문제점이 존재 하였다. In the conventional news recommendation system, when news subscribers calculate news preferences for each news subscriber based on the time of stay for news articles, the speed at which news is read for each news subscriber is different. There were problems with errors.

본 발명은 뉴스 구독자가 뉴스 기사에 대한 체류 시간을 기반으로 뉴스 구독자별 뉴스 선호도를 산정하는 것에는 뉴스 구독자마다 뉴스를 읽은 속도가 달라 절대 시간을 기준으로 선호도를 산정하는 경우 선호도 판단에 오류가 발생하는 문제점을 해결하기 위하여 각 뉴스 구독자가 구독한 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집하고 수집된 체류 시간을 상대 점수로 환산하여 해당 뉴스의 개인 선호도를 판단하여 개인별 뉴스 선호도를 제공하는 것에 그 목적이 있다. In the present invention, when news subscribers calculate news preferences for each news subscriber based on the time of stay for news articles, the speed of reading news for each news subscriber is different, and thus, when calculating preferences based on absolute time, an error occurs in preference determination In order to solve the problem, each news subscriber collects residence time information on news articles on a web page subscribed to, and converts the collected residence time into relative scores to determine the personal preference of the news and to provide individual news preferences. It has a purpose.

본 발명의 실시 예에 따르면 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 장치는 프로세서에 의하여 동작하는 신경망으로 이루어진 딥러닝 모델을 이용하여, 뉴스 구독자가 구독한 복수의 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집하는 체류 시간 정보 수집부; 수집된 뉴스 기사 별 체류 시간 정보 각각에 정규분포 함수를 대응시키는 정규 분포 함수 매칭부; 상기 대응된 복수의 체류 시간 정보에 대한 정규분포 함수를 선형적으로 결합하여 상기 뉴스 구독자의 체류시간 분포를 추정하는 체류시간 분포 추정부; 상기 추정된 체류시간 분포를 누적하여 누적 분포 함수를 계산하고, 상기 계산된 누적 분포 함수를 이용하여 체류 시간을 각 구독자별 상대 점수로 환산한 맞춤형 점수를 계산하는 맞춤형 점수 계산부; 및 상기 계산된 맞춤형 점수를 기반으로 각 뉴스 구독자별 복수의 뉴스 기사에 대한 선호도 그래프를 생성하여 해당 뉴스의 개인 선호도를 판단하는 개인 선호도 판단부를 포함할 수 있다.According to an embodiment of the present invention, a device for determining a news preference based on a residence time using deep learning uses a deep learning model composed of a neural network operated by a processor, for news articles on a plurality of web pages subscribed to by news subscribers. A residence time information collection unit for collecting residence time information; A normal distribution function matching unit that correlates the normal distribution function to each of the collected residence time information for each news article; A residence time distribution estimator configured to linearly combine a normal distribution function for the corresponding plurality of residence time information to estimate the distribution of residence time of the news subscribers; A custom score calculation unit that calculates a cumulative distribution function by accumulating the estimated residence time distribution, and calculates a customized score converted into a relative score for each subscriber using the calculated cumulative distribution function; And it may include a personal preference determining unit for determining the personal preference of the news by generating a preference graph for a plurality of news articles for each news subscriber based on the calculated customized score.

본 발명의 일 실시 예에 따르면 상기 체류시간 분포 추정부는, 상기 뉴스 구독자가 구독한 뉴스 기사의 수가 늘어날수록 추정된 체류시간 분포는 실제 결과에 부합되는 분포를 따라 갈 수 있다.According to an embodiment of the present invention, the residence time distribution estimating unit may increase the number of news articles subscribed by the news subscribers, and the estimated residence time distribution may follow a distribution corresponding to actual results.

본 발명의 일 실시 예에 따르면 상기 맞춤형 점수 계산부는, 상기 계산된 누적 분포 함수를 사분위(quantile) 변환하여 맞춤형 점수를 계산할 수 있다.According to an embodiment of the present invention, the custom score calculation unit may calculate a custom score by quantile transforming the calculated cumulative distribution function.

본 발명의 일 실시 예에 따르면 상기 개인 선호도 판단부는, 상기 계산된 맞춤형 점수가 높을수록 각 뉴스 구독자가 선호하는 뉴스 기사로 판단할 수 있다.According to an embodiment of the present invention, the personal preference determining unit may determine a news article preferred by each news subscriber as the calculated personalized score is higher.

본 발명의 일 실시 예에 따르면 상기 판단된 개인 선호도를 이용하여 높은 개인 선호도를 보인 뉴스 기사와 유사한 뉴스를 상기 뉴스 구독자에게 제공할 수 있다.According to an embodiment of the present invention, news similar to a news article showing high personal preference may be provided to the news subscriber by using the determined personal preference.

본 발명의 실시 예에 따르면 프로세서에 의하여 동작하는 신경망으로 이루어진 딥러닝 모델을 이용하여, 뉴스 구독자가 구독한 복수의 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집하는 단계; 수집된 뉴스 기사 별 체류 시간 정보 각각에 정규분포 함수를 대응시키는 단계; 상기 대응된 복수의 체류 시간 정보에 대한 정규분포 함수를 선형적으로 결합하여 상기 뉴스 구독자의 체류시간 분포를 추정하는 단계; 상기 추정된 체류시간 분포를 누적하여 누적 분포 함수를 계산하고, 상기 계산된 누적 분포 함수를 이용하여 체류 시간을 각 구독자별 상대 점수로 환산한 맞춤형 점수를 계산하는 단계; 및 상기 계산된 맞춤형 점수를 기반으로 각 뉴스 구독자별 복수의 뉴스 기사에 대한 선호도 그래프를 생성하여 해당 뉴스의 개인 선호도를 판단하는 단계를 포함할 수 있다.According to an embodiment of the present invention, using a deep learning model consisting of a neural network operated by a processor, collecting residence time information for news articles on a plurality of web pages subscribed by a news subscriber; Correlating a normal distribution function to each residence time information for each collected news article; Estimating the distribution of residence time of the news subscribers by linearly combining normal distribution functions for the corresponding plurality of residence time information; Calculating a cumulative distribution function by accumulating the estimated residence time distribution, and calculating a customized score that converts the residence time into a relative score for each subscriber using the calculated cumulative distribution function; And generating preference graphs for a plurality of news articles for each news subscriber based on the calculated personalized score to determine personal preference of the news.

본 발명의 일 실시 예에 따르면 상기 체류시간 분포를 추정하는 단계는, 상기 뉴스 구독자가 구독한 뉴스 기사의 수가 늘어날수록 추정된 체류시간 분포는 실제 결과에 부합되는 분포를 따라 가는 것을 특징으로 하는 딥러닝을 이용한 체류시간을 갈수 있다.According to an embodiment of the present invention, the step of estimating the residence time distribution is a dip characterized in that the estimated residence time distribution follows the distribution corresponding to the actual result as the number of news articles subscribed by the news subscriber increases. You can go through your stay using running.

본 발명의 일 실시 예에 따르면 상기 맞춤형 점수를 계산하는 단계는, 상기 계산된 누적 분포 함수를 사분위(quantile) 변환하여 맞춤형 점수를 계산할 수 있다.According to an embodiment of the present invention, in the step of calculating the customized score, a customized score may be calculated by converting the calculated cumulative distribution function into a quartile.

본 발명의 일 실시 예에 따르면 상기 개인 선호도를 판단하는 단계는, 상기 계산된 맞춤형 점수가 높을수록 각 뉴스 구독자가 선호하는 뉴스 기사로 판단할 수 있다.According to an embodiment of the present invention, the step of determining the personal preference may be determined as a news article preferred by each news subscriber as the calculated personalized score is higher.

본 발명에 따르면 딥러닝을 이용하여 뉴스 구독자가 구독한 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집하고, 이를 이용하여 체류 시간을 각 구독자별 상대 점수로 환산한 맞춤형 점수를 계산하여 뉴스 선호도를 제공함으로써 더욱 정확한 개인별 선호도를 제공할 수 있다는 효과가 존재한다.According to the present invention, by using deep learning, information on news time on news articles on a web page subscribed by a news subscriber is collected, and by using this, a personalized score converted into a relative score for each subscriber is calculated to improve news preference. By providing, there is an effect that more accurate personal preferences can be provided.

도 1은 본 발명의 실시 예에 따른 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 장치의 구성도이다.
도 2는 본 발명의 실시 예에 따른 뉴스 선호도 판단 장치에서의 데이터 흐름을 나타내는 도면이다.
도 3은 본 발명의 실시 예에 따라 히스토그램을 스무딩하여 산출된 각 기사별 체류시간 분포 추정 그래프를 나타낸 도면이다.
도 4는 본 발명의 실시 예에 따라 추정된 체류시간 분포를 누적하여 생성한누적 분포 함수 그래프를 나타낸 도면이다.
도 5는 본 발명의 실시 예에 따른 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 방법의 흐름도이다.1 is a configuration diagram of a news preference determination device based on a residence time using deep learning according to an embodiment of the present invention.
2 is a diagram illustrating data flow in a news preference determining device according to an embodiment of the present invention.
3 is a view showing a graph of estimation of a residence time distribution for each article calculated by smoothing a histogram according to an embodiment of the present invention.
4 is a diagram showing a cumulative distribution function graph generated by accumulating the estimated residence time distribution according to an embodiment of the present invention.
5 is a flowchart of a news preference determination method based on a residence time using deep learning according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art to which the present invention pertains may easily practice. However, the present invention can be implemented in many different forms and is not limited to the embodiments described herein.

그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.In addition, in order to clearly describe the present invention in the drawings, parts irrelevant to the description are omitted, and like reference numerals are assigned to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part “includes” a certain component, it means that the component may further include other components, not to exclude other components, unless otherwise stated.

이하, 도면을 참조하여 본 발명의 실시 예에 따른 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 장치 및 그 방법에 대하여 설명한다.Hereinafter, an apparatus and method for determining a news preference based on a residence time using deep learning according to an embodiment of the present invention will be described with reference to the drawings.

도 1은 본 발명의 실시 예에 따른 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 장치(1000)의 구성도이다.1 is a configuration diagram of a news preference determining apparatus 1000 based on a residence time using deep learning according to an embodiment of the present invention.

도 1을 참조하면 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 장치 (1000)는 체류 시간 정보 수집부(100), 정규 분포 함수 매칭부(200), 체류시간 분포 추정부(300), 맞춤형 점수 계산부(400), 개인 선호도 판단부(500)를 포함할 수 있다.Referring to FIG. 1, the apparatus 1000 for determining a news preference based on residence time using deep learning includes a residence time information collection unit 100, a normal distribution function matching unit 200, a residence time distribution estimation unit 300, and customized The score calculation unit 400 and the personal preference determination unit 500 may be included.

본 발명의 일 실시 예에 따르면 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 장치(1000) 프로세서에 의하여 동작하는 신경망으로 이루어진 딥러닝 모델을 이용하여 각 구성요소의 동작을 수행할 수 있다.According to an embodiment of the present invention, the operation of each component may be performed using a deep learning model composed of a neural network operated by a processor 1000 of a news preference determining apparatus 1000 based on residence time using deep learning.

체류 시간 정보 수집부(100)는 뉴스 구독자가 구독한 복수의 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집할 수 있다.The residence time information collecting unit 100 may collect residence time information for news articles on a plurality of web pages subscribed to by news subscribers.

본 발명의 일 실시 예에 따르면 한 명의 뉴스 구독자가 뉴스 기사를 읽기 위해 해당 웹페이지에 체류한 체류 시간을 측정하여 체류 시간 정보로 정의할 수 있다.According to an embodiment of the present invention, one news subscriber may define a residence time information by measuring a residence time in a corresponding web page in order to read a news article.

상기 실시 예에 따르면 뉴스 구독자가 구독한 복수의 뉴스 기사에 포함된 각 뉴스 기사 별로 체류 시간을 측정할 수 있으며, 이렇게 측정된 체류 시간을 기준으로 체류 시간 정보를 생성할 수 있다. According to the above embodiment, a residence time can be measured for each news article included in a plurality of news articles subscribed by a news subscriber, and residence time information can be generated based on the measured residence time.

이때 생성된 체류 시간 정보에 포함된 시간 정보는 절대적인 시간에 대한 것으로 뉴스 구독자별 구독 속도 및 구독 습관에 대하여 전혀 고려가 되어 있지 않아 이를 기반으로 선호도를 산정하는 경우 개인차로 인하여 정확도가 감소하는 문제점이 존재한다.At this time, since the time information included in the generated residence time information is for absolute time, there is no consideration for the subscription speed and subscription habits for each news subscriber, and when calculating preference based on this, the accuracy decreases due to individual differences. exist.

따라서 본 발명과 같이 측정된 체류 시간을 바탕으로 개인별 상대 시간을 산정할 필요성이 존재한다.Therefore, there is a need to calculate the relative time for each individual based on the residence time measured as in the present invention.

정규 분포 함수 매칭부(200)는 수집된 뉴스 기사 별 체류 시간 정보 각각에 정규분포 함수를 대응할 수 있다.The normal distribution function matching unit 200 may correspond to the normal distribution function to each of the collected residence time information for each news article.

본 발명의 일 실시 예에 따르면 뉴스 구독자 A가 뉴스기사 1, 2, 3,…을 읽었다고 가정하면, 수 많은 뉴스 시가 중에서 특정한 기사를 읽었다는 희박한 확률의 사건이 일어난 것이므로 각각의 사건이 중요한 의미를 가질 수 있다.According to an embodiment of the present invention, news subscriber A is a news article 1, 2, 3,… Assuming that you have read, there is a rare probability of having read a specific article among a number of news cigars, so each event can have an important meaning.

본 발명이 일 실시 예에 따르면 뉴스 구독자 A에게 뉴스기사 1과 취향적으로 비슷한 다른 특정 뉴스기사를 보여준다면 뉴스기사 1을 읽었을 때와 비슷한 체류시간을 가질 것임을 가정하면, 비슷한 체류시간은 정규분포를 따를 것이라고 추정할 수 있으며, 수집된 뉴스 기사 별 체류 시간 정보 각각에 정규분포 함수를 대응할 수 있다. According to an embodiment of the present invention, if the news subscriber A shows another specific news article tastefully similar to the news article 1, it is assumed that the news article 1 will have a similar residence time, and the similar residence time is a normal distribution. It can be assumed that it will follow, and a normal distribution function can be mapped to each of the collected residence time information for each news article.

체류시간 분포 추정부(300)는 대응된 복수의 체류 시간 정보에 대한 정규분포 함수를 선형적으로 결합하여, 뉴스 구독자의 체류시간 분포를 추정할 수 있다.The residence time distribution estimator 300 may linearly combine the normal distribution functions for the corresponding plurality of residence time information to estimate the distribution of residence time of news subscribers.

본 발명의 일 실시 예에 따르면 뉴스 구독자 A가 뉴스를 읽은 사건 하나(뉴스 기사, 체류시간)에 정규 분포 함수를 하나를 대응시키고, 대응된 정규분포함수들을 선형 결합해 뉴스 구독자 A의 체류시간 분포를 추정할 수 있다.According to an embodiment of the present invention, the news subscriber A corresponds to one event (news article, residence time) that reads the news, and one normal distribution function is matched, and the corresponding normal distribution functions are linearly combined to distribute the residence time of the news subscriber A Can be estimated.

상기 일 실시 예에 따르면 각각의 정규분포 함수는 기사 N에 대한 체류시간

, 뉴스 구독자 A의 사건 개수

, 뉴스 구독자 A의 체류시간 표준편차

일 때 평균 및 표준 편차는 아래와 같다.According to the above embodiment, each normal distribution function is a residence time for article N

, News Subscriber A Incident Count

, Standard deviation of the stay time of news subscriber A

And mean and standard deviation are as follows.

본 발명의 일 실시 예에 따르면 상술한 변수에 근거하여 아래의 수학식 1을 이용함으로써 뉴스 구독자의 체류시간 분포(pdf)를 추정할 수 있다.According to an embodiment of the present invention, the residence time distribution (pdf) of a news subscriber can be estimated by using Equation 1 below based on the above-described variable.

[수학식 1][Equation 1]

본 발명의 일 실시 예에 따르면 체류시간 분포에 대한 체류시간 평균은 기록된 뉴스 구독자의 체류시간 평균과 같다고 가정될 수 있으며, 체류시간 분포(pdf)에 대한 체류시간 분산은 아래 수학식 2와 같다According to an embodiment of the present invention, the average residence time for the distribution of residence time can be assumed to be equal to the average of the residence time of the recorded news subscribers, and the dispersion of residence time for the distribution of residence time (pdf) is as shown in Equation 2 below.

[수학식 2][Equation 2]

본 발명의 일 실시 예에 따르면 뉴스 구독자가 구독한 뉴스 기사의 수가 늘어날수록 추정된 체류시간 분포는 실제 결과에 부합되는 분포를 따라 갈 수 있다.According to an embodiment of the present invention, as the number of news articles subscribed by a news subscriber increases, the estimated residence time distribution may follow a distribution that matches the actual result.

즉, 뉴스 구독자가 읽은 뉴스 기사 수가 늘어날수록 체류시간 분포(pdf)는 실제 결과에 대한 정확도를 상승시킬 수 있다.That is, as the number of news articles read by news subscribers increases, the residence time distribution (pdf) may increase the accuracy of the actual results.

맞춤형 점수 계산부(400)는 추정된 체류시간 분포를 누적하여 누적 분포 함수를 계산하고, 계산된 누적 분포 함수를 이용하여 체류 시간을 각 구독자별 상대 점수로 환산한 맞춤형 점수를 계산할 수 있다.The customized score calculation unit 400 may calculate the cumulative distribution function by accumulating the estimated residence time distribution, and calculate a customized score obtained by converting the residence time into a relative score for each subscriber using the calculated cumulative distribution function.

여기서 맞춤형 점수는 일정한 만점 단위를 가지는 점수가 아니고, 상대적인 크기를 통해 선호도 여부를 판정할 수 있는 수치를 의미할 수 있으나, 이에 한정되지 않고 상대적인 크기를 나타낼 수 있는 수치라면 제한 없이 사용될 수 있다.Here, the customized score is not a score having a certain perfect score unit, and may mean a value capable of determining preference through a relative size, but is not limited thereto and may be used without limitation as long as it is a numerical value indicating a relative size.

본 발명의 일 실시 예에 따르면 아래 수학식 3을 이용하여 누적 분포 함수(cdf)를 계산할 수 있다.According to an embodiment of the present invention, the cumulative distribution function cdf may be calculated using Equation 3 below.

[수학식 3][Equation 3]

본 발명의 일 실시 예에 따르면 계산된 누적 분포 함수를 사분위(quantile) 변환하여 맞춤형 점수를 계산할 수 있다.According to an embodiment of the present invention, a customized score may be calculated by converting the calculated cumulative distribution function into a quartile.

본 발명의 일 실시 예에 따르면 이상적인 점수의 분포를 가정한 후 사분위(quantile) 변환하여 맞춤형 점수를 계산할 수 있으며, 일 실시 예에 따르면 아래 수학식 4와 같은 사분위 변환을 수행하여 맞춤형 점수를 계산할 수 있다.According to an embodiment of the present invention, after assuming the distribution of an ideal score, a custom score can be calculated by performing a quartile transformation, and according to an embodiment, a custom score is obtained by performing a quartile transformation as in Equation 4 below. Can be calculated.

[수학식 4][Equation 4]

개인 선호도 판단부(500)는 계산된 맞춤형 점수를 기반으로 각 뉴스 구독자별 복수의 뉴스 기사에 대한 선호도 그래프를 생성하여 해당 뉴스의 개인 선호도를 판단할 수 있다.The personal preference determining unit 500 may determine a personal preference of the corresponding news by generating a preference graph for a plurality of news articles for each news subscriber based on the calculated customized score.

본 발명의 일 실시 예에 따르면 각 뉴스 구독자별 복수의 뉴스 기사에 대한 선호도 그래프는 X축을 체류 시간 Y축을 맞춤형 점수로 하여 각 기사별 맞춤형 점수를 한눈에 볼 수 있게 그려질 수 있으며, 이를 통해 각 뉴스 별 개인의 선호도를 판단할 수 있다.According to an embodiment of the present invention, the preference graph for a plurality of news articles for each news subscriber may be drawn so that the personalized score for each article can be viewed at a glance using the X-axis as the residence time and the Y-axis as a personalized score. Individual preferences for each news can be determined.

본 발명의 일 실시 예에 따르면 계산된 맞춤형 점수가 높을수록 각 뉴스 구독자가 선호하는 뉴스 기사로 판단할 수 있다.According to an embodiment of the present invention, the higher the calculated customized score is, the more news articles each subscriber may prefer.

본 발명의 일 실시 예에 따르면 판단된 개인 선호도를 이용하여 높은 개인 선호도를 보인 뉴스 기사와 유사한 뉴스를 상기 뉴스 구독자에게 제공할 수 있다.According to an embodiment of the present invention, news similar to a news article showing high personal preference may be provided to the news subscriber by using the determined personal preference.

본 발명의 일 실시 예에 따르면 미리 설정한 일정한 수치 이상의 맞춤형 점수에 해당하는 기사를 개인별 선호 기사로 선정하여 이와 유사한 기사를 뉴스 구독자에게 제공할 수 있다.According to an embodiment of the present invention, articles corresponding to personalized scores of a predetermined value or more set in advance may be selected as personal preference articles, and similar articles may be provided to news subscribers.

도 2는 본 발명의 실시 예에 따른 뉴스 선호도 판단 장치에서의 데이터 흐름을 나타내는 도면이다.2 is a diagram illustrating data flow in a news preference determining device according to an embodiment of the present invention.

도 2를 참조하면 뉴스 선호도 판단 장치에서의 데이터 흐름이 나타나 있으며, 크게 2 단계의 산술적 연산을 거쳐 정규 분포 함수에 대응된 체류시간 정보를 이용하여 누적 분포 함수(cdf)를 산출할 수 있다.Referring to FIG. 2, the data flow in the news preference determining device is shown, and the cumulative distribution function (cdf) can be calculated using residence time information corresponding to the normal distribution function through two arithmetic operations.

본 발명의 일 실시 예에 따르면 제1 단계의 산술적 연산으로 체류 시간 분포를 추정하기 위하여 파첸의 창(Parzen Windows, 수학식 1)이 사용될 수 있다.According to an embodiment of the present invention, a window of Pachen (Parzen Windows, Equation 1) may be used to estimate the residence time distribution by the arithmetic operation of the first step.

여기서 파첸의 창은 커널밀도추정(Kernel Density Estimation)이라고 표현되기도 하며, 특정 함수에 합성곱하기 위한 마스크 역할을 수행하며, 가우시안(gaussian) 또는 유니폼(uniform) 필터의 종류마다 마스크 값이 다를 수 있다.Here, the window of Pachen is sometimes referred to as Kernel Density Estimation, and acts as a mask for multiplying a specific function, and the mask value may be different for each type of Gaussian or uniform filter.

본 발명의 일 실시 예에 따르면 제2 단계의 산술적 연산으로 누적분포 함수(cdf)를 산출 하기 위하여 수학식 3과 같은 사용될 수 있다.According to an embodiment of the present invention, it can be used as in Equation 3 to calculate the cumulative distribution function (cdf) by the arithmetic operation of the second step.

이렇게 산출된 누적분포 함수(cdf)를 이용하여 맞춤형 점수를 산출할 수 있다. The customized score can be calculated using the calculated cumulative distribution function (cdf).

도 3은 본 발명의 실시 예에 따라 히스토그램을 스무딩하여 산출된 각 기사별 체류시간 분포 추정 그래프를 나타낸 도면이다.3 is a view showing a graph of estimation of a residence time distribution for each article calculated by smoothing a histogram according to an embodiment of the present invention.

본 발명의 일 실시 예에 따르면 본 발명의 일 실시 예에 따르면 파첸의 창(Parzen Windows, 수학식 1)을 이용해 추정된 체류 시간 분포는 도 3과 같이 표현될 수 있다.According to an embodiment of the present invention, according to an embodiment of the present invention, the estimated residence time distribution using the Pazen's window (Equation 1) may be expressed as shown in FIG. 3.

도 4는 본 발명의 실시 예에 따라 추정된 체류시간 분포를 누적하여 생성한누적 분포 함수 그래프를 나타낸 도면이다.4 is a diagram showing a cumulative distribution function graph generated by accumulating the estimated residence time distribution according to an embodiment of the present invention.

본 발명의 일 실시 예에 따르면 수학식 3을 이용하여 제2 단계의 산술적 연산을 수행해 누적분포 함수(cdf)를 산출하고, 이를 통해 맞춤형 점수를 산출하여 도 4의 그래프와 같이 나타낼 수 있다.According to an embodiment of the present invention, the mathematical distribution of the second step is performed using Equation (3) to calculate the cumulative distribution function (cdf), and through this, a customized score can be calculated to represent the graph of FIG. 4.

도 4의 (1)~(5)의 선은 각 뉴스 기사 1~5에 대한 맞춤형 점수를 나타낼 수 있고, 이중 맞춤형 점수가 미리 설정된 일정 값보다 큰 뉴스 기사를 구독자가 선호하는 기사로 판단할 수 있다. Lines (1) to (5) of FIG. 4 may indicate a customized score for each news article 1 to 5, and a news article whose double customized score is greater than a predetermined predetermined value may be determined as a subscriber's preferred article. have.

도 5는 본 발명의 실시 예에 따른 딥러닝을 이용한 체류시간을 기반의 뉴스 선호도 판단 방법의 흐름도이다.5 is a flowchart of a news preference determination method based on a residence time using deep learning according to an embodiment of the present invention.

뉴스 기사에 대한 체류 시간 정보를 수집한다(510).Gather time information for news articles (510).

본 발명의 일 실시 에 따르면 뉴스 구독자가 구독한 복수의 웹 페이지 상의 뉴스 기사에 대한 체류 시간 정보를 수집할 수 있다.According to an embodiment of the present invention, it is possible to collect residence time information for news articles on a plurality of web pages subscribed to by news subscribers.

체류 시간 정보에 정규분포 함수를 대응한다(520).A normal distribution function is mapped to the residence time information (520).

본 발명의 일 실시 예에 따르면 수집된 뉴스 기사 별 체류 시간 정보 각각에 정규분포 함수를 대응할 수 있다.According to an embodiment of the present invention, a normal distribution function may correspond to each residence time information for each news article collected.

본 발명이 일 실시 예에 따르면 뉴스 구독자 A에게 뉴스기사 1과 취향적으로 비슷한 다른 특정 뉴스기사를 보여준다면 뉴스기사 1을 읽었을 때와 비슷한 체류시간을 가질 것임을 가정하면, 비슷한 체류시간은 정규분포를 따를 것이라고 추정할 수 있으며, 수집된 뉴스 기사 별 체류 시간 정보 각각에 정규분포 함수를 대응할 수 있다.According to an embodiment of the present invention, if the news subscriber A shows another specific news article tastefully similar to the news article 1, it is assumed that the news article 1 will have a similar residence time, and the similar residence time is a normal distribution. It can be assumed that it will follow, and a normal distribution function can be mapped to each of the collected residence time information for each news article.

뉴스 구독자의 체류시간 분포를 추정한다(530).Estimate the distribution of residence time of news subscribers (530).

본 발명의 일 실시 예에 따르면 대응된 복수의 체류 시간 정보에 대한 정규분포 함수를 선형적으로 결합하여, 뉴스 구독자의 체류시간 분포를 추정할 수 있다.According to an embodiment of the present invention, it is possible to estimate the distribution of residence time of news subscribers by linearly combining the normal distribution function for the corresponding plurality of residence time information.

본 발명의 일 실시 예에 따르면 상술한 변수에 근거하여 수학식 1을 이용함으로써 뉴스 구독자의 체류시간 분포(pdf)를 추정할 수 있다.According to an embodiment of the present invention, it is possible to estimate the residence time distribution (pdf) of a news subscriber by using Equation 1 based on the above-described variables.

본 발명의 일 실시 예에 따르면 체류시간 분포에 대한 체류시간 평균은 기록된 뉴스 구독자의 체류시간 평균과 같다고 가정될 수 있으며, 체류시간 분포(pdf)에 대한 체류시간 분산은 수학식 2와 같다.According to an embodiment of the present invention, the average residence time for the distribution of residence time may be assumed to be the same as the average of the residence time of the recorded news subscribers, and the dispersion of residence time for the distribution of residence time (pdf) is expressed by Equation 2.

맞춤형 점수를 계산한다(540).Calculate the customized score (540).

본 발명의 일 실시 예에 따르면 추정된 체류시간 분포를 누적하여 누적 분포 함수를 계산하고, 계산된 누적 분포 함수를 이용하여 체류 시간을 각 구독자별 상대 점수로 환산한 맞춤형 점수를 계산할 수 있다.According to an embodiment of the present invention, a cumulative distribution function may be calculated by accumulating the estimated residence time distribution, and a customized score converted into a relative score for each subscriber may be calculated using the calculated cumulative distribution function.

본 발명의 일 실시 예에 따르면 수학식 3을 이용하여 누적 분포 함수(cdf)를 계산할 수 있다.According to an embodiment of the present invention, the cumulative distribution function (cdf) may be calculated using Equation (3).

해당 뉴스의 개인 선호도를 판단한다(550).The personal preference of the news is determined (550).

본 발명의 일 실시 예에 따르면 계산된 맞춤형 점수를 기반으로 각 뉴스 구독자별 복수의 뉴스 기사에 대한 선호도 그래프를 생성하여 해당 뉴스의 개인 선호도를 판단할 수 있다.According to an embodiment of the present invention, it is possible to determine a personal preference of a corresponding news by generating a preference graph for a plurality of news articles for each news subscriber based on the calculated customized score.

본 발명의 일 실시 예에 따르면 각 뉴스 구독자별 복수의 뉴스 기사에 대한 선호도 그래프는 X축을 체류 시간 Y축을 맞춤형 점수로 하여 각 기사별 맞춤형 점수를 한눈에 볼 수 있게 그려질 수 있으며, 이를 통해 각 뉴스 별 개인의 선호도를 판단할 수 있다.According to an embodiment of the present invention, the preference graph for a plurality of news articles for each news subscriber may be drawn so that the personalized score for each article can be viewed at a glance using the X-axis as the residence time and the Y-axis as a customized score. Individual preferences for each news can be determined.

본 발명의 일 실시 예에 따르면 계산된 맞춤형 점수가 높을수록 각 뉴스 구독자가 선호하는 뉴스 기사로 판단할 수 있다.According to an embodiment of the present invention, the higher the calculated customized score is, the more news subscribers can judge as a preferred news article.

본 발명의 일 실시 예에 따르면 미리 설정한 일정한 수치 이상의 맞춤형 점수에 해당하는 기사를 개인별 선호 기사로 선정하여 이와 유사한 기사를 뉴스 구독자에게 제공할 수 있다.According to an embodiment of the present invention, articles corresponding to a customized score of a predetermined value or more set in advance may be selected as personal preference articles, and similar articles may be provided to news subscribers.

본 발명의 실시 예는 이상에서 설명한 장치 및/또는 방법을 통해서만 구현이 되는 것은 아니며, 이상에서 본 발명의 실시 예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.The embodiments of the present invention are not implemented only through the devices and/or methods described above, and the embodiments of the present invention have been described in detail above, but the scope of the present invention is not limited thereto, and the following claims Various modifications and improvements of those skilled in the art using the basic concept of the present invention defined in the above also belong to the scope of the present invention.

100 : 체류 시간 정보 수집부 200 : 정규 분포 함수 매칭부
300 : 체류시간 분포 추정부 400 : 맞춤형 점수 계산부
500 : 개인 선호도 판단부 1000 : 뉴스 선호도 판단 장치100: residence time information collection unit 200: normal distribution function matching unit
300: residence time distribution estimation unit 400: customized score calculation unit
500: personal preference determination unit 1000: news preference determination device

Claims

Using a deep learning model consisting of neural networks operated by a processor,
A residence time information collection unit that collects residence time information for news articles on a plurality of web pages subscribed by news subscribers;
A normal distribution function matching unit that associates the normal distribution function with each of the collected residence time information for each news article;
A residence time distribution estimator configured to linearly combine a normal distribution function for the corresponding plurality of residence time information to estimate the distribution of residence time of the news subscriber;
A custom score calculation unit that calculates a cumulative distribution function by accumulating the estimated residence time distribution, and calculates a customized score converted into a relative score for each subscriber using the calculated cumulative distribution function; And
Based on the calculated personalized scores, a preference graph for a plurality of news articles for each news subscriber is generated, and a personal preference determining unit for determining personal preference of the news is based on the residence time using deep learning. News preference judgment device.

According to claim 1, The residence time distribution estimation unit,
As the number of news articles subscribed to by the news subscribers increases, the estimated residence time distribution follows a distribution that matches the actual result.

According to claim 1, wherein the customized score calculation unit,
An apparatus for judging news preference based on residence time using deep learning, characterized in that the calculated cumulative distribution function is converted into a quartile to calculate a customized score.

According to claim 1, wherein the personal preference determining unit,
The higher the calculated personalized score, the news preference determining device based on the residence time using deep learning, characterized in that each news subscriber is determined to be preferred news articles.

According to claim 1,
A news preference determination device based on residence time using deep learning, characterized in that news news similar to a news article showing high personal preference is provided to the news subscriber by using the determined personal preference.

Using a deep learning model consisting of neural networks operated by a processor,
Collecting residence time information for news articles on a plurality of web pages subscribed by news subscribers;
Correlating a normal distribution function to each residence time information for each collected news article;
Estimating the distribution of residence time of the news subscribers by linearly combining normal distribution functions for the corresponding plurality of residence time information;
Calculating a cumulative distribution function by accumulating the estimated residence time distribution, and calculating a customized score that converts the residence time into a relative score for each subscriber using the calculated cumulative distribution function; And
And generating preference graphs for a plurality of news articles for each news subscriber based on the calculated customized score to determine personal preferences of the news, based on residence time-based news preferences using deep learning. How to judge.

The method of claim 6, wherein estimating the distribution of residence time,
A method for determining news preference based on residence time using deep learning, characterized in that as the number of news articles subscribed by the news subscriber increases, the estimated residence time distribution follows a distribution that matches the actual result.

The method of claim 6, wherein the step of calculating the customized score,
A method for judging news preference based on residence time using deep learning, wherein the calculated cumulative distribution function is converted to a quartile to calculate a customized score.

The method of claim 6, wherein determining the personal preferences,
The higher the calculated personalized score, the better news preference method based on residence time using deep learning, characterized in that each news subscriber determines it as a preferred news article.

The method of claim 6,
A news preference determination method based on residence time using deep learning, characterized in that news news similar to a news article showing high personal preference is provided to the news subscriber by using the determined personal preference.