KR20180049852A

KR20180049852A - News recommedation server and method for recommendation news of using the same

Info

Publication number: KR20180049852A
Application number: KR1020160146002A
Authority: KR
Inventors: 유성준; 구영현; 철호 박; 학림 윤; 지연 강
Original assignee: 세종대학교산학협력단
Priority date: 2016-11-03
Filing date: 2016-11-03
Publication date: 2018-05-14
Also published as: KR101907865B1

Abstract

A news recommendation server and a news recommendation method using the same are disclosed. According to an exemplary embodiment of the present invention, the news recommendation server includes a data collection unit for collecting web browsing history information from a user terminal, a field ratio calculation unit for calculating a ratio of news fields preferred by a corresponding user based on the collected web browsing history information, and a news recommending unit for recommending news to the user terminal by adjusting at least one of an arrangement order of recommended news fields and the amount of recommended news for each field according to the calculated ratio of the news fields. Accordingly, the present invention can actively recommend the news according to a preference change of the user.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a news recommendation server,

본 발명의 실시예는 뉴스 추천 기술과 관련된다.Embodiments of the present invention relate to news recommendation techniques.

스마트 디바이스가 많이 보급됨에 따라 웹 서비스의 중점은 PC(Personal Computer)에서 스마트 디바이스로 옮겨가고 있다. 이는 뉴스 서비스도 마찬가지이다. 또한, 뉴스 추천 시스템의 발전에 따라 사용자는 선호하는 분야의 뉴스를 자동으로 추천 받고자 하는 욕구가 커지고 있다. 더욱이 선호하는 정도에 따라 추천되는 뉴스의 양이 동적으로 조절되기를 원하고 있으나, 기존의 모바일 뉴스 추천 시스템은 사용자의 선호도에 따라 선호 분야 뉴스의 양을 동적으로 조절하지 못한다. As smart devices become popular, the focus of Web services is shifting from PCs (personal computers) to smart devices. This is also true of news services. Also, according to the development of news recommendation system, a user is increasingly desirous to automatically receive recommendation of news in a preferred field. Moreover, although the amount of recommended news is desired to be dynamically adjusted depending on the degree of preference, the existing mobile news recommendation system can not dynamically adjust the amount of preference news according to the user's preference.

한국공개특허공보 제10-2016-0104067호(2016.09.02)Korean Patent Laid-Open Publication No. 10-2016-0104067 (2016.09.02)

본 발명의 실시예는 사용자의 선호도에 따라 뉴스 분야의 순서 및 각 뉴스 분야의 수량을 조절하여 추천할 수 있는 기법을 제공하기 위한 것이다.The embodiment of the present invention is to provide a technique for recommending the order of the news field and the quantity of each news field by adjusting the user's preference.

본 발명의 일 실시예에 따른 뉴스 추천 서버는, 사용자 단말기로부터 웹 브라우징 히스토리 정보를 수집하는 데이터 수집부; 상기 수집한 웹 브라우징 히스토리 정보를 기반으로 해당 사용자가 선호하는 뉴스 분야들의 비율을 산출하는 분야 비율 산출부; 및 상기 산출한 뉴스 분야들의 비율을 기반으로 상기 사용자 단말기로 뉴스를 추천하는 뉴스 추천부를 포함한다.A news recommendation server according to an embodiment of the present invention includes a data collection unit for collecting web browsing history information from a user terminal; An area ratio calculation unit for calculating a ratio of news fields preferred by the user based on the collected web browsing history information; And a news recommending unit for recommending news to the user terminal based on the ratio of the calculated news fields.

상기 뉴스 추천부는, 상기 산출한 뉴스 분야들의 비율에 따라 추천하는 뉴스 분야의 배열 순서 및 분야 별 추천하는 뉴스의 수량 중 적어도 하나를 조절할 수 있다.The news recommending unit may adjust at least one of the arrangement order of the news fields to be recommended and the number of news items recommended by each field according to the ratio of the calculated news fields.

상기 분야 비율 산출부는, 상기 웹 브라우징 히스토리 정보를 기반으로 상기 사용자 단말기가 접속한 각 뉴스의 웹 페이지에서 텍스트를 각각 추출하고, 상기 추출한 텍스트에서 단어의 품사를 기반으로 분석 대상 단어를 1차 선별하며, 상기 1차 선별한 분석 대상 단어를 기반으로 상기 접속한 각 뉴스의 분야를 결정할 수 있다.The field ratio calculation unit extracts texts from web pages of each news item accessed by the user terminal based on the web browsing history information, and firstly selects analysis target words based on the part of speech in the extracted text , And can determine the fields of each news item based on the first selected analysis subject word.

상기 분야 비율 산출부는, 상기 1차 선별한 분석 대상 단어들 각각에 가중치를 부여하고, 상기 1차 선별한 분석 대상 단어들 중 상기 가중치 값이 높은 기 설정된 개수의 단어들을 2차 선별하며, 상기 텍스트 별로 상기 2차 선별된 단어들의 포함 정도에 따라 상기 접속한 각 뉴스의 분야를 결정할 수 있다.The sector ratio calculating unit may assign a weight to each of the first to-be-analyzed analysis subject words, to secondarily select a predetermined number of words having a high weight value among the first to-be-analyzed analyzed words, It is possible to determine the field of each connected news according to the inclusion degree of the secondary selected words.

상기 뉴스 추천 서버는, 각 사용자 단말기로 제공하는 각 뉴스에 대한 사용자의 호감도를 산출하는 뉴스 호감도 산출부를 더 포함할 수 있다.The news recommendation server may further include a news favorability calculating unit for calculating a user's likelihood of each news provided to each user terminal.

상기 뉴스 호감도 산출부는, 해당 사용자가 상기 뉴스를 읽은 시간 및 상기 뉴스의 문장 길이를 기반으로 상기 뉴스에 대한 사용자의 호감도 값을 산출할 수 있다.The news favorability calculating unit may calculate a user's likelihood value for the news based on the time the user read the news and the sentence length of the news.

상기 데이터 수집부는, 상기 사용자 단말기로부터 상기 뉴스 추천 서버에 접속하는 로그 데이터를 수집하고, 상기 뉴스 호감도 산출부는, 상기 웹 브라우징 히스토리 정보의 웹 페이지들 중 상기 로그 데이터에서 소정 뉴스를 제공한 시점과 동일한 접속 시점을 갖는 웹 페이지를 추출하고, 상기 추출한 웹 페이지의 접속 시간과 상기 추출한 웹 페이지 다음에 접속한 웹 페이지의 접속 시간을 기반으로 상기 뉴스를 읽은 시간을 산출할 수 있다.Wherein the data collecting unit collects log data to be connected to the news recommendation server from the user terminal, and the news favorability calculating unit is configured to calculate the news favorability of the news recommendation server from the web pages of the web browsing history information, Extracting a web page having a connection point, and calculating a time of reading the news based on the connection time of the extracted web page and the connection time of the web page accessed next to the extracted web page.

상기 뉴스 호감도 산출부는, 해당 사용자가 상기 뉴스를 읽은 시간 및 상기 뉴스의 문장 길이를 기반으로 상기 뉴스를 읽는 속도를 산출하고, 상기 산출한 뉴스 읽는 속도가 기 설정된 표준 속도에 근접하는 정도에 따라 상기 뉴스에 대한 사용자의 호감도 값을 높게 설정할 수 있다.Wherein the news favorability calculating unit calculates the speed at which the user reads the news based on the time at which the user read the news and the length of the sentence of the news, The user can set a high value of the user's favorability for the news.

상기 뉴스 추천부는, 해당 사용자와 뉴스 구독 성향이 유사한 사용자를 검색하고, 상기 검색된 뉴스 구독 성향이 유사한 사용자가 본 뉴스들 중에서 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스를 추천할 수 있다.The news recommender searches for a user having a similar tendency to a news subscription to the user and recommends the news according to the ratio of the news field preferred by the user among the news items that the user has a similar tendency to the retrieved news subscription.

상기 뉴스 추천부는, 상기 검색된 뉴스 구독 성향이 유사한 사용자가 본 뉴스들 중 상기 뉴스 구독 성향이 유사한 사용자들의 뉴스 호감도가 높은 순서대로 추출하고, 상기 추출한 뉴스들을 대상으로 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 추천하는 뉴스 분야의 배열 순서 및 분야 별 추천하는 뉴스의 수량 중 적어도 하나를 조절할 수 있다.Wherein the news recommendation unit extracts, in the descending order of the news appeal of users who have similar news subscription tendencies, the news that the user has a similar news subscription tendency, and extracts the ratio of the news field preferred by the user It is possible to adjust at least one of the arrangement order of the recommended news fields and the number of recommended news items according to the fields.

본 발명의 일 실시예에 따른 뉴스 추천 방법은, 하나 이상의 프로세서들, 및 상기 하나 이상의 프로세서들에 의해 실행되는 하나 이상의 프로그램들을 저장하는 메모리를 구비한 컴퓨팅 장치에서 수행되는 방법으로서, 사용자 단말기로부터 웹 브라우징 히스토리 정보를 수집하는 단계; 상기 수집한 웹 브라우징 히스토리 정보를 기반으로 해당 사용자가 선호하는 뉴스 분야들의 비율을 산출하는 단계; 및 상기 산출한 뉴스 분야들의 비율을 기반으로 상기 사용자 단말기로 뉴스를 추천하는 단계를 포함한다.A news recommendation method according to an embodiment of the present invention is a method performed in a computing device having a memory storing one or more processors and one or more programs executed by the one or more processors, Collecting browsing history information; Calculating a ratio of news fields preferred by the user based on the collected web browsing history information; And recommending news to the user terminal based on the ratio of the calculated news fields.

상기 뉴스를 추천하는 단계는, 상기 산출한 뉴스 분야들의 비율에 따라 추천하는 뉴스 분야의 배열 순서 및 분야 별 추천하는 뉴스의 수량 중 적어도 하나를 조절하는 단계를 포함할 수 있다.The step of recommending the news may include adjusting at least one of a sequence order of recommended news fields and a recommended number of news items for each field according to the ratio of the calculated news fields.

상기 뉴스 분야들의 비율을 산출하는 단계는, 상기 웹 브라우징 히스토리 정보를 기반으로 상기 사용자 단말기가 접속한 각 뉴스의 웹 페이지에서 텍스트를 각각 추출하는 단계; 상기 추출한 텍스트에서 단어의 품사를 기반으로 분석 대상 단어를 1차 선별하는 단계; 상기 1차 선별한 분석 대상 단어들 각각에 가중치를 부여하는 단계; 상기 1차 선별한 분석 대상 단어들 중 상기 가중치 값이 높은 기 설정된 개수의 단어들을 2차 선별하는 단계; 및 상기 텍스트 별로 상기 2차 선별된 단어들의 포함 정도에 따라 상기 접속한 각 뉴스의 분야를 결정하는 단계를 포함할 수 있다.Wherein the step of calculating the ratio of the news fields comprises: extracting a text from a web page of each news accessed by the user terminal based on the web browsing history information; Firstly sorting the analysis target word based on the part of speech in the extracted text; Assigning weights to each of the first to-be-analyzed analyzed words; Secondly selecting a predetermined number of words having the highest weight value among the first to-be-analyzed analyzed words; And determining an area of each connected news item in accordance with the inclusion degree of the secondary selected words on the basis of the text.

상기 뉴스 추천 방법은, 각 사용자 단말기로 제공하는 각 뉴스에 대한 사용자의 호감도를 산출하는 단계를 더 포함할 수 있다.The news recommendation method may further include calculating a user's likelihood of each news provided to each user terminal.

상기 호감도를 산출하는 단계는, 해당 사용자가 상기 뉴스를 읽은 시간 및 상기 뉴스의 문장 길이를 기반으로 상기 뉴스에 대한 사용자의 호감도 값을 산출할 수 있다.The step of calculating the good feeling may calculate a user's likelihood value for the news based on the time the user read the news and the sentence length of the news.

상기 호감도를 산출하는 단계는, 상기 사용자 단말기로부터 상기 뉴스 추천 서버에 접속하는 로그 데이터를 수집하는 단계; 상기 웹 브라우징 히스토리 정보의 웹 페이지들 중 상기 로그 데이터에서 소정 뉴스를 제공한 시점과 동일한 접속 시점을 갖는 웹 페이지를 추출하는 단계; 및 상기 추출한 웹 페이지의 접속 시간과 상기 추출한 웹 페이지 다음에 접속한 웹 페이지의 접속 시간을 기반으로 상기 뉴스를 읽은 시간을 산출하는 단계를 포함할 수 있다.The step of calculating the good feeling may include: collecting log data to be connected to the news recommendation server from the user terminal; Extracting a web page having the same connection point as the point of time when the predetermined news is provided from the log data among the web pages of the web browsing history information; And calculating the time of reading the news based on the connection time of the extracted web page and the connection time of the web page accessed next to the extracted web page.

상기 호감도를 산출하는 단계는, 해당 사용자가 상기 뉴스를 읽은 시간 및 상기 뉴스의 문장 길이를 기반으로 상기 뉴스를 읽는 속도를 산출하는 단계; 및 상기 산출한 뉴스 읽는 속도가 기 설정된 표준 속도에 근접하는 정도에 따라 상기 뉴스에 대한 사용자의 호감도 값을 높게 설정하는 단계를 포함할 수 있다.Calculating the rate of reading the news based on a time when the user read the news and a sentence length of the news; And setting a user's likelihood value for the news to a high value according to the calculated degree of the reading speed of the news close to a preset standard speed.

상기 뉴스를 추천하는 단계는, 해당 사용자와 뉴스 구독 성향이 유사한 사용자를 검색하는 단계; 및 상기 검색된 뉴스 구독 성향이 유사한 사용자가 본 뉴스들 중에서 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스를 추천하는 단계를 포함할 수 있다.The step of recommending news includes: searching for a user who has a similar news subscription tendency to the user; And recommending the news according to the ratio of the news field preferred by the user among the news items that the user having a similar news subscription propensity is searched for.

상기 뉴스를 추천하는 단계는, 상기 검색된 뉴스 구독 성향이 유사한 사용자가 본 뉴스들 중 상기 뉴스 구독 성향이 유사한 사용자들의 뉴스 호감도가 높은 순서대로 추출하는 단계; 및 상기 추출한 뉴스들을 대상으로 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 추천하는 뉴스 분야의 배열 순서 및 분야 별 추천하는 뉴스의 수량 중 적어도 하나를 조절하는 단계를 포함할 수 있다.Wherein the step of recommending the news comprises the steps of: extracting, in descending order of news appeal from users who have similar news subscription tendencies and have similar news subscription tendencies, And adjusting at least one of a sequence order of news fields to be recommended and a recommended number of news items for each field based on the ratio of the news fields preferred by the user to the extracted news items.

본 발명의 실시예에 의하면, 사용자 별 웹 브라우징 히스토리를 이용하여 사용자가 선호하는 뉴스 분야의 비율을 산출하고, 그에 따라 뉴스를 추천하여 줌으로써, 사용자의 선호도의 변화에 따라 능동적으로 뉴스를 추천할 수 있게 된다.According to the embodiment of the present invention, the ratio of the news field preferred by the user is calculated using the web browsing history per user, and the news is recommended according to the ratio, thereby actively recommending the news according to the change of the user's preference .

도 1은 본 발명의 일 실시예에 따른 뉴스 추천 시스템을 나타낸 블록도
도 2는 본 발명의 일 실시예에 따른 뉴스 추천 서버의 구성을 나타낸 블록도
도 3은 본 발명의 일 실시예에 따른 뉴스 추천 서버에서, 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스를 추천하는 일 실시예를 나타낸 도면
도 4는 본 발명의 일 실시예에 따른 뉴스 추천 서버에서, 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스를 추천하는 다른 실시예를 나타낸 도면
도 5는 본 발명의 일 실시예에 따른 뉴스 추천 방법을 설명하기 위한 흐름도
도 6은 본 발명의 다른 실시예에 따른 뉴스 추천 방법을 설명하기 위한 흐름도
도 7은 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경을 예시하여 설명하기 위한 블록도1 is a block diagram illustrating a news recommendation system according to an embodiment of the present invention;
2 is a block diagram illustrating a configuration of a news recommendation server according to an embodiment of the present invention;
3 is a diagram illustrating an example of recommending news according to a ratio of a news field preferred by a user in a news recommendation server according to an embodiment of the present invention;
4 is a view illustrating another embodiment of recommending news according to a ratio of a news field preferred by a user in a news recommendation server according to an exemplary embodiment of the present invention
5 is a flowchart illustrating a news recommendation method according to an exemplary embodiment of the present invention.
6 is a flowchart for explaining a news recommendation method according to another embodiment of the present invention
7 is a block diagram illustrating and illustrating a computing environment including a computing device suitable for use in the exemplary embodiments.

이하, 도면을 참조하여 본 발명의 구체적인 실시형태를 설명하기로 한다. 이하의 상세한 설명은 본 명세서에서 기술된 방법, 장치 및/또는 시스템에 대한 포괄적인 이해를 돕기 위해 제공된다. 그러나 이는 예시에 불과하며 본 발명은 이에 제한되지 않는다.Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. The following detailed description is provided to provide a comprehensive understanding of the methods, apparatus, and / or systems described herein. However, this is merely an example and the present invention is not limited thereto.

본 발명의 실시예들을 설명함에 있어서, 본 발명과 관련된 공지기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다. 그리고, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. 상세한 설명에서 사용되는 용어는 단지 본 발명의 실시예들을 기술하기 위한 것이며, 결코 제한적이어서는 안 된다. 명확하게 달리 사용되지 않는 한, 단수 형태의 표현은 복수 형태의 의미를 포함한다. 본 설명에서, "포함" 또는 "구비"와 같은 표현은 어떤 특성들, 숫자들, 단계들, 동작들, 요소들, 이들의 일부 또는 조합을 가리키기 위한 것이며, 기술된 것 이외에 하나 또는 그 이상의 다른 특성, 숫자, 단계, 동작, 요소, 이들의 일부 또는 조합의 존재 또는 가능성을 배제하도록 해석되어서는 안 된다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail. The following terms are defined in consideration of the functions of the present invention, and may be changed according to the intention or custom of the user, the operator, and the like. Therefore, the definition should be based on the contents throughout this specification. The terms used in the detailed description are intended only to describe embodiments of the invention and should in no way be limiting. Unless specifically stated otherwise, the singular form of a term includes plural forms of meaning. In this description, the expressions " comprising " or " comprising " are intended to indicate certain features, numbers, steps, operations, elements, parts or combinations thereof, Should not be construed to preclude the presence or possibility of other features, numbers, steps, operations, elements, portions or combinations thereof.

이하의 설명에 있어서, 신호 또는 정보의 "전송", "통신", "송신", "수신" 기타 이와 유사한 의미의 용어는 일 구성요소에서 다른 구성요소로 신호 또는 정보가 직접 전달되는 것뿐만이 아니라 다른 구성요소를 거쳐 전달되는 것도 포함한다. 특히 신호 또는 정보를 일 구성요소로 "전송" 또는 "송신"한다는 것은 그 신호 또는 정보의 최종 목적지를 지시하는 것이고 직접적인 목적지를 의미하는 것이 아니다. 이는 신호 또는 정보의 "수신"에 있어서도 동일하다. 또한 본 명세서에 있어서, 2 이상의 데이터 또는 정보가 "관련"된다는 것은 하나의 데이터(또는 정보)를 획득하면, 그에 기초하여 다른 데이터(또는 정보)의 적어도 일부를 획득할 수 있음을 의미한다. In the following description, terms such as " transmission ", " transmission ", " transmission ", " reception ", and the like, of a signal or information refer not only to the direct transmission of signals or information from one component to another But also through other components. In particular, "transmitting" or "transmitting" a signal or information to an element is indicative of the final destination of the signal or information and not a direct destination. This is the same for " reception " of a signal or information. Also, in this specification, the fact that two or more pieces of data or information are " related " means that when one piece of data (or information) is acquired, at least a part of the other data (or information) can be obtained based thereon.

또한, 제1, 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로 사용될 수 있다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성 요소는 제2 구성 요소로 명명될 수 있고, 유사하게 제2 구성 요소도 제1 구성 요소로 명명될 수 있다.Also, the terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms may be used for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.

도 1은 본 발명의 일 실시예에 따른 뉴스 추천 시스템을 나타낸 블록도이다. 1 is a block diagram illustrating a news recommendation system according to an embodiment of the present invention.

도 1을 참조하면, 뉴스 추천 시스템(100)은 사용자 단말기(102) 및 뉴스 추천 서버(104)를 포함할 수 있다. 사용자 단말기(102)는 통신 네트워크(150)를 통해 뉴스 추천 서버(104)와 통신 가능하게 연결된다. 몇몇 실시예들에서, 통신 네트워크(150)는 인터넷, 하나 이상의 로컬 영역 네트워크(local area networks), 광역 네트워크(wire area networks), 셀룰러 네트워크, 모바일 네트워크, 그 밖에 다른 종류의 네트워크들, 또는 이러한 네트워크들의 조합을 포함할 수 있다.Referring to FIG. 1, the news recommendation system 100 may include a user terminal 102 and a news recommendation server 104. The user terminal 102 is communicably connected to the news recommendation server 104 via the communication network 150. [ In some embodiments, the communications network 150 may include one or more of the Internet, one or more local area networks, wire area networks, cellular networks, mobile networks, other types of networks, As shown in FIG.

사용자 단말기(102)는 웹 브라우징(Web Browsing)이 가능한 통신 장치로서, 예를 들어 모바일 단말기, 스마트 디바이스, 데스크 탑, 랩탑, 웨어러블 기기 등을 포함할 수 있다. 사용자 단말기(102)에는 웹 브라우저가 탑재된다. The user terminal 102 may include a mobile terminal, a smart device, a desktop, a laptop, a wearable device, and the like as a communication device capable of web browsing. The user terminal 102 is equipped with a web browser.

사용자 단말기(102)는 웹 브라우징 히스토리 정보를 뉴스 추천 서버(104)로 전송할 수 있다. 웹 브라우징 히스토리 정보는 예를 들어, 사용자 단말기(102)가 접속한 웹 페이지의 ID 인덱스, 해당 웹 페이지의 타이틀(제목), 해당 웹 페이지의 주소(즉, 웹 페이지의 URL), 해당 웹 페이지의 즐겨 찾기 설정 여부, 해당 웹 페이지를 방문한 시점 등이 포함될 수 있다. 사용자 단말기(102)는 소정 웹 페이지에 접속할 때마다 웹 브라우징 히스토리 정보를 뉴스 추천 서버(104)로 전송할 수 있다. The user terminal 102 may transmit web browsing history information to the news recommendation server 104. [ The web browsing history information includes, for example, an ID index of a web page accessed by the user terminal 102, a title (title) of the web page, an address of the web page (i.e., a URL of the web page) Whether or not a favorite is set, and when the web page is visited. The user terminal 102 may transmit the web browsing history information to the news recommendation server 104 every time the user terminal 102 accesses the predetermined web page.

사용자 단말기(102)는 뉴스를 제공하는 웹 서버(미도시)(뉴스 추천 서버(104)와는 다른 서버일 수 있음)에 접속하여 뉴스를 제공받을 수 있다. 또한, 사용자 단말기(102)는 뉴스 추천 서버(104)에 접속하고 뉴스 추천 서버(104)로부터 사용자가 선호하는 분야의 뉴스를 추천 받을 수 있다. The user terminal 102 can access news and provide a web server (not shown) (which may be a server different from the news recommendation server 104) for providing news. In addition, the user terminal 102 can access the news recommendation server 104 and receive the news of the user's favorite field from the news recommendation server 104.

뉴스 추천 서버(104)는 각 사용자 단말기(102)로부터 웹 브라우징 히스토리 정보를 수신할 수 있다. 뉴스 추천 서버(104)는 각 사용자 단말기(102)로부터 수신한 웹 브라우징 히스토리 정보를 기반으로 해당 사용자가 선호하는 뉴스 분야들의 비율을 산출할 수 있다. 즉, 뉴스 추천 서버(104)는 각 사용자 단말기(102)로부터 수신한 일정 기간 동안의 웹 브라우징 히스토리 정보를 분석하여 기 설정된 뉴스 분야들 중 해당 사용자가 선호하는 뉴스 분야들의 비율을 산출할 수 있다. The news recommendation server 104 may receive web browsing history information from each user terminal 102. The news recommendation server 104 may calculate the ratio of news fields preferred by the user based on the web browsing history information received from each user terminal 102. [ That is, the news recommendation server 104 may analyze the web browsing history information for a predetermined period received from each user terminal 102 and calculate the ratio of the news fields preferred by the user among the preset news fields.

뉴스 추천 서버(104)는 사용자 단말기(102)가 접속하는 경우, 해당 사용자가 선호하는 분야의 뉴스를 추천(즉, 사용자 단말기(102)로 제공)할 수 있다. 이때, 뉴스 추천 서버(104)는 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 추천하는 뉴스 분야의 배열 순서 및 분야 별 추천하는 뉴스의 수량 중 적어도 하나를 조절할 수 있다. When the user terminal 102 accesses the news recommendation server 104, the news recommendation server 104 can recommend (i.e., provide the user terminal 102 with) the news of the user's favorite field. At this time, the news recommendation server 104 may adjust at least one of the arrangement order of the news fields to be recommended and the number of news items recommended for each field according to the ratio of the news field preferred by the user.

뉴스 추천 서버(104)는 뉴스 추천 서버(104)가 제공하는 각 뉴스 별로 사용자들의 호감도를 산출할 수 있다. 뉴스 추천 서버(104)는 사용자가 해당 뉴스를 읽는 속도를 기반으로 해당 뉴스의 호감도를 산출할 수 있다. 뉴스 추천 서버(104)는 뉴스 분야 별로 호감도가 높은 순으로 뉴스들을 분류하여 추천 뉴스 셋(Set)을 생성할 수 있다. 이때, 추천 뉴스 셋(Set)은 뉴스의 호감도가 기 설정된 값 이상인 뉴스들을 대상으로 할 수 있다. The news recommendation server 104 may calculate the likelihood of users for each news provided by the news recommendation server 104. [ The news recommendation server 104 can calculate the likelihood of the news based on the speed at which the user reads the news. The news recommendation server 104 can generate a recommendation news set (Set) by classifying the news in descending order of popularity for each news field. At this time, the recommended news set (Set) can be targeted to the news having the favorable value of the news equal to or more than the predetermined value.

뉴스 추천 서버(104)는 해당 사용자와 뉴스 구독 성향이 유사한 사용자들이 본 뉴스들 중에서 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스를 추천할 수 있다. 뉴스 추천 서버(104)는 해당 사용자와 뉴스 구독 성향이 유사한 사용자들이 본 뉴스들 중에서 해당 사용자와 뉴스 구독 성향이 유사한 사용자들의 뉴스 호감도가 높은 뉴스들을 추출한 후, 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스 분야의 순서를 배열하고, 분야 별 추천하는 뉴스의 수량을 조절할 수 있다.The news recommendation server 104 can recommend news according to the ratio of the news field preferred by the user among the news items of the users who have similar news subscription propensity to the user. The news recommendation server 104 extracts the news items having high news appeal from the news users who have similar news subscription tendencies and the users who have similar news subscription tendencies among the news items, You can arrange the order of the news field and adjust the quantity of the recommended news for each field.

도 2는 본 발명의 일 실시예에 따른 뉴스 추천 서버의 구성을 나타낸 블록도이다. 2 is a block diagram illustrating a configuration of a news recommendation server according to an embodiment of the present invention.

도 2를 참조하면, 뉴스 추천 서버(104)는 데이터 수집부(111), 분야 비율 산출부(113), 뉴스 호감도 산출부(115), 및 뉴스 추천부(117)를 포함할 수 있다. Referring to FIG. 2, the news recommendation server 104 may include a data collection unit 111, a sector ratio calculation unit 113, a news favorability calculation unit 115, and a news recommendation unit 117.

데이터 수집부(111)는 각 사용자 단말기(102)들로부터 웹 브라우징 히스토리 정보를 수집할 수 있다. 데이터 수집부(111)는 수집한 웹 브라우징 히스토리 정보를 데이터베이스(미도시)에 저장할 수 있다. 여기서, 데이터베이스(미도시)는 뉴스 추천 서버(104)에 구비되는 데이터 저장 매체일 수도 있고, 뉴스 추천 서버(104)의 외부에 구비되는 데이터 저장 매체일 수도 있다. The data collection unit 111 may collect web browsing history information from each user terminal 102. The data collection unit 111 may store the collected web browsing history information in a database (not shown). Here, the database (not shown) may be a data storage medium provided in the news recommendation server 104 or a data storage medium provided outside the news recommendation server 104.

또한, 데이터 수집부(111)는 각 사용자 단말기(102) 별로 뉴스 추천 서버(104) 접속 시 로그 데이터를 수집할 수 있다. 예를 들어, 사용자 단말기(102)가 뉴스 추천 서버(104)에 접속하여 뉴스를 제공 받는 경우, 데이터 수집부(111)는 그에 따른 로그 데이터(해당 뉴스의 ID, 해당 뉴스의 타이틀, 해당 뉴스의 URL, 해당 뉴스를 제공한 시점 등)를 수집할 수 있다. 데이터 수집부(111)는 수집한 로그 데이터를 데이터베이스(미도시)에 저장할 수 있다.The data collection unit 111 may collect log data when the news recommendation server 104 is accessed for each user terminal 102. For example, when the user terminal 102 accesses the news recommendation server 104 and receives news, the data collecting unit 111 acquires the log data (the ID of the corresponding news, the title of the corresponding news, URL, time at which the news was provided, etc.). The data collecting unit 111 may store the collected log data in a database (not shown).

여기서, 웹 브라우징 히스토리 정보는 해당 사용자의 웹 브라우징에 따른 전체 웹 페이지를 본 기록이고, 로그 데이터는 해당 사용자가 뉴스 추천 서버(104)로부터 제공 받은 웹 페이지를 본 기록일 수 있다. Here, the web browsing history information may be a history of viewing the entire web page according to the web browsing of the user, and the log data may be a history of viewing the web page provided from the news recommendation server 104 by the user.

분야 비율 산출부(113)는 각 사용자 단말기(102)들로부터 수집한 웹 브라우징 히스토리 정보를 기반으로 사용자 별 선호하는 뉴스 분야들의 비율을 산출할 수 있다. 예시적인 실시예에서, 뉴스 분야는 스포츠(sports), 엔터테인먼트(entertainment), 과학(science), 글로벌(global), 정치(politics), 경제(economy), 라이프(life), 기타 등의 8개의 분야로 분류될 수 있다. 여기서, 기타는 해당 웹 페이지가 이미지로만 되어 있거나 로그인이 필요한 이유로 분류가 불가능한 경우에 해당될 수 있다. The sector ratio calculating unit 113 may calculate the ratio of the news sectors preferred by each user based on the web browsing history information collected from each user terminal 102. [ In an exemplary embodiment, the news field comprises eight fields, such as sports, entertainment, science, global, politics, economy, life, . &Lt; / RTI > Here, the guitar may correspond to a case where the corresponding web page is only an image, or classification is impossible because login is required.

구체적으로, 분야 비율 산출부(113)는 일정 기간 동안의 사용자 단말기(102)의 웹 브라우징 히스토리에서 해당 사용자 단말기(102)가 접속한 각 웹 페이지를 방문하여 해당 웹 페이지의 본문(즉, 텍스트)을 추출할 수 있다. 만약, 사용자 단말기(102)가 접속한 웹 페이지가 로그인이 필요한 서비스이거나 이미지로만 이루어져 본문 추출이 어려운 경우, 분야 비율 산출부(113)는 해당 웹 페이지의 타이틀(즉, 제목)을 텍스트로 추출할 수 있다. Specifically, the field ratio calculation unit 113 visits each web page accessed by the corresponding user terminal 102 in the web browsing history of the user terminal 102 for a predetermined period of time and displays the body text (i.e., text) Can be extracted. If the web page accessed by the user terminal 102 is a service requiring login or is difficult to extract a body composed only of images, the field ratio calculation unit 113 extracts a title (i.e., title) of the corresponding web page as text .

분야 비율 산출부(113)는 형태소 분석기(Part of Speech Tagger)를 이용하여 상기 추출한 웹 페이지의 텍스트에 포함된 단어들의 품사를 확인할 수 있다. 분야 비율 산출부(113)는 웹 페이지의 텍스트에 포함된 단어들 중 품사가 동사, 형용사, 부사인 단어 및 단어의 문자열 길이가 1인 단어를 제외한 단어(이하, “분석 대상 단어”라 지칭될 수 있음)들을 대상으로 단어 별 빈도수를 측정할 수 있다. 즉, 품사가 동사, 형용사, 부사인 단어는 명사에 비해 상대적으로 정보의 양이 적고, 문자열 길이가 1인 단어는 뜻이 없거나 무의미한 경우가 대다수이므로 이러한 단어들은 제외시킬 수 있다. The field ratio calculation unit 113 can check the part of words included in the text of the extracted web page using a morphological analyzer (Part of Speech Tagger). The field ratio calculation unit 113 calculates a field ratio of the words included in the text of the web page (hereinafter, referred to as " analysis target word ") except the word whose string length is 1 in the verb, adjective, adverb, The frequency of words can be measured. In other words, words of verbs, adjectives, and adverbs are relatively less information than nouns, and words with a string length of 1 are mostly meaningless or meaningless.

분야 비율 산출부(113)는 분석 대상 단어들에 가중치를 적용하여 가중치 값이 높은 순서대로 기 설정된 개수(예를 들어, 1,000개)의 단어(이하, 분야 산출 기초 단어로 지칭될 수 있음)를 추출하여 단어 벡터(Term Vector)(또는 벡터 공간(Vector Space))를 구성할 수 있다. 예시적인 실시예에서, 분야 비율 산출부(113)는 각 분석 대상 단어들에 TF-IDF(Term Frequency - Inverse Document Frequency) 가중치를 적용할 수 있다. 여기서, TF-IDF 가중치는 단어의 빈도수(Term Frequency)와 역 문헌 빈도수(Inverse Document Frequency)의 곱으로 나타낼 수 있다. 단어의 빈도수(Term Frequency)는 어떤 텍스트 내에서 특정 단어가 나타나는 횟수를 텍스트 내에 있는 모든 단어 수로 나눈 값일 수 있다. 역 문헌 빈도수(Inverse Document Frequency)는 전체 텍스트의 개수(N)를 특정 단어가 포함된 텍스트의 개수(n)로 나눈 값에 로그(log)를 취한 값(즉, log(N/n))일 수 있다. The field ratio calculation unit 113 applies a weight to the words to be analyzed so as to calculate a predetermined number (for example, 1,000) of words (hereinafter, may be referred to as a field calculation basic word) in the descending order of the weight value (Or a vector space) can be constructed by extracting a word vector. In the exemplary embodiment, the field ratio calculation unit 113 may apply a term frequency-inverted document frequency (TF-IDF) weight to each analysis target word. Here, the TF-IDF weight can be expressed as a product of a word frequency (Term Frequency) and an inverse document frequency (Inverse Document Frequency). The term frequency can be the number of times a particular word appears within a text divided by the number of words in the text. Inverse Document Frequency is a value obtained by dividing the total number of texts (N) by the number of texts (n) containing specific words (log (N / n)) .

분야 비율 산출부(113)는 가중치 값이 높은 기 설정된 개수의 단어(분야 산출 기초 단어)들로 구성된 단어 벡터를 이용하여 각 텍스트 및 뉴스 분야에 대한 매트리스(Matrix)를 구성한 후, 각 테스트의 뉴스 분야를 분류할 수 있다. 즉, 분야 비율 산출부(113)는 표 1에 나타낸 바와 같이 단어 벡터를 이용하여 각 텍스트 및 뉴스 분야에 대한 매트리스(Matrix)를 구성할 수 있다. The field ratio calculation unit 113 constructs a matrix for each text and news field using a word vector composed of a predetermined number of words having a high weight value (field calculation basis words) You can classify the field. That is, the field ratio calculation unit 113 can construct a matrix for each text and news field using word vectors as shown in Table 1.

N_moneyN_money V_interest rateV_interest rate N_familyN_family ...... N_footballN_football 뉴스 분야News field 텍스트1Text 1 1One 00 00 ...... 00 SportsSports 텍스트2Text 2 1One 1One 00 ...... 00 EconomyEconomy 텍스트 3Text 3 00 00 1One ...... 1One LifeLife ...... ...... ...... ...... ...... ...... ...... 텍스트 jText j 1One 00 00 ...... 1One SportsSports

분야 비율 산출부(113)는 상기 매트리스에서 각 뉴스의 텍스트 별로 분야 산출 기초 단어들의 어느 정도 포함되어 있는지 여부를 확인(예를 들어, 해당 단어가 포함되어 있으면 1, 포함되어 있지 않으면 0을 부여)하고, 이를 머신 러닝(Machine Learning) 기법에 적용하여 각 뉴스의 분야가 어느 분야인지를 결정할 수 있다. 즉, 이미 분야들이 분류된 뉴스들에서 어떤 분야 산출 기초 단어들이 어느 정도 포함되어 있을 때 해당 분야로 분류되었는지를 머신 러닝(Machine Learning) 기법으로 분석하면, 상기 매트리스를 이용하여 각 뉴스의 분야가 어느 분야인지를 결정할 수 있게 된다. 이때, 사용되는 알고리즘으로는 Naive Bayes, K-Nearest Neighbor, Decision Tree, Artificial Neural Network, Support Vector Machine 등이 있을 수 있다. The field ratio calculating unit 113 determines whether the field includes a certain number of field-based words (for example, 1 if the word is included or 0 if the word is not included) And apply it to the machine learning technique to decide which field of each news field is. In other words, when a field containing a certain number of basic words is included in the classified news, it is analyzed by a machine learning technique to classify the field into a corresponding field. Then, by using the mattress, Field can be determined. In this case, there are Naive Bayes, K-Nearest Neighbor, Decision Tree, Artificial Neural Network, and Support Vector Machine.

분야 비율 산출부(113)는 일정 기간 동안 해당 사용자 단말기(102)가 접속한 각 웹 페이지의 텍스트들을 스포츠(sports), 엔터테인먼트(entertainment), 과학(science), 글로벌(global), 정치(politics), 경제(economy), 라이프(life), 및 기타 중 어느 하나의 분야로 분류할 수 있다. 분야 비율 산출부(113)는 일정 기간 동안 해당 사용자 단말기(102)가 접속한 각 웹 페이지의 텍스트들에서 기타 분야를 제외한 나머지 분야 빈도의 합(이하, 전체 분야 빈도의 합이라 지칭될 수 있음)을 계산할 수 있다. 분야 비율 산출부(113)는 각 분야(스포츠, 엔터테인먼트, 과학, 글로벌, 정치, 경제, 라이프)의 빈도를 전체 분야 빈도의 합으로 나누어서 각 분야의 비율을 산출할 수 있다. 분야 비율 산출부(113)는 주기적으로 각 사용자 단말기(102)로부터 수집한 웹 브라우징 히스토리 정보를 분석하여 각 사용자들의 선호하는 뉴스 분야의 비율을 갱신할 수 있다. The sector ratio calculating unit 113 may calculate the texts of each web page accessed by the user terminal 102 for a predetermined period of time based on sports, entertainment, science, global, politics, , Economy, life, and others. &Lt; Desc / Clms Page number 2 > Sector ratio calculating unit 113 calculates the sum of frequencies of other fields except for other fields (hereinafter, may be referred to as a sum of frequency of all fields) in the texts of the respective web pages accessed by the corresponding user terminal 102 for a predetermined period of time, Can be calculated. The sector ratio calculating unit 113 can calculate the ratio of each field by dividing the frequency of each field (sports, entertainment, science, global, politics, economy, life) by the sum of frequency of all fields. Sector ratio calculating unit 113 may periodically analyze the web browsing history information collected from each user terminal 102 and update the ratio of each user's favorite news field.

뉴스 호감도 산출부(115)는 뉴스 추천 서버(104)에서 사용자 단말기(102)로 뉴스를 제공하는 경우, 각 뉴스에 대한 사용자의 호감도를 산출할 수 있다. 뉴스 호감도 산출부(115)는 사용자가 해당 뉴스를 읽는 속도를 기반으로 해당 뉴스에 대한 사용자의 호감도를 산출할 수 있다. When the news recommendation server 104 provides news to the user terminal 102, the news favorability calculating unit 115 may calculate the user's likelihood of each news. The news favorability calculating unit 115 may calculate the user's liking of the news based on the speed at which the user reads the news.

구체적으로, 뉴스 호감도 산출부(115)는 데이터 수집부(111)에서 수집한 웹 브라우징 히스토리 정보와 로그 데이터들을 비교하여 웹 브라우징 히스토리 정보들의 웹 페이지 주소(URL)들 중 로그 데이터에서 소정 뉴스를 제공한 시점과 동일한 접속 시점을 갖는 웹 페이지를 추출할 수 있다. 여기서, 추출한 웹 페이지의 주소는 뉴스 추천 서버(104)에서 사용자 단말기(102)로 제공한 뉴스의 웹 페이지 주소(즉, 로그 데이터에서 해당 뉴스의 URL)와 동일하다. Specifically, the news favorability calculating unit 115 compares the web browsing history information and the log data collected by the data collecting unit 111 and provides predetermined news from the log data among the web page addresses (URLs) of the web browsing history information It is possible to extract a web page having the same connection point as the one point in time. Here, the address of the extracted web page is the same as the address of the web page of the news provided from the news recommendation server 104 to the user terminal 102 (i.e., the URL of the news in the log data).

이때, 웹 브라우징 히스토리 정보의 웹 페이지 URL과 로그 데이터의 웹 페이지 URL을 접속 시간에 근거하여 매칭하는 이유는, 사용자 단말기(102)로부터 수집되는 웹 브라우징 히스토리 정보에는 뉴스 추천 서버(104)에 접속한 기록뿐만 아니라 해당 사용자의 웹 브라우징에 따른 전체 웹 페이지 접속 기록을 포함되기 때문이다. The reason why the web page URL of the web browsing history information and the web page URL of the log data are matched based on the connection time is that the web browsing history information collected from the user terminal 102 includes As well as the entire web page access record according to the user's web browsing.

뉴스 호감도 산출부(115)는 웹 브라우징 히스토리 정보에서 상기 추출한 웹 페이지 다음으로 접속한 웹 페이지의 접속 시간을 확인하여 해당 사용자가 상기 추출한 웹 페이지(즉, 뉴스 추천 서버(104)에서 사용자 단말기(102)로 제공한 뉴스의 웹 페이지)를 읽은 시간을 산출할 수 있다. The news favorability calculating unit 115 checks the connection time of the web page accessed next to the extracted web page in the web browsing history information and determines whether the user has selected the web page (i.e., the news recommendation server 104) Quot;) ") can be calculated.

구체적으로, 뉴스 호감도 산출부(115)는 해당 사용자의 웹 브라우징 히스토리 정보에서 해당 뉴스의 웹 페이지 접속 시간 T(i) 및 해당 뉴스의 웹 페이지 다음에 접속한 웹 페이지의 접속 시간 T(i+1)의 차이(즉, T(i+1) - T(i))를 해당 뉴스를 읽은 시간으로 산출할 수 있다. 이때, 뉴스 호감도 산출부(115)는 해당 뉴스를 읽은 시간이 기 설정된 시간(예를 들어, 10분)을 초과하는 경우, 해당 뉴스를 노이즈로 분류할 수 있다. 즉, 사용자들이 사용자 단말기(102)(예를 들어, 스마트 디바이스)를 통해 뉴스를 보는 경우, 웹 페이지 또는 웹 브라우저를 종료하고 끝내는 경우보다는 스마트 디바이스에서 홈 버튼을 통해 홈 화면(메인 화면)으로 돌아가는 경우가 많다. 이때, 해당 웹 페이지는 종료된 것이 아니라 백 그라운드(Background)에서 계속 대기하는 중이므로, 실제적으로는 사용자가 뉴스를 읽고 있지 않음에도 불구하고 해당 뉴스를 읽는 시간이 길게 나올 수 있기 때문에, 뉴스를 읽은 시간이 기 설정된 시간(예를 들어, 10분)을 초과하는 경우, 해당 뉴스를 읽은 건을 노이즈로 분류할 수 있다.More specifically, the news favorability calculating unit 115 calculates the web page access time T (i) of the news and the access time T (i + 1) of the web page accessed after the web page of the news in the web browsing history information of the user ) (I.e., T (i + 1) - T (i)) can be calculated as the time at which the news is read. At this time, the news favorability calculating unit 115 may classify the news as noise when the time of reading the news exceeds a preset time (for example, 10 minutes). That is, when a user views news via the user terminal 102 (e.g., a smart device), the smart device returns to the home screen (main screen) through the home button rather than terminating the web page or the web browser There are many cases. At this time, since the web page is not terminated but is still waiting in the background, it may take a long time to read the news in spite of the fact that the user is not reading the news, If it exceeds the preset time (for example, 10 minutes), the reading of the news can be classified as noise.

뉴스 호감도 산출부(115)는 상기 산출한 뉴스 읽는 시간과 해당 뉴스의 문장 길이를 기반으로 해당 뉴스를 읽는 속도를 산출할 수 있다. 이때, 뉴스 호감도 산출부(115)는 해당 뉴스의 문장 길이가 기 설정된 문장 길이 미만(예를 들어, 50자 미만)인 경우, 해당 뉴스를 읽은 건을 노이즈로 분류할 수 있다. The news favorability calculating unit 115 may calculate the speed of reading the news based on the calculated time of reading the news and the sentence length of the news. At this time, if the sentence length of the news is less than a predetermined sentence length (for example, less than 50 characters), the news favorability calculating unit 115 may classify the read news as noise.

뉴스 호감도 산출부(115)는 사용자의 해당 뉴스를 읽는 속도와 기 설정된 표준 속도(예를 들어, 20 ~ 30자/sec)를 비교하여 사용자의 해당 뉴스에 대한 호감도를 산출할 수 있다. 구체적으로, 뉴스 호감도 산출부(115)는 사용자의 해당 뉴스를 읽는 속도가 기 설정된 표준 속도에 근접할수록 해당 뉴스에 대한 호감도 값을 높게 할 수 있다. 예시적인 실시예에서, 해당 뉴스에 대한 호감도 값의 범위는 0 ~ 5로 설정할 수 있다. 뉴스 호감도 산출부(115)는 사용자의 해당 뉴스를 읽는 속도가 기 설정된 표준 속도에 근접하는 정도를 코사인 유사도(Cosine Similarity) 등을 이용하여 구할 수 있다. The news favorability calculating unit 115 may calculate the user's liking for the news by comparing the speed of reading the news of the user with a predetermined standard speed (for example, 20 to 30 characters / sec). Specifically, the news favorability calculating unit 115 can increase the likelihood value for the news as the user's speed of reading the corresponding news approaches the predetermined standard speed. In an exemplary embodiment, the range of goodwill values for the news may be set to 0-5. The news favorability calculating unit 115 may obtain the degree to which the speed at which the user reads the news is close to a preset standard speed, using a cosine similarity or the like.

뉴스 호감도 산출부(115)는 사용자 단말기(102)로 제공하는 각 뉴스에 대한 사용자의 호감도를 데이터베이스(미도시)에 저장할 수 있다. 즉, 뉴스 호감도 산출부(115)는 표 2에 나타낸 바와 같이, 사용자 별 각 뉴스에 대한 호감도를 데이터베이스화하여 저장할 수 있다.The news favorability calculating unit 115 may store the user's liking of each news provided to the user terminal 102 in a database (not shown). That is, as shown in Table 2, the news favorability calculating unit 115 can store the favorability for each news item in a database.

사용자 IDUser ID 뉴스 IDNews ID 호감도Favorability 1One 1One 521521 4.74.7 22 1One 620620 3.93.9 33 1One 258258 3.43.4 44 22 521521 33 ...... ...... ...... ...... nn nn 521521 55

뉴스 추천부(117)는 뉴스 추천 서버(104)에 로그인하여 접속하는 사용자 단말기(102)로 해당 사용자가 선호하는 분야의 뉴스를 추천(즉, 사용자 단말기(102)로 제공)할 수 있다. 뉴스 추천부(117)는 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 추천하는 뉴스 분야의 배열 순서 및 분야 별 추천하는 뉴스의 수량 중 적어도 하나를 조절할 수 있다. The news recommendation unit 117 can recommend the news of the user's favorite field to the user terminal 102 that logs in and connects to the news recommendation server 104 (that is, provides the news to the user terminal 102). The news recommendation unit 117 may adjust at least one of the arrangement order of recommended news fields and the number of news items recommended by each field according to the ratio of the news field preferred by the user.

예를 들어, 해당 사용자가 선호하는 뉴스 분야의 비율이 스포츠:글로벌:정치:경제:라이프:과학:엔터테인먼트 = 1:4:4:4:2:7:2인 경우, 뉴스 추천부(117)는 도 3에 도시된 바와 같이, 뉴스를 과학, 글로벌, 정치, 경제, 라이트, 엔터테인먼트, 스포츠의 순서로 사용자에게 추천해 줄 수 있다. 그리고, 뉴스 추천부(117)는 추천해주는 뉴스의 양을 해당 사용자가 선호하는 뉴스 분야의 비율(즉, 1:4:4:4:2:7:2)에 따라 스포츠 1건, 글로벌 4건, 정치 4건, 경제 4건, 라이프 2건, 과학 7건, 엔터테인먼트 2건으로 추천하는 뉴스 수량을 결정할 수 있다. For example, if the ratio of news items preferred by the user is sports: global: politics: economy: life: science: entertainment = 1: 4: 4: 4: 2: 7: 2, As shown in FIG. 3, can recommend news to users in the order of science, global, politics, economy, light, entertainment, and sports. The news recommendation unit 117 sets the amount of news to be recommended in accordance with the ratio of the news field preferred by the user (i.e., 1: 4: 4: 4: 2: 7: 2) , 4 politics, 4 economies, 2 life, 7 science and 2 entertainment.

또한, 해당 사용자의 선호하는 뉴스 분야의 비율이 스포츠:글로벌:정치:경제:라이프:과학:엔터테인먼트 = 6:3:3:3:1:5:1로 달라진 경우, 뉴스 추천부(117)는 도 4에 도시된 바와 같이, 뉴스를 스포츠, 과학, 글로벌, 정치, 경제, 라이프, 엔터테이먼트의 순서로 사용자에게 추천해줄 수 있다. 그리고, 뉴스 추천부(117)는 추천해주는 뉴스의 양을 해당 사용자가 선호하는 뉴스 분야의 비율(즉, 6:3:3:3:1:5:1)에 따라 스포츠 6건, 글로벌 3건, 정치 3건, 경제 3건, 라이프 1건, 과학 5건, 엔터테인먼트 1건으로 추천하는 뉴스 수량을 결정할 수 있다. Also, when the ratio of the user's favorite news field is changed to Sports: Global: Politics: Economy: Life: Science: Entertainment = 6: 3: 3: 3: 1: 5: 1, As shown in FIG. 4, the news can be recommended to users in the order of sports, science, global, politics, economy, life, entertainment. The news recommendation unit 117 sets the amount of news to be recommended in accordance with the ratio of the news field preferred by the user (i.e., 6: 3: 3: 3: 1: 5: 1) , 3 politics, 3 economies, 1 life, 5 science, 1 entertainment.

뉴스 추천부(117)는 사용자 단말기(102)의 로그 데이터를 분석하여 해당 사용자가 본 뉴스와 유사한 뉴스를 본 다른 사용자들을 검색한 후, 검색된 다른 사용자들이 본 뉴스 중에서 뉴스를 추천할 수도 있다. 구체적으로, 뉴스 추천부(117)는 검색된 다른 사용자들(즉, 해당 사용자와 뉴스 구독 성향이 유사한 사용자)이 본 뉴스들 중에서 해당 사용자가 보지 않은 뉴스들을 추출한 후, 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 추천할 수 있다. 이때, 뉴스 추천부(117)는 해당 사용자가 선호하는 뉴스 분야의 뉴스들 중 해당 사용자와 뉴스 구독 성향이 유사한 사용자들의 뉴스 호감도가 높은 순서대로 추천할 수 있다. The news recommendation unit 117 may analyze the log data of the user terminal 102 to search for other users who viewed the news similar to the news, and then recommend the news among the news users who are searched. Specifically, the news recommendation unit 117 extracts news that has been searched by other users (i.e., a user whose news subscription tendency is similar to that of the user) from the news, It can be recommended according to the ratio. At this time, the news recommendation unit 117 may recommend users who are similar in news subscription tendency among the news in the news sector preferred by the user in descending order of news appeal.

도 5는 본 발명의 일 실시예에 따른 뉴스 추천 방법을 설명하기 위한 흐름도이다. 도 5에 도시된 방법은 예를 들어, 전술한 뉴스 추천 서버(104)에 의해 수행될 수 있다. 도시된 흐름도에서는 상기 방법을 복수 개의 단계로 나누어 기재하였으나, 적어도 일부의 단계들은 순서를 바꾸어 수행되거나, 다른 단계와 결합되어 함께 수행되거나, 생략되거나, 세부 단계들로 나뉘어 수행되거나, 또는 도시되지 않은 하나 이상의 단계가 부가되어 수행될 수 있다.5 is a flowchart illustrating a news recommendation method according to an exemplary embodiment of the present invention. The method shown in Fig. 5 can be performed, for example, by the news recommendation server 104 described above. In the illustrated flow chart, the method is described as being divided into a plurality of steps, but at least some of the steps may be performed in reverse order, combined with other steps, performed together, omitted, divided into detailed steps, One or more steps may be added and performed.

도 5를 참조하면, 뉴스 추천 서버(104)는 각 사용자 단말기(102)들로부터 웹 브라우징 히스토리 정보를 수집한다(S 101). Referring to FIG. 5, the news recommendation server 104 collects web browsing history information from each user terminal 102 (S 101).

다음으로, 뉴스 추천 서버(104)는 각 사용자 단말기(102)들로부터 수집한 웹 브라우징 히스토리 정보를 기반으로 사용자 별 선호하는 뉴스 분야들의 비율을 산출한다(S 103). Next, the news recommendation server 104 calculates the ratio of the news sectors preferred by each user based on the web browsing history information collected from each user terminal 102 (S 103).

다음으로, 뉴스 추천 서버(104)는 각 사용자가 선호하는 뉴스 분야의 비율에 따라 추천하는 뉴스 분야의 순서 및 분야 별 추천하는 뉴스의 수량 중 적어도 하나를 조절하여 제공한다(S 105). Next, the news recommendation server 104 adjusts at least one of the order of the recommended news field and the recommended number of news items for each field according to the ratio of the news field preferred by each user (S 105).

도 6은 본 발명의 다른 실시예에 따른 뉴스 추천 방법을 설명하기 위한 흐름도이다. 도 6에 도시된 방법은 예를 들어, 전술한 뉴스 추천 서버(104)에 의해 수행될 수 있다. 도시된 흐름도에서는 상기 방법을 복수 개의 단계로 나누어 기재하였으나, 적어도 일부의 단계들은 순서를 바꾸어 수행되거나, 다른 단계와 결합되어 함께 수행되거나, 생략되거나, 세부 단계들로 나뉘어 수행되거나, 또는 도시되지 않은 하나 이상의 단계가 부가되어 수행될 수 있다.6 is a flowchart illustrating a news recommendation method according to another embodiment of the present invention. The method shown in Fig. 6 can be performed, for example, by the news recommendation server 104 described above. In the illustrated flow chart, the method is described as being divided into a plurality of steps, but at least some of the steps may be performed in reverse order, combined with other steps, performed together, omitted, divided into detailed steps, One or more steps may be added and performed.

도 6을 참조하면, 뉴스 추천 서버(104)는 각 사용자 단말기(102)들로부터 웹 브라우징 히스토리 정보를 수집하고, 각 사용자 단말기(102) 별 로그 데이터를 수집한다(S 201). Referring to FIG. 6, the news recommendation server 104 collects web browsing history information from each user terminal 102 and collects log data for each user terminal 102 (S 201).

다음으로, 뉴스 추천 서버(104)는 각 사용자 단말기(102)들로부터 수집한 웹 브라우징 히스토리 정보를 기반으로 사용자 별 선호하는 뉴스 분야들의 비율을 산출한다(S 203). Next, the news recommendation server 104 calculates the ratio of the news sectors preferred by each user based on the web browsing history information collected from each user terminal 102 (S 203).

다음으로, 뉴스 추천 서버(104)는 각 사용자 단말기(102)로 제공하는 각 뉴스에 대한 사용자의 호감도(즉, 뉴스 호감도)를 산출한다(S 205). Next, the news recommendation server 104 calculates a user's liking (i.e., news favorability) for each news provided to each user terminal 102 (S 205).

다음으로, 뉴스 추천 서버(104)는 소정 사용자 단말기(102)가 접속하는 경우(S 207), 해당 사용자와 뉴스 구독 성향이 유사한 사용자들을 검색한다(S 209). 예를 들어, 뉴스 추천 서버(104)는 Nearest Neighborhood 알고리즘을 이용하여 해당 사용자와 뉴스 구독 성향이 유사한 사용자들을 검색할 수 있다. Next, when the predetermined user terminal 102 accesses the news recommendation server 104 (S 207), the news recommendation server 104 searches for users having similar news subscription tendencies (S 209). For example, the news recommendation server 104 may search for users who have similar news subscription tendencies by using the Nearest Neighborhood algorithm.

다음으로, 뉴스 추천 서버(104)는 해당 사용자와 뉴스 구독 성향이 유사한 사용자들이 본 뉴스들 중에서 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스를 추천한다(S 211). 예를 들어, 뉴스 추천 서버(104)는 해당 사용자와 뉴스 구독 성향이 유사한 사용자들이 본 뉴스들 중에서 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스 분야의 순서를 배열하고, 분야 별 추천하는 뉴스의 수량을 조절할 수 있다. 또한, 뉴스 추천 서버(104)는 해당 사용자와 뉴스 구독 성향이 유사한 사용자들이 본 뉴스들 중에서 해당 사용자와 뉴스 구독 성향이 유사한 사용자들의 뉴스 호감도가 높은 뉴스들을 추출한 후, 해당 사용자가 선호하는 뉴스 분야의 비율에 따라 뉴스 분야의 순서를 배열하고, 분야 별 추천하는 뉴스의 수량을 조절할 수 있다. 이때, 각 분야 별로 상기 뉴스 호감도가 가장 높은 뉴스의 순서대로 추출하여 해당 수량에 맞게 뉴스를 추천할 수 있다. Next, the news recommendation server 104 recommends the news according to the ratio of the news sector preferred by the user among the news users who have similar news subscription tendencies (S 211). For example, the news recommendation server 104 arranges the order of the news field according to the ratio of the news field preferred by the user among the news items of the news users who have a similar tendency to the news subscription, The quantity can be adjusted. In addition, the news recommendation server 104 extracts news items having high news appeal from users who have a similar news subscription tendency and those users who have similar news subscription tendencies among the news items, You can arrange the order of the news fields according to their ratios and adjust the quantity of news items recommended by each field. At this time, it is possible to extract news in the order of news having the highest likelihood of the news in each field, and to recommend the news according to the quantity.

도 7은 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경(10)을 예시하여 설명하기 위한 블록도이다. 도시된 실시예에서, 각 컴포넌트들은 이하에 기술된 것 이외에 상이한 기능 및 능력을 가질 수 있고, 이하에 기술되는 것 이외에도 추가적인 컴포넌트를 포함할 수 있다.FIG. 7 is a block diagram illustrating and illustrating a computing environment 10 including a computing device suitable for use in the exemplary embodiments. In the illustrated embodiment, each of the components may have different functions and capabilities than those described below, and may include additional components in addition to those described below.

도시된 컴퓨팅 환경(10)은 컴퓨팅 장치(12)를 포함한다. 일 실시예에서, 컴퓨팅 장치(12)는 사용자 단말기(예를 들어, 사용자 단말기(102))일 수 있다. 또한, 컴퓨팅 장치(12)는 뉴스를 추천하는 서버(예를 들어, 뉴스 추천 서버(104))일 수 있다. The illustrated computing environment 10 includes a computing device 12. In one embodiment, computing device 12 may be a user terminal (e.g., user terminal 102). In addition, the computing device 12 may be a server that recommends news (e.g., news recommendation server 104).

컴퓨팅 장치(12)는 적어도 하나의 프로세서(14), 컴퓨터 판독 가능 저장 매체(16) 및 통신 버스(18)를 포함한다. 프로세서(14)는 컴퓨팅 장치(12)로 하여금 앞서 언급된 예시적인 실시예에 따라 동작하도록 할 수 있다. 예컨대, 프로세서(14)는 컴퓨터 판독 가능 저장 매체(16)에 저장된 하나 이상의 프로그램들을 실행할 수 있다. 상기 하나 이상의 프로그램들은 하나 이상의 컴퓨터 실행 가능 명령어를 포함할 수 있으며, 상기 컴퓨터 실행 가능 명령어는 프로세서(14)에 의해 실행되는 경우 컴퓨팅 장치(12)로 하여금 예시적인 실시예에 따른 동작들을 수행하도록 구성될 수 있다.The computing device 12 includes at least one processor 14, a computer readable storage medium 16, The processor 14 may cause the computing device 12 to operate in accordance with the exemplary embodiment discussed above. For example, processor 14 may execute one or more programs stored on computer readable storage medium 16. The one or more programs may include one or more computer-executable instructions, which when executed by the processor 14 cause the computing device 12 to perform operations in accordance with the illustrative embodiment .

컴퓨터 판독 가능 저장 매체(16)는 컴퓨터 실행 가능 명령어 내지 프로그램 코드, 프로그램 데이터 및/또는 다른 적합한 형태의 정보를 저장하도록 구성된다. 컴퓨터 판독 가능 저장 매체(16)에 저장된 프로그램(20)은 프로세서(14)에 의해 실행 가능한 명령어의 집합을 포함한다. 일 실시예에서, 컴퓨터 판독 가능 저장 매체(16)는 메모리(랜덤 액세스 메모리와 같은 휘발성 메모리, 비휘발성 메모리, 또는 이들의 적절한 조합), 하나 이상의 자기 디스크 저장 디바이스들, 광학 디스크 저장 디바이스들, 플래시 메모리 디바이스들, 그 밖에 컴퓨팅 장치(12)에 의해 액세스되고 원하는 정보를 저장할 수 있는 다른 형태의 저장 매체, 또는 이들의 적합한 조합일 수 있다.The computer-readable storage medium 16 is configured to store computer-executable instructions or program code, program data, and / or other suitable forms of information. The program 20 stored in the computer-readable storage medium 16 includes a set of instructions executable by the processor 14. In one embodiment, the computer-readable storage medium 16 may be any type of storage medium such as a memory (volatile memory such as random access memory, non-volatile memory, or any suitable combination thereof), one or more magnetic disk storage devices, Memory devices, or any other form of storage medium that can be accessed by the computing device 12 and store the desired information, or any suitable combination thereof.

통신 버스(18)는 프로세서(14), 컴퓨터 판독 가능 저장 매체(16)를 포함하여 컴퓨팅 장치(12)의 다른 다양한 컴포넌트들을 상호 연결한다.Communication bus 18 interconnects various other components of computing device 12, including processor 14, computer readable storage medium 16.

컴퓨팅 장치(12)는 또한 하나 이상의 입출력 장치(24)를 위한 인터페이스를 제공하는 하나 이상의 입출력 인터페이스(22) 및 하나 이상의 네트워크 통신 인터페이스(26)를 포함할 수 있다. 입출력 인터페이스(22) 및 네트워크 통신 인터페이스(26)는 통신 버스(18)에 연결된다. 입출력 장치(24)는 입출력 인터페이스(22)를 통해 컴퓨팅 장치(12)의 다른 컴포넌트들에 연결될 수 있다. 예시적인 입출력 장치(24)는 포인팅 장치(마우스 또는 트랙패드 등), 키보드, 터치 입력 장치(터치패드 또는 터치스크린 등), 음성 또는 소리 입력 장치, 다양한 종류의 센서 장치 및/또는 촬영 장치와 같은 입력 장치, 및/또는 디스플레이 장치, 프린터, 스피커 및/또는 네트워크 카드와 같은 출력 장치를 포함할 수 있다. 예시적인 입출력 장치(24)는 컴퓨팅 장치(12)를 구성하는 일 컴포넌트로서 컴퓨팅 장치(12)의 내부에 포함될 수도 있고, 컴퓨팅 장치(12)와는 구별되는 별개의 장치로 컴퓨팅 장치(12)와 연결될 수도 있다.The computing device 12 may also include one or more input / output interfaces 22 and one or more network communication interfaces 26 that provide an interface for one or more input / output devices 24. The input / output interface 22 and the network communication interface 26 are connected to the communication bus 18. The input / output device 24 may be connected to other components of the computing device 12 via the input / output interface 22. The exemplary input and output device 24 may be any type of device, such as a pointing device (such as a mouse or trackpad), a keyboard, a touch input device (such as a touch pad or touch screen), a voice or sound input device, An input device, and / or an output device such as a display device, a printer, a speaker, and / or a network card. The exemplary input and output device 24 may be included within the computing device 12 as a component of the computing device 12 and may be coupled to the computing device 12 as a separate device distinct from the computing device 12 It is possible.

이상에서 본 발명의 대표적인 실시예들을 상세하게 설명하였으나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 상술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. 그러므로 본 발명의 권리범위는 설명된 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, . Therefore, the scope of the present invention should not be limited to the above-described embodiments, but should be determined by equivalents to the appended claims, as well as the appended claims.

100 : 뉴스 추천 시스템
102 : 사용자 단말기
104 : 뉴스 추천 서버
111 : 데이터 수집부
113 : 분야 비율 산출부
115 : 뉴스 호감도 산출부
117 : 뉴스 추천부100: News recommendation system
102: User terminal
104: news recommendation server
111: Data collecting unit
113: field ratio calculation unit
115: news favorability calculating section
117: news recommendation department

Claims

A data collection unit for collecting web browsing history information from a user terminal;
An area ratio calculation unit for calculating a ratio of news fields preferred by the user based on the collected web browsing history information; And
And a news recommendation unit for recommending news to the user terminal by adjusting at least one of an arrangement order of recommended news fields and a recommended number of news items for each field according to the ratio of the calculated news fields.

The method according to claim 1,
The sector ratio calculation unit calculates,
Extracting a text from a web page of each news item accessed by the user terminal based on the web browsing history information, firstly selecting an analysis subject word based on the part of speech in the extracted text, And determines a field of each connected news based on a word to be analyzed.

The method of claim 2,
The sector ratio calculation unit calculates,
A weighted value is assigned to each of the firstly selected analysis target words, and a second predetermined number of words having a high weight value is selected from among the first to be analyzed analyzed words, And determines the field of each connected news according to the inclusion degree of words.

The method according to claim 1,
The news recommendation server,
Further comprising a news favorability calculating unit for calculating a user's likelihood of each news provided to each user terminal.

The method of claim 4,
Wherein the news favorability calculating unit comprises:
And calculates a user's likelihood value for the news based on a time when the user read the news and a sentence length of the news.

The method of claim 5,
Wherein the data collection unit collects log data from the user terminal to access the news recommendation server,
Wherein the news favorability calculating unit is configured to extract a web page having the same access point as the time point when the predetermined news is provided from the log data among the web pages of the web browsing history information, And calculates the time at which the news is read based on the connection time of the next accessed web page.

The method of claim 5,
Wherein the news favorability calculating unit comprises:
The speed of reading the news based on the time when the user read the news and the length of the sentence of the news is calculated and the user's likelihood of the news according to the degree that the calculated speed of reading the news approaches the preset standard speed A news recommender server that sets a high value.

The method of claim 4,
Wherein,
Searches for a user having a similar tendency to a news subscription to the user, and recommends the news according to a ratio of a news field preferred by the user among the news that the user has a similar tendency to the retrieved news subscription.

The method of claim 8,
Wherein,
A user having a similar news subscription tendency and having a similar news tendency to the news, and extracting the extracted news in accordance with the ratio of the news field preferred by the user to the extracted news A news recommendation server that adjusts at least one of a sequence order of fields and a quantity of recommended news items per field.

One or more processors, and
A method performed in a computing device having a memory storing one or more programs executed by the one or more processors,
Collecting web browsing history information from a user terminal;
Calculating a ratio of news fields preferred by the user based on the collected web browsing history information; And
And recommending the news to the user terminal by adjusting at least one of an arrangement order of recommended news fields and a recommended number of news items according to the ratio of the calculated news fields.

The method of claim 10,
Wherein the step of calculating the ratio of the news fields comprises:
Extracting a text from a web page of each news accessed by the user terminal based on the web browsing history information;
Firstly sorting the analysis target word based on the part of speech in the extracted text;
Assigning weights to each of the first to-be-analyzed analyzed words;
Secondly selecting a predetermined number of words having the highest weight value among the first to-be-analyzed analyzed words; And
And determining an area of each connected news item according to the inclusion degree of the secondary selected words on the basis of the text.

The method of claim 10,
The news recommendation method includes:
Further comprising calculating a user's likelihood of each news provided to each user terminal.

The method of claim 12,
Wherein the step of calculating the good feeling comprises:
And calculating a user's likelihood value for the news based on the time the user read the news and the sentence length of the news.

14. The method of claim 13,
Wherein the step of calculating the good feeling comprises:
Collecting log data for accessing the news recommendation server from the user terminal;
Extracting a web page having the same connection point as the point of time when the predetermined news is provided from the log data among the web pages of the web browsing history information; And
And calculating the time of reading the news based on the connection time of the extracted web page and the connection time of the web page accessed next to the extracted web page.

14. The method of claim 13,
Wherein the step of calculating the good feeling comprises:
Calculating a speed of reading the news based on a time when the user read the news and a sentence length of the news; And
And setting a user's preference value for the news to a high value according to a degree of the calculated news reading speed approaching a preset standard speed.

The method of claim 12,
The recommending step may include:
Searching for a user having a similar news subscription tendency to the user; And
And recommending the news according to the ratio of the news field preferred by the user among the news items having the similar news subscription tendency.

18. The method of claim 16,
The recommending step may include:
Extracting, in descending order of news appeal from users who have similar news subscription tendencies and have similar news subscription tendencies, And
Adjusting at least one of a sequence order of recommended news fields and a recommended number of news items for each field based on a ratio of the news fields preferred by the user to the extracted news items.