KR100993845B1

KR100993845B1 - System For Recommending Personalized Meaning-Based Web-Document And Its Method

Info

Publication number: KR100993845B1
Application number: KR1020070140872A
Authority: KR
Inventors: 최중민; 강진범
Original assignee: 한양대학교 산학협력단
Priority date: 2007-12-28
Filing date: 2007-12-28
Publication date: 2010-11-12
Also published as: KR20090072680A

Abstract

The present invention relates to a personalized semantic based web document recommendation system and method. The present invention comprises: (a) collecting at least one web document using a web robot; (b) Each of the collected at least one web document is defined as an instance, and an identifier capable of identifying the corresponding web document is defined as an instance name of the instance, so that the generalized ontology of the at least one web document is collected. Generating; (c) monitoring a web document visited by the specific user; (d) the system inferring a concept (hereinafter referred to as “user preference concept”) included in the web document visited by the specific user from the generalized ontology using an ontology inference engine; (e) the system generating at least one instance of at least one concept most similar to the user preference concept among the concepts included in the generalization ontology; (f) the system receiving a web document recommendation request from the specific user; And (g) when the system receives the document recommendation request, recommending at least one web document from the generalization ontology to the specific user based on the user preference concept, first similarity, and second similarity. Provides a method for recommending personalized semantic based web documents.

delete

Description

System for Recommending Personalized Meaning-Based Web-Document And Its Method}

본 발명은 웹 문서 추천 분야에 관한 것으로서, 보다 상세하게는 사용자가 요청하는 문서를 선별하기 위해 웹 로봇이 수집한 문서로부터 의미적 관계를 정의한 일반화 온톨로지를 생성하고, 사용자가 방문한 문서에서 일반화 온톨로지의 개념들과의 연관성을 추론하여 사용자의 선호 개념과 선호도를 반영하는 선호정보 온톨로지를 생성하고, 일반화 온톨로지 및 선호정보 온톨로지를 이용하여 사용자의 의도에 부합하는 웹 문서를 추천하는 개인화된 의미 기반 웹 문서 추천 시스템 및 그 방법에 관한 것이다.The present invention relates to a web document recommendation field, and more specifically, to generate a generalization ontology defining a semantic relationship from documents collected by a web robot in order to sort out a document requested by a user, and to generate a generalization ontology from a document visited by a user. Personalized semantic-based web document that creates a preference information ontology that reflects the user's preference concept and preference by inferring association with concepts, and recommends a web document that matches the user's intention using generalization ontology and preference information ontology A recommendation system and method.

웹(web) 검색 분야에 있어서, 사용자는 원하는 정보를 찾기 위해 많은 시간을 소요하게 된다. 효율적인 웹 검색을 위해 다양한 웹 문서 추천 시스템이 개발되고 있다.In the field of web search, users spend a lot of time to find desired information. Various web document recommendation systems have been developed for efficient web search.

종래 기술에 따른 웹 문서 추천 시스템은, 텍스트 마이닝 기법을 기반으로 단어의 빈도수 또는 확률적 방법을 이용한 유사도 측정을 통해 사용자가 선호하는 웹 문서를 추천하는 방법으로서, 문서가 의미적으로 연관있는 콘텐츠를 내포하고 있음에도 불구하고 문서 내에 동일한 단어가 나타나지 않으면 추천되지 않는 문제점이 있었다.The web document recommendation system according to the related art is a method of recommending a web document preferred by a user through a similarity measurement using a frequency or a probabilistic method of a word based on a text mining technique, and the content of which the document is semantically related. Despite the implication, if the same word does not appear in the document, there is a problem that is not recommended.

특히 의미적 관계를 고려하지 않는 웹 문서 추천 시스템은, 단면적인 추천만이 가능하다. 예를 들어, 사용자가 "야구"에 관심을 가질 때, "야구"라는 단어가 포함된 문서만이 추천될 수 있다. 의미적 관계를 고려하게 되면, "야구"와 연관있는 "포수", "투수" 등 의미적 연관있는 개념들의 정보까지 추천될 수 있다.In particular, a web document recommendation system that does not consider semantic relations is only capable of cross-sectional recommendation. For example, when a user is interested in "baseball," only documents containing the word "baseball" may be recommended. If semantic relations are considered, information on semantic related concepts such as "catcher" and "pitcher" associated with "baseball" can be recommended.

더불어 최근 시멘틱 웹에 대한 연구가 활발해지면서 시멘틱 포털이라는 시멘틱 웹 기반의 검색엔진이 개발되고 있지만, 문서와 개념적 관계를 표현한 온톨로지와의 연결성을 부여하는데 어려움이 있는 실정이다. 이는 웹 문서 추천 분야에서 사용자가 요구하는 의미적 정보를 정확하게 식별하여 개념적 연관관계를 통해 다양한 정보를 추천받기 원하는 것과는 상반되는 결과를 가져오는 문제점을 발생시킨다.In addition, as research on the semantic web has recently been actively conducted, a semantic web-based search engine called a semantic portal has been developed, but there is a difficulty in providing connectivity with ontologies expressing documents and conceptual relationships. This causes a problem in which the semantic information required by the user in the web document recommendation field is accurately identified and results in a conflict with the desire to receive various information through conceptual association.

본 발명은 상기와 같은 문제점을 해결하고 최근의 추세와 요청에 따라 제안된 것으로서, 사용자가 요청하는 문서를 선별하기 위해 웹 로봇이 수집한 문서로부터 의미적 관계를 정의한 일반화 온톨로지를 생성하고, 사용자가 방문한 문서에서 일반화 온톨로지의 개념들과의 연관성을 추론하여 사용자의 선호 개념과 선호도를 반영하는 선호정보 온톨로지를 생성하고, 일반화 온톨로지 및 선호정보 온톨로지를 이용하여 사용자의 의도에 부합하는 웹 문서를 추천하는 개인화된 의미 기반 웹 문서 추천 시스템 및 그 방법을 제공하는데 그 목적이 있다.The present invention solves the above problems and is proposed according to recent trends and requests, and generates a generalization ontology defining semantic relationships from documents collected by a web robot to screen documents requested by a user, and the user Inferring the association with the concepts of generalized ontology from the visited documents, creating preference information ontology that reflects the user's preferred concept and preferences, and recommending a web document that matches the user's intention using the generalized ontology and preference information ontology The purpose is to provide a personalized semantic based web document recommendation system and method.

상기의 목적을 달성하기 위한 본 발명의 일 양상으로서, 본 발명에 따른 개인화된 의미 기반 웹 문서 추천 방법은, 특정 사용자와 네트워크를 통해 연결되며 상기 네트워크를 통해 상기 특정 사용자에게 웹 문서를 추천하는 시스템을 통해 구현되며, (a)상기 시스템이 웹 로봇을 이용하여 적어도 하나의 웹(web) 문서를 수집하는 단계; (b)상기 시스템이 상기 수집된 적어도 하나의 웹 문서마다 각각의 인스턴스로 정의하고, 해당 웹 문서를 식별할 수 있는 식별자를 상기 인스턴스의 인스턴스명으로 정의하여 상기 수집된 적어도 하나의 웹 문서에 관한 일반화 온톨로지를 생성하는 단계; (c)상기 시스템이 상기 특정 사용자가 방문하는 웹 문서를 모니터링하는 단계; (d)상기 시스템이 상기 일반화 온톨로지로부터 상기 특정 사용자가 방문한 웹 문서에 포함된 개념(이하 "사용자 선호 개념"이라 함)을 온톨로지 추론 엔진을 이용하여 추론하는 단계; (e)상기 시스템이 상기 일반화 온톨로지에 포함된 개념들 중 상기 사용자 선호 개념과 가장 유사한 적어도 하나의 개념을 적어도 하나의 인스턴스로 생성하는 단계; (f)상기 시스템이 상기 특정 사용자로부터 웹 문서 추천 요청을 수신하는 단계; 및 (g)상기 시스템이 상기 문서 추천 요청을 수신하면, 상기 사용자 선호 개념, 제1 유사도 및 제2 유사도에 근거하여 상기 일반화 온톨로지로부터 적어도 하나의 웹 문서를 상기 특정 사용자에게 추천하는 단계를 포함한다.
상기 (d)단계는 상기 일반화 온톨로지의 모든 인스턴스들을 벡터 모델로 인덱싱하고, 상기 인덱싱된 상기 일반화 온톨로지의 모든 인스턴스들로부터 상기 특정 사용자가 방문한 웹 문서에 포함되는 개념과 가장 유사한 개념을 상기 사용자 선호 개념으로 추론하는 것으로 이루어지며,
상기 (f)단계는 상기 시스템이 상기 특정 사용자가 상기 시스템에 질의한 질의어를 수신하거나, 상기 특정 사용자가 방문하는 특정 웹 문서를 선택하는 것을 인식하는 것으로써 이루어지며,
상기 제1 유사도는 상기 질의어와 상기 일반화 온톨로지의 인스턴스들간 유사도(S1) 또는 상기 특정 웹 문서와 상기 일반화 온톨로지의 인스턴스들간 유사도(S2)이며, 상기 제2 유사도는 상기 질의어가 속한 개념과 상기 일반화 온톨로지의 인스턴스들이 속한 개념들간 유사도(S3) 또는 상기 특정 웹 문서가 포함하는 개념과 상기 일반화 온톨로지의 인스턴스들이 속한 개념들간 유사도(S4)인 것을 특징으로 하는As an aspect of the present invention for achieving the above object, the personalized semantic-based web document recommendation method according to the present invention is a system for connecting a specific user through a network and recommending a web document to the specific user through the network It is implemented through, (a) collecting at least one web document by the system using a web robot; (b) the system defines each instance of each of the at least one web document collected, and defines an identifier capable of identifying the web document as an instance name of the instance, thereby relating to the collected at least one web document. Creating a generalized ontology; (c) the system monitoring the web document visited by the specific user; (d) the system inferring a concept (hereinafter referred to as “user preference concept”) included in the web document visited by the specific user from the generalized ontology using an ontology inference engine; (e) the system generating at least one instance of at least one concept most similar to the user preference concept among the concepts included in the generalization ontology; (f) the system receiving a web document recommendation request from the specific user; And (g) when the system receives the document recommendation request, recommending at least one web document from the generalization ontology to the specific user based on the user preference concept, first similarity, and second similarity. .
The step (d) indexes all instances of the generalization ontology into a vector model, and the user preference concept is the concept most similar to the concept included in the web document visited by the specific user from all the instances of the indexed generalization ontology. It consists of reasoning,
The step (f) is made by recognizing that the system receives a query term that the specific user queries the system, or selects a specific web document visited by the specific user,
The first similarity is the similarity (S1) between the query word and the instances of the generalized ontology, or the similarity (S2) between the specific web document and the instances of the generalized ontology, and the second similarity is the concept to which the query language belongs and the generalized ontology. Characterized in that the similarity (S3) between the concepts to which the instances belong or the concept included in the specific web document and the similarities (S4) between the concepts to which the instances of the generalization ontology belong.

상기 인스턴스는 상기 해당 웹 문서를 기술하는 주석, 이미지, 제목 및 URL 중 적어도 하나에 관한 정보를 포함할 수 있다.The instance may include information regarding at least one of a comment, an image, a title, and a URL describing the corresponding web document.

상기 (e)단계는, 상기 생성된 인스턴스에 대한 상기 특정 사용자의 선호도를 나타내는 선호도 정보를 생성하는 단계를 더 포함하고, 상기 사용자의 선호도는, 상기 생성된 인스턴스와 관련된 웹 문서에 대한 상기 사용자의 방문 빈도와 관련되어 결정되는 것을 특징으로 할 수 있다.The step (e) further includes generating preference information indicating preferences of the specific user for the created instance, wherein the preference of the user is the user's preference for the web document associated with the created instance. It may be characterized by being determined in relation to the frequency of visits.

상기 S1과 S2는 각각 상기 질의어와 상기 일반화 온톨로지의 인스턴스들간 코사인 유사도, 상기 특정 웹 문서와 상기 일반화 온톨로지의 인스턴스들간 코사인 유사도를 계산하여 산출되는 것을 특징으로 할 수 있다.The S1 and S2 may be calculated by calculating the cosine similarity between the query word and the instances of the generalized ontology, and the cosine similarity between the specific web document and the instances of the generalized ontology, respectively.

상기 S3과 S4는 상기 질의어에 대해 또는 상기 특정 웹 문서에 대해 추론된 적어도 하나의 개념으로 구성된 제1 집합과 상기 일반화 온톨로지의 인스턴스들이 속한 개념들로 구성된 제2 집합에서, 상기 두 집합의 교집합에 해당하는 개념들이 상기 두 집합의 합집합에 해당되는 개념들 대비 얼마나 있는 지로 산출되는 것을 특징으로 할 수 있다.The S3 and S4 correspond to the intersection of the two sets in the first set consisting of at least one concept deduced from the query term or the specific web document and the concepts belonging to instances of the generalization ontology. It can be characterized in that it is calculated as how many concepts compared to the concepts corresponding to the union of the two sets.

삭제delete

상기의 목적을 달성하기 위한 본 발명의 다른 양상으로서, 본 발명에 따른 개인화된 의미 기반 웹 문서 추천 시스템은, 웹 로봇을 이용하여 적어도 하나의 웹(web) 문서를 수집하고 상기 수집된 적어도 하나의 웹 문서에 관한 일반화 온톨로지를 생성하는 웹 문서 수집부; 사용자가 방문하는 웹 문서를 모니터링하고, 상기 사용자가 방문한 웹 문서와 관련되는 개념(이하 "사용자 선호 개념"이라 함)을 상기 일반화 온톨로지로부터 추론하고, 상기 추론된 사용자 선호 개념과 상기 일반화 온톨로지의 연관성이 부여된 상기 사용자에 관한 선호정보 온톨로지를 생성하는 선호정보 생성부; 및 상기 사용자로부터 수신된 문서 추천 요청에 따라, 상기 사용자 선호 개념과 문서 유사도를 고려하여 상기 일반화 온톨로지로부터 적어도 하나의 웹 문서를 상기 사용자에게 추천하는 문서 추천부를 포함하여 이루어진다.As another aspect of the present invention for achieving the above object, the personalized semantic-based web document recommendation system according to the present invention collects at least one web document using a web robot and collects the at least one web document. A web document collection unit that generates a generalization ontology for web documents; Monitor the web document visited by the user, infer a concept related to the web document visited by the user (hereinafter referred to as "user preference concept") from the generalization ontology, and associate the deduced user preference concept with the generalization ontology A preference information generating unit that generates an ontology of preference information about the user to which the user is assigned; And a document recommendation unit recommending at least one web document from the generalization ontology to the user in consideration of the user preference concept and document similarity according to a document recommendation request received from the user.

본 발명에 따른 개인화된 의미 기반 웹 문서 추천 시스템 및 그 방법에 의하면, 웹 로봇이 수집한 문서로부터 의미적 관계를 정의하여 일반화 온톨로지를 생성하고, 사용자가 방문한 문서로부터 사용자의 선호 개념 및 선호도를 파악하여 선호정보 온톨로지를 생성하고, 상기 일반화 온톨로지 및 상기 선호정보 온톨로지를 이용하여 사용자의 의도가 정확히 반영된 개인화된 웹 문서를 추천할 수 있다.According to the personalized meaning-based web document recommendation system and method according to the present invention, a semantic relationship is defined from a document collected by a web robot to generate a generalization ontology, and a user's preference concept and preference are grasped from a document visited by the user By creating a preference information ontology, the generalized ontology and the preference information ontology can be used to recommend a personalized web document accurately reflecting the user's intention.

본 발명의 상술한 목적, 특징들 및 장점은 첨부된 도면과 관련된 다음의 상세한 설명을 통하여 보다 분명해질 것이다. 이하 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예들을 상세히 설명한다. 명세서 전체에 걸쳐서 동일한 참조번호들은 동일한 구성요소들을 나타낸다. 또한, 본 발명과 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우, 그 상세한 설명을 생략한다.The above-mentioned objects, features and advantages of the present invention will become more apparent through the following detailed description in conjunction with the accompanying drawings. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Throughout the specification, the same reference numbers refer to the same components. In addition, when it is determined that a detailed description of a known function or configuration related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted.

도 1은 본 발명의 일 실시예에 따른 개인화된 의미 기반 웹 문서 추천 시스템의 블록 구성도이다. 도 2는 본 발명의 일 실시예에 따른 개인화된 의미 기반 웹 문서 추천 방법의 흐름도이다. 도 1, 도 2 및 필요한 도면들을 참조하여, 본 발명 의 일 실시예에 따른 개인화된 의미 기반 웹 문서 추천 시스템 및 그 방법을 상세히 설명하기로 한다.1 is a block diagram of a personalized meaning-based web document recommendation system according to an embodiment of the present invention. 2 is a flowchart of a method for recommending a personalized semantic based web document according to an embodiment of the present invention. 1 and 2 and the necessary drawings, a personalized semantic based web document recommendation system and method according to an embodiment of the present invention will be described in detail.

개인화된 의미 기반 웹 문서 추천 시스템(100)은, 도 1에 도시된 바와 같이, 웹 문서 수집부(20), 선호 정보 생성부(40) 및 문서 추천부(50)를 포함하여 구성될 수 있다. 상기 웹 문서 수집부(20), 상기 선호 정보 생성부(40) 및 상기 문서 추천부(50)의 기능을 간단히 설명하면 다음과 같다.The personalized meaning-based web document recommendation system 100 may include a web document collection unit 20, a preference information generation unit 40, and a document recommendation unit 50, as shown in FIG. 1. . The functions of the web document collection unit 20, the preference information generation unit 40, and the document recommendation unit 50 are briefly described as follows.

상기 웹 문서 수집부(20)는, 웹 로봇을 이용하여 적어도 하나의 웹 문서를 수집하고 상기 수집된 적어도 하나의 웹 문서에 관한 일반화 온톨로지(24)를 생성한다.The web document collection unit 20 collects at least one web document using a web robot and generates a generalization ontology 24 for the collected at least one web document.

상기 선호 정보 생성부(40)는, 로그인한 사용자가 방문한 웹 문서에 포함된 개념을 상기 일반화 온톨로지(24)로부터 추론하고, 상기 추론된 개념을 상기 사용자의 사용자 선호 개념으로 하여 상기 사용자에 관한 선호정보 온톨로지(45)를 생성한다.The preference information generation unit 40 infers the concept included in the web document visited by the logged-in user from the generalization ontology 24, and uses the inferred concept as the user preference concept of the user, so that the preference for the user The information ontology 45 is created.

상기 문서 추천부(50)는, 상기 사용자로부터 수신된 문서 추천 요청에 따라, 상기 일반적 온톨로지(24) 및 상기 선호정보 온톨로지(45)를 이용하여 상기 사용자에게 적어도 하나의 웹 문서를 추천한다.The document recommendation unit 50 recommends at least one web document to the user using the general ontology 24 and the preference information ontology 45 according to a document recommendation request received from the user.

이하 상기 개인화된 의미 기반 웹 문서 추천 시스템(100)의 구체적인 동작과, 본 발명의 일 실시예에 따른 개인화된 의미 기반 웹 문서 추천 방법을 상세히 설명하기로 한다.Hereinafter, a detailed operation of the personalized meaning-based web document recommendation system 100 and a method of recommending a personalized meaning-based web document according to an embodiment of the present invention will be described in detail.

상기 웹 문서 수집부(20)는, 웹 로봇(web robot)을 이용하여 적어도 하나의 웹 문서를 수집한다[S101]. 상기 웹 로봇은, 최초에 주어진 Seed URL(10)을 시작 지점으로 하이퍼링크(hyperlink)로 연결된 모든 문서를 수집할 수 있다. 이때 상기 웹 문서 수집부(20)는 큐(Queue, 21)를 이용할 수 있다. 큐는 대기 행렬을 말한다. 상기 웹 문서 수집부(20)는, 최초에 주어진 Seed URL(10)을 상기 큐(21)에 저장하고 상기 큐(21)에 저장된 Seed URL(10)에 해당하는 문서를 수집하고[S101] 상기 Seed URL(10)에 해당하는 문서에 포함된 모든 URL들을 획득하고[S103] 상기 큐(21)에 추가한다[S105]. 그리고 상기 웹 문서 수집부(20)는, 상기 큐(21)에 저장된 URL들에 해당하는 문서를 수집할 수 있다[S101]. 이러한 방식으로 상기 웹 문서 수집부(20)는 미리 정해진 횟수만큼 상기 S101 단계 내지 상기 S105 단계를 반복 수행할 수 있다.The web document collection unit 20 collects at least one web document using a web robot [S101]. The web robot may collect all documents connected by a hyperlink to the starting point of the initially given Seed URL 10. At this time, the web document collection unit 20 may use a queue (Queue, 21). Q is a queue. The web document collection unit 20 stores the first given Seed URL 10 in the queue 21 and collects documents corresponding to the Seed URL 10 stored in the queue 21 [S101]. All URLs included in the document corresponding to the Seed URL 10 are obtained [S103] and added to the queue 21 [S105]. Then, the web document collection unit 20 may collect documents corresponding to URLs stored in the queue 21 [S101]. In this way, the web document collection unit 20 may repeatedly perform steps S101 to S105 a predetermined number of times.

상기 웹 문서 수집부(20)는, 상기 수집된 적어도 하나의 웹 문서에 관한 일반화 온톨로지(24)를 생성한다[S107]. 상기 일반화 온톨로지(24)는, 상기 수집된 적어도 하나의 웹 문서의 주제를 파악하여 개념적 연결성을 정의한다.The web document collection unit 20 generates a generalization ontology 24 for the collected at least one web document [S107]. The generalization ontology 24 grasps the subject matter of the collected at least one web document and defines conceptual connectivity.

도 3은 일반화 온톨로지를 생성하는 예를 도시한 도면이다. 일반적으로 웹 문서에 관한 온톨로지는, 해당 웹 문서에 포함된 개념(Concept), 인스턴스(Instance), 개념들 간의 연관성 등에 관한 정보로 구성된다. 상기 웹 문서 수집부(20)는, 상기 수집된 적어도 하나의 웹 문서마다 각각의 서로 다른 인스턴스로 정의하는 방식으로 상기 일반화 온톨로지(24)를 생성할 수 있다. 그리고 상기 각 인스턴스는, 해당 웹 문서를 식별할 수 있는 식별자(60)를 인스턴스명(Instance Name)으로 정의하고, 상기 해당 웹 문서를 기술하는 주석(61), 이미지(62), 제 목(63) 및 URL(64) 중 적어도 하나에 관한 정보를 포함할 수 있다.3 is a diagram illustrating an example of generating a generalized ontology. In general, an ontology about a web document is composed of information related to concepts, instances, and associations between concepts included in the web document. The web document collection unit 20 may generate the generalized ontology 24 in a manner that is defined as each different instance for each of the at least one web document collected. In addition, each of the instances defines an identifier 60 capable of identifying the web document as an instance name, and an annotation 61, image 62, and title 63 describing the web document. ) And URL 64.

상기 선호 정보 생성부(40)는, 시스템(100)에 로그인한 사용자(웹 문서 추천 요청을 한 특정 사용자로서 이하 '사용자'로 간단히 표기한다)가 방문하는 웹 문서(41)를 모니터링하고[S109], 상기 사용자가 방문한 웹 문서(41)에 포함된 개념(이하 "사용자 선호 개념"이라 함)을 온톨로지 추론 엔진(46)을 이용하여 상기 일반화 온톨로지(24)로부터 추론한다[S111]. 여기서 상기 선호 정보 생성부(40)는, 상기 일반화 온톨로지(24)의 모든 인스턴스들을 벡터 모델로 인덱싱하고, 상기 사용자가 방문한 웹 문서(41)에 포함된 개념과 가장 유사한 개념을 상기 사용자 선호 개념으로 추론할 수 있다.The preference information generating unit 40 monitors the web document 41 visited by a user who is logged in to the system 100 (hereinafter referred to simply as a'user' as a specific user who has requested a web document recommendation) [S109] ], the concept included in the web document 41 visited by the user (hereinafter referred to as "user preference concept") is inferred from the generalized ontology 24 using the ontology inference engine 46 [S111]. Here, the preference information generating unit 40 indexes all instances of the generalization ontology 24 into a vector model, and the concept most similar to the concept included in the web document 41 visited by the user is the user preference concept. Can deduce.

도 4는 S111 단계의 예를 도시한 도면이다. 상기 S111 단계는, 사용자가 어떤 종류의 문서에 관심을 갖고 있는지 예측하기 위해 상기 사용자가 방문한 웹 문서(41)가 일반화 온톨로지(24)에 속하는 개념들 중 어떤 개념을 포함하는 문서인지를 추론하는 단계이다. 도 4에 도시된 바와 같이, 상기 선호 정보 생성부(40)는, 상기 일반화 온톨로지(24)에 포함된 모든 인스턴스들의 주석(61)을 벡터 모델(70)로 인덱싱하고, 상기 온톨로지 추론 엔진(46)을 이용하여 상기 인덱싱된 상기 일반화 온톨로지(24)의 인스턴스들을 대상으로 상기 사용자가 방문한 웹 문서(41)에 포함된 개념과 가장 유사한 개념을 상기 사용자가 방문한 웹 문서(41)에 포함된 개념으로 추론할 수 있다. 도 4를 참조하면, S111 단계에 의한 개념의 추론은 Concepts 함수(71)를 이용하여 이루어질 수 있는데, Concepts 함수(71)는 질의어 q를 입력받아 벡터 모델에 인덱스되어 있는 문서 d와의 코사인 유사도를 측정하고 가장 높은 유사도를 가지는 문서가 포함하는 개념들의 집합을 반환한다.4 is a view showing an example of the step S111. The step S111 is a step of inferring which of the concepts belonging to the generalization ontology 24 is a web document 41 visited by the user in order to predict what kind of document the user is interested in. to be. As illustrated in FIG. 4, the preference information generation unit 40 indexes annotations 61 of all instances included in the generalization ontology 24 into a vector model 70 and the ontology reasoning engine 46 ) To target the instances of the generalized ontology 24 indexed using the concept most similar to the concept contained in the web document 41 visited by the user as the concept included in the web document 41 visited by the user Can deduce. Referring to FIG. 4, inference of the concept by step S111 may be performed using the Concepts function 71, which receives the query q and measures the cosine similarity with the document d indexed in the vector model. And returns the set of concepts included in the document with the highest similarity.

벡터 모델(Vector Model)이란, 문서를 벡터상에 표현하여 문서들의 연관성을 파악하는 모델이다. 문서를 벡터상에 표현하기 위해 문서에서 나타나는 모든 단어들을 TF(Term Frequence)-IDF(Inverse Document Frequence)로 표현할 수 있다. 예를 들어, A라는 문서는 "a b c c e"가 표현되어 있고 B라는 문서는 "a b c c e"가 표현되어 있다고 할 때 문서 A와 B는 5차원으로 표현될 수 있다. TF만 고려한 경우 문서 A는 (a, b, c, d, e)=(1, 1, 2, 1, 0)으로, 문서 B는 (a, b, c, d, e)=(1, 1, 2, 0, 1)로 표현될 수 있다. IDF는 문서의 빈도수를 고려해 너무 빈번히 나타나는 단어에 대해 가중치를 낮추는 역할을 한다. 예를 들어 a라는 단어는 문서 A와 문서 B 모두에서 나타나기 때문에 문서를 구분할 수 있는 좋은 단어가 아니다. 따라서 a라는 단어의 가중치는 낮게 하고 단어 d 또는 e에 대해서는 높은 가중치를 가지도록 한다. 이와 같이 모든 문서에 대해 벡터상에 표현하는 것을 인덱스(Index)라고 한다.A vector model is a model that expresses a document on a vector to grasp the association between documents. To express a document on a vector, all words appearing in the document can be expressed as TF (Term Frequence)-IDF (Inverse Document Frequence). For example, documents A and B may be expressed in five dimensions when "a b c c e" is expressed in the document A and document "a b c c e" is expressed in the document B. If only TF is considered, document A is (a, b, c, d, e)=(1, 1, 2, 1, 0), and document B is (a, b, c, d, e)=(1, 1, 2, 0, 1). IDF lowers the weight of words that appear too frequently considering the frequency of documents. For example, the word a appears in both document A and document B, so it is not a good word to distinguish between documents. Therefore, the weight of the word a is low, and the weight of the word d or e is high. In this way, all documents are represented on a vector by an index.

벡터 모델에 표현된 문서들 간의 유사도를 측정하기 위해 코사인 측정 방식이 이용될 수 있다. 코사인 측정 방식은, 도 4에서 참조번호 71이 나타내는 바와 같이, 두 문서를 단위벡터로 노멀라이즈(normalize)하고 두 문서간의 코사인 각을 측정하는 방식이다. 코사인 측정 방식은, 정보 검색의 벡터 모델에서 일반적으로 사용되는 방식이다.A cosine measurement method may be used to measure similarity between documents expressed in the vector model. The cosine measurement method is a method of normalizing two documents as a unit vector and measuring a cosine angle between the two documents, as indicated by reference numeral 71 in FIG. 4. The cosine measurement method is a method commonly used in vector models of information retrieval.

그리고 상기 선호 정보 생성부(40)는, 상기 추론된 사용자 선호 개념과 상기 일반화 온톨로지(24)의 연관성이 부여된, 상기 사용자에 관한 선호정보 온톨로지(45)를 생성한다[S113]. 도 5는 선호정보 온톨로지(45)를 생성하는 예를 도시한 도면이다.In addition, the preference information generation unit 40 generates a preference information ontology 45 for the user, which has been given a correlation between the inferred user preference concept and the generalized ontology 24 [S113]. 5 is a diagram showing an example of generating the preference information ontology 45.

상기 S113 단계는, 상기 일반화 온톨로지(24)에 포함된 개념들 중 상기 사용자 선호 개념과 가장 유사한 적어도 하나의 개념을 상기 선호정보 온톨로지(45)의 적어도 하나의 인스턴스로 생성하고, 상기 가장 유사한 적어도 하나의 개념과 상기 생성된 적어도 하나의 인스턴스에 연관성을 부여할 수 있다. 예를 들어, 상기 S113 단계에서는, 도 5에 도시된 바와 같이, 상기 일반화 온톨로지(24)에 포함된 개념들 중 상기 사용자 선호 개념과 가장 유사한 적어도 하나의 개념(80)을 상기 선호정보 온톨로지(45)의 인스턴스(81)로 생성하고, 상기 가장 유사한 적어도 하나의 개념과 상기 인스턴스(81)를 "referTo" ObjectProperty를 이용하여 연관성을 부여할 수 있다.In step S113, at least one concept most similar to the user preference concept among the concepts included in the generalization ontology 24 is generated as at least one instance of the preference information ontology 45, and the at least one most similar. The concept may be associated with the at least one instance. For example, in step S113, as illustrated in FIG. 5, among the concepts included in the generalization ontology 24, at least one concept 80 most similar to the user preference concept is the preference information ontology 45 ), and the association between the most similar at least one concept and the instance 81 by using the "referTo" ObjectProperty.

도 5는 상기 일반화 온톨로지(24)와 상기 선호정보 온톨로지(45)의 관계를 나타낸다. 예를 들어, 사용자가 상기 일반화 온톨로지(24)의 "Machine Learning" 개념에 관심을 갖고 있다고 할 때, 도 5에 도시된 바와 같이, 상기 선호정보 온톨로지(45)에 "Machine Learning" 인스턴스(81)가 생성되고 이 개념을 선호하게 된 출처로서 "referTo" 관계가 형성된다. 도 5의 예에서, 사용자가 "Machine Learning" 개념을 선호하게 된 배경은 "Bayesian Network"라는 인스턴스에 적합한 문서를 이전에 방문했기 때문에 "referTo" 관계가 형성되는 것이다.5 shows the relationship between the generalized ontology 24 and the preference information ontology 45. For example, when the user is interested in the concept of "Machine Learning" of the generalization ontology 24, as shown in FIG. 5, the "Machine Learning" instance 81 in the preference information ontology 45 Is created and a "referTo" relationship is formed as the source from which this concept is preferred. In the example of FIG. 5, the background in which the user prefers the concept of “Machine Learning” is that a “referTo” relationship is formed because a document suitable for an instance called “Bayesian Network” has been previously visited.

상기 S113 단계는, 상기 생성된 인스턴스(81)에 대한 상기 사용자의 선호도를 나타내는 선호도 정보를 생성할 수 있다. 상기 사용자의 선호도는, 사용자가 상기 생성된 인스턴스(81)를 얼마나 선호하고 있는지를 나타내게 된다. 상기 사용자의 선호도는, 상기 생성된 인스턴스(81)와 관련된 웹 문서에 대한 상기 사용자의 방문 빈도와 관련되어 결정될 수 있다. 도 5를 참조하면, 상기 사용자의 선호도는 "hasWeight" 관계를 가짐으로써 표현될 수 있다. 즉 상기 선호정보 온톨로지(45)에서, 사용자가 선호하는 개념들이 인스턴스들로 표현되어 있고, "referTo" 관계를 통해 이전에 어떤 유사한 웹 문서를 방문하였는지를 알 수 있으며, "hasWeight" 관계를 통해 사용자가 특정 개념 또는 특정 인스턴스를 얼마나 선호하는지를 알 수 있게 된다.In step S113, preference information indicating the user's preference for the generated instance 81 may be generated. The preference of the user indicates how much the user prefers the created instance 81. The preference of the user may be determined in relation to the frequency of the user's visit to the web document associated with the created instance 81. Referring to FIG. 5, the user's preference may be expressed by having a “hasWeight” relationship. That is, in the preference information ontology 45, the concepts preferred by the user are expressed as instances, and through the "referTo" relationship, it is possible to know which similar web document was previously visited, and through the "hasWeight" relationship, the user You will know how much you prefer a particular concept or specific instance.

도 6은 생성된 인스턴스(81)에 대한 사용자의 선호도를 측정하는 방식의 예를 나타낸 도면이다. 도 6에 도시된 바와 같이, 상기 선호 정보 생성부(40)는, 상기 선호정보 온톨로지(45)의 인스턴스들을 키(key)로 하고 인스턴스와 연결된 "hasWeight" DatatypeProperty의 가중치 값을 키에 대응하는 값으로 하는 사전(PreferenceDic)과 사용자가 현재 방문한 문서의 추론된 개념들을 이용하여 사용자의 선호도를 측정 또는 수정할 수 있다. 상기 사전은 개념별 사용자 선호 정보를 포함한다. 6 is a diagram illustrating an example of a method of measuring a user's preference for the generated instance 81. As illustrated in FIG. 6, the preference information generation unit 40 uses the instances of the preference information ontology 45 as a key, and the weight value of the “hasWeight” DatatypeProperty associated with the instance corresponding to the key. The user's preferences can be measured or modified using the PreferenceDic and the deduced concepts of the document the user is currently visiting. The dictionary includes user preference information for each concept.

상기 선호 정보 생성부(40)는, 상기 생성된 사전의 키 집합(PreferenceDic.KeySet)에 사용자가 방문한 문서로부터 추론된 개념들을 모두 포함하고 있는지 판단한다. 만약 추론된 개념을 모두 포함하고 있지 않다면, 사전에 포함되어 있지 않는 개념들(NewConcepts)만을 대상으로 해당 개념(c)의 값은 0으로 초기화한다(PreferenceDic[c]=0). 상기 초기화 과정은 해당 개념들의 이전 선호도가 0임을 명시하기 위함이다. 현재 방문한 문서에 대한 가중치는 "Occur" 함수에 의해서 결정되며 현재 사용자가 방문한 문서의 개념이 PreferenceDic의 개념과 일치하면 1, 일치하지 않으면 0의 값이 반환될 수 있다.The preference information generation unit 40 determines whether all the concepts deduced from the document visited by the user are included in the generated dictionary key set (PreferenceDic.KeySet). If not all of the deduced concepts are included, the value of the concept (c) is initialized to 0 for only those not included in the dictionary (NewConcepts) (PreferenceDic[c]=0). The initializing process is for specifying that the previous preference of the concepts is 0. The weight for the currently visited document is determined by the "Occur" function. If the concept of the document currently visited by the user matches the concept of PreferenceDic, a value of 0 may be returned.

상기 선호 정보 생성부(40)는, 상기 추가된 사전의 모든 개념 키에 대해서 이전 선호도와 사용자가 방문한 문서의 개념 일치 유무(83)를 고려하여 상기 선호도를 갱신할 수 있다(82). 예를 들어, 사용자가 "network" 개념의 웹 문서에 처음 방문했다면, 상기 선호정보 온톨로지(45)에 정의되어 있지 않은 개념이기 때문에 PreferenceDic[network]=1이 된다. 이후 사용자가 "database" 개념의 웹 문서에 처음 방문했다면, PreferenceDic[network]=0.5, PreferenceDic[database]=1이 된다(σ=0.5). σ값에 의해 이전 선호도의 값이 점점 작아지는 정도가 설정될 수 있다. 사용자가 다시 "network"를 방문하면 PreferenceDic[network]=1.26, PreferenceDic[database]=0.5가 된다. 그래서 빈번히 방문하지 않는 문서에 대한 선호도는 점차 작아져 0에 가까운 값을 갖는다. 결국 사용자가 최근 관심을 갖고 있는 개념 또는 빈번히 관심을 갖는 개념에 대해서 높은 선호도가 생성될 수 있다.The preference information generation unit 40 may update the preferences in consideration of the previous preferences and whether or not the concept of the document visited by the user 83 for all concept keys of the added dictionary (82). For example, when a user first visits a web document of the “network” concept, PreferenceDic[network]=1 because it is a concept not defined in the preference ontology 45. Then, when the user first visits a web document of the "database" concept, PreferenceDic[network]=0.5 and PreferenceDic[database]=1 (σ=0.5). The degree to which the value of the previous preference becomes smaller by the σ value may be set. When the user visits "network" again, PreferenceDic[network]=1.26, PreferenceDic[database]=0.5. Therefore, the preference for documents that are not frequently visited gradually decreases to a value close to zero. As a result, a high preference may be generated for a concept that the user is recently interested in or a concept that is frequently interested.

상기 문서 추천부(50)는, 상기 사용자로부터 문서 추천 요청을 수신한다[S115]. 상기 문서 추천 요청의 수신은, 상기 사용자가 질의한 질의어를 수신하거나 상기 사용자가 웹 문서를 선택하는 것을 인식함으로써 이루어질 수 있다. 상기 사용자에 의해 웹 문서가 선택되는 것이 인식되는 예를 들면, 상기 사용자가 방문하는 특정 웹 문서를 선택하는 것을 인식하는 것이다.The document recommendation unit 50 receives a document recommendation request from the user [S115]. The document recommendation request may be received by receiving a query term inquired by the user or recognizing that the user selects a web document. For example, it is recognized that a web document is selected by the user, for example, selecting a specific web document that the user visits.

상기 문서 추천부(50)는, 상기 온톨로지 추론 엔진(46)을 이용하여 상기 선호정보 온톨로지(45)로부터 상기 특정 질의어 또는 상기 특정 웹 문서의 유사한 개념과 문서를 추론할 수 있다[S117]. 상기 온톨로지 추론 엔진(46)은, 개념들의 상하관계 추론과 인스턴스의 새로운 개념 추론을 수행할 수 있다. 상기 온톨로지 추 론 엔진(46)의 예를 들면, FACT++, Pellet, Kaon2, RacerPro 등이 있다.The document recommendation unit 50 may infer similar concepts and documents of the specific query word or the specific web document from the preference information ontology 45 using the ontology reasoning engine 46 [S117]. The ontology inference engine 46 may perform inference of up-down relationships of concepts and new concept inference of instances. Examples of the ontology inference engine 46 include FACT++, Pellet, Kaon2, RacerPro, and the like.

상하관계 추론의 예를 들면, "투수" 개념을 "'공을 던지는 사람'이고 '야구'이다"라고 정의하면, 상기 온톨로지 추론엔진(46)은 "투수"가 공을 던지는 사람의 하위 개념이고 야구의 하위 개념임을 추론한다.For example, when the concept of “upper” is defined as “the person who throws the ball” and “the baseball”, the ontology reasoning engine 46 is the sub-concept of the person who throws the ball. Infer that it is a sub-concept of baseball.

인스턴스의 새로운 개념 추론의 예를 들면, "야구" 개념의 인스턴스로 "이승엽"이 있다고 할 때, 상기 온톨로지 추론엔진(46)은 개념적 관계를 추론하여 "이승엽"이 "야구" 개념의 하위 개념인 "타자"의 인스턴스임을 추론한다.As an example of new concept inference of an instance, when there is an "baseball" concept instance, "Lee Seung-yeop", the ontology reasoning engine 46 infers a conceptual relationship so that "Lee Seung-yeop" is a sub-concept of the "baseball" concept. Infer that it is an instance of "other".

상기 문서 추천부(50)는, 상기 온톨로지 추론엔진(46)을 이용하여 상기 일반화 온톨로지(24)로부터 추론된 개념들의 상하관계를 추론하고 변경된 상하관계를 기반으로 사용자가 입력한 질의어와 유사한 문서의 개념을 질의어의 개념으로 추론할 수 있다.The document recommendation unit 50 uses the ontology inference engine 46 to infer the up-down relations of concepts inferred from the generalization ontology 24 and based on the changed up-down relations, a document similar to a query entered by the user. The concept can be inferred as the concept of a query.

상기 문서 추천부(50)는, 상기 문서 추천 요청에 따라, 상기 사용자 선호 개념과 문서 유사도를 이용하여 상기 일반화 온톨로지(24)로부터 적어도 하나의 웹 문서를 상기 사용자에게 추천한다[S119]. 상기 문서 추천부(50)는, 벡터 모델의 유사도와 사용자 선호 개념의 가중치 및 사용자 입력 질의어의 개념 가중치를 고려하여 사용자의 의도에 적합한 문서를 추천할 수 있다.The document recommendation unit 50 recommends at least one web document from the generalization ontology 24 to the user using the user preference concept and document similarity according to the document recommendation request [S119]. The document recommendation unit 50 may recommend a document suitable for the user's intention by considering the similarity of the vector model, the weight of the user preference concept, and the concept weight of the user input query.

도 7은 S119 단계를 수행하기 위한 수식의 예를 나타낸 도면이다. 도 7에 도시된 바와 같이, 상기 문서 추천부(50)는, 인덱스된 상기 일반화 온톨로지(24)의 인스턴스들을 대상으로 벡터 모델의 유사도(85), 질의어가 속한 개념과 문서가 포함하는 개념간의 유사도(86), 사용자 선호 개념의 가중치(82)를 고려한 점수(84)를 계산하고 내림차순 또는 올림차순으로 정렬하여 적어도 하나의 웹 문서를 추천할 수 있다.7 is a view showing an example of a formula for performing step S119. As illustrated in FIG. 7, the document recommendation unit 50 targets instances of the generalized ontology 24 indexed, the similarity of the vector model (85), and the similarity between the concept of the query and the concept included in the document. (86), at least one web document may be recommended by calculating the score 84 considering the weight 82 of the user preference concept and sorting in descending or ascending order.

상기 벡터 모델의 유사도(85)는, 상기 특정 질의어와 상기 일반화 온톨로지(24)의 인스턴스들간 유사도(S1) 또는 상기 특정 웹 문서와 상기 일반화 온톨로지(24)의 인스턴스들간 유사도(S2)를 의미한다. 상기 벡터 모델의 유사도(85)는(S1과 S2는 각각) 상기 특정 질의어와 상기 일반화 온톨로지(24)의 인스턴스들간 코사인 유사도, 상기 특정 웹 문서와 상기 일반화 온톨로지(24)의 인스턴스들간 코사인 유사도를 계산하여 산출될 수 있다. 즉 상기 벡터 모델의 유사도(85)는, 인덱스되어 있는 벡터 모델에서 사용자가 입력한 질의어와 인덱스되어 있는 문서간 코사인 유사도를 계산함으로써 산출될 수 있다.The similarity 85 of the vector model means the similarity (S1) between the specific query language and the instances of the generalized ontology 24 or the similarity (S2) between the specific web document and the instances of the generalized ontology 24. The similarity 85 of the vector model (S1 and S2 are respectively) calculates the cosine similarity between the specific query language and the instances of the generalized ontology 24, and the cosine similarity between the specific web document and the instances of the generalized ontology 24. Can be calculated. That is, the similarity 85 of the vector model may be calculated by calculating the cosine similarity between the query word input by the user and the indexed document in the indexed vector model.

상기 특정 질의어가 속한 개념과 문서가 포함하는 개념간의 유사도(86)는, 상기 특정 질의어가 속한 개념과 상기 일반화 온톨로지(24)의 인스턴스들이 속한 개념들간 유사도(S3) 또는 상기 특정 웹 문서가 포함하는 개념과 상기 일반화 온톨로지(24)의 인스턴스들이 속한 개념들간 유사도(S4)를 의미한다. 상기 질의어가 속한 개념과 문서가 포함하는 개념간의 유사도(86)는(S3과 S4는), 상기 특정 질의어 또는 상기 특정 웹 문서에 대해 추론된 적어도 하나의 개념으로 구성된 제1 집합과 상기 일반화 온톨로지(24)의 인스턴스들이 속한 개념들로 구성된 제2 집합을 포함하는 개념 집합에서 상기 제1 집합과 상기 제2 집합의 교집합에 해당하는 개념들이 상기 제1 집합과 상기 제2 집합의 합집합 대비 얼마나 있는 지로 측정될 수 있다. 예를 들어, 상기 제1 집합은 {'야구', '포수'}로 추론되었고 상기 제2 집합은 {'야구', '투수'}라고 할 때, 상기 제1 집합과 상기 제2 집합을 포함하는 개념 집합(합집합)은 {'야구', '포수', '투수'}이다. 따라서 상기 질의어가 속한 개념과 상기 일반화 온톨로지(24)의 인스턴스들이 속한 개념들간의 유사도(86)는, 1/3이 된다. 또한 예를 들어, 상기 제1 집합은 {'야구', '포수'}이고 상기 제2 집합도 {'야구', '포수'}이면, 상기 질의어가 속한 개념과 상기 일반화 온톨로지(24)의 인스턴스들이 속한 개념들간의 유사도(86)는, 2/2가 된다. 즉 상기 질의어가 속한 개념과 문서가 포함하는 개념간의 유사도(86)는, 결국 이들 두 개념이 얼마나 중복되는지를 나타낸다.The similarity 86 between the concept to which the specific query term belongs and the concept included in the document is the similarity between the concept to which the specific query word belongs and the concepts to which the instances of the generalization ontology 24 belong (S3) or the specific web document includes It means the similarity (S4) between the concept and concepts to which the instances of the generalized ontology 24 belong. The similarity 86 between the concept to which the query word belongs and the concept included in the document (S3 and S4) is a first set consisting of at least one concept deduced from the specific query word or the specific web document and the generalization ontology 24 ) In a concept set including a second set of concepts to which instances of) belong, how many concepts corresponding to the intersection of the first set and the second set are compared to the union of the first set and the second set have. For example, the first set is inferred as {'baseball','catcher'}, and when the second set is {'baseball','pitcher'}, it includes the first set and the second set The concept set (union) is {'baseball','catcher','pitcher'}. Therefore, the similarity 86 between the concepts to which the query language belongs and the concepts to which the instances of the generalized ontology 24 belong is 1/3. In addition, for example, if the first set is {'baseball','catcher'} and the second set is also {'baseball','catcher'}, the concept to which the query language belongs and the instance of the generalization ontology 24 The similarity 86 between the concepts to which they belong is 2/2. That is, the similarity 86 between the concept to which the query language belongs and the concept included in the document indicates how much these two concepts overlap.

도 8은 S119 단계에 의해 추천된 문서의 예를 나타낸 도면이다. 사용자가 방문한 문서를 수집하고 추천 정보를 제공하는 인터페이스(90)에서 "network"가 추천 키워드로 입력되었다. 추천된 문서의 개념은 화면 우측(92)에 나타내고, 화면 좌측에 문서들이 우선순위를 이용하여 나열되어 있다. 사용자가 "network"에 관해 문서 추천을 요청할 때, 도 8에 도시된 바와 같이, 지금까지 사용자가 선호하는 개념을 보다 높은 가중치를 부여하여 개인화된 의미기반 문서를 추천할 수 있다.8 is a view showing an example of a document recommended by step S119. "Network" was input as a recommended keyword in the interface 90 that collects documents visited by the user and provides recommendation information. The concept of the recommended document is shown on the right side of the screen 92, and documents are listed on the left side of the screen using priority. When a user requests a document recommendation regarding "network", as shown in FIG. 8, the concept preferred by the user so far can be given a higher weight to recommend a personalized semantic-based document.

상기에서 설명한 본 발명에 의한 개인화된 의미 기반 웹 문서 추천 방법은, 컴퓨터에서 실행시키기 위한 프로그램으로 컴퓨터로 읽을 수 있는 기록매체에 기록하여 제공될 수 있다.The personalized semantic based web document recommendation method according to the present invention described above may be provided by recording on a computer-readable recording medium as a program for executing on a computer.

본 발명에 의한 개인화된 의미 기반 웹 문서 추천 방법은 소프트웨어를 통해 실행될 수 있다. 소프트웨어로 실행될 때, 본 발명의 구성 수단들은 필요한 작업을 실행하는 코드 세그먼트들이다. 프로그램 또는 코드 세그먼트들은 프로세서 판독 가능 매체에 저장되거나 전송 매체 또는 통신망에서 반송파와 결합된 컴퓨터 데이터 신호에 의하여 전송될 수 있다.The personalized semantic based web document recommendation method according to the present invention can be executed through software. When executed in software, the configuration means of the present invention are code segments that perform necessary tasks. The program or code segments can be stored on a processor readable medium or transmitted by a computer data signal combined with a carrier wave in a transmission medium or communication network.

컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록 장치의 예로는, ROM, RAM, CD-ROM, DVD±ROM, DVD-RAM, 자기 테이프, 플로피 디 스크, 하드 디스크(hard disk), 광데이터 저장장치 등이 있다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 장치에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The computer-readable recording medium includes all types of recording devices in which data readable by a computer system is stored. Examples of computer-readable recording devices include ROM, RAM, CD-ROM, DVD±ROM, DVD-RAM, magnetic tape, floppy disk, hard disk, and optical data storage device. The computer-readable recording medium can also be distributed over network-connected computer devices so that the computer-readable code is stored and executed in a distributed fashion.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다.The present invention described above, as the person skilled in the art to which the present invention pertains, various substitutions, modifications and changes are possible without departing from the technical spirit of the present invention. It is not limited by the drawings.

도 1은 본 발명의 일 실시예에 따른 개인화된 의미 기반 웹 문서 추천 시스템의 블록 구성도이다.1 is a block diagram of a personalized meaning-based web document recommendation system according to an embodiment of the present invention.

도 2는 본 발명의 일 실시예에 따른 개인화된 의미 기반 웹 문서 추천 방법의 흐름도이다.2 is a flowchart of a method for recommending a personalized semantic based web document according to an embodiment of the present invention.

도 3은 일반화 온톨로지를 생성하는 예를 도시한 도면이다.3 is a diagram illustrating an example of generating a generalized ontology.

도 4는 S111 단계의 예를 도시한 도면이다.4 is a view showing an example of the step S111.

도 5는 선호정보 온톨로지(45)를 생성하는 예를 도시한 도면이다.5 is a diagram showing an example of generating the preference information ontology 45.

도 6은 생성된 인스턴스(81)에 대한 사용자의 선호도를 측정하는 방식의 예를 나타낸 도면이다.6 is a diagram illustrating an example of a method of measuring a user's preference for the generated instance 81.

도 7은 S119 단계를 수행하기 위한 수식의 예를 나타낸 도면이다.7 is a view showing an example of a formula for performing step S119.

도 8은 S119 단계에 의해 추천된 문서의 예를 나타낸 도면이다.8 is a view showing an example of a document recommended by step S119.

<도면의 주요 부분에 대한 부호의 설명><Explanation of reference numerals for main parts of drawings>

100: 웹 문서 추천 시스템 10: Seed URL100: Web document recommendation system 10: Seed URL

20: 웹 문서 수집부 21: 큐(Queue)20: Web document collection unit 21: Queue

24: 일반화 온톨로지 40: 선호 정보 생성부24: generalization ontology 40: preference information generating unit

45: 선호정보 온톨로지 46: 추론 엔진45: preference information ontology 46: inference engine

50: 문서 추천부50: Document recommendation department

Claims

A method of recommending a web document to a specific user using a system connected to a specific user through a network and recommending a web document to the specific user through the network:

(a) the system collecting at least one web document using a web robot;

(b) the system defines each instance of each of the at least one web document collected, and defines an identifier capable of identifying the web document as an instance name of the instance, thereby relating to the collected at least one web document. Creating a generalized ontology;

(c) the system monitoring the web document visited by the specific user;

(d) the system inferring a concept (hereinafter referred to as “user preference concept”) included in the web document visited by the specific user from the generalized ontology using an ontology inference engine;

(e) the system generating at least one instance of at least one concept most similar to the user preference concept among the concepts included in the generalization ontology;

(f) the system receiving a web document recommendation request from the specific user; And

(g) when the system receives the document recommendation request, recommending at least one web document from the generalization ontology to the specific user based on the user preference concept, first similarity, and second similarity,

The step (d) indexes all instances of the generalization ontology into a vector model, and the user preference concept is the concept most similar to the concept included in the web document visited by the specific user from all the instances of the indexed generalization ontology. It consists of reasoning,

The step (f) is made by recognizing that the system receives a query word that the specific user queries the system, or selects a specific web document visited by the specific user,

The first similarity is the similarity (S1) between the query word and the instances of the generalized ontology, or the similarity (S2) between the specific web document and the instances of the generalized ontology, and the second similarity is the concept to which the query language belongs and the generalized ontology. A method of recommending a personalized semantic-based web document, characterized in that the similarity (S3) between concepts belonging to instances of S or the concept included in the specific web document and similarity (S4) between concepts belonging to instances of the generalization ontology.

delete

The method of claim 1, wherein the instance,

A personalized semantic based web document recommendation method comprising information on at least one of a comment, an image, a title, and a URL describing the web document.

delete

The method of claim 1, wherein the step (e),

The method further includes generating preference information indicating preferences of the specific user for the generated instance,

The preference of the user, the personalized meaning-based web document recommendation method characterized in that it is determined in relation to the frequency of the user's visit to the web document associated with the created instance.

delete

The method of claim 1, wherein S1 and S2 are each

A method for recommending a personalized semantic-based web document, wherein the cosine similarity between the query word and the instances of the generalized ontology is calculated by calculating the cosine similarity between the specific web document and the instances of the generalized ontology.

The method of claim 1, wherein S3 and S4 are

In the first set consisting of at least one concept deduced for the query word or for the specific web document and the second set consisting of concepts to which instances of the generalization ontology belong, concepts corresponding to the intersection of the two sets are the two A method for recommending personalized semantic-based web documents, characterized in that it is calculated as compared to the concepts corresponding to the set union.

A computer-readable recording medium recording a computer program capable of executing the method of any one of claims 1, 3, 6, 9 or 10 with a computer.

At least one web document is collected using a web robot, and each of the collected at least one web document is defined as an instance, and an identifier capable of identifying the web document is defined as an instance name of the instance. A web document collection unit generating a generalization ontology for the at least one web document;

Monitor the web document visited by a specific user, infer the concept (hereinafter referred to as "user preference concept") included in the web document visited by the specific user from the generalization ontology using the ontology inference engine, and to the generalization ontology. A preference information generation unit generating at least one concept that is most similar to the user preference concept among the included concepts as at least one instance; And

Receiving a document recommendation request from the specific user, and recommending at least one web document from the generalization ontology to the specific user based on the user preference concept, first similarity, and second similarity according to the received document recommendation request Include a document recommendation section,

The preference information generating unit indexes all instances of the generalization ontology into a vector model, and the concept most similar to the concept included in the web document visited by the specific user from all the instances of the indexed generalization ontology as the user preference concept Reasoning,

Receiving the document recommendation request is made by recognizing that the specific user receives a query term queried or selects a specific web document visited by the specific user,

The first similarity is the similarity (S1) between the query word and the instances of the generalized ontology, or the similarity (S2) between the specific web document and the instances of the generalized ontology, and the second similarity is the concept to which the query language belongs and the generalized ontology. Personalized semantic-based web document recommendation system, characterized in that the similarity between the concepts to which the instances belong (S3) or the concepts included in the specific web document and the concepts to which the instances of the generalization ontology belong (S4).

delete

The method of claim 12, wherein the instance,

Personalized semantic based web document recommendation system, characterized in that it comprises information about at least one of the annotation, image, title and URL describing the web document.

delete

The method of claim 12, wherein the preference information generating unit,

Preference information indicating the user's preference for the generated instance is generated,

The personal preference based web document recommendation system, characterized in that the user's preference is determined in relation to the frequency of the user's visit to the web document associated with the created instance.

delete

The method of claim 12, wherein S1 and S2 are each

A personalized semantic-based web document recommendation system, characterized by calculating cosine similarity between the query word and the instances of the generalized ontology, and cosine similarity between the specific web document and the instances of the generalized ontology.

The method of claim 12, wherein S3 and S4 are

In the first set consisting of at least one concept deduced for the query language or for the specific web document and the second set consisting of concepts to which instances of the generalization ontology belong, concepts corresponding to the intersection of the two sets are the two Personalized semantic-based web document recommendation system, characterized in that it is calculated by how much compared to the concepts corresponding to the set union.