KR20090130953A

KR20090130953A - System and method for related sites information provide using link infromation and advertising system thereof

Info

Publication number: KR20090130953A
Application number: KR1020080056690A
Authority: KR
Inventors: 김형남
Original assignee: 주식회사 쉐어링크
Priority date: 2008-06-17
Filing date: 2008-06-17
Publication date: 2009-12-28

Abstract

PURPOSE: A system and a method for providing site-related information provide using link information and an advertising system using the same are provided to enable a user to access proper information easily by simultaneously receiving the information about a website that the user visits and the website-related information. CONSTITUTION: A clustering server system(101) collects and analyzes the link information included in a page of an information providing website requested from a user computer. The server system measures the similarity between websites based on the analysis for the link information. The server system provides the website-related information, and the website-related information is concerned with the request website information requested from the user computer.

Description

System and method for providing related site information using link information and advertising system using the same {SYSTEM AND METHOD FOR RELATED SITES INFORMATION PROVIDE USING LINK INFROMATION AND ADVERTISING SYSTEM THEREOF}

본 발명은 웹 사이트 정보 제공 방법에 관한 것으로, 더욱 상세하게는 웹사이트에서 페이지 내 링크된 정보를 수집, 분석하여 이를 기반으로 해당 정보의 웹사이트와 가장 유사한 정보를 가진 유사 웹사이트 정보를 취합하여 제공하고, 유사 웹사이트 정보에 대한 우선 순위를 관리하여 비즈니스 활성화를 도모하기 위한 링크 정보를 이용한 관련 사이트 정보 제공 시스템, 방법 및 이를 이용한 광고 시스템에 관한 것이다.The present invention relates to a method for providing website information, and more particularly, to collect and analyze information linked to a page in a website, and to collect similar website information having information most similar to that of the website based on the information. The present invention relates to a related site information providing system, a method, and an advertisement system using the same, using link information for promoting business by providing priority management of similar website information.

일반적으로, 네이버(Naver) 및 구글(Google)과 같은 많은 검색 엔진 서비스는 인터넷을 통해 액세스할 수 있는 정보를 검색할 수 있게 해준다. 이들 검색 엔진 서비스는, 사용자로 하여금 사용자가 원할 수도 있는 웹 페이지들과 같은 디스플레이 페이지들을 검색할 수 있게 해준다. 사용자가 검색 용어들을 포함하는 검색 요구를 청하면, 검색 엔진 서비스는 이들 검색 용어들과 관련될 수 있는 웹 페이지 들을 식별한다.In general, many search engine services, such as Naver and Google, allow you to search for information that can be accessed over the Internet. These search engine services allow a user to search display pages such as web pages that the user may want. When a user requests a search request that includes search terms, the search engine service identifies web pages that may be associated with these search terms.

관련된 웹 페이지들을 신속하게 식별하기 위해, 검색 엔진 서비스는 웹 페이지들에 대한 키워드의 맵핑을 유지할 수 있다. 이 맵핑은 웹(월드 와이드 웹)을 "크롤링(crawling)"함으로써 생성되어서 각 웹 페이지의 키워드들을 식별할 수 있게 된다. 웹을 크롤링하기 위해, 검색 엔진 서비스는 루트(root) 웹 페이지들의 리스트를 이용하여서 이들 루트 웹 페이지들을 통해 액세스할 수 있는 모든 웹 페이지들을 식별할 수 있다.To quickly identify relevant web pages, the search engine service may maintain a mapping of keywords to web pages. This mapping is created by "crawling" the web (world wide web) so that keywords of each web page can be identified. To crawl the web, the search engine service can use a list of root web pages to identify all web pages that can be accessed through these root web pages.

임의의 특정 웹 페이지의 키워드들은, 헤드라인의 단어들, 그 웹 페이지의 메타데이터에서 제공되는 단어들, 하이라이트된 단어들 등을 식별하는 것과 같은 여러 공지된 정보 검색 기법을 이용하여 식별될 수 있다. 이 검색 엔진 서비스는, 각 매치의 근접도, 웹 페이지 인기도(예를 들면 구글의 페이지 랭크(PageRank)) 등에 기초하여, 그 검색 요구와 그 웹 페이지의 정보가 얼마나 관련이 있는지를 나타내기 위한 관련 스코어를 생성할 수 있다. 그 후 검색 엔진 서비스는 이들의 랭킹에 기초한 순서대로 이들 웹 페이지에 대한 링크들을 사용자에게 디스플레이한다.The keywords of any particular web page may be identified using various known information retrieval techniques such as identifying words in the headline, words provided in the metadata of the web page, highlighted words, and the like. . This search engine service provides an association for indicating how relevant the search request is to the information of the web page, based on the proximity of each match, web page popularity (e.g. Google's PageRank), and the like. A score can be generated. The search engine service then displays the links to these web pages to the user in order based on their ranking.

검색 엔진 서비스는 검색 결과로서 많은 웹 페이지들을 리턴할 수도 있지만, 랭크 순서대로 웹 페이지들을 표시하는 것으로 인해, 사용자가 특별히 관심이 있는 웹 페이지들을 실질적으로 찾아내는 것이 어렵게 될 수 있다. 처음에 표시되는 웹 페이지들이 인기있는 토픽에 관한 것일 수 있기 때문에, 잘 알려지지 않은 토픽에 관심이 있는 사용자는 관심 있는 웹 페이지를 찾아내기 위해 검색 결과의 많은 페이지들을 스캔할 필요가 있을 수 있다. 사용자가 관심 있는 웹 페이지들을 좀 더 용이하게 찾아내도록 하기 위해, 검색 결과의 웹 페이지들은 웹 페이지들의 몇몇 분류 또는 카테고리에 기초하여 계층적 구조로 표시될 수도 있을 것이다.The search engine service may return many web pages as a search result, but due to displaying the web pages in rank order, it may be difficult for the user to find substantially the web pages of particular interest. Since the web pages initially displayed may be about popular topics, a user interested in an unknown topic may need to scan many pages of search results to find a web page of interest. To make it easier for a user to find web pages of interest, the web pages of the search results may be displayed in a hierarchical structure based on some classification or category of web pages.

사용자는 처음에 웹 페이지들의 분류 리스트가 보여질 것을 선호할 수도 있고 이에 따라 사용자는 관심 있는 웹 페이지들의 분류를 선택할 수 있다. 예를 들면, 사용자에게는 처음에, 검색 결과의 웹 페이지들이 '스포츠 관련 및 법률 관련'으로 분류되었음을 나타내는 것이 표시될 수 있다. 그 후 사용자는 법률 관련된 웹 페이지들을 보기 위해 법률 관련 분류를 선택할 수 있다. 반면에, 스포츠 웹 페이지들은 법률 웹 페이지들보다 인기있기 때문에, 대부분의 인기있는 웹 페이지들이 처음에 표시되는 경우 사용자는 법률 관련 웹 페이지들을 찾아내기 위해 많은 페이지들을 스캔해야 할 것이다.The user may initially prefer to see a categorized list of web pages so that the user can select a categorization of web pages of interest. For example, a user may initially be shown indicating that web pages of search results have been classified as 'sports related and legal related'. The user can then select a legal category to view legal web pages. On the other hand, since sports web pages are more popular than law web pages, when most popular web pages are initially displayed, the user will have to scan many pages to find law related web pages.

따라서, 이용 가능한 수많은 웹 페이지들을 수동으로 분류하는 것은 비실용적이며, 텍스트 기반 컨텐트를 분류하는 데에 자동화된 분류 기법들이 이용되어 왔지만, 이들 기법은 웹 페이지들의 분류에 일반적으로 적용할 수 있는 것은 아니다.Thus, manually classifying the large number of available web pages is impractical and automated classification techniques have been used to classify text-based content, but these techniques are not generally applicable to the classification of web pages.

이에 따라, 사용자에게 정확한 정보를 원하는 시간에 신속하게 제공하려는 정보 제공 시스템의 개발이 여러 방면으로 연구되고 있으나 검색엔진을 통한 방대한 검색 결과나 특정 사이트의 로그 분석을 통한 관련 정보 제공 등의 제한적 형태로 사용자가 인터넷상에서 원하는 정보 검색할 때 효율적으로 적합한 정보를 제공하는 서비스라고 보기에는 그 수준이 아직 현저히 낮은 상황이다. 결국, 인터넷상의 다양한 정보간의 연관성을 분석하여 관련된 정보를 자동적으로 제공하는 서비스 개발의 필요성이 대두되고 있다.As a result, the development of an information providing system that promptly provides users with accurate information at the desired time has been studied in various fields, but in the limited form of providing a large search result through a search engine or providing related information through log analysis of a specific site. The level of service is still very low in that it is a service that efficiently provides suitable information when a user searches for desired information on the Internet. As a result, there is a need to develop a service for automatically providing related information by analyzing the association between various information on the Internet.

본 발명은 이와 같은 문제점을 해결하기 위해 창출된 것으로, 본 발명의 목적은 사용자로부터 요청된 웹사이트 정보에 기반하여 관련 웹사이트 정보를 추출 및 저장하여 해당 웹사이트 상에 재구성하여 제공함으로써, 인터넷 사용자에게는 원하는 정보를 정확하고 신속하게 제공할 수 있는 링크 정보를 이용한 관련 사이트 정보 제공 시스템, 방법 및 이를 이용한 광고 시스템을 제공함에 있다.The present invention was created to solve the above problems, and an object of the present invention is to extract and store relevant website information based on the requested website information from the user, and to reconstruct and provide the related website information on the Internet user. To provide a related site information providing system, a method using the link information that can provide the desired information accurately and quickly, and an advertising system using the same.

본 발명의 다른 목적은, 사용자가 인터넷상의 개별 웹사이트를 방문하여 각종 정보를 취득할 때 방문한 웹사이트의 페이지에 포함된 링크 구조를 수집/분석하여 연결된 웹사이트를 분류하고 웹사이트 간의 연결 정보를 기반으로 클러스터링하여 유사도를 측정함으로써, 관련 정보에 대한 정확성을 높일 수 있는 링크 정보를 이용한 관련 사이트 정보 제공 방법 및 이를 이용한 광고 시스템을 제공함에 있다.Another object of the present invention is to collect / analyze the link structure included in the page of the visited website when the user visits an individual website on the Internet and acquires various kinds of information, to classify the linked websites, and to obtain link information between the websites. By measuring the similarity by clustering on the basis, it provides a method for providing related site information using link information to increase the accuracy of the related information and an advertisement system using the same.

본 발명의 또 다른 목적은, 웹사이트를 운영하는 사업자에게는 관련 정보를 자동 제공함으로써 비즈니스 활성화 및 운영 효율성을 극대화함으로써 안정적인 비즈니스 운영 및 서비스 확장을 추구할 수 있는 링크 정보를 이용한 관련 사이트 정보 제공 시스템, 방법 및 이를 이용한 광고 시스템을 제공함에 있다.Another object of the present invention, by providing relevant information to the operators who operate the website by providing relevant site information providing system using link information that can pursue stable business operation and service expansion by maximizing business activation and operation efficiency, It is to provide a method and an advertising system using the same.

상기 목적을 달성하기 위한 본 발명의 제1 관점에 따른 링크 정보를 이용한 관련 사이트 정보 제공 시스템은, 사용자 컴퓨터의 요청에 의해 인터넷 정보 제공 웹사이트에서 제공하는 콘텐츠 또는 광고정보 페이지에 포함되어 있는 링크정보를 수집하는 링크정보 수집 모듈; 상기 링크정보 수집모듈로부터 수집된 사이트 링크 정보에 대해 링크연결 구조를 분석하고, 다수의 정보 제공 사이트 간의 링크관계를 파악하여, 링크의 연결단계를 기반으로 웹사이트 간의 유사성을 측정하여 관련 웹사이트 정보를 분류/군집하는 링크정보 클러스터링 모듈; 상기 링크정보 클러스터링 모듈로부터 추출된 사이트 간 유사도 정보, 사이트별 가중치 정보 등을 데이터베이스화하여 저장하는 정보저장 모듈; 및 상기 사용자 컴퓨터의 요청 정보에 대응하는 요청 웹사이트 정보 및 상기 정보저장 모듈로부터 설정된 가중치 정보에 의거 추출된 관련 웹사이트 정보를 재구성하여 상기 사용자 단말기로 전송하는 정보매치 모듈로 구성되는 것을 특징으로 한다.Related site information providing system using the link information according to the first aspect of the present invention for achieving the above object, the link information contained in the content or advertisement information page provided by the Internet information providing website at the request of the user computer Link information collection module for collecting the; Analyzing the link link structure of the site link information collected from the link information collection module, grasping the link relationship between a plurality of information providing sites, and measuring the similarity between websites based on the linking step of the related website information Link information clustering module to classify / group; An information storage module for storing database similarity information, site weight information, etc. extracted from the link information clustering module; And an information matching module configured to reconstruct and transmit the related website information extracted based on the request website information corresponding to the request information of the user computer and the weight information set from the information storage module to the user terminal. .

구체적으로, 상기 링크정보 클러스터링 모듈에서 실행되는 유사성 측정은, 사용자가 선택한 요청 웹 사이트와 링크된 관련 사이트 웹페이지 간 상호 접속되는 노드(NODE)의 갯 수를 토대로 측정하는 것을 특징으로 한다.Specifically, the similarity measurement performed in the link information clustering module may be measured based on the number of nodes (NODEs) interconnected between a requesting web site selected by a user and a web site linked to a related site.

상기 목적을 달성하기 위한 본 발명의 제2 관점에 따른 링크 정보를 이용한 관련 사이트 정보 제공 방법은, a) 사용자로부터 선택된 요청 웹 사이트의 웹 페이지 정보를 토대로 타 웹 사이트의 페이지 간 연결상태 정보를 수집하는 단계; b) 상기 요청 웹 사이트의 웹 페이지와 링크된 관련 웹 사이트들의 웹 페이지 간 노드의 갯 수를 추적하는 단계; c) 상기 관련 웹 사이트별 각 노드의 갯 수에 대응하는 가중치를 부여하는 단계; d) 상기 가중치에 대응하여 상기 관련 웹 사이트별로 상기 요청 웹 사이트의 웹 페이지에 대한 유사도를 연산하는 단계; 및 e) 상기 유사 도에 대한 연산결과에 기초하여 클러스터링을 수행하고, 상기 클러스터링 수행 결과를 상기 요청 웹 사이트의 웹 페이지로 부가하여 해당 웹 페이지를 재구성하는 단계로 이루어진 것을 특징으로 한다.Related site information providing method using the link information according to the second aspect of the present invention for achieving the above object, a) collecting link state information between pages of other web sites based on the web page information of the request web site selected from the user Making; b) tracking the number of nodes between the web page of the requesting web site and the web page of related web sites linked thereto; c) assigning a weight corresponding to the number of each node for each related web site; d) calculating a similarity degree to a web page of the requesting web site for each related web site corresponding to the weight; And e) performing clustering based on the calculation result of the similarity, and reconfiguring the web page by adding the clustering result to a web page of the requesting web site.

한편, 상기 목적을 달성하기 위한 본 발명의 제3 관점에 따른 링크 정보를 이용한 관련 사이트 정보 제공 방법을 이용한 광고 시스템은, 온라인 상에서 관련 사이트 정보 제공을 이용한 광고 시스템에 있어서, 사용자 단말로부터 요청된 요청 웹 페이지에 대한 클러스터링 결과를 산출하고, 클러스터링 결과로부터 결정된 관련 웹 페이지 또는 광고정보의 노출순위를 광고주 단말의 유사도 우선순위 등록에 따라 변경하는 ISP 시스템을 포함하는 것을 특징으로 하며, 상기 광고주 단말의 유사도 우선순위 등록은, 상기 클러스터링 결과에 따른 소정 범위 내의 순위를 보유하는 것을 특징으로 한다.On the other hand, the advertisement system using the related site information providing method using the link information according to the third aspect of the present invention for achieving the above object, in the advertising system using the relevant site information provided online, the request requested from the user terminal And an ISP system for calculating a clustering result for the web page and changing the exposure order of the related web page or advertisement information determined from the clustering result according to the similarity priority registration of the advertiser device. Priority registration is characterized by having a ranking within a predetermined range according to the clustering result.

본 발명에 따른 링크 정보를 이용한 관련 사이트 정보 제공 시스템, 방법 및 이를 이용한 광고 시스템은, 인터넷상의 정보(콘텐츠, 광고 등)를 검색하는 사용자의 경우, 링크정보를 이용한 웹사이트 클러스터링 서버 시스템에 의하여 방문한 웹사이트의 정보 및 관련 웹사이트 정보를 동시에 제공받음으로써 어떤 웹사이트를 접속하더라도 원하는 정보를 찾기 위해 개별적으로 소비하는 시간의 낭비를 줄이고 가장 적합한 정보에 손쉽게 접근할 수 있는 효과가 있다.Related site information providing system, method and advertisement system using the link information according to the present invention, a user searching for information (content, advertising, etc.) on the Internet, visited by the website clustering server system using the link information By simultaneously providing website information and related website information, any website access can reduce the wasted time spent individually searching for the desired information and provide easy access to the most appropriate information.

또한, 인터넷상의 개별 웹사이트를 통해 정보(콘텐츠, 광고 등를) 제공 사업 자의 경우 각 서비스 사이트의 구성과 별도로 링크 정보 이용한 웹사이트 클러스터링 서버 시스템을 통하여 사용자에게 자체 보유 정보뿐만 아니라 관련된 웹 사이트 정보를 함께 제공함으로써 사용자의 서비스 만족도를 높이고 사이트 운영의 효율성을 극대화시키는 효과를 갖는다.In addition, in the case of a provider of information (content, advertisement, etc.) through individual websites on the Internet, the website clustering server system using link information is separately provided with the user's own information as well as related web site information through the configuration of each service site. By increasing the service satisfaction of users and the efficiency of site operation is maximized.

또한, 이와 같은 상호보완적 관계를 구성함으로써 인터넷 정보 제공 서비스의 다양화 및 채널의 다양화로 인터넷 비즈니스를 더욱 활성화시키는 효과를 제공할 수 있다.In addition, by forming such a complementary relationship, it is possible to provide an effect of further activating the Internet business by diversification of the Internet information service and channel.

이하, 본 발명의 바람직한 실시 예를 첨부된 예시도면에 의거 상세히 설명하면 다음과 같다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 링크 정보를 이용한 관련 사이트 정보 제공 시스템을 나타낸 구성도이다. 도시된 바와 같이 사이트정보 제공시스템은, 웹 사이트 접속이 가능한 다수의 사용자 컴퓨터(103)와, 콘텐츠, 광고 등을 포함하는 다수의 정보를 온라인 상에서 제공하는 정보제공 웹사이트(105) 및 상기 사용자 컴퓨터(103)로부터 요청된 상기 정보제공 웹사이트(105) 내 페이지에 포함된 링크 정보를 수집, 분석하고, 상기 링크정보에 대한 분석을 기반으로 클러스터링하여 웹사이트 간의 유사성을 측정한 후, 상기 사용자 컴퓨터(103)로부터 요청된 해당 요청 웹사이트 정보에 관련한 관련 웹사이트 정보를 제공하는 클러스터링 서버 시스템(101)으로 구성된다.1 is a block diagram showing a system for providing related site information using link information according to the present invention. As shown in the drawing, the site information providing system includes a plurality of user computers 103 capable of accessing a website, an information providing website 105 that provides a plurality of information including contents, advertisements, and the like online. After collecting and analyzing the link information included in the page in the information providing website 105 requested from 103, clustering based on the analysis of the link information, and measuring the similarity between websites, the user computer And clustering server system 101 for providing relevant website information related to the corresponding requested website information requested from 103.

상기 사이트정보 제공시스템은 사용자 컴퓨터(103)가 정보제공 웹 사이트(105)에 접속하여 원하는 요청 웹사이트 정보를 검색 또는 브라우징할 때, 상기 요청 웹사이트 정보 및 이에 관련하여 수집된 관련 웹사이트 정보를 상기 사용자 컴퓨터(103)로 패킷 단위로 송수신한다.The site information providing system, when the user computer 103 accesses the information providing web site 105 and searches or browses the desired request website information, the site information providing system collects the requested website information and related website information collected in connection with the requesting website information. The user computer 103 transmits and receives a packet unit.

상기 관련 웹사이트 정보는 사용자로부터 요청된 요청 웹사이트 정보와 더불어 제공되며, 요청 웹사이트의 일부 공간을 구획 또는 분할하여 제공될 수 있으며, 필요에 따라 상기 요청 웹사이트 상에서 팝업되어 디스플레이될 수 있을 것이다.The relevant website information is provided together with the requested website information requested by the user, and may be provided by partitioning or dividing some space of the requested website, and may be displayed by popping up on the request website as necessary. .

따라서, 사용자 컴퓨터(103)는 인터넷 서비스 사업자(ISP)와의 서비스 계약을 통해 정보제공 웹사이트(105)에 접속함으로써, 사용자의 다양한 관심 정보(콘텐츠, 광고 등)를 제공받는다. 상기 사이트정보 제공시스템은 사용자 컴퓨터(103)의 요청에 따라 인터넷 서비스 사업자(ISP)의 인터넷 망을 통해 사용자로부터 요청된 요청 웹사이트 정보를 제공하며, 상기 클러스터링 서버 시스템(101)을 이용하여 요청 웹사이트 정보와 관련된 관련 웹사이트 정보를 함께 제공하는 것이다. 즉, 사용자 컴퓨터(103)는 사용자에 의해 선택된 요청 웹사이트 정보와 더불어, 요청 웹사이트의 일측 화면을 통해 사용자에 의해 직접 선택되지 않은 관련 웹사이트 정보를 제공받을 수 있다.Accordingly, the user computer 103 is provided with various information of interest (content, advertisement, etc.) of the user by accessing the information providing website 105 through a service contract with an Internet service provider (ISP). The site information providing system provides requested website information requested by a user through an internet network of an Internet service provider (ISP) according to a request of the user computer 103, and uses the clustering server system 101 to make a request web. It also provides relevant website information related to site information. That is, the user computer 103 may be provided with the requested website information selected by the user and related website information not directly selected by the user through one screen of the request website.

그러면, 상기 클러스터링 서버 시스템(101)을 첨부된 예시도면에 의거 상세히 설명하면 다음과 같다.Then, the clustering server system 101 will be described in detail with reference to the accompanying drawings.

도 2는 본 발명에 따른 클러스터링 서버 시스템의 주요 기능을 설명하기 위한 구성도이다. 도시된 바와 같이, 사용자 컴퓨터(103)의 요청에 의해 인터넷 정보 제공 웹사이트(105)에서 제공하는 해당 정보(콘텐츠, 광고 등) 페이지에 포함되어 있는 링크정보를 수집하는 링크정보 수집 모듈(207)과, 상기 링크정보 수집모듈(207)로부터 수집된 사이트 링크 정보에 대해 링크연결 구조를 분석하고, 다수의 정보 제공 사이트 간의 링크관계를 파악하여, 링크의 연결단계를 기반으로 웹사이트 간의 유사성을 측정하여 관련 웹사이트 정보를 분류/군집하는 링크정보 클러스터링 모듈(205)과, 상기 링크정보 클러스터링 모듈(205)로부터 추출된 사이트 간 유사도 정보, 사이트별 가중치 정보 등을 데이터베이스화하여 저장하는 정보저장 모듈(203)과, 상기 사용자 컴퓨터의 요청 정보에 대응하는 요청 웹사이트 정보 및 상기 정보저장 모듈(203)로부터 설정된 가중치 정보에 의거 추출된 관련 웹사이트 정보를 재구성하여 상기 사용자 단말기(103)로 전송하는 정보매치 모듈(201)로 구성된다.2 is a configuration diagram for explaining the main functions of the clustering server system according to the present invention. As shown, a link information collection module 207 for collecting link information included in a corresponding information (content, advertisement, etc.) page provided by the Internet information providing website 105 at the request of the user computer 103. And analyzing the link connection structure with respect to the site link information collected from the link information collection module 207, grasping the link relationship between a plurality of information providing sites, and measuring the similarity between websites based on the linking step of the link. Link information clustering module 205 for classifying / grouping related website information, and information storage module for storing the similarity information between sites extracted from the link information clustering module 205, weight information for each site, and storing the data into a database. 203, request website information corresponding to the request information of the user computer, and weights set from the information storage module 203 And an information matching module 201 for reconstructing relevant website information extracted based on the location information and transmitting the reconstructed website information to the user terminal 103.

여기서, 상기 링크 정보에는 해당 사이트에서 직접 링크되어 있는 사이트와 이들 연결된 사이트에 포함되어 있는 링크정보 모두를 포함한다.Here, the link information includes both the site directly linked from the corresponding site and the link information included in these linked sites.

상기 링크정보 클러스터링 모듈(205)에서 실행되는 유사성 측정은, 사용자가 선택한 요청 웹 사이트와, 유사한 사이트 또는 관련이 높은 관련 웹 사이트를 검색하기 위한 것으로, 요청 웹 사이트 웹페이지와 관련 사이트 웹페이지 간 상호 접속되는 노드(NODE)의 갯 수를 토대로 유사성을 측정한다. 즉, 도 3에 도시된 바와 같이, 사이트 중 A를 기준으로 링크 단계를 분석해 보면, 사이트 A와 사이트 C는 페이지 a1과 페이지 c4, 페이지 a2와 페이지 c2, 페이지 a3와 페이지 c3가 1단계로 연결되어 총 3 페이지가 1단계로 연결되어 있다.The similarity measure performed in the link information clustering module 205 is for searching a request web site selected by a user and a similar site or a highly related web site, and the mutual relationship between the request web site web page and the related site web page. Similarity is measured based on the number of nodes connected. That is, as shown in Figure 3, when analyzing the link step based on the A of the site, the site A and the site C is connected to the page a1 and page c4, page a2 and page c2, page a3 and page c3 in one step As a result, a total of three pages are linked in one step.

사이트 A와 사이트 D는 페이지 a3과 페이지 d1, 페이지 a2와 페이지 d4가 1단계로 연결되어 있고 페이지 a2와 페이지 d3는 페이지 c2를 통해 2단계로 연결되어 있다. 마지막으로 사이트 A와 사이트 B는 페이지 a2와 페이지 b2가 1단계로 연결되어 있고 페이지 c4를 통하여 페이지 a1과 페이지 b1, b2가 2단계로 연결되어 있다. 상기와 같은 링크연결 단계를 기준으로 유사도를 측정하면 다음과 같다.Sites A and D have pages a3 and d1, pages a2 and d4 connected in one step, and pages a2 and d3 are linked in two steps through page c2. Finally, in site A and site B, page a2 and page b2 are connected in one step, and page a1, page b1 and b2 are connected in two steps through page c4. The similarity is measured based on the link connection step as described above.

사이트 A를 기준으로 사이트 B,C,D 모두 3개의 페이지가 연결되어 있으나 사이트 C는 1단계로 직접 연결된 페이지가 3개, 사이트 D는 1단계로 연결된 페이지가 2개, 2단계로 연결된 페이지가 1개, 사이트 B는 1단계로 연결된 페이지가 1개, 2단계로 연결된 페이지가 2개로 사이트 C가 가장 유사도가 높고, 다음은 사이트 D, 사이트 B는 가장 유사도가 떨어진다고 측정할 수 있다. 측정된 유사도를 기준으로 클러스터링하여 A사이트와 관련된 사이트를 분류하면 사이트 C와 사이트 D가 관련된 사이트로 분류할 수 있을 것이다.Based on site A, three pages are linked to site B, C, and D, but site C has three direct pages linked in step 1, and site D has two pages linked in step 1 and pages linked in two steps. One site, site B has one page linked in one step, and two pages linked in two steps. Site C has the highest similarity, and site D and site B have the lowest similarity. By classifying sites related to Site A by clustering based on the measured similarity, Site C and Site D may be classified as related sites.

그러면, 사이트 간 링크 정보를 분석하기 위한 절차를 첨부된 예시도면에 의거 상세히 설명하면 다음과 같다.Then, a procedure for analyzing link information between sites will be described in detail with reference to the accompanying drawings.

도 4는 본 발명에 따른 사이트 간 링크 정보 분석을 설명하기 위한 플로우챠트이다. 도시된 바와 같이, S401 단계에서 상기 링크정보 수집모듈(207)은 현재 사용자로부터 선택된 요청 웹 사이트의 웹 페이지 정보를 토대로 타 웹 사이트 즉, 관련 웹 사이트의 페이지 간 연결상태를 수집한다. 이러한 웹 페이지 간 연결정보는 각 웹 페이지에서 제공되는 것으로, 해당 페이지와 연결된 타 사이트의 페이지 정보를 추출하는 것이다.4 is a flowchart illustrating the analysis of link information between sites according to the present invention. As shown, in step S401, the link information collection module 207 collects a connection state between pages of another web site, that is, a related web site, based on web page information of a request web site selected by a current user. The connection information between the web pages is provided by each web page, and extracts page information of another site connected to the page.

S403 단계로 진입하여, 상기 링크정보 클러스터링 모듈(205)은 사용자가 선택한 상기 요청 웹 사이트의 웹 페이지와 링크된 상기 관련 웹 사이트들의 웹 페이지 간 노드의 갯 수를 추적한다. 노드(NODE)의 갯 수는 도 3에서 도시한 바와 같이, 하나의 요청 웹 사이트 내의 다수 웹 페이지와, 타 사이트 내의 다수 웹 페이지 간 상호 연결된 고리의 갯 수를 나타내는 것이다. 예컨대, 사이트 A의 a3 웹페이지 및 사이트 C는 도시된 c1 웹 페이지를 거쳐 연결되기 때문에, 하나의 노드가 존재하는 것이며, 사이트 D의 d4 웹 페이지가 사이트 B와 연결되기 위해서는 a2 웹 페이지(사이트 A) 및 b2 웹 페이지(사이트 B)를 거쳐 연결되기 때문에 두 개의 노드를 갖는다.In step S403, the link information clustering module 205 tracks the number of nodes between the web pages of the requesting web site selected by the user and the web pages of the related web sites linked thereto. The number of nodes NODE represents the number of interconnected links between a plurality of web pages in one requesting web site and a plurality of web pages in another site, as shown in FIG. For example, since a3 webpage and site C of site A are connected through the illustrated c1 web page, one node exists, and in order for the d4 webpage of site D to be connected with site B, a2 webpage (site A ) And b2 web pages (site B), so they have two nodes.

이와 같이, 상기 링크정보 클러스터링 모듈(205)은 사용자에 의해 선택된 요청 웹 사이트의 웹 페이지가 타 사이트(관련 사이트)와의 연결 상태를 산출한 후, S405 단계에서, 노드의 갯 수에 대응하는 가중치를 부여한다. 상기 가중치는 관리자에 의해 설정될 수 있다. 가중치는 노드의 갯 수에 비례하여 부여되는데, 하나의 노드를 통해 각 사이트의 웹 사이트 간 접속이 이루어질 경우, 가장 높은 가중치가 부여되고, 노드의 갯 수가 많아질수록 가중치는 저하될 것이다.As described above, the link information clustering module 205 calculates a connection state of the web page of the request web site selected by the user with another site (related site), and then, in step S405, a weight corresponding to the number of nodes is calculated. Grant. The weight may be set by an administrator. The weight is given in proportion to the number of nodes. When the connection between the web sites of each site is made through one node, the weight is given the highest weight, and as the number of nodes increases, the weight decreases.

이후, S407 단계로 진입하여 각 노드에 대한 가중치를 토대로 상기 링크정보 클러스터링 모듈(205)은 사이트별 유사도를 연산한다. 예컨대, 하나의 노드에 대한 가중치가 '10'이고, 두 개의 노드가 존재할 경우 가중치 '8'을 부여하며, 세 개의 노드가 발견될 경우 가중치 '5'를, 네 개의 노드가 발견될 경우 가중치 '3'을 부여할 경우, 도 3에서 도시한 바와 같이, 사이트 A와 사이트 C가 연결되기 위해 하나 의 노드가 사용되어, 사이트 A를 중심으로 사이트 C는 가중치 '10'이 부여된다. 즉, 사이트 A의 a3에 대한 웹 페이지가 사이트 C로 연결되기 위해서는 사이트 C의 c1 웹 페이지를 거쳐야 하기 때문에, 하나의 노드가 사용되며, 이로부터 가중치는 '10'이 부여되는 것이다.In operation S407, the link information clustering module 205 calculates the similarity for each site based on the weight of each node. For example, if the weight for one node is' 10 'and two nodes are present, the weight is assigned to' 8 ', if three nodes are found, the weight is' 5', and if four nodes are found, the weight is' When 3 'is assigned, as shown in FIG. 3, one node is used to connect site A and site C, and site C is weighted' 10 'around site A. That is, since the web page for a3 of site A needs to go through the c1 web page of site C in order to connect to site C, one node is used, and the weight is given by '10'.

반면, 사이트 A의 a3에 해당하는 웹 페이지가 사이트 B에 접속되기 위해서는 c1, d4, a2, b2에 해당하는 웹 페이지를 거쳐야 하기 때문에, 네 개의 노드가 사용된다. 따라서, 사이트 A에 대한 사이트 B의 가중치는 '3'이 된다. 동일한 방법으로, 상기 사이트 A의 a3에 대한 웹 페이지가 사이트 D와 접속되기 위해서는 d1을 거쳐야하기 때문에 하나의 노드가 필요하며, 이에 대응하는 가중치 '10'이 부여된다. 결국, 상기 사이트 A에 대한 유사성이 높은 사이트는 사이트 D 및 사이트 C이며, 사이트 B는 유사성이 낮은 것으로 판단할 수 있다.On the other hand, four nodes are used because a web page corresponding to a3 of site A must go through web pages corresponding to c1, d4, a2, and b2 in order to access site B. Therefore, the weight of site B relative to site A becomes '3'. In the same way, since a web page for a3 of site A needs to go through d1 in order to be connected to site D, one node is required, and a corresponding weight '10' is assigned. As a result, sites having high similarity to the site A are sites D and C, and site B may be determined to have low similarity.

한편, 상기 사이트 A의 a3에 대한 웹 페이지와 유사성이 높다고 판단된 사이트 D 및 사이트 C는 최단거리 노드를 기준으로 판단된 것으로, 사이트 D와 같은 경우는 d1을 통해 접속되거나, d4 및 c1을 통해 접속되는 등 두 개 이상의 경로를 가지고 있다. 여기서, 사이트 간 접속 가능한 경로의 갯 수 또한 가중치의 대상으로 정의하여 유사성 판단에 기여할 수 있음은 물론이다. 예컨대, 경로의 갯 수가 하나 일 경우, 가중치 '1'을 부여하여 1배수를 연산하고, 경로의 갯 수가 두 개일 경우, 가중치 '1.2'를 부여하여 1.2배수를 연산하며, 경로의 갯 수가 세 개일 경우, 가중치 '1.4'를 부여하여 1.4배수의 연산을 수행토록 할 수 있을 것이다.Meanwhile, Site D and Site C, which are determined to have high similarity to the web page for a3 of Site A, are determined based on the shortest distance node, and in the case of Site D, it is accessed through d1 or through d4 and c1. It has more than one path, such as connected. In this case, the number of accessible paths between sites may also be defined as an object of weight to contribute to the similarity judgment. For example, if the number of paths is one, a weight of '1' is given to calculate a multiple of 1, and if the number of paths is two, a weight of '1.2' is given to a value of 1.2 and the number of paths is three. In this case, the weight '1.4' may be assigned to perform a 1.4-fold operation.

이를 적용할 경우, 전술된 사이트 D 및 사이트 C는 최초 가중치가 동일하게 '10'이 부여되었지만, 두 번째 가중치로서 사이트 D는 두 개의 경로를 갖고 있음에 따라, '12'의 가중치가 산출되고, 상기 사이트 C와 같은 경우 하나의 경로를 가지고 있어 '10'의 가중치가 유지된다. 따라서, 사이트 D는 사이트 A의 a3 웹 페이지와 가장 유사성이 있는 사이트로 설정할 수 있을 것이다.In this case, the above-described site D and the site C have the same initial weight of '10', but as the second weighting site D has two paths, the weight of '12' is calculated. In the case of the site C, it has one path and thus the weight of '10' is maintained. Therefore, site D may be set as the site most similar to the a3 web page of site A.

이와 같이 유사성 판단이 이루어지면, S409 단계에서 상기 링크정보 클러스터링 모듈(205)은 유사성 판단 결과에 기초하여 클러스터링(Clustering)을 수행한다. 상기 클러스터링은 벡터 모델링에 기반한 k-means 알고리즘, 링고 알고리즘, 퍼지 유전자 알고리즘과 Cosine, Euclidian, jaccard, Dice 계수 등 중 어느 하나가 적용될 수 있으며, 필요에 따라 선택적으로 사용하여 가장 적합한 사이트 군집을 수행할 수 있을 것이다. 상기 클러스터링은 공지된 기술로서 이에 대한 구체적인 설명은 본 발명의 요지를 벗어날 우려가 있어 생략한다.When the similarity determination is made as described above, in step S409, the link information clustering module 205 performs clustering based on the similarity determination result. The clustering may be applied to any one of k-means algorithm, ringo algorithm, fuzzy genetic algorithm and Cosine, Euclidian, jaccard, Dice coefficient, etc. based on vector modeling, and may be selectively used as necessary to perform the most suitable site clustering. Could be. The clustering is a known technique, and a detailed description thereof will be omitted since there is a risk of departing from the gist of the present invention.

S411 단계로 진입하여, 상기 정보저장 모듈(203)은 클러스터링 절차에 따라 수집된 사이트 정보를 저장하며, 상기 정보매치 모듈(201)을 통해 사용자로부터 요청된 요청 웹 사이트의 웹 페이지 및 클러스터링된 관련 웹 사이트의 웹 페이지를 재구성하여 사용자 단말기로 제공한다. 여기서, 상기 정보매치 모듈(201)은 사용자 컴퓨터의 요청 정보에 대해 인터넷 정보제공 웹사이트에서 제공하는 정보(검색결과, 링크 구조 컨텐츠, 광고 등)의 제시와 더불어 관련된 웹 사이트 정보를 재구성하여 사용자가 원하는 정보를 한번에 제공하는 기능을 수행함으로써 해당 서비스의 만족도를 높일 수 있다.In step S411, the information storage module 203 stores the site information collected according to the clustering procedure, and the web page of the requested web site and the clustered related web requested from the user through the information matching module 201. Reconstruct the web page of the site and provide it to the user terminal. In this case, the information matching module 201 is configured to present information (search results, link structure contents, advertisements, etc.) provided by the Internet information providing website with respect to the request information of the user's computer, and reconstruct the related website information. By providing the information you want at a time, you can increase the satisfaction of the service.

한편, 본 발명에서 웹 사이트 간의 링크정보를 기반으로 클러스터링을 통해 군집된 관련 사이트들에 대해 해당 정보제공 사이트와 유사도가 높은 순으로 관련 사이트를 자동으로 제시하고 있으나, 링크정보를 기반으로 클러스터링을 통해 군집된 관련사이트들 중 해당 정보제공 사이트에 우선 노출을 원하는 웹사이트는 링크정보를 이용한 관련 사이트 서비스 제공사에 상위 등록을 신청함으로써 유사도 순위에 우선하여 노출할 수 있을 것이다. 이는 사용자가 많은 정보제공 웹사이트의 경우 관련된 웹사이트는 우선 노출을 신청함으로써 사용자에 대한 광고효과를 높일 수 있는 광고 비즈니스 모델로서 상정될 수 있다.Meanwhile, in the present invention, although related sites are automatically presented to clustered related sites through clustering based on link information between web sites, the related sites are automatically presented in order of high similarity, but through clustering based on link information. Websites that prefer to expose the relevant information providing sites among the clustered related sites may be exposed in preference to the similarity ranking by applying for higher registration with the relevant site service provider using the link information. This can be assumed as an advertising business model in which, in the case of an information-providing website with many users, the relevant website can increase the advertising effect for the user by first applying for exposure.

도 5는 본 발명에 따른 클러스터링 시스템을 이용한 광고 제공 시스템을 나타낸 구성도이다. 도시된 바와 같이, 사용자 단말기(103)로부터 요청된 요청 웹 페이지에 대한 클러스터링 결과를 산출하고, 클러스터링 결과로부터 결정된 관련 웹 페이지 또는 광고정보의 노출순위를 광고주 단말(505)의 유사도 우선순위 등록에 따라 변경 또는 수정하는 ISP 시스템(501)을 포함한다.5 is a block diagram showing an advertisement providing system using a clustering system according to the present invention. As shown, the clustering result for the requested web page requested from the user terminal 103 is calculated, and the exposure priority of the related web page or advertisement information determined from the clustering result is determined according to the similarity priority registration of the advertiser terminal 505. ISP system 501 to change or modify.

여기서, 상기 광고주 단말(505)의 유사도 우선순위 등록은 해당 광고주 단말(505)의 광고정보가 임의의 노출순위를 보유하고 있음을 전제로 하는 것으로, 전혀 관련없는 광고정보가 우선순위 등록을 신청할 수 없음은 당연할 것이다. 따라서, 상기 광고주 단말(505)의 유사도 우선순위 등록은 클러스터링 결과에 따른 소정 범위 내의 순위를 보유하고 있음이 바람직하다. 예컨대, 소정 범위 내의 순위를 5 단계로 설정할 경우, 현재 우선순위 등록을 요청한 광고주 단말(505)의 광고정보에 대한 클러스터링 결과가 6위로 산출되면 우선순위를 1순위로 등록하지 못할 것이다. 이는 웹 페이지 또는 광고정보에 대한 유사도를 기초로 제공되는 서비스에 대한 신뢰성을 유지하기 위한 것이다.Here, the similarity priority registration of the advertiser terminal 505 is based on the premise that the advertisement information of the corresponding advertiser terminal 505 has an arbitrary exposure order, and the advertisement information which is not relevant at all can apply for priority registration. None will be natural. Therefore, the similarity priority registration of the advertiser terminal 505 preferably has a ranking within a predetermined range according to the clustering result. For example, if the ranking within the predetermined range is set to five levels, if the clustering result of the advertisement information of the advertiser terminal 505 that has requested the current priority registration is calculated as sixth, the priority may not be registered as the first priority. This is to maintain the reliability of the service provided based on the similarity to the web page or advertisement information.

한편, 상기 ISP 시스템(501)으로 유사도 우선순위 등록을 요청한 광고정보가 적어도 둘 이상일 경우, 상기 ISP 시스템(501)은 해당 광고정보들에 대한 온라인 경매를 실행할 수 있다. 즉, 소정 범위 내의 유사도 우선순위를 보유한 광고정보들은 등록요청에 따라 모두 노출할 경우, 전술한 바와 같이 서비스의 신뢰성을 저하시킬 우려가 있어, 어느 하나의 광고정보를 선택하기 위한 온라인 경매를 수행하는 것이다. 온라인 경매에 의해 선택된 광고정보는 필요에 따라 다수 개가 선택될 수있으며, 이는 본 발명의 요지를 벗어나지 않을 것이다.Meanwhile, when there is at least two advertisement information requesting registration of similarity priority to the ISP system 501, the ISP system 501 may execute an online auction for the corresponding advertisement information. That is, if all of the advertisement information having a similarity priority within a predetermined range is exposed in accordance with the registration request, there is a fear that the reliability of the service may be reduced as described above, so that an online auction for selecting one advertisement information may be performed. will be. A plurality of advertisement information selected by the online auction may be selected as needed, which will not depart from the gist of the present invention.

상기 ISP 시스템(501)은 클러스터링에 의한 관련 웹 사이트 또는 광고정보의 수집과 더불어, 콘텐츠 매치에 의한 정보수집이 병행될 수 있을 것이다. 이를 위해, 상기 ISP 시스템(501)은 콘텐츠 매치 서비스 시스템(507)을 보유하며, 상기 콘텐츠 매치 서비스 시스템(507)은 동 출원인이 2008년 6월 10일자로 출원한 "온라인 콘텐츠 매치 시스템"에 구체적으로 기술되어 있다. 온라인 콘텐츠 매치 시스템은 본 발명에 따른 유사성 측정을 위한 사이트 간 노드를 산출하지 않고, 패킷 분석을 이용한 관련 사이트 또는 광고정보를 추출하여 사용자 단말기(103)로 제공하는 것이다. 결국, 유사성을 갖는 관련 웹 페이지를 포함하여, 패킷 분석을 이용한 관련 정보를 동시에 제공할 수 있다.The ISP system 501 may be configured to collect related web sites or advertisement information by clustering, and to collect information by content matches. To this end, the ISP system 501 has a content match service system 507 which is specific to the "online content match system" filed by the applicant on June 10, 2008. Is described. The online content matching system extracts relevant sites or advertisement information using packet analysis and provides them to the user terminal 103 without calculating nodes between sites for similarity measurement according to the present invention. As a result, related information using packet analysis may be simultaneously provided, including related web pages having similarities.

도 6 내지 도 9는 본 발명의 실시 예로 나타낸 사용자의 요청 웹 페이지 결과를 도시하고 있다. 먼저, 도 6과 같이 웹 브라우저의 상단에 해당 웹사이트의 콘텐츠 페이지 구성과는 별도로 바 형태의 관련 정보제공 영역을 구성하여 사이트 방 문자에게 관련 사이트 정보를 제공하는 방식이다. 해당 웹사이트의 페이지 구성과는 별도로 웹 브라우저상에서 실행되므로 콘텐츠 페이지의 구성변경 없이 정보를 제공받을 수 있다.6 through 9 illustrate results of a request web page of a user according to an embodiment of the present invention. First, as shown in FIG. 6, a related information providing area in the form of a bar is formed on the upper part of a web browser to provide related site information to a visitor. It is executed in a web browser separately from the page configuration of the website so that information can be provided without changing the configuration of the content page.

또한, 도 7과 같이 웹 브라우저의 좌측 슬라이드 형식으로 관련 사이트 정보를 제공할 수 있다. 상단 바 형태와 마찬가지로 웹사이트의 페이지 구성과는 별도로 브라우저상에 제공되므로 사용자가 선택적으로 정보제공 영역을 닫을 수 있음에 따라, 해당 웹사이트의 정보 취득 시 편의성을 증대할 수 있다. 그리고, 도 8과 같이 웹 사이트 콘텐츠 페이지의 상단에 '바로가기' 형태로 관련 사이트 정보를 제시할 수 있으며, 사이트 내용과 가장 근접한 관련 사이트 정보를 제공하여, 사용자로부터 이를 클릭할 수 있도록 유도함으로써 추가 콘텐츠 제공 효과를 기대할 수 있다.In addition, as shown in FIG. 7, related site information may be provided in a left slide form of a web browser. As in the form of the top bar, since it is provided on the browser separately from the page configuration of the website, the user can selectively close the information providing area, thereby increasing convenience when acquiring information of the website. In addition, as shown in FIG. 8, the relevant site information may be presented in the form of 'shortcut' at the top of the web site content page, and the related site information closest to the site content may be provided, thereby inducing a user to click it. Content provision effects can be expected.

한편, 도 9와 같이 웹 사이트 콘텐츠 페이지의 하단에 관련 정보를 구성하여 원래의 정보 습득에 방해되지 않고 추가 관련 사이트 정보를 제공함으로써 사용자로 하여금 신속하게 관련 정보를 취득할 수 있는 서비스를 구성할 수 있어 서비스 편의성 및 만족도를 제고할 수 있음은 당연할 것이다.Meanwhile, as shown in FIG. 9, by configuring related information at the bottom of the web site content page, additional related site information is provided without interfering with the original information acquisition, so that a user can construct a service that can quickly obtain relevant information. It is natural that service convenience and satisfaction can be improved.

전술된 바와 같이, 본 발명에 따른 링크 정보를 이용한 관련 사이트 정보 제공 방법 및 이를 이용한 광고 시스템은, 사용자가 방문한 웹사이트의 정보를 포함하여 이에 관련한 관련 웹사이트 정보를 동시에 제공토록 함으로써, 온라인 정보 활용의 효율성을 제공하며, 이를 토대로 인터넷 정보 제공 서비스의 다양화 및 채널의 다양화로 인터넷 비즈니스를 활성화시킬 수 있어 산업적 이용 가치가 높다고 할 수 있다.As described above, the related site information providing method using the link information according to the present invention and the advertising system using the same, including the information of the website visited by the user to provide the relevant website information related to this, by utilizing the online information It can be said that industrial use value is high because it can revitalize internet business by diversification of internet information service and channel.

도 1은 본 발명에 따른 사이트정보 제공 시스템을 나타낸 구성도이다.1 is a block diagram showing a site information providing system according to the present invention.

도 2는 도 1의 클러스터링 서버 시스템의 주요 기능을 설명하기 위한 구성도이다.FIG. 2 is a diagram illustrating the main functions of the clustering server system of FIG. 1.

도 3은 본 발명에 따른 사이트 간 유사도 측정 방법을 설명하기 위한 도면이다.3 is a view for explaining a method for measuring similarity between sites according to the present invention.

도 4는 본 발명에 따른 주요 동작을 설명하기 위한 플로우챠트이다.4 is a flowchart for explaining the main operation according to the present invention.

도 5는 본 발명에 따른 링크 정보를 이용한 관련 사이트 정보 제공 방법을 이용한 광고 시스템을 나타낸 구성도이다.5 is a block diagram showing an advertising system using a method for providing related site information using link information according to the present invention.

도 6 내지 도 9는 본 발명의 실시 예로 나타낸 웹 페이지이다.6 to 9 are web pages illustrating embodiments of the present invention.

<주요 도면에 대한 부호의 설명><Explanation of symbols for main drawings>

101 : 클러스터링 서버 시스템 103 : 사용자 단말기101: clustering server system 103: user terminal

105 : 정보제공 웹 사이트 201 : 정보매치 모듈105: informational website 201: information matching module

203 : 정보저장 모듈 205 : 링크정보 클러스터링 모듈203: information storage module 205: link information clustering module

207 : 링크정보 수집모듈 501 : ISP 시스템207: link information collection module 501: ISP system

505 : 광고주 단말 507 : 콘텐츠 매치 서비스 시스템505: advertiser terminal 507: content match service system

Claims

A system for automatically providing relevant site information online,

After collecting and analyzing the link information included in the page in the information providing website requested from the user computer, clustering based on the analysis of the link information to measure the similarity between the websites, the corresponding request requested from the user computer And a clustering server system for providing related website information related to the requested website information.

The method of claim 1,

When the user computer accesses an information providing web site and searches or browses desired request website information, the site information providing system sends the requested website information and related website information collected in connection with the user computer to the user computer. Related site information providing system using link information, characterized in that transmitting and receiving in units.

The method of claim 1,

The related website information is provided together with the requested website information requested by the user, and the related site information providing system using the link information, characterized in that partitioning or dividing some space of the requesting website.

The method of claim 1,

The clustering server system includes: a link information collection module for collecting link information included in a content or advertisement information page provided by an internet information providing website at the request of the user computer;

Analyzing the link link structure of the site link information collected from the link information collection module, grasping the link relationship between a plurality of information providing sites, and measuring the similarity between websites based on the linking step of the related website information Link information clustering module to classify / group;

An information storage module for storing database similarity information, site weight information, etc. extracted from the link information clustering module; And

And an information matching module configured to reconstruct and transmit the related website information extracted based on the request website information corresponding to the request information of the user computer and the weight information set from the information storage module to the user terminal. Related site information providing system using information.

The method of claim 4, wherein

The similarity measurement performed in the link information clustering module is measured based on the number of nodes (NODEs) interconnected between a requesting web site selected by a user and a web site linked to a related site. Informational system.

In a method for automatically providing relevant site information online,

a) collecting connection status information between pages of another web site based on web page information of a request web site selected from a user;

b) tracking the number of nodes between the web page of the requesting web site and the web page of related web sites linked thereto;

c) assigning a weight corresponding to the number of each node for each related web site;

d) calculating a similarity degree to a web page of the requesting web site for each related web site corresponding to the weight; And

e) performing clustering based on the result of the similarity calculation, and adding the clustering result to the web page of the requesting web site to reconstruct the web page. How to Provide Information.

The method of claim 6,

And c) further includes a weight corresponding to the number of accessible paths between the web page of the requesting web site and the related web site.

The method of claim 6,

The clustering is a method of providing related site information using link information, characterized in that any one of k-means algorithm, Ringo algorithm, fuzzy genetic algorithm and Cosine, Euclidian, jaccard, Dice coefficient, etc. are applied based on vector modeling.

The method according to any one of claims 6 to 8,

And the web page is at least one of a search result, link structure content, and advertisement information.

The method of claim 6,

The request web site reconstruction of step e) may include a bar space on a web page of the request web site, and display information on the related web site in the corresponding space. How to provide site information.

The method of claim 6,

The request web site reconstruction of step e), the related site information providing method using the link information, characterized in that sliding the information on the related web site to one side of the web page of the request web site.

The method of claim 6,

In the step (e), the request web site reconstruction method provides related site information using link information, wherein information about the related web site is displayed in a form of 'shortcut' to a web page of the request web site. .

In the advertising system using the relevant site information online,

Comprising an ISP system for calculating the clustering result for the requested web page requested from the user terminal, and changing the exposure priority of the related web page or advertisement information determined from the clustering result according to the similarity priority registration of the advertiser terminal Advertising system using link information.

The method of claim 13,

The similarity priority registration of the advertiser terminal, the advertisement system using the link information, characterized in that it has a ranking within a predetermined range according to the clustering result.

The method of claim 13,

And at least two advertisement information requesting similarity priority registration to the ISP system, the ISP system executes an online auction for the corresponding advertisement information.

The method of claim 13,

The ISP system collects related web sites or advertisement information by clustering, and collects information by matching content to extract related sites or advertisement information using packet analysis.