KR20080048128A

KR20080048128A - Web page search order decision method and the system

Info

Publication number: KR20080048128A
Application number: KR1020060118063A
Authority: KR
Inventors: 주세홍
Original assignee: 주식회사 한랭크닷컴
Priority date: 2006-11-28
Filing date: 2006-11-28
Publication date: 2008-06-02

Abstract

A method and a system for determining webpage search ranking are provided to offer a correct and substantial search result by extracting a population of Internet users, determining an importance level of a result value for a search keyword based on visited website/webpage URLs(Uniform Resource Locator) extracted through the agent program installed in a user terminal belonging to the population, and reflecting the ranking to a search result automatically based on the importance level. A plurality of agent programs(110a-110n) transmit URL data of visited websites/webpages(180) to a URL database server(140) through the Internet(120) by being installed in a user terminal. The URL database server stores the received URL data, and transmits the URL data to a crawling server(150) and a ranking database server(170). The crawling server deletes the duplicate URL data among the received URL data, crawls the webpages by connecting to the visited websites/webpages, and transfers the crawled webpages to an indexing server(160). The indexing server makes a database by indexing the crawled webpage. The ranking database server calculates and stores the ranking of the webpages by analyzing a duplication degree of each URL based on the received URL data.

Description

Web page search order decision method and the system

도 1은 본 발명의 일실시예에 따른 웹 페이지 검색 순위 결정 방법을 수행하기 위한 개략적인 시스템 구성 및 망 접속 상태를 나타낸 구성도.1 is a schematic diagram showing a system configuration and a network connection state for performing a web page search ranking method according to an embodiment of the present invention.

도 2는 본 발명에 따른 웹 페이지 검색 순위 결정 방법 중 인터넷 이용자의 웹 서핑 Traffic을 기반으로 한 웹 페이지의 순위 산출 및 저장 과정을 도시한 일실시예 흐름도.2 is a flowchart illustrating a process of calculating and storing a ranking of a web page based on web surfing traffic of an internet user in a method for determining a web page search ranking according to the present invention.

도 3은 본 발명에 따른 웹 페이지 검색 순위 결정 방법 중 인터넷 이용자의 정보 검색 시 검색 결과가 도출되는 과정을 도시한 일실시예 흐름도.3 is a flowchart illustrating a process of deriving a search result when searching for information of an Internet user in a method of determining a web page search ranking according to the present invention.

<도면의 주요부분에 대한 부호설명><Code Description of Main Parts of Drawing>

110 : Agent 프로그램 120 : 인터넷 망110: Agent program 120: Internet network

130 : 검색 시스템 140 : URL DB 서버130: search system 140: URL DB server

150 : Crawling 서버 160 : Indexing 서버150: Crawling Server 160: Indexing Server

170 : 순위 DB 서버 180 : 웹 페이지(웹 사이트)170: ranking DB server 180: web page (web site)

본 발명은 인터넷 정보 검색 방법에 관한 것으로, 특히 인터넷 이용자들이 실제로 방문한 웹 페이지 URL 정보를 바탕으로 이용자가 검색어 입력 시 검색 결과에 우선 순위에 자동 반영함으로써 보다 정확하고 실질적인 검색 결과가 나올수 있도록 한 웹 페이지 검색 순위 결정 방법 및 그 시스템에 관한 것이다.The present invention relates to a method of retrieving Internet information, and in particular, based on web page URL information actually visited by Internet users, when a user inputs a search term, the web page automatically reflects the search results in priority order so that a more accurate and practical search result can be obtained. A search ranking method and system thereof.

최근 급변하는 정보화 사회속에서 정보통신 분야 등의 발달에 힘입어 인터넷 이용자에게 사용 가능한 정보의 양은 그 방대함이 다루기 힘들 정도에까지 이르렀다. 이를 해결하기 위한 방법으로, 인터넷 검색 엔진과 같은 기술을 활용하여 인터넷 정보를 검색하게 되는데, 이러한 종래의 인터넷 정보 검색은 WWW 웹 페이지를 검색엔진의 프로그램(robot)이 수집(Crawling)하여 저장해 놓고 분류하여 Database화하고, 검색엔진 나름의 방법론에 의해 검색결과를 보여주고 있으며, 크게 키워드형 정보 검색과, 디렉토리형 인터넷 정보 검색과, 메타형 인터넷 정보 검색 등으로 분류된다.Recently, due to the development of the information and communication field in the rapidly changing information society, the amount of information available to the Internet users has reached an extent that its vastness is difficult to deal with. In order to solve this problem, the Internet information is searched by using a technology such as an Internet search engine. In the conventional Internet information search, the WWW web page is collected and stored by the robot of the search engine. Database, and the search results are shown by the search engine's own methodology, and classified into keyword type information search, directory type internet information search, and meta type internet information search.

통상, 종래의 키워드형 정보 검색은 인터넷망에 분포된 웹 페이지들의 특정 문구를 데이터베이스로 구축해놓고, 이용자가 입력한 해당 정보를 검색하는 정보 검색을 말하며, 이러한 검색 방식을 이용한 상용화된 인터넷 정보 검색으로는 알타비스타(altavista), 라이코스(lycos), 네이버(naver), 심마니(simmany) 등을 들 수 있다.In general, the conventional keyword type information retrieval refers to an information retrieval for retrieving the corresponding information input by a user by constructing a specific phrase of web pages distributed in the Internet network, and using a retrieval method of commercialized internet information. Examples thereof include altavista, lycos, naver, simmany, and the like.

또한, 디렉토리형 인터넷 정보 검색은 인터넷망에 분포된 웹 페이지들을 사전에 설정해 놓은 몇 가지 디렉토리로 분류해 놓고, 이용자가 입력한 해당 정보를 검색하는 정보 검색을 말하며, 이러한 방식을 이용한 상용화된 인터넷 정보 검색으로는 야후(yahoo), 갤럭시(galaxy) 등을 들 수 있다.In addition, the directory type internet information search classifies web pages distributed in the internet network into several directories that are set in advance, and refers to an information search for searching the corresponding information input by the user. Searches include yahoo, galaxy and the like.

또한, 메타형 인터넷 정보 검색은 이용자가 입력한 해당 정보를 기준으로 다른 정보 검색 시스템들을 한꺼번에 검색하는 정보 검색 시스템을 말하며, 이러한 방식을 이용한 상용화된 인터넷 정보 검색으로는 새비서치(savvysearch), 메타서치(matasearch), 미스다찾니(mochanni) 등을 들 수 있다.In addition, the meta-type internet information search refers to an information search system that searches other information search systems at once based on the information input by the user. The commercialized Internet information search using this method includes savysearch and metasearch. (matasearch), mischanni (mochanni) and the like.

그러나, 상술한 바와 같은 종래의 인터넷 정보 검색 방법들은, 인터넷 이용자들이 실질적으로 많이 방문하는 순위에 관계없이 웹 사이트별 메타정보 또는 스폰서 링크, 어순 등의 기준에 의해 우선 순위를 반영하여 정보를 검색하기 때문에, 이용자가 정작 필요로 하는 정보의 내용을 파악하고 추출해 내기 위해서는 대량의 정보를 살펴보아야만 하였고, 이에 따라 인터넷 정보 검색 시 많은 시간과 노력이 추가적으로 필요하게 되는 문제점이 있었다. However, the above-described conventional methods for retrieving information on the Internet search information by reflecting priorities by criteria such as meta information of each web site, sponsored links, or word order, regardless of the ranks of Internet users. Therefore, in order to grasp and extract the contents of the information required by the user, a large amount of information has to be examined, and accordingly, there is a problem in that a lot of time and effort are required when searching for Internet information.

한편, 상기 문제점을 해결하기 위해, 최근에는 인터넷 이용자의 이용빈도에 따라 순서대로 웹 사이트를 나열하거나 컨텐츠의 내용을 평가해 순위를 매기는 방식이 성행하고 있는데, 대표적인 방식으로 웹마이닝(Web Mining)이라 불리우는 로그파일분석을 통하여 웹 사이트 접속빈도, 접속지속시간, 평균접속시간 등 다양한 기준을 적용하여 웹 사이트 순위를 선정하는 패널 리서치 방식과, 특정 웹 사이트에 들어오는 인터넷 이용자들의 사이트 내에서의 활동을 기록한 로그를 분석하는 방법에 의한 로그 파일 분석방식 등이 있다.On the other hand, in order to solve the above-mentioned problem, in recent years, a method of ranking web sites in order according to the frequency of use of Internet users or evaluating the contents of contents is ranked. Web mining is a typical method. Through a log file analysis called "Panel Research", a panel research method is used to select web site rankings by applying various criteria such as web site access frequency, connection duration, and average access time. There are log file analysis method by analyzing recorded log.

그러나, 상기 패널 리서치 방식이나 로그 파일 분석방식과 같은 순위검색방 식은 특정계층의 집중투표에 의한 추천폭탄현상(spam), 각기 다른 영역지식에서의 중복 투표 등이 야기되면서 순위 검색 결과에 있어서 신뢰성 및 정확성이 부족한 문제점이 있었다.However, the ranking search method such as the panel research method or the log file analysis method causes the recommendation bomb spam due to intensive voting of specific layers, duplicate voting in different domain knowledge, etc. There was a problem of lack of accuracy.

따라서 본 발명은 상기와 같은 문제점을 해결하기 위해 안출된 것으로, 본 발명의 목적은 인터넷 이용자 중 통계학적인 표본집단을 추출하고, 추출된 인터넷 이용자 단말기에 설치된 Agent 프로그램을 통해 웹 사이트 및 웹 페이지 방문 URL을 추출하여 검색 주제어에 대한 결과 값의 중요도를 결정하며, 이를 이용하여 이용자가 검색어 입력 시 검색 결과의 우선 순위에 자동 반영함으로써 보다 정확하고 실질적인 검색 결과가 나올수 있도록 한 웹 페이지 검색 순위 결정 방법 및 그 시스템을 제공함에 있다.Therefore, the present invention was devised to solve the above problems, and an object of the present invention is to extract a statistical sample group among Internet users, and to visit a web site and web page visit URL through an agent program installed in the extracted internet user terminal. To determine the importance of the result value for the search key word, and the web page search ranking method that enables more accurate and practical search results by automatically reflecting the priority of the search result when the user inputs the search word and using the same. In providing a system.

상기와 같은 목적을 달성하기 위한 본 발명에 따른 웹 페이지 검색 순위 결정 방법은, 인터넷 이용자에게 동의를 구한 후 Agent 프로그램을 설치하는 단계; 이용자들이 실제로 방문한 웹 사이트의 URL 데이터를 추출하는 단계; 수신한 URL 정보에서 중복을 제거하는 단계; 중복을 제거한 URL 정보를 바탕으로 웹 페이지를 가져오는 단계; 가져온 웹 페이지를 DB에 저장하고 Indexing 하는 단계; 이용자들이 방문한 전체 URL 정보로부터 웹 페이지 별로 방문횟수로 재 정렬하는 단계; 이 용자의 정보 검색 시 검색주제어에 따른 웹 페이지에서 가장 방문자 수가 많은 웹 페이지 순서로 검색결과를 출력하는 단계;를 포함하여 이루어진 것을 특징으로 한다.Web page search ranking method according to the present invention for achieving the above object, the step of installing the Agent program after seeking consent from the Internet user; Extracting URL data of a web site actually visited by users; Removing duplicates from the received URL information; Retrieving a web page based on URL information from which duplicates are removed; Storing and indexing the imported web page in a DB; Reordering the number of visits for each web page from the entire URL information visited by the users; And outputting the search results in the order of the web pages with the most visitor numbers in the web page according to the search main control when searching for the user's information.

상기와 같은 목적을 달성하기 위한 본 발명에 따른 웹 페이지 검색 순위 결정 시스템은, 이용자 단말기에 설치되어 이용자가 방문한 웹 사이트 및 웹 페이지의 URL 데이터를 인터넷을 통해 URL DB 서버로 전송하는 복수의 Agent 프로그램; 상기 Agent 프로그램으로부터 전송받은 URL 데이터를 저장하고, 저장된 URL 데이터를 상기 Crawling 서버 및 순위 DB 서버로 전송하는 URL DB 서버; 상기 URL DB 서버로부터 전송받은 URL 데이터 중에 중복 URL을 제거하고, 인터넷을 통해 이용자가 방문한 웹 사이트 및 웹 페이지에 접속하여 해당 웹 페이지를 Crawling 해 와서 상기 Crawling 한 데이터를 Indexing 서버에 전송하는 Crawling 서버; 상기 Crawling 서버로부터 전송받은 웹 페이지를 인덱싱 하여 데이터베이스 화 하는 Indexing 서버; 상기 URL DB 서버로부터 전송받은 URL 데이터를 바탕으로 URL별 중복도를 분석하여 웹 페이지의 순위를 산출 및 저장하는 순위 DB 서버;를 포함하여 구성되는 것을 특징으로 한다.Web page search ranking system according to the present invention for achieving the above object, a plurality of agent programs installed in the user terminal for transmitting the URL data of the web site and web page visited by the user to the URL DB server via the Internet ; A URL DB server storing URL data received from the Agent program and transmitting the stored URL data to the crawling server and the rank DB server; A crawling server which removes duplicate URLs from URL data received from the URL DB server, accesses a website and a web page visited by a user through the Internet, crawls the web page, and transmits the crawled data to an indexing server; Indexing server for indexing the web page received from the Crawling server to the database; And a ranking DB server for calculating and storing the ranking of the web page by analyzing the degree of redundancy for each URL based on the URL data received from the URL DB server.

이하에서는 첨부 도면을 참조하여 본 발명의 가장 바람직한 일 실시예를 상세히 설명하기로 한다. 하기의 설명에서는 본 발명에 따른 동작을 이해하는데 필요한 부분만이 설명되며 그 이외 부분의 설명은 본 발명의 요지를 벗어나지 않도록 생략될 것이라는 것을 유의하여야 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. It should be noted that in the following description, only parts necessary for understanding the operation according to the present invention will be described, and descriptions of other parts will be omitted so as not to deviate from the gist of the present invention.

도 1은 본 발명의 일실시예에 따른 웹 페이지 검색 순위 결정 방법을 수행하 기 위한 개략적인 시스템 구성 및 망 접속 상태를 나타낸 구성도이다.1 is a schematic diagram showing a system configuration and a network connection state for performing a web page search ranking method according to an embodiment of the present invention.

도 1을 참조하면, 이용자 측 클라이언트PC에 설치된 복수의 Agent 프로그램(110a~110n)은 인터넷을 통해 본 발명이 적용된 검색 시스템(130)에 연결되어 이용자의 웹 서핑 URL 데이터를 전송하고, 상기 검색 시스템(130) 내에는 URL DB 서버(140) 및 Crawling 서버(150), Indexing 서버(160), 순위 DB 서버(170)가 있어 상기 복수의 Agent 프로그램(110a~110n)으로부터 전송받은 인터넷 이용자들의 실제 웹 서핑 URL 데이터를 바탕으로 이용자의 트랜드를 반영하는 보다 정확하고 실질적인 검색결과 서비스가 제공되어지게 된다.Referring to Figure 1, a plurality of Agent programs (110a ~ 110n) installed on the client side client PC is connected to the search system 130 to which the present invention is applied via the Internet to transmit the user's web surfing URL data, the search system There is a URL DB server 140, Crawling server 150, Indexing server 160, ranking DB server 170 in the 130, the actual web of the Internet users received from the plurality of Agent programs (110a ~ 110n) Based on the surf URL data, more accurate and realistic search result services are provided that reflect the user's trends.

실시예에 따르면, 이용자 측 클라이언트PC에 설치된 복수의 Agent 프로그램(110a~110n)은 이용자가 방문한 웹 사이트 및 웹 페이지의 URL 데이터를 인터넷을 통하여 상기 URL DB 서버(140)에 전송한다.According to an embodiment, the plurality of Agent programs 110a to 110n installed on the client PC transmits URL data of the web site and web page visited by the user to the URL DB server 140 via the Internet.

상기 URL DB 서버(140)는 상기 Agent 프로그램(110a~110n)으로부터 전송받은 URL 데이터를 저장하고, 저장된 URL 데이터를 상기 Crawling 서버(150)및 순위 DB 서버(170)로 전송한다.The URL DB server 140 stores URL data received from the Agent programs 110a to 110n and transmits the stored URL data to the crawling server 150 and the rank DB server 170.

상기 Crawling 서버(150)는 상기 URL DB 서버(140)로부터 전송받은 URL 데이터 중에 중복 URL을 제거하고, 인터넷을 통해 이용자가 방문한 웹 사이트 및 웹 페이지에 접속하여 해당 웹 페이지를 Crawling 해 와서 상기 Crawling 한 데이터를 Indexing 서버(160)에 전송한다.The crawling server 150 removes duplicated URLs from URL data received from the URL DB server 140, accesses a web site and a web page visited by a user through the Internet, and crawls the web page. The data is transmitted to the indexing server 160.

상기 Indexing 서버(160)는 상기 Crawling 서버(150)로부터 전송받은 웹 페이지를 인덱싱 하여 데이터베이스 화 시킨다.The indexing server 160 indexes a web page received from the crawling server 150 to make a database.

상기 순위 DB 서버(170)는 상기 URL DB 서버(140)로부터 전송받은 URL 데이터를 바탕으로 URL별 중복도를 분석하여 웹 페이지의 순위를 산출 및 저장한다. 즉, 인터넷 이용자가 본 발명이 적용된 웹 서버에 접속하여 정보 검색을 위해 검색 주제어를 입력 시, 웹 서버에서는 상기 검색 주제어가 포함된 웹 페이지를 Indexing 서버(160)에서 추출하고, 추출된 웹 페이지에 해당하는 이용자 방문 중복도를 상기 순위 DB 서버(170)에서 계산하여 이용자에게 검색결과를 보여줄 때 중복도가 많은 웹 사이트 또는 웹 페이지가 순서대로 나열된다.The ranking DB server 170 calculates and stores the ranking of the web page by analyzing the degree of redundancy for each URL based on the URL data received from the URL DB server 140. That is, when an Internet user accesses a web server to which the present invention is applied and inputs a search keyword for information retrieval, the web server extracts a web page including the search keyword from the indexing server 160 and displays the extracted web page. When the corresponding user visit redundancy is calculated by the ranking DB server 170 and the search results are displayed to the user, the web sites or web pages with many redundancies are listed in order.

도 2는 본 발명에 따른 웹 페이지 검색 순위 결정 방법 중 인터넷 이용자의 웹 서핑 Traffic을 기반으로 한 웹 페이지의 순위 산출 및 저장 과정을 도시한 일실시예 흐름도이다.2 is a flowchart illustrating a process of calculating and storing a ranking of a web page based on a web surfing traffic of an internet user in a method of determining a ranking of a web page search according to the present invention.

도 2에 따르면, 우선, 이용자가 인터넷을 통하여 최초로 본 발명이 적용된 검색 시스템(130)에 접속하면(S201), 상기 검색 시스템(130)이 이용자의 동의를 얻어 이용자 측 클라이언트PC에 Agent 프로그램(110a~110n)을 설치한다.(S202)According to FIG. 2, first, when a user connects to the search system 130 to which the present invention is applied for the first time through the Internet (S201), the search system 130 obtains the user's consent and the Agent program 110a on the client side client PC. ~ 110n) is installed (S202).

이 후, 이용자가 웹 서핑을 할 때마다 이용자가 방문한 웹 페이지 URL 데이터가 상기 Agent 프로그램(110a~110n)에 의하여 URL DB 서버(140)로 전송되고(S203), 상기 URL DB 서버(140)는 상기 Agent 프로그램(110a~110n)으로부터 전송받은 URL 데이터를 저장함과 동시에 저장된 URL 데이터를 상기 Crawling 서버(150)및 순위 DB 서버(170)로 전송한다.(S204)Thereafter, whenever the user surfs the web, the web page URL data visited by the user is transmitted to the URL DB server 140 by the Agent programs 110a to 110n (S203), and the URL DB server 140 is The URL data received from the agent programs 110a to 110n is stored and the stored URL data is transmitted to the crawling server 150 and the rank DB server 170. (S204)

이어, Crawling 서버(150)가 상기 URL DB 서버(140)로부터 전송된 URL 데이터의 중복을 제거한 후 이를 바탕으로 해당 웹 페이지를 Crawling 해 오고, Crawling 해 온 웹 페이지를 Indexing 서버(160)에 전송한다.(S205)Subsequently, the crawling server 150 removes duplicate URL data transmitted from the URL DB server 140, and then crawls the web page based on this, and transmits the crawled web page to the indexing server 160. (S205)

이어, Indexing 서버(160)가 Crawling 해 온 웹 페이지를 데이터베이스 화 시키고(S206), 순위 DB 서버(170)가 상기 URL DB 서버(140)로부터 전송받은 URL 데이터의 중복도를 계산하여 웹 페이지의 순위를 산출 및 저장하게 된다.(S207) Subsequently, the indexing server 160 databases the web pages that have been crawled (S206), and the rank DB server 170 calculates the degree of redundancy of URL data transmitted from the URL DB server 140 to rank the web pages. It is calculated and stored. (S207)

도 3은 본 발명에 따른 웹 페이지 검색 순위 결정 방법 중 인터넷 이용자의 정보 검색 시 검색 결과가 도출되는 과정을 도시한 일실시예 흐름도이다.3 is a flowchart illustrating a process of deriving a search result when searching for information of an Internet user in a method of determining a web page search ranking according to the present invention.

도 3에 따르면, 우선, 이용자가 인터넷을 통하여 본 발명이 적용된 검색 시스템(130)에 접속한 후 원하는 검색 주제어를 입력하면(S301), 상기 검색 시스템(130)이 Indexing 서버(160)에 저장되어 있는 전체 웹 페이지 중에서 검색 주제어가 포함되어 있는 웹 페이지를 추출하고(S302), 추출된 웹 페이지만을 대상으로 방문자 수가 가장 많은 웹 페이지, 즉, URL 중복도가 가장 높은 웹 페이지를 순서대로 계산한다.(S303)According to FIG. 3, first, when a user accesses a search system 130 to which the present invention is applied through the Internet and inputs a desired search keyword (S301), the search system 130 is stored in the indexing server 160. The web page including the search keyword is extracted from the entire web pages (S302), and the web pages with the highest number of visitors, ie, the web pages with the highest URL redundancy, are calculated in order for the extracted web pages only. (S303)

이어, 검색 시스템(130) 이용자가 입력한 검색 주제어가 포함된 웹 페이지 중에서 URL 중복도가 가장 높은 웹 페이지부터 순서대로 이용자의 클라이언트PC에 출력하게 되고, 이에 따라 이용자는 보다 정확한 검색 결과를 얻을 수 있는 것이다.(S304)Subsequently, among the web pages including the search key word input by the search system 130 user, the search system 130 outputs the web pages in order from the web page with the highest URL redundancy, thereby allowing the user to obtain more accurate search results. (S304)

한편, 본 발명은 또한 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, DVD-ROM, 자기 테이프, 하드디스크, 플로피디스크, 플래쉬 메모리, 광데이터 저장장치 등이 있다. 또한, 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드로서 저장되고 실행될 수 있다.On the other hand, the present invention can also be embodied as computer readable codes on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include ROM, RAM, CD-ROM, DVD-ROM, magnetic tape, hard disk, floppy disk, flash memory, optical data storage, and the like. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

이상 본 발명의 상세한 설명에서는 구체적인 실시예에 관해 설명하였으나, 본 발명의 범위에서 벗어나지 않는 한도내에서 여러 가지 변형이 가능함은 물론이다. 그러므로 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 안되며 후술하는 특허청구의 범위뿐만 아니라 이 특허청구의 범위와 균등한 것들에 의해 정해져야 한다.In the foregoing detailed description of the present invention, specific embodiments have been described, but various modifications may be made without departing from the scope of the present invention. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be defined not only by the scope of the following claims, but also by the equivalents of the claims.

상술한 바와 같은 본 발명은, 인터넷 정보 검색 결과 주제어에 따른 웹 페이지 중에서 인터넷 이용자들이 실제로 방문한 횟수를 기준으로 실질적이고 정확한 우선 순위가 반영된 웹 페이지 순으로 정렬하여 검색되어진 수많은 정보들 중에 자신이 필요한 정보를 최우선적으로 검색함으로써 시간과 노력의 낭비를 최소화할 수 있는 장점이 있다. As described above, according to the present invention, information required by the user from among a large number of pieces of information searched by sorting in order of web pages reflecting actual and accurate priorities based on the number of actual visits of Internet users among web pages according to the Internet keyword search result. By searching first, it has the advantage of minimizing waste of time and effort.

또한, 본 발명에 따르면 인터넷 이용자들의 성별, 연령별, 지역 인터넷 사용 패턴을 분석하여 특정 웹 사이트에 대한 정량적 평가를 할 수 있는 효과가 있다.In addition, according to the present invention there is an effect that can be quantitative evaluation for a specific web site by analyzing the gender, age, local Internet usage pattern of the Internet users.

Claims

A web page retrieval method performed in a web page retrieval system for providing a web page retrieval result corresponding to a search main control input from a user terminal connected through an internet network,

Installing an agent program after obtaining consent from an internet user;

Extracting URL data of a web site actually visited by users;

Removing duplicates from the received URL information;

Retrieving a web page based on URL information from which duplicates are removed;

Storing and indexing the imported web page in a DB;

Reordering the number of visits for each web page from the entire URL information visited by the users;

And outputting the search results in the order of the web pages having the most visitor numbers in the web page according to the search main control when searching for the user's information.

A web page search system for providing a web page search result corresponding to a search main control input from a user terminal connected through an internet network,

A plurality of agent programs 110a to 110n installed in the user terminal and transmitting URL data of the web site and web page visited by the user to the URL DB server 140 through the Internet;

A URL DB server 140 for storing URL data received from the Agent programs 110a to 110n and transmitting the stored URL data to the crawling server 150 and the rank DB server 170;

The duplicated URL is removed from the URL data transmitted from the URL DB server 140, the web site and the web page visited by the user are accessed through the Internet, and the web page is crawled to index the crawled data. Crawling server 150 to transmit to;

Indexing server 160 for indexing the web page received from the Crawling server 150 to the database;

A web page search comprising: a rank DB server 170 for calculating and storing a rank of a web page by analyzing a degree of redundancy for each URL based on the URL data received from the URL DB server 140. Ranking system.

The ability to install the Agent program after seeking consent from an Internet user;

Extracting URL data of a web site that users actually visited;

Removing duplicates from the received URL information;

Importing web pages based on URL information having been removed;

Save and index imported web pages in a database;

Reordering the number of visits by web page from the entire URL information visited by users;

A computer-readable recording medium having recorded thereon a program for executing the function of outputting the search results in the order of the web pages having the most visitor number in the web page according to the search main control when searching the user's information.