KR101051422B1

KR101051422B1 - Record media recording the automatic completion system, method and program for each query type with guaranteed search results

Info

Publication number: KR101051422B1
Application number: KR1020080105464A
Authority: KR
Inventors: 정한민; 이미경; 김평; 이승우; 박동인; 성원경
Original assignee: 한국과학기술정보연구원
Priority date: 2008-10-01
Filing date: 2008-10-27
Publication date: 2011-07-25
Also published as: KR20100037512A

Abstract

검색 결과가 보장된 질의어 유형별 자동완성 시스템 및 방법 및 프로그램을 기록한 기록매체가 개시된다. 이를 위하여 질의어를 입력하는 검색 시스템에서 검색 결과가 보장되는 질의어 검색 결과만을 자동완성하여 제시하는 것으로, 특히, 입력된 질의어에 대하여 검색결과가 존재하는 질의어 목록과 유형을 제시하고 문서정보의 추가, 삭제 등에 실시간으로 대응할 수 있기 때문에 검색 결과의 신뢰도를 높이고, 검색의 실패를 방지하며 검색을 신속하게 하고, 유형별 정보를 그룹화하여 개선상태로 제공하는 효과가 있다. Disclosed is a recording medium recording an automatic completion system, method, and program for each query type for which a search result is guaranteed. To this end, the search system that inputs the query automatically presents only the query result that guarantees the search result. In particular, the list and type of the query result in which the search result exists is presented and the document information is added or deleted. Since it can respond in real time, etc., it is effective to increase the reliability of the search results, to prevent the failure of the search, to speed up the search, and to provide the improved status by grouping the information by type.

자동완성, 검색, 문서정보, 색인어, 빈도수, 질의어, 데이터베이스, 서버, 주제어 Autocomplete, Search, Document Information, Index, Frequency, Query, Database, Server, Topic

Description

Recording system that records auto-complete system, method, and program for each type of query with guaranteed search results.

본 발명은 질의어 검색 시스템에 관한 것으로, 상세하게는 입력된 질의어에 대하여 검색결과가 존재하는 질의어 목록과 유형을 제시하고 문서정보의 추가, 삭제 등에 실시간으로 대응할 수 있는 검색 결과가 보장된 질의어 유형별 자동완성 시스템 및 방법 및 프로그램을 기록한 기록매체에 관한 것이다. The present invention relates to a query search system. In detail, the present invention provides a list and types of query words in which a search result exists for an input query, and provides automatic search by type of query for which a search result can be added in real time. It relates to a recording medium recording the complete system, method and program.

현대 사회는 컴퓨터와 인터넷과 통신망 등을 이용하는 데이터 통신의 발달에 의하여 기존에 공개된 정보로부터 최신 정보까지 유료 및 무료 방식 중에서 어느 하나의 방식으로 대량의 정보에 손쉽게 접근할 수 있는 정보의 홍수시대이다. The modern society is the age of the flood of information that can easily access a large amount of information through either the paid or the free method from the previously published information to the latest information by the development of data communication using computers, the Internet, and communication networks. .

이러한 정보에는 글자(text), 소리(sound), 화상(picture), 영상(movie), 멀티미디어 등이 있으며, 일반적으로 정보의 내용은 글자로 기록한다. Such information includes text, sound, pictures, movies, multimedia, and the like. In general, the content of information is recorded in text.

또한, 정보의 양이 많아지면서 기록하여 관리하는 장치와 필요한 정보를 필요한 때에 신속하고 정확하게 찾아내는 기술을 필요로 하게 되었고, 이러한 기술을 일반적으로 데이터베이스(DATABASE : DB)라고 하며, 상기 데이터베이스 전용 컴퓨터를 서버(SERVER)라 하고, 상기 서버에 인터넷 등의 데이터 통신망을 통하여 원거리에서 누구나 접속하며 필요한 정보를 손쉽게 검색 및 활용한다. In addition, as the amount of information increases, there is a need for a device for recording and managing information and a technique for quickly and accurately finding necessary information when needed. Such a technique is generally called a database (DATABASE: DB), and the database dedicated computer is a server. It is called (SERVER), and anyone accesses the server from a long distance through a data communication network such as the Internet, and easily retrieves and utilizes necessary information.

상술한 데이터베이스(DB)에는 텍스트, 소리, 화상, 영상, 멀티미디어 정보가 모두 포함되지만 정해진 저장용량에 최대의 정보를 기록 및 관리하기 위하여 텍스트 정보를 위주로 기록한다. The database DB includes all text, sound, image, video, and multimedia information, but mainly records text information in order to record and manage maximum information in a predetermined storage capacity.

텍스트 정보에는 사람이 알 수 있는 어휘와 프로그램이 포함되는 컴퓨터가 알 수 있는 어휘가 있다. Textual information includes a human-readable vocabulary and a computer-readable vocabulary that contains a program.

데이터베이스에 저장된 대량의 정보로부터 원하는 내용을 신속하게 검색하기 위하여 사람과 컴퓨터가 정보를 교환하고 동시에 알 수 있는 표준화된 어휘의 용어가 필요하며, 이러한 표준화된 어휘를 개념의 집합체 또는 온톨로지(ONTOLOGY)라 하고, 이러한 온톨로지를 이용하여 대량의 정보가 공유되어 있는 인터넷의 웹상에서 원하는 정보를 선택적으로 검색하는 것이 시맨틱 웹(SEMANTIC WEB) 기술이다. In order to quickly search for the desired contents from a large amount of information stored in a database, a standardized vocabulary term is required for both humans and computers to exchange information and know at the same time. This standardized vocabulary is called a collection of concepts or an ontology. In order to selectively search for desired information on the web of the Internet in which a large amount of information is shared using the ontology, a semantic web technology is used.

일반적으로 전문 지식을 획득하거나 경영상 정확한 판단 및 결정 등을 위하여 정보 검색의 필요성이 있고, 축적된 대단위의 기술정보로부터 원하는 또는 필요로 하는 기술정보를 신속하게 발굴(MINING)하는 것은 또 하나의 독립된 기술 분야가 된다. In general, there is a need for information retrieval in order to obtain professional knowledge or to make accurate judgments and decisions on management, and to quickly find desired or necessary technical information from a large amount of accumulated technical information. It becomes a technical field.

상기 데이터베이스 서버로부터 정보 검색을 위하여 입력되는 어휘 또는 용어를 질의어(QUERY) 또는 질의어(ENTITY) 또는 추천이라고도 하며, 이하에서는 가능한 질의어로 사용한다. Vocabulary or terms input from the database server for information retrieval are also referred to as QUERY, ENTITY, or recommendation.

상기 질의어가 일부 입력된 상태를 완성된 상태로 판단하여 검색된 질의어를 표시하는 것이 자동완성 표시 방식이다. In the autocomplete display method, the searched query is displayed by determining that the query part is partially input as a completed state.

상기 자동완성 방식은 웹 브라우저나 기타 데이터 검색용 소프트웨어에서 반복적으로 동일하게 입력되는 이름, 주소, 명칭 등과 같은 질의어의 경우, 이전에 입력되어 검색되었던 질의어를 목록으로 표시하고, 표시된 질의어 목록 중에서 하나를 선택하여 신속하게 입력하도록 하는 것이다. In the case of a query such as a name, an address, a name, etc. repeatedly inputted in a web browser or other data retrieval software, the autocomplete method displays a list of previously searched queries and displays one of the displayed query lists. To make a quick selection.

웹 2.0 기술의 하나인 에이작스(Asynchronous JavaScript and XML : AJAX) 기술을 이용하여 구현된 자동완성 방식은 포털 인터넷 사이트를 포함하는 다양한 웹사이트와 디지털 도서관, ENTERPRISE WEB 2.0, 전문 응용 분야 프로그램 등 다양한 분야에서 광범위하게 적용되고 있다. The autocomplete method is implemented using Asynchronous JavaScript and XML (AJAX) technology, which is one of the Web 2.0 technologies, and various fields including portal internet sites, digital libraries, ENTERPRISE WEB 2.0, and specialized application programs. It is widely applied in.

상기와 같은 자동완성 방식은 사용자 경험 강화 측면에서 그 효과가 선호도로 입증되므로 향후의 검색 인터페이스에 더욱 널리 사용될 것으로 보인다. As the above-mentioned autocompletion method is proved as a preference in terms of enhancing the user experience, it may be used more widely in the future search interface.

그러나 현재까지 제공되는 자동완성 방식은 사용자 질의어 입력에 맞추어 저장된 로그(LOG) 정보나 자체적인 사전 등을 이용하여 질의어들을 나열하는 방식이었으므로, 검색하고자 하는 질의어가 표시된 목록의 상단에 보이지 않을 경우, 제시된 자동완성 목록을 순차적으로 모두 살펴보아야 하는 문제가 있다. However, the autocomplete methods provided up to now have been used to list the query terms using the stored log information or its own dictionary according to the user query input. Therefore, if the query to be searched is not displayed at the top of the displayed list, There is a problem that you must look through the autocomplete list sequentially.

또한, 방대한 콘텐츠를 보유한 포털 사이트와 달리 개인기업 또는 특정 응용 분야의 경우에는 상대적으로 빈약한 콘텐츠로 인하여 검색의 성공을 보장할 수 없는 상태에서도 검색어 입력 빈도수에 의하여 자동완성 목록을 제시함으로써 검색 기능의 신뢰도를 떨어뜨리는 문제가 있다. In addition, unlike a portal site with a large amount of content, in the case of a private enterprise or a specific application field, the search function can be provided by presenting an autocomplete list based on the frequency of search terms even when the search success cannot be guaranteed due to relatively poor content. There is a problem of low reliability.

도 1 은 일반적인 데이터베이스 시스템으로부터 정보를 검색하는 시스템의 기능 구성도 이다. 1 is a functional configuration diagram of a system for retrieving information from a general database system.

도 1 을 참조하여 텍스트 정보에 의한 데이터베이스 서버로부터 필요한 정보의 검색 개념을 설명하면, 데이터베이스(DB) 서버에 텍스트로 이루어지는 다양한 정보가 대량으로 기록되고 관리된다. Referring to FIG. 1, a concept of retrieving necessary information from a database server by text information is described. A large amount of information composed of text is recorded and managed in a database server.

데이터베이스 서버에 기록된 텍스트 정보로부터 필요한 정보를 검색하기 위해서는 질의어를 컴퓨터 단말기에 입력한다. In order to retrieve necessary information from the text information recorded in the database server, a query is input to the computer terminal.

컴퓨터 단말기에는 검색 프로그램(PROGRAM)이 구비되고, 상기 검색 프로그램에 의하여 입력된 질의어(개체)를 분석(ANALYZE)하며, 데이터베이스 서버로부터 질의어에 해당하는 색인 정보를 검색(SEARCH) 한다. The computer terminal is provided with a search program (PROGRAM), analyzes the query word (object) input by the search program, and retrieves index information corresponding to the query word from the database server.

질의어가 포함되는 다수의 색인 정보가 검색되어 목록으로 제공되고, 상기 목록 중에서 어느 하나를 선택하면 해당 정보가 최종 검색되어 컴퓨터 단말기에 출력된다. A plurality of index information including a query is searched and provided as a list. When any one of the lists is selected, the corresponding information is finally searched and output to the computer terminal.

검색된 정보는 더 높고 다양한 지식을 얻거나 경영자 또는 관리자가 결정 및 판단하기 위한 참고자료로 사용된다. The information retrieved is used as a reference for obtaining higher and more diverse knowledge or for making decisions and judgments by managers or managers.

데이터베이스 서버에 기록되어 관리되는 정보의 양은 지식과 과학이 발달하면서 그 양에서 매우 크게 늘어나고, 입력되는 질의어로 원하는 정보를 분석하여 검색하는데 많은 시간이 소요되는 문제가 있다. The amount of information recorded and managed in the database server increases greatly as the knowledge and science develops, and there is a problem that it takes a lot of time to search and analyze the desired information by the input query.

도 2 는 일례에 의한 것으로 데이터를 검색하기 위하여 입력되고 검색되는 질의어를 자동완성 방식으로 표시하는 상태 도시도 이다. FIG. 2 is a diagram illustrating a query word input and searched in an autocomplete manner by way of example.

도 2 를 상세히 설명하면, 질의어로 “대”가 입력된 상태에서 자동완성 방식으로 검색된 질의어 목록이 도시되어 있다. 검색된 목록은 전방일치와 후방일치로 분류되어 표시된다. 도 2 에서는 한글을 질의어로 입력하였으나 영어 등과 같은 언어도 가능하다. Referring to FIG. 2 in detail, a query list searched in an autocomplete manner in a state where “large” is entered as a query is shown. The searched list is displayed classified as forward match and backward match. In FIG. 2, Korean is input as a query language, but a language such as English may be used.

입력된 질의어를 자동완성 방식으로 검색하고 전방일치 방식으로 표시한 것에는 대한항공, 대법원, 대성, 대구은행 등이 있으며, 후방일치 방식으로 표시한 것에는 이용대, 소녀시대, 단국대 등이 있다. The searched input words are automatically completed and displayed in a forward match method such as Korean Air, Supreme Court, Daesung, and Daegu Bank, and the rear matching methods include Yongdae, Girls' Generation, and Dankook University.

질의어로 검색되어 자동완성으로 표시된 대한항공, 대구은행, 대한통운 등의 질의어는 기관 유형(TYPE)으로 분류되고, 대한민국은 국가 유형으로 분류되며, 대성, 이용대 등은 인물 유형으로 분류되고, 소녀시대는 그룹 유형 등으로 분류된다. Queries, such as Korean Air, Daegu Bank, and Korea Express, which are searched as query words and marked as autocomplete, are classified as TYPE, Korea is classified as country type, and Daesung and Yongdae are classified as person types. Are classified into group types and the like.

현재 대한민국의 많은 포털(PORTAL) 사이트, 일례로, 네이버(Naver, www.naver.com) 등에서 검색을 위하여 질의어를 입력하는 경우, 입력된 질의어가 포함되어 자동완성된 질의어를 검색하고 목록으로 제공한다. If you enter a query to search in many portal sites of Korea, for example, Naver (Naver, www.naver.com ), the entered query is included and the search is completed and provided as a list. .

질의어는 입력이 계속되면서 완성되고, 일부가 입력된 상태에서도 자동완성된 질의어 목록 중에서 원하는 질의어를 선택 입력하므로 검색을 위한 질의어 입력 시간을 줄이고 사용상 편리하다. The query is completed while the input continues, and since a desired query is selected from a list of auto-completed queries even when a part is input, the query time for searching is shortened and it is convenient for use.

포털 사이트와 같이 방대한 콘텐츠를 보유하지 않은 기업이나 특정 응용 분야 등과 같은 경우에는 상대적인 콘텐츠 빈약에 의하여 입력되는 질의어에 대한 검색 결과(검색 결과의 성공적 제시)를 보장할 수 없고, 단지 입력 빈도에 의한 자동 완성 목록을 제시함으로써 해당 기능의 신뢰도를 떨어뜨린다. In the case of a company that does not have a large amount of content, such as a portal site, or a specific application, the search results (successful presentation of the search results) for the query input by the relative content poor cannot be guaranteed, but only by the frequency of input. Presenting a complete list reduces the reliability of the feature.

도 3 은 종래 기술의 일례에 의한 것으로 질의어를 입력하고 검색에 실패한 상태 도시도 이다. 3 is a diagram illustrating a state in which a query is input and a search fails due to an example of the related art.

도 3 을 상세히 설명하면, 일례로, 상품의 가격을 비교하는 사이트인 ‘베스트바이어(www.bb.co.kr)’ 사이트에 접속하여 검색을 위한 질의어로 ‘아피나’를 입력한 상태이다. 상기 입력된 질의어에 대하여 ‘아피나, 아피나 식탁’이 각각 검색되어 자동완성으로 표시된다. Referring to FIG. 3, for example, the user accesses the 'best buyer ( www.bb.co.kr )' site, which compares the price of the product, and enters ' apina ' as a query for searching. For the input query word, 'apina, apina table' is searched for and displayed as autocomplete.

질의어에 대한 색인정보 검색에는 실패한 것으로 표시된다. 일례로, 검색된 질의어 ‘아피나’의 색인정보와 ‘아피나 식탁’의 색인정보 검색에는 실패한 것으로 표시된다. 이러한 실패의 원인은, 일례로, 해당 정보의 상품 판매 부진, 재고 소진, 유효기간 만료 등으로 삭제되었을 수 있다. Searching for index information for a query is marked as failed. For example, the index information of the searched query word "apina" and the index information of the "apina table" are displayed as failed. The cause of such a failure may be, for example, deleted due to the lack of product sales, inventory exhaustion, expiration of the corresponding information, and the like.

상기와 같은 원인에도 불구하고, 사용자에게는 검색 실패에 의하여 시스템의 신뢰도를 떨어트리게 되는 문제점이 있다. In spite of the above causes, the user has a problem that the reliability of the system is lowered due to a search failure.

종래 기술의 일례에 의한 것으로, A. Bangalore, A. Browne, and G. Divita, “UMLSKS SUGGEST: An Auto-complete Feature for the UMLSKS Interface Using AJAX”, In Proceedings of AMIA, 1106에 의하면 UMLSKS 인터페이스에 적용하기 위한 자동완성 특징으로서 성공적인 검색 결과를 생성한 질의어들만을 제시할 수 있는 방안이 기재되어 있다. As an example of the prior art, according to A. Bangalore, A. Browne, and G. Divita, “UMLSKS SUGGEST: An Auto-complete Feature for the UMLSKS Interface Using AJAX”, In Proceedings of AMIA, 1106 As an autocompletion feature, a method for presenting only queries that have generated a successful search result is described.

상기와 같은 종래 기술은 플래그(FLAG)에 의하여 질의어 검색이 성공한 경우에 이를 설정하는 방식으로, 자동완성 목록 제시 여부를 결정하였다. In the prior art as described above, it is determined whether or not an autocomplete list is presented by setting a case where a query search is successful by a flag FLAG.

그러나 검색이 실패된 질의어를 포함하는 콘텐츠가 추후 추가되는 경우, 질의어로 입력되어 검색을 시도하기 전에는 자동완성 목록으로 제시되지 않는다는 문제가 있었다. However, when content including a query that failed to be searched is added later, there is a problem that it is not presented as an autocomplete list until it is entered as a query and a search is attempted.

또한, M. Takasi et al. ,“Auto Complete Method for Web Application Form Based on Term Hierarchy”, In Proceedings of the Annual Conference on JSAI (in Japanese), 1106.에 의한 종래 기술은 자동완성과 관련된 것으로, 서로 다른 응용 프로그램에서 질의어 목록을 호환하여 사용할 수 있도록 형식 변환을 지원하는 기술에 관련된 것이고, 검색 결과가 보장되는 자동완성 목록을 제공하지 못하므로 적용범위가 매우 상이하여 연관성이 없다. In addition, M. Takasi et al. The prior art by “Auto Complete Method for Web Application Form Based on Term Hierarchy”, In Proceedings of the Annual Conference on JSAI ( in Japanese ), 1106. relates to autocompletion and is compatible with query lists in different applications. It is related to a technology that supports format conversion so that it can be used, and its coverage is very different and unrelated because it does not provide a list of guaranteed autocompletions.

또한, 이광조, 송진우, 한정석, 양성봉, “모바일 단말기를 위한 위치기반 검색어 추천 시스템”, 한국정보과학회 추계학술대회, 1107.에 의한 종래 기술은 모바일 단말기에서 원격 추천 서버를 이용하여 단말기에서의 질의어 저장 공간의 한계를 극복하고 사용자 위치 정보를 고려한 기술이지만, 역시 검색 결과가 보장되는 자동완성 목록을 제공하지 못하므로 적용범위가 매우 상이하여 연관성이 없다. In addition, Kwang-Jo Lee, Jin-Woo Song, Han Jung-Seok, Yang Sung-Bong, “Location Based Query Recommendation System for Mobile Terminals”, Korean Information Science Society Fall Conference, 1107. It is a technology that overcomes the limitations of space and considers user location information, but also does not provide an autocomplete list that guarantees a search result, so the range of application is very different and unrelated.

또한, 종래 기술로서 W. Sung, H. Jung, P. Kim, I. Kang, S. Lee, M. Lee, D. Park, and S. Hahn, “A Semantic Portal for Researchers Using OntoFrame”, In Proceedings of the 6^thInternational Semantic Web Conference, 1107. 이 있다. Also, as prior art, W. Sung, H. Jung, P. Kim, I. Kang, S. Lee, M. Lee, D. Park, and S. Hahn, “A Semantic Portal for Researchers Using OntoFrame”, In Proceedings of the 6 ^th International Semantic Web Conference, 1107.

종래 기술에서의 OntoFrame은 시맨틱 웹 표준 기술인 XML, RDF (Resource Description Framework), OWL (Web Ontology Language), SPARQL (SPARQL Protocol and RDF Query Language) 등을 기반으로 하여 학술 연구 정보 분석 서비스를 제공하기 위해 구축된 시맨틱 웹 서비스 프레임워크이다. OntoFrame in the prior art is built to provide academic research information analysis services based on semantic web standard technologies XML, RDF (Resource Description Framework), OWL (Web Ontology Language), SPARQL Protocol and RDF Query Language (SPARQL), etc. Semantic web services framework.

종래 기술에서는 모델링된 온톨로지를 참조하여 기존 DB를 수집하고 RDF 트리플 형식으로 변환하여 추론 엔진인 OntoReasoner에서는 이를 지식으로 활용한다. In the prior art, an existing DB is collected by referring to a modeled ontology and converted into an RDF triple format, and OntoReasoner, an inference engine, uses this as knowledge.

또한, 종래 기술로서, 정한민, 강인수, 성원경, “시소러스와 분야분류체계를 이용한 과학기술문헌에의 주제 및 분야 할당”, 한국언어정보학회 하계학술대회, 1106. 이 있다. In addition, as the prior art, there are Han Min-min, Kang In-soo, and Sung-Kyung Sung, "Assigning Topics and Sectors to Scientific and Technical Literature Using Thesaurus and Discipline Classification System", Korean Society for Linguistic Information, Summer Conference, 1106.

종래 기술은, URI 서버에서 정보의 수집과 변환을 담당하는데, 원문으로부터 질의어를 추출하고 해당 원문에 할당함으로써 문서 분류 기능을 수행한다. The prior art is responsible for collecting and converting information in a URI server, and performs a document classification function by extracting a query word from an original text and assigning it to the original text.

시맨틱 웹 서비스 프레임 워크의 OntoFrame 서비스는 질의어(개체) 중심적 통합 검색 기능을 제공하는 것으로 이는 포털 사이트인 Naver의 Vertical Search와 유사한 기능이다. The OntoFrame service of the Semantic Web Services Framework provides a query-centered, integrated search function, similar to the vertical search of portal site Naver.

즉, 특정 질의어의 유형을 파악하여 해당 유형에 맞는 검색 결과를 생성하는 것으로, 일례로, 사용자가 “Christian Becker”라는 인물명을 질의어로 입력하면, 관련 연구자(Similar Researchers), 인용 관계에 있는 연구자(Researchers in Citation) 정보 등을 제시하고, “Semantic Web”이라는 주제어를 질의어로 입력한 경우에는 주제 추이(Topic Trends), 유사 주제어(See Also), 주제별 전문가(Researchers by Topic), 주제별 논문(Papers by Topic), 연구자 네트워크(Researcher Network) 등을 제시한다. That is, to identify a specific query type and generate search results for that type. For example, if a user inputs the name of a person named “Christian Becker” as a query word, related researchers (Similar Researchers), researchers with citations ( Researchers in Citations, etc. If the topic “Semantic Web” is entered as a query, Topic Trends, See Also, Researchers by Topic, and Papers by Topic Topic), Researcher Network, etc.

특히, 질의어에 대한 검색 결과를 제공할 때는 URI 서버에서 수행하였던 주제어 추출 결과를 활용한다. 주제어 추출 결과는 추론 엔진에 의해 인물, 기관 등으로 전파되어 주제 추이, 주제별 전문가, 주제별 논문을 구성할 수 있게 해주기 때문에 해당 문서정보가 추가되지 않는 이상 올바른 시맨틱 웹 서비스를 구성할 수 없는 문제가 있다. In particular, when providing a search result for a query, the subject extraction result that was performed in the URI server is used. The result of extracting the key word is propagated to the person, organization, etc. by the inference engine, so that it is possible to compose the subject trend, thematic experts, and thematic papers. Therefore, there is a problem that a correct semantic web service cannot be constructed unless the document information is added. .

종래 기술에 의한 OntoFrame 서비스에서도 자동완성을 제공하였는데, 질의어 추출에 사용된 질의어 사전 내의 질의어들 중에서 사용자의 질의어와 매칭된 것들을 자동완성 목록으로 제시하였다. In the OntoFrame service according to the prior art, autocompletion was also provided, and among the queries in the query dictionary used for query extraction, those matching the user's query were presented as an autocompletion list.

그러나 추출된 질의어들에 포함되지 않은 질의어들도 자동완성 목록으로 제시됨으로 검색 결과가 없거나 부실한 검색 결과를 초래하는 경우가 생기는 문제가 있다. However, even queries that are not included in the extracted queries are presented as an autocomplete list, which results in a case in which there is no search result or a poor search result.

또한, 질의어로 입력되는 문자열과 매칭된 질의어의 숫자가 많아 자동완성 부하가 크게 발생하는 문제가 있다. In addition, a large number of query words matched with a string input as a query word causes a large autocomplete load.

따라서 자동완성 목록을 추출된 질의어들로 통제하는 방식을 사용하여 상기와 같은 문제점을 해결하는 기술을 개발할 필요가 있다. Therefore, there is a need to develop a technique for solving the above problems using a method of controlling the autocomplete list with extracted query terms.

또한, 질의어 유형 인식을 자동완성 제시 전 단계에서 수행하여 질의어 유형별 자동완성을 가능하게 하는 기술을 개발할 필요가 있다.In addition, there is a need to develop a technology that enables autocompletion for each query type by performing a query type recognition at the stage of autocompletion presentation.

또한, 입력되는 질의어에 대하여 검색이 성공한 결과만을 자동완성으로 제공하는 기술을 개발할 필요가 있다. In addition, there is a need to develop a technology for automatically providing a result of a successful search for an input query.

본 발명은 상기와 같은 종래 기술의 문제점과 필요성을 해결하기 위하여 안출한 것으로, 특히, 입력되는 질의어를 검색하여 결과가 보장되는 경우에만 자동완성으로 제시하는 검색 결과가 보장된 질의어 유형별 자동완성 시스템 및 방법 및 프로그램을 기록한 기록매체를 제공하는 것이 그 목적이다. The present invention has been made in order to solve the problems and necessity of the prior art, in particular, the automatic completion system for each type of query is guaranteed to ensure that the search results presented by the automatic completion only when the result is guaranteed by searching the input query and It is an object to provide a recording medium on which the method and program are recorded.

또한, 본 발명은 입력된 질의어의 검색 결과가 보장되는 경우에만 자동완성으로 제시하므로 검색이 정확하고 신속하며 검색 결과가 보장된 질의어 유형별 자동완성 시스템 및 방법 및 프로그램을 기록한 기록매체를 제공하는 것이 그 목적이다. In addition, the present invention provides an autocompletion system for each type of query, which provides accurate, fast, and secure search results, and provides a recording medium recording the search results of an input query. Purpose.

또한, 본 발명은 문서정보의 추가와 삭제 등에 의한 질의어의 발생 빈도수 값을 실시간으로 반영하고 유형별 정보를 그룹화하여 개선된 상태로 제공하는 검색 결과가 보장된 질의어 유형별 자동완성 시스템 및 방법 및 프로그램을 기록한 기록매체를 제공하는 것이 그 목적이다. In addition, the present invention records the automatic completion system, method and program for each type of query, which guarantees the search result that reflects the occurrence frequency value of the query by the addition and deletion of document information in real time and provides the improved information by grouping the information by type. The purpose is to provide a recording medium.

이러한 목적을 달성하기 위하여 안출한 본 발명은, 문서정보를 등록받고 등록받은 문서정보로부터 색인어 정보와 빈도수 정보를 추출하여 색인 데이터베이스에 기록하며 추출된 색인어 정보로부터 자동완성 목록정보를 생성하는 문서색인 서버, 문서색인 서버에 의하여 생성된 자동완성 목록정보를 빈도수 정보와 연계상태 로 기록하는 자동완성 데이터베이스 및 자동완성 데이터베이스를 검색하여 색인어 정보가 포함되는 자동완성 목록정보를 추출하고 질의어로 변환하여 사용자 인터페이스로 제공하며 선택 입력된 질의어를 색인어로 변환하고 색인어가 포함되는 문서정보를 검색하여 사용자 인터페이스로 제공하는 자동완성 서버를 포함하여 구성한다. In order to achieve the above object, the present invention provides a document index server which extracts index word information and frequency information from registered document information and records the document information and records the index information in the index database and generates autocomplete list information from the extracted index word information. After searching the autocomplete database and the autocomplete database, which records the autocomplete list information generated by the document index server in association with the frequency information, extracts the autocomplete list information including the index information and converts it to the query language. It is configured to include an autocomplete server which provides the user interface by converting the selected input query word into an index word and searching document information including the index word.

바람직하게, 문서색인서버에 접속하여 수집된 상기 문서정보를 등록하는 문서수집부, 문서색인서버로부터 제공되는 색인어 정보를 기록하고 자동완성 서버의 검색에 의하여 색인어 정보를 제공하는 색인 데이터베이스를 더 포함하는 구성을 제시한다. Preferably, the document collecting unit for accessing the document index server to register the document information collected, and further comprising an index database for recording the index word information provided from the document index server and providing the index word information by the search of the autocomplete server Present the configuration.

또한, 문서수집부는 웹페이지 문서정보, 서식 문서정보, 이미지 문서정보, 동영상 문서정보, 텍스트 문서정보, 멀티미디어 문서정보를 포함하는 콘텐츠 문서정보를 하나 이상 수집하는 구성으로 이루어지는 것을 특징으로 한다. The document collection unit may be configured to collect one or more content document information including web page document information, format document information, image document information, video document information, text document information, and multimedia document information.

또한, 문서색인 서버는, 새로운 문서정보를 입력하여 등록하는 문서등록부, 문서등록부가 등록하는 문서정보로부터 색인어를 추출하여 색인 데이터베이스에 저장하는 문서 색인부 및 색인 데이터베이스에 저장되는 색인어로부터 자동완성 목록으로 제공되는 색인어 정보를 검색하여 자동완성 데이터베이스에 기록하고 빈도수 정보를 갱신하여 관리하는 데이터베이스 생성부를 포함하여 이루어지는 구성을 특징으로 한다. In addition, the document index server includes a document register that inputs and registers new document information, a document index unit that extracts an index word from document information registered by the document register, and stores the index word in an index database, and the index word stored in the index database. And a database generation unit for retrieving the index word information provided, recording it in an autocomplete database, and updating and managing frequency information.

또한, 문서 색인부는 문서등록부에 등록된 문서정보로부터 색인어를 추출하는 것과 텍스트 프로세싱으로 지정된 색인어 정보를 추출하는 것 중에서 선택된 어 느 하나 이상으로부터 색인어 정보가 포함되는 부가정보를 추출하는 구성으로 이루어지는 것을 특징으로 한다.The document indexing unit may be configured to extract additional information including index word information from at least one selected from extracting an index word from document information registered in the document register and extracting index word information designated by text processing. It is done.

또한, 문서색인부는 문서등록부에 등록된 문서정보로부터 형태소 해석 방식, 엔그람(N-gram)의 방식 중에서 선택된 어느 하나의 방식으로 색인어를 추출하여 상기 색인 데이터베이스에 저장하는 구성으로 이루어지는 것을 특징으로 한다. The document indexing unit may be configured to extract an index word from any document information registered in the document registration unit in one of a morpheme analysis method and an N-gram method and store the index word in the index database. .

또한, 문서색인부는 추출한 색인어가 포함되는 부가정보를 해당 문서정보와 연계시켜 색인 데이터베이스에 기록하여 저장하는 구성으로 이루어지는 것을 특징으로 한다. In addition, the document indexing unit is configured to record additional information including the extracted index word in association with the document information and store it in the index database.

또한, 문서색인 서버는 문서등록부에 등록된 문서정보를 수정하고 삭제하는 문서편집부를 더 포함하는 구성으로 이루어지는 것을 특징으로 한다. In addition, the document index server is characterized in that the configuration further comprises a document editing section for modifying and deleting the document information registered in the document register.

또한, 문서색인부는 문서등록부에 등록된 문서정보로부터 추출된 색인어 중에서 불용어 사전에 포함된 불필요한 색인어를 제거하는 구성으로 이루어지는 것을 특징으로 한다. The document indexer may be configured to remove unnecessary index words included in the stopword dictionary from the index words extracted from the document information registered in the document register.

또한, 데이터베이스 생성부는 문서정보 들의 빈도수 정보들을 자동완성 데이터베이스 단위의 자동완성 목록 대상으로 누적 계산하여 기록하는 구성으로 이루어지는 것을 특징으로 한다. The database generating unit may be configured to accumulate and record the frequency information of the document information as an autocomplete list object of an autocomplete database unit.

또한, 데이터베이스 생성부는 빈도수 정보의 값이 0 이면 누적 계산되는 자동완성 목록 대상으로부터 제외하는 구성으로 이루어지는 것을 특징으로 한다. The database generating unit may be configured to exclude from the target of completion of the autocomplete list that is cumulatively calculated when the value of the frequency information is zero.

또한, 자동완성 서버는 검색할 질의어를 사용자 인터페이스로 입력받아 색인어로 변환하는 질의어 입력부, 질의어 입력부의 색인어를 자동완성 데이터베이스로 부터 검색하는 데이터베이스 검색부, 자동완성 데이터베이스에 저장된 색인어의 빈도수 정보를 확인하고 자동완성 목록 정보로 결정하여 제공하는 색인어 결정부, 색인어 결정부가 제공하는 자동완성 목록 정보를 질의어로 변환하여 사용자 인터페이스로 제공하는 제시부, 입력된 질의어와 제시부가 제공한 자동완성 목록의 질의어를 사용자 인터페이스로 제공하고 이벤트 신호와 함께 선택된 상기 질의어 정보를 입력하여 상기 색인어로 변환하는 선택부; 상기 선택부가 입력한 색인어 정보와 검색 이벤트 신호에 의하여 문서정보를 검색하고 제공하는 서비스 연동부를 포함하는 구성으로 이루어지는 것을 특징으로 한다. In addition, the autocomplete server checks the frequency information of the query word input unit that receives the query to be searched into the user interface and converts it into an index, the database search unit that retrieves the index word from the query input unit from the autocomplete database, and the index information stored in the autocomplete database. Indexing unit that determines and provides the autocomplete list information, the presentation unit that converts the autocomplete list information provided by the index determination unit to the query interface to provide to the user interface, the input query and the query of the autocomplete list provided by the user interface user interface A selection unit for inputting the selected query word information together with an event signal and converting the index word into the index word; And a service linkage unit for searching and providing document information according to the index word information and the search event signal inputted by the selection unit.

또한, 질의어 입력부는 질의어가 음소, 음절, 어절, 단어 중에서 선택된 어느 하나에 의한 단위 글자로 입력하는 구성으로 이루어지는 것을 특징으로 한다. In addition, the query input unit is characterized in that the query is composed of a unit letter input by any one selected from the phoneme, syllable, word, word.

또한, 질의어 입력부는 상기 질의어가 입력될 때마다 자동완성 데이터베이스를 에이작스(AJAX) 방식으로 호출하여 색인어를 검색하는 구성을 포함하여 이루어지는 것을 특징으로 한다. In addition, the query input unit is characterized in that it comprises a configuration for retrieving the index by calling the autocomplete database in the AJAX method whenever the query is input.

또한, 질의어 입력부는 상기 질의어 정보를 사용자 인터페이스(UI)로 입력하는 구성을 포함하여 이루어지는 것을 특징으로 한다. The query input unit may include a configuration for inputting the query information into a user interface (UI).

또한, 데이터베이스 검색부는 색인어를 전방일치와 후방일치 방식으로 각각 검색하여 자동완성 목록으로 작성하는 구성을 포함하여 이루어지는 것을 특징으로 한다. In addition, the database search unit is characterized in that it comprises a configuration to create an autocomplete list by searching the index word in the forward matching and backward matching, respectively.

또한, 색인어 결정부는 색인어의 빈도수 정보가 1 이상인 것을 자동완성 목록에 포함시켜 제공하도록 결정하는 구성으로 이루어지는 것을 특징으로 한다. The index word determining unit may be configured to determine that the index information includes one or more frequency information in the autocomplete list.

또한, 제시부는 자동완성 목록을 질의어의 입력 통계 정보와 질의어의 빈도수 정보와 질의어의 가나다 순서 정보 중에서 선택된 어느 하나 이상을 이용하여 자동완성 목록에서의 순위를 조절하는 구성으로 이루어지는 것을 특징으로 한다. The presenting unit may be configured to adjust the ranking in the autocomplete list by using at least one selected from the input statistical information of the query, the frequency information of the query, and the order of information of the query.

또한, 서비스 연동부는 색인어 정보를 에피아이(API) 호출에 의하여 문서정보를 검색하는 구성으로 이루어지는 것을 특징으로 한다. The service interworking unit may be configured to search for document information by calling the index word information by an API.

이러한 목적을 달성하기 위하여 안출한 본 발명은, 문서정보를 수집 등록하고 등록된 문서정보로부터 색인어 정보와 빈도수 정보를 추출하여 색인 데이터베이스에 저장하며 색인어 정보의 자동완성 목록 정보를 생성하여 자동완성 데이터베이스에 저장하고 등록된 문서정보를 수정하는 과정, 자동완성 데이터베이스로부터 사용자 인터페이스를 통하여 입력되는 질의어를 색인어로 변환하고 빈도수 1 이상으로 검색되는 색인어를 자동완성 목록 정보에 포함시키는 과정 및 자동완성 목록의 색인어를 질의어로 변환하고 사용자 인터페이스로 제시하며 외부로부터 사용자 인터페이스로 선택 입력된 질의어를 상기 색인어로 변환하며 변환된 색인어로 검색된 문서정보를 출력하는 과정을 포함하는 구성을 제시한다. In order to achieve the above object, the present invention collects and registers document information, extracts index word information and frequency information from registered document information, stores it in an index database, and generates an autocomplete list information of index word information in an autocomplete database. Storing and modifying the registered document information, converting the query word input from the autocomplete database through the user interface into an index, including the index word searched with a frequency of 1 or more in the autocomplete list information, and the index word of the autocomplete list. The present invention provides a configuration including a process of converting a query language into a query interface, converting the selected query word from the outside into a user interface, and outputting document information retrieved from the converted index word.

바람직하게, 수정과정은 문서정보를 문서수집부에 의하여 수집하는 과정, 수집된 문서정보를 문서등록부에 의하여 등록하는 과정, 등록된 문서정보로부터 문서 색인부에 의하여 색인어를 추출하고 색인 데이터베이스에 저장하는 과정, 색인 데이터베이스에 저장되는 색인어 정보로부터 데이터베이스 생성부에 의하여 자동완성 목록으로 제공할 색인어를 추출하고 자동완성 데이터베이스에 저장하는 과정, 등록 된 문서정보를 문서 편집부에 의하여 수정하거나 삭제하는 과정을 포함하여 이루어지는 것을 특징으로 한다. Preferably, the modification process includes collecting document information by the document collecting unit, registering the collected document information by the document registering unit, extracting an index word from the registered document information by the document indexing unit, and storing the document information in the index database. Including extracting index words to be provided as an autocomplete list by the database generator from the index word information stored in the index database and storing them in the autocomplete database, and modifying or deleting registered document information by the document editing unit. Characterized in that made.

또한, 색인어 추출은 형태소 해석 방식, 엔그람(N-gram) 색인 방식 중에서 선택된 어느 하나의 방식으로 상기 색인어를 추출하는 것을 특징으로 한다. The index word extraction may be performed by extracting the index word by any one method selected from a morpheme analysis method and an N-gram index method.

또한, 색인 데이터베이스에 저장되는 상기 색인어에는 부가정보가 포함되고 불용어 사전을 이용하여 불필요한 색인어를 제거하며, 자동완성 데이터베이스에는 각 문서정보의 갱신된 빈도수 정보가 저장되는 것을 특징으로 한다. In addition, the index word stored in the index database includes additional information and removes unnecessary index words using a stopword dictionary, and the updated frequency information of each document information is stored in the autocomplete database.

또한, 입력되는 질의어는 음소, 음절, 어절, 단어 중에서 선택된 어느 하나의 단위로 입력하는 것을 특징으로 한다. The input query word may be input in any one unit selected from a phoneme, a syllable, a word, and a word.

또한, 입력되는 질의어는 색인어로 변환되고 에이작스 방식으로 상기 자동완성 데이터베이스부로부터 호출하여 검색하는 것을 특징으로 한다. In addition, the input query is converted into an index word, characterized in that the search by calling from the autocomplete database unit in an Ajax manner.

이러한 목적을 달성하기 위하여 안출한 본 발명은, 문서정보를 수집 등록하고 등록된 문서정보로부터 색인어 정보와 빈도수 정보를 추출하여 색인 데이터베이스에 저장하며 색인어 정보의 자동완성 목록 정보를 생성하여 자동완성 데이터베이스에 저장하고 등록된 문서정보를 수정하는 프로세스, 자동완성 데이터베이스로부터 사용자 인터페이스를 통하여 입력되는 질의어를 색인어로 변환하고 빈도수 1 이상으로 검색되는 상기 색인어를 자동완성 목록 정보에 포함시키는 프로세스, 자동완성 목록의 색인어를 상기 질의어로 변환하고 사용자 인터페이스로 제시하며 외부로부터 사용자 인터페이스로 선택 입력된 질의어를 색인어로 변환하며 변환된 색인 어로 검색된 문서정보를 출력하는 프로세스를 포함하여 이루어지는 검색 결과가 보장된 질의어 유형별 자동완성 방법의 프로그램 소스를 기록한 기록매체의 구성을 제시한다. In order to achieve the above object, the present invention collects and registers document information, extracts index word information and frequency information from registered document information, stores it in an index database, and generates an autocomplete list information of index word information in an autocomplete database. A process of storing and modifying registered document information, a process of converting a query word input from an autocomplete database through a user interface into an index word and including the index word searched with a frequency of 1 or more in the autocomplete list information, an index word of an autocomplete list A query with guaranteed search results, which includes a process of converting the query into the query and presenting it to the user interface, converting the selected query from the outside into the index, and outputting document information searched by the converted index. It presents the structure of a recording medium recording a program source of the type of auto-complete manner.

바람직하게, 질의어의 자동완성 목록 프로세스는 질의어의 입력 통계 정보와 자동완성 데이터베이스의 빈도수 정보와 질의어의 가나다 순서 정보 중에서 선택된 어느 하나 이상으로 순위를 조절하는 프로세스를 특징으로 한다. Preferably, the autocompletion list process of the query is characterized by a process of adjusting the ranking to any one or more selected from the input statistical information of the query, the frequency information of the autocomplete database, and the order of information of the query.

또한, 입력되는 질의어 프로세스는 음소, 음절, 어절, 단어 중에서 선택된 어느 하나의 단위로 입력되는 경우마다 에이작스 방식으로 자동완성 데이터베이스부를 호출하고 검색하는 프로세스를 특징으로 하는 한다. In addition, the input query process is characterized by a process of calling and searching the autocomplete database unit in an Ajax manner whenever it is input in any one unit selected from phoneme, syllable, word, and word.

이러한 목적을 달성하기 위하여 안출한 본 발명은, 문서정보를 등록받고 등록받은 문서정보로부터 색인어와 빈도수 정보를 추출하여 자동완성 데이터베이스를 구축하며 외부로부터 사용자 인터페이스로 입력되는 질의어를 색인어로 변환하고 자동완성 데이터베이스로부터 색인어가 포함되는 색인어의 자동완성 목록을 사용자 인터페이스를 통하여 제공하며 사용자 인터페이스를 통하여 선택 입력되는 질의어를 색인어로 변환하고 변환된 색인어가 포함되는 문서정보를 상기 사용자 인터페이스로 제공하는 서버시스템, 서버시스템과 접속하고 유선통신경로와 무선통신경로 중에서 선택된 통신경로로 질의어와 검색된 문서정보의 데이터를 송수신하는 공중통신망, 및 공중통신망에 접속하고 검색할 질의어를 사용자 인터페이스로 입력하여 서버 시스템에 전달하며 서버 시스템이 제공하는 질의어의 자동완성 목록을 사용자 인터페이스로 표시하고 선택된 하나의 질의어를 이벤트 신호와 함께 입력하여 서버 시스템에 제공하며 서버 시스템이 검색하여 제공하는 문서정보를 표시하는 컴퓨터로 이루어지는 단말장치를 포함하는 구성을 제시한다. In order to achieve the above object, the present invention establishes an autocomplete database by extracting index information and frequency information from registered document information after registering document information, converts a query word input from an external user interface into an index word, and autocompletes A server system and server for providing an automatic completion list of index words including index words from a database through a user interface, converting a query word selected through the user interface into an index word, and providing document information including the converted index word to the user interface. The public communication network which connects to the system and transmits and receives the query word and the data of the retrieved document information through the selected communication path among the wired communication path and the wireless communication path, and inputs the query word to access and search the public communication network through the user interface. It is a computer that displays the auto-completion list of query terms provided by the server system in the user interface, inputs one selected query word with event signal, provides it to the server system, and displays document information searched and provided by the server system. A configuration including a terminal device is provided.

바람직하게, 서버시스템은 문서정보를 등록받고 색인어와 빈도수 정보를 추출하며 자동완성 데이터베이스를 구축하는 문서색인 서버, 질의어를 외부로부터 사용자 인터페이스로 입력하고 색인어로 변환하며 색인어가 포함되는 색인어 목록 정보를 추출하고 질의어로 변환하여 사용자 인터페이스로 제공하며 선택 입력된 질의어를 색인어로 변환하여 검색된 문서정보를 제공하는 자동완성 서버, 문서색인 서버에 의하여 생성된 자동완성 목록정보를 빈도수 정보와 연계상태로 기록하는 자동완성 데이터베이스, 문서색인서버에 접속하여 수집된 문서정보를 등록하는 문서수집부 및 문서색인 서버로부터 제공되는 색인어 정보를 기록하고 자동완성 서버의 검색에 의하여 제공하는 색인 데이터베이스를 포함하여 이루어지는 구성을 특징으로 한다. Preferably, the server system registers document information, extracts index words and frequency information, and builds an autocomplete database. The document index server inputs query words from the outside into a user interface, converts them into index words, and extracts index word list information including index words. Autocomplete server that converts the selected input query into index and provides the retrieved document information, and automatically records the autocomplete list information generated by the document index server in association with the frequency information. Features include a complete database, a document collecting unit for registering collected document information by accessing a document index server, and an index database for recording index word information provided from a document index server and providing the search by an autocomplete server. do.

또한, 공중통신망은 서버시스템과 단말장치를 무선통신경로로 접속하고 데이터 신호를 전송하는 무선통신망 및 서버시스템과 단말장치를 유선통신경로로 접속하고 데이터 신호를 전송하는 유선통신망을 포함하여 이루어지는 구성을 특징으로 한다. In addition, the public communication network includes a wireless communication network for connecting a server system and a terminal device to a wireless communication path and transmitting a data signal, and a wired communication network for connecting a server system and a terminal device to a wired communication path and transmitting a data signal. It features.

따라서, 본 발명에 의하면 입력되는 질의어를 검색하여 결과가 보장되는 경 우에만 자동완성으로 제시하므로 검색 결과의 신뢰도를 높이는 사용상 편리한 효과가 있다. Therefore, according to the present invention, since the searched input query is automatically presented only when the result is guaranteed, there is a convenient effect of increasing the reliability of the search result.

또한, 본 발명은 검색 결과가 보장되는 경우에만 자동완성으로 제시하므로 검색의 실패를 방지하고 검색을 신속하게 하는 산업적 이용효과가 있다. In addition, the present invention has an industrial use effect to prevent the failure of the search and to speed up the search because it is presented as autocomplete only when the search results are guaranteed.

그리고, 본 발명은 문서정보의 추가와 삭제 등에 의한 질의어의 발생 빈도수 값을 실시간으로 반영하고 유형별 정보를 그룹화하여 개선상태로 제공하는 사용상 편리한 효과가 있다. In addition, the present invention has a convenient effect of reflecting the occurrence frequency value of the query by the addition and deletion of document information in real time and grouping the information for each type to provide an improved state.

본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정 해석되지 아니하며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.The terms or words used in this specification and claims are not to be construed as limiting in their usual or dictionary meanings, and the inventors may properly define the concept of terms in order to best explain their invention in the best way possible. It should be interpreted as meaning and concept corresponding to the technical idea of the present invention.

이하, 도면을 참조하여 본 발명의 일실시예에 대하여 상세히 설명한다.Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

실시 예Example

본 발명을 설명하기 위하여 첨부된 것으로, 도 4 는 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 시스템의 기능 구성도 이며, 도 5 는 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 시스템의 서버 시스템 상세 기능 구성도 이고, 도 6 은 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 방법 순서도 이며, 도 7 은 본 발명의 일례에 의한 것으로 문서정보의 추가와 삭제에 의한 빈도수 갱신 상태 설명도 이다. In order to explain the present invention, FIG. 4 is a diagram illustrating a functional configuration of an automatic completion system for each type of query, in which a search result is guaranteed by an example of the present invention, and FIG. 5 is an example of the present invention. Figure 6 is a detailed configuration diagram of the server system of the complete auto-complete system for each query type, Figure 6 is a flow chart of the automatic completion method for each query type is guaranteed by the example of the present invention, Figure 7 is an example of the present invention document information An explanation of the frequency update status is given by adding and deleting.

도 4 를 참조하면, 본 발명의 검색 결과가 보장된 질의어 유형별 자동완성 시스템은 서버시스템(100), 공중통신망(110) 그리고 단말장치(120)를 포함하여 구성된다. Referring to FIG. 4, the automatic completion system for each query type for which the search result of the present invention is guaranteed includes a server system 100, a public communication network 110, and a terminal device 120.

서버시스템(100)은, 문서정보를 등록받으며 색인어를 추출하고 자동완성 데이터베이스를 구축하며 질의어를 입력하여 색인어로 변환하며 자동완성 데이터베이스로부터 상기 색인어가 포함되는 색인어를 검색하고 질의어로 변환하여 제공한다.The server system 100 receives document information, extracts an index word, constructs an autocomplete database, inputs a query word, converts it into an index word, and retrieves an index word including the index word from the autocomplete database and converts it into a query word.

상세하게는 문서정보를 등록받고 등록받은 문서정보로부터 색인어와 빈도수 정보를 추출하여 자동완성 데이터베이스를 구축하며 외부로부터 사용자 인터페이스로 입력되는 질의어를 색인어로 변환하고 자동완성 데이터베이스로부터 색인어가 포함되는 색인어의 자동완성 목록을 사용자 인터페이스를 통하여 제공하며 사용자 인터페이스를 통하여 선택 입력되는 질의어를 색인어로 변환하고 변환된 색인어가 포함되는 문서정보를 사용자 인터페이스로 제공한다.In detail, the document information is registered and the index word and frequency information are extracted from the registered document information to build an autocomplete database. The query word input from the external user interface is converted into an index word and the index word including the index word from the autocomplete database is automatically generated. The complete list is provided through the user interface, and the selected query word is converted into an index word through the user interface, and the document information including the converted index word is provided to the user interface.

상기 공중통신망(110)은, 상기 서버시스템(100) 및 단말장치(300)와 유선통신경로와 무선통신경로 중에서 선택된 통신경로로 접속하고 데이터 신호를 송수신 또는 전송한다. The public communication network 110 connects the server system 100 and the terminal device 300 to a communication path selected from a wired communication path and a wireless communication path, and transmits or receives a data signal.

상기 단말장치(300)는, 검색할 질의어를 사용자 인터페이스(USER INTERFACE : UI)를 통하여 입력하고 상기 공중통신망(110)을 통하여 상기 서버시스템(100)에 제공하며, 상기 서버시스템(100)이 제공하는 정보를 사용자 인터페이 스(USER INTERFACE : UI)로 표시하며, 표시되는 질의어 목록 중에서 하나를 선택하여 검색 이벤트 신호와 함께 사용자 인터페이스로 입력하며 검색되어 제공된 문서정보를 확인하는 컴퓨터 단말장치로 이루어진다. The terminal device 300 inputs a query to be searched through a user interface (USER INTERFACE: UI) and provides the server system 100 to the server system 100 through the public communication network 110, which is provided by the server system 100. The information is displayed in a user interface (UI), and one of the displayed query words is selected and input to the user interface with a search event signal, and the computer terminal device checks the provided document information.

도 5 를 참조하여 서버 시스템(100)을 설명하면, 문서수집부(110), 문서색인서버(120), 자동완성 데이터베이스(140), 자동완성서버(130), 색인 데이터베이스(150)를 포함하는 구성이다. Referring to FIG. 5, the server system 100 includes a document collecting unit 110, a document index server 120, an autocomplete database 140, an autocomplete server 130, and an index database 150. Configuration.

문서수집부(110)는 수집된 문서정보를 상기 문서색인서버(120)에 등록하는 것이고, 수집된 문서정보는 웹페이지 문서정보, 서식 문서정보, 이미지 문서정보, 동영상 문서정보, 텍스트 문서정보, 멀티미디어 문서정보를 포함하는 콘텐츠(CONTENTS) 문서정보이다. The document collecting unit 110 registers the collected document information in the document index server 120, and the collected document information includes web page document information, form document information, image document information, video document information, text document information, Content (CONTENTS) document information including multimedia document information.

상기 문서색인서버(120)는 문서정보를 등록받고 색인어를 추출하며 자동완성 목록정보의 데이터베이스를 구축하는 것으로, 문서등록부(121), 문서색인부(123), 데이터베이스 생성부(124), 문서편집부(122)를 포함하는 구성이고, 문서등록부(121)는 문서수집부(110)가 제공하는 새로운 정보의 문서를 도면에 도시하지 않은 별도의 문서정보 저장 데이터베이스에 기록하여 등록한다. The document index server 120 registers document information, extracts index terms, and builds a database of autocomplete list information. The document register unit 121, the document index unit 123, the database generator 124, and the document editor The document registration unit 121 records and registers a document of new information provided by the document collecting unit 110 in a separate document information storage database not shown in the drawing.

문서색인부(123)가 등록된 문서정보로부터 상기 부가정보를 추출하는 방식에는, 입력된 문서정보로부터 색인어를 추출하는 방식과 텍스트 프로세싱(TEXT PROCESSING)으로 지정된 정보를 추출하는 방식이 있으며, 방식 중에서 선택된 어느 하나 이상으로 부가정보 추출 작업을 한다. 또한, 상기 추출된 부가정보는 문서정 보에 연계되는 상태로 상기 색인 데이터베이스에 기록하여 저장한다. The document indexing unit 123 extracts the additional information from the registered document information. There are a method of extracting an index word from the input document information and a method of extracting information designated by text processing. The additional information extraction operation is performed on at least one selected. In addition, the extracted additional information is recorded and stored in the index database in a state linked to document information.

상기 문서색인부(123)가 상기 문서등록부(121)에 의하여 등록된 문서정보로부터 색인어 정보가 포함되는 부가정보 추출 방식 중에서 상기의 전자에 해당하는 색인어 추출하는 방식은, 등록받은 문서정보로부터 형태소 해석, N-gram의 색인 방식 중에서 선택된 방식으로 색인어가 포함되는 부가정보를 추출하여 상기 색인 데이터베이스에 저장한다. Among the additional information extraction methods in which the document index unit 123 includes index word information from the document information registered by the document registration unit 121, the index word extraction method corresponding to the former is morphological analysis from the registered document information. In the selected index method of N-gram, the additional information including the index word is extracted and stored in the index database.

즉, 색인어 추출방식에는 형태소 해석 방식과 N-gram의 색인 방식이 있으며, 등록된 문서정보로부터 선택된 방식으로 색인어를 추출한다. That is, the index word extraction method includes a morpheme analysis method and an N-gram index method, and the index word is extracted from the registered document information in a selected manner.

상기 형태소는 의미를 갖는 단어 혹은 단어의 일부로 그 이상 작은 단위로는 의미를 가지지 못하는 것을 말한다. The morpheme refers to a word or part of a word that has meaning and no meaning in smaller units.

상기 형태소 해석은 최장일치법이라고도 부르며, 형태소 해석에서 단어나 문절의 구분 방법으로 복수의 가능성이 있을 때에 가장 긴 문자수를 포함한 해석법을 채용하는 방법이다. The morpheme analysis is also called the longest matching method, and is a method of adopting an analysis method including the longest number of characters when there is a plurality of possibilities as a method of classifying words or phrases in the morpheme analysis.

상기 N-gram 색인 방식은 인접한 N 개의 음절을 말한다. 일례로, ‘한국과학기술’의 경우, ‘한국’, ‘국과’, ‘과학’, ‘학기’, ‘기술’의 음절이고, 상기 각 음절이 질의어로 활용된다. 이와 같은 음절 중에서 의미 없는 엔그람(N-gram)의 질의어는 부적합한 문서정보를 검색할 수 있으므로 이러한 상태를 방지하기 위하여 각각의 음절에 가중치를 부여한다. The N-gram indexing method refers to N adjacent syllables. For example, 'Korea Science and Technology' is a syllable of 'Korea', 'Kukwa', 'Science', 'Semester' and 'Technology', and each syllable is used as a query word. Among such syllables, a query word of meaningless gram (N-gram) can search for inappropriate document information, and thus weight each syllable to prevent such a state.

또한, 상기 문서색인부(123)는 상기 등록받은 문서정보로부터 추출된 색인어 중에서 불용어 사전에 포함된 불필요한 색인어를 제거한다. In addition, the document indexer 123 removes unnecessary index words included in the stopword dictionary from the index words extracted from the registered document information.

상기 불용어(STOPWORD, 不用語)는 인터넷 검색 시에 검색 용어로 사용하지 않는 단어, 일례로, 관사 전치사 조사 접속사 등과 같이 검색 색인어로 의미가 없는 단어이다. The stopword (STOPWORD, 不用語) is a word that is not used as a search term when searching the Internet, for example, a word having no meaning as a search index word such as an article preposition investigation conjunction.

상기 데이터베이스 생성부(124)는 상기 색인 데이터베이스(150)에 저장되는 색인어로부터 자동완성으로 제공되는 유형별로 색인어를 검색하여 자동완성 데이터베이스(140)에 기록하고 빈도수(document frequency) 정보를 다시 계산하여 관리하는 것으로 각 문서정보에서 발생하는 각각의 빈도수 정보를 기록하고, 상기 빈도수의 값이 0 이면 자동완성 목록의 대상으로부터 제외한다. The database generation unit 124 retrieves the index words for each type provided by the autocomplete from the index words stored in the index database 150, records them in the autocomplete database 140, and calculates and manages document frequency information again. Each frequency information generated from each document information is recorded, and if the frequency value is 0, the frequency is excluded from the target of the autocomplete list.

하나의 문서정보에 지정된 색인어가 포함되는지 여부를 표시하는 것이 출현 빈도수(이하, 빈도수라 한다.)이고, 상기 하나의 문서정보에 질의어가 출현되는 경우 1 의 값이 주어지며 출현 되지 않는 경우는 0 의 값이 주어진다. It is the frequency of appearance (hereinafter, referred to as frequency) that indicates whether or not the specified index word is included in one document information. When a query word appears in the one document information, a value of 1 is given, and when it does not appear, 0 Is given by.

상기 자동완성 데이터베이스(140)는 상기 문서색인 서버(120)에 의하여 생성된 색인어의 자동완성 목록정보를 빈도수 정보와 함께 기록한다. The autocomplete database 140 records the autocomplete list information of the index word generated by the document index server 120 together with the frequency information.

상기 자동완성 서버(130)는 상기 자동완성 데이터베이스(140)를 검색하여 색인어가 포함되는 색인어 정보를 추출하고 질의어로 변환하여 제공하는 것으로, 질의어 입력부(131); 데이터베이스 검색부(132); 색인어 결정부(133); 제시부(134); 선택부(135); 서비스 연동부(136)를 포함하는 구성이고, 상기 질의어 입력부(131) 는 검색 질의어를 사용자 인터페이스로 입력받고 색인어로 변환하는 것으로, 상기 질의어가 음소, 음절, 어절, 단어 중에서 선택된 어느 하나에 의한 단위 글자로 사용자 인터페이스(User Interface : UI)를 통하여 입력될 때마다 자동완성 데이터베이스를 에이작스(AJAX) 방식으로 호출하여 검색하도록 한다. The autocomplete server 130 searches for the autocomplete database 140, extracts index word information including an index word, and converts the index word information into a query word, and provides a query word input unit 131; A database search unit 132; Index word determination unit 133; The presentation unit 134; Selector 135; The query interlocking unit 136 includes a service interworking unit 136, and the query input unit 131 receives a search query into a user interface and converts the search query into an index. The query unit is a unit selected from phoneme, syllable, word, and word. Whenever the text is entered through the user interface (UI), the autocomplete database is called by AJAX to search.

상기 에이작스(Asynchronous JavaScript and XML : AJAX) 애플리케이션(APPLICATION) 방식은 필요한 데이터만을 웹서버(web server)에 요청하여 제공받은 후 클라이언트(client)에서 데이터 처리하는 방식이다. The Asynchronous JavaScript and XML (AJAX) application (APPLICATION) method is a method of processing data in a client after receiving and providing only necessary data to a web server.

일반적으로 웹(web) 서버에서 검색 또는 요청된 내용에 의하여 웹 페이지를 작성 제공하고, 새로운 내용을 요청하는 경우 새로운 웹페이지를 작성하여 제공한다. Generally, a web page is created and provided by a searched or requested content on a web server, and a new web page is created and provided when a new content is requested.

이러한 경우 최초 웹 페이지가 가지고 있던 내용과 새로운 웹페이지의 내용에는 유사한 내용을 가지고 있는 경우가 많다. 즉, HTML 코드가 중복되는 상태에서 동일한 HTML 코드의 내용을 다시 한 번 전송하므로 많은 대역폭(bandwidth)을 낭비하며, 대역폭의 낭비는 시간과 금전적 손실을 발생하고, 사용자와 실시간 대화 서비스를 어렵게 한다. In this case, the contents of the original web page and the contents of the new web page are often similar. In other words, the same HTML code is transmitted once again in a state where the HTML code is duplicated, which wastes a lot of bandwidth, and the waste of bandwidth causes time and financial loss, and makes it difficult to service a real-time conversation with a user.

이러한 AJAX 방식은 웹 서버에서 처리되던 일부의 데이터를 클라이언트(CLIENT) 또는 접속된 단말기에서 처리하므로 웹서버와 클라이언트 사이에 교환되는 데이터 량이 줄어들고 대역폭이 줄어들고, 웹서버에서의 전체 데이터 처리량이 줄어들어 응답성이 향상되어 대화 형식의 데이터 교환이 가능하다. This AJAX method processes some of the data processed by the web server in the client or the connected terminal, thereby reducing the amount of data exchanged between the web server and the client, reducing bandwidth, and reducing overall data throughput on the web server. This enhancement allows for interactive data exchange.

상기 AJAX 방식은 응용할 수 없는 브라우저가 있고, HTTP 클라이언트의 기능이 한정되며, 보안상의 문제가 있고, 스크립트 작성으로 Debugging이 용이하지 않는 등의 단점이 있으나, 웹 페이지를 거의 고정한 상태에서 고속화면 전환이 가능하고, 클라이언트 또는 단말기에 데이터 처리의 일부를 위임하므로 서버의 부하(LOAD)가 줄어드는 동시에 데이터 처리 시간이 짧으며 비동기 데이터 통신이 가능하고, 적은 데이터 량에 의하여 대역폭을 줄이고, 통신시간이 줄어드는 등의 장점에 의하여 많이 사용되는 방식이다. The AJAX method has disadvantages such as a browser that is not applicable, limited functionality of an HTTP client, security problems, and difficulty in debugging by writing a script. Delegation of data processing to client or terminal reduces server load, data processing time is short, asynchronous data communication is possible, bandwidth is reduced by small data volume, communication time is reduced, etc. By the advantage of the method is used a lot.

상기 데이터베이스 검색부(132)는 상기 질의어 입력부가 입력하여 변환한 색인어를 상기 자동완성 데이터베이스로부터 검색하는 것으로, 상기 색인어를 전방일치와 후방일치 방식으로 각각 검색하여 목록으로 작성한다. The database search unit 132 searches for the index words inputted and converted by the query input unit from the autocomplete database, and searches the index words in a forward match and a backward match, respectively, to create a list.

상기 색인어 결정부(133)는 상기 자동완성 데이터베이스(140)에 저장된 색인어의 빈도수 정보를 확인하고, 상기 질의어 빈도수 정보가 1 이상의 값인 것을 자동완성 목록으로 제공하도록 결정하여 자동완성 목록 정보로 제공한다. The index word determining unit 133 checks the frequency information of the index words stored in the autocomplete database 140, determines that the query frequency information is one or more values, and provides it as the autocomplete list information.

상기 제시부(134)는 상기 색인어 결정부(133)가 제공하는 자동완성 목록 정보를 질의어 정보로 변환하여 사용자 인터페이스로 제공하는 것으로, 질의어 입력 통계 정보와 빈도수 정보와 가나다 순서 정보 중에서 선택된 어느 하나 이상의 정보를 이용하여 목록에서의 표시되는 순위 또는 순서를 조절한다. The presentation unit 134 converts the autocomplete list information provided by the index word determination unit 133 into query information and provides the user interface with one or more pieces of information selected from query input statistics information, frequency information, and order information. Use to adjust the order or order of display in the list.

상기 선택부(135)는 입력된 질의어와 상기 제시부(134)가 질의어로 변환하여 제공하는 자동완성 목록을 사용자 인터페이스로 제공하고 선택된 질의어 정보를 입 력하여 색인어로 변환하며, 상기 서비스 연동부(136)는 상기 선택부(135)가 입력하여 제공하는 색인어 정보를 검색 이벤트 신호에 의하여 검색하고, 상기 색인 데이터베이스(150)로부터 검색된 문서정보를 제공하는 것으로, 상기 검색은 API 호출에 의하여 검색한다. The selector 135 provides an input query and an autocomplete list that the presentation unit 134 converts into a query language and provides the user interface, converts the selected query information into an index word, and converts the selected query information into an index word. ) Searches for index word information input and provided by the selection unit 135 by a search event signal, and provides document information searched from the index database 150. The search is searched by API call.

상기 API(Application Program Interface ; 응용프로그램 인터페이스)는 운영체계나 다른 응용프로그램에게 처리요구를 할 수 있도록 하는 컴퓨터 운영체계나 다른 응용프로그램에 의해 미리 정해진 특정한 방법이다. The application program interface (API) is a specific method predetermined by a computer operating system or another application program that enables a processing request to an operating system or another application program.

API는 운영체계나 프로그램의 인터페이스로서 사용자와 직접적으로 대하게 되는 그래픽 사용자 인터페이스나 명령형 인터페이스와는 다르다. An API is different from a graphical user interface or an imperative interface that directly interacts with the user as an interface to the operating system or program.

상기 API는 응용프로그램이 운영체계나 데이터베이스 관리시스템과 같은 시스템 프로그램과 통신할 때 사용되는 언어나 메시지 형식을 말한다. API는 프로그램 내에서 실행을 위해 특정 서브루틴에 연결을 제공하는 함수를 호출함으로써 구현된다. The API refers to a language or message format used when an application program communicates with a system program such as an operating system or a database management system. An API is implemented by calling a function that provides a link to a specific subroutine for execution within a program.

즉, 하나의 API는 함수 호출에 의해 요청되는 작업을 수행하기 위해 이미 존재하거나 또는 연결되어야 하는 몇 개의 프로그램 모듈이나 루틴으로 이루어진다. That is, an API consists of several program modules or routines that must already exist or be linked in order to perform the work requested by the function call.

일반적으로 서버(SERVER)는 컴퓨터를 활용하고, 네트워크와 통신을 수행하며, 컴퓨터 연산 처리를 수행하고, 다양한 기능을 수행하는 구성 요소를 포함하며, 이러한 구성 요소 각각은 서버의 프로세서(processor), 메모리(memory), 입출력 수단 등에 의하여 작동된다. Generally, a server includes components that utilize a computer, communicate with a network, perform computer arithmetic operations, and perform various functions. Each of these components includes a processor and a memory of the server. (memory), the input and output means and the like.

상기와 같은 구성의 본 발명에 의한 서버 시스템(100)은 문서정보를 등록받고 색인하며 자동완성 DB(140)를 구축하는 문서 색인 서버(120)와 사용자가 입력한 질의어를 색인어로 변환하고 상기 색인어가 포함되는 색인어를 자동완성 DB(140)로부터 검색하여 질의어로 변환한 목록으로 제공하는 자동완성 제공 서버(130)와 문서정보를 수집하는 문서수집부(110)와 색인어 및 색인어의 유형별 정보를 포함하여 기록하는 자동완성 데이터베이스(140)와 색인어 정보를 기록하는 색인 데이터베이스(150)를 포함한다. The server system 100 according to the present invention having the above-described configuration registers and indexes document information and converts the query index inputted by the user into a index and converts the query index entered by the user into an index. Includes a document collection unit 110 to collect the document information and the automatic completion providing server 130 to provide a list of the index word is searched from the autocomplete DB 140 and converted into a query language, and includes information for each type of index word and index word And an index database 150 for recording the index word information.

상기 문서 색인 서버(120)는 문서정보를 등록받는 문서등록부(121), 문서정보 내 색인어들을 추출하여 색인하는 문서 색인부(123), DB 생성부(124) 및 문서 편집부(122)를 포함한다. The document index server 120 includes a document registration unit 121 for registering document information, a document indexing unit 123 for extracting and indexing index words in the document information, a DB generating unit 124, and a document editing unit 122. .

상기 문서등록부(121)는, 문서 등록기, 지식 관리 시스템, 문서 수집기 등이 포함되는 문서수집부(110)를 통하여 문서정보를 등록받는다. 상기 등록받는 문서정보는 웹(web) 페이지 문서정보, 텍스트(text) 문서정보, 서식 문서정보, 이미지 문서정보, 동영상 문서정보 등의 모든 콘텐츠를 포함한다. The document registration unit 121 receives document information through the document collection unit 110 including a document register, a knowledge management system, a document collector, and the like. The registered document information includes all contents such as web page document information, text document information, format document information, image document information, and video document information.

문서 색인부(123)는, 형태소 해석, N-gram 등의 색인방식 중에서 선택된 색인 방식으로 검출된 색인어(질의어)들을 추출하여 색인 데이터베이스(150)에 저장한다. The document indexing unit 123 extracts index words (query words) detected by an index method selected from an index method such as morpheme analysis and N-gram, and stores the index words (query words) in the index database 150.

문서 색인부(123)가 색인어를 추출하는 방식은 등록된 문서정보로부터 특정 한 색인어를 추출하거나 텍스트 프로세싱을 통하여 특정 정보를 추출하는 등의 부가정보 작업을 수행한다. The document indexing unit 123 extracts an index word to perform additional information such as extracting a specific index word from registered document information or extracting specific information through text processing.

이때 문서등록부(121)로부터 추출된 부가 정보를 색인 데이터베이스(150)에 추가하고, 불용어 사전 등을 이용하여 불필요한 색인어를 미리 제거한다. At this time, the additional information extracted from the document registration unit 121 is added to the index database 150, and unnecessary index words are removed in advance by using a stopword dictionary or the like.

데이터베이스 생성부(124)는, 색인 데이터베이스(150)에 저장되는 색인어 정보 중에서 자동완성으로 제공되는 유형별 색인어 정보를 추출하여 색인어 정보와 함께 자동완성 데이터베이스(140)에 기록하고, 그 발생 빈도수 (Document Frequency) 정보를 다시 계산한다. The database generator 124 extracts the index information for each type provided by the autocomplete from the index word information stored in the index database 150, records the index word information with the index word information in the autocomplete database 140, and generates the document frequency. ) Recalculate the information.

상기 빈도수 정보는 해당 정보들이 문서 내에서 발생한 빈도수 정보를 기록한 것이고, 이 값이 0 이면, 해당 색인어 정보는 자동완성 목록으로 제시할 대상으로부터 제외된다. The frequency information records frequency information generated in the document, and when the value is 0, the index information is excluded from the object to be presented as an autocomplete list.

상기 문서 편집부(122)는, 이미 등록된 문서정보에 대한 수정 또는 삭제를 수행한다. 문서정보가 수정 또는 삭제됨에 따라 해당 문서정보 내의 색인어들과 빈도수 값은 변경되어야 하므로 상기 데이터베이스 생성부(124)에 영향을 미친다. The document editing unit 122 corrects or deletes already registered document information. As document information is modified or deleted, index words and frequency values in the document information need to be changed, thus affecting the database generator 124.

본 발명은 시스템에 입력 및 기록되는 문서정보로부터 색인어를 검색 및 추출하고, 시스템에서 관리하는 색인어 사전에 기록된 색인어에 상기 검색된 색인어를 별도 표시하여 관리한다. The present invention searches and extracts index words from document information input and recorded in the system, and separately displays and manages the searched index words on index words recorded in the index word dictionary managed by the system.

상기 질의어와 색인어는 약간의 차이가 있으나 일반적으로 같은 의미로 사용된다. 즉, 질의어는 사용자와의 인터페이스에 의한 것으로 사용자가 입력하거나 선 택하도록 표시되는 것이고, 상기 질의어는 시스템에 입력되어 색인어로 변환되며, 색인어는 질의어로 변환되어 제시 또는 출력된다. The query and index terms are slightly different, but are generally used in the same meaning. That is, the query is an interface with the user and is displayed to be input or selected by the user. The query is input to the system and converted into an index, and the index is converted into a query and presented or output.

또한, 일례로 특정한 논문 정보가 시스템에 입력되는 경우, 상기 논문 정보로부터 상위 5 개의 색인어 정보를 검출하여 대표 색인어로 추출한다. In addition, when specific article information is input to the system, for example, the top five index words are detected from the article information and extracted as representative index words.

상기 검색되고 추출되어 할당된 상위 5 개의 색인어는 시스템의 색인어 사전에 빈도수 필드를 두고 색인어로서 선정된 빈도수를 증가시킨다. The searched, extracted and assigned top five index words place a frequency field in the index dictionary of the system and increase the frequency selected as the index word.

상기와 같은 본 발명은, 신규 문서정보가 추가되는 경우에 색인어 추출이 수행되고 색인어 사전에 색인어들의 빈도수 정보가 자동적으로 누적 계산되어 변경되므로 실시간(real-time)으로 자동완성 목록을 갱신(update)할 수 있는 장점이 있다. In the present invention as described above, when new document information is added, index word extraction is performed, and the frequency information of index words is automatically accumulated and changed in the index word dictionary, so that the autocomplete list is updated in real time. There is an advantage to this.

일례로 특정 문서정보를 삭제하면, 빈도수 필드의 해당 색인어의 빈도수 정보가 각각 1 씩 감소된다. 이러한 방식을 통하여 특정한 문서정보의 추가 또는 삭제되는 경우에 따른 대응을 실시간으로 처리할 수 있고, 하기의 도 7 의 설명에서 상세히 설명한다. For example, when the specific document information is deleted, the frequency information of the corresponding index word in the frequency field is reduced by one each. In this way, a response according to the case where specific document information is added or deleted can be processed in real time, which will be described in detail later with reference to FIG. 7.

시스템에 구비되는 검색 엔진에서의 색인 데이터베이스는 색인어 사전과 인물 사전 등을 구비한다. 상기 인물 사전은 URI 서버로부터 직접 웹 서비스를 통해 전달받은 인물명으로 구성한다. 즉, 말뭉치 등으로부터 확보한 색인어 목록을 가지고 있는 것이 아니라, 논문 정보 등과 같은 서비스 대상 문서정보의 저자(인물)들 을 실시간으로 추가한다. The index database in the search engine included in the system includes an index dictionary and a person dictionary. The person dictionary is composed of a person name received through a web service directly from a URI server. In other words, rather than having a list of index words obtained from corpus, etc., authors (persons) of service target document information such as article information are added in real time.

그러나 색인어 사전과 마찬가지로 실시간에 의한 빈도수 정보 값을 유지하므로 자동완성으로 목록을 제공하고 문서정보의 추가, 삭제 등에 대응하는 방식은 동일하다. However, like the index dictionary, the frequency information values are maintained in real time, and thus the list is provided by auto-completion and the method of adding and deleting document information is the same.

도 7 은 문서정보의 추가와 삭제에 의한 빈도수 갱신 상태 설명도 이다. 7 is an explanatory diagram of a frequency update state by addition and deletion of document information.

도 7 을 상세히 설명하면 1 번 문서정보가 등록되는 경우에 5 개의 색인어들이 추출되어 색인 데이터베이스(150)와 자동완성 데이터베이스(140)에 추가 기록된다. 상기 색인어들이 자동완성 방식에 의한 유형별 정보이면 하나의 문서정보로부터 추출된 정보들이므로 그 빈도수는 각각 1 로 한다. Referring to FIG. 7, in the case where document information 1 is registered, five index words are extracted and additionally recorded in the index database 150 and the autocomplete database 140. If the index words are type-specific information based on an autocomplete method, the information is extracted from one document information, and thus the frequency is 1.

상기 도 7 에서의 2 번 문서정보가 등록되는 경우에 5 개의 색인어들이 추출되는데, “OWL"과 ”Semantic Annotation"이 1 번 문서정보에서 이미 추출된 색인어들이므로 그 빈도수는 2가 되며, 나머지 3개 색인어들은 처음 추출된 것들이므로 그 빈도수 정보는 1 이 된다. When document number 2 in FIG. 7 is registered, five index words are extracted. Since “OWL” and “Semantic Annotation” are index words already extracted from document number 1, the frequency is 2, and the remaining 3 The dog index words are the first ones extracted, so the frequency information is one.

상기 도 7 에서의 1 번째 문서정보가 삭제되면, 1 번 문서정보에 의한 5 개의 색인어들에 해당하는 빈도수가 각 1 씩 감소하게 된다. When the first document information in FIG. 7 is deleted, the frequency corresponding to five index words based on the first document information is decreased by one.

종래 기술에서는 검색 결과를 생성한 색인어들만을 자동완성 방식으로 제시할 수 있는 방안이 소개되었고, 여기에서는 사용자 질의가 성공한 경우에 이를 해당 색인어의 검색 성공과 실패의 여부를 확인하는 플래그(flag)를 부착하는 단순한 방식을 사용하였다. In the prior art, a method of automatically presenting only index terms that generate a search result has been introduced. Here, when a user query succeeds, a flag for checking whether the index is successful or failed is searched. A simple method of attachment was used.

상기와 같은 종래 기술에서는 검색의 실패를 일으킨 색인어를 포함하는 문서정보가 추후 추가되더라도 사용자가 해당 질의어를 직접 입력하여 검색을 시도하기 전까지는 자동완성 목록에서 제시되지 않는 문제점이 있었다. In the prior art as described above, even if document information including an index word that caused a search failure is added later, there is a problem in that the automatic completion list is not presented until the user attempts to search by directly inputting the query word.

상기와 같은 문제를 극복하여, 문서정보 등록 또는 편집 시점에 즉각적으로 자동완성 대상 질의어들의 빈도수 정보를 조정함으로써 시간적 차이를 두지 않고 검색 결과를 실시간(REAL-TIME)으로 보장하는 자동완성 목록을 제공하는 것이 본 발명의 기술적 사상이다. By overcoming the above problems, by automatically adjusting the frequency information of the query targets automatically completed at the time of document information registration or editing to provide an automatic completion list that guarantees the search results in real-time without time difference This is the technical idea of the present invention.

종래 기술에 의한 OntoFrame은 개체 중심적 통합 검색을 제공하는데, 상기 개체는 질의어의 하위 집합(Subset)이다. The prior art OntoFrame provides an object-oriented integrated search, which is a subset of the query.

상기 입력된 질의어가 인물, 주제어 등 특정 질의어 유형이 포함되어 매칭되므로 검색되는 경우는 질의어 페이지가 구성되고, 그렇지 않은 경우는 일반 검색 결과 페이지가 구성된다. Since the inputted query is matched with a specific query type such as a person or a subject, the search query page is configured, and if not, the general search result page is configured.

질의어의 유형별 확인은 검색 엔진을 호출하는 방식으로 이루어지고, 입력된 질의어를 색인어로 변환하며, 색인어에 대해 검색 엔진 내 색인어 사전과 인물 사전을 동시에 참조한다. 이때, 빈도수 값이 1 이상인 색인어와 인물명을 검색하면서, 상기 입력된 질의어가 지역별 유형인지 인물별 유형인지 등등의 유형별 정보도 함께 검색한다. The type checking of the query is performed by calling a search engine, converting the input query into an index, and simultaneously referring to the index and person dictionary in the search engine for the index. At this time, while searching for the index word and the person name having a frequency value of 1 or more, type information such as whether the input query word is a local type or a person type is searched together.

하기는 본 발명의 일례에 의한 검색 엔진 프로그램이다. The following is a search engine program according to an example of the present invention.

API: SearchResultList getAutoComplete(String SearchTerm)API: SearchResultList getAutoComplete (String SearchTerm)

호출 예: getAutoComplete(“sem”)Example call: getAutoComplete (“sem”)

결과값 예:Example result:

[Sem Borst, Person][Sem Borst, Person]

[Semantic Annotation, Topic][Semantic Annotation, Topic]

[Semantic Web, Topic][Semantic Web, Topic]

[Semih Ergintav, Person]Semih Ergintav, Person

[Semyon M. Meerkov, Person][Semyon M. Meerkov, Person]

본 발명에 의한 자동완성 인터페이스에서는 검색 결과값에 의한 색인어 유형을 인식하고 아이콘, 색상, 트리 분류 등의 방식으로 표시할 수 있다. In the auto-completion interface according to the present invention, an index word type based on a search result value may be recognized and displayed in an icon, color, tree classification, or the like manner.

자동완성 서버(130)는 사용자 질의어를 입력받고 색인어로 변환하는 질의어 입력부(131), 색인어를 포함하는 자동완성 데이터베이스(140) 안에서 대응되는 색인어들을 검색하는 데이터베이스 검색부(132), 상기 자동완성 데이터베이스(140) 안에 기록 저장된 빈도수 정보를 확인하고 자동완성 제공 여부를 결정하는 색인어 결정부(133), 자동완성 목록을 질의어로 변환하고 사용자 인터페이스(UI)를 통하여 검색 인터페이스로 제공하는 제시부(134), 상기 제시된 질의어들을 포함하는 자동완성 목록 중 특정 질의어를 선택할 수 있도록 사용자 인터페이스(UI)를 제공하고 선택된 질의어를 색인어로 변환하는 선택부(135) 및 상기 선택된 색인어를 검색 버 튼이나 키보드 조작에 의한 이벤트 신호에 따라 검색 서비스로 검색된 문서정보를 제공하는 서비스 연동부(136)를 포함한다. The autocomplete server 130 is a query input unit 131 that receives a user query and converts it into an index, a database search unit 132 for searching the corresponding index in the autocomplete database 140 including the index, the autocomplete database An index word determination unit 133 for checking the frequency information stored in the 140 and determining whether to provide autocompletion, a presentation unit 134 for converting an autocompletion list into a query and providing the search interface through a user interface (UI), A selection unit 135 for providing a user interface (UI) to select a specific query among the suggested autocompletion lists including the suggested queries and converting the selected query into an index and an event by a search button or a keyboard operation Service interlocking unit 136 for providing the document information retrieved by the search service according to the signal It includes.

상기 질의어 입력부(131)는, 사용자 인터페이스 방식으로 제공되는 검색창을 통해 질의어를 입력받고 색인어로 변환한다. 상기 입력되는 색인어는 음소, 음절, 어절, 단어 단위에 의한 한 글자가 입력될 때마다 데이터베이스 검색부(132)를 에이작스 (Asynchronous JavaScript and XML : AJAX) 방식으로 호출한다. The query input unit 131 receives a query and converts it into an index through a search window provided in a user interface method. The input index word calls the database search unit 132 in an Asynchronous JavaScript and XML (AJAX) method whenever one letter is input by phoneme, syllable, word, and word unit.

상기 데이터베이스 검색부(132)는, 상기 입력된 검색어에 대해 자동완성 데이터베이스(140)를 검색하여 상기 검색어를 포함하는 검색어들이 존재하는지의 여부를 검색하여 확인한다. The database search unit 132 searches the autocomplete database 140 for the input search word and searches whether the search words including the search word exist.

상기 검색은 일례로, “대평”에 대해 “대평중학교”가 검색되는 것처럼 앞 부분이 일치하는 전방일치와 “대평”에 대해 “심대평”이 검색되는 것처럼 뒷부분이 일치하는 후방일치 방식으로 수행한다. For example, the search may be performed in a forward-matching manner in which the front part coincides with the search as “Daepyeong Middle School” is searched for “Daepyeong” and in the back-matching method where the rear part is matched as if “simpyeongpyeong” is searched for “daepyeong”.

상기 색인어 결정부(133)는, 상기 데이터베이스 검색부(132)를 통해 질의어가 포함되는 것으로 확인 또는 매칭(matching)되어 검색된 검색어들 중, 빈도수 값이 1 이상인 것들을 자동완성 목록으로 제시하도록 결정한다. The index word determination unit 133 determines to present, as an autocomplete list, among the search words that are identified or matched as being included in the query through the database search unit 132 and searched for.

상기 빈도수 값이 1 이상이라는 의미는 해당 색인어(질의어)를 포함하는 문서정보가 검색 시스템 내에 존재하는 것을 의미한다. Meaning that the frequency value is 1 or more means that the document information including the corresponding index word (query word) exists in the search system.

상기 제시부(134)는, 상기 색인어 결정부(133)를 통해 얻어진 색인어들을 질의어로 변환하고 자동완성 목록으로 제시한다. 일반적인 자동완성 방식에 따라 사용자가 입력한 질의어 입력 통계 정보, 빈도수 정보, 가나다 순서 정보 중에서 선 택된 어느 하나 이상을 이용하여 자동완성 목록 내에서의 배치되고 표시되는 순위 또는 순서를 조절하거나, 해당 질의어가 자동완성 데이터베이스(140) 안에서 가지는 빈도수 정보 값 또는 가나다 순서 정보를 이용하여 자동완성 목록의 순위를 조절한다. The presenting unit 134 converts the index words obtained through the index word determining unit 133 into query terms and presents them as an autocomplete list. According to the general autocompletion method, the ranking or order of the placed and displayed in the autocompletion list is adjusted by using one or more selected from the user's input query information, frequency information, and order information. The ranking of the autocomplete list is adjusted by using the frequency information value or the alphabetical order information in the autocomplete database 140.

상기 순위(순서)를 조절하거나 결정하는 방식은 새로운 방식이 개발되는 경우 적용될 수 있다. The manner of adjusting or determining the ranking may be applied when a new way is developed.

상기 선택부(135)는 사용자 인터페이스(UI)를 통하여 상기 단말장치(120)에 질의어로 제시된 자동완성 목록 중 특정 질의어를 지정하여 선택하면 상기 선택된 질의어를 색인어로 변환하여 입력한다. 상기 제시된 질의어 목록의 선택은 단말장치(120)에 구비되는 키보드의 상하 버튼이나 마우스 등을 이용하여 특정 질의어를 지정하고 하나의 질의어를 선택한다. When the selector 135 selects and selects a specific query from an autocomplete list presented as a query to the terminal device 120 through a user interface (UI), the selector 135 converts the selected query into an index and inputs the index. In the selected query list, a specific query is designated using an up / down button or a mouse of a keyboard provided in the terminal device 120 and one query is selected.

상기 선택된 질의어(색인어) 정보는 해당 이벤트 신호 정보와 함께 상기 서비스 연동부(136)에 전달된다. The selected query (index) information is transmitted to the service interlocking unit 136 together with the corresponding event signal information.

상기 서비스 연동부(136)는, 상기 선택된 질의어를 입력하고 검색 버튼이나 엔터 키 등의 키보드 조작에 의한 이벤트 신호에 따라 색인 데이터베이스(150)로부터 API 호출을 통하여 색인 정보를 검색하고 해당 문서정보를 제공하는 서비스를 처리한다. The service interworking unit 136 inputs the selected query word and retrieves index information from the index database 150 through an API call according to an event signal generated by a keyboard operation such as a search button or an enter key and provides corresponding document information. Handles the service.

도 6 은 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 방법 순서도 이다. FIG. 6 is a flowchart illustrating an autocomplete method for each type of query in which a search result is guaranteed by an example of the present invention.

상기 도 6 을 참조하여 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 방법을 설명하면, 수정하는 과정; 결정하는 과정; 출력하는 과정; 을 포함하는 구성이다. Referring to FIG. 6, a method of autocomplete for each type of query word for which a search result is guaranteed by an example of the present invention is modified. Decision making process; Output process; It includes a configuration.

상기 수정하는 과정은 문서정보를 수집 등록하고 색인어를 추출하여 색인 데이터베이스에 저장하며 자동완성으로 제공할 색인어를 생성하여 자동완성 데이터베이스에 저장하고 등록된 문서정보를 수정 또는 삭제하는 것으로, 문서수집부에 의하여 문서정보를 수집하는 과정(S100), 상기 수집된 문서정보를 문서등록부에 의하여 도면에 도시되지 않은 별도의 문서정보 데이터베이스에 기록하여 저장하는 과정과, 상기 등록된 문서정보로부터 문서 색인부에 의하여 색인어가 포함되는 부가정보를 추출하고 색인 데이터베이스에 저장하는 과정(S110), 상기 색인 데이터베이스에 저장되는 정보로부터 데이터베이스 생성부에 의하여 자동완성으로 제공할 색인어를 생성하고 자동완성 데이터베이스에 저장하는 과정(S120), 상기 등록한 문서정보를 문서 편집부에 의하여 수정하거나 삭제하는 과정(S130)을 포함하여 이루어진다. The modifying process includes collecting and registering document information, extracting an index word, storing the index word in an index database, generating an index word for automatic completion, storing the index information in an autocomplete database, and modifying or deleting the registered document information. Collecting document information (S100), recording and storing the collected document information in a separate document information database (not shown in the drawing) by the document registration unit, and storing the document information by the document index unit from the registered document information. Extracting additional information including an index word and storing the index information in an index database (S110), and generating an index word to be automatically provided by a database generator from the information stored in the index database and storing the index word in an autocomplete database (S120). ), The document editing unit for the registered document information It includes a process of modifying or deleting by (S130).

상기 색인어 추출 방식에는 형태소 해석 색인 방식과 N-gram 색인 방식이 있으며, 상기 형태소 해석 색인 방식과 N-gram 색인 방식 중에서 선택된 어느 하나의 방식을 이용한다. The index word extraction method includes a morpheme analysis index method and an N-gram index method, and any one method selected from the morpheme analysis index method and the N-gram index method is used.

상기 색인어 추출 방식을 이용하지 않는 경우에는 텍스트 프로세싱 방식을 이용할 수 있다. If the index word extraction method is not used, a text processing method may be used.

상기 결정하는 과정은 상기 자동완성 데이터베이스(140)로부터 사용자 인터페이스로 입력되는 색인어를 검색하여 빈도수 값이 1 이상을 검색 색인어로 결정하는 것이고, 상기 입력되는 색인어는 음소, 음절, 어절, 단어 중에서 선택된 어느 하나의 단위로 입력되는 경우마다 에이작스(AJAX) 방식으로 데이터베이스 검색부(132)를 호출한다(S140~S160). The determining process is to search for an index word input from the autocomplete database 140 to the user interface and determine a frequency value of 1 or more as a search index word, wherein the input index word is any one selected from phoneme, syllable, word, word. Whenever it is input in one unit, the database search unit 132 is called in an AJAX manner (S140 to S160).

상기 출력하는 과정은 상기 결정된 색인어를 질의어로 변환하여 자동완성 목록으로 제시하고 선택된 색인어의 정보를 검색하여 해당 문서정보를 출력하는 것으로(S170~S190), 상기 질의어의 자동완성 목록은 사용자에 의하여 입력되는 질의어 입력 통계 정보와 자동완성 데이터베이스의 빈도수 정보와 가나다 순서 정보 중에서 선택된 어느 하나 이상에 의하여 순위(순서)를 조절한다. The outputting step is to convert the determined index word into a query word and present it as an autocomplete list, search for information on the selected index word, and output corresponding document information (S170 to S190). The autocomplete list of the query word is input by a user. The rank (order) is controlled by one or more selected from the query input statistics information, the frequency information of the autocomplete database, and the order information of the alphabet.

상기와 같은 구성의 본 발명은 상기 시스템에 대한 상술한 설명에 의하여 모두 수행 된다. The present invention having the above configuration is all performed by the above description of the system.

상기의 본 발명에 의한 방법은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 상기 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 기록 장치이다. 예를 들어, ROM, RAM, Cache, 하드 디스크, 광디스크, 플로피 디스크, 자기 테이프 등이 있다. 또한, 캐리어 웨이브의 형태로 구현되는 것도 포함하며, 예를 들어 인터넷을 통한 전송 등이 있다. 또한, 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드로서 저장되고 실행될 수 있다.The method according to the present invention can be embodied as computer readable codes on a computer readable recording medium. The computer-readable recording medium is a recording device that stores data that can be read by a computer system. For example, ROM, RAM, Cache, hard disk, optical disk, floppy disk, magnetic tape and the like. In addition, the carrier wave may be implemented in the form of a carrier wave, for example, transmission through the Internet. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

이상에서 본 발명은 기재된 구체 예에 대해서만 상세히 설명되었지만 본 발명의 기술사상 범위 내에서 다양한 변형 및 수정이 가능함은 당업자에게 있어서 명백한 것이며, 이러한 변형 및 수정이 첨부된 특허청구범위에 속함은 당연한 것이다.Although the present invention has been described in detail only with respect to the described embodiments, it will be apparent to those skilled in the art that various changes and modifications can be made within the technical scope of the present invention, and such modifications and modifications belong to the appended claims.

도 1 은 일반적인 데이터베이스 시스템으로부터 정보를 검색하는 시스템의 기능 구성도, 1 is a functional configuration diagram of a system for retrieving information from a general database system;

도 2 는 일례에 의한 것으로 데이터를 검색하기 위하여 입력되고 검색되는 질의어를 자동완성 방식으로 표시하는 상태 도시도, FIG. 2 is a state diagram showing, by way of example, an autocompletion method of a query entered and retrieved to retrieve data;

도 3 은 종래 기술의 일례에 의한 것으로 질의어를 입력하고 검색에 실패한 상태 3 is a state in which a query is input and a search fails due to an example of the related art.

도 4 는 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 시스템의 기능 구성도, 4 is a functional configuration diagram of an automatic completion system for each type of query word with a guaranteed search result according to an example of the present invention;

도 5 는 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 시스템의 서버 시스템 상세 기능 구성도, 5 is a detailed configuration diagram of a server system of an autocomplete system for each query type with a guaranteed search result according to an example of the present invention;

도 6 은 본 발명의 일례에 의한 것으로 검색 결과가 보장된 질의어 유형별 자동완성 방법 순서도, FIG. 6 is a flowchart illustrating an autocomplete method for each type of query, in which a search result is guaranteed by an example of the present invention;

도 7 은 본 발명의 일례에 의한 것으로 문서정보의 추가와 삭제에 의한 빈도수 갱신 상태 설명도. Fig. 7 is an explanatory diagram of a frequency update state by addition and deletion of document information by one example of the present invention.

** 도면의 주요 부분에 대한 부호 설명 ** ** Explanation of symbols on the main parts of the drawing **

100 : 서버 시스템 200 : 공중통신망100: server system 200: public communication network

300 : 단말장치 110 : 문서수집부300: terminal device 110: document collection unit

120 : 문서색인서버 121 : 문서등록부120: document index server 121: document register

123 : 문서색인부 124 : 데이터베이스 생성부123: document indexing unit 124: database generation unit

122 : 문서편집부 140 : 자동완성 데이터베이스122: document editing unit 140: autocomplete database

130 : 자동완성 서버 131 : 질의어 입력부130: autocomplete server 131: query input unit

132 : 데이터베이스 검색부 133 : 색인어 결정부132: database search unit 133: index word determination unit

134 : 제시부 135 : 선택부134: presentation unit 135: selection unit

136 : 서비스 연동부 150 : 색인 데이터베이스136: service interlocking unit 150: index database

Claims

A document index server for registering document information, extracting and recording index word information and frequency information from the registered document information, and generating autocomplete list information from the extracted index word information;

An autocomplete database for recording the autocomplete list information generated by the document index server in association with the frequency information; And

Search the autocomplete database to extract the autocomplete list information including the index information, convert it into a query language, provide the user interface, convert the selected input query word into an index word, and search for document information including the index word. Autocomplete server to provide a user interface; Guaranteed search results, including the automatic completion system for each type of query.

The method of claim 1,

A document collecting unit for registering the collected document information with the document index server; And

An index database for recording the index word information provided from the document index server and providing the index word information to the autocomplete server; Autocomplete system for each type of query is guaranteed search results, characterized in that further comprises.

The method of claim 2, wherein the document collecting unit,

Automatic search results guaranteed by query type characterized in that the configuration consists of collecting one or more content document information, including web page document information, form document information, image document information, video document information, text document information, multimedia document information Finishing system.

The server of claim 2, wherein the document color is a server.

A document registration unit for registering the collected document information of the document collection unit;

A document indexing unit which extracts the index word from the document information of the document registering unit and stores the index word in the index database; And

A database generation unit for retrieving index word information provided as an autocomplete list from the index word stored in the index database, recording the index word information in the autocomplete database, and updating and managing frequency information; Autocomplete system for each type of query is guaranteed search results, characterized in that the configuration consisting of.

The method of claim 4, wherein the document index unit,

And extracting the additional information including the index word information from any one or more selected from extracting the index word from the document information registered in the document registration section and extracting the index word information designated by text processing. Autocomplete system for each query type with guaranteed search results.

The method of claim 4, wherein the document index portion,

The search result is configured to extract the index word from any of the document information registered in the document registration unit and a method selected from the N-gram method and store the index word in the index database. Autocomplete system by guaranteed query type.

The method of claim 6, wherein the document index portion,

And the additional information including the extracted index word is recorded and stored in the index database in association with the document information.

The server of claim 4, wherein the document color is a server.

A document editing unit for modifying and deleting the document information registered in the document registration unit; Autocomplete system for each type of query is guaranteed search results, characterized in that further comprises a configuration.

The method of claim 4, wherein the document index portion,

The automatic completion system for each type of query word guaranteed by the search result, characterized by removing the unnecessary index word included in the stopword dictionary from the index words extracted from the document information registered in the document registration unit.

The method of claim 4, wherein the database generation unit,

And the frequency information of the document information is cumulatively calculated and recorded as an object of the autocompletion list in the unit of the autocompletion database.

The method of claim 10, wherein the database generation unit,

When the value of the frequency information is 0, the automatic completion system for each type of query result guaranteed by the search result, characterized in that the configuration is excluded from the cumulative autocomplete list target.

The method of claim 1, wherein the autocomplete server,

A query input unit which receives the query word to be searched by a user interface and converts the query word into the index word;

A database search unit for searching the index word of the query word input unit from the autocomplete database;

An index word determination unit for checking frequency information of index words stored in the autocomplete database and determining and providing the information as autocomplete list information;

A presentation unit which converts the autocomplete list information provided by the index word determination unit into the query language and provides the converted information to the user interface;

A selection unit which provides the input query word and the query word of the auto-completion list provided by the presentation unit to the user interface, and inputs the query word information selected together with an event signal to convert the index word into the index word; And

A service linker for searching and providing the document information according to the index word information and a search event signal inputted by the selector; Autocomplete system for each type of query is guaranteed search results, characterized in that consisting of a configuration.

The method of claim 12, wherein the query input unit,

The query completion type automatic completion system for each query type, characterized in that the query is composed of input unit by any one selected from phoneme, syllable, word, word.

The method of claim 13, wherein the query input unit,

The automatic completion system for each type of query result guaranteed by the search result, characterized in that it comprises a configuration for calling the auto-complete database in the AJAX method every time the query is input.

The method of claim 12, wherein the query input unit,

And a query result guaranteed automatic search system for each type of query, comprising: inputting the query information into the user interface (UI).

The method of claim 12, wherein the database search unit,

And a search result guaranteed query type according to a query type, characterized in that the index word is searched in the forward matching and backward matching methods, respectively.

The method of claim 12, wherein the index word determination unit,

The automatic completion system for each type of query word guaranteed by the search result, characterized in that the configuration to determine to provide the frequency information of the index word to the one or more in the autocomplete list.

The method of claim 12, wherein the presentation unit,

And the ranking is adjusted in the autocomplete list by using one or more selected from the input statistics information of the query, the frequency information of the query, and the order of information of the query. Autocomplete system for each type of query with guaranteed results.

The method of claim 12, wherein the service interlocking unit,

The search result guaranteed automatic completion system for each type of query, characterized in that the index information is configured to retrieve the document information by an API call (API).

(a) collecting and registering document information, extracting index word information and frequency information from the registered document information, storing the index word information and frequency information in the index database, generating the autocomplete list information of the index word information, storing it in the automatic completion database, and registering the registered document information. Modifying;

(b) converting a query word input from the autocomplete database through a user interface into the index word and including the index word searched for the frequency 1 or more in the autocomplete list information; And

(c) converting the index word of the autocomplete list into the query language and presenting the query word to the user interface, converting the query word input from the outside into the user interface into the index word, and outputting the document information retrieved from the converted index word. Doing;

Autocompletion method for each type of query with guaranteed search results, including.

The method of claim 20, wherein step (a) is

Collecting the document information by a document collecting unit;

Registering the collected document information by a document registration unit;

Extracting the index word from the registered document information by a document indexing unit and storing the index word in the index database;

Extracting the index word to be provided to the autocomplete list by a database generator from the index word information stored in the index database and storing the index word in the autocomplete database; And

Modifying or deleting the registered document information by a document editing unit; Autocomplete method for each type of query is guaranteed search results, characterized in that comprises a.

The method of claim 21, wherein the index word extraction,

An automatic completion method for each type of query word guaranteed by a search result, wherein the index word is extracted using one of a morpheme analysis method and an N-gram index method.

The method of claim 21,

Step (a) is

The index word stored in the index database includes additional information, and removes the unnecessary index word by using a stopword dictionary and stores the updated frequency information of each document information in the autocomplete database. Guaranteed autocomplete by query type.

The method of claim 20, wherein the query word input in the step (b),

Autocomplete method for each type of query guaranteed by the search results, characterized in that input by phoneme, syllable, word, word in any one unit selected.

The method of claim 24, wherein the input query is,

An automatic completion method for each type of query word with a guaranteed search result, which is converted into an index word and retrieved by calling from the autocomplete database unit in an Ajax manner.

A document that collects and registers document information, extracts index word information and frequency information from the registered document information, stores the index word information and frequency information in the index database, generates the autocomplete list information of the index word information, stores the information in the autocomplete database, and modifies the registered document information. Information collection and registration process;

An index word extraction and storage process for converting a query word input from the autocomplete database through a user interface into the index word and including the index word searched for the frequency 1 or more in the autocomplete list information; And

A search process for converting the index word of the autocomplete list into the query word and presenting it to the user interface, converting the query word selected from the outside into the user interface into the index word, and outputting the document information searched for in the converted index word; ; To

Recording medium that records the program source of automatic completion method for each query type with guaranteed search results.

27. The method of claim 26, wherein the index extracting and storing process is a process of adjusting the rank to at least one selected from input statistical information of the query, query frequency information in the autocomplete database, and the Canadian order information of the query. A recording medium recording a program source of an autocompletion method for each query type with guaranteed search results.

The search result according to claim 26, wherein the search process is a process of calling and searching the autocomplete database in an ajax manner whenever it is input in any one unit selected from phoneme, syllable, word, and word. Recording the program source of the autocomplete method for each guaranteed query type.

Register the document information, extract the index word and the frequency information from the registered document information to build an autocomplete database, convert the query word input to the user interface from the outside into the index word and the index word including the index word from the autocomplete database. A server system for providing an autocomplete list through the user interface, converting a query word selected and input through the user interface into the index word, and providing the document information including the converted index word to the user interface;

A public communication network connected to the server system and transmitting and receiving data of the query word and the retrieved document information through a communication path selected from a wired communication path and a wireless communication path; And

Input the query to search and access the public communication network to the user interface and transmit the query to the server system to display an autocomplete list of the query provided by the server system on the user interface and display the selected query as an event signal. A terminal device configured to be inputted together with the server system to display the document information provided by the server system; Autocomplete system for each query type is guaranteed a search result including a.

The server system of claim 29, wherein the server system comprises:

A document index server that registers the document information, extracts the index word and the frequency information, and builds the autocomplete database;

Input the query word from the outside into the user interface, convert the index word, extract index word list information including the index word, convert the query word into the query word, provide the query word to the user interface, and convert the selected input query word into the index word An autocomplete server providing the document information;

The autocomplete database for recording the autocomplete list information generated by the document index server in association with the frequency information;

A document collecting unit which registers the collected document information by accessing the document index server; And

An index database for recording the index word information provided from the document index server and providing the index word information by searching the autocomplete server; Autocomplete system for each type of query is guaranteed search results, characterized in that the configuration consisting of.

The method of claim 29, wherein the public communication network,

A wireless communication network connecting the server system and the terminal device to the wireless communication path and transmitting the data signal; And

A wired communication network connecting the server system and the terminal device to the wired communication path and transmitting the data signal; Guaranteed search results, characterized in that the configuration consisting of including the automatic completion system for each type of query.