KR20100029581A

KR20100029581A - Recommended search terms providing system and method for each user and computer readable medium processing the method

Info

Publication number: KR20100029581A
Application number: KR1020080088416A
Authority: KR
Inventors: 임성욱
Original assignee: 에스케이커뮤니케이션즈 주식회사
Priority date: 2008-09-08
Filing date: 2008-09-08
Publication date: 2010-03-17
Also published as: KR101453382B1

Abstract

본 발명은 사용자가 검색어 입력시 사용자의 개인적 성향을 고려한 검색어 자동완성 리스트를 제공하도록 하여 사용자의 검색어 입력을 더욱 간편화할 수 있도록 한 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체에 관한 것으로, 개인별 검색어 입력 정보를 이용한 협업 필터링 방식을 적용하여 사용자의 성향에 따른 예측 검색어들을 선별한 후, 이러한 선별된 검색어들을 기반으로 사용자에게 자동완성 검색어 리스트를 제공하도록 하는 것으로 사용자가 입력한 적이 없는 검색어들을 포함한 사용자 성향에 따른 검색어들이 자동적으로 선별 제공되도록 함으로써, 보다 정확한 사용자별 검색어 추천이 가능한 효과가 있다.The present invention provides a search word autocompletion list in consideration of a user's personal tendency when a user inputs a search word so that the user can easily input the user's search word, and a computer-readable method capable of implementing the same. The present invention relates to a recording medium. By applying a collaborative filtering method using individual search term input information, the user can select predicted search terms according to the user's inclination, and then provide the user with an autocomplete search term list based on the selected search terms. By automatically providing the search terms according to the user's inclination including the search terms that have not been entered, there is an effect that the user can more accurately recommend the search terms.

Description

RECOMMENDED SEARCH TERMS PROVIDING SYSTEM AND METHOD FOR EACH USER AND COMPUTER READABLE MEDIUM PROCESSING THE METHOD}

본 발명은 사용자별 검색어 추천 시스템과 방법에 관한 것으로, 특히 사용자가 검색어 입력시 사용자의 개인적 성향을 고려한 검색어 자동완성 리스트를 제공하도록 하여 사용자의 검색어 입력을 더욱 간편화할 수 있도록 한 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체에 관한 것이다.The present invention relates to a user-specific query recommendation system and method, and in particular, to provide a search word autocompletion list in consideration of a user's personal tendency when a user inputs a search word, so that the user can easily input a user's search word. And a computer readable recording medium capable of implementing the same.

인터넷이 급속도로 발달하면서 방대한 양의 정보가 사용자들에게 제공되고 있으며, 이로 인해 사용자들이 원하는 정보를 찾는 것이 점차 어려워지고 있다. 또한 원하는 정보를 찾는다 하더라도 그 정보를 찾을 때까지 소요되는 시간(즉, 정보 검색 시간)이 증가하게 되었다. 그에 따라 인터넷을 이용한 정보 검색은 대단히 일상적인 일이 되어 검색에 소요되는 시간 역시 무시하지 못할 정도로 일상 속에서 큰 비중을 차지하고 있어 검색의 편의성을 높이고, 검색의 효율을 높이기 위한 여러 가지 연구들이 진행되거나 그 결과가 적용되고 있는 실정이다. With the rapid development of the Internet, a vast amount of information is provided to users, which makes it increasingly difficult for users to find the information they want. In addition, even if the desired information is found, the time required to find the information (ie, information retrieval time) has increased. As a result, information retrieval using the Internet has become a very common task, and the time required for searching is too large in daily life, and various studies are being conducted to improve the convenience of searching and to improve the efficiency of searching. The result is being applied.

특히, 검색 기능을 대표적 기능으로 제시하는 검색 포털이나 쇼핑몰 등의 영향력이 증가하면서 사용자의 불편함을 개선하고, 기능을 향상시켜 자신이 제공하는 서비스에 대한 사용자 만족도를 높여 점유율을 유지하고자 하는 치열한 노력이 가속되고 있다. 그 예로서, 사용자가 검색하고자 하는 검색어를 사용자의 입력에 따라 유추하여 미리 완성된 형태의 검색어 리스트를 제공해 줌으로써 검색어를 입력하는 시간을 줄이고, 검색어의 정확한 내용을 미리 알려주는 검색어 자동완성 서비스가 제공되고 있다. 예를 들어, 사용자가 검색창에 "네이"라고 입력하면 자동으로 "네이트", "네이트온" 등의 완성된 검색어 리스트를 제공하고, 사용자가 이들 중 선택한 하나를 검색어로 입력할 수 있게 된다.In particular, as the influence of search portals and shopping malls that present search functions as representative functions increases, the intensive efforts to improve user discomfort and improve user satisfaction for the services provided by them by maintaining the market share This is accelerating. As an example, the search term auto-completion service provides a list of search terms in a pre-finished form by inferring the search term that the user wants to search according to the user's input, and reduces the time for entering the search term and provides the search term auto-completion service that informs the exact contents of the search term in advance. It is becoming. For example, when a user enters "Nay" into a search box, the user automatically provides a list of completed search terms such as "Nate" and "Nate On", and the user can enter one of them as a search term.

이러한 검색어 자동완성 서비스는 사용자의 수동 입력을 줄여주는 편의성 측면에서의 효과와 함께 사용자가 검색할 내용에 대한 정확한 정보를 모르는 경우나, 검색어 중 일부만 아는 경우, 효과적인 검색어가 존재하는 경우에 유용한 검색어 가이드로서 사용되고 있어 최근 대부분의 검색 포털의 검색창에 적용되고 있다.This query autocompletion service has the effect of reducing the manual input of the user, and a useful search guide if the user does not know the exact information about what they are searching for, or if he or she knows only a few of the search terms. It is used as a search box in most search portals.

하지만, 현재의 검색어 자동완성 서비스는 대부분 모든 사용자들의 검색어 사용 빈도를 산출하여 검색 빈도가 높은 순으로 자동완성 검색어 리스트를 제공하고 있다. 이러한 획일적인 자동완성 검색어 리스트의 제공은 자신이 검색할 검색어가 해당 리스트에 나타날 경우에는 용이한 서비스가 될 수 있으나, 자신이 검색할 검색어가 나타나지 않은 리스트가 제공될 경우에는 불필요한 서비스가 될 수 있다.However, the current search word autocompletion service calculates the use frequency of most users and provides a list of autocomplete search terms in order of high search frequency. The provision of such a uniform autocomplete search term list may be an easy service when the search term to be searched for appears in the list, but may be an unnecessary service when a list is provided in which the search term does not appear. .

따라서, 사용자의 검색어 입력을 토대로 예측된 완성 검색어 리스트를 제공하는 방식에 개인화를 도입하고자 하는 방안이 검토되었다.Therefore, a method of introducing personalization into a method of providing a predicted complete search word list based on a user's search word input has been examined.

예를 들어, 한국 등록특허 제10-0754768호, '사용자별 맞춤 추천어를 제공하는 시스템, 방법 및 상기 방법을 실행하기 위한 프로그램이 기록된 컴퓨터에서 판독가능한 기록 매체'는 사용자의 검색어 입력을 단말기에 로그로 남기고, 이러한 단말기의 로그를 기준으로 사용자별 맞춤 추천어를 추출하여 제공하는 기술을 개시하고 있다. 즉, 해당 기술적 구성은 사용자가 기 입력한 검색어를 기반으로 자동완성 기능을 제공하고자 하는 것으로, 기 입력된 검색어를 색인하여 사용자의 입력에 따라 프리픽스, 서픽스 등의 검색 방식으로 자동완성을 위한 추천 검색어 리스트를 제공하는 것이다. 하지만 이러한 방식은 사용자가 입력한 검색어로 추천 검색어가 한정되기 때문에 개인화에는 성공했을지 몰라도 사용자가 최초로 검색하고자 하는 검색어에 대한 추천은 불가능하게 되어 사용자가 잘 모르는 검색어나 신규한 검색어를 입력할 경우에는 오히려 기존의 검색어 자동완성 서비스보다 검색어 제공 효율이 낮은 경우가 빈번하게 된다. 즉, 단순히 자신이 자주 사용하던 검색어를 입력하는 시간을 줄여주는 정도에 불과하여 불특정한 다수의 검색어들 중에서 사용자가 원하는 검색어를 가이드해 주고자 하는 기본적인 목적을 달성할 수 없게 된다.For example, Korean Patent No. 10-0754768, "A system, a method for providing a personalized recommendation for each user, and a computer-readable recording medium having recorded thereon a program for executing the method," refer to a user's input of a search term. The present invention discloses a technique of extracting and providing a user-specific recommendation word based on the log of the terminal. That is, the technical configuration is intended to provide an autocomplete function based on a user's pre-populated search term, and recommends for the autocompletion by searching a prefix, suffix, etc. according to the user's input. It provides a list of search terms. However, this method is limited to the search terms entered by the user, so personalization may be successful, but when the user enters a query that the user does not know or enters a new search term, the recommendation for the first search term is impossible. Often, the efficiency of providing a search word is lower than that of the existing search word autocompletion service. That is, it merely reduces the time for inputting a search word that is frequently used by the user, and thus it is impossible to achieve the basic purpose of guiding a search word desired by the user among the unspecified search terms.

종래의 획일적인 검색어 자동완성 서비스나 한계가 존재하던 개인화된 검색어 자동완성 서비스의 문제점을 해결하기 위해 새롭게 제안하는 본 발명 실시예들의 목적은 사용자가 입력한 적이 없었던 검색어에 대한 추천이 가능하면서도 사용자의 개인적 특성이 반영되도록 하여 사용자가 원하는 검색어가 보다 높은 확률로 검색어 자동완성 리스트에 제공될 수 있도록 함으로써 사용자의 편의를 극대화할 수 있도록 한 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체를 제공하는 것이다.In order to solve the problems of the conventional uniform search word autocompletion service or the personalized search word autocompletion service, there exist limitations, the object of the present invention is to provide a user's recommendation for a search word that the user has not entered yet. The user's search term recommendation system and method and computer that can implement it can maximize the user's convenience by allowing the user's desired search term to be provided in the search word autocomplete list with higher probability. To provide a recording medium.

본 발명 실시예들의 다른 목적은 개인별 검색어 입력 정보를 이용한 협업 필터링 방식을 적용하여 사용자의 성향에 따른 예측 검색어들을 선별한 후, 이러한 선별된 검색어들을 기반으로 사용자에게 자동완성 검색어 리스트를 제공하도록 하는 것으로 사용자가 입력한 적이 없는 검색어들을 포함한 사용자 성향에 따른 검색어들이 자동적으로 선별 제공되도록 함으로써, 보다 정확한 사용자별 검색어 추천이 가능하도록 한 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체를 제공하는 것이다.Another object of the embodiments of the present invention is to select a prediction query according to the user's disposition by applying a collaborative filtering method using individual search term input information, and to provide an autocomplete search term list to the user based on the selected search terms. By automatically providing search terms according to the user's inclination including the search terms that the user has not entered, the system and method for recommending search terms by user and computer-readable records that can implement the search terms To provide a medium.

본 발명 실시예들의 또 다른 목적은 협업 필터링과 함께 시간을 근거로 사용자별 예측 검색어들의 가중치를 가변하도록 하여 시간에 따라 관심 분야를 달리하도록 하거나 최근 검색한 내용들에 대한 가중치를 높이도록 하는 등의 시간에 따른 가변적 개인화가 가능하도록 함으로써 보다 정확한 사용자별 검색어 추천이 가능하 도록 한 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체를 제공하는 것이다.Another object of the embodiments of the present invention is to vary the weight of prediction queries for each user based on time with collaborative filtering so as to change interests according to time or to increase the weight of recently searched contents. The present invention provides a user-specific keyword recommendation system and method for enabling personalized search by user, and a computer-readable recording medium capable of implementing the same.

본 발명 실시예들의 또 다른 목적은 검색어 입력 중 비정상적인 행위나 의미 없는 단어를 필터링하는 기능을 부가하도록 하여 검색어 자동완성 리스트의 정확도를 높일 수 있도록 한 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체를 제공하는 것이다.Another object of the embodiments of the present invention is to add a function for filtering abnormal behaviors or meaningless words while inputting a search word to increase the accuracy of the search word autocomplete list, and a computer capable of implementing the search word recommendation system for each user. It is to provide a recording medium that can be read by.

상기와 같은 목적을 달성하기 위하여, 본 발명의 일 실시예에 따른 사용자별 검색어 추천 시스템은 사용자별 검색어 정보를 수신하여 검색어에 대한 선호도를 산출하고, 사용자들의 검색어 사용 정보로부터 검색어 유사도를 산출하여 사용자별 예측 검색어를 생성하는 검색어 협업 필터링부와; 상기 협업 필터링부가 생성한 사용자별 예측 검색어를 저장하는 사용자별 예측 검색어 데이터베이스와; 사용자의 검색어 입력에 따라 상기 사용자별 예측 검색어 데이터베이스로부터 해당 사용자에 대한 추천 검색어 리스트를 제공하는 사용자별 예측 검색어 선별부를 포함하는 것을 특징으로 한다.In order to achieve the above object, the user's search term recommendation system according to an embodiment of the present invention receives search term information for each user to calculate a preference for a search term, and calculates a search term similarity from the user's search term use information. A search term collaborative filtering unit to generate a predicted search term; A prediction query database for each user that stores the prediction query for each user generated by the collaboration filtering unit; And a prediction keyword selection unit for each user, which provides a list of recommended search terms for the user from the prediction database for each user according to a user's search word input.

상기 협업 필터링부는 사용자별 검색어 정보를 수신하면서 의미 없는 검색어나 검색어로서 의미 없는 단어를 필터링하는 검색어 필터링부를 더 포함하는 것을 특징으로 한다. The collaborative filtering unit may further include a search term filtering unit for filtering meaningless words as meaningless search terms or search terms while receiving search word information for each user.

상기 협업 필터링부는 검색어 수신시 수신 시간을 기준으로 해당 검색어에 가중치를 적용하여 선호도를 산출하는 사용자별 검색어 선호도 산출부를 더 포함하는 것을 특징으로 한다. The collaborative filtering unit may further include a search word preference calculator for each user that calculates a preference by applying a weight to the corresponding search word based on a reception time when the search word is received.

상기 사용자별 검색어 선호도 산출부는 상기 사용자의 인적, 지역적 정보를 이용하여 사용자의 속성을 결정한 후 상기 수신되는 검색어 정보에 사용자 속성별 가중치를 부여한 후 선호도를 산출하는 것을 특징으로 한다. The search word preference calculator for each user may determine a user's property by using the user's personal and regional information, and then calculate a preference after assigning a weight to the received search word information for each user property.

상기 협업 필터링부는 사용자들의 검색어 사용 정보로부터 미리 설정된 시간 간격 이내에 동일 사용자에 의해 연속적으로 이용된 서로 다른 2이상의 검색어들 중 선택 가능한 2개의 검색어 조합을 유사 검색어 쌍으로 결정하고 상기 검색어 쌍이 속한 카테고리 유사도에 비례하는 속성 가중치를 적용하여 검색어 유사도를 산출하는 검색어 유사도 산출부를 포함하는 것을 특징으로 한다. The collaborative filtering unit determines two search term combinations that can be selected from two or more different search terms consecutively used by the same user within a preset time interval from the user's search term usage information, as similar search term pairs, and determines the category similarity to which the search term pair belongs. And a search word similarity calculator for calculating search term similarity by applying proportional attribute weights.

상기 검색어 유사도 산출부는 동일 사용자에 의해 사용된 상기 검색어 쌍의 사용 시간 간격에 반비례하는 시간 가중치를 더 적용하여 검색어 유사도를 산출하는 것을 특징으로 한다.The search term similarity calculator may further calculate search term similarity by further applying a time weight that is inversely proportional to the use time interval of the search term pair used by the same user.

상기 검색어 유사도 산출부는 검색어 쌍에 대해 기 설정된 검색어 연관도를 상기 검색어 쌍에 연관 가중치로 적용하여 검색어 유사도를 산출하는 것을 특징으로 한다.The search term similarity calculator may calculate the search term similarity by applying a predetermined search term association degree to the search term pair as an association weight.

상기 협업 필터링부와 상기 사용자별 예측 검색어 선별부에 사용자의 아이디, 네트워크 식별자, 쿠키 중 적어도 하나 이상을 통해 사용자를 판별하여 사용자 판별 정보를 제공하는 사용자 판별부를 더 포함하는 것을 특징으로 한다. The user interface may further include a user determination unit configured to determine a user through at least one of an ID, a network identifier, and a cookie of the collaboration filtering unit and the prediction keyword selection unit for each user, and to provide user identification information.

본 발명의 다른 실시예에 따른 사용자별 검색어 추천 방법은, 사용자별 검색 어 정보를 수신하여 검색어에 대한 선호도를 산출하고, 사용자들의 검색어 사용 정보로부터 검색어 유사도를 산출하여 사용자별 예측 검색어를 생성하는 검색어 협업 필터링 단계와; 사용자의 검색어 입력에 따라 상기 사용자별 예측 검색어 데이터베이스로부터 해당 사용자에 대한 추천 검색어 리스트를 제공하는 검색어 자동완성 리스트 제공 단계를 포함하는 것을 특징으로 한다.According to another exemplary embodiment of the present invention, a search term recommendation method for each user receives a search term information for each user to calculate a preference for a search term, and calculates a search term similarity from the search term usage information of the user to generate a prediction term for each user. A collaborative filtering step; And a search word autocompletion list providing step of providing a list of recommended search words for the corresponding user from the prediction database for each user according to a user's search word input.

또한, 본 발명의 또 다른 실시예에 따른 기록 매체는 전술한 사용자별 검색어 추천 방법을 수행하는 프로그램이 수록된다.In addition, the recording medium according to another embodiment of the present invention includes a program for performing the user-specific keyword recommendation method described above.

이러한 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있도록 프로그램 및 데이터가 저장되는 모든 종류의 기록매체를 포함한다. 그 예로는, 롬(Read Only Memory), 램(Random Access Memory), CD(Compact Disk), DVD( Digital Video Disk)-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있으며, 또한 케리어 웨이브(예를 들면, 인터넷을 통한 전송)의 형태로 구현되는 것도 포함된다. 또한, 이러한 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Such recording media includes all types of recording media on which programs and data are stored so that they can be read by a computer system. Examples include Read Only Memory (ROM), Random Access Memory (RAM), Compact Disk (CD), Digital Video Disk (DVD) -ROM, magnetic tape, floppy disk, optical data storage devices, and the like. It also includes implementations in the form of (eg, transmission over the Internet). In addition, these recording media can be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

본 발명 실시예에 따른 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체는 사용자가 입력한 적이 없었던 검색어에 대한 추천이 가능하면서도 사용자의 개인적 특성이 반영되도록 하여 사용자가 원하는 검색어가 보다 높은 확률로 검색어 자동완성 리스트에 제공될 수 있도록 함 으로써 사용자의 편의를 극대화할 수 있는 효과가 있다.According to an exemplary embodiment of the present invention, a system and method for recommending a search word for each user and a computer-readable recording medium capable of implementing the same may allow a user to make a recommendation for a search word that the user has not input, while reflecting the personal characteristics of the user. Search terms can be provided to the search word autocomplete list with a higher probability, thereby maximizing user convenience.

본 발명 실시예에 따른 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체는 개인별 검색어 입력 정보를 이용한 협업 필터링 방식을 적용하여 사용자의 성향에 따른 예측 검색어들을 선별한 후, 이러한 선별된 검색어들을 기반으로 사용자에게 자동완성 검색어 리스트를 제공하도록 하는 것으로 사용자가 입력한 적이 없는 검색어들을 포함한 사용자 성향에 따른 검색어들이 자동적으로 선별 제공되도록 함으로써, 보다 정확한 사용자별 검색어 추천이 가능한 효과가 있다.According to an embodiment of the present invention, a system and method for recommending a search word for each user and a computer-readable recording medium for implementing the same may select a search query according to a user's inclination by applying a collaborative filtering method using individual search word input information. By providing a list of autocomplete search terms to the user based on the selected search terms, the search results according to the user's inclination including the search terms that the user has not entered are automatically provided and the user can more accurately recommend the search terms for each user. have.

본 발명 실시예에 따른 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체는 협업 필터링과 함께 시간을 근거로 사용자별 예측 검색어들의 가중치를 가변하도록 하여 시간에 따라 관심 분야를 달리하도록 하거나 최근 검색한 내용들에 대한 가중치를 높이도록 하는 등의 시간에 따른 가변적 개인화가 가능한 효과가 있다.In accordance with an embodiment of the present invention, a system and method for recommending a search word for each user and a computer-readable recording medium that can implement the same may change interests according to time by varying the weight of the prediction search words for each user based on time along with collaborative filtering. It is possible to change the personalization over time, such as to make different or increase the weight of the recently searched content.

본 발명 실시예에 따른 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체는 검색어 입력 중 비정상적인 행위나 의미 없는 단어를 필터링하는 기능을 부가하도록 하여 검색어 자동완성 리스트의 정확도를 높일 수 있는 효과가 있다.According to an embodiment of the present invention, a system and method for recommending a search word for each user and a computer-readable recording medium embodying the same may add a function to filter abnormal behavior or meaningless words during input of a search word, thereby improving accuracy of the search word autocomplete list. It can increase the effect.

본 발명 실시예에 따른 사용자별 검색어 추천 시스템과 방법 및 이를 구현할 수 있는 컴퓨터로 읽을 수 있는 기록 매체는 사용자의 속성을 이용하여 사용자 별 선호도를 산출하도록 하여 소수 집단에 속한 사용자 성향을 고려하도록 함으로써, 소수 집단의 사용자들에게도 정확한 검색어 추천이 가능한 효과가 있다.According to an embodiment of the present invention, a system and method for recommending a search word for each user and a computer-readable recording medium capable of implementing the same may calculate a user preference using a user's attributes to consider a user's propensity belonging to a minority group. Even a small number of users can benefit from accurate search suggestions.

상기한 바와 같은 본 발명을 첨부된 도면들과 실시예들을 통해 상세히 설명하도록 한다. The present invention as described above will be described in detail with reference to the accompanying drawings and embodiments.

도 1은 본 발명의 실시예에 따른 사용자별 검색어 추천 시스템의 예를 보인 것으로, 도시한 바와 같이 사용자가 검색어를 입력하고, 검색창을 통해 실시간 입력되는 문자에 따른 자동완성 검색어 리스트를 제공받는 웹브라우저(10)와, 상기 웹브라우저(10)와 통신하여 검색 서비스를 제공하는 웹서버(20)와, 상기 웹서버(20)를 통해 제공되는 검색어 정보나 실시간 검색어 입력 문자 정보에 따라 해당 사용자 개인에 대해 특화된 자동완성 검색어 리스트를 제공하는 검색어 자동완성 서버(30) 및 상기 검색어 자동완성 서버(30)가 사용자별 검색어 추천을 위한 근거가 되는 사용자들의 검색어 사용 정보(로그) 데이터베이스(41), 사용자별 선호도와 검색어 유사도에 따라 사용자별로 생성되는 사용자별 예측 검색어 데이터베이스(42), 검색어들 간의 연관도를 미리 설정해 둔 검색어 연관도 데이터베이스(43)를 포함하는 데이터베이스(40)를 포함하여 이루어진다.1 illustrates an example of a search word recommendation system for each user according to an embodiment of the present invention. As shown in FIG. 1, a user inputs a search word and receives an autocomplete search word list based on a text input in real time through a search box. The user of the user according to the browser 10, the web server 20 to communicate with the web browser 10 to provide a search service, and the search word information or real-time search text input text information provided through the web server 20 A search term autocomplete server 30 that provides a list of specialized autocomplete search terms for the user, and a search term usage information (log) database 41 of users, which is the basis for recommending search terms for each user. User-specific prediction query database 42 generated for each user according to each user's preference and search query similarity, and previews the correlation between the search terms. The set search word association is also made including the database 40 including the database 43.

상기 검색어 자동완성 서버(30)는 사용자들의 검색어 사용 정보들을 이용하여 특정 사용자로부터 수신되는 검색어와 유사도가 높은 검색어들을 선별하고, 해당 특정 사용자의 검색어 사용 환경에 따른 검색어 선호도를 산출하는 개선된 협업 필터링 방식을 이용하여 해당 특정 사용자에 대한 예측 검색어를 생성한다. 그리 고, 상기 웹브라우저(10)를 통해 입력되는 검색어의 초성이나 문자, 단어를 이용하여 상기 예측 검색어들을 필터링하여 선호도가 높은 순으로 정리하여 사용자에게 자동완성 검색어 리스트로 제공한다. The search term autocompletion server 30 uses search term usage information of users to select search terms having high similarity with search terms received from a specific user and to improve search term preference according to a search term using environment of the specific user. Using the method, a prediction query for the specific user is generated. In addition, the predicted search terms are filtered using the initial consonants, letters, and words of the search word input through the web browser 10, and are arranged in order of high preference, and are provided to the user as an autocomplete search word list.

즉, 상기 검색어 자동완성 서버(30)는 획일적인 최신 검색어들 중에서 자동완성 검색어 리스트를 만들거나, 사용자가 입력한 검색어들 중에서만 자동완성 검색어 리스트를 만드는 방식이 아니라, 개선된 협업 필터링 방식을 이용하여 개인화 및 검색어 가이드가 가능한 방식으로 자동완성 검색어 리스트를 제공하게 된다.That is, the query autocompletion server 30 uses an improved collaborative filtering method, not a method of creating an autocompletion search query list among uniform new search queries or an autocomplete search query list only among search terms input by a user. It provides a list of autocomplete search terms in such a way that personalization and search term guides are possible.

여기서, 협업 필터링이란, 사용자들에게 추천할 데이터의 결정시 사용자 또는 데이터 간의 유사도를 이용하는 방법을 말한다. 즉 협업 필터링은 크게 사용자간의 유사도를 이용하여 추천 데이터를 결정하는 ‘사용자간(user to user) 협업 필터링’과 데이터 간의 유사도를 이용하여 추천 데이터를 결정하는 ‘아이템(item to item) 협업 필터링’또는 이들 두 가지를 병합한 협업 필터링으로 구분된다. 먼저 ‘사용자간 협업 필터링’은 사용자가 제공한 정보(예컨대, 사용자의 신상정보 및 선호도 등)를 이용하여 비슷한 패턴을 보이는 사용자들을 그룹으로 분류한 후 동일 그룹에 포함된 다른 사용자들의 데이터 이용 정보를 이용하여 특정 사용자의 추천 데이터를 결정한다. 즉 동일 그룹에 포함된 사용자들 간에 교차 추천을 수행한다. 예를 들어 ‘사용자간 협업 필터링’방식을 이용하는 추천 시스템에서, ‘사용자 1’과 ‘사용자 2’가 ‘그룹 1’에 포함되고 ‘사용자 1’이 ‘데이터 2’를 이용한 경우 상기 추천 시스템은 ‘사용자 2’의 추천 데이터에 ‘데이터 2’를 포함시킨다.Here, collaborative filtering refers to a method of using similarity between users or data when determining data to be recommended to users. In other words, collaborative filtering is a 'user to user collaborative filtering' which determines the recommendation data using the similarity between users and 'item to item collaborative filtering' which determines the recommendation data using the similarity between the data. These are separated by collaborative filtering that merges the two. First, 'collaboration filtering between users' is used to classify users who have similar patterns by using information provided by users (for example, user's personal information and preferences), and then use data usage information of other users included in the same group. Determine the recommendation data of a specific user. That is, cross recommendation is performed between users included in the same group. For example, in the recommendation system using the 'collaboration filtering between users' method, if' User 1 'and' User 2 'are included in' Group 1 'and' User 1 'uses' Data 2', the recommendation system is' Include 'data 2' in the recommendation data of the user 2.

한편 ‘아이템 간 협업 필터링’은 동일 사용자가 동시에 이용한 데이터 쌍을 이용하여 특정 사용자의 추천 데이터를 결정한다. 예를 들어 ‘아이템 간 협업 필터링’방식을 이용하는 추천 시스템에서, ‘사용자 1’이 ‘데이터 1’과 ‘데이터 2’를 동시에 이용한 경우 ‘사용자 2’가 ‘데이터 1’을 이용하면 상기 추천 시스템은 ‘사용자 2’의 추천 데이터에 ‘데이터 2’를 포함시킨다. Meanwhile, 'collaboration filtering between items' determines data recommended by a specific user using data pairs used simultaneously by the same user. For example, in a recommendation system using a 'collaboration filtering between items' method, when 'user 1' uses 'data 1' and 'data 2' at the same time, if 'user 2' uses 'data 1' Include 'data 2' in the recommendation data of 'user 2'.

하지만, 이러한 일반적인 협업 필터링 방식은 소수 사용자 집단에 대한 고려가 없어 소수 사용자 집단에 속한 사용자에 대한 정확한 추천이 어렵고, 부정확한 정보까지도 포괄하기 때문에 부정확한 결과가 유도될 수 있으며, 사용자의 시간에 따른 선호 성향 변화에 둔감하여 부정확한 결과가 유도될 수 있다. However, this common collaborative filtering method has no consideration of the minority user group, so it is difficult to make accurate recommendations to the users of the minority user group, and even includes inaccurate information, which may lead to inaccurate results. Insensitive to changes in preference tends to lead to inaccurate results.

따라서, 본 실시예에서는 소수 사용자 집단에 속한 사용자 및 다수 사용자 집단에 속한 사용자에게 시간에 따른 성향 변화가 고려되며, 오류 정보를 사전에 제거한 협업 필터링 방식으로 개인화한 검색어 자동완성 서비스를 제공하도록 한다.Therefore, in the present embodiment, a change in propensity over time is considered for a user belonging to a minority user group and a user belonging to a multi-user group, and a personalized search word autocompletion service is provided by a collaborative filtering method in which error information is removed in advance.

상기 검색어 자동완성 서버(30)는 상기와 같은 개선된 협업 필터링 방식을 적용하여 사용자의 성향을 고려한 검색어의 선호도와 검색어들 간 유사도에 기반한 유사 검색어를 이용하여 사용자별 예측 검색어를 생성하게 되는데, 이를 통해서 사용자가 입력하지 않았던 검색어라 하더라도 사용자가 이전에 입력한 선호 검색어와 유사도가 높은 검색어들이 자동완성 검색어로서 제공되게 된다. The search term autocomplete server 30 generates the prediction query for each user by using the similar search term based on the similarity between the search terms and the preference of the search word considering the user's propensity by applying the improved collaborative filtering method as described above. Even though the search word is not entered by the user, search terms having similarity with the user's previously entered preferred search word are provided as the autocomplete search word.

예를 들어, 사용자가 평소에 채소를 이용한 요리를 즐겨 찾는 경우 검색어 창에 "오"라고 입력하면 "오렌지, 오얏, 오리엔탈 드레싱..."과 같은 자동완성 검 색어 리스트가 제공되고, 사용자가 평소에 인터넷 쇼핑이나 뉴스에 대한 검색어를 주로 사용한 경우 검색어 창에 "오"라고 입력하면 "옥션, 오마이뉴스, 오늘의 날씨..."와 같은 자동완성 검색어 리스트가 제공될 수 있다. 즉, 사용자별로 다른 자동완성 검색어가 추천되게 되며, 이는 사용자의 검색어 입력에 따라 지속적으로 변화된다.For example, if a user normally enjoys cooking with vegetables, typing "o" into the search box will provide a list of autocomplete search terms, such as "orange, oyat, oriental dressing ..." If you mainly use search terms for internet shopping or news, you can provide a list of autocomplete searches such as "Auction, OhmyNews, Today's weather ..." by typing "O" in the search box. In other words, different autocompletion search terms are recommended for each user, which is constantly changed according to a user's input.

도 2는 상기 검색어 자동완성 서버(30)의 보다 세부적인 구성을 보인 실시예로서, 도시한 바와 같이 웹서버(20)로부터 제공되는 사용자별 검색어 정보를 획득하여 부정확한 결과를 유발하는 검색어를 필터링하는 검색어 필터링부(31)와, 필터링된 검색어와 상기 검색어가 수신된 시간 정보, 상기 사용자에 대한 환경 정보(성별, 나이, 관심, 지역 정보 등) 등을 이용하여 사용자별 검색어 선호도를 산출하는 사용자별 검색어 선호도 산출부(32)와, 상기 데이터베이스의 사용자들의 검색어 사용 정보(로그)가 저장된 검색어 데이터베이스(41)를 통해 사용자별 검색어 사용 행위를 기반으로 상기 수신된 검색어에 대해 유사도가 높은 검색어를 선별하는 검색어 유사도 산출부(33)와, 상기 사용자별 검색어 선호도 산출부(32)에 의한 해당 검색어의 선호도 정보와 상기 검색어 유사도 산출부(33)에 의한 유사 검색어들의 유사도 정보를 이용하여 해당 사용자에 대해 추천할 검색어를 선별한 후 상기 선호도나 유사도 관련 정보들을 포함한 데이터 형태의 예측 검색어를 생성하여 상기 사용자별 예측 검색어 데이터베이스(42)에 사용자별로 저장하는 사용자별 예측 검색어 생성부(34)와, 상기 웹서버(20)를 통해 사용자가 검색창에 입력하는 실시간 입력 문자(초성, 낱자, 단어 등)에 따라 해당 사용자에 대한 예측 검색어를 상기 사용자 별 예측 검색어 데이터베이스(42)로부터 획득하여 자동완성 검색어 리스트 정보 형태로 가공하여 상기 웹서버(20)를 통해 상기 웹브라우저(10)에 제공하는 사용자별 예측 검색어 선별부(35)를 포함한다. 2 is an embodiment showing a more detailed configuration of the search word auto-completion server 30, and as shown in FIG. 2 to obtain search word information for each user provided from the web server 20 to filter the search words that cause inaccurate results. A user that calculates a user's search query preference using a search query filtering unit 31, a filtered search word, time information at which the search word is received, and environment information (gender, age, interest, local information, etc.) for the user. The search term preference calculator 32 and the search term database 41 storing search term usage information (logs) of the users of the database are selected to search for high similarity terms for the received search term based on the search term use behavior for each user. Matched with the preference information of the corresponding search word by the search word similarity calculator 33 and the user-specific search word preference calculator 32. The search term similarity information by the search term similarity calculator 33 selects search terms to be recommended for the corresponding user, and then generates a predicted search term in the form of data including the preference or similarity related information to the user's prediction search term database. According to the user-specific prediction query generation unit 34 stored in (42) for each user and the real-time input characters (initial, words, words, etc.) that the user enters into the search box through the web server 20 Predictive search query for each user is obtained from the prediction query database for each user 42 is processed in the form of auto-complete search query list information provided by the user to the web browser 10 via the web server 20 prediction query selection unit 35 ).

상기 검색어 자동완성 서버(30)는 기본적으로 사용자 개인을 기반으로 동작하게 되므로 검색 서비스를 이용하는 사용자를 판단해야 하는데, 도시한 경우는 사용자의 입력에 따라 사용자에 대한 정보(사용자의 ID 정보와 같은 사용자 입력 식별 정보, 쿠키 정보와 같은 단말 저장 정보, IP 정보나 랜카드의 맥(MAC) 주소 정보와 같은 네트워크 식별 정보 등)를 파악하여 상기 검색어 필터링부(31)와 사용자별 예측 검색어 선택부(35)에 제공하는 사용자 판단부(36)가 상기 검색어 자동완성 서버(30)에 구성된 경우이다. 그러나, 해당 사용자 판단부(36)는 상기 웹서버(20)에 구성되어 상기 검색어 자동완성 서버(30)는 상기 웹서버(20)를 통해서 사용자 판단 정보를 획득할 수도 있다는 것에 주의한다.Since the search word auto-complete server 30 basically operates based on a user's personality, it is necessary to determine a user using a search service. In this case, information about a user (such as user's ID information) is determined according to a user's input. Input identification information, terminal storage information such as cookie information, network identification information such as IP information or MAC (MAC) address information of a LAN card), and the like, and the query filtering unit 31 and the user-specific prediction query selecting unit 35 This is a case where the user determination unit 36 provided in the search word autocomplete server 30 is configured. However, it should be noted that the user determination unit 36 is configured in the web server 20 so that the search word autocomplete server 30 may obtain user determination information through the web server 20.

이러한 상기 검색어 자동완성 서버(30)의 구성을 좀더 상세히 살펴보도록 한다. The configuration of the search word autocomplete server 30 will be described in more detail.

먼저, 상기 검색어 필터링부(31)는 사용자별 검색어 정보를 수신하면서 의미 없는 검색어나 검색어로서 의미 없는 단어를 필터링하여 협업 필터링의 정확도를 높일 수 있다. 예를 들어, 영문을 한글로 쓰거나 한글을 영문으로 쓴 검색어나 오타가 포함된 검색어, 검색 결과가 없는 잘못 기재된 검색어 등과 같은 의미 없는 검색어나, 검색어에 포함된 단어 중에서 조사나 너무 사용 빈도가 높아 사용자 의도를 파악하기 어려운 검색어들을 필터링하여 불필요한 노이즈를 제거한다. 물론, 이러한 경우 외에도 제한된 시간 내에 동일 검색어가 너무 많이 입력된다거나, 사람의 입력 조건을 벗어나는 검색어 유입 등의 경우는 오류로 판단하여 필터링할 수도 있다.First, the search word filtering unit 31 may increase the accuracy of the collaborative filtering by filtering the meaningless search terms or the meaningless words as search terms while receiving search word information for each user. For example, users who search in English or Korean in English, search for words with a typo, incorrect search terms with no search results, or search words that are too frequently used by users Filter unnecessary search terms to remove unnecessary noise. Of course, in addition to such a case, too many of the same search terms are input within a limited time, or a search word flowing out of a human input condition may be determined as an error and filtered.

상기 검색어 필터링부(31)와 사용자별 예측 검색어 생성부(34)를 선택적으로 상기 사용자별 검색어 선호도 산출부(32) 및 상기 검색어 유사도 산출부(33)에 적용하여 협업 필터링부를 구성할 수 있는데, 실질적인 협업 필터링을 수행하는 사용자별 검색어 선호도 산출부(32)와, 검색어 유사도 산출부(33)는 다음과 같이 구성될 수 있다. The query filtering unit 31 and the user-specific prediction query generator 34 may be selectively applied to the user-specific keyword preference calculator 32 and the keyword similarity calculator 33 to configure a collaborative filtering unit. The user-specific keyword preference calculator 32 and the keyword similarity calculator 33 that perform substantial collaborative filtering may be configured as follows.

먼저, 상기 사용자별 검색어 선호도 산출부(32)는 기본적으로 상기 검색어 필터링부(31)를 통해 수신되는 검색어의 수신 시간을 기준으로 시간에 대한 가중치를 적용하며, 최근에 자주 검색된 검색어에 대한 선호도가 높도록 선호도를 산출한다. 만일 해당 사용자가 ID 입력을 통해 접속한 경우라면 해당 사용자에 대해 기 조사되거나 누적된 사용자의 성향 정보를 얻을 수 있어 이를 사용자 성향 가중치로 적용한 선호도를 산출할 수 있다. 만일, 해당 사용자가 ID 입력을 통해 접속한 경우가 아니라면 알 수 있는 정보들(접속 지역 정보, 시간 등)을 성향 가중치로 적용한 선호도를 산출할 수 있다. 이러한 사용자의 성향에 따른 가중치는 단순한 수치가 아닌 특정 속성에 따른 선호도 산출 결과의 분리를 의미하게 되는데, 성별, 나이, 지역, 성향 등과 같은 사용자의 속성 조합에 따라 유사 속성 조합의 성향을 확인하도록 하여 소수 집단에 속한 사용자의 성향이 반영될 수 있도록 한다.First, the search term preference calculator 32 for each user basically applies a weight to a time based on a reception time of a search term received through the search term filter 31, and prefers a search term frequently searched for recently. Calculate your preference so that it is high. If the user accesses through ID input, the user's propensity information of the user who has been surveyed or accumulated can be obtained, and the preference applied to the user's propensity weight can be calculated. If the user is not connected through the ID input, the preference may be calculated by applying the known information (access area information, time, etc.) as the propensity weight. The weight according to the user's inclination means not the simple numerical value but the separation of the result of preference calculation according to a specific property. By checking the propensity of the similar property combination according to the user's property combination such as gender, age, region, and propensity, Make it possible to reflect the propensity of users in the minority group.

상기 시간 가중치는 다양한 방식으로 적용될 수 있는데, 기본적으로 최근에 검색한 검색어에 가중치를 높게 주고, 시간이 지날수록 해당 시간 가중치를 낮추어가면서 최근 검색어에 대한 선호도가 높아지도록 하는 방식이 있을 수 있고, 이전에 수신된 검색어의 수신 시간대와 최근 수신된 검색어의 수신 시간대의 차이에 따른 가중치를 적용하여 유사한 시간대에 주로 사용되는 검색어에 대한 선호도가 높아지도록 하는 방식이 있을 수 있으며, 동일 또는 유사 검색어의 사용 빈도나 사용 시간 간격에 따라 가중치를 적용하여 자주 검색되는 검색어에 대한 선호도가 높아지도록 하는 방식이 있을 수 있다.The time weight may be applied in various ways. Basically, there may be a method of increasing a weight on a recently searched search term, increasing a preference for a recent search term while decreasing a corresponding time weight as time passes. There may be a method of increasing the preference for search terms that are mainly used in similar time zones by applying a weight according to the difference between the receiving time range of the received search term and the recently received search term, and the frequency of using the same or similar search terms. However, there may be a method of increasing the preference for frequently searched search terms by applying weights according to the use time interval.

상기 검색어 유사도 산출부(33)는 상기 검색어 데이터베이스(41)에 저장된 사용자들의 검색어 사용 정보로부터 미리 설정된 시간 간격 이내에 동일 사용자에 의해 연속적으로 이용된 서로 다른 2이상의 검색어들 중 선택 가능한 2개의 검색어 조합을 유사 검색어 쌍으로 판단하고, 상기 검색어 쌍이 속한 카테고리 유사도에 비례하는 속성 가중치를 적용하여 검색어 유사도를 산출한다. The search term similarity calculator 33 selects two search term combinations that can be selected from two or more different search terms consecutively used by the same user within a predetermined time interval from the search term usage information of the users stored in the search term database 41. The search term similarity is calculated by determining a similar search term pair and applying an attribute weight proportional to the category similarity to which the search term pair belongs.

한편, 상기 검색어 유사도 산출부는 동일 사용자에 의해 사용된 상기 검색어 쌍의 사용 시간 간격에 반비례하는 시간 가중치를 더 적용하여 검색어 유사도를 산출할 수 있다. 이때 서로 다른 2이상의 검색어들이 미리 설정된 시간 간격 이내에 동일 사용자에 의해 연속적으로 사용된 횟수가 미리 설정된 횟수 이상인 경우에 그 검색어들 중 선택 가능한 2개의 검색어 조합을 유사 검색어 쌍으로 결정하는 것이 바람직하다.The search term similarity calculator may calculate the search term similarity by further applying a time weight that is inversely proportional to the use time interval of the search word pair used by the same user. In this case, when two or more different search terms are consecutively used by the same user within a preset time interval, it is preferable to determine a combination of two search terms among the search terms as a similar search term pair.

예를 들어, 사용자가 입력한 '검색어1'에 대해서, 유사도를 산출하는 경우 보다 가까운 날에 보다 가까운 간격으로 '검색어1'과 '검색어2'를 함께 사용한 사 용자들의 수가 많을수록 유사도가 높게 산출되도록 할 수 있다. For example, when calculating the similarity with respect to the 'search term 1' entered by the user, the higher the number of users who used the 'search term 1' and the 'search term 2' together at intervals closer to each other, the higher the similarity is calculated. can do.

추가적으로, 검색어들 사이의 유사도를 미리 구한 검색어 연관도 데이터베이스(43)를 통해서도 유사 검색어들 간의 유사도를 얻을 수 있으나, 상기 협업 필터링 방식과 결합하는 방식으로 적용하는 것이 바람직하다. 즉, 특정 유사 검색어 쌍에 대해서 검색어 연관도 데이터베이스(43)를 통해 얻은 검색어 연관도를 상기 검색어 쌍에 연관 가중치로 적용하여 검색어 유사도를 산출할 수 있다.In addition, although similarity between similar search terms may be obtained through a search term association database 43 that obtains similarity between search terms in advance, it is preferable to apply the method in combination with the collaborative filtering method. That is, the search term similarity may be calculated by applying the search term relevance obtained through the search term relevance database 43 to the search term pair as an association weight.

여기서, 상기 검색어 유사도 산출부(33)는 클러스터링에 의해 상기 유사 컨텐츠 쌍을 결정하며, 상기 클러스터링 방법은 널리 알려진 바와 같이 피어슨 상관관계, 인기도차 방법, 코사인 유사도 방법 등 일반적인 협업 필터링에서 사용되는 모든 알고리즘을 사용하는 것이 가능하다. Here, the search term similarity calculating unit 33 determines the similarity pair by clustering, and the clustering method is all algorithms used in general collaborative filtering such as Pearson correlation, popularity difference method, and cosine similarity method as is well known. It is possible to use

한편, 이러한 유사 컨텐츠 쌍을 구하여 유사도를 산출할 때, 너무 많은 유사 컨텐츠 쌍을 결정할 수 있으므로, 특정 유사 컨텐츠 쌍을 사용한 사용자의 수가 기 설정된 소수 사용자 수보다 작을 경우 이를 필터링 할 수 있다. 만일 해당 유사 컨텐츠 쌍을 사용한 사용자의 수가 기 설정된 소수 사용자 범위에 속하는 경우이고 해당 소수 사용자의 속성이 일정한 수 이상이라면 해당 소수 사용자들의 속성과 해당 컨텐츠 쌍의 정보를 사용자별 검색어 선호도 산출부(32)에 전달하여 특정 사용자 그룹의 선호도 정보로 관리하도록 할 수도 있다. On the other hand, when obtaining the similarity pairs to calculate the similarity, too many similar content pairs can be determined, so that if the number of users using a specific similar content pair is smaller than the predetermined number of users can be filtered. If the number of users using the corresponding content pair is within a predetermined range of minority users and the attributes of the minority user are equal to or greater than a certain number, the attribute of the minority users and the information of the corresponding content pair are searched by the user's query preference calculator 32. It can also be passed to to be managed as preference information for a specific user group.

상기 사용자별 검색어 선호도 산출부(32)와 상기 검색어 유사도 산출부(33)를 통해 얻어진 검색어에 대한 사용자 선호도와 유사 검색어 쌍으로 얻어진 유사 검색어들에 대한 정보를 선호도 정보와 유사도 정보 및 시간/속성/성향 가중치 등 에 대한 정보를 2차원이나 3차원적인 데이터 포멧을 가지는 사용자별 예측 검색어로 생성하여 상기 사용자별 예측 검색어 데이터베이스(42)에 저장한다. Information about similar search terms obtained as a user preference and a similar search word pair for the search word obtained through the user search word preference calculator 32 and the search word similarity calculator 33 may include preference information, similarity information, and time / attribute / Information about propensity weights and the like is generated as a prediction keyword for each user having a two-dimensional or three-dimensional data format and stored in the prediction keyword database for each user 42.

상기 사용자별 예측 검색어 데이터베이스(42)에 저장되는 정보는 시간에 따른 가중치가 가변되고, 사용자들의 검색어 사용이 누적되므로 현재 상태에 맞게 갱신되는 것이 바람직하므로, 주기적으로 혹은 사용자의 접속이나 검색이 있는 경우 사용자별 예측 검색어가 갱신될 수 있다.The information stored in the user-specific prediction query database 42 may be updated according to the current state because the weight of the user varies by time and the use of the user's search word is accumulated, and periodically or when there is a user's connection or search. The prediction query for each user may be updated.

상기 사용자별 예측 검색어 선별부(35)는 사용자의 검색어 입력에 따라 상기 사용자별 예측 검색어 데이터베이스로부터 해당 사용자에 대한 추천 검색어 리스트를 제공하는데, 예측 검색어 데이터베이스의 검색어들 중 선호도가 높은 검색어를 선택하고 해당 검색어에 대한 유사도가 높은 검색어들을 선별하며, 실시간 입력되는 검색창의 문자(초성, 낱자, 단어, 끝자 등)로 필터링한 자동완성 검색어 리스트를 생성하여 전달하는 것으로 검색창에 예측된 자동완성 검색어 리스트가 나타나도록 한다. The prediction query selector for each user 35 provides a list of recommended search terms for the user from the prediction database for each user according to a user's search term input. It searches for high similarity search terms and generates and delivers a list of autocomplete search terms filtered by letters (first, last, words, end, etc.) in real-time search box. Make it appear.

상기 설명한 구성을 이용하여 사용자별 검색어를 추천하는 방법의 예를 도 3 및 도 4의 순서도를 참조하여 설명하도록 한다. An example of a method of recommending a user-specific keyword using the above-described configuration will be described with reference to the flowcharts of FIGS. 3 and 4.

도 3은 사용자별 예측 검색어를 생성하기 위한 과정의 예를 보인 순서도로서, 도시한 바와 같이 사용자가 검색창을 통해 검색어를 입력한 후 검색 쿼리를 전송하면 상기 정보를 수신한 웹서버 혹은 검색어 자동완성 서버가 해당 사용자를 판별한다. 상기 판별된 사용자 정보와 상기 검색어를 수신한 검색어 자동완성 서버는 상기 검색어의 이상 여부를 검색어 필터링부를 통해 필터링한 후 협업 필터링부를 통해 시간이나 성향에 따른 가중치를 부여한 후 검색어 선호도를 산출하고, 사용자들의 검색어 사용 로그를 기반으로 검색어 유사도를 산출한다. 이때, 검색어 유사도 산출 시에도 시간이나 속성 가중치를 적용할 수 있다.FIG. 3 is a flowchart illustrating an example of a process for generating a prediction query for each user. As shown in FIG. 3, when a user inputs a search term through a search box and transmits a search query, the web server or search term auto-completed received the information. The server determines the user. The search word autocompletion server receiving the determined user information and the search word filters the abnormality of the search word through the search word filtering part, and then assigns a weight value according to time or propensity through the collaborative filtering part and calculates a search term preference. The search term similarity is calculated based on the search term usage log. In this case, the time or the attribute weight may also be applied when calculating the search term similarity.

상기 검색어에 대한 사용자의 선호도 산출값과 상기 검색어 유사도 산출을 통해 상기 검색어를 포함하는 검색어 쌍에 대한 유사도 산출값을 이용하여 사용자별 예측 검색어를 생성하고, 이를 사용자별 예측 검색어 데이터베이스에 저장한다.The user's preference calculation value for the search word and the search word similarity calculation may be used to generate the user's prediction search word using the similarity calculation value for the search word pair including the search word, and store it in the user's prediction search word database.

이러한 방식으로 사용자에 대해 개인화된 예측 검색어 정보를 확보하며, 이는 사용자가 검색하지 않았지만 검색했던 정보와 관련성이 높은 검색어를 포함한다.In this way, personalized predicted search term information is obtained for the user, which includes search terms highly relevant to the information that the user has not searched for.

도 4는 상기 얻어진 사용자별 예측 검색어를 활용하여 사용자별 자동완성 검색어 리스트를 제공하는 예에 대한 순서도로서, 도시한 바와 같이 사용자가 웹브라우저를 통해 검색창에 검색어를 입력하면, 웹서버나 검색어 자동완성 서버가 사용자를 판별하게 된다. 상기 웹서버는 상기 검색창에 입력되는 검색어를 실시간으로 획득하여 상기 검색어 자동완성 서버에 제공한다. FIG. 4 is a flowchart illustrating an example of providing a user-completed list of autocomplete search terms by using the predicted search terms for each user. As shown in FIG. 4, when a user inputs a search term into a search box through a web browser, The completion server will determine the user. The web server obtains a search word input to the search box in real time and provides the search word to the search word autocomplete server.

상기 검색어 자동완성 서버는 사용자별 예측 검색어 데이터베이스로부터 추천할 예측 검색어를 선별하는데, 상기 실시간으로 제공되는 검색어의 철자(초성, 낱자, 단어 등)를 기준으로 필터링하는 방식을 이용하며, 사용자 선호도, 검색어 유사도 정보가 높은 순으로 필터링된 예측 검색어들을 소정의 검색어 자동완성 리스트 정보 형태로 생성하여 제공한다. The search word autocompletion server selects a prediction query to be recommended from a prediction database for each user, and uses a method of filtering based on spellings (initiality, words, words, etc.) of the search term provided in real time, and user preferences and search terms. Predicted search terms filtered in order of high similarity information are generated and provided in the form of predetermined search word autocomplete list information.

도 5는 상기와 같은 방식을 이용한 검색어 자동완성 서비스의 예를 설명하기 위한 것으로, 특정한 사용자가 검색어로서 '외환, 옥션, 사과'를 입력한 경우를 예로 들어 설명한다. 상기 '외환, 옥션, 사과'와 같은 검색어들은 사용자가 관심을 가지고 있는 단어로 간주될 수 있다. 이를 간단한 식으로 나타내면, 사용자(i)에 대한 검색어 정보는 Wi={외환, 옥션, 사과}가 된다.FIG. 5 illustrates an example of a search word autocompletion service using the above-described method. A case in which a specific user inputs 'foreign currency, auction, and apple' as a search word will be described as an example. Search terms such as 'Forex, Auction, Apple' may be regarded as words of interest to the user. In a simple manner, the search word information for the user i becomes Wi = {foreign exchange, auction, apple}.

여기서, 각 검색어들에 대한 검색 쌍들로 유사도가 높은 검색어들을 선별한 결과(즉, 협업 필터링을 통해 얻은 유사도 높은 검색어, Wr)는 다음과 같이 나타낼 수 있다. Here, the result of selecting the high similarity search terms as the search pairs for each search word (that is, the high similarity search word obtained through collaborative filtering, Wr) may be expressed as follows.

Wr(외환)={은행, 돈, 환율, 통화, 입금}Wr (foreign currency) = {bank, money, exchange rate, currency, deposit}

Wr(옥션)={경매, 입금, 시장, 은행}Wr (auction) = {auction, deposit, market, bank}

Wr(사과)={시장, 판매, 배송}Wr (apple) = {market, sale, shipping}

이러한 검색어들은 하나의 단어만을 의미하는 것은 아니기 때문에 다음과 같은 경우도 가능하다.Since these search words do not mean only one word, it is also possible to:

Wr(외환, 옥션)={은행, 입금}Wr (Foreign Exchange, Auction) = {Bank, Deposit}

따라서, 사용자가 검색창에 "o"를 입력하는 경우 '은행, 입금'과 같은 자동완성 검색어가 '외환, 옥션'과 더불어 자동완성 검색어 리스트에 나타날 수 있다. 즉, 사용자의 관심을 반영하고 있는 추천 검색어가 비록 사용자가 한번도 입력한 적이 없더라도 나타나게 되며, 이러한 검색어가 사용자가 원하는 검색어일 가능성이 높아진다. 이는 입력 편의를 개선하는 목적도 있으나, 사용자가 수 많은 검색어들이 존재하는 상황에서 자신의 관심을 충족할 수 있는 검색어를 효과적으로 가이드하기 위한 목적으로도 효과적으로 사용될 수 있다.Therefore, when a user inputs "o" into a search box, an autocomplete search word such as "bank and deposit" may appear in the autocomplete search word list along with "foreign exchange and auction." That is, the recommended search term reflecting the user's interest appears even if the user has never entered it, and the likelihood that the search term is the user's desired search term increases. This may be used to improve input convenience, but may also be used effectively for the purpose of guiding a search word that can satisfy a user's interest in a situation where a large number of search words exist.

한편, 상기 각 검색어를 수신한 시간에 따른 가중치 정보가 현재 시간을 기준으로 10, 그리고 하루가 지나면 관심이 낮아질 것으로 간주하여 0.3씩 줄여나가도록 할 수 있다. 따라서, '사과' 및 그에 대한 유사 검색어들 보다는 '외환'이나 '옥션' 및 그에 대한 유사 검색어들이 우선적으로 제공될 수 있다. 이러한 시간에 따르 시간 가중치의 감쇄 형태는 S자형, 계단형 등 다양한 방식이 적용될 수 있다.Meanwhile, the weighted information according to the time of receiving each of the search words may be reduced by 0.3 since 10 is considered to be lowered after 10 days and one day after the current time. Accordingly, 'foreign' and 'auction' and similar search terms may be preferentially provided, rather than 'apple' and similar search terms. The attenuation form of the time weight according to the time may be applied in various ways such as S-shape, stepped.

이상에서는 본 발명에 따른 바람직한 실시예들에 대하여 도시하고 또한 설명하였다. 상기 설명한 검색어 추천은 인터넷 환경뿐만 아니라, 이동통신 단말기의 문자 메세지, 모바일 인터넷 환경 등에서도 적용될 수 있는 등 그 적용 범위는 대단히 넓다.In the above described and illustrated with respect to preferred embodiments according to the present invention. The above-mentioned search word recommendation can be applied not only to the Internet environment but also to a text message of a mobile communication terminal, a mobile Internet environment, and the like, and its application range is very wide.

그러나 본 발명은 상술한 실시예에 한정되지 아니하며, 특허 청구의 범위에서 첨부하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 누구든지 다양한 변형 실시가 가능할 것이다. However, the present invention is not limited to the above-described embodiment, and various modifications can be made by those skilled in the art without departing from the gist of the present invention attached to the claims. .

도 1은 본 발명의 실시예에 따른 사용자별 검색어 추천 시스템의 구성을 보인 블록도.1 is a block diagram showing the configuration of a user-specific keyword recommendation system according to an embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 검색어 자동완성 서버의 구성을 보인 블록도.Figure 2 is a block diagram showing the configuration of a search word autocomplete server according to an embodiment of the present invention.

도 3 및 도 4는 본 발명의 실시예에 따른 사용자별 검색어 추천 시스템의 동작 과정을 보인 순서도.3 and 4 are flowcharts showing the operation of the user-specific keyword recommendation system according to an embodiment of the present invention.

도 5는 본 발명의 실시예를 설명하기 위한 검색어 예시도. 5 is an exemplary search word for explaining an embodiment of the present invention.

** 도면의 주요 부분에 대한 부호의 설명 **** Description of symbols for the main parts of the drawing **

10: 웹브라우저 20: 웹서버10: web browser 20: web server

30: 검색어 자동완성 서버 31: 검색어 필터링부30: query autocomplete server 31: query filtering unit

32: 사용자별 검색어 선호도 산출부32: query preference calculator by user

33: 검색어 유사도 산출부 34: 사용자별 예측 검색어 생성부33: search term similarity calculating unit 34: user prediction query generation unit

35: 사용자별 예측 검색어 선별부35: prediction query selection unit by user

Claims

A search term collaborative filtering unit configured to receive search term information for each user, calculate a preference for the search term, and calculate a search term similarity from the search term usage information of the users to generate a predicted search term for each user;

A prediction query database for each user that stores the prediction query for each user generated by the collaboration filtering unit;

And a prediction query selection unit for each user, wherein the prediction query selection unit for each user provides a list of recommended search terms for the corresponding user from the prediction database for each user according to a user's search word input.

The system of claim 1, wherein the collaborative filtering unit further includes a search term filtering unit configured to filter meaningless terms as search terms or search terms that are meaningless while receiving search term information for each user.

The search word recommendation system for each user according to claim 1, wherein the collaboration filtering unit further includes a search word preference calculator for each user that calculates a preference by applying a weight to the corresponding search word based on a reception time when the search word is received.

The system of claim 3, wherein the collaborative filtering unit varies a time reference weight of search terms of the prediction query database for each user over time.

The user of claim 3, wherein the user's search word preference calculator determines a user's property using the user's personal and regional information, and then calculates a user's preference after assigning a weighted value to the received user's property. Star Query Recommendation System.

The method of claim 1, wherein the collaborative filtering unit determines two search term combinations that are selectable from two or more different search terms consecutively used by the same user within a preset time interval from the user's search term use information as similar search term pairs, and performs the search term. And a search word similarity calculator configured to calculate search word similarity by applying attribute weights proportional to the category similarity to which the pair belongs.

The search word recommendation system for each user according to claim 6, wherein the search word similarity calculator calculates the search word similarity by further applying a time weight inversely proportional to the use time interval of the search word pair used by the same user.

The system of claim 6, wherein the search term similarity calculator calculates the search term similarity by applying a predetermined search term association degree to the search term pair as an association weight.

The system of claim 1, wherein the collaborative filtering unit updates predicted search term data for each user periodically or according to a user's access.

The system of claim 1, wherein the prediction keyword selection unit for each user selects a search word having a high preference among search terms in the prediction search database and selects search terms having a high similarity with respect to the search word.

The apparatus of claim 1, further comprising a user determination unit configured to determine a user through at least one of an ID, a network identifier, and a cookie of the collaboration filtering unit and the prediction keyword selection unit for each user, and to provide user identification information. User-specific query recommendation system.

A search term collaborative filtering step of receiving a search term information for each user to calculate a preference for the search term, and calculating a search term similarity from the search term usage information of the users to generate a predicted search term for each user;

And a search word autocomplete list providing step of providing a list of recommended search terms for the user from the prediction database for each user according to a user's search word input.

The method of claim 12, wherein the collaborative filtering step further comprises filtering meaningless words as meaningless search terms or search terms while receiving search word information for each user.

The method of claim 12, wherein the collaborative filtering step further includes a preference calculation step of applying a weight to the corresponding search word based on a reception time when the search word is received, and calculating a preference for the search word based on the time weight of the search word. How to suggest search terms by user.

15. The method of claim 14, wherein the collaborative filtering step further comprises varying a time reference weight for the search terms of the user-specific predicted search term over time.

The method of claim 14, wherein the calculating of the preference further includes determining a property of the user by using the user's personal and regional information, and then calculating a preference after assigning a weight to the received search word information for each user property. How to suggest search terms by user.

The method of claim 12, wherein the collaborative filtering step determines two search term combinations that are selectable among two or more different search terms consecutively used by the same user within a preset time interval from the user's search term usage information, and determines the similar search term pairs. And a similarity calculation step of calculating the similarity of the search terms by applying attribute weights proportional to the similarity of the category to which the search term pair belongs.

The search word recommendation for each user according to claim 17, wherein the calculating of the similarity further comprises calculating a search term similarity by further applying a time weight inversely proportional to the use time interval of the search term pair used by the same user. Way.

The method of claim 17, wherein the calculating of the similarity comprises calculating search term similarity by applying a correlation weight of the pair of search terms obtained through a separate search term relevance database as an association weight.

The method of claim 12, wherein the collaborative filtering step further comprises updating predicted user-specific search word data generated periodically or according to a user's access.

The method of claim 12, wherein the providing of the search word autocomplete list comprises generating a recommendation search list by selecting search words having high preference and search terms having high similarity to the search word among the search terms in the prediction search database. How to suggest search terms by user.

The method of claim 12, wherein the providing of the search word autocompletion list further comprises filtering the recommendation search word list according to a real-time input character according to a user's search word input and providing the result as an autocomplete search word list. How to suggest search terms by user.

The user-specific search word of claim 12, further comprising a user determination step of determining a user through at least one of a user ID, a network identifier, and a cookie before the collaborative filtering step and the search word autocomplete list providing step. Recommended way.

A computer-readable recording medium containing a program capable of performing the method of claim 12.