KR102473155B1

KR102473155B1 - Method for providing interactive information service and apparatus therefor

Info

Publication number: KR102473155B1
Application number: KR1020160006090A
Authority: KR
Inventors: 박남현; 김광섭
Original assignee: 주식회사 케이티
Priority date: 2016-01-18
Filing date: 2016-01-18
Publication date: 2022-11-30
Also published as: KR20220162681A; KR102654884B1; KR20170086353A

Abstract

본 발명은 대화형 질문에 대한 사용자의 선택 응답 정보를 기초로 사용자에게 검색된 정보를 제공하는 대화형 정보 제공 서비스 방법 및 이를 위한 장치에 관한 것이다. 본 발명에 따른 정보 제공 장치는, 질문별로 정의된 속성정보를 기초로, 질문 트리를 생성하는 트리 생성부; 사용자 단말로부터 대화형 정보 제공 서비스를 요청받으면, 상기 질문 트리에 형성된 계층적인 질문을 상기 사용자 단말로 순차적으로 제공하되, 질문에 대한 응답 정보에 근거하여 다음 레벨의 하위 질문을 선택하여 상기 사용자 단말로 전송하는 질문 제공부; 및 최종 질문에 대한 응답 정보가 상기 사용자 단말로부터 수신되면, 상기 최종 질문의 응답 정보와 대응되는 정보를 추출하여 상기 사용자 단말로 제공하는 정보 제공부를 포함한다.The present invention relates to an interactive information providing service method and apparatus for providing searched information to a user based on user selection response information for an interactive question. An information providing apparatus according to the present invention includes a tree generator for generating a question tree based on attribute information defined for each question; When an interactive information providing service is requested from a user terminal, hierarchical questions formed in the question tree are sequentially provided to the user terminal, but a sub-question of the next level is selected based on response information to the question and sent to the user terminal. a question providing unit to transmit; and an information providing unit extracting information corresponding to the response information of the final question and providing the extracted information to the user terminal when response information to the final question is received from the user terminal.

Description

Interactive information providing service method and apparatus therefor

본 발명은 정보 제공 서비스 방법에 관한 것으로서, 더욱 상세하게는 대화형 질문에 대한 사용자의 선택 응답 정보를 기초로 사용자에게 검색된 정보를 제공하는 대화형 정보 제공 서비스 방법 및 이를 위한 장치에 관한 것이다. The present invention relates to an information providing service method, and more particularly, to an interactive information providing service method and apparatus for providing searched information to a user based on user selection response information for an interactive question.

오늘날 통신기기와 이동통신망의 발전으로 인하여, 방대한 데이터가 인터넷 상에 존재하고 있으며, 사용자들은 검색 기능을 통해서 원하는 정보를 취득하기도 한다. Today, due to the development of communication devices and mobile communication networks, a vast amount of data exists on the Internet, and users sometimes acquire desired information through a search function.

하지만 이러한 수동적은 검색 방법은 각 사용자들에 검색 능력에 의해 편차가 발생하는 것으로서, 검색 능력이 좋은 사용자는 원하는 정보를 바로 획득하기도 하며 검색 능력이 떨어지는 사용자는 장시간에 거쳐 원하는 정보를 획득하기도 한다.However, in this passive search method, deviations occur depending on the search ability of each user. Users with good search ability may obtain desired information immediately, and users with poor search ability may obtain desired information over a long period of time.

한편, 특정 카테고리 내에서 정보를 검색하여 사용자에게 제공하는 기술이 대두되었다. 아래의 특허문헌은 카테고리 식별자를 이용한 검색 방법 및 시스템에 대해서 개시한다.Meanwhile, a technology for searching for information within a specific category and providing it to a user has emerged. The following patent documents disclose a search method and system using a category identifier.

그러나 이러한 카테고리 기반의 정보 검색 방법도, 사용자가 적절한 키워드를 입력하여 검색하여만 정확한 정보가 사용자에게 제공되며, 키워드가 부적절한 경우에 부정확한 정보가 제공되는 문제점이 있다. 또한, 종래의 카테고리 기반의 정보 검색 방법은, 부정확한 정보가 검색된 경우 계속적으로 키워드를 바꿔가며 검색해야 되는 불편함이 수반되는 문제점도 있다.However, this category-based information search method also has a problem in that accurate information is provided to the user only when the user inputs an appropriate keyword and inaccurate information is provided when the keyword is inappropriate. In addition, the conventional category-based information search method has a problem accompanied by the inconvenience of continuously changing keywords when inaccurate information is searched for.

한국등록특허 10-0882437Korean registered patent 10-0882437

본 발명은 이러한 종래의 문제점을 해결하기 위하여 제안된 것으로, 사용하기 편리하고 정확한 검색 정보를 사용자에게 제공할 수 있는 대화형 정보 제공 서비스 방법 및 이를 위한 장치를 제공하는데 그 목적이 있다.The present invention has been proposed to solve these conventional problems, and an object of the present invention is to provide an interactive information providing service method and apparatus capable of providing user with convenient and accurate search information.

본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있으며, 본 발명의 실시예에 의해 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.Other objects and advantages of the present invention can be understood by the following description, and will be more clearly understood by the examples of the present invention. It will also be readily apparent that the objects and advantages of the present invention may be realized by means of the instrumentalities and combinations indicated in the claims.

상기 목적을 달성하기 위한 본 발명의 제 1 측면에 따른 정보 제공 장치는, 질문별로 정의된 속성정보를 기초로, 질문 트리를 생성하는 트리 생성부; 사용자 단말로부터 대화형 정보 제공 서비스를 요청받으면, 상기 질문 트리에 형성된 계층적인 질문을 상기 사용자 단말로 순차적으로 제공하되, 질문에 대한 응답 정보에 근거하여 다음 레벨의 하위 질문을 선택하여 상기 사용자 단말로 전송하는 질문 제공부; 및 최종 질문에 대한 응답 정보가 상기 사용자 단말로부터 수신되면, 상기 최종 질문의 응답 정보와 대응되는 정보를 추출하여 상기 사용자 단말로 제공하는 정보 제공부를 포함하는 것을 특징으로 한다.An information providing apparatus according to a first aspect of the present invention for achieving the above object includes a tree generator for generating a question tree based on attribute information defined for each question; When an interactive information providing service is requested from a user terminal, hierarchical questions formed in the question tree are sequentially provided to the user terminal, but a sub-question of the next level is selected based on response information to the question and sent to the user terminal. a question providing unit to transmit; and an information providing unit extracting information corresponding to the response information of the final question and providing the extracted information to the user terminal when response information to the final question is received from the user terminal.

상기 목적을 달성하기 위한 본 발명의 제 2 측면에 따른 정보 제공 장치에서 대화형 정보 제공 서비스를 수행하는 방법은, 질문별로 정의된 속성정보를 기초로, 질문 트리를 생성하는 단계; 사용자 단말로부터 대화형 정보 제공 서비스를 요청받는 단계; 상기 질문 트리에 형성된 계층적인 질문을 상기 사용자 단말로 순차적으로 제공하되, 질문에 대한 응답 정보에 근거하여 다음 레벨의 하위 질문을 선택하여 상기 사용자 단말로 전송하는 단계; 및 최종 질문에 대한 응답 정보가 상기 사용자 단말로부터 수신되면, 상기 최종 질문의 응답 정보와 대응되는 정보를 추출하여 상기 사용자 단말로 제공하는 단계를 포함하는 것을 특징으로 한다.To achieve the above object, a method for performing an interactive information providing service in an information providing apparatus according to a second aspect of the present invention includes generating a question tree based on attribute information defined for each question; Receiving a request for an interactive information providing service from a user terminal; sequentially providing hierarchical questions formed in the question tree to the user terminal, selecting a sub-question of a next level based on response information to the question, and transmitting the selected sub-question to the user terminal; and when response information to the final question is received from the user terminal, extracting information corresponding to the response information to the final question and providing the extracted information to the user terminal.

본 발명은 복수의 선택형 질문에 대한 사용자의 응답 정보를 기초로, 데이터를 추출하여 사용자에게 제공함으로써, 정보 검색시에 편의성을 향상시키는 장점이 있다.The present invention has an advantage of improving convenience when searching for information by extracting data based on user response information to a plurality of multiple-choice questions and providing the extracted data to the user.

또한, 본 발명은 질문의 속성정보를 토대로 질문 트리를 생생하고, 이 질문 트리에 기초하여 연관관계가 높은 질문을 순차적으로 사용자에게 제공하고, 질문에 대한 사용자의 선택 응답을 기초로 정보를 사용자에게 제공하기 때문에, 사용자가 요구하는 정보를 정확하게 검색할 수 있는 효과가 있다.In addition, the present invention creates a question tree based on the attribute information of the question, sequentially provides questions with a high correlation to the user based on the question tree, and provides information to the user based on the user's selection response to the question. Since it is provided, there is an effect that the user can accurately search for the requested information.

본 명세서에 첨부되는 다음의 도면들은 본 발명의 바람직한 실시예를 예시하는 것이며, 발명을 실시하기 위한 구체적인 내용과 함께 본 발명의 기술사상을 더욱 이해시키는 역할을 하는 것이므로, 본 발명은 그러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.
도 1은 본 발명의 일 실시예에 따른, 통신 시스템을 나타내는 도면이다.
도 2는 본 발명의 일 실시예에 따른, 정보 제공 장치의 구성을 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따른, 속성정보를 포함하는 질문을 예시하는 도면이다.
도 4는 본 발명의 일 실시예에 따른, 정보 제공 장치에서 질문 트리를 생성하는 방법을 설명하는 흐름도이다.
도 5는 질문이 이분 군집되는 과정을 예시적으로 나타내는 도면이다.
도 6은 질문 트리가 생성되는 과정을 예시적으로 나타내는 도면이다.
도 7은 본 발명의 일 실시예에 따른, 질문 트리를 토대로 대화형 정보 제공 서비스를 사용자에게 제공하는 방법을 설명하는 흐름도이다.The following drawings attached to this specification illustrate preferred embodiments of the present invention, and serve to further understand the technical idea of the present invention together with specific details for carrying out the invention, so the present invention is described in such drawings should not be construed as limited to
1 is a diagram illustrating a communication system according to an embodiment of the present invention.
2 is a diagram showing the configuration of an information providing device according to an embodiment of the present invention.
3 is a diagram illustrating a question including attribute information according to an embodiment of the present invention.
4 is a flowchart illustrating a method of generating a question tree in an information providing apparatus according to an embodiment of the present invention.
5 is a diagram illustratively illustrating a process of bipartite clustering of questions.
6 is a diagram illustrating a process of generating a question tree by way of example.
7 is a flowchart illustrating a method of providing an interactive information providing service to a user based on a question tree according to an embodiment of the present invention.

상술한 목적, 특징 및 장점은 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해 질 것이며, 그에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에 그 상세한 설명을 생략하기로 한다. 이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 일 실시예를 상세히 설명하기로 한다.The above-described objects, features and advantages will become more apparent through the following detailed description in conjunction with the accompanying drawings, and accordingly, those skilled in the art to which the present invention belongs can easily implement the technical idea of the present invention. There will be. In addition, in describing the present invention, if it is determined that a detailed description of a known technology related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted. Hereinafter, a preferred embodiment according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른, 통신 시스템을 나타내는 도면이다.1 is a diagram illustrating a communication system according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 통신 시스템은 사용자 단말(100) 및 정보 제공 장치(200)를 포함하여, 상기 사용자 단말(100)과 정보 제공 장치(200)는 네트워크(300)를 통해서 서로 통신한다. 상기 네트워크(300)는 이동통신망과 유선 인터넷망을 포함하는 것으로서, 본 발명의 주지의 관용 기술에 해당하므로 자세한 설명은 생략한다.As shown in FIG. 1, the communication system according to an embodiment of the present invention includes a user terminal 100 and an information providing device 200, and the user terminal 100 and the information providing device 200 are network They communicate with each other through (300). The network 300 includes a mobile communication network and a wired Internet network, and since it corresponds to well-known and used technology of the present invention, detailed description thereof will be omitted.

사용자 단말(100)은 사용자가 소지한 통신 장치로서, 정보 제공 장치(200)와 접속하여 대화형 정보 서비스를 제공받는다. 특히, 사용자 단말(100)은 정보 제공 장치(200)로 접속하여, 선택형 질문을 순차적으로 수신하고, 각각의 선택형 질문에 응답하여, 해당 응답에 부합되는 데이터 그룹을 정보 제공 장치(200)로부터 수신한다. 상기 선택형 질문은 복수의 선택 정보가 포함된 다지선다형, 네(yes)와 아니오(no) 중에서 어느 하나를 선택할 수 있는 양자택지형 등의 객관식 유형의 질문이다. 또한, 데이터 그룹은 하나의 이상의 데이터가 집합된 정보로서, 예컨대, 용도에 따라 API(application programming interface)가 집합된 API 데이터 그룹, 부서별로 구분된 부서 데이터 그룹, 여행지에 따라 구분된 여행지 데이터 그룹 등이다. 상기 사용자 단말(100)은 스마트폰 등과 같은 이동통신단말, 데스크톱 컴퓨터 등과 같은 개인용 컴퓨터 등이 채택될 수 있다. The user terminal 100 is a communication device possessed by a user, and is provided with an interactive information service by accessing the information providing device 200 . In particular, the user terminal 100 accesses the information providing device 200, sequentially receives multiple choice questions, and receives a data group corresponding to the response from the information providing device 200 in response to each of the multiple choice questions. do. The choice type question is a multiple-choice type question such as a multiple-choice type including a plurality of selection information and a two-way type in which one of yes and no can be selected. In addition, the data group is information in which one or more pieces of data are aggregated, for example, an API data group in which APIs (application programming interfaces) are aggregated according to purpose, a department data group divided by department, a travel destination data group classified according to travel destinations, etc. to be. The user terminal 100 may be a mobile communication terminal such as a smart phone or a personal computer such as a desktop computer.

정보 제공 장치(200)는 속성정보가 정의되며 인코딩 정보가 할당된 질문 및 이 질문과 대응되는 데이터 그룹을 데이터 유형별로 구분하여 저장한다. 또한, 정보 제공 장치(200)는 유전자 알고리즘을 이용하여 각 질문을 최적 군집으로 분리하고, 상기 분리된 군집을 토대로 데이터 유형별 질문 트리를 생성한다. The information providing apparatus 200 classifies and stores a question to which attribute information is defined and encoding information is assigned and a data group corresponding to the question by data type. In addition, the information providing device 200 separates each question into optimal clusters using a genetic algorithm, and creates a question tree for each data type based on the separated clusters.

상기 유전자 알고리즘은, 자연의 진화 과정을 토대로 개발된 계산 모델로서, 최적의 해를 도출하기 위하여 사용되는 기법 중의 하나이다. 상기 유전자 알고리즘은 선택 연산, 교배 연산 및 돌연변이 연산을 거쳐서, 특정 문제에 대한 최적의 해를 도출한다. 본 발명의 실시예서는, 정보 제공 장치(200)가 각 질문을 노드로서 설정하고, 각 질문에 할당된 인코딩 정보를 이용하여 질문을 선택 연산, 교배 연산, 돌연변이 연산 및 적합도 평가를 반복적으로 수행하면서, 각 질문을 일정 개수의 최적 군집으로 분리한다. 여기서, 인코딩 정보는, 각 질문에 할당한 일정 자릿수의 식별정보를 나타내는 것으로서, 유전자 알고리즘에서의 유전자 정보(예컨대, 염색체 정보)와 대응되며, 사전에 생성되어 각 질문에 할당된다. 상기 인코딩 정보에는 10진수에 해당하는 일정 자릿수의 숫자가 기록된다. 선택적으로, 상기 인코딩 정보에는 2진수, 8진수, 16진수 등과 같은 다양한 형식의 데이터가 기록될 수도 있다.The genetic algorithm is a calculation model developed based on the natural evolutionary process and is one of the techniques used to derive an optimal solution. The genetic algorithm derives an optimal solution for a specific problem through selection operation, crossover operation, and mutation operation. In an embodiment of the present invention, while the information providing device 200 sets each question as a node and repeatedly performs a selection operation, crossover operation, mutation operation, and fitness evaluation using the encoding information assigned to each question. , each question is separated into a certain number of optimal clusters. Here, the encoding information represents identification information of a certain number of digits assigned to each question, corresponds to genetic information (eg, chromosome information) in the genetic algorithm, and is generated in advance and assigned to each question. In the encoding information, a number of a certain number of digits corresponding to a decimal number is recorded. Optionally, data in various formats such as binary, octal, hexadecimal, and the like may be recorded in the encoding information.

상기 정보 제공 장치(200)는 사용자 단말(100)로 질문을 제공하고, 사용자 단말(100)에서 선택한 응답 정보에 근거하여, 연관되는 또 다른 질문을 계속적으로 제공한 후, 최종적인 질문의 응답 정보와 대응되는 정보를 추출하여 사용자 단말(100)로 제공한다. 정보 제공 장치(200)는 API 데이터 그룹, 부서별로 구분된 부서 데이터 그룹, 여행지에 따라 구분된 여행지 데이터 그룹 등 중에서 어느 하나를 정보로서 사용자 단말(100)로 제공할 수 있다.The information providing device 200 provides a question to the user terminal 100, continuously provides another related question based on the response information selected by the user terminal 100, and then provides final question response information. Information corresponding to is extracted and provided to the user terminal 100 . The information providing device 200 may provide the user terminal 100 with any one of an API data group, a department data group classified by department, and a travel destination data group classified according to a travel destination as information.

도 2는 본 발명의 일 실시예에 따른, 정보 제공 장치의 구성을 나타내는 도면이다.2 is a diagram showing the configuration of an information providing device according to an embodiment of the present invention.

도 2에 도시된 바와 같이, 본 발명의 일 실시예에 따른 정보 제공 장치(200)는 질문 제공부(210), 저장부(220), 트리 생성부(230) 및 정보 제공부(240)를 포함하며, 이러한 구성요소들은 하드웨어 또는 소프트웨어로 구현되거나, 하드웨어와 소프트웨어의 결합을 통해서 구현될 수 있다.As shown in FIG. 2 , the information providing device 200 according to an embodiment of the present invention includes a question providing unit 210, a storage unit 220, a tree generating unit 230 and an information providing unit 240. Including, these components may be implemented in hardware or software, or through a combination of hardware and software.

질문 제공부(210)는 트리 생성부(230)에 형성한 질문 트리를 토대로, 다수의 선택지와 질문 사항이 포함된 질문을 사용자 단말(100)로 제공하는 기능을 수행한다. 상기 질문 제공부(210)는 최초 질문으로서 루트 로드에 해당하는 질문을 사용자 단말(100)로 제공한다. 또한, 질문 제공부(210)는 사용자 단말(100)의 선택 응답 정보에 근거하여, 현재 질문 노드의 자식 노드 중에서 상기 선택 응답 정보에 대응되는 자식 노드의 질문을 추출하여 사용자 단말(100)로 제공한다.The question provider 210 performs a function of providing a question including a plurality of options and questions to the user terminal 100 based on the question tree formed in the tree generator 230 . The question providing unit 210 provides a question corresponding to a root road to the user terminal 100 as an initial question. In addition, the question providing unit 210 extracts a question of a child node corresponding to the selection response information from child nodes of the current question node based on the selection response information of the user terminal 100 and provides the extracted question to the user terminal 100. do.

저장부(220)는 메모리, 디스크 장치 등과 같은 저장 매체로서 각 데이터 유형에 대한 복수의 선택형 질문을 저장하고, 또한 질문과 대응되는 정보(즉, 데이터 그룹)를 저장한다. 또한, 저장부(220)는 데이터 유형별 질문 트리를 저장한다. 상기 질문에는 속성정보가 정의되고 있고 인코딩 정보가 할당되어 있으며, 복수의 선택지 정보가 질문에 포함된다. 또한, 상기 속성정보에는 데이터 그룹별 연관성 여부에 대한 정보가 기록된다. 상기 속성정보는 군집 분리의 적합도를 산출하는데 이용된다.The storage unit 220 is a storage medium such as a memory or a disk device, and stores a plurality of multiple-choice questions for each data type and also stores information corresponding to the questions (ie, a data group). Also, the storage unit 220 stores a question tree for each data type. In the question, attribute information is defined, encoding information is allocated, and a plurality of option information is included in the question. In addition, information on whether or not there is correlation for each data group is recorded in the attribute information. The attribute information is used to calculate the degree of fitness for cluster separation.

도 3은 본 발명의 일 실시예에 따른, 속성정보를 포함하는 질문을 예시하는 도면이다. 3 is a diagram illustrating a question including attribute information according to an embodiment of the present invention.

도 3을 참조하여 예를 들어 설명하면, 질문에는 다차원 좌표 형태의 속성정보가 포함된다. 예컨대, "육상 교통과 관련 있습니까?"의 질문에는 (1,1,1,1,1,1,1,1,1,0,0,0,0)의 속성정보가 포함되며, "해상 교통과 관련 있습니까?'의 질문에는 (0,0,0,0,0,0,0,0,0,1,0,0,0)의 속성정보가 포함된다. 각각의 속성정보는 다차원 좌표값으로 표현되며, 세부 값으로서 '0' 또는 '1'이 기록된다. 여기서, '1'은 해당 데이터 그룹과 관련성이 있음을 나타내는 플래그 정보이고, '0'은 해당 데이터 그룹과 관련성이 없음을 나타내는 플래그 정보이다. 도 3의 "해상 교통과 관련 있습니까?'의 질문에 포함된 속성정보에 따르면, 상기 질문은 "선박운항" 데이터 그룹과 관련성이 있으며, 이 외의 데이터 그룹과는 관련성이 없다.Referring to FIG. 3 as an example, the question includes attribute information in the form of multi-dimensional coordinates. For example, the question “Is it related to land transportation?” includes attribute information of (1,1,1,1,1,1,1,1,1,0,0,0,0), and “Is it related to sea transportation?” Do you have it?' includes attribute information of (0,0,0,0,0,0,0,0,0,1,0,0,0) Each attribute information is expressed as a multidimensional coordinate value '0' or '1' is recorded as a detailed value, where '1' is flag information indicating that it is related to the corresponding data group, and '0' is flag information indicating that it is not related to the corresponding data group. According to the attribute information included in the question "Is it related to maritime traffic?" in Fig. 3, the question is related to the "ship operation" data group and has no relation to other data groups.

트리 생성부(230)는 유전자 알고리즘을 이용하여 각 질문을 일정 개수의 최적 군집으로 분리하고, 이 분리된 군집을 토대로 데이터 유형별 질문 트리를 생성하여 저장부(220)에 저장하는 기능을 수행한다. 즉, 트리 생성부(230)는 유전자 알고리즘을 이용하여, 질문의 연관성에 따라 계층적인 구조 가지는 질문 트리를 생성하여 저장부(220)에 저장한다.The tree generator 230 separates each question into a predetermined number of optimum clusters using a genetic algorithm, generates a question tree for each data type based on the separated clusters, and stores the data type in the storage unit 220 . That is, the tree generator 230 generates a question tree having a hierarchical structure according to the relation of questions using a genetic algorithm and stores it in the storage 220 .

구체적으로, 트리 생성부(230)는 질문의 속성정보, 질문의 인코딩 정보 등의 인자를 토대로, 복수의 질문을 일정 개수의 최적 군집으로 초기 분리한다. 이때, 트리 생성부(230)는 질문들을 일정 개수의 임의의 군집으로 분리하고, 후술하는 수학식 1과 2를 이용하여 군집에 대한 내부 유사성과 외부 유사성을 산출하고, 상기 내부 유사성과 내부 유사성이 곱해진 최종 유사성이 산출하고 나서, 최종 유사성이 수렴될 때까지 선택 연산, 교배 연산, 돌연변이 연산 및 적합도 평가(즉, 최종 유사성 수렴 평가)를 반복하여, 질문들을 최적의 군집으로 초기 분리한다. Specifically, the tree generator 230 initially separates a plurality of questions into a predetermined number of optimal clusters based on factors such as question attribute information and question encoding information. At this time, the tree generator 230 divides the questions into a certain number of random clusters, calculates internal similarity and external similarity for the clusters using Equations 1 and 2 described later, and determines whether the internal similarity and internal similarity are After calculating the multiplied final similarity, the selection operation, crossover operation, mutation operation, and fitness evaluation (i.e., final similarity convergence evaluation) are repeated until the final similarity converges, and the questions are initially separated into optimal clusters.

상기 선택 연산은 질문 중에서 교배(crossover) 대상이 되는 질문을 선택하는 것으서, 균등 비례 룰렛 휠 선택 기법, 토너먼트 선택 기법, 순위 기반 선택 기법 등이 이용될 수 있다. 상기 교배(crossover) 연산은 선택된 질문들을 교배하여 새로운 객체(즉, 인코딩 정보)를 생성하는 과정으로서, 교배되어 생성된 새로운 객체에는 부모의 유전자(즉, 인코딩 정보의 일부)가 상속된다. 상기 교배 연산에서는, 싸이클 교차 연산, 순서 교차 연산, 산술적 교차 연산, 휴리스틱 교차 연산 등이 이용될 수 있다. 상기 돌연변이 연산은, 새롭게 생성된 객체의 인코딩 정보의 일부 데이터를, 부모(즉, 선택 연산에서 선택되어진 질문)로부터 상속되지 않은 데이터(즉, 숫자)로 변경함으로써, 돌연변이 객체를 도출하는 과정을 나타낸다. 상기 변경하고자 하는 데이터를 난수 생성기를 통해서 생성될 수 있다.The selection operation selects a crossover target question among questions, and an equal proportional roulette wheel selection method, a tournament selection method, a rank-based selection method, and the like may be used. The crossover operation is a process of generating a new object (ie, encoding information) by crossing selected questions, and the gene (ie, part of the encoding information) of a parent is inherited by the new object created through crossing over. In the crossing operation, a cycle crossing operation, an order crossing operation, an arithmetic intersection operation, a heuristic intersection operation, and the like may be used. The mutation operation represents a process of deriving a mutant object by changing some data of the encoding information of a newly created object to data (ie, a number) not inherited from a parent (ie, a question selected in the selection operation). . The data to be changed may be generated through a random number generator.

이렇게 돌연변이 연산이 완료된 객체(즉, 인코딩 정보)는 특정 질문지의 인코딩 정보로 변경된다. 이때, 트리 생성부(230)는 돌연변이 연산이 완료된 객체와 가장 유사한 인코딩 정보를 가지는 질문을 확인하고, 이 질문의 인코딩 정보를 상기 돌연변이 완료된 인코딩 정보로 변경할 수 있다.The object for which the mutation operation has been completed (ie, encoding information) is changed to the encoding information of a specific questionnaire. At this time, the tree generator 230 may identify a question having encoding information most similar to the object for which the mutation operation has been completed, and change the encoding information of the question to the encoding information for which the mutation has been completed.

상기 선택 연산, 교배 연산, 돌연변이 연산에 따른 1 사이클이 완료되면, 특정 질문의 인코딩 정보가 갱신된다. 그러면, 트리 생성부(230)는 현재 분리된 군집에 속하는 질문의 속성정보를 수학식 1, 수학식 2에 대입하여, 현재 분리된 군집의 최종 유사성을 산출하고, 이 유사성이 수렴되고 있는지 여부를 판별하여, 수렴되고 있으면 현재 분리된 군집을 최적의 군집인 것으로 판별하고 군집 분리를 중단한다.When one cycle according to the selection operation, crossover operation, and mutation operation is completed, encoding information of a specific question is updated. Then, the tree generator 230 substitutes the attribute information of the question belonging to the currently separated cluster into Equations 1 and 2, calculates the final similarity of the currently separated cluster, and determines whether the similarity is converging. If convergence is occurring, the currently separated cluster is determined as the optimal cluster and cluster separation is stopped.

트리 생성부(230)는 최적의 군집이 초기에 분리되면, 제1군집의 중심값과 가까운 거리에 해당하는 속성정보를 가지는 질문을 루트 로드로 생성한다. 또한, 트리 생성부(230)는 초기에 분리된 제2군집에서 포함된 특정 질문을 상기 루트 로드의 제2자식 노드로 생성한다. 이때, 트리 생성부(230)는 제2군집의 중심값과 가장 가까운 거리에 해당하는 속성정보를 가지는 질문을 상기 루트 로드의 제2자식 노드로 생성할 수 있다.When the optimal cluster is initially separated, the tree generator 230 creates a question having attribute information corresponding to a distance close to the center value of the first cluster as a root load. Also, the tree generator 230 generates a specific question included in the initially separated second cluster as a second child node of the root load. At this time, the tree generator 230 may generate a question having attribute information corresponding to the closest distance to the center value of the second cluster as the second child node of the root road.

또한, 트리 생성부(230)는 유전자 알고리즘의 선택 연산, 교배 연산 및 돌연변이 연산을 통해서, 분리된 제1군집과 제2군집 각각을 일정 개수의 서브 군집을 분리한다. 마찬가지로, 트리 생성부(230)는 최적으로 분리된 군집들을 임의의 서브 군집으로 분리하되, 상기 수학식 1과 2를 이용하여 서브 군집의 내부 유사성과 외부 유사성을 산출한 후, 내부 유사성과 외부 유사성이 곱해진 최종 유사성을 산출하고, 최종 유사성이 수렴될 때까지 유전자 알고리즘을 통한 군집 분리를 반복함으로써, 분리된 군집을 다시 최적의 서브 군집으로 각각 분리한다.In addition, the tree generator 230 separates a predetermined number of sub-clusters from each of the separated first and second clusters through a selection operation, crossover operation, and mutation operation of a genetic algorithm. Similarly, the tree generator 230 divides the optimally separated clusters into arbitrary sub-clusters, calculates the internal similarity and external similarity of the sub-clusters using Equations 1 and 2, and then determines the internal similarity and external similarity. The multiplied final similarity is calculated, and by repeating the cluster separation through the genetic algorithm until the final similarity converges, the separated clusters are again separated into optimal sub-clusters.

트리 생성부(230)는 제1군집의 서브 군집 중에서 연관성 플래그(즉, '1')의 빈도가 가장 높은 서브 군집의 대표 질문(즉, 서브 군집의 중심값과 가장 가까운 거리에 해당하는 속성정보를 가지는 질문)을 루트 노드의 제1자식 노드(예컨대, 오른쪽 자식 노드)로 생성한다. 또한, 트리 생성부(230)는 제1군집의 서브 군집 중에서 연관성 플래그(즉, '0')의 빈도가 가장 낮은 서브 군집의 대표 질문(즉, 서브 군집의 중심값과 가장 가까운 거리에 해당하는 속성정보를 가지는 질문)을 상기 제1자식 노드의 자식 노드로 생성한다. 게다가, 트리 생성부(230)는 제2군집의 서브 군집 중에서 연관성 플래그(즉, '1')의 빈도가 가장 높은 서브 군집의 대표 질문을 제2자식 노드(예컨대, 오른쪽 자식 노드)의 자식 노드로 생성하고, 제1군집의 서브 군집 중에서 연관성 플래그(즉, '0')의 빈도가 낮은 서브 군집중의 대표 질문을 상기 제2자식 노드의 또 다른 자식 노드로 생성한다. The tree generator 230 performs the representative question of the sub-cluster having the highest frequency of the association flag (ie, '1') among the sub-clusters of the first cluster (ie, attribute information corresponding to the closest distance to the central value of the sub-cluster). question) is created as the first child node (eg, right child node) of the root node. In addition, the tree generator 230 determines the representative question of the sub-cluster having the lowest frequency of the association flag (ie, '0') among the sub-clusters of the first cluster (ie, the closest distance to the central value of the sub-cluster). A question having attribute information) is created as a child node of the first child node. In addition, the tree generator 230 selects the representative question of the sub-cluster having the highest frequency of the association flag (ie, '1') among the sub-clusters of the second cluster as a child node of the second child node (eg, the right child node). , and a representative question in a sub-cluster in which the relevance flag (ie, '0') has a low frequency among the sub-clusters of the first cluster is created as another child node of the second child node.

이렇게 각각의 질문이 질문 트리에서 노드로 생성되기까지 서브 군집으로 분리하는 과정을 반복하면서, 상기 트리 생성부(230)는 부모 노드에 대한 자식 노드를 생성하는 과정을 계속적으로 진행한다.While repeating the process of dividing each question into sub-clusters until each question is generated as a node in the question tree, the tree generator 230 continues the process of generating child nodes for parent nodes.

한편, 자식 노드의 개수의 질문의 선택 정보 개수와 대응되어 생성된다. 즉, '예'와 '아니오'와 같은 양자택일형 선택 정보를 가지는 질문인 경우에 자식 노드는 최대 두 개가 생성될 수 있으며, 선택 정보 개수를 3개인 질문인 경우에 자식 노드는 최대 세 개가 생성될 수 있다.Meanwhile, the number of child nodes is generated corresponding to the number of selection information of the question. That is, in the case of a question with either-or-type selection information such as 'yes' and 'no', a maximum of two child nodes can be generated, and in the case of a question with three selection information, a maximum of three child nodes can be generated. It can be.

정보 제공부(240)는 마지막 질문에 대한 응답 정보가 사용자 단말(100)로부터 수신되면, 상기 마지막 질문 및 응답 정보와 대응되는 정보(즉, 데이터 그룹)를 저장부(220)에 추출하여, 상기 정보를 사용자 단말(100)로 제공한다.When response information for the last question is received from the user terminal 100, the information provider 240 extracts information (ie, a data group) corresponding to the last question and response information to the storage 220, Information is provided to the user terminal 100 .

도 4는 본 발명의 일 실시예에 따른, 정보 제공 장치에서 질문 트리를 생성하는 방법을 설명하는 흐름도이다.4 is a flowchart illustrating a method of generating a question tree in an information providing apparatus according to an embodiment of the present invention.

도 5는 질문이 이분 군집되는 과정을 예시적으로 나타내는 도면이다.5 is a diagram illustratively illustrating a process of bipartite clustering of questions.

도 6은 질문 트리가 생성되는 과정을 예시적으로 나타내는 도면이다.6 is a diagram illustrating a process of generating a question tree by way of example.

이하, 도 4와 도 6을 참조한 설명에서는, 질문이 양자택일형 질문이고 이에 따라 2진 트리가 형성되는 것으로 가정한다. 더불어, 부모 노드와 관련성이 높은 질문이 왼쪽 자식 노드(즉, 긍정 응답에 해당하는 자식 노드)에 생성되고, 관련성이 낮은 질문이 오른쪽 자식 노드(즉, 부정 응답에 해당하는 자식 노드)에 생성되는 것으로 가정한다.In the following description with reference to FIGS. 4 and 6 , it is assumed that the questions are either-or-type questions and thus a binary tree is formed. In addition, questions with high relevance to the parent node are created in the left child node (ie, child nodes corresponding to positive responses), and questions with low relevance are created in the right child node (ie, child nodes corresponding to negative responses). assume that

도 4 내지 도 6을 참조하면, 트리 생성부(230)는 유전자 알고리즘의 선택 연산, 교배 연산, 돌연변이 연산 및 적합도 평가를 통하여, 복수의 질문을 두 개의 최적 군집으로 초기 분리한다(S401). 상기 트리 생성부(230)는 각 질문에 할당된 인코딩 정보를 이용하여, 각 질문의 교배 연산과 돌연변이 연산을 수행한다. 또한, 트리 생성부(230)는 유전자 알고리즘을 이용하여 질문들을 임의의 두 개의 군집으로 분리하되, 아래의 수학식 1과 2를 이용하여 임의로 분리된 군집의 내부 유사성과 외부 유사성을 산출한 후, 상기 내부 유사성과 외부 유사성이 곱해진 최종 유사성이 수렴될 때까지 군집 분리를 반복 수행하여, 질문들을 최적의 이분 군집으로 초기 분리한다. 부연하면, 트리 생성부(230)는 각 질문에 할당된 인코딩 정보를 유전자 정보로서 이용하여 선택 연산, 교배 연산 및 돌연변이 연산을 수행하여 군집을 분리하고, 적합도 평가 지표로서 최종 유사성을 이용한다. Referring to FIGS. 4 to 6 , the tree generator 230 initially separates a plurality of questions into two optimal clusters through selection operation, crossover operation, mutation operation, and fitness evaluation of the genetic algorithm (S401). The tree generator 230 performs crossover and mutation operations on each question using encoding information assigned to each question. In addition, the tree generator 230 separates the questions into two arbitrary clusters using a genetic algorithm, and calculates the internal similarity and external similarity of the randomly separated clusters using Equations 1 and 2 below, Cluster separation is repeatedly performed until the final similarity multiplied by the internal similarity and the external similarity converges, and the questions are initially separated into optimal bipartite clusters. In other words, the tree generator 230 separates clusters by performing selection, crossover, and mutation operations using the encoding information assigned to each question as genetic information, and uses the final similarity as a fitness evaluation index.

여기서, Fit_intra는 내부 유사성을 의미하고, k는 군집 개수, C_k는 k번째의 군집의 중심값, X_i는 같은 군집에 속해 있는 i번째 질문의 속성정보(즉, 다차원 좌표값)을 나타낸다.Here, Fit _intra means internal similarity, k is the number of clusters, C _k is the center value of the k-th cluster, and X _i represents the attribute information (ie, multi-dimensional coordinate values) of the i-th question belonging to the same cluster. .

여기서, Fit_inter는 외부 유사성을 의미하고, k는 군집 개수, C_k는 k번째의 군집의 중심값, X_i는 타 군집 각각에 속해 있는 i번째 질문의 속성정보(즉, 다차원 좌표값), d는 최대 k값을 의미한다.Here, Fit _inter means external similarity, k is the number of clusters, C _k is the center value of the k-th cluster, X _i is the attribute information of the i-th question belonging to each other cluster (i.e., multi-dimensional coordinate values), d means the maximum k value.

다음으로, 트리 생성부(230)는 복수의 질문들이 최적의 이분 군집으로 초기에 분리되면, 초기에 분리된 A 군집과 B 군집 중에서 어느 하나를 루트대상 군집으로 선정한다. 이때, 트리 생성부(230)는 속성정보에서 연관 플래그 정보 '1'의 빈도(즉, 1의 총 개수)가 많은 군집을 루트대상 군집으로 선정할 수도 있으며, 반대로 연관 플래그 정보 '0'의 빈도가 적은 군집을 루트대상 군집으로 선정할 수 있다. 도 4 내지 도 6을 참조한 설명에서는 A 군집이 루트대상 군집으로 선정된 것으로 설명된다.Next, when a plurality of questions are initially divided into optimal bipartite clusters, the tree generator 230 selects one of the initially separated clusters A and B as the root target cluster. At this time, the tree generator 230 may select a cluster having a high frequency of the related flag information '1' (ie, the total number of 1's) in the attribute information as the root target cluster, and conversely, the frequency of the related flag information '0' A cluster with less is selected as the root target cluster. In the description with reference to FIGS. 4 to 6 , it is explained that cluster A is selected as the route target cluster.

도 5는 다차원 질문들을 2차원으로 단순화시킨 도면으로서, X좌표 축을 기준으로 왼쪽으로 가까울수록 연관 플래그 정보 '1'의 빈도가 많아짐을 의미한다. 도 5를 참조하여 예를 들어 설명하면, 트리 생성부(230)는 총 7개의 질문에 대해서, 1번부터 4번까지의 질문을 A 군집으로 형성하고 있으며, 5번부터 7번까지의 질문을 B 군집으로 형성하여, 질문들을 2개의 군집으로 분리하며, 이 중에서 A 군집을 루트대상 군집으로 선정할 수 있다.5 is a diagram in which multi-dimensional questions are simplified into two dimensions, and it means that the frequency of the related flag information '1' increases as it is closer to the left with respect to the X coordinate axis. Referring to FIG. 5 as an example, the tree generation unit 230 forms A cluster with questions 1 to 4 for a total of 7 questions, and questions 5 to 7 are grouped together. By forming the B cluster, the questions are separated into two clusters, and among them, the A cluster can be selected as the root target cluster.

이어서, 트리 생성부(230)는 선정된 루트대상 군집(즉, A 군집)의 중심값을 확인하고, 이 중심값과 가장 가까운 거리에 해당하는 속성정보(즉, 다차원 좌표정보)를 가지는 질문을 루트 로드로 생성한다(S403). 도 5의 (a)와 도 6의 (b)를 참조하면, 루트대상 군집인 A 군집의 중심값과 가장 가까운 3번 질문이 루트 로드로 생성된다.Subsequently, the tree generator 230 checks the center value of the selected route target cluster (ie, cluster A), and asks a question having attribute information (ie, multidimensional coordinate information) corresponding to the closest distance to the center value. It is created by root load (S403). Referring to FIG. 5(a) and FIG. 6(b), question number 3 closest to the center value of cluster A, which is the root target cluster, is created as a root load.

다음으로, 트리 생성부(230)는 루트대상 군집으로 선정되지 않은 B 군집의 중심값을 확인하고, 이 군집의 중심값과 가장 가까운 속성정보를 가지는 질문을 루트 노드의 제1오른쪽 자식 노드로 생성한다(S405). 여기서, 오른쪽 자식 노드는 부모 노드와 연관성이 낮은 노드로서, 부정 응답에 해당되는 질문 노드이다. 도 5의 (a)와 도 6의 (a)를 참조하면, 루트대상 군집으로 선정되지 않은 B 군집의 중심값과 가장 가까운 거리에 있는 7번 질문이 루트 로드의 오른쪽 자식 노드로 생성된다.Next, the tree generator 230 checks the center value of cluster B, which is not selected as the root target cluster, and creates a question having attribute information closest to the center value of this cluster as the first right child node of the root node. Do (S405). Here, the right child node is a node having a low correlation with the parent node and is a question node corresponding to a negative response. Referring to FIG. 5(a) and FIG. 6(a), question number 7 closest to the center value of cluster B, which is not selected as the root target cluster, is created as the right child node of the root road.

다음으로, 트리 생성부(230)는 유전자 알고리즘(즉, 선택 연산, 교배 연산, 돌연변이 연산)을 이용하여 초기 분리된 각 군집을 다시 두 개의 최적의 서브 군집을 각각 분리한다(S407). 이때, 트리 생성부(230)는 초기 분리된 군집을 두 개의 서브 군집으로 분리하되, 상기 수학식 1과 2를 이용하여 서브 군집의 내부 유사성과 외부 유사성을 산출하다. 그리고 트리 생성부(230)는 상기 내부 유사성과 외부 유사성이 곱해진 최종 유사성을 산출하여, 최종 유사성이 수렴될 때까지 유전자 알고리즘을 통한 군집 분리를 다시 실행하는 과정을 반복함으로써, 분리된 군집을 최적의 서브 군집으로 각각 분리한다.Next, the tree generator 230 separates each initially separated cluster into two optimal sub-clusters using a genetic algorithm (ie, selection operation, crossover operation, mutation operation) (S407). At this time, the tree generator 230 separates the initially separated cluster into two sub-clusters, and calculates internal similarity and external similarity of the sub-clusters using Equations 1 and 2 above. Then, the tree generator 230 calculates the final similarity multiplied by the internal similarity and the external similarity, and repeats the process of re-executing cluster separation through a genetic algorithm until the final similarity converges, thereby optimizing the separated clusters. Separate each into sub-clusters of .

이어서 트리 생성부(230)는 분리된 서브 군집에 포함된 질문들을 자식 노드로서 생성하는 프로세스를 진행한다(S409). 즉, 트리 생성부(230)는 루트 대상 군집에 포함되는 서브 군집(즉, A-1 군집과 A-2 군집) 중에서 연관성 플래그(즉, '1')의 빈도가 가장 높은 서브 군집을 확인하고, 이 서브 군집의 대표 질문(즉, 서브 군집의 중심값과 가장 가까운 거리에 해당하는 속성정보를 가지는 질문)을 루트 노드의 제1왼쪽 자식 노드로 생성한다. 또한, 루트 대상 군집에 포함되는 서브 군집 중에서 연관성 플래그(즉, '0')의 빈도가 낮은 서브 군집의 대표 질문(즉, 서브 군집의 중심값과 가장 가까운 거리에 해당하는 속성정보를 가지는 질문)을 상기 제1왼쪽 자식 노드의 오른쪽 자식 노드로 생성한다. 이때, 트리 생성부(230)는 노드로서 선정되지 않은 질문 중에서 자식 노드를 선정하여 생성한다.Subsequently, the tree generator 230 proceeds with a process of generating questions included in the separated sub-cluster as child nodes (S409). That is, the tree generator 230 identifies a sub-cluster having the highest frequency of the association flag (ie, '1') among the sub-clusters (ie, cluster A-1 and A-2) included in the root target cluster, and , A representative question of this sub-cluster (that is, a question having attribute information corresponding to the closest distance to the center value of the sub-cluster) is created as the first left child node of the root node. In addition, a representative question of a sub-cluster with a low frequency of the association flag (ie, '0') among sub-clusters included in the root target cluster (ie, a question having attribute information corresponding to the closest distance to the center value of the sub-cluster) is created as a right child node of the first left child node. At this time, the tree generator 230 selects and creates child nodes from questions that are not selected as nodes.

다음으로, 트리 생성부(230)는 B 군집의 서브 군집(즉, B-1 군집과 B-2 군집) 중에서 연관성 플래그(즉, '1')의 빈도가 가장 높은 서브 군집의 대표 질문을 제1오른쪽 자식 노드(즉, 루트 노드 오른쪽 자식 노드)의 왼쪽 자식 노드로 생성하고, B 군집의 서브 군집 중에서 연관성 플래그(즉, '0')의 빈도가 낮은 서브 군집의 대표 질문을 상기 제1오른쪽 자식 노드의 오른쪽 자식 노드로 생성한다.Next, the tree generator 230 provides a representative question of the sub-cluster having the highest frequency of the association flag (ie, '1') among the sub-clusters of cluster B (ie, cluster B-1 and cluster B-2). 1 Created as a left child node of a right child node (ie, a right child node of the root node), and a representative question of a sub-cluster having a low frequency of the association flag (ie, '0') among sub-clusters of cluster B is selected as the first right child node. It is created as the right child node of a child node.

도 5의 (b)와 도 6의 (b)를 참조하면, A 군집은 A-1 군집과 A-2 군집으로 분리되고, B 군집은 B-1 군집과 B-2 군집으로 분리된다. 그리고 A-1 군집과 A-2 군집 중에서, 연관성 플래그 빈도가 가장 높은 서브 군집인 A-1 군집의 중심값과 가장 가까운 노드인 2번 질문이 루트 노드(즉, 3번째 질문)의 왼쪽 자식 노드로 생성된다. 또한, 연관성 플래그 빈도가 가장 낮은 서브 군집인 A-2 군집의 중심값과 가까운 노드인 4번 질문이 2번 질문의 오른쪽 자식 노드로 생성된다. 또한, B-1 군집과 B-2 군집 중에서, 연관성 플래그 빈도가 가장 높은 서브 군집인 B-1 군집의 중심값과 가장 가까운 노드인 5번 질문이 루트 노드의 오른쪽 자식 노드(즉, 7번 질문)의 왼쪽 자식 노드로 생성된다. 또한, 연관성 플래그 빈도가 가장 낮은 서브 군집인 B-2 군집의 중심값과 가까운 노드인 6번 질문이 루트 노드의 오른쪽 자식 노드(즉, 7번 질문)의 오른쪽 자식 노드로 생성된다. Referring to (b) of FIG. 5 and (b) of FIG. 6 , cluster A is divided into clusters A-1 and A-2, and cluster B is divided into clusters B-1 and B-2. Among clusters A-1 and A-2, question 2, which is the node closest to the center value of cluster A-1, which is the sub-cluster with the highest association flag frequency, is the left child node of the root node (ie, question 3). is created with In addition, question 4, which is a node close to the center value of cluster A-2, which is the sub-cluster with the lowest association flag frequency, is created as a right child node of question 2. In addition, among clusters B-1 and B-2, question 5, which is the node closest to the center value of cluster B-1, which is the sub-cluster with the highest association flag frequency, is the right child node of the root node (i.e., question 7 ) is created as the left child node of In addition, question 6, which is a node close to the center value of cluster B-2, which is the sub-cluster with the lowest association flag frequency, is created as a right child node of the right child node of the root node (ie, question 7).

다음으로, 트리 생성부(230)는 종료 조건, 모든 질문 개수와 대응하여 노드가 트리에 형성되었는지 여부를 판별하여(S411), 종료 조건에 부합되지 않으면, S407 단계를 재진행하여 또 다른 자식 노드를 생성한다. 도 5의 (c)와 도 6의 (c)을 참조하면, 트리 생성부(230)는 S407 단계부터 재진행하여 A-1 군집을 두 개의 서브 군집(즉, A-1-1, A-1-2)으로 분리하고, A-1-1 군집과 A-1-2 군집 중에서 연관 플래그 빈도가 많은 A-1-1의 중심값과 가까운 질문을 2번 질문의 오른쪽 자식 노드로 형성한다.Next, the tree generator 230 determines whether a node is formed in the tree in response to the end condition and the number of all questions (S411). generate Referring to (c) of FIG. 5 and (c) of FIG. 6 , the tree generation unit 230 proceeds again from step S407 and divides cluster A-1 into two sub-clusters (ie, A-1-1 and A-1). -2), and a question close to the central value of A-1-1 with a high associated flag frequency among A-1-1 and A-1-2 clusters is formed as the right child node of question 2.

이러한 반복적인 과정을 통해서, 질문 트리가 형성된다. 상기 트리 생성부(230)는 관리자로부터 데이터 유형(예컨대, API 프로그래밍, 회사 등)을 입력받고, 입력된 데이터 유형과 상기 형성한 질문 트리를 함께 저장부(220)에 저장하고, 이에 따라 저장부(220)에는 데이터 유형별 질문 트리가 저장된다. Through this iterative process, a question tree is formed. The tree creation unit 230 receives a data type (eg, API programming, company, etc.) from an administrator, stores the input data type and the formed question tree together in the storage unit 220, and accordingly the storage unit In 220, a question tree for each data type is stored.

한편, 상술한 실시예에서, 질문 트리가 2진 트리인 것으로 설명하였지만, 3개 이상의 자식 노드를 가지는 질문 트리가 형성될 수 있다. 예컨대, 트리 생성부(230)는 3가지 선택 정보를 가지는 복수의 질문에 대해서, 최대 3가지 자식 노드를 생성할 수 있다. 이 경우, 트리 생성부(230)는 부모 노드와 연관성이 가장 높은 질문을 맨 왼쪽(또는 맨 오른쪽)의 자식 노드, 부모 노드와 연관성이 두 번째로 높은 질문을 가운데 자식 노드, 부모 노드와 연관성이 가장 낮은 질문을 맨 오른쪽(또는 맨 왼쪽)의 자식 노드로 각각 생성할 수 있다.Meanwhile, in the above embodiment, the question tree has been described as being a binary tree, but a question tree having three or more child nodes may be formed. For example, the tree generator 230 may generate up to three child nodes for a plurality of questions having three selection information. In this case, the tree generation unit 230 assigns the question with the highest correlation with the parent node to the leftmost (or rightmost) child node, and the question with the second highest correlation with the parent node to the middle child node or parent node. The lowest question can be created as a rightmost (or leftmost) child node, respectively.

도 7은 본 발명의 일 실시예에 따른, 질문 트리를 토대로 대화형 정보 제공 서비스를 사용자에게 제공하는 방법을 설명하는 흐름도이다.7 is a flowchart illustrating a method of providing an interactive information providing service to a user based on a question tree according to an embodiment of the present invention.

도 7을 참조하면, 질문 제공부(210)는 사용자 단말(100)로부터 대화형 서비스를 요청받는다. 그러면, 질문 제공부(210)는 데이터 유형 리스트를 사용자 단말(100)로 제공하고, 사용자가 선택한 특정 데이터 유형을 사용자 단말(100)로부터 수신한다(S701).Referring to FIG. 7 , the question provider 210 receives a request for an interactive service from the user terminal 100 . Then, the question providing unit 210 provides the data type list to the user terminal 100 and receives a specific data type selected by the user from the user terminal 100 (S701).

이어서, 질문 제공부(210)는 상기 특정 데이터 유형과 대응하는 질문 트리를 저장부(220)에서 확인하고(S703), 이 질문 트리에서 최상위에 레벨(즉, 루트)에 해당하는 질문을 사용자 단말(100)로 전송한다(S705). 상기 질문에는 선택지 정보가 포함된다.Subsequently, the question providing unit 210 checks the question tree corresponding to the specific data type in the storage unit 220 (S703), and sends a question corresponding to the highest level (ie, root) of the question tree to the user terminal. It is transmitted to (100) (S705). The question includes option information.

이어서, 질문 제공부(210)는 상기 질문에 대한 선택 응답 정보를 사용자 단말(100)로부터 수신하고(S707), 상기 선택 응답 정보와 대응되는 다음 레벨(즉, 2레벨)의 자식 노드를 확인하고, 이 자식 노드에 해당하는 질문을 사용자 단말(100)로 제공한다(S709). 질문 제공부(210)는 질문지가 양자택일형 질문이고 선택 응답 정보가 긍정 응답인 경우에, 루트 로드의 왼쪽(또는 오른쪽) 자식 노드에 해당하는 질문을 사용자 단말(100)로 제공할 수 있다. 다른 예로서, 질문 제공부(210)는 질문지가 양자택일형 질문이고 선택 응답 정보가 부정 응답인 경우에, 루트 로드의 오른쪽(또는 왼쪽) 자식 노드에 해당하는 질문을 사용자 단말(100)로 제공할 수 있다. 또 다른 예로서, 질문 제공부(210)는 질문지가 다지선다형 질문이고 선택 응답 정보가 1번 응답인 경우에, 맨 왼쪽에 위치한 자식 노드에 해당하는 질문을 사용자 단말(100)로 제공할 수 있으며, 선택 응답 정보가 2번 응답인 경우 왼쪽에서 두번째에 위치한 자식 노드를 사용자 단말(100)로 제공할 수 있다.Subsequently, the question providing unit 210 receives selection response information for the question from the user terminal 100 (S707), identifies a child node of the next level (ie, level 2) corresponding to the selection response information, and , a question corresponding to this child node is provided to the user terminal 100 (S709). The question providing unit 210 may provide a question corresponding to a left (or right) child node of the root road to the user terminal 100 when the questionnaire is an either/or type question and the selection response information is an affirmative response. As another example, the question providing unit 210 provides a question corresponding to a right (or left) child node of the root rod to the user terminal 100 when the questionnaire is an either/or type question and the selection response information is a negative response. can do. As another example, the question providing unit 210 may provide a question corresponding to a child node located at the far left to the user terminal 100 when the questionnaire is a multiple-choice question and the selection response information is a response number 1. , If the selection response information is the second response, the child node located second from the left may be provided to the user terminal 100 .

다음으로, 질문 제공부(210)는 자식 노드에 해당하는 질문에 대한 선택 응답 정보를 사용자 단말(100)로부터 수신한다(S711). 그러면, 질문 제공부(210)는 질문이 종료되었는지 여부를 판별하고(S713), 종료된 경우에 정보 제공부(240)로 데이터 제공을 지시한다.Next, the question providing unit 210 receives selection response information for a question corresponding to a child node from the user terminal 100 (S711). Then, the question providing unit 210 determines whether the question has ended (S713), and instructs the information providing unit 240 to provide data if the question has ended (S713).

그러면, 정보 제공부(240)는 마지막 질문과 상기 선택 응답 정보에 대응되는 정보(즉, 데이터 그룹)를 저장부(220)에 추출하여, 이 정보(즉, 데이터 그룹)를 사용자 단말(100)로 제공한다(S715).Then, the information providing unit 240 extracts information (ie, data group) corresponding to the last question and the selection response information to the storage unit 220, and stores this information (ie, data group) in the user terminal 100. Provided as (S715).

반면에, 질문 제공부(210)는 S713 단계에서 질문이 종료되지 않은 것으로 판별되면, 상기 선택 응답 정보와 대응되는 다음 레벨의 자식 노드를 확인하고, 이 자식 노드에 해당하는 질문을 사용자 단말(100)로 제공하는 S709 단계를 재진행한다.On the other hand, if it is determined that the question is not ended in step S713, the question providing unit 210 checks a child node of the next level corresponding to the selection response information, and sends a question corresponding to the child node to the user terminal 100. ) and proceeds again to step S709 provided.

상술한 바와 같이, 본 발명은, 복수의 선택형 질문에 대한 사용자의 응답 정보를 기초로, 정보를 추출하여 사용자에게 제공함으로써, 정보 검색시에 편의성을 향상시킨다. 또한, 본 발명은 질문의 속성정보를 토대로 질문 트리를 생생하고, 이 질문 트리에 기초하여 관련성이 있는 질문을 순차적으로 사용자에게 제공하고, 질문에 대한 사용자의 선택 응답을 기초로 정보를 사용자에게 제공하기 때문에, 사용자가 요구하는 정보를 정확하게 검색할 수 있다.As described above, the present invention improves convenience when searching for information by extracting and providing information to the user based on the user's response information to the plurality of multiple-choice questions. In addition, the present invention creates a question tree based on the attribute information of the question, sequentially provides related questions to the user based on the question tree, and provides information to the user based on the user's selection response to the question Because of this, it is possible to accurately search for the information requested by the user.

본 명세서는 많은 특징을 포함하는 반면, 그러한 특징은 본 발명의 범위 또는 특허청구범위를 제한하는 것으로 해석되어서는 안 된다. 또한, 본 명세서에서 개별적인 실시예에서 설명된 특징들은 단일 실시예에서 결합되어 구현될 수 있다. 반대로, 본 명세서에서 단일 실시예에서 설명된 다양한 특징들은 개별적으로 다양한 실시예에서 구현되거나, 적절히 결합되어 구현될 수 있다.While this specification contains many features, such features should not be construed as limiting the scope of the invention or the claims. Also, features described in separate embodiments in this specification may be implemented in combination in a single embodiment. Conversely, various features that are described in this specification in a single embodiment may be implemented in various embodiments individually or in combination as appropriate.

도면에서 동작들이 특정한 순서로 설명되었으나, 그러한 동작들이 도시된 바와 같은 특정한 순서로 수행되는 것으로, 또는 일련의 연속된 순서, 또는 원하는 결과를 얻기 위해 모든 설명된 동작이 수행되는 것으로 이해되어서는 안 된다. 특정 환경에서 멀티태스킹 및 병렬 프로세싱이 유리할 수 있다. 아울러, 상술한 실시예에서 다양한 시스템 구성요소의 구분은 모든 실시예에서 그러한 구분을 요구하지 않는 것으로 이해되어야 한다. 상술한 프로그램 구성요소 및 시스템은 일반적으로 단일 소프트웨어 제품 또는 멀티플 소프트웨어 제품에 패키지로 구현될 수 있다.Although actions are described in a particular order in the drawings, it should not be understood that such actions are performed in the specific order as shown, or that the actions are performed in a series of sequential order, or that all described actions are performed to achieve a desired result. . Multitasking and parallel processing can be advantageous in certain circumstances. In addition, it should be understood that the division of various system components in the above-described embodiments does not require such division in all embodiments. The program components and systems described above may generally be implemented as a package in a single software product or multiple software products.

상술한 바와 같은 본 발명의 방법은 프로그램으로 구현되어 컴퓨터로 읽을 수 있는 형태로 기록매체(시디롬, 램, 롬, 플로피 디스크, 하드 디스크, 광자기 디스크 등)에 저장될 수 있다. 이러한 과정은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있으므로 더 이상 상세히 설명하지 않기로 한다.The method of the present invention as described above may be implemented as a program and stored in a recording medium (CD-ROM, RAM, ROM, floppy disk, hard disk, magneto-optical disk, etc.) in a computer-readable form. Since this process can be easily performed by a person skilled in the art to which the present invention belongs, it will not be described in detail.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다.The present invention described above is capable of various substitutions, modifications, and changes without departing from the technical spirit of the present invention to those skilled in the art to which the present invention belongs, and thus the above-described embodiments and It is not limited by drawings.

100 : 사용자 단말 200 : 정보 제공 장치
210 : 질문 제공부 220 : 저장부
230 : 트리 생성부 240 : 정보 제공부
300 : 네트워크100: user terminal 200: information providing device
210: question providing unit 220: storage unit
230: tree generation unit 240: information provision unit
300: network

Claims

a tree generator for generating a question tree based on attribute information defined for each question;
When an interactive information providing service is requested from a user terminal, hierarchical questions formed in the question tree are sequentially provided to the user terminal, but a sub-question of the next level is selected based on response information to the question and sent to the user terminal. a question providing unit to transmit; and
When response information to the final question is received from the user terminal, an information providing unit extracting information corresponding to the response information of the final question and providing the extracted information to the user terminal;
The tree generator,
A plurality of questions are divided into a plurality of clusters, a first question included in a first cluster among the plurality of clusters is generated as a node, and a second question included in a second cluster among the plurality of clusters is generated as a node of the node. 1 An information providing device created as a child node.

According to claim 1,
The tree generator,
Questions are initially separated into a certain number of clusters, a root target cluster is selected from among the separated clusters, representative questions included in the selected root target cluster are created as root nodes, and representative questions of non-selected root target clusters are generated. is created as a sub-node of the root node, and the process of re-separating each separated cluster into a certain number of sub-clusters is repeated, but the representative questions of the separated sub-clusters are created as sub-nodes. Device.

According to claim 1,
The tree generator,
An information providing device generating a third question included in a first cluster among the plurality of clusters as a second child node of the node.

According to claim 3,
The tree generator,
The first sub-cluster is divided into a plurality of sub-clusters, a first sub-cluster having the highest frequency of association flags is identified among the plurality of sub-clusters, and the third question included in the first sub-cluster is assigned to the second sub-cluster. An information providing device created as a child node.

According to claim 4,
The tree generator,
Identifying a second sub-cluster having the lowest frequency of association flags among the plurality of sub-clusters, and generating a fourth question included in the second sub-cluster as a child node of the second child node.

According to claim 1,
The tree generator,
An information providing device characterized by calculating a final similarity multiplied by an internal similarity and an external similarity for the separated clusters using the following equation, and repeatedly performing cluster separation until the final similarity converges.
(mathematical expression)

Here, Fit _intra means internal similarity, k is the number of clusters, C _k is the center value of the k-th cluster, and X _i is the attribute information of the i-th question belonging to the same cluster.

Here, Fit _inter means external similarity, k is the number of clusters, C _k is the center value of the k-th cluster, X _i is the attribute information of the ith question belonging to each other cluster, and d is the maximum value of k.

According to claim 2,
The representative question is a question having attribution information corresponding to a closest distance from a central value of a cluster including the representative question.

According to claim 3,
The question provider,
The information providing device characterized in that for checking a child node formed at a position corresponding to the response information, and transmitting a question corresponding to the child node to the user terminal.

According to claim 3,
The tree generator,
An information providing device characterized in that the cluster is separated in proportion to the number of selection information of the question.

According to claim 1,
The plurality of clusters,
An information providing device that is initially separated clusters or sub-clusters obtained by separating one of the initially separated clusters.

A method for performing an interactive information providing service in an information providing device,
generating a question tree based on attribute information defined for each question;
Receiving a request for an interactive information providing service from a user terminal;
sequentially providing hierarchical questions formed in the question tree to the user terminal, selecting a sub-question of a next level based on response information to the question, and transmitting the selected sub-question to the user terminal; and
When response information to the final question is received from the user terminal, extracting information corresponding to the response information of the final question and providing the extracted information to the user terminal;
The step of generating the question tree is,
A plurality of questions are divided into a plurality of clusters, a first question included in a first cluster among the plurality of clusters is generated as a node, and a second question included in a second cluster among the plurality of clusters is generated as a node of the node. 1 Interactive information provision service method created as a child node.

According to claim 11,
The step of generating the question tree is,
initially dividing questions into a certain number of clusters, and selecting a root target cluster from among the divided clusters;
generating a representative question included in the selected root target cluster as a root node, and generating a representative question of a non-selected root target cluster as a lower node of the root node; and
and repeating the process of re-separating each of the separated sub-clusters into the predetermined number of sub-clusters, and generating representative questions of the separated sub-clusters as lower nodes.

According to claim 11,
Calculating the final similarity obtained by multiplying the internal similarity and the external similarity for the separated cluster using the equation below, and repeatedly performing cluster separation until the final similarity converges; Interactive informational service method.
(mathematical expression)

Here, Fit _inter means external similarity, k is the number of clusters, C _k is the center value of the k-th cluster, X _i is the attribute information of the i-th question belonging to the same and different clusters, and d is the maximum k represents a value.

According to claim 11,
The step of transmitting to the user terminal,
and identifying a child node formed at a location corresponding to the response information, and transmitting a question corresponding to the child node to the user terminal.