KR102643045B1

KR102643045B1 - Chatbot service providing system for analyzing by time and method thereof

Info

Publication number: KR102643045B1
Application number: KR1020210022396A
Authority: KR
Inventors: 나윤후; 이석준
Original assignee: (주)아와소프트; 광운대학교 산학협력단
Priority date: 2021-02-19
Filing date: 2021-02-19
Publication date: 2024-03-04
Also published as: KR20220118681A

Abstract

시점별로 분석 가능한 챗봇 서비스 제공 시스템 및 그 방법이 제공된다. 상기 방법은, 챗봇관리서버에 의해 수행되는 시점별로 분석 가능한 챗봇 서비스 제공 방법에 있어서, 챗봇 서비스를 통해 사용자 단말기로부터 질의가 포함된 대화데이터를 수신하는 단계; 상기 질의가 포함된 대화데이터를 분석하여 분석데이터를 생성하여 대화모델을 분류하는 단계; 분류된 상기 대화모델에 대응하는 챗봇 단말기를 선택하는 단계; 및 상기 챗봇 단말기가 학습데이터에 저장된 데이터를 검색 및 추출하여 상기 질의에 대응하는 답변이 포함된 대화데이터를 생성하는 단계; 를 포함할 수 있다.A chatbot service provision system and method that can be analyzed at each point in time are provided. The above method is a method of providing a chatbot service that can be analyzed at each point in time performed by a chatbot management server, comprising: receiving conversation data including a query from a user terminal through a chatbot service; Analyzing conversation data including the query to generate analysis data and classifying a conversation model; Selecting a chatbot terminal corresponding to the classified conversation model; and generating conversation data containing an answer corresponding to the query by the chatbot terminal searching and extracting data stored in learning data. may include.

Description

Chatbot service provision system and method that can be analyzed by time point {CHATBOT SERVICE PROVIDING SYSTEM FOR ANALYZING BY TIME AND METHOD THEREOF}

본 발명은 시점별로 분석 가능한 챗봇 서비스 제공 시스템 및 그 방법에 관한 것으로써, 특히 중의적 데이터를 고려하여 시점별로 데이터를 학습하여 시간 개념을 부여하여 사용자에게 시점별로 정보를 제공할 수 있는, 시점별로 분석 가능한 챗봇 서비스 제공 시스템 및 그 방법에 관한 것이다.The present invention relates to a chatbot service provision system and method that can be analyzed by time point, and in particular, by learning data by time point in consideration of ambiguous data and giving the concept of time, information can be provided to the user by time point. This relates to a chatbot service provision system and method that can be analyzed.

일반적으로 챗봇(CHATBOT)은 자연어 처리를 이용하여 컴퓨터와 사람이 마치 사람과 사람이 대화하는 것과 같은 서비스를 제공해주는 기술로써, 웹이나 어플리케이션과 같은 인터페이스를 이용하지 않고 마치 사람과 대화하는 것과 같은 방법으로 정보를 얻을 수 있는 기술이다.In general, CHATBOT is a technology that uses natural language processing to provide services between computers and people as if they were talking to each other, as if talking to a person without using an interface such as the web or an application. It is a technology that allows you to obtain information.

예를 들어, 날씨를 알려주거나, 영화 예매, 식당 예약과 같은 서비스를 사람에게 대화하듯이 기계가 처리해줄 수 있도록 한다.For example, it allows machines to handle services such as telling the weather, movie tickets, and restaurant reservations as if they were talking to a human.

또한, 챗봇은 미리 입력된 알고리즘에 따라 정해진 답변을 제공하는 수준에서 서비스가 되었으나 빅데이터 처리 기술과 함께 자연어 분석 및 처리 기술이 발전함에 따라 다양한 변수를 고려한 최적의 답변을 제공하고 있다.In addition, chatbots have become a service at the level of providing predetermined answers according to pre-entered algorithms, but as natural language analysis and processing technology develops along with big data processing technology, it is providing optimal answers that take into account various variables.

즉, 챗봇 서비스는 빅데이터 분석 및 머신 러닝, 그리고 자연어 처리 기술과 함께 비약적인 발전을 거듭하여 사용자의 대화 내용 분석 기반으로 올바른 답을 유추하고 다음 질문을 예측하여 대화 서비스를 제공하며, 나아가 최근에는 단순한 대화 서비스를 넘어, 쇼핑 및 결제와 같은 다양한 동작을 처리하는 용도로까지 확대되고 있다.In other words, chatbot services have made rapid progress with big data analysis, machine learning, and natural language processing technology to provide conversation services by inferring the correct answer and predicting the next question based on analysis of the user's conversation content. Furthermore, recently, simple It is expanding beyond conversation services to include processing various operations such as shopping and payments.

상기의 배경기술로서 설명된 사항들은 본 발명의 배경에 대한 이해 증진을 위한 것을 뿐, 이 기술분야에서 통상의 지식을 가진 자에게 이미 알려진 종래기술에 해당함을 인정하는 것으로 받아들여서는 안될 것이다.The matters described as background technology above are only for the purpose of improving understanding of the background of the present invention, and should not be taken as an acknowledgment that they correspond to prior art already known to those skilled in the art.

대한민국 공개특허 제10-2018-0098455호Republic of Korea Patent Publication No. 10-2018-0098455

본 발명이 해결하고자 하는 과제는 시점별로 분석 가능한 챗봇 서비스 제공 시스템 및 그 방법을 제공하는 것이다.The problem to be solved by the present invention is to provide a chatbot service provision system and method that can be analyzed at each point in time.

본 발명이 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned can be clearly understood by those skilled in the art from the description below.

상술한 과제를 해결하기 위한 본 발명의 일실시예에 따른 시점별로 분석 가능한 챗봇 서비스 제공 방법은, 챗봇관리서버에 의해 수행되는 시점별로 분석 가능한 챗봇 서비스 제공 방법에 있어서, 챗봇 서비스를 통해 사용자 단말기로부터 질의가 포함된 대화데이터를 수신하는 단계; 상기 질의가 포함된 대화데이터를 분석하여 분석데이터를 생성하여 대화모델을 분류하는 단계; 분류된 상기 대화모델에 대응하는 챗봇 단말기를 선택하는 단계; 및 상기 챗봇 단말기가 학습데이터에 저장된 데이터를 검색 및 추출하여 상기 질의에 대응하는 답변이 포함된 대화데이터를 생성하는 단계; 를 포함할 수 있다.A method of providing a chatbot service that can be analyzed by time point according to an embodiment of the present invention to solve the above-described problem is a method of providing a chatbot service that can be analyzed by time point performed by a chatbot management server. Receiving conversation data including a query; Analyzing conversation data including the query to generate analysis data and classifying a conversation model; Selecting a chatbot terminal corresponding to the classified conversation model; and generating conversation data containing an answer corresponding to the query by the chatbot terminal searching and extracting data stored in learning data. may include.

본 발명의 일실시예에 있어서, 상기 대화모델을 분류하는 단계는, 상기 질의가 포함된 대화데이터에 포함된 형태소, 동사구 및 내용어를 추출 및 분석하여 상기 대화모델을 분류할 수 있다.In one embodiment of the present invention, the step of classifying the conversation model may include classifying the conversation model by extracting and analyzing morphemes, verb phrases, and content words included in conversation data including the query.

본 발명의 일실시예에 있어서, 상기 대화모델을 분류하는 단계는, 추출된 상기 형태소, 상기 동사구 및 상기 내용어에 대응하는 오탈자 검수를 자동으로 수행하는 단계;를 포함하고, 상기 오탈자 검수를 통해 오탈자가 검색된 경우, 검색된 오탈자가 기준데이터에 대응하여 실제 오탈자인지 여부를 판단하여, 실제 오탈자일 때 상기 기준데이터에 대응하여 오탈자를 수정할 수 있다.In one embodiment of the present invention, the step of classifying the conversation model includes automatically performing typo correction corresponding to the extracted morpheme, the verb phrase, and the content word, and through the typo inspection. When a typo is searched, it is determined whether the searched typo is an actual typo in response to the standard data, and if it is an actual typo, the typo can be corrected in response to the standard data.

본 발명의 일실시예에 있어서, 상기 대화모델을 분류하는 단계는, 추출된 상기 형태소, 상기 동사구 및 상기 내용어에 대응하는 오탈자 검수를 자동으로 수행하는 단계;를 포함하고, 상기 오탈자 검수를 통해 오탈자가 검색된 경우, 검색된 오탈자가 기준데이터에 대응하여 실제 오탈자인지 여부를 판단하여, 실제 오탈자가 아닐 때 오탈자 분석을 통해 오분석 여부를 판단하는 단계;를 포함할 수 있다.In one embodiment of the present invention, the step of classifying the conversation model includes automatically performing typo correction corresponding to the extracted morpheme, the verb phrase, and the content word, and through the typo inspection. When a typo is searched, it may include determining whether the searched typo is an actual typo in response to the reference data, and determining whether or not the typo is misanalyzed through typo analysis when it is not an actual typo.

본 발명의 일실시예에 있어서, 상기 오탈자 분석을 통해 상기 오탈자가 오분석인 경우, 상기 형태소, 상기동사구 및 상기 내용어를 다시 추출하여 상기 분석데이터를 생성할 수 있다.In one embodiment of the present invention, if the typo is a misanalysis through the typo analysis, the morpheme, the verb phrase, and the content word can be re-extracted to generate the analysis data.

본 발명의 일실시예에 있어서, 빅데이터를 기반으로 반복학습하여 상기 학습데이터를 생성하는 단계; 를 포함할 수 있다.In one embodiment of the present invention, generating the learning data through repeated learning based on big data; may include.

본 발명의 일실시예에 있어서, 상기 학습데이터를 생성하는 단계는, 상기 빅데이터를 수집하는 단계; 수집된 상기 빅데이터를 분류 가능한 데이터로 변환시키는 단계; 분류된 상기 빅데이터로부터 중의적 데이터를 추출하여 제거하는 단계; 및 상기 중의적 데이터가 제거된 빅데이터를 이용하여 시차별 데이터를 반복 학습하는 단계;를 포함할 수 있다.In one embodiment of the present invention, generating the learning data includes collecting the big data; Converting the collected big data into classifiable data; extracting and removing ambiguous data from the classified big data; and iteratively learning time-specific data using the big data from which the ambiguous data has been removed.

본 발명의 일실시예에 있어서, 상기 빅데이터를 분류 가능한 데이터로 변환시키는 단계는, 상기 빅데이터를 기계처리 데이터(Machine Generated Data), 준 자동처리 데이터(Semi Generated Data), 사용자 처리 데이터(Human Generated Data), 기업 특화 데이터(Enterprise Specialized Data), 기업 보편적 데이터(Enterprise Normalized Data) 및 기업 특화 어휘 데이터(Enterprise Specialized Vocabulary Data)로 분류할 수 있다.In one embodiment of the present invention, the step of converting the big data into classifiable data includes converting the big data into machine generated data, semi-generated data, and user-processed data. Generated Data, Enterprise Specialized Data, Enterprise Normalized Data, and Enterprise Specialized Vocabulary Data.

본 발명의 일실시예에 있어서, 상기 시차별 데이터를 반복 학습하는 단계는, 실시간(Realtime), 일간(Daily), 월간(Monthly), 분기(Quarterly) 및 연간(Yearly)을 포함하는 유형별로 데이터를 반복 학습할 수 있다.In one embodiment of the present invention, the step of repeatedly learning the time-specific data includes data by type including Realtime, Daily, Monthly, Quarterly, and Yearly. can be learned repeatedly.

본 발명의 일실시예에 있어서, 상기 빅데이터를 분류 가능한 데이터로 변환시키는 단계 이전에, 상기 빅데이터를 전처리하는 단계; 를 더 포함할 수 있다.In one embodiment of the present invention, before converting the big data into classifiable data, preprocessing the big data; It may further include.

본 발명의 일실시예에 있어서, 상기 빅데이터를 전처리하는 단계는, 상기 빅데이터에 포함된 단어들이 의미있는 단어의 최소 단위로 구분되도록 공백제거필터, 특수문자제거필터를 이용하여 토큰화 작업을 수행하는 단계; 상기 토큰화 작업이 완료된 후 상기 빅데이터에 포함된 등장 빈도가 낮은 단어 또는 다수 반복되는 해당 단어들에 대한 노이즈 데이터를 제거하여 잔존하는 단어들의 의미가 부각되도록 정제화 작업을 수행하는 단계; 및 상기 정제화 작업이 완료된 후 상기 빅데이터를 정규화하는 단계; 를 포함할 수 있다.In one embodiment of the present invention, the step of preprocessing the big data involves tokenizing the words included in the big data using a space removal filter and a special character removal filter so that the words included in the big data are divided into minimum units of meaningful words. Steps to perform; After the tokenization operation is completed, performing a refining operation to highlight the meaning of the remaining words by removing noise data for words with low frequency of occurrence or words that are repeated a large number of times included in the big data; and normalizing the big data after the refining operation is completed; may include.

본 발명의 일실시예에 있어서, 상기 답변이 포함된 대화데이터에 대응하여 상기 학습데이터를 실시간 업데이트하는 단계; 를 더 포함할 수 있다.In one embodiment of the present invention, updating the learning data in real time in response to conversation data including the answer; It may further include.

본 발명의 일실시예에 있어서, 상기 학습데이터를 검수하는 단계; 를 더 포함할 수 있다.In one embodiment of the present invention, inspecting the learning data; It may further include.

또한, 상술한 과제를 해결하기 위한 본 발명의 다른 일실시예에 따른 시점별로 분석 가능한 챗봇 서비스 제공 방법은, 챗봇 단말기가 챗봇 서비스를 통해 사용자 단말기로부터 사용자정보 및 질의가 포함된 대화데이터를 수신하는 단계; 상기 챗봇 단말기가 상기 사용자정보를 기초로 상기 질의가 포함된 대화데이터를 분석하여 분석데이터를 생성하는 단계; 및 상기 챗봇 단말기가 상기 사용자정보를 기초로 상기 분석데이터를 이용하여 학습데이터를 검색하여 답변이 포함된 대화데이터를 생성하는 단계; 를 포함할 수 있다.In addition, a method of providing a chatbot service that can be analyzed at each point in time according to another embodiment of the present invention to solve the above-described problem involves a chatbot terminal receiving conversation data containing user information and queries from a user terminal through a chatbot service. step; generating analysis data by the chatbot terminal analyzing conversation data including the query based on the user information; And a step of the chatbot terminal searching learning data using the analysis data based on the user information and generating conversation data including an answer; may include.

본 발명의 일실시예에 있어서, 상기 챗봇관리서버가 빅데이터를 기반으로 반복학습하여 상기 학습데이터를 생성하는 단계; 를 포함할 수 있다.In one embodiment of the present invention, the chatbot management server generates the learning data by iteratively learning based on big data; may include.

본 발명의 일실시예에 있어서, 상기 학습데이터를 생성하는 단계는, 상기 챗봇관리서버가 빅데이터를 수집하는 단계; 상기 챗봇관리서버가 상기 빅데이터를 전처리하여 분류 가능한 데이터로 변환시키는 단계; 상기 챗봇관리서버가 상기 빅데이터를 기계처리 데이터(Machine Generated Data), 준 자동처리 데이터(Semi Generated Data), 사용자 처리 데이터(Human Generated Data), 기업 특화 데이터(Enterprise Specialized Data), 기업 보편적 데이터(Enterprise Normalized Data) 및 기업 특화 어휘 데이터(Enterprise Specialized Vocabulary Data)로 업무별, 개인별 및 기업별로 분류하는 단계; 및 상기 챗봇관리서버가 분류된 데이터를 실시간(Realtime), 일간(Daily), 월간(Monthly), 분기(Quarterly) 및 연간(Yearly)을 포함하는 유형별로 데이터를 반복 학습하여 상기 학습데이터를 생성하는 단계; 를 포함할 수 있다.In one embodiment of the present invention, the step of generating the learning data includes: the chatbot management server collecting big data; The chatbot management server preprocesses the big data and converts it into classifiable data; The chatbot management server divides the big data into Machine Generated Data, Semi Generated Data, Human Generated Data, Enterprise Specialized Data, and Enterprise General Data ( Classifying by task, individual, and company into Enterprise Normalized Data and Enterprise Specialized Vocabulary Data; And the chatbot management server generates the learning data by repeatedly learning the classified data by type, including Realtime, Daily, Monthly, Quarterly, and Yearly. step; may include.

또한, 상술한 과제를 해결하기 위한 본 발명의 다른 일실시예에 따른 시점별로 분석 가능한 챗봇 서비스 제공 시스템은, 상황별, 분야별, 개인별로 선택된 챗봇 단말기를 통해서 사용자 단말기에게 실시간 챗봇 서비스를 제공하는 챗봇관리서버; 를 포함하고, 상기 챗봇관리서버는 상기 사용자 단말기로부터 수신된 질의가 포함된 대화데이터를 분석하여 분석데이터를 생성하여 대화모델을 분류한 후, 중의적 데이터를 고려하여 생성된 학습데이터로부터 상기 대화모델에 대응하는 데이터를 추출하여 답변이 포함된 대화데이터를 생성하여 상기 챗봇 단말기를 통해 상기 사용자 단말기로 전송할 수 있다.In addition, a chatbot service providing system that can be analyzed at each point in time according to another embodiment of the present invention to solve the above-described problem is a chatbot that provides real-time chatbot services to user terminals through chatbot terminals selected by situation, field, and individual. Management server; It includes, the chatbot management server analyzes conversation data containing a query received from the user terminal, generates analysis data, classifies the conversation model, and then creates the conversation model from the learning data generated by considering ambiguous data. Data corresponding to can be extracted to generate conversation data including answers and transmitted to the user terminal through the chatbot terminal.

본 발명의 일실시예에 있어서, 상기 챗봇관리서버는 상기 중의적 데이터를 고려하여 수집된 빅데이터를 전처리하여 분류 가능한 데이터로 변환시키고, 전처리된 데이터를 기계처리 데이터(Machine Generated Data), 준 자동처리 데이터(Semi Generated Data), 사용자 처리 데이터(Human Generated Data), 기업 특화 데이터(Enterprise Specialized Data), 기업 보편적 데이터(Enterprise Normalized Data) 및 기업 특화 어휘 데이터(Enterprise Specialized Vocabulary Data)로 업무별, 개인별, 기업별로 분류하고, 분류된 데이터를 실시간(Realtime), 일간(Daily), 월간(Monthly), 분기(Quarterly) 및 연간(Yearly)으로 반복학습하여 상기 학습데이터를 생성할 수 있다.In one embodiment of the present invention, the chatbot management server preprocesses the big data collected in consideration of the ambiguous data and converts it into classifiable data, and converts the preprocessed data into machine generated data, semi-automatic data. Processed data (Semi Generated Data), User processed data (Human Generated Data), Enterprise Specialized Data, Enterprise Normalized Data, and Enterprise Specialized Vocabulary Data are used for each task and individual. , the learning data can be generated by classifying by company and repeatedly learning the classified data in real time, daily, monthly, quarterly, and yearly.

또한, 상술한 과제를 해결하기 위한 본 발명의 다른 일실시예에 따른 시점별로 분석 가능한 챗봇 서비스 제공 시스템은, 사용자 단말기에게 실시간 챗봇 서비스를 제공하는 챗봇관리서버; 를 포함하고, 상기 챗봇관리서버는 상기 사용자 단말기로부터 수신된 사용자정보 및 질의가 포함된 대화데이터를 분석하여 분석데이터를 생성하여 대화모델을 분류한 후, 중의적 데이터를 고려하여 생성된 학습데이터로부터 상기 대화모델 및 상기 사용자정보에 대응하는 데이터를 추출하여 답변이 포함된 대화데이터를 생성하여 상기 사용자 단말기로 전송하고, 상기 챗봇관리서버는, 상기 질의가 포함된 대화데이터에 포함된 형태소, 동사구 및 내용어를 추출하여 각각 상황별, 분야별, 개인별로 추출 및 분석하여 상기 분석데이터를 생성할 수 있다.In addition, a chatbot service providing system capable of analysis at each point in time according to another embodiment of the present invention to solve the above-described problem includes a chatbot management server that provides a real-time chatbot service to a user terminal; It includes, and the chatbot management server analyzes conversation data containing user information and queries received from the user terminal, generates analysis data, classifies the conversation model, and then generates analysis data from the learning data generated by considering ambiguous data. Data corresponding to the conversation model and the user information is extracted to generate conversation data including an answer and transmitted to the user terminal, and the chatbot management server extracts morphemes, verb phrases, and verb phrases included in the conversation data including the query. The analysis data can be generated by extracting content words and extracting and analyzing them by situation, field, and individual.

그리고, 본 발명의 일실시예에 따른 프로그램은 하드웨어인 컴퓨터와 결합되어, 상기 시점별로 분석 가능한 챗봇 서비스 제공 방법을 수행할 수 있도록 컴퓨터에서 독출가능한 기록매체에 저장된다.In addition, the program according to one embodiment of the present invention is combined with a computer, which is hardware, and is stored in a computer-readable recording medium so as to perform the method of providing a chatbot service that can be analyzed for each point in time.

본 발명의 기타 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Other specific details of the invention are included in the detailed description and drawings.

본 발명에 따르면, 챗봇 서비스를 이용하여 중의적 데이터를 고려하여 시점별로 데이터를 학습하여 시간 개념을 부여함으로써, 상담 과정에서 발생하는 비용을 최소화하면서 사용자에게 시점별로 정보를 제공할 수 있다.According to the present invention, by using a chatbot service to learn data at each point in time, taking ambiguous data into account, and giving a concept of time, it is possible to provide information to the user at each point in time while minimizing the cost incurred during the consultation process.

본 발명에 따르면, 본 발명에 따르면, 중의적 데이터를 고려하여 질의를 명확하게 파악하여 대화방식으로 사용자에게 시점별로 분석된 정보를 제공함으로써, 중의적 데이터에 대한 오류 및 검색시간을 최소화함과 동시에 사용자의 다양성을 존중하면서 편의성 및 신뢰성을 높일 수 있다.According to the present invention, by clearly understanding the query in consideration of ambiguous data and providing information analyzed by viewpoint to the user in a conversational manner, errors and search times for ambiguous data are minimized. Convenience and reliability can be improved while respecting user diversity.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 본 발명의 일실시예에 따른 시점별로 분석 가능한 챗봇 서비스 제공 시스템을 설명하기 위한 도면이다.
도 2는 도 1에 도시된 챗봇관리서버를 설명하기 위한 상세 도면이다.
도 3은 본 발명의 일실시예인 시점별로 분석 가능한 챗봇 서비스 제공 방법을 설명하기 위한 도면이다.
도 4는 도 3에 도시된 대화데이터를 생성하는 단계를 설명하기 위한 도면이다.
도 5는 도 3에 도시된 학습데이터를 생성하는 단계를 설명하기 위한 도면이다.1 is a diagram illustrating a chatbot service provision system capable of analysis at each point in time according to an embodiment of the present invention.
FIG. 2 is a detailed diagram for explaining the chatbot management server shown in FIG. 1.
Figure 3 is a diagram illustrating a method of providing a chatbot service that can be analyzed by time point, which is an embodiment of the present invention.
FIG. 4 is a diagram for explaining the step of generating conversation data shown in FIG. 3.
FIG. 5 is a diagram for explaining the steps of generating learning data shown in FIG. 3.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and may be implemented in various different forms. The present embodiments are merely provided to ensure that the disclosure of the present invention is complete and to provide a general understanding of the technical field to which the present invention pertains. It is provided to fully inform the skilled person of the scope of the present invention, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terminology used herein is for describing embodiments and is not intended to limit the invention. As used herein, singular forms also include plural forms, unless specifically stated otherwise in the context. As used in the specification, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other elements in addition to the mentioned elements. Like reference numerals refer to like elements throughout the specification, and “and/or” includes each and every combination of one or more of the referenced elements. Although “first”, “second”, etc. are used to describe various components, these components are of course not limited by these terms. These terms are merely used to distinguish one component from another. Therefore, it goes without saying that the first component mentioned below may also be a second component within the technical spirit of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used in this specification may be used with meanings commonly understood by those skilled in the art to which the present invention pertains. Additionally, terms defined in commonly used dictionaries are not interpreted ideally or excessively unless clearly specifically defined.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 일실시예에 따른 시점별로 분석 가능한 챗봇 서비스 제공 시스템을 설명하기 위한 도면이고, 도 2는 도 1에 도시된 챗봇관리서버를 설명하기 위한 상세 도면이다.FIG. 1 is a diagram illustrating a chatbot service provision system capable of analysis at each point in time according to an embodiment of the present invention, and FIG. 2 is a detailed diagram illustrating the chatbot management server shown in FIG. 1.

도 1 및 도 2에 도시된 바와 같이, 본 발명의 일실시예인 시점별로 분석 가능한 챗봇 서비스 제공 시스템(1)은 사용자 단말기(10), 챗봇 단말기(20) 및 챗봇관리서버(30)를 포함할 수 있다.As shown in Figures 1 and 2, the chatbot service providing system 1 that can be analyzed by time point, which is an embodiment of the present invention, includes a user terminal 10, a chatbot terminal 20, and a chatbot management server 30. You can.

본 실시예에서, 제1 내지 제n 사용자 단말기(10-1, 10-2, …10-n)(이하, 사용자 단말기(10)로 칭함) 및 제1 내지 제n 챗봇 단말기(20-1, 20-2, …20-n)(이하 챗봇 단말기(20)로 칭함)는 복수개로 개시될 수 있지만, 이에 한정하지 않는다. 예를 들어, 챗봇 단말기(20)는 복수개의 구성요소가 실제 물질적 환경에서는 서로 통합되는 형태로도 구현될 수 있다.In this embodiment, the first to nth user terminals 10-1, 10-2, ...10-n (hereinafter referred to as the user terminal 10) and the first to nth chatbot terminals 20-1, 20-2, ...20-n) (hereinafter referred to as chatbot terminal 20) may be disclosed in plural numbers, but is not limited thereto. For example, the chatbot terminal 20 may be implemented in a form in which a plurality of components are integrated with each other in an actual material environment.

여기서, 사용자 단말기(10), 챗봇 단말기(20) 및 챗봇관리서버(30)는 무선통신망을 이용하여 실시간으로 동기화되어 데이터를 송수신할 수 있다. 무선통신망은 다양한 원거리 통신 방식이 지원될 수 있으며, 예를 들어 무선랜(Wireless LAN: WLAN), DLNA(Digital Living Network Alliance), 와이브로(Wireless Broadband: Wibro), 와이맥스(World Interoperability for Microwave Access: Wimax), GSM(Global System for Mobile communication), CDMA(Code Division Multi Access), CDMA2000(Code Division Multi Access 2000), EV-DO(Enhanced Voice-Data Optimized or Enhanced Voice-Data Only), WCDMA(Wideband CDMA), HSDPA(High Speed Downlink Packet Access), HSUPA(High Speed Uplink Packet Access), IEEE 802.16, 롱 텀 에볼루션(Long Term Evolution: LTE), LTEA(Long Term Evolution-Advanced), 광대역 무선 이동 통신 서비스(Wireless Mobile Broadband Service: WMBS), BLE(Bluetooth Low Energy), 지그비(Zigbee), RF(Radio Frequency), LoRa(Long Range) 등과 같은 다양한 통신 방식이 적용될 수 있으나 이에 한정되지 않으며 널리 알려진 다양한 무선통신 또는 이동통신 방식이 적용될 수도 있다.Here, the user terminal 10, chatbot terminal 20, and chatbot management server 30 can transmit and receive data in real-time synchronization using a wireless communication network. The wireless communication network may support various long-distance communication methods, such as Wireless LAN (WLAN), DLNA (Digital Living Network Alliance), Wibro (Wireless Broadband: Wibro), and Wimax (World Interoperability for Microwave Access: Wimax). ), GSM (Global System for Mobile communication), CDMA (Code Division Multi Access), CDMA2000 (Code Division Multi Access 2000), EV-DO (Enhanced Voice-Data Optimized or Enhanced Voice-Data Only), WCDMA (Wideband CDMA) , HSDPA (High Speed Downlink Packet Access), HSUPA (High Speed Uplink Packet Access), IEEE 802.16, Long Term Evolution (LTE), LTEA (Long Term Evolution-Advanced), broadband wireless mobile communication service (Wireless Mobile) Broadband Service: Various communication methods such as WMBS (WMBS), BLE (Bluetooth Low Energy), Zigbee, RF (Radio Frequency), LoRa (Long Range), etc. may be applied, but are not limited to these and include various widely known wireless or mobile communications. method may be applied.

사용자 단말기(10)는 챗봇 단말기(20)와의 통신을 통해 챗봇관리서버(30)로부터 제공되는 챗봇 서비스를 실행할 수 있다.The user terminal 10 can execute the chatbot service provided by the chatbot management server 30 through communication with the chatbot terminal 20.

구체적으로, 사용자 단말기(10)는 챗봇 서비스를 통해 챗봇 단말기(20)와 대화데이터를 실시간으로 송수신할 수 있다.Specifically, the user terminal 10 can transmit and receive conversation data with the chatbot terminal 20 in real time through the chatbot service.

예를 들어, 사용자 단말기(10)는 질의내용이 포함된 대화데이터를 챗봇 단말기(20)로 전송하면, 질의내용에 대응하는 답변내용이 포함된 대화데이터를 수신하거나, 질의내용이 포함된 대화데이터를 챗봇 단말기(20)로부터 수신하면, 질의내용에 대응하는 답변내용이 포함된 대화데이터를 챗봇 단말기(20)로 전송할 수 있다. 즉, 사용자 단말기(10)는 챗봇 단말기(20)와 업무별, 개인별, 기업별로 질의/답변할 수 있는 사용자 또는 기업에 소속된 사용자에 동작할 수 있다.For example, when the user terminal 10 transmits conversation data including the query content to the chatbot terminal 20, it receives conversation data including the answer content corresponding to the query content, or conversation data including the query content. When received from the chatbot terminal 20, conversation data including answer content corresponding to the inquiry content can be transmitted to the chatbot terminal 20. In other words, the user terminal 10 can operate with the chatbot terminal 20 and users who can ask/answer questions by task, individual, or company, or users belonging to the company.

이를 위해 사용자 단말기(10)는 별도의 입력도구를 이용하여 질의/답변이 포함된 대화데이터를 생성할 수 있다. 예를 들어, 사용자 단말기(10)는 마이크로폰(microphone), 터치스크린(touch screen), 마우스(mouse), 마이크(mike), 키보드(keyboard) 또는 카메라(camera) 중 적어도 어느 하나를 이용하여 질의/답변이 포함된 대화데이터를 생성할 수 있지만, 이에 한정하지 않는다. 이때, 대화데이터는 텍스트 또는 음성으로 이루어질 수 있지만, 이에 한정하지 않고, 동영상 또는 이미지 등으로 이루어질 수도 있다.To this end, the user terminal 10 can generate conversation data including questions/answers using a separate input tool. For example, the user terminal 10 asks/asks questions using at least one of a microphone, touch screen, mouse, microphone, keyboard, or camera. Conversation data including answers can be generated, but is not limited to this. At this time, the conversation data may consist of text or voice, but is not limited to this and may also consist of videos or images.

사용자 단말기(10)는 대화데이터를 시각적 및 청각적으로 각각 또는 동시에 수신할 수 있다. 예를 들어, 동작 상태에 따라 기호, 문자, 숫자 등을 화면에 출력할 수 디스플레이, 색변화 또는 깜빡임으로 출력하는 램프, 또는 오디오로 출력하는 스피커 등을 포함할 수 있다.The user terminal 10 can receive conversation data visually and audibly separately or simultaneously. For example, it may include a display that can output symbols, letters, numbers, etc. on the screen depending on the operating state, a lamp that outputs color changes or blinks, or a speaker that outputs audio.

사용자 단말기(10)는 챗봇 단말기(20)와 챗봇 서비스를 수행하기 위해 사용자정보를 이용하여 챗봇관리서버(30)에 회원가입을 선진행할 수 있지만, 이에 한정하는 것은 아니다. 여기서, 사용자정보에는 이름, 성별, 나이, 연락처 및 직업 등을 포함할 수 있지만, 이에 한정하지 않고, 사용자의 현재 상황에 대한 정보를 입력할 수도 있다.The user terminal 10 may advance membership registration to the chatbot management server 30 using user information in order to perform the chatbot service with the chatbot terminal 20, but is not limited to this. Here, user information may include name, gender, age, contact information, and occupation, but is not limited to this, and information about the user's current situation may also be entered.

실시예에 따라, 사용자 단말기(10)는 챗봇 서비스를 사용 후 또는 사용중에 챗봇 서비스에 대한 피드백신호를 생성하여 챗봇 단말기(20)로 전송할 수 있다. 이와 달리, 사용자 단말기(10)는 챗봇 서비스를 사용하기 전에 전송할 수 있다. 이때, 피드백신호에는 챗봇 서비스에 대한 업무별, 개인별, 기업별로 답변 데이터에 대한 만족도, 정확도, 신뢰도 등의 정보가 포함될 수 있지만, 이에 한정하지 않는다.Depending on the embodiment, the user terminal 10 may generate a feedback signal for the chatbot service after or while using the chatbot service and transmit it to the chatbot terminal 20. In contrast, the user terminal 10 may transmit the chatbot service before using it. At this time, the feedback signal may include information such as satisfaction, accuracy, and reliability of response data for each task, individual, and company regarding the chatbot service, but is not limited to this.

실시예에 따라, 사용자 단말기(10)는 피드백신호에 대응하는 피드백제어신호를 챗봇 단말기(20)로부터 챗봇 서비스를 사용 전, 사용중, 또는 사용후에 수신할 수 있다. 이때, 피드백제어신호는 피드백신호에 대응하는 챗봇 서비스의 개선 또는 업데이트된 정보가 포함될 수 있다. 여기서, 피드백제어신호에는 이벤트정보가 포함될 수 있다. 이벤트정보는 광고가 포함되거나, 챗봇 서비스에 대한 할인 또는 행사에 대한 정보가 포함될 수 있지만, 이에 한정하지 않는다.Depending on the embodiment, the user terminal 10 may receive a feedback control signal corresponding to the feedback signal from the chatbot terminal 20 before, during, or after using the chatbot service. At this time, the feedback control signal may include improved or updated information on the chatbot service corresponding to the feedback signal. Here, the feedback control signal may include event information. Event information may include advertisements or information about discounts or events for chatbot services, but is not limited to this.

실시예에 따라, 사용자 단말기(10)는 챗봇 서비스를 통해 챗봇관리서버(30)와 대화데이터를 실시간으로 송수신할 수 있다.Depending on the embodiment, the user terminal 10 may transmit and receive conversation data with the chatbot management server 30 in real time through the chatbot service.

이와 같은, 사용자 단말기(10)는 챗봇 단말기(20) 및 챗봇관리서버(30)와의 통신을 지원하는 각종 휴대 가능한 전자통신기기를 포함할 수 있다. 예를 들어, 사용자 단말기(10)는 별도의 스마트 기기로써, 스마트폰(Smart phone), PDA(Personal Digital Assistant), 테블릿(Tablet), 웨어러블 디바이스(Wearable Device), 워치형 단말기(Smartwatch), 글래스형 단말기(Smart Glass), HMD(Head Mounted Display)등 포함), 각종 IoT(Internet of Things) 단말과 같은 다양한 휴대 단말을 포함할 수 있지만 이와 달리 휴대 가능하지 않는 데스크 탑 컴퓨터(desktop computer) 및 워크스테이션 컴퓨터 등의 전자통신기기를 포함할 수 있다.As such, the user terminal 10 may include various portable electronic communication devices that support communication with the chatbot terminal 20 and the chatbot management server 30. For example, the user terminal 10 is a separate smart device, such as a smart phone, a personal digital assistant (PDA), a tablet, a wearable device, a smartwatch, It may include various portable terminals such as glass-type terminals (including Smart Glass, HMD (Head Mounted Display), etc.) and various IoT (Internet of Things) terminals, but in contrast, non-portable desktop computers and It may include electronic communication devices such as workstation computers.

또한, 사용자 단말기(10)는 본 개시에서 응용 프로그램(application program 또는 애플리케이션(application))을 이용하여 동작할 수 있으며, 이러한 응용 프로그램은 무선통신을 통해 외부서버 또는 챗봇관리서버(30)로부터 다운로드 될 수 있다.In addition, the user terminal 10 can operate using an application program (application program or application) in the present disclosure, and such application program can be downloaded from an external server or chatbot management server 30 through wireless communication. You can.

챗봇 단말기(20)는 사용자 단말기(10)와의 챗봇 서비스를 제공하는 단말기로써, 사용자 단말기(10)로부터 포함된 대화데이터를 수신한 경우, 질의에 대응하는 답변이 포함된 대화데이터를 사용자 단말기(10)로 전송할 수 있다.The chatbot terminal 20 is a terminal that provides a chatbot service with the user terminal 10. When receiving the conversation data included from the user terminal 10, the chatbot terminal 20 sends the conversation data containing the answer corresponding to the inquiry to the user terminal 10. ) can be transmitted.

본 실시예에서, 챗봇 단말기(20)는 상황별, 분야별, 개인별로 분류되어 이에 대응하는 챗봇 서비스를 사용자 단말기(10)로 제공할 수 있다.In this embodiment, the chatbot terminal 20 can be classified by situation, field, and individual and provide the corresponding chatbot service to the user terminal 10.

예를 들어, 사용자 단말기(10)와의 대화데이터가 상황별 데이터인 경우, 상황별 검색이 가능한 제1 챗봇 단말기(20-1)가 채팅 서비스를 진행하고, 사용자 단말기(10)와의 대화데이터가 분야별 데이터인 경우, 분야별 검색이 가능한 제2 챗봇 단말기(20-2)가 채팅 서비스를 진행하고, 사용자 단말기(10)와의 대화데이터가 개인별 데이터인 경우, 개인별 검색이 가능한 제n 챗봇 단말기(20-n)가 채팅 서비스를 진행할 수 있지만, 이에 한정하지 않고, 각각의 챗봇 단말기(20)가 상황별, 분야별, 개인별로 챗봇 서비스를 제공할 수 있다.For example, when the conversation data with the user terminal 10 is situation-specific data, the first chatbot terminal 20-1 capable of searching by situation performs a chat service, and the conversation data with the user terminal 10 is field-specific. In the case of data, the second chatbot terminal (20-2) capable of searching by field provides a chat service, and if the conversation data with the user terminal 10 is personal data, the nth chatbot terminal (20-n) capable of searching by individual ) can provide a chat service, but is not limited to this, and each chatbot terminal 20 can provide a chatbot service for each situation, field, and individual.

이와 같은 챗봇 단말기(20)는 사용자 단말기(10) 및 챗봇관리서버(30)와의 통신을 지원하는 각종 휴대 가능한 전자통신기기를 포함할 수 있다. 예를 들어, 별도의 스마트 기기로써, 스마트폰(Smart phone), PDA(Personal Digital Assistant), 테블릿(Tablet), 웨어러블 디바이스(Wearable Device, 예를 들어, 워치형 단말기(Smartwatch), 글래스형 단말기(Smart Glass), HMD(Head Mounted Display)등 포함) 및 각종 IoT(Internet of Things) 단말과 같은 다양한 단말을 포함할 수 있지만 이에 한정하는 것은 아니다.Such a chatbot terminal 20 may include various portable electronic communication devices that support communication with the user terminal 10 and the chatbot management server 30. For example, as separate smart devices, smart phones, PDAs (Personal Digital Assistants), tablets, wearable devices (e.g., smartwatches, glass-type terminals) It may include, but is not limited to, various terminals such as (including Smart Glass, HMD (Head Mounted Display), etc.) and various IoT (Internet of Things) terminals.

챗봇관리서버(30)는 도 2를 참조하면, 데이터송수신부(32), 데이터베이스부(34), 모니터링부(36) 및 관리제어부(38)를 포함할 수 있다.Referring to FIG. 2, the chatbot management server 30 may include a data transmission/reception unit 32, a database unit 34, a monitoring unit 36, and a management control unit 38.

데이터송수신부(32)는 사용자 단말기(10) 또는 챗봇 단말기(20)와 질의/답변이 포함된 대화데이터를 송수신할 수 있다.The data transmitting and receiving unit 32 can transmit and receive conversation data including questions/answers with the user terminal 10 or the chatbot terminal 20.

또한, 데이터송수신부(32)는 사용자 단말기(10) 또는 챗봇 단말기(20)로부터 사용자정보를 수신할 수 있다.Additionally, the data transmitting and receiving unit 32 can receive user information from the user terminal 10 or the chatbot terminal 20.

실시예에 따라, 데이터송수신부(32)는 사용자 단말기(10) 또는 챗봇 단말기(20)로부터 피드백신호를 수신하고, 피드백신호에 대응하는 피드백제어신호를 사용자 단말기(10) 또는 챗봇 단말기(20)로 전송할 수 있다.Depending on the embodiment, the data transmitting and receiving unit 32 receives a feedback signal from the user terminal 10 or the chatbot terminal 20, and sends a feedback control signal corresponding to the feedback signal to the user terminal 10 or the chatbot terminal 20. It can be sent to .

데이터베이스부(34)는 무선통신망을 통해 사용자 단말기(10) 또는 챗봇 단말기(20)와 송수신되는 데이터를 저장할 수 있다.The database unit 34 can store data transmitted and received from the user terminal 10 or the chatbot terminal 20 through a wireless communication network.

데이터베이스부(34)는 챗봇관리서버(30)의 다양한 기능을 지원하는 데이터를 저장할 수 있다. 데이터베이스부(34)는 챗봇관리서버(30)에서 구동되는 다수의 응용 프로그램(application program 또는 애플리케이션(application)), 챗봇관리서버(30)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 이러한 응용 프로그램 중 적어도 일부는, 무선통신을 통해 외부 서버로부터 다운로드 될 수 있다.The database unit 34 can store data supporting various functions of the chatbot management server 30. The database unit 34 can store a number of application programs (application programs or applications) running on the chatbot management server 30, data for the operation of the chatbot management server 30, and commands. At least some of these applications may be downloaded from an external server through wireless communication.

모니터링부(36)는 사용자 조작에 의한 사용자 단말기(10)의 동작상태, 챗봇 단말기(20)의 동작 상태, 챗봇관리서버(30)의 동작상태, 그리고 사용자 단말기(10) 또는 챗봇 단말기(20)와 챗봇관리서버(30) 사이의 송수신되는 데이터 등을 화면을 통해 모니터링 할 수 있다. 즉, 사용자 단말기(10) 또는 챗봇 단말기(20)의 사용 상태를 실시간으로 확인함으로써, 사용자의 사용을 편리하게 하여 사용자에게 더욱 신뢰감을 줄 수 있다.The monitoring unit 36 monitors the operating state of the user terminal 10 due to user manipulation, the operating state of the chatbot terminal 20, the operating state of the chatbot management server 30, and the user terminal 10 or chatbot terminal 20. Data transmitted and received between and the chatbot management server 30 can be monitored through the screen. In other words, by checking the usage status of the user terminal 10 or the chatbot terminal 20 in real time, it is possible to conveniently use the user and provide the user with greater confidence.

관리제어부(38)는 사용자 단말기(10) 또는 챗봇 단말기(20)와의 챗봇 서비스를 실행하고, 사용자 단말기(10)로부터 수신된 대화데이터를 분석하여 분석데이터를 생성하여 대화모델을 분류한 후, 중의적 데이터를 고려하여 생성된 학습데이터로부터 대화모델에 대응하는 데이터를 추출하여 대화데이터를 생성하여 사용자 단말기(10) 또는 챗봇 단말기(20)로 전송할 수 있다. 이때, 관리제어부(38)는 사용자 단말기(10) 또는 챗봇 단말기(20)로부터 수신된 모든 대화데이터를 이용하여 분석데이터를 생성하는 것으로 개시하였지만, 이에 한정하지 않고 사용자정보를 이용하여 생성할 수도 있다.The management control unit 38 executes a chatbot service with the user terminal 10 or the chatbot terminal 20, analyzes the conversation data received from the user terminal 10, generates analysis data, classifies the conversation model, and classifies the conversation model. Data corresponding to the conversation model can be extracted from the learning data generated by considering the enemy data, generate conversation data, and transmit it to the user terminal 10 or chatbot terminal 20. At this time, the management control unit 38 has started to generate analysis data using all conversation data received from the user terminal 10 or the chatbot terminal 20, but is not limited to this and can also be generated using user information. .

구체적으로, 관리제어부(38)는 대화데이터에 포함된 형태소, 동사구 및 내용어를 추출하여 각각 상황별, 분야별, 개인별로 추출 및 분석하여 분석데이터를 생성하여 대화모델을 분류할 수 있다.Specifically, the management control unit 38 can classify the conversation model by extracting morphemes, verb phrases, and content words included in the conversation data, extracting and analyzing them by situation, field, and individual, respectively, and generating analysis data.

더욱 구체적으로, 관리제어부(38)는 추출된 형태소, 동사구 및 내용어에 대하여 기준에 대응하여 오탈자 검수를 수행하고, 오탈자가 발생한 경우 기준데이터에 대응하여 오탈자를 수정하여 분석데이터를 생성할 수 있다. 여기서, 기준데이터는 국립국어원에서 정한 맞춤법 및 표준어를 포함할 수 있지만, 이에 한정하지 않는다.More specifically, the management control unit 38 performs typo checking on the extracted morphemes, verb phrases, and content words in response to standards, and when typos occur, the typos are corrected in response to the standard data to generate analysis data. . Here, the standard data may include, but is not limited to, spelling and standard language determined by the National Institute of the Korean Language.

이와 달리, 관리제어부(38)는 추출된 형태소, 동사구 및 내용어에 대하여 기준데이터에 대응하여 오탈자 검수를 수행하고, 오탈자가 발생하지 않은 경우 오탈자 분석을 통해 오분석여부를 판단하여 오분석인 경우 형태소, 동사구 및 내용어를 다시 추출하여 분석데이터를 생성할 수 있다. 하지만, 관리제어부(38)는 오탈자가 발생하지 않은 경우 오탈자 분석을 통해 오분석여부를 판단하여 오분석이 아닌 경우 추출된 형태소, 동사구 및 내용어를 이용하여 분석데이터를 생성할 수 있다. 이에 따라, 오탈자를 명확히 하여 형태소, 동사구 및 내용어를 추출 및 분석하여 분석데이터를 생성함으로써, 대화데이터에 포함된 질의 의도를 명확하게 파악하여 사용자에게 이해도가 높으면서 정확한 답변을 전달할 수 있다.In contrast, the management control unit 38 performs typo checking on the extracted morphemes, verb phrases, and content words in response to the standard data, and if no typos occur, determines whether or not there is a misanalysis through typo analysis, and determines whether there is a misanalysis. Analysis data can be generated by re-extracting morphemes, verb phrases, and content words. However, if no typos occur, the management control unit 38 determines whether or not there are misanalyses through typo analysis, and if there are no misanalyses, it can generate analysis data using the extracted morphemes, verb phrases, and content words. Accordingly, by clarifying typos and generating analysis data by extracting and analyzing morphemes, verb phrases, and content words, the intention of the query contained in the conversation data can be clearly identified and an accurate answer with a high degree of understanding can be delivered to the user.

실시예에 따라, 관리제어부(38)는 사용자정보를 기초로 대화데이터를 상황별, 분야별, 개인별로 분석하여 분석데이터를 생성하여 대화모델을 분류할 수 있다.Depending on the embodiment, the management control unit 38 may analyze conversation data by situation, field, and individual based on user information to generate analysis data and classify the conversation model.

이때, 관리제어부(38)는 대화데이터를 분석하여 분석데이터를 생성하기 전에 대화데이터를 전처리할 수 있다.At this time, the management control unit 38 may analyze the conversation data and preprocess the conversation data before generating analysis data.

구체적으로, 관리제어부(38)는 대화데이터가 정확히 인식되도록 토큰화(Tokenization) 작업, 정제화(Cleaning) 작업 및 정규화(Normalization) 작업을 순서대로 진행하여 대화데이터를 전처리할 수 있다. 여기서, 대화데이터가 음성, 이미지 또는 동영상인 경우 텍스트로 변화 처리될 수 있다.Specifically, the management control unit 38 may preprocess the conversation data by sequentially performing tokenization, cleaning, and normalization so that the conversation data is accurately recognized. Here, if the conversation data is voice, image, or video, it can be converted into text.

구체적으로, 관리제어부(38)는 텍스트로 이루어진 대화데이터를 필터링하는 토큰화 작업을 수행할 수 있다. 즉, 관리제어부(38)는 텍스트에 포함된 단어들이 의미있는 단어의 최소 단위로 구분되도록 공백제거필터, 특수문자제거필터를 이용하여 토큰화 작업을 수행할 수 있다.Specifically, the management control unit 38 can perform a tokenization operation to filter conversation data consisting of text. That is, the management control unit 38 can perform tokenization using a space removal filter and a special character removal filter so that words included in the text are divided into minimum units of meaningful words.

관리제어부(38)는 토큰화 작업이 끝난 대화데이터에 포함된 노이즈 데이터(noise data)를 제거하는 정제화 작업을 수행할 수 있다. 즉, 관리제어부(38)는 등장 빈도가 낮은 단어 또는 다수 반복되는 해당 단어들에 대한 노이즈 데이터를 제거하여 잔존하는 단어들의 의미가 부각되도록 정제화 작업을 수행할 수 있다.The management control unit 38 may perform a refinement operation to remove noise data included in conversation data for which tokenization has been completed. That is, the management control unit 38 may perform a refining operation to remove noise data about words that appear less frequently or words that are repeated a lot so that the meaning of the remaining words is highlighted.

실시예에 따라, 관리제어부(38)는 노이즈 데이터를 수치화하여 사용자 단말기(10)의 사용자정보에 대응하여 삭제여부를 판단할 수 있다.Depending on the embodiment, the management control unit 38 may quantify the noise data and determine whether to delete it in response to the user information of the user terminal 10.

관리제어부(38)는 정제화 작업이 끝난 텍스트데이터를 정규화할 수 있다.The management control unit 38 can normalize text data for which the refinement process has been completed.

또한, 관리제어부(38)는 학습데이터에 저장된 데이터를 검색하여 대화모델에 대응하는 데이터를 추출하여 대화데이터를 생성할 수 있다.Additionally, the management control unit 38 can generate conversation data by searching data stored in the learning data and extracting data corresponding to the conversation model.

또한, 관리제어부(38)는 빅데이터를 기반으로 딥러닝 기법 또는 머신러닝 기법을 이용하여 데이터를 반복학습하여 학습데이터를 생성할 수 있다. 여기서, 빅데이터는 텍스트, 음성, 동영상, 이미지 등 다양한 형태의 데이터를 포함할 수 있다.In addition, the management control unit 38 can generate learning data by repeatedly learning data using deep learning techniques or machine learning techniques based on big data. Here, big data may include various types of data such as text, voice, video, and image.

구체적으로, 관리제어부(38)는 수집된 데이터를 업무별, 개인별, 기업별로 데이터를 분류할 수 있다. 즉, 관리제어부(38)는 수집된 데이터를 전처리하여 분류 가능한 데이터로 변환시킬 수 있다.Specifically, the management control unit 38 can classify the collected data by task, individual, and company. That is, the management control unit 38 can preprocess the collected data and convert it into data that can be classified.

예를 들어 표 1을 참조하면, 관리제어부(38)는 전처리된 데이터를 기계처리 데이터(Machine Generated Data), 준 자동처리 데이터(Semi Generated Data), 사용자 처리 데이터(Human Generated Data), 기업 특화 데이터(Enterprise Specialized Data), 기업 보편적 데이터(Enterprise Normalized Data) 및 기업 특화 어휘 데이터(Enterprise Specialized Vocabulary Data)를 포함하는 유형별로 데이터를 업무별, 개인별, 기업별로 데이터를 분류할 수 있다.For example, referring to Table 1, the management control unit 38 divides the pre-processed data into machine generated data, semi-automatically processed data, human generated data, and company-specific data. Data can be classified by type, including Enterprise Specialized Data, Enterprise Normalized Data, and Enterprise Specialized Vocabulary Data, by job, individual, and company.

유형category 설명explanation 기계처리 데이터
(Machine Generated Data)machine processing data
(Machine Generated Data) 기계가 자동으로 생성하는 실시간 데이터Real-time data automatically generated by machines 준 자동처리 데이터
(Semi Generated Data)Semi-automatically processed data
(Semi Generated Data) 산술적 계산식과 인위적으로 설정한 목표치의 데이터Arithmetic calculation formula and artificially set target data 사용자 처리 데이터
(Human Generated Data)User Processed Data
(Human Generated Data) 인위적으로 작성, 판단 및 정의하는 데이터Data that is artificially created, judged, and defined 기업 특화 데이터
(Enterprise Specialized Data)Company-specific data
(Enterprise Specialized Data) 특정기업에 한정된 데이터Data limited to specific companies 기업 보편적 데이터
(Enterprise Normalized Data)enterprise universal data
(Enterprise Normalized Data) 기업 구분을 떠나 보편적 데이터Universal data regardless of company classification 기업 특화 어휘 데이터
(Enterprise Specialized Vocabulary Data)Company-specific vocabulary data
(Enterprise Specialized Vocabulary Data) 특정기업에서만 사용되는 어휘 데이터Vocabulary data used only by specific companies

다시 말하면, 관리제어부(38)는 수집된 데이터 중 기계가 자동으로 생성하는 실시간 데이터, 예를 들어 제품시리얼 넘버에 포함된 부품의 일련 번호, 각각의 출고 자동 평가값 등의 기계처리 데이터로 데이터를 분류할 수 있다.In other words, the management control unit 38 converts the collected data into machine-processed data such as real-time data automatically generated by the machine, such as serial numbers of parts included in the product serial number and automatic evaluation values for each shipment. Can be classified.

관리제어부(38)는 수집된 데이터 중 부가적인 산술적 계산식과 사용자가 인위적으로 설정한 목표치의 데이터, 예를 들어 KPI(Key Performance Indicato)와 관련된 사항들, 월간 판매 목표 달성 사원, 월간 생산 목표 달성 부서, 이번 달 경비 청구 가결산결과, 회사 비용 대비 입금, 재고 현황 등의 준 자동처리 데이터로 데이터를 분류할 수 있다.Among the collected data, the management control unit 38 includes additional arithmetic calculation formulas and target data artificially set by the user, such as matters related to KPI (Key Performance Indicato), employees achieving monthly sales goals, and departments achieving monthly production goals. , data can be classified into semi-automatically processed data such as this month's expense claim preliminary settlement results, deposits compared to company expenses, and inventory status.

관리제어부(38)는 수집된 데이터 중 인위적으로 사용자의 작성, 판단 및 정의하는 데이터, 예를 들어 생산 계획, 일일 업무보고, 회의 참석자 명단, 영업이익 등의 사용자 처리 데이터로 데이터를 분류할 수 있다.Among the collected data, the management control unit 38 can artificially classify the data into user-processed data such as data created, judged, and defined by users, such as production plans, daily business reports, meeting attendee lists, and operating profits. .

관리제어부(38)는 수집된 데이터 중 특정기업에 한정된 데이터, 예를 들어 자동차 생산량, 자동차 소비량 등의 기업 특화 데이터로 데이터를 분류할 수 있다.The management control unit 38 may classify the collected data into data limited to a specific company, for example, company-specific data such as automobile production and automobile consumption.

관리제어부(38)는 수집된 데이터 중 기업을 구분하지 않은 보편적인 데이터, 예를 들어 출장경비, 야근식대 등의 기업 보편적 데이터로 데이터를 분류할 수 있다.The management control unit 38 may classify the collected data into general data that does not distinguish between companies, for example, business trip expenses, overtime meals, etc.

관리제어부(38)는 수집된 데이터 중 특정기업에서만 사용되는 일반적인 데이터, 예를 들어, 감미료 트레할로스 등의 기업 보편적 데이터로 데이터를 분류할 수 있다.The management control unit 38 may classify the collected data into general data used only by a specific company, for example, corporate data such as the sweetener trehalose.

또한, 관리제어부(38)는 사용자정보에 기초하여 분류된 데이터로부터 중의적 데이터를 추출하여 제거할 수 있다.Additionally, the management control unit 38 can extract and remove ambiguous data from data classified based on user information.

구체적으로, 관리제어부(38)는 사용자정보에 기초하여 동음이의어, 다의어, 비유, 어휘 및 문장 등을 고려하여 중의적 데이터를 제거할 수 있다.Specifically, the management control unit 38 may remove ambiguous data by considering homonyms, polysemy, metaphors, vocabulary, and sentences based on user information.

예를 들어, 관리제어부(38)는 사용자정보에 기초하여 “올해 우리 회사의 영업이익은 얼마지?” 라는 질의에서 '올해'와 '영업이익'은 중의적 데이터를 추출할 수 있다. 즉, 올해라는 기준은 회계 기준으로는 다음해 3월까지는 전년도 기준을 따르며, 사용자정보에 기초한 올해는 1월1일부터 지금 시점까지의 영업이익을 의미하는 것이므로, 사용자정보에 기초하여 중의적 데이터를 삭제할 수 있다. 이에 따라, 중의적 데이터를 제거함으로써, 사용자의 질문의도에 대응하는 답변을 더욱 명확하게 생성하여 전달하여 사용자의 신뢰도 향상될 수 있다.For example, the management control unit 38 asks, “How much is our company’s operating profit this year?” based on user information. In the query 'this year' and 'operating profit', ambiguous data can be extracted. In other words, this year's standard follows the previous year's standard until March of the following year in terms of accounting standards, and this year based on user information refers to operating profit from January 1 to the present, so ambiguous data is provided based on user information. can be deleted. Accordingly, by removing ambiguous data, the user's trust can be improved by more clearly generating and delivering an answer corresponding to the user's question intention.

또한, 관리제어부(38)는 중의적 데이터가 제거된 데이터를 시차별로 구분하여 시차별 데이터를 반복 학습하여 업무별, 개인별, 기업별로 학습데이터를 생성할 수 있다.In addition, the management control unit 38 can divide the data from which ambiguous data has been removed by time and repeatedly learn the time-specific data to generate learning data for each task, individual, and company.

예를 들어, 표 2와 같이 실시간(Realtime), 일간(Daily), 월간(Monthly), 분기(Quarterly) 및 연간(Yearly)을 포함하는 유형별로 데이터를 반복 학습하여 업무별, 개인별, 기업별로 학습데이터를 생성할 수 있다.For example, as shown in Table 2, learn by task, individual, and company by repeatedly learning data by type, including Realtime, Daily, Monthly, Quarterly, and Yearly. Data can be generated.

유형category 설명explanation 실시간(Realtime)Realtime 실시간으로 학습하는 데이터Data learned in real time 일간(Daily)Daily 매일 학습하는 데이터Data learned every day 월간(Monthly)Monthly 매달 학습하는 데이터Data learned every month 분기(Quarterly)Quarterly 분기별로 학습하는 데이터Data learned quarterly 연간(Yearly)Yearly 연간별로 학습하는 데이터Data learned by year

다시 말하면, 관리제어부(38)는 수집된 데이터 중 실시간으로 학습 해야하는 데이터, 예를 들어 1시간 동안에 생산된 제품의 하자 여부 등의 데이터를 실시간으로 반복 학습하여 학습데이터를 생성할 수 있다.In other words, the management control unit 38 can generate learning data by repeatedly learning in real time data that needs to be learned in real time among the collected data, for example, data such as whether or not a product produced in one hour has a defect.

관리제어부(38)는 수집된 데이터 중 매일 학습 해야하는 데이터, 예를 들어 일일 회사가 언급된 신문기사, 경쟁사와 관련된 신문기사, 경제 관련 뉴스 등의 데이터를 반복 학습하여 학습데이터를 생성할 수 있다.The management control unit 38 can generate learning data by repeatedly learning data that needs to be learned every day among the collected data, such as daily newspaper articles mentioning the company, newspaper articles related to competitors, and economic news.

관리제어부(38)는 수집된 데이터 중 매달 학습 해야하는 데이터, 예를 들어 월간 보고내역 또는 월간 업무별, 개인별, 기업별 KPI의 측정 등의 데이터를 반복 학습하여 학습데이터를 생성할 수 있다.The management control unit 38 can generate learning data by repeatedly learning data that needs to be learned every month among the collected data, such as monthly reporting details or monthly KPI measurement for each task, individual, or company.

관리제어부(38)는 수집된 데이터 중 분기별로 학습하는 데이터, 예를 들어 분기별 임시 개정되는 사규 또는 분기 매출 결산 등의 데이터를 반복 학습하여 학습데이터를 생성할 수 있다.The management control unit 38 may generate learning data by repeatedly learning data learned quarterly among the collected data, for example, data such as company regulations that are temporarily revised each quarter or quarterly sales settlement data.

관리제어부(38)는 수집된 데이터 중 연간별로 학습하는 데이터, 예를 들어 연간 재무제표, 연간 결산보고서, 연간 연구결과 보고서 등의 데이터를 반복 학습하여 학습데이터를 생성할 수 있다.The management control unit 38 may generate learning data by repeatedly learning data learned annually, such as annual financial statements, annual settlement reports, and annual research results reports, among the collected data.

또한, 관리제어부(38)는 답변이 포함된 대화데이터를 고려하여 실시간으로 학습데이터를 업데이트할 수 있다.Additionally, the management control unit 38 can update the learning data in real time by considering conversation data including answers.

실시예에 따라, 관리제어부(38)는 외부 검색엔진, 전문가 등을 통해 검수과정을 통해 학습데이터를 생성함으로써, 답변 유사도를 더욱 향상시킬 수 있다.Depending on the embodiment, the management control unit 38 may further improve answer similarity by generating learning data through a review process using an external search engine, experts, etc.

그리고, 관리제어부(38)는 사용자 단말기(10)로부터 수신된 피드백신호에 대응하여 챗봇 서비스를 개선할 수 있는 피드백제어신호를 생성할 수 있다.Additionally, the management control unit 38 can generate a feedback control signal that can improve the chatbot service in response to the feedback signal received from the user terminal 10.

실시예에 따라, 관리제어부(38)는 이벤트정보가 포함된 피드백제어신호를 생성하여 사용자 단말기(10) 또는 챗봇 단말기(20)의 챗봇 서비스의 사용을 증대시킬 수 있다.Depending on the embodiment, the management control unit 38 may generate a feedback control signal containing event information to increase the use of the chatbot service of the user terminal 10 or the chatbot terminal 20.

이와 같은 챗봇관리서버(30)는 하드웨어 회로(예를 들어, CMOS 기반 로직 회로), 펌웨어, 소프트웨어 또는 이들의 조합에 의해 구현될 수 있다. 예를 들어, 다양한 전기적 구조의 형태로 트랜지스터, 로직게이트 및 전자회로를 활용하여 구현될 수 있다.Such a chatbot management server 30 may be implemented by hardware circuits (e.g., CMOS-based logic circuits), firmware, software, or a combination thereof. For example, it can be implemented using transistors, logic gates, and electronic circuits in the form of various electrical structures.

이와 같은 구조를 갖는 본 발명의 일실시예에 따른 시점별로 분석 가능한 챗봇 서비스 제공 시스템의 동작은 다음과 같다. 도 3은 본 발명의 일실시예인 시점별로 분석 가능한 챗봇 서비스 제공 방법을 설명하기 위한 도면이고, 도 4는 도 3에 도시된 대화데이터를 생성하는 단계를 설명하기 위한 도면이며, 도 5는 도 3에 도시된 학습데이터를 생성하는 단계를 설명하기 위한 도면이다.The operation of the chatbot service providing system that can be analyzed at each point in time according to an embodiment of the present invention having this structure is as follows. Figure 3 is a diagram illustrating a method of providing a chatbot service that can be analyzed by viewpoint, which is an embodiment of the present invention, Figure 4 is a diagram illustrating the step of generating conversation data shown in Figure 3, and Figure 5 is a diagram illustrating Figure 3 This is a diagram to explain the steps for generating learning data shown in .

우선, 챗봇관리서버(30)는 챗봇 단말기(20)를 통해서 챗봇 서비스를 요청하는 사용자 단말기(10)와 챗봇 서비스를 실행할 수 있지만, 이에 한정하지 않을 수 있다.First, the chatbot management server 30 may execute the chatbot service with the user terminal 10 requesting the chatbot service through the chatbot terminal 20, but may not be limited to this.

여기서, 챗봇관리서버(30)는 사용자 단말기(10)로부터 입력되는 사용자정보를 수신할 수 있다. 이때, 사용자정보에는 이름, 성별, 나이, 연락처 및 직업 등을 포함할 수 있지만, 이에 한정하지 않고, 사용자의 현재 상황에 대한 정보를 입력할 수도 있다.Here, the chatbot management server 30 can receive user information input from the user terminal 10. At this time, the user information may include name, gender, age, contact information, and occupation, but is not limited to this, and information about the user's current situation may also be entered.

또한, 챗봇 서비스를 실행하기 위해 사용자 단말기(10)는 사용자정보 입력을 통해 회원가입을 선진행할 수 있지만, 이에 한정하는 것은 아니다.Additionally, in order to run the chatbot service, the user terminal 10 may advance membership registration by entering user information, but is not limited to this.

다음으로, 챗봇관리서버(30)는 사용자단말기(10)로부터 질의/답변이 포함된 대화데이터를 수신할 수 있다(S10).Next, the chatbot management server 30 can receive conversation data including questions/answers from the user terminal 10 (S10).

다음으로, 챗봇관리서버(30)는 사용자 단말기(10)로부터 수신한 대화데이터를 분석하여 분석데이터를 생성하여 대화모델을 분류할 수 있다(S12).Next, the chatbot management server 30 can analyze the conversation data received from the user terminal 10, generate analysis data, and classify the conversation model (S12).

구체적으로, 도 4를 참조하면, 챗봇관리서버(30)는 사용자 단말기(10)로부터 수신된 대화데이터를 전처리할 수 있다(S100).Specifically, referring to FIG. 4, the chatbot management server 30 may preprocess conversation data received from the user terminal 10 (S100).

다음, 챗봇관리서버(30)는 전처리된 대화데이터로부터 대화데이터에 포함된 형태소, 동사구 및 내용어를 추출 및 분석할 수 있다(S110). 이때, 챗봇관리서버(30)는 사용자정보를 기초로 대화데이터를 상황별, 분야별, 개인별로 분석하여 분석데이터를 생성하여 대화모델을 분류할 수 있다.Next, the chatbot management server 30 can extract and analyze morphemes, verb phrases, and content words included in the conversation data from the preprocessed conversation data (S110). At this time, the chatbot management server 30 can classify the conversation model by analyzing conversation data by situation, field, and individual based on user information and generating analysis data.

다음, 챗봇관리서버(30)는 추출된 형태소, 동사구 및 내용어에 대한 오탈자를 자동으로 검수할 수 있다(S120).Next, the chatbot management server 30 can automatically check for typos in the extracted morphemes, verb phrases, and content words (S120).

구체적으로, 오탈자 검수를 통해 오탈자가 검색된 경우(S130), 챗봇관리서버(30)는 검색된 오탈자가 기준데이터에 대응하여 실제 오탈자인지 여부를 판단할 수 있다.Specifically, when a typo is found through typo inspection (S130), the chatbot management server 30 can determine whether the searched typo is an actual typo in response to the standard data.

검색된 오탈자가 실제 오탈자인 경우(S140), 챗봇관리서버(30)는 기준데이터에 대응하여 오탈자를 수정하여 분석데이터를 생성할 수 있다(S150, S160). 여기서, 기준데이터는 국립국어원에서 정한 맞춤법 및 표준어를 포함할 수 있지만, 이에 한정하지 않는다.If the searched typo is an actual typo (S140), the chatbot management server 30 can generate analysis data by correcting the typo in response to the standard data (S150, S160). Here, the standard data may include, but is not limited to, spelling and standard language determined by the National Institute of the Korean Language.

이와 달리, 검색된 오탈자가 실제 오탈자가 아닌 경우(S140), 챗봇관리서버(30)는 오탈자 분석을 통해 오분석 여부를 판단할 수 있다.On the other hand, if the searched typo is not an actual typo (S140), the chatbot management server 30 can determine whether the typo is analyzed through typo analysis.

오탈자 분석을 통해 오탈자가 오분석인 경우(S170), 챗봇관리서버(30)는 형태소, 동사구 및 내용어를 다시 추출하여 분석데이터를 생성할 수 있다.If the typo is a misanalysis through typo analysis (S170), the chatbot management server 30 can generate analysis data by re-extracting the morpheme, verb phrase, and content word.

한편, 오탈자 분석을 통해 오탈자가 오분석이 아닌 경우(S170), 챗봇관리서버(30)는 추출된 형태소, 동사구 및 내용어를 이용하여 분석데이터를 생성할 수 있다.Meanwhile, if the typo is not a misanalysis through typo analysis (S170), the chatbot management server 30 can generate analysis data using the extracted morphemes, verb phrases, and content words.

또한, 검색된 오탈자가 실제 오탈자가 아닌 경우(S130), 챗봇관리서버(30)는 추출된 형태소, 동사구 및 내용어를 이용하여 분석데이터를 생성할 수 있다.Additionally, if the searched typo is not an actual typo (S130), the chatbot management server 30 can generate analysis data using the extracted morphemes, verb phrases, and content words.

다음으로, 챗봇관리서버(30)는 사용자 정보에 기초하여 분류된 대화모델에 대응하는 챗봇 단말기(20)를 선택할 수 있다(S14).Next, the chatbot management server 30 can select the chatbot terminal 20 corresponding to the conversation model classified based on user information (S14).

즉, 챗봇 단말기(20)는 상황별, 분야별, 개인별로 분류되어 이에 대응하는 챗봇 서비스를 사용자 단말기(10)로 제공할 수 있다.In other words, the chatbot terminal 20 can be classified by situation, field, and individual and provide the corresponding chatbot service to the user terminal 10.

다음으로, 챗봇 단말기(20)는 학습데이터에 저장된 데이터를 검색하여 대화모델에 대응하는 데이터를 추출하여 대화데이터를 생성할 수 있다(S16, S18).Next, the chatbot terminal 20 can generate conversation data by searching data stored in the learning data and extracting data corresponding to the conversation model (S16, S18).

여기서, 학습데이터는 챗봇관리서버(30)가 빅데이터를 기반으로 딥러닝 기법 또는 머신러닝 기법을 이용하여 데이터를 반복학습하여 생성할 수 있다.Here, the learning data can be generated by the chatbot management server 30 repeatedly learning data using deep learning techniques or machine learning techniques based on big data.

예를 들어, 도 5에 도시된 바와 같이 챗봇관리서버(30)는 빅데이터를 수집할 수 있다(S200).For example, as shown in FIG. 5, the chatbot management server 30 can collect big data (S200).

구체적으로, 챗봇관리서버(30)는 빅데이터를 기반으로 업무별, 개인별, 기업별로 다양한 데이터를 수집할 수 있다. 여기서, 빅데이터는 텍스트, 음성, 동영상, 이미지 등 다양한 형태의 데이터를 포함할 수 있다.Specifically, the chatbot management server 30 can collect various data by job, individual, and company based on big data. Here, big data may include various forms of data such as text, voice, video, and image.

다음, 챗봇관리서버(30)는 수집된 데이터를 전처리하여 분류 가능한 데이터로 변환시킬 수 있다(S210).Next, the chatbot management server 30 can preprocess the collected data and convert it into data that can be classified (S210).

예를 들어, 챗봇관리서버(30)는 대화데이터가 정확히 인식되도록 토큰화(Tokenization) 작업, 정제화(Cleaning) 작업 및 정규화(Normalization) 작업을 순서대로 진행하여 대화데이터를 전처리할 수 있다. 여기서, 대화데이터가 음성, 이미지 또는 동영상인 경우 텍스트로 변화 처리될 수 있다.For example, the chatbot management server 30 may preprocess conversation data by sequentially performing tokenization, cleaning, and normalization so that the conversation data is accurately recognized. Here, if the conversation data is voice, image, or video, it can be converted into text.

예를 들어, 챗봇관리서버(30)는 텍스트에 포함된 단어들이 의미있는 단어의 최소 단위로 구분되도록 공백제거필터, 특수문자제거필터를 이용하여 토큰화 작업을 수행할 수 있다.For example, the chatbot management server 30 can perform tokenization using a space removal filter and a special character removal filter so that words included in the text are divided into the minimum unit of meaningful words.

또한, 챗봇관리서버(30)는 토큰화 작업이 끝난 후 대화데이터에 등장 빈도가 낮은 단어 또는 다수 반복되는 해당 단어들에 대한 노이즈 데이터를 제거하여 잔존하는 단어들의 의미가 부각되도록 정제화 작업을 수행할 수 있다. 이때, 챗봇관리서버(30)는 노이즈 데이터를 수치화하여 사용자 단말기(10)의 사용자정보에 대응하여 삭제여부를 판단할 수 있다.In addition, after the tokenization process is completed, the chatbot management server 30 performs a refinement process to highlight the meaning of the remaining words by removing noise data about words that appear less frequently or are repeated a lot in the conversation data. You can. At this time, the chatbot management server 30 can quantify the noise data and determine whether to delete it in response to the user information of the user terminal 10.

그리고, 챗봇관리서버(30)는 정제화 작업이 끝난 텍스트데이터를 정규화할 수 있다.Additionally, the chatbot management server 30 can normalize text data for which the refinement process has been completed.

다음, 챗봇관리서버(30)는 수집된 데이터를 업무별, 개인별, 기업별로 분류할 수 있다(S220).Next, the chatbot management server 30 can classify the collected data by task, individual, and company (S220).

구체적으로, 챗봇관리서버(30)는 수집된 데이터를 기계처리 데이터(Machine Generated Data), 준 자동처리 데이터(Semi Generated Data), 사용자 처리 데이터(Human Generated Data), 기업 특화 데이터(Enterprise Specialized Data), 기업 보편적 데이터(Enterprise Normalized Data) 및 기업 특화 어휘 데이터(Enterprise Specialized Vocabulary Data)를 포함하는 유형별로 데이터를 업무별, 개인별, 기업별로 데이터를 분류할 수 있다.Specifically, the chatbot management server 30 divides the collected data into Machine Generated Data, Semi Generated Data, Human Generated Data, and Enterprise Specialized Data. , Data can be classified by type, including Enterprise Normalized Data and Enterprise Specialized Vocabulary Data, by task, individual, and company.

다음, 챗봇관리서버(30)는 사용자정보에 기초하여 분류된 데이터로부터 중의적 데이터를 추출하여 제거할 수 있다(S230).Next, the chatbot management server 30 can extract and remove ambiguous data from data classified based on user information (S230).

구체적으로, 챗봇관리서버(30)는 사용자정보에 기초하여 동음이의어, 다의어, 비유, 어휘 및 문장 등을 고려하여 중의적 데이터를 제거할 수 있다.Specifically, the chatbot management server 30 can remove ambiguous data by considering homonyms, polysemy, metaphors, vocabulary, and sentences based on user information.

예를 들어, 챗봇관리서버(30)는 사용자정보에 기초하여 “올해 우리 회사의 영업이익은 얼마지?” 라는 질의에서 '올해'와 '영업이익'은 중의적 데이터를 추출할 수 있다. 즉, 올해라는 기준은 회계 기준으로는 다음해 3월까지는 전년도 기준을 따르며, 사용자정보에 기초한 올해는 1월1일부터 지금 시점까지의 영업이익을 의미하는 것이므로, 사용자정보에 기초하여 중의적 데이터를 삭제할 수 있다. 이에 따라, 중의적 데이터를 제거함으로써, 사용자의 질문의도에 대응하는 답변을 더욱 명확하게 생성하여 전달하여 사용자의 신뢰도 향상될 수 있다.For example, the chatbot management server 30 asks “How much is our company’s operating profit this year?” based on user information. In the query 'this year' and 'operating profit', ambiguous data can be extracted. In other words, this year's standard follows the previous year's standard until March of the following year in terms of accounting standards, and this year based on user information refers to operating profit from January 1 to the present, so ambiguous data is provided based on user information. can be deleted. Accordingly, by removing ambiguous data, the user's trust can be improved by more clearly generating and delivering an answer corresponding to the user's question intention.

다음, 챗봇관리서버(30)는 중의적 데이터가 제거된 데이터를 시차별로 구분하여 시차별 데이터를 반복 학습할 수 있다(S240).Next, the chatbot management server 30 can classify the data from which ambiguous data has been removed by time and repeatedly learn the data by time (S240).

구체적으로, 챗봇관리서버(30)는 실시간(Realtime), 일간(Daily), 월간(Monthly), 분기(Quarterly) 및 연간(Yearly)을 포함하는 유형별로 데이터를 반복 학습하여 업무별, 개인별, 기업별로 학습데이터를 생성할 수 있다.Specifically, the chatbot management server 30 repeatedly learns data by type, including Realtime, Daily, Monthly, Quarterly, and Yearly, for each task, individual, and company. Learning data can be generated separately.

다음, 챗봇관리서버(30)는 검수과정을 통해 학습데이터를 생성할 수 있다(S250, S260).Next, the chatbot management server 30 can generate learning data through an inspection process (S250, S260).

만약, 생성된 학습데이터가 검수과정을 통과하지 못한 경우(S250), 챗봇관리서버(30)는 데이터 분류 단계(S220), 중의적 데이터 추출 및 제거 단계(S240) 또는 시점별로 데이터 학습 단계(S240) 중 적어도 하나의 단계로 되돌아가 상기 단계를 반복 수행하여 학습데이터를 생성할 수 있다. 이에 따라, 답변의 정확성을 높이면서 사용자의 신뢰성을 더욱 향상시킬 수 있다.If the generated learning data does not pass the inspection process (S250), the chatbot management server 30 performs the data classification step (S220), the ambiguous data extraction and removal step (S240), or the data learning step by time point (S240). ), you can go back to at least one step and repeat the step to generate learning data. Accordingly, user reliability can be further improved while increasing the accuracy of answers.

실시예에 따라, 사용자 단말기(10)는 챗봇관리서버(30)로부터 수신된 답변데이터의 유사성을 판단하여 피드백신호를 생성하고, 챗봇관리서버(30)는 사용자 단말기(10)로부터 수신된 피드백신호에 대응하여 챗봇 서비스를 개선할 수 있는 제어신호를 생성할 수 있다.According to the embodiment, the user terminal 10 determines the similarity of the answer data received from the chatbot management server 30 and generates a feedback signal, and the chatbot management server 30 generates a feedback signal received from the user terminal 10. In response, control signals that can improve chatbot services can be generated.

실시예에 따라, 사용자 단말기(10)는 피드백신호에 대응하여 챗봇 서비스의 개선 또는 업데이트된 정보가 포함된 제어신호를 챗봇 서비스를 사용 전, 사용중, 또는 사용후에 수신할 수 있다.Depending on the embodiment, the user terminal 10 may receive a control signal containing improved or updated information on the chatbot service in response to the feedback signal before, during, or after using the chatbot service.

다음으로, 챗봇관리서버(30)는 답변이 포함된 대화데이터를 이용하여 학습데이터를 실시간으로 업데이트할 수 있다.Next, the chatbot management server 30 can update the learning data in real time using conversation data including answers.

마지막으로, 챗봇관리서버(30)는 사용자 단말기(10)가 챗봇 서비스의 종료를 요청하면 챗봇 서비스를 종료할 수 있다.Lastly, the chatbot management server 30 can terminate the chatbot service when the user terminal 10 requests termination of the chatbot service.

본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of the method or algorithm described in connection with embodiments of the present invention may be implemented directly in hardware, implemented as a software module executed by hardware, or a combination thereof. Software modules can be RAM (Random Access Memory), ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), Flash Memory, hard disk, removable disk, CD-ROM, or It may reside on any type of computer-readable recording medium well known in the art to which the invention pertains.

이상, 첨부된 도면을 참조로 하여 본 발명의 실시예를 설명하였지만, 본 발명이 속하는 기술분야의 통상의 기술자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며, 제한적이 아닌 것으로 이해해야만 한다.Above, embodiments of the present invention have been described with reference to the attached drawings, but those skilled in the art will understand that the present invention can be implemented in other specific forms without changing its technical idea or essential features. You will be able to understand it. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive.

1 : 시점별로 분석 가능한 챗봇 서비스 제공 시스템
10 : 사용자 단말기
20 : 챗봇 단말기
30 : 챗봇관리서버1: Chatbot service provision system that can be analyzed by point in time
10: user terminal
20: Chatbot terminal
30: Chatbot management server

Claims

In a method of providing a chatbot service that can be analyzed by time point performed by a chatbot management server,
Generating learning data through repeated learning based on big data;
Receiving conversation data including a query from a user terminal through a chatbot service;
Classifying a conversation model by analyzing conversation data including the query based on the learning data to generate analysis data;
Selecting a chatbot terminal corresponding to the classified conversation model; and
generating conversation data containing an answer corresponding to the query by the chatbot terminal searching and extracting data stored in the learning data; Including,
The step of generating the learning data is,
collecting the big data;
generating preprocessed data by preprocessing the big data and converting it into classifiable data;
The preprocessed data is divided into Machine Generated Data, Semi Generated Data, Human Generated Data, Enterprise Specialized Data, Enterprise Normalized Data, and Classifying by type including Enterprise Specialized Vocabulary Data to generate classification data classified by task, individual, and company;
extracting and removing ambiguous data from the classified big data; and
The big data from which the above ambiguous data has been removed is divided by time on the basis of Realtime, Daily, Monthly, Quarterly and Yearly and repeatedly studied to provide data for each task, individual and company. Including a step of generating the learning data of the star,
The step of generating the classification data is,
The machine-processed data including real-time data automatically generated by a machine from the pre-processed data, the semi-automatically processed data including arithmetical calculation formulas and artificially set target value data, and the data artificially created, judged, and defined. The user-processed data including the user-processed data, the company-specific data including data limited to a specific company, the company universal data including universal data regardless of company classification, and the company containing vocabulary data used only for a specific company. Classifying by job, individual, and company based on specialized vocabulary data; and
A method of providing a chatbot service that can be analyzed at each point in time, including the step of inspecting the data classified by task, individual, and company for duplication and similarity.

According to paragraph 1,
The step of classifying the conversation model is,
A method of providing a chatbot service that can be analyzed by viewpoint, classifying the conversation model by extracting and analyzing morphemes, verb phrases, and content words included in conversation data including the query.

According to paragraph 2,
The step of classifying the conversation model is,
A step of automatically performing typo correction corresponding to the extracted morpheme, the verb phrase, and the content word,
When a typo is found through the typo inspection, it is determined whether the searched typo is an actual typo in response to standard data, and when it is an actual typo, the typo is corrected in response to the standard data. A method of providing a chatbot service that can be analyzed at each point in time.

According to paragraph 2,
The step of classifying the conversation model is,
A step of automatically performing typo correction corresponding to the extracted morpheme, the verb phrase, and the content word;
When a typo is found through the typo inspection, determining whether the searched typo is an actual typo in response to the standard data, and determining whether or not the typo is analyzed through typo analysis when it is not an actual typo; analyzing by time point, including; Possible chatbot service provision methods.

According to paragraph 4,
A method of providing a chatbot service that can be analyzed at each point in time, in which, if the typo is a misanalysis through the typo analysis, the morpheme, the verb phrase, and the content word are re-extracted to generate the analysis data.

delete

According to paragraph 1,
The step of preprocessing the big data is,
performing a tokenization operation using a space removal filter and a special character removal filter so that words included in the big data are divided into minimum units of meaningful words;
After the tokenization operation is completed, performing a refining operation to highlight the meaning of the remaining words by removing noise data for words with low frequency of occurrence or words that are repeated a large number of times included in the big data; and
Normalizing the big data after the refining operation is completed; A method of providing a chatbot service that can be analyzed at each point in time, including.

According to paragraph 1,
Updating the learning data in real time in response to conversation data including the answer; A method of providing a chatbot service that can be analyzed at each point in time, further comprising:

delete

A step where the chatbot management server generates learning data by repeatedly learning based on big data;
A chatbot terminal receiving conversation data including user information and inquiries from a user terminal through a chatbot service;
generating analysis data by the chatbot terminal analyzing conversation data including the query based on the user information; and
The chatbot terminal searches the learning data using the analysis data based on the user information and generates conversation data including an answer; Including,
The step of generating the learning data is,
The chatbot management server collecting big data;
The chatbot management server preprocesses the big data and converts it into classifiable data to generate preprocessed data;
The chatbot management server divides the preprocessed data into Machine Generated Data, Semi Generated Data, Human Generated Data, Enterprise Specialized Data, and Enterprise General Data ( Creating classified data classified by job, individual, and company by classifying them by type, including Enterprise Normalized Data) and Enterprise Specialized Vocabulary Data;
Extracting and removing ambiguous data from the classified big data by the chatbot management server; and
The chatbot management server repeatedly learns the big data from which the ambiguous data has been removed by dividing it by time based on Realtime, Daily, Monthly, Quarterly, and Yearly. Including; generating the learning data for each individual, individual, and company,
The step of generating the classification data is,
The machine-processed data including real-time data automatically generated by a machine from the pre-processed data, the semi-automatically processed data including arithmetical calculation formulas and artificially set target value data, and the data artificially created, judged, and defined. The user-processed data including the user-processed data, the company-specific data including data limited to a specific company, the company universal data including universal data regardless of company classification, and the company containing vocabulary data used only for a specific company. Classifying by job, individual, and company based on specialized vocabulary data; and
A method of providing a chatbot service that can be analyzed at each point in time, including the step of inspecting the data classified by task, individual, and company for duplication and similarity.

delete

A chatbot management server that provides real-time chatbot services to user terminals through chatbot terminals selected by situation, field, and individual; Including,
The chatbot management server analyzes conversation data containing a query received from the user terminal, generates analysis data, classifies the conversation model, and then collects data corresponding to the conversation model from the learning data generated by considering ambiguous data. Extracts and generates conversation data containing answers and transmits it to the user terminal through the chatbot terminal,
The chatbot management server preprocesses the big data collected in consideration of the ambiguous data and converts it into classifiable data, and converts the preprocessed data into machine generated data, semi-generated data, Classified data categorized by job, individual and company into Human Generated Data, Enterprise Specialized Data, Enterprise Normalized Data and Enterprise Specialized Vocabulary Data. Generates classified data, extracts and improves the ambiguous data from the big data, and analyzes the big data from which the ambiguous data has been removed in real time, daily, monthly, and quarterly And the above-mentioned learning data for each task, individual, and company is generated through repetitive learning by dividing it by time on an annual basis,
The classification data is,
The machine-processed data including real-time data automatically generated by a machine using the pre-processed data, the semi-automatically processed data including arithmetical calculation formulas and artificially set target value data, and the data artificially created, judged, and defined. The user-processed data including the user-processed data, the company-specific data including data limited to a specific company, the company universal data including universal data regardless of company classification, and the company containing vocabulary data used only for a specific company. A chatbot service provision system that can be analyzed at each point in time, created by classifying data by job, individual, and company based on specialized vocabulary data, and then inspecting the data classified by job, individual, and company for duplication and similarity.

delete

Chatbot management server that provides real-time chatbot services to user terminals; Including,
The chatbot management server analyzes conversation data containing user information and queries received from the user terminal, generates analysis data, classifies the conversation model, and then creates the conversation model and Extract data corresponding to the user information, generate conversation data containing answers, and transmit it to the user terminal,
The chatbot management server is,
The analysis data is generated by extracting and analyzing morphemes, verb phrases, and content words included in the conversation data containing the query, respectively, by situation, field, and individual, and
The chatbot management server is,
Considering the above ambiguous data, the collected big data is pre-processed and converted into classifiable data, and the pre-processed data is converted into machine generated data, semi-generated data, and user-processed data. Generated Data, Enterprise Specialized Data, Enterprise Normalized Data, and Enterprise Specialized Vocabulary Data are classified by job, individual, and company to generate classified data. Data Extract and improve the ambiguous data from the big data, and analyze the big data from which the ambiguous data has been removed in real time, daily, monthly, quarterly, and yearly formats. As a standard, the above-mentioned learning data for each task, individual, and company is generated through repeated learning by dividing by time period,
The classification data is,
The machine-processed data including real-time data automatically generated by a machine from the pre-processed data, the semi-automatically processed data including arithmetical calculation formulas and artificially set target value data, and the data artificially created, judged, and defined. The user-processed data including the user-processed data, the company-specific data including data limited to a specific company, the company universal data including universal data regardless of company classification, and the company containing vocabulary data used only for a specific company. A chatbot service provision system that can be analyzed at each point in time, created by classifying data by job, individual, and company based on specialized vocabulary data, and then inspecting the data classified by job, individual, and company for duplication and similarity.

delete

A computer program stored in a computer-readable recording medium, combined with a hardware computer, to perform the method of any one of claims 1 to 5, 11 to 12, and 14.