KR20220121182A

KR20220121182A - Method and system for predicting cognitive impairment based on document classification model and fluency tagging

Info

Publication number: KR20220121182A
Application number: KR1020220014969A
Authority: KR
Inventors: 이헌복
Original assignee: 주식회사 바이칼에이아이
Priority date: 2021-02-24
Filing date: 2022-02-04
Publication date: 2022-08-31

Abstract

Provided is a method for predicting a cognitive impairment based on a document classification model technique and fluency tagging. The method comprises: a step of receiving utterance data of a subject for a document for which is a subject for evaluation; a step of transcribing the utterance data of the subject; a step of performing tagging for a predetermined non-fluency feature among the transcribed utterance data; a step of tokenizing for the tagged non-fluency feature and each word; a step of performing embedding based on the word and document targeting the tokenized utterance data; and a step of outputting a prediction result value corresponding to a parameter affecting cognitive impairment by inputting the embedded utterance data to a pre-learned prediction model.

Description

Method and system for predicting cognitive impairment based on document classification model technique and fluency tagging

본 발명은 문서 분류 모델 기법 및 유창성 태깅에 기반한 인지장애 예측 방법 및 시스템에 관한 것이다.The present invention relates to a method and system for predicting cognitive impairment based on a document classification model technique and fluency tagging.

인지장애는 기억력, 언어능력, 시공간 분석 및 구성 능력, 주의 집중력, 판단력, 추리력 등 두뇌에서 이루어지는 지적능력에 이상이 발생하는 것을 말한다. 인지장애의 원인으로는 뇌의 물리적 손상부터 노인성 치매질환, 감염성 질환, 뇌종양, 우울증 등 수십가지에 이르는 등 매우 다양한다. 이러한 인지장애는 일단 발생하게 되면 일상생활과 사회생활에 큰 부정적인 영향을 미치게 된다.Cognitive impairment refers to abnormalities in the intellectual abilities of the brain, such as memory, language ability, temporal and spatial analysis and construction ability, attention, concentration, judgment, and reasoning ability. The causes of cognitive impairment are very diverse, ranging from physical damage to the brain to senile dementia, infectious diseases, brain tumors, depression, and dozens of others. Once such cognitive impairment occurs, it has a significant negative impact on daily life and social life.

인지장애를 진단하기에 앞서 인지장애를 예측할 수 있는 간단한 수단으로는 MMSE(Mini-Mental State Examination), BNT(Boston Naming Test) 등이 있다. 이는 전문 의료기기나 시간, 비용을 들여 인지장애를 진단하기 전에 간편한 방법을 통해 장애 여부를 예측할 수 있도록 하여 인지장애 진단의 문턱을 낮추도록 한다. Prior to diagnosing cognitive impairment, simple means for predicting cognitive impairment include the Mini-Mental State Examination (MMSE) and the Boston Naming Test (BNT). This lowers the threshold for diagnosis of cognitive impairment by enabling a simple method to predict the presence or absence of a cognitive impairment before diagnosing cognitive impairment using specialized medical equipment, time, and money.

이러한 예측이 자주 쉽게 실시되어 조기 진단으로 이어지게 되면 치매와 같은 더 큰 후속 질병을 조기에 발견하고, 치료 등 적절한 조치를 신속하 취할 수 있도록 하는 중요한 역할을 한다.If these predictions are frequently and easily carried out, leading to early diagnosis, they will play an important role in early detection of larger subsequent diseases such as dementia and promptly taking appropriate measures, such as treatment.

공개특허공보 제10-2019-0021896호 (2019.03.06)Laid-open Patent Publication No. 10-2019-0021896 (2019.03.06)

본 발명이 해결하고자 하는 과제는 자연 발화된 내용의 비유창성 특징들을 문서 분류 모델 기법으로 모델링하고, 이를 기반으로 발화 내용에 대한 인지장애에 영향을 줄 수 있는 파라미터에 대한 예측값을 도출할 수 있는, 문서 분류 모델 기법 및 유창성 태깅에 기반한 인지장애 예측 방법 및 시스템을 제공하는 것이다.The problem to be solved by the present invention is to model the non-fluency characteristics of spontaneous utterances using a document classification model technique, and based on this, predict values for parameters that can affect cognitive impairment for utterances can be derived. It is to provide a method and system for predicting cognitive impairment based on a document classification model technique and fluency tagging.

다만, 본 발명이 해결하고자 하는 과제는 상기된 바와 같은 과제로 한정되지 않으며, 또다른 과제들이 존재할 수 있다.However, the problems to be solved by the present invention are not limited to the problems described above, and other problems may exist.

상술한 과제를 해결하기 위한 본 발명의 제1 측면에 따른 문서 분류 모델 기법 및 유창성 태깅에 기반한 인지장애 예측 방법은 평가 대상인 문서에 대한 피험자의 발화 데이터를 수신하는 단계; 상기 피험자의 발화 데이터를 전사하는 단계; 상기 전사된 발화 데이터 중 소정의 비유창성 특징에 대한 태깅을 수행하는 단계; 상기 태깅된 비유창성 특징 및 각 단어에 대하여 토큰화하는 단계; 상기 토큰화된 발화 데이터를 대상으로 단어 및 문서 기반의 임베딩을 수행하는 단계; 및 상기 임베딩된 발화 데이터를 미리 학습된 예측 모델에 입력하여 인지장애에 영향을 주는 파라미터에 상응하는 예측 결과값을 출력하는 단계를 포함한다.A method for predicting cognitive impairment based on a document classification model technique and fluency tagging according to a first aspect of the present invention for solving the above-described problems comprises the steps of: receiving utterance data of a subject for a document to be evaluated; transcribing the subject's utterance data; performing tagging on a predetermined non-fluency characteristic among the transcribed utterance data; tokenizing for each word and the tagged non-fluency feature; performing word- and document-based embedding on the tokenized utterance data; and inputting the embedded speech data into a pre-trained prediction model and outputting a prediction result corresponding to a parameter affecting cognitive impairment.

본 발명의 일부 실시예에서, 상기 전사된 발화 데이터 중 소정의 비유창성 특징에 대한 태깅을 수행하는 단계는, 상기 전사된 발화 데이터 중 소정의 시간 동안 침묵하는 주저 행위 특징, 의미 전달과 관계없는 중모음, 낱말, 구가 발화에 포함되는 삽입 행위 특징, 발화 시 전달 내용, 문법 형태나 낱말의 발음을 바꾸어 말하는 수정 행위 특징, 낱말이나 발화가 완성되지 않고 종료되는 미완성 행위 특징, 복수의 완성된 낱말이 반복되는 구반복 특징, 낱말 전체가 반복되는 단어반복 특징, 음절이 반복되는 음절반복 특징, 한 음소나 이중 모음의 일부만이 반복되는 소리반복 특징, 음운 또는 모음의 한 요소가 길게 지속되는 연장 행위 특징, 및 음운을 시작하거나 파열음 발화시 생기는 시간차로 인해 소리가 중단되는 막힘 행위 특징 중 적어도 하나를 포함하는 비유창성 특징에 대한 태깅을 수행할 수 있다.In some embodiments of the present invention, the step of tagging a predetermined non-fluency characteristic among the transcribed utterance data includes the hesitant behavior characteristic of being silent for a predetermined period of time among the transcribed utterance data, and heavy vowels not related to meaning transfer. , Characteristics of insertion behavior in which words and phrases are included in utterances, content delivered during utterance, characteristics of corrective actions by changing grammatical form or pronunciation of words, characteristics of incomplete behavior in which words or utterances are not completed without completion, multiple completed words Characteristic of repeated phrase repetition, repeated word repetition characteristic of whole words, repeated syllable repetition characteristic, sound repetition characteristic that only part of a phoneme or diphthong is repeated, extended behavior characteristic of one element of a phoneme or vowel being repeated for a long time , and a non-fluency feature including at least one of a clogging behavior feature in which a sound is stopped due to a time difference that occurs when a phoneme is started or a plosive sound is uttered may be tagged.

본 발명의 일부 실시예에서, 상기 태깅된 비유창성 특징 및 각 단어에 대하여 토큰화하는 단계는, 상기 태깅된 비유창성 특징에 상응하는 태그 부분 및 내용 부분 중 내용 부분을 삭제시키는 단계; 및 상기 내용 부분이 삭제된 비유창성 특징 및 각 단어에 대하여 토큰화를 수행하는 단계를 포함할 수 있다.In some embodiments of the present invention, the tokenizing for each word and the tagged non-fluency characteristic comprises: deleting a content portion of the content portion and a tag portion corresponding to the tagged non-fluency characteristic; and performing tokenization on each word and the non-fluent feature from which the content portion is deleted.

본 발명의 일부 실시예에서, 상기 토큰화된 발화 데이터를 대상으로 단어 및 문서 기반의 임베딩을 수행하는 단계는, 상기 토큰화된 발화 데이터를 구성하는 모든 단어를 벡터화하는 단어 임베딩을 수행하는 단계; 및 상기 단어 임베딩 결과에 대하여 상기 문서의 특징 정보를 나타내는 벡터를 추가하는 문서 임베딩을 수행하는 단계를 포함할 수 있다.In some embodiments of the present invention, performing word- and document-based embedding on the tokenized utterance data includes: performing word embedding of vectorizing all words constituting the tokenized utterance data; and performing document embedding of adding a vector representing characteristic information of the document to the word embedding result.

본 발명의 일부 실시예에서, 상기 토큰화된 발화 데이터를 대상으로 단어 및 문서 기반의 임베딩을 수행하는 단계는, 상기 토큰화된 발화 데이터를 구성하는 토큰 중 상기 미리 학습된 예측 모델에서 학습되지 않은 새로운 토큰이 존재하는 경우 상기 임베딩을 수행할 수 있다.In some embodiments of the present invention, the step of performing word- and document-based embedding on the tokenized utterance data may include: among the tokens constituting the tokenized utterance data, not learned in the pre-trained prediction model. When a new token exists, the embedding may be performed.

본 발명의 일부 실시예는, 상기 새로운 토큰을 기반으로 수행한 임베딩 결과에 기초하여 상기 예측 모델을 업데이트하는 단계를 더 포함할 수 있다.Some embodiments of the present invention may further include updating the prediction model based on a result of embedding performed based on the new token.

본 발명의 일부 실시예는, 상기 예측 모델의 학습을 위한 학습 데이터를 예측 모델의 입력단에 입력되도록 설정하는 단계; 상기 인지장애에 영향을 주는 파라미터에 상응하는 미리 설정된 예측 결과값을 상기 예측 모델의 출력단으로 출력되도록 설정하는 단계; 및 상기 입력단 및 출력단이 설정된 예측 모델을 학습시키는 단계를 더 포함하고, 상기 학습 데이터는 소정의 발화 데이터를 대상으로 전사하고, 전사된 발화 데이터 중 소정의 비유창성 특징에 대한 태깅이 수행된 후, 상기 태깅된 발화 데이터가 토큰화 및 임베딩화되어 구성될 수 있다.Some embodiments of the present invention, the steps of setting the learning data for learning the predictive model to be input to the input terminal of the predictive model; setting a preset prediction result value corresponding to a parameter affecting the cognitive impairment to be output to an output terminal of the prediction model; and learning the predictive model in which the input terminal and the output terminal are set, wherein the training data is transcribed for predetermined utterance data, and after tagging of a predetermined non-fluency characteristic among the transcribed utterance data is performed, The tagged utterance data may be tokenized and embedded.

본 발명의 일부 실시예에서, 상기 학습 데이터는 상기 태깅된 비유창성 특징에 상응하는 태그 부분 및 내용 부분 중 내용 부분이 삭제된 후 토큰화될 수 있다.In some embodiments of the present invention, the learning data may be tokenized after the content portion of the content portion and the tagged portion corresponding to the tagged non-fluency characteristic are deleted.

본 발명의 일부 실시예에서, 상기 학습 데이터는 상기 토큰화된 학습 데이터를 구성하는 모든 단어를 벡터화하는 단어 임베딩 및 상기 단어 임베딩이 수행된 결과에 대하여 상기 학습 데이터를 구성하는 문서의 특징 정보를 나타내는 벡터를 추가하는 문서 임베딩이 수행되어 벡터화될 수 있다.In some embodiments of the present invention, the learning data represents feature information of a document constituting the learning data with respect to a word embedding for vectorizing all words constituting the tokenized learning data and a result of the word embedding being performed. Document embedding that adds a vector may be performed and vectorized.

또한, 본 발명의 제2 측면에 따른 문서 분류 모델 기법 및 유창성 태깅에 기반한 인지장애 예측 시스템은 평가 대상 문서에 대한 피험자의 발화 데이터를 수신하는 통신모듈, 상기 발화 데이터를 기반으로 미리 학습된 예측 모델을 통해 인지장애에 영향을 주는 파라미터에 상응하는 예측 결과값을 출력하기 위한 프로그램이 저장된 메모리 및 상기 메모리에 저장된 프로그램을 실행시키는 프로세서를 포함한다. 이때, 상기 프로세서는 상기 메모리에 저장된 프로그램을 실행시킴에 따라, 상기 피험자의 발화 데이터를 전사하고, 상기 전사된 발화 데이터 중 소정의 비유창성 특징에 대한 태깅을 수행하고, 상기 태깅된 비유창성 특징 및 각 단어에 대한 토큰화를 수행하고, 상기 토큰화된 발화 데이터를 대상으로 단어 및 문서 기반의 임베딩을 수행한 후, 상기 임베딩된 발화 데이터를 상기 예측 모델에 입력하여 예측 결과값을 출력한다.In addition, the cognitive impairment prediction system based on the document classification model technique and fluency tagging according to the second aspect of the present invention includes a communication module that receives the subject's speech data for the document to be evaluated, and a prediction model learned in advance based on the speech data. and a memory in which a program for outputting a predicted result value corresponding to a parameter affecting cognitive impairment is stored through the memory and a processor for executing the program stored in the memory. In this case, as the processor executes the program stored in the memory, the utterance data of the subject is transcribed, and a predetermined non-fluency characteristic is tagged among the transcribed utterance data, and the tagged non-fluency characteristic and After tokenization of each word is performed, word and document-based embedding is performed on the tokenized speech data, the embedded speech data is input to the prediction model to output a prediction result.

상술한 과제를 해결하기 위한 본 발명의 다른 면에 따른 컴퓨터 프로그램은, 하드웨어인 컴퓨터와 결합되어 상기 문서 분류 모델 기법 및 유창성 태깅에 기반한 인지장애 예측 방법을 위한 프로그램을 실행하며, 컴퓨터 판독가능 기록매체에 저장된다.A computer program according to another aspect of the present invention for solving the above problems is combined with a computer that is hardware to execute a program for a cognitive impairment prediction method based on the document classification model technique and fluency tagging, and a computer-readable recording medium is stored in

본 발명의 기타 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Other specific details of the invention are included in the detailed description and drawings.

상술한 본 발명에 의하면, 비유창성 특징들에 대한 태깅, 토큰화 및 임베딩 과정을 통해 예측 모델을 학습 및 적용함으로써, 인지장애에 영향을 줄 수 있는 파라미터에 대한 예측 결과값을 정확하게 도출할 수 있는 장점이 있다.According to the present invention described above, by learning and applying a predictive model through tagging, tokenization, and embedding processes for non-fluency features, it is possible to accurately derive prediction results for parameters that can affect cognitive impairment. There are advantages.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 본 발명의 일 실시예에 따른 예측 모델을 학습하는 과정을 설명하기 위한 순서도이다.
도 2는 학습 데이터를 구성하는 내용을 설명하기 위한 순서도이다.
도 3은 비유창성 특징에 대하여 정의된 태그를 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 인지장애 예측 방법의 순서도이다.
도 5는 본 발명의 일 실시예에 따른 인지장애 예측 시스템의 블록도이다.1 is a flowchart illustrating a process of learning a predictive model according to an embodiment of the present invention.
2 is a flowchart for explaining the contents of the learning data.
3 is a diagram for explaining a tag defined for a non-fluidity characteristic.
4 is a flowchart of a method for predicting cognitive impairment according to an embodiment of the present invention.
5 is a block diagram of a cognitive impairment prediction system according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and those of ordinary skill in the art to which the present invention pertains. It is provided to fully inform those skilled in the art of the scope of the present invention, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terminology used herein is for the purpose of describing the embodiments and is not intended to limit the present invention. In this specification, the singular also includes the plural, unless specifically stated otherwise in the phrase. As used herein, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other components in addition to the stated components. Like reference numerals refer to like elements throughout, and "and/or" includes each and every combination of one or more of the recited elements. Although "first", "second", etc. are used to describe various elements, these elements are not limited by these terms, of course. These terms are only used to distinguish one component from another. Therefore, it goes without saying that the first component mentioned below may be the second component within the spirit of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used herein will have the meaning commonly understood by those of ordinary skill in the art to which this invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless clearly specifically defined.

본 발명은 문서 분류 모델 기법 및 유창성 태깅에 기반한 인지장애 예측 방법 및 시스템(100)에 관한 것이다.The present invention relates to a method and system (100) for predicting cognitive impairment based on a document classification model technique and fluency tagging.

인지장애를 간편하고 조속히 예측할 수 있는 예측 수단을 더욱 단순화, 자동화하기 위해, 자연발화를 음성, 언어 분석하여 실시간 또는 그에 가깝게 예측하는 모델들이 지속적으로 연구되어 오고 있다. In order to further simplify and automate the predictive means that can predict cognitive impairment simply and quickly, models that predict spontaneous speech in real time or close to it by analyzing speech and language have been continuously studied.

본 발명의 일 실시예는 이러한 예측 수단의 일환으로 자연발화를 전사한 내용과 그 내용의 비유창성 특징들을 NLP(Natural Language Processing)의 문서 분류 모델 기법으로 모델링하고, 이를 이용한 예측 모델을 제시한다. 이러한 본 발명은 인지장애를 진단하는 것이 아니고, 인지장애에 영향을 줄 수 있는 적어도 하나의 파라미터에 대한 예측값을 도출하는 것을 특징으로 한다.One embodiment of the present invention models, as part of such a prediction means, the transcribed content of spontaneous speech and the non-fluency characteristics of the content using a document classification model technique of NLP (Natural Language Processing), and presents a predictive model using the same. The present invention is not diagnosing cognitive impairment, but is characterized by deriving predictive values for at least one parameter that may affect cognitive impairment.

이하에서는 도 1 내지 도 4를 참조하여 본 발명의 일 실시예에 따른 문서 분류 모델 기법 및 유창성 태깅에 기반한 인지장애 예측 방법(이하, 인지장애 예측 방법)에 대하여 설명하도록 한다.Hereinafter, a method for predicting cognitive impairment based on a document classification model technique and fluency tagging (hereinafter, a method for predicting cognitive impairment) according to an embodiment of the present invention will be described with reference to FIGS. 1 to 4 .

도 1은 본 발명의 일 실시예에 따른 예측 모델을 학습하는 과정을 설명하기 위한 순서도이다. 도 2는 학습 데이터를 구성하는 내용을 설명하기 위한 순서도이다. 도 3은 비유창성 특징에 대하여 정의된 태그를 설명하기 위한 도면이다. 도 4는 본 발명의 일 실시예에 따른 인지장애 예측 방법의 순서도이다.1 is a flowchart illustrating a process of learning a predictive model according to an embodiment of the present invention. 2 is a flowchart for explaining the contents of the learning data. 3 is a diagram for explaining a tag defined for a non-fluidity characteristic. 4 is a flowchart of a method for predicting cognitive impairment according to an embodiment of the present invention.

한편, 도 1, 도 2 및 도 4에 도시된 각 단계들은 후술하는 도 5의 인지장애 예측 시스템(100)에 의해 수행되는 것으로 이해될 수 있으나, 반드시 이에 한정되는 것은 아니다.Meanwhile, each of the steps shown in FIGS. 1, 2 and 4 may be understood to be performed by the cognitive impairment prediction system 100 of FIG. 5 to be described later, but is not necessarily limited thereto.

먼저, 도 1 내지 도 3을 참조하여 본 발명에 적용되는 예측 모델을 학습하는 과정을 설명한 후, 도 4를 참조하여 학습이 완료된 예측 모델을 기반으로 인지장애를 예측하는 과정을 설명하도록 한다.First, the process of learning the predictive model applied to the present invention will be described with reference to FIGS. 1 to 3, and then, the process of predicting cognitive impairment based on the learned predictive model will be described with reference to FIG. 4 .

먼저, 예측 모델의 학습을 위해 준비된 학습 데이터를 예측 모델의 입력단에 입력되도록 설정한다(S110).First, the training data prepared for learning the predictive model is set to be input to the input terminal of the predictive model (S110).

이를 위해 소정의 발화 데이터로 구성되는 학습 데이터를 준비한다(S210).To this end, learning data composed of predetermined speech data is prepared (S210).

일 실시예로, 본 발명에서의 학습 데이터는 소정의 발화 데이터일 수 있다. 발화 데이터는 피험자와 대상자 간의 1:1 대화를 통해 획득한 것으로, 피험자와 대상자 간의 대화 중 피험자에 상응하는 발화를 자동 또는 수동으로 전사한 내용에, 비유창성 특징을 나타내는 태그를 자동 또는 수동으로 추가한 문서이다. 여기에서 대상자는 소정의 사람일 수 있으며, 인공지능 기반의 AI 챗봇일 수도 있다.In one embodiment, the learning data in the present invention may be predetermined speech data. Speech data is acquired through a 1:1 conversation between a subject and a subject, and a tag indicating the non-fluency characteristic is automatically or manually added to the transcript of the utterance corresponding to the subject during the conversation between the subject and the subject. is one document. Here, the target may be a predetermined person or may be an AI-based AI chatbot.

이때, 입력되는 학습 데이터는 하나의 문장으로 구성될 수 있으나 일반적으로 복수 개의 문장으로 이루어진 문단으로 구성되며, 본 발명에서는 이러한 단위를 편의상 문서라 지칭하도록 한다. 또한, 피험자 1인당 1개의 문서를 하나의 학습 데이터 단위로 하나, 경우에 따라 복수 개로 구분되어 제공될 수도 있다.At this time, the input learning data may be composed of one sentence, but generally composed of paragraphs composed of a plurality of sentences, and in the present invention, such a unit is referred to as a document for convenience. In addition, one document per subject may be provided as one unit of learning data, and in some cases, a plurality of documents may be provided.

피험자와 대상자 간의 대화 내용은 기존 MMSE 과정이나 미리 준비한 일련의 질문에 대한 답변 등, 일관성있는 단어 사용 등을 예측할 수 있는 내용으로 제공되는 것이 바람직하나 반드시 이에 한정되는 것은 아니다.It is preferable that the content of the conversation between the subject and the subject be provided as content that can predict the consistent use of words, such as the existing MMSE course or answers to a series of questions prepared in advance, but is not necessarily limited thereto.

또한, 언어의 의미적 분석이 가능하기 위해서 학습 데이터는 한 단어 문장이나 의미 없는 추임새는 제거하고 기록하는 것이 좋을 수 있으나, 반드시 이에 한정되는 것은 아니며 인지장애 예측 목적에 따라 상이하게 구성될 수 있음은 물론이다.In addition, in order to enable semantic analysis of language, it may be better to record the learning data after removing one-word sentences or meaningless chuimsae, but it is not necessarily limited thereto and may be configured differently depending on the purpose of predicting cognitive impairment. Of course.

또한, 본 발명의 일 실시예는 전사나 태깅의 방법에 있어 소정의 음성인식 기술을 적용한 자동 전사, 태깅 방법을 적용할 수도 있고, 수동 전사, 태깅 방법을 적용할 수도 있으나, 효율성을 위해 자동으로 만든 데이터라도 수동으로 정제하여 정확성을 높이는 방법이 추천된다.In addition, in an embodiment of the present invention, an automatic transcription and tagging method to which a predetermined voice recognition technology is applied may be applied in the transcription or tagging method, or a manual transcription and tagging method may be applied. It is recommended to manually refine even created data to increase accuracy.

한편, 본 발명의 일 실시예는 학습 데이터를 구성하기 위해 발화 데이터가 전사되고 나면(S220), 전사된 발화 데이터 중 소정의 비유창성 특징에 대한 태깅이 수행된다(S230).On the other hand, according to an embodiment of the present invention, after the utterance data is transcribed to form the learning data (S220), tagging is performed on a predetermined non-fluency characteristic among the transcribed utterance data (S230).

예를 들어, 비유창성(disfluency) 특징은 도 3과 같은 유형으로 구분될 수 있으며, 각각 소정의 태그로 구분된다. 구체적으로, ①주저(Hesitation) 행위 특징은 발화 데이터 중 소정의 시간(예를 들어 1초 또는 그 이상의 시간) 동안 침묵하는 행위로 {H} 태그가 할당된다. ②삽입(Interjection) 행위 특징은 의미 전달과 관계없는 중모음, 낱말, 구가 발화에 포함되는 행위로 {I} 태그가 할당된다. ③수정(Revision) 행위 특징은 발화 시 전달 내용, 문법 형태나 낱말의 발음을 바꾸어 말하는 행위로 {R} 태그가 할당된다. ④미완성(Unfinished) 행위 특징은 낱말이나 발화가 완성되지 않고 종료되는 행위로, 보통 수정 행위 특징이 뒤따르며, {U} 태그가 할당된다. ⑤구반복(Phrase repetition) 특징은 복수의 완성된 낱말에서의 반복 행위로 {RP} 태그가 할당된다. ⑥단어반복(Word repetition) 특징은 낱말 전체의 반복 행위로, 일음절 단어의 반복 행위도 이에 포함되며, {RW} 태그가 할당된다. ⑦음절반복(Syllable repetition) 특징은 소리반복과 낱말반복의 중간으로 음절만 반복하는 행위로 {RL} 태그가 할당된다. ⑧소리반복(Sound repetition) 특징은 한 음소나 이중모음의 일부만이 반복되는 행위로 {RS} 태그가 할당된다. ⑨연장(Prolongation) 행위 특징은 음운 또는 모음의 한 요소가 부적절하게 길게 지속되는 행위로 {P} 태그가 할당된다. ⑩막힘(Block) 행위 특징은 음운을 시작하거나 파열음 발화시 생기는 부적절한 시간차로 인해 소리가 중단되는 막힘 행위로 {B} 태그가 할당된다. 이때, 막힘 행위 특징의 경우 막힘이 뚜렷이 구분되는 경우 사용되나, 그렇지 않은 경우에는 주저 행위 특징 태그가 사용된다.For example, the disfluency characteristic may be divided into types as shown in FIG. 3 , and each is divided by a predetermined tag. Specifically, ① hesitation behavior characteristic is an action of silence for a predetermined time (for example, 1 second or more) among utterance data, and a {H} tag is assigned. ② Interjection The behavior characteristic is that a middle vowel, a word, or a phrase that is not related to the delivery of meaning is included in an utterance, and the {I} tag is assigned. ③ Revision The characteristic of the action is to change the content, grammatical form, or pronunciation of words when speaking, and the {R} tag is assigned. ④ The unfinished behavior feature is an action in which a word or utterance is not completed and is usually followed by a correction behavior feature, and a {U} tag is assigned. ⑤ Phrase repetition characteristic is that the {RP} tag is assigned as a repetitive action in a plurality of completed words. ⑥ Word repetition characteristic is the repetition of the entire word, including the repetition of a single syllable word, and a {RW} tag is assigned. ⑦ Syllable repetition (Syllable repetition) is an activity that repeats only syllables in the middle between sound repetition and word repetition, and a {RL} tag is assigned. ⑧ Sound repetition characteristic is that only a part of one phoneme or diphthong is repeated, and the {RS} tag is assigned. ⑨ The prolongation behavior characteristic is the behavior in which one element of a phoneme or vowel lasts for an inappropriately long time, and is assigned a {P} tag. ⑩ The block behavior characteristic is a blocking behavior in which the sound is stopped due to an inappropriate time difference that occurs when starting a phonology or uttering a plosive, and is assigned a {B} tag. At this time, in the case of the blocking behavior characteristic, it is used when the blocking is clearly distinguished, but in other cases, the hesitant behavior characteristic tag is used.

전사된 발화 데이터에 대한 태깅이 수행되고 나면, 태깅된 비유창성 특징 및 각 단어에 대하여 토큰화(Tokenizing)를 수행한다(S240).After tagging is performed on the transcribed speech data, tokenizing is performed on the tagged non-fluency characteristics and each word ( S240 ).

앞서 문서는 문장들의 합으로 구성되고, 문장은 복수의 단어로 이루어져 있으므로, 결국 문서는 일정 단어들이 나열된 것이라 할 수 있다. 이때, 문서를 이루는 최소 단위인 단어를 토큰이라 지칭할 수 있다. 토큰이라고 별도로 칭하는 것은 이 단위가 언어에서 말하는 단어의 단위와 일치하지 않을 수도 있고 비언어적 표현이 포함될 수도 있기 때문이다.Since the document is composed of the sum of sentences and the sentence is composed of a plurality of words, the document can be said to be a list of certain words. In this case, the word, which is the smallest unit constituting the document, may be referred to as a token. Separately referred to as tokens because these units may not correspond to units of words spoken in the language and may contain non-verbal expressions.

일반적으로 하나의 단어는 하나의 토큰으로 볼 수 있으며, 본 발명에서는 비유창성 특징에 상응하는 하나의 태그도 하나의 특정한 토큰으로 취급된다.In general, one word can be viewed as one token, and in the present invention, one tag corresponding to the non-fluency characteristic is also treated as one specific token.

비유창성 특징에 대항 태깅화의 예시는 다음과 같다.Examples of tagging shoes against the non-fluidity feature include:

- 입력: 나는 그러니깐 어어어제 학교에 갔…- Input: So I went to school yesterday...

- 태깅: 나는 {그러니깐:I} {어어:RL} 어제 학교에 갔{U}- Tagging: I {so:I} {uh:RL} went to school yesterday{U}

- 토큰화: 나+는+{I}+{RL}+어제+학교+에+가+았+{U}- Tokenization: I+{I}+{RL}+yesterday+school+go+{U}

이때, 본 발명의 일 실시예는 태깅된 비유창성 특징에 상응하는 태그 부분 및 내용 부분 중 내용 부분이 삭제시키고, 내용 부분이 삭제된 비유창성 특징 및 각 단어에 대하여 토큰화를 수행할 수 있다.In this case, according to an embodiment of the present invention, the content part of the tag part and the content part corresponding to the tagged non-fluency feature may be deleted, and tokenization may be performed for each word and the non-fluency feature from which the content part is deleted.

위 예시의 경우, 태깅에서 {I}, {RL} 부분은 문장의 의미와 관련이 없는 부분이므로, 내용 부분은 삭제된 후 태그 부분만 토큰으로 변경된다. 또한, 마지막에 '갔'이 '가+았'이 되는 것은 토큰화에 형태소 분리를 적용한 것이다.In the above example, since the {I}, {RL} parts in tagging are not related to the meaning of the sentence, only the tag part is changed to a token after the content part is deleted. Also, the fact that 'go' becomes 'got' at the end is applying morpheme separation to tokenization.

한편, 본 발명의 일 실시예에서 토큰화를 위한 문장 분할 방법은 다양한 방법이 적용될 수 있으며, 입력의 종류, 크기, 모델의 목적 등에 따라 적절한 방법의 선택 적용이 가능하다. On the other hand, in an embodiment of the present invention, various methods may be applied to the method of dividing a sentence for tokenization, and an appropriate method may be selected and applied according to the type, size, and purpose of the model.

학습 데이터를 대상으로 비유창성 특징 및 각 단어에 대한 토큰화가 완료되면, 토큰화된 발화 데이터를 대상으로 단어 및 문서 기반의 임베딩을 수행한다(S250).When the non-fluency characteristics and tokenization of each word are completed for the training data, word and document-based embedding is performed on the tokenized utterance data (S250).

이때, 본 발명의 일 실시예는 토큰화된 발화 데이터를 구성하는 모든 단어를 벡터화하는 단어 임베딩과, 단어 임베딩 결과에 대하여 문서의 특징 정보를 나타내는 벡터를 추가하는 문서 임베딩을 수행할 수 있다.In this case, according to an embodiment of the present invention, word embedding for vectorizing all words constituting tokenized utterance data and document embedding for adding a vector indicating document characteristic information to the word embedding result may be performed.

자연어 처리에서 단어 임베딩이란 단어로 이루어진 문장 또는 문서나 그 단어를 컴퓨터가 이해할 수 있는 방법으로 표현하는 모든 방법을 뜻한다. 가장 유용한 경우는 모든 문서를 구성하는 모든 단어(토큰)들을 적절한 값을 갖는 벡터로 변환하는 것이다.In natural language processing, word embedding refers to a sentence or document made up of words or any method of representing the word in a way that a computer can understand. The most useful case is to convert all the words (tokens) that make up all documents into vectors with appropriate values.

단어 임베딩에서의 단어란 문장을 구성하는 기본 단위를 말하는 것으로, 앞서 설명한 토큰과 그 개념이 같다고 볼 수 있다. 이하, 단어, 토큰이라 표현하는 것은 동일한 의미를 갖는 것으로 본다.A word in word embedding refers to a basic unit constituting a sentence, and the concept is the same as the token described above. Hereinafter, words and tokens are regarded as having the same meaning.

임베딩을 수행하기 위해, 기본적으로 각 문서를 구성하는 단어가 의미상의 서로의 위치 관계를 잘 나타내는 일반적인 방법인 Word2Vec, Doc2Vec 등을 사용할 수 있다. 이때, 본 발명의 일 실시예는 Doc2Vec를 기본으로 적용하여, Word2Vec에서 문장 및 문서를 구성하는 모든 단어를 벡터로 변환하는 것에 더하여, Doc2Vec를 통해 단어에 문서를 나타내는 벡터를 추가한 방법을 적용하는 것을 특징으로 한다. 이는 Word2Vec와 같은 방법으로 각 토큰의 벡터를 변환한 이후, 문서를 나타내는 벡터에 전체 문서에 대한 정보를 부여하는 것이다.In order to perform embedding, Word2Vec, Doc2Vec, etc., which are general methods in which words constituting each document well represent a semantic positional relationship with each other, may be used. At this time, an embodiment of the present invention applies Doc2Vec as a basis, and in addition to converting all words constituting sentences and documents into vectors in Word2Vec, a method of adding a vector representing a document to a word through Doc2Vec is applied. characterized in that This is to give information about the entire document to the vector representing the document after transforming the vector of each token in the same way as Word2Vec.

Word2Vec의 일 예시는 다음과 같다.An example of Word2Vec is as follows.

문서: 저 하늘은 푸른 하늘 → 저(w1) + 하늘(w2) + 은(w3) + 푸른(w4) + 하늘(w2)Document: That Sky is Blue Sky → Low(w1) + Sky(w2) + Silver(w3) + Blue(w4) + Sky(w2)

여기에서 단어(토큰)에 해당하는 w1, w2, w3, w4는 각각 임의의 벡터를 의미한다.Here, w1, w2, w3, and w4 corresponding to words (tokens) mean arbitrary vectors, respectively.

이때, Word2Vec는 입력이 w1, w2, w3, X, w2로 주어질 때, X가 w4가 되는 확률을 높이는 방향으로 각 단어의 값(벡터)을 변경하면서 학습시키는 과정을 모든 문서의 모든 단어에 대해, 모든 단어가 적절한 벡터값을 가질 때까지 반복하여 계산하는 과정이다. 이 과정을 거치면 각 단어는 단어들에 의해 형성된 벡터 공간에서 각 문서에서의 상호 관계와 적합한 위치를 갖도록 되어 의미를 표현하게 된다.At this time, when Word2Vec is given as w1, w2, w3, X, and w2 as inputs, the process of learning while changing the value (vector) of each word in a direction to increase the probability that X becomes w4 is performed for all words in all documents. , is the process of repeating calculations until all words have an appropriate vector value. Through this process, each word has a mutual relationship and an appropriate position in each document in the vector space formed by the words to express meaning.

Doc2Vec의 일 예시는 다음과 같다.An example of Doc2Vec is as follows.

문서: 저 하늘은 푸른 하늘 → D1 = 저(w1) + 하늘(w2) + 은(w3) + 푸른(w4) + 하늘(w2) + p1Document: that sky is blue sky → D1 = low(w1) + sky(w2) + silver(w3) + blue(w4) + sky(w2) + p1

여기에서 w1, w2, w3, w4, p1은 각각 임의의 벡터를 의미한다.Here, w1, w2, w3, w4, and p1 each mean an arbitrary vector.

Doc2Vec는 문서에 대한 벡터를 하나 추가하여 Word2Vec와 동일한 과정을 통해 모든 벡터를 계산하는 방법이다. 즉, 입력이 w1, w2, w3, X, w2, P1으로 주어질 때 X가 w4가 되는 확률을 높이는 방향으로 학습을 하는 것은 Word2Vec와 같으며, 이 과정에서 문서의 특성을 나타내는 p1이 추가로 정해지게 된다.Doc2Vec is a method that calculates all vectors through the same process as Word2Vec by adding one vector for the document. That is, when the input is given as w1, w2, w3, X, w2, P1, learning in the direction of increasing the probability that X becomes w4 is the same as in Word2Vec, and in this process, p1 representing the characteristics of the document is additionally determined. will lose

다시 도 1을 참조하면, 이와 같이 준비된 학습 데이터를 예측 모델의 입력단으로 입력되도록 설정하고(S110), 인지장애에 영향을 주는 파라미터에 상응하는 미리 설정된 예측 결과값을 예측 모델의 출력단으로 출력되도록 설정한다(S120).Referring back to FIG. 1, the training data prepared in this way is set to be input to the input terminal of the predictive model (S110), and the preset prediction result value corresponding to the parameter affecting cognitive impairment is set to be output to the output terminal of the predictive model do (S120).

즉, 학습데이터의 모든 문서의 임베딩을 입력값 X로 하고, 인지장애 여부, 중증도, MMSE 점수 등과 같은 인지장애에 영향을 주는 파라미터에 상응하는 미리 설정된 예측 결과값을 각각 Y로 설정하여 예측 모델을 학습한다. 예측 모델은 예측 결과값 설정에 따라 이진, 다항 분류 모델이 될 수도 있고 선형 회귀 모델이 될 수도 있다. 한편, 예측 모델을 생성하는 방법은 SVM이나 DNN 등 기존 머신러닝 방법을 적용할 수도 있으며, 입력 데이터의 규모 특성에 따라 적절히 선택 가능하다.In other words, by taking the embedding of all documents in the learning data as the input value X, and setting the preset prediction result values corresponding to parameters affecting cognitive impairment such as cognitive impairment, severity, MMSE score, etc. learn The predictive model may be a binary or polynomial classification model or a linear regression model depending on the setting of the prediction result. On the other hand, as a method of generating a predictive model, existing machine learning methods such as SVM or DNN can be applied, and it can be appropriately selected according to the scale characteristics of the input data.

예측 모델의 예측 결과값은 입력 데이터와 동일한 형태를 가지며, 마찬가지로 전사, 태깅 과정을 거쳐 원본 데이터로 제공될 수 있다.The prediction result of the prediction model has the same shape as the input data, and may be provided as original data through transcription and tagging processes similarly.

이와 같이, 입력단 및 출력단이 설정되고 나면 예측 모델을 학습시킨다(S130).In this way, after the input terminal and the output terminal are set, the prediction model is trained (S130).

만약, 기 형성된 모델의 크기가 충분히 클 경우, 예측 결과값의 토큰은 이미 임베딩 결과에 포함되어 있어 거의 그대로 높은 정확성을 갖는 예측 결과값이 될 수 있다. 하지만, 모델의 크기가 상당히 크더라도 새로운 토큰이 존재하는 경우에는 임베딩만 새로 하고 빠른 예측을 수행한 후, 변경된 임베딩 결과에 기초하여 예측 모델을 업데이트하는 방법도 사용할 수 있다.If the size of the pre-formed model is sufficiently large, the token of the prediction result value is already included in the embedding result, so that it can be a prediction result value having high accuracy almost as it is. However, even if the size of the model is quite large, if a new token exists, only the embedding is renewed, a fast prediction is performed, and the prediction model is updated based on the changed embedding result.

이하에서는 도 4를 참조하여 학습된 예측 모델을 기반으로 피험자의 발화 데이터에 대한 인지장애를 예측화는 과정을 설명하도록 한다. 이때, 도 1 내지 도 3에서의 예측 모델 학습 과정에서 설명한 내용과 중복되는 내용은 생략하도록 한다.Hereinafter, a process of predicting cognitive impairment with respect to the subject's speech data based on the learned prediction model will be described with reference to FIG. 4 . In this case, content that overlaps with the content described in the predictive model learning process in FIGS. 1 to 3 will be omitted.

먼저, 평가 대상인 문서에 대한 피험자의 발화 데이터를 수신한다(S310).First, utterance data of a subject for a document to be evaluated is received ( S310 ).

다음으로, 피험자의 발화 데이터를 전사하고(S320), 전사된 발화 데이터 중 소정의 비유창성 특징에 대한 태깅을 수행한다(S330).Next, the utterance data of the subject is transcribed (S320), and tagging is performed on a predetermined non-fluency characteristic among the transcribed utterance data (S330).

이때, 본 발명의 일 실시예에 적용되는 태깅 중 비유창성 특징에 상응하는 태깅은 도 3에서 설명한 바와 같이 주저 행위 특징, 삽입 행위 특징, 수정 행위 특징, 미완성 행위 특징, 단어반복 특징, 음절반복 특징, 소리반복 특징, 연장 행위 특징, 막힘 행위 특징 중 적어도 하나의 비유창성 특징에 대하여 태깅을 수행할 수 있다.In this case, among the tagging applied to an embodiment of the present invention, the tagging corresponding to the non-fluency characteristic is a hesitant behavior characteristic, an insertion behavior characteristic, a correction behavior characteristic, an incomplete behavior characteristic, a word repetition characteristic, and a syllable repetition characteristic, as described in FIG. , a sound repetition characteristic, an extended behavior characteristic, and a clogging behavior characteristic may be tagged with respect to at least one non-fluency characteristic.

다음으로, 태깅된 비유창성 특징 및 각 단어에 대하여 토큰화를 수행하고(S340), 토큰화된 발화 데이터를 대상으로 단어 및 문서 기반의 임베딩을 수행한다(S350).Next, tokenization is performed on the tagged non-fluency characteristics and each word (S340), and word- and document-based embedding is performed on the tokenized speech data (S350).

일 실시예로, 토큰화는 태깅된 비유창성 특징에 상응하는 태그 부분 및 내용 부분 중 내용 부분을 삭제시킨 후, 내용 부분이 삭제된 비유창성 특징 및 각 단어에 대한 토큰화를 수행하는 것일 수 있다.In an embodiment, the tokenization may be to delete a content part of a tag part and a content part corresponding to the tagged disfluency characteristic, and then perform tokenization on each word and the non-fluency characteristic from which the content part is deleted. .

이와 같이 토큰화된 발화 데이터를 구성하는 모든 단어를 벡터화하는 단어 임베딩을 수행하고, 단어 임베딩 결과에 대하여 문서의 특징 정보를 나타내는 벡터를 추가하는 문서 임베딩을 수행하게 된다. In this way, word embedding of vectorizing all words constituting the tokenized utterance data is performed, and document embedding of adding a vector representing document characteristic information to the word embedding result is performed.

한편, 임베딩 과정의 경우 토큰화된 발화 데이터를 구성하는 토큰 중 미리 학습된 예측 모델에서 학습되지 않은 새로운 토큰이 존재하는 경우에 수행될 수 있다. 그리고 새로운 토큰을 기반으로 수행한 임베딩 결과에 기초하여 예측 모델을 업데이트할 수 있다. On the other hand, the embedding process may be performed when there is a new token that has not been learned in the pre-trained prediction model among the tokens constituting the tokenized utterance data. And the prediction model can be updated based on the embedding result performed based on the new token.

즉, 학습된 모델의 크기가 상당히 큰 경우에는 새로운 토큰이 있는 경우 임베딩만 새로 하고 빠른 예측을 수행한 후, 변경된 임베딩으로 모델을 업데이트하는 방법도 사용할 수 있다.In other words, when the size of the trained model is quite large, if there is a new token, it is also possible to use a method of updating the model with the changed embeddings after performing a quick prediction after only new embeddings.

다음으로, 임베딩된 발화 데이터를 미리 학습된 예측 모델에 입력하여 인지장애에 영향을 주는 파라미터에 상응하는 예측 결과값을 출력한다(S360).Next, by inputting the embedded speech data to the pre-trained prediction model, the prediction result corresponding to the parameter affecting the cognitive impairment is output (S360).

한편, 예측 모델이 충분히 많은 데이터로 학습되어 있거나 새로운 토큰이 적은 경우에는, 기존의 학습된 모델로 우선 예측 결과값을 제공할 수도 있으며, 차후 임베딩과 모델을 갱신하여 정확한 예측값을 도출하는 방법도 적용할 수 있다.On the other hand, if the prediction model is trained with enough data or there are few new tokens, the prediction result can be provided first with the existing trained model, and the method of deriving an accurate prediction value by updating the embedding and model later is also applied. can do.

한편, 상술한 설명에서, 단계 S110 내지 S360은 본 발명의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다. 한편, 기타 생략된 내용이라 하더라도 도 1 내지 도 4의 내용은 도 5의 인지장애 예측 시스템(100)에도 적용된다.Meanwhile, in the above description, steps S110 to S360 may be further divided into additional steps or combined into fewer steps according to an embodiment of the present invention. In addition, some steps may be omitted if necessary, and the order between the steps may be changed. On the other hand, the contents of FIGS. 1 to 4 are also applied to the cognitive impairment prediction system 100 of FIG. 5 even if other contents are omitted.

도 5는 본 발명의 일 실시예에 따른 인지장애 예측 시스템(100)의 블록도이다.5 is a block diagram of a cognitive impairment prediction system 100 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 인지장애 예측 방법은 통신모듈(110), 메모리(120) 및 프로세서(130)를 포함한다.A cognitive impairment prediction method according to an embodiment of the present invention includes a communication module 110 , a memory 120 , and a processor 130 .

통신모듈(110)은 평가 대상 문서에 대한 피험자의 발화 데이터를 수신한다. 이와 같은 통신 모듈(110)은 유선 통신 모듈 및 무선 통신 모듈을 모두 포함할 수 있다. 유선 통신 모듈은 전력선 통신 장치, 전화선 통신 장치, 케이블 홈(MoCA), 이더넷(Ethernet), IEEE1294, 통합 유선 홈 네트워크 및 RS-485 제어 장치로 구현될 수 있다. 또한, 무선 통신 모듈은 WLAN(wireless LAN), Bluetooth, HDR WPAN, UWB, ZigBee, Impulse Radio, 60GHz WPAN, Binary-CDMA, 무선 USB 기술 및 무선 HDMI 기술 등으로 구현될 수 있다.The communication module 110 receives the subject's speech data for the document to be evaluated. Such a communication module 110 may include both a wired communication module and a wireless communication module. The wired communication module may be implemented as a power line communication device, a telephone line communication device, a cable home (MoCA), Ethernet, IEEE1294, an integrated wired home network, and an RS-485 control device. In addition, the wireless communication module may be implemented by wireless LAN (WLAN), Bluetooth, HDR WPAN, UWB, ZigBee, Impulse Radio, 60GHz WPAN, Binary-CDMA, wireless USB technology, wireless HDMI technology, and the like.

메모리(120)에는 발화 데이터를 기반으로 미리 학습된 예측 모델을 통해 인지장애에 영향을 주는 파라미터에 상응하는 예측 결과값을 출력하기 위한 프로그램이 저장된다. 여기에서, 메모리(120)는 전원이 공급되지 않아도 저장된 정보를 계속 유지하는 비휘발성 저장장치 및 휘발성 저장장치를 통칭하는 것이다. The memory 120 stores a program for outputting a prediction result value corresponding to a parameter affecting cognitive impairment through a prediction model learned in advance based on the speech data. Here, the memory 120 collectively refers to a non-volatile storage device and a volatile storage device that continuously maintain stored information even when power is not supplied.

예를 들어, 메모리(120)는 콤팩트 플래시(compact flash; CF) 카드, SD(secure digital) 카드, 메모리 스틱(memory stick), 솔리드 스테이트 드라이브(solid-state drive; SSD) 및 마이크로(micro) SD 카드 등과 같은 낸드 플래시 메모리(NAND flash memory), 하드 디스크 드라이브(hard disk drive; HDD) 등과 같은 마그네틱 컴퓨터 기억 장치 및 CD-ROM, DVD-ROM 등과 같은 광학 디스크 드라이브(optical disc drive) 등을 포함할 수 있다.For example, the memory 120 may include a compact flash (CF) card, a secure digital (SD) card, a memory stick, a solid-state drive (SSD), and a micro SD card. NAND flash memory such as cards, magnetic computer storage devices such as hard disk drives (HDDs), and optical disc drives such as CD-ROMs and DVD-ROMs. can

또한, 메모리(120)에 저장된 프로그램은 소프트웨어 또는 FPGA(Field Programmable Gate Array) 또는 ASIC(Application Specific Integrated Circuit)와 같은 하드웨어 형태로 구현될 수 있으며, 소정의 역할들을 수행할 수 있다. In addition, the program stored in the memory 120 may be implemented in the form of software or hardware such as a field programmable gate array (FPGA) or an application specific integrated circuit (ASIC), and may perform predetermined roles.

프로세서(130)는 메모리(120)에 저장된 프로그램을 실행시킨다. 구체적으로, 프로세서(130)는 피험자의 발화 데이터를 전사하고, 전사된 발화 데이터 중 소정의 비유창성 특징에 대한 태깅을 수행하고, 태깅된 비유창성 특징 및 각 단어에 대한 토큰화를 수행하고, 토큰화된 발화 데이터를 대상으로 단어 및 문서 기반의 임베딩을 수행한 후, 임베딩된 발화 데이터를 상기 예측 모델에 입력하여 예측 결과값을 출력한다.The processor 130 executes a program stored in the memory 120 . Specifically, the processor 130 transcribes the subject's utterance data, performs tagging on a predetermined non-fluency characteristic among the transcribed utterance data, and tokenizes the tagged non-fluency characteristic and each word, and a token After performing word- and document-based embedding on the uttered utterance data, the embedded utterance data is input to the prediction model to output a prediction result.

이상에서 전술한 본 발명의 일 실시예는, 하드웨어인 컴퓨터와 결합되어 실행되기 위해 프로그램(또는 어플리케이션)으로 구현되어 매체에 저장될 수 있다.The embodiment of the present invention described above may be implemented as a program (or application) to be executed in combination with a computer, which is hardware, and stored in a medium.

상기 전술한 프로그램은, 상기 컴퓨터가 프로그램을 읽어 들여 프로그램으로 구현된 상기 방법들을 실행시키기 위하여, 상기 컴퓨터의 프로세서(CPU)가 상기 컴퓨터의 장치 인터페이스를 통해 읽힐 수 있는 C, C++, JAVA, Ruby, 기계어 등의 컴퓨터 언어로 코드화된 코드(Code)를 포함할 수 있다. 이러한 코드는 상기 방법들을 실행하는 필요한 기능들을 정의한 함수 등과 관련된 기능적인 코드(Functional Code)를 포함할 수 있고, 상기 기능들을 상기 컴퓨터의 프로세서가 소정의 절차대로 실행시키는데 필요한 실행 절차 관련 제어 코드를 포함할 수 있다. 또한, 이러한 코드는 상기 기능들을 상기 컴퓨터의 프로세서가 실행시키는데 필요한 추가 정보나 미디어가 상기 컴퓨터의 내부 또는 외부 메모리의 어느 위치(주소 번지)에서 참조되어야 하는지에 대한 메모리 참조관련 코드를 더 포함할 수 있다. 또한, 상기 컴퓨터의 프로세서가 상기 기능들을 실행시키기 위하여 원격(Remote)에 있는 어떠한 다른 컴퓨터나 서버 등과 통신이 필요한 경우, 코드는 상기 컴퓨터의 통신 모듈을 이용하여 원격에 있는 어떠한 다른 컴퓨터나 서버 등과 어떻게 통신해야 하는지, 통신 시 어떠한 정보나 미디어를 송수신해야 하는지 등에 대한 통신 관련 코드를 더 포함할 수 있다.The above-mentioned program, in order for the computer to read the program and execute the methods implemented as a program, C, C++, JAVA, Ruby, which the processor (CPU) of the computer can read through the device interface of the computer; It may include code coded in a computer language such as machine language. Such code may include functional code related to a function defining functions necessary for executing the methods, etc. can do. In addition, the code may further include additional information necessary for the processor of the computer to execute the functions or code related to memory reference for which location (address address) in the internal or external memory of the computer should be referenced. have. In addition, when the processor of the computer needs to communicate with any other computer or server located remotely in order to execute the functions, the code uses the communication module of the computer to determine how to communicate with any other computer or server remotely. It may further include a communication-related code for whether to communicate and what information or media to transmit and receive during communication.

상기 저장되는 매체는, 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상기 저장되는 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있지만, 이에 제한되지 않는다. 즉, 상기 프로그램은 상기 컴퓨터가 접속할 수 있는 다양한 서버 상의 다양한 기록매체 또는 사용자의 상기 컴퓨터상의 다양한 기록매체에 저장될 수 있다. 또한, 상기 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장될 수 있다.The storage medium is not a medium that stores data for a short moment, such as a register, a cache, a memory, etc., but a medium that stores data semi-permanently and can be read by a device. Specifically, examples of the storage medium include, but are not limited to, ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical data storage device. That is, the program may be stored in various recording media on various servers accessible by the computer or in various recording media on the computer of the user. In addition, the medium may be distributed in a computer system connected by a network, and computer-readable codes may be stored in a distributed manner.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The foregoing description of the present invention is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and likewise components described as distributed may also be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.

100: 인지장애 예측 시스템
110: 통신모듈
120: 메모리
130: 프로세서100: cognitive impairment prediction system
110: communication module
120: memory
130: processor

Claims

A method performed by a computer comprising:
receiving utterance data of a subject for a document to be evaluated;
transcribing the subject's utterance data;
performing tagging on a predetermined non-fluency characteristic among the transcribed utterance data;
tokenizing for each word and the tagged non-fluency feature;
performing word- and document-based embedding on the tokenized utterance data; and
Comprising the step of inputting the embedded speech data into a pre-learned prediction model and outputting a prediction result value corresponding to a parameter affecting cognitive impairment,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

According to claim 1,
The step of tagging a predetermined non-fluency characteristic among the transcribed utterance data includes:
Among the transcribed speech data, the characteristic of hesitant behavior for a predetermined period of time, the characteristic of intervening vowels, words, and phrases that are not related to meaning transfer, the characteristics of the insertion behavior that is included in the utterance, the content delivered during utterance, and correction of speaking by changing the grammatical form or pronunciation of the word Behavioral characteristics, incomplete behavioral characteristics in which words or utterances are not completed without completion, oral repetition characteristics in which plural completed words are repeated, word repetition characteristics in which whole words are repeated, syllable repetition characteristics in which syllables are repeated, single phoneme or diphthong Non-fluidity comprising at least one of a repeating feature in which only a portion of a vowel is repeated, an extended behavioral feature in which an element of a phoneme or vowel is long lasting, and a clogging behavioral feature in which the sound is interrupted due to a time lag that occurs when the phonological initiation or plosive is uttered. To perform tagging for features,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

According to claim 1,
Tokenizing for each word and the tagged non-fluency feature comprises:
deleting a content portion of the tagged portion and the content portion corresponding to the tagged non-fluent characteristic; and
performing tokenization for each word and the non-fluency feature from which the content portion has been deleted,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

4. The method of claim 3,
The step of performing word- and document-based embedding on the tokenized utterance data includes:
performing word embedding of vectorizing all words constituting the tokenized utterance data; and
performing document embedding of adding a vector representing characteristic information of the document to the word embedding result,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

5. The method of claim 4,
The step of performing word- and document-based embedding on the tokenized utterance data includes:
Among the tokens constituting the tokenized utterance data, if there is a new token that has not been learned in the pre-trained prediction model, the embedding is performed,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

6. The method of claim 5,
Further comprising the step of updating the prediction model based on the embedding result performed based on the new token,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

According to claim 1,
setting learning data for learning the predictive model to be input to an input terminal of the predictive model;
setting a preset prediction result value corresponding to a parameter affecting the cognitive impairment to be output to an output terminal of the prediction model; and
Further comprising the step of learning the predictive model in which the input terminal and the output terminal are set,
The training data is transcribed from predetermined utterance data, and after tagging is performed on a predetermined non-fluency characteristic among the transcribed utterance data, the tagged utterance data is tokenized and embedded,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

6. The method of claim 5,
The learning data is tokenized after the content part of the tag part and the content part corresponding to the tagged non-fluency characteristic is deleted,
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

7. The method of claim 6,
The training data includes word embedding for vectorizing all words constituting the tokenized training data, and document embedding for adding a vector representing characteristic information of a document constituting the training data to a result of the word embedding being performed. which is vectorized and
Cognitive impairment prediction method based on document classification model technique and fluency tagging.

In the cognitive impairment prediction system based on the document classification model technique and fluency tagging,
A communication module for receiving the subject's speech data for the document to be evaluated;
a memory in which a program is stored for outputting a prediction result value corresponding to a parameter affecting cognitive impairment through a prediction model learned in advance based on the speech data; and
Including a processor for executing the program stored in the memory,
As the processor executes the program stored in the memory, the utterance data of the subject is transcribed, the transcribed utterance data is tagged with a predetermined non-fluency characteristic, and the tagged non-fluency characteristic and each word are transcribed. After performing tokenization on the tokenized utterance data and performing word and document-based embedding on the tokenized utterance data, inputting the embedded utterance data into the prediction model to output a prediction result value,
Cognitive impairment prediction system based on document classification model technique and fluency tagging.