KR20200143940A

KR20200143940A - Method and system for assessing language ability based on artificial intelligence

Info

Publication number: KR20200143940A
Application number: KR1020190071643A
Authority: KR
Inventors: 김유섭; 이윤경; 김지수; 오병두
Original assignee: 한림대학교 산학협력단
Priority date: 2019-06-17
Filing date: 2019-06-17
Publication date: 2020-12-28
Also published as: KR102314572B1

Abstract

According to an embodiment of the present invention, provided is a system for evaluating a language ability, which includes a user device and a server. The user device provides conversation data including speech data of a subject for measuring a language ability to the server. The server performs an artificial intelligence algorithm calculation using conversation data of the subject received from the user device. In accordance with the result of the calculation, the level of the language ability of the subject is evaluated. The abnormality may be determined by comparing predicted age information of the subject calculated through the artificial intelligence algorithm with the actual age of a previously registered subject.

Description

Method and system for assessing language ability based on artificial intelligence}

본 발명은 인공지능 기반의 언어 능력 평가 방법 및 시스템에 관한 것이다. The present invention relates to an artificial intelligence-based language ability evaluation method and system.

언어 병리학이란 의사소통에 사용되는 감각적, 운동적 장애를 포함하여 관련된 장애를 연구하는 분야이다. 의사소통 장애는 증상과 형태를 정의하기 어려우나, 장애로 인한 의사소통 장애부터 일반적 아이들의 모국어 학습에 관련된 장애까지 다양하다. 이는 아동 뿐 아니라 비슷한 환경의 성인에게도 적용 될 수 있다. Speech pathology is the field of study of related disorders, including sensory and motor disorders used in communication. Communication disorders are difficult to define in symptoms and forms, but range from communication disorders due to disabilities to disorders related to the learning of the native language of general children. This can be applied not only to children, but also to adults in similar circumstances.

국내 언어치료사에 대한 수요전망 연구에 의하면 2020년에 필요한 언어임상가 및 청각임상가의 수요는 60,000명 정도로 계산되나, 그 수요가 턱없이 부족한 실정에 있다. 이에 국내에서는 언어치료 전문인력확보에 대한 연구와 전문성 확보를 위한 연구가 다수 진행되어왔다.According to a study on the demand forecast for speech therapists in Korea, the demand for speech and hearing clinicians in 2020 is estimated to be around 60,000, but the demand is inadequate. Accordingly, in Korea, a number of studies have been conducted on securing experts in speech therapy and to secure expertise.

이러한 배경 속에서 가장 활발히 연구되는 분야는 아동 언어 능력 분야이며, 이는 학령 전기 연령 집단에서 언어능력이 폭발적으로 증가한다는 점에 기인한다. 학령전기 및 학령기 연령집단의 발달 장애 포착을 위한 종래의 분석 방법으로는 자발화 분석 방법이 있으며, 자발화 분석은 상담이나 놀이를 통한 치료도구에서 발생하는 발화를 수집함으로 얻어지는 각종 측정지표를 통계 분석하여 피검사자의 언어 연령을 예측하거나 발달장애 여부 판별 및 각종 연구에 활용하는 분석 방법이다. 이러한 종래의 언어능력 분석과정은 피검사자의 발화를 녹음하고 이를 전사하여 분석하는 과정에서 많은 비용이 발생되는 문제가 있다. In this background, the field that is most actively researched is the field of language ability for children, which is due to the fact that language ability explodes in the preschool age group. As a conventional analysis method for capturing developmental disabilities in preschool and school-age age groups, there is a method of analyzing spontaneous speech, and analysis of spontaneous speech is a statistical analysis of various measurement indicators obtained by collecting utterances from treatment tools through counseling or play. Therefore, it is an analysis method that predicts the language age of the test subject, determines whether there is a developmental disability, and is used in various studies. This conventional language ability analysis process has a problem in that a large cost is incurred in the process of recording the speech of the test subject and transcribing it to analyze it.

1)등록특허공보 10-1684424호(자폐 스펙트럼 장애 아동의 사회적 언어 사용 능력 평가 장치, 시스템 및 그 방법)1) Registered Patent Publication No. 10-1684424 (Device, system and method for evaluating social language ability of children with autism spectrum disorders)

본 발명은 합성곱 신경망 모델을 통해 자동으로 특정 대상의 언어 능력을 평가하는 방법을 수행하기 위해 고안되었다. The present invention was devised to perform a method of automatically evaluating a language ability of a specific object through a convolutional neural network model.

특히, 본 발명은 언어 능력을 평가하고자 하는 피검자의 발화 문장의 수준을 자동으로 분류하고, 피검자의 언어 연령을 산출하며, 실제 피검자의 연령과 산출된 언어 연령을 비교함을 통해 장애 여부를 판단하기 위해 고안되었다. In particular, the present invention automatically classifies the level of the spoken sentence of the subject for which the language ability is to be evaluated, calculates the language age of the subject, and determines whether there is a disability by comparing the actual age of the subject with the calculated language age. Designed for

본 발명의 실시 예에 따른 언어 능력 평가 시스템은 사용자 기기 및 서버를 포함하여 구성되며, 상기 사용자 기기는 언어 능력을 측정하고자 하는 피검자의 발화 데이터를 포함하는 대화 데이터를 상기 서버에 제공하고, 상기 서버는 상기 사용자 기기로부터 수신된 피검자의 대화 데이터를 이용하여 인공지능 알고리즘 연산을 수행하며, 상기 연산의 결과에 따라 상기 피검자의 언어 능력 수준을 평가하되, 상기 인공지능 알고리즘을 통해 산출된 피검자의 예측 연령 정보와, 기 등록된 피검자의 실제 연령과의 비교를 통해 이상 여부를 판단할 수 있다. The system for evaluating language ability according to an embodiment of the present invention includes a user device and a server, and the user device provides conversation data including speech data of an examinee to measure language ability to the server, and the server Performs an artificial intelligence algorithm operation using the conversation data of the subject received from the user device, and evaluates the level of language ability of the subject according to the result of the operation, and the predicted age of the subject calculated through the artificial intelligence algorithm The abnormality can be determined through comparison between the information and the actual age of the previously registered subject.

본 발명의 실시 예는 언어 능력에 문제가 있는 피검자 뿐 아니라 언어 능력이 뛰어한 피검자를 효과적이고 간편한 방법으로 평가할 수 있다. According to an exemplary embodiment of the present invention, not only a subject having a problem in language ability but also a subject having excellent language ability can be evaluated in an effective and simple manner.

본 발명의 실시 예는 피검자의 발화 데이터를 간편하게 추출할 수 있다. According to an embodiment of the present invention, speech data of a subject can be easily extracted.

본 발명의 실시 예는 피검자가 직접 상담가와 대면하지 않고도 자동으로 매칭되는 상담가와 대화를 수행할 수 있으며, 이를 통해 기관 방문 등에 의해 발생되는 불편함을 절감시킬 수 있다. According to an exemplary embodiment of the present invention, the subject can automatically communicate with the matching counselor without having to face the counselor directly, thereby reducing inconvenience caused by visiting an institution.

도 1은 본 발명의 일 실시 예에 따른 언어 능력 평가 방법을 구현하기 위한 시스템을 설명하기 위한 시스템도이다.
도 2는 본 발명의 실시 예에 따른 서버의 구성을 도시한 블록도이다.
도 3은 본 발명의 실시 예에 따른 인공지능 알고리즘 기반의 언어 능력 평가 동작의 순서를 간략하게 도시한 도면이다.
도 4는 본 발명의 실시 예에 따라 피검자의 발화 데이터가 추출된 후 저장되는 형태를 도시한 도면이다.
도 5는 본 발명의 실시 예에 따른 인공지능 알고리즘의 구성을 도식적으로 표현한 도면이다.
도 6는 본 발명의 실시 예에 따른 통계적 분석 동작의 근거가 되는 연령집단 내 개별 예측확률분포를 도시한 그래프이다.
도 7은 본 발명의 실시 예에 따른 언어 능력 평가 동작의 순서를 도시한 순서도이다. 1 is a system diagram illustrating a system for implementing a language ability evaluation method according to an embodiment of the present invention.
2 is a block diagram showing the configuration of a server according to an embodiment of the present invention.
3 is a diagram schematically illustrating a sequence of an operation of evaluating language ability based on an artificial intelligence algorithm according to an embodiment of the present invention.
4 is a diagram illustrating a form in which speech data of a subject is extracted and stored according to an embodiment of the present invention.
5 is a diagram schematically showing the configuration of an artificial intelligence algorithm according to an embodiment of the present invention.
6 is a graph showing an individual prediction probability distribution within an age group that is a basis for a statistical analysis operation according to an embodiment of the present invention.
7 is a flowchart illustrating a sequence of a language ability evaluation operation according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. Advantages and features of the present invention, and a method of achieving them will become apparent with reference to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in a variety of different forms, only the present embodiments are intended to complete the disclosure of the present invention, It is provided to fully inform the technician of the scope of the present invention, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terms used in the present specification are for describing exemplary embodiments and are not intended to limit the present invention. In this specification, the singular form also includes the plural form unless specifically stated in the phrase. As used in the specification, “comprises” and/or “comprising” do not exclude the presence or addition of one or more other elements other than the mentioned elements. Throughout the specification, the same reference numerals refer to the same elements, and “and/or” includes each and all combinations of one or more of the mentioned elements. Although "first", "second", and the like are used to describe various elements, it goes without saying that these elements are not limited by these terms. These terms are only used to distinguish one component from another component. Therefore, it goes without saying that the first component mentioned below may be the second component within the technical idea of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used as meanings that can be commonly understood by those of ordinary skill in the art to which the present invention belongs. In addition, terms defined in a commonly used dictionary are not interpreted ideally or excessively unless explicitly defined specifically.

명세서에서 사용되는 "부" 또는 "모듈"이라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, "부" 또는 "모듈"은 어떤 역할들을 수행한다. 그렇지만 "부" 또는 "모듈"은 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. "부" 또는 "모듈"은 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 "부" 또는 "모듈"은 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 "부" 또는 "모듈"들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 "부" 또는 "모듈"들로 결합되거나 추가적인 구성요소들과 "부" 또는 "모듈"들로 더 분리될 수 있다.The terms "unit" or "module" used in the specification means software, hardware components such as FPGA or ASIC, and "unit" or "module" performs certain roles. However, "unit" or "module" is not meant to be limited to software or hardware. The "unit" or "module" may be configured to be in an addressable storage medium or may be configured to reproduce one or more processors. Thus, as an example, "sub" or "module" refers to components such as software components, object-oriented software components, class components and task components, processes, functions, properties, It includes procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays and variables. Components and functions provided within "sub" or "module" may be combined into a smaller number of components and "sub" or "module" or into additional components and "sub" or "module" Can be further separated.

공간적으로 상대적인 용어인 "아래(below)", "아래(beneath)", "하부(lower)", "위(above)", "상부(upper)" 등은 도면에 도시되어 있는 바와 같이 하나의 구성요소와 다른 구성요소들과의 상관관계를 용이하게 기술하기 위해 사용될 수 있다. 공간적으로 상대적인 용어는 도면에 도시되어 있는 방향에 더하여 사용시 또는 동작시 구성요소들의 서로 다른 방향을 포함하는 용어로 이해되어야 한다. 예를 들어, 도면에 도시되어 있는 구성요소를 뒤집을 경우, 다른 구성요소의 "아래(below)"또는 "아래(beneath)"로 기술된 구성요소는 다른 구성요소의 "위(above)"에 놓여질 수 있다. 따라서, 예시적인 용어인 "아래"는 아래와 위의 방향을 모두 포함할 수 있다. 구성요소는 다른 방향으로도 배향될 수 있으며, 이에 따라 공간적으로 상대적인 용어들은 배향에 따라 해석될 수 있다.Spatially relative terms "below", "beneath", "lower", "above", "upper", etc., as shown in the figure It can be used to easily describe the correlation between a component and other components. Spatially relative terms should be understood as terms including different directions of components during use or operation in addition to the directions shown in the drawings. For example, if a component shown in a drawing is turned over, a component described as "below" or "beneath" of another component will be placed "above" the other component. I can. Accordingly, the exemplary term “below” may include both directions below and above. Components may be oriented in other directions, and thus spatially relative terms may be interpreted according to orientation.

본 명세서에서, 컴퓨터는 적어도 하나의 프로세서를 포함하는 모든 종류의 하드웨어 장치를 의미하는 것이고, 실시 예에 따라 해당 하드웨어 장치에서 동작하는 소프트웨어적 구성도 포괄하는 의미로서 이해될 수 있다. 예를 들어, 컴퓨터는 스마트폰, 태블릿 PC, 데스크톱, 노트북 및 각 장치에서 구동되는 사용자 클라이언트 및 애플리케이션을 모두 포함하는 의미로서 이해될 수 있으며, 또한 이에 제한되는 것은 아니다.In the present specification, a computer refers to all kinds of hardware devices including at least one processor, and may be understood as encompassing a software configuration operating in a corresponding hardware device according to embodiments. For example, the computer may be understood as including all of a smartphone, a tablet PC, a desktop, a laptop, and a user client and an application running on each device, but is not limited thereto.

이하, 첨부된 도면을 참조하여 본 발명의 실시 예를 상세하게 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 명세서에서 설명되는 각 단계들은 컴퓨터에 의하여 수행되는 것으로 설명되나, 각 단계의 주체는 이에 제한되는 것은 아니며, 실시 예에 따라 각 단계들의 적어도 일부가 서로 다른 장치에서 수행될 수도 있다. Each of the steps described herein is described as being performed by a computer, but the subject of each step is not limited thereto, and at least some of the steps may be performed by different devices according to embodiments.

도 1은 본 발명의 일 실시 예에 따른 언어 능력 평가 방법을 구현하기 위한 시스템을 설명하기 위한 시스템도이다. 1 is a system diagram illustrating a system for implementing a language ability evaluation method according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 시스템은 서버(100), 사용자 기기(200)를 포함하여 구성될 수 있다. As illustrated in FIG. 1, the system may include a server 100 and a user device 200.

사용자 기기(100)는 스마트폰(smartphone), 태블릿 PC(tablet personal computer), 이동 전화기(mobile phone), 영상 전화기, 전자책 리더기(e-book reader), 데스크탑 PC (desktop PC), 랩탑 PC(laptop PC), 넷북 컴퓨터(netbook computer), 워크스테이션(workstation), 서버, PDA(personal digital assistant), PMP(portable multimedia player)중 적어도 하나를 포함할 수 있다.The user device 100 includes a smartphone, a tablet PC, a mobile phone, a video phone, an e-book reader, a desktop PC, and a laptop PC. laptop PC), a netbook computer, a workstation, a server, a personal digital assistant (PDA), and a portable multimedia player (PMP).

상기 사용자 기기(200)는 언어 능력에 대한 수준을 평가 받고자 하는 피검자와 상기 검사자(예, 피검자가 발언하도록 피검자에게 질문하는 등의 대화를 이끌어가는 역할을 하는 검사 도우미를 의미할 수 있음) 간의 대화 데이터(예, 녹음된 음성 파일)를 수집하는 동작을 수행하고, 수집된 대화 데이터를 서버(100)로 전송하기 위한 구성이다. The user device 200 is a conversation between a subject who wishes to be evaluated for their level of language ability and the tester (e.g., it may refer to a test assistant that serves to lead a conversation, such as asking a question to the subject to speak) This is a configuration for performing an operation of collecting data (eg, a recorded voice file) and transmitting the collected conversation data to the server 100.

다양한 실시 예에 따라 상기 사용자 기기(200)는 피검자용, 검사자용 등으로 구분되어 구성될 수 있다. 그리고 서버(100)를 통해 대화 데이터 수집 목적으로 이루어지는 피검자용 기기와 검사자용 기기간의 통화 내용을 자동으로 녹음하고, 녹음된 대화 데이터를 서버(100)로 전송할 수 있다. According to various embodiments, the user device 200 may be divided into a subject, a tester, and the like. In addition, the content of a call between the tester's device and the tester's device for the purpose of collecting chat data through the server 100 may be automatically recorded, and the recorded chat data may be transmitted to the server 100.

상기 서버(100)는 사용자 기기(200)로부터 대화 데이터를 수신하여, 수신한 대화 데이터로부터 피검자의 발화 데이터를 추출하고, 추출된 피검자의 발화 데이터로부터 피검자의 언어 능력을 분석하는 동작을 수행할 수 있다. 이 때, 상기 서버(100)가 수행하는 발화 데이터 추출동작은 음성 파일 형태의 대화 데이터를 텍스트화하는 전사 동작 이후 수행될 수 있다.The server 100 may receive conversation data from the user device 200, extract speech data of the subject from the received conversation data, and analyze the language ability of the subject from the extracted speech data. have. In this case, the operation of extracting speech data performed by the server 100 may be performed after a transcription operation of converting conversation data in the form of a voice file into text.

상기 서버(100)는 상기 대화 데이터 뿐 아니라, 피검자의 언어 능력을 분석하는 데 요구되는 다양한 정보(예, 피검자의 연령, 질병 정보 등)을 상기 사용자 기기(200)로부터 더 수집할 수 있다. The server 100 may further collect, from the user device 200, not only the conversation data, but also various information (eg, age, disease information, etc. of the subject) required to analyze the language ability of the subject.

구체적으로, 상기 서버(100)는 피검자의 발화 데이터를 기 학습된 인공지능 알고리즘(CNN)에 투입하여 분석하여, 해당 발화 데이터에 대응하는 연령을 예측하는 동작을 수행할 수 있다. 그리고 상기 서버(100)는 실제 피검자의 연령과 인공지능 알고리즘의 분석 결과 산출된 예측 연령의 비교를 통해 피검자의 언어적 특징 자질(예, 언어 장애, 언어 능력 우수 등) 을 판단할 수 있다. Specifically, the server 100 may perform an operation of predicting an age corresponding to the speech data by inputting and analyzing the speech data of the subject into a pre-learned artificial intelligence algorithm (CNN). In addition, the server 100 may determine the language characteristic features (eg, language disorder, excellent language ability, etc.) of the subject through comparison of the actual age of the subject and the predicted age calculated as a result of analysis of the artificial intelligence algorithm.

본 발명의 실시 예에 따른 상기 서버(100)는 단일 서버로 구성될 수 있을 뿐만 아니라, 필요에 따라, 복수의 서버로 구성될 수 있으며, 클라우드 서버를 포함하는 개념일 수 있다. The server 100 according to the exemplary embodiment of the present invention may be configured as a single server, as needed, may be configured as a plurality of servers, and may be a concept including a cloud server.

도 2는 본 발명의 실시 예에 따른 서버의 구성을 도시한 블록도이다. 2 is a block diagram showing the configuration of a server according to an embodiment of the present invention.

본 발명의 실시 예에 따른 상기 서버(100)는 사용자 기기(200)로부터 피검자의 언어 능력 분석을 수행하는 데 요구되는 다양한 정보를 획득하고, 획득된 정보를 가공 및 분석하여 피검자의 언어 능력을 평가하는 동작을 수행할 수 있다. The server 100 according to an embodiment of the present invention obtains various information required to perform the language ability analysis of the subject from the user device 200, and evaluates the language ability of the subject by processing and analyzing the obtained information. You can perform the operation

본 발명의 실시 예에 따른 상기 서버(100)는 통신부(110), 저장부(120) 및 제어부(130)를 포함하여 구성될 수 있다. The server 100 according to an embodiment of the present invention may include a communication unit 110, a storage unit 120, and a control unit 130.

먼저 상기 통신부(110)는 사용자 기기와 서버 간의 데이터 송수신을 위해 네트워크를 이용할 수 있으며 상기 네트워크의 종류는 특별히 제한되지 않는다. 상기 네트워크는 예를 들어, 인터넷 프로토콜(IP)을 통하여 대용량 데이터의 송수신 서비스를 제공하는 아이피(IP: Internet Protocol)망 또는 서로 다른 IP 망을 통합한 올 아이피(All IP) 망 일 수 있다. 또한, 상기 네트워크는 유선망, Wibro(Wireless Broadband)망, WCDMA를 포함하는 이동통신망, HSDPA(High Speed Downlink Packet Access)망 및 LTE(Long Term Evolution) 망을 포함하는 이동통신망, LTE advanced(LTE-A), 5G(Five Generation)를 포함하는 이동통신망, 위성 통신망 및 와이파이(Wi-Fi)망 중 하나 이거나 또는 이들 중 적어도 하나 이상을 결합하여 이루어질 수 있다.First, the communication unit 110 may use a network to transmit and receive data between a user device and a server, and the type of the network is not particularly limited. The network may be, for example, an Internet Protocol (IP) network that provides a service for transmitting and receiving large amounts of data through an Internet Protocol (IP) or an All IP network integrating different IP networks. In addition, the network includes a wired network, a wireless broadband (Wibro) network, a mobile communication network including WCDMA, a high speed downlink packet access (HSDPA) network and a mobile communication network including a long term evolution (LTE) network, and LTE advanced (LTE-A). ), a mobile communication network including 5G (Five Generation), a satellite communication network, and a Wi-Fi network, or at least one or more of them may be combined.

본 발명의 실시 예에 따른 상기 통신부(110)는 사용자 기기(200)와의 통신을 통해 사용자 기기(200)에서 녹음된 사용자들의 대화 파일을 수신할 수 있다. 이때 상기 통신부(110)는 피검자와 검사자 간의 대화 음성 파일인 대화 데이터를 수집하기 위한 수집 요청 동작을 사용자 기기(200)로부터 수신할 수 있다. The communication unit 110 according to an embodiment of the present invention may receive a conversation file of users recorded by the user device 200 through communication with the user device 200. In this case, the communication unit 110 may receive a collection request operation for collecting conversation data, which is a conversation voice file between the examinee and the examinee, from the user device 200.

이 때 상기 사용자 기기(200)는 예컨대, 언어 치료 기관에서 관리되며, 피검자와 검사자의 대화를 즉석에서 녹음할 수 있는 기기일 수 있다. 그러나 이에 한정되지 않고, 상기 사용자 기기(200)는 피검자 또는 검사자의 개인 스마트폰 등의 기기를 의미할 수 있다. 이러한 상기 사용자 기기(200)가 개인 기기일 경우, 상기 사용자 기기(200)는 피검자 기기(200a)와 검사자 기기(200b)로 분류될 수 있다. In this case, the user device 200 is managed by, for example, a speech therapy institution, and may be a device capable of instantly recording a conversation between a subject and an examiner. However, the present invention is not limited thereto, and the user device 200 may mean a device such as a personal smartphone of a subject or an examiner. When the user device 200 is a personal device, the user device 200 may be classified into a subject device 200a and a tester device 200b.

만약 피검자가 직접 검사자를 만나지 않고도 원격으로 자신의 언어 능력을 진단받고자 할 경우, 상기 피검자는 자신의 개인 기기인 피검자 기기(200a)를 통해 서버 100에 접속하여 자신과 대화를 수행할 검사자를 배정받을 수 있다. 이후 피검자는 자신이 배정받은 검사자와 대화를 수행할 수 있고, 대화 내용은 자동으로 서버(100)로 제공될 수 있으며, 상기 대화 내용을 토대로 피검자의 언어 능력이 측정될 수 있다. If the subject wants to receive a diagnosis of his/her language ability remotely without meeting the examiner directly, the examinee will be assigned a examiner who will communicate with himself by accessing the server 100 through his personal device, the examinee's device 200a. I can. Thereafter, the subject may conduct a conversation with the examiner to whom he or she has been assigned, and the conversation content may be automatically provided to the server 100, and the subject's language ability may be measured based on the conversation content.

이 경우, 상기 통신부(110)는 피검자 기기(200a)로부터 대화 요청 신호를 수신하고, 그에 대응하여 적합한 검사자 기기(200b)에 피검자 매칭 신호를 전송할 수 있다. 일 실시 예에 따라 상기 서버(100)의 통신부(110)는 상기 검사자 기기(200b)에서 특정 알림벨이 울리도록 하는 피검자 매칭 신호를 전송할 수 있으며, 그에 따라 검사자는 일반적인 통화 연결 방식과 유사하게 알림벨을 확인한 후 피검자와 통화를 시작할 수 있다. 이후 수행되는 대화 내용은 자동 녹음되며, 상기 통신부(110)를 통해 녹음된 대화 내용은 자동으로 서버(100)측으로 전송될 수 있다. In this case, the communication unit 110 may receive a conversation request signal from the subject device 200a and transmit a subject matching signal to a suitable tester device 200b in response thereto. According to an embodiment, the communication unit 110 of the server 100 may transmit a subject matching signal that causes the tester device 200b to ring a specific notification bell, and accordingly, the tester notifies similarly to a general call connection method. After confirming the bell, you can start talking to the subject. Conversation contents performed after that are automatically recorded, and the recorded conversation contents through the communication unit 110 may be automatically transmitted to the server 100 side.

다양한 실시 예에 따라 상기 통신부(110)는 인공지능 알고리즘 학습과 관련된 자료를 웹 또는 연계된 기관 서버 등으로부터 획득할 수 있다. According to various embodiments, the communication unit 110 may obtain data related to learning an artificial intelligence algorithm from a web or a linked institution server.

또한 다양한 실시 예에 따라 상기 통신부(110)는 서버(100)에서 수행한 피검자의 언어 능력 분석의 결과 정보 및 그에 대응하는 피드백 정보를 사용자 기기(200)측으로 전송하는 동작을 수행할 수 있다. In addition, according to various embodiments, the communication unit 110 may perform an operation of transmitting information as a result of the analysis of the language ability of the subject performed by the server 100 and feedback information corresponding thereto to the user device 200.

상기 저장부(120)는 예를 들면, 내장 메모리 또는 외장 메모리를 포함할 수 있다. 내장메모리는, 예를 들면, 휘발성 메모리(예: DRAM(dynamic RAM), SRAM(static RAM), 또는 SDRAM(synchronous dynamic RAM) 등), 비휘발성 메모리(non-volatile Memory)(예: OTPROM(one time programmable ROM), PROM(programmable ROM), EPROM(erasable and programmable ROM), EEPROM(electrically erasable and programmable ROM), mask ROM, flash ROM, 플래시 메모리(예: NAND flash 또는 NOR flash 등), 하드 드라이브, 또는 솔리드 스테이트 드라이브(solid state drive(SSD)) 중 적어도 하나를 포함할 수 있다.The storage unit 120 may include, for example, an internal memory or an external memory. The internal memory includes, for example, volatile memory (e.g., dynamic RAM (DRAM), static RAM (SRAM), synchronous dynamic RAM (SDRAM), etc.)), non-volatile memory (e.g., OTPROM (one time programmable ROM), programmable ROM (PROM), erasable and programmable ROM (EPROM), electrically erasable and programmable ROM (EEPROM), mask ROM, flash ROM, flash memory (such as NAND flash or NOR flash), hard drive, Alternatively, it may include at least one of a solid state drive (SSD).

외장 메모리는 플래시 드라이브(flash drive), 예를 들면, CF(compact flash), SD(secure digital), Micro-SD(micro secure digital), Mini-SD(mini secure digital), XD(extreme digital), MMC(multi-media card) 또는 메모리 스틱(memory stick) 등을 더 포함할 수 있다. 외장 메모리는 다양한 인터페이스를 통하여 전자 장치와 기능적으로 및/또는 물리적으로 연결될 수 있다.External memory is a flash drive, for example, compact flash (CF), secure digital (SD), micro secure digital (Micro-SD), mini secure digital (mini-SD), extreme digital (XD), It may further include a multi-media card (MMC) or a memory stick. The external memory may be functionally and/or physically connected to the electronic device through various interfaces.

본 발명의 실시 예에 따른 상기 저장부(120)는 사용자 기기(200)로부터 수신된 음성 대화 데이터 및 상기 음성 대화 데이터의 내용을 문자화한 텍스트 대화 데이터를 저장할 수 있다. 상기 저장부(120)는 또한 상기 음성 대화 데이터를 텍스트 대화 데이터로 변환하는 동작 및 상기 텍스트 대화 데이터로부터 피검자의 발화 데이터와 검사자의 발화 데이터를 각각 구분하여 추출하는 동작을 수행하는 데 요구되는 프로그램을 저장할 수 있다. The storage unit 120 according to an exemplary embodiment of the present invention may store voice chat data received from the user device 200 and text chat data obtained by converting contents of the voice chat data into text. The storage unit 120 further includes a program required to perform an operation of converting the voice conversation data into text conversation data and an operation of separately extracting the speech data of the subject and the speech data of the examinee from the text conversation data. Can be saved.

또한 상기 저장부(120)는 본 발명의 실시 예예 따라 피검자의 요청에 대응하여 피검자와의 대화를 수행할 검사자를 매칭하는 동작, 대화를 요청한 피검자 기기(200a)와 상기 피검자 기기(200a)에 매칭된 검사자의 기기인 검사자 기기(200b) 간의 통화 연결을 수행하고, 통화 내용을 녹음한 데이터를 획득하는 동작 등을 수행하기 위한 프로그램을 저장할 수 있다. In addition, the storage unit 120 matches an examiner who will perform a conversation with the examinee in response to a request of the examinee according to an embodiment of the present invention, and matches the examinee's device 200a and the examinee's device 200a requesting the conversation. It is possible to store a program for performing a call connection between the tester devices 200b, which is the tester's device, and acquiring data recorded on the call.

이 밖에도 상기 저장부(120)는 획득한 대화 데이터를 기반으로 피검자의 대화 능력을 판단을 수행하는 인공지능 알고리즘을 저장할 수 있고, 더불어, 피검자 언어 능력 분석 동작의 전반을 지원하는 프로그램을 저장할 수 있다. In addition, the storage unit 120 may store an artificial intelligence algorithm that determines the conversation ability of the subject based on the acquired conversation data, and also stores a program that supports the overall operation of analyzing the language ability of the subject. .

상기 제어부(130)는 프로세서(Processor), 컨트롤러(controller), 마이크로 컨트롤러(microcontroller), 마이크로 프로세서(microprocessor), 마이크로 컴퓨터(microcomputer) 등으로도 호칭될 수 있다. 한편, 제어부(130)는 하드웨어(hardware) 또는 펌웨어(firmware), 소프트웨어, 또는 이들의 결합에 의해 구현될 수 있다. The control unit 130 may also be referred to as a processor, a controller, a microcontroller, a microprocessor, a microcomputer, or the like. Meanwhile, the controller 130 may be implemented by hardware, firmware, software, or a combination thereof.

펌웨어나 소프트웨어에 의한 구현의 경우, 본 발명의 일 실시예는 이상에서 설명된 기능 또는 동작들을 수행하는 모듈, 절차, 함수 등의 형태로 구현될 수 있다. 소프트웨어 코드는 저장부(120)에 저장되어 제어부(130)에 의해 구동될 수 있다. 상기 저장부(120)는 상기 사용자 단말 및 서버 내부 또는 외부에 위치할 수 있으며, 이미 공지된 다양한 수단에 의해 상기 제어부(130)와 데이터를 주고받을 수 있다.In the case of implementation by firmware or software, an embodiment of the present invention may be implemented in the form of a module, procedure, or function that performs the functions or operations described above. The software code may be stored in the storage unit 120 and driven by the control unit 130. The storage unit 120 may be located inside or outside the user terminal and the server, and may exchange data with the controller 130 through various known means.

본 발명의 실시 예에 따른 상기 제어부(130)는 이하의 도면에서 설명하는 언어 능력 평가 동작 전반을 수행할 수 있다. The control unit 130 according to an embodiment of the present invention may perform the overall language ability evaluation operation described in the following drawings.

일 실시 예에 따라 상기 제어부(130)는 도 3에 도시되는 인공지능 알고리즘 연산 과정에 기반하여 언어 능력 평가 동작을 수행할 수 있다. According to an embodiment, the controller 130 may perform a language ability evaluation operation based on the artificial intelligence algorithm calculation process shown in FIG. 3.

도 3에서 도시되는 바와 같이, 상기 제어부(130)는 피검자와 검사자 간의 대화 데이터를 획득하는 동작(Data collection), 대화 데이터를 분석용 데이터로 가공 및 전처리하는 동작(Processing), 인공지능 알고리즘 기반의 합성곱 연산을 수행하는 동작, 통계적 분석(Statistical analysis), 언어 연령 예측(언어 장애 여부 판단) 동작을 수행할 수 있다. As shown in FIG. 3, the control unit 130 includes an operation of acquiring conversation data between a subject and an examiner (Data collection), processing and pre-processing the conversation data into data for analysis (Processing), based on an artificial intelligence algorithm. An operation of performing a convolution operation, a statistical analysis, and an operation of predicting language age (determining whether there is a language disorder) may be performed.

인공지능 알고리즘에 의한 연산 및 분석 동작을 수행하기에 앞서, 본 발명의 실시 예에 따른 제어부(130)는 관리자(검사자)가 설정한 특정 기준에 대응하는 데이터 셋을 기반으로 인공지능 알고리즘을 학습시킬 수 있다. Before performing the calculation and analysis operation by the artificial intelligence algorithm, the control unit 130 according to an embodiment of the present invention is to learn the artificial intelligence algorithm based on a data set corresponding to a specific criterion set by the manager (inspector). I can.

상기 제어부(130)는 학습되어 저장된 인공지능 알고리즘에 피검자의 대화 데이터로부터 추출된 발화 데이터를 투입하고, 이를 통해 피검자의 언어 능력을 평가하는 동작을 수행할 수 있다. The controller 130 may input speech data extracted from conversation data of the subject into the learned and stored artificial intelligence algorithm, and perform an operation of evaluating the language ability of the subject through this.

구체적으로, 상기 제어부(130)는 상기 통신부(110)를 통해 수신된 사용자(피검자)와 검사자 간의 음성 대화 데이터에서 텍스트 대화 데이터를 획득하는 동작을 수행할 수 있다. Specifically, the controller 130 may perform an operation of obtaining text conversation data from voice conversation data between a user (subject) and an examiner received through the communication unit 110.

또한 상기 제어부(130)는 텍스트 대화 데이터에서 피검자의 음성 주파수에 해당하는 텍스트 데이터만을 피검자의 발화 데이터로써 추출할 수 있다. Also, the controller 130 may extract only text data corresponding to the voice frequency of the subject from the text conversation data as speech data of the subject.

상기 제어부(130)는 추출된 피검자의 발화 데이터를 인공지능 알고리즘에 기반하여 분석하며, 상기 피검자의 발화 데이터의 수준이 취학 연령(SA; School Age)에 해당하는지 또는 미취학 연령(PSA; Pre School Age)에 해당하는지 여부를 판단할 수 있다. The control unit 130 analyzes the extracted speech data of the examinee based on an artificial intelligence algorithm, and whether the level of the speech data of the examinee corresponds to school age (SA) or preschool age (PSA; Pre School Age) ) Can be determined.

상기 피검자의 발화 데이터는 다수개의 문장으로 구성될 수 있으며, 그에 따라 상기 제어부(130)는 상기 발화 데이터를 구성하는 다수개의 문장 전체에 대한 분석을 수행할 수 있다. 상기 제어부(130)는 피검자의 발화 데이터를 구성하는 전체 문장에 대한 분석을 수행한 결과에 따라 전체 발화 데이터 내 문장들 중 취학 연령에 해당하는 문장의 비율을 기반으로 사용자의 예상 연령대를 산출할 수 있다. 앞서 기술한 방식을 통해 상기 제어부(130)는 특정 대상에 대한 언어 능력을 평가할 수 있다. The utterance data of the subject may be composed of a plurality of sentences, and accordingly, the control unit 130 may perform analysis on all of the plurality of sentences constituting the utterance data. The control unit 130 may calculate the user's expected age group based on the proportion of sentences corresponding to the school age among sentences in the total utterance data according to a result of analyzing all sentences constituting the subject's utterance data. have. Through the above-described method, the control unit 130 may evaluate the language ability for a specific target.

이하에서는, 상기 제어부(130)의 동작에 대하여 보다 상세히 설명하기로 한다. 상기 제어부(130)는 일 실시 예에 따라 발화 데이터 추출부(131), 학습 지원부(132), 평가 지원부(133)를 포함하여 구성될 수 있다. Hereinafter, the operation of the control unit 130 will be described in more detail. According to an embodiment, the control unit 130 may include a speech data extraction unit 131, a learning support unit 132, and an evaluation support unit 133.

상기 발화 데이터 추출부(131)는 크게 3가지 동작을 통해 피검자의 발화 데이터를 추출할 수 있다. 구체적으로 상기 발화 데이터 추출부(131)는 피검자와 검사자 간의 음성 대화 데이터를 획득하는 동작, 획득된 음성 대화 데이터를 텍스트 대화 데이터로 변환하는 동작, 상기 텍스트 대화 데이터를 피검자 발화 데이터와 검사자 발화 데이터로 분류하는 동작을 수행할 수 있다. The speech data extracting unit 131 may extract speech data of a subject through three major operations. Specifically, the speech data extraction unit 131 includes an operation of acquiring voice conversation data between a subject and an examiner, an operation of converting the obtained voice conversation data into text conversation data, and converting the text conversation data into the subject's speech data and the examinee's speech data. You can perform the classification operation.

상기 발화 데이터 추출부(131)가 대화 데이터를 획득하는 동작은 보다 구체적으로 다음과 같이 수행될 수 있다. 상기 발화 데이터 추출부(131)는 일 실시 예에 따라 피검자 및 검사자의 대면에 의해 수행된 대화 내용(예, 녹음 대화 데이터)을 사용자 기기(200)로부터 직접 획득할 수 있다. The operation of obtaining conversation data by the speech data extracting unit 131 may be performed in more detail as follows. According to an exemplary embodiment, the speech data extracting unit 131 may directly acquire conversation contents (eg, recorded conversation data) performed by the subject and the examinee face-to-face from the user device 200.

그리고 다양한 실시 예에 따라 상기 발화 데이터 추출부(131)는 피검자와 검사자가 대면하지 않은 상태에서 이루어진 대화 내용(예, 통화 녹음 대화 데이터 등)을 획득할 수 있다. 이 때 상기 발화 데이터 추출부(131)는 피검자와 검사자 간 대화가 이루어지도록 하기 위해 피검자 기기(200a)와 검사자 기기(200b)를 매칭하고 통신 연결하는 동작을 수행할 수 있다. In addition, according to various embodiments, the speech data extracting unit 131 may acquire conversation contents (eg, call recording conversation data, etc.) made in a state in which the examinee and the examinee do not face each other. At this time, the speech data extracting unit 131 may perform an operation of matching and communicating with the examinee device 200a and the examiner device 200b in order to establish a conversation between the examinee and the examinee.

구체적으로, 상기 발화 데이터 추출부(131)는 통신부(110)를 통해 수신되는 대화요청 신호를 피검자 기기(200a)로부터 또는 검사자 기기(200b)중 어느 하나로부터 수신할 수 있다. 상기 발화 데이터 추출부(131)는 수신된 대화요청 신호에 대응하여, 대화 연결을 중개할 수 있다. 예컨대, 상기 발화 데이터 추출부(131)는 피검자 기기(200a)로부터 대화 요청 신호를 수신하면, 상기 피검자 기기(200a)에 매칭할 검사자 및 검사자 기기(200b)를 추출하고, 매칭 신호를 상기 추출된 검사자 기기(200b)로 전송할 수 있다. 상기 발화 데이터 추출부(131)가 매칭 신호를 검사자 기기(200b)로 전송할 시, 상기 검사자 기기(200b)에서는 통화 착신 알림음과 같은 벨소리가 출력됨과 동시에, 피검자와 대화를 진행할 것인지 여부를 묻는 알림 화면이 표시될 수 있다. 이러한 검사자 기기(200b)측 동작에 대응하여 검사자가 매칭 신호에 응답(예, 대화 수락을 의미하는 버튼 선택 등)할 경우, 상기 발화 데이터 추출부(131)는 대화 요청 신호를 보낸 피검자 기기(200a)와 매칭 신호에 대응하여 응답 신호를 보낸 검사자 기기(200b)와의 통화 연결을 수행하고, 자동으로 통화 중 대화 내용을 녹음하여 획득할 수 있다. 이는 상기 발화 데이터 추출부(131)가 통화 중 대화 내용을 녹음하고, 녹음된 데이터를 전송하도록 상기 피검자 기기(200a)와 검사자 기기(200b)중 적어도 하나에 요청하는 것을 의미할 수 있다. Specifically, the speech data extraction unit 131 may receive a conversation request signal received through the communication unit 110 from either the testee device 200a or the tester device 200b. The speech data extraction unit 131 may mediate a conversation connection in response to the received conversation request signal. For example, when receiving a conversation request signal from the subject device 200a, the speech data extracting unit 131 extracts the tester and tester device 200b to be matched with the test subject device 200a, and extracts the extracted matching signal. It can be transmitted to the examiner's device 200b. When the speech data extracting unit 131 transmits the matching signal to the tester device 200b, the tester device 200b outputs a ringtone such as an incoming call notification sound and a notification asking whether to proceed with a conversation with the subject. The screen may be displayed. In response to the operation of the tester device 200b, when the tester responds to the matching signal (e.g., selecting a button indicating conversation acceptance, etc.), the utterance data extracting unit 131 sends a conversation request signal to the testee device 200a. In response to) and the matching signal, a call connection with the tester device 200b that sent a response signal may be performed, and conversation contents may be automatically recorded and acquired during the call. This may mean that the speech data extracting unit 131 records a conversation content during a call and requests at least one of the testee device 200a and the tester device 200b to transmit the recorded data.

이와 같이 피검자와 검사자를 매칭하는 원리는 검사자가 먼저 대화 요청을 수행한 경우에도, 유사하게 수행될 수 있다. 상기 발화 데이터 추출부(131)는 검사자가 대화 요청 신호를 발송한 경우, 대응하는 피검자를 추출하고 해당 피검자의 기기(200a)에 매칭 신호를 송신할 수 있다. In this way, the principle of matching the examinee with the examinee can be similarly performed even when the examinee first performs a conversation request. When the tester sends a conversation request signal, the speech data extracting unit 131 may extract a corresponding subject and transmit a matching signal to the device 200a of the subject.

다양한 실시 예에 따라, 상기 발화 데이터 추출부(131)는 검사자와 대화를 통한 언어 능력 검사요구가 있는 피검자 리스트와, 피검자와 대화할 준비가 된 검사자 리스트를 관리하고, 특정 대상이 매칭 신호에 응답하지 않게 된 경우 대체할 다른 매칭 대상을 상기 피검자 리스트 또는 검사자 리스트 중에서 추출하여 재매칭을 수행할 수 있다. According to various embodiments, the speech data extracting unit 131 manages a list of subjects who are required to test language ability through conversation with the examinee, and a list of examinees who are ready to communicate with the examinee, and a specific target responds to a matching signal. If not, another matching object to be replaced may be extracted from the subject list or the test subject list to perform rematching.

또한 상기 발화 데이터 추출부(131)는 상기 기술한 방법들에 의해 획득된 음성 대화 데이터를 텍스트 대화 데이터로 변환할 수 있다. 상기 발화 데이터 추출부(131)는 STT(Sound To Text)프로그램 등을 이용하여, 획득된 음성 데이터를 텍스트로 변환할 수 있다. In addition, the speech data extracting unit 131 may convert voice conversation data acquired by the above-described methods into text conversation data. The speech data extraction unit 131 may convert the obtained speech data into text using a sound to text (STT) program or the like.

그리고 상기 발화 데이터 추출부(131)는 상기 텍스트 대화 데이터를 피검자 발화 데이터와 검사자 발화 데이터로 분류하는 동작을 수행할 수 있다. In addition, the speech data extraction unit 131 may perform an operation of classifying the text conversation data into the subject speech data and the examinee speech data.

일 실시 예에 따라 상기 발화 데이터 추출부(131)는 피검자 발화 데이터와 검사자 발화 데이터를 구분하기 위해 주파수 특성 정보를 이용하여 발화자의 목소리를 구분하고, 그에 대응하여 발화자별 대화 내용을 분류할 수 있다. 구체적으로, 상기 발화 데이터 추출부(131)는 피검자 또는 검사자 중 일 대상의 목소리 주파수 정보를 미리 획득하여 특징값(예, 파형 특성, 주파수 수치 등)을 확보할 수 있다. 상기 발화 데이터 추출부(131)는 텍스트 대화 데이터를 구성하는 문장 단위의 발화 내용들을 목소리 종류별로 발화자를 2 종류(또는 그 이상)로 분류하고, 발화자가 피검자인지 또는 검사자인지 여부를 판단하여 대화 문장들을 피검자 발화 데이터와 검사자 발화 데이터로 분류할 수 있다. 예를 들어, 상기 분류는 각 문장에 대응하는 발화자(예, 피검자 및 검사자)를 레이블링하는 동작을 포함할 수 있다.According to an embodiment, the speech data extracting unit 131 may classify a speaker's voice by using frequency characteristic information in order to distinguish between the subject speech data and the examiner speech data, and accordingly classify the conversation contents for each speaker. . Specifically, the speech data extraction unit 131 may obtain feature values (eg, waveform characteristics, frequency values, etc.) by obtaining voice frequency information of one of the examinee or one of the examinees in advance. The speech data extracting unit 131 classifies speech contents of a sentence unit constituting text conversation data into two types (or more) of talkers for each type of voice, and determines whether the talker is a subject or a tester They can be classified into subject speech data and examiner speech data. For example, the classification may include labeling a talker (eg, a subject and a tester) corresponding to each sentence.

일 실시 예에 따라, 상기 발화 데이터 추출부(131)는 목소리 주파수 분석을 통해 발화자를 A 및 B로 구분하고, 이후 대화 내용(예, 질문의 비중이 높은 측 판단 등)을 판단하여 A 와 B 중 누가 피검자인지를 판단할 수 있다. According to an embodiment, the speech data extractor 131 divides the talkers into A and B through voice frequency analysis, and then determines the contents of the conversation (eg, determination of the side with a high proportion of questions), Among them, it is possible to determine who is the subject.

그 외에도 상기 발화 데이터 추출부(131)는 기 등록된 목소리 특성값과 일치하는 대상을 선별하는 방식으로 피검자와 검사자를 특정할 수도 있다. 이를 위해 상기 발화 데이터 추출부(131)는 검사자와 피검자 중 적어도 일 측에 대화 요청 이전 목소리를 서버(100)에 등록하도록 요청할 수 있으며, 등록된 목소리는 분석을 거쳐 주파수 특성값의 형태로 저장될 수 있다. In addition, the speech data extracting unit 131 may specify a subject and an examiner in a manner that selects an object matching a previously registered voice characteristic value. To this end, the speech data extracting unit 131 may request at least one of the examiner and the subject to register a voice before the conversation request to the server 100, and the registered voice is analyzed and stored in the form of a frequency characteristic value. I can.

상기 발화 데이터 추출부(131)는 텍스트 대화 데이터를 구성하는 각각의 발언 내용들이 피검자의 발화인지 또는 검사자의 발화인지 여부가 판단되면, 각각 피검자 발화 데이터와 검사자 발화 데이터로 분류하여 저장할 수 있다. The speech data extracting unit 131 may classify and store the speech data of the examinee and speech data of the examinee, respectively, when it is determined whether the speech contents constituting the text conversation data are speech of the examinee or speech of the examinee.

다양한 실시 예에 따라 상기 발화 데이터 추출부(131)는 피검자의 발화 데이터를 구성하는 발언 문장들의 총 발화량이 기준치 미만인 경우 대화 데이터를 피검자 또는 검사자에게 다시 요청할 수 있다. 이에 따라 상기 발화 데이터 추출부(131)는 피검자의 언어 능력 평가에 요구되는 충분한 데이터 총량을 확보할 수 있다. According to various embodiments, the speech data extracting unit 131 may request the conversation data from the examinee or the examiner again when the total speech amount of speech sentences constituting the speech data of the examinee is less than a reference value. Accordingly, the speech data extracting unit 131 may secure a sufficient total amount of data required to evaluate the language ability of the subject.

피검자 발화 데이터에 해당하는 다수의 발언 문장들은 도 4 와 같이 정리되어 기록될 수 있다. A plurality of speech sentences corresponding to the subject speech data may be arranged and recorded as shown in FIG. 4.

도 4는 본 발명의 실시 예에 따라 피검자의 발화 데이터가 추출된 후 저장되는 형태를 도시한 도면이다. 4 is a diagram illustrating a form in which speech data of a subject is extracted and stored according to an embodiment of the present invention.

도 4에서와 같이, 본 발명의 실시 예에 따른 발화 데이터 추출부(131)는 상담과정에서 문장이 생성된 대화 차례 정보, 총 문장의 수, 피검자의 성명, 나이 등의 신상정보, 상담이 이루어진 장소, 추출된 피검자의 발화 데이터, 추출된 검사자의 발화 데이터 등의 데이터를 각각 구분한 형태로 저장할 수 있다. 나아가 상기 발화 데이터 추출부(131)는 피검자와 검사자가 대면 상담을 수행하였는지 여부 등을 추가로 저장할 수 있다. As shown in FIG. 4, the speech data extraction unit 131 according to an embodiment of the present invention includes information on the conversation order in which sentences were generated during the consultation process, the total number of sentences, personal information such as the name of the subject, and age, and the consultation was performed. Data such as the location, the extracted speech data of the examinee, and the extracted speech data of the examinee can be stored in a separate form. Furthermore, the speech data extraction unit 131 may additionally store whether or not the examinee and the examinee have performed face-to-face consultation.

상기 학습 지원부(132)는 본 발명의 실시 예에 따른 인공지능 알고리즘의 학습을 지원할 수 있다. 상기 인공지능 알고리즘은 피검자의 언어 연령을 예측하는 동작을 통해 언어 능력을 평가할 수 있다. The learning support unit 132 may support learning of an artificial intelligence algorithm according to an embodiment of the present invention. The artificial intelligence algorithm may evaluate language ability through an operation of predicting the language age of the subject.

상기 학습 지원부(132)는 인공지능 알고리즘에 투입할 학습용 데이터의 전처리를 수행할 수 있다. 그리고 상기 학습 지원부(132)는 인공지능 알고리즘에 학습용 데이터를 투입할 수 있다. 상기 인공지능 알고리즘은 예컨대 합성곱 신경망 모델(CNN)을 의미할 수 있으며, RNN의 개념을 CNN에 적용한 RCNN을 통한 문장 분류를 수행할 수 있다.The learning support unit 132 may perform pre-processing of learning data to be put into an artificial intelligence algorithm. In addition, the learning support unit 132 may input data for learning to an artificial intelligence algorithm. The artificial intelligence algorithm may refer to, for example, a convolutional neural network model (CNN), and may perform sentence classification through an RCNN in which the concept of an RNN is applied to the CNN.

먼저, 상기 학습 지원부(132)가 학습용 데이터의 전처리를 수행하는 동작에 관하여 기술하기로 한다. First, an operation in which the learning support unit 132 performs preprocessing of learning data will be described.

상기 학습 지원부(132)는 학습용 데이터로 사용할 자료를 획득할 수 있는데, 상기 학습용 데이터로 사용할 자료는 예컨대, 기 준비된 데이터셋 및 발화 데이터 추출부(131)로부터 추출된 발화 데이터 등을 포함할 수 있다. The learning support unit 132 may acquire data to be used as learning data, and the data to be used as learning data may include, for example, a data set prepared in advance and speech data extracted from the speech data extraction unit 131. .

상기 학습 지원부(132)는 학습에 적합하지 않은 데이터들을 제거하기 위해, 획득된 자료(텍스트 형태의 대화 내용으로 구성된 데이터셋, 데이터 수집 과정에서 추출된 발화 데이터 등)의 노이즈를 제거하는 전처리 동작을 수행할 수 있다. 이 때, 상기 학습 지원부(132)는 학습에 적합하지 않은 문장구조를 노이즈로 정의할 수 있는데, 자유 발화는 상담과정에서 피검자(피상담자)가 발생시키는 모든 발화를 인정하기 때문에 분석 단위로써 발화를 사용하기 위해 발화 경계를 구분해야 한다. 발화 경계를 나누게 되면 단순 긍정, 부정 응답이나 소정의 기준치 이하(예, 3어절 이하)의 짧은 발화가 피검자(피상담자)의 발화로 기록될 수 있다. 이러한 발화는 학습 과정에서 그 차이를 발견하기 힘들기 때문에, 잘못된 학습으로 이어질 수 있다. 따라서 상기 학습 지원부(132)는 이러한 발화 문장에 대하여 노이즈 데이터로 정의하여 제거할 수 있다. 하기 표 1은 노이즈 데이터에 대하여 정리한 표이다. The learning support unit 132 performs a pre-processing operation of removing noise from the acquired data (a data set composed of text-type conversation contents, speech data extracted from the data collection process, etc.) in order to remove data that is not suitable for learning. Can be done. At this time, the learning support unit 132 may define a sentence structure that is not suitable for learning as noise, and free speech recognizes all utterances generated by the subject (the subject) during the counseling process, so the speech is used as an analysis unit. To do this, you need to separate the boundaries of your utterance. When the speech boundary is divided, a simple positive or negative response, or a short speech less than or equal to a predetermined reference value (eg, 3 words or less) can be recorded as the speech of the subject (the subject). These utterances can lead to incorrect learning because it is difficult to find the difference in the learning process. Therefore, the learning support unit 132 may define and remove such speech sentences as noise data. Table 1 below is a table summarizing noise data.

하지만, 이러한 노이즈 데이터를 PSA 그룹에서 제거하게 되면 두 가지 문제점이 발생될 수 있다. 첫 번째 문제점은 데이터의 양이 너무 작아지게 될 수 있다는 점이고, 두번째는 이러한 노이즈가 PSA 그룹의 대표 자질로 표현되기 때문에 학습에 어려움을 겪을 수 있다는 점이다. 따라서 상기 학습지원부(132)는 PSA 그룹의 학습용 데이터에 대한 전처리 시, 노이즈 제거 기준을 SA 그룹의 학습용 데이터의 노이즈 제거 기준에 비해 낮은 값으로 설정(SA그룹에 비해 제거되는 데이터를 감소시킴)할 수 있다. 예를 들어, 상기 학습 지원부(132)는 단순 대답을 제외한 노이즈는 제거하지 않을 수 있다. 또한, 상기 학습 지원부(132)는 노이즈 제거 후 최종적으로 산출되는 SA 학습용 데이터와 PSA 학습용 데이터의 문장 개수 비율이 유사해지도록 각 데이터간 문장 개수 차이의 한계 비율을 설정할 수 있다. 예컨대, 상기 학습 지원부(132)는 SA 학습용 데이터와 PSA 학습용 데이터의 문장 수의 차이가 10% 이하가 되도록 설정할 수 있다. 이 때 상기 학습용 데이터를 구성하는 각 문장들은 그에 대응하는 연령 집단 정보로 레이블링될 수 있다. 그에 따라, 인공지능 알고리즘은 학습용 데이터를 구성하는 개개의 문장과, 각 문장들에 대응하여 레이블링된 연령 집단 정보를 학습함을 통해, 추후 피검자의 발화 문장이 어떤 연령 그룹에 대응하는지 여부를 판단할 수 있게 된다. 이하의 표 2는 각 집단에서 추출된 문장의 개수와 레이블링된 문장의 예시를 도시하고 있다. However, when the noise data is removed from the PSA group, two problems may occur. The first problem is that the amount of data may become too small, and the second is that it may be difficult to learn because these noises are expressed as representative qualities of the PSA group. Therefore, the learning support unit 132 sets the noise removal criterion to a value lower than the noise removal criterion of the training data of the SA group (reduces the removed data compared to the SA group) when preprocessing the training data of the PSA group. I can. For example, the learning support unit 132 may not remove noise except for a simple answer. In addition, the learning support unit 132 may set a limit ratio of the difference in the number of sentences between each data so that the ratio of the number of sentences of the SA learning data and the PSA learning data that are finally calculated after noise removal is similar. For example, the learning support unit 132 may set the difference between the number of sentences between the SA learning data and the PSA learning data to be 10% or less. At this time, each sentence constituting the learning data may be labeled with age group information corresponding thereto. Accordingly, the artificial intelligence algorithm can determine whether the subject's speech sentence corresponds to which age group later by learning the individual sentences constituting the learning data and the age group information labeled corresponding to each sentence. Can be. Table 2 below shows the number of sentences extracted from each group and examples of labeled sentences.

이처럼, 상기 학습지원부(132)는 학습용 데이터를 SA 학습용 데이터 및 PSA 학습용 데이터로 구분하여 생성할 수 있으며, 각각의 기준에 따라 SA 학습용 데이터와 PSA 학습용 데이터를 구성하는 문장의 수량을 조정할 수 있다. In this way, the learning support unit 132 may generate the learning data by dividing it into SA learning data and PSA learning data, and adjust the quantity of sentences constituting the SA learning data and PSA learning data according to respective criteria.

상기 학습 지원부(132)는 학습용 데이터를 생성한 이후, 이를 인공지능 알고리즘에 투입하여 학습시키고, 그에 따라 산출되는 적합한 파라미터(예, 합성곱 연산에 요구되는 가중치, 정규화 설정값 등)를 산출 및 적용할 수 있다. After the learning support unit 132 generates data for learning, it inputs it to an artificial intelligence algorithm to learn, and calculates and applies appropriate parameters (e.g., weight required for convolution operation, normalization setting value, etc.) calculated accordingly. can do.

상기 인공지능 알고리즘에 대한 설명은 도 5를 참조하기로 한다. The description of the artificial intelligence algorithm will be referred to FIG. 5.

도 5는 본 발명의 실시 예에 따른 인공지능 알고리즘의 구성을 도식적으로 표현한 도면이다. 5 is a diagram schematically showing the configuration of an artificial intelligence algorithm according to an embodiment of the present invention.

도 5에서 도시되는 바와 같이, 본 발명의 실시 예에 따라 피검자의 언어 능력 인공지능 알고리즘은 N*k 로 구성되는 Sentence Matrix 로부터 각기 다른 크기를 가진 필터(2,3,4)가 Matrix를 슬라이딩하며 합성곱 연산을 수행할 수 있다. 상기 인공지능 알고리즘은 이로부터 추출된 3개의 Feature Map을 바탕으로 1-max pooling을 수행하고 추출된 1차 특징벡터를 연결시켜(concatenate) 최종 특징벡터를 생성할 수 있다. 이후 상기 인공지능 알고리즘은 최종 특징벡터에 대하여 최종적으로 드롭아웃(dropout)이 적용된 소프트맥스 레이어(Softmax layer)에서 이진분류를 수행할 수 있다. As shown in Figure 5, according to an embodiment of the present invention, the AI algorithm of the language ability of the subject slides the matrix from the Sentence Matrix consisting of N*k and filters 2, 3, and 4 having different sizes. You can perform convolutional operations. The artificial intelligence algorithm may generate a final feature vector by performing 1-max pooling based on three Feature Maps extracted therefrom and concatenating the extracted primary feature vectors. Thereafter, the artificial intelligence algorithm may perform binary classification in a Softmax layer to which a dropout is finally applied to the final feature vector.

합성곱 신경망은 통상적으로 이미지 인식 및 처리 분야에서 주로 사용되는 인공지능 분석 방식이다. 이러한 합성곱 신경망을 자연어 처리에 사용하기 위해서, 상기 학습 지원부(132)는 개별 발화를 임베딩 행렬로 표현하는 워드 임베딩 동작을 수행하게 된다. 이는 이미지 처리를 위한 동작 시 신경망의 입력을 이미지 픽셀크기의 행렬로 표현하는 것과 동일하게, 문장을 행렬로써 표현해야함을 의미한다. 문장을 행렬로 표현하기 위해서는 먼저 단어를 벡터로 표현하는 과정이 요구된다. The convolutional neural network is an artificial intelligence analysis method commonly used in image recognition and processing. In order to use the convolutional neural network for natural language processing, the learning support unit 132 performs a word embedding operation in which individual speeches are expressed as an embedding matrix. This means that the sentence should be expressed as a matrix in the same way that the input of the neural network is expressed as a matrix of the image pixel size during the operation for image processing. In order to express sentences in a matrix, a process of expressing words as vectors is first required.

이러한 요구에 따라, 상기 학습 지원부(132)는 인공지능 학습을 위해, 학습용 데이터(예, 기존 피검자의 발화 데이터, 기 준비된 데이터셋 등)를 구성하는 문장 내 단어들을 추출하는 동작을 수행할 수 있다. 그리고 상기 학습 지원부(132)는 문장을 구성하는 단어를

와 같이 k차원의 벡터로 표현하기 위한 일련의 동작을 수행할 수 있다. In accordance with such a request, the learning support unit 132 may perform an operation of extracting words in a sentence constituting learning data (eg, speech data of an existing subject, a prepared data set, etc.) for artificial intelligence learning. . And the learning support unit 132 is a word constituting the sentence

As described above, a series of operations for expressing a k-dimensional vector can be performed.

상기 학습 지원부(132)는 데이터셋을 구성하는 문장들 내 단어의 최대 개수가 N일 때 이를 Sequence Length로 정의하면 N *k차원의 Sentence Matrix를 만들 수 있다. 그리고 상기 학습 지원부(132)는 Sentence Matrix 생성을 위해, 만약 문장 내 단어의 개수를 n이라고 할 때, n≤N 이라면,

에서

까지의 단어를 채워(padding) 동일한 문장길이로 표현할 수 있다. When the maximum number of words in sentences constituting the data set is N, the learning support unit 132 may create a sentence matrix of N *k dimension by defining this as a sequence length. In addition, the learning support unit 132 generates a Sentence Matrix, if the number of words in a sentence is n, if n ≤ N,

in

Words up to can be padded and expressed in the same sentence length.

상기 학습 지원부(132)는 이러한 Sentence Matrix를 구현함에 있어 사전에 학습된 벡터 표현 값을 사용할 수 있다. 그러나 상기 방식으로 제한되지 않으며, 본 발명의 실시 예에 따라 벡터값은 무작위로 초기 설정(initialize)되고, 분류 동작을 수행하기에 적절한 값으로 학습 과정을 통해 업데이트 될 수 있다. 이를 위해 상기 학습 지원부 132는 학습용 데이터로 초반에는 기 준비된 데이터 셋을 이용하고 임의의 값으로 벡터값을 설정하며, 업데이트를 위해 피검자의 발화 데이터를 이용한 학습 동작을 수행할 수 있다. The learning support unit 132 may use a previously learned vector expression value in implementing the sentence matrix. However, it is not limited to the above method, and according to an embodiment of the present invention, a vector value is randomly initialized and may be updated through a learning process to a value suitable for performing a classification operation. To this end, the learning support unit 132 may use a previously prepared data set as learning data, set a vector value to an arbitrary value, and perform a learning operation using speech data of the subject for updating.

또한, 상기 학습 지원부(132)는 인공지능 학습에 요구되는 인공지능 알고리즘의 구동 동작을 전반을 제어할 수 있다. In addition, the learning support unit 132 may control overall driving operation of an artificial intelligence algorithm required for artificial intelligence learning.

합성곱 층은 각기 다른 크기를 가지는 필터들이 합성곱 연산을 수행하는 층과, 생성된 피쳐 맵(Feature Map) 에서 특징 벡터를 뽑아내는 풀링 층을 포함하여 구성될 수 있다. 본 발명의 실시 예에 따른 합성곱 층은 최적의 성능을 나타낸 단일 크기 값에 근사하여 구성되는 다중 필터를 포함할 수 있다. 본 발명의 실시 예에 따른 학습 지원부 132는 상기 합성곱 층의 다중 필터들이 각 필터들의 피쳐 맵으로부터 Pooling을 통해 중요한 특징을 나타내는 벡터를 추출(Feature Extraction)하도록 제어할 수 있다. 이후 상기 학습 지원부 132는 Fully Connected Layer를 거쳐 마지막 Softmax Layer에서 정규화 동작 이후 분류결과를 출력(Classification)하도록 제어할 수 있다. The convolutional layer may include a layer in which filters having different sizes perform a convolution operation, and a pooling layer that extracts a feature vector from a generated feature map. The convolutional layer according to an embodiment of the present invention may include multiple filters configured to approximate a single size value representing an optimal performance. The learning support unit 132 according to an embodiment of the present invention may control the multiple filters of the convolutional layer to extract a vector representing an important feature from a feature map of each filter through a pooling (Feature Extraction). Thereafter, the learning support unit 132 may control to output the classification result after the normalization operation in the last Softmax Layer through the Fully Connected Layer.

상기 학습 지원부 132는 바람직하게 합성곱 신경망에서 드롭아웃(Dropout)방식의 정규화 동작을 수행할 수 있다. 상기 드롭아웃 방식은 심층 신경망(Deep Neural Network) 에서 발생하는 과적합 문제를 해결하는 대표적인 정규화 방법으로 학습과정에서 무작위로 유닛을 드롭시키는 방식을 말한다. 본 발명의 실시 예에 따라, 상기 드롭아웃은 예컨대, 0.4~0.6의 확률로 수행될 수 있다. 그리고 본 발명의 실시 예에 따라 드롭아웃 방식의 정규화 동작은 합성곱 층과 전연결층 사이에서 수행될 수 있다.The learning support unit 132 may preferably perform a dropout normalization operation in a convolutional neural network. The dropout method is a typical normalization method that solves an overfitting problem occurring in a deep neural network, and refers to a method of randomly dropping units in a learning process. According to an embodiment of the present invention, the dropout may be performed with a probability of, for example, 0.4 to 0.6. In addition, according to an embodiment of the present invention, the normalization operation of the dropout method may be performed between the convolutional layer and the full connection layer.

상기 학습 지원부 132는 인공지능 알고리즘을 학습시키는 동작 수행 전, 먼저 거시적 변수(Hyper Parameter) 설정할 수 있다. 상기 거시적 변수의 예시는 표 3 에 나타난 내용을 참조하기로 한다. The learning support unit 132 may first set a macro variable (Hyper Parameter) before performing an operation of learning an artificial intelligence algorithm. For examples of the macroscopic variables, refer to the contents shown in Table 3.

상기 학습 지원부 132는 거시적 변수 설정과 더불어 학습(Training), 교차 검증(Cross Validation), 그리고 데이터 셋 구성 동작을 수행할 수 있다. 상기 데이터 셋은 표 4에서와 같이 설정될 수 있다. The learning support unit 132 may perform training, cross validation, and data set configuration operations in addition to setting macroscopic variables. The data set may be set as shown in Table 4.

그리고 상기 데이터 셋과 거시적 변수로 모델을 세팅하고 학습하여 얻어지는 각 스코어는 표 5와 같다. In addition, each score obtained by setting and learning a model with the data set and macroscopic variables is shown in Table 5.

감정 분석에서 나타나는 긍정, 부정과는 다르게 연령은 20대부터 많게는 100세 이상까지도 존재한다. 따라서 학습 연령집단과 다른 연령집단 데이터를 입력으로 하여 예측결과분포를 분석한다. 예를 들어, 30대 피검사자에 대한 데이터를 입력으로 했을 경우, 예측결과는 SA 그룹에 해당하는 결과에 치우칠 것이다. 이에 반해 4~5세의 아동의 경우 예측결과는 SA 그룹에 해당하는 결과에 치우칠 것이다. 이를 토대로 만약 고등부에 해당하는 피검사자의 예측결과가 유아부에 해당하는 결과에 치우친다면, 실제연령보다 언어연령이 낮다는 결론을 얻을 수 있게 된다. Unlike the positives and negatives that appear in emotion analysis, the age ranges from 20s to 100 years of age. Therefore, the distribution of prediction results is analyzed by inputting data from the age group different from the learning age group. For example, if data on a test subject in their 30s is input, the prediction result will be biased toward the result corresponding to the SA group. In contrast, in the case of children aged 4-5 years old, the predicted results will be biased toward those in the SA group. Based on this, if the predicted result of the test subject corresponding to the high school part is biased to the result corresponding to the infant part, it can be concluded that the language age is lower than the actual age.

하기 수식 1과 수식 2는 각각 입력되는 피검사자의 발화의 분포를 PSA 그룹으로 예측하는 확률과 SA 그룹으로 예측하는 확률을 나타낼 수 있다.Equations 1 and 2 below may represent a probability of predicting a distribution of an input subject's utterance as a PSA group and a probability of predicting as an SA group, respectively.

SA 그룹 예측 확률은 다음의 수식 1과 같이 계산될 수 있다. The SA group prediction probability can be calculated as in Equation 1 below.

[수식 1][Equation 1]

P(SA) = (number of predictions to SA)/(Total utterance supposed SA)*100P(SA) = (number of predictions to SA)/(Total utterance supposed SA)*100

그리고 PSA 그룹 예측 확률은 다음의 수식 2와 같이 계산될 수 있다.And the PSA group prediction probability can be calculated as in Equation 2 below.

[수식 2][Equation 2]

P(PSA) = (number of predictions to PSA)/(Total utternace supposed PSA)*100P(PSA) = (number of predictions to PSA)/(Total utternace supposed PSA)*100

먼저 SA 그룹 예측 확률인 P(SA)의 분모는 실제 발화가 모두 SA 그룹으로 가정한 발화의 총 개수를 나타내고, 분자는 입력 발화를 SA 예측한 개수를 나타낼 수 있다. 반대로 PSA 그룹 예측 확률인 P(PSA)의 분모는 PSA 그룹으로 가정한 발화의 총 개수, 분자는 PSA 그룹으로 예측한 실제 개수를 나타낼 수 있다. 위 수식을 이용하여 이후 실험에서 도출되는 예측 확률을 구할 수 있다.First, the denominator of P(SA), which is an SA group prediction probability, indicates the total number of utterances assuming that all actual utterances are SA groups, and the numerator may indicate the number of SA-predicted input utterances. Conversely, the denominator of P(PSA), which is the predicted probability of the PSA group, may represent the total number of utterances assumed as the PSA group, and the numerator may represent the actual number predicted by the PSA group. Using the above equation, you can find the predicted probability derived from the subsequent experiment.

상기 학습 지원부 132는 SA 예측 확률과 PSA 그룹 예측 확률이 기 설정된 수치에 도달하도록 학습 동작을 반복하여 수행할 수 있다. The learning support unit 132 may repeatedly perform a learning operation such that the SA prediction probability and the PSA group prediction probability reach a preset value.

본 발명의 실시 예에 따르면, 상기 학습 지원부 132는 연령집단별 예측확률 분포와 연령집단 내 개별확률 분포와 같은 통계 데이터를 기반으로 피검자의 언어 능력을 분석하도록 인공지능 알고리즘을 구현할 수 있다. According to an embodiment of the present invention, the learning support unit 132 may implement an artificial intelligence algorithm to analyze the language ability of a subject based on statistical data such as a prediction probability distribution for each age group and an individual probability distribution within an age group.

상기 통계 데이터 중 연령집단별 예측확률에 관한 이해를 돕기 위해 표 6 내지 표 7을 참조하기로 한다. Tables 6 to 7 will be referred to to help understand the prediction probability for each age group among the statistical data.

표 6은 학습용 데이터에 포함된 대상들의 연령 집단 분포를 나타내고 있다. Table 6 shows the distribution of age groups of subjects included in the learning data.

표 7은 표 6의 데이터와 같이 분류된 연령 집단(학습에 사용된 연령 집단)과, 학습에 사용되지 않은 PSA 그룹(예, 중고등부; Middle/high school) 및 SA 그룹(예, 유아부; infant) 모델에 따른 미취학연령(PSA)과 취학연령(SA)에 대한 예측 확률을 나타내고 있다. Table 7 shows the age group (the age group used for learning) classified as the data in Table 6, the PSA group not used for learning (eg, middle/high school) and the SA group (eg, infant; infant). ) It shows the predicted probability for preschool age (PSA) and school age (SA) according to the model.

표 7에서 도시되는 바와 같이, 예측 결과의 추이는 연령이 낮을수록 미취학연령(PSA) 그룹으로 예측한 확률이 높고, 연령이 증가할수록 취학연령(SA) 그룹으로 예측한 확률이 높아진다. 성인부3(Adult 3)의 경우 취학연령(SA)로 예측된 발화의 비율이, 노화에 의해 다시 낮아지는 것을 확인할 수 있는데, 이는 연령이 높아질수록 언어 능력이 퇴화한다는 기존의 연구에 부합한다. 더불어 학습에 사용되지 않은 유아부(Infant)와 중고등부(Middle/High school) 데이터에 대한 예측 확률은 70%이상의 정확도를 보인다. As shown in Table 7, the predicted probability of the preschool age (PSA) group increases as the age decreases, and the predicted probability of the school age (SA) group increases as the age increases. In the case of Adult 3, it can be seen that the rate of speech predicted by school age (SA) decreases again with aging, which is consistent with previous studies that show that language skills deteriorate with age. In addition, the prediction probability of data for infant and middle/high school that are not used for learning shows an accuracy of more than 70%.

그리고 각 연령 집단을 구성하는 개별 피상담자의 발화의 예측 확률 분포는 다음과 같이 산출될 수 있다. 개별 피상담자의 발화 예측 확률 분포는 표 7의 원인을 확인할 수 있게 하고, 실제 개별 상담자의 언어 연령을 예측하는 방식을 정리할 수 있게 한다. In addition, the predicted probability distribution of utterances of individual counselees constituting each age group may be calculated as follows. The probability distribution of prediction of utterances of individual counselors allows the cause of Table 7 to be identified, and the method of predicting the actual language age of individual counselors can be summarized.

도 6는 본 발명의 실시 예에 따른 통계적 분석 동작의 근거가 되는 연령집단 내 개별 예측확률분포를 도시한 그래프이다. 6 is a graph showing an individual prediction probability distribution within an age group that is a basis for a statistical analysis operation according to an embodiment of the present invention.

도 6은 각 연령집단 내에 존재하는 개인별 예측 확률의 분포를 보여주는 그래프이다. 상기 그래프에서 원 모양은 개인 발화 데이터 중 취학연령(SA)으로 예측한 확률을 나타내고 사각형은 미취학연령(PSA)으로 예측한 확률을 나타낸다. (a) 는 Individual probability distribution for elementary group를 (b) 는 Individual probability distribution for adult1 group을, (c) 는 Individual probability distribution for adult2 group를, 마지막으로 (d) 는 Individual probability distribution for adult3 group을 나타낸다.6 is a graph showing the distribution of predicted probability for each individual existing in each age group. In the graph, the circle shape represents the probability predicted by school age (SA) among personal speech data, and the square represents the probability predicted by preschool age (PSA). (a) represents the individual probability distribution for elementary group, (b) represents the individual probability distribution for adult1 group, (c) represents the individual probability distribution for adult2 group, and finally (d) represents the individual probability distribution for adult3 group.

개별 발화자의 예측 결과를 보면, Elementary group은 다른 group 에 비해서 미취학연령(PSA)에 대한 예측 확률과 취학연령(SA)으로 예측된 확률값의 격차가 작다. 그리고 PSA 확률값이 SA 확률값보다 더 큰 경우가 40% 정도 발생한다는 것을 확인할 수 있다. 이는 언어 발달이 왕성하게 이루어지는 연령대라는 것을 의미할 수 있다. 예를 들어, 6번 발화자(좌측에서 6번째 그래프에 해당)는 매우 높은 수준의 언어 발달을 보이고 있으나, 16번 발화자(좌측에서 16번째에 해당)는 이에 비하여 매우 더딘 발달 정도를 보인다고 할 수 있다.When looking at the prediction results of individual talkers, the difference between the predicted probability for preschool age (PSA) and the predicted probability value for school age (SA) is smaller in the elementary group than in other groups. In addition, it can be seen that about 40% of the cases where the PSA probability value is greater than the SA probability value occurs. This can mean that language development is the age group in which language is actively developed. For example, talker #6 (corresponding to the 6th graph from the left) shows a very high level of language development, while talker #16 (corresponding to the 16th from the left) shows a very slow development level. .

이에 비하여 Adult group 1 이나 Adult group 2에서는 일관되게 취학연령일 예측 확률 값이 미취학연령일 예측 확률 값보다 높다. 또한 취학연령일 예측 확률값과 미취학연령일 예측 확률 값의 격차는 Adult 2가 더 크다. 이는 언어 발달이 가장 최고조에 이르렀음을 보여주는 것이다. 그러나 Adult group 3 에서는 다시 취학연령일 확률 값과 미취학연령일 확률 값이 역전되는 현상이 나타나는데, 이는 노화에 따른 언어 능력의 감퇴에 기인한다. In contrast, in adult group 1 and adult group 2, the probability value of predicting school age is higher than that of predicting preschool age. Also, the difference between the predicted probability value of school age and the predicted probability value of preschool age is larger in Adult 2. This shows that language development has reached its peak. However, in adult group 3, the probability value of school age and the probability value of preschool age are reversed again, which is due to the decline in language ability due to aging.

본 발명의 실시 예에 따른 언어 능력 평가를 위한 인공지능 알고리즘은 이러한 개별 확률 분포가 보이는 특성에 기반하여, 새로운 피험자의 언어 능력을 유추하도록 구성될 수 있다. 구체적으로, 본 발명의 실시 예에 따른 인공지능 알고리즘은 먼저 특정 피검자의 SA 및 PSA 확률을 계산하기 위한 동작을 수행하고, 그 결과를 해당 그래프에 포함시켜 다른 피험자들의 값과 비교하는 동작을 수행하도록 구현될 수 있다. The artificial intelligence algorithm for evaluating language ability according to an embodiment of the present invention may be configured to infer the language ability of a new subject based on the characteristics of such individual probability distributions. Specifically, the artificial intelligence algorithm according to an embodiment of the present invention first performs an operation for calculating the SA and PSA probabilities of a specific subject, and includes the result in a corresponding graph to perform an operation of comparing with values of other subjects. Can be implemented.

또한, 본 발명의 실시 예에 따른 인공지능 알고리즘은, 피검자의 언어 능력을 판단하기 위해, 피검자가 발화 시 사용한 어휘의 개수(즉, 문장의 길이)와, 특징 어휘, 그리고 발화 구조를 판단하도록 구성될 수 있다. In addition, the artificial intelligence algorithm according to an embodiment of the present invention is configured to determine the number of vocabularies used by the examinee when speaking (i.e., length of sentences), characteristic vocabulary, and speech structure in order to determine the language ability of the examinee. Can be.

상기 인공지능 알고리즘은 특정 피검자의 발화 문장 내에서 어휘의 개수가 적을수록 언어능력이 낮은 것으로 판단하도록 구성될 수 있다. 이 때 문장의 길이 분포는 각 연령 집단에 따라 다양한데, 연령이 높아질수록 문장의 길이는 길어지는 특성을 보인다. 이와 같은 연령별 발화 문장의 길이 특성은 Elementary 그룹과 Adult 그룹을 구분하게 하는 지표가 될 수 있다. The artificial intelligence algorithm may be configured to determine that the lower the number of vocabulary words in the spoken sentence of a specific subject, the lower the language ability. At this time, the length distribution of sentences varies according to each age group, and as the age increases, the length of the sentences increases. The length characteristics of spoken sentences by age can be an index that distinguishes the elementary group and the adult group.

하기 [표 8]은 특이한 분포를 보이는 발화 예시를 나타내고 있고, [표 9]는 올바른 분류 결과를 보인 발화 예시를 나타내고 있다. 예시 발화는 편의상 생략하여 기재된 것이며, 실제 발화는 예시 발화보다 더 많은 어절로 구성될 수 있다. The following [Table 8] shows examples of utterances showing a peculiar distribution, and [Table 9] shows examples of utterances showing correct classification results. Exemplary utterances are omitted for convenience, and actual utterances may be composed of more words than exemplary utterances.

그리고, 상기 인공지능 알고리즘은 각 연령 집단에 따라 나타나는 특징 어휘의 존재 여부를 판단하고, 특정 특징 어휘가 존재하는 것으로 판단되면, 그에 대응하는 연령 집단을 산출할 수 있다. 예를 들어, 각 연령 집단별로 주로 시청하는 TV프로가 다를 것이므로, 상기 인공지능 알고리즘은 특정 TV 프로그램 명칭이 의미하는 피검자의 연령 집단을 산출할 수 있다. In addition, the artificial intelligence algorithm may determine whether or not a characteristic vocabulary appears according to each age group exists, and when it is determined that a specific characteristic vocabulary exists, may calculate an age group corresponding thereto. For example, since the TV programs mainly watched for each age group will be different, the artificial intelligence algorithm can calculate the age group of the subject, which is indicated by the specific TV program name.

그리고 상기 인공지능 알고리즘은 발화문장의 구조에 기반하여 피검자의 연령을 예측할 수 있다. 수집되는 피검자의 발화 데이터는, 자유로운 분위기 속에서 이루어지는 대화를 기반으로 하기 때문에, 문장의 구성 성분 중 일부가 생략된 경우가 많을 것이다. 이러한 특징을 고려하여 상기 인공지능 알고리즘은 생략이 많은 문장 구조일수록 발화자의 언어 연령을 낮게 평가하도록 구성될 수 있다. In addition, the artificial intelligence algorithm may predict the age of the subject based on the structure of the speech sentence. Since the collected utterance data of the subject is based on dialogue held in a free atmosphere, some of the constituent elements of the sentence will be omitted in many cases. In consideration of these characteristics, the artificial intelligence algorithm may be configured to evaluate a speaker's language age lower as the sentence structure with many omissions increases.

이하의 표 10은 초등부의 특징 어휘가 포함된 문장 및 문장구조를 수정하여 분석한 결과를 도시하고 있다. 분석 결과 전 연령 그룹에서 SA 그룹으로 예측할 확률이 증가(표 7과의 비교)되었음을 알 수 있다. Table 10 below shows the results of analysis by modifying the sentence and sentence structure including the characteristic vocabulary of the elementary school. As a result of the analysis, it can be seen that the probability of predicting from all age groups to the SA group increased (compared with Table 7).

또한, 상기 인공지능 알고리즘은 일부 성인(성인부 3; Adult 3) 그룹에서는 미취학 연령으로 예측될 확률이 타 취학 연령대의 연령 그룹에 비해 높아질 수 있음을 고려하여, 언어 장애 여부를 판단하도록 구현될 수 있다. In addition, the artificial intelligence algorithm may be implemented to determine whether there is a speech impairment, taking into account that the probability of predicting preschool age as preschool age may be higher in some adult (adult 3; adult 3) groups compared to the age group of other school age groups. .

본 발명은 상기와 같은 특징을 고려하여 설계되는 인공지능 알고리즘을 이용하여, 피검자의 언어 능력을 평가할 수 있다. The present invention can evaluate the language ability of a subject by using an artificial intelligence algorithm designed in consideration of the above characteristics.

상기 평가 지원부(133)는 본 발명의 실시 예에 따른 인공지능 알고리즘을 활용하여, 피검자의 언어 능력을 평가하는 동작을 수행할 수 있다. 상기 평가 지원부(133)는 상기 학습 지원부(132)에 따라 학습되고, 파라미터가 설정된 인공지능 알고리즘에 피검자의 분석용 데이터(예, 대화 데이터 또는 발화 데이터)를 투입하여 분석동작을 수행할 수 있다. 상기 학습 지원부(132)는 앞서 설명한 본 발명의 실시 예에 따른 인공지능 알고리즘의 구동 순서에 따라 분석 동작을 수행하도록 지원할 수 있다. The evaluation support unit 133 may perform an operation of evaluating the language ability of the subject by using the artificial intelligence algorithm according to an embodiment of the present invention. The evaluation support unit 133 may perform an analysis operation by inputting data for analysis (eg, conversation data or speech data) of a subject into an artificial intelligence algorithm that is learned according to the learning support unit 132 and set parameters. The learning support unit 132 may support performing an analysis operation according to the driving order of the artificial intelligence algorithm according to the embodiment of the present invention described above.

예를 들면, 상기 평가 지원부(133)는 상기 인공지능 알고리즘에 상기 발화 데이터 추출부 131에서 추출된 대화 데이터 또는 발화 데이터로부터 피검자의 개별 발화 문장들로부터 어휘의 개수 추출 및 패딩 동작(Padding; 기 설정된 길이의 문장을 생성하도록 채우는 동작)을 기반으로 한 워드 임베딩(문장을 행렬로 표현하는 과정) 동작을 수행할 수 있다. 상기 워드 임베딩 동작은 학습 지원부(132)가 데이터 셋을 통해 인공지능 알고리즘을 학습하는 과정과 유사하게 수행될 수 있다. 상기 워드 임베딩 동작에 관한 구체적 설명은 기 기술되었으므로 생략하기로 한다. For example, the evaluation support unit 133 extracts the number of vocabulary words from the individual speech sentences of the subject from the dialogue data or speech data extracted by the speech data extraction unit 131 in the artificial intelligence algorithm and padding (Padding; preset) A word embedding (a process of expressing a sentence in a matrix) based on an operation of filling in to generate a sentence of length) may be performed. The word embedding operation may be performed similarly to a process in which the learning support unit 132 learns an artificial intelligence algorithm through a data set. Since a detailed description of the word embedding operation has been previously described, it will be omitted.

그리고 상기 평가 지원부(133)는 워드 임베딩 동작에 의해 생성된 Sentence Matrix를 결정된 인공지능 알고리즘의 파라미터(Parameter)(예, 필터 사이즈 등)에 따라 합성곱 연산을 수행할 수 있다. 상기 평가 지원부(133)는 합성곱 연산을 수행하여, 피검자의 발화 데이터에 포함되는 다수개의 발화 문장 각각에 대한 취학여부 예측(PSA 또는 SA 판단)을 수행할 수 있다. 이로써, 상기 평가 지원부(133)는 인공지능 알고리즘의 연산에 따른 결과 정보를 취득할 수 있다. In addition, the evaluation support unit 133 may perform a convolution operation on the Sentence Matrix generated by the word embedding operation according to a parameter (eg, filter size) of the determined artificial intelligence algorithm. The evaluation support unit 133 may perform a convolution operation to predict whether to attend school (PSA or SA determination) for each of a plurality of spoken sentences included in the speech data of the subject. Accordingly, the evaluation support unit 133 may acquire result information according to the calculation of the artificial intelligence algorithm.

나아가 상기 평가 지원부(133)는 인공지능 알고리즘으로부터 산출된 결과 값을 기반으로 추가적인 피검자의 언어능력 평가 동작을 수행할 수 있다. Furthermore, the evaluation support unit 133 may perform an additional testee's language ability evaluation operation based on a result value calculated from an artificial intelligence algorithm.

그리고 상기 평가 지원부(133)는 각 문장에 대한 취학 여부 예측 결과들을 합산하여, 전체 각 피검자 1인에 대한 취학연령일 예측확률(SA 확률) 및 미취학 연령일 예측확률(PSA 확률)을 판단할 수 있다. In addition, the evaluation support unit 133 can determine the school-age-age prediction probability (SA probability) and preschool-age-age prediction probability (PSA probability) for each one of the subjects by summing up the school-age prediction results for each sentence. have.

그리고, 상기 평가 지원부(133)는 피검자 1인에 대한 SA 확률 및 PSA 확률 중 적어도 하나의 수치 정보를 판단할 수 있다. 그리고 상기 평가 지원부(133)는 대상 피검자의 실제 연령과 SA 확률 및 PSA 확률 중 적어도 하나의 수치 정보를 통해 예측되는 피검자의 연령을 비교할 수 있다. 예를 들어, 상기 평가 지원부(133)는 피검자의 실제 연령에 대응하는 통계적 SA 확률 및 PSA확률과, 상기 피검자의 발화 데이터를 분석한 결과값인 SA 확률 및 PSA확률을 비교할 수 있다. In addition, the evaluation support unit 133 may determine at least one numerical information of an SA probability and a PSA probability for one subject. In addition, the evaluation support unit 133 may compare the actual age of the target subject and the age of the subject predicted through at least one of the SA probability and PSA probability. For example, the evaluation support unit 133 may compare the statistical SA probability and PSA probability corresponding to the actual age of the subject, and the SA probability and PSA probability, which are results of analyzing the speech data of the subject.

상기 평가 지원부(133)는 상기 비교를 통해 피검자의 실제 연령대에 대응하는 통계적 확률값과 피검자의 발화 데이터 기반의 예측 연령 확률 값의 차이가 기준이 미만인 경우 이상 없음으로 판단할 수 있다. 반면, 상기 평가 지원부(133)는 상기 피검자의 예측 연령 확률과 상기 피검자의 실제 연령대에 대응하는 통계적 예측 연령 확률과 비교하여, 피검자의 SA확률이 통계적 SA확률에 비해 기준치 이상 낮은 값을 나타내는 경우(또는 피검자의 PSA확률이 통계적 PSA확률에 비해 기준치 이상 높은 갚을 나타내는 경우) 피검자에게 언어 장애가 있는 것으로 판단할 수 있다. The evaluation support unit 133 may determine that there is no abnormality when the difference between the statistical probability value corresponding to the actual age group of the subject and the predicted age probability value based on the subject's speech data is less than the reference through the comparison. On the other hand, the evaluation support unit 133 compares the predicted age probability of the subject with the statistical predicted age probability corresponding to the actual age of the subject, and the SA probability of the subject indicates a value lower than the reference value than the statistical SA probability ( Alternatively, if the subject's PSA probability is higher than the reference value compared to the statistical PSA probability), it may be determined that the subject has a speech impairment.

반면 상기 평가 지원부(133)는 상기 피검자의 예측 연령 확률과 상기 피검자의 실제 연령대에 대응하는 통계적 예측 연령 확률과 비교하여, 피검자의 SA확률이 통계적 SA확률에 비해 기준치 이상 높은 값을 나타내는 경우(또는 피검자의 PSA확률이 통계적 PSA확률에 비해 기준치 이상 낮은 값을 나타내는 경우), 언어 능력이 우수한 것으로 판단할 수 있다. 나아가, 상기 평가 지원부(133)는 비교에 따른 수치 차이에 대응하여 피검자의 언어 능력의 등급을 판단할 수 있다. On the other hand, the evaluation support unit 133 compares the predicted age probability of the subject with the statistical predicted age probability corresponding to the actual age of the subject, and the SA probability of the subject shows a value higher than the reference value than the statistical SA probability (or If the test subject's PSA probability is lower than the standard value compared to the statistical PSA probability), it can be determined that the language ability is excellent. Further, the evaluation support unit 133 may determine the level of the language ability of the subject in response to the numerical difference according to the comparison.

도 7은 본 발명의 실시 예에 따른 언어 능력 평가 동작의 순서를 도시한 순서도이다. 7 is a flowchart illustrating a sequence of a language ability evaluation operation according to an embodiment of the present invention.

본 발명의 실시 예에 따른 서버(100)의 제어부(130)는 기 설정된 데이터 셋(트레이닝용 임의의 대화 자료)을 인공지능 알고리즘을 학습시킬 학습용 데이터로 가공하는 S105동작 단계를 수행할 수 있다. 이후 상기 제어부(130)는 상기 학습용 데이터로 인공지능 알고리즘을 학습시키는 S100 동작을 수행할 수 있다. 이 과정에서 상기 제어부(130)는 언어 능력 평가 동작에 요구되는 인공지능 알고리즘의 연산용 파라미터들을 추출하고 적용할 수 있다. 이러한 인공지능 학습 동작은 1회에 그치지 않고, 기 분석된 피검자의 데이터를 학습용 데이터로 가공하여 추가의 학습이 수행될 수 있다. The control unit 130 of the server 100 according to an embodiment of the present invention may perform operation S105 of processing a preset data set (arbitrary conversation data for training) into learning data for training an artificial intelligence algorithm. Thereafter, the controller 130 may perform operation S100 of learning an artificial intelligence algorithm using the learning data. In this process, the controller 130 may extract and apply parameters for calculation of an artificial intelligence algorithm required for an operation of evaluating language ability. This artificial intelligence learning operation is not limited to one time, and additional learning may be performed by processing the previously analyzed data of the subject into learning data.

그리고 상기 서버의 제어부(130)는 검사자와 피검자의 대화로 이루어진 음성 대화 데이터를 사용자 기기로부터 획득하는 S115동작을 수행할 수 있다. 이 때 피검자 기기와 검사자 기기는 통화가 이루어질 수 있으며, 통화 시 이루어진 대화 내용이 대화 데이터로써 획득될 수 있다. 그리고 이 때 획득되는 대화 데이터는 음성 형태의 대화 데이터일 수 있다. In addition, the controller 130 of the server may perform operation S115 of acquiring voice conversation data consisting of conversations between the examinee and the subject from the user device. At this time, the testee device and the tester device may make a call, and conversation contents made during the call may be obtained as conversation data. In addition, the conversation data obtained at this time may be conversation data in a voice form.

상기 서버의 제어부(130)는 획득된 음성 대화 데이터를 텍스트 대화 데이터로 변환하고, 상기 텍스트 대화 데이터로부터 피검자의 발화데이터를 추출하는 S120동작을 수행할 수 있다. The controller 130 of the server may perform operation S120 of converting the acquired voice conversation data into text conversation data and extracting speech data of the subject from the text conversation data.

이후 상기 서버의 제어부 (130)는 특정 피검자의 발화 데이터를 상기 학습이 이루어진 인공지능 알고리즘에 투입할 분석용 데이터로 가공하는 S125동작을 수행할 수 있다. 이 때 노이즈 제거 동작이 수행될 수 있다. Thereafter, the controller 130 of the server may perform operation S125 of processing the speech data of a specific subject into data for analysis to be input to the artificial intelligence algorithm in which the learning has been performed. At this time, a noise removal operation may be performed.

상기 서버의 제어부(130)는 분석용 데이터를 인공지능 알고리즘에 투입하여 연산하는 S130 동작과, 인공지능 알고리즘의 연산 결과로 상기 피검자의 언어 능력을 평가하는 S135동작을 수행할 수 있다. 이 때 상기 서버의 제어부(130)는 인공지능 알고리즘의 연산 결과로 산출된 피검자의 예상 연령 확률과 실제 피검자의 연령에 대응하는 통계적 예측 연령 확률의 비교를 통해 피검자의 언어 능력 수준을 평가할 수 있다. The controller 130 of the server may perform operation S130 for calculating by inputting analysis data to an artificial intelligence algorithm, and operation S135 for evaluating the language ability of the subject based on the result of the operation of the artificial intelligence algorithm. At this time, the control unit 130 of the server may evaluate the level of language ability of the subject by comparing the predicted age probability of the subject calculated as a result of the computation of the artificial intelligence algorithm and the statistical predicted age probability corresponding to the actual age of the subject.

요컨대, 본 발명의 다양한 실시 예에 따른 언어 능력 평가 시스템은 사용자 기기 및 서버를 포함하여 구성될 수 있으며, 상기 사용자 기기는 언어 능력을 측정하고자 하는 피검자의 발화 데이터를 포함하는 대화 데이터를 상기 서버에 제공하고, 상기 서버는 상기 사용자 기기로부터 수신된 피검자의 대화 데이터를 이용하여 인공지능 알고리즘 연산을 수행하며, 상기 연산의 결과에 따라 상기 피검자의 언어 능력 수준을 평가하되, 상기 인공지능 알고리즘을 통해 산출된 피검자의 예측 연령 정보와, 기 등록된 피검자의 실제 연령과의 비교를 통해 이상 여부를 판단할 수 있다. In short, the language ability evaluation system according to various embodiments of the present disclosure may include a user device and a server, and the user device transmits conversation data including speech data of a subject to be measured language ability to the server. Provided, and the server performs an artificial intelligence algorithm operation using the conversation data of the subject received from the user device, and evaluates the level of language ability of the subject according to the result of the operation, but calculated through the artificial intelligence algorithm The abnormality can be determined by comparing the predicted age information of the previously registered subject with the actual age of the previously registered subject.

상기 사용자 기기는 피검자 기기와 검사자 기기를 포함하여 구성될 수 있고, 상기 서버는 임의의 피검자 기기와 검사자 기기 간의 매칭 및 통화 연결을 지원하고, 통화 중 수행된 대화 내용을 녹음한 음성형 대화 데이터를 수신하는 통신부 및 상기 통화 중 피검자와 검사자가 수행하는 대화의 내용을 녹음하여 음성형 대화 데이터를 생성하게 하고, 상기 피검자 기기 및 검사자 기기 중 적어도 하나에 상기 생성된 음성형 대화 데이터를 서버로 전송하도록 요청하는 제어부를 포함하여 구성될 수 있다.The user device may be configured to include a subject device and a tester device, and the server supports matching and call connection between any test subject device and tester device, and provides voice-type conversation data recording conversation contents performed during a call. To generate voice-type conversation data by recording the contents of the receiving communication unit and the conversation between the examinee and the examinee during the call, and transmit the generated speech-type conversation data to at least one of the examinee's device and the examinee's device to the server. It may be configured to include a requesting control unit.

그리고 상기 제어부는 상기 음성형 대화 데이터를 텍스트 형태로 변환하여 텍스트 대화 데이터를 생성하고, 피검자의 목소리 주파수 특성에 기반하여 상기 텍스트 대화 데이터로부터 피검자의 발화 데이터를 추출하는 발화 데이터 추출부, 피검자의 발화 데이터 및 기 설정된 데이터 셋 중 적어도 하나를 포함하여 인공지능 알고리즘에 투입할 학습용 데이터를 생성하고, 상기 학습용 데이터를 인공지능 알고리즘에 투입하여 학습을 수행하는 학습 지원부, 피검자의 발화 데이터로부터 인공지능 알고리즘에 투입할 분석용 데이터를 생성하고, 생성된 분석용 데이터를 인공지능 알고리즘에 투입하고 연산하여 피검자의 언어 능력의 이상 여부를 판단하는 평가 지원부를 포함할 수 있다. And the control unit converts the speech-type conversation data into a text form to generate text conversation data, and a speech data extracting unit that extracts speech data of the examinee from the text conversation data based on the speech frequency characteristic of the examinee, and speech of the examinee. A learning support unit that generates learning data to be put into an artificial intelligence algorithm including at least one of the data and a preset data set, and inserts the learning data into the artificial intelligence algorithm to perform learning, and from the speech data of the subject to the artificial intelligence algorithm. It may include an evaluation support unit that generates data for analysis to be injected, and determines whether or not the subject's language ability is abnormal by inputting and calculating the generated analysis data to an artificial intelligence algorithm.

상기 학습 지원부는 상기 학습용 데이터 생성 시, 피검자의 발화 데이터의 분석 결과를 SA(School age) 그룹일 확률과, PSA(Pre School age) 그룹일 확률로 각각 산출되도록 발화 문장 각각에 예상 연령 정보를 레이블링하고, SA 그룹의 학습용 데이터와 PSA 그룹의 학습용 데이터를 분류하여 노이즈 제거를 수행하되, PSA 그룹의 학습용 데이터의 노이즈 제거 기준을 SA 그룹의 학습용 데이터에 비해 낮은 값으로 설정하여, 제거되는 노이즈 비율을 감소시킬 수 있다. 이와 같은 노이즈 제거 기준의 차등 설정의 이유는 PSA 그룹의 학습용 데이터에서 SA 그룹의 노이즈 제거 기준을 동일하게 적용할 경우, 학습 가능한 데이터가 대부분 노이즈로 인식될 가능성 때문이다. When generating the learning data, the learning support unit labels predicted age information on each spoken sentence so that the analysis result of the speech data of the subject is calculated as a probability of a school age (SA) group and a probability of a pre school age (PSA) group. And, by classifying the training data of the SA group and the training data of the PSA group, noise removal is performed, but by setting the noise removal criterion for the training data of the PSA group to a lower value than the training data of the SA group, the noise ratio to be removed is set. Can be reduced. The reason for the differential setting of the noise removal criterion is that most of the learnable data is recognized as noise when the SA group noise removal criterion is applied to the training data of the PSA group.

상기 평가 지원부는 상기 분석용 데이터를 인공지능 알고리즘을 통해 분석한 결과로 피검자의 언어 능력의 이상 여부를 판단하되, 상기 인공지능 알고리즘의 분석을 통해 피검자의 발화 데이터에 대한 예측 연령 확률을 산출하고, 상기 피검자의 예측 연령 확률과 상기 피검자의 실제 연령대에 대응하는 통계적 예측 연령 확률과 비교하여, 피검자의 SA확률이 통계적 SA확률에 비해 기준치 이상 낮은 값을 나타내는 경우 피검자에게 언어 장애가 있는 것으로 판단할 수 있다. The evaluation support unit determines whether the subject's language ability is abnormal as a result of analyzing the analysis data through an artificial intelligence algorithm, and calculates a predicted age probability for the subject's speech data through the analysis of the artificial intelligence algorithm, Comparing the predicted age probability of the subject and the statistical predicted age probability corresponding to the actual age group of the subject, when the SA probability of the subject exhibits a value lower than the reference value than the statistical SA probability, it may be determined that the subject has a speech impairment. .

한편, 상기 평가 지원부는 상기 피검자의 예측 연령 확률과 상기 피검자의 실제 연령대에 대응하는 통계적 예측 연령 확률과 비교하여, 피검자의 SA확률이 통계적 SA확률에 비해 기준치 이상 높은 값을 나타내는 경우, 수치 차이에 대응하여 피검자의 언어 능력의 우수 정도(예, 우수 등급 산출)를 평가할 수 있다. On the other hand, the evaluation support unit compares the predicted age probability of the subject with the statistical predicted age probability corresponding to the actual age of the subject, and when the SA probability of the subject shows a value higher than the reference value than the statistical SA probability, the numerical difference Correspondingly, the degree of excellence (eg, excellence grade calculation) of the subject's language ability can be evaluated.

상술한 예를 참조하여 본 발명을 상세하게 설명하였지만, 당업자라면 본 발명의 범위를 벗어나지 않으면서도 본 예들에 대한 개조, 변경 및 변형을 가할 수 있다. 요컨대 본 발명이 의도하는 효과를 달성하기 위해 도면에 도시된 모든 기능 블록을 별도로 포함하거나 도면에 도시된 모든 순서를 도시된 순서 그대로 따라야만 하는 것은 아니며, 그렇지 않더라도 얼마든지 청구항에 기재된 본 발명의 기술적 범위에 속할 수 있음에 주의한다.Although the present invention has been described in detail with reference to the above-described examples, those skilled in the art can make modifications, changes, and modifications to these examples without departing from the scope of the present invention. In short, in order to achieve the intended effect of the present invention, it is not necessary to separately include all functional blocks shown in the drawings or to follow all the sequences shown in the drawings as shown in the order shown. Note that it may fall within the range.

100 : 서버
110 : 통신부
120 : 저장부
130 : 제어부
131 : 발화 데이터 추출부
132 : 학습 지원부
133 : 평가 지원부 100: server
110: communication department
120: storage
130: control unit
131: speech data extraction unit
132: Learning Support Department
133: Evaluation Support Department

Claims

In an artificial intelligence-based language ability evaluation system comprising a user device and a server,
The user device
Provides conversation data including speech data of the subject who wants to measure language ability to the server,
The server
Performs an artificial intelligence algorithm operation using the conversation data of the subject received from the user device, and evaluates the level of language ability of the subject according to the result of the operation, and predicted age information of the subject calculated through the artificial intelligence algorithm And, the language ability evaluation system, characterized in that the abnormality is determined through comparison with the actual age of the previously registered subject.

The method of claim 1,
The user device
It may be configured to include a subject device and a tester device,
The server
A communication unit that supports matching and call connection between an arbitrary testee device and the tester device, and receives voice-type chat data obtained by recording chat contents performed during a call; And
A control unit for requesting at least one of the testee device and the tester device to generate voice-type conversation data by recording the contents of the conversation between the subject and the examinee during the call, and transmit the generated voice-type conversation data to a server; Language ability evaluation system comprising a.

The method of claim 2,
The control unit
A speech data extracting unit configured to convert the speech-type conversation data into a text format to generate text conversation data, and extract speech data of the examinee from the text conversation data based on a voice frequency characteristic of the examinee;
A learning support unit for generating learning data to be injected into an artificial intelligence algorithm including at least one of speech data of the subject and a preset data set, and performing learning by inputting the learning data into an artificial intelligence algorithm; And
And an evaluation support unit that generates data for analysis to be injected into the artificial intelligence algorithm from speech data of the subject, and determines whether or not the subject's language ability is abnormal by inputting and calculating the generated analysis data to the artificial intelligence algorithm. Language ability evaluation system.

The method of claim 3,
The learning support unit
When generating the learning data, the predicted age information is labeled on each of the speech sentences so that the analysis result of the subject's speech data is calculated as a probability of a school age (SA) group and a probability of a pre school age (PSA) group,
Noise removal is performed by classifying the training data of the SA group and the training data of the PSA group, but by setting the noise removal criterion for the training data of the PSA group to a lower value than the training data of the SA group, reducing the removed noise ratio. Language ability evaluation system, characterized in that.

The method of claim 3,
The evaluation support unit
It is determined whether or not the subject's language ability is abnormal as a result of analyzing the analysis data through an artificial intelligence algorithm,
The predicted age probability for the subject's utterance data is calculated through the analysis of the artificial intelligence algorithm, and compared with the predicted age probability of the subject and the statistical predicted age probability corresponding to the subject's actual age, the SA probability of the subject is statistically If the value is lower than the reference value compared to the SA probability, it is judged that the subject has a speech disorder,
Compared with the predicted age probability of the subject and the statistical predicted age probability corresponding to the actual age group of the subject, when the SA probability of the subject shows a value higher than the reference value than the statistical SA probability, it is determined that the subject's language ability is excellent. , The language ability evaluation system, characterized in that to determine the superiority of the language ability of the subject in response to the numerical difference.

In an artificial intelligence-based language ability evaluation system comprising a user device and a server,
The user device
Provides conversation data including speech data of the subject who wants to measure language ability to the server,
The server is
Performs an artificial intelligence algorithm operation using the conversation data of the subject received from the user device, and evaluates the level of language ability of the subject according to the result of the operation, and predicted age information of the subject calculated through the artificial intelligence algorithm Wow, it is determined whether there is an abnormality through comparison with the actual age of the previously registered subject,
The server is
A communication unit that supports matching and call connection between an arbitrary testee device and the tester device, and receives voice-type chat data obtained by recording chat contents performed during a call; And
A control unit for requesting at least one of the testee device and the tester device to generate voice-type conversation data by recording the contents of the conversation between the subject and the examinee during the call, and transmit the generated voice-type conversation data to a server; Language ability evaluation system comprising a.

In an artificial intelligence-based language ability evaluation system comprising a user device and a server,
The user device
Provides conversation data including speech data of the subject who wants to measure language ability to the server,
The server
Performs an artificial intelligence algorithm operation using the conversation data of the subject received from the user device, and evaluates the level of language ability of the subject according to the result of the operation, and predicted age information of the subject calculated through the artificial intelligence algorithm Wow, it is determined whether there is an abnormality through comparison with the actual age of the previously registered subject,
The server
A speech data extracting unit configured to convert the speech-type conversation data into a text format to generate text conversation data, and extract speech data of the examinee from the text conversation data based on a voice frequency characteristic of the examinee;
A learning support unit for generating learning data to be injected into an artificial intelligence algorithm including at least one of speech data of the subject and a preset data set, and performing learning by inputting the learning data into an artificial intelligence algorithm; And
And an evaluation support unit that generates data for analysis to be injected into the artificial intelligence algorithm from speech data of the subject, and determines whether or not the subject's language ability is abnormal by inputting and calculating the generated analysis data to the artificial intelligence algorithm. Language ability evaluation system.

In the language ability evaluation method performed by the language ability evaluation system comprising a user device and a server,
Processing, by the server, a preset data set into learning data for learning an artificial intelligence algorithm;
Learning, by the server, an artificial intelligence algorithm using the learning data;
Obtaining, by the server, voice conversation data consisting of conversations between an examiner and a subject from the user device;
Converting, by the server, the obtained voice conversation data into text conversation data, and extracting speech data of the subject from the text conversation data;
Processing, by the server, the speech data of a specific subject into analysis data to be input to the artificial intelligence algorithm in which the learning has been performed; And
And evaluating, by the server, the language ability of the subject based on a result of the calculation by inputting the analysis data to an artificial intelligence algorithm.