KR100904049B1

KR100904049B1 - System and Method for Classifying Named Entities from Speech Recongnition

Info

Publication number: KR100904049B1
Application number: KR1020070068250A
Authority: KR
Inventors: 김성기; 조주형
Original assignee: 주식회사 예스피치
Priority date: 2007-07-06
Filing date: 2007-07-06
Publication date: 2009-06-23
Also published as: KR20090004216A

Abstract

본 발명은 음성 인식에 대한 통계적 의미 분류 시스템에 있어서, 입력된 음성신호를 미리 정의된 텍스트 코퍼스(Text Corpus)로 추출 및 어절단위로 분류하는 음성 인식부; 어절 단위로 분류된 상기 텍스트 코퍼스를 형태소 단위로 분절하는 형태소 분석기; 의미적으로 개념이 동일한 단어를 그룹화 한 개념 사전을 도입하여, 상기 형태소 분석기에서 분절된 각 형태소를 대표 단어로 그룹화 하는 맵핑부; 및 통계 기반의 의미 분류기에서 생성되는 벡터에 상기 대표단어를 대입 및 각각의 의미 범주별로 자동 할당하는 의미분류부를 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템 및 방법에 관한 것이다.According to another aspect of the present invention, there is provided a statistical semantic classification system for speech recognition, comprising: a speech recognition unit configured to extract input speech signals into a predefined text corpus and classify them in word units; A morpheme analyzer for segmenting the text corpus classified into word units into morpheme units; A mapping unit which introduces a concept dictionary in which words having the same concept are semantically grouped, and groups each morpheme segmented by the morpheme analyzer into a representative word; And a semantic classification unit for assigning the representative word to the vector generated by the statistical-based semantic classifier and automatically assigning the semantic category to each semantic category.

본 발명에 따른 음성 인식에 대한 통계적 의미 분류 시스템 및 방법은 한국어의 각 형태소를 어절 단위로 분절하도록 하는 형태소 분석기와, 불용어 사전 및 개념 사전을 탐색과정에 적용함으로써, 사전 크기를 안정화시키고, 한국어의 음성인식 및 의미분류에 대한 성능을 향상시키는 효과가 있다.The statistical semantic classification system and method for speech recognition according to the present invention stabilizes dictionary size by applying a morpheme analyzer to segment each morpheme of Korean word by word, and applying a stopword dictionary and a conceptual dictionary to the search process. It has the effect of improving the performance of speech recognition and semantic classification.

음성, 형태소, 벡터추출부, 개념사전, 불용어사전 Voice, Morphology, Vector Extraction, Concept Dictionary

Description

System and Method for Classifying Named Entities from Speech Recongnition}

도 1은 종래 음성 인식 시스템에 대한 블록 구성도이다.1 is a block diagram of a conventional speech recognition system.

도 2는 본 발명의 바람직한 실시예에 의한 음성 인식 및 의미분류 시스템의 개략도이다.2 is a schematic diagram of a speech recognition and semantic classification system according to a preferred embodiment of the present invention.

도 3은 본 발명의 바람직한 실시예에 의한 의미 분류 시스템에 대한 블록 구성도이다.3 is a block diagram of a semantic classification system according to a preferred embodiment of the present invention.

도 4는 본 발명의 바람직한 실시예에 의한 의미 분류 시스템의 의미분류기를 도시한 블록 구성도이다.4 is a block diagram illustrating a semantic classifier of a semantic classification system according to an exemplary embodiment of the present invention.

도 5는 본 발명에 따른 신경망 기반 통합 음성 인식기의 구조를 보여주는 도면이다.5 is a diagram illustrating a structure of a neural network based integrated speech recognizer according to the present invention.

도 6은 도 5의 신경망 기반 통합 음성 인식기가 적용된 일례를 나타내는 도면이다.FIG. 6 is a diagram illustrating an example in which the neural network based integrated speech recognizer of FIG. 5 is applied.

도 7은 본 발명에 따른 음성 인식에 대한 통계적 의미 분류 시스템의 흐름도이다.7 is a flowchart of a statistical semantic classification system for speech recognition according to the present invention.

도 8은 본 발명에 따른 음성 인식에 대한 통계적 의미 분류 시스템의 의미분 류기의 흐름도이다.8 is a flowchart of a semantic classifier of a statistical semantic classification system for speech recognition according to the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

10: 음성 인식부 15: 학습데이터부10: speech recognition unit 15: learning data unit

20: 형태소분석기 22: 형태소발음사전DB20: morpheme analyzer 22: morpheme phonetic dictionary DB

24: 관리모듈 26: 분류모듈24: management module 26: classification module

28: 인식모듈 30: 맵핑부28: recognition module 30: mapping unit

32: 개념사전DB 34: 그룹화모듈32: Conceptual Dictionary DB 34: Grouping Module

40: 불용어제거부 42: 불용어사전DB40: Terminology Removal Unit 42: Terminology Dictionary DB

44: 불용어제거모듈 50: 자질선택부44: stop-word removal module 50: feature selection unit

60: 벡터추출부 70: 벡터학습부60: vector extraction unit 70: vector learning unit

80: 의미분류부80: semantic classification

본 발명은 음성 인식에 대한 통계적 의미 분류 시스템 및 방법에 관한 것으로, 보다 상세하게는 사용자가 발성한 발화내용에 대한 의미 범주를 형태소 별로 자동으로 분류하도록 하여, 한국어의 음성인식 및 의미분류에 대한 성능을 향상시키도록 하는 음성 인식에 대한 통계적 의미 분류 시스템 및 방법에 관한 것이다.The present invention relates to a statistical semantic classification system and method for speech recognition, and more particularly, to automatically classify the semantic categories of utterances spoken by users by morphemes, thereby improving performance of Korean speech recognition and semantic classification. The present invention relates to a statistical semantic classification system and method for speech recognition.

일반적으로 음성 인식 기술은 사람의 음성을 통하여 사람과 컴퓨터 간의 인터페이스가 가능하도록 하는 것으로서, 발음에 따라 특정 주파수를 갖는 사람의 음성을 컴퓨터가 분석해 전기신호로 변환한 후, 음성신호의 주파수 특성을 추출해 발음을 이해하고, 이해된 음성에 따라 업무를 수행하는 기술이다. 이러한 음성인식 기술은 현재 상용화되어, 전화 다이얼링, 장난감 제어, 어학학습 또는 가전기기 제어 등과 같은 다양한 분야에 응용되고 있다.In general, speech recognition technology enables an interface between a person and a computer through a person's voice. A computer analyzes a voice of a person having a specific frequency according to pronunciation, converts it into an electric signal, and then extracts frequency characteristics of the voice signal. This is the skill of understanding pronunciation and performing tasks according to the understood voice. Such voice recognition technology is currently commercialized and applied to various fields such as telephone dialing, toy control, language learning or home appliance control.

종래 음성 인식 기술은 도 1에 도시된 바와 같이, 화자가 특정 단말기(100)를 통해 발화를 하면, 발화된 음성 신호가 음성 인식 시스템(100)으로 전달되어 정보를 추출 및 연산하게 된다. 그리고 최종적으로 화자가 발화한 음성 신호는 텍스트(300)로 변환하게 된다.In the conventional speech recognition technology, as illustrated in FIG. 1, when a speaker speaks through a specific terminal 100, the spoken speech signal is transferred to the speech recognition system 100 to extract and calculate information. Finally, the voice signal spoken by the speaker is converted into text 300.

그리고 종래의 음성 인식 시스템(100)을 구성하는 모듈은 보통 크게 다섯 가지로 학습 및 연산을 수행하게 된다. 이는 도 1에 도시된 바와 같이, 특징 추출부(110), 음향 모델부(212), 발음 모델부(222), 언어 모델부(232), 후처리부(240)를 포함한다.In addition, the modules constituting the conventional speech recognition system 100 usually perform learning and computation in five ways. As shown in FIG. 1, the feature extractor 110, an acoustic model unit 212, a pronunciation model unit 222, a language model unit 232, and a post-processing unit 240 are included.

상기 특징 추출부(110)는 음성 신호로부터 유용한 특징들을 추출하는 과정으로써, 인간의 청각특성을 반영하는(perceptually meaningful) 특징 표현, 다양한 잡음환경/화자/채널 변이에 강인한(robust) 특징 등을 추출한다.The feature extractor 110 is a process of extracting useful features from a speech signal, extracting feature representations that are perceptually meaningful, and robust features to various noise environments, speakers, and channel variations. do.

상기 음향 모델부(212)는 음성 데이터베이스(210)로부터 음성 신호가 어떻게 표현할 수 있는지를 나타낸다. 최근 음성인식기에서 가장 널리 사용되는 음향모델은 HMM(hidden Markov model)에 기반 한 것이다. 음향모델의 기본 단위는 음소 또 는 유사음소 단위이다. 각 모델은 하나의 음향모델 단위를 나타내며 보통 3개의 상태(state)로 구성된다. 주로 좌에서 우로의 상태 간 천이만 허용된다. 각 상태에서의 음성특징 벡터의 관측 확률은 이산 확률분포 또는 연속 확률밀도함수(pdf)로 표현된다.The acoustic model unit 212 shows how the speech signal from the speech database 210 can be represented. Recently, the most widely used acoustic model in speech recognizer is based on HMM (hidden Markov model). The basic unit of an acoustic model is a phoneme or similar phoneme unit. Each model represents one acoustic model unit and usually consists of three states. Usually only transitions between states from left to right are allowed. The observed probabilities of the speech feature vectors in each state are expressed as discrete probability distributions or continuous probability density functions (pdf).

상기 발음 모델부(222)는 실제 학습 될 음소는 표기음소가 아니라 발음음소이므로 표기음소를 발음음소로 바꾸어주는 모델이다. 이 모델은 보통 표준발음법에 의거하여 간단한 규칙을 정하거나 특정 환경과 화자 및 사투리까지의 특색을 고려하여 정의하는 방법으로 발음 사전 DB(데이터베이스)(220)를 구축하여 수행한다.The pronunciation model unit 222 is a model for changing the phonemes to phonemes because the phonemes to be actually learned are phonetic phones, not phonetic phones. The model is usually implemented by establishing a phonetic dictionary DB (database) 220 by defining simple rules based on standard phonetics or by taking into account specific environments, characteristics of speakers and dialects.

상기 언어 모델부(232)는 음성 인식기의 문법이라고 할 수 있다. 이는 텍스트 말뭉치 DB(230)로부터 문법을 추출하여, 학습 및 탐색 시 임의적인 문장 보다는 문법에 맞는 문장을 선별하는 과정이다. 상기 언어 모델부(232)는 음성 인식기의 탐색 공간을 감소할 수 있으며 문법에 맞는 문장에 대한 확률을 높여 주는 역할을 하기 때문에 인식률 향상에도 기여하게 된다.The language model unit 232 may be referred to as a grammar of a speech recognizer. This is a process of extracting grammar from text corpus DB 230 to select sentences that match grammar rather than random sentences during learning and searching. The language model unit 232 may reduce the search space of the speech recognizer and contribute to the improvement of the recognition rate since the language model unit 232 increases the probability of sentences that match the grammar.

상기 후처리부(240)는 경우에 따라서는 고려하지 않은 경우도 있지만, 보통 인식기를 통해 인식률이 높은 후보 텍스트를 선별한 후, 또 다른 가공된 언어적 정보나 에러 패턴을 학습하고 적용하여 가장 적절한 텍스트(300)를 찾는 과정이다.Although the post-processing unit 240 may not be considered in some cases, the candidate text having a high recognition rate is usually selected through a recognizer, and then another processed linguistic information or an error pattern is learned and applied to the most suitable text. 300 is the process of finding.

한편, 상기 후처리부(240)에서 선별된 텍스트(300)를 컴퓨터 시스템이 이해하기 위해서는 발성된 발화에서 표현된 단어가 의미하는 개념과 문장에서 표현하고자 하는 "그 개념들 간의 관계"를 규명하여야 한다. 이를 위하여 기존에 사용되는 의미 분류 방법으로는, 의미 문법을 설계하여 처리하는 규칙(rule) 기반의 방법과, 확률적 통계 기반에 의한 발화의 자동 의미 범주화 처리 방법이 있다.Meanwhile, in order for the computer system to understand the text 300 selected by the post-processing unit 240, the concept of the word expressed in the spoken speech and the "relationship between the concepts" to be expressed in the sentence should be identified. . To this end, conventional semantic classification methods include rule-based methods for designing and processing semantic grammars and automatic semantic categorization of speech based on probabilistic statistics.

의미 문법은 발화의 구문과 의미 처리가 결합된 형태로 제한된 영역에서 유용하게 사용될 수 있지만, 의미 문법 규칙 설계와 규칙의 일반화가 어렵고 확장이 어렵다는 단점을 가진다. Although semantic grammar can be usefully used in a limited area because it combines syntax and semantic processing of speech, it has a disadvantage in that it is difficult to design a semantic grammar rule and generalize rules and is difficult to extend.

반면에 통계 기반의 발화의 자동 의미 범주화는 학습 데이터가 주어지기만 한다면, 언어의 규칙성을 시스템이 자동으로 학습할 수 있기 때문에, 규칙 개발에 소요되는 개발 비용을 줄일 수 있다. 통계 기반의 의미 분류기란 발화를 수집하고 수집된 발화로부터 미리 정의해놓은 의미 범주만큼 사람이 분리해 놓은 학습데이터를 기계학습 알고리즘에 의하여 통계적으로 자동 학습하여 제작된 시스템이며, 현재 상용화된 의미 분류를 위한 종래 의미 분류기 모델은 사용자가 발성한 발화를 통계적 음성 언어 인식기가 인식하고, 인식한 결과를 텍스트로 입력받아 통계적 의미 분류기를 통하여 의미 범주 클래스로 분류하여, 사용자의 발화를 의미 범주별로 자동 할당해준다.On the other hand, the statistical semantic automatic semantic categorization can reduce the development cost of rule development because the system can automatically learn the regularity of the language, provided that learning data is given. The statistics-based semantic classifier is a system that is produced by statistically automatically learning the learning data separated by human as much as the semantic category predefined from the collected utterances by machine learning algorithm. The conventional semantic classifier model recognizes speech spoken by a user by a statistical speech language recognizer, receives the recognized result as text, classifies it into a semantic category class through a statistical semantic classifier, and automatically assigns user speech by semantic category.

그러나 종래 의미 분류기 모델은 영어에 맞추어 제작된 것이 대부분이기 때문에, 영어의 인식단위이자 띄워 쓰기 단위로 구성된 인식 단어 선정에는 적합하지만, 형태소 분석 적용된 키워드에 대한 처리나 동의어 처리 등에는 적합하지 않다.However, since most conventional semantic classifier models are designed for English, the semantic classifier model is suitable for selecting recognition words composed of English recognition units and floating units, but not for processing morphological analysis, synonyms, and the like.

특히, 한국어는 교착어로서 한 어절은 실질 형태소와 조사나 어미와 같은 헝태소로 이루어지므로, 문장을 단어의 연결로 보는 것이 아니라 형태소의 연결로 인식하고, 이러한 형태소의 수를 어느 정도 고정함으로써, 사전 크기를 안정화시키고 음성 인식률을 높이는 방법이 제안되어 왔다.In particular, since Korean is a deadlock, a word is composed of real morphemes and hints such as surveys and endings. A method of stabilizing and increasing the speech recognition rate has been proposed.

상기와 같은 종래기술의 문제점을 해결하고자, 본 발명의 목적은 한국어의 각 형태소를 어절 단위로 분절하도록 하는 형태소 분석기와, 불용어 사전 및 개념 사전을 탐색과정에 적용함으로써, 사전 크기를 안정화시키고, 한국어의 음성인식 및 의미분류에 대한 성능을 향상시키도록 하는 음성 인식에 대한 통계적 의미 분류 시스템 및 방법을 제공하도록 하는 데 있다.In order to solve the problems of the prior art as described above, an object of the present invention is to stabilize the dictionary size by applying a morpheme analyzer for segmenting each morpheme in Korean by word units, and applying a stopword dictionary and a conceptual dictionary to the search process. The present invention provides a statistical semantic classification system and method for speech recognition to improve the performance of speech recognition and semantic classification.

상기 목적을 달성하기 위하여 본 발명은 음성 인식에 대한 통계적 의미 분류 시스템에 있어서, 입력된 음성신호를 미리 정의된 텍스트 코퍼스(Text Corpus)로 추출 및 어절단위로 분류하는 음성 인식부; 어절 단위로 분류된 상기 텍스트 코퍼스를 형태소 단위로 분절하는 형태소 분석기; 의미적으로 개념이 동일한 단어를 그룹화 한 개념 사전을 도입하여, 상기 형태소 분석기에서 분절된 각 형태소를 대표 단어로 그룹화 하는 맵핑부; 및 통계 기반의 의미 분류기에서 생성되는 벡터에 상기 대표단어를 대입 및 각각의 의미 범주별로 자동 할당하는 의미분류부를 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.In order to achieve the above object, the present invention provides a statistical semantic classification system for speech recognition, comprising: a speech recognition unit configured to extract input speech signals into predefined text corpus and classify them in word units; A morpheme analyzer for segmenting the text corpus classified into word units into morpheme units; A mapping unit which introduces a concept dictionary in which words having the same concept are semantically grouped, and groups each morpheme segmented by the morpheme analyzer into a representative word; And a semantic classification unit for assigning the representative word to the vector generated by the statistical-based semantic classifier and automatically assigning the semantic categories to each semantic category.

또한, 상기 의미분류기는 상기 음성 인식부에서 생성된 텍스트 코퍼스가 수집된 학습데이터부; 상기 학습데이터부에서 수집된 텍스트 코퍼스를 형태소 단위로 분절하는 형태소 분석기; 의미적으로 개념이 동일한 단어를 그룹화 한 개념 사전을 도입하여, 상기 형태소 분석기에서 분절된 각 형태소를 대표 단어로 그룹화 하는 맵핑부; 상기 대표 단어 중에서 범주화 구분에 유용하게 사용되는 자질을 선택하는 자질선택부; 및 상기 자질선택부에서 선택되는 상기 자질을 기계학습알고리즘에 의해 특정 벡터로 추출하는 벡터추출부를 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.The semantic classifier may include: a learning data unit in which a text corpus generated by the speech recognition unit is collected; A morpheme analyzer for segmenting the text corpus collected in the learning data unit into morpheme units; A mapping unit which introduces a concept dictionary in which words having the same concept are semantically grouped, and groups each morpheme segmented by the morpheme analyzer into a representative word; A feature selection unit for selecting a feature useful for categorizing classification from the representative words; And a vector extraction unit for extracting the features selected by the feature selection unit into a specific vector by a machine learning algorithm.

또한, 상기 형태소 분석기는 각 형태소들이 기본단위들로 분류된 형태소발음사전DB; 상기 형태소발음사전DB를 관리하는 관리모듈; 상기 관리모듈을 이용하여 상기 음성 인식부에서 생성된 텍스트 코퍼스를 각 형태소로 분류하는 분류모듈; 및 상기 분류모듈에 의해 분류된 상기 형태소를 인식하는 인식모듈을 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.In addition, the morpheme analyzer is a morpheme phonetic dictionary DB each morpheme is classified into basic units; A management module for managing the morpheme phonetic dictionary DB; A classification module for classifying the text corpus generated by the speech recognition unit into each morpheme using the management module; And a recognition module for recognizing the morphemes classified by the classification module.

또한, 상기 맵핑부는 의미적으로 개념이 동일한 단어를 그룹화한 개념사전DB; 및 상기 형태소 분석기에서 분절된 각 형태소를 개념사전DB에 대입하여, 대표 단어로 그룹화 하는 그룹화모듈을 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.The mapping unit may include a concept dictionary DB grouping semantically identical words; And a grouping module for substituting each morpheme segmented by the morpheme analyzer into a conceptual dictionary DB and grouping the representative morpheme into a representative word.

또한, 상기 맵핑부에 의해 그룹화된 상기 대표 단어 중 별다른 정보를 주지 않는 불용어를 제거하는 불용어제거부를 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.The present invention also provides a statistical semantic classification system for speech recognition, comprising: a stopword removing unit for removing stopwords which do not give any other information among the representative words grouped by the mapping unit.

또한, 상기 불용어제거부는 별다른 정보를 주지 않는 문구, 단어 의성어 등의 불용어로 구성되는 불용어 사전DB; 및 상기 불용어 사전DB를 이용하여 상기 대표 단어 중 별다른 정보를 주지 않는 불용어를 제거시키는 불용어 제거 모듈을 포 함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.In addition, the stopword removal unit is a stopword dictionary DB consisting of stopwords, such as phrases, words onomatopoei do not give much information; And a stopword removal module for removing stopwords that do not give any other information among the representative words by using the stopword dictionary DB.

또한, 상기 자질선택부는 상기 대표 단어에 대한 상호 정보척도(Mutual Information)와 카이 제곱 통계량(Chi-Square)을 추출하는 자질선택모듈; 및 상기 자질선택모듈에서 의미분류 자질값이 낮은 데이터를 상기 불용어제거부에 적용 및 제거하기 위한 단어추출모듈을 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.The feature selection unit may include a feature selection module configured to extract a mutual information scale and chi-square statistics about the representative word; And a word extraction module for applying and removing data having a low semantic classification feature value from the feature selection module to the stopword removal unit.

또한, 상기 벡터추출부는 입력층, 은닉층, 출력층을 포함하는 신경망 구조를 가지며, 상기 신경망 구조를 기반으로 한 상기 기계 학습 알고리즘을 통해 각 계층 간 신경망 가중치를 조정하여 입력 패턴에 대한 목표 출력 값을 생성 및 융합하여 상기 음성신호에 대한 음성 특징 벡터를 추출하도록 구성되는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.The vector extractor has a neural network structure including an input layer, a hidden layer, and an output layer, and generates a target output value for an input pattern by adjusting neural network weights between layers through the machine learning algorithm based on the neural network structure. And fusion to extract a speech feature vector for the speech signal.

또한, 상기 벡터추출부에서 추출되는 벡터에 상기 기계 학습 알고리즘을 적용하여, 상기 벡터를 자동 학습시키는 벡터학습부를 포함하여 구성되는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.The present invention also provides a statistical semantic classification system for speech recognition, comprising a vector learning unit for automatically learning the vector by applying the machine learning algorithm to the vector extracted from the vector extracting unit.

또한, 상기 기계 학습 알고리즘은 ANNs(Artificial Neural Networks) 알고리즘으로 구성되는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 시스템을 제공한다.In addition, the machine learning algorithm provides a statistical semantic classification system for speech recognition, characterized in that the ANNs (Artificial Neural Networks) algorithm.

또한, 음성 인식에 대한 통계적 의미 분류 시스템을 이용한 음성 인식에 대한 통계적 의미 분류 방법에 있어서, 입력된 음성신호로부터 음성인식에 사용되는 텍스트 코퍼스(Text Corpus)를 추출 및 어절단위로 분류하는 음성 인식 단계; 어절 단위로 분류된 상기 텍스트 코퍼스를 형태소 단위로 분절하는 형태소 분석 단계; 의미적으로 개념이 동일한 단어를 그룹화 한 개념 사전을 도입하여, 상기 형태소 분석 단계에서 분절된 각 형태소를 대표 단어로 그룹화 하는 맵핑 단계; 및 상기 의미 분류기에서 생성되는 벡터에 상기 대표단어를 대입 및 각각의 의미 범주별로 자동 할당하는 의미 분류 단계를 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 방법을 제공한다.Also, in the statistical meaning classification method for speech recognition using the statistical meaning classification system for speech recognition, a speech recognition step of extracting and classifying the text corpus used for speech recognition from the input speech signal by word unit ; A morpheme analysis step of segmenting the text corpus classified into word units into morpheme units; A mapping step of grouping each morpheme segmented in the morpheme analysis step into a representative word by introducing a concept dictionary in which words having the same concept are semantically grouped; And a semantic classification step of assigning the representative word to the vector generated by the semantic classifier and automatically assigning the semantic word to each semantic category.

또한, 상기 의미 분류기는 상기 음성 인식부에서 생성된 텍스트 코퍼스가 수집되는 학습데이터 수집 단계; 어절 단위로 분류된 상기 텍스트 코퍼스를 형태소 단위로 분절하는 형태소 분석 단계; 의미적으로 개념이 동일한 단어를 그룹화 한 개념 사전을 도입하여, 상기 형태소 분석 단계에서 분절된 각 형태소를 대표 단어로 그룹화 하는 맵핑 단계; 상기 맵핑 단계에서 그룹화된 상기 대표 단어 중 별다른 정보를 주지 않는 불용어를 제거시키는 불용어 제거 단계; 상기 불용어 제거 단계에서 불용어가 제거된 상기 대표 단어 중에서 범주화 구분에 유용하게 사용되는 자질을 선택하는 자질 선택 단계; 및 상기 자질을 상기 기계 학습 알고리즘에 의해 음성 특징 벡터로 추출하는 벡터 추출 단계를 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 방법을 제공한다.In addition, the semantic classifier is a learning data collection step of collecting the text corpus generated by the speech recognition unit; A morpheme analysis step of segmenting the text corpus classified into word units into morpheme units; A mapping step of grouping each morpheme segmented in the morpheme analysis step into a representative word by introducing a concept dictionary in which words having the same concept are semantically grouped; A stopword removal step of removing stopwords which do not give any other information among the representative words grouped in the mapping step; A feature selection step of selecting a feature useful for categorizing classification among the representative words from which the stopword is removed in the stopword removal step; And a vector extracting step of extracting the feature into a speech feature vector by the machine learning algorithm.

또한, 상기 맵핑 단계에서 그룹화된 상기 대표 단어 중 별다른 정보를 주지 않는 불용어를 제거시키는 불용어 제거 단계를 포함하는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 방법을 제공한다.The present invention also provides a statistical semantic classification method for speech recognition, comprising: removing a stopword which does not give any other information among the representative words grouped in the mapping step.

또한, 상기 벡터 추출 단계에서 추출되는 벡터에 상기 기계 학습 알고리즘을 적용하여 상기 벡터를 자동 학습시키는 벡터 학습 단계를 포함하여 구성되는 것을 특징으로 하는 음성 인식에 대한 통계적 의미 분류 방법을 제공한다.The present invention also provides a statistical semantic classification method for speech recognition, comprising a vector learning step of automatically learning the vector by applying the machine learning algorithm to the vector extracted in the vector extracting step.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예에 따른 음성 인식에 대한 통계적 의미 분류 시스템 및 방법을 더욱 상세히 설명한다.Hereinafter, a statistical semantic classification system and method for speech recognition according to a preferred embodiment of the present invention will be described in more detail with reference to the accompanying drawings.

도 2는 본 발명의 바람직한 실시예에 의한 음성인식 및 의미분류 시스템의 개략도이다.2 is a schematic diagram of a speech recognition and semantic classification system according to a preferred embodiment of the present invention.

도 2를 참조하면, 화자가 특정 단말기(100)를 통해 발화를 하면, 발화된 음성 신호가 음성 인식 시스템(100)으로 전달되어 정보를 추출 및 연산하게 된다. 상기 음성 인식 시스템(100)은 종래(도 1)와 마찬가지로, 종래 특징 추출부(110), 음향 모델부(212), 발음 모델부(222), 언어모델부(232), 후처리부(240)를 포함하여 구성되며, 이는 종래의 음성 인식 시스템(100)과 동일한 구성으로서, 보다 상세한 설명은 생략하기로 한다.Referring to FIG. 2, when a speaker speaks through a specific terminal 100, the spoken voice signal is transmitted to the voice recognition system 100 to extract and calculate information. The speech recognition system 100 has a conventional feature extractor 110, an acoustic model unit 212, a pronunciation model unit 222, a language model unit 232, and a post-processing unit 240, similar to the conventional (FIG. 1). It is configured to include, which is the same configuration as the conventional speech recognition system 100, a more detailed description will be omitted.

그리고 상기 후처리부(240)에서 선별된 텍스트(300)는 의미 분류 시스템(5)을 통하여, 그 텍스트(300)의 의미를 컴퓨터 시스템이 이해할 수 있도록 그 텍스트(300)의 의미 범주별로 자동 할당된다.The text 300 selected by the post processor 240 is automatically assigned to each semantic category of the text 300 so that the computer system can understand the meaning of the text 300 through the semantic classification system 5. .

도 3은 본 발명의 바람직한 실시예에 의한 의미 분류 시스템에 대한 블록 구 성도이다.3 is a block diagram of a semantic classification system according to a preferred embodiment of the present invention.

도 3을 참조하면, 본 발명의 의미 분류 시스템(5)은 확률적 통계 기반에 의한 발화의 자동 의미 범주화 처리를 위한 시스템으로서, 음성 인식부(10), 형태소분석기(20), 맵핑부(30), 불용어제거부(40) 및 의미분류부(80)를 포함하여 구성된다.Referring to FIG. 3, the semantic classification system 5 of the present invention is a system for automatic semantic categorization processing of speech based on probabilistic statistics. The speech recognition unit 10, the morpheme analyzer 20, and the mapping unit 30 are described. ), The stopword removal unit 40 and the semantic classification unit 80 is configured.

음성 인식부(10)는 상기 도 2의 음성 인식 시스템에서 선별된 텍스트(300)를 미리 정의된 텍스트 코퍼스(Text Corpus)로 추출 및 어절단위로 분류하는 것으로서, 상기 텍스트(300)를 어절 단위로 분리하고 특정 텍스트 코퍼스로 인식한다.The speech recognizer 10 extracts the text 300 selected by the speech recognition system of FIG. 2 into a predefined text corpus, and classifies the text 300 in word units. Separate and recognize as a specific text corpus.

형태소분석기(20)는 여러 형태소들의 묶음이 표층 형태로 나타나는 하나의 어절로부터 의미를 갖는 최소 단위인 각 형태소를 분석하여, 실제의 문장에 사용되는 단어의 원래의 구조를 파악하는 것으로서, 인식된 텍스트 코퍼스를 어절 단위에서 형태소 단위로 분절하며, 이렇게 분절된 형태소는 하기 맵핑부(30)를 통하여 동일한 개념을 가지는 단어를 그룹화시키고, 불용어제거부(40)를 통하여 불필요한 접사 등의 불용어를 제거시키게 된다. The morpheme analyzer 20 analyzes each morpheme, which is a minimum unit having a meaning, from a single word in which a bundle of various morphemes appears in the form of a surface, and grasps the original structure of a word used in an actual sentence. The corpus is segmented from word units to morpheme units. The segmented morphemes group words having the same concept through the mapping unit 30, and remove unnecessary words such as unnecessary affixes through the stopword removal unit 40. .

이처럼 형태소 분석을 통한 어절 분리 과정을 통하여 발화의 내용을 표현하는 단어를 어절 단위에서 형태소 단위로 추출하여 내용 분석에 꼭 필요한 어근만을 선택할 수 있기 때문에, 그 내용 특징 반영률이 우수한 자질을 갖는 단어를 선택할 수 있게 된다.In this way, the word expressing the contents of the utterance can be selected from the word unit to the morpheme unit through the word separation process through morphological analysis, so only the roots necessary for the content analysis can be selected. It becomes possible.

이러한 형태소분석기(20)는 각 형태소들이 기본단위들로 분류된 형태소발음사전모듈(22), 상기 형태소발음사전모듈(22)을 관리하는 관리모듈(24), 상기 형태 소발음사전모듈(22)을 이용하여 상기 텍스트 코퍼스를 각 형태소로 분류하는 분류모듈(26), 및 상기 분류모듈(26)에 의해 분류된 상기 형태소를 인식하는 인식모듈(28)을 포함하여 구성된다. 이들의 구체적인 동작을 살펴보면, 먼저, 분류모듈(26)에서는 형태소들을 기본음소단위군으로 구성한 형태소발음사전DB(22)와 연동되어, 상기 텍스트 코퍼스를 각 형태소로 분류 및 모델링 한다. 그리고 인식모듈(28)에서는 분류모듈(26)에서 분류된 텍스트 코퍼스를 형태소 단위로 인식하게 된다.The morpheme analyzer 20 has a morpheme sounding dictionary module 22 in which each morpheme is classified into basic units, a management module 24 for managing the morpheme sounding dictionary module 22, and the morpheme sounding dictionary module 22. And a classification module 26 for classifying the text corpus into respective morphemes, and a recognition module 28 for recognizing the morphemes classified by the classification module 26. Looking at the specific operation of these, first, the classification module 26 is linked to the morpheme phonetic dictionary DB 22 composed of morphemes as a basic phoneme unit group, and classifies and models the text corpus into each morpheme. The recognition module 28 recognizes the text corpus classified by the classification module 26 in morpheme units.

맵핑부(30)는 의미적으로 개념이 동일한 단어를 그룹화한 개념사전DB(32)를 도입한 그룹화모듈(34)을 통하여, 상기 형태소분석기(20)에서 분절된 각 형태소를 대표 단어로 그룹화하도록 구성된다. 이러한 맵핑부(30)에 의하여, 텍스트 코퍼스는 의미적 개념이 동일한 대표 개체명이나, 의미를 대표하는 새로이 정의된 개체명으로 다시 맵핑되면서, 의미가 동일한 여러 개의 텍스트 코퍼스가 하나의 텍스트 코퍼스로 필터링되게 된다. 여기서 개념사전DB(32)란 유사 발음 단어가 제외된, 의미적으로 개념이 동일한 동일어를 그룹화한 사전이 저장된 데이터베이스로서, 기존 통계 모델에서 자주 나타나는 불충분한 학습 데이터 문제 (Sparse Data Problem)를 해결하게 하는 도구로 사용되어, 기존 통계 모델의 약점을 보완하게 된다.The mapping unit 30 groups the morphemes segmented by the morpheme analyzer 20 into representative words through the grouping module 34 introducing the concept dictionary DB 32 in which the words having the same concepts are grouped together. It is composed. By the mapping unit 30, the text corpus is remapped to a representative entity name having the same semantic concept or a newly defined entity name representing the meaning, while filtering multiple text corpus having the same meaning to one text corpus. Will be. Here, the concept dictionary DB (32) is a database in which a dictionary is grouped of semantically identical concepts except for similar pronunciation words, and solves an insufficient sparse data problem frequently found in existing statistical models. It can be used as a tool to make up for the weaknesses of existing statistical models.

불용어제거부(40)는 별다른 정보를 주지않는 문구, 단어 의성어 등의 불용어로 구성되는 불용어사전DB(42) 및 상기 불용어사전DB(42)를 이용하여 상기 맵핑부(30)에서 그룹화된 대표 단어 중 별다른 정보를 주지 않는 불용어를 제거시키는 불용어제거모듈(44)로 구성되어, 별다른 정보를 주지 않는 불용어, 즉 의미 없는 문구, 단어, 의성어 등을 발화 데이터에서 제거한 데이터를 얻도록 한다. 이로 인해 개념 사전 적용 단계를 거친 발화 데이터로부터, 불필요한 단어가 제외된 데이터를 얻게 됨으로써, 키워드 자질값 추출 시에 필요 없는 데이터 분석에 드는 비용을 줄일 수 있도록 한다. 그리고 이러한 불용어제거부(40)는 경우에 따라서는 고려하지 않는 경우도 있다.Terminology removal unit 40 is a group of representative words grouped in the mapping unit 30 using the stopword dictionary DB 42 and the stopword dictionary DB 42 composed of stopwords, such as phrases, words, and the like that do not give much information. It is composed of a stopword removal module 44 for removing stopwords that do not give much information, so as to obtain data that removes stopwords that do not give much information, that is, meaningless phrases, words, onomatopoeia, etc., from utterance data. As a result, data obtained by excluding unnecessary words is obtained from the utterance data that has undergone the concept dictionary application step, thereby reducing the cost of data analysis that is unnecessary when extracting keyword feature values. And such stopword removal unit 40 may not be considered in some cases.

의미분류부(80)는 통계 기반의 의미 분류기(82)에서 생성되는 벡터에 상기 대표단어를 대입하여, 상기 대표단어가 각각의 의미 범주별로 자동 할당되도록 한다. 상기 의미분류기(82)는 하기 도 4에서 상세하게 설명하기로 한다.The semantic classification unit 80 substitutes the representative word into a vector generated by the statistical-based semantic classifier 82 so that the representative word is automatically allocated for each semantic category. The semantic classifier 82 will be described in detail with reference to FIG. 4.

이처럼 특정 단말기(100)를 통해 발화된 음성 신호는 음성 인식 시스템(100)을 통하여 텍스트로 선별되며, 선별된 텍스트는 음성 인식부(10)에서 텍스트 코퍼스로 인식되고, 이어서 형태소분석기(20), 맵핑부(30), 불용어제거부(40) 및 의미분류부(80)를 거치면서, 그 텍스트의 의미를 컴퓨터 시스템이 이해할 수 있도록, 그 텍스트에 대한 각각의 의미범주별로 자동 할당된다.As such, the speech signal uttered through the specific terminal 100 is selected as text through the speech recognition system 100, and the selected text is recognized by the speech recognition unit 10 as a text corpus, followed by the morpheme analyzer 20, Through the mapping unit 30, the stop word removing unit 40, and the semantic classification unit 80, the semantic category for the text is automatically assigned so that the computer system can understand the meaning of the text.

도 4를 참조하면, 본 발명의 바람직한 실시예에 의한 의미 분류 시스템(5)의 의미분류기(82)는 학습데이터부(15), 형태소분석기(20), 맵핑부(30), 불용어제거부(40), 자질선택부(50), 벡터추출부(60) 및 벡터학습부(70)를 포함한다.4, the semantic classifier 82 of the semantic classification system 5 according to the preferred embodiment of the present invention includes a learning data unit 15, a morpheme analyzer 20, a mapping unit 30, and a stopword removal unit ( 40), the feature selection unit 50, the vector extraction unit 60 and the vector learning unit 70.

학습데이터부(15)는 상기 도 3의 음성인식부(10)에서 생성된 텍스트 코퍼스 가 다수 수집되도록 구성된다.The learning data unit 15 is configured to collect a plurality of text corpus generated by the voice recognition unit 10 of FIG. 3.

형태소분석기(20)는 여러 형태소들의 묶음이 표층 형태로 나타나는 하나의 어절로부터 의미를 갖는 최소 단위인 각 형태소를 분석하여, 실제의 문장에 사용되는 단어의 원래의 구조를 파악하는 것으로서, 학습데이터부(15)에서 수집된 텍스트 코퍼스를 형태소분석기를 사용하여 어절 단위에서 형태소 단위로 분절하며, 이렇게 분절된 형태소는 하기 맵핑부(30)를 통하여 동일한 개념을 가지는 단어를 그룹화시키고, 불용어제거부(40)를 통하여 불필요한 접사 등의 불용어를 제거시키게 된다. The morpheme analyzer 20 analyzes each morpheme, which is a minimum unit having a meaning, from a single word in which a bundle of various morphemes is represented in a surface form, and grasps the original structure of a word used in an actual sentence. The text corpus collected in (15) is segmented from word units to morpheme units using a morpheme analyzer, and the segmented morphemes are grouped with words having the same concept through the mapping unit 30, and the stopword unit 40 ) To eliminate unnecessary words such as unnecessary macros.

그리고 상기 형태소분석기(20), 맵핑부(30), 및 불용어제거부(40)는 상기 도2와 동일하므로, 상세한 설명은 생략하기로 한다.In addition, since the morpheme analyzer 20, the mapping unit 30, and the stop word removing unit 40 are the same as those of FIG. 2, detailed description thereof will be omitted.

자질선택부(50)는 키워드 추출을 목적으로 하는 것으로서, 발화의 의미 범주화 성능을 높이기 위한 자질선택모듈 및 단어추출모듈로 구성된다.The feature selection unit 50 is for keyword extraction, and is composed of a feature selection module and a word extraction module for enhancing semantic categorization performance of speech.

상기 자질선택모듈은 맵핑부(30) 또는 불용어제거부(40)에 의해 생성되는 대표 단어 중에서 범주화 구분에 유용하게 사용될 만한 단어를 선택하도록 학습 데이터에 나타나는 상기 대표 단어에 대한 상호 정보척도(Mutual Information)와, 카이 제곱 통계량(Chi-Square)을 추출하고, 이 정보량이 큰 단어만을 선택하여 의미 분류기에 사용한다.The feature selection module is a mutual information scale for the representative words appearing in the training data to select words that can be usefully used for categorization among the representative words generated by the mapping unit 30 or the stopword removal unit 40. ) And Chi-Square, and only words with a large amount of information are selected and used for the semantic classifier.

상기 단어추출모듈은 상기 자질선택모듈에서 의미분류 자질 값이 낮은 데이터를 상기 불용어제거부(40)에 적용 및 제거하도록 한다.The word extraction module applies and removes data having a low semantic classification feature value to the stopword removal unit 40 in the feature selection module.

하기 표 1, 수학식 1, 수학식 2는 상기 자질선택부(50)의 상기 자질선택모듈에 대한 예시로서, 먼저 표 1에서 A는 의미 범주에 속해 있는 발화 중에서 자질 후 보 단어를 포함하고 있는 발화의 수이고, B는 해당 의미 범주에 속해 있지 않은 발화 중에서 자질 후보 단어를 포함하고 있는 발화의 수이다. 그리고 C는 의미 범주에 속해 있는 발화 중에서 자질 후보 단어를 포함하지 않는 발화의 수이며, D는 해당 의미 범주에 속해 있지 않은 발화 중에서 자질 후보 단어를 포함하지 않는 발화의 수이다.Table 1, Equation 1, and Equation 2 are examples of the feature selection module of the feature selection unit 50. First, in Table 1, A includes feature candidate words among utterances belonging to a semantic category. B is the number of utterances, and B is the number of utterances containing the candidate candidate words among utterances not belonging to the semantic category. C is the number of utterances that do not include the candidate candidate words among the utterances belonging to the semantic category, and D is the number of utterances that do not include the candidate candidate words among the utterances which do not belong to the semantic category.

[표 1]TABLE 1

의미 범주Meaning Category NOT 의미 범주NOT semantic category 자질 후보 단어Qualification candidate words AA BB NOT 자질 후보 단어NOT qualifier candidate words CC DD

하기 수학식 1의 카이 제곱 통계량과, 하기 수학식 2의 상호 정보 척도는 상기 표1의 A, B, C, D의 출현 수를 이용, 자질이 되는 통계량으로 계산하여 우수한 자질을 갖는 단어 리스트를 선정한다.The chi-square statistic of Equation 1 and the mutual information scale of Equation 2 are calculated as statistics by using the number of occurrences of A, B, C, and D in Table 1 to form a word list having excellent qualities. Select.

[수학식 1][Equation 1]

여기에서, N은 전체 발화의 수를 의미한다.Here, N means the total number of ignitions.

[수학식 2][Equation 2]

벡터추출부(60)는 상기 자질선택부(50)에서 선택되는 자질을 특정 벡터로 추출하는 것으로서, 위에서 선정된 상기 단어를 이용하여 발화의 특징을 표현하도록 벡터공간모델을 사용한다. 상기 벡터공간모델은 신경망 이론이 적용된 기계 학습 알고리즘으로 구성되며, 이 기계학습알고리즘을 사용하면 상기 자질선택부(50)에서 선택된, 정보량이 큰 단어 리스트만으로 학습 발화 데이터를 벡터로 표현할 수 있다. 상기 기계 학습 알고리즘은 ANNs(Artificial Neural Networks) 알고리즘으로 구성된다.The vector extraction unit 60 extracts a feature selected by the feature selection unit 50 as a specific vector, and uses a vector space model to express the characteristics of the speech using the words selected above. The vector space model is composed of a machine learning algorithm to which neural network theory is applied. Using this machine learning algorithm, the learning speech data can be represented as a vector using only a word list having a large amount of information selected by the feature selection unit 50. The machine learning algorithm is composed of ANNs (Artificial Neural Networks) algorithm.

벡터학습부(70)는 상기 벡터추출부(60)에서 추출되는 벡터에 상기 기계 학습 알고리즘을 적용하여, 상기 벡터를 자동 학습시키도록 구성되며, 하기 도 5 에서 설명하기로 한다.The vector learning unit 70 is configured to automatically learn the vector by applying the machine learning algorithm to the vector extracted from the vector extracting unit 60, which will be described below with reference to FIG. 5.

이렇게 신경망 기반 통합 음성 인식기에 의해 학습된 벡터학습부(70)는 하기 도 5 에서와 같은 음성 인식 시스템을 통해, 형태소 분석 단계와 개념 사전을 거쳐, 발화에 대한 의미를 통계적으로 분류함으로써 사용자의 의도를 파악하고 사용자의 목적에 맞는 서비스를 제공할 수 있게 된다.The vector learning unit 70 trained by the neural network-based integrated speech recognizer uses a speech recognition system as shown in FIG. It will be able to identify and provide services that meet the user's purpose.

도 5는 본 발명에 따른 신경망 기반 통합 음성 인식기의 구조를 보여주는 도면이다. 도 5를 참조하면, 상기 도 4의 벡터추출부(60)에서 추출되는 벡터는 신경망 이론이 적용된 기계 학습 알고리즘인 ANNs(Artificial Neural Networks) 알고리즘으로 학습 되도록 구성되며, 상기 ANNs 알고리즘은 데이터를 입력하기 위한 입력층, 상기 입력층으로부터의 신호와 이전층의 출력신호를 입력하여 학습을 수행하는 은닉층, 및 상기 입력층으로부터의 신호와 은닉층의 출력신호를 입력하여 학습을 수행하고 최종적인 결과를 출력하기 위한 출력층으로 이루어진다. 그리고 이러한 신경망 구조를 기반으로 한 상기 기계 학습 알고리즘을 통해 각 계층 간 신경망 가중치를 조정하여 입력 패턴에 대한 목표 출력 값을 생성 및 융합하여 상기 음성신호에 대한 음성 특징 벡터를 추출하도록 구성된다.5 is a diagram illustrating a structure of a neural network based integrated speech recognizer according to the present invention. Referring to FIG. 5, the vector extracted by the vector extractor 60 of FIG. 4 is configured to be trained by an ANNs (Artificial Neural Networks) algorithm, which is a machine learning algorithm to which a neural network theory is applied, wherein the ANNs algorithm inputs data. Inputting an input layer, a hidden layer for inputting a signal from the input layer and an output signal of a previous layer, and inputting a signal from the input layer and an output signal of the hidden layer to perform learning and outputting a final result. For the output layer. And through the machine learning algorithm based on the neural network structure, the neural network weights between the layers are adjusted to generate and fuse a target output value for the input pattern to extract the speech feature vector for the speech signal.

도 6은 도 5의 신경망 기반 통합 음성 인식기가 적용된 일례를 나타내는 것으로서, X1~Xn은 신경망의 입력 값 즉, 입력층이고, h1~hk 는 은닉층, y1~ys는 입력 값에 대한 각각의 출력 결과 즉, 출력층을 나타낸다. 여기서, 입력 값은 자질로 이루어진 발화의 의미 벡터 값이 되고, 각각의 Ys는 발화가 Ys에 속할 통계량 값의 출력이 된다. Ys의 최대값을 발화에 대한 의미 범주로 결정 함으로서 학습을 수행한다.FIG. 6 illustrates an example in which the neural network based integrated speech recognizer of FIG. 5 is applied, wherein X1 to Xn are input values of the neural network, that is, an input layer, h1 to hk are hidden layers, and y1 to ys are output results for respective input values. That is, the output layer is shown. Here, the input value is a semantic vector value of speech composed of qualities, and each Ys is an output of a statistical value at which the speech belongs to Ys. Learning is done by determining the maximum value of Ys as a semantic category for speech.

도면을 참조하면, 우선, 음성 인식 시스템(100)을 통하여 음성이 인식 및 텍스트로 생성되고, 생성된 텍스트는 음성인식부(10)에 의해 텍스트코퍼스로 추출 및 어절단위로 분류된다.(S110) 그리고 어절 단위로 분류된 상기 텍스트 코퍼스는 형태소분석기(20)에 의해서 형태소 단위로 분절된다.(S120)Referring to the drawings, first, the voice is recognized and generated as text through the speech recognition system 100, and the generated text is extracted by the speech recognition unit 10 into text corpus and classified into word units (S110). The text corpus classified in word units is segmented in morpheme units by the morpheme analyzer 20 (S120).

이렇게 형태소 단위로 분절된 텍스트 코퍼스는 의미적으로 개념이 동일한 단어를 그룹화 하는 맵핑부(30)에 의하여 대표단어로 그룹화하게 되며, 이때 그룹화모듈(34)에 의하여 상기 형태소분석기(20)에서 분절된 각 형태소를 개념사전DB(32)에 대입하여, 대표 단어로 그룹화된다.(S130) 그리고 상기 대포 단어 중 불필요한 접사 등의 불용어는 불용어제거부(40)를 통하여 제거될 수도 있다.The text corpus segmented in the morpheme unit is grouped as a representative word by the mapping unit 30 grouping the words that are semantically identical in concept, and in this case, the text corpus is segmented in the morpheme analyzer 20 by the grouping module 34. Each morpheme is substituted into the conceptual dictionary DB 32 to be grouped into representative words (S130). Unnecessary words such as unnecessary affixes among the cannon words may be removed through the stop word removing unit 40.

상기 맵핑부(30) 및 불용어제거부(40)는 형태소 단위로 분절된 텍스트 코퍼스를 필터링시키는 것으로서, 하기 자질선택부(50)에 의하여 우수한 자질 값 분석 시에 그 비교 데이터를 줄이기 위한 것이다.The mapping unit 30 and the stop word removing unit 40 filter the text corpus segmented in morpheme units, and are used to reduce the comparison data when the feature selection unit 50 analyzes excellent feature values.

맵핑부(30)에 의해 그룹화된 대표단어는 통계 기반의 의미 분류기(82)에서 생성되는 벡터에 상기 대표단어를 대입 및 각각의 의미 범주별로 자동 할당된다.(S140)The representative words grouped by the mapping unit 30 are assigned to the vector generated by the statistical-based semantic classifier 82 and automatically assigned to each semantic category (S140).

도 8은 본 발명에 따른 음성 인식에 대한 통계적 의미 분류 시스템의 의미분류기의 흐름도이다.8 is a flowchart of a semantic classifier of a statistical semantic classification system for speech recognition according to the present invention.

도면을 참조하면, 우선, 학습데이터부(15)는 음성인식부(10)에서 생성된 텍스트 코퍼스를 수집한다.(S210) 이렇게 수집된 텍스트 코퍼스 즉, 학습데이터는 형 태소분석기(20)에 의해서 형태소 단위로 분절하게 된다.(S220)Referring to the drawings, first, the learning data unit 15 collects the text corpus generated by the speech recognition unit 10 (S210). The collected text corpus, that is, the learning data is collected by the typeface analyzer 20. It is divided into morphological units. (S220)

이렇게 형태소 단위로 분절된 텍스트 코퍼스는 의미적으로 개념이 동일한 단어를 그룹화 하는 맵핑부(30)에 의하여 대표단어로 그룹화하게 되며(S230), 이때 그룹화모듈(34)에 의하여 상기 형태소분석기(20)에서 분절된 각 형태소를 개념사전DB(32)에 대입하여, 대표 단어로 그룹화된다.(S240) 그리고 상기 대포 단어 중 별다른 정보를 주지않는 불필요한 접사 등의 불용어는 불용어제거부(40)를 통하여 제거될 수도 있다. 상기 맵핑부(30) 및 불용어제거부(40)는 형태소 단위로 분절된 텍스트 코퍼스를 필터링시키는 것으로서, 하기 자질선택부(50)에 의하여 우수한 자질 값 분석 시에 그 비교 데이터를 줄이기 위한 것이다.The text corpus segmented in the morpheme unit is grouped as a representative word by the mapping unit 30 to group words that are semantically identical in concept (S230), and the grouping module 34 makes the morpheme analyzer 20. Each morpheme segmented in is substituted into the conceptual dictionary DB 32, and grouped into representative words. (S240) And stopwords such as unnecessary affixes that do not give much information among the cannon words are removed through the stopword removal unit 40. It may be. The mapping unit 30 and the stop word removing unit 40 filter the text corpus segmented in morpheme units, and are used to reduce the comparison data when the feature selection unit 50 analyzes excellent feature values.

자질선택부(50)에서는 상기 대표단어 중에서 범주화 구분에 유용하게 사용되는 자질이 선택된다.(S250) 그리고 자질선택부(50)에 의해서 선택되는 자질은 벡터추출부에 의해 특정 벡터로 추출되며, 이렇게 추출된 벡터는 ANNs알고리즘을 적용한 기계학습알고리즘에 의하여 자동 학습된다.(S260)The feature selector 50 selects a feature useful for categorization among the representative words. (S250) And the feature selected by the feature selector 50 is extracted as a specific vector by the vector extractor. The extracted vector is automatically learned by the machine learning algorithm applying the ANNs algorithm.

이렇게 학습된 상기 특정 벡터는 의미분류부(80)에서 상기 화자의 음성을 인식하는데 사용되어, 발화에 대한 의미를 통계적으로 분류함으로써 사용자의 의도를 파악하고 사용자의 목적에 맞는 서비스를 제공할 수 있게 된다.The specific vector learned in this way is used by the semantic classification unit 80 to recognize the speaker's voice, thereby statistically classifying the meaning of the utterance so as to grasp the user's intention and provide a service suitable for the user's purpose. do.

본 발명은 상기 실시예에서 상세히 설명되었지만, 본 발명의 기술사상 범위 내에서 다양한 변형 및 수정이 가능함은 당업자에게 있어서 명백한 것이며, 이러한 변형 및 수정이 첨부된 특허청구범위에 속함은 당연한 것이다.While the invention has been described in detail in the foregoing embodiments, it will be apparent to those skilled in the art that various modifications and variations are possible within the spirit of the invention, and such modifications and variations belong to the appended claims.

상기에서 살펴본 바와 같이, 본 발명에 따른 음성 인식에 대한 통계적 의미 분류 시스템 및 방법은 한국어의 각 형태소를 어절 단위로 분절하도록 하는 형태소 분석기와, 불용어 사전 및 개념 사전을 탐색과정에 적용함으로써, 사전 크기를 안정화시키고, 한국어의 음성인식 및 의미분류에 대한 성능을 향상시키는 효과가 있다.As described above, the statistical semantic classification system and method for speech recognition according to the present invention is a dictionary size by applying a morpheme analyzer for segmenting each morpheme in Korean by word units, and applying a stopword dictionary and a conceptual dictionary to a search process. It is effective to stabilize the performance and to improve the performance of Korean speech recognition and semantic classification.

Claims

delete

In the statistical semantic classification system for speech recognition,

A speech recognition unit for extracting the input speech signal into a predefined text corpus and classifying the speech unit into word units;

A morpheme analyzer for segmenting the text corpus classified into word units into morpheme units;

A mapping unit which introduces a concept dictionary in which words having the same concept are semantically grouped, and groups each morpheme segmented by the morpheme analyzer into a representative word; And

A semantic classification unit for assigning the representative word to a vector generated by a statistic based semantic classifier and automatically assigning the semantic category to each semantic category;

Including,

The semantic classifier is:

A learning data unit in which a text corpus generated by the speech recognition unit is collected;

A morpheme analyzer for segmenting the text corpus collected in the learning data unit into morpheme units;

A mapping unit which introduces a concept dictionary in which words having the same concept are semantically grouped, and groups each morpheme segmented by the morpheme analyzer into a representative word;

A feature selection unit for selecting a feature useful for categorizing classification from the representative words; And

A vector extraction unit for extracting the feature selected by the feature selection unit into a specific vector by a machine learning algorithm;

Statistical semantic classification system for speech recognition comprising a.

In the statistical semantic classification system for speech recognition,

Including,

The morphological analyzer is:

Morpheme phoneme dictionary DB, in which each morpheme is classified into basic units;

A management module for managing the morpheme phonetic dictionary DB;

A classification module for classifying the text corpus generated by the speech recognition unit into each morpheme using the management module; And

A recognition module for recognizing the morphemes classified by the classification module;

Statistical semantic classification system for speech recognition comprising a.

In the statistical semantic classification system for speech recognition,

Including,

The mapping unit:

A conceptual dictionary that semantically groups words with the same concept; And

A grouping module for assigning each morpheme segmented by the morpheme analyzer to a conceptual dictionary DB and grouping the representative morpheme into a representative word;

Statistical semantic classification system for speech recognition comprising a.

In the statistical semantic classification system for speech recognition,

Including,

Statistical term classification system for speech recognition, characterized in that it further comprises a stop-word removal unit for removing stop words that do not give much information among the representative words grouped by the mapping unit.

The method of claim 5, wherein

The stopword removal unit:

Terminology dictionary DB consisting of phrases that do not give much information, and stopwords of words onomatopoeia; And

A stopword removal module for removing stopwords which do not give any other information among the representative words using the stopword dictionary DB;

Statistical semantic classification system for speech recognition comprising a.

The method of claim 2,

The feature selection unit:

A feature selection module for extracting mutual information scales and chi-square statistics for the representative words; And

A word extraction module for applying and removing data having a low semantic classification feature value from the feature selection module to a stop word removing unit for removing stop words that do not give other information among the representative words grouped by the mapping unit;

Statistical semantic classification system for speech recognition comprising a.

The method of claim 2,

The vector extraction unit:

The neural network structure includes an input layer, a hidden layer, and an output layer, and adjusts neural network weights between layers through the machine learning algorithm based on the neural network structure to generate and fuse a target output value for an input pattern, thereby generating the voice signal. And extract a speech feature vector for the statistical semantic classification system for speech recognition.

The method of claim 2,

And a vector learning unit for automatically learning the vector by applying the machine learning algorithm to the vector extracted from the vector extracting unit.

The method according to claim 8 or 9,

The machine learning algorithm is a statistical semantic classification system for speech recognition, characterized in that consisting of the ANNs (Artificial Neural Networks) algorithm.

delete

In the statistical meaning classification method for speech recognition using the statistical meaning classification system for speech recognition of claim 2,

A speech recognition step of extracting a text corpus used for speech recognition from the input speech signal and classifying it into word units;

A morpheme analysis step of segmenting the text corpus classified into word units into morpheme units;

A mapping step of grouping each morpheme segmented in the morpheme analysis step into a representative word by introducing a concept dictionary in which words having the same concept are semantically grouped; And

A semantic classification step of assigning the representative word to the vector generated by the semantic classifier and automatically assigning each semantic category;

Including,

The semantic classification step is:

Learning data collection step of collecting the text corpus generated by the speech recognition unit;

A mapping step of grouping each morpheme segmented in the morpheme analysis step into a representative word by introducing a concept dictionary in which words having the same concept are semantically grouped;

A stopword removal step of removing stopwords which do not give any other information among the representative words grouped in the mapping step;

A feature selection step of selecting a feature useful for categorizing classification among the representative words from which the stopword is removed in the stopword removal step; And

Extracting the feature into a speech feature vector by the machine learning algorithm;

Statistical meaning classification method for speech recognition, comprising the.

Including,

And a stopword removing step of removing stopwords which do not give any other information among the representative words grouped in the mapping step.

The method of claim 12,

And a vector learning step of automatically learning the vector by applying the machine learning algorithm to the vector extracted in the vector extracting step.