KR20030069377A

KR20030069377A - Apparatus and method for detecting topic in speech recognition system

Info

Publication number: KR20030069377A
Application number: KR1020020008978A
Authority: KR
Inventors: 김진영; 최승호; 이경록
Original assignee: 대한민국(전남대학교총장)
Priority date: 2002-02-20
Filing date: 2002-02-20
Publication date: 2003-08-27

Abstract

PURPOSE: An apparatus and a method for detecting a topic of a voice recognition system are provided to increase discrimination by increasing the number of components of a basic analysis unit, thereby reducing an error of topic detection. CONSTITUTION: If voice data is inputted to an apparatus for detecting a topic, a phoneme recognizer(100) recognizes phonemes from the input voice data. A key string detector(110) separately packs the recognized phonemes for integrating the phonemes by the base analysis unit. A topic comparator(120) calculates the probability of the basic analysis unit that the basic analysis unit appears from topic data of trained data preliminarily memorized, and judges appropriateness of each topic by the probability of appearance for granting a score. A topic detector(130) detects topics more than a preliminarily set threshold in the score-granted topics.

Description

Topic detection device and method of speech recognition system {Apparatus and method for detecting topic in speech recognition system}

본 발명은 음성인식시스템의 토픽 검출장치 및 방법에 관한 것으로, 보다 상세하게는 입력 음성데이터로부터 인식된 음소를 N개의 기본 분석 단위로 묶어 이로부터 토픽을 검출하는 음성인식시스템의 토픽 검출장치 및 방법에 관한 것이다.The present invention relates to an apparatus and a method for detecting a topic of a speech recognition system. More specifically, the apparatus and method for detecting a topic of a speech recognition system for grouping phonemes recognized from input speech data into N basic analysis units and detecting a topic therefrom. It is about.

일반적으로, 음성인식시스템에서 토픽 검출은 멀티미디어 데이터를 분석하여 주요 내용을 검출하고 이를 이용하여 컨텐츠를 결정하기 위해 입력된 음성데이터에서 제목이나 화제, 중요 내용 등에 해당하는 토픽(topic)을 검출하는 것을 말한다.In general, a topic detection in a speech recognition system is to detect a topic corresponding to a title, a topic, an important content, etc. in the inputted voice data to determine the main content by analyzing the multimedia data and using the same to determine the content. Say.

예전에는 사람이 직접 멀티미디어 데이터로부터 토픽을 결정하는 수동적인 방법이 사용되었으나, 멀티미디어 정보의 폭발적인 증가로 인해 이제는 토픽의 수동 분류가 불가능한 것으로 인식되고 있다.In the past, a manual method of directly determining a topic from multimedia data was used. However, due to the explosion of multimedia information, it is recognized that manual classification of topics is impossible.

이에 따라 근래에는 자동으로 멀티미디어 데이터를 분석하여 토픽별로 분류하는 토픽 검출장치가 개발되어 사용되고 있다.Accordingly, in recent years, a topic detection apparatus for automatically analyzing multimedia data and classifying them by topic has been developed and used.

예를 들면, 뉴스 데이터 중에서 "김대중 대통령의 유럽순방"이라고 말하여진 음성데이터는 토픽 검출장치에서 "국제", "김대중 대통령"이라는 항목의 토픽으로 응답된다. 이러한 토픽 검출기능에 의하면 전자도서관이나 인터넷에서 멀티미디어 데이터를 검색할 때 중요한 인덱스정보를 제공하는 것이 가능해진다.For example, voice data that is said to be "President Kim Dae-Jung's European Tour" in the news data is answered by a topic detection device with topics of "International" and "President Kim Dae-Jung." The topic detection function makes it possible to provide important index information when searching for multimedia data in an electronic library or the Internet.

일반적으로 토픽 검출장치는 전처리부와 토픽 비교기와 토픽 검출기로 구성된다.In general, a topic detecting apparatus includes a preprocessor, a topic comparator, and a topic detector.

입력 음성데이터는 전처리부에서 기본 분석 단위로 인식되는데, 기본 분석 단위가 음소인 경우에는 도 1에 도시된 바와 같이 전처리부로서 음소 인식기(10)가 사용되고, 기본 분석 단위가 단어인 경우에는 도 2에 도시된 바와 같이 전처리부로서 단어 인식기(11)가 사용된다.The input speech data is recognized as a basic analysis unit by the preprocessor. When the basic analysis unit is a phoneme, as shown in FIG. 1, the phoneme recognizer 10 is used as the preprocessor and when the basic analysis unit is a word, FIG. 2. As shown in Fig. 11, the word recognizer 11 is used as the preprocessor.

토픽 비교기(20)는 전처리부에서 인식된 기본 분석 단위들을 각 토픽에 대한 적합성을 판별하여 판별 결과에 따라 점수를 부여하고 부여된 점수별로 분류를 수행한다. 여기서, 적합성의 판별은 기본 분석 단위가 미리 기억된 훈련 데이터의 해당 토픽 데이터에서 출현할 확률을 계산함에 의해 결정된다.The topic comparator 20 determines the suitability of the basic analysis units recognized by the preprocessing unit for each topic, assigns a score according to the determination result, and performs classification by the given scores. Here, the determination of fitness is determined by calculating the probability that the basic analysis unit will appear in the corresponding topic data of the pre-stored training data.

토픽 검출기(30)는 토픽 비교기(20)에서 처리된 각 토픽별 점수를 비교하여 비교적 높은 점수의 토픽들 중 미리 설정된 문턱치 이상인 경우만을 검출하여 이를 입력 음성의 토픽으로 인정한다.The topic detector 30 compares scores for each topic processed by the topic comparator 20, detects only a case of a topic having a predetermined threshold value or more among the topics having a relatively high score, and recognizes it as a topic of the input voice.

상기한 토픽 검출장치에서, 가장 핵심적인 사항 중 하나가 기본 분석 단위를 선정하는 것인데, 일반적으로 기본 분석 단위는 단어 단위와 음소 단위가 사용된다.In the topic detection apparatus, one of the most important points is to select a basic analysis unit, and in general, the basic analysis unit uses a word unit and a phoneme unit.

상기 단어를 기본 분석 단위로 하는 토픽 검출장치는, 인식오류가 적은 장점이 있는 반면, 핵심 단어의 수가 증가할수록 연산비용이 비용이 비례적으로 증가하여 대량의 토픽을 갖는 멀티미디어 환경에 적용하기 어렵고 상대적으로 구성이 복잡해지는 문제점이 있다.The topic detection apparatus using the word as a basic analysis unit has an advantage of low recognition error, while the cost increases proportionally as the number of key words increases, making it difficult to apply to a multimedia environment having a large amount of topics There is a problem that the configuration is complicated.

상기 음소를 기본 분석 단위로 하는 토픽 검출장치는, 단어를 기본 분석 단위로 하는 경우에 비해 핵심 음소의 수가 한정적이어서 영역 확장시에도 비용증가없이 쉽게 적용할 수 있고 구성이 단순한 장점이 있는 반면, 음소 인식기의 인식률이 낮아 오류가 많고 핵심음소의 수가 너무 적어 대량의 토픽을 갖는 멀티미디어 환경에 적용했을 때 각 토픽의 특성을 원활히 표현하지 못하는 문제점이 있다.The topic detecting apparatus using the phoneme as the basic analysis unit has a limited number of core phonemes compared to the case of using the word as the basic analysis unit, so that the phoneme can be easily applied without increasing the cost even when the area is expanded, and the configuration is simple. There is a problem in that the recognition rate of the recognizer is low and the number of core phonemes is too small, so that the characteristics of each topic may not be expressed smoothly when applied to a multimedia environment having a large number of topics.

즉, 종래의 토픽 검출장치는 그 기본 분석 단위의 종류에 따라 각기 보완해야할 단점을 갖고 있는 것이다.In other words, the conventional topic detection apparatus has a disadvantage to be supplemented according to the type of the basic analysis unit.

이에 본 발명은 상기한 종래 기술의 문제점을 해소하기 위해 안출한 것으로, 음소를 기본 분석 단위로 하여 구성을 단순화시키면서도 키 스트링 검출에 의해 토픽 검출 오류를 보상하며 대량의 토픽에 대해 그 특성을 원활하게 표현할 수 있는 음성인식시스템의 토픽 검출장치 및 방법을 제공하는 데 그 목적이 있다.Accordingly, the present invention has been made to solve the above-described problems of the prior art, and while simplifying the configuration using phoneme as a basic analysis unit, compensating for the topic detection error by key string detection and smoothing the characteristics of a large number of topics. An object of the present invention is to provide an apparatus and method for detecting a topic of a speech recognition system.

도 1 및 도 2는 종래의 토픽 검출장치를 도시한 개략적인 블록구성도,1 and 2 is a schematic block diagram showing a conventional topic detection apparatus,

도 3은 본 발명의 바람직한 실시예에 따른 토픽 검출장치를 도시한 개략적인 블록구성도,3 is a schematic block diagram showing a topic detecting apparatus according to a preferred embodiment of the present invention;

도 4는 도 3에 도시된 토픽 검출장치를 이용하여 기본 음소 단위를 검출하는 일예를 도시한 도면,4 is a diagram illustrating an example of detecting a basic phoneme unit using the topic detecting apparatus illustrated in FIG. 3;

도 5는 도 3에 도시된 토픽 검출장치의 연산 처리과정을 도시한 플로우챠트.FIG. 5 is a flowchart illustrating an operation process of the topic detecting apparatus shown in FIG. 3.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

100: 음소 인식기 110: 키 스트링 검출기100: phoneme recognizer 110: key string detector

120: 토픽 비교기 130: 토픽 검출기120: topic comparator 130: topic detector

상기 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 음성인식시스템의 토픽 검출장치는, 입력 음성데이터에서 음소를 인식하는 음소인식기와, 상기 음소인식기에서 인식된 음소를 N개씩 묶어 기본 분석 단위로 통합하여 검출하는 키 스트링 검출기와, 상기 N 키 스트링 검출기에서 검출된 기본 분석 단위에 대해 그 기본 분석 단위가 미리 기억되어 있는 훈련 데이터의 토픽 데이터에서 출현할 확률을 계산하고 이 계산된 출현 확률에 의해 토픽별 적합성을 판별하여 점수를 부여하는 토픽비교기 및, 상기 토픽비교기에 의해 점수가 부여된 각 토픽들 중 미리 설정된 문턱치 이상의 토픽을 검출하여 출력하는 토픽검출기를 포함하여 구성된 것을 특징으로 한다.Topic detection apparatus of the voice recognition system according to a preferred embodiment of the present invention for achieving the above object, a phoneme recognizer for recognizing the phoneme from the input voice data, and the phoneme recognized by the phoneme recognizer by N by a basic analysis unit Compute the probability of appearing in the topic data of the training data that the basic analysis unit is stored in advance with respect to the key string detector for detecting and the basic analysis unit detected by the N key string detector, and by the calculated appearance probability And a topic comparator for determining a suitability for each topic and assigning a score to the topic comparator, and a topic detector for detecting and outputting a topic having a predetermined threshold or more among the topics assigned a score by the topic comparator.

상기 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 음성인식시스템의 토픽 검출방법은, 입력 음성데이터에서 음소를 인식하는 음소인식스텝과, 상기 인식된 음소를 N개씩 묶어 기본 분석 단위로 통합하여 검출하는 N 키 스트링 검출스텝과, 상기 N 키 스트링 검출스텝에서 검출된 기본 분석 단위에 대해 그 기본 분석 단위가 미리 기억되어 있는 훈련 데이터의 토픽 데이터에서 출현할 확률을 계산하고 이 계산된 출현 확률에 의해 토픽별 적합성을 판별하여 점수를 부여하는 토픽비교스텝 및, 상기 토픽비교스텝에 의해 점수가 부여된 각 토픽들 중 미리 설정된 문턱치 이상의 토픽을 검출하여 출력하는 토픽검출스텝을 포함하여 이루어진 것을 특징으로 한다.The topic detection method of the voice recognition system according to a preferred embodiment of the present invention for achieving the above object, by combining the phoneme recognition step of recognizing the phonemes from the input voice data, and by combining the recognized phonemes by N by a basic analysis unit For the N key string detection step to detect and the basic analysis unit detected in the N key string detection step, the probability that the basic analysis unit appears in the topic data of the training data stored in advance is calculated, and the calculated probability of occurrence A topic comparison step for determining a suitability for each topic and assigning a score, and a topic detection step for detecting and outputting a topic having a predetermined threshold value or more among the topics assigned a score by the topic comparison step. do.

이하, 본 발명의 바람직한 실시예를 첨부된 도면을 참조하여 상세히 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 3은 본 발명의 바람직한 실시예에 따른 토픽 검출장치를 도시한 개략적인 블록구성도로서, 동도면에 도시된 바와 같이, 본 발명에 따른 토픽 검출장치는, 입력 음성데이터에서 음소를 인식하는 음소인식기(100)와, 음소인식기(100)에서 인식된 음소를 N개씩 묶어 기본 분석 단위로 통합하여 검출하는 키 스트링(Key String) 검출기(110)와, 키 스트링 검출기(110)에서 검출된 기본 분석 단위에 대해 그 기본 분석 단위가 미리 기억되어 있는 훈련 데이터의 토픽 데이터에서 출현할 확률을 계산하고 이 계산된 출현 확률에 의해 토픽별 적합성을 판별하여 점수를 부여하는 토픽비교기(120) 및, 토픽비교기(120)에 의해 점수가 부여된 각 토픽들 중 미리 설정된 문턱치 이상의 토픽을 검출하여 출력하는 토픽검출기(130)를 포함하여 구성된다.3 is a schematic block diagram showing a topic detecting apparatus according to a preferred embodiment of the present invention. As shown in the same drawing, the topic detecting apparatus according to the present invention is a phoneme for recognizing phonemes from input speech data. A key string detector 110 for detecting the recognizer 100 and the phoneme recognized by the phoneme recognizer 100 by integrating the N phonemes into a basic analysis unit, and the basic analysis detected by the key string detector 110. A topic comparator 120 for calculating a probability of appearing in the topic data of the training data in which the basic analysis unit is stored in advance for the unit, and determining suitability for each topic based on the calculated appearance probability, and assigning a score to the topic comparator. The topic detector 130 is configured to include a topic detector 130 that detects and outputs a topic having a predetermined threshold value or more among the topics assigned a score.

이제 상기와 같이 구성된 본 발명의 동작과정을 첨부된 도면을 참조하여 설명하기로 한다.The operation of the present invention configured as described above will now be described with reference to the accompanying drawings.

먼저, 본 발명의 토픽 검출장치로 음성데이터를 입력하면(S10), 음소 인식기(100)에서는 입력 음성데이터에서 음소를 인식하고(S20), 키 스트링 검출기(110)에서는 상기 인식된 음소를 N개씩 묶어 기본 분석 단위로 통합하여 검출한다(S30).First, when the voice data is input to the topic detecting apparatus of the present invention (S10), the phoneme recognizer 100 recognizes phonemes from the input voice data (S20), and the key string detector 110 stores the recognized phonemes by N. The bundle is integrated into the basic analysis unit and detected (S30).

예컨대, N값이 "4"이고 입력 음성데이터의 주제가 "김대중 대통령의 유럽 순방"인 경우에 음소를 인식하여 기본 분석 단위를 검출하는 과정에 대해 도 5를 참조하여 설명한다.For example, when N value is "4" and the subject of the input voice data is "President Kim Dae-Jung's European tour", a process of detecting a phoneme and detecting a basic analysis unit will be described with reference to FIG. 5.

입력 음성데이터의 주제가 "김대중 대통령의 유럽 순방"임에 따라 "김대중 대통령"이라는 음성이 다수 출현하게 되는데, "김대중 대통령"이라는 음성은, 음소 인식기(100)에서 "gz ii mf dz ai jz uu ng dz ai th oo ng rr yv ng" 이라는 음소 수열로 인식된다. 이것을 키 스트링 검출기(110)에서 4-key string으로 변환하면 "gz ii mf dz", "ii mf dz ai", "mf dz ai jz" "dz ai jz uu", "ai jz uu ng", "jz uu ng dz", "uu ng dz ai", "ng dz ai th"... 등이 되며, 이들 키 스트링으로 묶인 음소들은 각기 기본 분석 단위가 된다.As the subject of the input voice data is "President Kim Dae-jung's European Tour," a number of voices of "President Kim Dae-jung" will appear. The voice of "President Kim Dae-jung" appears as "gz ii mf dz ai jz uu u dz ai th oo ng rr yv ng ". If you convert this to 4-key string in the key string detector 110, "gz ii mf dz", "ii mf dz ai", "mf dz ai jz" "dz ai jz uu", "ai jz uu ng", " jz uu ng dz "," uu ng dz ai "," ng dz ai th "..., and the phonemes enclosed by these key strings are the basic units of analysis.

다음, 토픽 비교기(120)에서는 상기 스텝(S30)에서 검출된 기본 분석 단위에 대해 그 기본 분석 단위가 미리 기억되어 있는 훈련 데이터의 토픽 데이터에서 출현할 확률을 계산하고 이 계산된 출현 확률에 의해 토픽별 적합성을 판별하여 점수를 부여한다(S40).Next, the topic comparator 120 calculates a probability of appearing in the topic data of the training data in which the basic analysis unit is stored in advance with respect to the basic analysis unit detected in the step S30, and the topic is calculated based on the calculated appearance probability. The score is determined by determining suitability for each star (S40).

토픽 검출기(130)에서는 상기 스텝(S40)에 의해 점수가 부여된 각 토픽들 중 미리 설정된 문턱치 이상의 토픽을 검출하여 출력한다.The topic detector 130 detects and outputs a topic having a predetermined threshold or more among the topics to which the score is assigned by the step S40.

상기에서 본 발명은 특정 실시예를 예시하여 설명하지만 본 발명이 상기 실시예에 한정되는 것은 아니다. 당업자는 본 발명에 대한 다양한 변형, 수정을 용이하게 만들 수 있으며, 이러한 변형 또는 수정이 본 발명의 특징을 이용하는 한 본발명의 범위에 포함된다는 것을 명심해야 한다.The present invention is described above by illustrating specific embodiments, but the present invention is not limited to the above embodiments. Those skilled in the art can easily make various modifications and variations to the present invention, and it should be noted that such variations or modifications are included within the scope of the present invention as long as they use the features of the present invention.

상술한 바와 같이 본 발명에 따른 음성인식시스템의 토픽 검출장치는, 음소 인식기에서 출력된 음소를 키 스트링 검출기에 의해 설정된 키 값만큼 씩 기본 분석 단위로 묶는다. 이와 같이 음소를 설정된 키 값만큼 씩 묶어 기본 분석 단위를 구성하게 되면 단순히 음소만을 기본 분석 단위로 할 때보다 기본 분석 단위의 성분 수가 많아져 식별성이 높아진다. 이에 따라, 본 발명을 이용하면, 음소만을 기본 분석 단위로 할 때보다 토픽 검출 오류를 감소시킬 수 있다.As described above, the topic detecting apparatus of the speech recognition system according to the present invention bundles the phonemes output from the phoneme recognizer into the basic analysis unit by the key value set by the key string detector. As such, when the phoneme is bundled by the set key value to form a basic analysis unit, the number of components of the basic analysis unit increases more than that of simply using the phoneme as the basic analysis unit. Accordingly, by using the present invention, it is possible to reduce a topic detection error than when only a phoneme is a basic analysis unit.

즉, 본 발명은, 기존 음소를 기본 분석 단위로 하여 구성이 단순하면서도 음소를 여러 개로 묶어 이를 기본 분석 단위로 함에 따라 단어를 기본 분석 단위로 하는 정도의 수준으로 식별성이 높아져 토픽 검출 오류가 감소되는 바, 기존 단어를 기본 분석 단위로 하는 토픽 검출장치와 기존 음소를 기본 분석 단위로 하는 토픽 검출장치의 장점만이 취합되어 단순한 구성에 토픽 검출 오류가 적은 최적의 효율성을 발휘한다.That is, according to the present invention, as the basic analysis unit is used as a basic analysis unit, the composition is simple, but the phoneme is bundled into several basic analysis units, and as a basic analysis unit, the identification is increased to the level of the word as the basic analysis unit, thereby reducing the topic detection error As a result, only the advantages of the topic detection apparatus using the existing word as the basic analysis unit and the topic detection apparatus using the existing phoneme as the basic analysis unit are combined to achieve the optimum efficiency with a simple configuration and less topic detection error.

이러한, 본 발명은 또, 한정된 수의 음소를 N개씩 통합함에 따라 표현 가능한 정보의 패턴이 다양해져서 토픽의 수가 많은 멀티미디어환경에서 각 토픽의 특성을 제대로 표현할 수 있다. 또한, N개의 음소가 핵심어의 음소 배치 패턴정보를 가지므로 보다 직관성이 있다.As described above, according to the present invention, a limited number of phonemes are integrated by N, so that the pattern of information that can be expressed is diversified to properly express the characteristics of each topic in a multimedia environment having a large number of topics. In addition, since N phonemes have phoneme arrangement pattern information of key words, they are more intuitive.

Claims

In the topic detection apparatus of the speech recognition system,

A phoneme recognizer to recognize phonemes from input voice data;

A key string detector for detecting and combining N phonemes recognized by the phoneme recognizer by integrating them into basic analysis units;

For the basic analysis unit detected by the key string detector, the probability of appearing in the topic data of the training data in which the basic analysis unit is stored in advance is calculated, and the suitability for each topic is determined based on the calculated appearance probabilities to give a score. Topic Comparators,

And a topic detector configured to detect and output a topic having a predetermined threshold value or more among the topics assigned a score by the topic comparator.

In the topic detection method of the speech recognition system,

A phoneme recognition step of recognizing phonemes in the input voice data;

An N key string detection step of detecting the phoneme by combining the recognized phonemes by N and integrating them into a basic analysis unit;

For the basic analysis unit detected in the N-key string detection step, a probability of appearing in the topic data of the training data in which the basic analysis unit is stored in advance is calculated, and the suitability for each topic is determined based on the calculated appearance probability to obtain a score. Topic comparison step to give,

And a topic detecting step of detecting and outputting a topic having a predetermined threshold value or more among the topics to which the score is assigned by the topic comparison step.