KR101065941B1

KR101065941B1 - Topic identification system and method using various relations between spoken words

Info

Publication number: KR101065941B1
Application number: KR1020040077432A
Authority: KR
Inventors: 장두성
Original assignee: 주식회사 케이티
Priority date: 2004-09-24
Filing date: 2004-09-24
Publication date: 2011-09-19
Also published as: KR20060028326A

Abstract

본 발명은 음성에 포함된 단어간의 관계를 이용하는 음성언어 주제 판별 시스템 및 그 방법에 관한 것으로, 사용자가 음성으로 발화하는 내용(적어도 하나의 문장이 됨)에서 주요 단어, 각 단어들간의 관계(상하위 관계, 구문 관계 등), 문장간의 관계(수사 관계)를 통해 속성을 추출하여, 추출된 속성들로 표현된 문장을 분류하여 정해진 주제 중 어디에 속하는지를 자동 판별할 수 있는 음성언어 주제 판별 시스템 및 그 방법을 제공하고자 한다.The present invention relates to a system for determining a speech language subject using a relationship between words included in a voice and a method thereof. The present invention relates to a main word and a relationship between each word (upper and lower) in a content spoken by a user (at least one sentence). Speech language subject identification system that can automatically classify a sentence represented by the extracted attributes by automatically extracting attributes through relations (syntax relations) To provide a method.

이를 위하여, 본 발명은, 음성언어 주제 판별 시스템에 있어서, 사용자의 발화 음성을 인식하기 위한 음성인식수단; 인식된 발화 문장에서 주요 단어를 추출하고, 각 주요 단어의 상하위 관계를 고려하여 '상/하위 단어'를 추출하고, 주요 단어들간에 구문 관계를 고려하여 '단어간 구문 관계'를 추출하고, 발화 문장간에 수사 관계를 고려하여 '문장간 수사 관계'를 추출하기 위한 속성추출수단; 및 상기 속성추출수단에서 추출된 속성들로 표현된 문장을 주제별로 분류하기 위한 속성분류수단을 포함한다.To this end, the present invention, in the speech language topic determination system, speech recognition means for recognizing a spoken voice of a user; Extract key words from recognized spoken sentences, extract 'upper / lower words' considering top and bottom relationships of each key word, and extract 'syntax relationships between words' by considering syntactic relationships between key words Attribute extracting means for extracting 'investigation relation between sentences' in consideration of the investigation relation between sentences; And attribute classification means for classifying the sentences represented by the attributes extracted by the attribute extraction means by subject.

음성인식, 음성언어 주제 판별, 단어 관계, 문장 관계, 자동 호 전환Speech Recognition, Speech Language Topic Determination, Word Relationship, Sentence Relationship, Auto Transfer

Description

Topic identification system and method using various relations between spoken words}

도 1 은 본 발명에 따른 음성언어 주제 판별 시스템의 일실시예 구성도,1 is a configuration diagram of an embodiment of a speech language topic determination system according to the present invention;

도 2 는 본 발명에 따른 상기 도 1의 속성 추출기의 일실시예 상세 구성도,2 is a detailed configuration diagram of an embodiment of the attribute extractor of FIG. 1 according to the present invention;

도 3 은 본 발명에 따른 음성언어 주제 판별을 위한 훈련 과정에 대한 일실시예 흐름도,3 is a flowchart illustrating an embodiment of a training process for determining a speech language topic according to the present invention;

도 4 는 본 발명에 따른 음성언어 주제 판별 방법에 대한 일실시예 흐름도이다.
4 is a flowchart illustrating a method of determining a speech language topic according to the present invention.

* 도면의 주요 부분에 대한 부호의 설명* Explanation of symbols for the main parts of the drawings

10 : 음성 인식기 20 : 문장 분류기10: speech recognizer 20: sentence classifier

21 : 속성 추출기 22 : 속성 분류기
21: Property Extractor 22: Property Classifier

본 발명은 음성에 포함된 단어간의 관계를 이용하는 음성언어 주제 판별 시스템 및 그 방법에 관한 것으로, 더욱 상세하게는 사용자가 음성으로 발화하는 내용(적어도 하나의 문장이 됨)에서 주요 단어, 각 단어들간의 관계(상하위 관계, 구문 관계 등), 문장간의 관계(수사 관계)를 통해 속성을 추출하여, 추출된 속성들로 표현된 문장을 분류하여 정해진 주제 중 어디에 속하는지를 자동 판별할 수 있는 음성언어 주제 판별 시스템 및 그 방법에 관한 것이다.The present invention relates to a system for determining a speech language subject using a relationship between words included in a voice and a method thereof, and more particularly, to a main word, each word between words spoken by a user (at least one sentence). Voice language topics that can automatically determine which of the subjects are classified by classifying attributes expressed through extracted relations (upper and lower relations, syntax relations, etc.), and relations between sentences (investigative relations) A discrimination system and method thereof are provided.

일반적으로, 널리 알려진 음성인식 방법으로 은닉 마르코프 모델(HMM : Hidden Markov Model)을 사용하는 방법이 있다. 여기서, 음성인식 과정으로 비터비(Viterbi) 탐색을 실시하는데, 이는 인식대상 후보 단어들에 대한 미리 훈련하여 구축한 HMM과 현재 입력된 음성의 특징들과의 차이를 비교하여 가장 유사한 후보단어를 결정하는 과정이다.In general, a well-known speech recognition method uses a Hidden Markov Model (HMM). Here, the Viterbi search is performed through the speech recognition process, which compares the difference between the HMM constructed by pre-training the candidate words to be recognized and the features of the currently input speech to determine the most similar candidate word. It's a process.

이해를 돕기 위하여, 일반적인 음성인식 시스템에 대해 살펴보기로 한다.To help understand, let's look at a general speech recognition system.

먼저, 음성이 입력되면, 끝점 검출기에서 음성의 앞뒤에 있는 묵음 구간을 제외한 음성구간을 찾는다. 이후에, 특징 추출기에서 앞에서 찾은 음성 구간의 음성신호로부터 음성의 특징을 추출한다.First, when a voice is input, the endpoint detector searches for a voice section excluding a silent section before and after the voice. Thereafter, the feature extractor extracts the feature of the speech from the speech signal of the speech section found above.

다음으로, 비터비 탐색기에서 음소 모델 데이터베이스로 구성된 발음사전에 등록된 단어들에 대해 음성 특징값을 이용하여 유사도(Likelihood)가 가장 유사한 단어들을 선정한다.Next, words most similar to the likelihood (Likelihood) are selected for the words registered in the phonetic dictionary composed of the phoneme model database in the Viterbi searcher using the speech feature value.

이어서, 발화 검증기가 비터비 탐색기에서 선정된 단어를 이용하여 음소단위로 특징구간을 분할한 후에, 반음소 모델을 이용하여 음소단위의 유사 신뢰도(Likelihood Ratio Confidence Score)를 구한다.Subsequently, the speech verifier divides the feature section into phoneme units using words selected from the Viterbi searcher, and then obtains a Likelihood Ratio Confidence Score of the phoneme unit using the semitone phone model.

발화 검증 방식이란, 음성인식된 어떤 결과에 대해 그 인식 결과를 받아들일 것인지(Accept), 거절할 것인지(Reject)를 어떤 신뢰도(Confidence Score 또는 Confidence Measure) 값을 사용하여 결정하는 방식이다. 여기서, 신뢰도는 음성인식 결과에 대해서 그 결과가 얼마나 믿을 만한 것인가를 나타내는 척도로서, 신뢰도값이 높으면 인식 결과를 신뢰할 수 있는 것으로 인식결과를 받아들여야 하고, 반대로 낮으면 결과를 신뢰하기가 어렵다는 의미로 인식결과를 거절하여야 한다.The speech verification method is a method of determining which confidence value (Confidence Score or Confidence Measure) value is used to determine whether to accept (Accept) or reject (Reject) the recognition result. Here, reliability is a measure of how reliable the result is for the speech recognition result. If the reliability value is high, the recognition result should be accepted as reliable, and if it is low, it is difficult to trust the result. The recognition result should be rejected.

마지막으로, 단어가 거절되면 다음 후보 단어에 대해 상기한 바와 같이 발화검증기에서 발화 검증 과정을 수행한다.Finally, if the word is rejected, the speech verification process is performed in the speech verifier as described above for the next candidate word.

한편, 문장을 인식할 경우에도 상기의 발화 검증 과정은 동일하게 적용되어 문법만 추가되며, 문장단위의 검증이 된다.On the other hand, in the case of recognizing a sentence, the above utterance verification process is applied in the same manner, only the grammar is added, and the sentence unit is verified.

이러한 음성인식 시스템을 바탕으로, 전화 지점 등으로 걸려 오는 전화를 기계가 받아 자동으로 원하는 전화 서비스나 서비스 상담 부서로 전화를 연결해 주는 자동 호 전환 서비스(automatic call routing service) 등이 가능한데, 이를 위해서는 음성언어 주제 판별 시스템이 고려된다.Based on the voice recognition system, an automatic call routing service, which receives a call from a telephone point and the like, automatically connects the call to a desired telephone service or service counseling department, is possible. Language subject determination systems are considered.

종래의 음성언어 주제 판별 시스템을 도 1을 참조하여 설명하면 다음과 같다.Referring to FIG. 1, a conventional speech language topic determination system is as follows.

통상, 음성언어 주제 판별 시스템은, 사용자의 음성을 인식하는 음성 인식기(10)와, 인식된 문장을 미리 정의된 몇 개의 주제로 분류하는 문장 분류기(20)로 구성되며, 문장 분류기(20)의 입력으로는 현재의 사용자 발화 문장, 혹은 이전에 발화하였던 사용자 문장들이며, 문장 분류기(20)는 이들에서 분류에 필요한 주요한 속성만을 추출하는 속성 추출기(21)와, 추출된 속성들로 표현된 문장을 분류하는 속성 분류기(22)로 구성된다.In general, a speech language subject determination system includes a speech recognizer 10 that recognizes a user's voice, and a sentence classifier 20 that classifies a recognized sentence into several predefined subjects. Inputs are current user spoken sentences or user sentences previously spoken, and the sentence classifier 20 extracts the sentence expressed by the extracted attributes and the attribute extractor 21 extracting only the main attributes necessary for classification therefrom. It consists of an attribute classifier 22 to classify.

음성언어 주제 판별 시스템의 속성 분류기(21)로는 은닉의미색인(LSI), SVM(Support Vector Machine), 나이브 베이즈 분류기 등 여러 기법의 분류기법들이 사용되고 있다. 하지만, 어떠한 형태의 분류기를 사용하는 것과는 상관없이 인식된 문장에서 사용된 주요 단어들을 분류기의 분류 속성으로 사용하는 것은 기존의 연구에서의 공통된 특징이다.As the attribute classifier 21 of the speech-language subject discrimination system, various techniques such as a hidden indexer (LSI), a support vector machine (SVM), and a naive Bayes classifier are used. However, regardless of which type of classifier is used, it is a common feature in previous studies to use the key words used in the recognized sentences as classification attributes of the classifier.

만약, 사용자가 "호텔 예약을 하고 싶은데요."라고 발화하였을 때, 여기에서 얻을 수 있는 주요 단어들은 '호텔', '예약', '하다' 등이다. 이 사용자의 의도는 속성으로 추출되는 이들 주요 단어들로 표현된다고 할 수 있다.If the user utters, "I want to make a hotel reservation," the main words that can be obtained here are 'hotel', 'reservation', and 'make'. The user's intention is expressed by these key words extracted as attributes.

하지만, 호 전환 서비스에서 대부분의 경우 발화에 사용된 주요 단어는 2~3 단어에 불과하고, 그나마 사용된 단어에도 사용자의 의도를 나타내는 단어가 포함된 경우는 매우 적다. 이러한 호 전환 서비스에서 적은 수의 주요 단어들만으로 구성된 분류 속성은 음성언어 주제 판별의 성능을 떨어뜨리는 주요한 요소가 된다.
However, in the call transfer service, in most cases, only two or three words are used for the utterance, and very few words are used to express the user's intention. In this call switching service, the classification attribute consisting of only a few key words is a major factor that degrades the performance of speech language subject identification.

본 발명은 상기 문제점을 해결하기 위하여 제안된 것으로, 사용자가 음성으로 발화하는 내용(적어도 하나의 문장이 됨)에서 주요 단어, 각 단어들간의 관계(상하위 관계, 구문 관계 등), 문장간의 관계(수사 관계)를 통해 속성을 추출하여, 추출된 속성들로 표현된 문장을 분류하여 정해진 주제 중 어디에 속하는지를 자동 판별할 수 있는 음성언어 주제 판별 시스템 및 그 방법을 제공하는데 그 목적이 있다.The present invention has been proposed to solve the above problems, the main words, the relationship between each word (upper and lower relationship, syntax relationship, etc.), the relationship between sentences in the content spoken by the user (at least one sentence) It is an object of the present invention to provide a speech-language subject identification system and a method for automatically determining whether a subject belongs to a predetermined subject by classifying a sentence represented by the extracted attributes by extracting an attribute through a rhetorical relation).

본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있으며, 본 발명의 실시예에 의해 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.
Other objects and advantages of the present invention can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. Also, it will be readily appreciated that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the claims.

상기 목적을 달성하기 위한 본 발명의 제1 시스템은, 음성언어 주제 판별 시스템에 있어서, 사용자의 발화 음성을 인식하기 위한 음성인식수단; 인식된 발화 문장에서 주요 단어를 추출하고, 각 주요 단어의 상하위 관계를 고려하여 '상/하위 단어'를 추출하고, 발화 문장간에 수사 관계를 고려하여 '문장간 수사 관계'를 추출하기 위한 속성추출수단; 및 상기 속성추출수단에서 추출된 속성들로 표현된 문장을 주제별로 분류하기 위한 속성분류수단을 포함한다.A first system of the present invention for achieving the above object comprises: a speech language subject identification system, comprising: speech recognition means for recognizing a spoken voice of a user; Extract key words from recognized spoken sentences, extract 'upper / lower words' considering top and bottom relationships of each key word, and extract attributes for 'investigative relations between sentences' in consideration of rhetorical relations between spoken sentences Way; And attribute classification means for classifying the sentences represented by the attributes extracted by the attribute extraction means by subject.

한편, 본 발명의 제2 시스템은, 음성언어 주제 판별 시스템에 있어서, 사용자의 발화 음성을 인식하기 위한 음성인식수단; 인식된 발화 문장에서 주요 단어를 추출하고, 주요 단어들간에 구문 관계를 고려하여 '단어간 구문 관계'를 추출하고, 발화 문장간에 수사 관계를 고려하여 '문장간 수사 관계'를 추출하기 위한 속성추출수단; 및 상기 속성추출수단에서 추출된 속성들로 표현된 문장을 주제별로 분류하기 위한 속성분류수단을 포함한다.On the other hand, the second system of the present invention, voice language subject determination system, voice recognition means for recognizing a spoken voice of a user; Extract key words from recognized spoken sentences, extract syntactic relations between words by considering syntactic relations between key words, and extract attributes to investigate 'investigative relations between sentences' Way; And attribute classification means for classifying the sentences represented by the attributes extracted by the attribute extraction means by subject.

한편, 본 발명의 제3 시스템은, 음성언어 주제 판별 시스템에 있어서, 사용자의 발화 음성을 인식하기 위한 음성인식수단; 인식된 발화 문장들에서 주요 단어를 추출하고, 발화 문장간에 수사 관계를 고려하여 '문장간 수사 관계'를 추출하기 위한 속성추출수단; 및 상기 속성추출수단에서 추출된 속성들로 표현된 문장을 주제별로 분류하기 위한 속성분류수단을 포함한다.On the other hand, the third system of the present invention, in the speech language topic determination system, voice recognition means for recognizing a spoken voice of the user; Attribute extracting means for extracting key words from recognized speech sentences and extracting 'investigative relations between sentences in consideration of rhetorical relations between spoken sentences; And attribute classification means for classifying the sentences represented by the attributes extracted by the attribute extraction means by subject.

한편, 본 발명의 제4 시스템은, 음성언어 주제 판별 시스템에 있어서, 사용자의 발화 음성을 인식하기 위한 음성인식수단; 인식된 발화 문장에서 주요 단어를 추출하고, 각 주요 단어의 상하위 관계를 고려하여 '상/하위 단어'를 추출하고, 주요 단어들간에 구문 관계를 고려하여 '단어간 구문 관계'를 추출하고, 발화 문장간에 수사 관계를 고려하여 '문장간 수사 관계'를 추출하기 위한 속성추출수단; 및 상기 속성추출수단에서 추출된 속성들로 표현된 문장을 주제별로 분류하기 위한 속성분류수단을 포함한다.On the other hand, the fourth system of the present invention, in the speech language topic determination system, voice recognition means for recognizing a spoken voice of the user; Extract key words from recognized spoken sentences, extract 'upper / lower words' considering top and bottom relationships of each key word, and extract 'syntax relationships between words' by considering syntactic relationships between key words Attribute extracting means for extracting 'investigation relation between sentences' in consideration of the investigation relation between sentences; And attribute classification means for classifying the sentences represented by the attributes extracted by the attribute extraction means by subject.

한편, 본 발명의 방법은, 음성언어 주제 판별 시스템에 적용되는 음성언어 주제 판별 방법에 있어서, 사용자의 발화 음성을 인식하는 음성인식단계; 인식된 적어도 하나의 발화 문장에서 주요 단어를 추출하고, 발화 문장간에 수사 관계를 고려해 '문장간 수사 관계'를 추출하여 분류에 필요한 속성을 추출하는 속성추출단계; 주제 코드가 부착된 다수의 문장 집합을 입력으로 하여 주제 코드별 속성 집합을 출력하는 훈련단계; 및 상기 속성추출단계에서 추출된 속성과 상기 훈련단계에서 출력된 주제 코드별 속성 집합을 비교하여, 인식된 문장의 속성에 가장 가까운 속성 집합의 주제 코드를 찾고, 그 유사도를 계산하여 발화 음성의 주제 코드로 확정하는 속성분류단계를 포함한다.On the other hand, the method of the present invention, the speech language subject determination method applied to the speech language subject determination system, the speech recognition step of recognizing a spoken voice of the user; An attribute extraction step of extracting a key word from the recognized at least one spoken sentence, extracting an attribute necessary for classification by extracting a 'investigative relation between sentences' in consideration of a rhetorical relation between spoken sentences; A training step of outputting an attribute set for each subject code by inputting a plurality of sentence sets to which a subject code is attached; And comparing the attributes extracted in the attribute extraction step with the attribute set for each subject code output in the training step to find the subject code of the attribute set closest to the recognized sentence attribute, calculating the similarity, and calculating the subject of the spoken speech. The attribute classification step is confirmed by code.

삭제delete

본 발명은 사용자의 발화 내용에서 주요 단어들 뿐만 아니라, 주요 단어간의 수직 관계 및 수평 관계를 자동 추출하여, 이를 분류를 위한 속성으로 같이 사용하고자 한다. 이와 같이 사용자 의도를 나타내는 속성 정보를 풍부하게 하여, 궁극적으로 주제 판별의 성능을 높이고자 한다.The present invention is to automatically extract not only the main words but also the vertical and horizontal relations between the main words in the user's speech and use them as attributes for classification. In this way, the attribute information representing the user's intention is enriched, and ultimately, the performance of subject discrimination is improved.

이때, 사용된 단어간의 수직 관계는 상하위 관계로서 워드넷(WordNet)과 같은 온톨로지(Ontology)에서 얻을 수 있다. 또한, 사용 가능한 단어들간의 수평 관계는 구문 관계, 수사 관계가 있다. 이들 관계는 문장 구문 분석 및 문장간 수사 구조 분석 등을 통해 얻을 수 있다.In this case, the vertical relationship between the words used may be obtained from an ontology such as WordNet as an upper and lower relationship. In addition, the horizontal relations between the available words are syntactic and rhetorical. These relationships can be obtained through sentence syntax analysis and sentence structure analysis.

이와 같이 본 발명은 사용자가 음성으로 말하는 내용이 정해진 주제 중 어디에 속하는지를 자동으로 판별하기 위해, 속성 추출 및 훈련, 주제 판별 과정을 통해 사용자 음성에 표현된 주요 단어, 각 단어들간의 관계, 문장간의 관계들을 표현하는 속성을 속성 분류기의 입력으로 사용하여 음성 주제 판별 성능을 향상시킬 수 있다.As described above, the present invention provides a method of extracting, training, and identifying subjects to identify key words, relations between words, and sentences between the sentences in order to automatically determine where the user speaks by voice. An attribute representing relations can be used as an input to an attribute classifier to improve speech subject discrimination performance.

상술한 목적, 특징 및 장점은 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해 질 것이며, 그에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에 그 상세한 설명을 생략하기로 한다. 이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 일실시예를 상세히 설명하기로 한다.The foregoing and other objects, features and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings, in which: There will be. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail. Hereinafter, a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

도 1 은 본 발명에 따른 음성언어 주제 판별 시스템의 일실시예 구성도이다.1 is a configuration diagram of an embodiment of a speech language topic determination system according to the present invention.

도 1에 도시된 바와 같이, 음성언어 주제 판별 시스템은, 사용자의 음성을 인식하는 음성 인식기(10)와, 인식된 문장을 미리 정의된 몇 개의 주제로 분류하는 문장 분류기(20)로 구성되며, 문장 분류기(20)는 다시 속성 추출기(21)와 속성 분류기(22)로 구성된다.As shown in FIG. 1, the speech language subject determining system includes a speech recognizer 10 that recognizes a user's voice, and a sentence classifier 20 that classifies the recognized sentence into several predefined subjects. The sentence classifier 20 is again composed of an attribute extractor 21 and an attribute classifier 22.

속성 추출기(21)는 현재의 사용자 발화 문장, 혹은 이전에 발화하였던 사용자 문장들을 입력으로 분류에 필요한 속성만을 추출하며, 속성 분류기(22)는 추출된 속성들로 표현된 문장을 분류한다.The attribute extractor 21 extracts only the attributes necessary for classifying the current user spoken sentence or the previously spoken user sentences as inputs, and the attribute classifier 22 classifies the sentences represented by the extracted attributes.

본 발명에서 분류을 위해 사용되는 속성은 문장에 사용된 주요 단어, 단어의 상위어, 단어간 구문관계, 문장간 수사 관계들이다.Attributes used for classification in the present invention are the main word used in the sentence, the upper word of the word, the syntactic relationship between the words, the inter-sentence investigation relations.

따라서 상기 속성 추출기(21)는 도 2에 도시된 바와 같이 주요 단어 추출기(211), 상하위 관계 추출기(22), 구문 구조 분석기(213), 수사 구조 분석기(214)로 구성되어 있다.Therefore, the attribute extractor 21 is composed of a main word extractor 211, a parent-child extractor 22, a syntax structure analyzer 213, and a rhetorical structure analyzer 214.

따라서 본 발명에 따른 음성언어 주제 판별 시스템은, 사용자의 발화 음성을 인식하기 위한 음성 인식기(10)와, 인식된 발화 문장에서 주요 단어를 추출하고, 각 주요 단어의 상하위 관계를 고려하여 '상/하위 단어'를 추출하기 위한 속성 추출기(21)와, 속성 추출기(21)에서 추출된 속성들로 표현된 문장을 주제별로 분류하기 위한 속성 분류기(22)를 포함한다.Therefore, according to the present invention, the speech-language subject determination system extracts key words from a speech recognizer 10 for recognizing a spoken voice of a user, a recognized spoken sentence, and considers the upper and lower relations of each key word. An attribute extractor 21 for extracting the lower word 'and an attribute classifier 22 for classifying the sentences represented by the attributes extracted by the attribute extractor 21 for each subject.

또한, 속성 추출기(21)는 인식된 발화 문장에서 주요 단어를 추출하고, 주요 단어들간에 구문 관계를 고려하여 '단어간 구문 관계'를 추출한다.In addition, the attribute extractor 21 extracts a main word from the recognized speech sentence, and extracts a 'syntax relationship between words' in consideration of a syntax relationship between the main words.

또한, 속성 추출기(21)는 발화 문장이 적어도 2개의 문장으로 이루어진 경우, 인식된 발화 문장들에서 주요 단어를 추출하고, 발화 문장간에 수사 관계를 고려하여 '문장간 수사 관계'를 추출한다.In addition, when the spoken sentence is composed of at least two sentences, the attribute extractor 21 extracts a main word from the recognized spoken sentences, and extracts a 'investigative relation between sentences' in consideration of the rhetorical relation between the spoken sentences.

속성 추출기(21)에서 속성을 추출하는 과정을 보다 상세하게 살펴보면 다음과 같다.Looking at the process of extracting the attribute in the attribute extractor 21 in more detail as follows.

사용자의 음성으로부터 얻을 수 있는 문장이 "객실 예약을 하고 싶은데요."일 때, 주요 단어 추출기(211)를 통해 이 문장으로부터 얻을 수 있는 주요 단어들은 '객실', '예약', '하다'이다. 이들 각 주요 단어의 상위 단어는 상하위 관계 추출기(212)를 통해 추출할 수 있다. 이들 단어로부터 추출 가능한 상위어는 '객실'의 상위어인 '호텔', '예약'의 상위어인 '고객대응'이다.When the sentence that can be obtained from the user's voice is "I want to make a room reservation.", The main words that can be obtained from the sentence through the key word extractor 211 are 'room', 'reservation', and 'make'. The upper word of each of these main words may be extracted through the upper and lower relationship extractors 212. The upper terms that can be extracted from these words are 'hotel' which is the upper term of 'room' and 'customer response' which is the upper term of 'reservation'.

또한, 구문 구조 분석기(213)를 통해 주요 단어들간의 구문 관계를 추출한다. 이때, 추출된 구문 관계는 '예약'과 '하다'의 목적격 관계인 '예약-을-하다'이다. 이러한 과정을 거쳐, 이 문장의 속성은 {'객실', '예약', '하다', '호텔', '고객대응', '예약-을-하다'}로 표현된다.In addition, the syntax structure analyzer 213 extracts the syntactic relations between the major words. At this time, the extracted syntax relationship is 'reservation-to-make', which is the purpose relationship between 'reservation' and 'make'. Through this process, the attributes of this sentence are expressed as {'room', 'reservation', 'make', 'hotel', 'customer response', 'reservation-to-make'}.

또 다른 예로, 사용자의 음성이 "어제 예약을 했는데요. 그런데, 오늘 일이 생겨서요."일 때, 수사 구조 분석기(214)을 통해 '(예약-을-하다) - 반전 - (일-이-생기다)'와 같은 문장간 수사 관계를 추출할 수 있다. 이 음성의 속성은 이 문장간 수사 관계를 포함하여 {'어제', '예약', '하다', '오늘', '일', '생기다', '날짜', '고객대응', '예약-을-하다', '일-이-생기다', '(예약-을-하다) - 반전 - (일-이-생기다)'}로 표현된다.In another example, when the user's voice says, "I made a reservation yesterday, but today's happening," the investigation structure analyzer 214 reads, "(make a reservation)--reverse-(one-this-one). You can extract rhetorical relations between sentences such as The attributes of this voice include rhetorical relations between these sentences: {'Yesterday', 'Reservation', 'Make', 'Today', 'Day', 'Generate', 'Date', 'Customer Response', 'Reservation- '-', 'One-to-produce', '(reservation-to-produce)-inversion- (one-to-produce')}.

그럼, 음성언어 주제 판별 시스템의 훈련 과정을 도 3을 참조하여 살펴보면 다음과 같다.Then, the training process of the speech-language subject identification system will be described with reference to FIG. 3.

음성언어 주제 판별을 위한 훈련 과정은, 주제 코드가 부착된 대량의 문장 집합을 입력으로 하여 주제 코드별 속성 집합을 출력한다.In the training process for speech language subject discrimination, a large sentence set to which a subject code is attached is input, and an attribute set for each subject code is output.

이를 구체적으로 살펴보면, 먼저 주제 코드가 부착된 대량의 문장 집합에서 문장을 하나 선택하여(301), 이 문장에 대한 속성을 추출한 후(302). 선택된 문장의 주제 코드에 해당하는 코드별 속성집합에 추출된 속성을 추가한다(303).In detail, first, one sentence is selected from a large set of sentences to which a subject code is attached (301), and an attribute of the sentence is extracted (302). The extracted attribute is added to the attribute set for each code corresponding to the subject code of the selected sentence (303).

그리고 남은 문장이 있는 경우 해당 문장에 대해 상기 "301" 단계 내지 "303" 단계를 반복하여, 모든 문장에 대해 추출된 속성이 추가되었으면 주제 코드별 속성 집합을 출력한다(305).If there are remaining sentences, the process repeats the steps "301" to "303" for the sentences, and if the extracted attributes are added for all sentences, the attribute set for each subject code is output (305).

도 4 는 본 발명에 따른 음성언어 주제 판별 방법에 대한 일실시예 흐름도로서, 사용자의 음성을 입력으로 하여 음성의 주제 코드를 출력하는 절차를 나타낸다.4 is a flowchart illustrating a method for determining a subject of a speech language according to an embodiment of the present invention, and illustrates a procedure of outputting a subject code of a speech by inputting a user's speech.

도 4에 도시된 바와 같이, 본 발명에 따른 음성언어 주제 판별 방법은, 사용자 음성을 인식하여 문장을 출력하고(401), 인식된 문장에서 속성을 추출한다(402).As shown in FIG. 4, the method of determining a speech language theme according to the present invention recognizes a user's voice, outputs a sentence (401), and extracts an attribute from the recognized sentence (402).

이때, 속성 추출은 전술한 바와 같이, 인식된 발화 문장에서 주요 단어를 추출하고, 각 주요 단어의 상하위 관계를 고려하여 '상/하위 단어'를 추출하거나, 주요 단어들간에 구문 관계를 고려하여 '단어간 구문 관계'를 추출하거나, 발화 문장이 적어도 2개의 문장으로 이루어진 경우 발화 문장간에 수사 관계를 고려하여 '문장간 수사 관계'를 추출한다. 다만, 단어간의 수직 관계(상하위 관계), 단어들간의 수평 관계(구문 관계), 문장들간의 수평 관계(수사 관계)는 주요 단어 속성과 함께 독립적인 분류 속성으로 사용 가능하고, 또는 이의 일부 혹은 전체적인 분류 속성으로 사용 가능하다.At this time, the attribute extraction, as described above, extracts the main words from the recognized utterance sentences, extract the 'upper / lower words' in consideration of the upper and lower relations of each major word, or' 'Syntax relation between words' is extracted or 'Investigation relation between sentences' is extracted in consideration of the rhetorical relation between spoken sentences when the spoken sentence consists of at least two sentences. However, the vertical relationship between words (upper and lower relationship), the horizontal relationship between words (syntax relationship), and the horizontal relationship between sentences (investigative relationship) can be used as an independent classification property together with the main word property, or a part or whole thereof Can be used as a classification attribute.

이후, 추출된 속성과 상기 도 3에서 출력된 주제 코드별 속성 집합을 비교하여 인식된 문장의 속성에 가장 가까운 속성 집합의 주제 코드를 찾고, 그 유사도를 계산한다(403).Thereafter, the extracted attribute is compared with the attribute set for each subject code output from FIG. 3 to find the subject code of the attribute set closest to the recognized sentence attribute, and the similarity is calculated (403).

만약, 계산된 유사도가 미리 설정된 임계치보다 높거나 같은 경우, 찾아진 주제 코드를 입력 음성의 주제 코드로 최종 확정하여 출력하고(405), 임계치 보다 낮은 경우, 주제 판별이 실패한 것으로 판단하여 주제 판별 실패 코드를 출력한다(406).If the calculated similarity is higher than or equal to the preset threshold, the found subject code is finally determined and output as the subject code of the input voice (405), and if it is lower than the threshold, it is determined that the subject discrimination has failed and subject discrimination fails. Output the code (406).

상술한 바와 같은 본 발명의 방법은 프로그램으로 구현되어 컴퓨터로 읽을 수 있는 형태로 기록매체(씨디롬, 램, 롬, 플로피 디스크, 하드 디스크, 광자기 디스크 등)에 저장될 수 있다. 이러한 과정은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있으므로 더 이상 상세히 설명하지 않기로 한다.The method of the present invention as described above may be embodied as a program and stored in a computer-readable recording medium (such as a CD-ROM, a RAM, a ROM, a floppy disk, a hard disk, or a magneto-optical disk). Since this process can be easily implemented by those skilled in the art will not be described in more detail.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다.
The present invention described above is capable of various substitutions, modifications, and changes without departing from the technical spirit of the present invention for those skilled in the art to which the present invention pertains. It is not limited by the drawings.

상기와 같은 본 발명은, 음성에 포함된 단어나 문장간의 관계를 분류 속성으로 이용함으로써, 사용자 의도를 나타내는 속성 정보를 풍부하게 하여, 음성 주제 판별의 성능을 높일 수 있는 효과가 있다.The present invention as described above has the effect of enhancing the performance of speech subject discrimination by enriching attribute information indicating user intention by using the relation between words and sentences included in the speech as a classification attribute.

Claims

In the speech language subject identification system,

Speech recognition means for recognizing a spoken voice of a user;

Extract key words from recognized spoken sentences, extract 'upper / lower words' considering top and bottom relationships of each key word, and extract attributes for 'investigative relations between sentences' in consideration of rhetorical relations between spoken sentences Way; And

Attribute classification means for classifying sentences represented by attributes extracted by the attribute extraction means for each subject

Speech language subject determination system using the relationship between the words contained in the speech comprising a.

The method of claim 1,

The attribute extraction means,

A speech-language subject identification system using a relationship between words included in a voice, further extracting a main word from a recognized spoken sentence and extracting a 'syntax relationship between words in consideration of a syntax relationship between the main words.

delete

In the speech language subject identification system,

Speech recognition means for recognizing a spoken voice of a user;

Extract key words from recognized spoken sentences, extract syntactic relations between words by considering syntactic relations between key words, and extract attributes to investigate 'investigative relations between sentences' Way; And

The method of claim 4, wherein

The attribute extraction means,

A speech-language subject identification system using a relationship between words included in a voice, further extracting a main word from a recognized spoken sentence and extracting a 'upper / lower word' in consideration of a parenthesis of each major word.

delete

In the speech language subject identification system,

Speech recognition means for recognizing a spoken voice of a user;

Attribute extracting means for extracting key words from recognized speech sentences and extracting 'investigative relations between sentences in consideration of rhetorical relations between spoken sentences; And

The method of claim 7, wherein

The attribute extraction means,

delete

In the speech language subject identification system,

Speech recognition means for recognizing a spoken voice of a user;

Extract key words from recognized spoken sentences, extract 'upper / lower words' considering top and bottom relationships of each key word, and extract 'syntax relationships between words' by considering syntactic relationships between key words Attribute extracting means for extracting 'investigation relation between sentences' in consideration of the investigation relation between sentences; And

delete

The method according to any one of claims 1, 2, 4, 5, 7, 7, 8, and 10,

The attribute extraction means,

Main word extraction means for extracting a main word from the recognized speech sentence;

Upper and lower relationship extracting means for extracting an upper word or a lower word of each main word from the spoken sentence;

Syntactic structure analysis means for extracting syntactic relations between key words in the spoken sentence; And

Investigation structure analysis means for extracting the investigation relationship between the spoken sentences

In the speech language topic determination method applied to a speech language topic determination system,

A voice recognition step of recognizing a spoken voice of the user;

An attribute extraction step of extracting a key word from the recognized at least one spoken sentence, extracting an attribute necessary for classification by extracting a 'investigative relation between sentences' in consideration of a rhetorical relation between spoken sentences;

A training step of outputting an attribute set for each subject code by inputting a plurality of sentence sets to which a subject code is attached; And

By comparing the attributes extracted in the attribute extraction step and the attribute set for each subject code output in the training step, the subject code of the attribute set closest to the recognized sentence attribute is found, and the similarity is calculated to calculate the subject code of the spoken speech. Attribute classification step to confirm with

Voice language subject determination method using a relationship between the words included in the speech.

The method of claim 13,

The attribute extraction step,

A method of determining a speech language theme using a relationship between words included in a voice by extracting a main word from a recognized spoken sentence and extracting a 'upper / lower word' in consideration of an upper and lower relationship of each main word.

The method of claim 13,

The attribute extraction step,

A method of discriminating a speech language using extracting a main word from a recognized spoken sentence and extracting a 'phrase relationship between words in consideration of a syntactic relationship between major words.

delete

The method according to any one of claims 13 to 15,

The speech language subject determination system,

A speech language subject discrimination method using a relationship between words contained in speech, characterized by being usable for an automatic call transfer service.

The method according to any one of claims 13 to 15,

The attribute classification step,

Compares the extracted attribute with the attribute set for every subject code, finds the subject code of the attribute set closest to the recognized sentence attribute, calculates the similarity, and finds when the calculated similarity is higher than or equal to a preset threshold. A speech language subject determination method using the relationship between words included in the speech, which is determined by the final subject code of the input voice as the final subject code and is output, and if it is lower than the threshold value, determines that the subject discrimination has failed and outputs a subject discrimination failure code. .

delete