JPH0421899A

JPH0421899A - Voice recognizing device

Info

Publication number: JPH0421899A
Application number: JP2127280A
Authority: JP
Inventors: Tatsuya Kimura; 達也木村; Tatsuro Ito; 達朗伊藤; Seiji Hiraoka; 平岡　省二
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1990-05-16
Filing date: 1990-05-16
Publication date: 1992-01-24

Abstract

PURPOSE:To quickly access a word purposed to express by a speaker by sorting plural word canditates or sentence candidates in each sorting item and outputting and displaying the sorted result. CONSTITUTION:A voice analyzing circuit 1 inputs an input voice and outputs a feature parameter time sequence expressing voice features to a pattern collating circuit 2. The circuit 2 collates the feature parameter time sequence with a parameter time sequence supplied from a reference pattern storing circuit 6 and utilized as a reference pattern in each prescribed word group and outputs the collated result to a candidate selecting circuit 3. The circuit 3 selects plural word candidates having higher collated results and supplies the selected candidates to a candidate sorting circuit 4, which collates sorting items stored in a sorting item information storing circuit 7, sorts the inputted word candidates to the sorting items. A picture display device 5 displays the word candidates in respective sorting items.

Description

【発明の詳細な説明】産業上の利用分野本発明は、人間の発生した単語音声または文音声を認識
する音声認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a speech recognition device that recognizes word speech or sentence speech produced by humans.

従来の技術従来、人間の発生した音声を認識する装置を実現する技
術としては種々の方法が存在している。BACKGROUND OF THE INVENTION Conventionally, there have been various techniques for realizing a device that recognizes human-generated speech.

その−例として、音響パラメータパターンのマツチング
に基づく方法がある。以下、従来技術として、単語認識
の場合について、音響パラメータパターンマツチングに
基づく方法について説明する。An example is a method based on matching acoustic parameter patterns. Hereinafter, as a conventional technique, a method based on acoustic parameter pattern matching will be described in the case of word recognition.

第３図は、従来の音声認識システムの典型的な構成例を
示したものである。FIG. 3 shows a typical configuration example of a conventional speech recognition system.

第３図において、入力音声は音声分析口ｆＩ８Ｌｌに入
力される。音声分析回路１１は入力音声を分析し、音声
の特徴を表わす特徴パラメータの時系列をパターン照合
回路１２に出力する。In FIG. 3, the input voice is input to the voice analysis port fI8Ll. The speech analysis circuit 11 analyzes the input speech and outputs a time series of feature parameters representing the characteristics of the speech to the pattern matching circuit 12.

パターン照合回路１２には、標準パターン格納回路１６
からもパラメータ時系列が供給される。The pattern matching circuit 12 includes a standard pattern storage circuit 16.
Also provides parameter time series.

標準パターン格納回路１６には、標準パターンとして利
用される単語毎に用意されたパラメータ時系列が格納さ
れている。パターン照合回路１２は、音声分析回路１１
で得られた特徴パラメータ時系列と、標準パターン格納
回路１６から供給される標準パターンパラメータ時系列
とを、所定の単語集団について照合し、照合の結果を単
語毎に認識結果判定回路１３に出力する。認識結果判定
回路１３は、パターン照合回路１２で照合した結果を入
力し、最も良い照合結果を与える単語を判定して、音声
認識装置の認識結果として外部に出力する。The standard pattern storage circuit 16 stores a parameter time series prepared for each word used as a standard pattern. The pattern matching circuit 12 is the voice analysis circuit 11
The characteristic parameter time series obtained in step 1 and the standard pattern parameter time series supplied from the standard pattern storage circuit 16 are compared for a predetermined group of words, and the results of the comparison are outputted to the recognition result determination circuit 13 for each word. . The recognition result determination circuit 13 inputs the results of the matching performed by the pattern matching circuit 12, determines the word that gives the best matching result, and outputs the word to the outside as the recognition result of the speech recognition device.

発明が解決しようとする課題従来の音声認識装置では、入カバターンと標準パターン
のそれぞれの時系列パラメータを照合し、最も良い照合
結果を与える単語を認識結果として得ている。しかし々
から、音声認識の場合には、話者が意図した内容と異な
る認識結果が得られる、いわゆる認識誤りが、ある頻度
で生じる。この認識誤りが生じた場合、同じ単語を再入
力する等の対応が必要と々るが、第３図の従来例で示し
たような認識結果を１個だけ得る方法では、認識誤りが
生じた場合に、話者が意図している単語にアクセスする
のに手間および時間がかかるという操作性の面での問題
がある。Problems to be Solved by the Invention In conventional speech recognition devices, the time-series parameters of each input pattern and standard pattern are compared, and the word that gives the best matching result is obtained as a recognition result. However, in the case of speech recognition, so-called recognition errors, in which a recognition result different from the content intended by the speaker is obtained, occur with a certain frequency. If this recognition error occurs, it may be necessary to take measures such as re-entering the same word, but with the method of obtaining only one recognition result as shown in the conventional example in Figure 3, recognition errors occur. In some cases, there is a problem in terms of operability in that it takes time and effort to access the word that the speaker intends.

本発明は」−記課題に鑑み、認識誤りが生じた場合に、
話者が意図している単語に短時間でアクセスできる音声
認識装置を提供することを目的とする。In view of the problems described above, the present invention provides the following:
It is an object of the present invention to provide a speech recognition device that can access words intended by a speaker in a short time.

課題を解決するだめの手段この目的を達成するために、本発明では、複数個の単語
候補または文候補を認識結果と］−２て出力する音声認
識手段と、音声認識手段が出力した複数個の単語候補ま
たは文候補を分類項目別に分類する候補分類手段と、候
補分類手段が分類した結果に基づいて複数個の単語候補
または文候補を分類項目別に出力表示する画面表示装置
とを設けるように構成されている。Means for Solving the Problem In order to achieve this object, the present invention provides a speech recognition means for outputting a plurality of word candidates or sentence candidates as recognition results, and a speech recognition means for outputting a plurality of word candidates or sentence candidates as recognition results, and A candidate classification means for classifying word candidates or sentence candidates according to classification items, and a screen display device for outputting and displaying a plurality of word candidates or sentence candidates according to classification items based on the classification results by the candidate classification means. It is configured.

作　　　　　　１１１本発明は、上記構成により、認識結果を候補として分類
項目別に複数個表示することで、複数個の候補の中に正
しい単語が含壕れていれば、表示された候補の中から正
しい単語を選び出す操作によって、１回の発声で話者の
意図している中１語にアクセスできるように作用する。111 With the above configuration, the present invention displays a plurality of recognition results as candidates for each classification item, and if a correct word is included in the plurality of candidates, the correct word is selected from among the displayed candidates. By selecting a word, it is possible to access one of the words intended by the speaker with a single utterance.

実施例以下、実施例により本発明の説明を行う。Example The present invention will be explained below with reference to Examples.

第１図は、本発明の一実施例の構成を示すブロック図で
ある。FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention.

第１図において、人力音声は音声分析回路１に入力され
る。音声分析回路１は人力音声を分析し、音声の特徴と
表わす特徴パラメータの時系列をパターン照合回路２に
出力する。In FIG. 1, human voice is input to a voice analysis circuit 1. In FIG. A speech analysis circuit 1 analyzes human speech and outputs a time series of feature parameters representing speech characteristics to a pattern matching circuit 2.

パターン照合回路２には、標準パターン格納回路６から
もパラメータ時系列が供給される。標準パターン格納回
路６には、標準パターンとして利用される単語毎に用意
されたパラメータ時系列が格納されている。パターン照
合回路２は、音声分析回路１で得られた特徴パラメータ
時系列と、標準パターン格納回路６から供給される標準
パターンとして利用されるパラメータ時系列とを、所定
の単語集団について照合し、照合の結果を単語毎に候補
選別回路３に出力する。The pattern matching circuit 2 is also supplied with parameter time series from the standard pattern storage circuit 6. The standard pattern storage circuit 6 stores a parameter time series prepared for each word used as a standard pattern. The pattern matching circuit 2 matches the feature parameter time series obtained by the speech analysis circuit 1 and the parameter time series used as a standard pattern supplied from the standard pattern storage circuit 6 for a predetermined word group, and performs matching. The results are output to the candidate selection circuit 3 for each word.

候補選別回路３は、パターン照合回路２の照合結果を入
力し、照合結果の良いものから複数個の単語候補を選別
して候補分類回路４に供給する。The candidate selection circuit 3 inputs the matching results of the pattern matching circuit 2, selects a plurality of word candidates based on the best matching results, and supplies them to the candidate classification circuit 4.

候補分類回路４は、候補選別回路３から供給される単語
候補を入力し、分類項目情報格納回路７に格納されてい
る分類項目を照合することにより、単語候補を分類項目
別に分類する。候補分類回路４で分類された内容は、終
段の画面表示装置５に入力され、分類項目別に表示され
る。The candidate classification circuit 4 receives the word candidates supplied from the candidate selection circuit 3, and classifies the word candidates by classification item by comparing the word candidates with the classification items stored in the classification item information storage circuit 7. The contents classified by the candidate classification circuit 4 are input to the screen display device 5 at the final stage, and are displayed for each classification item.

第２図に画面表示装置５における画面表示の例を示す。FIG. 2 shows an example of a screen display on the screen display device 5.

なお、この例では、単語「−発明」ならば、１−は段」
、単語「音声」ならば［あ段」のように、単語分類の基
準として単語の最初の音節が属する５０音の１段」を設
定することで分類した表示を行っている。In addition, in this example, if the word ``-invention'', 1- is ``dan''.
, for the word ``speech'', the word is displayed as classified by setting the first syllable of the 50 sounds to which the first syllable of the word belongs as a standard for word classification, such as ``a-dan''.

第２図は、「発明」という入力音声を認識した際の認識
結果の候補を、画面表示装置５に表示した例を示してい
る。この例では、第１位の候補として単語［−シつれい
」が認識され、正解の１はつめい」は、第２位の候補と
して認識されている。FIG. 2 shows an example in which recognition result candidates are displayed on the screen display device 5 when the input voice "invention" is recognized. In this example, the word "-shitsurei" is recognized as the first candidate, and the correct answer "1 is Tsumei" is recognized as the second candidate.

即ち、第１位の候補は２重アンダーラインを付して表示
され、第２位の候補はアンダーラインを付して表示され
る。That is, the first-place candidate is displayed with a double underline, and the second-place candidate is displayed with an underline.

第３図で示した従来の音声認識装置では、第１位の結果
のみを得るようにしているので、認識誤りが生じた場合
、同じ単語を再発声することが必要となるが、第１図に
示す本発明の音声認識装置により、ば、Ｋ声認識装置の
使用者は、第２図のように表示された情報から、画面表
示装置５の画面」−の夕ｙチパイ・ルやキー人力もしく
はマウス等の入力手段により、意図する単語を選び出す
ことにより、使用者が意図している単語に短時間でアク
セスすることが可能となシ、操作性の向上を図ることが
できる。In the conventional speech recognition device shown in Fig. 3, only the first result is obtained, so if a recognition error occurs, it is necessary to re-speak the same word. With the voice recognition device of the present invention shown in FIG. Alternatively, by selecting the desired word using an input means such as a mouse, the user can access the desired word in a short time, thereby improving operability.

丑た、情報検索等の入力として音声認識を利用する場合
等では、実際の音声認識性能の限界を越える語常数が必
要となる場合が生じ得る。このような場合でも、本発明
の音声認識装置によれば、得られた候補の中に話者が意
図している単語が含まれている確率が、従来の単数の認
識結果を表示する方法に比べて高くなるので、取シ扱う
語粟数を従来に比べて大きく設定することが可能となる
。Furthermore, when speech recognition is used as an input for information retrieval, etc., a word constant that exceeds the limit of actual speech recognition performance may be required. Even in such a case, according to the speech recognition device of the present invention, the probability that the obtained candidates include the word intended by the speaker is lower than the conventional method of displaying singular recognition results. Since the cost is higher than that of conventional methods, it is possible to set the number of words to be handled larger than in the past.

発明の詳細な説明したように、本発明によれば、複数個の単語候補
または文候補を認識結果として出力する音声認識手段と
、音声認識手段が出力した複数個の単語候補または文候
補を分類項目別に分類する候補分類手段と、候補分類手
段が分類した結果に基づいて複数個の単語候補または文
候補を分類項目別に出力表示する画面表示装置とを設け
るように構成されている。DETAILED DESCRIPTION OF THE INVENTION According to the present invention, there is provided a speech recognition means for outputting a plurality of word candidates or sentence candidates as recognition results, and a method for classifying the plurality of word candidates or sentence candidates outputted by the speech recognition means. The present invention is configured to include a candidate classification means for classifying by item, and a screen display device for outputting and displaying a plurality of word candidates or sentence candidates for each classification item based on the results of classification by the candidate classification means.

この構成によシ認識結果を候補として分類項目別に複数
個表示することで、複数個の候補の中に正しい単語が含
まれていれば、表示された候補の中から正しい単語を選
び出す操作によって、話者が意図している単語に短時間
でアクセスすることが可能となる。With this configuration, by displaying multiple recognition results as candidates for each classification item, if the correct word is included in the multiple candidates, the correct word can be selected from the displayed candidates. It becomes possible to access the words intended by the speaker in a short time.

[Brief explanation of the drawing]

第１図は、本発明の一実施例における音声認識装置を示
すブロック図、第２図は、本発明の一実施例における音
声認識装置の動作を説明する画面表示装置の正面図、第
３図は、従来の音声認識装置を示すブロック図である。１・・・音声分析回路、、２・・・パターン照合回路、
３・候補選別回路、４　候補分類回路、５・・・画面表
示装置、６・・・標準パターン格納回路、７・・・分類
項目情報格納回路。FIG. 1 is a block diagram showing a speech recognition device in an embodiment of the present invention, FIG. 2 is a front view of a screen display device for explaining the operation of the speech recognition device in an embodiment of the invention, and FIG. 3 1 is a block diagram showing a conventional speech recognition device. 1...Speech analysis circuit, 2...Pattern matching circuit,
3. Candidate selection circuit, 4. Candidate classification circuit, 5. Screen display device, 6. Standard pattern storage circuit, 7. Classification item information storage circuit.

Claims

[Claims]

a speech recognition means for outputting a plurality of word candidates or sentence candidates as recognition results; a candidate classification means for classifying the plurality of word candidates or sentence candidates outputted by the speech recognition means according to classification items; A speech recognition device comprising: a screen display device that outputs and displays the plurality of word candidates or sentence candidates for each of the classification items based on the classification results.