JPS6211898A - Voice recognition equipment - Google Patents

Voice recognition equipment

Info

Publication number
JPS6211898A
JPS6211898A JP60150179A JP15017985A JPS6211898A JP S6211898 A JPS6211898 A JP S6211898A JP 60150179 A JP60150179 A JP 60150179A JP 15017985 A JP15017985 A JP 15017985A JP S6211898 A JPS6211898 A JP S6211898A
Authority
JP
Japan
Prior art keywords
syllable
dictionary
continuous
input
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP60150179A
Other languages
Japanese (ja)
Inventor
義典 北原
薮内 繁
大島 義光
正博 阿部
武市 宣之
遠藤 裕英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to JP60150179A priority Critical patent/JPS6211898A/en
Publication of JPS6211898A publication Critical patent/JPS6211898A/en
Pending legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 〔発明の利用分野〕 本発明は、単語・文節単位で発声させた音声を認識し、
言語において出現しうる2連・3連等の連音節パタンを
記憶した辞書を用いて、複数候補系列の中より正しい系
列を選択する方式に係り、該方式の性能改良に関する。
[Detailed Description of the Invention] [Field of Application of the Invention] The present invention recognizes speech uttered in units of words and phrases,
The present invention relates to a method for selecting a correct sequence from among a plurality of candidate sequences using a dictionary that stores syllable patterns such as doublets and triplets that may appear in a language, and relates to improving the performance of the method.

〔発明の背景〕[Background of the invention]

従来の方式は、情処全大IH−6(1983−3) 。 The conventional system is Information Center IH-6 (1983-3).

音響学会講論集2−7−4 (1982−10)記載の
ように、日本語において出現し得る2連・3連等の連音
節パタンを記憶した辞書を用いて、音声認識より得られ
た複数候補系列より正しい系列を選択するものであった
。しかし、日本語において出現し得る全ての連音節パタ
ンを収集することは困難で、新造語等に対する性能につ
いては配慮されていなかった。
As described in Proceedings of the Acoustical Society of Japan 2-7-4 (October 1982), plural syllables obtained by speech recognition are The correct sequence was selected from candidate sequences. However, it is difficult to collect all the consecutive syllable patterns that can appear in Japanese, and the performance with newly coined words has not been considered.

〔発明の目的〕[Purpose of the invention]

本発明の目的は、新造語に対しても高い認識性能が得ら
れる音節認識装置を提供することにある。
An object of the present invention is to provide a syllable recognition device that can obtain high recognition performance even for newly coined words.

〔発明の概要〕[Summary of the invention]

上記目的を達成するために、本発明では入力音声の認識
結果が正しくない場合に正解系列を入力し、該入力系列
を入力し、該入力系列を2連・3連等の連音節パタンに
分解して該連音節バタンと既に辞書として記憶しである
連音節バタンと照合し、辞書中に連音節バタンか存在し
なければ、該連音節パタンを辞書に追加登録する。
In order to achieve the above object, the present invention inputs a correct answer sequence when the recognition result of input speech is incorrect, inputs the input sequence, and decomposes the input sequence into continuous syllable patterns such as doubles and triplets. Then, the connected syllable pattern is compared with the connected syllable pattern already stored in the dictionary, and if the connected syllable pattern does not exist in the dictionary, the connected syllable pattern is additionally registered in the dictionary.

〔発明の実施例〕[Embodiments of the invention]

以下、本発明の一実施例を、図を用いて説明する。第1
図は、本発明の詳細な説明するための音声認識装置の一
例のブロック図である。入力された単語/文節音声は、
音響認識部1にて音韻/音節標準パタン3との照合が行
なわれ、各音節複数候補のラティスを形成し出力する。
An embodiment of the present invention will be described below with reference to the drawings. 1st
The figure is a block diagram of an example of a speech recognition device for explaining the present invention in detail. The input word/phrase audio is
The acoustic recognition unit 1 performs a comparison with the phoneme/syllable standard pattern 3, and forms and outputs a lattice of multiple candidates for each syllable.

音響認識部1の実施例については、公知の技術(例えば
、矢島他3「日本語音声入力装置の認識方式」信学全大
1602 (昭6O−3)参照)を用いることができる
For the embodiment of the acoustic recognition unit 1, a known technique (for example, see Yajima et al., ``Recognition Method for Japanese Speech Input Device'', IEICE University, 1602 (Sho 6O-3)) can be used.

第2図は、音響認識部1が文節「カンガエテイルJを認
識した場合の候補ラティスの一例である。
FIG. 2 is an example of a candidate lattice when the acoustic recognition unit 1 recognizes the phrase "Kangae Tail J."

候補ラティスは、候補音節と音響類似度とから構成され
る。音響類似度は、値が大きい程入力音節と音響的に類
似していることを示す。各音節は各容土から音響類似度
の高い順に並んでいる。この候補ラティスは、候補系列
選択部2に送られる。
The candidate lattice is composed of candidate syllables and acoustic similarities. The acoustic similarity indicates that the larger the value, the more acoustically similar the input syllable is. Each syllable is arranged in descending order of acoustic similarity from each content. This candidate lattice is sent to the candidate sequence selection section 2.

このラティスから音節を組合せて生成される候補文節系
列は6400通り存在し、系列としての音響類似度の和
の上位10位までを示したものが第3図(a)である。
There are 6,400 candidate phrase sequences generated by combining syllables from this lattice, and FIG. 3(a) shows the top 10 in terms of sum of acoustic similarities as sequences.

この段階においては、正解系列「カンガエテイル」は第
6番目になっている。候補系列選択部2では、これら複
数系列に対して、連音節辞書4を用いて候補系列を絞る
。この連音節辞書4には、第3図(b)のように「カン
」とか「エテ」のように日本語において出現し得る2連
音節が記憶されている。これらは文節頭・中・尾のよう
に位置別に記憶しておいてもよい。連音節は2連に限定
されることなく、n連(n≧2)でよい。日本語におい
て出現し得ない「テピ」や「ガペ」のような連音節パタ
ンは連音節辞書4には記憶されてはいない。まず第3図
(a)の候補系列について、1系列ずつを連音節に分解
する。
At this stage, the correct answer series "Kangae Tail" is the 6th. The candidate sequence selection unit 2 narrows down the candidate sequences from these multiple sequences using the syllable dictionary 4. As shown in FIG. 3(b), this continuous syllable dictionary 4 stores two continuous syllables that can appear in Japanese, such as "kan" and "ete". These may be stored by position, such as beginning, middle, and end of the phrase. The number of consecutive syllables is not limited to two consecutive syllables, but may be n consecutive syllables (n≧2). Continuous syllable patterns such as "tepi" and "gape" which cannot appear in Japanese are not stored in the concatenated syllable dictionary 4. First, each candidate sequence in FIG. 3(a) is decomposed into continuous syllables.

例えば、「カンガエテピル」は、「カン」、「ンガ」、
「ガニ」、「エテ」、[テピ」、「ピル」のように分解
される。次に、これら分解された6種類の連音節が、連
音節辞書4の各項目と照合し、6種類全てが連音節辞書
4中に存在すれば、該候補移列は音響類似度和とともに
候補系列出力用メモリ2′に記憶され、他方、6種類の
うち1種類でも連音節辞書中に存在しないものがあれば
、該候補系列は候補系列出力用メモリ2′に記憶されな
い。以上の動作により、第3図(c)の3系列が、認識
候補として音響類似度和とともに候補系列出力用メモリ
2′に記憶され、候補系列出力用メモリ2′の内容はデ
ィスプレイ装置5に表示される。
For example, "Kangaetepir" means "Kan", "Nga",
It is broken down into ``gani'', ``ete'', ``tepi'', and ``pil''. Next, these six types of decomposed continuous syllables are compared with each item in the continuous syllable dictionary 4, and if all six types are present in the continuous syllable dictionary 4, the candidate transfer is performed together with the acoustic similarity sum. On the other hand, if even one of the six types does not exist in the continuous syllable dictionary, the candidate sequence is not stored in the candidate sequence output memory 2'. Through the above operations, the three sequences shown in FIG. 3(c) are stored as recognition candidates together with the acoustic similarity sum in the candidate sequence output memory 2', and the contents of the candidate sequence output memory 2' are displayed on the display device 5. be done.

ところで、入力音声に新造語が含有されている場合、連
音節辞書4に記憶されている連音節バタンか充分でなく
、候補系列選択部2からの出力に正解系列が含まれない
ことがある。第4図(a)は、「ワードプロセッサハ」
を認識したときの音響認識部1の出力である候補ラティ
スより作成した候補系列の例である。また、第4図(b
)は、連音節辞書4の一例である。該辞書を用いた場合
、候補系列選択部2の出力は、第4図(c)のようにな
り、この中には正解系列「ワードプロセッサハ」は含有
されていない。この原因は、連音節辞書4の中に連音節
「ドブ」が存在しないことにあり、rドブ」を登録しな
ければ、[ワードプロセッサハ」が認識されることはな
い。これを解決するために、候補系列選択部2の出力系
列がディスプレイ装置5に表示される際に、表示系列中
に正解系列が含有されていなければ、キーボード7等の
手段を用いて、正解系列を入力する。連音節学習登録部
6は、入力された該正解系列を連音節に分解する。先の
例では、「ワー」、「−ド」。
By the way, when the input speech contains a newly coined word, the number of consecutive syllables stored in the consecutive syllable dictionary 4 may not be sufficient, and the output from the candidate sequence selection section 2 may not include the correct answer sequence. Figure 4(a) shows the word processor
This is an example of a candidate series created from a candidate lattice that is the output of the acoustic recognition unit 1 when recognizing the following. In addition, Fig. 4 (b
) is an example of the continuous syllable dictionary 4. When this dictionary is used, the output of the candidate sequence selection unit 2 is as shown in FIG. 4(c), which does not include the correct sequence "word processor HA". The reason for this is that the continuous syllable "dobu" does not exist in the continuous syllable dictionary 4, and unless "r dobu" is registered, "word processor ha" will not be recognized. In order to solve this problem, when the output series of the candidate series selection section 2 is displayed on the display device 5, if the correct series is not included in the displayed series, the correct series is selected using a means such as the keyboard 7. Enter. The continuous syllable learning registration unit 6 breaks down the input correct answer sequence into continuous syllables. In the previous example, "word" and "-do".

「ドブ」、「プロ」、「ロセ」、「セラ」、「ツサ」、
「サバ」という8種の2連音節に分解される。次に、上
記8種の入力連音節を1つずつ入力連音節用バッファメ
モリ6′にロードし、また連音節辞書4の中の項目を]
一つずつ辞書連音節用バッファメモリ6“にロードし、
入力連音節用バッファメモリ6′と辞書連音節用バッフ
ァメモリ6′の内容を比較照合する。両内容が一致しな
ければ、連音節辞書4の中の次項目を辞書連音節用バッ
ファメモリ6′にロードして入力連音節用バソファメモ
リ6′と比較照合する。両内容が一致すれば次の入力連
音節を入力連音節用バッファメモリ6′にロードし、ま
た、連音節辞書4の中の項目を1つずつ辞書連音節用バ
ッファメモリ6′にロードし、両バッファメモリを比較
照合する。
"Dobu", "Pro", "Rose", "Sera", "Tssa",
It is broken down into eight types of disyllables: ``Saba.'' Next, the eight types of input continuous syllables mentioned above are loaded one by one into the input continuous syllable buffer memory 6', and the items in the continuous syllable dictionary 4 are loaded]
Load them one by one into the dictionary concatenated syllable buffer memory 6",
The contents of the input continuous syllable buffer memory 6' and the dictionary continuous syllable buffer memory 6' are compared and verified. If the two contents do not match, the next item in the continuous syllable dictionary 4 is loaded into the dictionary continuous syllable buffer memory 6' and compared with the input continuous syllable buffer memory 6'. If the two contents match, the next input syllable is loaded into the input syllable buffer memory 6', and the items in the syllable dictionary 4 are loaded one by one into the dictionary syllable buffer memory 6'. Compare and check both buffer memories.

以上の手続きを繰り返し、連音節辞書4の中の全項目と
一致しなかった入力連音節は、連音節辞書4に新項目と
して追加登録される。上記の例では、「ドブ」が新項目
として追加登録される。
The above procedure is repeated, and input continuous syllables that do not match all items in the continuous syllable dictionary 4 are additionally registered as new items in the continuous syllable dictionary 4. In the above example, "Gutter" is additionally registered as a new item.

〔発明の効果〕〔Effect of the invention〕

以上説明したように、本発明によれば、音響認識候補文
字列の中から連音節辞書を用いて日本語として存在し得
る系列のみを選択し、新語・造語に対しても、連音節辞
書の項目を学習することによって、正しい候補系列を出
力することができる。
As explained above, according to the present invention, only sequences that can exist as Japanese are selected from acoustic recognition candidate character strings using a concatenated syllable dictionary. By learning the items, correct candidate sequences can be output.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例の全体構成図、第2図は文節
「カンデエティル」を入力した時の音響認識部出力候補
ラティスの例、第3図(a)は文節「カンガエテイル」
を入力した時の候補ラティスより得られる音節の組合せ
の系列のうち音響類似度の総和の上位10系列、同図(
b)は連音節辞書の一例、同図(c)は候補系列選択部
より出力され、メモリに記憶される候補系列、第4図(
a)、(b)、(c)は各々文節[ワードプロセッサハ
」を入力した時の第3図(a)、(b)。 (c)に対応するものである。 1・・・音響認識部、2・・・候補系列選択部、2′・
・・候補系列出力用メモリ、3・・・音韻/音節標準バ
タン、4・・・連音節辞書、5・・・ディスプレイ装置
、6・・・連音節学習登録部、6′・・・入力連音節用
バッファメモリ、6“・・・辞書連音節用パップアメモ
リ、7・・・キーボード。 ■2図 v33図 (矢) (b) 第 412]” (幻      ゛ (C)
Fig. 1 is an overall configuration diagram of an embodiment of the present invention, Fig. 2 is an example of the output candidate lattice of the acoustic recognition unit when the phrase ``Kandaeteil'' is input, and Fig. 3(a) is the phrase ``Kangaeteil''.
Among the series of syllable combinations obtained from the candidate lattice when inputting
b) is an example of a continuous syllable dictionary, FIG.
a), (b), and (c) are the results of FIGS. 3 (a) and (b) when the phrase [word processor HA] is input, respectively. This corresponds to (c). 1... Acoustic recognition section, 2... Candidate sequence selection section, 2'.
... memory for candidate series output, 3... phoneme/syllable standard button, 4... continuous syllable dictionary, 5... display device, 6... continuous syllable learning registration unit, 6'... input sequence Buffer memory for syllables, 6 "... Puppy memory for dictionary connected syllables, 7... Keyboard. ■Figure 2 v Figure 33 (arrow) (b) No. 412]" (phantom ゛ (C)

Claims (1)

【特許請求の範囲】[Claims] 単語・文節単位で発声させた音声を、音韻または音節単
位で認識し、複数の認識候補系列より正しい系列を、そ
の言語において出現し得る2連、3連等の連音節パタン
を記憶する辞書を用いて選択し、文字コードもしくはそ
れに類するコードに変換する音声認識装置において、入
力された音声が正しく文字コードに変換されない場合に
、正解系列を入力し、該入力系列を、2連、3連等の連
音節パタンに分解して該連音節パタンと辞書中のパタン
とを照合し、辞書中に連音節パタンが存在しなければ、
該連音節パタンを辞書に追加登録することを特徴とする
音声認識装置。
A dictionary that recognizes speech uttered in units of words and phrases in units of phonemes or syllables, and stores the correct sequence from multiple recognition candidate sequences, as well as consecutive syllable patterns such as doublets and triplets that may appear in the language. If the input voice is not correctly converted into a character code in a speech recognition device that selects and converts it into a character code or similar code, the correct answer sequence is input, and the input sequence is converted into a double, triple, etc. If the continuous syllable pattern does not exist in the dictionary, if the continuous syllable pattern does not exist in the dictionary,
A speech recognition device characterized in that the continuous syllable pattern is additionally registered in a dictionary.
JP60150179A 1985-07-10 1985-07-10 Voice recognition equipment Pending JPS6211898A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP60150179A JPS6211898A (en) 1985-07-10 1985-07-10 Voice recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP60150179A JPS6211898A (en) 1985-07-10 1985-07-10 Voice recognition equipment

Publications (1)

Publication Number Publication Date
JPS6211898A true JPS6211898A (en) 1987-01-20

Family

ID=15491229

Family Applications (1)

Application Number Title Priority Date Filing Date
JP60150179A Pending JPS6211898A (en) 1985-07-10 1985-07-10 Voice recognition equipment

Country Status (1)

Country Link
JP (1) JPS6211898A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8774438B2 (en) 2010-08-09 2014-07-08 Kabushiki Kaisha Audio-Technica Microphone unit and highly directional microphone

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8774438B2 (en) 2010-08-09 2014-07-08 Kabushiki Kaisha Audio-Technica Microphone unit and highly directional microphone

Similar Documents

Publication Publication Date Title
US5832428A (en) Search engine for phrase recognition based on prefix/body/suffix architecture
JPH10503033A (en) Speech recognition method and device based on new word modeling
US20070016420A1 (en) Dictionary lookup for mobile devices using spelling recognition
JPS6211898A (en) Voice recognition equipment
JP2820093B2 (en) Monosyllable recognition device
JPS62165267A (en) Voice word processor device
JP2966002B2 (en) Voice recognition device
JPS6219899A (en) Sentence voice recognition equipment
JP2002189490A (en) Method of pinyin speech input
JPS6283796A (en) Voice input unit
JP2817406B2 (en) Continuous speech recognition method
JP3009709B2 (en) Japanese speech recognition method
JP3353769B2 (en) Character recognition device, character recognition method, and character recognition program recording medium
JPS61177575A (en) Forming device of japanese document
JPS63249199A (en) Voice recognition system
JPH0552506B2 (en)
JPS6283797A (en) Recognition equipment
JPS61261798A (en) Voice recognition equipment
JPH04112269A (en) Lattice searching system using difference of similarity between recognitive candidate
JPS6073592A (en) Voice recognition equipment for specific speaker
JPH0916575A (en) Pronunciation dictionary device
JP2000214881A (en) Apparatus and method for sound recognition linguistic model generation
JPS61189599A (en) Symbol comparator
JPS588379A (en) Kana (japanese syllabary)-kanji (chinese character) converting system
JPS60179835A (en) Input method of japanese voice input device