JPS6242200A

JPS6242200A - Voice recognition equipment

Info

Publication number: JPS6242200A
Application number: JP60182521A
Authority: JP
Inventors: 山岸　美奈
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1985-08-20
Filing date: 1985-08-20
Publication date: 1987-02-24

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】技夏分災本発明は、音声認識装置に関する。[Detailed description of the invention] technical summer disaster The present invention relates to a speech recognition device.

従来１地複数の異なる種類の音声を認識する音声認識装置におい
て、どの音声の種類も同じ発声間隔をとることは広く知
られているが、これでは異なる種類の音声の分類がされ
にくく、高認識率を得ることはできない。It is widely known that in conventional speech recognition devices that recognize multiple different types of voices in one place, all types of voices take the same utterance interval, but this makes it difficult to classify different types of voices, and it is difficult to classify different types of voices. You can't get a rate.

また、発声間隔を利用する技術は、特開昭５６−１１４
０４Ｌ号公報に開示された音声入力文書作成装置におい
て公知であるが、この場合の発声間隔は単語や文章の区
切りに利用するもので、異なる種類の音声の分類に利用
しているものではない。更に、異なる種類の音声、例え
ば単語と単音節を同じ方式で認識すると認識精度が悪く
なるため、入力部は同じでも認識部は別々に持−）こと
が通常である。一般には、入力された音声の長さ。In addition, the technology that utilizes vocalization intervals is disclosed in Japanese Patent Application Laid-Open No.
Although it is known in the speech input document creation device disclosed in Publication No. 04L, the utterance interval in this case is used to separate words or sentences, and is not used to classify different types of speech. Furthermore, if different types of speech, such as words and monosyllables, are recognized using the same method, the recognition accuracy will deteriorate, so it is common to have separate recognition units even though the input unit is the same. Generally, the length of the input audio.

パワー安定部のスペクトル変化などで分類されているが
１例えば単音節「れ」と単語「例（れ−）」や集音部「
きよ」と米語「今日」などの区分が正確にできないとい
う欠点を持っている。It is classified based on spectral changes in the power stable part.1 For example, the single syllable "re" and the word "example (re-)" and the sound collecting part "re-" are classified.
It has the disadvantage that it is not possible to accurately distinguish between words such as "kiyo" and American "kyou".

■−−煎本発明は、」二連のごとき実情に鑑みてなされたもので
、特に、認識率の高い音声認識装置を提供することを目
的としてなされたものである。(2) The present invention was made in view of the actual situation such as "Double Recognition", and in particular, it was made for the purpose of providing a speech recognition device with a high recognition rate.

楕成本発明は、上記目的を達成するために、音声入力部と音
声認識部とを有し、複数の異なる種類の音声を認識する
音声認識装置において、入力しようとする音声の種類が
直前の入力音声の種類と異なる時は、直前の入力音声の
種類と同じ時に発声する発声間隔とは、異なる発声間隔
をとること、或いは、集音部と、集音された信号から音
声に係る部分を検出する検出部と、異なる複数の音声を
認識し得る認識部とを具備しで成る音声認識装置におい
て、認識せんとする音声と、（の前後に認識した音声の
時間間隔を測定し、該時間間隔の長さにより音声信号の
認識部を選択するようにしたこと、或いは、集音部と、
集音された信号から音声に係る部分を検出する検出部と
、異なる種類への音声を認識し得る認識部と、認識結果
を表示する表示部とを具備して成る音声認識装置におい
て、認識せんとする音声と、その前後に認識した音声の
時間間隔を測定し、該時間間隔の長さにより認識結果の
いずれかを選択して表示するようにしたことを特徴とし
たものである。以下、本発明の実施例に基づいて説明す
る。In order to achieve the above object, the present invention provides a speech recognition device that includes a speech input section and a speech recognition section and that recognizes a plurality of different types of speech. When the type of audio differs, the utterance interval is different from the utterance interval that is uttered at the same time as the type of the previous input audio, or the sound collection unit detects the part related to the audio from the collected signal. In a speech recognition device comprising a detection unit capable of recognizing a plurality of different voices, and a recognition unit capable of recognizing a plurality of different voices, the time interval between the voice to be recognized and the voice recognized before and after (is measured, and the time interval is The recognition section of the audio signal is selected depending on the length of the sound signal, or the sound collection section and
A speech recognition device comprising a detection unit that detects a voice-related part from a collected signal, a recognition unit that can recognize different types of voice, and a display unit that displays recognition results. The present invention is characterized in that the time interval between the voice recognized before and after the voice recognized is measured, and one of the recognition results is selected and displayed depending on the length of the time interval. Hereinafter, the present invention will be explained based on examples.

本発明は、複数の異なる種類の音声を認識する音声認識
装置において、入力しようとする音声の種類が、直前の
入力音声の種類と異なる時は、直前の入力音声の種類と
同じ時に発声する間隔よりも発声間隔を長くとることに
より、音声の種類を分類しやすくし、もって、認識率を
あげるようにしたものである。The present invention provides a speech recognition device that recognizes a plurality of different types of speech, and when the type of speech to be input is different from the type of the immediately preceding input speech, the interval between utterances when the type of speech to be input is the same as that of the immediately preceding input speech is provided. By making the utterance interval longer than the utterance interval, it is made easier to classify the type of voice, thereby increasing the recognition rate.

第１図は、本発明の一実施例を説明するための構成図で
、図中、］は音声入力部、２は音声認識部、３は畠力部
で、音声認識部２は単語認識部２Ａと単音節認識部２Ｂ
より成っており、音声入力部１からは単音節と単語が入
力される。FIG. 1 is a block diagram for explaining one embodiment of the present invention. In the figure, ] is a voice input section, 2 is a voice recognition section, 3 is a Hataki section, and the voice recognition section 2 is a word recognition section. 2A and monosyllable recognition section 2B
The voice input section 1 inputs monosyllables and words.

第２図は、上記音声認識装置における全体の処理フロー
を示す図で、今、単音節又は単語入力が続く場合の発声
間隔をｔｌ、単音節と単語が交互に入力される場合の発
声間隔をｔ２としく但し。FIG. 2 is a diagram showing the overall processing flow in the speech recognition device, where tl is the utterance interval when single syllables or words are input continuously, and tl is the utterance interval when monosyllables and words are input alternately. However, t2.

ｔｌくｔ２とする）、入力する文頭は必らず単音節でい
れるものときめておけば、例えば、「きようばてんきが
よいてす」という文章中、「てんき」。If you make sure that the beginning of the sentence you input is a monosyllable, for example, "tenki" in the sentence "Kiyouba Tenki ga goodesu".

「よい」、「です」の３つが単語辞書パターンに登録さ
れている時は、発声は、ｔｔ　　ｔｔ　　ｔ２　　　　ｔ２　　ｔ２　　　ｔｔ
の様におこなえばよい。When the three words "good" and "desu" are registered in the word dictionary pattern, the utterance is tt tt t2 t2 t2 tt
You can do it like this.

第３図は、第１図に示した実施例の変形実施例を示す図
で、第１図に示した実施例においては、文頭は必ず単音
節で入力するようにしたが、この実施例においては、指
定部４を設け、この指定部４により、最初の入力音声が
単音節であるか、単語であるかと指定するようにしてい
る。FIG. 3 is a diagram showing a modified embodiment of the embodiment shown in FIG. 1. In the embodiment shown in FIG. is provided with a specifying section 4, which specifies whether the first input speech is a monosyllable or a word.

第４図及び第５図は、それぞれ本発明の他の実施例を説
明するための構成図で、図中、１１は集音部、１２は検
出部、１３は時間間隔測定部。4 and 5 are block diagrams for explaining other embodiments of the present invention, respectively. In the figures, 11 is a sound collection section, 12 is a detection section, and 13 is a time interval measurement section.

１４は単語認識部、１５は単音節認識部、１６は出力表
示部で、この場合の音声入力は、例えば、単音節を入力
する時、直前に発声したものとの間隔をｊｌ＋単語を入
力する時、直前に発声したものとの間隔をｔ２とする（
ただし、ｔ１≠ｔ２）。14 is a word recognition unit, 15 is a monosyllable recognition unit, and 16 is an output display unit. In this case, for example, when inputting a monosyllable, input the interval from the previous utterance to jl + word. time, let the interval between the previous utterance be t2 (
However, t1≠t2).

今、第１図に示した実施例の例題文を同じ条件で（単語
が３つ含まれている）入力するとすれば、その時のその
発声方法は、ｔｔ　　ｔ、　　ｔ２　　　　ｔ、　　ｔ２　　　ｔ２
というように行う（この例では、ｔｌ＜ｔ２）。Now, if we input the example sentence of the example shown in Figure 1 under the same conditions (it contains three words), the way to pronounce it at that time is tt t, t2 t, t2 t2
(In this example, tl<t2).

而して、第４図に示した実施例は、認識せんとする音声
と、その前後に認識した音声の時間間隔を時間間隔測定
部１３で測定し、その時間間隔の長さにより音声信号の
認識部の選択すなわち単語認識部１４と単音節認識部１
５のいずれかを選択するものであり、第５図に示した実
施例は、測定した時間間隔の長さにより認識結果のいず
れかを選択するようにしたものである。In the embodiment shown in FIG. 4, the time interval between the speech to be recognized and the speech recognized before and after the speech is measured by the time interval measuring section 13, and the length of the time interval is used to determine the sound signal. Selection of recognition units, namely word recognition unit 14 and monosyllable recognition unit 1
In the embodiment shown in FIG. 5, one of the recognition results is selected depending on the length of the measured time interval.

効　　　果以上の説明から明らかなように、本発明によると、複数
の異なる種類の音声を認識する装置において、音声の種
類を分類するために特別な回路を設けずに分類すること
ができ、そのため低コストで高性能の認識装置を得るこ
とができる。Effects As is clear from the above explanation, according to the present invention, in a device that recognizes a plurality of different types of voices, it is possible to classify the types of voices without providing a special circuit for classifying them. A high-performance recognition device can be obtained at low cost.

[Brief explanation of the drawing]

第１図は、本発明による音声認識装置の一実施例を説明
するための構成図、第２図は、第１図に示した実施例の
動作説明をするためのフローチャート、第３図は、第１
図に示した実施例の変形実施例を示す図、第４図及び第
５図は、それぞれ本発明の他の実施例を示す構成図であ
る。１・・・音声入力部、２・・・認識部、２Ａ・・・単語
認識部、２Ｂ・・・単音節認識部、３・・・出力部、４
・・・指定部、１１・・・集音部、１２・・・検出部、
１３・・・時間間隔測定部、１４・・・単語認識部、１
５・・・単音節認識部、１６・・・出力表示部。特許出願人　　株式会社　リコー第　１　図第２図第３図第　４　図第５図手続補正帯（師）昭和６０年　特許願　第１８２５２１号２、発明の名称音声認識装置３、補正をする者事件との関係　　特許出願人オオタ　り　ナカマゴメ住所　　東京都大田区中馬込１丁目３番６号氏名（名称
）　　（６７４）株式会社リコー代表者　　浜　　１）
　　　広４、代　理　人住　所　　　　〒２３１　横浜市中区不老町１−２−７
６、補正の対象　　　明細書の発明の詳細な説明の欄７
、補正の内容明細書第５頁第１行目に記載の「発声間隔を長くとるこ
とにより、」を「発声間隔を長く（又は、短く）するこ
とにより、」に補正する。FIG. 1 is a block diagram for explaining an embodiment of the speech recognition device according to the present invention, FIG. 2 is a flowchart for explaining the operation of the embodiment shown in FIG. 1, and FIG. 1st
Figures 4 and 5 showing modified embodiments of the embodiment shown in the figures are configuration diagrams showing other embodiments of the present invention, respectively. DESCRIPTION OF SYMBOLS 1... Voice input part, 2... Recognition part, 2A... Word recognition part, 2B... Monosyllable recognition part, 3... Output part, 4
... designation section, 11 ... sound collection section, 12 ... detection section,
13... Time interval measurement unit, 14... Word recognition unit, 1
5... Monosyllable recognition section, 16... Output display section. Patent applicant: Ricoh Co., Ltd. No. 1 Figure 2 Figure 3 Figure 4 Figure 5 Procedure amendment band (master) 1985 Patent application No. 182521 2 Name of invention Speech recognition device 3 Case of person making amendment Relationship with Patent applicant Ri Ota Nakamagome Address 1-3-6 Nakamagome, Ota-ku, Tokyo Name (674) Ricoh Co., Ltd. Representative Hama 1)
Hiro 4, Agent Address: 1-2-7 Furocho, Naka-ku, Yokohama 231
6. Subject of amendment Column 7 for detailed explanation of the invention in the specification
, "By lengthening the utterance interval" stated in the first line of page 5 of the Specification of Contents of the Amendment is amended to "by lengthening (or shortening) the utterance interval."

Claims

[Claims]

(1) In a speech recognition device that has a speech input section and a speech recognition section and recognizes a plurality of different types of speech, when the type of speech to be input is different from the type of the previous input speech, the A speech recognition device characterized in that the speech recognition interval is different from the speech interval that is produced at the same time as the type of input speech.

(2) The speech recognition device according to claim (1), characterized in that the type of first input speech is specified.

(3) In a speech recognition device comprising a sound collection section, a detection section that detects a voice-related part from the collected signal, and a recognition section that can recognize a plurality of different voices, A speech recognition device characterized in that a time interval between a speech recognized before and after the speech recognized is measured, and a speech signal recognition unit is selected based on the length of the time interval.

(4) A sound collection unit, a detection unit that detects a voice-related part from the collected signal, a recognition unit that can recognize different types of voice, and a display unit that displays recognition results. In a speech recognition device consisting of a speech recognition device, the time interval between the speech to be recognized and the speech recognized before and after it is measured, and one of the recognition results is selected and displayed depending on the length of the time interval. Characteristic voice recognition device.