JPS60203997A

JPS60203997A - Speaker shift system for voice recognition equipment

Info

Publication number: JPS60203997A
Application number: JP59061546A
Authority: JP
Inventors: 松丸　宏; 好高久間
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1984-03-28
Filing date: 1984-03-28
Publication date: 1985-10-15

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔発明の利用分野〕本発明は、音声認識装置における話者交代方式の改良に
係り、特に特定話者ｆｐ３認疏装置に好適な話者交代方
式に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to an improvement of a speaker change method in a speech recognition device, and particularly to a speaker change method suitable for a specific speaker fp3 recognition device.

[Background of the invention]

音声認識技術は、不特定話者認識と特定話者認識の２つ
に大別される。勿論、不特定話者認識が、望まれるが、
現在の技術においては、確実に認識できるのは数字のみ
であるなど、使用言語が限定され、応用範囲が著しく狭
い。一方、特定話者認識は、技術的に実用化レベルに達
しているがコストが高いうらみがある。このため、同一
認識装置を多くの人に使ってもらうため、特定話者であ
りながら、何十人でも使えるようにシステムを組むこと
が必要となる。この話者が変わることを話者交代機能と
称する“。Speech recognition technology is roughly divided into two types: speaker-independent recognition and speaker-specific recognition. Of course, speaker-independent recognition is desired, but
With current technology, only numbers can be reliably recognized, which limits the languages that can be used, and the range of applications is extremely narrow. On the other hand, specific speaker recognition has reached a level of practical use technically, but it is expensive. Therefore, in order to have the same recognition device used by many people, it is necessary to create a system that can be used by dozens of people, even if they are specific speakers. This change of speaker is called the speaker change function.

話者交代の方法として、認識装置についている話者の音
声認識用のパターンを記憶するメモリを物理的に変える
、つまり、自分の音声のフロッピーに差換えると言う方
法が従来より採用されてきた。しかし、この方法では、
話者交代の度に操作員がフロッピーを交換せざるを得す
、運用が頻雑で不便である。Conventionally, a method of changing speakers has been to physically change the memory in the recognition device that stores the speaker's voice recognition patterns, that is, to replace it with a floppy disk containing the speaker's own voice. However, with this method,
The operator is forced to change the floppy disk every time the speaker changes, making the operation frequent and inconvenient.

[Purpose of the invention]

本発明の目的は、簡単にして確実に話者交代のできる音
声認識装置の話者交代方式を堤供するこ−とである。SUMMARY OF THE INVENTION An object of the present invention is to provide a speaker change method for a speech recognition device that allows speaker change easily and reliably.

[Summary of the invention]

本発明の特徴とするところは、音声認識部と、複数の話
者による複数種の音声を識別するために話者別に設定さ
れた認識パターンを記憶する認識パターンメモリと、話
者別に対応する機械出力信号を発生する操作手段と、こ
の機械出力信号を入力してこの入力に対応する話者を識
別する話者識別手段を備え、この話者識別結果により上
記パターンを該当話者用に切換えることである。The features of the present invention include a speech recognition unit, a recognition pattern memory that stores recognition patterns set for each speaker in order to identify multiple types of speech by multiple speakers, and a machine corresponding to each speaker. comprising an operating means for generating an output signal, and a speaker identification means for inputting this machine output signal and identifying a speaker corresponding to this input, and switching the pattern for the corresponding speaker based on the speaker identification result. It is.

これにより、極めて簡便にして、話者交代が可能となる
。This makes it possible to change speakers extremely easily.

最も望ましい実施態様においては、上記機械出力信号と
して音声合成手段の出力を用い、上記話者識別手段とし
て上記音声認識部を共用する。In the most desirable embodiment, the output of the speech synthesis means is used as the mechanical output signal, and the speech recognition section is shared as the speaker identification means.

これにより、格別な付加装置を必要とせず、確実に話者
交代が可能となる。This makes it possible to reliably change the speaker without requiring any special additional equipment.

[Embodiments of the invention]

第１図は、本発明を用いた一実施例である。 FIG. 1 shows an embodiment using the present invention.

音声入出力固定局１と音声人力移動局２は、無線によっ
て結合している。The voice input/output fixed station 1 and the voice-powered mobile station 2 are connected wirelessly.

音声入出力固定局１は、アンテナ３ａ、無線機４ａ、音
声認識装置５、音声合成装置１０、処理装置７、音声認
識パターン外部メモリ８から構成され、音声入力移動局
２は、アンテナ３ｂ、無線機４ｂ、音声合成装置１１、
話者交代コンソール１２、−人ツドホン１４、マイク１
３より構成される。The voice input/output fixed station 1 includes an antenna 3a, a radio 4a, a voice recognition device 5, a voice synthesizer 10, a processing device 7, and a voice recognition pattern external memory 8. The voice input mobile station 2 includes an antenna 3b, a wireless machine 4b, speech synthesis device 11,
Speaker change console 12, voicephone 14, microphone 1
Consists of 3.

第１図において話者はＡ〜Ｅ迄５人とする。各人別の識
別番号、コード、氏名などの情報が、音声入力移動局２
内の音声合成装置１１内のメモリーに格納されている。In FIG. 1, there are five speakers, A to E. Information such as each person's identification number, code, name, etc. is transmitted to the voice input mobile station 2.
It is stored in the memory of the speech synthesizer 11 within the computer.

その合成出力を認識する認識パターンが、音声入出力固
定局１内の音声認識装置５内の話者認識パターンメモＩ
Ｊ　５　ａに音声認識パターン外部メモリ８よりあげら
れ常駐している。操作員は、話者交代をする時、話者交
代コンソール１２上のロータリースイッチ１５を自分の
名前の所にセットし、交代割込み釦１６を押す。The recognition pattern for recognizing the synthesized output is the speaker recognition pattern memo I in the voice recognition device 5 in the voice input/output fixed station 1.
The voice recognition pattern is retrieved from the external memory 8 and resides in J5a. When changing speakers, the operator sets the rotary switch 15 on the speaker change console 12 to his/her name and presses the change interrupt button 16.

この割込みにより、音声合成装置１１は、当該話者に対
応する情報例えば名前を合成出力する。この合成出力さ
れた機械的出力信号例えば音声合成出力は無線機４ｂ、
アンテナ３ｂを介し、音声入出力固定局１のアンテナ３
ａ、無線機４ａにとどき、音声認識装置５に受信される
。音声認識装置５は受信した合成音（機械出力信号）の
パターンが、自装置内の認識パターンメモリ６ａ内のど
れと合うかをパターンマツチングし話者を認識すると共
に、その話者が、これから音声入力する言葉のパターン
を、例えば、話者ＣならＣ１〜Ｃ５のパターンを音声認
識パターン外部メモリ８から認識パターンメモリ６ｂに
セットする。そして話者に対して、認識した人名を音声
合成装置１０が音声合成データ９より取出し、無線４ａ
、アンテナ３ａを介し、音声入力移動局２へ返信する。Due to this interruption, the speech synthesis device 11 synthesizes and outputs information corresponding to the speaker, such as the name. This synthesized mechanical output signal, for example, the voice synthesis output is the radio device 4b,
Antenna 3 of audio input/output fixed station 1 via antenna 3b
a, it reaches the radio device 4a and is received by the speech recognition device 5. The speech recognition device 5 performs pattern matching to determine which pattern in the recognition pattern memory 6a within the device matches the pattern of the received synthesized speech (mechanical output signal), recognizes the speaker, and recognizes the speaker from now on. For example, if the speaker is C, patterns of C1 to C5 are set from the voice recognition pattern external memory 8 to the recognition pattern memory 6b. Then, the speech synthesizer 10 extracts the recognized person's name from the speech synthesis data 9 and sends it to the speaker via the wireless 4a.
, is sent back to the voice input mobile station 2 via the antenna 3a.

音声入力移動局２のアンテナ３ｂで受信した合成者は、
無線４ｂを通し、話者のヘッドホン１４に伝わり、話者
は、話者交代が確実に行なわれたことを確認する。そし
てマイク１２より以後、各種音声入力を実行する。The synthesizer receives the signal from the antenna 3b of the voice input mobile station 2,
The message is transmitted to the speaker's headphones 14 via the radio 4b, and the speaker confirms that the speaker change has been reliably performed. After that, various voice inputs are performed through the microphone 12.

その後は、音声認識部５で照合すべき肉声認識パターン
は、該当話者のものに確実に切換っているので、現状の
音声認識技術レベルによっても、十分な音声認識が可能
である。Thereafter, the real voice recognition pattern to be verified by the voice recognition unit 5 is reliably switched to that of the corresponding speaker, so that sufficient voice recognition is possible even with the current level of voice recognition technology.

話者交代のための操作は、このために特定の操作を義務
づけるものに限らず、使用が終ったときの通常の操作や
、使用開始時の通常の操作によって置換えることもでき
る。このようにすれば、使用開始時に必ず、使用開始者
への話者交代を行うこととなる。The operation for changing the speaker is not limited to one that requires a specific operation for this purpose, but can also be replaced by a normal operation when use is finished or a normal operation when use is started. If this is done, the speaker will always be switched to the person starting the use at the beginning of use.

機械出力信号とは、肉声に対応する出力信号と区別する
意味であって、上記実施例における音声合成出力の外、
話者別に異る周波数を割当てた周波数信号などであって
もよく、この場合にも、音声認識部を話者識別用に共用
することも−できる。The term "mechanical output signal" is used to distinguish it from an output signal corresponding to a real voice, and includes, in addition to the speech synthesis output in the above embodiment,
It may be a frequency signal in which a different frequency is assigned to each speaker, and in this case as well, the voice recognition section can be shared for speaker identification.

更に、その他、各種の醒気信号が使用できることは言う
までもない。Furthermore, it goes without saying that various other arousal signals can be used.

〔Effect of the invention〕

本発明によれば、簡便にして確実な話者交代が可能とな
り、話者交代不能による音声認識装置の不稼動の可能性
を少くすることができる。According to the present invention, it is possible to easily and reliably change the speaker, and it is possible to reduce the possibility that the speech recognition device will be out of operation due to the inability to change the speaker.

[Brief explanation of the drawing]

図は本発明の一実施例による話者交代システムブロック
図である。The figure is a block diagram of a speaker change system according to an embodiment of the present invention.

Claims

[Claims] 1. A means for inputting a signal corresponding to the real voice of a speaker to a speech recognition section, a speech recognition section that recognizes the input signal by matching it with a pattern, and a means for inputting a signal corresponding to a speaker's real voice to a speech recognition section; a human voice recognition pattern memory unit that stores recognition patterns set for each speaker in order to identify the voice of the speaker; an operating means that generates a mechanical output signal corresponding to each speaker; speaker identification means for identifying a speaker corresponding to the speaker, and switching a pattern of the speech recognition unit to a recognition pattern of the corresponding speaker in response to the speaker identification. Alternating method. 2. The speaker change system of the speech recognition device according to item 1, wherein the speaker identification means is the speech recognition section that compares the mechanical output signal with a speaker identification pattern set for each speaker. 3. The speaker change system of the speech recognition device according to item 2, wherein the mechanical output signal is an output of a speech synthesis means that generates a different speech signal for each speaker. 4. The speaker change system of the speech recognition device according to item 2, wherein the mechanical output signal is a frequency signal to which a different frequency is assigned to each speaker.