JPH03175497A

JPH03175497A - Interrupting device for voice synthesis

Info

Publication number: JPH03175497A
Application number: JP1315721A
Authority: JP
Inventors: Yasushi Yamazaki; 泰山崎
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1989-12-05
Filing date: 1989-12-05
Publication date: 1991-07-30
Anticipated expiration: 2014-06-28
Also published as: JP2913310B2

Abstract

PURPOSE:To interrupt voice synthesis automatically and restart the voice synthesis later by comparing a difference when a difference due to environmental conditions is minimized and detecting a sudden noise. CONSTITUTION:If a synthesized voice can not be heard suddenly owing to an ambient noise and the voice synthesis is interrupted temporarily, either or both of the output voice of a voice synthesizer 1 and a synthesized voice collected through a microphone are passed through an acyclic filter including characteristics of a speaker, a microphone, etc., and corrected and the differences of the both when differences due to environmental conditions are minimized are compared with each other to detect a sudden noise. When the noise level is larger than a specific threshold value, a voice synthesis interrup tion signal (1) is generated. Consequently, the voice synthesis is interrupted automatically and then restarted afterward.

Description

【発明の詳細な説明】〔概要〕合成音声を聴取中に、突然、周囲雑音により、開き取れ
なかった場合に、−時的に音声合成を中断する方式に関
し、音声合成を中断することを自動化して、例えば、部品収
集システム運用時の操作性を向上させることを目的とし
、例えば、蓄積合成（分析合成）型、又は、規則合成型等
の音声合成装置において、該音声合成装置からの出力音
声と、マイクを通して収録した合成音声のいずれか、又
は、両方に、スピーカ、マイク等の環境特性を含んだ非
巡回型（ＦＩＲ型）フィルタをかけて信号を補正し、両
者の環境条件による差を極小にしたときの差を比較して
、突然の雑音を検出し、該雑音レベルが、特定の閾値よ
り大きかった場合に、音声合成装置に対する中断信号の
を発生させて、これにより音声合成を自動的に中断し、
その後、音声合成を再開するように構成する。[Detailed Description of the Invention] [Summary] Automatically interrupting speech synthesis regarding a method for temporarily interrupting speech synthesis when the synthesized speech suddenly cannot be opened due to ambient noise while listening to synthesized speech. For example, for the purpose of improving the operability during operation of a parts collection system, for example, in a speech synthesis device such as an accumulation synthesis (analysis synthesis) type or a rule synthesis type, the output from the speech synthesis device Either or both of the voice and the synthesized voice recorded through the microphone are applied with an acyclic (FIR type) filter that includes the environmental characteristics of the speaker, microphone, etc., to correct the signal and calculate the differences due to the environmental conditions between the two. A sudden noise is detected by comparing the difference when the noise level is minimized, and if the noise level is higher than a certain threshold, an interrupt signal is generated to the speech synthesizer, thereby starting the speech synthesis. automatically interrupt,
Thereafter, the configuration is made to restart speech synthesis.

[Industrial application field]

本発明は合成音声を聴取中に突然の雑音により、該合成
音声が開き取れなかった場合、−時的に音声合成を中断
し、その後合成を再開する音声合成の中断方式に関する
。The present invention relates to a method for interrupting speech synthesis, which temporarily interrupts speech synthesis and then resumes synthesis when the synthesized speech cannot be opened due to sudden noise while listening to the synthesized speech.

従来から、合成音声による部品収集システム等が知られ
ているが、電話の呼出し音等の外部雑音が発生すると、
該合成音声が聞き取れなくなり、部品の収集ができなく
なることがある。Conventionally, parts collection systems using synthesized voices have been known, but when external noise such as a telephone ring occurs,
The synthesized voice may become inaudible, and parts may not be collected.

この場合の音声合成を中断するのに、人手によることな
く自動的に音声合成の中断ができる中断装置があると、
例えば、部品収集者の両手が塞がっていても、音声合成
の中断が可能となり、該部品収集システム等の操作性が
向上する。In order to interrupt speech synthesis in this case, there is an interrupting device that can automatically interrupt speech synthesis without manual intervention.
For example, even if the parts collector's hands are occupied, voice synthesis can be interrupted, improving the operability of the parts collection system and the like.

〔従来の技術と発明が解決しようとする課題〕第９図は
従来の音声合成の中断方式を説明する図である。[Prior art and problems to be solved by the invention] FIG. 9 is a diagram illustrating a conventional speech synthesis interruption method.

従来、部品収集システム等において、音声合成装置ｌか
らの合成音声を聴取中に、周囲雑音等により一時的に聞
き取れなくなった場合、人手により、中断スイッチ３を
操作して、該音声合成を中断させていた。Conventionally, in a parts collection system, etc., when listening to synthesized speech from a speech synthesizer l, if it becomes temporarily inaudible due to ambient noise, etc., the interrupt switch 3 is manually operated to interrupt the speech synthesis. was.

そのため、手を使って作業を行っている場合には、その
作業を中止し、中断スイッチを操作する必要があり、操
作性が悪いという問題があった。Therefore, when working with hands, it is necessary to stop the work and operate an interrupt switch, which poses a problem of poor operability.

本発明は上記従来の欠点に鑑み、従来の中断スイッチに
よる音声合成の中断を自動化し、ユーザにとって合成音
を聞き取れないであろう雑音を検出して、音声合成を自
動的に中断する装置を提供することを目的とするもので
ある。In view of the above conventional drawbacks, the present invention provides a device that automatically interrupts speech synthesis using a conventional interrupt switch, detects noise that would make it difficult for the user to hear the synthesized speech, and automatically interrupts speech synthesis. The purpose is to

[Means to solve the problem]

上記の問題点は下記の如くに構成した音声合成の中断装
置によって解決される。The above problem is solved by a speech synthesis interrupting device configured as follows.

（］）音声合成装置のスピーカから出力された合成音声
を収録するマイクと、該マイクの出力をデジタル信号に変換するアナログ／デ
ィジタル変換部（Ａ／Ｄ）と、該音声合成装置からスピ
ーカへの入力アナログ信号をデジタル信号に変換するア
ナログ／ディジタル変換部（Ａ／Ｄ）と、該アナログ／ディジタル変換部（Ａ／Ｄ）の出力を一定
時間遅延させ増幅（又は、減衰）する信号補正部と、該信号補正部の出力と、上記音声合成装置、又は、マイ
クからの入力アナログ信号をアナログ／ディジタル変換
部（Ａ／Ｄ）でディジタル変換した出力との差の短時間
電力を計算する差信号電力計算部と、該差信号電力計算部の出力が閾値を越した時に、音声合
成中断信号のを生成する中断信号生成部とを有するよう
に構成する。(]) A microphone that records the synthesized voice output from the speaker of the voice synthesizer, an analog/digital converter (A/D) that converts the output of the microphone into a digital signal, and a microphone that records the synthesized voice output from the speaker of the voice synthesizer. An analog/digital converter (A/D) that converts an input analog signal into a digital signal, and a signal corrector that delays and amplifies (or attenuates) the output of the analog/digital converter (A/D) for a certain period of time. , a difference signal for calculating the short-time power of the difference between the output of the signal correction section and the output obtained by digitally converting the input analog signal from the voice synthesizer or microphone by the analog/digital conversion section (A/D). The apparatus is configured to include a power calculation section and an interruption signal generation section that generates a speech synthesis interruption signal when the output of the difference signal power calculation section exceeds a threshold value.

（２）上記音声合成の中断装置において、上記信号補正
部として、非巡回型（ＦＩＲ型）のフィルタを用いるよ
うに構成する。(2) The speech synthesis interrupting device is configured to use an acyclic (FIR type) filter as the signal correction section.

（３）上記音声合成の中断装置において、音声合成装置
からのアナログ／ディジタル変換部（Ａ／Ｄ）を設けな
いで、音声合成装置内のディジタルアナログ変換部（Ｄ
／Ａ）へのデジタル入力を、上記信号補正部に直接わた
すように構成する。(3) In the above speech synthesis interrupting device, the analog/digital converter (A/D) from the speech synthesizer is not provided, and the digital/analog converter (D) in the speech synthesizer is not provided.
/A) is configured so as to directly pass the digital input to the signal correction section.

（４）上記音声合成の中断装置において、各アナログ／
ディジタル変換部（Ａ／Ｄ）の出力から上記非巡回型Ｃ
ＦＴＲ型）フィルタの最適係数を計算する係数計算部を
有するように構成する。(4) In the above speech synthesis interrupting device, each analog/
From the output of the digital converter (A/D), the above acyclic C
The present invention is configured to include a coefficient calculation unit that calculates optimal coefficients of the FTR type filter.

（５）上記音声合成の中断装置において、音声合成装置
からの出力を補正する信号補正部の代わりに、マイクの
出力を一定時間遅延させ増幅（減衰）する信号補正部を
用いるように構成する。(5) In the speech synthesis interrupting device, a signal correction section that delays and amplifies (attenuates) the output of the microphone for a certain period of time is used instead of the signal correction section that corrects the output from the speech synthesis device.

[Effect]

即ち、本発明によれば、音声合成中の雑音を自動検出す
るため、例えば、合成音声自体の品質と。That is, according to the present invention, in order to automatically detect noise during speech synthesis, for example, the quality of the synthesized speech itself.

スピーカ・マイク等の種々の周囲環境の特性（雑音も含
む）を含んだ合成音声の品質を比較する。Compare the quality of synthesized speech that includes characteristics (including noise) of various surrounding environments such as speakers and microphones.

その為、合成音声にスピーカ・マイク等の特性を含んだ
フィルタをかける。これを信号補正部で行い、その係数
を係数計算部で決定する。該係数の決定には、例えば以
下のような方法がある。Therefore, a filter is applied to the synthesized voice that includes the characteristics of the speaker, microphone, etc. This is performed by the signal correction section, and its coefficients are determined by the coefficient calculation section. For example, the following method can be used to determine the coefficient.

信号補正部の一例として、非巡回型（Ｆｉｎｉｔｅ　１
ｎｐｕｌｓｅ　Ｒｅ５ｐｏｎｓｅ　　：　Ｆ　ｒ　Ｒ型
）のフィルタがある。As an example of the signal correction section, an acyclic type (Finite 1
There is a filter of npulse Re5ponse (F r R type).

この構成は、第８図に示した構成のものである。This configuration is that shown in FIG.

ここで、信号補正部への入力Ｘｆｉを合成音自体のデー
タとし、Ｙ、、をマイクから抽出した合成音声とする。Here, the input Xfi to the signal correction section is the data of the synthesized sound itself, and Y, , is the synthesized speech extracted from the microphone.

このＹ、１と、信号補正部の出力Ｙ、、・との比較によ
って雑音を検出する。この信号補正部の係数α。Noise is detected by comparing this Y,1 with the output Y, . . . of the signal correction section. Coefficient α of this signal correction section.

α１．・・、α９を係数計算部で計算する。α1. ..., α9 is calculated by the coefficient calculating section.

これは、真の雑音を効率よく抽出するためには、音声合
成開始時点の環境に応じで、スピーカとマイク間の遅延
や定常雑音を含んだ状態で、該比較対象の合成音声の品
質を合わせる必要があるからである。In order to efficiently extract true noise, it is necessary to adjust the quality of the synthesized speech to be compared, depending on the environment at the start of speech synthesis, including delay between the speaker and microphone and stationary noise. This is because it is necessary.

入力信号Ｘ７を補正する際、スピーカ・マイクの特性と
定常雑音を考慮しなければならない。係数決定フェーズ
では、補正後のＹ、ｌ・と、マイクからの合成音声Ｙ、
ｌの差を最小にするよう係数を決定する。即ち、真の雑
音以外の、例えば、環境条件によって支配された雑音を
極小にする為の係数を求めるわけである。When correcting the input signal X7, the characteristics of the speaker and microphone and stationary noise must be taken into consideration. In the coefficient determination phase, the corrected Y, l and the synthesized voice Y from the microphone,
Coefficients are determined to minimize the difference in l. That is, coefficients are determined to minimize noise other than true noise, such as noise dominated by environmental conditions.

第８図より、Ｙ　ｅ□　＝Ｘ　Ｈαｏ　＋　Ｘ　ｎ−＋α、−，＋　
Ｘ１ｌ−、α、２＋　　−−＋　Ｘｌ１−、αｎ−ｐとなる。この時の誤差εは。From Figure 8, Y e□ =X Hαo + X n-+α, -, +
X1l-, α, 2+ --+ Xl1-, αn-p. The error ε at this time is.

ε＝　Σ　（Ｙ、ｌ　−Ｙ、ｌ・）２・　Σ　（Ｙ、−Ｘ、１α。−Ｘ、１−１α７１・　・
−Ｘ　Ｒ−ｐαａ−ｐ　）ｚとなる。この誤差εを最小にするように、各係数（α０
〜α９）を決定する。そこで、εを各係数（α。〜α９
）で偏微分し、ゼロとおくことにより、各係数（α。〜
αｐ）を決定する。すなわち各係数に関する、Ｘ、１ｐαｎ−９）（ −Ｘ、ｌ）（０）７、α□、）（ＸＦＩ−１）（１）・−Ｘ、、αｎ−ｐ　）（−Ｘｎ−Ｐ　）　　・０（ｐ
）という　（０）弐〜（ｐ）式のｐ＋１個の式からなる
連立方程式を解き、各係数（α。〜α９）を決定する。ε= Σ (Y, l −Y, l・)2 ・ Σ (Y, −X, 1α. −X, 1−1α71・・
-X R-pαa-p )z . Each coefficient (α0
~α9) is determined. Therefore, let ε be each coefficient (α. ~ α9
) and set it to zero, each coefficient (α.~
αp) is determined. That is, regarding each coefficient,・0(p
), the simultaneous equations consisting of p+1 equations (0)2 to (p) are solved, and each coefficient (α. to α9) is determined.

このようにして決定された係数を用いて、例えば、上記
音声合成装置からのアナログ入力信号のディジタル変換
信号を、マイクからの合成音声のディジタル変換出力信
号に近づける（補正する）ことができ、真の雑音の抽出
が容易となる。Using the coefficients determined in this way, for example, it is possible to bring (correct) the digital conversion signal of the analog input signal from the speech synthesizer to the digital conversion output signal of the synthesized voice from the microphone, and to make it true. This makes it easier to extract noise.

そして、該補正信号と、マイクからのディジタル変換し
た信号とを短時間電力比較して、特定の閾値を越える真
の雑音を求め、音声合成装置に対する中断信号のとする
。Then, the correction signal and the digitally converted signal from the microphone are compared in power for a short period of time to determine the true noise that exceeds a specific threshold value and is used as an interrupt signal for the speech synthesizer.

従って、本発明においては、合成音声を聴取中に、周囲
環境に発生した、該合成された音声とは異なる雑音が自
動的に抽出され、該抽出された信号を中断信号のとする
ことで、音声合成を自動的に中断することができる効果
がある。Therefore, in the present invention, while listening to synthesized speech, noise that occurs in the surrounding environment and is different from the synthesized speech is automatically extracted, and the extracted signal is used as an interruption signal. This has the effect of automatically interrupting speech synthesis.

〔Example〕

以下本発明の実施例を図面によって詳述する。 Embodiments of the present invention will be described in detail below with reference to the drawings.

第１図は本発明の一実施例を示した図であり、第２図は
本発明の動作フロー図であり、第３図〜第７図は本発明
の他の実施例を示した図であって、例えば、音声合成装
置１からの出力音声のディジタル変換信号にスピーカ２
．マイク４等の環境の特性を含んだフィルタをかけたも
のと、該スピーカ２からの出力をマイク４を通して収録
した合成音声のディジタル変換信号との短時間電力差を
比較して、突然の雑音を検出し、音声合成装置１に対す
る中断信号■とする手段が本発明を実施するのに必要な
手段である。尚、全図を通して同じ符号は同じ対象物を
示している。FIG. 1 is a diagram showing one embodiment of the present invention, FIG. 2 is an operation flow diagram of the present invention, and FIGS. 3 to 7 are diagrams showing other embodiments of the present invention. For example, if the digital conversion signal of the output voice from the voice synthesizer 1 is
．． By comparing the short-term power difference between the signal filtered from the microphone 4, etc. that includes the characteristics of the environment, and the digitally converted signal of the synthesized voice recorded from the output from the speaker 2 through the microphone 4, sudden noise can be detected. A means for detecting this and generating an interrupt signal (3) to the speech synthesizer 1 is a means necessary for carrying out the present invention. Note that the same reference numerals indicate the same objects throughout the figures.

第１図に示した実施例（実施例１）のシステムは、音声
合成装置１．スピーカ２．マイク４．中断装置５からな
り、本発明の中断装置５は、該マイク４から抽出した雑
音を含む合成音（アナログデータ）をデジタルデータに
変換するアナログ／ディジタル変換部（Ａ／Ｄ）　５１
．合成音自体（アナログデータ）をデジタルデータに変
換するアナログ／ディジタル変換部（Ａ／Ｄ）　５２．
合成音の信号補正部５４．該信号補正部５４の係数を決
定する係数計算部５３９合成音を補正したものと、雑音
を含むマイク４からの合成音を比較する差信号電力計算
部５５．雑音が著しく太き時、合成を中断させる中断信
号生成部５６から成る。The system of the embodiment (Embodiment 1) shown in FIG. 1 includes a speech synthesizer 1. Speaker 2. Microphone 4. The interrupting device 5 of the present invention includes an analog/digital converter (A/D) 51 that converts the synthesized sound (analog data) including noise extracted from the microphone 4 into digital data.
．． Analog/digital converter (A/D) that converts the synthesized sound itself (analog data) into digital data 52.
Synthetic sound signal correction unit 54. A coefficient calculation unit 539 that determines the coefficients of the signal correction unit 54; a difference signal power calculation unit 55 that compares the corrected synthesized sound with the synthesized sound from the microphone 4 that includes noise; It consists of an interruption signal generating section 56 that interrupts the synthesis when the noise is extremely large.

信号補正部５４は、前述の第８図に示したようなＦＩＲ
型のフィルタから成り、その係数は係数計算部５３で計
算される。この係数計算部５３は前述の（０）弐〜（ｐ
）式を解く。中断信号生成部５６では、外部から与えら
れた閾値と、差信号電力計算部５５からの入力を比較し
て、入力の方が大きい時に音声合成装置器１に対して、
中断信号のを出力するように動作する。The signal correction unit 54 is configured to perform an FIR as shown in FIG.
The coefficients are calculated by a coefficient calculating section 53. This coefficient calculation unit 53 is the above-mentioned (0)
) solve the equation. The interruption signal generation section 56 compares the externally given threshold with the input from the difference signal power calculation section 55, and when the input is larger, sends a signal to the speech synthesizer 1.
Operates to output an interrupt signal.

このときの動作フローを第２図に示す。始めに、信号補
正部５４の係数を係数計算部５３で決定する。これはテ
ストメツセージ（例えば、「本日は晴天なり」のような
もの）を音声合成装Ｗ１で合成して出力し、該合成音自
体と、マイク４から抽出した合成音から、そのときの周
囲環境（スピーカ２．マイク４を含む）に合った係数を
決定することができる。該係数決定後は、実際の音声合
成を開始し、雑音を検出した段階で、該音声合成を中断
する。その後、該音声合成を、例えば、該中断した音声
から再開することになる。The operational flow at this time is shown in FIG. First, coefficients for the signal correction section 54 are determined by the coefficient calculation section 53. This synthesizes and outputs a test message (for example, "It's a sunny day today") using the speech synthesizer W1, and uses the synthesized sound itself and the synthesized sound extracted from the microphone 4 to determine the surrounding environment at that time. (including speaker 2 and microphone 4) can be determined. After determining the coefficients, actual speech synthesis is started, and when noise is detected, the speech synthesis is interrupted. Thereafter, the speech synthesis will be resumed, for example, from the interrupted speech.

第３図は本発明の他の実施例（実施例２）を示したもの
である。この実施例では、音声合成装置ｌからのアナロ
グ信号をディジタル信号に変換するアナログ／ディジタ
ル変換部（Ａ／Ｄ）　５２をもたず、音声合成装置ｌか
ら内部データをデジタルデータの形で検出して、直接、
信号補正部５４．係数計算部５３に入力させる。これに
より、第１図に示した実施例１に比べて合成音声のアナ
ログ／ディジタル変換の手間がかからないという利点が
ある。FIG. 3 shows another embodiment (Embodiment 2) of the present invention. This embodiment does not have an analog/digital converter (A/D) 52 that converts an analog signal from the speech synthesizer l into a digital signal, and detects internal data from the speech synthesizer l in the form of digital data. directly,
Signal correction section 54. The coefficient calculation unit 53 is inputted. This has the advantage that analog/digital conversion of synthesized speech requires less effort than the first embodiment shown in FIG.

第４図に他の実施例（実施例３）を示す。第１図の実施
例で示した、係数計算部５３．信号補正部５４の代わり
に、同様の係数計算部５８．信号補正部５７を用いるも
のである。FIG. 4 shows another embodiment (Embodiment 3). The coefficient calculating section 53 shown in the embodiment of FIG. Instead of the signal correction section 54, a similar coefficient calculation section 58. A signal correction section 57 is used.

これは、マイク４から抽出した音声に対して、スピーカ
・マイク特性を取り除く補正を行う方式であり、該マイ
ク４で収録した合成音声をできる限り、音声合成装置１
での合成音声そのものに近づけて、真の雑音の取り出し
を容易にするものである。これは、第１図の係数計算部
５３．信号補正部５４の動作とは逆の作業を行っている
ことになる。This is a method of correcting the voice extracted from the microphone 4 by removing the speaker/microphone characteristics, and the synthesized voice recorded by the microphone 4 is corrected to the voice synthesizer 1 as much as possible.
This makes it easier to extract the true noise by bringing it closer to the synthesized speech itself. This is the coefficient calculating section 53 in FIG. This means that the operation is the opposite of that of the signal correction section 54.

第５図に他の実施例（実施例４）を示す。この実施例で
は、第３図に示した実施例（実施例２）と同じく、音声
合成装置１からの合成音声のアナログ音声をディジタル
信号に変換する為のアナログ／ディジタル変換部（Ａ／
Ｄ）　５２をもたず、音声合成装置１から内部データを
デジタルデータの形で検出して、係数計算部５８に入力
する。これは、上記実施例２と実施例３の組み合わせた
ものである。FIG. 5 shows another embodiment (Embodiment 4). In this embodiment, like the embodiment shown in FIG. 3 (Embodiment 2), an analog/digital converter (A/
D) detects internal data from the speech synthesizer 1 in the form of digital data and inputs it to the coefficient calculation unit 58; This is a combination of Example 2 and Example 3 above.

第６図に他の実施例（実施例５）を示す。第１図に示し
た実施例１と同しように、音声合成装置１からの合成信
号を補正し、且つ、第４図に示した実施例３と同しよう
に、マイク４で収録した合成音声を補正して、両者の差
を検出し、中断信号のを生成しようとするもので、実施
例１と実施例３の組み合わせたものである。FIG. 6 shows another embodiment (Embodiment 5). As in the first embodiment shown in FIG. 1, the synthesized signal from the speech synthesizer 1 is corrected, and as in the third embodiment shown in FIG. This is a combination of Embodiment 1 and Embodiment 3, and attempts to correct the difference, detect the difference between the two, and generate an interruption signal.

第７図に他の実施例（実施例６）を示す。この実施例で
は、上記実施例５において、音声合成装置ｌからのアナ
ログ音声に対するアナログ／ディジタル変換部（Ａ／Ｄ
）　５２を省いて、音声合成装置１から内部データをデ
ジタルデータの形で検出し、信号補正部５３．及び、係
数計算部５１０に直接入力するようにしたもので、上記
実施例５と実施例４の組み合わせたものである。FIG. 7 shows another embodiment (Embodiment 6). In this embodiment, in the fifth embodiment, an analog/digital converter (A/D
) 52 is omitted, internal data is detected in the form of digital data from the speech synthesizer 1, and the signal correction unit 53. This is a combination of the fifth embodiment and the fourth embodiment, which is directly input to the coefficient calculating section 510.

このように、本発明を具体的に実施する場合、種々の構
成が考えられる。As described above, various configurations are possible when specifically implementing the present invention.

このように、本発明は、合成音声を聴取中に、突然、周
囲雑音により、聞き取れなかった場合、−時的に音声合
成装置での音声合成を中断するのに、例えば、音声合成
装置からの出力音声のディジタル変換信号にスピーカ、
マイク等の周囲環境の特性を含んだフィルタをかけて、
マイクで収録した合成音声に近い合成音声と、該スピー
カからの出力をマイクを通して収録した合成音声のディ
ジタル変換信号との短時間電力差を比較するようにして
、周囲環境の条件を無くして、突然の雑音を検出゛し、
上記音声合成装置に対する中断信号のとするように構成
した所に特徴がある。As described above, the present invention provides a method for temporarily interrupting speech synthesis in the speech synthesizer when, while listening to synthesized speech, you suddenly cannot hear it due to ambient noise. A speaker for the output audio digital conversion signal,
Applying a filter that includes the characteristics of the surrounding environment such as the microphone,
By comparing the short-term power difference between a synthesized voice that is close to the synthesized voice recorded with a microphone and a digitally converted signal of the synthesized voice that is recorded from the output from the speaker through the microphone, we eliminate the surrounding environment conditions and suddenly detect the noise of
The feature is that the above-mentioned speech synthesis device is configured to receive an interruption signal.

〔Effect of the invention〕

以上、詳細に説明したように、本発明の音声合成の中断
装置は、合成音声を聴取中に、突然、周囲雑音により、
聞き取れなかった場合に、−時的に音声合成を中断する
のに、例えば、蓄積合成（分析合成）型、又は、規則合
成型等の音声合成装置において、該音声合成装置からの
出力音声と。As described above in detail, the speech synthesis interrupting device of the present invention suddenly interrupts voice synthesis due to ambient noise while listening to synthesized speech.
If the voice cannot be heard, - temporarily interrupting the voice synthesis, for example, in a voice synthesis device such as an accumulation synthesis (analysis synthesis) type or a rule synthesis type, the output voice from the voice synthesis device.

マイクを通して収録した合成音声のいずれか、又は、両
方に、スピーカ、マイク等の特性を含んだ非巡回型（Ｆ
ＩＲ型）フィルタをかけて信号を補正し、両者の環境条
件による差を極小にしたときの差を比較して、突然の雑
音を検出し、該雑音レベルが、特定の閾値より大きかっ
た場合に、音声合成中断信号のを発生させて、これによ
り音声合成を自動的に中断し、その後、音声合成を再開
するように構成したものであるので、合成音声を聴取中
に、周囲環境に発生した、該合成された音声とは異なる
雑音が自動的に抽出され、該抽出された信号を中断信号
のとすることで、音声合成を自動的に中断することがで
きる効果がある。An acyclic type (F
IR type) filter is applied to correct the signal, the difference due to environmental conditions is minimized, and the difference is compared to detect sudden noise, and if the noise level is greater than a specific threshold, The system is configured to generate a speech synthesis interruption signal, thereby automatically interrupting speech synthesis, and then restarting speech synthesis. By automatically extracting noise different from the synthesized speech and using the extracted signal as an interruption signal, it is possible to automatically interrupt speech synthesis.

[Brief explanation of drawings]

第１図は本発明の一実施例を示した図第２図は本発明の動作フロー図。第３図〜第７図は本発明の他の実施例を示した図。第８図は非巡回型フィルタの構成例を示した図。第９図は従来の音声合成の中断方式を説明する図。である。図面において、ｌは音声合成装置、　　２はスピーカ。３は中断スイッチ、　　４はマイク。５は中断装置。５１．５２はアナログ／ディジタル変換部（Ａ／Ｄ）。５３．５８．５１０は係数計算部。５４．５７は信号補正部。５５は差信号電力計算部、５６は中断信号生戒部。 ■は中断信号。をそれぞれ示す。開始１本発明の動作フロー図第図第図第図本発明の他の実施例を示した図第図本発明の他の実施例を示した図第図 FIG. 1 is a diagram showing an embodiment of the present invention. FIG. 2 is an operational flow diagram of the present invention. FIGS. 3 to 7 are diagrams showing other embodiments of the present invention. FIG. 8 is a diagram showing an example of the configuration of an acyclic filter. FIG. 9 is a diagram illustrating a conventional speech synthesis interruption method. It is. In the drawing, 1 is a voice synthesizer, 2 is a speaker. 3 is the interrupt switch, 4 is the microphone. 5 is an interruption device. 51 and 52 are analog/digital converters (A/D). 53.58.510 is the coefficient calculation section. 54.57 is a signal correction section. 55 is a difference signal power calculation section, and 56 is an interruption signal generation section. ■ is an interruption signal. are shown respectively. start 1 Operation flow diagram of the present invention No. figure No. figure No. figure Diagram showing another embodiment of the present invention No. figure Diagram showing another embodiment of the present invention No. figure

Claims

[Claims]

(1) A microphone (4) that records the synthesized voice output from the speaker (2) of the voice synthesizer (1);
) an analog/digital converter (A/D) (51) that converts the output of the output into a digital signal, and the speech synthesizer (1)
An analog/digital converter (A/D) (
52) and the analog/digital converter (A/D) (
a signal correction section (54) that delays and amplifies (or attenuates) the output of the signal correction section (51 or 52) for a certain period of time;
), or input analog signals from the microphone (4) to an analog/digital converter (A/D) (52 or 51).
)), which calculates the short-time power of the difference between the digitally converted output and the digitally converted output; (1)) An interruption signal generation unit (56) for generating a speech synthesis interruption device.

(2) The speech synthesis interrupting device according to claim 1, wherein an acyclic (FIR type) filter is used as the signal correction section (54).

(3) In the speech synthesis interrupting device, the analog/digital converter (A/D) (
52), and the digital input to the digital-to-analog converter (D/A) in the speech synthesizer (1) is directly passed to the signal correction unit (54). The speech synthesis interrupting device according to item 1.

(4) In the above speech synthesis interrupting device, each analog/
Claims 1 and 2 further comprising a coefficient calculation unit (53) that calculates optimal coefficients of the acyclic (FIR type) filter from the output of the digital conversion unit (A/D) (51, 52). The device for interrupting the speech synthesis described above.

(5) In the above-mentioned speech synthesis interrupting device, instead of the signal correction section that corrects the output from the speech synthesis device (1), the signal correction section delays and amplifies (or attenuates) the output of the microphone (4) for a certain period of time. The speech synthesis interrupting device according to claim 1, characterized in that the device uses a section (57).