JPH01267700A

JPH01267700A - Speech processor

Info

Publication number: JPH01267700A
Application number: JP63095693A
Authority: JP
Inventors: Shunji Tanaka; 田中　俊二
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-04-20
Filing date: 1988-04-20
Publication date: 1989-10-25

Abstract

PURPOSE:To obtain a smooth speech which contains no noise by editing the output of a reverse filter in each pitch cycle which is the output of a pitch extractor and leading the output to a synthesizing filter, and determining the characteristics of the reverse filter and synthesizing filter with the output of a spectrum analyzer. CONSTITUTION:The output of the reverse filter 8 is edited in each pitch cycle which is the output of the pitch extractor 6 and led to the synthesizing filter 10. The characteristics of the reverse filter 8 and composite filter 10 are determined with the output of the spectrum analyzer 7. Then the output of an editing device 9 is inputted to the synthesizing filter 10 (having the opposite characteristics from the reverse filter 8 because of the output of the spectrum analyzer 7) to obtain a speech signal wherein phoneme information is restored. Their input/output speed ratios need not be integers or reciprocals like 2 and 1/3 and may be 1.2 and 1.1, and the object of editing is a residue signal having no correlation, so there is no noise. Consequently, the smooth speech having no noise can be obtained.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明はテープレコーダの再生時にテープスピードを変
化させても発声音韻に変化なく早聞き。[Detailed Description of the Invention] [Industrial Field of Application] The present invention allows rapid listening without changing the utterance phoneme even if the tape speed is changed during playback by a tape recorder.

遅聞きのできるような音声処理装置に関するものである
。This invention relates to a speech processing device that allows slow listening.

[Conventional technology]

従来、この種の音声処理装置の例としてはｖＳＣ（Ｖａ
ｒｉａｂｌｅ　５ｐｅｅｃｈ　Ｃｏｎｔｒｏｌ　）とい
う方式があシ、メモリに音声波形を書き込むスピードと
読み申すスピードを変えて、音韻の変化を防いでいた。Conventionally, an example of this type of audio processing device is vSC (Va
There was a method called riable 5peach Control) that prevented changes in phonology by changing the speed at which speech waveforms were written into memory and the speed at which they were interpreted.

[λ-ma problem that the invention seeks to solve]

上述した従来の音声処理装置におけるｖＳＣは、音声の
波形のピッチにかかわシが〈波形の編集を行っているた
め、波形のつなぎ目゛で雑音を生じ、それが非常に耳ざ
わシに表るという課龜があった。The vSC in the conventional audio processing device described above edits the pitch of the audio waveform, so noise is generated at the joints of the waveforms, which is very noticeable. There was a division.

[Means to solve the problem]

本発明の音声処理装置は、第１のクロック信号および第
２のクロック信号の２つのタイムベースクロック信号入
力端子を有し、音声信号をそれぞれ入力とするピッチ抽
出器とスペクトル分析器および逆フィルタが上記第１の
クロック信号で動作するようになし、合成フィルタが上
記第２のクロック信号で動作するようになし、上記逆フ
ィルタの出力を上記ピッチ抽出器の出力であるピッチ同
期毎Ｋｍ集して上記合成フィルタに導くよりに構成され
、上記スペクトル分析器の石力が上記逆フィルタおよび
上記合成フィルタの特性を決定するような構成をとるも
のである。The audio processing device of the present invention has two time base clock signal input terminals, a first clock signal and a second clock signal, and includes a pitch extractor, a spectrum analyzer, and an inverse filter each receiving an audio signal as input. The synthesis filter is configured to operate with the first clock signal, the synthesis filter is configured to operate with the second clock signal, and the output of the inverse filter is collected every Km of pitch synchronization, which is the output of the pitch extractor. The inverse filter is guided to the synthesis filter, and the power of the spectrum analyzer is configured to determine the characteristics of the inverse filter and the synthesis filter.

[Effect]

本発明においては、残差の段階でピッチ周期にしたがっ
て編集することにより雑音の入らないスムーズ々音声を
得る。In the present invention, smooth speech without noise is obtained by editing according to the pitch period at the residual stage.

〔Example〕

以下−図面に基づき本発明の実施例を詳細に説明する。 Hereinafter, embodiments of the invention will be explained in detail with reference to the drawings.

第１図は本発明の一実施例を示すブロック図である。FIG. 1 is a block diagram showing one embodiment of the present invention.

図において、１は音声入力端子である。２はスピード信
号入力端子、３は読出スピード信号入力・端子で、これ
らは第１のクロック信号および第２のクロック信号の２
つのタイムベースクロック信号入力端子を構成している
。４は音声出力端子でちる。In the figure, 1 is an audio input terminal. 2 is a speed signal input terminal, 3 is a read speed signal input/terminal, and these are two of the first clock signal and second clock signal.
It constitutes two time base clock signal input terminals. 4 is the audio output terminal.

５はＡ／Ｄコンバータ、６，７．８はそれぞれＡ／Ｄコ
ンバータ５の出力である音声信号を入力とするピッチ抽
出器とスペクトル分析器および逆フィルタで、このピッ
チ抽出器６とスペクトル分析器７および逆フィルタ８は
スピード信号入力端子２からのクロック信号で動作する
ように構成されている。９は編集器、１０は合成フィル
タで、この合成フィルタ１０は読出スピード信号入力端
子３からのクロック信号で動作するように構成されてい
る。１１はＤ／Ａコンバータである。5 is an A/D converter, and 6, 7.8 are a pitch extractor, a spectrum analyzer, and an inverse filter that receive the audio signal output from the A/D converter 5, respectively. 7 and the inverse filter 8 are configured to operate with a clock signal from the speed signal input terminal 2. 9 is an editor; 10 is a synthesis filter; this synthesis filter 10 is configured to operate with a clock signal from the read speed signal input terminal 3; 11 is a D/A converter.

そして、逆フィルタ８の出力をピッチ抽出器６の出力で
あるピッチ周期毎に編集して合成フィルタ１０に導くよ
うに構成され、スペクトル分析器７の出力が逆フィルタ
８および合成フィルタ１０の特性を決定するような構成
をとっている。The output of the inverse filter 8 is edited for each pitch period, which is the output of the pitch extractor 6, and is guided to the synthesis filter 10, and the output of the spectrum analyzer 7 determines the characteristics of the inverse filter 8 and the synthesis filter 10. It is structured in such a way that it makes a decision.

第２図および第３図は第１図の動作説ＢＡＫ供する図で
、第２図は編集される波形人力スピード〉読出スピード
の場合を示したものであυ、第３図は゛編集される波形
入力スピード〈読出スピードの場合を示したものである
。なお、（ａ）は入力残差を示し、伽）は出力残差を示
す。Figures 2 and 3 are diagrams that provide the operation theory BAK of Figure 1. Figure 2 shows the case where the edited waveform manual speed is greater than the readout speed, and Figure 3 shows the edited waveform. This shows the case of input speed <read speed. Note that (a) indicates the input residual, and (a) indicates the output residual.

つぎに第１ｒｌＡＫ示す実施例の動作を第２図および第
３図を参照して説明する。Next, the operation of the embodiment shown in the first rlAK will be explained with reference to FIGS. 2 and 3.

まず、音声入力端子１に印加された音声信号はＡ／Ｄコ
ンバータ５でディジタルに変換される。First, an audio signal applied to the audio input terminal 1 is converted into a digital signal by the A/D converter 5.

この変換されるレートはスピード信号入力端子２に入力
されるテープレコーダのテープ速度に比例したクロック
信号による。そして、ディジタル化された音声はピッチ
抽出器６とスペクトル分析器７および逆フィルタ８にそ
れぞれ導かれる。ここで、このピッチ抽出器６とスペク
トル分析器Ｔおよび逆フィルタ８はＡ／Ｄコンバータ５
と同じタイミングクロックで動作している。The rate to be converted is determined by a clock signal proportional to the tape speed of the tape recorder input to the speed signal input terminal 2. The digitized speech is then guided to a pitch extractor 6, a spectrum analyzer 7, and an inverse filter 8, respectively. Here, the pitch extractor 6, spectrum analyzer T and inverse filter 8 are connected to the A/D converter 5.
It operates with the same timing clock.

つぎに、スペクトル分析器Ｔで得られたスペクトル信号
を利用して逆フィルタ８によって残差信号が得られる。Next, a residual signal is obtained by an inverse filter 8 using the spectrum signal obtained by the spectrum analyzer T.

この残差信号は編集器９に導かれ、ピップ抽出器６によ
プ得られたピッチ周期単位で編集される。This residual signal is led to an editor 9, and edited by the pip extractor 6 in pitch period units.

いま、スピード信号入力端子２からｎＫＨ２のクロック
信号が入力され、読出スピード信号入力端子からｍ　Ｋ
Ｈｚのクロック信号が入力された場合でｎ＞ｍの場合に
は、編集器９に入力された残差信号のうち１ピッチ周期
分が切シ取られｎ７ｍ倍に伸長されて再結合され出力さ
れる。このｎ＝２ｍの場合をｇ２［ｆｆｉに示している
。々お、この第２図において、（イ）は捨てる部分を示
す。Now, a clock signal of nKH2 is input from the speed signal input terminal 2, and a clock signal of mKH2 is input from the read speed signal input terminal.
When a Hz clock signal is input and n>m, one pitch period of the residual signal input to the editor 9 is cut out, expanded by n7m times, recombined, and output. Ru. This case of n=2m is shown in g2[ffi. In this Figure 2, (a) indicates the part to be discarded.

そして、編集器９の出力は合成フィルタ１０（スペクト
ル分析器７の出力によシ逆フィルタ８の逆の特性を示す
）Ｋ入力され、音韻情報の復元された音声信号が得られ
る。ｎ＞ｍの場合では早口の音声がピッチを上げずに出
力される。Then, the output of the editor 9 is inputted to a synthesis filter 10 (which has characteristics opposite to those of the inverse filter 8 based on the output of the spectrum analyzer 7), and a speech signal with phoneme information restored is obtained. When n>m, fast speech is output without raising the pitch.

逆Ｋｎ、＜ｒｎの場合には、テープ速度が標準よシ遅い
場合に＠Ｗする。そして、この場合は第３図に示すよう
に入力残差（ａ）はｎ７ｍ倍に縮少され復製されて合成
フィルタ１０へ導かれる。In the case of reverse Kn and <rn, @W is performed when the tape speed is slower than the standard. In this case, as shown in FIG. 3, the input residual (a) is reduced by n7m times, reproduced, and guided to the synthesis filter 10.

々お、との第２図および第３図は、それぞれ人力／出力
スピード比が２　、１／３の場合を示しているが、その
比は整数またはその逆数である必要はなく、１．２とか
１．１　でちってもかまわない。Figures 2 and 3 show cases where the human power/output speed ratio is 2 and 1/3, respectively, but the ratio need not be an integer or its reciprocal; It doesn't matter if it is 1.1.

そして、編集の対象が相関のない残差信号でおるから、
雑音はない。And since the object of editing is the uncorrelated residual signal,
There's no noise.

なお、説明の都合上、Ａ／Ｄコンバータ、　Ｄ／Ａコン
バータを用いすべてディジタルでの実施例を示したが、
必ずしも値はディジタルである必要はな（、ＢＢＤなど
のアナログ記録素子を利用して離散時聞達！量のシステ
ムを作ることも容易である。For convenience of explanation, an all-digital example using an A/D converter and a D/A converter is shown.
The values do not necessarily have to be digital (it is also easy to create a system with discrete time readings using an analog recording device such as a BBD).

また、スピード信号入力端子２はテープスピードに連動
して、読出スピード信号入力端子３は固定とした使い方
がテープを早くしたシ遅くしても聞きとれるテープレコ
ーダとして通常の使い方であるが、スピード信号入力端
子２を固定にしたままで読出スピード信号入力端子３を
可変にすると、音色の変わった声をつくることができて
特殊効果を出すこともできる。In addition, the speed signal input terminal 2 is linked to the tape speed, and the read speed signal input terminal 3 is fixed, which is the normal usage for a tape recorder that can be heard even when the tape is sped up or slowed down. By making the read speed signal input terminal 3 variable while keeping the input terminal 2 fixed, it is possible to create a voice with a different timbre and to create special effects.

〔Effect of the invention〕

以上説明したように本発明は、残差の段階でピップ周期
にしたがって編集することにより雑音の入らないスムー
ズな音声を得ることができる効果がある。また、音声の
変わった声をつくることができて特殊効果を出すことも
できる。As explained above, the present invention has the advantage that smooth speech without noise can be obtained by editing according to the pip period at the residual stage. You can also create unusual voices and create special effects.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示すブロック図、第２図お
よび第３図は第１図の動作説明に供する図である。１・・・・音声入力端子、２・・・・スピード信号入力
端子、３・・・・読出スピード信号入力端子、６・・・
・ピッチ抽出器、Ｔ・・・・スヘクトル分析器、８・・
・・逆フィルタ、９・・・・編集器、１０・・・・合成
フィルタ。特許出願人　　日本電気株式会社FIG. 1 is a block diagram showing one embodiment of the present invention, and FIGS. 2 and 3 are diagrams for explaining the operation of FIG. 1. 1...Audio input terminal, 2...Speed signal input terminal, 3...Reading speed signal input terminal, 6...
・Pitch extractor, T...Speech analyzer, 8...
... Inverse filter, 9... Editor, 10... Synthesis filter. Patent applicant: NEC Corporation

Claims

[Claims]

A pitch extractor, a spectrum analyzer, and an inverse filter each having two time base clock signal input terminals, a first clock signal and a second clock signal, each receiving an audio signal, operate on the first clock signal. The synthesis filter is configured to operate with the second clock signal, and the output of the inverse filter is edited for each pitch period that is the output of the pitch extractor and guided to the synthesis filter. and an output of the spectrum analyzer determines characteristics of the inverse filter and the synthesis filter.