JPH06222794A

JPH06222794A - Voice speed conversion method

Info

Publication number: JPH06222794A
Application number: JP5009737A
Authority: JP
Inventors: Ryoji Suzuki; 良二鈴木; Masayuki Misaki; 正之三崎
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1993-01-25
Filing date: 1993-01-25
Publication date: 1994-08-12
Anticipated expiration: 2016-03-19
Also published as: JP3147562B2

Abstract

PURPOSE:To provide a voice conversion method capable of outputting a voice with the little discontinuity of a waveform and with the little omission of data and rich in a natural property by adding them after multiplying a first signal and a second signal by a window function. CONSTITUTION:The first signal with a time length T is inputted from an input pointer. The second signal with the time length T is inputted succeeding to that. The correlative function between the first signal and the second signal is calculated, and a time delay Tc when the value of the correlative function is maximum is retrieved. The first signal is multiplied by the window function gradually increasing an amplitude according to the obtained time delay Tc. The second signal is multiplied by the window function gradually decreasing the amplitude according the time delay Tc. After the first signal multiplied by the window function and the second signal multiplied by the window function are shifted to the position of the time delay Tc where the correlative function becomes a maximum, they are added. Then, the obtained signal and a third signal succeeding to the first signal are outputted only for a interval alpha(T-Tc)/(alpha-1) (where alpha is a time base transformation ratio).

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声の基本周波数を変
えずに継続時間長のみを変える音声速度変換方法に関す
るものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice speed conversion method in which only the duration is changed without changing the fundamental frequency of voice.

【０００２】[0002]

【従来の技術】近年、テープレコーダ等に記録されてい
る音声信号の早聞きや遅聞きを行うために音声速度変換
装置が利用されている。2. Description of the Related Art In recent years, a voice speed conversion device has been used to perform fast listening or slow listening of a voice signal recorded on a tape recorder or the like.

【０００３】以下、図面を参照しながら、上述したよう
な従来の音声速度変換装置について説明を行う。A conventional voice speed conversion device as described above will be described below with reference to the drawings.

【０００４】図８は従来の音声速度変換装置の構成を示
すものである。図８において、８１はＡ／Ｄ変換器、８
２はバッファ、８３は速度制御回路、８４はデータ読出
し回路、８５はミューティング回路、８６はＤ／Ａ変換
器である。FIG. 8 shows the structure of a conventional voice speed converter. In FIG. 8, 81 is an A / D converter, and 8
2 is a buffer, 83 is a speed control circuit, 84 is a data reading circuit, 85 is a muting circuit, and 86 is a D / A converter.

【０００５】以上のように構成された音声速度変換装置
について、以下その動作を説明する。The operation of the voice speed conversion device configured as described above will be described below.

【０００６】まず、アナログ入力信号はＡ／Ｄ変換器８
１でディジタル信号に変換され、バッファ８２へ書込ま
れる。次に、速度制御回路８３は時間軸変換比に応じて
データ読出し回路８４を制御し、バッファ８２からデー
タを読出す。このような読出し方法によって、再生速度
を様々に変化させることができる。再生時間を短くする
場合には、ブロック単位で読出すデータを間引く。再生
時間を長くする場合には、ブロック単位で読出すデータ
を繰返す。そして各ブロック間の不連続部分はミューテ
ィング回路８５でミューティングをかけ、Ｄ／Ａ変換器
８６でアナログ信号に変換して出力する。First, the analog input signal is the A / D converter 8
At 1, it is converted to a digital signal and written in the buffer 82. Next, the speed control circuit 83 controls the data reading circuit 84 according to the time base conversion ratio to read the data from the buffer 82. By such a reading method, the reproduction speed can be variously changed. When shortening the reproduction time, the data to be read is thinned out in block units. When lengthening the reproduction time, the data to be read is repeated in block units. The muting circuit 85 mutes the discontinuous portion between the blocks, and the D / A converter 86 converts the muted signal into an analog signal for output.

【０００７】図９は原音とこれを時間軸変換した音との
信号の時系列関係を示した図であり、（ａ）は原音、
（ｂ）は時間軸変換比α＝0.5で変換された信号、
（ｃ）は時間軸変換比α＝2.0で変換された信号、を模
式的に示したものである。ここで時間軸変換比αは次式
で定義されるものとする。FIG. 9 is a diagram showing a time-series relationship of signals between an original sound and a sound obtained by time-base conversion of the original sound.
(B) is a signal converted with the time-axis conversion ratio α = 0.5,
(C) schematically shows a signal converted at the time axis conversion ratio α = 2.0. Here, the time axis conversion ratio α is defined by the following equation.

【０００８】[0008]

【数１】 [Equation 1]

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、上記の
ような構成では、時間軸を圧縮して速度を早める場合に
は、データを間引くために子音などが欠落して明瞭度が
低下し、さらにブロックの接続点は不連続であり、それ
を減らすために接続点をミューティングしているもの
の、振幅や位相が不連続で自然性に乏しい音声しか得ら
れないという課題を有していた。However, in the above configuration, when the time axis is compressed to increase the speed, consonants and the like are lost to thin out the data, resulting in a decrease in clarity and further block. The connection point of is discontinuous, and although the connection point is muted in order to reduce it, there is a problem that only amplitude and phase discontinuous voices with poor naturalness can be obtained.

【００１０】また、他の従来の音声速度変換装置では、
ＴＤＨＳ（Time Domein Harmonic Scaling）のように入
力信号のピッチ周期を用いる方法もあるが、入力信号に
音楽や雑音が重畳している場合にはピッチの抽出が難し
いので適用できず、適当なものではなかった。In another conventional voice speed conversion device,
There is also a method of using the pitch period of the input signal such as TDHS (Time Domein Harmonic Scaling), but when music or noise is superimposed on the input signal, it is difficult to extract the pitch, so it is not applicable. There wasn't.

【００１１】本発明は上記課題に鑑み、波形の不連続性
が少なく、データの欠落をあまり生じない自然性に富ん
だ音声を出力することのできる音声速度変換方法を提供
するものである。In view of the above problems, the present invention provides a voice speed conversion method capable of outputting a natural voice having few waveform discontinuities and causing little data loss.

【００１２】[0012]

【課題を解決するための手段】この目的を達成するため
に本発明の音声速度変換方法は、有限時間長Ｔの第１の
信号と該第１の信号に続く有限時間長Ｔの第２の信号と
の相関関数を計算して該相関関数の値が最大となる時間
遅れＴｃを求め、前記第１の信号と前記第２の信号に前
記相関関数の値が最大となる時間遅れＴｃに基づいて決
定した時間的に振幅が相補的に変化する窓関数をそれぞ
れ乗じ、前記窓関数を乗じた第１の信号と前記窓関数を
乗じた第２の信号とを前記相関関数の値が最大となる時
間遅れＴｃの位置で加算し、前記加算した信号に第３の
信号を連続して出力し、前記加算した信号と前記第３の
信号とを時間軸変換比α（＝出力時間／入力時間）と相
関関数の値が最大となる時間遅れＴｃと有限時間長Ｔに
基づいて決定した時間長だけ出力し、次回の処理におけ
る第１の信号と第２の信号の開始点を時間軸変換比αと
相関関数の値が最大となる時間遅れＴｃと有限時間長Ｔ
に基づいて決定し、上述した全ての処理を繰り返すこと
により音声の再生時間を原音の長さに対して変化させる
ことを特徴とする音声速度変換方法である。In order to achieve this object, a voice speed conversion method of the present invention comprises a first signal having a finite time length T and a second signal having a finite time length T following the first signal. A time delay Tc at which the value of the correlation function is maximized is calculated by calculating a correlation function with the signal, and based on the time delay Tc at which the value of the correlation function is maximized for the first signal and the second signal. And a second signal obtained by multiplying the window function and the second signal obtained by multiplying the window function, the amplitudes of which are complementary to each other. At the position of the time delay Tc, a third signal is continuously output to the added signal, and the added signal and the third signal are converted to the time axis conversion ratio α (= output time / input time ) And the time delay Tc at which the value of the correlation function becomes maximum and the finite time length T Output only during long delays a first signal and a time value of the correlation function the starting point and α time axis conversion ratio of the second signal becomes maximum in the next processing Tc and finite time length T
The voice speed conversion method is characterized in that the reproduction time of the voice is changed with respect to the length of the original sound by repeating all the processes described above.

【００１３】[0013]

【作用】このような方法によって、第１の信号と第２の
信号に窓関数を乗じてから加算することにより、加算し
た信号の欠落および振幅の不連続が少なくなり、また窓
関数を乗じた第１の信号と窓関数を乗じた第２の信号と
を相関関数の値が最大となる時間遅れＴｃの位置で加算
することにより、位相の不連続が少なくなる。According to such a method, the first signal and the second signal are multiplied by the window function and then added to reduce the missing of the added signal and the discontinuity of the amplitude, and also to multiply the window function. By adding the first signal and the second signal obtained by multiplying the window function at the position of the time delay Tc where the value of the correlation function is maximum, the discontinuity of the phase is reduced.

【００１４】さらに、窓関数を乗じた第１の信号と窓関
数を乗じた第２の信号とを加算した信号と、この加算し
た信号に続く第３の信号とを時間軸変換比αと相関関数
の値が最大となる時間遅れＴｃと有限時間長Ｔに基づい
て決定した時間長だけ出力することにより、信号の欠落
が少なく、かつ任意の速度に変換が行えることとなる。Furthermore, the signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal following the added signal are correlated with the time axis conversion ratio α. By outputting only the time length determined based on the time delay Tc that maximizes the value of the function and the finite time length T, it is possible to reduce the loss of the signal and perform conversion to an arbitrary speed.

【００１５】[0015]

【実施例】以下、本発明の実施例について、図面を参照
しながら説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１６】本発明は、信号の振幅及び位相の不連続が
少なく、データの欠落を生じない自然性に富んだ音声
を、時間軸変換比αがα≧１．０の範囲で出力すること
ができる音声速度変換方法を提供するものである。According to the present invention, it is possible to output a highly natural voice in which there is little discontinuity in the amplitude and phase of the signal and there is no loss of data in the range where the time axis conversion ratio α is α ≧ 1.0. The present invention provides a possible voice speed conversion method.

【００１７】ここで時間軸変換比αは次式で定義される
ものとする。Here, the time axis conversion ratio α is defined by the following equation.

【００１８】[0018]

【数２】 [Equation 2]

【００１９】図１は本発明の第一の実施例における音声
速度変換方法のフローチャートを示すものである。以
下、その動作について説明する。FIG. 1 shows a flow chart of a voice speed converting method in the first embodiment of the present invention. The operation will be described below.

【００２０】まずステップ１１において、入力ポインタ
をリセットする。次にステップ１２において、入力ポイ
ンタからＴ区間の第１の信号（Ｘ_A）を入力する。そし
てステップ１３で、入力ポインタにＴを加える。次にス
テップ１４で、入力ポインタからＴ区間の第２の信号
（Ｘ_B）を入力する。First, in step 11, the input pointer is reset. Next, in step 12, the first signal (X _A ) in the T section is input from the input pointer. Then, in step 13, T is added to the input pointer. Next, at step 14, the second signal (X _B ) in the T section is input from the input pointer.

【００２１】ステップ１５で、第１の信号Ｘ_Aと第２の
信号Ｘ_Bの相関関数を計算し、この相関関数の値が最大
となる時間遅れＴｃを探索する。次にステップ１６で、
先ほど求めた相関関数が最大となる時間遅れＴｃに基づ
いて、第１の信号Ｘ_Aに振幅が漸増する窓関数を乗じ
る。そしてステップ１７で、先ほど求めた相関関数が最
大となる時間遅れＴｃに基づいて、第２の信号Ｘ_Bに振
幅が漸減する窓関数を乗じる。次にステップ１８で、窓
関数を乗じた第１の信号と窓関数を乗じた第２の信号と
を相関関数が最大となる時間遅れＴｃの位置にずらした
後に加算する。そしてステップ１９において、ステップ
１８で加算した信号と、第１の信号Ｘ_Aに続く信号、つ
まり現在の入力ポインタを開始点とする第３の信号（Ｘ
_C）を、α（Ｔ−Ｔｃ）／（α−１）区間だけ出力す
る。次にステップ２０で、入力ポインタに（２Ｔ−αＴ
−Ｔｃ）／（α−１）を加える。この後、ステップ１２
に戻る。In step 15, the correlation function of the first signal X _A and the second signal X _B is calculated, and the time delay Tc at which the value of this correlation function becomes maximum is searched for. Then in step 16,
The first signal X _A is multiplied by a window function whose amplitude gradually increases, based on the time delay Tc at which the correlation function obtained above becomes maximum. Then, in step 17, the second signal X _B is multiplied by the window function whose amplitude is gradually reduced, based on the time delay Tc at which the correlation function obtained previously becomes maximum. Next, at step 18, the first signal multiplied by the window function and the second signal multiplied by the window function are shifted to the position of the time delay Tc at which the correlation function becomes maximum and then added. Then, in step 19, the signal added in step 18 and the signal following the first signal X _A , that is, the third signal (X
_C ) is output only in the α (T-Tc) / (α-1) section. Next, at step 20, the input pointer is set to (2T-αT
-Tc) / (α-1) is added. After this, step 12
Return to.

【００２２】図２は、図１に示したステップ１５におけ
る、第１の信号Ｘ_Aと第２の信号Ｘ_Bの相関関数を計算
し、相関関数の値が最大となる時間遅れＴｃを探索する
処理のフローチャートを示すものである。以下、その動
作について説明する。In FIG. 2, the correlation function of the first signal X _A and the second signal X _B in step 15 shown in FIG. 1 is calculated, and the time delay Tc at which the value of the correlation function becomes maximum is searched for. It shows a flowchart of processing. The operation will be described below.

【００２３】まず、ステップ２０１，２０２および２０
３で、時間遅れτ，相関関数の値が最大となる時間遅れ
Ｔｃおよび相関関数の最大値Ｒmaxを０に初期化する。
次に、ステップ２０４で、（数３）に示すように時間遅
れτが負でない場合の、第１の信号Ｘ_Aと第２の信号Ｘ_B
の相関関数Ｒ（τ）を計算する。First, steps 201, 202 and 20
At 3, the time delay τ, the time delay Tc that maximizes the value of the correlation function, and the maximum value Rmax of the correlation function are initialized to zero.
Next, in step 204, when the time delay τ is not negative as shown in (Equation 3), the first signal X _A and the second signal X _B
The correlation function R (τ) of is calculated.

【００２４】[0024]

【数３】 [Equation 3]

【００２５】そしてステップ２０５において、ステップ
２０４で求めた相関関数Ｒ（τ）が、それ以前に求めら
れた相関関数の最大値Ｒmaxよりも大きくない場合には
ステップ２０８に分岐し、そうでない場合には、ステッ
プ２０６で相関関数の最大値ＲmaxをＲ（τ）に更新
し、ステップ２０７で相関関数の値が最大となる時間遅
れＴｃをτに更新する。次にステップ２０８で時間遅れ
τを１点だけ増加する。そしてステップ２０９で、時間
遅れτがτmax₊を越えていないならばステップ２０４に
戻り、ステップ２０４から２０８までの処理を時間遅れ
τがτmax₊を越えるまで繰り返す。そして上記条件を満
たしたら、ステップ２１０で、時間遅れτを−１に初期
化する。次にステップ２１１で、（数４）に示すように
時間遅れτが負の場合の、第１の信号Ｘ_Aと第２の信号
Ｘ_Bの相関関数Ｒ（τ）を計算する。Then, in step 205, if the correlation function R (τ) obtained in step 204 is not larger than the maximum value Rmax of the correlation function obtained before that, the process branches to step 208. In step 206, the maximum value Rmax of the correlation function is updated to R (τ), and in step 207, the time delay Tc at which the value of the correlation function is maximum is updated to τ. Next, at step 208, the time delay τ is increased by one point. Then, in step 209, if the time delay τ does not exceed τ max ₊ , the process returns to step 204, and the processing from steps 204 to 208 is repeated until the time delay τ exceeds τ max ₊ . When the above condition is satisfied, the time delay τ is initialized to -1 in step 210. Next, at step 211, the correlation function R (τ) of the first signal X _A and the second signal X _B when the time delay τ is negative as shown in (Equation 4) is calculated.

【００２６】[0026]

【数４】 [Equation 4]

【００２７】そしてステップ２１２で、ステップ２１１
で求めた相関関数Ｒ（τ）が、それ以前に求められた相
関関数の最大値Ｒmaxよりも大きくない場合にはステッ
プ２１５に分岐し、そうでない場合には、ステップ２１
３で相関関数の最大値ＲmaxをＲ（τ）に更新し、ステ
ップ２１４で相関関数の値が最大となる時間遅れＴｃを
τに更新する。次にステップ２１５で時間遅れτを１点
だけ減少させる。そしてステップ２１６で、時間遅れτ
がτmax_-より小さくないならばステップ２１１に戻り、
ステップ２１１から２１５までの処理を時間遅れτがτ
max_-より小さくなるまで繰り返す。最後にステップ２１
７で、相関関数の値が最大となる時間遅れＴｃを出力す
る。Then, in step 212, step 211
If the correlation function R (τ) obtained in step 1 is not larger than the maximum value Rmax of the correlation function obtained before that, the process branches to step 215, and if not, step 21
In step 3, the maximum value Rmax of the correlation function is updated to R (τ), and in step 214, the time delay Tc at which the value of the correlation function becomes maximum is updated to τ. Next, at step 215, the time delay τ is decreased by one point. Then, in step 216, the time delay τ
There .tau.max _- returns to step 211 if no smaller than,
In the processing from steps 211 to 215, the time delay τ is τ
max _- repeated until than smaller. Finally step 21
At 7, the time delay Tc that maximizes the value of the correlation function is output.

【００２８】図３は、図１に示したステップ１６，１７
および１８における処理の模式図を示すものである。FIG. 3 shows steps 16 and 17 shown in FIG.
It is a schematic diagram of the process in 18 and.

【００２９】図３（ａ）は相関関数の値が最大となる時
間遅れＴｃ＝０の場合、（ｂ）は相関関数の値が最大と
なる時間遅れＴｃ＞０の場合、そして（ｃ）は相関関数
の値が最大となる時間遅れＴｃ＜０の場合である。それ
ぞれの場合とも、第１の信号には振幅が時間的に漸増す
る窓関数を乗じ、第２の信号には振幅が時間的に漸減す
る窓関数を乗じ、それらを相関関数が最大となる時間遅
れＴｃだけずらしてから加算する。ここで、窓関数の形
状は相関関数が最大となる時間遅れＴｃに基づいて変化
させる。加算した結果の時間長は（Ｔ−Ｔｃ）となる。FIG. 3A shows the case where the time delay Tc = 0 at which the value of the correlation function becomes maximum, FIG. 3B shows the time delay Tc> 0 when the value of the correlation function becomes maximum, and FIG. This is the case where the time delay Tc <0 at which the value of the correlation function becomes maximum. In each case, the first signal is multiplied by a window function whose amplitude gradually increases with time, and the second signal is multiplied by a window function whose amplitude gradually decreases with time. Add only after delaying Tc. Here, the shape of the window function is changed based on the time delay Tc at which the correlation function becomes maximum. The time length of the addition result is (T-Tc).

【００３０】図４は本発明の上述した音声速度変換方法
の処理例を模式的に示したものである。FIG. 4 schematically shows a processing example of the above-described voice speed conversion method of the present invention.

【００３１】図４（ａ）は入力信号、（ｂ）は時間軸変
換比α＝３／２の場合の出力信号である。Ｘ_A1とＸ_B1の
相関関数が最大となる時間遅れＴ_c1＝０、Ｘ_A2とＸ_B2の
相関関数が最大となる時間遅れＴ_c2＞０、そしてＸ_A3と
Ｘ_B3の相関関数が最大となる時間遅れＴ_c3＜０となって
いる。第１の信号Ｘ_Anと第２の信号Ｘ_Bnとを加算した信
号と、第１の信号Ｘ_Anに続く第３の信号Ｘ_Cnの時間長の
和は、α（Ｔ−Ｔ_cn）／（α−１）となり、時間軸変換
比αと相関関数の値が最大となる時間遅れＴ_cnと有限時
間長Ｔに基づいて決定されている。入力信号（Ｘ_C1＋Ｘ
_C2＋Ｘ_C3）の時間長に対する出力信号の時間長の比は、
設定した時間軸変換比α（＝３／２）と等しくなる。Ｘ
_Cnはそのまま出力され、かつ入力信号の全ての区間を用
いるので、出力信号における情報欠落は全くない。FIG. 4A shows an input signal, and FIG. 4B shows an output signal when the time axis conversion ratio α = 3/2. The time delay T _c1 = 0 at which the correlation function of X _A1 and X _B1 becomes maximum, the time delay T _c2 > 0 at which the correlation function of X _A2 and X _B2 becomes maximum, and the correlation function of X _A3 and X _B3 become maximum. The time delay T _c3 <0. The sum of the time lengths of the signal obtained by adding the first signal X _An and the second signal X _Bn and the third signal X _Cn following the first signal X _An is α (T−T _cn ) / ( α-1), which is determined based on the time-axis conversion ratio α, the time delay T _{cn at} which the value of the correlation function is maximum, and the finite time length T. Input signal (X _C1 + X
The ratio of the time length of the output signal to the time length of ( _C2 + X _C3 ) is
It becomes equal to the set time axis conversion ratio α (= 3/2). X
_{Since Cn} is output as it is and all the sections of the input signal are used, there is no information loss in the output signal.

【００３２】以上のように本実施例によれば、第１の信
号Ｘ_Aに漸増する窓関数を乗じ、第２の信号Ｘ_Bに漸減す
る窓関数を乗じてから加算することにより、加算した信
号の振幅の不連続が少なくなる。そして、窓関数を乗じ
た第１の信号と窓関数を乗じた第２の信号とを相関関数
の値が最大となる時間遅れＴｃの位置で加算することに
より、位相の不連続が少なくなる。さらに、窓関数を乗
じた第１の信号と窓関数を乗じた第２の信号とを加算し
た信号と、第１の信号Ｘ_Aに続く第３の信号Ｘ_Cを時間軸
変換比αと相関関数の値が最大となる時間遅れＴｃと有
限時間長Ｔに基づいて決定した時間長だけ出力すること
により、信号の欠落がなく容易に、入力信号を時間軸変
換比αをα≧１．０の範囲に伸長して出力することがで
きる。As described above, according to the present embodiment, the first signal X _A is multiplied by the gradually increasing window function, and the second signal X _B is multiplied by the gradually decreasing window function, and then the addition is performed, thereby performing the addition. The discontinuity of the signal amplitude is reduced. Then, the first signal multiplied by the window function and the second signal multiplied by the window function are added at the position of the time delay Tc at which the value of the correlation function becomes maximum, whereby the discontinuity of the phase is reduced. Further, a signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal X _C subsequent to the first signal X _A are correlated with the time axis conversion ratio α. By outputting only the time length determined based on the time delay Tc that maximizes the function value and the finite time length T, the input signal can be easily input without time loss and the time axis conversion ratio α can be set to α ≧ 1.0. The output can be expanded to the range of.

【００３３】以下、本発明の第二の実施例について、図
面を参照しながら説明する。本発明は、信号の振幅及び
位相の不連続が少なく、データの欠落をあまり生じない
自然性に富んだ音声を時間軸変換比αがα≦１．０の範
囲で出力することができる音声速度変換方法を提供する
ものである。A second embodiment of the present invention will be described below with reference to the drawings. The present invention is a voice speed at which a time-axis conversion ratio α can be output within a range of α ≦ 1.0, which is a voice with few discontinuities in the amplitude and phase of the signal and which does not cause data loss. It provides a conversion method.

【００３４】図５は本発明の第二の実施例における音声
速度変換方法のフローチャートを示すものである。以
下、その動作について説明する。FIG. 5 shows a flow chart of a voice speed converting method in the second embodiment of the present invention. The operation will be described below.

【００３５】まず、ステップ５１において、入力ポイン
タをリセットする。次に、ステップ５２において、入力
ポインタからＴ区間の第１の信号Ｘ_Aを入力する。そし
てステップ５３において、入力ポインタにＴを加える。
次にステップ５４において、入力ポインタからＴ区間の
第２の信号Ｘ_Bを入力する。そしてステップ５５で、第
１の信号Ｘ_Aと第２の信号Ｘ_Bの相関関数を計算し、この
相関関数の値が最大となる時間遅れＴｃを探索する。First, in step 51, the input pointer is reset. Next, in step 52, the first signal X _{A in} the T section is input from the input pointer. Then, in step 53, T is added to the input pointer.
Next, at step 54, the second signal X _{B in} the T section is input from the input pointer. Then, in step 55, the correlation function of the first signal X _A and the second signal X _B is calculated, and the time delay Tc at which the value of this correlation function becomes maximum is searched for.

【００３６】次にステップ５６で、先ほど求めた相関関
数が最大となる時間遅れＴｃに基づいて、第１の信号Ｘ
_Aに漸減する窓関数を乗じる。そしてステップ５７で、
先ほど求めた相関関数が最大となる時間遅れＴｃに基づ
いて、第２の信号Ｘ_Bに漸増する窓関数を乗じる。次に
ステップ５８で、窓関数を乗じた第１の信号と窓関数を
乗じた第２の信号とを相関関数が最大となる時間遅れＴ
ｃの位置にずらした後に加算する。そしてステップ５９
で、入力ポインタにＴを加える。次にステップ６０にお
いて、ステップ５８で加算した信号と、第２の信号Ｘ_B
に続く信号、つまり現在の入力ポインタを開始点とする
第３の信号Ｘ_Cを、α（Ｔ−Ｔｃ）／（１−α）区間だ
け出力する。そしてステップ６１で、入力ポインタに
（２αＴ−Ｔ−Ｔｃ）／（１−α）を加える。この後、
ステップ５２に戻る。Next, at step 56, the first signal X is calculated based on the time delay Tc at which the correlation function obtained above becomes maximum.
Multiply _A by a decreasing window function. And in step 57,
The second signal X _B is multiplied by the gradually increasing window function on the basis of the time delay Tc at which the correlation function obtained above becomes maximum. Next, at step 58, the first signal multiplied by the window function and the second signal multiplied by the window function are time-delayed T with which the correlation function becomes maximum.
Add after shifting to the position of c. And step 59
Then, add T to the input pointer. Next, in step 60, the signal added in step 58 and the second signal X _B
A signal subsequent to, that is, a third signal X _C having the current input pointer as a starting point is output only in the α (T-Tc) / (1-α) section. Then, in step 61, (2αT-T-Tc) / (1-α) is added to the input pointer. After this,
Return to step 52.

【００３７】図５のステップ５５における、第１の信号
Ｘ_Aと第２の信号Ｘ_Bの相関関数を計算し、相関関数の値
が最大となる時間遅れＴｃを探索する処理は、図２に示
した本発明の第一の実施例におけるものと同じである。The process of calculating the correlation function of the first signal X _A and the second signal X _B in step 55 of FIG. 5 and searching for the time delay Tc at which the value of the correlation function is maximum is shown in FIG. This is the same as in the first embodiment of the present invention shown.

【００３８】図６は、図５のステップ５６，５７および
５８における処理の模式図を示すものである。FIG. 6 is a schematic diagram of the processing in steps 56, 57 and 58 of FIG.

【００３９】図６（ａ）は相関関数の値が最大となる時
間遅れＴｃ＝０の場合、（ｂ）は相関関数の値が最大と
なる時間遅れＴｃ＞０の場合、そして（ｃ）は相関関数
の値が最大となる時間遅れＴｃ＜０の場合である。それ
ぞれの場合とも、第１の信号には振幅が時間的に漸減す
る窓関数を乗じ、第２の信号には振幅が時間的に漸増す
る窓関数を乗じ、それらを相関関数が最大となる時間遅
れＴｃだけずらしてから加算する。ここで窓関数の形状
は相関関数が最大となる時間遅れＴｃに基づいて変化さ
せる。加算した結果の時間長は（Ｔ＋Ｔｃ）となる。FIG. 6A shows the case where the time delay Tc = 0 at which the value of the correlation function becomes maximum, FIG. 6B shows the case where the time delay Tc> 0 at which the value of the correlation function becomes maximum, and FIG. This is the case where the time delay Tc <0 at which the value of the correlation function becomes maximum. In each case, the first signal is multiplied by a window function whose amplitude gradually decreases, and the second signal is multiplied by a window function whose amplitude gradually increases with time. Add only after delaying Tc. Here, the shape of the window function is changed based on the time delay Tc that maximizes the correlation function. The time length of the addition result is (T + Tc).

【００４０】図７は上述した音声速度変換方法の処理例
を模式的に示したものである。図７（ａ）は入力信号、
（ｂ）は時間軸変換比α＝２／３の場合の出力信号であ
る。Ｘ_A1とＸ_B1の相関関数が最大となる時間遅れＴ_c1＝
０、Ｘ_A2とＸ_B2の相関関数が最大となる時間遅れＴ_c2＞
０、そしてＸ_A3とＸ_B3の相関関数が最大となる時間遅れ
Ｔ_c3＜０となっている。第１の信号Ｘ_Anと第２の信号Ｘ
_Bnとを加算した信号と、第２の信号Ｘ_Bnに続く第３の信
号Ｘ_Cnの時間長の和は、α（Ｔ−Ｔ_cn）／（１−α）と
なり、時間軸変換比αと相関関数の値が最大となる時間
遅れＴ_cnと有限時間長Ｔに基づいて決定されている。入
力信号の時間長に対する出力信号の時間長の比は、設定
した時間軸変換比α（＝２／３）と等しくなる。入力信
号は第１の信号Ｘ_An，第２の信号Ｘ_Bn，および第３の信
号Ｘ_Cnで全て用いられるので、出力信号における情報欠
落は少ない。FIG. 7 schematically shows a processing example of the voice speed conversion method described above. FIG. 7A shows an input signal,
(B) is an output signal when the time-axis conversion ratio α = 2/3. Time delay T _c1 = where the correlation function of X _A1 and X _B1 is maximum =
0, time delay T _c2 > at which the correlation function of X _A2 and X _B2 becomes maximum>
0, and the time delay T _c3 <0 at which the correlation function of X _A3 and X _B3 becomes maximum. First signal X _An and second signal X
The sum of the time lengths of the signal obtained by adding _Bn and the third signal X _Cn that follows the second signal X _Bn is α (T-T _cn ) / (1-α), and the time-axis conversion ratio α and It is determined based on the time delay T _cn that maximizes the value of the correlation function and the finite time length T. The ratio of the time length of the output signal to the time length of the input signal is equal to the set time axis conversion ratio α (= 2/3). Since the input signals are all used for the first signal X _An , the second signal X _Bn , and the third signal X _Cn , there is little information loss in the output signal.

【００４１】以上のように本実施例によれば、第１の信
号Ｘ_Aに漸減する窓関数を乗じ、第２の信号Ｘ_Bに漸増す
る窓関数を乗じてから加算することにより、加算した信
号の振幅の不連続が少なくなる。そして、窓関数を乗じ
た第１の信号と窓関数を乗じた第２の信号とを相関関数
の値が最大となる時間遅れＴｃの位置で加算することに
より、位相の不連続が少なくなる。さらに窓関数を乗じ
た第１の信号と窓関数を乗じた第２の信号とを加算した
信号と、第２の信号Ｘ_Bに続く第３の信号Ｘ_Cを時間軸変
換比αと相関関数の値が最大となる時間遅れＴｃと有限
時間長Ｔに基づいて決定した時間長だけ出力することに
より、信号の欠落が少なく容易に、入力信号を時間軸変
換比αをα≦１．０の範囲に圧縮して出力することがで
きる。As described above, according to the present embodiment, the first signal X _A is multiplied by the gradually decreasing window function, and the second signal X _B is multiplied by the gradually increasing window function, and then the addition is performed, thereby performing the addition. The discontinuity of the signal amplitude is reduced. Then, the first signal multiplied by the window function and the second signal multiplied by the window function are added at the position of the time delay Tc at which the value of the correlation function becomes maximum, whereby the discontinuity of the phase is reduced. Further, a signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal X _C subsequent to the second signal X _B are added to the time axis conversion ratio α and the correlation function. By outputting only the time length determined based on the time delay Tc that maximizes the value of T and the finite time length T, the input signal can be easily reduced with a time axis conversion ratio α of α ≦ 1.0. It can be compressed into a range and output.

【００４２】[0042]

【発明の効果】以上の説明より明らかなように、本発明
は、第１の信号と第２の信号に時間的に振幅が相補的に
変化する窓関数を乗じてから加算することにより、加算
した信号の振幅の不連続が少なくなり、また窓関数を乗
じた第１の信号と窓関数を乗じた第２の信号とを相関関
数の値が最大となる時間遅れの位置で加算することによ
り、位相の不連続が少なくなる。As is apparent from the above description, according to the present invention, the first signal and the second signal are multiplied by a window function whose amplitudes change in a complementary manner with respect to time, and then the signals are added. By adding the first signal multiplied by the window function and the second signal multiplied by the window function at the time delay position where the value of the correlation function becomes maximum, , Phase discontinuity is reduced.

【００４３】さらに、窓関数を乗じた第１の信号と窓関
数を乗じた第２の信号とを加算した信号と、この加算し
た信号に続く第３の信号を時間軸変換比αと相関関数の
値が最大となる時間遅れＴｃと有限時間長Ｔに基づいて
決定した時間長だけ出力することにより、信号の欠落が
少なく、かつ任意の速度に変換を行うことができるとい
う優れた効果を得ることができる。Furthermore, a signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal following the added signal are added to the time axis conversion ratio α and the correlation function. By outputting only the time length determined based on the time delay Tc that maximizes the value of and the finite time length T, it is possible to obtain an excellent effect that there is little loss of signal and conversion to an arbitrary speed can be performed. be able to.

[Brief description of drawings]

【図１】本発明の第一の実施例における音声速度変換方
法のフローチャートFIG. 1 is a flowchart of a voice speed conversion method according to a first embodiment of the present invention.

【図２】本発明の第一の実施例における音声速度変換方
法の相関関数演算のフローチャートFIG. 2 is a flowchart of a correlation function calculation of the voice speed conversion method according to the first embodiment of the present invention.

【図３】本発明の第一の実施例における音声速度変換方
法の窓関数による重み付けと相関関数の値が最大となる
時間遅れの位置での加算の模式図FIG. 3 is a schematic diagram of weighting by a window function and addition at a time delay position where the value of the correlation function is maximum in the voice speed conversion method according to the first embodiment of the present invention.

【図４】本発明の第一の実施例における音声速度変換方
法の入力信号と出力信号の模式図FIG. 4 is a schematic diagram of an input signal and an output signal of the voice speed conversion method according to the first embodiment of the present invention.

【図５】本発明の第二の実施例における音声速度変換方
法のフローチャートFIG. 5 is a flowchart of a voice speed conversion method according to the second embodiment of the present invention.

【図６】本発明の第二の実施例における音声速度変換方
法の窓関数による重み付けと相関関数の値が最大となる
時間遅れの位置での加算の模式図FIG. 6 is a schematic diagram of weighting by a window function and addition at a time delay position where the value of the correlation function is maximum in the voice speed conversion method according to the second embodiment of the present invention.

【図７】本発明の第二の実施例における音声速度変換方
法の入力信号と出力信号の模式図FIG. 7 is a schematic diagram of an input signal and an output signal of a voice speed conversion method according to a second embodiment of the present invention.

【図８】従来の音声速度変換装置の構成図FIG. 8 is a block diagram of a conventional voice speed conversion device.

【図９】従来の音声速度変換装置の入力信号と出力信号
の模式図FIG. 9 is a schematic diagram of an input signal and an output signal of a conventional voice speed conversion device.

[Explanation of symbols]

８１Ａ／Ｄ変換器８２バッファ８３速度制御回路８４データ読出し回路８５ミューティング回路８６Ｄ／Ａ変換器 81 A / D converter 82 Buffer 83 Speed control circuit 84 Data reading circuit 85 Muting circuit 86 D / A converter

Claims

[Claims]

1. A time delay in which a correlation function between a first signal having a finite time length T and a second signal having a finite time length T following the first signal is calculated to maximize the value of the correlation function. Tc is obtained, and each of the first signal and the second signal is multiplied by a window function whose amplitude temporally changes complementarily, which is determined based on a time delay Tc at which the value of the correlation function is maximum, and The first signal multiplied by the window function and the second signal multiplied by the window function are added at the position of the time delay Tc where the value of the correlation function is maximum, and the third signal is added to the added signal. Based on the time-axis conversion ratio α (= output time / input time), the time delay Tc at which the value of the correlation function is maximum, and the finite time length T, which are continuously output and the added signal and the third signal are output. Output for the determined time length, and the first signal and the second signal in the next processing
The starting point of the signal is determined based on the time-axis conversion ratio α, the time delay Tc at which the value of the correlation function is maximum, and the finite time length T,
A voice speed conversion method, characterized in that the reproduction time of voice is changed with respect to the length of an original sound by repeating all the processes described above.

2. A time delay Tc at which the correlation function has a maximum value by calculating a correlation function between a first signal having a finite time length T and a second signal having a finite time length T following the first signal. And the first signal is multiplied by a window function whose amplitude gradually increases with time, which is determined based on the time delay Tc at which the value of the correlation function becomes maximum, and the second signal has the correlation function of the correlation function. The first signal multiplied by the window function and the second signal multiplied by the window function are multiplied by a window function in which the amplitude determined based on the time delay Tc having the maximum value is gradually reduced. Addition is performed at the position of the time delay Tc where the value of the function is maximum, the third signal is continuously output to the added signal, and the added signal and the third signal are time length {α (T- Tc) / (α-
1)} is output (where α is the time base conversion ratio,
Output time / input time), the starting point of the first signal in the next processing is a point obtained by delaying the starting point of the first signal by {(T-Tc) / (α-1)}. ,
A voice speed conversion method characterized in that the reproduction time of voice is changed to 1.0 times or more of the length of the original sound by repeating all the processes described above.

3. A time delay Tc at which a correlation function has a maximum value by calculating a correlation function between a first signal having a finite time length T and a second signal having a finite time length T following the first signal. Is obtained, and the first signal is multiplied by a window function in which the amplitude determined based on the time delay Tc at which the value of the correlation function is maximum gradually decreases, and the second signal is calculated by the correlation function of the correlation function. The correlation is made between the first signal multiplied by the window function and the second signal multiplied by the window function, which is obtained by multiplying the window function by which the amplitude determined based on the time delay Tc having the maximum value gradually increases with time. Addition is performed at the position of the time delay Tc where the value of the function is maximum, the third signal is continuously output to the added signal, and the added signal and the third signal are time length {α (T- Tc) / (1-
α)} only is output (where α is the time base conversion ratio,
Output time / input time), and the starting point of the first signal in the next processing is a point obtained by delaying the starting point of the first signal by {(T-Tc) / (1-α)}. ,
A voice speed conversion method characterized in that the reproduction time of a voice is changed to 1.0 times or less of the length of an original sound by repeating all the processes described above.