JPH06222794A - Voice speed conversion method - Google Patents

Voice speed conversion method

Info

Publication number
JPH06222794A
JPH06222794A JP5009737A JP973793A JPH06222794A JP H06222794 A JPH06222794 A JP H06222794A JP 5009737 A JP5009737 A JP 5009737A JP 973793 A JP973793 A JP 973793A JP H06222794 A JPH06222794 A JP H06222794A
Authority
JP
Japan
Prior art keywords
signal
time
correlation function
time delay
multiplied
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP5009737A
Other languages
Japanese (ja)
Other versions
JP3147562B2 (en
Inventor
Ryoji Suzuki
良二 鈴木
Masayuki Misaki
正之 三崎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP00973793A priority Critical patent/JP3147562B2/en
Priority to DE69428612T priority patent/DE69428612T2/en
Priority to US08/187,295 priority patent/US5630013A/en
Priority to EP94101057A priority patent/EP0608833B1/en
Publication of JPH06222794A publication Critical patent/JPH06222794A/en
Application granted granted Critical
Publication of JP3147562B2 publication Critical patent/JP3147562B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Abstract

PURPOSE:To provide a voice conversion method capable of outputting a voice with the little discontinuity of a waveform and with the little omission of data and rich in a natural property by adding them after multiplying a first signal and a second signal by a window function. CONSTITUTION:The first signal with a time length T is inputted from an input pointer. The second signal with the time length T is inputted succeeding to that. The correlative function between the first signal and the second signal is calculated, and a time delay Tc when the value of the correlative function is maximum is retrieved. The first signal is multiplied by the window function gradually increasing an amplitude according to the obtained time delay Tc. The second signal is multiplied by the window function gradually decreasing the amplitude according the time delay Tc. After the first signal multiplied by the window function and the second signal multiplied by the window function are shifted to the position of the time delay Tc where the correlative function becomes a maximum, they are added. Then, the obtained signal and a third signal succeeding to the first signal are outputted only for a interval alpha(T-Tc)/(alpha-1) (where alpha is a time base transformation ratio).

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、音声の基本周波数を変
えずに継続時間長のみを変える音声速度変換方法に関す
るものである。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice speed conversion method in which only the duration is changed without changing the fundamental frequency of voice.

【0002】[0002]

【従来の技術】近年、テープレコーダ等に記録されてい
る音声信号の早聞きや遅聞きを行うために音声速度変換
装置が利用されている。
2. Description of the Related Art In recent years, a voice speed conversion device has been used to perform fast listening or slow listening of a voice signal recorded on a tape recorder or the like.

【0003】以下、図面を参照しながら、上述したよう
な従来の音声速度変換装置について説明を行う。
A conventional voice speed conversion device as described above will be described below with reference to the drawings.

【0004】図8は従来の音声速度変換装置の構成を示
すものである。図8において、81はA/D変換器、8
2はバッファ、83は速度制御回路、84はデータ読出
し回路、85はミューティング回路、86はD/A変換
器である。
FIG. 8 shows the structure of a conventional voice speed converter. In FIG. 8, 81 is an A / D converter, and 8
2 is a buffer, 83 is a speed control circuit, 84 is a data reading circuit, 85 is a muting circuit, and 86 is a D / A converter.

【0005】以上のように構成された音声速度変換装置
について、以下その動作を説明する。
The operation of the voice speed conversion device configured as described above will be described below.

【0006】まず、アナログ入力信号はA/D変換器8
1でディジタル信号に変換され、バッファ82へ書込ま
れる。次に、速度制御回路83は時間軸変換比に応じて
データ読出し回路84を制御し、バッファ82からデー
タを読出す。このような読出し方法によって、再生速度
を様々に変化させることができる。再生時間を短くする
場合には、ブロック単位で読出すデータを間引く。再生
時間を長くする場合には、ブロック単位で読出すデータ
を繰返す。そして各ブロック間の不連続部分はミューテ
ィング回路85でミューティングをかけ、D/A変換器
86でアナログ信号に変換して出力する。
First, the analog input signal is the A / D converter 8
At 1, it is converted to a digital signal and written in the buffer 82. Next, the speed control circuit 83 controls the data reading circuit 84 according to the time base conversion ratio to read the data from the buffer 82. By such a reading method, the reproduction speed can be variously changed. When shortening the reproduction time, the data to be read is thinned out in block units. When lengthening the reproduction time, the data to be read is repeated in block units. The muting circuit 85 mutes the discontinuous portion between the blocks, and the D / A converter 86 converts the muted signal into an analog signal for output.

【0007】図9は原音とこれを時間軸変換した音との
信号の時系列関係を示した図であり、(a)は原音、
(b)は時間軸変換比α=0.5で変換された信号、
(c)は時間軸変換比α=2.0で変換された信号、を模
式的に示したものである。ここで時間軸変換比αは次式
で定義されるものとする。
FIG. 9 is a diagram showing a time-series relationship of signals between an original sound and a sound obtained by time-base conversion of the original sound.
(B) is a signal converted with the time-axis conversion ratio α = 0.5,
(C) schematically shows a signal converted at the time axis conversion ratio α = 2.0. Here, the time axis conversion ratio α is defined by the following equation.

【0008】[0008]

【数1】 [Equation 1]

【0009】[0009]

【発明が解決しようとする課題】しかしながら、上記の
ような構成では、時間軸を圧縮して速度を早める場合に
は、データを間引くために子音などが欠落して明瞭度が
低下し、さらにブロックの接続点は不連続であり、それ
を減らすために接続点をミューティングしているもの
の、振幅や位相が不連続で自然性に乏しい音声しか得ら
れないという課題を有していた。
However, in the above configuration, when the time axis is compressed to increase the speed, consonants and the like are lost to thin out the data, resulting in a decrease in clarity and further block. The connection point of is discontinuous, and although the connection point is muted in order to reduce it, there is a problem that only amplitude and phase discontinuous voices with poor naturalness can be obtained.

【0010】また、他の従来の音声速度変換装置では、
TDHS(Time Domein Harmonic Scaling)のように入
力信号のピッチ周期を用いる方法もあるが、入力信号に
音楽や雑音が重畳している場合にはピッチの抽出が難し
いので適用できず、適当なものではなかった。
In another conventional voice speed conversion device,
There is also a method of using the pitch period of the input signal such as TDHS (Time Domein Harmonic Scaling), but when music or noise is superimposed on the input signal, it is difficult to extract the pitch, so it is not applicable. There wasn't.

【0011】本発明は上記課題に鑑み、波形の不連続性
が少なく、データの欠落をあまり生じない自然性に富ん
だ音声を出力することのできる音声速度変換方法を提供
するものである。
In view of the above problems, the present invention provides a voice speed conversion method capable of outputting a natural voice having few waveform discontinuities and causing little data loss.

【0012】[0012]

【課題を解決するための手段】この目的を達成するため
に本発明の音声速度変換方法は、有限時間長Tの第1の
信号と該第1の信号に続く有限時間長Tの第2の信号と
の相関関数を計算して該相関関数の値が最大となる時間
遅れTcを求め、前記第1の信号と前記第2の信号に前
記相関関数の値が最大となる時間遅れTcに基づいて決
定した時間的に振幅が相補的に変化する窓関数をそれぞ
れ乗じ、前記窓関数を乗じた第1の信号と前記窓関数を
乗じた第2の信号とを前記相関関数の値が最大となる時
間遅れTcの位置で加算し、前記加算した信号に第3の
信号を連続して出力し、前記加算した信号と前記第3の
信号とを時間軸変換比α(=出力時間/入力時間)と相
関関数の値が最大となる時間遅れTcと有限時間長Tに
基づいて決定した時間長だけ出力し、次回の処理におけ
る第1の信号と第2の信号の開始点を時間軸変換比αと
相関関数の値が最大となる時間遅れTcと有限時間長T
に基づいて決定し、上述した全ての処理を繰り返すこと
により音声の再生時間を原音の長さに対して変化させる
ことを特徴とする音声速度変換方法である。
In order to achieve this object, a voice speed conversion method of the present invention comprises a first signal having a finite time length T and a second signal having a finite time length T following the first signal. A time delay Tc at which the value of the correlation function is maximized is calculated by calculating a correlation function with the signal, and based on the time delay Tc at which the value of the correlation function is maximized for the first signal and the second signal. And a second signal obtained by multiplying the window function and the second signal obtained by multiplying the window function, the amplitudes of which are complementary to each other. At the position of the time delay Tc, a third signal is continuously output to the added signal, and the added signal and the third signal are converted to the time axis conversion ratio α (= output time / input time ) And the time delay Tc at which the value of the correlation function becomes maximum and the finite time length T Output only during long delays a first signal and a time value of the correlation function the starting point and α time axis conversion ratio of the second signal becomes maximum in the next processing Tc and finite time length T
The voice speed conversion method is characterized in that the reproduction time of the voice is changed with respect to the length of the original sound by repeating all the processes described above.

【0013】[0013]

【作用】このような方法によって、第1の信号と第2の
信号に窓関数を乗じてから加算することにより、加算し
た信号の欠落および振幅の不連続が少なくなり、また窓
関数を乗じた第1の信号と窓関数を乗じた第2の信号と
を相関関数の値が最大となる時間遅れTcの位置で加算
することにより、位相の不連続が少なくなる。
According to such a method, the first signal and the second signal are multiplied by the window function and then added to reduce the missing of the added signal and the discontinuity of the amplitude, and also to multiply the window function. By adding the first signal and the second signal obtained by multiplying the window function at the position of the time delay Tc where the value of the correlation function is maximum, the discontinuity of the phase is reduced.

【0014】さらに、窓関数を乗じた第1の信号と窓関
数を乗じた第2の信号とを加算した信号と、この加算し
た信号に続く第3の信号とを時間軸変換比αと相関関数
の値が最大となる時間遅れTcと有限時間長Tに基づい
て決定した時間長だけ出力することにより、信号の欠落
が少なく、かつ任意の速度に変換が行えることとなる。
Furthermore, the signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal following the added signal are correlated with the time axis conversion ratio α. By outputting only the time length determined based on the time delay Tc that maximizes the value of the function and the finite time length T, it is possible to reduce the loss of the signal and perform conversion to an arbitrary speed.

【0015】[0015]

【実施例】以下、本発明の実施例について、図面を参照
しながら説明する。
Embodiments of the present invention will be described below with reference to the drawings.

【0016】本発明は、信号の振幅及び位相の不連続が
少なく、データの欠落を生じない自然性に富んだ音声
を、時間軸変換比αがα≧1.0の範囲で出力すること
ができる音声速度変換方法を提供するものである。
According to the present invention, it is possible to output a highly natural voice in which there is little discontinuity in the amplitude and phase of the signal and there is no loss of data in the range where the time axis conversion ratio α is α ≧ 1.0. The present invention provides a possible voice speed conversion method.

【0017】ここで時間軸変換比αは次式で定義される
ものとする。
Here, the time axis conversion ratio α is defined by the following equation.

【0018】[0018]

【数2】 [Equation 2]

【0019】図1は本発明の第一の実施例における音声
速度変換方法のフローチャートを示すものである。以
下、その動作について説明する。
FIG. 1 shows a flow chart of a voice speed converting method in the first embodiment of the present invention. The operation will be described below.

【0020】まずステップ11において、入力ポインタ
をリセットする。次にステップ12において、入力ポイ
ンタからT区間の第1の信号(XA)を入力する。そし
てステップ13で、入力ポインタにTを加える。次にス
テップ14で、入力ポインタからT区間の第2の信号
(XB)を入力する。
First, in step 11, the input pointer is reset. Next, in step 12, the first signal (X A ) in the T section is input from the input pointer. Then, in step 13, T is added to the input pointer. Next, at step 14, the second signal (X B ) in the T section is input from the input pointer.

【0021】ステップ15で、第1の信号XAと第2の
信号XBの相関関数を計算し、この相関関数の値が最大
となる時間遅れTcを探索する。次にステップ16で、
先ほど求めた相関関数が最大となる時間遅れTcに基づ
いて、第1の信号XAに振幅が漸増する窓関数を乗じ
る。そしてステップ17で、先ほど求めた相関関数が最
大となる時間遅れTcに基づいて、第2の信号XBに振
幅が漸減する窓関数を乗じる。次にステップ18で、窓
関数を乗じた第1の信号と窓関数を乗じた第2の信号と
を相関関数が最大となる時間遅れTcの位置にずらした
後に加算する。そしてステップ19において、ステップ
18で加算した信号と、第1の信号XAに続く信号、つ
まり現在の入力ポインタを開始点とする第3の信号(X
C)を、α(T−Tc)/(α−1)区間だけ出力す
る。次にステップ20で、入力ポインタに(2T−αT
−Tc)/(α−1)を加える。この後、ステップ12
に戻る。
In step 15, the correlation function of the first signal X A and the second signal X B is calculated, and the time delay Tc at which the value of this correlation function becomes maximum is searched for. Then in step 16,
The first signal X A is multiplied by a window function whose amplitude gradually increases, based on the time delay Tc at which the correlation function obtained above becomes maximum. Then, in step 17, the second signal X B is multiplied by the window function whose amplitude is gradually reduced, based on the time delay Tc at which the correlation function obtained previously becomes maximum. Next, at step 18, the first signal multiplied by the window function and the second signal multiplied by the window function are shifted to the position of the time delay Tc at which the correlation function becomes maximum and then added. Then, in step 19, the signal added in step 18 and the signal following the first signal X A , that is, the third signal (X
C ) is output only in the α (T-Tc) / (α-1) section. Next, at step 20, the input pointer is set to (2T-αT
-Tc) / (α-1) is added. After this, step 12
Return to.

【0022】図2は、図1に示したステップ15におけ
る、第1の信号XAと第2の信号XBの相関関数を計算
し、相関関数の値が最大となる時間遅れTcを探索する
処理のフローチャートを示すものである。以下、その動
作について説明する。
In FIG. 2, the correlation function of the first signal X A and the second signal X B in step 15 shown in FIG. 1 is calculated, and the time delay Tc at which the value of the correlation function becomes maximum is searched for. It shows a flowchart of processing. The operation will be described below.

【0023】まず、ステップ201,202および20
3で、時間遅れτ,相関関数の値が最大となる時間遅れ
Tcおよび相関関数の最大値Rmaxを0に初期化する。
次に、ステップ204で、(数3)に示すように時間遅
れτが負でない場合の、第1の信号XAと第2の信号XB
の相関関数R(τ)を計算する。
First, steps 201, 202 and 20
At 3, the time delay τ, the time delay Tc that maximizes the value of the correlation function, and the maximum value Rmax of the correlation function are initialized to zero.
Next, in step 204, when the time delay τ is not negative as shown in (Equation 3), the first signal X A and the second signal X B
The correlation function R (τ) of is calculated.

【0024】[0024]

【数3】 [Equation 3]

【0025】そしてステップ205において、ステップ
204で求めた相関関数R(τ)が、それ以前に求めら
れた相関関数の最大値Rmaxよりも大きくない場合には
ステップ208に分岐し、そうでない場合には、ステッ
プ206で相関関数の最大値RmaxをR(τ)に更新
し、ステップ207で相関関数の値が最大となる時間遅
れTcをτに更新する。次にステップ208で時間遅れ
τを1点だけ増加する。そしてステップ209で、時間
遅れτがτmax+を越えていないならばステップ204に
戻り、ステップ204から208までの処理を時間遅れ
τがτmax+を越えるまで繰り返す。そして上記条件を満
たしたら、ステップ210で、時間遅れτを−1に初期
化する。次にステップ211で、(数4)に示すように
時間遅れτが負の場合の、第1の信号XAと第2の信号
Bの相関関数R(τ)を計算する。
Then, in step 205, if the correlation function R (τ) obtained in step 204 is not larger than the maximum value Rmax of the correlation function obtained before that, the process branches to step 208. In step 206, the maximum value Rmax of the correlation function is updated to R (τ), and in step 207, the time delay Tc at which the value of the correlation function is maximum is updated to τ. Next, at step 208, the time delay τ is increased by one point. Then, in step 209, if the time delay τ does not exceed τ max + , the process returns to step 204, and the processing from steps 204 to 208 is repeated until the time delay τ exceeds τ max + . When the above condition is satisfied, the time delay τ is initialized to -1 in step 210. Next, at step 211, the correlation function R (τ) of the first signal X A and the second signal X B when the time delay τ is negative as shown in (Equation 4) is calculated.

【0026】[0026]

【数4】 [Equation 4]

【0027】そしてステップ212で、ステップ211
で求めた相関関数R(τ)が、それ以前に求められた相
関関数の最大値Rmaxよりも大きくない場合にはステッ
プ215に分岐し、そうでない場合には、ステップ21
3で相関関数の最大値RmaxをR(τ)に更新し、ステ
ップ214で相関関数の値が最大となる時間遅れTcを
τに更新する。次にステップ215で時間遅れτを1点
だけ減少させる。そしてステップ216で、時間遅れτ
がτmax-より小さくないならばステップ211に戻り、
ステップ211から215までの処理を時間遅れτがτ
max-より小さくなるまで繰り返す。最後にステップ21
7で、相関関数の値が最大となる時間遅れTcを出力す
る。
Then, in step 212, step 211
If the correlation function R (τ) obtained in step 1 is not larger than the maximum value Rmax of the correlation function obtained before that, the process branches to step 215, and if not, step 21
In step 3, the maximum value Rmax of the correlation function is updated to R (τ), and in step 214, the time delay Tc at which the value of the correlation function becomes maximum is updated to τ. Next, at step 215, the time delay τ is decreased by one point. Then, in step 216, the time delay τ
There .tau.max - returns to step 211 if no smaller than,
In the processing from steps 211 to 215, the time delay τ is τ
max - repeated until than smaller. Finally step 21
At 7, the time delay Tc that maximizes the value of the correlation function is output.

【0028】図3は、図1に示したステップ16,17
および18における処理の模式図を示すものである。
FIG. 3 shows steps 16 and 17 shown in FIG.
It is a schematic diagram of the process in 18 and.

【0029】図3(a)は相関関数の値が最大となる時
間遅れTc=0の場合、(b)は相関関数の値が最大と
なる時間遅れTc>0の場合、そして(c)は相関関数
の値が最大となる時間遅れTc<0の場合である。それ
ぞれの場合とも、第1の信号には振幅が時間的に漸増す
る窓関数を乗じ、第2の信号には振幅が時間的に漸減す
る窓関数を乗じ、それらを相関関数が最大となる時間遅
れTcだけずらしてから加算する。ここで、窓関数の形
状は相関関数が最大となる時間遅れTcに基づいて変化
させる。加算した結果の時間長は(T−Tc)となる。
FIG. 3A shows the case where the time delay Tc = 0 at which the value of the correlation function becomes maximum, FIG. 3B shows the time delay Tc> 0 when the value of the correlation function becomes maximum, and FIG. This is the case where the time delay Tc <0 at which the value of the correlation function becomes maximum. In each case, the first signal is multiplied by a window function whose amplitude gradually increases with time, and the second signal is multiplied by a window function whose amplitude gradually decreases with time. Add only after delaying Tc. Here, the shape of the window function is changed based on the time delay Tc at which the correlation function becomes maximum. The time length of the addition result is (T-Tc).

【0030】図4は本発明の上述した音声速度変換方法
の処理例を模式的に示したものである。
FIG. 4 schematically shows a processing example of the above-described voice speed conversion method of the present invention.

【0031】図4(a)は入力信号、(b)は時間軸変
換比α=3/2の場合の出力信号である。XA1とXB1
相関関数が最大となる時間遅れTc1=0、XA2とXB2
相関関数が最大となる時間遅れTc2>0、そしてXA3
B3の相関関数が最大となる時間遅れTc3<0となって
いる。第1の信号XAnと第2の信号XBnとを加算した信
号と、第1の信号XAnに続く第3の信号XCnの時間長の
和は、α(T−Tcn)/(α−1)となり、時間軸変換
比αと相関関数の値が最大となる時間遅れTcnと有限時
間長Tに基づいて決定されている。入力信号(XC1+X
C2+XC3)の時間長に対する出力信号の時間長の比は、
設定した時間軸変換比α(=3/2)と等しくなる。X
Cnはそのまま出力され、かつ入力信号の全ての区間を用
いるので、出力信号における情報欠落は全くない。
FIG. 4A shows an input signal, and FIG. 4B shows an output signal when the time axis conversion ratio α = 3/2. The time delay T c1 = 0 at which the correlation function of X A1 and X B1 becomes maximum, the time delay T c2 > 0 at which the correlation function of X A2 and X B2 becomes maximum, and the correlation function of X A3 and X B3 become maximum. The time delay T c3 <0. The sum of the time lengths of the signal obtained by adding the first signal X An and the second signal X Bn and the third signal X Cn following the first signal X An is α (T−T cn ) / ( α-1), which is determined based on the time-axis conversion ratio α, the time delay T cn at which the value of the correlation function is maximum, and the finite time length T. Input signal (X C1 + X
The ratio of the time length of the output signal to the time length of ( C2 + X C3 ) is
It becomes equal to the set time axis conversion ratio α (= 3/2). X
Since Cn is output as it is and all the sections of the input signal are used, there is no information loss in the output signal.

【0032】以上のように本実施例によれば、第1の信
号XAに漸増する窓関数を乗じ、第2の信号XBに漸減す
る窓関数を乗じてから加算することにより、加算した信
号の振幅の不連続が少なくなる。そして、窓関数を乗じ
た第1の信号と窓関数を乗じた第2の信号とを相関関数
の値が最大となる時間遅れTcの位置で加算することに
より、位相の不連続が少なくなる。さらに、窓関数を乗
じた第1の信号と窓関数を乗じた第2の信号とを加算し
た信号と、第1の信号XAに続く第3の信号XCを時間軸
変換比αと相関関数の値が最大となる時間遅れTcと有
限時間長Tに基づいて決定した時間長だけ出力すること
により、信号の欠落がなく容易に、入力信号を時間軸変
換比αをα≧1.0の範囲に伸長して出力することがで
きる。
As described above, according to the present embodiment, the first signal X A is multiplied by the gradually increasing window function, and the second signal X B is multiplied by the gradually decreasing window function, and then the addition is performed, thereby performing the addition. The discontinuity of the signal amplitude is reduced. Then, the first signal multiplied by the window function and the second signal multiplied by the window function are added at the position of the time delay Tc at which the value of the correlation function becomes maximum, whereby the discontinuity of the phase is reduced. Further, a signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal X C subsequent to the first signal X A are correlated with the time axis conversion ratio α. By outputting only the time length determined based on the time delay Tc that maximizes the function value and the finite time length T, the input signal can be easily input without time loss and the time axis conversion ratio α can be set to α ≧ 1.0. The output can be expanded to the range of.

【0033】以下、本発明の第二の実施例について、図
面を参照しながら説明する。本発明は、信号の振幅及び
位相の不連続が少なく、データの欠落をあまり生じない
自然性に富んだ音声を時間軸変換比αがα≦1.0の範
囲で出力することができる音声速度変換方法を提供する
ものである。
A second embodiment of the present invention will be described below with reference to the drawings. The present invention is a voice speed at which a time-axis conversion ratio α can be output within a range of α ≦ 1.0, which is a voice with few discontinuities in the amplitude and phase of the signal and which does not cause data loss. It provides a conversion method.

【0034】図5は本発明の第二の実施例における音声
速度変換方法のフローチャートを示すものである。以
下、その動作について説明する。
FIG. 5 shows a flow chart of a voice speed converting method in the second embodiment of the present invention. The operation will be described below.

【0035】まず、ステップ51において、入力ポイン
タをリセットする。次に、ステップ52において、入力
ポインタからT区間の第1の信号XAを入力する。そし
てステップ53において、入力ポインタにTを加える。
次にステップ54において、入力ポインタからT区間の
第2の信号XBを入力する。そしてステップ55で、第
1の信号XAと第2の信号XBの相関関数を計算し、この
相関関数の値が最大となる時間遅れTcを探索する。
First, in step 51, the input pointer is reset. Next, in step 52, the first signal X A in the T section is input from the input pointer. Then, in step 53, T is added to the input pointer.
Next, at step 54, the second signal X B in the T section is input from the input pointer. Then, in step 55, the correlation function of the first signal X A and the second signal X B is calculated, and the time delay Tc at which the value of this correlation function becomes maximum is searched for.

【0036】次にステップ56で、先ほど求めた相関関
数が最大となる時間遅れTcに基づいて、第1の信号X
Aに漸減する窓関数を乗じる。そしてステップ57で、
先ほど求めた相関関数が最大となる時間遅れTcに基づ
いて、第2の信号XBに漸増する窓関数を乗じる。次に
ステップ58で、窓関数を乗じた第1の信号と窓関数を
乗じた第2の信号とを相関関数が最大となる時間遅れT
cの位置にずらした後に加算する。そしてステップ59
で、入力ポインタにTを加える。次にステップ60にお
いて、ステップ58で加算した信号と、第2の信号XB
に続く信号、つまり現在の入力ポインタを開始点とする
第3の信号XCを、α(T−Tc)/(1−α)区間だ
け出力する。そしてステップ61で、入力ポインタに
(2αT−T−Tc)/(1−α)を加える。この後、
ステップ52に戻る。
Next, at step 56, the first signal X is calculated based on the time delay Tc at which the correlation function obtained above becomes maximum.
Multiply A by a decreasing window function. And in step 57,
The second signal X B is multiplied by the gradually increasing window function on the basis of the time delay Tc at which the correlation function obtained above becomes maximum. Next, at step 58, the first signal multiplied by the window function and the second signal multiplied by the window function are time-delayed T with which the correlation function becomes maximum.
Add after shifting to the position of c. And step 59
Then, add T to the input pointer. Next, in step 60, the signal added in step 58 and the second signal X B
A signal subsequent to, that is, a third signal X C having the current input pointer as a starting point is output only in the α (T-Tc) / (1-α) section. Then, in step 61, (2αT-T-Tc) / (1-α) is added to the input pointer. After this,
Return to step 52.

【0037】図5のステップ55における、第1の信号
Aと第2の信号XBの相関関数を計算し、相関関数の値
が最大となる時間遅れTcを探索する処理は、図2に示
した本発明の第一の実施例におけるものと同じである。
The process of calculating the correlation function of the first signal X A and the second signal X B in step 55 of FIG. 5 and searching for the time delay Tc at which the value of the correlation function is maximum is shown in FIG. This is the same as in the first embodiment of the present invention shown.

【0038】図6は、図5のステップ56,57および
58における処理の模式図を示すものである。
FIG. 6 is a schematic diagram of the processing in steps 56, 57 and 58 of FIG.

【0039】図6(a)は相関関数の値が最大となる時
間遅れTc=0の場合、(b)は相関関数の値が最大と
なる時間遅れTc>0の場合、そして(c)は相関関数
の値が最大となる時間遅れTc<0の場合である。それ
ぞれの場合とも、第1の信号には振幅が時間的に漸減す
る窓関数を乗じ、第2の信号には振幅が時間的に漸増す
る窓関数を乗じ、それらを相関関数が最大となる時間遅
れTcだけずらしてから加算する。ここで窓関数の形状
は相関関数が最大となる時間遅れTcに基づいて変化さ
せる。加算した結果の時間長は(T+Tc)となる。
FIG. 6A shows the case where the time delay Tc = 0 at which the value of the correlation function becomes maximum, FIG. 6B shows the case where the time delay Tc> 0 at which the value of the correlation function becomes maximum, and FIG. This is the case where the time delay Tc <0 at which the value of the correlation function becomes maximum. In each case, the first signal is multiplied by a window function whose amplitude gradually decreases, and the second signal is multiplied by a window function whose amplitude gradually increases with time. Add only after delaying Tc. Here, the shape of the window function is changed based on the time delay Tc that maximizes the correlation function. The time length of the addition result is (T + Tc).

【0040】図7は上述した音声速度変換方法の処理例
を模式的に示したものである。図7(a)は入力信号、
(b)は時間軸変換比α=2/3の場合の出力信号であ
る。XA1とXB1の相関関数が最大となる時間遅れTc1
0、XA2とXB2の相関関数が最大となる時間遅れTc2
0、そしてXA3とXB3の相関関数が最大となる時間遅れ
c3<0となっている。第1の信号XAnと第2の信号X
Bnとを加算した信号と、第2の信号XBnに続く第3の信
号XCnの時間長の和は、α(T−Tcn)/(1−α)と
なり、時間軸変換比αと相関関数の値が最大となる時間
遅れTcnと有限時間長Tに基づいて決定されている。入
力信号の時間長に対する出力信号の時間長の比は、設定
した時間軸変換比α(=2/3)と等しくなる。入力信
号は第1の信号XAn,第2の信号XBn,および第3の信
号XCnで全て用いられるので、出力信号における情報欠
落は少ない。
FIG. 7 schematically shows a processing example of the voice speed conversion method described above. FIG. 7A shows an input signal,
(B) is an output signal when the time-axis conversion ratio α = 2/3. Time delay T c1 = where the correlation function of X A1 and X B1 is maximum =
0, time delay T c2 > at which the correlation function of X A2 and X B2 becomes maximum>
0, and the time delay T c3 <0 at which the correlation function of X A3 and X B3 becomes maximum. First signal X An and second signal X
The sum of the time lengths of the signal obtained by adding Bn and the third signal X Cn that follows the second signal X Bn is α (T-T cn ) / (1-α), and the time-axis conversion ratio α and It is determined based on the time delay T cn that maximizes the value of the correlation function and the finite time length T. The ratio of the time length of the output signal to the time length of the input signal is equal to the set time axis conversion ratio α (= 2/3). Since the input signals are all used for the first signal X An , the second signal X Bn , and the third signal X Cn , there is little information loss in the output signal.

【0041】以上のように本実施例によれば、第1の信
号XAに漸減する窓関数を乗じ、第2の信号XBに漸増す
る窓関数を乗じてから加算することにより、加算した信
号の振幅の不連続が少なくなる。そして、窓関数を乗じ
た第1の信号と窓関数を乗じた第2の信号とを相関関数
の値が最大となる時間遅れTcの位置で加算することに
より、位相の不連続が少なくなる。さらに窓関数を乗じ
た第1の信号と窓関数を乗じた第2の信号とを加算した
信号と、第2の信号XBに続く第3の信号XCを時間軸変
換比αと相関関数の値が最大となる時間遅れTcと有限
時間長Tに基づいて決定した時間長だけ出力することに
より、信号の欠落が少なく容易に、入力信号を時間軸変
換比αをα≦1.0の範囲に圧縮して出力することがで
きる。
As described above, according to the present embodiment, the first signal X A is multiplied by the gradually decreasing window function, and the second signal X B is multiplied by the gradually increasing window function, and then the addition is performed, thereby performing the addition. The discontinuity of the signal amplitude is reduced. Then, the first signal multiplied by the window function and the second signal multiplied by the window function are added at the position of the time delay Tc at which the value of the correlation function becomes maximum, whereby the discontinuity of the phase is reduced. Further, a signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal X C subsequent to the second signal X B are added to the time axis conversion ratio α and the correlation function. By outputting only the time length determined based on the time delay Tc that maximizes the value of T and the finite time length T, the input signal can be easily reduced with a time axis conversion ratio α of α ≦ 1.0. It can be compressed into a range and output.

【0042】[0042]

【発明の効果】以上の説明より明らかなように、本発明
は、第1の信号と第2の信号に時間的に振幅が相補的に
変化する窓関数を乗じてから加算することにより、加算
した信号の振幅の不連続が少なくなり、また窓関数を乗
じた第1の信号と窓関数を乗じた第2の信号とを相関関
数の値が最大となる時間遅れの位置で加算することによ
り、位相の不連続が少なくなる。
As is apparent from the above description, according to the present invention, the first signal and the second signal are multiplied by a window function whose amplitudes change in a complementary manner with respect to time, and then the signals are added. By adding the first signal multiplied by the window function and the second signal multiplied by the window function at the time delay position where the value of the correlation function becomes maximum, , Phase discontinuity is reduced.

【0043】さらに、窓関数を乗じた第1の信号と窓関
数を乗じた第2の信号とを加算した信号と、この加算し
た信号に続く第3の信号を時間軸変換比αと相関関数の
値が最大となる時間遅れTcと有限時間長Tに基づいて
決定した時間長だけ出力することにより、信号の欠落が
少なく、かつ任意の速度に変換を行うことができるとい
う優れた効果を得ることができる。
Furthermore, a signal obtained by adding the first signal multiplied by the window function and the second signal multiplied by the window function and the third signal following the added signal are added to the time axis conversion ratio α and the correlation function. By outputting only the time length determined based on the time delay Tc that maximizes the value of and the finite time length T, it is possible to obtain an excellent effect that there is little loss of signal and conversion to an arbitrary speed can be performed. be able to.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の第一の実施例における音声速度変換方
法のフローチャート
FIG. 1 is a flowchart of a voice speed conversion method according to a first embodiment of the present invention.

【図2】本発明の第一の実施例における音声速度変換方
法の相関関数演算のフローチャート
FIG. 2 is a flowchart of a correlation function calculation of the voice speed conversion method according to the first embodiment of the present invention.

【図3】本発明の第一の実施例における音声速度変換方
法の窓関数による重み付けと相関関数の値が最大となる
時間遅れの位置での加算の模式図
FIG. 3 is a schematic diagram of weighting by a window function and addition at a time delay position where the value of the correlation function is maximum in the voice speed conversion method according to the first embodiment of the present invention.

【図4】本発明の第一の実施例における音声速度変換方
法の入力信号と出力信号の模式図
FIG. 4 is a schematic diagram of an input signal and an output signal of the voice speed conversion method according to the first embodiment of the present invention.

【図5】本発明の第二の実施例における音声速度変換方
法のフローチャート
FIG. 5 is a flowchart of a voice speed conversion method according to the second embodiment of the present invention.

【図6】本発明の第二の実施例における音声速度変換方
法の窓関数による重み付けと相関関数の値が最大となる
時間遅れの位置での加算の模式図
FIG. 6 is a schematic diagram of weighting by a window function and addition at a time delay position where the value of the correlation function is maximum in the voice speed conversion method according to the second embodiment of the present invention.

【図7】本発明の第二の実施例における音声速度変換方
法の入力信号と出力信号の模式図
FIG. 7 is a schematic diagram of an input signal and an output signal of a voice speed conversion method according to a second embodiment of the present invention.

【図8】従来の音声速度変換装置の構成図FIG. 8 is a block diagram of a conventional voice speed conversion device.

【図9】従来の音声速度変換装置の入力信号と出力信号
の模式図
FIG. 9 is a schematic diagram of an input signal and an output signal of a conventional voice speed conversion device.

【符号の説明】[Explanation of symbols]

81 A/D変換器 82 バッファ 83 速度制御回路 84 データ読出し回路 85 ミューティング回路 86 D/A変換器 81 A / D converter 82 Buffer 83 Speed control circuit 84 Data reading circuit 85 Muting circuit 86 D / A converter

Claims (3)

【特許請求の範囲】[Claims] 【請求項1】有限時間長Tの第1の信号と該第1の信号
に続く有限時間長Tの第2の信号との相関関数を計算し
て該相関関数の値が最大となる時間遅れTcを求め、前
記第1の信号と前記第2の信号に前記相関関数の値が最
大となる時間遅れTcに基づいて決定した時間的に振幅
が相補的に変化する窓関数をそれぞれ乗じ、前記窓関数
を乗じた第1の信号と前記窓関数を乗じた第2の信号と
を前記相関関数の値が最大となる時間遅れTcの位置で
加算し、前記加算した信号に第3の信号を連続して出力
し、前記加算した信号と前記第3の信号とを時間軸変換
比α(=出力時間/入力時間)と相関関数の値が最大と
なる時間遅れTcと有限時間長Tに基づいて決定した時
間長だけ出力し、次回の処理における第1の信号と第2
の信号の開始点を時間軸変換比αと相関関数の値が最大
となる時間遅れTcと有限時間長Tに基づいて決定し、
上述した全ての処理を繰り返すことにより音声の再生時
間を原音の長さに対して変化させることを特徴とする音
声速度変換方法。
1. A time delay in which a correlation function between a first signal having a finite time length T and a second signal having a finite time length T following the first signal is calculated to maximize the value of the correlation function. Tc is obtained, and each of the first signal and the second signal is multiplied by a window function whose amplitude temporally changes complementarily, which is determined based on a time delay Tc at which the value of the correlation function is maximum, and The first signal multiplied by the window function and the second signal multiplied by the window function are added at the position of the time delay Tc where the value of the correlation function is maximum, and the third signal is added to the added signal. Based on the time-axis conversion ratio α (= output time / input time), the time delay Tc at which the value of the correlation function is maximum, and the finite time length T, which are continuously output and the added signal and the third signal are output. Output for the determined time length, and the first signal and the second signal in the next processing
The starting point of the signal is determined based on the time-axis conversion ratio α, the time delay Tc at which the value of the correlation function is maximum, and the finite time length T,
A voice speed conversion method, characterized in that the reproduction time of voice is changed with respect to the length of an original sound by repeating all the processes described above.
【請求項2】有限時間長Tの第1の信号と該第1の信号
に続く有限時間長Tの第2の信号との相関関数を計算し
て相関関数の値が最大となる時間遅れTcを求め、前記
第1の信号には前記相関関数の値が最大となる時間遅れ
Tcに基づいて決定した振幅が時間的に漸増する窓関数
を乗じ、前記第2の信号には前記相関関数の値が最大と
なる時間遅れTcに基づいて決定した振幅が時間的に漸
減する窓関数を乗じ、前記窓関数を乗じた第1の信号と
前記窓関数を乗じた第2の信号とを前記相関関数の値が
最大となる時間遅れTcの位置で加算し、前記加算した
信号に第3の信号を連続して出力し、前記加算した信号
と前記第3の信号を時間長{α(T−Tc)/(α−
1)}だけ出力し(ただし、αは時間軸変換比であり、
出力時間/入力時間 で与えられる)、次回の処理にお
ける第1の信号の開始点は前記第1の信号の開始点を
{(T−Tc)/(α−1)}だけ遅延させた点とし、
上述した全ての処理を繰り返すことにより音声の再生時
間を原音の長さの1.0倍以上に変化させることを特徴
とする音声速度変換方法。
2. A time delay Tc at which the correlation function has a maximum value by calculating a correlation function between a first signal having a finite time length T and a second signal having a finite time length T following the first signal. And the first signal is multiplied by a window function whose amplitude gradually increases with time, which is determined based on the time delay Tc at which the value of the correlation function becomes maximum, and the second signal has the correlation function of the correlation function. The first signal multiplied by the window function and the second signal multiplied by the window function are multiplied by a window function in which the amplitude determined based on the time delay Tc having the maximum value is gradually reduced. Addition is performed at the position of the time delay Tc where the value of the function is maximum, the third signal is continuously output to the added signal, and the added signal and the third signal are time length {α (T- Tc) / (α-
1)} is output (where α is the time base conversion ratio,
Output time / input time), the starting point of the first signal in the next processing is a point obtained by delaying the starting point of the first signal by {(T-Tc) / (α-1)}. ,
A voice speed conversion method characterized in that the reproduction time of voice is changed to 1.0 times or more of the length of the original sound by repeating all the processes described above.
【請求項3】有限時間長Tの第1の信号と該第1の信号
に続く有限時間長Tの第2の信号との相関関数を計算し
て相関関数の値が最大となる時間遅れTcを求め、前記
第1の信号には前記相関関数の値が最大となる時間遅れ
Tcに基づいて決定した振幅が時間的に漸減する窓関数
を乗じ、前記第2の信号には前記相関関数の値が最大と
なる時間遅れTcに基づいて決定した振幅が時間的に漸
増する窓関数を乗じ、前記窓関数を乗じた第1の信号と
前記窓関数を乗じた第2の信号とを前記相関関数の値が
最大となる時間遅れTcの位置で加算し、前記加算した
信号に第3の信号を連続して出力し、前記加算した信号
と前記第3の信号を時間長{α(T−Tc)/(1−
α)}だけ出力し(ただし、αは時間軸変換比であり、
出力時間/入力時間 で与えられる)、次回の処理にお
ける第1の信号の開始点は前記第1の信号の開始点を
{(T−Tc)/(1−α)}だけ遅延させた点とし、
上述した全ての処理を繰り返すことにより音声の再生時
間を原音の長さの1.0倍以下に変化させることを特徴
とする音声速度変換方法。
3. A time delay Tc at which a correlation function has a maximum value by calculating a correlation function between a first signal having a finite time length T and a second signal having a finite time length T following the first signal. Is obtained, and the first signal is multiplied by a window function in which the amplitude determined based on the time delay Tc at which the value of the correlation function is maximum gradually decreases, and the second signal is calculated by the correlation function of the correlation function. The correlation is made between the first signal multiplied by the window function and the second signal multiplied by the window function, which is obtained by multiplying the window function by which the amplitude determined based on the time delay Tc having the maximum value gradually increases with time. Addition is performed at the position of the time delay Tc where the value of the function is maximum, the third signal is continuously output to the added signal, and the added signal and the third signal are time length {α (T- Tc) / (1-
α)} only is output (where α is the time base conversion ratio,
Output time / input time), and the starting point of the first signal in the next processing is a point obtained by delaying the starting point of the first signal by {(T-Tc) / (1-α)}. ,
A voice speed conversion method characterized in that the reproduction time of a voice is changed to 1.0 times or less of the length of an original sound by repeating all the processes described above.
JP00973793A 1993-01-25 1993-01-25 Audio speed conversion method Expired - Fee Related JP3147562B2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP00973793A JP3147562B2 (en) 1993-01-25 1993-01-25 Audio speed conversion method
DE69428612T DE69428612T2 (en) 1993-01-25 1994-01-25 Method and device for carrying out a time scale modification of speech signals
US08/187,295 US5630013A (en) 1993-01-25 1994-01-25 Method of and apparatus for performing time-scale modification of speech signals
EP94101057A EP0608833B1 (en) 1993-01-25 1994-01-25 Method of and apparatus for performing time-scale modification of speech signals

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP00973793A JP3147562B2 (en) 1993-01-25 1993-01-25 Audio speed conversion method

Publications (2)

Publication Number Publication Date
JPH06222794A true JPH06222794A (en) 1994-08-12
JP3147562B2 JP3147562B2 (en) 2001-03-19

Family

ID=11728630

Family Applications (1)

Application Number Title Priority Date Filing Date
JP00973793A Expired - Fee Related JP3147562B2 (en) 1993-01-25 1993-01-25 Audio speed conversion method

Country Status (1)

Country Link
JP (1) JP3147562B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998041976A1 (en) * 1997-03-14 1998-09-24 Nippon Hoso Kyokai Speaking speed changing method and device
JP2003510625A (en) * 1998-10-09 2003-03-18 ヘジェナ, ドナルド ジェイ. ジュニア Method and apparatus for preparing a creation filtered by listener interest
JP2005275010A (en) * 2004-03-25 2005-10-06 Casio Comput Co Ltd Voice extension device, voice extension method and program
WO2007086365A1 (en) * 2006-01-24 2007-08-02 Matsushita Electric Industrial Co., Ltd. Conversion device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006137425A1 (en) 2005-06-23 2006-12-28 Matsushita Electric Industrial Co., Ltd. Audio encoding apparatus, audio decoding apparatus and audio encoding information transmitting apparatus

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998041976A1 (en) * 1997-03-14 1998-09-24 Nippon Hoso Kyokai Speaking speed changing method and device
US6205420B1 (en) 1997-03-14 2001-03-20 Nippon Hoso Kyokai Method and device for instantly changing the speed of a speech
JP2003510625A (en) * 1998-10-09 2003-03-18 ヘジェナ, ドナルド ジェイ. ジュニア Method and apparatus for preparing a creation filtered by listener interest
JP2005275010A (en) * 2004-03-25 2005-10-06 Casio Comput Co Ltd Voice extension device, voice extension method and program
WO2007086365A1 (en) * 2006-01-24 2007-08-02 Matsushita Electric Industrial Co., Ltd. Conversion device
US8073704B2 (en) 2006-01-24 2011-12-06 Panasonic Corporation Conversion device
JP5096932B2 (en) * 2006-01-24 2012-12-12 パナソニック株式会社 Conversion device

Also Published As

Publication number Publication date
JP3147562B2 (en) 2001-03-19

Similar Documents

Publication Publication Date Title
US5630013A (en) Method of and apparatus for performing time-scale modification of speech signals
EP0910065B1 (en) Speaking speed changing method and device
US7173986B2 (en) Nonlinear overlap method for time scaling
US5781885A (en) Compression/expansion method of time-scale of sound signal
JP3147562B2 (en) Audio speed conversion method
US6085157A (en) Reproducing velocity converting apparatus with different speech velocity between voiced sound and unvoiced sound
KR100656968B1 (en) Speech rate conversion apparatus, method and computer-readable record medium thereof
JPS5982608A (en) System for controlling reproducing speed of sound
JPH0232399A (en) Voice synthesizing device
JP3156020B2 (en) Audio speed conversion method
JPH1078791A (en) Pitch converter
JP3162945B2 (en) Video tape recorder
JPS642960B2 (en)
JP3357742B2 (en) Speech speed converter
JP2532731B2 (en) Voice speed conversion device and voice speed conversion method
JPH09152889A (en) Speech speed transformer
JPH0762800B2 (en) Pitch conversion method
JP2669088B2 (en) Audio speed converter
JPH03123397A (en) Device and method for converting voice speed
JPH01267700A (en) Speech processor
JPS63234299A (en) Voice analysis/synthesization system
JPH08292789A (en) Speech speed changing device
JPH05303400A (en) Method and device for audio reproduction
JPH09154107A (en) Video and sound signal reproducing device
JPH05181497A (en) Pitch conversion device

Legal Events

Date Code Title Description
LAPS Cancellation because of no payment of annual fees