JPH08115097A

JPH08115097A - Acoustic reproduction device

Info

Publication number: JPH08115097A
Application number: JP6249340A
Authority: JP
Inventors: Hiroki Onishi; 宏樹大西
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1994-10-14
Filing date: 1994-10-14
Publication date: 1996-05-07
Anticipated expiration: 2017-03-04
Also published as: JP3263546B2

Abstract

PURPOSE: To provide an acoustic reproduction device in which the tempo and the interval of an accompaniment are compensated in accordance with the interval of a singer even when there exists a tempo difference, which is within an allowable range, between the voice of the singer and the accompaniment. CONSTITUTION: The device consists of a first interval extracting means 3 which extracts the interval of a first inputted voice, a second interval extracting means 6 which extracts the interval of a second inputted voice, storage means 4 and 7 which store the time histories of each interval, a computing means 8 which computes the differences in tempo and interval between the first and the second inputted voices employing the time histories of the intervals and a nonlinear pattern matching method, compensating means 9 and 10 which compensate the tempo and the interval of the second inputted voice so that they are made approximately equal to the tempo and the interval of the first inputted voice and a means 13 which reproduces the compensated second inputted voice that is compensated by the means 9 and 10 or a third inputted voice that is compensated for its tempo and interval differences.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、例えば、歌い手の音声
のテンポや音程に合わせて、伴奏音のテンポや音程を補
正する音響再生装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a sound reproducing apparatus for correcting the tempo and pitch of an accompaniment sound in accordance with the tempo and pitch of the voice of a singer, for example.

【０００２】[0002]

【従来の技術】従来、カラオケ装置等の音響再生装置に
音程調整機を接続して、伴奏音の曲のテンポを変えずに
音程のみを変化させることが行われていた。2. Description of the Related Art Heretofore, a pitch adjuster has been connected to a sound reproducing device such as a karaoke device to change only the pitch without changing the tempo of the accompaniment tune.

【０００３】音響再生装置の伴奏音の音程の調整に関し
ては、特開平５−３５２８６号公報に示された如く、伴
奏音の途中であっても、あらかじめ設定された時間内で
歌い手の音声の音程と伴奏音の音程との間の音程差を検
出し、その後、自動的に伴奏音の音程を歌い手の音程に
近似するように補正する方法があった。Regarding the adjustment of the pitch of the accompaniment sound of the sound reproducing device, as shown in Japanese Patent Laid-Open No. 5-35286, the pitch of the voice of the singer within a preset time even during the accompaniment sound. There is a method of detecting a pitch difference between the pitch of the accompaniment sound and the pitch of the accompaniment sound, and then automatically correcting the pitch of the accompaniment sound so as to approximate the pitch of the singer.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、従来の
如く、歌い手の音声の音程に近似するように自動的に伴
奏音の音程を補正する方法では、歌い手の音声と伴奏音
との時間的なずれ、すなわち、歌い手の音声と伴奏音と
の間でテンポ差はないことを前提としており、この状況
下で、歌い手の音声と伴奏音との間にテンポ差が生じて
しまうと、前記伴奏音の音程が誤って補正されるという
問題があった。However, in the conventional method of automatically correcting the pitch of the accompaniment sound so as to approximate the pitch of the voice of the singer, the time difference between the voice of the singer and the accompaniment sound is increased. That is, it is premised that there is no tempo difference between the singer's voice and the accompaniment sound, and in this situation, if a tempo difference occurs between the singer's voice and the accompaniment sound, There was a problem that the pitch was erroneously corrected.

【０００５】そこで、本発明は前述の問題点に鑑み為さ
れたものであり、歌い手の音声と伴奏音との間に許容差
内のテンポ差があっても、該歌い手の音程に合わせて、
前記伴奏音のテンポ及び音程を補正する音響再生装置を
提供することを目的とする。Therefore, the present invention has been made in view of the above-described problems, and even if there is a tempo difference within a permissible difference between the voice of the singer and the accompaniment sound,
An object of the present invention is to provide a sound reproducing device that corrects the tempo and pitch of the accompaniment sound.

【０００６】[0006]

【課題を解決するための手段】本発明による音響再生装
置は、第１の入力音声の音程を抽出する第１音程抽出手
段と、第２の入力音声の音程を抽出する第２音程抽出手
段と、前記各々の音程の時間履歴を記憶する記憶手段
と、該音程の時間履歴を用いて、非線形パターンマッチ
ング手法により第１の入力音声と第２の入力音声との間
のテンポ差および音程差を算出する計算手段と、第２の
入力音声のテンポおよび音程を第１の入力音声のテンポ
および音程に近似させるよう補正する補正手段と、該補
正手段により補正された第２の入力音声或は、前記算出
されたテンポ差および音程差の補正を行った第３の入力
音声を再生する手段と、を具備することを特徴としてい
る。加えて、前記非線形パターンマッチング手法にＤＰ
マッチングを用い、該ＤＰマッチングより求められる時
間正規化関数の傾きにより、前記第１の入力音声と第２
の入力音声とのテンポ差を算出し、この後、斯かるテン
ポ差に応じて補正された第２の入力音声の音程の時間履
歴と第１の入力音声の音程の時間履歴から算出される平
均音程差を第１の入力音声と第２の入力音声との音程差
として算出することを特徴とする。A sound reproducing apparatus according to the present invention comprises a first pitch extracting means for extracting a pitch of a first input voice, and a second pitch extracting means for extracting a pitch of a second input voice. , A tempo difference and a pitch difference between the first input voice and the second input voice by a non-linear pattern matching method using a storage means for storing the time history of each pitch and the time history of the pitch. A calculating means for calculating, a correcting means for correcting the tempo and pitch of the second input voice to approximate the tempo and pitch of the first input voice, and the second input voice or the second input voice corrected by the correcting means, Means for playing back the third input sound in which the calculated tempo difference and pitch difference are corrected. In addition, the nonlinear pattern matching method has a DP
Using the matching, the slope of the time normalization function obtained from the DP matching is used to detect the first input voice and the second input voice.
An average calculated from the time history of the pitch of the second input voice and the time history of the pitch of the first input voice corrected according to the tempo difference. It is characterized in that the pitch difference is calculated as the pitch difference between the first input voice and the second input voice.

【０００７】また、前記ＤＰマッチングより求められる
時間正規化関数の傾きに制限を設け、前記第２の入力音
声のテンポに対する前記第１の入力音声のテンポ差を設
定値以上に補正しないことを特徴とする。Further, the slope of the time normalization function obtained from the DP matching is limited so that the tempo difference of the first input voice with respect to the tempo of the second input voice is not corrected to a set value or more. And

【０００８】さらに、前記第２の入力音声のテンポ及び
音程を前記第１の入力音声のテンポおよび音程に近似さ
せるように補正する補正期間を設定し、該補正期間後
は、前記テンポ差および音程差の補正量を一定に保つよ
うに第２の入力音声或は、前記第３の入力音声を再生す
ることを特徴とする。Further, a correction period for correcting the tempo and pitch of the second input voice to approximate the tempo and pitch of the first input voice is set, and after the correction period, the tempo difference and pitch are set. The second input voice or the third input voice is reproduced so that the difference correction amount is kept constant.

【０００９】一方、前記第２の入力音声或は、第３の入
力音声のテンポ及び音程の補正の有無を指示するための
手元スイッチを備えており、該手元スイッチにより、操
作者が前記補正期間を指定できることを特徴とする。On the other hand, a hand switch for instructing the presence / absence of correction of the tempo and pitch of the second input voice or the third input voice is provided, and the operator can use the hand switch to perform the correction period. The feature is that you can specify.

【００１０】[0010]

【作用】本発明の音響再生装置は、操作者の指示によ
り、第２の入力音声或は、第３の入力音声のテンポ及び
音程を補正するように補正開始の命令をうけ、まず、第
１の入力音声及び第２の入力音声の音程を抽出し、斯か
る各々の音声の一定期間の音程の時間履歴を記憶する。The sound reproducing apparatus of the present invention receives a command to start correction so as to correct the tempo and pitch of the second input voice or the third input voice according to the operator's instruction, and firstly, the first The pitches of the input voice and the second input voice are extracted, and the time history of the pitch of each of the voices for a certain period is stored.

【００１１】次に、該音程の時間履歴を用いて、非線形
パターンマッチング手法により第１の入力音声と第２の
入力音声との間のテンポ差および音程差を算出する。Next, using the time history of the pitch, the tempo difference and the pitch difference between the first input voice and the second input voice are calculated by the non-linear pattern matching method.

【００１２】斯かる非線形パターンマッチング手法にＤ
Ｐマッチングを用いると、該ＤＰマッチングより求めら
れる時間正規化関数の傾きにより、第１の入力音声と第
２の入力音声とのテンポ差が算出される。この時、該時
間正規化関数の傾きは、あらかじめ設定されたある許容
範囲内であれば、該傾きを第１の入力音声と第２の入力
音声のテンポ差として算出し、許容範囲外であれば、該
テンポ差は補正不可であると判断する。D is applied to such a nonlinear pattern matching method.
When P matching is used, the tempo difference between the first input voice and the second input voice is calculated from the slope of the time normalization function obtained from the DP matching. At this time, if the slope of the time normalization function is within a certain allowable range set in advance, the slope is calculated as the tempo difference between the first input voice and the second input voice, and if it is outside the allowable range. For example, it is determined that the tempo difference cannot be corrected.

【００１３】一方、テンポ差が許容範囲内にあれば、前
記時間正規化関数によって第２の入力音声の音程の時間
履歴が第１の入力音声の音程の時間履歴に対し、近似な
テンポになるよう補正され、該補正された第２の入力音
声の音程の時間履歴と第１の入力音声の音程の時間履歴
とから算出される平均音程差を第１の入力音声と第２の
入力音声との音程差として算出する。On the other hand, if the tempo difference is within the allowable range, the time history of the pitch of the second input voice is approximated to the time history of the pitch of the first input voice by the time normalization function. The average pitch difference calculated from the corrected time history of the pitch of the second input voice and the corrected time history of the pitch of the first input voice is corrected between the first input voice and the second input voice. It is calculated as the pitch difference of.

【００１４】最後に、前記算出されたテンポ差および音
程差により、第２の入力音声のテンポ及び音程を第１の
入力音声のテンポおよび音程に近似させるように第２の
入力音声を補正して再生するか或は、前記算出されたテ
ンポ差および音程差により、第３の入力音声のテンポ及
び音程を第１の入力音声のテンポおよび音程に近似させ
るように第３の入力音声を補正して再生する。Finally, the second input voice is corrected by the calculated tempo difference and pitch difference so as to approximate the tempo and pitch of the second input voice to the tempo and pitch of the first input voice. The third input sound is reproduced or is corrected by the calculated tempo difference and pitch difference so that the tempo and pitch of the third input sound are approximated to the tempo and pitch of the first input sound. Reproduce.

【００１５】[0015]

【実施例】図１は、本発明による音響再生装置をカラオ
ケ装置に適用した場合の概略構成図である。通常、歌い
手の音声は、マイク１を介して、アナログ音声信号に変
換され、ミキシング回路１２に送信される。一方、楽音
情報記録媒体５にはＭＩＤＩ（Musical Instrument Dig
ital Interface）情報と呼ばれるディジタルの伴奏音の
信号が記憶されており、該ディジタル伴奏音信号は、テ
ンポ差補正回路９および音程差補正回路１０を介するも
のの、補正されずに通過し、Ｄ／Ａ変換器１１によりア
ナログ伴奏音信号に変換され、ミキシング回路１２に送
信される。ミキシング回路１２では、該アナログ伴奏音
信号と前記アナログ音声信号とがミキシング、増幅さ
れ、スピーカー１３により、歌い手の音声と伴奏音とが
混在した音として再生される。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a schematic diagram of the case where the sound reproducing device according to the present invention is applied to a karaoke device. Usually, the voice of the singer is converted into an analog voice signal via the microphone 1 and transmitted to the mixing circuit 12. On the other hand, the musical sound information recording medium 5 has a MIDI (Musical Instrument Dig.
A digital accompaniment sound signal called ital interface) information is stored. The digital accompaniment sound signal passes through the tempo difference correction circuit 9 and the pitch difference correction circuit 10, but is not corrected, and the D / A It is converted into an analog accompaniment sound signal by the converter 11 and transmitted to the mixing circuit 12. In the mixing circuit 12, the analog accompaniment sound signal and the analog audio signal are mixed and amplified, and reproduced by the speaker 13 as a sound in which the voice of the singer and the accompaniment sound are mixed.

【００１６】ここで、歌い手の操作により、伴奏音のテ
ンポ及び音程を補正して再生する場合を図２に示す処理
手順に沿って説明する。Now, a case where the tempo and pitch of the accompaniment sound are corrected and reproduced by the operation of the singer will be described with reference to the processing procedure shown in FIG.

【００１７】本実施例では、第１の入力音声を歌い手の
音声、第２の入力音声を楽音情報記憶媒体５に記憶され
ている歌い手の手本となる教師音信号、及び第３の入力
音声を楽音情報記憶媒体５に記憶されている伴奏音信号
として説明する。In this embodiment, the first input voice is the voice of the singer, the second input voice is the teacher sound signal serving as the model of the singer stored in the musical tone information storage medium 5, and the third input voice. Will be described as an accompaniment sound signal stored in the musical sound information storage medium 5.

【００１８】ステップＳ１では、マイク１に取りつけら
れた手元スイッチを歌い手が操作することにより、本発
明によるカラオケ装置は伴奏音を補正する補正開始命令
を受ける。In step S1, the singer operates the hand switch attached to the microphone 1, whereby the karaoke apparatus according to the present invention receives a correction start command for correcting the accompaniment sound.

【００１９】しかる後、直ちに（第１の入力音声であ
る）歌い手のアナログ音声信号は、前述した通常信号経
路とは別に、Ａ／Ｄ変換器２によって、ディジタル音声
信号に変換される。Ａ／Ｄ変換の際のＡ／Ｄ変換器２の
標本化周波数は２ｋＨｚとし、標本化の前に音声信号は
カットオフ周波数１ｋＨｚのローパスフィルタを通過す
る。Immediately thereafter, the analog voice signal of the singer (which is the first input voice) is converted into a digital voice signal by the A / D converter 2 separately from the above-mentioned normal signal path. The sampling frequency of the A / D converter 2 at the time of A / D conversion is 2 kHz, and the audio signal passes through a low-pass filter having a cutoff frequency of 1 kHz before sampling.

【００２０】そして、ステップＳ２では、前記ディジタ
ル音声信号は、音程抽出回路３により時系列的に順次、
自己相関法などの信号処理技術を用いて、歌い手の音声
の音程を算出し、該歌い手の音程の時間履歴が、バッフ
ァメモリ４に記憶される。実施例では、音程抽出回路３
は、計測時間２０ｍｓｅｃ毎に平均された音程を抽出
し、バッファメモリ４には、６ｓｅｃの時間長の前記歌
い手の音程の時間履歴が記憶される。Then, in step S2, the digital voice signal is sequentially time-sequentially output by the pitch extraction circuit 3.
The pitch of the voice of the singer is calculated using a signal processing technique such as the autocorrelation method, and the time history of the pitch of the singer is stored in the buffer memory 4. In the embodiment, the pitch extraction circuit 3
Extracts a pitch averaged every 20 msec of the measurement time, and the buffer memory 4 stores a time history of the pitch of the singer having a time length of 6 sec.

【００２１】一方、楽音情報記憶媒体５には、前述の如
く、予め音程、音色、音量などの信号が分離された状態
でＭＩＤＩ情報と呼ばれる（第２の入力音声である）歌
い手の手本となる教師音信号及び（第３の入力音声であ
る）ディジタル伴奏音信号が記憶されている。従って、
音程抽出回路６により教師音の音程が抽出され、斯かる
教師音の音程の時間履歴を５ｓｅｃの時間長でバッファ
メモリ７に記憶する。バッファメモリ７に記憶する教師
音の音程の時間履歴は、前記歌い手の音程時間履歴同
様、計測時間２０ｍｓｅｃ毎の平均音程となるよう間引
き或は、線形補間される。On the other hand, in the tone information storage medium 5, as described above, a sample of a singer called MIDI information (which is the second input voice) in a state where signals such as pitch, tone color and volume are separated in advance. The teacher sound signal and the digital accompaniment sound signal (which is the third input sound) are stored. Therefore,
The pitch of the teacher sound is extracted by the pitch extraction circuit 6, and the time history of the pitch of the teacher sound is stored in the buffer memory 7 with a time length of 5 sec. The time history of the pitches of the teacher's notes stored in the buffer memory 7 is thinned out or linearly interpolated so as to have an average pitch every 20 msec of the measurement time, like the pitcher's time history of the singer.

【００２２】従って、バッファメモリ４には前記補正開
始命令より時間起算した６ｓｅｃ分の前記歌い手の音程
の時間履歴が記憶され、バッファメモリ７には前記補正
開始命令より時間起算した５ｓｅｃ分の前記教師音の音
程の時間履歴が記憶されており、斯かる各々の音程の時
間履歴を用いて、非線形パタ−ンマッチング回路８によ
り、前記歌い手の音声と教師音とのテンポ差及び音程差
を算出する。Therefore, the buffer memory 4 stores the time history of the pitch of the singer for 6 seconds calculated from the correction start command, and the buffer memory 7 stores the teacher for 5 seconds calculated from the correction start command. The time history of the pitch of the sound is stored, and the non-linear pattern matching circuit 8 is used to calculate the tempo difference and the pitch difference between the voice of the singer and the teacher's sound by using the time history of each pitch. .

【００２３】次に、ステップＳ３における前記テンポ差
及び音程差の算出方法について述べる。本実施例では、
ＤＰマッチング手法により前記歌い手の音声と教師音と
のテンポ差及び音程差を算出する。ＤＰマッチング手法
に関しては、日本音響学会誌Vol.27 No.9 pp483-487 に
記載されており、本実施例では、歌い手の音程の時間履
歴と教師音の音程の時間履歴とにより、両者の音程差の
絶対値が前記補正開始命令発生より時間起算した５ｓｅ
ｃの間の個々の時点で、その時点に至る累積音程差が最
小となるような時間軸正規化関数を求める。Next, a method of calculating the tempo difference and the pitch difference in step S3 will be described. In this embodiment,
The tempo difference and the pitch difference between the voice of the singer and the teacher's sound are calculated by the DP matching method. The DP matching method is described in Journal of Acoustical Society of Japan, Vol.27 No.9 pp483-487, and in the present embodiment, the pitches of the singer's pitch and the pitch of the teacher's pitch are both recorded. The absolute value of the difference is 5se calculated from the time when the correction start command is issued.
At each time point between c, a time axis normalization function that minimizes the cumulative pitch difference up to that time point is obtained.

【００２４】実施例では、教師音の音程時間履歴は、５
ｓｅｃ分記憶され、音程計測時間は２０ｍｓｅｃ単位で
あるので、Ａ₁ ，Ａ₂ ，・・・・・，Ａ₂₅₀ なる２５０個の音程の時系列データとして表される。同
様に、歌い手の音程時間履歴は、６ｓｅｃ分記憶されて
いるので、Ｂ₁ ，Ｂ₂ ，・・・・・，Ｂ₃₀₀ なる音程の時系列データとして表される。図３はＤＰマ
ッチング手法による時間軸正規化関数の計算例を示す。
時間軸正規化関数は、格子点（Ａ₁，Ｂ₁）を起点とし、
この起点からすべての格子点における累積音程差の最小
値を累積距離として求め、パターンマッチングを終了す
る格子点（Ａ₂₅₀，Ｂ_tr）から、逆時間方向に累積距離
が小さくなる経路をたどることによって求める。尚、Ｂ
_trとは、教師音の音程データＡ₂₅₀に対応する歌い手の
音程データをいう。In the embodiment, the interval time history of the teacher sound is 5
Since it is stored for sec, and the pitch measurement time is in 20 msec unit, it is represented as time series data of 250 pitches A ₁ , A ₂ , ..., A ₂₅₀ . Similarly, since the singer's pitch time history is stored for 6 seconds, it is represented as time series data of the pitches B ₁ , B ₂ , ..., B ₃₀₀ . FIG. 3 shows a calculation example of the time axis normalization function by the DP matching method.
The time-axis normalization function starts from the grid point (A ₁ , B ₁ ),
From this starting point, the minimum value of the cumulative pitch difference at all grid points is obtained as the cumulative distance, and the path from the grid point (A ₂₅₀ , B _tr ) at which the pattern matching is terminated becomes smaller in the reverse time direction. Ask. Incidentally, B
_tr means the pitch data of the singer corresponding to the pitch data A ₂₅₀ of the teacher sound.

【００２５】図３は横軸に教師音の音程時間履歴をと
り、縦軸に歌い手の音程時間履歴をとっており、前述の
格子点と呼ばれる座標より構成される。ここで、時間軸
正規化関数３１は、ＤＰマッチング手法により、線分３
５上の（Ａ₂₅₀，Ｂ₂₀₀）から（Ａ₂₅₀，Ｂ₃₀₀）までの各
格子点における、格子点（Ａ₁，Ｂ₁）からの累積音程差
を最小とする累積距離と呼ばれる値の内、最も小さい累
積距離の値を有する格子点（Ａ₂₅₀，Ｂ_tr）を始点とし
て、時間逆進行方向に累積距離が小さくなる経路をたど
ることによって求められる。実施例では、歌い手の音程
測定データＢ₂₀₀は、前記補正開始命令から４ｓｅｃ後
にあたり、求められた時間軸正規化関数３１より、前記
補正命令開始直後の歌い手の歌い出しが教師音に対して
遅れたにもかかわらず、該補正命令開始５秒後の教師音
に対して早く歌い終わっていることがわかる。図３中、
線分３２は（Ａ₁，Ｂ₁）及び（Ａ₂₅₀，Ｂ₂₅₀）を通る傾
き１の線分であり、時間軸正規化関数が線分３２と一致
すれば、教師音と歌い手の音声の時間的なずれ、すなわ
ち、テンポ差はない。これに対し、線分３３
（（Ａ₂₅ ₀，Ｂ₃₀₀）を通り傾き１の線分）及び、線分３
４（（Ａ₂₅₀，Ｂ₂₀₀）を通り傾き１の線分）は時間軸整
合窓と呼ばれる座標領域を形成し、時間軸正規化関数が
線分３３及び線分３４の間の時間整合窓から外れた場合
は、非現実的であるとして、前記テンポ差は補正しない
ものと考える。但し、実施例では、時間軸正規化関数
は、前述の如き逆時間方向に経路を探索する際、前記時
間整合窓の範囲を越えようとすれば、強制的に時間整合
窓内での最も累積距離が小さい他の格子点を選択するの
で、必ず時間軸整合窓の範囲の中での時間軸正規化関数
が求まる。実施例では、時間軸正規化関数３１に対し
て、最小二乗法により近似直線３６を得、直線３６の傾
きをテンポ差補正量とした。In FIG. 3, the abscissa represents the pitch time history of the teacher's tone and the ordinate represents the pitch time history of the singer, which is composed of the coordinates called grid points. Here, the time axis normalization function 31 uses the DP matching method to calculate the line segment 3
Of the cumulative distance that minimizes the cumulative pitch difference from the grid point (A ₁ , B ₁ ) at each grid point from (A ₂₅₀ , B ₂₀₀ ) to (A ₂₅₀ , B ₃₀₀ ) in 5 above. , The grid point (A ₂₅₀ , B _tr ) having the smallest value of the cumulative distance is used as a starting point, and the route is calculated by tracing the path in which the cumulative distance becomes smaller in the time reverse direction. In the embodiment, the singer's pitch measurement data B ₂₀₀ is 4 seconds after the correction start command, and the singer's singing immediately after the start of the correction command is delayed with respect to the teacher sound by the obtained time axis normalization function 31. However, it can be seen that, despite the fact that the teacher's sound is five seconds after the start of the correction command, the song is finished singing quickly. In FIG.
The line segment 32 is a line segment having a slope of 1 passing through (A ₁ , B ₁ ) and (A ₂₅₀ , B ₂₅₀ ). If the time axis normalization function matches the line segment 32, the teacher sound and the voice of the singer are There is no time lag, that is, tempo difference. In contrast, line segment 33
_{_{_{((A 25 0, B 300}}} ) of a line segment of street slope 1) and, line segment 3
4 (a line segment that passes through (A ₂₅₀ , B ₂₀₀ ) and has a slope of 1) forms a coordinate area called a time-axis matching window, and the time-axis normalization function If it is not correct, it is considered unrealistic and the tempo difference is not corrected. However, in the embodiment, the time-axis normalizing function is forced to maximize the accumulation within the time-matching window if an attempt is made to exceed the range of the time-matching window when searching for a route in the reverse time direction as described above. Since another grid point having a small distance is selected, the time axis normalization function is always found within the time axis matching window range. In the embodiment, an approximate straight line 36 is obtained for the time axis normalization function 31 by the method of least squares, and the inclination of the straight line 36 is used as the tempo difference correction amount.

【００２６】次に、ステップＳ５におけるテンポ差補正
回路９での処理について述べる。テンポ差補正回路９で
は、非線形パターンマッチング回路８で得られたテンポ
差補正量を用い、伴奏音の通常再生速度を該テンポ差補
正量で除した値を伴奏音の補正再生速度とする。実施例
では、テンポ差補正量０．９７が得られた。Next, the processing in the tempo difference correction circuit 9 in step S5 will be described. The tempo difference correction circuit 9 uses the tempo difference correction amount obtained by the non-linear pattern matching circuit 8 and sets the value obtained by dividing the normal reproduction speed of the accompaniment sound by the tempo difference correction amount as the corrected reproduction speed of the accompaniment sound. In the example, a tempo difference correction amount of 0.97 was obtained.

【００２７】最後に、ステップＳ６における音程差補正
回路１０での処理について述べる。音程差補正回路１０
では、図３に示す非線形パターンマッチング回路８で得
られた時間軸正規化関数３１の経路に沿って、前記歌い
手の音程時間履歴に対して時間軸上、逐次的に対応する
教師音の音程差の総和より平均を求め、その平均音程差
を音程差補正量として伴奏音の音程を補正する。具体的
に、図４を用いて説明する。図４の上段にバッファメモ
リ７に記憶されている歌い手の音程時間履歴の実施例を
示し、下段にバッファメモリ４に記憶されている教師音
の音程時間履歴の実施例を示す。得られた時間軸正規化
関数３１の経路をたどることにより、Ａ ₂₅₀にはＢ_tr、
Ａ₂₁₅にはＢ₂₀₅、・・・，Ａ₁ にはＢ₁₇といった具合
に、時間的に非線形な対応が得られる。音程差補正量
は、前記それぞれの時間軸に対応する音程差の平均値に
より算出される。Finally, the pitch difference correction in step S6
The processing in the circuit 10 will be described. Pitch difference correction circuit 10
Then, using the non-linear pattern matching circuit 8 shown in FIG.
Along the path of the time axis normalization function 31
Corresponding to the pitch time history of the hand sequentially on the time axis
Average from the sum of the pitch differences of the teacher's sound, and the average pitch difference
Is used as the pitch difference correction amount to correct the pitch of the accompaniment sound. concrete
First, description will be made with reference to FIG. Buffer memo at the top of Figure 4
An example of the pitch time history stored in Li 7
And the teacher sounds stored in the buffer memory 4 at the bottom.
An example of the interval time history of is shown. Obtained time base normalization
By following the path of function 31, A ₂₅₀To B_tr,
A₂₁₅To B₂₀₅・・・, A₁ To B₁₇Such as
Thus, a non-linear response in time is obtained. Pitch difference correction amount
Is the average value of the pitch difference corresponding to each time axis
It is calculated from

【００２８】以上のテンポ差及び音程差の補正処理を楽
音情報記憶媒体５より得られるディジタル伴奏音信号に
施し、以後、一定の補正量により補正されたディジタル
伴奏音信号は、Ｄ／Ａ変換器１１を経てアナログ信号と
なり、ミキシング回路１２で歌い手の音声信号とミキシ
ングされ、スピーカー１３によって再生される。The above-described correction processing of the tempo difference and the pitch difference is applied to the digital accompaniment sound signal obtained from the musical sound information storage medium 5, and thereafter, the digital accompaniment sound signal corrected by a constant correction amount is converted into a D / A converter. After passing through 11, it becomes an analog signal, is mixed with the voice signal of the singer in the mixing circuit 12, and is reproduced by the speaker 13.

【００２９】一方、前記補正開始命令が実施されない場
合は、楽音情報記憶媒体５から供給されるディジタル伴
奏音信号は、テンポ差補正回路９及び音程差補正回路１
０を補正せずに通過して、Ｄ／Ａ変換器１１を経てアナ
ログ信号となり、ミキシング回路１２で歌い手の音声信
号とミキシングされ、スピーカー１３によって再生され
る。実施例では、Ｄ／Ａ変換器１１の標本化速度を４
４．１ｋＨｚとし、Ｄ／Ａ変換された直後の伴奏音信号
はカットオフ周波数２０ｋＨｚのローパスフィルタを通
過する。On the other hand, when the correction start command is not executed, the digital accompaniment sound signal supplied from the musical tone information storage medium 5 is the tempo difference correction circuit 9 and the pitch difference correction circuit 1.
0 passes through without correction, passes through the D / A converter 11, becomes an analog signal, is mixed with the voice signal of the singer in the mixing circuit 12, and is reproduced by the speaker 13. In the embodiment, the sampling rate of the D / A converter 11 is set to 4
At 4.1 kHz, the accompaniment sound signal immediately after D / A conversion is passed through a low-pass filter with a cutoff frequency of 20 kHz.

【００３０】ここで、楽音情報記憶媒体５に、（第２の
入力音声である）歌い方の手本となる教師音信号が記憶
されていない場合でも、（第３の入力音声である）伴奏
音信号の音程時間履歴を教師音信号の音程時間履歴の代
わりに用いることによって、（第１の入力音声である）
歌い手の音声とのテンポ差及び音程を算出し、伴奏音自
信のテンポ及び音程が補正可能であることは言うまでも
ない。Here, even when the musical tone information storage medium 5 does not store a teacher sound signal as a model for singing (which is the second input voice), the accompaniment (which is the third input voice) By using the interval time history of the sound signal instead of the interval time history of the teacher sound signal (the first input voice)
Needless to say, the tempo difference and pitch of the singer's voice can be calculated to correct the tempo and pitch of the accompaniment sound.

【００３１】[0031]

【発明の効果】本発明による音響再生装置によれば、操
作者の希望するタイミングで、第１の入力音声のテンポ
及び音程に合わせて、第２の入力音声或は、第３の入力
音声を補正することが可能になるので、例えば、第１の
入力音声を歌い手の音声として、補正対象を伴奏音とし
たようなカラオケ装置に適用した場合、歌い手の希望す
るタイミングで、歌い手のテンポ及び音程に合わせて補
正された伴奏音が再生されるため、歌い手は気持ち良く
歌を歌うことができ、カラオケの娯楽性が高められる。According to the sound reproducing apparatus of the present invention, the second input sound or the third input sound is matched with the tempo and pitch of the first input sound at the timing desired by the operator. Since the correction can be performed, for example, when the first input sound is applied to a karaoke device in which the correction target is an accompaniment sound as the voice of the singer, the tempo and pitch of the singer at the desired timing of the singer. Since the accompaniment sound corrected according to is reproduced, the singer can comfortably sing a song, and the entertainment of karaoke is enhanced.

【００３２】また、歌い手は、手元スイッチを操作する
ことによって、前述の補正開始をカラオケ装置に指示で
きるので、非常に良い操作性が得られる。加えて、テン
ポ差及び音程差の補正量に制限があるため、伴奏音が補
正されすぎて、歌い手に違和感を生じさせるようなこと
なくカラオケが楽しめるといった効果を奏する。Further, since the singer can instruct the karaoke apparatus to start the above-mentioned correction by operating the hand switch, very good operability can be obtained. In addition, since the correction amount of the tempo difference and the pitch difference is limited, the accompaniment sound is overcorrected, and the karaoke can be enjoyed without causing the singer to feel uncomfortable.

[Brief description of drawings]

【図１】本発明の音響再生装置の概略構成図である。FIG. 1 is a schematic configuration diagram of a sound reproducing device of the present invention.

【図２】本発明の音響再生装置の概略処理手順を示す図
である。FIG. 2 is a diagram showing a schematic processing procedure of the sound reproducing device of the present invention.

【図３】本発明の音響再生装置の構成をなす非線形パタ
ーンマッチング回路によるＤＰマッチング計算結果例で
ある。FIG. 3 is an example of a DP matching calculation result by a non-linear pattern matching circuit that constitutes the configuration of the sound reproducing device of the present invention.

【図４】教師音の音程時間履歴及び歌い手の音程の時間
履歴を示す図である。FIG. 4 is a diagram showing a pitch time history of a teacher sound and a time history of a singer's pitch.

[Explanation of symbols]

１・・・マイク２・・・Ａ／Ｄ変換器３，６・・・音程抽出回路４，７・・・バッファメモリ８・・・非線形パターンマッチング回路９・・・テンポ差補正回路１０・・・音程差補正回路１１・・・Ｄ／Ａ変換器１２・・・ミキシング回路１３・・・スピーカー 1 ... Microphone 2 ... A / D converter 3, 6 ... Pitch extraction circuit 4, 7 ... Buffer memory 8 ... Non-linear pattern matching circuit 9 ... Tempo difference correction circuit 10 ...・ Pitch difference correction circuit 11 ・・・ D / A converter 12 ・・・ Mixing circuit 13 ・・・ Speaker

Claims

[Claims]

1. A first pitch extracting means for extracting a pitch of a first input voice, a second pitch extracting means for extracting a pitch of a second input voice, and a memory for storing a time history of each pitch. And a calculating means for calculating a tempo difference and a pitch difference between the first input voice and the second input voice by a non-linear pattern matching method using the time history of the pitch.
Correction means for correcting the tempo and pitch of the second input sound so as to approximate the tempo and pitch of the first input sound; and the second input sound or the second input sound corrected by the correction means,
Means for playing back the third input sound in which the calculated tempo difference and pitch difference are corrected, and a sound reproducing device.

2. The non-linear pattern matching method includes D
Using P matching, the tempo difference between the first input voice and the second input voice is calculated from the slope of the time normalization function obtained from the DP matching, and then corrected according to the tempo difference. Calculating an average pitch difference calculated from the recorded time history of the pitch of the second input voice and the time history of the pitch of the first input voice as the pitch difference between the first input voice and the second input voice. The sound reproducing device according to claim 1, wherein

3. The slope of the time normalization function obtained from the DP matching is limited, and the tempo difference of the first input voice with respect to the tempo of the second input voice is not corrected to a set value or more. The sound reproducing device according to claim 2.

4. A correction period for correcting the tempo and pitch of the second input voice to approximate the tempo and pitch of the first input voice, and after the correction period, the tempo difference and pitch. Second to keep the difference correction amount constant
2. The sound reproducing apparatus according to claim 1, wherein the input sound of the above or the third input sound is reproduced.

5. A hand switch for instructing correction of the tempo and pitch of the second input voice or the third input voice is provided, and the operator specifies the correction period by the hand switch. The sound reproducing device according to claim 4, which can be performed.