JP2006301019A

JP2006301019A - Pitch-notifying device and program

Info

Publication number: JP2006301019A
Application number: JP2005118590A
Authority: JP
Inventors: Kentaro Katahira; 健太郎片平
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2005-04-15
Filing date: 2005-04-15
Publication date: 2006-11-02

Abstract

<P>PROBLEM TO BE SOLVED: To provide a means for supporting improvement in performance ability, by notifying a user about the performance of other parts affecting the performance of the user. <P>SOLUTION: A pitch identifying section 1013 identifies the pitch of user's singing sounds, based on the pitch of guide sounds indicated by reference pitch data 1021. A performing part identifying section 1014 estimates, when the pitch of the singing sounds deviates from the pitch of the guide sounds, the part of accompaniment sounds causing the deviation, based on the relation between the pitch of singing sounds identified by the pitch identifying section 1013, the pitch of guide sounds, and the pitch of accompaniment sounds indicated by the reference pitch data 1021. An image signal generating section 1015 generates image signals, indicating the performing part estimated by the performing part identifying section 1014 and displays the image signals on a display 13. Thus, the user can conduct singing training, while checking the other performing parts that affect the user's singing performance. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、演奏力向上のためのピッチ通知技術に関する。 The present invention relates to a pitch notification technique for improving performance.

カラオケにおける歌唱者の歌唱力を評価する技術がある。例えば特許文献１は、ファジー推論を行うことにより歌唱力や楽器の演奏力をより正しく評価する音楽評価装置が開示されている。
特許第３０７４６９３号公報 There is a technique for evaluating the singing ability of a singer in karaoke. For example, Patent Document 1 discloses a music evaluation apparatus that more accurately evaluates singing ability and musical instrument performance by performing fuzzy inference.
Japanese Patent No. 3074693

ところで、歌唱や特にバイオリン等のフレットレス楽器を用いた演奏を他の演奏パートとともに行う場合、演奏者は他の演奏の音に影響されて、本来発音すべき音からずれたピッチの音を発音してしまうことがある。従来技術にかかる音楽評価装置による場合、そのような誤った演奏を行った場合に、例えば演奏全体として悪い評価が演奏者に通知されたり、音程が正しくとれていない旨の評価が演奏者に通知されたりする。しかしながら、音程が正しくとれていない原因が演奏者に通知されることはないため、演奏者は効率的に自分の演奏を改善することができない。 By the way, when performing a song or a performance using a fretless instrument such as a violin together with other performance parts, the performer is affected by the sound of the other performance and produces a sound with a pitch that deviates from the sound that should originally be pronounced. May end up. In the case of a music evaluation apparatus according to the prior art, when such an erroneous performance is performed, for example, the player is notified of a bad evaluation as a whole performance, or the player is notified of an evaluation that the pitch is not correct. Or However, since the performer is not notified of the reason that the pitch is not correct, the performer cannot efficiently improve his performance.

上記の状況に鑑み、本発明は、演奏者に対し自分の演奏に影響を与えている他の演奏パートの演奏を通知することにより、演奏力の向上を支援する手段を提供することを目的とする。 In view of the above situation, an object of the present invention is to provide means for supporting the improvement of performance by notifying the performer of the performance of other performance parts that affect his performance. To do.

上記課題を達成するために、本発明は、複数の演奏パートにより演奏される楽曲において前記複数の演奏パートのうちの一の演奏パートによって発音されるべき音のピッチを時系列的に示す基準ピッチデータを記憶する記憶手段と、前記一の演奏パートにより発音された音を示す音信号を取得する入力手段と、前記入力手段により取得された音信号により示される音のピッチを特定するピッチ特定手段と、一のタイミングにおける前記一の演奏パートの前記基準ピッチデータにより示される音のピッチと、当該一のタイミングにおける前記ピッチ特定手段により特定された前記一の演奏パートの音のピッチと、当該一のタイミングにおける前記複数の演奏パートのうちの前記一の演奏パート以外の演奏パートの各々の音のピッチとの関係に基づき、当該一のタイミングにおいて前記一の演奏パートの演奏に影響を与えている他の演奏パートを特定する演奏パート特定手段と、前記演奏パート特定手段により特定された演奏パートもしくは当該演奏パートにより発音された音のピッチを示すデータを出力する出力手段とを備えることを特徴とするピッチ通知装置を提供する。 In order to achieve the above object, the present invention provides a reference pitch that indicates in time series the pitch of a sound to be produced by one performance part of the plurality of performance parts in a music played by a plurality of performance parts. Storage means for storing data; input means for acquiring a sound signal indicating a sound produced by the one performance part; and pitch specifying means for specifying a pitch of a sound indicated by the sound signal acquired by the input means The pitch of the sound indicated by the reference pitch data of the one performance part at one timing, the pitch of the sound of the one performance part specified by the pitch specifying means at the one timing, and the one Based on the relationship with the pitch of each of the performance parts other than the one performance part of the plurality of performance parts at the timing of A performance part specifying means for specifying another performance part that affects the performance of the one performance part at the one timing, and a performance part specified by the performance part specifying means or a sound produced by the performance part An output means for outputting data indicating the pitch of the generated sound is provided.

かかる構成のピッチ通知装置によれば、ユーザは自分の演奏が他の演奏パートのいずれに影響されているかを知ることができる。 According to the pitch notification device having such a configuration, the user can know which of the other performance parts affects his / her performance.

好ましい態様において、前記記憶手段により記憶されている基準ピッチデータは、前記複数の演奏パートの各々によって発音されるべき音のピッチを時系列的に示すデータであり、前記演奏パート特定手段は、前記他の演奏パートを特定するに際し、一のタイミングにおける前記一の演奏パート以外の演奏パートの各々の前記基準ピッチデータにより示される音のピッチを、当該一のタイミングにおける前記一の演奏パート以外の演奏パートの各々の音のピッチとして用いる。 In a preferred aspect, the reference pitch data stored by the storage means is data indicating time-sequentially the pitch of the sound to be generated by each of the plurality of performance parts, When specifying other performance parts, the pitch of the sound indicated by the reference pitch data of each performance part other than the one performance part at one timing is set to the performance other than the one performance part at the one timing. Used as the pitch of each sound of the part.

また、他の好ましい態様において、前記入力手段は、前記複数の演奏パートの各々により発音された音を示す音信号を取得し、前記ピッチ特定手段は、前記入力手段により取得された音信号により示される前記複数の演奏パートにより発音された音の各々に関し、当該音のピッチを特定し、前記演奏パート特定手段は、前記他の演奏パートを特定するに際し、一のタイミングにおける前記ピッチ特定手段により特定された前記一の演奏パート以外の演奏パートの各々の音のピッチを、当該一のタイミングにおける前記一の演奏パート以外の演奏パートの各々の音のピッチとして用いる。 In another preferred embodiment, the input means acquires a sound signal indicating a sound produced by each of the plurality of performance parts, and the pitch specifying means is indicated by a sound signal acquired by the input means. The pitch of the sound is specified for each of the sounds produced by the plurality of performance parts, and the performance part specifying means is specified by the pitch specifying means at one timing when specifying the other performance parts. The pitch of the sound of each performance part other than the one performance part is used as the pitch of the sound of each performance part other than the one performance part at the one timing.

また、他の好ましい態様において、前記記憶手段は、前記ピッチ特定手段により特定されたピッチを時系列的に示す実演奏ピッチデータを記憶し、前記演奏パート特定手段は、現在行われている演奏中の一のタイミングにおける前記ピッチ特定手段により特定された前記一の演奏パートの音のピッチと、過去に行われた演奏中の当該一のタイミングにおける前記一の演奏パートの前記実演奏ピッチデータにより示される音のピッチと、当該現在行われている演奏に参加しており当該過去に行われた演奏に参加していなかった演奏パートにより当該現在行われている演奏中の当該一のタイミングにおける音のピッチとの関係に基づき、前記他の演奏パートを特定する。 In another preferred embodiment, the storage means stores actual performance pitch data indicating the pitch specified by the pitch specifying means in time series, and the performance part specifying means is currently performing the performance. The pitch of the sound of the one performance part specified by the pitch specifying means at one timing and the actual performance pitch data of the one performance part at the one timing during the performance performed in the past. And the pitch of the sound at the same timing during the performance being performed by a performance part that has participated in the current performance and has not participated in the performance performed in the past. The other performance part is specified based on the relationship with the pitch.

また、他の好ましい態様において、前記記憶手段は、前記ピッチ特定手段により特定されたピッチを時系列的に示す実演奏ピッチデータを記憶し、前記演奏パート特定手段は、現在行われている演奏中の一のタイミングにおける前記ピッチ特定手段により特定された前記一の演奏パートの音のピッチと、過去に行われた演奏中の当該一のタイミングにおける前記一の演奏パートの前記実演奏ピッチデータにより示される音のピッチと、当該現在行われている演奏に参加しておらず当該過去に行われた演奏に参加していた演奏パートにより当該過去に行われた演奏中の当該一のタイミングにおける音のピッチとの関係に基づき、前記他の演奏パートを特定する。 In another preferred embodiment, the storage means stores actual performance pitch data indicating the pitch specified by the pitch specifying means in time series, and the performance part specifying means is currently performing the performance. The pitch of the sound of the one performance part specified by the pitch specifying means at one timing and the actual performance pitch data of the one performance part at the one timing during the performance performed in the past. And the pitch of the sound at the same timing during the performance performed in the past by the performance part that did not participate in the performance performed in the past and participated in the performance performed in the past. The other performance part is specified based on the relationship with the pitch.

また、他の好ましい態様において、前記記憶手段は、前記ピッチ特定手段により特定されたピッチを時系列的に示す実演奏ピッチデータを記憶し、前記演奏パート特定手段は、前記一の演奏パートの前記基準ピッチデータにより示される音のピッチと前記一の演奏パートの前記実演奏ピッチデータにより示される音のピッチとの差と、前記一の演奏パートの前記基準ピッチデータにより示される音のピッチと前記一の演奏パート以外の演奏パートの各々の音のピッチとの差との相関関係に基づき、前記他の演奏パートを特定する。 In another preferred embodiment, the storage means stores real performance pitch data indicating the pitches specified by the pitch specification means in a time series, and the performance part specification means is configured to store the performance parts of the one performance part. The difference between the pitch of the sound indicated by the reference pitch data and the pitch of the sound indicated by the actual performance pitch data of the one performance part, the pitch of the sound indicated by the reference pitch data of the one performance part, and the The other performance parts are identified based on the correlation with the difference between the pitches of the sounds of the performance parts other than the performance part.

また、他の好ましい態様において、前記ピッチ特定手段は、前記入力手段により取得された音信号により示される音のピッチの候補と前記一の演奏パートの前記基準ピッチデータにより示される音のピッチとの差を変数として含む関数の値を算出し、当該関数の値に基づき、前記一の演奏パートにより当該一のタイミングに発音された音のピッチを特定する。 In another preferred embodiment, the pitch specifying means includes a sound pitch candidate indicated by the sound signal acquired by the input means and a sound pitch indicated by the reference pitch data of the one performance part. A value of a function including the difference as a variable is calculated, and a pitch of a sound produced at the one timing by the one performance part is specified based on the value of the function.

また、上記の好ましい態様において、前記入力手段は、前記複数の演奏パートの各々に対応付けて各々設けられた複数の入力手段であり、前記関数は、一の周波数に関し、前記一の演奏パートに対応する入力手段により取得された音信号の当該一の周波数における周波数成分の振幅と、前記一の演奏パート以外の演奏パートに対応する入力手段のうちの少なくとも一の入力手段により取得された音信号の当該一の周波数における周波数成分の振幅との差を示す数値を変数として含むようにしてもよい。 Further, in the above preferred aspect, the input means is a plurality of input means provided in association with each of the plurality of performance parts, and the function is applied to the one performance part with respect to one frequency. The amplitude of the frequency component at the one frequency of the sound signal acquired by the corresponding input means, and the sound signal acquired by at least one input means among the input means corresponding to the performance parts other than the one performance part A numerical value indicating a difference from the amplitude of the frequency component at the one frequency may be included as a variable.

また、本発明は、上記のピッチ通知装置により行われる処理をコンピュータに実行させるプログラムを提供する。 The present invention also provides a program that causes a computer to execute processing performed by the pitch notification device.

［１．第１実施形態］
以下、歌唱者（以下、「ユーザ」と呼ぶ）が自動演奏される伴奏に合わせて歌唱トレーニングを行う場合を例として、本発明の第１実施形態を説明する。図１は第１実施形態にかかる演奏トレーニングシステム１の構成を示した図である。演奏トレーニングシステム１は、楽曲の伴奏音を示す音信号を出力するとともにユーザの歌唱音のピッチと模範歌唱音のピッチを表すグラフを表示するための画像信号を出力するピッチ通知装置１０と、ピッチ通知装置１０に接続されたスピーカ１１、マイク１２、ディスプレイ１３およびキーボード１４を備えている。 [1. First Embodiment]
Hereinafter, the first embodiment of the present invention will be described by taking as an example a case where singing training is performed in accordance with an accompaniment in which a singer (hereinafter referred to as “user”) is automatically played. FIG. 1 is a diagram showing a configuration of a performance training system 1 according to the first embodiment. The performance training system 1 outputs a sound signal indicating an accompaniment sound of a music, and outputs an image signal for displaying a graph representing a pitch of the user's singing sound and a pitch of the model singing sound, and a pitch A speaker 11, a microphone 12, a display 13, and a keyboard 14 connected to the notification device 10 are provided.

以下の説明において、ピッチは周波数で表され、単位は［Ｈｚ］であるものとする。従って、ある音のピッチはすなわちその音の波形の基本周波数である。しかしながら、ピッチを示す数値は周波数に限られず、例えば半音間が１００に相当するように変換した数値で表すようにしてもよい。その場合、単位は［ｃｅｎｔ］となる。 In the following description, the pitch is represented by a frequency, and the unit is [Hz]. Therefore, the pitch of a certain sound is the fundamental frequency of the waveform of that sound. However, the numerical value indicating the pitch is not limited to the frequency, and may be expressed by a numerical value converted so that the interval between semitones corresponds to 100, for example. In that case, the unit is [cent].

スピーカ１１はピッチ通知装置１０から音信号を受け取り、受け取った音信号を音に変換して発音する。マイク１２は周囲の音を集音し、集音した音を示す音信号をピッチ通知装置１０に出力する。ディスプレイ１３はピッチ通知装置１０から画像信号を受け取り、受け取った画像信号に従い図形や文字を含む画面の表示を行う。キーボード１４は複数のキーを備えユーザのキー操作に応じた信号をピッチ通知装置１０に送信する。 The speaker 11 receives a sound signal from the pitch notification device 10, converts the received sound signal into a sound, and generates a sound. The microphone 12 collects ambient sounds and outputs a sound signal indicating the collected sounds to the pitch notification device 10. The display 13 receives an image signal from the pitch notification device 10 and displays a screen including graphics and characters according to the received image signal. The keyboard 14 includes a plurality of keys and transmits a signal corresponding to the user's key operation to the pitch notification device 10.

ピッチ通知装置１０は、ピッチ通知装置１０の構成部を制御する制御部１０１と、制御部１０１による各種処理を指示するプログラムおよび制御部１０１により利用される各種データを記憶するとともに制御部１０１のワークエリアとして用いられる記憶部１０２と、ピッチ通知装置１０が外部装置との間で信号の送受信を行う入出力インタフェース１０３と、所定の時間間隔でクロック信号を生成し制御部１０１に引き渡す発振器１０４を備えている。発振器１０４により生成されるクロック信号は、ピッチ通知装置１０の構成部間の処理の同期や楽曲の先頭からの経過時間等の計時に用いられる。 The pitch notification device 10 stores a control unit 101 that controls the components of the pitch notification device 10, a program that instructs various processes by the control unit 101, and various data that is used by the control unit 101, and a work of the control unit 101. A storage unit 102 used as an area, an input / output interface 103 through which the pitch notification device 10 transmits and receives signals to and from an external device, and an oscillator 104 that generates a clock signal at a predetermined time interval and delivers it to the control unit 101 are provided. ing. The clock signal generated by the oscillator 104 is used for timing of processing synchronization between components of the pitch notification device 10 and elapsed time from the beginning of the music.

記憶部１０２には、楽曲の演奏情報を示す基準ピッチデータ１０２１が予め記憶されている。図２は、基準ピッチデータ１０２１の内容を例示した図である。基準ピッチデータ１０２１は、１つの楽曲に関し、楽曲を演奏する複数の演奏パートの各々が発音すべき音のピッチやユーザに対し表示されるべき歌詞等のデータを、それらのデータに関する処理が実行されるべきタイミングを示すデータとともに含むデータである。図２の例は、ボーカルパート、フルートパートおよびピアノパートの各演奏パートにより演奏される楽曲に関する基準ピッチデータ１０２１の内容の一部を示したものである。 In the storage unit 102, reference pitch data 1021 indicating performance information of music is stored in advance. FIG. 2 is a diagram illustrating the contents of the reference pitch data 1021. The reference pitch data 1021 is related to one piece of music, and data such as the pitch of the sound to be generated by each of a plurality of performance parts playing the music and the lyrics to be displayed to the user is processed. This data includes data indicating the timing to be performed. The example of FIG. 2 shows a part of the content of the reference pitch data 1021 regarding the music played by each performance part of a vocal part, a flute part, and a piano part.

基準ピッチデータ１０２１に含まれる各行のデータ（以下、「タイミング付イベントデータ」と呼ぶ）は、「タイミング」、「種別」および「内容」の項目を備えており、「タイミング」は楽曲におけるタイミングを示し、「種別」は「タイミング」により示されるタイミングで実行されるべき処理の種別を示し、「内容」は実行されるべき処理の内容を示している。 Each row of data included in the reference pitch data 1021 (hereinafter referred to as “event data with timing”) has items of “timing”, “type”, and “content”, and “timing” indicates the timing of the music. “Type” indicates the type of the process to be executed at the timing indicated by “Timing”, and “Content” indicates the content of the process to be executed.

例えば、図２に例示されるタイミング付イベントデータ［０３：０１：０００／歌詞表示／「どうしてないているの」］は、楽曲の第３小節第１拍のティック「０００」のタイミングにおいて、歌詞「どうしてないているの」を表示すべきであることを指示している。「ティック」とは１拍の長さを所定数に分割した時間間隔のことであり、拍の先頭からの経過時間をティックの数で表すことで、楽曲における特定のタイミングが示される。従って、例えば［０３：０１：０００］は第３小節第１拍の先頭タイミングを示している。以下、例として１ティックを１／４８０拍とする。 For example, the timed event data illustrated in FIG. 2 [03: 01: 00 / Lyrics display / "Why are you not doing]" is the lyrics at the timing of the tick "000" of the first beat of the third measure of the song. It indicates that “why not” should be displayed. “Tick” is a time interval obtained by dividing the length of one beat into a predetermined number, and the elapsed time from the beginning of the beat is represented by the number of ticks, thereby indicating a specific timing in the music. Therefore, for example, [03: 01: 000] indicates the start timing of the first beat of the third measure. Hereinafter, as an example, one tick is set to 1/480 beat.

図２に例示されるタイミング付イベントデータ［００：０１：０００／音色／フルート］は、その演奏パートにおいて発音される音の音色が「フルート」であることを指示している。ここでタイミングを示す［００：０１：０００］は楽曲の開始前もしくは開示時を示している。 The event data with timing [00: 01: 00 / tone / flute] illustrated in FIG. 2 indicates that the tone color of the sound produced in the performance part is “flute”. Here, [00: 01: 000] indicating the timing indicates before the start of the music or at the time of disclosure.

また、タイミング付イベントデータ［０３：０１：２４０／ノートオン／Ｃ４（７２）］は、楽曲の第３小節第１拍のティック「２４０」のタイミングにおいて、音程「Ｃ４」の音を音強「７２」で発音すべきであることを指示している。また、タイミング付イベントデータ［０３：０１：４２０／ノートオフ／Ｃ４］は、楽曲の第３小節第１拍のティック「４２０」のタイミングにおいて、音程「Ｃ４」の音を消音（発音停止）すべきであることを指示している。 Also, the event data with timing [03: 01: 240 / note-on / C4 (72)] indicates that the sound of the pitch “C4” is intensified at the timing of the tick “240” of the first beat of the third measure of the music. 72 "indicates that the sound should be pronounced. The event data with timing [03: 01: 420 / note off / C4] silences the sound of the pitch “C4” at the timing of the tick “420” of the third beat and the first beat of the music. Instructed that it should.

タイミング付イベントデータ［０３：０１：２４０／ノートオン／Ｃ４（７２）］に例示されるように、基準ピッチデータ１０２１においては、発音されるべき音の高さが周波数ではなく、「Ｃ４」等の音高名で表現されている。これはデータの可読性を高めるためであって、例えば音高名「Ｃ４」の代わりに周波数「２６１．６２６」（Ｈｚ）といったピッチを直接示すデータが用いられてもよい。 As illustrated in the event data with timing [03: 01: 240 / note on / C4 (72)], in the reference pitch data 1021, the pitch of the sound to be generated is not a frequency, but “C4” or the like. It is expressed by the pitch name. This is for improving the readability of the data. For example, data directly indicating the pitch such as the frequency “261.626” (Hz) may be used instead of the pitch name “C4”.

また、基準ピッチデータ１０２１の形式は図２に例示のものに限られず、例えばＳＭＦ（ＳｔａｎｄａｒｄＭｕｓｉｃａｌＩｎｓｔｒｕｍｅｎｔＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅＦｉｌｅ）の形式に従ったものであってもよい。 The format of the reference pitch data 1021 is not limited to that illustrated in FIG. 2, and may be in accordance with, for example, the format of SMF (Standard Musical Instrument Digital Interface File).

制御部１０１は、基準ピッチデータ１０２１に従い、楽曲の再生指示を行う再生部１０１１を備えている。再生部１０１１は、ユーザによるキーボード１４を用いた操作により楽曲の再生開始の指示を受けると、発振器１０４から受け取るクロック信号の数を数えることにより、楽曲の開始タイミングからの経過時間を継続的に計時し、基準ピッチデータ１０２１に含まれるタイミング付イベントデータの「タイミング」欄により示されるタイミングが到来すると、そのタイミング付イベントデータの「種別」欄および「内容」欄のデータに演奏パートを示すデータを付加した後、「種別」欄の内容に応じて、制御部１０１が備える他の構成部に引き渡す。以下、再生部１０１１により他の構成部に引き渡される「演奏パート」欄、「種別」欄および「内容」欄で構成されるデータを「イベントデータ」と呼ぶ。イベントデータは、例えば［ピアノ／ノートオン／Ｄ４（６３）］のような形式のデータである。 The control unit 101 includes a playback unit 1011 that issues a music playback instruction according to the reference pitch data 1021. When the playback unit 1011 receives an instruction to start playback of a song by an operation using the keyboard 14 by the user, the playback unit 1011 continuously counts the elapsed time from the start timing of the song by counting the number of clock signals received from the oscillator 104. When the timing indicated by the “timing” column of the event data with timing included in the reference pitch data 1021 arrives, data indicating the performance part is added to the data in the “type” column and “content” column of the event data with timing. After the addition, it is handed over to other components included in the control unit 101 according to the contents of the “type” column. Hereinafter, data constituted by the “performance part” column, the “type” column, and the “content” column delivered to the other components by the playback unit 1011 is referred to as “event data”. The event data is data in a format such as [Piano / Note On / D4 (63)].

制御部１０１は、再生部１０１１から受け取るイベントデータに従い、指定された音色の指定されたピッチの音を示す音信号を生成する音源部１０１２を備えている。音源部１０１２は例えばＦＭ（ＦｒｅｑｕｅｎｃｙＭｏｄｕｌａｔｉｏｎ）方式により同時に異なる音色、ピッチおよび音強の音を示す音信号を生成することができる。ただし、音源部１０１２の方式はＦＭ方式に限られず、ＰＣＭ（ＰｕｌｓｅＣｏｄｅＭｏｄｕｌａｔｉｏｎ）方式や物理モデル方式など、他のいずれの方式であってもよい。音源部１０１２は再生部１０１１から「種別」欄が「音色」、「ノートオン」もしくは「ノートオフ」であるイベントデータを受け取る。音源部１０１２は受け取ったイベントデータに従い、各々の演奏パートに関し、指定された音を示す音信号を生成し、生成した音信号を入出力インタフェース１０３に引き渡す。 The control unit 101 includes a sound source unit 1012 that generates a sound signal indicating a sound having a specified pitch and a specified pitch according to event data received from the playback unit 1011. The sound source unit 1012 can generate sound signals indicating different timbres, pitches, and sound intensities at the same time by, for example, FM (Frequency Modulation). However, the method of the sound source unit 1012 is not limited to the FM method, and may be any other method such as a PCM (Pulse Code Modulation) method or a physical model method. The sound source unit 1012 receives event data whose “type” column is “tone”, “note on”, or “note off” from the reproduction unit 1011. The sound source unit 1012 generates a sound signal indicating a designated sound for each performance part in accordance with the received event data, and delivers the generated sound signal to the input / output interface 103.

入出力インタフェース１０３は音信号を外部装置に出力する音信号出力部１０３１を備えている。音信号出力部１０３１は音源部１０１２から音信号を受け取り、受け取った音信号をスピーカ１１に出力する。スピーカ１１は音信号出力部１０３１から音信号を受け取ると、受け取った音信号を音に変換し発音する。その結果、ユーザは基準ピッチデータ１０２１により示される演奏の内容を聞くことができる。ここで、ボーカルパートの演奏音はユーザが歌唱すべき音のピッチを示すガイド音である。 The input / output interface 103 includes a sound signal output unit 1031 that outputs a sound signal to an external device. The sound signal output unit 1031 receives the sound signal from the sound source unit 1012 and outputs the received sound signal to the speaker 11. When the speaker 11 receives the sound signal from the sound signal output unit 1031, the speaker 11 converts the received sound signal into a sound and generates a sound. As a result, the user can hear the contents of the performance indicated by the reference pitch data 1021. Here, the performance sound of the vocal part is a guide sound indicating the pitch of the sound to be sung by the user.

ユーザは楽曲の演奏開始に先立ち、いずれの演奏パートの音をスピーカ１１から発音させるかを選択することができる。例えば、ユーザによりボーカルおよびフルートの音を消音して演奏を行う旨の指示があった場合、再生部１０１１は基準ピッチデータ１０２１のうちピアノパートのみのデータに従いイベントデータを音源部１０１２に引き渡す。その結果、ボーカルパートのガイド音とフルートパートの伴奏音はスピーカ１１から発音されなくなる。 The user can select which performance part sound is to be generated from the speaker 11 prior to the start of the music performance. For example, when the user gives an instruction to perform with the vocal and flute sound muted, the playback unit 1011 delivers the event data to the sound source unit 1012 in accordance with only the piano part data in the reference pitch data 1021. As a result, the guide sound of the vocal part and the accompaniment sound of the flute part are not generated from the speaker 11.

入出力インタフェース１０３は外部装置から音信号を受け取る音信号入力部１０３２を備えている。ユーザがピッチ通知装置１０に対し演奏開始の指示を行い、スピーカ１１から発音される楽曲の伴奏音やガイド音に従い歌唱を行うと、マイク１２はユーザの歌唱音を含む周囲の音を集音し、集音した音を示す音信号を音信号入力部１０３２に出力する。音信号入力部１０３２はマイク１２から音信号を受け取ると、受け取った音信号を記憶部１０２に実演奏波形データ１０２２として記憶させる。 The input / output interface 103 includes a sound signal input unit 1032 that receives a sound signal from an external device. When the user instructs the pitch notification device 10 to start playing and sings according to the accompaniment sound or the guide sound of the music sounded from the speaker 11, the microphone 12 collects surrounding sounds including the user's singing sound. The sound signal indicating the collected sound is output to the sound signal input unit 1032. When the sound signal input unit 1032 receives a sound signal from the microphone 12, the received sound signal is stored in the storage unit 102 as actual performance waveform data 1022.

制御部１０１は、ユーザの歌唱音のピッチをほぼリアルタイムに特定するピッチ特定部１０１３を備えている。ピッチ特定部１０１３は、記憶部１０２に順次記憶されてゆく実演奏波形データ１０２２の中から、最近に記憶された所定時間分の波形データを取り出す。以下、所定時間分の波形データを「フレーム」と呼ぶ。１フレームの長さはピッチ特定に十分な長さであり、かつ特定したピッチと現在発音されている音のピッチとの時間的なズレがユーザにとって許容可能な程度に短い必要がある。以下、例として、１フレームの長さは１００ミリ秒であるものとする。 The control unit 101 includes a pitch specifying unit 1013 that specifies the pitch of the user's singing sound almost in real time. The pitch specifying unit 1013 extracts the waveform data for a predetermined time recently stored from the actual performance waveform data 1022 sequentially stored in the storage unit 102. Hereinafter, waveform data for a predetermined time is referred to as a “frame”. The length of one frame is sufficient to specify the pitch, and the time difference between the specified pitch and the pitch of the currently sounding sound needs to be short enough to be acceptable to the user. Hereinafter, as an example, the length of one frame is assumed to be 100 milliseconds.

音の波形データからその音のピッチを特定する方法としては様々なものが提案されている。以下、例として、ピッチ特定部１０１３は音の波形データのパワースペクトルに基づきピッチの特定を行うものとする。ただし、ピッチ特定部１０１３がピッチの特定を行う際に用いる方法は、ほぼリアルタイムにピッチの特定が可能であるものであれば、既存のいずれの方法を利用するものであってもよい。 Various methods for specifying the pitch of the sound from the sound waveform data have been proposed. Hereinafter, as an example, the pitch specifying unit 1013 specifies the pitch based on the power spectrum of the sound waveform data. However, the method used when the pitch specifying unit 1013 specifies the pitch may use any existing method as long as the pitch can be specified almost in real time.

ピッチ特定部１０１３は、実演奏波形データ１０２２から取り出した１フレームに対し高速フーリエ変換等の処理を施し、取り出した１フレームにより示される波形に含まれる周波数成分の分布を算出する。続いて、ピッチ特定部１０１３は算出した周波数成分の分布を示すグラフの包絡線を示すグラフを求める。図３は、ピッチ特定部１０１３により算出される周波数成分の分布の包絡線を示すグラフを例示した図である。図３において、Ｗで示される周波数帯は、一般的な人間の声の中心周波数が分布する範囲である。 The pitch specifying unit 1013 performs processing such as fast Fourier transform on one frame extracted from the actual performance waveform data 1022, and calculates the distribution of frequency components included in the waveform indicated by the extracted one frame. Subsequently, the pitch specifying unit 1013 obtains a graph indicating the envelope of the graph indicating the distribution of the calculated frequency components. FIG. 3 is a diagram illustrating a graph showing an envelope of the distribution of frequency components calculated by the pitch specifying unit 1013. In FIG. 3, the frequency band indicated by W is a range in which the center frequency of a general human voice is distributed.

図３のグラフにより示される周波数成分の分布によれば、周波数ω_１〜ω_６の６つの周波数において周波数成分の振幅が極大値をとる。このように周波数成分の振幅が極大値をとる周波数は、音波形により示される音の基本周波数の候補となる周波数である。しかしながら、通常、音には基本周波数の成分のみでなく、基本周波数の倍音成分も多く含まれている。また、演奏トレーニングシステム１においてマイク１２が集音する音には、ユーザの歌唱音に加え、ピッチ通知装置１０の音源部１０１２により生成されスピーカ１１から発音される伴奏音やガイド音が混ざっている。従って、例えば図３に示される周波数ω_１〜ω_６のいずれがユーザの歌唱音の基本周波数であるかは容易に特定できない。 According to the distribution of frequency components shown in the graph of FIG. 3, the amplitude of the frequency component has a maximum value at _six frequencies ω ₁ to ω ₆ . Thus, the frequency at which the amplitude of the frequency component takes the maximum value is a frequency that is a candidate for the fundamental frequency of the sound indicated by the sound waveform. However, the sound usually includes not only the fundamental frequency component but also many harmonic components of the fundamental frequency. The sound collected by the microphone 12 in the performance training system 1 is mixed with accompaniment sounds and guide sounds generated by the sound source unit 1012 of the pitch notification device 10 and generated from the speakers 11 in addition to the user's singing sound. . Therefore, for example, it cannot be easily specified which of the frequencies ω _{1 to} ω ₆ shown in FIG. 3 is the fundamental frequency of the user's singing sound.

そこで、ピッチ特定部１０１３は再生部１０１１から演奏パート「ボーカル」、種別「ノートオン」であるイベントデータを受け取り、受け取ったイベントデータにより示される音程、すなわちボーカルパートがその時点で発音すべき歌唱音の音程を利用して、いずれの基本周波数の候補が正しい基本周波数であるかを特定する。 Therefore, the pitch specifying unit 1013 receives the event data of the performance part “vocal” and the type “note on” from the reproduction unit 1011, and the pitch indicated by the received event data, that is, the singing sound that the vocal part should be pronounced at that time Is used to identify which fundamental frequency candidate is the correct fundamental frequency.

具体的には、ピッチ特定部１０１３は再生部１０１１から最後に受け取った演奏パート「ボーカル」、種別「ノートオン」であるイベントデータの「内容」欄に示される音高名を周波数に変換する。例えば、イベントデータにより示される音高名が「Ａ４」であれば、ピッチ特定部１０１３はその音高名に対応する周波数として「４４０．０００」を得る。以下、そのように変換された周波数をω_０とする。また、イベントデータにより示される音高の音を以下、「基準音高」と呼ぶ。 Specifically, the pitch specifying unit 1013 converts the pitch name shown in the “content” column of the event data of the performance part “vocal” and type “note on” received last from the playback unit 1011 into a frequency. For example, if the pitch name indicated by the event data is “A4”, the pitch specifying unit 1013 obtains “440.000” as the frequency corresponding to the pitch name. Hereinafter, the frequency thus converted is _denoted by ω ₀ . In addition, the pitch of the pitch indicated by the event data is hereinafter referred to as “reference pitch”.

続いて、ピッチ特定部１０１３は図３に示される周波数ω_１〜ω_６のうち、周波数帯Ｗに含まれるものを抽出する。この例の場合、周波数ω_２およびω_３が抽出される。続いて、ピッチ特定部１０１３は抽出した周波数の各々に関し、例えば以下の（式１）に示される関数の値を算出する。

Subsequently, the pitch specifying unit 1013 extracts the frequency ω _{1 to} ω ₆ shown in FIG. 3 that is included in the frequency band W. In this example, the frequencies ω ₂ and ω ₃ are extracted. Subsequently, the pitch specifying unit 1013 calculates, for example, a function value represented by the following (Equation 1) for each of the extracted frequencies.

ただし、（式１）におけるｓ（ω）は周波数ωにおける周波数成分の振幅を示している。また、ｋは任意の自然数であり、ω_ｋはユーザの歌唱音の基本周波数の候補を示す。図３の例による場合、ｋ＝２またはｋ＝３である。（式１）の右辺第１項は、基本周波数の候補における周波数成分の振幅を示す項である。（式１）の右辺第２項は、基準ピッチデータ１０２１により示される基準音高の周波数と基本周波数の候補との差を示す項であるが、必ず０以上の値をとるようにその差が自乗されている。（式１）におけるｃは、基本周波数の候補における周波数成分の振幅の関数ｆ（ω_ｋ）における寄与度を示す係数を決定するための変数であり、０から１の範囲で経験的に適当な値が選択される。 However, s (ω) in (Equation 1) indicates the amplitude of the frequency component at the frequency ω. Further, k is an arbitrary natural number, and ω _k indicates a candidate for the fundamental frequency of the user's singing sound. In the example of FIG. 3, k = 2 or k = 3. The first term on the right side of (Expression 1) is a term indicating the amplitude of the frequency component in the fundamental frequency candidate. The second term on the right side of (Equation 1) is a term indicating the difference between the reference pitch frequency indicated by the reference pitch data 1021 and the fundamental frequency candidate, and the difference is always taken to be a value of 0 or more. Has been squared. C in (Expression 1) is a variable for determining a coefficient indicating the contribution in the function f (ω _k ) of the amplitude of the frequency component in the fundamental frequency candidate, and is empirically appropriate in the range of 0 to 1. A value is selected.

（式１）により示される関数ｆ（ω_ｋ）の値は、基本周波数の候補における周波数成分の振幅に対し、基準ピッチデータ１０２１により示される基準音高の周波数と基本周波数の候補との差に基づくバイアスを加えた値である。ピッチ特定部１０１３はｋ＝２およびｋ＝３の各々に関し関数ｆ（ω_ｋ）の値を算出し、それらの値が最大である場合のω_ｋを、ユーザが現在発音している歌唱音の基本周波数として特定する。 The value of the function f (ω _k ) expressed by (Equation 1) is the difference between the frequency of the reference pitch indicated by the reference pitch data 1021 and the basic frequency candidate with respect to the amplitude of the frequency component in the basic frequency candidate. This is the value to which the bias based is added. Pitch identification unit 1013 calculates the value of k = 2 and k = 3 for each relates to the function f (ω _k), the omega _k when these values is the maximum, the singing sound that the user is currently Pronunciation Identified as the fundamental frequency.

従来技術により単純に周波数成分が最大となる周波数を音波形の基本周波数として選択する場合、例えば図３において、ω_３が歌唱音の正しい基本周波数であり、ω_２はフルートパートによる伴奏音の基本周波数であったような場合、より振幅の大きいω_２が誤って歌唱音の基本周波数として特定される。これに対し、ピッチ特定部１０１３が用いる上記（式１）のような関数値においては、基準音高の周波数から大きく外れた基本周波数の候補に関しては、第２項の絶対値が大きくなる結果、関数ｆ（ω_ｋ）の値が小さくなり、最終的に基本周波数として選択されることがなく、正しい基本周波数の特定が行われる可能性が高まる。 When the frequency having the maximum frequency component is simply selected as the fundamental frequency of the sound waveform according to the prior art, for example, in FIG. 3, ω ₃ is the correct fundamental frequency of the singing sound, and ω ₂ is the basic accompaniment sound by the flute part. If it is a frequency, ω ₂ having a larger amplitude is erroneously specified as the fundamental frequency of the singing sound. On the other hand, in the function value as in the above (formula 1) used by the pitch specifying unit 1013, as a result of the absolute value of the second term being increased with respect to the fundamental frequency candidate greatly deviating from the frequency of the reference pitch, The value of the function f (ω _k ) becomes small and is not finally selected as the fundamental frequency, and the possibility that the correct fundamental frequency is specified increases.

ただし、上記（式１）はピッチ特定部１０１３が歌唱音の基本周波数を特定するために用いる関数式の例示であって、他にも様々な関数式が利用可能であることは言うまでもない。要すれば、基準音高の周波数と基本周波数の候補との差を反映させることにより、複数の基本周波数の候補から正しい基本周波数を特定する方法であれば、如何なる方法であってもピッチ特定部１０１３がピッチを特定するための方法として採用可能である。 However, the above (Formula 1) is an example of a functional formula used by the pitch identifying unit 1013 to identify the fundamental frequency of the singing sound, and it goes without saying that various other functional formulas can be used. In short, any method can be used as long as the correct fundamental frequency is identified from a plurality of fundamental frequency candidates by reflecting the difference between the reference pitch frequency and the fundamental frequency candidate. 1013 can be employed as a method for specifying the pitch.

また、ピッチ特定部１０１３が周波数分布の振幅が極値をとる場合の周波数を基本周波数の候補として用いる代わりに、例えば周波数帯Ｗに含まれる周波数を１Ｈｚごとに順次取り出して基本周波数の候補とし、それらの基本周波数の候補について（式１）の関数（ω_ｋ）の値を算出し、算出した値が最大となるω_ｋを基本周波数として特定するようにしてもよい。 Further, instead of using the frequency when the amplitude of the frequency distribution takes an extreme value as the fundamental frequency candidate, for example, the pitch specifying unit 1013 sequentially takes out the frequencies included in the frequency band W every 1 Hz to make the fundamental frequency candidate, The value of the function (ω _k ) of (Equation 1) may be calculated for these fundamental frequency candidates, and ω _k that maximizes the calculated value may be specified as the fundamental frequency.

ピッチ特定部１０１３は、上記の基本周波数の特定処理を、例えば１０ミリ秒ごとに行い、その結果を示すデータをその時点の楽曲におけるタイミングを示すデータに対応付けて、順次、記憶部１０２に記憶する。ピッチ特定部１０１３は演奏が終了すると、その演奏において演奏音の発音が行われていた演奏パートを示すデータを、順次記憶した基本周波数およびタイミングを示すデータの集まりに対応付けて、実演奏ピッチデータ１０２３として記憶部１０２に記憶する。図４は実演奏ピッチデータ１０２３の内容を例示した図である。なお、ユーザにより演奏トレーニングシステム１を用いた歌唱練習が複数回行われた場合、その各々の歌唱練習に対応する実演奏ピッチデータ１０２３が記憶部１０２に記憶されることになる。 The pitch specifying unit 1013 performs the above-described basic frequency specifying process every 10 milliseconds, for example, and stores the data indicating the result in the storage unit 102 sequentially in association with the data indicating the timing of the music at that time. To do. When the performance is finished, the pitch specifying unit 1013 associates the data indicating the performance part where the performance sound was generated during the performance with the collection of data indicating the fundamental frequency and timing stored in sequence, and the actual performance pitch data. 1023 is stored in the storage unit 102. FIG. 4 is a diagram illustrating the contents of the actual performance pitch data 1023. In addition, when the singing practice using the performance training system 1 is performed a plurality of times by the user, actual performance pitch data 1023 corresponding to each singing practice is stored in the storage unit 102.

制御部１０１は、ユーザの歌唱音に影響を与えている他の演奏パートを特定する演奏パート特定部１０１４を備えている。演奏パート特定部１０１４の処理は、過去の演奏に関する実演奏ピッチデータ１０２３が記憶部１０２に記憶されているか否か、記憶されている場合にいずれの演奏パートによる演奏に関する実演奏ピッチデータ１０２３が記憶されているか、現在の演奏において発音している演奏パートはいずれであるか、に応じて異なる。以下、それらのバリエーションごとに演奏パート特定部１０１４の処理を説明する。 The control unit 101 includes a performance part specifying unit 1014 that specifies another performance part that affects the user's singing sound. The performance part specifying unit 1014 processes whether or not the actual performance pitch data 1023 related to the past performance is stored in the storage unit 102 and, if stored, the actual performance pitch data 1023 related to the performance by any performance part is stored. It depends on which part is being played or which part of the current performance is sounding. Hereinafter, the processing of the performance part specifying unit 1014 will be described for each variation.

（ケース１：過去の演奏に関する実演奏ピッチデータがなく、発音中の伴奏パートが単声の場合）
まず、記憶部１０２にまだ過去の演奏に関する実演奏ピッチデータ１０２３が記憶されておらず、ボーカルパート以外の演奏パート、すなわち伴奏パートのうち発音を行っているものが１つであり、かつフルートのように単声楽器である場合、演奏パート特定部１０１４は再生部１０１１からボーカルパートおよび発音中の伴奏パートに関する種別「ノートオン」のイベントデータを受け取る。また、演奏パート特定部１０１４はピッチ特定部１０１３からピッチ特定部１０１３により特定された現時点の歌唱音のピッチを示すデータ（以下、「現時点実演奏ピッチデータ」と呼ぶ）を順次受け取る。 (Case 1: There is no actual performance pitch data related to past performances, and the accompaniment part being pronounced is a single voice)
First, the actual performance pitch data 1023 relating to past performances is not yet stored in the storage unit 102, and there is one performance part other than the vocal part, that is, one that is sounding among the accompaniment parts, and the flute Thus, in the case of a monophonic instrument, the performance part specifying unit 1014 receives event data of the type “note on” regarding the vocal part and the accompaniment part that is sounding from the playback unit 1011. Further, the performance part specifying unit 1014 sequentially receives data indicating the pitch of the current singing sound specified by the pitch specifying unit 1013 from the pitch specifying unit 1013 (hereinafter referred to as “current actual performance pitch data”).

演奏パート特定部１０１４は、それらのデータにより示されるガイド音のピッチ、伴奏音のピッチおよびユーザの歌唱音のピッチの高低関係に基づき、演奏中において伴奏音により歌唱音が影響を受けている箇所を特定する。以下、具体例を用いてその処理を説明する。今、演奏中のあるタイミングにおいて、演奏パート特定部１０１４が最後に受け取った各演奏パートに関するイベントデータおよび現時点実演奏ピッチデータが以下のとおりであったとする。
（ａ）イベントデータ［ボーカル／ノートオン／Ｃ４］
（ｂ）イベントデータ［フルート／ノートオン／Ａ５］
（ｃ）現時点実演奏ピッチデータ［２５８．４１５］ The performance part specifying unit 1014 is a place where the singing sound is affected by the accompaniment sound during the performance based on the pitch relationship between the guide sound pitch, the accompaniment sound pitch, and the user singing sound pitch indicated by the data. Is identified. Hereinafter, the process will be described using a specific example. It is assumed that the event data and the current actual performance pitch data regarding each performance part received last by the performance part specifying unit 1014 are as follows at a certain timing during performance.
(A) Event data [Vocal / Note On / C4]
(B) Event data [flute / note on / A5]
(C) Current performance pitch data [258.415]

ここで、上記（ａ）はガイド音の音高を示し、上記（ｂ）は伴奏音の音高を示している。演奏パート特定部１０１４は、伴奏音の音高がガイド音の音高の前後にまたがる１オクターブの音域に入るように、伴奏音の音高をオクターブ単位でシフトさせる。上記の例の場合、ガイド音の音高は［Ｃ４］であり、伴奏音の音高は［Ａ５］であるため、演奏パート特定部１０１４は伴奏音の音高を［Ａ４］にシフトダウンする。このように、伴奏音の音高をピッチシフトさせるのは、楽曲において音の絶対的な高低よりも音階上の高低の方が楽曲においては意味を有するためである。以下、ガイド音に伴奏音の音高を近づける音高のシフト処理を「音高シフト処理」と呼ぶ。 Here, (a) indicates the pitch of the guide sound, and (b) indicates the pitch of the accompaniment sound. The performance part specifying unit 1014 shifts the pitch of the accompaniment sound in units of octaves so that the pitch of the accompaniment sound falls within a range of one octave before and after the pitch of the guide sound. In the above example, since the pitch of the guide sound is [C4] and the pitch of the accompaniment sound is [A5], the performance part specifying unit 1014 shifts down the pitch of the accompaniment sound to [A4]. . The reason why the pitch of the accompaniment sound is pitch-shifted in this way is because the pitch of the scale in the music is more meaningful than the absolute pitch of the sound. Hereinafter, the pitch shift processing for bringing the pitch of the accompaniment sound closer to the guide sound is referred to as “pitch shift processing”.

続いて、演奏パート特定部１０１４はガイド音の音高およびピッチシフト後の伴奏音（以下、単に「伴奏音」と呼ぶ）の音高を、対応する周波数に変換する。例えば、音高［Ｃ４］は周波数［２６１．６２６］Ｈｚに、音高［Ａ４］は周波数［２２０．０００］Ｈｚに各々変換される。以下、音高名により示される音高を周波数に変換する処理を「音高変換処理」と呼ぶ。演奏パート特定部１０１４は音高変換処理により、ガイド音の周波数［２６１．６２６］Ｈｚ、伴奏音の周波数［２２０．０００］Ｈｚおよびユーザの歌唱音の周波数［２５８．４１５］Ｈｚを得る。以下、それらの値をｆ_ａ、ｆ_ｂおよびｆ_ｃと呼ぶ。 Subsequently, the performance part specifying unit 1014 converts the pitch of the guide sound and the pitch of the accompaniment sound after the pitch shift (hereinafter simply referred to as “accompaniment sound”) into a corresponding frequency. For example, pitch [C4] is converted to frequency [261.626] Hz, and pitch [A4] is converted to frequency [220.000] Hz. Hereinafter, the process of converting the pitch indicated by the pitch name into a frequency is referred to as “pitch conversion process”. The performance part specifying unit 1014 obtains the guide sound frequency [261.626] Hz, the accompaniment sound frequency [220.000] Hz, and the user singing sound frequency [258.415] Hz by pitch conversion processing. Hereinafter, these values are referred to as f _a , f _b, and f _c .

続いて、演奏パート特定部１０１４は歌唱音のガイド音からのずれの指標として、ｄ＝（｜ｆ_ｃ−ｆ_ａ｜）／ｆ_ａ×１００（％）を算出する。演奏パート特定部１０１４はこのように算出したずれの指標が所定の閾値、例えば１．００（％）以上である場合、歌唱音が修正を要する程度にガイド音からずれていると判定する。以下、指標ｄにより歌唱音が修正を要する程度にずれていることを検出する処理を「ずれ検出処理」と呼ぶ。 Subsequently, the playing part specification section 1014 as an index of deviation from the guide sound of the singing _sound, d = calculates a _{/ f a × 100 (%)} (| | f c -f a). The performance part specifying unit 1014 determines that the singing sound has deviated from the guide sound to the extent that correction is required when the deviation index calculated in this way is a predetermined threshold, for example, 1.00 (%) or more. Hereinafter, the process of detecting that the singing sound is deviated to an extent that requires correction by the index d is referred to as “deviation detection process”.

ずれ検出処理において修正を要するずれを検出した場合、演奏パート特定部１０１４は伴奏音がそのずれに影響を与えているか否かを判断する。具体的には、演奏パート特定部１０１４は伴奏音のピッチがガイド音のピッチから見て歌唱音のピッチと同じ側にあり、かつ歌唱音のピッチよりも離れている場合、歌唱音のずれが伴奏音に影響されているものと判定する。 When a deviation requiring correction is detected in the deviation detection process, the performance part specifying unit 1014 determines whether or not the accompaniment sound affects the deviation. Specifically, the performance part specifying unit 1014 has a gap in the singing sound when the pitch of the accompaniment sound is on the same side as the pitch of the singing sound as viewed from the pitch of the guide sound and is far from the pitch of the singing sound. It is determined that the sound is influenced by the accompaniment sound.

例えば上記の例の場合、ずれの指標はｄ＝約１．２３となり、所定の閾値１．００（％）以上である。また、ガイド音の周波数（ｆ_ａ＝２６１．６２６）からみて伴奏音の周波数（ｆ_ｂ＝２２０．０００）が歌唱音の周波数（ｆ_ｃ＝２５８．４１５）と同じ側にあり、かつガイド音の周波数と伴奏音の周波数との差の絶対値（｜ｆ_ａ−ｆ_ｂ｜＝４１．６２６）が、ガイド音の周波数と歌唱音の周波数との差の絶対値（｜ｆ_ａ−ｆ_ｃ｜＝３．２１１）よりも大きい。従って、演奏パート特定部１０１４は歌唱音のずれが伴奏音により影響されている、と推定する。以下、ある伴奏音の周波数、ガイド音の周波数および歌唱音の周波数の関係に基づきその伴奏音が歌唱音のずれに影響を与えていると推定する処理を「影響推定処理」と呼ぶ。 For example, in the case of the above example, the deviation index is d = about 1.23, which is equal to or greater than the predetermined threshold value 1.00 (%). Further, the frequency of the accompaniment sound (f _b = 220.000) is on the same side as the frequency of the singing sound (f _c = 258.415) when viewed from the frequency of the guide sound (f _a = 261.626), and the guide sound the absolute value of the difference between the frequency of the accompaniment sound _{_{(| f a -f b | =}} 41.626) is the frequency of the guide sound absolute value of the difference between the frequency of the singing sound _(| f a -f _c | = 3.211). Accordingly, the performance part specifying unit 1014 estimates that the deviation of the singing sound is affected by the accompaniment sound. Hereinafter, the process of estimating that the accompaniment sound has an influence on the shift of the singing sound based on the relationship between the frequency of the accompaniment sound, the frequency of the guide sound, and the frequency of the singing sound is referred to as “influence estimation processing”.

制御部１０１は、イベントデータにより示されるガイド音および伴奏音のピッチと、ユーザの歌唱音のピッチをグラフで示す画面の画像信号を生成する画像信号生成部１０１５を備えている。演奏パート特定部１０１４は、影響推定処理において伴奏音が歌唱音のずれに影響を与えていると推定した場合、その伴奏音を発音している演奏パートを示すデータ（以下、「演奏パートデータ」と呼ぶ）を画像信号生成部１０１５に引き渡す。また、演奏パート特定部１０１４は伴奏音が歌唱音に影響を与えていないと推定した場合、いずれの演奏パートも歌唱に影響を与えていないことを示すデータとして、例えば演奏パートデータ「なし」を画像信号生成部１０１５に引き渡す。 The control unit 101 includes an image signal generation unit 1015 that generates a screen image signal indicating the pitch of the guide sound and the accompaniment sound indicated by the event data and the pitch of the user singing sound in a graph. When the performance part specifying unit 1014 estimates that the accompaniment sound affects the deviation of the singing sound in the effect estimation process, the performance part specifying unit 1014 indicates data indicating the performance part that is producing the accompaniment sound (hereinafter, “performance part data”). Is called) to the image signal generation unit 1015. When the performance part specifying unit 1014 estimates that the accompaniment sound does not affect the singing sound, for example, performance part data “none” is used as data indicating that none of the performance parts has an influence on the singing. Delivered to the image signal generation unit 1015.

（ケース２：過去の演奏に関する実演奏ピッチデータがなく、発音中の伴奏パートが複数もしくは多声の場合）
記憶部１０２にまだ過去の演奏に関する実演奏ピッチデータ１０２３が記憶されておらず、発音を行っている伴奏パートが複数ある場合や伴奏パートが多声楽器である場合、演奏パート特定部１０１４は再生部１０１１からボーカルパートおよび発音中の全ての伴奏パートに関する種別「ノートオン」のイベントデータを受け取る。また、演奏パート特定部１０１４はピッチ特定部１０１３から現時点実演奏ピッチデータを順次受け取る。 (Case 2: When there is no actual performance pitch data related to past performances and there are multiple or polyphonic accompaniment parts)
When the actual performance pitch data 1023 relating to the past performance is not yet stored in the storage unit 102 and there are a plurality of accompaniment parts that are sounding, or when the accompaniment part is a polyphonic instrument, the performance part specifying unit 1014 performs playback. Event data of the type “Note On” relating to the vocal part and all the accompaniment parts being pronounced is received from the section 1011. The performance part specifying unit 1014 sequentially receives the current actual performance pitch data from the pitch specifying unit 1013.

伴奏パートがピアノのように多声楽器である場合、演奏パート特定部１０１４はその伴奏パートにより同時に発音される和音の列を複数のメロディラインに分離し、以下の処理においてそれぞれのメロディラインを１つの単声楽器により演奏される音の列として扱う。 When the accompaniment part is a polyphonic instrument such as a piano, the performance part specifying unit 1014 separates a chord string that is simultaneously generated by the accompaniment part into a plurality of melody lines. Treated as a sequence of sounds played by two monophonic instruments.

続いて、演奏パート特定部１０１４は、各々の伴奏音に関し音高シフト処理を行い、伴奏音の音高をガイド音の音高の近辺にシフトさせる。その後、演奏パート特定部１０１４はガイド音の音高およびシフト後の伴奏音（以下、単に「伴奏音」と呼ぶ）の音高に関し音高変換処理を行い、各々の周波数を得る。 Subsequently, the performance part specifying unit 1014 performs a pitch shift process on each accompaniment sound, and shifts the pitch of the accompaniment sound to the vicinity of the pitch of the guide sound. Thereafter, the performance part specifying unit 1014 performs pitch conversion processing on the pitch of the guide sound and the pitch of the accompaniment sound after the shift (hereinafter simply referred to as “accompaniment sound”) to obtain each frequency.

演奏パート特定部１０１４は、上記のように得られた周波数を用いて、ずれ検出処理を行う。演奏パート特定部１０１４はピッチ特定部１０１３から現時点実演奏ピッチデータを受け取るごとに上述した一連の処理を行う。演奏パート特定部１０１４は、ずれ検出処理を行うと、ずれの有無を示すデータと、ガイド音、各々の伴奏音および歌唱音のピッチを示す周波数を対応付けたデータを生成し、生成したデータを順次、作業用ピッチデータ１０２４として一時的に記憶部１０２に記憶する。図５は作業用ピッチデータ１０２４の内容を例示した図である。図５において、伴奏音１は伴奏パート「フルート」の音のピッチを示し、伴奏音２〜伴奏音４は各々、伴奏パート「ピアノ」の第１声〜第３声の音のピッチを示している。 The performance part specifying unit 1014 performs a deviation detection process using the frequency obtained as described above. The performance part specifying unit 1014 performs the above-described series of processing every time the actual performance pitch data is received from the pitch specifying unit 1013. When performing the deviation detection process, the performance part specifying unit 1014 generates data in which the data indicating the presence / absence of the deviation is associated with the frequency indicating the pitch of the guide sound, each accompaniment sound, and the singing sound. The work pitch data 1024 is temporarily stored in the storage unit 102 sequentially. FIG. 5 is a diagram illustrating the contents of the work pitch data 1024. In FIG. 5, accompaniment sound 1 indicates the pitch of the sound of the accompaniment part “flute”, and accompaniment sounds 2 to 4 indicate the pitches of the first to third sounds of the accompaniment part “piano”, respectively. Yes.

演奏パート特定部１０１４は、ずれ検出処理において修正を要するずれがあると判定した場合、複数の伴奏音のいずれによってそのずれが引き起こされているかを推定する。そのため、演奏パート特定部１０１４はまず各々の伴奏音について影響推定処理を行い、その処理において肯定的な結果が出た伴奏音を、ユーザの歌唱を誤らせている原因の伴奏音の候補として抽出する。つまり、ガイド音からみて、歌唱音と同じ側の音域であり、かつ歌唱音よりもガイド音から離れている伴奏音が、歌唱音に影響を与えている伴奏音の候補として抽出される。以下、歌唱音に影響を与えている伴奏音の候補を抽出する処理を「候補音抽出処理」と呼ぶ。 When the performance part specifying unit 1014 determines that there is a shift that needs to be corrected in the shift detection process, the performance part specifying unit 1014 estimates which of the plurality of accompaniment sounds causes the shift. Therefore, the performance part specifying unit 1014 first performs an influence estimation process for each accompaniment sound, and extracts an accompaniment sound that has yielded a positive result as a candidate for an accompaniment sound that causes the user's singing error. To do. In other words, an accompaniment sound that is in the same range as the singing sound and is farther from the guide sound than the singing sound is extracted as a candidate for an accompaniment sound that affects the singing sound. Hereinafter, the process of extracting accompaniment sound candidates that affect the singing sound is referred to as “candidate sound extraction process”.

演奏の開始直後等であって、作業用ピッチデータ１０２４に記憶されているデータのうち、「ずれの有無」が「有」であるデータの数がまだ少ない場合、演奏パート特定部１０１４は候補音抽出処理によって抽出した伴奏音のうち歌唱音のピッチに最も近いピッチの伴奏音を発音している演奏パートを、歌唱音のずれに影響を与えている演奏パートであると推定する。 If there is still a small number of data in which “presence / absence of deviation” is “present” among the data stored in the work pitch data 1024, such as immediately after the start of the performance, the performance part specifying unit 1014 selects the candidate sound. A performance part that produces an accompaniment sound having a pitch closest to the pitch of the singing sound among the accompaniment sounds extracted by the extraction process is estimated to be a performance part that affects the deviation of the singing sound.

一方、作業用ピッチデータ１０２４に記憶されているデータのうち、「ずれの有無」が「有」であるデータの数が十分に多い場合、演奏パート特定部１０１４は作業用ピッチデータ１０２４に記憶されているデータの中から「ずれの有無」が「有」であるデータを抽出し、候補音抽出処理において抽出した伴奏音の演奏パートの各々に関し、抽出したデータにより示されるガイド音のピッチと歌唱音のピッチとの差と、ガイド音のピッチと伴奏音のピッチとの差との相関係数を算出する。 On the other hand, when the number of data having “existence of deviation” is “present” among the data stored in the work pitch data 1024 is sufficiently large, the performance part specifying unit 1014 is stored in the work pitch data 1024. For each of the performance parts of the accompaniment sounds extracted in the candidate sound extraction process, data with “presence / absence” of “exist” is extracted from the recorded data, and the pitch and singing of the guide sound indicated by the extracted data A correlation coefficient between the difference between the pitch of the sound and the difference between the pitch of the guide sound and the pitch of the accompaniment sound is calculated.

例えば、あるタイミングにおいて伴奏音１（フルート）が歌唱音に影響を与えている伴奏音の候補として抽出された場合、作業用ピッチデータ１０２４のデータのうち「ずれの有無」が「有」であるものに含まれるガイド音の周波数の列をｆ_ａｋ、伴奏音１の周波数の列をｆ_ｂｋ、歌唱音の周波数の列をｆ_ｃｋ（ただし、ｋ＝１〜ｎの自然数、ｎは「ずれの有無」が「有」であるデータの数）とすると、演奏パート特定部１０１４はｋに応じて変化する（ｆ_ｂｋ−ｆ_ａｋ）と（ｆ_ｃｋ−ｆ_ａｋ）の組合せについての相関係数を算出する。そのように算出される相関係数は、過去に歌唱音がガイド音からずれた箇所において、ガイド音のピッチを基準とした場合における、歌唱音の変化と伴奏音の変化の相関関係を示す指標である。 For example, when accompaniment sound 1 (flute) is extracted as a candidate for an accompaniment sound that affects the singing sound at a certain timing, “presence / absence of deviation” is “present” in the work pitch data 1024. The frequency sequence of the guide sound included in the object is f _ak , the frequency sequence of the accompaniment sound 1 is f _bk , and the frequency sequence of the singing sound is f _ck (where k = 1 to n is a natural number, n is Assuming that “presence / absence” is “number of data”, the performance part specifying unit 1014 calculates a correlation coefficient for the combination of (f _bk −f _ak ) and (f _ck −f _ak ) that changes according to k. calculate. The correlation coefficient thus calculated is an index indicating the correlation between the change of the singing sound and the change of the accompaniment sound when the singing sound has deviated from the guide sound in the past and the pitch of the guide sound is used as a reference. It is.

演奏パート特定部１０１４は、候補音抽出処理において抽出した伴奏音の演奏パートの全てについて相関係数を算出すると、最も大きい相関係数が算出された演奏パートを、歌唱音のずれに影響を与えている演奏パートであると推定する。 When the performance part specifying unit 1014 calculates the correlation coefficient for all the performance parts of the accompaniment sounds extracted in the candidate sound extraction process, the performance part having the largest correlation coefficient is affected by the deviation of the singing sound. It is presumed that the performance part is.

演奏パート特定部１０１４は、上記のように歌唱音のずれに影響を与えている演奏パートを推定すると、推定した演奏パートを示す演奏パートデータを画像信号生成部１０１５に引き渡す。また、演奏パート特定部１０１４はいずれの演奏パートも歌唱音のずれに影響を与えていないと判定した場合、例えば演奏パートデータ「なし」を画像信号生成部１０１５に引き渡す。 When the performance part specifying unit 1014 estimates the performance part that affects the deviation of the singing sound as described above, the performance part specifying unit 1014 passes the performance part data indicating the estimated performance part to the image signal generation unit 1015. If the performance part specifying unit 1014 determines that none of the performance parts has an effect on the singing sound shift, the performance part specifying unit 1014 delivers, for example, performance part data “none” to the image signal generation unit 1015.

（ケース３：現在の演奏に参加している伴奏パートが参加していない過去の演奏に関する実演奏ピッチデータが記憶されている場合）
現在行われている演奏において発音を行っている伴奏パートのいずれかが参加していない過去の演奏に関する実演奏ピッチデータ１０２３が記憶部１０２に記憶されている場合、演奏パート特定部１０１４は再生部１０１１からボーカルパートおよび発音中の全ての伴奏パートに関する種別「ノートオン」のイベントデータを受け取る。また、演奏パート特定部１０１４はピッチ特定部１０１３から現時点実演奏ピッチデータを順次受け取る。 (Case 3: When actual performance pitch data relating to past performances in which an accompaniment part participating in the current performance does not participate is stored)
When actual performance pitch data 1023 relating to a past performance that does not participate in any of the accompaniment parts that are sounding in the currently performed performance is stored in the storage unit 102, the performance part specifying unit 1014 is a playback unit. From 1011, event data of the type “Note On” relating to the vocal part and all the accompaniment parts being sounded are received. The performance part specifying unit 1014 sequentially receives the current actual performance pitch data from the pitch specifying unit 1013.

演奏パート特定部１０１４は、多声楽器の伴奏パートの和音の列を複数のメロディラインに分離する処理、伴奏音に関する音高シフト処理、ガイド音および伴奏音に関する音高変換処理を行い、ガイド音の周波数、伴奏音の各々の周波数および現在の演奏における歌唱音の周波数を得る。 The performance part specifying unit 1014 performs processing for separating a chord string of an accompaniment part of a polyphonic instrument into a plurality of melody lines, pitch shift processing for accompaniment sounds, pitch conversion processing for guide sounds and accompaniment sounds, and guide sound , The frequency of each accompaniment sound, and the frequency of the singing sound in the current performance.

続いて、演奏パート特定部１０１４は記憶部１０２から、過去の演奏に関する実演奏ピッチデータ１０２３に含まれるデータのうち、現在演奏されている楽曲中のタイミングに対応するタイミングのデータを読み出す。以下、そのように読み出されたデータを「過去実演奏ピッチデータ」と呼ぶ。今、以上のようにして得られた周波数を示すデータが以下のとおりであるものとする。
（ａ）ガイド音［２６１．６２６］
（ｂ１）伴奏音１（フルート）［２２０．０００］
（ｂ２）伴奏音２（ピアノ１）［１９６．０００］
（ｂ３）伴奏音３（ピアノ２）［２４６．９４２］
（ｂ４）伴奏音４（ピアノ３）［２９３．６６５］
（ｃ）現時点実演奏ピッチデータ［２５８．４１５］
（ｄ）過去実演奏ピッチデータ［２６０．５２１］ Subsequently, the performance part specifying unit 1014 reads out, from the storage unit 102, data of timing corresponding to the timing in the currently played music from the data included in the actual performance pitch data 1023 regarding the past performance. Hereinafter, the data thus read is referred to as “past actual performance pitch data”. It is assumed that the data indicating the frequency obtained as described above is as follows.
(A) Guide sound [261.626]
(B1) Accompaniment sound 1 (flute) [220.000]
(B2) Accompaniment sound 2 (piano 1) [196.000]
(B3) Accompaniment sound 3 (piano 2) [2466.942]
(B4) Accompaniment sound 4 (piano 3) [293.665]
(C) Current performance pitch data [258.415]
(D) Past actual performance pitch data [260.521]

ここで、記憶部１０２に記憶されている過去の演奏に関する実演奏ピッチデータ１０２３により示される、演奏において発音していた演奏パートが「ピアノ」であった場合、上記のデータは以下の事実を示している。
（イ）現在行われている演奏において、歌唱音（２５８．４１５Ｈｚ）はガイド音（２６１．６２６Ｈｚ）から修正を要する程度にずれている。
（ロ）過去に行われた演奏において、歌唱音（２６０．５２１Ｈｚ）は現在の歌唱音よりガイド音（２６１．６２６Ｈｚ）に近かった。
（ハ）過去に行われた演奏においては伴奏パート「フルート」は発音していなかったが、現在の演奏においては伴奏パート「フルート」が発音している。 Here, when the performance part sounded in the performance indicated by the actual performance pitch data 1023 related to the past performance stored in the storage unit 102 is “piano”, the above data indicates the following facts. ing.
(A) In the performance currently being performed, the singing sound (258.415 Hz) deviates from the guide sound (261.626 Hz) to the extent that requires correction.
(B) In performances performed in the past, the singing sound (260.521 Hz) was closer to the guide sound (261.626 Hz) than the current singing sound.
(C) The accompaniment part “flute” was not pronounced in the performances performed in the past, but the accompaniment part “flute” is pronounced in the current performance.

上記のことから、ユーザは現在の歌唱において伴奏パート「フルート」の音に引きずられて本来出すべき音のピッチよりも低いピッチの音を出している可能性がある、と推論される。そこで、演奏パート特定部１０１４は伴奏パート「フルート」の伴奏音１が、現在の歌唱音のずれに影響を与えているか否かの推定を行う。その推定の方法は、上述したケース１におけるものと同様である。 From the above, it is inferred that there is a possibility that the user is producing a sound with a pitch lower than that of the sound that should be originally produced by being dragged by the sound of the accompaniment part “flute” in the current song. Therefore, the performance part specifying unit 1014 estimates whether or not the accompaniment sound 1 of the accompaniment part “flute” affects the deviation of the current singing sound. The estimation method is the same as that in Case 1 described above.

一方、記憶部１０２に記憶されている過去の演奏に関する実演奏ピッチデータ１０２３により示される、演奏において発音していた演奏パートが「フルート」であった場合、上記のデータは以下の事実を示している。
（イ）現在行われている演奏において、歌唱音（２５８．４１５Ｈｚ）はガイド音（２６１．６２６Ｈｚ）から修正を要する程度にずれている。
（ロ）過去に行われた演奏において、歌唱音（２６０．５２１Ｈｚ）は現在の歌唱音よりガイド音（２６１．６２６Ｈｚ）に近かった。
（ハ）過去に行われた演奏においては伴奏パート「ピアノ」は発音していなかったが、現在の演奏においては伴奏パート「ピアノ」が発音している。 On the other hand, when the performance part sounded in the performance indicated by the actual performance pitch data 1023 related to the past performance stored in the storage unit 102 is “flute”, the above data indicates the following facts: Yes.
(A) In the performance currently being performed, the singing sound (258.415 Hz) deviates from the guide sound (261.626 Hz) to the extent that requires correction.
(B) In performances performed in the past, the singing sound (260.521 Hz) was closer to the guide sound (261.626 Hz) than the current singing sound.
(C) The accompaniment part “piano” was not pronounced in the performances performed in the past, but the accompaniment part “piano” is pronounced in the current performance.

上記のことから、ユーザは現在の歌唱において伴奏パート「ピアノ」のいずれかの音に引きずられて本来出すべき音のピッチよりも低いピッチの音を出している、と推論される。そこで、演奏パート特定部１０１４は伴奏パート「ピアノ」から分離された伴奏音２〜４のいずれが現在の歌唱音のずれに影響を与えている伴奏音であるかを推定する。その推定の方法は、上述したケース２におけるものと同様である。 From the above, it is inferred that the user is making a sound with a pitch lower than the pitch of the sound that should be originally produced by being dragged by any sound of the accompaniment part “piano” in the current song. Therefore, the performance part specifying unit 1014 estimates which of the accompaniment sounds 2 to 4 separated from the accompaniment part “piano” is an accompaniment sound that affects the deviation of the current singing sound. The estimation method is the same as that in Case 2 described above.

演奏パート特定部１０１４は、上記のように歌唱音のずれに影響を与えている伴奏音の演奏パートを推定すると、推定した演奏パートを示す演奏パートデータを画像信号生成部１０１５に引き渡す。また、演奏パート特定部１０１４はいずれの演奏パートも歌唱音のずれに影響を与えていないと判定した場合、例えば演奏パートデータ「なし」を画像信号生成部１０１５に引き渡す。 When the performance part specifying unit 1014 estimates the performance part of the accompaniment sound that affects the deviation of the singing sound as described above, the performance part specifying unit 1014 hands over the performance part data indicating the estimated performance part to the image signal generation unit 1015. If the performance part specifying unit 1014 determines that none of the performance parts has an effect on the singing sound shift, the performance part specifying unit 1014 delivers, for example, performance part data “none” to the image signal generation unit 1015.

（ケース４：現在の演奏に参加していない伴奏パートが参加していた過去の演奏に関する実演奏ピッチデータが記憶されている場合）
現在行われている演奏において発音を行っていない伴奏パートのいずれかが参加している過去の演奏に関する実演奏ピッチデータ１０２３が記憶部１０２に記憶されている場合、演奏パート特定部１０１４は再生部１０１１から、ボーカルパートおよび過去の演奏において発音していた全ての伴奏パートに関する種別「ノートオン」のイベントデータを受け取る。また、演奏パート特定部１０１４はピッチ特定部１０１３から現時点実演奏ピッチデータを順次受け取る。 (Case 4: Actual performance pitch data related to past performances in which accompaniment parts not participating in the current performance participated are stored)
When actual performance pitch data 1023 relating to a past performance in which any of the accompaniment parts that are not sounding in the currently performed performance is participating is stored in the storage unit 102, the performance part specifying unit 1014 is a playback unit. From 1011, event data of the type “note on” regarding the vocal part and all the accompaniment parts that have been pronounced in the past performance are received. The performance part specifying unit 1014 sequentially receives the current actual performance pitch data from the pitch specifying unit 1013.

続いて、演奏パート特定部１０１４は記憶部１０２から、過去の演奏に関する実演奏ピッチデータ１０２３から過去実演奏ピッチデータ、すなわち現在演奏されている楽曲中のタイミングに対応するタイミングのデータを読み出す。今、以上のようにして得られた周波数を示すデータが以下のとおりであるものとする。
（ａ）ガイド音［２６１．６２６］
（ｂ１）伴奏音１（フルート）［２２０．０００］
（ｂ２）伴奏音２（ピアノ１）［１９６．０００］
（ｂ３）伴奏音３（ピアノ２）［２４６．９４２］
（ｂ４）伴奏音４（ピアノ３）［２９３．６６５］
（ｃ）現時点実演奏ピッチデータ［２６１．１０２］
（ｄ）過去実演奏ピッチデータ［２６５．３１４］ Subsequently, the performance part specifying unit 1014 reads the past actual performance pitch data from the actual performance pitch data 1023 related to the past performance, that is, the timing data corresponding to the timing in the currently played music from the storage unit 102. It is assumed that the data indicating the frequency obtained as described above is as follows.
(A) Guide sound [261.626]
(B1) Accompaniment sound 1 (flute) [220.000]
(B2) Accompaniment sound 2 (piano 1) [196.000]
(B3) Accompaniment sound 3 (piano 2) [2466.942]
(B4) Accompaniment sound 4 (piano 3) [293.665]
(C) Current actual performance pitch data [261.102]
(D) Past actual performance pitch data [265.314]

ここで、現在の演奏において発音を行っている伴奏パートが「フルート」のみであった場合、上記のデータは以下の事実を示している。
（イ）現在行われている演奏において、歌唱音（２６１．１０２Ｈｚ）は修正を要しない程度にガイド音（２６１．６２６Ｈｚ）に近い。
（ロ）過去に行われた演奏において、歌唱音（２６５．３１４Ｈｚ）はガイド音（２６１．６２６Ｈｚ）から修正を要する程度にずれていた。
（ハ）過去に行われた演奏においては伴奏パート「ピアノ」が発音していたが、現在の演奏においては伴奏パート「ピアノ」は発音していない。 Here, when the accompaniment part that produces pronunciation in the current performance is only “flute”, the above data indicates the following facts.
(A) In the performance currently being performed, the singing sound (261.102 Hz) is close to the guide sound (261.626 Hz) to the extent that no correction is required.
(B) In performances performed in the past, the singing sound (265.314 Hz) deviated from the guide sound (261.626 Hz) to the extent that correction is required.
(C) The accompaniment part “piano” was pronounced in the performances performed in the past, but the accompaniment part “piano” is not pronounced in the current performance.

上記のことから、ユーザは過去の演奏において伴奏パート「ピアノ」のいずれかの音に引きずられて本来出すべき音のピッチよりも高いピッチの音を出していた、と推論される。そこで、演奏パート特定部１０１４は過去の演奏における伴奏パート「ピアノ」から分離された伴奏音２〜４のいずれが過去の歌唱音のずれに影響を与えていた伴奏音であったかを推定する。その推定の方法は、上述したケース２におけるものと同様である。 From the above, it is inferred that the user has been producing a sound with a pitch higher than the pitch of the sound that should be originally produced by being dragged by any sound of the accompaniment part “piano” in the past performance. Therefore, the performance part specifying unit 1014 estimates which of the accompaniment sounds 2 to 4 separated from the accompaniment part “piano” in the past performance is the accompaniment sound that has affected the deviation of the past singing sound. The estimation method is the same as that in Case 2 described above.

一方、現在の演奏において発音を行っている伴奏パートが「ピアノ」のみであった場合、上記のデータは以下の事実を示している。
（イ）現在行われている演奏において、歌唱音（２６１．１０２Ｈｚ）は修正を要しない程度にガイド音（２６１．６２６Ｈｚ）に近い。
（ロ）過去に行われた演奏において、歌唱音（２６５．３１４Ｈｚ）はガイド音（２６１．６２６Ｈｚ）から修正を要する程度にずれていた。
（ハ）過去に行われた演奏においては伴奏パート「フルート」が発音していたが、現在の演奏においては伴奏パート「フルート」は発音していない。 On the other hand, when the accompaniment part that produces pronunciation in the current performance is only “piano”, the above data shows the following facts.
(A) In the performance currently being performed, the singing sound (261.102 Hz) is close to the guide sound (261.626 Hz) to the extent that no correction is required.
(B) In performances performed in the past, the singing sound (265.314 Hz) deviated from the guide sound (261.626 Hz) to the extent that correction is required.
(C) The accompaniment part “flute” was pronounced in the performances performed in the past, but the accompaniment part “flute” was not pronounced in the current performance.

上記のことから、ユーザは過去の演奏において伴奏パート「フルート」の音に引きずられて本来出すべき音のピッチよりも高いピッチの音を出していた可能性がある、と推論される。そこで、演奏パート特定部１０１４は過去の演奏における伴奏パート「フルート」の伴奏音１が過去の歌唱音のずれに影響を与えていた伴奏音であったかを推定する。その推定の方法は、上述したケース１におけるものと同様である。 From the above, it is inferred that in the past performance, the user might have been dragged by the sound of the accompaniment part “flute” and had produced a sound with a pitch higher than the pitch of the sound that should be originally produced. Therefore, the performance part specifying unit 1014 estimates whether or not the accompaniment sound 1 of the accompaniment part “flute” in the past performance was an accompaniment sound that had affected the deviation of the past singing sound. The estimation method is the same as that in Case 1 described above.

演奏パート特定部１０１４は、上記のように歌唱音のずれに影響を与えていた伴奏音の演奏パートを推定すると、推定した演奏パートを示す演奏パートデータを画像信号生成部１０１５に引き渡す。また、演奏パート特定部１０１４はいずれの演奏パートも歌唱音のずれに影響を与えていなかったと判定した場合、例えば演奏パートデータ「なし」を画像信号生成部１０１５に引き渡す。 When the performance part specifying unit 1014 estimates the performance part of the accompaniment sound that has affected the deviation of the singing sound as described above, the performance part specifying unit 1014 delivers performance part data indicating the estimated performance part to the image signal generation unit 1015. When the performance part specifying unit 1014 determines that none of the performance parts has affected the deviation of the singing sound, the performance part data “None” is transferred to the image signal generation unit 1015, for example.

画像信号生成部１０１５は、上述したように演奏パート特定部１０１４から、歌唱音のずれに影響を与えていると推定される演奏パートを示す演奏パートデータを受け取る。また、画像信号生成部１０１５は、再生部１０１１から各演奏パートにより発音される音の音高や表示されるべき歌詞を示すイベントデータを受け取る。また、画像信号生成部１０１５は、ピッチ特定部１０１３から現時点実演奏ピッチデータを受け取る。画像信号生成部１０１５はそれらのデータを用いて、ユーザに対し、歌唱すべき歌詞、ガイド音のピッチ、伴奏音のピッチ、歌唱音のピッチ、歌唱音がガイド音からずれた箇所およびそのずれに影響を与えたと推定される演奏パートを示すための画像信号を生成する。 As described above, the image signal generation unit 1015 receives performance part data indicating a performance part that is estimated to be affecting the deviation of the singing sound from the performance part specifying unit 1014. Further, the image signal generation unit 1015 receives event data indicating the pitch of the sound produced by each performance part and the lyrics to be displayed from the reproduction unit 1011. Further, the image signal generation unit 1015 receives the current actual performance pitch data from the pitch specifying unit 1013. The image signal generation unit 1015 uses these data to inform the user of the lyrics to be sung, the pitch of the guide sound, the pitch of the accompaniment sound, the pitch of the singing sound, the location where the singing sound is deviated from the guide sound, and the deviation thereof. An image signal for indicating a performance part that is estimated to have been affected is generated.

入出力インタフェース１０３は、画像信号生成部１０１５により生成された画像信号を外部装置に出力する画像信号出力部１０３３を備えている。画像信号生成部１０１５は生成した画像信号を、画像信号出力部１０３３を介してディスプレイ１３に送信する。 The input / output interface 103 includes an image signal output unit 1033 that outputs the image signal generated by the image signal generation unit 1015 to an external device. The image signal generation unit 1015 transmits the generated image signal to the display 13 via the image signal output unit 1033.

図６は、画像信号出力部１０３３から送信された画像信号に従い、ディスプレイ１３により表示される画像を示した図である。図６に示される画像において、横軸は楽曲の進行に伴い経過する時間を示しており、縦軸は発音される音のピッチを示している。文字１３１は歌詞を示している。ライン１３２は楽曲における現在のタイミングを示している。 FIG. 6 is a diagram showing an image displayed on the display 13 in accordance with the image signal transmitted from the image signal output unit 1033. In the image shown in FIG. 6, the horizontal axis indicates the time that elapses as the music progresses, and the vertical axis indicates the pitch of the sound that is sounded. Character 131 indicates the lyrics. Line 132 indicates the current timing of the music.

折れ線１３３〜１３５はそれぞれ、ガイド音、伴奏パート１（フルート）および伴奏パート２（ピアノ１）のピッチを示している。また、曲線１３６は歌唱音のピッチを示している。なお、図６においては図の簡略化のため伴奏パート３および４（ピアノ２および３）のピッチを示す折れ線は省略されている。なお、過去の演奏に参加し、現在の演奏に参加していない伴奏パートのピッチを示す折れ線については、例えば点線で示す等により現在の演奏に参加している伴奏パートのピッチを示す折れ線と区別される。 The broken lines 133 to 135 indicate the pitches of the guide sound, accompaniment part 1 (flute) and accompaniment part 2 (piano 1), respectively. A curve 136 indicates the pitch of the singing sound. In FIG. 6, a broken line indicating the pitch of accompaniment parts 3 and 4 (pianos 2 and 3) is omitted for simplification of the drawing. In addition, the broken line indicating the pitch of the accompaniment part that has participated in the past performance and does not participate in the current performance is distinguished from the broken line that indicates the pitch of the accompaniment part participating in the current performance, for example, by a dotted line. Is done.

折れ線１３４および１３５には、それぞれマーカ１３７および１３８が付されている。これらのマーカは、歌唱音のピッチがガイド音のピッチから大きくずれた期間を示すとともに、そのずれの原因となったと推定される演奏パートおよびその演奏パートにより発音された音のピッチを示すマーカである。ただし、ここで折れ線１３４で示される伴奏パート１（フルート）は現在の演奏に参加しておらず、過去の演奏のマーカ１３７により示される期間において歌唱音に影響を与えていたことが示されている。従って、現在の歌唱音のピッチは、マーカ１３７により示される期間においてガイド音のピッチから大きく外れていない。つまり、その期間の演奏が、伴奏パート１（フルート）が抜けることにより改善されていることを示している。また、矢印１３９および１４０は、各々、いずれの伴奏パートにより歌唱音が影響されているかを示す矢印である。 Markers 137 and 138 are attached to the polygonal lines 134 and 135, respectively. These markers indicate the period during which the pitch of the singing sound is greatly deviated from the pitch of the guide sound, and the markers that indicate the performance part that is estimated to have caused the deviation and the pitch of the sound produced by the performance part. is there. However, it is indicated that the accompaniment part 1 (flute) indicated by the broken line 134 does not participate in the current performance and has influenced the singing sound during the period indicated by the marker 137 of the past performance. Yes. Therefore, the pitch of the current singing sound is not greatly deviated from the pitch of the guide sound during the period indicated by the marker 137. That is, the performance during that period is improved by the loss of accompaniment part 1 (flute). Arrows 139 and 140 are arrows indicating which accompaniment part affects the singing sound.

以上のように、演奏トレーニングシステム１によれば、ユーザは自分の現在の演奏音のピッチが本来発音されるべき音のピッチからどれだけずれているかをリアルタイムに確認することができるとともに、そのずれが他のいずれの演奏パートの音によりもたらされているかを知ることができる。従って、ユーザはそのように通知された演奏パートの音に引きずられないように意識しながら演奏のトレーニングを繰り返すことにより、効果的に演奏能力を向上することができる。 As described above, according to the performance training system 1, the user can confirm in real time how much the pitch of his current performance sound is deviated from the pitch of the sound that should be originally pronounced, and the deviation. Can be found by the sound of any other performance part. Therefore, the user can effectively improve performance performance by repeating performance training while being conscious not to be dragged by the sound of the performance part thus notified.

ところで、上記の説明においては、ユーザの演奏パートはボーカルパートであるものとしたが、ユーザの演奏パートがバイオリン等の楽器であってもよい。その場合、ピッチ特定部１０１３により基本周波数の候補の抽出における周波数帯Ｗの範囲が異なる他、上記と異なることはない。 In the above description, the performance part of the user is a vocal part, but the performance part of the user may be an instrument such as a violin. In that case, the pitch identification unit 1013 is different from the above except that the range of the frequency band W in the extraction of the fundamental frequency candidates is different.

また、上記の説明においては、基準ピッチデータ１０２１は図２に例示されるような演奏制御データの形式をとるものとしたが、例えば図５に例示されるような、各々のタイミングで発音されるべき音のピッチを周波数等により示す形式のデータであってもよい。その場合、音高を周波数に変換する音高変換処理は不要になる。 In the above description, the reference pitch data 1021 is in the form of performance control data as illustrated in FIG. 2, but is generated at each timing as illustrated in FIG. 5, for example. Data of a format that indicates the pitch of the power sound by frequency or the like may be used. In that case, a pitch conversion process for converting the pitch into a frequency is not necessary.

また、図６に示した画像は演奏トレーニングシステム１によりユーザに対し演奏に影響を与えている他の演奏パートを通知するための画像の例示であって、他に様々な表示態様があり得ることは言うまでもない。例えば、マーカの代わりに折れ線の色や太さを変えたり、ユーザの演奏音が他の演奏パートの音に引きずられている様子を矢印で示したりしてもよい。 Further, the image shown in FIG. 6 is an example of an image for notifying the user of other performance parts that have an influence on the performance by the performance training system 1, and may have various display modes. Needless to say. For example, the color or thickness of the broken line may be changed instead of the marker, or the state where the user's performance sound is dragged by the sound of another performance part may be indicated by an arrow.

また、ピッチ通知装置１０は専用のハードウェアにより実現されてもよいし、汎用コンピュータにアプリケーションプログラムに従った処理を実行させることにより実現されてもよい。ピッチ通知装置１０が汎用コンピュータにより実現される場合、制御部１０１の各構成部は、汎用コンピュータが備えるＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）およびＣＰＵの制御下で動作するＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）が、アプリケーションプログラムに含まれる各モジュールに従った処理を同時並行して行うことにより、汎用コンピュータの機能として実現される。 The pitch notification device 10 may be realized by dedicated hardware or may be realized by causing a general-purpose computer to execute processing according to an application program. When the pitch notification device 10 is realized by a general-purpose computer, each component of the control unit 101 includes a CPU (Central Processing Unit) included in the general-purpose computer and a DSP (Digital Signal Processor) operating under the control of the CPU. Are performed as functions of a general-purpose computer by performing processing in accordance with each module included in the above in parallel.

［２．第２実施形態］
以下、複数のユーザがコーラスのトレーニングを行う場合を例として、本発明の第２実施形態を説明する。図７は第２実施形態にかかる演奏トレーニングシステム２の構成を示した図である。演奏トレーニングシステム２は演奏トレーニングシステム１と比較し、マイク１２をユーザの数に応じて複数備えている点が異なるが、他の構成は演奏トレーニングシステム１と同様である。従って、図７において、演奏トレーニングシステム１の構成部と同じもしくは類似の構成部には、図１において用いられているものと同じ符号が付されている。ただし、演奏トレーニングシステム２が備えるピッチ通知装置は制御部の処理が一部ピッチ通知装置１０のそれと異なるため、ピッチ通知装置２０としてピッチ通知装置１０と区別する。また、演奏トレーニングシステム２の動作は、演奏トレーニングシステム１の動作と多くの点で共通している。従って、以下、それらが異なる点のみを説明する。 [2. Second Embodiment]
Hereinafter, the second embodiment of the present invention will be described using a case where a plurality of users perform chorus training as an example. FIG. 7 is a diagram showing a configuration of the performance training system 2 according to the second embodiment. The performance training system 2 is different from the performance training system 1 in that a plurality of microphones 12 are provided according to the number of users, but the other configurations are the same as the performance training system 1. Accordingly, in FIG. 7, the same or similar components as those of the performance training system 1 are denoted by the same reference numerals as those used in FIG. However, the pitch notification device included in the performance training system 2 is distinguished from the pitch notification device 10 as the pitch notification device 20 because the processing of the control unit is partly different from that of the pitch notification device 10. In addition, the operation of the performance training system 2 is common in many respects to the operation of the performance training system 1. Accordingly, only the differences between them will be described below.

以下、説明のため、演奏トレーニングシステム２を利用するユーザの数は３名であり、それらをユーザ１９−１〜３とする。また、演奏トレーニングシステム２はユーザ１９−１〜３が各々利用するマイク１２−１〜３を備えているものとする。さらに、ユーザ１９−１〜３は、トレーニングを行う合唱曲において、コーラスパート１〜３を各々担当するものとする。 Hereinafter, for explanation, the number of users who use the performance training system 2 is three, and these are users 19-1 to 19-3. The performance training system 2 is assumed to include microphones 12-1 to 12-3 used by the users 19-1 to 19-3. Furthermore, it is assumed that the users 19-1 to 19-3 are in charge of the chorus parts 1 to 3, respectively, in the choral music for training.

ピッチ通知装置２０の制御部１０１が備えるピッチ特定部１０１３は、マイク１２−１〜３の各々により集音された音を示す音信号を受け取り、それら３組の音信号から、ユーザ１９−１〜３の各々の歌唱音の基本周波数を特定する。まず、マイク１２−１〜３の各々により集音される音には、演奏トレーニングシステム１における場合と同様に、音源部１０１２から出力されスピーカ１１により発音される伴奏パートの音が含まれている。また、マイク１２−１〜３の各々のユーザの歌唱音の周波数成分には、基本周波数の倍音成分も含まれている。 The pitch specifying unit 1013 included in the control unit 101 of the pitch notification device 20 receives sound signals indicating sounds collected by the microphones 12-1 to 12-3, and the users 19-1 to 19-1 from the three sets of sound signals. The fundamental frequency of each singing sound of 3 is specified. First, the sound collected by each of the microphones 12-1 to 12 includes the sound of the accompaniment part output from the sound source unit 1012 and pronounced by the speaker 11, as in the performance training system 1. . In addition, the frequency component of the singing sound of each user of the microphones 12-1 to 12 includes a harmonic component of the fundamental frequency.

上記に加え、演奏トレーニングシステム２においては、マイク１２−１〜３が同じ空間内に配置されていることから、例えばマイク１２−１は主としてユーザ１９−１の歌唱音を集音するものの、同時に近くにいるユーザ１９−２および３の歌唱音も集音してしまう。従って、ピッチ通知装置２０のピッチ特定部１０１３は、例えばマイク１２−１から取得した音信号からユーザ１９−１の歌唱音の基本周波数を特定するに際し、他のマイク１２、すなわちマイク１２−２および３から取得した音信号の周波数成分とマイク１２−１から取得した音信号の周波数成分の、基本周波数の候補における振幅の差を変数として含む関数の値を算出し、その値に基づき基本周波数を特定する。 In addition to the above, in the performance training system 2, since the microphones 12-1 to 12-3 are arranged in the same space, for example, the microphone 12-1 mainly collects the singing sound of the user 19-1, but at the same time. The singing sounds of the nearby users 19-2 and 3 are also collected. Therefore, when the pitch specifying unit 1013 of the pitch notification device 20 specifies the fundamental frequency of the singing sound of the user 19-1 from the sound signal acquired from the microphone 12-1, for example, the other microphone 12, that is, the microphone 12-2 and 3 calculates a function value including the difference in amplitude between the frequency component of the sound signal acquired from 3 and the frequency component of the sound signal acquired from the microphone 12-1 in the fundamental frequency candidate as a variable, and calculates the fundamental frequency based on the value. Identify.

具体的には、ピッチ特定部１０１３は、まず演奏トレーニングシステム１における場合と同様に、再生部１０１１から受け取ったイベントデータにより示される、コーラスパート１のガイド音のピッチを示す周波数ω_０を算出する。続いて、ピッチ特定部１０１３はマイク１２−１から入力された音信号を示す実演奏波形データ１０２２に対し周波数分析を行い、その包絡線を示すグラフを求める。ピッチ特定部１０１３はそのように求めたグラフにおいて、人の音声の基本周波数が分布する周波数帯Ｗに含まれる、振幅が極大値をとる周波数を基本周波数の候補として抽出する。以下、それらの基本周波数の候補をω_１〜ω_ｋ（ただし、ｋは抽出された基本周波数の候補の数）とする。 Specifically, the pitch identifying unit 1013 first calculates the frequency ω ₀ indicating the pitch of the guide sound of the chorus part 1 indicated by the event data received from the playback unit 1011 as in the performance training system 1. . Subsequently, the pitch specifying unit 1013 performs frequency analysis on the actual performance waveform data 1022 indicating the sound signal input from the microphone 12-1, and obtains a graph indicating the envelope. The pitch specifying unit 1013 extracts, as a fundamental frequency candidate, a frequency having a maximum amplitude included in the frequency band W in which the fundamental frequency of human speech is distributed in the graph thus obtained. Hereinafter, these fundamental frequency candidates are denoted by ω ₁ to ω _k (where k is the number of extracted fundamental frequency candidates).

続いて、ピッチ特定部１０１３はマイク１２−２および３から入力された音信号を示す実演奏波形データ１０２２に対し周波数分析を行い、各周波数における周波数成分の振幅を算出する。以下、マイク１２−１〜３の各々から入力された音信号の周波数ωにおける周波数成分をｓ_ｉ（ω）（ただし、ｉはマイク１２−１〜３に対応する数値１〜３のいずれか）とする。 Subsequently, the pitch specifying unit 1013 performs frequency analysis on the actual performance waveform data 1022 indicating the sound signals input from the microphones 12-2 and 3, and calculates the amplitude of the frequency component at each frequency. Hereinafter, s _i (ω) is the frequency component at the frequency ω of the sound signal input from each of the microphones 12-1 to 12 (where i is any one of the numerical values 1 to 3 corresponding to the microphones 12-1 to 12). And

続いて、ピッチ特定部１０１３は、例えば以下の（式２）に示される関数の値を算出する。

Subsequently, the pitch specifying unit 1013 calculates a value of a function represented by the following (Equation 2), for example.

（式２）の右辺第１項および第３項は、それぞれ演奏トレーニングシステム１における（式１）の右辺第１項および第２項と同様の役割を果たす項である。（式２）の右辺第２項は、基本周波数の候補ω_ｋにおけるマイク１２−１により集音された音の周波数成分と他のマイク１２により集音された音の周波数成分の差の総和を示す項である。この（式２）の右辺第２項は、ω_ｋがマイク１２−１〜３により集音された同じ音のうちマイク１２−１により大きなボリュームで集音された音の成分である場合に大きな値となる。従って、ｆ_１（ω_ｋ）は、ユーザ１９−２および３の歌唱音の基本周波数よりも、マイク１２−１の近くで歌唱するユーザ１９−１の歌唱音の基本周波数において、より大きな値をとる。ピッチ特定部１０１３は、ｆ_１（ω_ｋ）の値が最大となるω_ｋをユーザ１９−１の歌唱音の基本周波数として特定する。 The first term and the third term on the right side of (Formula 2) are terms that play the same role as the first term and the second term on the right side of (Formula 1) in the performance training system 1, respectively. The second term of the right side of (Equation 2) is the sum of the difference between the frequency components of the sound collected by the frequency component and the other microphone 12 of collected sounds by the microphone 12-1 in the candidate omega _k of the fundamental frequency It is a term to show. The second term on the right side of (Expression 2) is large when ω _k is a component of sound collected with a large volume by the microphone 12-1 among the same sounds collected by the microphones 12-1 to 12-1. Value. Therefore, f ₁ (ω _k ) has a larger value at the fundamental frequency of the singing sound of the user 19-1 singing near the microphone 12-1 than at the fundamental frequency of the singing sound of the users 19-2 and 3. Take. Pitch specifying unit 1013, the value of _{f 1} (ω _k) is identified as the fundamental frequency of the singing sound of the user 19-1 omega _k that maximizes.

ピッチ特定部１０１３は、マイク１２−２および３から入力された音信号を示す実演奏波形データ１０２２に対しても、同様の関数値を算出し、算出した関数値が最大となる周波数を、ユーザ１９−２および３の歌唱音の基本周波数として特定する。すなわち、（式２）は（式３）のように一般化される。

The pitch specifying unit 1013 also calculates a similar function value for the actual performance waveform data 1022 indicating the sound signals input from the microphones 12-2 and 3, and sets the frequency at which the calculated function value is maximum to the user. It is specified as the fundamental frequency of 19-2 and 3 singing sounds. That is, (Expression 2) is generalized as (Expression 3).

なお、（式２）もしくは（式３）におけるａおよびｂは、各項の寄与度を示す係数を決定するための変数であり、経験的に適当な値が選択される。ここで、（式３）はピッチ特定部１０１３が歌唱音の基本周波数を特定するために用いる関数式の例示であって、他にも様々な関数式が利用可能であることは演奏トレーニングシステム１における場合と同様である。要すれば、注目するマイク１２により集音された音の音信号の周波数成分と他のマイク１２により集音された音の音信号の周波数成分との差を評価指標に反映させることにより、注目していないマイク１２に向かって歌唱しているユーザ１９の歌唱音の基本周波数が誤って注目しているマイク１２に向かって歌唱しているユーザ１９の歌唱音の基本周波数として特定されることを防止する方法であれば、如何なる方法であってもピッチ特定部１０１３がピッチを特定するための方法として採用可能である。 Note that a and b in (Equation 2) or (Equation 3) are variables for determining a coefficient indicating the degree of contribution of each term, and appropriate values are selected empirically. Here, (Equation 3) is an example of a functional equation used by the pitch identification unit 1013 to identify the fundamental frequency of the singing sound, and that various other functional equations can be used is that the performance training system 1 It is the same as in the case of. If necessary, the evaluation index reflects the difference between the frequency component of the sound signal of the sound collected by the microphone 12 of interest and the frequency component of the sound signal of the sound collected by the other microphone 12. The fundamental frequency of the singing sound of the user 19 who is singing toward the microphone 12 that is not performing is specified as the fundamental frequency of the singing sound of the user 19 who is singing toward the microphone 12 that is erroneously paying attention. Any method can be used as a method for the pitch specifying unit 1013 to specify the pitch.

ピッチ特定部１０１３は、上記のようにして特定されたユーザ１９−１〜３の歌唱音の基本周波数、すなわちコーラスパート１〜３のピッチを示すデータを生成し、生成したデータを順次、実演奏ピッチデータ１０２３として記憶部１０２に記憶するとともに、演奏パート特定部１０１４および画像信号生成部１０１５に引き渡す。 The pitch specifying unit 1013 generates data indicating the fundamental frequency of the singing sound of the users 19-1 to 19-3 specified as described above, that is, the pitch of the chorus parts 1 to 3, and sequentially performs the generated data on the actual performance. The pitch data 1023 is stored in the storage unit 102 and is also transferred to the performance part specifying unit 1014 and the image signal generation unit 1015.

ピッチ通知装置２０の演奏パート特定部１０１４は、再生部１０１１からコーラスパート１〜３のガイド音の音高を示すイベントデータを受け取る代わりに、ピッチ特定部１０１３からコーラスパート１〜３の歌唱音のピッチを示すデータを受け取り、そのデータを用いて演奏パートの特定処理を行う。その処理の内容は、演奏トレーニングシステム１におけるものと同様である。 The performance part specifying unit 1014 of the pitch notification device 20 receives the event data indicating the pitch of the guide sounds of the chorus parts 1 to 3 from the playback unit 1011, and instead of receiving the event data indicating the pitch of the chorus parts 1 to 3, Data indicating the pitch is received, and the performance part is specified using the data. The contents of the processing are the same as those in the performance training system 1.

および画像信号生成部１０１５は、演奏パート特定部１０１４により特定された演奏パートを示す演奏パートデータ、再生部１０１１から引き渡されるガイド音の音高を示すイベントデータ、そしてピッチ特定部１０１３から引き渡されるユーザ１９−１〜３の各々の歌唱音に関する現時点実演奏ピッチデータを用いて、ユーザ１９−１〜３の各々の歌唱音がいずれの他の歌唱音に引きずられているかを示す画像信号を生成し、生成した画像信号をディスプレイ１３に送信する。 The image signal generation unit 1015 includes performance part data indicating the performance part specified by the performance part specification unit 1014, event data indicating the pitch of the guide sound delivered from the reproduction unit 1011, and a user delivered from the pitch specification unit 1013. Using the current actual performance pitch data relating to each of the singing sounds 19-1 to 19-3, an image signal indicating which of the other singing sounds is dragged by each of the singing sounds of the users 19-1 to 3 is generated. The generated image signal is transmitted to the display 13.

図８は、演奏トレーニングシステム２においてディスプレイ１３に表示される画像を例示した図である。図８において、折れ線２３１〜２３３はそれぞれコーラスパート１〜３のガイド音のピッチを示し、曲線２３４および２３５はそれぞれユーザ１９−２および３の歌唱音のピッチを示している。なお、ユーザ１９−１は現在の演奏に参加していないため、その歌唱音のピッチを示す曲線は示されていない。また、マーカ２３６〜２３８および矢印２３９〜２４１は、歌唱音が他のコーラスパートの音に影響されている様子を示す表示である。 FIG. 8 is a diagram exemplifying an image displayed on the display 13 in the performance training system 2. 8, broken lines 231 to 233 indicate the pitches of the guide sounds of the chorus parts 1 to 3, respectively, and curves 234 and 235 indicate the pitches of the singing sounds of the users 19-2 and 3, respectively. In addition, since the user 19-1 has not participated in the present performance, the curve which shows the pitch of the song sound is not shown. Moreover, the markers 236 to 238 and the arrows 239 to 241 are displays showing a state in which the singing sound is influenced by the sounds of other chorus parts.

上記のように、演奏トレーニングシステム２によっても、演奏トレーニングシステム１による場合と同様に、ユーザは他の演奏パートの音に引きずられないように意識しながら、効果的に演奏能力を向上することができる。また、演奏トレーニングシステム２によれば、複数のユーザにより同時に演奏が行われるような場合でも、それらの複数のユーザが同時に各々の演奏能力を向上することができる。 As described above, with the performance training system 2, as with the performance training system 1, the user can effectively improve the performance ability while being conscious of not being dragged by the sound of other performance parts. it can. Moreover, according to the performance training system 2, even when a performance is performed simultaneously by a plurality of users, the plurality of users can simultaneously improve their performance capabilities.

第１実施形態にかかる演奏トレーニングシステムの構成を示した図である。It is the figure which showed the structure of the performance training system concerning 1st Embodiment. 第１実施形態にかかる基準ピッチデータの内容を例示した図である。It is the figure which illustrated the contents of the reference pitch data concerning a 1st embodiment. 第１実施形態にかかるピッチ特定部により生成されるグラフを例示した図である。It is the figure which illustrated the graph produced | generated by the pitch specific | specification part concerning 1st Embodiment. 第１実施形態にかかる実演奏ピッチデータの内容を例示した図である。It is the figure which illustrated the contents of the actual performance pitch data concerning a 1st embodiment. 第１実施形態にかかる作業用ピッチデータの内容を例示した図である。It is the figure which illustrated the contents of work pitch data concerning a 1st embodiment. 第１実施形態にかかるディスプレイにより表示される画像を示した図である。It is the figure which showed the image displayed by the display concerning 1st Embodiment. 第２実施形態にかかる演奏トレーニングシステムの構成を示した図である。It is the figure which showed the structure of the performance training system concerning 2nd Embodiment. 第２実施形態にかかるディスプレイにより表示される画像を示した図である。It is the figure which showed the image displayed by the display concerning 2nd Embodiment.

Explanation of symbols

１・２…演奏トレーニングシステム、１０・２０…ピッチ通知装置、１１…スピーカ、１２…マイク、１３…ディスプレイ、１４…キーボード、１０１…制御部、１０２…記憶部、１０３…入出力インタフェース、１０４…発振器、１０１１…再生部、１０１２…音源部、１０１３…ピッチ特定部、１０１４…演奏パート特定部、１０１５…画像信号生成部、１０２１…基準ピッチデータ、１０２２…実演奏波形データ、１０２３…実演奏ピッチデータ、１０２４…作業用ピッチデータ、１０３１…音信号出力部、１０３２…音信号入力部、１０３３…画像信号出力部 1... 2, performance training system, 10 20, pitch notification device, 11 speaker, 12 microphone, 13 display, 14 keyboard, 101 control unit, 102 storage unit, 103 input / output interface, 104 Oscillator, 1011... Playback unit, 1012... Sound generator unit, 1013... Pitch specification unit, 1014... Performance part specification unit, 1015. Data, 1024 ... Work pitch data, 1031 ... Sound signal output unit, 1032 ... Sound signal input unit, 1033 ... Image signal output unit

Claims

Storage means for storing reference pitch data indicating time-sequentially the pitch of the sound to be pronounced by one of the plurality of performance parts in a music played by a plurality of performance parts;
Input means for obtaining a sound signal indicating a sound produced by the one performance part;
Pitch specifying means for specifying the pitch of the sound indicated by the sound signal acquired by the input means;
The pitch of the sound indicated by the reference pitch data of the one performance part at one timing, the pitch of the sound of the one performance part specified by the pitch specifying means at the one timing, and the one timing Other performance parts other than the one performance part among the plurality of performance parts in FIG. 5 are affected by the performance of the one performance part at the one timing. A performance part identifying means for identifying a performance part;
A pitch notification device comprising: output means for outputting data indicating the pitch of the performance part specified by the performance part specifying means or the sound produced by the performance part.

The reference pitch data stored by the storage means is data indicating in time series the pitch of the sound to be generated by each of the plurality of performance parts.
The performance part specifying means specifies the pitch of the sound indicated by the reference pitch data of each performance part other than the one performance part at one timing when specifying the other performance part. The pitch notification device according to claim 1, wherein the pitch notification device is used as a pitch of each sound of a performance part other than the one performance part.

The input means obtains a sound signal indicating a sound produced by each of the plurality of performance parts;
The pitch specifying means specifies the pitch of the sound for each of the sounds produced by the plurality of performance parts indicated by the sound signal acquired by the input means,
When the performance part specifying means specifies the other performance part, the pitch of each sound of the performance parts other than the one performance part specified by the pitch specifying means at one timing is used as the one timing. The pitch notification device according to claim 1, wherein the pitch notification device is used as a pitch of each sound of a performance part other than the one performance part.

The storage means stores actual performance pitch data indicating the pitch specified by the pitch specifying means in time series,
The performance part specifying means includes the pitch of the sound of the one performance part specified by the pitch specifying means at one timing during the performance being performed and the one timing during the performance performed in the past. The pitch of the sound indicated by the actual performance pitch data of the one performance part and the performance part that has participated in the current performance and has not participated in the performance performed in the past. The pitch notification device according to claim 1, wherein the other performance part is specified based on a relationship with a pitch of the sound at the one timing during the performance being performed.

The storage means stores actual performance pitch data indicating the pitch specified by the pitch specifying means in time series,
The performance part specifying means includes the pitch of the sound of the one performance part specified by the pitch specifying means at one timing during the performance being performed and the one timing during the performance performed in the past. In the past, the pitch of the sound indicated by the actual performance pitch data of the one performance part and the performance part that has not participated in the currently performed performance and participated in the performance performed in the past The pitch notification device according to claim 1, wherein the other performance part is specified based on a relationship with a pitch of the sound at the one timing during the performance performed at the same time.

The storage means stores actual performance pitch data indicating the pitch specified by the pitch specifying means in time series,
The performance part specifying means includes a difference between a pitch of a sound indicated by the reference pitch data of the one performance part and a pitch of a sound indicated by the actual performance pitch data of the one performance part, and the one performance The other performance part is specified based on the correlation between the pitch of the sound indicated by the reference pitch data of the part and the difference between the pitches of the sounds of the performance parts other than the one performance part. The pitch notification device according to claim 1.

The pitch specifying means is a function value including, as a variable, a difference between a sound pitch candidate indicated by the sound signal acquired by the input means and a sound pitch indicated by the reference pitch data of the one performance part. The pitch notification device according to claim 1, wherein the pitch of the sound produced at the one timing by the one performance part is specified based on the value of the function.

The input means is a plurality of input means provided in association with each of the plurality of performance parts,
The function is related to one frequency, the amplitude of the frequency component at the one frequency of the sound signal acquired by the input means corresponding to the one performance part, and the input corresponding to the performance part other than the one performance part. The pitch notification device according to claim 7, wherein a numerical value indicating a difference between the amplitude of the frequency component at the one frequency of the sound signal acquired by at least one of the means is included as a variable.

A process of storing reference pitch data indicating time-sequentially the pitch of the sound to be produced by one of the plurality of performance parts in a music played by a plurality of performance parts;
Processing for obtaining a sound signal indicating a sound produced by the one performance part;
A process of identifying the pitch of the sound indicated by the acquired sound signal;
The pitch of the sound indicated by the reference pitch data of the one performance part at one timing, the pitch of the sound of the one performance part specified at the one timing, and the plurality of performance parts at the one timing A process of identifying another performance part that affects the performance of the one performance part at the one timing based on the relationship with the pitch of each sound of the performance parts other than the one performance part When,
A program for causing a computer to execute a process of outputting data indicating a specified performance part or a pitch of a sound produced by the performance part.