JP6488767B2

JP6488767B2 - Singing evaluation device and program

Info

Publication number: JP6488767B2
Application number: JP2015041620A
Authority: JP
Inventors: 川嶋　隆宏; 隆宏川嶋
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2015-03-03
Filing date: 2015-03-03
Publication date: 2019-03-27
Anticipated expiration: 2035-03-03
Also published as: JP2016161831A

Description

本発明は、歌唱を評価する技術に関する。 The present invention relates to a technique for evaluating a song.

複数の歌唱パートを有する楽曲について、歌唱した音声（以下「歌唱音声」という）を解析することで、各歌唱パート毎に歌唱の巧拙を評価する各種の歌唱評価技術が提案されている。例えば、特許文献１には、複数の歌唱パートを有する歌唱者の歌唱音声を評価する技術において、歌唱音声の音高（歌唱音高）と、利用者が歌唱すべき音高を表す評価用データとを比較することで、歌唱パート毎に歌唱の巧拙を評価する構成が開示されている。 Various singing evaluation techniques for evaluating the skill of singing for each singing part by analyzing the sung sound (hereinafter referred to as “singing sound”) for music having a plurality of singing parts have been proposed. For example, in Patent Document 1, in the technique for evaluating the singing voice of a singer having a plurality of singing parts, evaluation data representing the pitch of the singing voice (singing pitch) and the pitch that the user should sing. The structure which evaluates the skill of a song for every song part is disclosed by comparing with.

特開２００８−２６８３６８号公報JP 2008-268368 A

しかしながら、特許文献１の構成では、複数の歌唱パート毎に評価用データを用意する必要があり、評価用データを作成する負担が大きいという事情がある。以上の事情を考慮して、本発明は、評価用データの存在を前提とすることなく、複数の歌唱パートについて歌唱音声の巧拙を適切に評価することを目的とする。 However, in the configuration of Patent Document 1, it is necessary to prepare evaluation data for each of a plurality of singing parts, and there is a situation that the burden of creating the evaluation data is large. In view of the above circumstances, an object of the present invention is to appropriately evaluate the skill of a singing voice for a plurality of singing parts without assuming the existence of data for evaluation.

以上の課題を解決するために、本発明の第１態様に係る歌唱評価装置は、楽曲の一の歌唱音声を示す第１歌唱信号の音高が安定する複数の第１音高安定区間と、前記楽曲の他の歌唱音声を示す第２歌唱信号の音高が安定する複数の第２音高安定区間とを設定する区間設定部と、前記区間設定部が設定した複数の第１音高安定区間の各々と複数の第２音高安定区間の各々とが時間軸上で重複する重複区間毎に、前記第１歌唱信号の音高の代表値と前記第２歌唱信号の音高の代表値とを算出する代表値算出部と、前記各重複区間における前記第１歌唱信号の音高の代表値と前記第２歌唱信号の音高の代表値とが所定の音高関係にあるか否かに応じて、前記一の歌唱音声と前記他の歌唱音声との調和の度合いを評価する評価部とを具備する。以上の構成では、第１歌唱信号の音高の代表値と第２歌唱信号の音高の代表値とが所定の音高関係にあるか否かに応じて歌唱音声間の調和の度合いが評価されるから、楽曲の各歌唱パートの評価用データが存在しない場合でも、一の歌唱パートの歌唱音声と他の歌唱パートの歌唱音声との調和を度合いを適切に評価することが可能である。また、以上の構成では、第１歌唱信号の音高が安定する複数の第１音高安定区間と、第２歌唱信号の音高が安定する複数の第２音高安定区間とが重複する重複区間毎に第１歌唱信号の音高の代表値と第２歌唱信号の音高の代表値との関係が判定されるから、楽曲全体に亘る歌唱音高が評価対象とされる構成と比較して、演算の負荷を低減しながら適切に歌唱音高を評価することが可能になる。
ここで、所定の音高関係としては、例えば、一の音高安定区間の代表値と他の音高安定区間の代表値とが、十二平均音律における半音（１００cent）の整数倍だけ相違する関係が例示される。また、代表値としては、平均値や、中央値等が例示される。 In order to solve the above problems, the singing evaluation apparatus according to the first aspect of the present invention includes a plurality of first pitch stable sections in which the pitch of the first singing signal indicating the singing voice of one piece of music is stable, A section setting unit that sets a plurality of second pitch stable sections in which the pitch of the second singing signal indicating other singing voices of the music is stable, and a plurality of first pitch stability set by the section setting unit For each overlapping section in which each of the sections and each of the plurality of second pitch stable sections overlap on the time axis, a representative value of the pitch of the first singing signal and a representative value of the pitch of the second singing signal And whether the representative value calculation unit for calculating and the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal in each of the overlapping sections have a predetermined pitch relationship. And an evaluation unit that evaluates the degree of harmony between the one singing voice and the other singing voice. In the above configuration, the degree of harmony between the singing voices is evaluated according to whether or not the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal are in a predetermined pitch relationship. Therefore, even when there is no evaluation data for each singing part of the music, it is possible to appropriately evaluate the degree of harmony between the singing voice of one singing part and the singing voice of another singing part. Further, in the above configuration, a plurality of first pitch stable sections where the pitch of the first singing signal is stable and a plurality of second pitch stable sections where the pitch of the second singing signal is stable overlap. Since the relationship between the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal is determined for each section, the singing pitch over the entire song is compared with the configuration to be evaluated. Thus, it is possible to appropriately evaluate the singing pitch while reducing the calculation load.
Here, as the predetermined pitch relationship, for example, the representative value of one pitch stable section and the representative value of another pitch stable section differ from each other by an integral multiple of a semitone (100 cents) in the twelve average temperament. The relationship is illustrated. Examples of the representative value include an average value and a median value.

第１態様に係る歌唱評価装置の好適例において、前記第１歌唱信号の音高を順次解析する音高解析部と、前記第２歌唱信号から順次解析された音高を受信する受信部とを具備し、前記区間設定部は、前記第１歌唱信号のうち前記音高解析部が解析した音高が安定する複数の第１音高安定区間と、前記第２歌唱信号のうち前記受信部が受信した音高が安定する複数の第２音高安定区間とを設定する。以上の構成では、第２歌唱信号から解析された音高が受信部によって受信されるから、音高解析部で第２歌唱信号の音高を解析する必要がない。したがって、音高解析部が第１歌唱信号の音高に加えて第２歌唱信号の音高も解析する構成と比較して、音高解析部による処理負荷を軽減することが可能になる。 In a preferred example of the singing evaluation apparatus according to the first aspect, a pitch analysis unit that sequentially analyzes the pitch of the first singing signal and a receiving unit that receives the pitches sequentially analyzed from the second singing signal. The section setting unit includes a plurality of first pitch stable sections in which the pitch analyzed by the pitch analysis unit of the first singing signal is stable, and the receiving unit of the second singing signal includes the receiving unit. A plurality of second pitch stable sections in which the received pitch is stable are set. In the above configuration, since the pitch analyzed from the second singing signal is received by the receiving unit, it is not necessary to analyze the pitch of the second singing signal by the pitch analyzing unit. Therefore, the processing load by the pitch analysis unit can be reduced as compared with the configuration in which the pitch analysis unit analyzes the pitch of the second singing signal in addition to the pitch of the first singing signal.

第１態様に係る歌唱評価装置の好適例において、前記音高解析部は、時間軸上の第１解析点毎に前記第１歌唱信号の音高を順次に解析し、前記受信部は、時間軸上の第２解析点毎に順次に解析された前記第２歌唱信号の音高を受信し、前記区間設定部は、複数の第１解析点と複数の第２解析点との間で相互に対応するもの同士を時間軸上で相互に合致させたうえで、前記第１音高安定区間と前記第２音高安定区間とが重複する重複区間を設定する。以上の構成では、第１解析点と第２解析点とを時間軸上で相互に合致させるから、時間軸上で一致する解析点における、一の歌唱パート（例えば一の歌唱音声）の音高と、他の歌唱パート（例えば他の歌唱音声）の音高との調和の度合いを適切に評価することが可能になる、という利点がある。 In a preferred example of the singing evaluation apparatus according to the first aspect, the pitch analysis unit sequentially analyzes the pitch of the first singing signal for each first analysis point on the time axis, and the reception unit The pitch of the second singing signal sequentially analyzed for each second analysis point on the axis is received, and the section setting unit is configured to mutually interact between the plurality of first analysis points and the plurality of second analysis points. Are matched with each other on the time axis, and an overlapping section in which the first pitch stable section and the second pitch stable section overlap is set. In the above configuration, since the first analysis point and the second analysis point are matched with each other on the time axis, the pitch of one singing part (for example, one singing voice) at the analysis point that matches on the time axis. There is an advantage that it is possible to appropriately evaluate the degree of harmony with the pitch of other singing parts (for example, other singing voices).

本発明の第２態様に係る歌唱評価装置は、楽曲の一の歌唱音声を示す第１歌唱信号の音高が安定する複数の第１音高安定区間と、前記楽曲の他の歌唱音声を示す第２歌唱信号の音高が安定する複数の第２音高安定区間とを設定する区間設定部と、前記複数の第１音高安定区間の各々における音高の代表値を算出する一方、前記複数の第２音高安定区間の各々における音高の代表値を算出する代表値算出部と、前記複数の第１音高安定区間に亘る音高の代表値の度数分布を示す第１度数分布を生成する一方、前記複数の第２音高安定区間に亘る音高の代表値の度数分布を示す第２度数分布を生成する音高分布生成部と、前記第１度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第１評価分布を作成する一方、前記第２度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第２評価分布を作成する解析処理部と、前記第１評価分布と前記第２評価分布とに基づいて、前記一の歌唱音声と前記他の歌唱音声との歌唱音高の類似の度合いを評価する評価部とを具備する。以上の構成では、複数の第１音高安定区間に亘る第１歌唱信号の音高の代表値の第１度数分布と、複数の第２音高安定区間に亘る第２歌唱信号の音高の代表値の第２度数分布とに応じて、一の歌唱音声と他の歌唱音声との歌唱音高における傾向の類似の度合いが評価されるから、歌唱音声と対比すべき評価用データが存在しない場合でも、一の歌唱音声と他の歌唱音声との歌唱音高における傾向の類似の度合いを適切に評価することが可能になる。また、第１歌唱信号および第２歌唱信号の全区間が評価対象とされる構成と比較して、演算の処理負荷を低減しながら適切に歌唱音高を評価することが可能になる、という利点がある。 The singing evaluation apparatus according to the second aspect of the present invention shows a plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and other singing voices of the music. A section setting unit for setting a plurality of second pitch stable sections in which the pitch of the second singing signal is stabilized, and calculating a representative value of the pitch in each of the plurality of first pitch stable sections, A representative value calculation unit that calculates a representative value of the pitch in each of the plurality of second pitch stable sections, and a first frequency distribution that indicates a frequency distribution of the representative values of the pitches over the plurality of first pitch stable sections. A pitch distribution generation unit that generates a second frequency distribution indicating a frequency distribution of representative values of pitches over the plurality of second pitch stable sections, and the first frequency distribution as a sound of a scale tone. Divide into multiple unit ranges centered on the height, and overlap each unit range with each other. The first evaluation distribution is created by summing the frequency of each frequency over a plurality of unit ranges for each pitch, while the second frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound. An analysis processing unit that creates a second evaluation distribution by overlapping ranges with each other and summing the frequency of the distribution of each unit range over a plurality of unit ranges for each pitch, the first evaluation distribution, and the second And an evaluation unit that evaluates the degree of similarity in singing pitch between the one singing voice and the other singing voice based on the evaluation distribution. In the above configuration, the first frequency distribution of the representative value of the pitch of the first singing signal over the plurality of first pitch stable sections and the pitch of the second singing signal over the plurality of second pitch stable sections. Since the degree of similarity in the singing pitch between one singing voice and the other singing voice is evaluated according to the second frequency distribution of the representative value, there is no evaluation data to be compared with the singing voice. Even in this case, it is possible to appropriately evaluate the degree of similarity in tendency in the singing pitch between one singing voice and another singing voice. Moreover, compared with the structure by which all the sections of a 1st song signal and a 2nd song signal are made into evaluation object, it becomes possible to evaluate a song pitch appropriately, reducing the processing load of a calculation. There is.

以上の各態様に係る歌唱評価装置は、専用のハードウェア（電子回路）によって実現されるほか、CPU（Central Processing Unit）等の汎用の演算処理装置とプログラムとの協働によっても実現される。本発明のプログラムは、コンピュータが読取可能な記録媒体に格納された形態で提供されてコンピュータにインストールされ得る。記録媒体は、例えば非一過性（non-transitory）の記録媒体であり、ＣＤ-ＲＯＭ等の光学式記録媒体（光ディスク）が好例であるが、半導体記録媒体や磁気記録媒体等の公知の任意の形式の記録媒体を包含し得る。また、例えば、本発明のプログラムは、通信網を介した配信の形態で提供されてコンピュータにインストールされ得る。また、本発明は、以上に説明した各態様に係る歌唱評価装置の動作方法（歌唱評価方法）としても特定される。 The singing evaluation apparatus according to each of the above aspects is realized not only by dedicated hardware (electronic circuit) but also by cooperation of a general-purpose arithmetic processing apparatus such as a CPU (Central Processing Unit) and a program. The program of the present invention can be provided in a form stored in a computer-readable recording medium and installed in the computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included. For example, the program of the present invention can be provided in the form of distribution via a communication network and installed in a computer. Moreover, this invention is specified also as the operation | movement method (singing evaluation method) of the song evaluation apparatus which concerns on each aspect demonstrated above.

第１実施形態に係る歌唱評価システム１の概略図である。It is the schematic of the song evaluation system 1 which concerns on 1st Embodiment. 第１歌唱信号Ｖ1の音高ＰAおよび第２歌唱信号Ｖ2の音高ＰBのグラフである。It is a graph of the pitch PA of the 1st song signal V1, and the pitch PB of the 2nd song signal V2. 解析区間ＴAおよび第１音高安定区間ＴSAの設定についての説明図である。It is explanatory drawing about the setting of the analysis area TA and the 1st pitch stable area TSA. 歌唱評価部２２(音高解析部２２２，区間設定部２２４)の動作のフローチャートである。It is a flowchart of operation | movement of the song evaluation part 22 (pitch analysis part 222, section setting part 224). 歌唱評価部２２（評価部２２８）の処理の動作のフローチャートである。It is a flowchart of operation | movement of the process of the song evaluation part 22 (evaluation part 228). 差分値Ｃに対する評価値の分布の一例を示す説明図である。It is explanatory drawing which shows an example of distribution of the evaluation value with respect to the difference value C. 第２実施形態に係る歌唱評価システム１の概略図である。It is the schematic of the song evaluation system 1 which concerns on 2nd Embodiment. 音高解析部２２２が解析した第１歌唱信号Ｖ1の音高ＰAと、通信装置１５が受信した第２歌唱信号Ｖ2の音高ＰBのグラフである。It is a graph of the pitch PA of the 1st song signal V1 which the pitch analysis part 222 analyzed, and the pitch PB of the 2nd song signal V2 which the communication apparatus 15 received. 区間設定部２２４が第１解析点ＫAと第２解析点ＫBとを合致させた後の第１歌唱信号Ｖ1の音高ＰAと第２歌唱信号Ｖ2の音高ＰBのグラフである。It is a graph of the pitch PA of the 1st song signal V1 and the pitch PB of the 2nd song signal V2 after the section setting part 224 matched the 1st analysis point KA and the 2nd analysis point KB. 第４実施形態に係る歌唱評価システム１の概略図である。It is the schematic of the song evaluation system 1 which concerns on 4th Embodiment. 第４実施形態の音高分布生成部２２５の処理についての説明図である。It is explanatory drawing about the process of the pitch distribution generation part 225 of 4th Embodiment. 第４実施形態の解析処理部２２７の処理についての説明図である。It is explanatory drawing about the process of the analysis process part 227 of 4th Embodiment. 第４実施形態の歌唱評価部２２の処理の動作のフローチャートである。It is a flowchart of operation | movement of the process of the song evaluation part 22 of 4th Embodiment. 第５実施形態に係る歌唱評価システム１の概略図である。It is the schematic of the song evaluation system 1 which concerns on 5th Embodiment.

＜第１実施形態＞
図１は、第１実施形態に係る歌唱評価システム１の概略図である。歌唱評価システム１は、図１に例示されるように、利用者Ｕ1が使用する端末装置Ｄ1と、利用者Ｕ2が使用する端末装置Ｄ2とを含んで構成される。第１実施形態では、端末装置Ｄ1を歌唱評価装置として利用する構成を例示する。利用者Ｕ1は、端末装置Ｄ1に向けて歌唱する。利用者Ｕ2は、端末装置Ｄ2に向けて歌唱する。 <First Embodiment>
FIG. 1 is a schematic diagram of a singing evaluation system 1 according to the first embodiment. As illustrated in FIG. 1, the singing evaluation system 1 includes a terminal device D1 used by the user U1 and a terminal device D2 used by the user U2. In 1st Embodiment, the structure which utilizes terminal device D1 as a song evaluation apparatus is illustrated. The user U1 sings toward the terminal device D1. The user U2 sings toward the terminal device D2.

端末装置Ｄ2は、例えば携帯電話機やスマートフォン等の通信端末であり、収音装置３４と通信装置３５とを具備する。収音装置３４は、利用者Ｕ2が楽曲を歌唱した歌唱音声を収音して第２歌唱信号Ｖ2を生成する。通信装置３５は、端末装置Ｄ1と通信するための通信機器であり、収音装置３４が生成した第２歌唱信号Ｖ2を端末装置Ｄ1に送信する。 The terminal device D2 is a communication terminal such as a mobile phone or a smartphone, and includes a sound collection device 34 and a communication device 35. The sound collection device 34 collects the singing voice of the user U2 singing the music and generates the second singing signal V2. The communication device 35 is a communication device for communicating with the terminal device D1, and transmits the second singing signal V2 generated by the sound collection device 34 to the terminal device D1.

第１実施形態の端末装置Ｄ1は、利用者Ｕ1が楽曲を歌唱した歌唱音声（第１歌唱音声）と、端末装置Ｄ2の利用者Ｕ2が同一の楽曲を歌唱した歌唱音声（第２歌唱音声）との間の調和の度合いを評価する装置であり、演算処理装置１０と記憶装置１２と入力装置１３と収音装置１４と通信装置１５と表示装置１６と放音装置１８とを具備するコンピュータシステムで実現される。例えば、利用者が携行する可搬型の通信端末（携帯電話機やスマートフォン）およびパーソナルコンピュータ等の情報処理装置が端末装置Ｄ1として利用される。収音装置１４は、周囲の音響を収音する装置（マイクロホン）である。第１実施形態の収音装置１４は、端末装置Ｄ1の利用者Ｕ1が楽曲を歌唱した歌唱音声を収音して第１歌唱信号Ｖ1を生成する。入力装置１３は、利用者からの指示を受付ける機器（例えば、タッチパネル等）である。 The terminal device D1 of the first embodiment has a singing voice (first singing voice) in which the user U1 sings a song and a singing voice (second singing voice) in which the user U2 of the terminal device D2 sings the same piece of music. A computer system comprising an arithmetic processing unit 10, a storage unit 12, an input unit 13, a sound collection unit 14, a communication unit 15, a display unit 16, and a sound emission unit 18. It is realized with. For example, a portable communication terminal (mobile phone or smartphone) carried by a user and an information processing device such as a personal computer are used as the terminal device D1. The sound collection device 14 is a device (microphone) that collects ambient sounds. The sound collection device 14 according to the first embodiment collects a singing voice in which the user U1 of the terminal device D1 sang a song and generates a first singing signal V1. The input device 13 is a device (for example, a touch panel) that accepts an instruction from a user.

記憶装置１２は、演算処理装置１０が実行するプログラムＰGMや演算処理装置１０が使用する各種のデータを記憶する。半導体記録媒体や磁気記録媒体等の公知の記録媒体または複数種の記録媒体の組合せが記憶装置１２として任意に採用される。具体的には、記憶装置１２は、楽曲データＬを記憶する。楽曲データＬは、楽曲の伴奏音を時系列で規定する伴奏データＢと、歌詞を示す歌詞データＱとを包含する。伴奏データＢは、例えばＭＰ３等の形式の音楽ファイルである。利用者が所望の楽曲を選択すると、当該楽曲の伴奏データＢが再生される。表示装置１６（例えば液晶表示パネル）は、演算処理装置１０から指示された画像を表示する。例えば、利用者が選択した楽曲の歌詞データＱや、歌唱音声の評価結果（評価値Ｓ）が表示装置１６に表示される。通信装置１５は、端末装置Ｄ2と通信するための通信機器である。具体的には、通信装置１５は、端末装置Ｄ2の通信装置３５が送信した第２歌唱信号Ｖ2を無線により受信する。なお、通信装置１５と端末装置Ｄ2との間の通信は、移動通信網やインターネット等の通信網４００を介した通信であってもよいし、無線LANやBluetooth（登録商標）規格等の近距離無線通信であってもよい。 The storage device 12 stores a program PGM executed by the arithmetic processing device 10 and various data used by the arithmetic processing device 10. A known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 12. Specifically, the storage device 12 stores music data L. The music data L includes accompaniment data B that defines the accompaniment sound of the music in time series, and lyric data Q that indicates lyrics. The accompaniment data B is a music file in a format such as MP3, for example. When the user selects a desired music piece, accompaniment data B of the music piece is reproduced. The display device 16 (for example, a liquid crystal display panel) displays an image instructed from the arithmetic processing device 10. For example, the lyrics data Q of the music selected by the user and the evaluation result (evaluation value S) of the singing voice are displayed on the display device 16. The communication device 15 is a communication device for communicating with the terminal device D2. Specifically, the communication device 15 wirelessly receives the second singing signal V2 transmitted by the communication device 35 of the terminal device D2. The communication between the communication device 15 and the terminal device D2 may be communication via a communication network 400 such as a mobile communication network or the Internet, or a short distance such as a wireless LAN or Bluetooth (registered trademark) standard. Wireless communication may be used.

図１の演算処理装置１０（ＣＰＵ）は、記憶装置１２に格納されたプログラムＰGMを実行することで、端末装置Ｄ1の各要素を統括的に制御する。具体的には、演算処理装置１０は、端末装置Ｄ1の収音装置１４によって生成された第１歌唱信号Ｖ1と、端末装置Ｄ2の収音装置３４によって生成された第２歌唱信号Ｖ2とが示す歌唱を評価するための複数の機能（歌唱評価部２２，再生処理部２６，表示処理部２８）を実現する。なお、演算処理装置１０の各機能を複数の装置に分散した構成や、専用の電子回路（例えばＤＳＰ）が演算処理装置１０の一部の機能を実現する構成も採用され得る。 1 executes the program PGM stored in the storage device 12 to centrally control each element of the terminal device D1. Specifically, the arithmetic processing device 10 indicates the first singing signal V1 generated by the sound collecting device 14 of the terminal device D1 and the second singing signal V2 generated by the sound collecting device 34 of the terminal device D2. A plurality of functions (singing evaluation unit 22, reproduction processing unit 26, display processing unit 28) for evaluating a song are realized. A configuration in which each function of the arithmetic processing device 10 is distributed to a plurality of devices, or a configuration in which a dedicated electronic circuit (for example, DSP) realizes a part of the functions of the arithmetic processing device 10 may be employed.

再生処理部２６は、収音装置１４によって生成された第１歌唱信号Ｖ1と記憶装置１２から読み出された伴奏データＢとを混合するとともにアナログ信号に変換して放音装置１８に供給する。放音装置１８（例えばスピーカやヘッドホン）は、再生処理部２６から供給される信号に応じた音響を放音する。すなわち、伴奏データＢが表す楽曲の伴奏音と利用者Ｕ1の歌唱音声との混合音が放音装置１８から放音される。表示処理部２８は、各種の画像を表示装置１６に表示させる。具体的には、表示処理部２８は、歌唱対象として選択された楽曲が包含する歌詞データＱや、歌唱評価部２２によって生成された歌唱評価の結果を示す評価値Ｓを表示装置１６に表示させる。 The reproduction processing unit 26 mixes the first singing signal V 1 generated by the sound collection device 14 and the accompaniment data B read from the storage device 12, converts it into an analog signal, and supplies it to the sound emission device 18. The sound emitting device 18 (for example, a speaker or headphones) emits sound corresponding to the signal supplied from the reproduction processing unit 26. That is, a mixed sound of the musical accompaniment sound represented by the accompaniment data B and the singing voice of the user U1 is emitted from the sound emitting device 18. The display processing unit 28 displays various images on the display device 16. Specifically, the display processing unit 28 causes the display device 16 to display the lyrics data Q included in the song selected as the singing target and the evaluation value S indicating the result of the singing evaluation generated by the singing evaluation unit 22. .

歌唱評価部２２は、第１歌唱信号Ｖ1および第２歌唱信号Ｖ2を解析することで、利用者Ｕ1による歌唱音声の音高（歌唱音高）と、利用者Ｕ2による歌唱音高との調和の度合いを評価する手段であり、音高解析部２２２と区間設定部２２４と代表値算出部２２６と評価部２２８とを含んで構成される。 The singing evaluation unit 22 analyzes the first singing signal V1 and the second singing signal V2, thereby harmonizing the pitch of the singing voice by the user U1 (singing pitch) and the singing pitch by the user U2. It is a means for evaluating the degree, and includes a pitch analysis unit 222, a section setting unit 224, a representative value calculation unit 226, and an evaluation unit 228.

第１実施形態の音高解析部２２２は、収音装置１４によって生成された第１歌唱信号Ｖ1の音高ＰAを所定周期毎に順次に解析する。また、第１実施形態の音高解析部２２２は、通信装置１５によって受信された第２歌唱信号Ｖ2の音高ＰBを所定周期毎に順次に解析する。 The pitch analysis unit 222 of the first embodiment sequentially analyzes the pitch PA of the first singing signal V1 generated by the sound collecting device 14 at predetermined intervals. In addition, the pitch analysis unit 222 of the first embodiment sequentially analyzes the pitch PB of the second singing signal V2 received by the communication device 15 every predetermined period.

図２は、音高解析部２２２によって解析される第１歌唱信号Ｖ1の音高ＰAと第２歌唱信号Ｖ2の音高ＰBのグラフである。音高解析部２２２は、楽曲に想定され得る音符の時間長と比較して十分に短い周期（例えば１０ms毎）で、第１歌唱信号Ｖ1の音高ＰAと第２歌唱信号Ｖ2の音高ＰBとを所定周期毎に順次に特定する。歌唱音声の音高の特定には、公知のピッチ検出技術が任意に採用され得る。 FIG. 2 is a graph of the pitch PA of the first singing signal V1 and the pitch PB of the second singing signal V2 analyzed by the pitch analysis unit 222. The pitch analysis unit 222 has a pitch PA of the first singing signal V1 and a pitch PB of the second singing signal V2 with a sufficiently short period (for example, every 10 ms) as compared with the time length of notes that can be assumed in the music. Are sequentially identified at predetermined intervals. A known pitch detection technique can be arbitrarily employed to specify the pitch of the singing voice.

図１の区間設定部２２４は、歌唱の進行に並行して、音高解析部２２２が解析した第１歌唱信号Ｖ1の音高ＰAが安定する第１音高安定区間ＴSAを複数設定する。具体的には、区間設定部２２４は、図２に例示される通り、第１歌唱信号Ｖ1のうち音高ＰAの変動量（すなわち、音高ＰAの最高値と最低値との差異）が所定の範囲内に維持される区間を第１音高安定区間ＴSA（ＴSA1,ＴSA2,ＴSA3…）として設定する。同様に、区間設定部２２４は、歌唱の進行に並行して、音高解析部２２２が解析した第２歌唱信号Ｖ2の音高ＰBが安定する第２音高安定区間ＴSBを複数設定する。具体的には、区間設定部２２４は、第１音高安定区間ＴSAの特定と同様の方法により、第２歌唱信号Ｖ2のうち音高ＰBの変動量（すなわち、音高ＰBの最高値と最低値との差異）が所定の範囲内に維持される区間を第２音高安定区間ＴSB（ＴSB1,ＴSB2,ＴSB3…）として複数設定する。 The section setting unit 224 in FIG. 1 sets a plurality of first pitch stable sections TSA in which the pitch PA of the first singing signal V1 analyzed by the pitch analysis section 222 is stabilized in parallel with the progress of the singing. Specifically, as illustrated in FIG. 2, the section setting unit 224 has a predetermined variation amount of the pitch PA (that is, the difference between the highest value and the lowest value) of the first singing signal V1. Is set as the first pitch stable section TSA (TSA1, TSA2, TSA3...). Similarly, the section setting unit 224 sets a plurality of second pitch stable sections TSB in which the pitch PB of the second singing signal V2 analyzed by the pitch analysis unit 222 is stabilized in parallel with the progress of the singing. Specifically, the section setting unit 224 uses the same method as the specification of the first pitch stable section TSA to change the pitch PB of the second singing signal V2 (that is, the highest and lowest pitches PB). A plurality of sections in which the difference between the two values is maintained within a predetermined range are set as second pitch stable sections TSB (TSB1, TSB2, TSB3...).

また、区間設定部２２４は、複数の第１音高安定区間ＴSAの各々と、複数の第２音高安定区間ＴSBの各々とが時間軸上で重複する重複区間ＲＴSを特定する。重複区間ＲＴSは、図２に例示される通り、複数の第１音高安定区間ＴSA（ＴSA1,ＴSA2,ＴSA3…）の各々と、複数の第２音高安定区間ＴSB（ＴSB1,ＴSB2,ＴSB3…）の各々とが、時間軸上で重なり合う範囲である。 In addition, the section setting unit 224 identifies an overlapping section RTS in which each of the plurality of first pitch stable sections TSA and each of the plurality of second pitch stable sections TSB overlap on the time axis. As illustrated in FIG. 2, the overlapping section RTS includes a plurality of first pitch stable sections TSA (TSA1, TSA2, TSA3...) And a plurality of second pitch stable sections TSB (TSB1, TSB2, TSB3... ) Are ranges that overlap on the time axis.

図１の代表値算出部２２６は、区間設定部２２４が設定した重複区間ＲＴS、すなわち、複数の第１音高安定区間ＴSAの各々と、複数の第２音高安定区間ＴSBの各々とが時間軸上で重複する重複区間ＲＴSにおける第１歌唱信号Ｖ1の音高ＰAの代表値ＲＰA（第１歌唱音声の音高の代表値）と第２歌唱信号Ｖ2の音高ＰBの代表値ＲＰB（第２歌唱音声の音高の代表値）とを算出する。代表値算出部２２６は、図２に例示されるように、重複区間ＲＴS毎に、音高ＰAの代表値ＲＰA（ＲＰA1,ＲＰA2,ＲＰA3…）および音高ＰBの代表値ＲＰB（ＲＰB1,ＲＰB2,ＲＰB3…）を算出する。具体的には、第１音高安定区間ＴSA（第２音高安定区間ＴSB）の各々で特定された複数の音高ＰA（ＰB）の平均値が代表値ＲＰA（ＲＰB）として算出される。 In the representative value calculation unit 226 of FIG. 1, the overlapping section RTS set by the section setting unit 224, that is, each of the plurality of first pitch stable sections TSA and each of the plurality of second pitch stable sections TSB is timed. The representative value RPA of the pitch PA of the first singing signal V1 (the representative value of the pitch of the first singing voice) and the representative value RPB of the pitch PB of the second singing signal V2 in the overlapping section RTS overlapping on the axis (first) 2) (representative value of the pitch of the two singing voices). As illustrated in FIG. 2, the representative value calculation unit 226, for each overlapping section RTS, represents the representative value RPA (RPA1, RPA2, RPA3...) Of the pitch PA and the representative value RPB (RPB1, RPB2,. RPB3 ...) is calculated. Specifically, an average value of a plurality of pitches PA (PB) specified in each of the first pitch stable sections TSA (second pitch stable sections TSB) is calculated as a representative value RPA (RPB).

図１の評価部２２８は重複区間ＲＴS毎に代表値算出部２２６が算出した第１歌唱信号Ｖ1の音高ＰAの代表値ＲＰAと第２歌唱信号Ｖ2の音高ＰBの代表値ＲＰBとが所定の音高関係にあるか否かに応じて、第１歌唱音声と第２歌唱音声との間の調和の度合いを評価する。本実施形態の評価部２２８は、第１歌唱音声と第２歌唱音声とが聴感的に調和のとれた音高関係にあるか否かに着目して、第１歌唱音声と第２歌唱音声との調和の度合いを評価した評価値Ｓを出力する。 In the evaluation unit 228 of FIG. 1, the representative value RPA of the pitch PA of the first singing signal V1 and the representative value RPB of the pitch PB of the second singing signal V2 calculated by the representative value calculating unit 226 for each overlapping section RTS are predetermined. The degree of harmony between the first singing voice and the second singing voice is evaluated according to whether or not the pitch relationship is satisfied. The evaluation unit 228 of the present embodiment pays attention to whether or not the first singing voice and the second singing voice have an audibly harmonized pitch relationship, and the first singing voice and the second singing voice, An evaluation value S that evaluates the degree of harmony is output.

図４は、音高解析部２２２および区間設定部２２４の処理の動作のフローチャートである。例えば、楽曲データＬの再生が開始されると、図４の処理が開始される。なお、図４では、音高解析部２２２および区間設定部２２４による第１歌唱信号Ｖ1に対する処理を例示する。第２歌唱信号Ｖ2に対する処理は、第１歌唱信号Ｖ1に対する処理と同様であるので詳細な説明を省略する。 FIG. 4 is a flowchart of processing operations of the pitch analysis unit 222 and the section setting unit 224. For example, when the reproduction of the music data L is started, the process of FIG. 4 is started. FIG. 4 illustrates the processing for the first singing signal V1 by the pitch analysis unit 222 and the section setting unit 224. Since the process for the second singing signal V2 is the same as the process for the first singing signal V1, detailed description thereof is omitted.

音高解析部２２２が第１歌唱信号Ｖ1のうち時間軸上の１個の時点（以下「解析点」という）ついて音高ＰAを特定すると（SA1）、区間設定部２２４は、図３に例示される通り、音高ＰAが特定された時間軸上の解析点ＫAを終点とする所定長の解析区間ＴAを設定する（SA2）。解析区間ＴAは、時間窓関数が規定する分析の対象とされる時間的区間であり、例えば、音高解析が実行される周期（１０ms）よりも十分に長い時間長（例えば２００ms）に設定される。したがって、解析点ＫAについて新たに特定された音高ＰAと解析点ＫA以前の音高ＰAとを含む複数の音高ＰAが解析区間ＴA内に包含される。 When the pitch analysis unit 222 specifies the pitch PA for one time point on the time axis (hereinafter referred to as “analysis point”) in the first singing signal V1, (SA1), the section setting unit 224 is illustrated in FIG. As described above, an analysis section TA having a predetermined length with the analysis point KA on the time axis on which the pitch PA is specified as an end point is set (SA2). The analysis interval TA is a time interval that is an object of analysis defined by the time window function, and is set to a time length (for example, 200 ms) that is sufficiently longer than the cycle (10 ms) in which the pitch analysis is performed. The Therefore, a plurality of pitches PA including the pitch PA newly specified for the analysis point KA and the pitch PA before the analysis point KA are included in the analysis section TA.

区間設定部２２４は、解析区間ＴA内の複数の音高ＰAの最大値ＰA-MAXと最小値ＰA-MINとを特定し、最大値ＰA-MAXと最小値ＰA-MINとの差分値Ｒ（絶対値）が所定の閾値ＰATHを下回るか否かを判定する（SA3）。差分値Ｒ（すなわち解析区間ＴA内の音高ＰAの分布幅）が狭いほど、第１歌唱信号Ｖ1の音高ＰAが安定していると評価できる。例えば、閾値ＰATHは、十二平均音律における５０centに設定され得る。 The section setting unit 224 identifies a maximum value PA-MAX and a minimum value PA-MIN of a plurality of pitches PA in the analysis section TA, and a difference value R () between the maximum value PA-MAX and the minimum value PA-MIN. It is determined whether (absolute value) is below a predetermined threshold PATH (SA3). It can be evaluated that the pitch PA of the first singing signal V1 is more stable as the difference value R (that is, the distribution width of the pitch PA in the analysis section TA) is narrower. For example, the threshold value PATH can be set to 50 cent in the twelve average temperament.

区間設定部２２４は、差分値Ｒが閾値ＰATHを下回る場合（SA3：YES）、当該解析区間ＴAを第１音高安定区間ＴSAに包含させる（SA4）。図３の解析区間ＴAnでは音高ＰAの最大値ＰA-MAXと最小値ＰA-MINとの差分値Ｒが閾値ＰATHを下回るから、解析点ＫAnを含む解析区間ＴAnが第１音高安定区間ＴSAに包含される。区間設定部２２４は、当該解析点ＫAにおける音高ＰAを記憶装置１２に記憶する（SA5）。 When the difference value R falls below the threshold PATH (SA3: YES), the section setting unit 224 includes the analysis section TA in the first pitch stable section TSA (SA4). In the analysis section TAn in FIG. 3, since the difference value R between the maximum value PA-MAX and the minimum value PA-MIN of the pitch PA is below the threshold value PATH, the analysis section TAn including the analysis point KAn is the first pitch stable section TSA. Is included. The section setting unit 224 stores the pitch PA at the analysis point KA in the storage device 12 (SA5).

区間設定部２２４は、楽曲が終了するまでの間(SA9：NO)、音高解析部２２２によって音高ＰAが特定される毎に(SA1)、当該音高ＰAの解析点ＫAを終点とする解析区間ＴAを設定し(SA2)、当該解析区間ＴAにおける音高ＰAの差分値Ｒが閾値ＰATHを下回るか否かを判定する(SA3)。すなわち、図３から理解される通り、音高解析部２２２が音高ＰAを特定する周期(１０ms)毎に解析区間ＴAを時間軸上で順次に移動させながら、当該解析区間ＴAが第１音高安定区間ＴSA内に包含されるか否かが判定される。したがって、閾値ＰATHを下回る分布幅の範囲内の音高ＰAを音高解析部２２２が特定するたびに第１音高安定区間ＴSAが時間軸上で順次に伸長していく。 The interval setting unit 224 uses the analysis point KA of the pitch PA as an end point every time the pitch PA is specified by the pitch analysis unit 222 (SA1) until the music ends (SA9: NO). An analysis section TA is set (SA2), and it is determined whether or not the difference value R of the pitch PA in the analysis section TA is below the threshold PATH (SA3). That is, as understood from FIG. 3, the analysis interval TA is moved to the first sound while the analysis interval TA is sequentially moved on the time axis for every period (10 ms) in which the pitch analysis unit 222 specifies the pitch PA. It is determined whether or not it is included in the highly stable section TSA. Accordingly, the first pitch stable section TSA is sequentially extended on the time axis every time the pitch analysis unit 222 specifies the pitch PA within the range of the distribution width below the threshold PATH.

他方、解析区間ＴA内における音高ＰAの差分値Ｒが閾値ＰATH以上である場合（SA3：NO）、区間設定部２２４は、当該解析点ＫAを第１音高安定区間ＴSAに含めない（SA6）。区間設定部２２４は、現在の解析区間ＴAの直前の解析区間ＴAが第１音高安定区間ＴSA内に存在するか否かを判定する（SA7）。判定結果が肯定である場合、区間設定部２２４は、直前の解析区間ＴAの終点（解析点ＫA）を１個の第１音高安定区間ＴSAの終点として確定する（SA8）。つまり、区間設定部２２４は、歌唱の進行に並行して順次に第１音高安定区間ＴSAを設定する。以上の構成によれば、楽曲全体のうち音高ＰAの変動量が小さい第１音高安定区間ＴSAのみを歌唱評価の対象とすることが可能である。以上の手順で、第１音高安定区間ＴSAおよび第２音高安定区間ＴSBが順次に設定されると、記憶装置１２には、第１音高安定区間ＴSAに包含される音高ＰAと、第２音高安定区間ＴSBに包含される音高ＰBとが、順次蓄積していく。次に、区間設定部２２４は、図２に例示されるように、第１音高安定区間ＴSAと第２音高安定区間ＴSBとが時間軸上で重複する重複区間ＲＴSを特定する。 On the other hand, when the difference value R of the pitch PA in the analysis section TA is equal to or greater than the threshold PATH (SA3: NO), the section setting unit 224 does not include the analysis point KA in the first pitch stable section TSA (SA6). ). The section setting unit 224 determines whether or not the analysis section TA immediately before the current analysis section TA exists in the first pitch stable section TSA (SA7). If the determination result is affirmative, the section setting unit 224 determines the end point (analysis point KA) of the immediately preceding analysis section TA as the end point of one first pitch stable section TSA (SA8). That is, the section setting unit 224 sequentially sets the first pitch stable section TSA in parallel with the progress of singing. According to the above configuration, it is possible to set only the first pitch stable section TSA with a small fluctuation amount of the pitch PA in the entire music as the object of the singing evaluation. When the first pitch stable section TSA and the second pitch stable section TSB are sequentially set by the above procedure, the storage device 12 stores the pitch PA included in the first pitch stable section TSA, The pitch PB included in the second pitch stable section TSB is sequentially accumulated. Next, as illustrated in FIG. 2, the section setting unit 224 identifies an overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap on the time axis.

次に、評価部２２８による歌唱評価について説明する。図５は、評価部２２８が第１歌唱音声の音高と第２歌唱音声の音高との調和の度合いを評価する処理（評価処理）のフローチャートである。区間設定部２２４が楽曲内の重複区間ＲＴSn（ｎ＝１,２,３,４，……）を設定し、代表値算出部２２６が重複区間ＲＴS内における音高ＰAの代表値ＲＰAと音高ＰBの代表値ＲＰBとを算出するたびに図５の評価処理が実行される。 Next, singing evaluation by the evaluation unit 228 will be described. FIG. 5 is a flowchart of a process (evaluation process) in which the evaluation unit 228 evaluates the degree of harmony between the pitch of the first singing voice and the pitch of the second singing voice. The section setting unit 224 sets the overlapping section RTSn (n = 1, 2, 3, 4,...) In the music, and the representative value calculating section 226 uses the representative value RPA and the pitch of the pitch PA in the overlapping section RTS. Each time the representative value RPB of PB is calculated, the evaluation process of FIG. 5 is executed.

評価部２２８は、重複区間ＲＴS内における音高ＰAの代表値ＲＰAと音高ＰBの代表値ＲＰBとの差分値Ｃを算出する（SB1）。具体的には、区間設定部２２４が新たに設定した重複区間ＲＴSの代表値ＲＰAと代表値ＲＰBとの差分値Ｃが算定される。なお、代表値ＲＰAと代表値ＲＰBとの差分値が１２００cent（１オクターブ）を上回る場合には、１２００centの整数倍を差分から差引くことで１２００cent以下の差分値Ｃを算定する。 The evaluation unit 228 calculates a difference value C between the representative value RPA of the pitch PA and the representative value RPB of the pitch PB in the overlapping section RTS (SB1). Specifically, the difference value C between the representative value RPA and the representative value RPB of the overlapping section RTS newly set by the section setting unit 224 is calculated. When the difference value between the representative value RPA and the representative value RPB exceeds 1200 cents (one octave), a difference value C of 1200 cents or less is calculated by subtracting an integer multiple of 1200 cents from the difference.

楽曲が十二平均律の１２種類の音階音（音律の音階を構成する離散的な音高）で構成される場合を想定すると、利用者Ｕ1が楽曲を歌唱した音声（第１歌唱音声）と、利用者Ｕ2が当該楽曲を歌唱した音声（第２歌唱音声）とで音高の調和がとれているとき、第１歌唱音声の音高と第２歌唱音声の音高との差分値は、相互に隣合う２個の音階音の間隔（半音に相当する１００cent）の整数倍に近似または合致すると期待される。したがって、１個の重複区間ＲＴSの代表値ＲＰAと代表値ＲＰBとの差分値Ｃが１００centの整数倍に相当するとき、当該重複区間ＲＴSでは、利用者Ｕ1の歌唱音高と利用者Ｕ2の歌唱音高とが、調和を保っていると評価できる。そこで、第１実施形態の評価部２２８は、重複区間ＲＴSにおける代表値ＲＰAと代表値ＲＰBとの差分値Ｃが１００centの整数倍に相当するときには、第１歌唱音声の音高と第２歌唱音声の音高との調和がとれているとして、評価値Ｓを増加させる。他方、差分値Ｃが１００centの整数倍から乖離する場合には、第１歌唱音声の音高と第２歌唱音声の音高との調和がとれていないと推定される。したがって、差分値Ｃが１００centの整数倍から乖離するときは、評価部２２８は、第１歌唱音声の音高と第２歌唱音声の音高とが調和しないとして、評価値Ｓを減少させる。 Assuming that the music is composed of twelve types of twelve scales (discrete pitches constituting the scale of the scale), the voice (first singing voice) that the user U1 sang the music and When the pitch of the user U2 is harmonized with the voice of the song sung (second song voice), the difference value between the pitch of the first song voice and the pitch of the second song voice is: It is expected to approximate or match an integer multiple of the interval between two adjacent musical notes (100 cents corresponding to a semitone). Therefore, when the difference value C between the representative value RPA and the representative value RPB of one overlapping section RTS corresponds to an integral multiple of 100 cents, in the overlapping section RTS, the singing pitch of the user U1 and the singing of the user U2 It can be evaluated that the pitch is in harmony. Therefore, the evaluation unit 228 of the first embodiment, when the difference value C between the representative value RPA and the representative value RPB in the overlapping section RTS corresponds to an integer multiple of 100 cent, the pitch of the first singing voice and the second singing voice. As a result, the evaluation value S is increased. On the other hand, when the difference value C deviates from an integer multiple of 100 cents, it is estimated that the pitch of the first singing voice and the pitch of the second singing voice are not in harmony. Therefore, when the difference value C deviates from an integer multiple of 100 cents, the evaluation unit 228 decreases the evaluation value S, assuming that the pitch of the first singing voice and the pitch of the second singing voice do not match.

例えば、代表値ＲＰAと代表値ＲＰBとの差分値Ｃが、１００centの整数倍の３００cent（短３度）や４００cent（長３度）である場合、第１歌唱音声の音高と第２歌唱音声の音高とが調和し、聴感的に心地良い印象を与える。また、差分値が７００cent（完全５度）や１２００cent（１オクターブ、すなわち完全８度）や０cent（完全１度）である場合、すなわち、第１歌唱音声の音高と第２歌唱音声の音高とが協和音の関係にある場合、第１歌唱音声の音高と第２歌唱音声の音高とが調和し、聴感的に響きがある印象を与える。以上の事情を考慮して、第１歌唱音声の音高と第２歌唱音声の音高との差分値Ｃが３００centおよび４００centである場合や、第１歌唱音声の音高と第２歌唱音声の音高とが協和音（すなわち、差分値Ｃが700cent,1200cent,0cent）の関係にある場合、評価値Ｓを高くしてもよい。 For example, when the difference value C between the representative value RPA and the representative value RPB is 300 cents (short 3 degrees) or 400 cent (long 3 degrees) which is an integral multiple of 100 cent, the pitch of the first singing voice and the second singing voice Harmonizes with the pitch and gives a pleasant impression. Further, when the difference value is 700 cent (completely 5 degrees), 1200 cent (1 octave, that is, completely 8 degrees), or 0 cent (completely 1 degree), that is, the pitch of the first singing voice and the pitch of the second singing voice. Are in harmony with each other, the pitch of the first singing voice and the pitch of the second singing voice are harmonized, giving an impression of being audibly audible. Considering the above circumstances, the difference value C between the pitch of the first singing voice and the pitch of the second singing voice is 300 cent and 400 cent, or the pitch of the first singing voice and the second singing voice When the pitch is in the form of a consonant tone (that is, the difference value C is 700 cent, 1200 cent, 0 cent), the evaluation value S may be increased.

評価部２２８は、重複区間ＲＴSにおける代表値ＲＰAと代表値ＲＰBとの差分値Ｃが１００centの整数倍に近似（合致を含む）するか否かを判定する(SB2)。具体的には、差分値Ｃが、１００centの整数倍を含む所定の範囲内（例えば±１０％）にあるか否かが判定される。差分値Ｃが１００centの整数倍に近似する場合(SB2:YES)、評価部２２８は重複区間ＲＴSnに対して、楽曲が開始されてから重複区間ＲＴSn-1までに獲得された評価値（得点:Score）Ｓn-1と、所定値Δ（正の整数）との加算値を、重複区間ＲＴSnまでに獲得した評価値（Score）Ｓnとして設定する（Ｓn＝Sn-1＋Δ）(SB3)。 The evaluation unit 228 determines whether or not the difference value C between the representative value RPA and the representative value RPB in the overlapping section RTS approximates (including a match) to an integer multiple of 100 cent (SB2). Specifically, it is determined whether or not the difference value C is within a predetermined range (eg, ± 10%) including an integer multiple of 100 cent. When the difference value C approximates an integer multiple of 100 cent (SB2: YES), the evaluation unit 228 evaluates the overlapping section RTSn from the start of the music until the overlapping section RTSn-1 (score: An added value of (Score) Sn-1 and a predetermined value Δ (positive integer) is set as an evaluation value (Score) Sn acquired until the overlapping section RTSn (Sn = Sn-1 + Δ) (SB3).

他方、差分値Ｃが１００centの整数倍とは乖離する場合(SB2:NO)、楽曲が開始されてから重複区間ＲＴSn-1までに獲得された評価値Ｓn-1と、所定値Δ（正の整数）との減算値を、当該ｎ番目の重複区間ＲＴSnまでに獲得した評価値Ｓnとして設定する（Ｓn=Sn-1−Δ）(SB4)。評価部２２８は、SB3およびSB4で設定した評価値Ｓnを、端末装置Ｄ1の表示処理部２８に出力する。表示処理部２８は、歌唱の評価値Ｓnを表示装置１６に表示させる(SB5)。つまり、楽曲の再生（歌唱の進行）に並行して評価値Ｓnは時々刻々と変化していく。また、評価部２２８によって算出された評価値Ｓnは、通信装置１５によって端末装置Ｄ2に送信される。端末装置Ｄ2では、端末装置Ｄ1と同様に、評価値Ｓnを表示させる。 On the other hand, when the difference value C deviates from an integer multiple of 100 cent (SB2: NO), the evaluation value Sn-1 acquired from the start of the music to the overlapping section RTSn-1 and the predetermined value Δ (positive) The subtraction value with (integer) is set as the evaluation value Sn acquired up to the n-th overlapping section RTSn (Sn = Sn-1−Δ) (SB4). The evaluation unit 228 outputs the evaluation value Sn set in SB3 and SB4 to the display processing unit 28 of the terminal device D1. The display processing unit 28 displays the evaluation value Sn of the singing on the display device 16 (SB5). That is, the evaluation value Sn changes every moment in parallel with the reproduction of the music (song progress). Further, the evaluation value Sn calculated by the evaluation unit 228 is transmitted by the communication device 15 to the terminal device D2. In the terminal device D2, the evaluation value Sn is displayed as in the terminal device D1.

なお、差分値Ｃに応じた評価値Ｓの算定方法は適宜に変更される。例えば、差分値Ｃに対して図６のような関係となるように評価値Ｓを設定することも可能である。図６に例示される通り、差分値Ｃの複数の範囲の各々について評価値Ｓの分布が事前に設定され、複数の評価値Ｓのうち代表値ＲＰAと代表値ＲＰBとの差分値Ｃに応じた評価値Ｓが選択される。短３度（３半音）や長３度（４半音）の差分値Ｃは受聴者が知覚する調和の度合いが高いという傾向を考慮し、短３度に相当する差分値Ｃ（Ｃ＝３００）や長３度に相当する差分値Ｃ（Ｃ＝４００）には高い評価点Ｓが設定されている。 In addition, the calculation method of the evaluation value S according to the difference value C is appropriately changed. For example, the evaluation value S can be set so that the difference value C has a relationship as shown in FIG. As illustrated in FIG. 6, the distribution of the evaluation value S is set in advance for each of the plurality of ranges of the difference value C, and according to the difference value C between the representative value RPA and the representative value RPB among the plurality of evaluation values S. The evaluated value S is selected. The difference value C of the minor third (3 semitones) and the major third (four semitones) takes into account the tendency that the degree of harmony perceived by the listener is high, and the difference value C corresponding to the minor third (C = 300) A high evaluation score S is set for the difference value C (C = 400) corresponding to 3 degrees.

以上に説明した通り、第１実施形態の構成では、第１音高安定区間ＴSAと第２音高安定区間ＴSBとが時間軸上で重複する重複区間ＲＴS内の代表値ＲＰAと代表値ＲＰBとの差分値Ｃが所定の音高関係（具体的には、十二平均律等の特定の音律のもとで相互に隣合う各音階音の間隔の整数倍）にあるか否かに応じて歌唱が評価される。したがって、複数の歌唱パートの各々について楽譜データが存在しない場合でも、歌唱を適切に評価することが可能になる。 As described above, in the configuration of the first embodiment, the representative value RPA and the representative value RPB in the overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap on the time axis are Depending on whether or not the difference value C is in a predetermined pitch relationship (specifically, an integral multiple of the interval between each tone adjacent to each other under a specific temperament such as twelve equal temperament). Singing is evaluated. Therefore, even when there is no musical score data for each of the plurality of singing parts, it is possible to appropriately evaluate the singing.

第１実施形態では、第１歌唱信号Ｖ1のうち音高ＰAが安定する第１音高安定区間ＴSAと、第２歌唱信号Ｖ2のうち音高ＰBが安定する第２音高安定区間ＴSBが重複する重複区間ＲＴSの各々における代表値ＲＰAと代表値ＲＰBとに応じて第１歌唱音声の音高と第２歌唱音声の音高との調和の程度が評価される。ここで、音符間で音高が連続的に遷移する区間や歌唱表現として音高が変動する区間（例えばビブラート区間）を含む楽曲全体に亘って歌唱を評価する構成（以下、「対比例」という）では、音高が不安定に遷移する区間を含む楽曲全体の歌唱が歌唱評価の対象とされるから、必ずしも適切な評価がなされない場合がある。他方、第１実施形態の構成によれば、音高が安定する第１音高安定区間ＴSAおよび第２音高安定区間ＴSBが重複する重複区間ＲＴSで算出された代表値ＲＰAおよび代表値ＲＰBに応じて、第１歌唱音声の音高と第２歌唱音声の音高との調和の程度が評価されるから、対比例の構成と比較して、安定した区間の音高を用いた適切な評価を実現することが可能になる。 In the first embodiment, the first pitch stable section TSA in which the pitch PA is stable in the first singing signal V1 and the second pitch stable section TSB in which the pitch PB is stable in the second singing signal V2 overlap. The degree of harmony between the pitch of the first singing voice and the pitch of the second singing voice is evaluated according to the representative value RPA and the representative value RPB in each overlapping section RTS. Here, the composition which evaluates the singing over the whole music including the section where the pitch continuously changes between notes and the section where the pitch fluctuates as a singing expression (for example, vibrato section) (hereinafter referred to as “comparative”) ), Since the singing of the entire music including the section in which the pitch is unstablely shifted is the target of the singing evaluation, the appropriate evaluation may not always be performed. On the other hand, according to the configuration of the first embodiment, the representative value RPA and the representative value RPB calculated in the overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB in which the pitch is stabilized overlap. Accordingly, since the degree of harmony between the pitch of the first singing voice and the pitch of the second singing voice is evaluated, an appropriate evaluation using the pitch of the stable section as compared with the comparative configuration. Can be realized.

＜第２実施形態＞
第１実施形態では、第２歌唱信号Ｖ2を端末装置Ｄ2から端末装置Ｄ1に送信して端末装置Ｄ1で第２歌唱信号Ｖ2の音高ＰBを解析した。第２実施形態では、端末装置Ｄ2で解析された第２歌唱信号Ｖ2の音高ＰBが端末装置Ｄ2から端末装置Ｄ1に順次に送信される。 Second Embodiment
In the first embodiment, the second song signal V2 is transmitted from the terminal device D2 to the terminal device D1, and the pitch PB of the second song signal V2 is analyzed by the terminal device D1. In the second embodiment, the pitch PB of the second singing signal V2 analyzed by the terminal device D2 is sequentially transmitted from the terminal device D2 to the terminal device D1.

図７は、第２実施形態の歌唱評価システム１の概略図である。第２実施形態では、利用者Ｕ2が使用する端末装置Ｄ2に音高解析部３６が付加される。音高解析部３６は、収音装置３４が生成した第２歌唱信号Ｖ2の音高ＰBを順次に解析する。音高解析部３６による音高ＰBの解析方法は、第１実施形態の音高解析部２２２による音高ＰBの解析方法と同様である。第２実施形態の通信装置３５は、図７に例示される通り、音高解析部３６によって解析された第２歌唱信号Ｖ2の音高ＰBを端末装置Ｄ1に送信する。端末装置Ｄ1の通信装置１５は、端末装置Ｄ2から送信された音高ＰBを順次に受信する。 FIG. 7 is a schematic diagram of the singing evaluation system 1 of the second embodiment. In the second embodiment, a pitch analysis unit 36 is added to the terminal device D2 used by the user U2. The pitch analysis unit 36 sequentially analyzes the pitch PB of the second singing signal V2 generated by the sound collecting device 34. The analysis method of the pitch PB by the pitch analysis unit 36 is the same as the analysis method of the pitch PB by the pitch analysis unit 222 of the first embodiment. The communication apparatus 35 of 2nd Embodiment transmits the pitch PB of the 2nd song signal V2 analyzed by the pitch analysis part 36 to the terminal device D1, as illustrated by FIG. The communication device 15 of the terminal device D1 sequentially receives the pitch PB transmitted from the terminal device D2.

端末装置Ｄ1の音高解析部２２２は、第１歌唱信号Ｖ1の音高ＰAを順次に解析して、解析した音高ＰAを区間設定部２２４に通知する。区間設定部２２４は、音高解析部２２２が解析した音高ＰAが安定する第１音高安定区間ＴSAを複数設定するとともに、通信装置１５によって受信された音高ＰBが安定する第２音高安定区間ＴSBを複数設定する。以降の説明については第１実施形態と同様であるので、説明を省略する。 The pitch analysis unit 222 of the terminal device D1 sequentially analyzes the pitch PA of the first singing signal V1 and notifies the section setting unit 224 of the analyzed pitch PA. The section setting unit 224 sets a plurality of first pitch stable sections TSA in which the pitch PA analyzed by the pitch analysis unit 222 is stable, and the second pitch in which the pitch PB received by the communication device 15 is stable. A plurality of stable sections TSB are set. Since the subsequent description is the same as that of the first embodiment, the description is omitted.

以上の説明から理解される通り、第２実施形態の構成によっても第１実施形態と同様の効果が実現される。また、第２実施形態の構成によれば、音高解析部２２２は、第２歌唱信号Ｖ2の音高ＰBを解析する必要がないから、端末装置Ｄ1による演算処理の負荷を軽減することが可能になる、という利点もある。 As understood from the above description, the same effects as those of the first embodiment are realized by the configuration of the second embodiment. Further, according to the configuration of the second embodiment, the pitch analysis unit 222 does not need to analyze the pitch PB of the second singing signal V2, and thus it is possible to reduce the load of calculation processing by the terminal device D1. There is also an advantage of becoming.

＜第３実施形態＞
第２実施形態では、区間設定部２２４は、端末装置Ｄ1内の音高解析部２２２から音高ＰAを取得する一方で、通信装置１５によって受信された音高ＰBを取得する。音高ＰBは端末装置Ｄ2から端末装置Ｄ1に送信されるから、利用者Ｕ1と利用者Ｕ2とが相互に同期して同じ楽曲を歌唱した場合でも、区間設定部２２４が楽曲のうち特定の時点の音高ＰBを取得する時点は、区間設定部２２４が当該時点の音高ＰAを取得する時点に対して、例えば端末装置Ｄ1と端末装置Ｄ2との間の通信遅延の分だけ遅延し得る。具体的には、図８に例示されるように、楽曲内の各時点の音高ＰAを区間設定部２２４が取得する時点（例えば解析点ＫAｎ[n=1,2,3…]の時間軸上における位置）と、楽曲内の同じ時点の音高ＰBを区間設定部２２４が取得する時点（例えば、解析点ＫBｎ[n=1,2,3…]の時間軸上における位置）との間には、通信に要する遅延（Δt）が発生し得る。そこで、第３実施形態では、遅延時間を補償する処理を実行する。 <Third Embodiment>
In the second embodiment, the section setting unit 224 acquires the pitch PA from the pitch analysis unit 222 in the terminal device D1, while acquiring the pitch PB received by the communication device 15. Since the pitch PB is transmitted from the terminal device D2 to the terminal device D1, even when the user U1 and the user U2 sing the same music in synchronism with each other, the section setting unit 224 has a specific time point in the music. The time point when the pitch PB is acquired may be delayed from the time point when the section setting unit 224 acquires the pitch PA at that time, for example, by the communication delay between the terminal device D1 and the terminal device D2. Specifically, as illustrated in FIG. 8, the time axis at which the section setting unit 224 acquires the pitch PA at each time point in the music (for example, the analysis point KAn [n = 1, 2, 3...] And the time point when the section setting unit 224 acquires the pitch PB at the same time in the music (for example, the position on the time axis of the analysis point KBn [n = 1,2,3...]). May cause a delay (Δt) required for communication. Therefore, in the third embodiment, a process for compensating for the delay time is executed.

第３実施形態の歌唱評価システム１の構成は第２実施形態と同様である。利用者Ｕ1が歌唱対象とする楽曲を選択すると、当該楽曲の再生指示が、端末装置Ｄ1から端末装置Ｄ2に送信される。端末装置Ｄ1と端末装置Ｄ2とでは、楽曲の開始時点が一致するから、利用者Ｕ1と利用者Ｕ2とは、同じ楽曲を同期して歌唱する。すなわち、端末装置Ｄ1で音高を解析する時点（解析点ＫA）と端末装置Ｄ2で音高を解析する時点（解析点ＫB）とが一致する。第３実施形態では、端末装置Ｄ1の音高解析部２２２は、第１歌唱信号Ｖ1のうち音高ＰAを解析するたびに、当該解析点ＫAの時間軸上における位置を示す情報（以降の説明では「第１時間情報」という。）を区間設定部２２４に通知する。また、端末装置Ｄ2の音高解析部３６は、第２歌唱信号Ｖ2のうち音高ＰBを解析するたびに、解析した音高ＰBと当該解析点ＫBの時間軸上における位置を示す情報（以降の説明では「第２時間情報」という。）を、通信装置３５を介して端末装置Ｄ1に送信する。端末装置Ｄ1の通信装置１５は、通信装置３５が送信した音高ＰBおよび第２時間情報を受信する。時間情報の一例としては、例えば、楽曲開始からの経過時間や時刻が例示される。 The configuration of the singing evaluation system 1 of the third embodiment is the same as that of the second embodiment. When the user U1 selects a song to be sung, a playback instruction for the song is transmitted from the terminal device D1 to the terminal device D2. Since the terminal device D1 and the terminal device D2 have the same music start time, the user U1 and the user U2 sing the same music synchronously. That is, the time point when the terminal device D1 analyzes the pitch (analysis point KA) coincides with the time point when the terminal device D2 analyzes the pitch (analysis point KB). In 3rd Embodiment, the pitch analysis part 222 of the terminal device D1 is the information (following description) which shows the position on the time-axis of the said analysis point KA whenever it analyzes the pitch PA among the 1st song signals V1. Then, it is referred to as “first time information”) to the section setting unit 224. Further, whenever the pitch analysis unit 36 of the terminal device D2 analyzes the pitch PB of the second singing signal V2, information indicating the position of the analyzed pitch PB and the analysis point KB on the time axis (hereinafter referred to as the pitch PB). Is referred to as “second time information”) via the communication device 35 to the terminal device D1. The communication device 15 of the terminal device D1 receives the pitch PB and the second time information transmitted by the communication device 35. As an example of the time information, for example, an elapsed time and time from the start of music are exemplified.

図９は、第３実施形態の区間設定部２２４による処理の説明図である。第３実施形態の区間設定部２２４は、音高解析部２２２により通知された第１時間情報と、通信装置３５および通信装置１５との間における通信を介して音高解析部３６により通知された第２時間情報とを相互に対応付けることで、第１時間情報が示す複数の解析点ＫA（第１解析点ＫA）と、第２時間情報が示す複数の解析点ＫB（第２解析点ＫB）との間で相互に対応するもの（楽曲内の同時点に対応するもの）同士を時間軸上で相互に合致させる。 FIG. 9 is an explanatory diagram of processing by the section setting unit 224 of the third embodiment. The section setting unit 224 of the third embodiment is notified by the pitch analysis unit 36 via the first time information notified by the pitch analysis unit 222 and communication between the communication device 35 and the communication device 15. By associating the second time information with each other, a plurality of analysis points KA (first analysis point KA) indicated by the first time information and a plurality of analysis points KB (second analysis point KB) indicated by the second time information. That correspond to each other (corresponding to the same point in the music) are matched with each other on the time axis.

区間設定部２２４は、相互に対応する解析点ＫAと解析点ＫBとを時間軸上で相互に合致させたうえで、第１実施形態と同様の手順により、第１音高安定区間ＴSAと第２音高安定区間ＴSBとを設定し、当該第１音高安定区間ＴSAと第２音高安定区間ＴSBとが時間軸上で重複する重複区間ＲＴSを設定する。 The section setting unit 224 matches the analysis point KA and the analysis point KB corresponding to each other on the time axis, and then performs the same procedure as in the first embodiment by using the same procedure as in the first embodiment. A two pitch stable section TSB is set, and an overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap on the time axis is set.

以上に説明した通り、第３実施形態の構成によっても、前述の各形態と同様の効果を奏することが可能である。第３実施形態の構成では、第１歌唱信号Ｖ1の音高ＰAが解析される第１解析点ＫAと、第２歌唱信号Ｖ2の音高ＰBが解析される第２解析点ＫBとを時間軸上で合致させる。したがって、端末装置Ｄ1から端末装置Ｄ2との間の通信に遅延が発生する場合でも、第１歌唱音声と第２歌唱音声との調和の程度を適切に評価することが可能になる。 As described above, the same effects as those of the above-described embodiments can be obtained by the configuration of the third embodiment. In the configuration of the third embodiment, the first analysis point KA at which the pitch PA of the first singing signal V1 is analyzed and the second analysis point KB at which the pitch PB of the second singing signal V2 is analyzed are time axes. Match above. Therefore, even when a delay occurs in communication between the terminal device D1 and the terminal device D2, it is possible to appropriately evaluate the degree of harmony between the first singing voice and the second singing voice.

＜第４実施形態＞
前述の各形態では、第１歌唱音声と第２歌唱音声とが所定の音高関係にあるか否かに応じて、第１歌唱音声の音高と第２歌唱音声の音高との調和の程度を評価した。第４実施形態では、第１歌唱音声と第２歌唱音声との歌唱音高の類似の度合いを評価する。 <Fourth embodiment>
In each above-mentioned form, according to whether the 1st singing voice and the 2nd singing voice have a predetermined pitch relation, the pitch of the 1st singing voice and the pitch of the 2nd singing voice are in harmony. The degree was evaluated. In the fourth embodiment, the degree of similarity in singing pitch between the first singing voice and the second singing voice is evaluated.

図１０は、第４実施形態の歌唱評価システム１の概略図である。図１０から把握される通り、第４実施形態の端末装置Ｄ1の構成においては、第１実施形態の端末装置Ｄ1の構成に対して音高分布生成部２２５と解析処理部２２７とが付加される。音高解析部２２２の機能および動作については第１実施形態と同様であるので詳細な説明を省略する。 FIG. 10 is a schematic diagram of the singing evaluation system 1 of the fourth embodiment. As understood from FIG. 10, in the configuration of the terminal device D1 of the fourth embodiment, a pitch distribution generation unit 225 and an analysis processing unit 227 are added to the configuration of the terminal device D1 of the first embodiment. . Since the function and operation of the pitch analysis unit 222 are the same as those in the first embodiment, a detailed description thereof will be omitted.

第４実施形態の区間設定部２２４は、第１歌唱信号Ｖ1の音高ＰAが安定する第１音高安定区間ＴSAと、第２歌唱信号Ｖ2の音高ＰBが安定する第２音高安定区間ＴSBとを複数設定する。第４実施形態の区間設定部２２４は、第１音高安定区間ＴSAと第２音高安定区間ＴSBとが重複する重複区間ＲＴSは設定しない。代表値算出部２２６は、楽曲全体に亘って第１音高安定区間ＴSAおよび第２音高安定区間ＴSBの設定が完了すると、第１実施形態と同様の手法により、複数の第１音高安定区間ＴSAの各々における音高ＰAの代表値ＲＰAと、複数の第２音高安定区間ＴSBの各々における音高ＰBの代表値ＲＰBとを算出する。代表値算出部２２６は、算出した代表値ＲＰAおよび代表値ＲＰBを記憶装置１２に順次格納する。したがって、楽曲内の第１音高安定区間ＴSAの総数に相当する個数の代表値ＲＰAおよび第２音高安定区間ＴSBの総数に相当する個数の代表値ＲＰBが記憶装置１２に格納される。音高分布生成部２２５は、複数の第１音高安定区間ＴSAに亘る代表値ＲＰAの度数分布を示す第１度数分布を生成する一方、複数の第２音高安定区間ＴSBに亘る代表値ＲＰBの度数分布を示す第２度数分布を生成する。 The section setting unit 224 of the fourth embodiment includes a first pitch stable section TSA in which the pitch PA of the first singing signal V1 is stable and a second pitch stable section in which the pitch PB of the second singing signal V2 is stable. A plurality of TSBs are set. The section setting unit 224 of the fourth embodiment does not set an overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap. When the setting of the first pitch stable section TSA and the second pitch stable section TSB is completed over the entire musical piece, the representative value calculating unit 226 uses the same method as in the first embodiment to perform a plurality of first pitch stable sections. A representative value RPA of the pitch PA in each of the sections TSA and a representative value RPB of the pitch PB in each of the plurality of second pitch stable sections TSB are calculated. The representative value calculation unit 226 sequentially stores the calculated representative value RPA and the representative value RPB in the storage device 12. Therefore, the number of representative values RPA corresponding to the total number of first pitch stable sections TSA and the number of representative values RPB corresponding to the total number of second pitch stable sections TSB in the music are stored in the storage device 12. The pitch distribution generation unit 225 generates a first frequency distribution indicating the frequency distribution of the representative value RPA over the plurality of first pitch stable sections TSA, while representing the representative value RPB over the plurality of second pitch stable sections TSB. A second frequency distribution indicating the frequency distribution is generated.

図１１は、音高分布生成部２２５の処理についての説明図である。音高分布生成部２２５は、複数の第１音高安定区間ＴSAに包含される代表値ＲＰAを記憶装置１２から読み出し、当該代表値ＲＰAの第１度数分布（ヒストグラム）ＨAを作成する。度数分布ＨAは、相互に隣合う各音階音の音高差（１００cent）と比較して十分に細かく代表値ＲＰAの数値範囲を区分した複数の階級の各々における代表値ＲＰAの度数の分布である。また、音高分布生成部２２５は、第１度数分布ＨAの生成と同様に、複数の第２音高安定区間ＴSBに包含される代表値ＲＰBを記憶装置１２から読み出し、当該代表値ＲＰBの度数分布を示す第２度数分布ＨBを作成する。 FIG. 11 is an explanatory diagram for the processing of the pitch distribution generation unit 225. The pitch distribution generation unit 225 reads the representative value RPA included in the plurality of first pitch stable sections TSA from the storage device 12, and creates a first frequency distribution (histogram) HA of the representative value RPA. The frequency distribution HA is a frequency distribution of the representative value RPA in each of a plurality of classes in which the numerical value range of the representative value RPA is sufficiently finely divided as compared with the pitch difference (100 cent) of each tone adjacent to each other. . Similarly to the generation of the first frequency distribution HA, the pitch distribution generation unit 225 reads the representative value RPB included in the plurality of second pitch stable sections TSB from the storage device 12, and the frequency of the representative value RPB. A second frequency distribution HB indicating the distribution is created.

図１０の解析処理部２２７は、音高分布生成部２２５によって生成された第１度数分布ＨAのうち音階音の音高を含む所定範囲の分布である複数の分布のそれぞれを合計する。具合的には、解析処理部２２７は、第１度数分布ＨAを音階音毎に区分した複数の単位範囲Ｔu(Ｔu1,Ｔu2,Ｔu3,Ｔu4,Ｔu5,Ｔu6)にわたり各単位範囲Ｔuの分布を合計して１個の第１評価分布ＱAを作成するとともに、当該第１評価分布ＱAの代表値を算出する。第４実施形態では、第１評価分布ＱAの平均値Ａ1を、第１評価分布ＱAの代表値として算出する。任意の１個の音階音に対応する単位範囲Ｔuは、当該音階音の音高を中心とする±５０centの範囲である。解析処理部２２７は、第１評価分布ＱAの生成と同様に、第２度数分布ＨBを音階音毎に区分した複数の単位範囲Ｔuにわたり各単位範囲Ｔuの分布を合計して１個の第２評価分布ＱBを作成するとともに当該第２評価分布ＱBの代表値（第２評価分布ＱBの平均値Ａ2）を算出する。 The analysis processing unit 227 in FIG. 10 sums up each of a plurality of distributions that are distributions of a predetermined range including the pitches of the scales in the first frequency distribution HA generated by the pitch distribution generation unit 225. Specifically, the analysis processing unit 227 sums up the distribution of each unit range Tu over a plurality of unit ranges Tu (Tu1, Tu2, Tu3, Tu4, Tu5, Tu6) obtained by dividing the first frequency distribution HA for each scale sound. Then, one first evaluation distribution QA is created and a representative value of the first evaluation distribution QA is calculated. In the fourth embodiment, the average value A1 of the first evaluation distribution QA is calculated as a representative value of the first evaluation distribution QA. The unit range Tu corresponding to one arbitrary scale sound is a range of ± 50 cent centered on the pitch of the scale sound. Similar to the generation of the first evaluation distribution QA, the analysis processing unit 227 totals the distributions of the unit ranges Tu over a plurality of unit ranges Tu obtained by dividing the second frequency distribution HB for each scale sound, thereby generating one second An evaluation distribution QB is created and a representative value of the second evaluation distribution QB (average value A2 of the second evaluation distribution QB) is calculated.

図１２は、解析処理部２２７により作成された第１評価分布ＱAの説明図である。解析処理部２２７は、音階音の音高（図１１では1200cent、1300cent、1400cent、1500cent、1600cent、1700cent）を０とする±５０centに亘る数値αの範囲（−５０≦α≦＋５０）に各単位範囲Ｔuの分布を対応させたうえで複数の単位範囲Ｔuに亘る分布の度数を数値α毎に合計する。そして、解析処理部２２７は、数値α毎の各合計度数を±５０centの範囲内でさらに合計した積算値（分布の下側の合計面積）が所定値（例えば１）となるように各合計度数を正規化することで第１評価分布ＱAを生成する。すなわち、解析処理部２２７は、音階音の音高を中心とする複数の単位範囲Ｔuに度数分布Ｈ（ＨA,ＨB）を区分し、各単位範囲Ｔuを相互に重複させて各単位範囲Ｔuの分布の度数を複数の単位範囲Ｔuにわたり数値α（音高）毎に合計する（さらにはその合計度数を正規化する）ことで第１評価分布ＱAを生成する。図１２の関数Ｇ(α)は、数値αを変数とした正規化後の第１評価分布ＱAを表現する関数である。解析処理部２２７は、第１評価分布ＱAの平均値Ａ1を算出する。また、解析処理部２２７は、第１評価分布ＱAと同様の方法で第２評価分布ＱBを生成してその平均値Ａ2を算出する。 FIG. 12 is an explanatory diagram of the first evaluation distribution QA created by the analysis processing unit 227. The analysis processing unit 227 has each unit within a range of numerical value α (−50 ≦ α ≦ + 50) over ± 50 cents, where 0 is the pitch of the scale tone (1200cent, 1300cent, 1400cent, 1500cent, 1600cent, 1700cent in FIG. 11). The frequencies of the distribution over a plurality of unit ranges Tu are summed for each numerical value α after the distribution of the range Tu is made to correspond. The analysis processing unit 227 then adds each total frequency for each numerical value α within a range of ± 50 cents so that an integrated value (total area under the distribution) becomes a predetermined value (for example, 1). Is normalized to generate the first evaluation distribution QA. In other words, the analysis processing unit 227 divides the frequency distribution H (HA, HB) into a plurality of unit ranges Tu centered on the pitch of the scale sound, and overlaps each unit range Tu with each other to determine each unit range Tu. The first evaluation distribution QA is generated by summing the frequency of distribution for each numerical value α (pitch) over a plurality of unit ranges Tu (and normalizing the total frequency). The function G (α) in FIG. 12 is a function representing the first evaluation distribution QA after normalization using the numerical value α as a variable. The analysis processing unit 227 calculates an average value A1 of the first evaluation distribution QA. Further, the analysis processing unit 227 generates the second evaluation distribution QB by the same method as the first evaluation distribution QA and calculates the average value A2.

以上の説明から理解される通り、第１評価分布ＱAは、複数の音階音の各々に対する音高ＰAの代表値ＲＰAの分布の傾向を表現する。具体的には、第１評価分布ＱAの平均値Ａ1が大きい場合には、利用者Ｕ1が、音階音の本来の音高に対して高目の音高を歌唱する傾向があると推定できる。同様に、第２評価分布ＱBは、各音階音の音高に対する音高ＰBの代表値ＲＰBの分布の傾向を表現する。具体的には、第２評価分布ＱBの平均値Ａ2が大きい場合には、利用者Ｕ2が、音階音の本来の音高に対して高目の音高を歌唱する傾向があると推定できる。 As understood from the above description, the first evaluation distribution QA expresses the tendency of the distribution of the representative value RPA of the pitch PA for each of a plurality of scale sounds. Specifically, when the average value A1 of the first evaluation distribution QA is large, it can be estimated that the user U1 tends to sing a higher pitch with respect to the original pitch of the scale sound. Similarly, the second evaluation distribution QB expresses the tendency of the distribution of the representative value RPB of the pitch PB with respect to the pitch of each scale sound. Specifically, when the average value A2 of the second evaluation distribution QB is large, it can be estimated that the user U2 tends to sing a higher pitch with respect to the original pitch of the scale sound.

評価部２２８は、解析処理部２２７によって算定された第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とに応じて、第１歌唱音声と第２歌唱音声との歌唱音高の類似の度合いを評価する。第１歌唱音声と第２歌唱音声との間で歌唱音高が近似する場合（例えば利用者Ｕ1および利用者Ｕ2の双方が、音階音の音高に対して高い音高で発音する傾向がある場合）、平均値Ａ1と平均値Ａ2とが相互に近似すると推定される。他方、第１歌唱音声と第２歌唱音声との間で歌唱音高が乖離する場合（例えば利用者Ｕ1は音階音に対して高い音高で発音する傾向がある一方で、利用者Ｕ2は音階音に対して低い音高で発音する傾向がある場合）、平均値Ａ1と平均値Ａ2とは乖離すると推定される。以上の傾向を考慮して、第４実施形態の評価部２２８は、第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とが近似するほど、楽曲の全体を通して第１歌唱音声と第２歌唱音声との歌唱音高が類似するとして評価値Ｓを大きい数値に設定する。他方、第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とが乖離するほど、楽曲の全体を通して第１歌唱音声と第２歌唱音声との歌唱音高が乖離するとして評価値Ｓを小さい数値に設定する。 The evaluation unit 228 sings the singing sound of the first singing voice and the second singing voice according to the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB calculated by the analysis processing unit 227. Assess the degree of high similarity. When the singing pitch is approximated between the first singing voice and the second singing voice (for example, both the user U1 and the user U2 tend to pronounce at a higher pitch than the pitch of the scale sound. The average value A1 and the average value A2 are approximated to each other. On the other hand, when the singing pitch deviates between the first singing voice and the second singing voice (for example, the user U1 tends to pronounce at a high pitch with respect to the scale tone, while the user U2 It is estimated that the average value A1 and the average value A2 are different from each other. Considering the above tendency, the evaluation unit 228 of the fourth embodiment performs the first singing throughout the entire song as the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB approximate. The evaluation value S is set to a large numerical value because the singing pitches of the voice and the second singing voice are similar. On the other hand, as the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB deviate, the singing pitches of the first singing voice and the second singing voice diverge throughout the entire music. Set the value S to a small number.

図１３は、第４実施形態の歌唱評価部２２（音高分布生成部２２５，解析処理部２２７，評価部２２８）による動作の処理のフローチャートである。代表値算出部２２６が、各第１音高安定区間ＴSAn（ｎ＝１,２,３,４，……）の音高ＰAの代表値ＲＰAと各第２音高安定区間ＴSBn（ｎ＝１,２,３,４，……）の音高ＰBの代表値ＲＰBとを楽曲全体に亘り算出すると、図１３の評価処理が実行される。 FIG. 13 is a flowchart of an operation process performed by the song evaluation unit 22 (pitch distribution generation unit 225, analysis processing unit 227, and evaluation unit 228) according to the fourth embodiment. The representative value calculation unit 226 selects the representative value RPA of the pitch PA of each first pitch stable section TSAn (n = 1, 2, 3, 4,...) And each second pitch stable section TSBn (n = 1). , 2, 3, 4,...) And the representative value RPB of the pitch PB are calculated over the entire music, the evaluation process of FIG. 13 is executed.

音高分布生成部２２５は、楽曲内の複数の第１音高安定区間ＴSAに亘る代表値ＲＰAの第１度数分布ＨAを作成するとともに、楽曲内の複数の第２音高安定区間ＴSBに亘る代表値ＲＰBの第２度数分布ＨBを作成する(SD1)。解析処理部２２７は、音高分布生成部２２５によって生成された第１度数分布ＨAを音階音毎に区分した複数の単位範囲Ｔuから第１評価分布ＱAを作成するとともに、第２度数分布ＨBを音階音毎に区分した複数の単位範囲Ｔuから第２評価分布ＱBを作成する(SD2)。解析処理部２２７は、第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とを算出する(SD3)。 The pitch distribution generation unit 225 creates the first frequency distribution HA of the representative value RPA over the plurality of first pitch stable sections TSA in the music and the plurality of second pitch stable sections TSB in the music. A second frequency distribution HB of the representative value RPB is created (SD1). The analysis processing unit 227 creates the first evaluation distribution QA from a plurality of unit ranges Tu obtained by dividing the first frequency distribution HA generated by the pitch distribution generation unit 225 for each scale sound, and the second frequency distribution HB. A second evaluation distribution QB is created from a plurality of unit ranges Tu divided for each scale sound (SD2). The analysis processing unit 227 calculates the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB (SD3).

評価部２２８は、解析処理部２２７によって算定された第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とに応じて、第１歌唱音声と第２歌唱音声との歌唱音高の類似の度合いを評価する(SD4)。例えば、第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とが近似するほど、楽曲の全体を通して第１歌唱音声の歌唱音高と第２歌唱音声の歌唱音高が類似するとして評価値Ｓを大きい数値に設定する。他方、第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とが乖離するほど、楽曲の全体を通して第１歌唱音声の歌唱音高と第２歌唱音声の歌唱音高が乖離すると評価して評価値Ｓを小さい数値に設定する。表示処理部２８は、評価値Ｓを表示装置１６に表示させる(SD5)。 The evaluation unit 228 sings the singing sound of the first singing voice and the second singing voice according to the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB calculated by the analysis processing unit 227. Assess the degree of high similarity (SD4). For example, as the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB approximate, the singing pitch of the first singing voice and the singing pitch of the second singing voice are similar throughout the music. As a result, the evaluation value S is set to a large numerical value. On the other hand, as the average value A1 of the first evaluation distribution QA deviates from the average value A2 of the second evaluation distribution QB, the singing pitch of the first singing voice and the singing pitch of the second singing voice are more different throughout the music. Then, it evaluates and sets evaluation value S to a small numerical value. The display processing unit 28 displays the evaluation value S on the display device 16 (SD5).

以上の説明から理解される通り、第４実施形態では、第１音高安定区間ＴSAにおける代表値ＲＰAの第１度数分布ＨAから生成された第１評価分布ＱAの平均値Ａ1（第１評価分布ＱAの代表値）と、第２音高安定区間ＴSBにおける代表値ＲＰBの第２度数分布ＨBから生成された第２評価分布ＱBの平均値Ａ2（第２評価分布ＱBの代表値）とに応じて、第１歌唱音声と第２歌唱音声との歌唱音高の類似の度合いが評価される。したがって、第４実施形態の構成によっても、第１実施形態の効果と同様に、楽曲の歌唱パート毎の楽譜データを必要とすることなく、利用者Ｕ1と利用者Ｕ2との歌唱音高の類似の度合いを適切に評価することが可能である。また、第４実施形態では、第１音高安定区間ＴSAにおける代表値ＲＰAと第２音高安定区間ＴSBにおける代表値ＲＰBとが歌唱評価の対象として利用されるから、楽曲のうち音高遷移が不安定な区間を含む全区間に亘る歌唱音高が評価対象とされる構成と比較して適切に歌唱音高を評価することが可能である。 As understood from the above description, in the fourth embodiment, the average value A1 (first evaluation distribution) of the first evaluation distribution QA generated from the first frequency distribution HA of the representative value RPA in the first pitch stable section TSA. According to the average value A2 of the second evaluation distribution QB generated from the second frequency distribution HB of the representative value RPB in the second pitch stable section TSB (representative value of the second evaluation distribution QB). Thus, the degree of similarity of the singing pitch between the first singing voice and the second singing voice is evaluated. Therefore, similar to the effect of the first embodiment, the configuration of the fourth embodiment is similar to the singing pitch between the user U1 and the user U2 without requiring the score data for each singing part of the music. It is possible to appropriately evaluate the degree of. In the fourth embodiment, since the representative value RPA in the first pitch stable section TSA and the representative value RPB in the second pitch stable section TSB are used as objects for singing evaluation, the pitch transition of the music is changed. It is possible to appropriately evaluate the singing pitch as compared with the configuration in which the singing pitch over the entire section including the unstable section is the evaluation target.

＜第５実施形態＞
前述の各形態では、利用者Ｕ1が使用する端末装置Ｄ1を歌唱評価装置として利用する構成を例示した。第５実施形態では、端末装置Ｄ（Ｄ1，Ｄ2）との間で通信を実行する管理装置５００を歌唱評価装置として利用する構成を例示する。 <Fifth Embodiment>
In each above-mentioned form, composition which utilizes terminal unit D1 which user U1 uses as a song evaluation device was illustrated. In 5th Embodiment, the structure which utilizes the management apparatus 500 which performs communication between terminal device D (D1, D2) as a song evaluation apparatus is illustrated.

図１４は、第５実施形態の歌唱評価システム１の概略図である。第５実施形態の歌唱評価システム１は、図１４に例示される通り、管理装置５００と、利用者Ｕ1が使用する端末装置Ｄ1と、利用者Ｕ2が使用する端末装置Ｄ2とを含んで構成される。端末装置Ｄ1および端末装置Ｄ2の構成は、図７に例示される端末装置Ｄ2と同様である。すなわち、利用者Ｕ1の第１歌唱音声を示す第１歌唱信号Ｖ1の音高ＰAが端末装置Ｄ1から管理装置５００に順次に送信され、利用者Ｕ2の第２歌唱音声を示す第２歌唱信号Ｖ2の音高ＰBが端末装置Ｄ2から管理装置５００に順次に送信される。 FIG. 14 is a schematic diagram of the singing evaluation system 1 of the fifth embodiment. As illustrated in FIG. 14, the singing evaluation system 1 according to the fifth embodiment includes a management device 500, a terminal device D1 used by the user U1, and a terminal device D2 used by the user U2. The The configurations of the terminal device D1 and the terminal device D2 are the same as the terminal device D2 illustrated in FIG. That is, the pitch PA of the first singing signal V1 indicating the first singing voice of the user U1 is sequentially transmitted from the terminal device D1 to the management apparatus 500, and the second singing signal V2 indicating the second singing voice of the user U2. Are sequentially transmitted from the terminal device D2 to the management device 500.

管理装置５００は、利用者Ｕ1の歌唱音声と、利用者Ｕ2の歌唱音声との音高の調和の度合いを評価して、評価結果を示す評価値Ｓを、利用者Ｕ1の端末装置Ｄ1と利用者Ｕ2の端末装置Ｄ2に各々送信する装置であり、歌唱評価部５２と通信装置５４とを包含する。通信装置５４は、端末装置Ｄ1および端末装置Ｄ2の各々と通信する。具体的には、通信装置５４は、端末装置Ｄ1が送信した音高ＰAと端末装置Ｄ2が送信した音高ＰBとを順次に受信する。歌唱評価部５２は、利用者Ｕ1による歌唱音声の音高と、利用者Ｕ2による歌唱音声の音高との調和の度合いを評価する。具体的には、第５実施形態の歌唱評価部５２は、第２実施形態の歌唱評価部２２と同様に、区間設定部２２４と代表値算出部２２６と評価部２２８とを包含する。区間設定部２２４，代表値算出部２２６，および，評価部２２８の処理は第２実施形態と同様であるので詳細な説明を省略する。評価部２２８が算定した評価値Ｓが、通信装置５４から端末装置Ｄ1および端末装置Ｄ2に送信されて各々の表示装置に表示される。 The management device 500 evaluates the degree of pitch harmony between the user U1 singing voice and the user U2 singing voice, and uses the evaluation value S indicating the evaluation result with the terminal device D1 of the user U1. And a singing evaluation unit 52 and a communication device 54. The communication device 54 communicates with each of the terminal device D1 and the terminal device D2. Specifically, the communication device 54 sequentially receives the pitch PA transmitted by the terminal device D1 and the pitch PB transmitted by the terminal device D2. The singing evaluation unit 52 evaluates the degree of harmony between the pitch of the singing voice by the user U1 and the pitch of the singing voice by the user U2. Specifically, the singing evaluation unit 52 of the fifth embodiment includes a section setting unit 224, a representative value calculation unit 226, and an evaluation unit 228, similarly to the singing evaluation unit 22 of the second embodiment. Since the processing of the section setting unit 224, the representative value calculation unit 226, and the evaluation unit 228 is the same as that of the second embodiment, detailed description thereof is omitted. The evaluation value S calculated by the evaluation unit 228 is transmitted from the communication device 54 to the terminal device D1 and the terminal device D2, and displayed on each display device.

第５実施形態でも、前述の各実施形態と同様の効果が実現される。また、第５実施形態の構成では、利用者Ｕ1の第１歌唱音声の音高と利用者Ｕ2の第２歌唱音声の音高との調和の度合いが管理装置５００により評価される。したがって、端末装置Ｄ1には歌唱評価部２２を搭載する必要がないという利点がある。なお、図１４では、端末装置Ｄ1が第１歌唱信号Ｖ1の音高ＰAを送信するとともに端末装置Ｄ2が第２歌唱信号Ｖ2の音高ＰBを送信する構成を例示したが、端末装置Ｄ1が第１歌唱信号Ｖ1を管理装置５００に送信するとともに端末装置Ｄ2が第２歌唱信号Ｖ2を管理装置５００に送信する構成も採用され得る。具体的には、第１歌唱信号Ｖ1の音高ＰAと第２歌唱信号Ｖ2の音高ＰBとを解析する音高解析部２２２が管理装置５００の歌唱評価部５２に追加される。 In the fifth embodiment, the same effects as those of the above-described embodiments are realized. In the configuration of the fifth embodiment, the management device 500 evaluates the degree of harmony between the pitch of the first singing voice of the user U1 and the pitch of the second singing voice of the user U2. Therefore, the terminal device D1 has an advantage that the singing evaluation unit 22 does not need to be mounted. 14 illustrates the configuration in which the terminal device D1 transmits the pitch PA of the first singing signal V1 and the terminal device D2 transmits the pitch PB of the second singing signal V2, the terminal device D1 is the first. A configuration in which the terminal device D2 transmits the second singing signal V2 to the management device 500 while transmitting the one singing signal V1 to the management device 500 may be employed. Specifically, a pitch analysis unit 222 that analyzes the pitch PA of the first song signal V1 and the pitch PB of the second song signal V2 is added to the song evaluation unit 52 of the management device 500.

＜変形例＞
前述の各形態は多様に変形され得る。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された２以上の態様を適宜に併合することも可能である。 <Modification>
Each of the above-described embodiments can be variously modified. Specific modifications are exemplified below. Two or more modes arbitrarily selected from the following examples can be appropriately combined.

（１）第４実施形態では、第２歌唱信号Ｖ2を端末装置Ｄ2から端末装置Ｄ1に送信して端末装置Ｄ1で第２歌唱信号Ｖ2の音高ＰBを解析したが、第２実施形態および第３実施形態と同様に、端末装置Ｄ2で解析された第２歌唱信号Ｖ2の音高ＰBが端末装置Ｄ2から端末装置Ｄ1に順次に送信される構成としてもよい。 (1) In the fourth embodiment, the second singing signal V2 is transmitted from the terminal device D2 to the terminal device D1, and the pitch PB of the second singing signal V2 is analyzed by the terminal device D1, but the second embodiment and the second embodiment Similarly to the third embodiment, the pitch PB of the second singing signal V2 analyzed by the terminal device D2 may be sequentially transmitted from the terminal device D2 to the terminal device D1.

（２）第４実施形態では、第１評価分布ＱAの平均値Ａ1と第２評価分布ＱBの平均値Ａ2とを各分布の代表値として例示したが、第１評価分布ＱAおよび第２評価分布ＱBの各々の特徴を表す代表値は平均値に限定されない。例えば、第１評価分布ＱAおよび第２評価分布ＱBの各々について、平均値以外の指標値（中央値，最頻値等）や、分散値、二次モーメント等の各種の統計量が代表値として算定され得る。例えば、第１評価分布ＱAと第２評価分布ＱBとの間で分散値や二次モーメントが近似するほど、第１歌唱音声と第２歌唱音声の歌唱音高が類似すると評価することが可能である。 (2) In the fourth embodiment, the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB are exemplified as representative values of each distribution. However, the first evaluation distribution QA and the second evaluation distribution The representative value representing each characteristic of QB is not limited to the average value. For example, for each of the first evaluation distribution QA and the second evaluation distribution QB, various statistics such as an index value other than the average value (median value, mode value, etc.), variance value, and second moment are used as representative values. Can be calculated. For example, it can be evaluated that the singing pitches of the first singing voice and the second singing voice are more similar as the variance value and the second moment are approximated between the first evaluation distribution QA and the second evaluation distribution QB. is there.

（３）第５実施形態では、第１歌唱信号Ｖ1の音高ＰAの代表値ＲＰAと、第２歌唱信号Ｖ2の音高ＰBの代表値ＲＰBとが所定の音高関係にあるか否かに応じて、第１歌唱音声と第２歌唱音声との調和の度合いを評価した。以上の例示以外に、第５実施形態の管理装置５００に、音高分布生成部２２５と解析処理部２２７とを付加し、第１歌唱音声と第２歌唱音声との歌唱音高の類似の度合いを評価してもよい。 (3) In the fifth embodiment, whether or not the representative value RPA of the pitch PA of the first singing signal V1 and the representative value RPB of the pitch PB of the second singing signal V2 have a predetermined pitch relationship. Accordingly, the degree of harmony between the first singing voice and the second singing voice was evaluated. In addition to the above examples, a pitch distribution generation unit 225 and an analysis processing unit 227 are added to the management device 500 of the fifth embodiment, and the degree of similarity in singing pitch between the first singing voice and the second singing voice. May be evaluated.

（４）前述の各形態では、端末装置Ｄ1の記憶装置１２に伴奏データＢと歌詞データＱとを包含する楽曲データＬが記憶される構成を例示したが、端末装置Ｄ1と、図示を省略する楽曲提供サーバーとの間で通信網（例えば移動通信網やインターネット）４００を介した通信を実行することで楽曲データＬを受信する構成としてもよい。 (4) In each of the above-described embodiments, the configuration in which the music data L including the accompaniment data B and the lyrics data Q is stored in the storage device 12 of the terminal device D1, but the illustration is omitted for the terminal device D1. It is good also as a structure which receives the music data L by performing communication via a communication network (for example, mobile communication network or the internet) 400 between music provision servers.

（５）前述の各形態では、歌唱の評価結果として評価値Ｓを表示させる構成を例示した。評価値Ｓに替えて、「ＯＫ」や「ＮＧ」等のテキスト情報を評価情報として表示させてもよい。 (5) In each above-mentioned form, the composition which displays evaluation value S as an evaluation result of singing was illustrated. Instead of the evaluation value S, text information such as “OK” or “NG” may be displayed as the evaluation information.

（６）前述の各形態では、利用者Ｕ1が端末装置Ｄ1に向けて歌唱する一方、利用者Ｕ2が端末装置Ｄ2に向けて歌唱する構成を例示したが、利用者Ｕ1と利用者Ｕ2とが、ひとつの端末装置に向けて歌唱する構成としてもよい。具体的には、例えば、端末装置Ｄ1に複数の収音装置（１４Aおよび１４B）を付加して、利用者Ｕ1が収音装置１４Aに向けて歌唱し、利用者Ｕ2が収音装置１４Bに向けて歌唱してもよい。以上の構成によっても、第１実施形態と同様の効果が実現される。 (6) In each of the above-described embodiments, the user U1 sings toward the terminal device D1, while the user U2 sings toward the terminal device D2, but the user U1 and the user U2 It is good also as a structure which sings toward one terminal device. Specifically, for example, a plurality of sound collecting devices (14A and 14B) are added to the terminal device D1, the user U1 sings toward the sound collecting device 14A, and the user U2 faces the sound collecting device 14B. You may sing. With the above configuration, the same effect as in the first embodiment is realized.

１……歌唱評価システム、１０……演算処理装置、１２……記憶装置、１４……収音装置、１５……通信装置、１６……表示装置、１８……放音装置、２２……歌唱評価部、２６……再生処理部、２８……表示処理部、３４……収音装置、３５……通信装置、３６……音高解析部、５２……歌唱評価部、５４……通信装置、２２２……音高解析部、２２４……区間設定部、２２５……音高分布生成部、２２６……代表値算出部、２２７……解析処理部、２２８……評価部、５００……管理装置、Ｂ……伴奏データ、Ｑ……歌詞データ、Ｌ……楽曲データ、ＴSA……第１音高安定区間、ＴSB……第２音高安定区間、ＲＴS……重複区間、Ｔu……単位範囲。
DESCRIPTION OF SYMBOLS 1 ... Singing evaluation system, 10 ... Arithmetic processing device, 12 ... Memory | storage device, 14 ... Sound collecting device, 15 ... Communication device, 16 ... Display device, 18 ... Sound emitting device, 22 ... Singing Evaluation unit, 26... Reproduction processing unit, 28... Display processing unit, 34... Sound collection device, 35 .. communication device, 36 .. pitch analysis unit, 52. , 222 …… Pitch analysis unit, 224 …… Section setting unit, 225 …… Pitch distribution generation unit, 226 …… Representative value calculation unit, 227 …… Analysis processing unit, 228 …… Evaluation unit, 500 …… Management Device, B ... Accompaniment data, Q ... Lyric data, L ... Music data, TSA ... First pitch stable section, TSB ... Second pitch stable section, RTS ... Overlapping section, Tu ... Unit range.

Claims

A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section;
For each overlapping section in which each of the plurality of first pitch stable sections and each of the plurality of second pitch stable sections set by the section setting unit overlap on the time axis, the pitch of the first singing signal is set. A representative value calculating unit for calculating a representative value and a representative value of the pitch of the second singing signal;
Depending on whether or not the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal in the respective overlapping sections are in a predetermined pitch relationship, the one singing voice and A singing evaluation apparatus comprising: an evaluation unit that evaluates a degree of harmony with the other singing voice.

A pitch analyzer for sequentially analyzing the pitch of the first singing signal;
A receiver for receiving the pitches sequentially analyzed from the second singing signal;
The section setting section includes a plurality of first pitch stable sections in which the pitch analyzed by the pitch analysis section of the first singing signal is stabilized, and a sound received by the receiving section of the second singing signal. The singing evaluation apparatus according to claim 1, wherein a plurality of second pitch stable sections in which the height is stabilized are set.

The pitch analysis unit sequentially analyzes the pitch of the first singing signal for each first analysis point on the time axis,
The receiving unit receives the pitch of the second singing signal sequentially analyzed for each second analysis point on the time axis,
The section setting unit matches the mutually corresponding ones on the time axis between the plurality of first analysis points and the plurality of second analysis points, and then sets the first pitch stable section and The singing evaluation apparatus according to claim 2, wherein an overlapping section in which the second pitch stable section overlaps is set.

A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section;
A representative value calculation unit for calculating a representative value of the pitch in each of the plurality of second pitch stable sections, while calculating a representative value of the pitch in each of the plurality of first pitch stable sections;
A first frequency distribution indicating a frequency distribution of representative values of pitches over the plurality of first pitch stable sections is generated, while a frequency distribution of representative values of pitches over the plurality of second pitch stable sections is generated. A pitch distribution generation unit for generating the second frequency distribution shown;
The first frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and each unit range is overlapped with each other, and the frequency of the distribution of each unit range is divided for each pitch over a plurality of unit ranges. The first evaluation distribution is created by summing up, while the second frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and the unit ranges are overlapped with each other. An analysis processing unit that creates a second evaluation distribution by summing the frequency of distribution for each pitch over a plurality of unit ranges;
A singing evaluation apparatus comprising: an evaluation unit that evaluates the degree of similarity in singing pitch between the one singing voice and the other singing voice based on the first evaluation distribution and the second evaluation distribution.

A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section,
For each overlapping section in which each of the plurality of first pitch stable sections and each of the plurality of second pitch stable sections set by the section setting unit overlap on the time axis, the pitch of the first singing signal is set. A representative value calculating unit for calculating a representative value and a representative value of the pitch of the second singing signal;
Depending on whether or not the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal in the respective overlapping sections are in a predetermined pitch relationship, the one singing voice and A program that causes a computer to function as an evaluation unit that evaluates the degree of harmony with the other singing voice.

A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section,
A representative value calculator for calculating a representative value of the pitch in each of the plurality of first stable pitch sections, and calculating a representative value of the pitch in each of the plurality of second stable pitch sections;
A first frequency distribution indicating a frequency distribution of representative values of pitches over the plurality of first pitch stable sections is generated, while a frequency distribution of representative values of pitches over the plurality of second pitch stable sections is generated. A pitch distribution generation unit for generating the second frequency distribution shown in FIG.
The first frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and each unit range is overlapped with each other, and the frequency of the distribution of each unit range is divided for each pitch over a plurality of unit ranges. The first evaluation distribution is created by summing up, while the second frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and the unit ranges are overlapped with each other. An analysis processing unit that creates a second evaluation distribution by summing the frequency of the distribution for each pitch over a plurality of unit ranges;
A program that causes a computer to function as an evaluation unit that evaluates the degree of similarity in singing pitch between the one singing voice and the other singing voice based on the first evaluation distribution and the second evaluation distribution.