JP6488767B2 - Singing evaluation device and program - Google Patents

Singing evaluation device and program Download PDF

Info

Publication number
JP6488767B2
JP6488767B2 JP2015041620A JP2015041620A JP6488767B2 JP 6488767 B2 JP6488767 B2 JP 6488767B2 JP 2015041620 A JP2015041620 A JP 2015041620A JP 2015041620 A JP2015041620 A JP 2015041620A JP 6488767 B2 JP6488767 B2 JP 6488767B2
Authority
JP
Japan
Prior art keywords
pitch
singing
unit
stable
evaluation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2015041620A
Other languages
Japanese (ja)
Other versions
JP2016161831A (en
Inventor
川嶋 隆宏
隆宏 川嶋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yamaha Corp
Original Assignee
Yamaha Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corp filed Critical Yamaha Corp
Priority to JP2015041620A priority Critical patent/JP6488767B2/en
Publication of JP2016161831A publication Critical patent/JP2016161831A/en
Application granted granted Critical
Publication of JP6488767B2 publication Critical patent/JP6488767B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Auxiliary Devices For Music (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Description

本発明は、歌唱を評価する技術に関する。   The present invention relates to a technique for evaluating a song.

複数の歌唱パートを有する楽曲について、歌唱した音声(以下「歌唱音声」という)を解析することで、各歌唱パート毎に歌唱の巧拙を評価する各種の歌唱評価技術が提案されている。例えば、特許文献1には、複数の歌唱パートを有する歌唱者の歌唱音声を評価する技術において、歌唱音声の音高(歌唱音高)と、利用者が歌唱すべき音高を表す評価用データとを比較することで、歌唱パート毎に歌唱の巧拙を評価する構成が開示されている。   Various singing evaluation techniques for evaluating the skill of singing for each singing part by analyzing the sung sound (hereinafter referred to as “singing sound”) for music having a plurality of singing parts have been proposed. For example, in Patent Document 1, in the technique for evaluating the singing voice of a singer having a plurality of singing parts, evaluation data representing the pitch of the singing voice (singing pitch) and the pitch that the user should sing. The structure which evaluates the skill of a song for every song part is disclosed by comparing with.

特開2008−268368号公報JP 2008-268368 A

しかしながら、特許文献1の構成では、複数の歌唱パート毎に評価用データを用意する必要があり、評価用データを作成する負担が大きいという事情がある。以上の事情を考慮して、本発明は、評価用データの存在を前提とすることなく、複数の歌唱パートについて歌唱音声の巧拙を適切に評価することを目的とする。   However, in the configuration of Patent Document 1, it is necessary to prepare evaluation data for each of a plurality of singing parts, and there is a situation that the burden of creating the evaluation data is large. In view of the above circumstances, an object of the present invention is to appropriately evaluate the skill of a singing voice for a plurality of singing parts without assuming the existence of data for evaluation.

以上の課題を解決するために、本発明の第1態様に係る歌唱評価装置は、楽曲の一の歌唱音声を示す第1歌唱信号の音高が安定する複数の第1音高安定区間と、前記楽曲の他の歌唱音声を示す第2歌唱信号の音高が安定する複数の第2音高安定区間とを設定する区間設定部と、前記区間設定部が設定した複数の第1音高安定区間の各々と複数の第2音高安定区間の各々とが時間軸上で重複する重複区間毎に、前記第1歌唱信号の音高の代表値と前記第2歌唱信号の音高の代表値とを算出する代表値算出部と、前記各重複区間における前記第1歌唱信号の音高の代表値と前記第2歌唱信号の音高の代表値とが所定の音高関係にあるか否かに応じて、前記一の歌唱音声と前記他の歌唱音声との調和の度合いを評価する評価部とを具備する。以上の構成では、第1歌唱信号の音高の代表値と第2歌唱信号の音高の代表値とが所定の音高関係にあるか否かに応じて歌唱音声間の調和の度合いが評価されるから、楽曲の各歌唱パートの評価用データが存在しない場合でも、一の歌唱パートの歌唱音声と他の歌唱パートの歌唱音声との調和を度合いを適切に評価することが可能である。また、以上の構成では、第1歌唱信号の音高が安定する複数の第1音高安定区間と、第2歌唱信号の音高が安定する複数の第2音高安定区間とが重複する重複区間毎に第1歌唱信号の音高の代表値と第2歌唱信号の音高の代表値との関係が判定されるから、楽曲全体に亘る歌唱音高が評価対象とされる構成と比較して、演算の負荷を低減しながら適切に歌唱音高を評価することが可能になる。
ここで、所定の音高関係としては、例えば、一の音高安定区間の代表値と他の音高安定区間の代表値とが、十二平均音律における半音(100cent)の整数倍だけ相違する関係が例示される。また、代表値としては、平均値や、中央値等が例示される。
In order to solve the above problems, the singing evaluation apparatus according to the first aspect of the present invention includes a plurality of first pitch stable sections in which the pitch of the first singing signal indicating the singing voice of one piece of music is stable, A section setting unit that sets a plurality of second pitch stable sections in which the pitch of the second singing signal indicating other singing voices of the music is stable, and a plurality of first pitch stability set by the section setting unit For each overlapping section in which each of the sections and each of the plurality of second pitch stable sections overlap on the time axis, a representative value of the pitch of the first singing signal and a representative value of the pitch of the second singing signal And whether the representative value calculation unit for calculating and the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal in each of the overlapping sections have a predetermined pitch relationship. And an evaluation unit that evaluates the degree of harmony between the one singing voice and the other singing voice. In the above configuration, the degree of harmony between the singing voices is evaluated according to whether or not the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal are in a predetermined pitch relationship. Therefore, even when there is no evaluation data for each singing part of the music, it is possible to appropriately evaluate the degree of harmony between the singing voice of one singing part and the singing voice of another singing part. Further, in the above configuration, a plurality of first pitch stable sections where the pitch of the first singing signal is stable and a plurality of second pitch stable sections where the pitch of the second singing signal is stable overlap. Since the relationship between the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal is determined for each section, the singing pitch over the entire song is compared with the configuration to be evaluated. Thus, it is possible to appropriately evaluate the singing pitch while reducing the calculation load.
Here, as the predetermined pitch relationship, for example, the representative value of one pitch stable section and the representative value of another pitch stable section differ from each other by an integral multiple of a semitone (100 cents) in the twelve average temperament. The relationship is illustrated. Examples of the representative value include an average value and a median value.

第1態様に係る歌唱評価装置の好適例において、前記第1歌唱信号の音高を順次解析する音高解析部と、前記第2歌唱信号から順次解析された音高を受信する受信部とを具備し、前記区間設定部は、前記第1歌唱信号のうち前記音高解析部が解析した音高が安定する複数の第1音高安定区間と、前記第2歌唱信号のうち前記受信部が受信した音高が安定する複数の第2音高安定区間とを設定する。以上の構成では、第2歌唱信号から解析された音高が受信部によって受信されるから、音高解析部で第2歌唱信号の音高を解析する必要がない。したがって、音高解析部が第1歌唱信号の音高に加えて第2歌唱信号の音高も解析する構成と比較して、音高解析部による処理負荷を軽減することが可能になる。   In a preferred example of the singing evaluation apparatus according to the first aspect, a pitch analysis unit that sequentially analyzes the pitch of the first singing signal and a receiving unit that receives the pitches sequentially analyzed from the second singing signal. The section setting unit includes a plurality of first pitch stable sections in which the pitch analyzed by the pitch analysis unit of the first singing signal is stable, and the receiving unit of the second singing signal includes the receiving unit. A plurality of second pitch stable sections in which the received pitch is stable are set. In the above configuration, since the pitch analyzed from the second singing signal is received by the receiving unit, it is not necessary to analyze the pitch of the second singing signal by the pitch analyzing unit. Therefore, the processing load by the pitch analysis unit can be reduced as compared with the configuration in which the pitch analysis unit analyzes the pitch of the second singing signal in addition to the pitch of the first singing signal.

第1態様に係る歌唱評価装置の好適例において、前記音高解析部は、時間軸上の第1解析点毎に前記第1歌唱信号の音高を順次に解析し、前記受信部は、時間軸上の第2解析点毎に順次に解析された前記第2歌唱信号の音高を受信し、前記区間設定部は、複数の第1解析点と複数の第2解析点との間で相互に対応するもの同士を時間軸上で相互に合致させたうえで、前記第1音高安定区間と前記第2音高安定区間とが重複する重複区間を設定する。以上の構成では、第1解析点と第2解析点とを時間軸上で相互に合致させるから、時間軸上で一致する解析点における、一の歌唱パート(例えば一の歌唱音声)の音高と、他の歌唱パート(例えば他の歌唱音声)の音高との調和の度合いを適切に評価することが可能になる、という利点がある。   In a preferred example of the singing evaluation apparatus according to the first aspect, the pitch analysis unit sequentially analyzes the pitch of the first singing signal for each first analysis point on the time axis, and the reception unit The pitch of the second singing signal sequentially analyzed for each second analysis point on the axis is received, and the section setting unit is configured to mutually interact between the plurality of first analysis points and the plurality of second analysis points. Are matched with each other on the time axis, and an overlapping section in which the first pitch stable section and the second pitch stable section overlap is set. In the above configuration, since the first analysis point and the second analysis point are matched with each other on the time axis, the pitch of one singing part (for example, one singing voice) at the analysis point that matches on the time axis. There is an advantage that it is possible to appropriately evaluate the degree of harmony with the pitch of other singing parts (for example, other singing voices).

本発明の第2態様に係る歌唱評価装置は、楽曲の一の歌唱音声を示す第1歌唱信号の音高が安定する複数の第1音高安定区間と、前記楽曲の他の歌唱音声を示す第2歌唱信号の音高が安定する複数の第2音高安定区間とを設定する区間設定部と、前記複数の第1音高安定区間の各々における音高の代表値を算出する一方、前記複数の第2音高安定区間の各々における音高の代表値を算出する代表値算出部と、前記複数の第1音高安定区間に亘る音高の代表値の度数分布を示す第1度数分布を生成する一方、前記複数の第2音高安定区間に亘る音高の代表値の度数分布を示す第2度数分布を生成する音高分布生成部と、前記第1度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第1評価分布を作成する一方、前記第2度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第2評価分布を作成する解析処理部と、前記第1評価分布と前記第2評価分布とに基づいて、前記一の歌唱音声と前記他の歌唱音声との歌唱音高の類似の度合いを評価する評価部とを具備する。以上の構成では、複数の第1音高安定区間に亘る第1歌唱信号の音高の代表値の第1度数分布と、複数の第2音高安定区間に亘る第2歌唱信号の音高の代表値の第2度数分布とに応じて、一の歌唱音声と他の歌唱音声との歌唱音高における傾向の類似の度合いが評価されるから、歌唱音声と対比すべき評価用データが存在しない場合でも、一の歌唱音声と他の歌唱音声との歌唱音高における傾向の類似の度合いを適切に評価することが可能になる。また、第1歌唱信号および第2歌唱信号の全区間が評価対象とされる構成と比較して、演算の処理負荷を低減しながら適切に歌唱音高を評価することが可能になる、という利点がある。   The singing evaluation apparatus according to the second aspect of the present invention shows a plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and other singing voices of the music. A section setting unit for setting a plurality of second pitch stable sections in which the pitch of the second singing signal is stabilized, and calculating a representative value of the pitch in each of the plurality of first pitch stable sections, A representative value calculation unit that calculates a representative value of the pitch in each of the plurality of second pitch stable sections, and a first frequency distribution that indicates a frequency distribution of the representative values of the pitches over the plurality of first pitch stable sections. A pitch distribution generation unit that generates a second frequency distribution indicating a frequency distribution of representative values of pitches over the plurality of second pitch stable sections, and the first frequency distribution as a sound of a scale tone. Divide into multiple unit ranges centered on the height, and overlap each unit range with each other. The first evaluation distribution is created by summing the frequency of each frequency over a plurality of unit ranges for each pitch, while the second frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound. An analysis processing unit that creates a second evaluation distribution by overlapping ranges with each other and summing the frequency of the distribution of each unit range over a plurality of unit ranges for each pitch, the first evaluation distribution, and the second And an evaluation unit that evaluates the degree of similarity in singing pitch between the one singing voice and the other singing voice based on the evaluation distribution. In the above configuration, the first frequency distribution of the representative value of the pitch of the first singing signal over the plurality of first pitch stable sections and the pitch of the second singing signal over the plurality of second pitch stable sections. Since the degree of similarity in the singing pitch between one singing voice and the other singing voice is evaluated according to the second frequency distribution of the representative value, there is no evaluation data to be compared with the singing voice. Even in this case, it is possible to appropriately evaluate the degree of similarity in tendency in the singing pitch between one singing voice and another singing voice. Moreover, compared with the structure by which all the sections of a 1st song signal and a 2nd song signal are made into evaluation object, it becomes possible to evaluate a song pitch appropriately, reducing the processing load of a calculation. There is.

以上の各態様に係る歌唱評価装置は、専用のハードウェア(電子回路)によって実現されるほか、CPU(Central Processing Unit)等の汎用の演算処理装置とプログラムとの協働によっても実現される。本発明のプログラムは、コンピュータが読取可能な記録媒体に格納された形態で提供されてコンピュータにインストールされ得る。記録媒体は、例えば非一過性(non-transitory)の記録媒体であり、CD-ROM等の光学式記録媒体(光ディスク)が好例であるが、半導体記録媒体や磁気記録媒体等の公知の任意の形式の記録媒体を包含し得る。また、例えば、本発明のプログラムは、通信網を介した配信の形態で提供されてコンピュータにインストールされ得る。また、本発明は、以上に説明した各態様に係る歌唱評価装置の動作方法(歌唱評価方法)としても特定される。   The singing evaluation apparatus according to each of the above aspects is realized not only by dedicated hardware (electronic circuit) but also by cooperation of a general-purpose arithmetic processing apparatus such as a CPU (Central Processing Unit) and a program. The program of the present invention can be provided in a form stored in a computer-readable recording medium and installed in the computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included. For example, the program of the present invention can be provided in the form of distribution via a communication network and installed in a computer. Moreover, this invention is specified also as the operation | movement method (singing evaluation method) of the song evaluation apparatus which concerns on each aspect demonstrated above.

第1実施形態に係る歌唱評価システム1の概略図である。It is the schematic of the song evaluation system 1 which concerns on 1st Embodiment. 第1歌唱信号V1の音高PAおよび第2歌唱信号V2の音高PBのグラフである。It is a graph of the pitch PA of the 1st song signal V1, and the pitch PB of the 2nd song signal V2. 解析区間TAおよび第1音高安定区間TSAの設定についての説明図である。It is explanatory drawing about the setting of the analysis area TA and the 1st pitch stable area TSA. 歌唱評価部22(音高解析部222,区間設定部224)の動作のフローチャートである。It is a flowchart of operation | movement of the song evaluation part 22 (pitch analysis part 222, section setting part 224). 歌唱評価部22(評価部228)の処理の動作のフローチャートである。It is a flowchart of operation | movement of the process of the song evaluation part 22 (evaluation part 228). 差分値Cに対する評価値の分布の一例を示す説明図である。It is explanatory drawing which shows an example of distribution of the evaluation value with respect to the difference value C. 第2実施形態に係る歌唱評価システム1の概略図である。It is the schematic of the song evaluation system 1 which concerns on 2nd Embodiment. 音高解析部222が解析した第1歌唱信号V1の音高PAと、通信装置15が受信した第2歌唱信号V2の音高PBのグラフである。It is a graph of the pitch PA of the 1st song signal V1 which the pitch analysis part 222 analyzed, and the pitch PB of the 2nd song signal V2 which the communication apparatus 15 received. 区間設定部224が第1解析点KAと第2解析点KBとを合致させた後の第1歌唱信号V1の音高PAと第2歌唱信号V2の音高PBのグラフである。It is a graph of the pitch PA of the 1st song signal V1 and the pitch PB of the 2nd song signal V2 after the section setting part 224 matched the 1st analysis point KA and the 2nd analysis point KB. 第4実施形態に係る歌唱評価システム1の概略図である。It is the schematic of the song evaluation system 1 which concerns on 4th Embodiment. 第4実施形態の音高分布生成部225の処理についての説明図である。It is explanatory drawing about the process of the pitch distribution generation part 225 of 4th Embodiment. 第4実施形態の解析処理部227の処理についての説明図である。It is explanatory drawing about the process of the analysis process part 227 of 4th Embodiment. 第4実施形態の歌唱評価部22の処理の動作のフローチャートである。It is a flowchart of operation | movement of the process of the song evaluation part 22 of 4th Embodiment. 第5実施形態に係る歌唱評価システム1の概略図である。It is the schematic of the song evaluation system 1 which concerns on 5th Embodiment.

<第1実施形態>
図1は、第1実施形態に係る歌唱評価システム1の概略図である。歌唱評価システム1は、図1に例示されるように、利用者U1が使用する端末装置D1と、利用者U2が使用する端末装置D2とを含んで構成される。第1実施形態では、端末装置D1を歌唱評価装置として利用する構成を例示する。利用者U1は、端末装置D1に向けて歌唱する。利用者U2は、端末装置D2に向けて歌唱する。
<First Embodiment>
FIG. 1 is a schematic diagram of a singing evaluation system 1 according to the first embodiment. As illustrated in FIG. 1, the singing evaluation system 1 includes a terminal device D1 used by the user U1 and a terminal device D2 used by the user U2. In 1st Embodiment, the structure which utilizes terminal device D1 as a song evaluation apparatus is illustrated. The user U1 sings toward the terminal device D1. The user U2 sings toward the terminal device D2.

端末装置D2は、例えば携帯電話機やスマートフォン等の通信端末であり、収音装置34と通信装置35とを具備する。収音装置34は、利用者U2が楽曲を歌唱した歌唱音声を収音して第2歌唱信号V2を生成する。通信装置35は、端末装置D1と通信するための通信機器であり、収音装置34が生成した第2歌唱信号V2を端末装置D1に送信する。   The terminal device D2 is a communication terminal such as a mobile phone or a smartphone, and includes a sound collection device 34 and a communication device 35. The sound collection device 34 collects the singing voice of the user U2 singing the music and generates the second singing signal V2. The communication device 35 is a communication device for communicating with the terminal device D1, and transmits the second singing signal V2 generated by the sound collection device 34 to the terminal device D1.

第1実施形態の端末装置D1は、利用者U1が楽曲を歌唱した歌唱音声(第1歌唱音声)と、端末装置D2の利用者U2が同一の楽曲を歌唱した歌唱音声(第2歌唱音声)との間の調和の度合いを評価する装置であり、演算処理装置10と記憶装置12と入力装置13と収音装置14と通信装置15と表示装置16と放音装置18とを具備するコンピュータシステムで実現される。例えば、利用者が携行する可搬型の通信端末(携帯電話機やスマートフォン)およびパーソナルコンピュータ等の情報処理装置が端末装置D1として利用される。収音装置14は、周囲の音響を収音する装置(マイクロホン)である。第1実施形態の収音装置14は、端末装置D1の利用者U1が楽曲を歌唱した歌唱音声を収音して第1歌唱信号V1を生成する。入力装置13は、利用者からの指示を受付ける機器(例えば、タッチパネル等)である。   The terminal device D1 of the first embodiment has a singing voice (first singing voice) in which the user U1 sings a song and a singing voice (second singing voice) in which the user U2 of the terminal device D2 sings the same piece of music. A computer system comprising an arithmetic processing unit 10, a storage unit 12, an input unit 13, a sound collection unit 14, a communication unit 15, a display unit 16, and a sound emission unit 18. It is realized with. For example, a portable communication terminal (mobile phone or smartphone) carried by a user and an information processing device such as a personal computer are used as the terminal device D1. The sound collection device 14 is a device (microphone) that collects ambient sounds. The sound collection device 14 according to the first embodiment collects a singing voice in which the user U1 of the terminal device D1 sang a song and generates a first singing signal V1. The input device 13 is a device (for example, a touch panel) that accepts an instruction from a user.

記憶装置12は、演算処理装置10が実行するプログラムPGMや演算処理装置10が使用する各種のデータを記憶する。半導体記録媒体や磁気記録媒体等の公知の記録媒体または複数種の記録媒体の組合せが記憶装置12として任意に採用される。具体的には、記憶装置12は、楽曲データLを記憶する。楽曲データLは、楽曲の伴奏音を時系列で規定する伴奏データBと、歌詞を示す歌詞データQとを包含する。伴奏データBは、例えばMP3等の形式の音楽ファイルである。利用者が所望の楽曲を選択すると、当該楽曲の伴奏データBが再生される。表示装置16(例えば液晶表示パネル)は、演算処理装置10から指示された画像を表示する。例えば、利用者が選択した楽曲の歌詞データQや、歌唱音声の評価結果(評価値S)が表示装置16に表示される。通信装置15は、端末装置D2と通信するための通信機器である。具体的には、通信装置15は、端末装置D2の通信装置35が送信した第2歌唱信号V2を無線により受信する。なお、通信装置15と端末装置D2との間の通信は、移動通信網やインターネット等の通信網400を介した通信であってもよいし、無線LANやBluetooth(登録商標)規格等の近距離無線通信であってもよい。   The storage device 12 stores a program PGM executed by the arithmetic processing device 10 and various data used by the arithmetic processing device 10. A known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 12. Specifically, the storage device 12 stores music data L. The music data L includes accompaniment data B that defines the accompaniment sound of the music in time series, and lyric data Q that indicates lyrics. The accompaniment data B is a music file in a format such as MP3, for example. When the user selects a desired music piece, accompaniment data B of the music piece is reproduced. The display device 16 (for example, a liquid crystal display panel) displays an image instructed from the arithmetic processing device 10. For example, the lyrics data Q of the music selected by the user and the evaluation result (evaluation value S) of the singing voice are displayed on the display device 16. The communication device 15 is a communication device for communicating with the terminal device D2. Specifically, the communication device 15 wirelessly receives the second singing signal V2 transmitted by the communication device 35 of the terminal device D2. The communication between the communication device 15 and the terminal device D2 may be communication via a communication network 400 such as a mobile communication network or the Internet, or a short distance such as a wireless LAN or Bluetooth (registered trademark) standard. Wireless communication may be used.

図1の演算処理装置10(CPU)は、記憶装置12に格納されたプログラムPGMを実行することで、端末装置D1の各要素を統括的に制御する。具体的には、演算処理装置10は、端末装置D1の収音装置14によって生成された第1歌唱信号V1と、端末装置D2の収音装置34によって生成された第2歌唱信号V2とが示す歌唱を評価するための複数の機能(歌唱評価部22,再生処理部26,表示処理部28)を実現する。なお、演算処理装置10の各機能を複数の装置に分散した構成や、専用の電子回路(例えばDSP)が演算処理装置10の一部の機能を実現する構成も採用され得る。   1 executes the program PGM stored in the storage device 12 to centrally control each element of the terminal device D1. Specifically, the arithmetic processing device 10 indicates the first singing signal V1 generated by the sound collecting device 14 of the terminal device D1 and the second singing signal V2 generated by the sound collecting device 34 of the terminal device D2. A plurality of functions (singing evaluation unit 22, reproduction processing unit 26, display processing unit 28) for evaluating a song are realized. A configuration in which each function of the arithmetic processing device 10 is distributed to a plurality of devices, or a configuration in which a dedicated electronic circuit (for example, DSP) realizes a part of the functions of the arithmetic processing device 10 may be employed.

再生処理部26は、収音装置14によって生成された第1歌唱信号V1と記憶装置12から読み出された伴奏データBとを混合するとともにアナログ信号に変換して放音装置18に供給する。放音装置18(例えばスピーカやヘッドホン)は、再生処理部26から供給される信号に応じた音響を放音する。すなわち、伴奏データBが表す楽曲の伴奏音と利用者U1の歌唱音声との混合音が放音装置18から放音される。表示処理部28は、各種の画像を表示装置16に表示させる。具体的には、表示処理部28は、歌唱対象として選択された楽曲が包含する歌詞データQや、歌唱評価部22によって生成された歌唱評価の結果を示す評価値Sを表示装置16に表示させる。   The reproduction processing unit 26 mixes the first singing signal V 1 generated by the sound collection device 14 and the accompaniment data B read from the storage device 12, converts it into an analog signal, and supplies it to the sound emission device 18. The sound emitting device 18 (for example, a speaker or headphones) emits sound corresponding to the signal supplied from the reproduction processing unit 26. That is, a mixed sound of the musical accompaniment sound represented by the accompaniment data B and the singing voice of the user U1 is emitted from the sound emitting device 18. The display processing unit 28 displays various images on the display device 16. Specifically, the display processing unit 28 causes the display device 16 to display the lyrics data Q included in the song selected as the singing target and the evaluation value S indicating the result of the singing evaluation generated by the singing evaluation unit 22. .

歌唱評価部22は、第1歌唱信号V1および第2歌唱信号V2を解析することで、利用者U1による歌唱音声の音高(歌唱音高)と、利用者U2による歌唱音高との調和の度合いを評価する手段であり、音高解析部222と区間設定部224と代表値算出部226と評価部228とを含んで構成される。   The singing evaluation unit 22 analyzes the first singing signal V1 and the second singing signal V2, thereby harmonizing the pitch of the singing voice by the user U1 (singing pitch) and the singing pitch by the user U2. It is a means for evaluating the degree, and includes a pitch analysis unit 222, a section setting unit 224, a representative value calculation unit 226, and an evaluation unit 228.

第1実施形態の音高解析部222は、収音装置14によって生成された第1歌唱信号V1の音高PAを所定周期毎に順次に解析する。また、第1実施形態の音高解析部222は、通信装置15によって受信された第2歌唱信号V2の音高PBを所定周期毎に順次に解析する。   The pitch analysis unit 222 of the first embodiment sequentially analyzes the pitch PA of the first singing signal V1 generated by the sound collecting device 14 at predetermined intervals. In addition, the pitch analysis unit 222 of the first embodiment sequentially analyzes the pitch PB of the second singing signal V2 received by the communication device 15 every predetermined period.

図2は、音高解析部222によって解析される第1歌唱信号V1の音高PAと第2歌唱信号V2の音高PBのグラフである。音高解析部222は、楽曲に想定され得る音符の時間長と比較して十分に短い周期(例えば10ms毎)で、第1歌唱信号V1の音高PAと第2歌唱信号V2の音高PBとを所定周期毎に順次に特定する。歌唱音声の音高の特定には、公知のピッチ検出技術が任意に採用され得る。   FIG. 2 is a graph of the pitch PA of the first singing signal V1 and the pitch PB of the second singing signal V2 analyzed by the pitch analysis unit 222. The pitch analysis unit 222 has a pitch PA of the first singing signal V1 and a pitch PB of the second singing signal V2 with a sufficiently short period (for example, every 10 ms) as compared with the time length of notes that can be assumed in the music. Are sequentially identified at predetermined intervals. A known pitch detection technique can be arbitrarily employed to specify the pitch of the singing voice.

図1の区間設定部224は、歌唱の進行に並行して、音高解析部222が解析した第1歌唱信号V1の音高PAが安定する第1音高安定区間TSAを複数設定する。具体的には、区間設定部224は、図2に例示される通り、第1歌唱信号V1のうち音高PAの変動量(すなわち、音高PAの最高値と最低値との差異)が所定の範囲内に維持される区間を第1音高安定区間TSA(TSA1,TSA2,TSA3…)として設定する。同様に、区間設定部224は、歌唱の進行に並行して、音高解析部222が解析した第2歌唱信号V2の音高PBが安定する第2音高安定区間TSBを複数設定する。具体的には、区間設定部224は、第1音高安定区間TSAの特定と同様の方法により、第2歌唱信号V2のうち音高PBの変動量(すなわち、音高PBの最高値と最低値との差異)が所定の範囲内に維持される区間を第2音高安定区間TSB(TSB1,TSB2,TSB3…)として複数設定する。   The section setting unit 224 in FIG. 1 sets a plurality of first pitch stable sections TSA in which the pitch PA of the first singing signal V1 analyzed by the pitch analysis section 222 is stabilized in parallel with the progress of the singing. Specifically, as illustrated in FIG. 2, the section setting unit 224 has a predetermined variation amount of the pitch PA (that is, the difference between the highest value and the lowest value) of the first singing signal V1. Is set as the first pitch stable section TSA (TSA1, TSA2, TSA3...). Similarly, the section setting unit 224 sets a plurality of second pitch stable sections TSB in which the pitch PB of the second singing signal V2 analyzed by the pitch analysis unit 222 is stabilized in parallel with the progress of the singing. Specifically, the section setting unit 224 uses the same method as the specification of the first pitch stable section TSA to change the pitch PB of the second singing signal V2 (that is, the highest and lowest pitches PB). A plurality of sections in which the difference between the two values is maintained within a predetermined range are set as second pitch stable sections TSB (TSB1, TSB2, TSB3...).

また、区間設定部224は、複数の第1音高安定区間TSAの各々と、複数の第2音高安定区間TSBの各々とが時間軸上で重複する重複区間RTSを特定する。重複区間RTSは、図2に例示される通り、複数の第1音高安定区間TSA(TSA1,TSA2,TSA3…)の各々と、複数の第2音高安定区間TSB(TSB1,TSB2,TSB3…)の各々とが、時間軸上で重なり合う範囲である。   In addition, the section setting unit 224 identifies an overlapping section RTS in which each of the plurality of first pitch stable sections TSA and each of the plurality of second pitch stable sections TSB overlap on the time axis. As illustrated in FIG. 2, the overlapping section RTS includes a plurality of first pitch stable sections TSA (TSA1, TSA2, TSA3...) And a plurality of second pitch stable sections TSB (TSB1, TSB2, TSB3... ) Are ranges that overlap on the time axis.

図1の代表値算出部226は、区間設定部224が設定した重複区間RTS、すなわち、複数の第1音高安定区間TSAの各々と、複数の第2音高安定区間TSBの各々とが時間軸上で重複する重複区間RTSにおける第1歌唱信号V1の音高PAの代表値RPA(第1歌唱音声の音高の代表値)と第2歌唱信号V2の音高PBの代表値RPB(第2歌唱音声の音高の代表値)とを算出する。代表値算出部226は、図2に例示されるように、重複区間RTS毎に、音高PAの代表値RPA(RPA1,RPA2,RPA3…)および音高PBの代表値RPB(RPB1,RPB2,RPB3…)を算出する。具体的には、第1音高安定区間TSA(第2音高安定区間TSB)の各々で特定された複数の音高PA(PB)の平均値が代表値RPA(RPB)として算出される。   In the representative value calculation unit 226 of FIG. 1, the overlapping section RTS set by the section setting unit 224, that is, each of the plurality of first pitch stable sections TSA and each of the plurality of second pitch stable sections TSB is timed. The representative value RPA of the pitch PA of the first singing signal V1 (the representative value of the pitch of the first singing voice) and the representative value RPB of the pitch PB of the second singing signal V2 in the overlapping section RTS overlapping on the axis (first) 2) (representative value of the pitch of the two singing voices). As illustrated in FIG. 2, the representative value calculation unit 226, for each overlapping section RTS, represents the representative value RPA (RPA1, RPA2, RPA3...) Of the pitch PA and the representative value RPB (RPB1, RPB2,. RPB3 ...) is calculated. Specifically, an average value of a plurality of pitches PA (PB) specified in each of the first pitch stable sections TSA (second pitch stable sections TSB) is calculated as a representative value RPA (RPB).

図1の評価部228は重複区間RTS毎に代表値算出部226が算出した第1歌唱信号V1の音高PAの代表値RPAと第2歌唱信号V2の音高PBの代表値RPBとが所定の音高関係にあるか否かに応じて、第1歌唱音声と第2歌唱音声との間の調和の度合いを評価する。本実施形態の評価部228は、第1歌唱音声と第2歌唱音声とが聴感的に調和のとれた音高関係にあるか否かに着目して、第1歌唱音声と第2歌唱音声との調和の度合いを評価した評価値Sを出力する。   In the evaluation unit 228 of FIG. 1, the representative value RPA of the pitch PA of the first singing signal V1 and the representative value RPB of the pitch PB of the second singing signal V2 calculated by the representative value calculating unit 226 for each overlapping section RTS are predetermined. The degree of harmony between the first singing voice and the second singing voice is evaluated according to whether or not the pitch relationship is satisfied. The evaluation unit 228 of the present embodiment pays attention to whether or not the first singing voice and the second singing voice have an audibly harmonized pitch relationship, and the first singing voice and the second singing voice, An evaluation value S that evaluates the degree of harmony is output.

図4は、音高解析部222および区間設定部224の処理の動作のフローチャートである。例えば、楽曲データLの再生が開始されると、図4の処理が開始される。なお、図4では、音高解析部222および区間設定部224による第1歌唱信号V1に対する処理を例示する。第2歌唱信号V2に対する処理は、第1歌唱信号V1に対する処理と同様であるので詳細な説明を省略する。   FIG. 4 is a flowchart of processing operations of the pitch analysis unit 222 and the section setting unit 224. For example, when the reproduction of the music data L is started, the process of FIG. 4 is started. FIG. 4 illustrates the processing for the first singing signal V1 by the pitch analysis unit 222 and the section setting unit 224. Since the process for the second singing signal V2 is the same as the process for the first singing signal V1, detailed description thereof is omitted.

音高解析部222が第1歌唱信号V1のうち時間軸上の1個の時点(以下「解析点」という)ついて音高PAを特定すると(SA1)、区間設定部224は、図3に例示される通り、音高PAが特定された時間軸上の解析点KAを終点とする所定長の解析区間TAを設定する(SA2)。解析区間TAは、時間窓関数が規定する分析の対象とされる時間的区間であり、例えば、音高解析が実行される周期(10ms)よりも十分に長い時間長(例えば200ms)に設定される。したがって、解析点KAについて新たに特定された音高PAと解析点KA以前の音高PAとを含む複数の音高PAが解析区間TA内に包含される。   When the pitch analysis unit 222 specifies the pitch PA for one time point on the time axis (hereinafter referred to as “analysis point”) in the first singing signal V1, (SA1), the section setting unit 224 is illustrated in FIG. As described above, an analysis section TA having a predetermined length with the analysis point KA on the time axis on which the pitch PA is specified as an end point is set (SA2). The analysis interval TA is a time interval that is an object of analysis defined by the time window function, and is set to a time length (for example, 200 ms) that is sufficiently longer than the cycle (10 ms) in which the pitch analysis is performed. The Therefore, a plurality of pitches PA including the pitch PA newly specified for the analysis point KA and the pitch PA before the analysis point KA are included in the analysis section TA.

区間設定部224は、解析区間TA内の複数の音高PAの最大値PA-MAXと最小値PA-MINとを特定し、最大値PA-MAXと最小値PA-MINとの差分値R(絶対値)が所定の閾値PATHを下回るか否かを判定する(SA3)。差分値R(すなわち解析区間TA内の音高PAの分布幅)が狭いほど、第1歌唱信号V1の音高PAが安定していると評価できる。例えば、閾値PATHは、十二平均音律における50centに設定され得る。   The section setting unit 224 identifies a maximum value PA-MAX and a minimum value PA-MIN of a plurality of pitches PA in the analysis section TA, and a difference value R () between the maximum value PA-MAX and the minimum value PA-MIN. It is determined whether (absolute value) is below a predetermined threshold PATH (SA3). It can be evaluated that the pitch PA of the first singing signal V1 is more stable as the difference value R (that is, the distribution width of the pitch PA in the analysis section TA) is narrower. For example, the threshold value PATH can be set to 50 cent in the twelve average temperament.

区間設定部224は、差分値Rが閾値PATHを下回る場合(SA3:YES)、当該解析区間TAを第1音高安定区間TSAに包含させる(SA4)。図3の解析区間TAnでは音高PAの最大値PA-MAXと最小値PA-MINとの差分値Rが閾値PATHを下回るから、解析点KAnを含む解析区間TAnが第1音高安定区間TSAに包含される。区間設定部224は、当該解析点KAにおける音高PAを記憶装置12に記憶する(SA5)。   When the difference value R falls below the threshold PATH (SA3: YES), the section setting unit 224 includes the analysis section TA in the first pitch stable section TSA (SA4). In the analysis section TAn in FIG. 3, since the difference value R between the maximum value PA-MAX and the minimum value PA-MIN of the pitch PA is below the threshold value PATH, the analysis section TAn including the analysis point KAn is the first pitch stable section TSA. Is included. The section setting unit 224 stores the pitch PA at the analysis point KA in the storage device 12 (SA5).

区間設定部224は、楽曲が終了するまでの間(SA9:NO)、音高解析部222によって音高PAが特定される毎に(SA1)、当該音高PAの解析点KAを終点とする解析区間TAを設定し(SA2)、当該解析区間TAにおける音高PAの差分値Rが閾値PATHを下回るか否かを判定する(SA3)。すなわち、図3から理解される通り、音高解析部222が音高PAを特定する周期(10ms)毎に解析区間TAを時間軸上で順次に移動させながら、当該解析区間TAが第1音高安定区間TSA内に包含されるか否かが判定される。したがって、閾値PATHを下回る分布幅の範囲内の音高PAを音高解析部222が特定するたびに第1音高安定区間TSAが時間軸上で順次に伸長していく。   The interval setting unit 224 uses the analysis point KA of the pitch PA as an end point every time the pitch PA is specified by the pitch analysis unit 222 (SA1) until the music ends (SA9: NO). An analysis section TA is set (SA2), and it is determined whether or not the difference value R of the pitch PA in the analysis section TA is below the threshold PATH (SA3). That is, as understood from FIG. 3, the analysis interval TA is moved to the first sound while the analysis interval TA is sequentially moved on the time axis for every period (10 ms) in which the pitch analysis unit 222 specifies the pitch PA. It is determined whether or not it is included in the highly stable section TSA. Accordingly, the first pitch stable section TSA is sequentially extended on the time axis every time the pitch analysis unit 222 specifies the pitch PA within the range of the distribution width below the threshold PATH.

他方、解析区間TA内における音高PAの差分値Rが閾値PATH以上である場合(SA3:NO)、区間設定部224は、当該解析点KAを第1音高安定区間TSAに含めない(SA6)。区間設定部224は、現在の解析区間TAの直前の解析区間TAが第1音高安定区間TSA内に存在するか否かを判定する(SA7)。判定結果が肯定である場合、区間設定部224は、直前の解析区間TAの終点(解析点KA)を1個の第1音高安定区間TSAの終点として確定する(SA8)。つまり、区間設定部224は、歌唱の進行に並行して順次に第1音高安定区間TSAを設定する。以上の構成によれば、楽曲全体のうち音高PAの変動量が小さい第1音高安定区間TSAのみを歌唱評価の対象とすることが可能である。以上の手順で、第1音高安定区間TSAおよび第2音高安定区間TSBが順次に設定されると、記憶装置12には、第1音高安定区間TSAに包含される音高PAと、第2音高安定区間TSBに包含される音高PBとが、順次蓄積していく。次に、区間設定部224は、図2に例示されるように、第1音高安定区間TSAと第2音高安定区間TSBとが時間軸上で重複する重複区間RTSを特定する。   On the other hand, when the difference value R of the pitch PA in the analysis section TA is equal to or greater than the threshold PATH (SA3: NO), the section setting unit 224 does not include the analysis point KA in the first pitch stable section TSA (SA6). ). The section setting unit 224 determines whether or not the analysis section TA immediately before the current analysis section TA exists in the first pitch stable section TSA (SA7). If the determination result is affirmative, the section setting unit 224 determines the end point (analysis point KA) of the immediately preceding analysis section TA as the end point of one first pitch stable section TSA (SA8). That is, the section setting unit 224 sequentially sets the first pitch stable section TSA in parallel with the progress of singing. According to the above configuration, it is possible to set only the first pitch stable section TSA with a small fluctuation amount of the pitch PA in the entire music as the object of the singing evaluation. When the first pitch stable section TSA and the second pitch stable section TSB are sequentially set by the above procedure, the storage device 12 stores the pitch PA included in the first pitch stable section TSA, The pitch PB included in the second pitch stable section TSB is sequentially accumulated. Next, as illustrated in FIG. 2, the section setting unit 224 identifies an overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap on the time axis.

次に、評価部228による歌唱評価について説明する。図5は、評価部228が第1歌唱音声の音高と第2歌唱音声の音高との調和の度合いを評価する処理(評価処理)のフローチャートである。区間設定部224が楽曲内の重複区間RTSn(n=1,2,3,4,……)を設定し、代表値算出部226が重複区間RTS内における音高PAの代表値RPAと音高PBの代表値RPBとを算出するたびに図5の評価処理が実行される。   Next, singing evaluation by the evaluation unit 228 will be described. FIG. 5 is a flowchart of a process (evaluation process) in which the evaluation unit 228 evaluates the degree of harmony between the pitch of the first singing voice and the pitch of the second singing voice. The section setting unit 224 sets the overlapping section RTSn (n = 1, 2, 3, 4,...) In the music, and the representative value calculating section 226 uses the representative value RPA and the pitch of the pitch PA in the overlapping section RTS. Each time the representative value RPB of PB is calculated, the evaluation process of FIG. 5 is executed.

評価部228は、重複区間RTS内における音高PAの代表値RPAと音高PBの代表値RPBとの差分値Cを算出する(SB1)。具体的には、区間設定部224が新たに設定した重複区間RTSの代表値RPAと代表値RPBとの差分値Cが算定される。なお、代表値RPAと代表値RPBとの差分値が1200cent(1オクターブ)を上回る場合には、1200centの整数倍を差分から差引くことで1200cent以下の差分値Cを算定する。   The evaluation unit 228 calculates a difference value C between the representative value RPA of the pitch PA and the representative value RPB of the pitch PB in the overlapping section RTS (SB1). Specifically, the difference value C between the representative value RPA and the representative value RPB of the overlapping section RTS newly set by the section setting unit 224 is calculated. When the difference value between the representative value RPA and the representative value RPB exceeds 1200 cents (one octave), a difference value C of 1200 cents or less is calculated by subtracting an integer multiple of 1200 cents from the difference.

楽曲が十二平均律の12種類の音階音(音律の音階を構成する離散的な音高)で構成される場合を想定すると、利用者U1が楽曲を歌唱した音声(第1歌唱音声)と、利用者U2が当該楽曲を歌唱した音声(第2歌唱音声)とで音高の調和がとれているとき、第1歌唱音声の音高と第2歌唱音声の音高との差分値は、相互に隣合う2個の音階音の間隔(半音に相当する100cent)の整数倍に近似または合致すると期待される。したがって、1個の重複区間RTSの代表値RPAと代表値RPBとの差分値Cが100centの整数倍に相当するとき、当該重複区間RTSでは、利用者U1の歌唱音高と利用者U2の歌唱音高とが、調和を保っていると評価できる。そこで、第1実施形態の評価部228は、重複区間RTSにおける代表値RPAと代表値RPBとの差分値Cが100centの整数倍に相当するときには、第1歌唱音声の音高と第2歌唱音声の音高との調和がとれているとして、評価値Sを増加させる。他方、差分値Cが100centの整数倍から乖離する場合には、第1歌唱音声の音高と第2歌唱音声の音高との調和がとれていないと推定される。したがって、差分値Cが100centの整数倍から乖離するときは、評価部228は、第1歌唱音声の音高と第2歌唱音声の音高とが調和しないとして、評価値Sを減少させる。   Assuming that the music is composed of twelve types of twelve scales (discrete pitches constituting the scale of the scale), the voice (first singing voice) that the user U1 sang the music and When the pitch of the user U2 is harmonized with the voice of the song sung (second song voice), the difference value between the pitch of the first song voice and the pitch of the second song voice is: It is expected to approximate or match an integer multiple of the interval between two adjacent musical notes (100 cents corresponding to a semitone). Therefore, when the difference value C between the representative value RPA and the representative value RPB of one overlapping section RTS corresponds to an integral multiple of 100 cents, in the overlapping section RTS, the singing pitch of the user U1 and the singing of the user U2 It can be evaluated that the pitch is in harmony. Therefore, the evaluation unit 228 of the first embodiment, when the difference value C between the representative value RPA and the representative value RPB in the overlapping section RTS corresponds to an integer multiple of 100 cent, the pitch of the first singing voice and the second singing voice. As a result, the evaluation value S is increased. On the other hand, when the difference value C deviates from an integer multiple of 100 cents, it is estimated that the pitch of the first singing voice and the pitch of the second singing voice are not in harmony. Therefore, when the difference value C deviates from an integer multiple of 100 cents, the evaluation unit 228 decreases the evaluation value S, assuming that the pitch of the first singing voice and the pitch of the second singing voice do not match.

例えば、代表値RPAと代表値RPBとの差分値Cが、100centの整数倍の300cent(短3度)や400cent(長3度)である場合、第1歌唱音声の音高と第2歌唱音声の音高とが調和し、聴感的に心地良い印象を与える。また、差分値が700cent(完全5度)や1200cent(1オクターブ、すなわち完全8度)や0cent(完全1度)である場合、すなわち、第1歌唱音声の音高と第2歌唱音声の音高とが協和音の関係にある場合、第1歌唱音声の音高と第2歌唱音声の音高とが調和し、聴感的に響きがある印象を与える。以上の事情を考慮して、第1歌唱音声の音高と第2歌唱音声の音高との差分値Cが300centおよび400centである場合や、第1歌唱音声の音高と第2歌唱音声の音高とが協和音(すなわち、差分値Cが700cent,1200cent,0cent)の関係にある場合、評価値Sを高くしてもよい。   For example, when the difference value C between the representative value RPA and the representative value RPB is 300 cents (short 3 degrees) or 400 cent (long 3 degrees) which is an integral multiple of 100 cent, the pitch of the first singing voice and the second singing voice Harmonizes with the pitch and gives a pleasant impression. Further, when the difference value is 700 cent (completely 5 degrees), 1200 cent (1 octave, that is, completely 8 degrees), or 0 cent (completely 1 degree), that is, the pitch of the first singing voice and the pitch of the second singing voice. Are in harmony with each other, the pitch of the first singing voice and the pitch of the second singing voice are harmonized, giving an impression of being audibly audible. Considering the above circumstances, the difference value C between the pitch of the first singing voice and the pitch of the second singing voice is 300 cent and 400 cent, or the pitch of the first singing voice and the second singing voice When the pitch is in the form of a consonant tone (that is, the difference value C is 700 cent, 1200 cent, 0 cent), the evaluation value S may be increased.

評価部228は、重複区間RTSにおける代表値RPAと代表値RPBとの差分値Cが100centの整数倍に近似(合致を含む)するか否かを判定する(SB2)。具体的には、差分値Cが、100centの整数倍を含む所定の範囲内(例えば±10%)にあるか否かが判定される。差分値Cが100centの整数倍に近似する場合(SB2:YES)、評価部228は重複区間RTSnに対して、楽曲が開始されてから重複区間RTSn-1までに獲得された評価値(得点:Score)Sn-1と、所定値Δ(正の整数)との加算値を、重複区間RTSnまでに獲得した評価値(Score)Snとして設定する(Sn=Sn-1+Δ)(SB3)。   The evaluation unit 228 determines whether or not the difference value C between the representative value RPA and the representative value RPB in the overlapping section RTS approximates (including a match) to an integer multiple of 100 cent (SB2). Specifically, it is determined whether or not the difference value C is within a predetermined range (eg, ± 10%) including an integer multiple of 100 cent. When the difference value C approximates an integer multiple of 100 cent (SB2: YES), the evaluation unit 228 evaluates the overlapping section RTSn from the start of the music until the overlapping section RTSn-1 (score: An added value of (Score) Sn-1 and a predetermined value Δ (positive integer) is set as an evaluation value (Score) Sn acquired until the overlapping section RTSn (Sn = Sn-1 + Δ) (SB3).

他方、差分値Cが100centの整数倍とは乖離する場合(SB2:NO)、楽曲が開始されてから重複区間RTSn-1までに獲得された評価値Sn-1と、所定値Δ(正の整数)との減算値を、当該n番目の重複区間RTSnまでに獲得した評価値Snとして設定する(Sn=Sn-1−Δ)(SB4)。評価部228は、SB3およびSB4で設定した評価値Snを、端末装置D1の表示処理部28に出力する。表示処理部28は、歌唱の評価値Snを表示装置16に表示させる(SB5)。つまり、楽曲の再生(歌唱の進行)に並行して評価値Snは時々刻々と変化していく。また、評価部228によって算出された評価値Snは、通信装置15によって端末装置D2に送信される。端末装置D2では、端末装置D1と同様に、評価値Snを表示させる。   On the other hand, when the difference value C deviates from an integer multiple of 100 cent (SB2: NO), the evaluation value Sn-1 acquired from the start of the music to the overlapping section RTSn-1 and the predetermined value Δ (positive) The subtraction value with (integer) is set as the evaluation value Sn acquired up to the n-th overlapping section RTSn (Sn = Sn-1−Δ) (SB4). The evaluation unit 228 outputs the evaluation value Sn set in SB3 and SB4 to the display processing unit 28 of the terminal device D1. The display processing unit 28 displays the evaluation value Sn of the singing on the display device 16 (SB5). That is, the evaluation value Sn changes every moment in parallel with the reproduction of the music (song progress). Further, the evaluation value Sn calculated by the evaluation unit 228 is transmitted by the communication device 15 to the terminal device D2. In the terminal device D2, the evaluation value Sn is displayed as in the terminal device D1.

なお、差分値Cに応じた評価値Sの算定方法は適宜に変更される。例えば、差分値Cに対して図6のような関係となるように評価値Sを設定することも可能である。図6に例示される通り、差分値Cの複数の範囲の各々について評価値Sの分布が事前に設定され、複数の評価値Sのうち代表値RPAと代表値RPBとの差分値Cに応じた評価値Sが選択される。短3度(3半音)や長3度(4半音)の差分値Cは受聴者が知覚する調和の度合いが高いという傾向を考慮し、短3度に相当する差分値C(C=300)や長3度に相当する差分値C(C=400)には高い評価点Sが設定されている。   In addition, the calculation method of the evaluation value S according to the difference value C is appropriately changed. For example, the evaluation value S can be set so that the difference value C has a relationship as shown in FIG. As illustrated in FIG. 6, the distribution of the evaluation value S is set in advance for each of the plurality of ranges of the difference value C, and according to the difference value C between the representative value RPA and the representative value RPB among the plurality of evaluation values S. The evaluated value S is selected. The difference value C of the minor third (3 semitones) and the major third (four semitones) takes into account the tendency that the degree of harmony perceived by the listener is high, and the difference value C corresponding to the minor third (C = 300) A high evaluation score S is set for the difference value C (C = 400) corresponding to 3 degrees.

以上に説明した通り、第1実施形態の構成では、第1音高安定区間TSAと第2音高安定区間TSBとが時間軸上で重複する重複区間RTS内の代表値RPAと代表値RPBとの差分値Cが所定の音高関係(具体的には、十二平均律等の特定の音律のもとで相互に隣合う各音階音の間隔の整数倍)にあるか否かに応じて歌唱が評価される。したがって、複数の歌唱パートの各々について楽譜データが存在しない場合でも、歌唱を適切に評価することが可能になる。   As described above, in the configuration of the first embodiment, the representative value RPA and the representative value RPB in the overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap on the time axis are Depending on whether or not the difference value C is in a predetermined pitch relationship (specifically, an integral multiple of the interval between each tone adjacent to each other under a specific temperament such as twelve equal temperament). Singing is evaluated. Therefore, even when there is no musical score data for each of the plurality of singing parts, it is possible to appropriately evaluate the singing.

第1実施形態では、第1歌唱信号V1のうち音高PAが安定する第1音高安定区間TSAと、第2歌唱信号V2のうち音高PBが安定する第2音高安定区間TSBが重複する重複区間RTSの各々における代表値RPAと代表値RPBとに応じて第1歌唱音声の音高と第2歌唱音声の音高との調和の程度が評価される。ここで、音符間で音高が連続的に遷移する区間や歌唱表現として音高が変動する区間(例えばビブラート区間)を含む楽曲全体に亘って歌唱を評価する構成(以下、「対比例」という)では、音高が不安定に遷移する区間を含む楽曲全体の歌唱が歌唱評価の対象とされるから、必ずしも適切な評価がなされない場合がある。他方、第1実施形態の構成によれば、音高が安定する第1音高安定区間TSAおよび第2音高安定区間TSBが重複する重複区間RTSで算出された代表値RPAおよび代表値RPBに応じて、第1歌唱音声の音高と第2歌唱音声の音高との調和の程度が評価されるから、対比例の構成と比較して、安定した区間の音高を用いた適切な評価を実現することが可能になる。   In the first embodiment, the first pitch stable section TSA in which the pitch PA is stable in the first singing signal V1 and the second pitch stable section TSB in which the pitch PB is stable in the second singing signal V2 overlap. The degree of harmony between the pitch of the first singing voice and the pitch of the second singing voice is evaluated according to the representative value RPA and the representative value RPB in each overlapping section RTS. Here, the composition which evaluates the singing over the whole music including the section where the pitch continuously changes between notes and the section where the pitch fluctuates as a singing expression (for example, vibrato section) (hereinafter referred to as “comparative”) ), Since the singing of the entire music including the section in which the pitch is unstablely shifted is the target of the singing evaluation, the appropriate evaluation may not always be performed. On the other hand, according to the configuration of the first embodiment, the representative value RPA and the representative value RPB calculated in the overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB in which the pitch is stabilized overlap. Accordingly, since the degree of harmony between the pitch of the first singing voice and the pitch of the second singing voice is evaluated, an appropriate evaluation using the pitch of the stable section as compared with the comparative configuration. Can be realized.

<第2実施形態>
第1実施形態では、第2歌唱信号V2を端末装置D2から端末装置D1に送信して端末装置D1で第2歌唱信号V2の音高PBを解析した。第2実施形態では、端末装置D2で解析された第2歌唱信号V2の音高PBが端末装置D2から端末装置D1に順次に送信される。
Second Embodiment
In the first embodiment, the second song signal V2 is transmitted from the terminal device D2 to the terminal device D1, and the pitch PB of the second song signal V2 is analyzed by the terminal device D1. In the second embodiment, the pitch PB of the second singing signal V2 analyzed by the terminal device D2 is sequentially transmitted from the terminal device D2 to the terminal device D1.

図7は、第2実施形態の歌唱評価システム1の概略図である。第2実施形態では、利用者U2が使用する端末装置D2に音高解析部36が付加される。音高解析部36は、収音装置34が生成した第2歌唱信号V2の音高PBを順次に解析する。音高解析部36による音高PBの解析方法は、第1実施形態の音高解析部222による音高PBの解析方法と同様である。第2実施形態の通信装置35は、図7に例示される通り、音高解析部36によって解析された第2歌唱信号V2の音高PBを端末装置D1に送信する。端末装置D1の通信装置15は、端末装置D2から送信された音高PBを順次に受信する。   FIG. 7 is a schematic diagram of the singing evaluation system 1 of the second embodiment. In the second embodiment, a pitch analysis unit 36 is added to the terminal device D2 used by the user U2. The pitch analysis unit 36 sequentially analyzes the pitch PB of the second singing signal V2 generated by the sound collecting device 34. The analysis method of the pitch PB by the pitch analysis unit 36 is the same as the analysis method of the pitch PB by the pitch analysis unit 222 of the first embodiment. The communication apparatus 35 of 2nd Embodiment transmits the pitch PB of the 2nd song signal V2 analyzed by the pitch analysis part 36 to the terminal device D1, as illustrated by FIG. The communication device 15 of the terminal device D1 sequentially receives the pitch PB transmitted from the terminal device D2.

端末装置D1の音高解析部222は、第1歌唱信号V1の音高PAを順次に解析して、解析した音高PAを区間設定部224に通知する。区間設定部224は、音高解析部222が解析した音高PAが安定する第1音高安定区間TSAを複数設定するとともに、通信装置15によって受信された音高PBが安定する第2音高安定区間TSBを複数設定する。以降の説明については第1実施形態と同様であるので、説明を省略する。   The pitch analysis unit 222 of the terminal device D1 sequentially analyzes the pitch PA of the first singing signal V1 and notifies the section setting unit 224 of the analyzed pitch PA. The section setting unit 224 sets a plurality of first pitch stable sections TSA in which the pitch PA analyzed by the pitch analysis unit 222 is stable, and the second pitch in which the pitch PB received by the communication device 15 is stable. A plurality of stable sections TSB are set. Since the subsequent description is the same as that of the first embodiment, the description is omitted.

以上の説明から理解される通り、第2実施形態の構成によっても第1実施形態と同様の効果が実現される。また、第2実施形態の構成によれば、音高解析部222は、第2歌唱信号V2の音高PBを解析する必要がないから、端末装置D1による演算処理の負荷を軽減することが可能になる、という利点もある。   As understood from the above description, the same effects as those of the first embodiment are realized by the configuration of the second embodiment. Further, according to the configuration of the second embodiment, the pitch analysis unit 222 does not need to analyze the pitch PB of the second singing signal V2, and thus it is possible to reduce the load of calculation processing by the terminal device D1. There is also an advantage of becoming.

<第3実施形態>
第2実施形態では、区間設定部224は、端末装置D1内の音高解析部222から音高PAを取得する一方で、通信装置15によって受信された音高PBを取得する。音高PBは端末装置D2から端末装置D1に送信されるから、利用者U1と利用者U2とが相互に同期して同じ楽曲を歌唱した場合でも、区間設定部224が楽曲のうち特定の時点の音高PBを取得する時点は、区間設定部224が当該時点の音高PAを取得する時点に対して、例えば端末装置D1と端末装置D2との間の通信遅延の分だけ遅延し得る。具体的には、図8に例示されるように、楽曲内の各時点の音高PAを区間設定部224が取得する時点(例えば解析点KAn[n=1,2,3…]の時間軸上における位置)と、楽曲内の同じ時点の音高PBを区間設定部224が取得する時点(例えば、解析点KBn[n=1,2,3…]の時間軸上における位置)との間には、通信に要する遅延(Δt)が発生し得る。そこで、第3実施形態では、遅延時間を補償する処理を実行する。
<Third Embodiment>
In the second embodiment, the section setting unit 224 acquires the pitch PA from the pitch analysis unit 222 in the terminal device D1, while acquiring the pitch PB received by the communication device 15. Since the pitch PB is transmitted from the terminal device D2 to the terminal device D1, even when the user U1 and the user U2 sing the same music in synchronism with each other, the section setting unit 224 has a specific time point in the music. The time point when the pitch PB is acquired may be delayed from the time point when the section setting unit 224 acquires the pitch PA at that time, for example, by the communication delay between the terminal device D1 and the terminal device D2. Specifically, as illustrated in FIG. 8, the time axis at which the section setting unit 224 acquires the pitch PA at each time point in the music (for example, the analysis point KAn [n = 1, 2, 3...] And the time point when the section setting unit 224 acquires the pitch PB at the same time in the music (for example, the position on the time axis of the analysis point KBn [n = 1,2,3...]). May cause a delay (Δt) required for communication. Therefore, in the third embodiment, a process for compensating for the delay time is executed.

第3実施形態の歌唱評価システム1の構成は第2実施形態と同様である。利用者U1が歌唱対象とする楽曲を選択すると、当該楽曲の再生指示が、端末装置D1から端末装置D2に送信される。端末装置D1と端末装置D2とでは、楽曲の開始時点が一致するから、利用者U1と利用者U2とは、同じ楽曲を同期して歌唱する。すなわち、端末装置D1で音高を解析する時点(解析点KA)と端末装置D2で音高を解析する時点(解析点KB)とが一致する。第3実施形態では、端末装置D1の音高解析部222は、第1歌唱信号V1のうち音高PAを解析するたびに、当該解析点KAの時間軸上における位置を示す情報(以降の説明では「第1時間情報」という。)を区間設定部224に通知する。また、端末装置D2の音高解析部36は、第2歌唱信号V2のうち音高PBを解析するたびに、解析した音高PBと当該解析点KBの時間軸上における位置を示す情報(以降の説明では「第2時間情報」という。)を、通信装置35を介して端末装置D1に送信する。端末装置D1の通信装置15は、通信装置35が送信した音高PBおよび第2時間情報を受信する。時間情報の一例としては、例えば、楽曲開始からの経過時間や時刻が例示される。   The configuration of the singing evaluation system 1 of the third embodiment is the same as that of the second embodiment. When the user U1 selects a song to be sung, a playback instruction for the song is transmitted from the terminal device D1 to the terminal device D2. Since the terminal device D1 and the terminal device D2 have the same music start time, the user U1 and the user U2 sing the same music synchronously. That is, the time point when the terminal device D1 analyzes the pitch (analysis point KA) coincides with the time point when the terminal device D2 analyzes the pitch (analysis point KB). In 3rd Embodiment, the pitch analysis part 222 of the terminal device D1 is the information (following description) which shows the position on the time-axis of the said analysis point KA whenever it analyzes the pitch PA among the 1st song signals V1. Then, it is referred to as “first time information”) to the section setting unit 224. Further, whenever the pitch analysis unit 36 of the terminal device D2 analyzes the pitch PB of the second singing signal V2, information indicating the position of the analyzed pitch PB and the analysis point KB on the time axis (hereinafter referred to as the pitch PB). Is referred to as “second time information”) via the communication device 35 to the terminal device D1. The communication device 15 of the terminal device D1 receives the pitch PB and the second time information transmitted by the communication device 35. As an example of the time information, for example, an elapsed time and time from the start of music are exemplified.

図9は、第3実施形態の区間設定部224による処理の説明図である。第3実施形態の区間設定部224は、音高解析部222により通知された第1時間情報と、通信装置35および通信装置15との間における通信を介して音高解析部36により通知された第2時間情報とを相互に対応付けることで、第1時間情報が示す複数の解析点KA(第1解析点KA)と、第2時間情報が示す複数の解析点KB(第2解析点KB)との間で相互に対応するもの(楽曲内の同時点に対応するもの)同士を時間軸上で相互に合致させる。   FIG. 9 is an explanatory diagram of processing by the section setting unit 224 of the third embodiment. The section setting unit 224 of the third embodiment is notified by the pitch analysis unit 36 via the first time information notified by the pitch analysis unit 222 and communication between the communication device 35 and the communication device 15. By associating the second time information with each other, a plurality of analysis points KA (first analysis point KA) indicated by the first time information and a plurality of analysis points KB (second analysis point KB) indicated by the second time information. That correspond to each other (corresponding to the same point in the music) are matched with each other on the time axis.

区間設定部224は、相互に対応する解析点KAと解析点KBとを時間軸上で相互に合致させたうえで、第1実施形態と同様の手順により、第1音高安定区間TSAと第2音高安定区間TSBとを設定し、当該第1音高安定区間TSAと第2音高安定区間TSBとが時間軸上で重複する重複区間RTSを設定する。   The section setting unit 224 matches the analysis point KA and the analysis point KB corresponding to each other on the time axis, and then performs the same procedure as in the first embodiment by using the same procedure as in the first embodiment. A two pitch stable section TSB is set, and an overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap on the time axis is set.

以上に説明した通り、第3実施形態の構成によっても、前述の各形態と同様の効果を奏することが可能である。第3実施形態の構成では、第1歌唱信号V1の音高PAが解析される第1解析点KAと、第2歌唱信号V2の音高PBが解析される第2解析点KBとを時間軸上で合致させる。したがって、端末装置D1から端末装置D2との間の通信に遅延が発生する場合でも、第1歌唱音声と第2歌唱音声との調和の程度を適切に評価することが可能になる。   As described above, the same effects as those of the above-described embodiments can be obtained by the configuration of the third embodiment. In the configuration of the third embodiment, the first analysis point KA at which the pitch PA of the first singing signal V1 is analyzed and the second analysis point KB at which the pitch PB of the second singing signal V2 is analyzed are time axes. Match above. Therefore, even when a delay occurs in communication between the terminal device D1 and the terminal device D2, it is possible to appropriately evaluate the degree of harmony between the first singing voice and the second singing voice.

<第4実施形態>
前述の各形態では、第1歌唱音声と第2歌唱音声とが所定の音高関係にあるか否かに応じて、第1歌唱音声の音高と第2歌唱音声の音高との調和の程度を評価した。第4実施形態では、第1歌唱音声と第2歌唱音声との歌唱音高の類似の度合いを評価する。
<Fourth embodiment>
In each above-mentioned form, according to whether the 1st singing voice and the 2nd singing voice have a predetermined pitch relation, the pitch of the 1st singing voice and the pitch of the 2nd singing voice are in harmony. The degree was evaluated. In the fourth embodiment, the degree of similarity in singing pitch between the first singing voice and the second singing voice is evaluated.

図10は、第4実施形態の歌唱評価システム1の概略図である。図10から把握される通り、第4実施形態の端末装置D1の構成においては、第1実施形態の端末装置D1の構成に対して音高分布生成部225と解析処理部227とが付加される。音高解析部222の機能および動作については第1実施形態と同様であるので詳細な説明を省略する。   FIG. 10 is a schematic diagram of the singing evaluation system 1 of the fourth embodiment. As understood from FIG. 10, in the configuration of the terminal device D1 of the fourth embodiment, a pitch distribution generation unit 225 and an analysis processing unit 227 are added to the configuration of the terminal device D1 of the first embodiment. . Since the function and operation of the pitch analysis unit 222 are the same as those in the first embodiment, a detailed description thereof will be omitted.

第4実施形態の区間設定部224は、第1歌唱信号V1の音高PAが安定する第1音高安定区間TSAと、第2歌唱信号V2の音高PBが安定する第2音高安定区間TSBとを複数設定する。第4実施形態の区間設定部224は、第1音高安定区間TSAと第2音高安定区間TSBとが重複する重複区間RTSは設定しない。代表値算出部226は、楽曲全体に亘って第1音高安定区間TSAおよび第2音高安定区間TSBの設定が完了すると、第1実施形態と同様の手法により、複数の第1音高安定区間TSAの各々における音高PAの代表値RPAと、複数の第2音高安定区間TSBの各々における音高PBの代表値RPBとを算出する。代表値算出部226は、算出した代表値RPAおよび代表値RPBを記憶装置12に順次格納する。したがって、楽曲内の第1音高安定区間TSAの総数に相当する個数の代表値RPAおよび第2音高安定区間TSBの総数に相当する個数の代表値RPBが記憶装置12に格納される。音高分布生成部225は、複数の第1音高安定区間TSAに亘る代表値RPAの度数分布を示す第1度数分布を生成する一方、複数の第2音高安定区間TSBに亘る代表値RPBの度数分布を示す第2度数分布を生成する。   The section setting unit 224 of the fourth embodiment includes a first pitch stable section TSA in which the pitch PA of the first singing signal V1 is stable and a second pitch stable section in which the pitch PB of the second singing signal V2 is stable. A plurality of TSBs are set. The section setting unit 224 of the fourth embodiment does not set an overlapping section RTS in which the first pitch stable section TSA and the second pitch stable section TSB overlap. When the setting of the first pitch stable section TSA and the second pitch stable section TSB is completed over the entire musical piece, the representative value calculating unit 226 uses the same method as in the first embodiment to perform a plurality of first pitch stable sections. A representative value RPA of the pitch PA in each of the sections TSA and a representative value RPB of the pitch PB in each of the plurality of second pitch stable sections TSB are calculated. The representative value calculation unit 226 sequentially stores the calculated representative value RPA and the representative value RPB in the storage device 12. Therefore, the number of representative values RPA corresponding to the total number of first pitch stable sections TSA and the number of representative values RPB corresponding to the total number of second pitch stable sections TSB in the music are stored in the storage device 12. The pitch distribution generation unit 225 generates a first frequency distribution indicating the frequency distribution of the representative value RPA over the plurality of first pitch stable sections TSA, while representing the representative value RPB over the plurality of second pitch stable sections TSB. A second frequency distribution indicating the frequency distribution is generated.

図11は、音高分布生成部225の処理についての説明図である。音高分布生成部225は、複数の第1音高安定区間TSAに包含される代表値RPAを記憶装置12から読み出し、当該代表値RPAの第1度数分布(ヒストグラム)HAを作成する。度数分布HAは、相互に隣合う各音階音の音高差(100cent)と比較して十分に細かく代表値RPAの数値範囲を区分した複数の階級の各々における代表値RPAの度数の分布である。また、音高分布生成部225は、第1度数分布HAの生成と同様に、複数の第2音高安定区間TSBに包含される代表値RPBを記憶装置12から読み出し、当該代表値RPBの度数分布を示す第2度数分布HBを作成する。   FIG. 11 is an explanatory diagram for the processing of the pitch distribution generation unit 225. The pitch distribution generation unit 225 reads the representative value RPA included in the plurality of first pitch stable sections TSA from the storage device 12, and creates a first frequency distribution (histogram) HA of the representative value RPA. The frequency distribution HA is a frequency distribution of the representative value RPA in each of a plurality of classes in which the numerical value range of the representative value RPA is sufficiently finely divided as compared with the pitch difference (100 cent) of each tone adjacent to each other. . Similarly to the generation of the first frequency distribution HA, the pitch distribution generation unit 225 reads the representative value RPB included in the plurality of second pitch stable sections TSB from the storage device 12, and the frequency of the representative value RPB. A second frequency distribution HB indicating the distribution is created.

図10の解析処理部227は、音高分布生成部225によって生成された第1度数分布HAのうち音階音の音高を含む所定範囲の分布である複数の分布のそれぞれを合計する。具合的には、解析処理部227は、第1度数分布HAを音階音毎に区分した複数の単位範囲Tu(Tu1,Tu2,Tu3,Tu4,Tu5,Tu6)にわたり各単位範囲Tuの分布を合計して1個の第1評価分布QAを作成するとともに、当該第1評価分布QAの代表値を算出する。第4実施形態では、第1評価分布QAの平均値A1を、第1評価分布QAの代表値として算出する。任意の1個の音階音に対応する単位範囲Tuは、当該音階音の音高を中心とする±50centの範囲である。解析処理部227は、第1評価分布QAの生成と同様に、第2度数分布HBを音階音毎に区分した複数の単位範囲Tuにわたり各単位範囲Tuの分布を合計して1個の第2評価分布QBを作成するとともに当該第2評価分布QBの代表値(第2評価分布QBの平均値A2)を算出する。   The analysis processing unit 227 in FIG. 10 sums up each of a plurality of distributions that are distributions of a predetermined range including the pitches of the scales in the first frequency distribution HA generated by the pitch distribution generation unit 225. Specifically, the analysis processing unit 227 sums up the distribution of each unit range Tu over a plurality of unit ranges Tu (Tu1, Tu2, Tu3, Tu4, Tu5, Tu6) obtained by dividing the first frequency distribution HA for each scale sound. Then, one first evaluation distribution QA is created and a representative value of the first evaluation distribution QA is calculated. In the fourth embodiment, the average value A1 of the first evaluation distribution QA is calculated as a representative value of the first evaluation distribution QA. The unit range Tu corresponding to one arbitrary scale sound is a range of ± 50 cent centered on the pitch of the scale sound. Similar to the generation of the first evaluation distribution QA, the analysis processing unit 227 totals the distributions of the unit ranges Tu over a plurality of unit ranges Tu obtained by dividing the second frequency distribution HB for each scale sound, thereby generating one second An evaluation distribution QB is created and a representative value of the second evaluation distribution QB (average value A2 of the second evaluation distribution QB) is calculated.

図12は、解析処理部227により作成された第1評価分布QAの説明図である。解析処理部227は、音階音の音高(図11では1200cent、1300cent、1400cent、1500cent、1600cent、1700cent)を0とする±50centに亘る数値αの範囲(−50≦α≦+50)に各単位範囲Tuの分布を対応させたうえで複数の単位範囲Tuに亘る分布の度数を数値α毎に合計する。そして、解析処理部227は、数値α毎の各合計度数を±50centの範囲内でさらに合計した積算値(分布の下側の合計面積)が所定値(例えば1)となるように各合計度数を正規化することで第1評価分布QAを生成する。すなわち、解析処理部227は、音階音の音高を中心とする複数の単位範囲Tuに度数分布H(HA,HB)を区分し、各単位範囲Tuを相互に重複させて各単位範囲Tuの分布の度数を複数の単位範囲Tuにわたり数値α(音高)毎に合計する(さらにはその合計度数を正規化する)ことで第1評価分布QAを生成する。図12の関数G(α)は、数値αを変数とした正規化後の第1評価分布QAを表現する関数である。解析処理部227は、第1評価分布QAの平均値A1を算出する。また、解析処理部227は、第1評価分布QAと同様の方法で第2評価分布QBを生成してその平均値A2を算出する。   FIG. 12 is an explanatory diagram of the first evaluation distribution QA created by the analysis processing unit 227. The analysis processing unit 227 has each unit within a range of numerical value α (−50 ≦ α ≦ + 50) over ± 50 cents, where 0 is the pitch of the scale tone (1200cent, 1300cent, 1400cent, 1500cent, 1600cent, 1700cent in FIG. 11). The frequencies of the distribution over a plurality of unit ranges Tu are summed for each numerical value α after the distribution of the range Tu is made to correspond. The analysis processing unit 227 then adds each total frequency for each numerical value α within a range of ± 50 cents so that an integrated value (total area under the distribution) becomes a predetermined value (for example, 1). Is normalized to generate the first evaluation distribution QA. In other words, the analysis processing unit 227 divides the frequency distribution H (HA, HB) into a plurality of unit ranges Tu centered on the pitch of the scale sound, and overlaps each unit range Tu with each other to determine each unit range Tu. The first evaluation distribution QA is generated by summing the frequency of distribution for each numerical value α (pitch) over a plurality of unit ranges Tu (and normalizing the total frequency). The function G (α) in FIG. 12 is a function representing the first evaluation distribution QA after normalization using the numerical value α as a variable. The analysis processing unit 227 calculates an average value A1 of the first evaluation distribution QA. Further, the analysis processing unit 227 generates the second evaluation distribution QB by the same method as the first evaluation distribution QA and calculates the average value A2.

以上の説明から理解される通り、第1評価分布QAは、複数の音階音の各々に対する音高PAの代表値RPAの分布の傾向を表現する。具体的には、第1評価分布QAの平均値A1が大きい場合には、利用者U1が、音階音の本来の音高に対して高目の音高を歌唱する傾向があると推定できる。同様に、第2評価分布QBは、各音階音の音高に対する音高PBの代表値RPBの分布の傾向を表現する。具体的には、第2評価分布QBの平均値A2が大きい場合には、利用者U2が、音階音の本来の音高に対して高目の音高を歌唱する傾向があると推定できる。   As understood from the above description, the first evaluation distribution QA expresses the tendency of the distribution of the representative value RPA of the pitch PA for each of a plurality of scale sounds. Specifically, when the average value A1 of the first evaluation distribution QA is large, it can be estimated that the user U1 tends to sing a higher pitch with respect to the original pitch of the scale sound. Similarly, the second evaluation distribution QB expresses the tendency of the distribution of the representative value RPB of the pitch PB with respect to the pitch of each scale sound. Specifically, when the average value A2 of the second evaluation distribution QB is large, it can be estimated that the user U2 tends to sing a higher pitch with respect to the original pitch of the scale sound.

評価部228は、解析処理部227によって算定された第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とに応じて、第1歌唱音声と第2歌唱音声との歌唱音高の類似の度合いを評価する。第1歌唱音声と第2歌唱音声との間で歌唱音高が近似する場合(例えば利用者U1および利用者U2の双方が、音階音の音高に対して高い音高で発音する傾向がある場合)、平均値A1と平均値A2とが相互に近似すると推定される。他方、第1歌唱音声と第2歌唱音声との間で歌唱音高が乖離する場合(例えば利用者U1は音階音に対して高い音高で発音する傾向がある一方で、利用者U2は音階音に対して低い音高で発音する傾向がある場合)、平均値A1と平均値A2とは乖離すると推定される。以上の傾向を考慮して、第4実施形態の評価部228は、第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とが近似するほど、楽曲の全体を通して第1歌唱音声と第2歌唱音声との歌唱音高が類似するとして評価値Sを大きい数値に設定する。他方、第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とが乖離するほど、楽曲の全体を通して第1歌唱音声と第2歌唱音声との歌唱音高が乖離するとして評価値Sを小さい数値に設定する。   The evaluation unit 228 sings the singing sound of the first singing voice and the second singing voice according to the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB calculated by the analysis processing unit 227. Assess the degree of high similarity. When the singing pitch is approximated between the first singing voice and the second singing voice (for example, both the user U1 and the user U2 tend to pronounce at a higher pitch than the pitch of the scale sound. The average value A1 and the average value A2 are approximated to each other. On the other hand, when the singing pitch deviates between the first singing voice and the second singing voice (for example, the user U1 tends to pronounce at a high pitch with respect to the scale tone, while the user U2 It is estimated that the average value A1 and the average value A2 are different from each other. Considering the above tendency, the evaluation unit 228 of the fourth embodiment performs the first singing throughout the entire song as the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB approximate. The evaluation value S is set to a large numerical value because the singing pitches of the voice and the second singing voice are similar. On the other hand, as the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB deviate, the singing pitches of the first singing voice and the second singing voice diverge throughout the entire music. Set the value S to a small number.

図13は、第4実施形態の歌唱評価部22(音高分布生成部225,解析処理部227,評価部228)による動作の処理のフローチャートである。代表値算出部226が、各第1音高安定区間TSAn(n=1,2,3,4,……)の音高PAの代表値RPAと各第2音高安定区間TSBn(n=1,2,3,4,……)の音高PBの代表値RPBとを楽曲全体に亘り算出すると、図13の評価処理が実行される。   FIG. 13 is a flowchart of an operation process performed by the song evaluation unit 22 (pitch distribution generation unit 225, analysis processing unit 227, and evaluation unit 228) according to the fourth embodiment. The representative value calculation unit 226 selects the representative value RPA of the pitch PA of each first pitch stable section TSAn (n = 1, 2, 3, 4,...) And each second pitch stable section TSBn (n = 1). , 2, 3, 4,...) And the representative value RPB of the pitch PB are calculated over the entire music, the evaluation process of FIG. 13 is executed.

音高分布生成部225は、楽曲内の複数の第1音高安定区間TSAに亘る代表値RPAの第1度数分布HAを作成するとともに、楽曲内の複数の第2音高安定区間TSBに亘る代表値RPBの第2度数分布HBを作成する(SD1)。解析処理部227は、音高分布生成部225によって生成された第1度数分布HAを音階音毎に区分した複数の単位範囲Tuから第1評価分布QAを作成するとともに、第2度数分布HBを音階音毎に区分した複数の単位範囲Tuから第2評価分布QBを作成する(SD2)。解析処理部227は、第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とを算出する(SD3)。   The pitch distribution generation unit 225 creates the first frequency distribution HA of the representative value RPA over the plurality of first pitch stable sections TSA in the music and the plurality of second pitch stable sections TSB in the music. A second frequency distribution HB of the representative value RPB is created (SD1). The analysis processing unit 227 creates the first evaluation distribution QA from a plurality of unit ranges Tu obtained by dividing the first frequency distribution HA generated by the pitch distribution generation unit 225 for each scale sound, and the second frequency distribution HB. A second evaluation distribution QB is created from a plurality of unit ranges Tu divided for each scale sound (SD2). The analysis processing unit 227 calculates the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB (SD3).

評価部228は、解析処理部227によって算定された第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とに応じて、第1歌唱音声と第2歌唱音声との歌唱音高の類似の度合いを評価する(SD4)。例えば、第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とが近似するほど、楽曲の全体を通して第1歌唱音声の歌唱音高と第2歌唱音声の歌唱音高が類似するとして評価値Sを大きい数値に設定する。他方、第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とが乖離するほど、楽曲の全体を通して第1歌唱音声の歌唱音高と第2歌唱音声の歌唱音高が乖離すると評価して評価値Sを小さい数値に設定する。表示処理部28は、評価値Sを表示装置16に表示させる(SD5)。   The evaluation unit 228 sings the singing sound of the first singing voice and the second singing voice according to the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB calculated by the analysis processing unit 227. Assess the degree of high similarity (SD4). For example, as the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB approximate, the singing pitch of the first singing voice and the singing pitch of the second singing voice are similar throughout the music. As a result, the evaluation value S is set to a large numerical value. On the other hand, as the average value A1 of the first evaluation distribution QA deviates from the average value A2 of the second evaluation distribution QB, the singing pitch of the first singing voice and the singing pitch of the second singing voice are more different throughout the music. Then, it evaluates and sets evaluation value S to a small numerical value. The display processing unit 28 displays the evaluation value S on the display device 16 (SD5).

以上の説明から理解される通り、第4実施形態では、第1音高安定区間TSAにおける代表値RPAの第1度数分布HAから生成された第1評価分布QAの平均値A1(第1評価分布QAの代表値)と、第2音高安定区間TSBにおける代表値RPBの第2度数分布HBから生成された第2評価分布QBの平均値A2(第2評価分布QBの代表値)とに応じて、第1歌唱音声と第2歌唱音声との歌唱音高の類似の度合いが評価される。したがって、第4実施形態の構成によっても、第1実施形態の効果と同様に、楽曲の歌唱パート毎の楽譜データを必要とすることなく、利用者U1と利用者U2との歌唱音高の類似の度合いを適切に評価することが可能である。また、第4実施形態では、第1音高安定区間TSAにおける代表値RPAと第2音高安定区間TSBにおける代表値RPBとが歌唱評価の対象として利用されるから、楽曲のうち音高遷移が不安定な区間を含む全区間に亘る歌唱音高が評価対象とされる構成と比較して適切に歌唱音高を評価することが可能である。   As understood from the above description, in the fourth embodiment, the average value A1 (first evaluation distribution) of the first evaluation distribution QA generated from the first frequency distribution HA of the representative value RPA in the first pitch stable section TSA. According to the average value A2 of the second evaluation distribution QB generated from the second frequency distribution HB of the representative value RPB in the second pitch stable section TSB (representative value of the second evaluation distribution QB). Thus, the degree of similarity of the singing pitch between the first singing voice and the second singing voice is evaluated. Therefore, similar to the effect of the first embodiment, the configuration of the fourth embodiment is similar to the singing pitch between the user U1 and the user U2 without requiring the score data for each singing part of the music. It is possible to appropriately evaluate the degree of. In the fourth embodiment, since the representative value RPA in the first pitch stable section TSA and the representative value RPB in the second pitch stable section TSB are used as objects for singing evaluation, the pitch transition of the music is changed. It is possible to appropriately evaluate the singing pitch as compared with the configuration in which the singing pitch over the entire section including the unstable section is the evaluation target.

<第5実施形態>
前述の各形態では、利用者U1が使用する端末装置D1を歌唱評価装置として利用する構成を例示した。第5実施形態では、端末装置D(D1,D2)との間で通信を実行する管理装置500を歌唱評価装置として利用する構成を例示する。
<Fifth Embodiment>
In each above-mentioned form, composition which utilizes terminal unit D1 which user U1 uses as a song evaluation device was illustrated. In 5th Embodiment, the structure which utilizes the management apparatus 500 which performs communication between terminal device D (D1, D2) as a song evaluation apparatus is illustrated.

図14は、第5実施形態の歌唱評価システム1の概略図である。第5実施形態の歌唱評価システム1は、図14に例示される通り、管理装置500と、利用者U1が使用する端末装置D1と、利用者U2が使用する端末装置D2とを含んで構成される。端末装置D1および端末装置D2の構成は、図7に例示される端末装置D2と同様である。すなわち、利用者U1の第1歌唱音声を示す第1歌唱信号V1の音高PAが端末装置D1から管理装置500に順次に送信され、利用者U2の第2歌唱音声を示す第2歌唱信号V2の音高PBが端末装置D2から管理装置500に順次に送信される。   FIG. 14 is a schematic diagram of the singing evaluation system 1 of the fifth embodiment. As illustrated in FIG. 14, the singing evaluation system 1 according to the fifth embodiment includes a management device 500, a terminal device D1 used by the user U1, and a terminal device D2 used by the user U2. The The configurations of the terminal device D1 and the terminal device D2 are the same as the terminal device D2 illustrated in FIG. That is, the pitch PA of the first singing signal V1 indicating the first singing voice of the user U1 is sequentially transmitted from the terminal device D1 to the management apparatus 500, and the second singing signal V2 indicating the second singing voice of the user U2. Are sequentially transmitted from the terminal device D2 to the management device 500.

管理装置500は、利用者U1の歌唱音声と、利用者U2の歌唱音声との音高の調和の度合いを評価して、評価結果を示す評価値Sを、利用者U1の端末装置D1と利用者U2の端末装置D2に各々送信する装置であり、歌唱評価部52と通信装置54とを包含する。通信装置54は、端末装置D1および端末装置D2の各々と通信する。具体的には、通信装置54は、端末装置D1が送信した音高PAと端末装置D2が送信した音高PBとを順次に受信する。歌唱評価部52は、利用者U1による歌唱音声の音高と、利用者U2による歌唱音声の音高との調和の度合いを評価する。具体的には、第5実施形態の歌唱評価部52は、第2実施形態の歌唱評価部22と同様に、区間設定部224と代表値算出部226と評価部228とを包含する。区間設定部224,代表値算出部226,および,評価部228の処理は第2実施形態と同様であるので詳細な説明を省略する。評価部228が算定した評価値Sが、通信装置54から端末装置D1および端末装置D2に送信されて各々の表示装置に表示される。   The management device 500 evaluates the degree of pitch harmony between the user U1 singing voice and the user U2 singing voice, and uses the evaluation value S indicating the evaluation result with the terminal device D1 of the user U1. And a singing evaluation unit 52 and a communication device 54. The communication device 54 communicates with each of the terminal device D1 and the terminal device D2. Specifically, the communication device 54 sequentially receives the pitch PA transmitted by the terminal device D1 and the pitch PB transmitted by the terminal device D2. The singing evaluation unit 52 evaluates the degree of harmony between the pitch of the singing voice by the user U1 and the pitch of the singing voice by the user U2. Specifically, the singing evaluation unit 52 of the fifth embodiment includes a section setting unit 224, a representative value calculation unit 226, and an evaluation unit 228, similarly to the singing evaluation unit 22 of the second embodiment. Since the processing of the section setting unit 224, the representative value calculation unit 226, and the evaluation unit 228 is the same as that of the second embodiment, detailed description thereof is omitted. The evaluation value S calculated by the evaluation unit 228 is transmitted from the communication device 54 to the terminal device D1 and the terminal device D2, and displayed on each display device.

第5実施形態でも、前述の各実施形態と同様の効果が実現される。また、第5実施形態の構成では、利用者U1の第1歌唱音声の音高と利用者U2の第2歌唱音声の音高との調和の度合いが管理装置500により評価される。したがって、端末装置D1には歌唱評価部22を搭載する必要がないという利点がある。なお、図14では、端末装置D1が第1歌唱信号V1の音高PAを送信するとともに端末装置D2が第2歌唱信号V2の音高PBを送信する構成を例示したが、端末装置D1が第1歌唱信号V1を管理装置500に送信するとともに端末装置D2が第2歌唱信号V2を管理装置500に送信する構成も採用され得る。具体的には、第1歌唱信号V1の音高PAと第2歌唱信号V2の音高PBとを解析する音高解析部222が管理装置500の歌唱評価部52に追加される。   In the fifth embodiment, the same effects as those of the above-described embodiments are realized. In the configuration of the fifth embodiment, the management device 500 evaluates the degree of harmony between the pitch of the first singing voice of the user U1 and the pitch of the second singing voice of the user U2. Therefore, the terminal device D1 has an advantage that the singing evaluation unit 22 does not need to be mounted. 14 illustrates the configuration in which the terminal device D1 transmits the pitch PA of the first singing signal V1 and the terminal device D2 transmits the pitch PB of the second singing signal V2, the terminal device D1 is the first. A configuration in which the terminal device D2 transmits the second singing signal V2 to the management device 500 while transmitting the one singing signal V1 to the management device 500 may be employed. Specifically, a pitch analysis unit 222 that analyzes the pitch PA of the first song signal V1 and the pitch PB of the second song signal V2 is added to the song evaluation unit 52 of the management device 500.

<変形例>
前述の各形態は多様に変形され得る。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された2以上の態様を適宜に併合することも可能である。
<Modification>
Each of the above-described embodiments can be variously modified. Specific modifications are exemplified below. Two or more modes arbitrarily selected from the following examples can be appropriately combined.

(1)第4実施形態では、第2歌唱信号V2を端末装置D2から端末装置D1に送信して端末装置D1で第2歌唱信号V2の音高PBを解析したが、第2実施形態および第3実施形態と同様に、端末装置D2で解析された第2歌唱信号V2の音高PBが端末装置D2から端末装置D1に順次に送信される構成としてもよい。   (1) In the fourth embodiment, the second singing signal V2 is transmitted from the terminal device D2 to the terminal device D1, and the pitch PB of the second singing signal V2 is analyzed by the terminal device D1, but the second embodiment and the second embodiment Similarly to the third embodiment, the pitch PB of the second singing signal V2 analyzed by the terminal device D2 may be sequentially transmitted from the terminal device D2 to the terminal device D1.

(2)第4実施形態では、第1評価分布QAの平均値A1と第2評価分布QBの平均値A2とを各分布の代表値として例示したが、第1評価分布QAおよび第2評価分布QBの各々の特徴を表す代表値は平均値に限定されない。例えば、第1評価分布QAおよび第2評価分布QBの各々について、平均値以外の指標値(中央値,最頻値等)や、分散値、二次モーメント等の各種の統計量が代表値として算定され得る。例えば、第1評価分布QAと第2評価分布QBとの間で分散値や二次モーメントが近似するほど、第1歌唱音声と第2歌唱音声の歌唱音高が類似すると評価することが可能である。   (2) In the fourth embodiment, the average value A1 of the first evaluation distribution QA and the average value A2 of the second evaluation distribution QB are exemplified as representative values of each distribution. However, the first evaluation distribution QA and the second evaluation distribution The representative value representing each characteristic of QB is not limited to the average value. For example, for each of the first evaluation distribution QA and the second evaluation distribution QB, various statistics such as an index value other than the average value (median value, mode value, etc.), variance value, and second moment are used as representative values. Can be calculated. For example, it can be evaluated that the singing pitches of the first singing voice and the second singing voice are more similar as the variance value and the second moment are approximated between the first evaluation distribution QA and the second evaluation distribution QB. is there.

(3)第5実施形態では、第1歌唱信号V1の音高PAの代表値RPAと、第2歌唱信号V2の音高PBの代表値RPBとが所定の音高関係にあるか否かに応じて、第1歌唱音声と第2歌唱音声との調和の度合いを評価した。以上の例示以外に、第5実施形態の管理装置500に、音高分布生成部225と解析処理部227とを付加し、第1歌唱音声と第2歌唱音声との歌唱音高の類似の度合いを評価してもよい。   (3) In the fifth embodiment, whether or not the representative value RPA of the pitch PA of the first singing signal V1 and the representative value RPB of the pitch PB of the second singing signal V2 have a predetermined pitch relationship. Accordingly, the degree of harmony between the first singing voice and the second singing voice was evaluated. In addition to the above examples, a pitch distribution generation unit 225 and an analysis processing unit 227 are added to the management device 500 of the fifth embodiment, and the degree of similarity in singing pitch between the first singing voice and the second singing voice. May be evaluated.

(4)前述の各形態では、端末装置D1の記憶装置12に伴奏データBと歌詞データQとを包含する楽曲データLが記憶される構成を例示したが、端末装置D1と、図示を省略する楽曲提供サーバーとの間で通信網(例えば移動通信網やインターネット)400を介した通信を実行することで楽曲データLを受信する構成としてもよい。   (4) In each of the above-described embodiments, the configuration in which the music data L including the accompaniment data B and the lyrics data Q is stored in the storage device 12 of the terminal device D1, but the illustration is omitted for the terminal device D1. It is good also as a structure which receives the music data L by performing communication via a communication network (for example, mobile communication network or the internet) 400 between music provision servers.

(5)前述の各形態では、歌唱の評価結果として評価値Sを表示させる構成を例示した。評価値Sに替えて、「OK」や「NG」等のテキスト情報を評価情報として表示させてもよい。   (5) In each above-mentioned form, the composition which displays evaluation value S as an evaluation result of singing was illustrated. Instead of the evaluation value S, text information such as “OK” or “NG” may be displayed as the evaluation information.

(6)前述の各形態では、利用者U1が端末装置D1に向けて歌唱する一方、利用者U2が端末装置D2に向けて歌唱する構成を例示したが、利用者U1と利用者U2とが、ひとつの端末装置に向けて歌唱する構成としてもよい。具体的には、例えば、端末装置D1に複数の収音装置(14Aおよび14B)を付加して、利用者U1が収音装置14Aに向けて歌唱し、利用者U2が収音装置14Bに向けて歌唱してもよい。以上の構成によっても、第1実施形態と同様の効果が実現される。   (6) In each of the above-described embodiments, the user U1 sings toward the terminal device D1, while the user U2 sings toward the terminal device D2, but the user U1 and the user U2 It is good also as a structure which sings toward one terminal device. Specifically, for example, a plurality of sound collecting devices (14A and 14B) are added to the terminal device D1, the user U1 sings toward the sound collecting device 14A, and the user U2 faces the sound collecting device 14B. You may sing. With the above configuration, the same effect as in the first embodiment is realized.

1……歌唱評価システム、10……演算処理装置、12……記憶装置、14……収音装置、15……通信装置、16……表示装置、18……放音装置、22……歌唱評価部、26……再生処理部、28……表示処理部、34……収音装置、35……通信装置、36……音高解析部、52……歌唱評価部、54……通信装置、222……音高解析部、224……区間設定部、225……音高分布生成部、226……代表値算出部、227……解析処理部、228……評価部、500……管理装置、B……伴奏データ、Q……歌詞データ、L……楽曲データ、TSA……第1音高安定区間、TSB……第2音高安定区間、RTS……重複区間、Tu……単位範囲。
DESCRIPTION OF SYMBOLS 1 ... Singing evaluation system, 10 ... Arithmetic processing device, 12 ... Memory | storage device, 14 ... Sound collecting device, 15 ... Communication device, 16 ... Display device, 18 ... Sound emitting device, 22 ... Singing Evaluation unit, 26... Reproduction processing unit, 28... Display processing unit, 34... Sound collection device, 35 .. communication device, 36 .. pitch analysis unit, 52. , 222 …… Pitch analysis unit, 224 …… Section setting unit, 225 …… Pitch distribution generation unit, 226 …… Representative value calculation unit, 227 …… Analysis processing unit, 228 …… Evaluation unit, 500 …… Management Device, B ... Accompaniment data, Q ... Lyric data, L ... Music data, TSA ... First pitch stable section, TSB ... Second pitch stable section, RTS ... Overlapping section, Tu ... Unit range.

Claims (6)

楽曲の一の歌唱音声を示す第1歌唱信号の音高が安定する複数の第1音高安定区間と、前記楽曲の他の歌唱音声を示す第2歌唱信号の音高が安定する複数の第2音高安定区間とを設定する区間設定部と、
前記区間設定部が設定した複数の第1音高安定区間の各々と複数の第2音高安定区間の各々とが時間軸上で重複する重複区間毎に、前記第1歌唱信号の音高の代表値と前記第2歌唱信号の音高の代表値とを算出する代表値算出部と、
前記各重複区間における前記第1歌唱信号の音高の代表値と前記第2歌唱信号の音高の代表値とが所定の音高関係にあるか否かに応じて、前記一の歌唱音声と前記他の歌唱音声との調和の度合いを評価する評価部と
を具備する歌唱評価装置。
A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section;
For each overlapping section in which each of the plurality of first pitch stable sections and each of the plurality of second pitch stable sections set by the section setting unit overlap on the time axis, the pitch of the first singing signal is set. A representative value calculating unit for calculating a representative value and a representative value of the pitch of the second singing signal;
Depending on whether or not the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal in the respective overlapping sections are in a predetermined pitch relationship, the one singing voice and A singing evaluation apparatus comprising: an evaluation unit that evaluates a degree of harmony with the other singing voice.
前記第1歌唱信号の音高を順次解析する音高解析部と、
前記第2歌唱信号から順次解析された音高を受信する受信部とを具備し、
前記区間設定部は、前記第1歌唱信号のうち前記音高解析部が解析した音高が安定する複数の第1音高安定区間と、前記第2歌唱信号のうち前記受信部が受信した音高が安定する複数の第2音高安定区間とを設定する
請求項1の歌唱評価装置。
A pitch analyzer for sequentially analyzing the pitch of the first singing signal;
A receiver for receiving the pitches sequentially analyzed from the second singing signal;
The section setting section includes a plurality of first pitch stable sections in which the pitch analyzed by the pitch analysis section of the first singing signal is stabilized, and a sound received by the receiving section of the second singing signal. The singing evaluation apparatus according to claim 1, wherein a plurality of second pitch stable sections in which the height is stabilized are set.
前記音高解析部は、時間軸上の第1解析点毎に前記第1歌唱信号の音高を順次に解析し、
前記受信部は、時間軸上の第2解析点毎に順次に解析された前記第2歌唱信号の音高を受信し、
前記区間設定部は、複数の第1解析点と複数の第2解析点との間で相互に対応するもの同士を時間軸上で相互に合致させたうえで、前記第1音高安定区間と前記第2音高安定区間とが重複する重複区間を設定する
請求項2の歌唱評価装置。
The pitch analysis unit sequentially analyzes the pitch of the first singing signal for each first analysis point on the time axis,
The receiving unit receives the pitch of the second singing signal sequentially analyzed for each second analysis point on the time axis,
The section setting unit matches the mutually corresponding ones on the time axis between the plurality of first analysis points and the plurality of second analysis points, and then sets the first pitch stable section and The singing evaluation apparatus according to claim 2, wherein an overlapping section in which the second pitch stable section overlaps is set.
楽曲の一の歌唱音声を示す第1歌唱信号の音高が安定する複数の第1音高安定区間と、前記楽曲の他の歌唱音声を示す第2歌唱信号の音高が安定する複数の第2音高安定区間とを設定する区間設定部と、
前記複数の第1音高安定区間の各々における音高の代表値を算出する一方、前記複数の第2音高安定区間の各々における音高の代表値を算出する代表値算出部と、
前記複数の第1音高安定区間に亘る音高の代表値の度数分布を示す第1度数分布を生成する一方、前記複数の第2音高安定区間に亘る音高の代表値の度数分布を示す第2度数分布を生成する音高分布生成部と、
前記第1度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第1評価分布を作成する一方、前記第2度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第2評価分布を作成する解析処理部と、
前記第1評価分布と前記第2評価分布とに基づいて、前記一の歌唱音声と前記他の歌唱音声との歌唱音高の類似の度合いを評価する評価部と
を具備する歌唱評価装置。
A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section;
A representative value calculation unit for calculating a representative value of the pitch in each of the plurality of second pitch stable sections, while calculating a representative value of the pitch in each of the plurality of first pitch stable sections;
A first frequency distribution indicating a frequency distribution of representative values of pitches over the plurality of first pitch stable sections is generated, while a frequency distribution of representative values of pitches over the plurality of second pitch stable sections is generated. A pitch distribution generation unit for generating the second frequency distribution shown;
The first frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and each unit range is overlapped with each other, and the frequency of the distribution of each unit range is divided for each pitch over a plurality of unit ranges. The first evaluation distribution is created by summing up, while the second frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and the unit ranges are overlapped with each other. An analysis processing unit that creates a second evaluation distribution by summing the frequency of distribution for each pitch over a plurality of unit ranges;
A singing evaluation apparatus comprising: an evaluation unit that evaluates the degree of similarity in singing pitch between the one singing voice and the other singing voice based on the first evaluation distribution and the second evaluation distribution.
楽曲の一の歌唱音声を示す第1歌唱信号の音高が安定する複数の第1音高安定区間と、前記楽曲の他の歌唱音声を示す第2歌唱信号の音高が安定する複数の第2音高安定区間とを設定する区間設定部、
前記区間設定部が設定した複数の第1音高安定区間の各々と複数の第2音高安定区間の各々とが時間軸上で重複する重複区間毎に、前記第1歌唱信号の音高の代表値と前記第2歌唱信号の音高の代表値とを算出する代表値算出部、
前記各重複区間における前記第1歌唱信号の音高の代表値と前記第2歌唱信号の音高の代表値とが所定の音高関係にあるか否かに応じて、前記一の歌唱音声と前記他の歌唱音声との調和の度合いを評価する評価部
としてコンピュータを機能させるプログラム。
A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section,
For each overlapping section in which each of the plurality of first pitch stable sections and each of the plurality of second pitch stable sections set by the section setting unit overlap on the time axis, the pitch of the first singing signal is set. A representative value calculating unit for calculating a representative value and a representative value of the pitch of the second singing signal;
Depending on whether or not the representative value of the pitch of the first singing signal and the representative value of the pitch of the second singing signal in the respective overlapping sections are in a predetermined pitch relationship, the one singing voice and A program that causes a computer to function as an evaluation unit that evaluates the degree of harmony with the other singing voice.
楽曲の一の歌唱音声を示す第1歌唱信号の音高が安定する複数の第1音高安定区間と、前記楽曲の他の歌唱音声を示す第2歌唱信号の音高が安定する複数の第2音高安定区間とを設定する区間設定部、
前記複数の第1音高安定区間の各々における音高の代表値を算出する一方、前記複数の第2音高安定区間の各々における音高の代表値を算出する代表値算出部、
前記複数の第1音高安定区間に亘る音高の代表値の度数分布を示す第1度数分布を生成する一方、前記複数の第2音高安定区間に亘る音高の代表値の度数分布を示す第2度数分布を生成する音高分布生成部、
前記第1度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第1評価分布を作成する一方、前記第2度数分布を音階音の音高を中心とする複数の単位範囲に区分し、各単位範囲を相互に重複させて当該各単位範囲の分布の度数を複数の単位範囲にわたり音高毎に合計することで第2評価分布を作成する解析処理部、
前記第1評価分布と前記第2評価分布とに基づいて、前記一の歌唱音声と前記他の歌唱音声との歌唱音高の類似の度合いを評価する評価部
としてコンピュータを機能させるプログラム。
A plurality of first pitch stable sections in which the pitch of the first singing signal indicating one singing voice of the music is stable, and a plurality of first pitches in which the pitch of the second singing signal indicating the other singing voice of the music is stable. A section setting unit for setting a two-pitch stable section,
A representative value calculator for calculating a representative value of the pitch in each of the plurality of first stable pitch sections, and calculating a representative value of the pitch in each of the plurality of second stable pitch sections;
A first frequency distribution indicating a frequency distribution of representative values of pitches over the plurality of first pitch stable sections is generated, while a frequency distribution of representative values of pitches over the plurality of second pitch stable sections is generated. A pitch distribution generation unit for generating the second frequency distribution shown in FIG.
The first frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and each unit range is overlapped with each other, and the frequency of the distribution of each unit range is divided for each pitch over a plurality of unit ranges. The first evaluation distribution is created by summing up, while the second frequency distribution is divided into a plurality of unit ranges centered on the pitch of the scale sound, and the unit ranges are overlapped with each other. An analysis processing unit that creates a second evaluation distribution by summing the frequency of the distribution for each pitch over a plurality of unit ranges;
A program that causes a computer to function as an evaluation unit that evaluates the degree of similarity in singing pitch between the one singing voice and the other singing voice based on the first evaluation distribution and the second evaluation distribution.
JP2015041620A 2015-03-03 2015-03-03 Singing evaluation device and program Active JP6488767B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2015041620A JP6488767B2 (en) 2015-03-03 2015-03-03 Singing evaluation device and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2015041620A JP6488767B2 (en) 2015-03-03 2015-03-03 Singing evaluation device and program

Publications (2)

Publication Number Publication Date
JP2016161831A JP2016161831A (en) 2016-09-05
JP6488767B2 true JP6488767B2 (en) 2019-03-27

Family

ID=56846899

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2015041620A Active JP6488767B2 (en) 2015-03-03 2015-03-03 Singing evaluation device and program

Country Status (1)

Country Link
JP (1) JP6488767B2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019092784A1 (en) * 2017-11-07 2019-05-16 ヤマハ株式会社 Evaluation device, evaluation method, and evaluation program
CN113571030B (en) * 2021-07-21 2023-10-20 浙江大学 MIDI music correction method and device based on hearing harmony evaluation

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3293745B2 (en) * 1996-08-30 2002-06-17 ヤマハ株式会社 Karaoke equipment
JP3888371B2 (en) * 1996-11-20 2007-02-28 ヤマハ株式会社 Sound signal analyzing apparatus and method
JP4271667B2 (en) * 2005-03-17 2009-06-03 株式会社第一興商 Karaoke scoring system for scoring duet synchronization
JP2006284796A (en) * 2005-03-31 2006-10-19 Yamaha Corp Musical sound signal transmitting terminal and musical sound signal receiving terminal
JP2007156330A (en) * 2005-12-08 2007-06-21 Taito Corp Karaoke device with compatibility determination function
JP2011215292A (en) * 2010-03-31 2011-10-27 Yamaha Corp Singing determination device and karaoke device
JP6177050B2 (en) * 2013-08-22 2017-08-09 株式会社第一興商 Online karaoke system

Also Published As

Publication number Publication date
JP2016161831A (en) 2016-09-05

Similar Documents

Publication Publication Date Title
JP6759545B2 (en) Evaluation device and program
JP6690181B2 (en) Musical sound evaluation device and evaluation reference generation device
US10453478B2 (en) Sound quality determination device, method for the sound quality determination and recording medium
JP4212446B2 (en) Karaoke equipment
JP6488767B2 (en) Singing evaluation device and program
JP6569224B2 (en) Singing evaluation device, singing evaluation method and program
JP2017027070A (en) Evaluation device and program
JP2013213907A (en) Evaluation apparatus
JP2020076844A (en) Acoustic processing method and acoustic processing device
JP2019101148A (en) Communication karaoke system
JP4218066B2 (en) Karaoke device and program for karaoke device
JPWO2006062064A1 (en) Music processing device
JPH11237890A (en) Singing scoring method of karaoke device with singing scoring function
JP5618743B2 (en) Singing voice evaluation device
JP5287616B2 (en) Sound processing apparatus and program
JP5131130B2 (en) Follow-up evaluation system, karaoke system and program
JP2016180965A (en) Evaluation device and program
JP5102939B2 (en) Speech synthesis apparatus and speech synthesis program
JP6144592B2 (en) Singing scoring system
JP2016156917A (en) Singing evaluation device and program
JP6286255B2 (en) Karaoke system
JP4930608B2 (en) Acoustic signal analysis apparatus, acoustic signal analysis method, and acoustic signal analysis program
JP5697395B2 (en) Singing voice evaluation apparatus and program
JP2015148750A (en) Singing synthesizer
JP6295691B2 (en) Music processing apparatus and music processing method

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20180125

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20181225

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20190129

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20190211

R151 Written notification of patent or utility model registration

Ref document number: 6488767

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R151