JP5428458B2

JP5428458B2 - Evaluation device

Info

Publication number: JP5428458B2
Application number: JP2009082065A
Authority: JP
Inventors: 幸生多田
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2009-03-30
Filing date: 2009-03-30
Publication date: 2014-02-26
Anticipated expiration: 2029-03-30
Also published as: JP2010237257A

Description

本発明は、歌唱を評価する技術に関する。 The present invention relates to a technique for evaluating a song.

カラオケ装置を用いて行われた利用者の歌唱の巧拙を採点する技術が知られている。例えば、特許文献１〜３には、カラオケ演奏を再生するカラオケ装置において、利用者の歌唱の採点を行う技術が開示されている。また、特許文献４，５には、教師がコンピュータ装置を用いて遠隔にいる生徒の歌声を再生し、生徒の歌声の添削を行い、その添削結果を生徒に送信する技術が開示されている。また、特許文献６には、通信カラオケシステムのサーバが、歌唱者の使用する通信端末にカラオケデータを配信するとともに通信端末から歌唱者の歌唱音声信号を受信し、カラオケデータの配信が終了した時に、受信した歌唱音声信号に基づいて歌唱者の歌唱力を判定し、その判定結果を通信端末に送信する技術が開示されている。 A technique for scoring the skill of a user's sing performed using a karaoke device is known. For example, Patent Documents 1 to 3 disclose a technique for scoring a user's singing in a karaoke apparatus that reproduces a karaoke performance. Patent Documents 4 and 5 disclose a technique in which a teacher reproduces a student's singing voice using a computer device, corrects the student's singing voice, and transmits the correction result to the student. Moreover, in patent document 6, when the server of a communication karaoke system distributes karaoke data to the communication terminal which a singer uses, a singer's singing voice signal is received from a communication terminal, and distribution of karaoke data is complete | finished. A technique for determining a singer's singing ability based on a received singing voice signal and transmitting the determination result to a communication terminal is disclosed.

特開２００３−２１６１６８号公報JP 2003-216168 A 特開２００７−１２１５５０号公報JP 2007-121550 A 特開平１０−２２２１８２号公報Japanese Patent Laid-Open No. 10-222182 特許第４０８７０８７号公報Japanese Patent No. 4087087 特開２００３−１５６７３号公報JP 2003-15673 A 特開２００７−１７８４９号公報JP 2007-17849 A

ところで、上述した特許文献１〜３のように、各々のカラオケ装置に利用者の歌唱を採点する採点機能を設けると、カラオケ装置のコストが高くなってしまう。また、カラオケ装置では、カラオケ演奏の再生などの処理も行わなくてはならないため、処理負荷の高い精密な採点処理を行うことができない。さらに、新たな採点機能を追加する場合には、カラオケ装置の１つひとつに新たな採点機能を追加しなくてはならないため、その作業に時間と手間がかかってしまう。
本発明は、カラオケ演奏の再生などの処理を行う通信装置以外の装置において、利用者の歌唱音声を評価区間毎に評価し、その評価結果を逐次出力することを目的とする。 By the way, if the scoring function which scores a user's song is provided in each karaoke apparatus like patent documents 1-3 mentioned above, the cost of a karaoke apparatus will become high. In addition, since the karaoke apparatus must perform processing such as reproduction of karaoke performance, it cannot perform precise scoring processing with a high processing load. Furthermore, when adding a new scoring function, it is necessary to add a new scoring function to each karaoke device, which takes time and effort.
An object of the present invention is to evaluate a user's singing voice for each evaluation section in an apparatus other than a communication apparatus that performs processing such as playback of a karaoke performance, and sequentially output the evaluation result.

本発明は、模範となる歌唱音声を表す模範音声データを記憶する第１の記憶手段と、利用者の歌唱音声を表す音声データをストリーミング方式で通信装置から取得する取得手段と、前記取得手段によって取得された音声データを記憶する第２の記憶手段と、前記第２の記憶手段に記憶されている前記音声データの時間軸を、歌唱音声の評価の対象となる複数の評価区間に分割する分割手段と、前記分割手段によって分割された各々の評価区間を時間軸上の位置に応じた順番で選択し、選択された当該評価区間に含まれる音声データを、前記第１の記憶手段に記憶されている当該評価区間に対応する区間に含まれる模範音声データと比較することにより、当該選択された評価区間における前記利用者の歌唱音声を評価し、評価結果を生成する評価手段と、前記評価手段によって各評価区間における評価結果が生成される度に、生成された当該評価結果を前記通信装置に送信する送信手段とを備えることを特徴とする評価装置を提供する。 The present invention includes a first storage unit that stores exemplary voice data representing an exemplary singing voice, an acquisition unit that acquires voice data representing a user's singing voice from a communication device in a streaming manner, and the acquisition unit. Division which divides | segments the time axis | shaft of the said audio | voice data memorize | stored in the 2nd storage means which memorize | stores the acquired audio | voice data, and the said 2nd memory | storage means into the several evaluation area used as the object of evaluation of a song voice And each evaluation section divided by the dividing means in the order corresponding to the position on the time axis, and voice data included in the selected evaluation section is stored in the first storage means. The user's singing voice in the selected evaluation section is evaluated by comparing with the model voice data included in the section corresponding to the evaluation section, and the evaluation result is generated. And means, every time the evaluation results of each evaluation interval is generated by the evaluation unit, generated the evaluation results provide an evaluation device, characterized in that it comprises a transmitting means for transmitting to the communication device.

本発明の好ましい態様において、前記模範音声データには、前記模範となる歌唱音声に含まれる各フレーズの時間軸上の区切り位置を示す区切データが付加されており、前記分割手段は、前記第２の記憶手段に記憶されている音声データの時間軸を、前記第１の記憶手段に記憶されている模範音声データにおいて前記区切データによって区切られた各フレーズ区間に対応する複数の評価区間に分割してもよい。 In a preferred aspect of the present invention, delimiter data indicating delimiter positions on the time axis of each phrase included in the exemplary singing voice is added to the exemplary voice data, and the dividing means includes the second voice data. The time axis of the voice data stored in the storage means is divided into a plurality of evaluation sections corresponding to each phrase section delimited by the delimiter data in the model voice data stored in the first storage means. May be.

本発明の好ましい態様において、前記通信装置は種別の異なるものが複数あり、前記評価手段は、前記取得手段が予め決められた種別の通信装置から前記音声データを取得した場合には、当該予め決められた種別とは異なる種別の通信装置から前記音声データを取得した場合よりも簡易な内容で前記評価を行ってもよい。 In a preferred aspect of the present invention, there are a plurality of communication devices of different types, and the evaluation unit determines the predetermined data when the acquisition unit acquires the audio data from a predetermined type of communication device. The evaluation may be performed with simpler content than when the audio data is acquired from a communication device of a type different from the specified type.

本発明によれば、カラオケ演奏の再生などの処理を行う通信装置以外の装置において、利用者の歌唱音声を評価区間毎に評価し、その評価結果を逐次出力することができる。 ADVANTAGE OF THE INVENTION According to this invention, in apparatuses other than a communication apparatus which performs processes, such as reproduction | regeneration of a karaoke performance, a user's song voice can be evaluated for every evaluation area and the evaluation result can be output sequentially.

実施形態に係る採点システムの構成を示すブロック図である。It is a block diagram which shows the structure of the scoring system which concerns on embodiment. 前記採点システムのカラオケ装置の構成を示すブロック図である。It is a block diagram which shows the structure of the karaoke apparatus of the said scoring system. 前記採点システムの採点サーバ装置の構成を示すブロック図である。It is a block diagram which shows the structure of the scoring server apparatus of the said scoring system. 前記採点システムの動作を示すシーケンス図である。It is a sequence diagram which shows operation | movement of the said scoring system. 前記採点システムが用いる歌唱データ及び模範歌唱データの一例を示す図である。It is a figure which shows an example of the song data and model song data which the said scoring system uses.

［構成］
図１は、本実施形態に係る採点システム１の構成を示すブロック図である。同図に示すように、採点システム１は、複数のカラオケ装置１０と、採点サーバ装置２０とを備えている。採点サーバ装置２０とカラオケ装置１０とは、インターネットなどのネットワークＮを介して接続されている。このカラオケ装置１０は、本発明に係る通信装置として機能する。また、採点サーバ装置２０は、本発明に係る評価装置として機能する。 [Constitution]
FIG. 1 is a block diagram showing a configuration of a scoring system 1 according to this embodiment. As shown in the figure, the scoring system 1 includes a plurality of karaoke apparatuses 10 and a scoring server apparatus 20. The scoring server device 20 and the karaoke device 10 are connected via a network N such as the Internet. This karaoke apparatus 10 functions as a communication apparatus according to the present invention. The scoring server device 20 functions as an evaluation device according to the present invention.

（カラオケ装置）
次に、カラオケ装置１０の構成について説明する。このカラオケ装置１０は、例えばカラオケ店などに設置されている。図２は、カラオケ装置１０の構成を示すブロック図である。同図に示すように、カラオケ装置１０は、ＣＰＵ(Central Processing Unit)１１と、メモリ１２と、通信部１３と、記憶部１４と、操作部１５と、表示部１６と、収音部１７と、音源部１８と、放音部１９とを備えている。ＣＰＵ１１は、メモリ１２に記憶されているプログラムを実行することにより、カラオケ装置１０の各部を制御する。メモリ１２は、例えばＲＯＭ（Read Only Memory）とＲＡＭ（Random Access Memory）とを備えており、ＣＰＵ１１によって用いられるプログラムやデータを記憶する。通信部１３は、ネットワークＮを介して接続された採点サーバ装置２０と通信を行う。記憶部１４は、例えばハードディスクを備えており、各々のカラオケ装置１０を識別するために用いられる装置ＩＤと、利用者が歌唱を行うときに用いられる複数のカラオケデータＤｋとを記憶している。このカラオケデータＤｋには、演奏データと歌詞データとが含まれている。演奏データは、楽曲の演奏音を表すデータである。この演奏音には歌声は含まれていない。歌詞データは、楽曲の歌詞を示すデータである。操作部１５は、例えば複数の操作ボタンを備えており、利用者の操作に応じた操作信号をＣＰＵ１１に入力する。表示部１６は、例えば液晶ディスプレイを備えており、ＣＰＵ１１の制御に応じた画像を表示する。収音部１７は、例えばマイクロホンとＡ／Ｄ変換部とを備えており、収集した音声に応じたアナログ信号を生成し、生成したアナログ信号をＡ／Ｄ変換によりデジタルデータに変換して出力する。音源部１８は、記憶部１４に記憶されている演奏データに応じた音声信号を生成して放音部１９に供給する。放音部１９は、例えばスピーカー、Ｄ／Ａ変換部及びアンプを備えており、音源部１８から供給された音声信号をＤ／Ａ変換によりアナログ信号に変換し、変換したアナログ信号を増幅して音声として放出する。 (Karaoke equipment)
Next, the configuration of the karaoke apparatus 10 will be described. This karaoke apparatus 10 is installed in a karaoke shop, for example. FIG. 2 is a block diagram showing the configuration of the karaoke apparatus 10. As shown in the figure, the karaoke apparatus 10 includes a CPU (Central Processing Unit) 11, a memory 12, a communication unit 13, a storage unit 14, an operation unit 15, a display unit 16, and a sound collection unit 17. The sound source unit 18 and the sound emitting unit 19 are provided. The CPU 11 controls each unit of the karaoke apparatus 10 by executing a program stored in the memory 12. The memory 12 includes, for example, a ROM (Read Only Memory) and a RAM (Random Access Memory), and stores programs and data used by the CPU 11. The communication unit 13 communicates with the scoring server device 20 connected via the network N. The memory | storage part 14 is provided with the hard disk, for example, and has memorize | stored apparatus ID used in order to identify each karaoke apparatus 10, and several karaoke data Dk used when a user sings. The karaoke data Dk includes performance data and lyrics data. The performance data is data representing the performance sound of the music. This performance sound does not include singing voice. The lyric data is data indicating the lyrics of the music. The operation unit 15 includes, for example, a plurality of operation buttons, and inputs an operation signal according to a user operation to the CPU 11. The display unit 16 includes a liquid crystal display, for example, and displays an image according to the control of the CPU 11. The sound collection unit 17 includes, for example, a microphone and an A / D conversion unit, generates an analog signal corresponding to the collected voice, converts the generated analog signal into digital data by A / D conversion, and outputs the digital data. . The sound source unit 18 generates a sound signal corresponding to the performance data stored in the storage unit 14 and supplies the sound signal to the sound emitting unit 19. The sound emitting unit 19 includes, for example, a speaker, a D / A conversion unit, and an amplifier. The sound signal supplied from the sound source unit 18 is converted into an analog signal by D / A conversion, and the converted analog signal is amplified. Release as audio.

（採点サーバ装置）
次に、採点サーバ装置２０の構成について説明する。この採点サーバ装置２０は、カラオケ装置１０と同じカラオケ店に設置されていてもよいし、カラオケ店とは異なる場所に設置されていてもよい。図３は、採点サーバ装置２０の構成を示すブロック図である。同図に示すように、採点サーバ装置２０は、ＣＰＵ２１と、メモリ２２と、通信部２３と、記憶部２４と、処理部２５とを備えている。ＣＰＵ２１は、メモリ２２に記憶されているプログラムを実行することにより、採点サーバ装置２０の各部を制御する。メモリ２２は、例えばＲＯＭとＲＡＭとを備えており、ＣＰＵ２１によって用いられるプログラムやデータを記憶する。通信部２３は、ネットワークＮを介して接続された各カラオケ装置１０と通信を行う。記憶部２４は、例えばハードディスクを備えており、模範となる歌唱音声を表す複数の模範歌唱データＤｒを記憶している。つまり、この記憶部２４は、模範となる歌唱音声を表す模範音声データ（模範歌唱データ）を記憶する第１の記憶手段として機能する。この模範歌唱データＤｒには、模範となる歌唱音声に含まれる各フレーズの時間軸上の区切り位置を示す区切データｋが付加されている。処理部２５は、例えばＤＳＰ（Digital Signal Processor）であり、カラオケ装置１０から送信されてくる歌唱データＤｘと、記憶部２４に記憶されている模範歌唱データＤｒとを比較して、利用者の歌唱の巧拙を評価する採点処理を行う。 (Scoring server device)
Next, the configuration of the scoring server device 20 will be described. The scoring server device 20 may be installed in the same karaoke store as the karaoke device 10 or may be installed in a place different from the karaoke store. FIG. 3 is a block diagram illustrating a configuration of the scoring server device 20. As shown in the figure, the scoring server device 20 includes a CPU 21, a memory 22, a communication unit 23, a storage unit 24, and a processing unit 25. The CPU 21 controls each unit of the scoring server device 20 by executing a program stored in the memory 22. The memory 22 includes, for example, a ROM and a RAM, and stores programs and data used by the CPU 21. The communication unit 23 communicates with each karaoke apparatus 10 connected via the network N. The storage unit 24 includes, for example, a hard disk, and stores a plurality of model song data Dr representing model song voices. That is, this memory | storage part 24 functions as a 1st memory | storage means which memorize | stores model audio | voice data (model singing data) showing the singing voice used as model. Separation data k indicating the separation position on the time axis of each phrase included in the exemplary singing voice is added to the exemplary singing data Dr. The processing unit 25 is a DSP (Digital Signal Processor), for example, and compares the singing data Dx transmitted from the karaoke apparatus 10 with the exemplary singing data Dr stored in the storage unit 24, so that the singing of the user is performed. Perform scoring to evaluate the skill of the

［動作］
次に、本実施形態に係る採点システム１の動作について説明する。図４は、採点システム１の動作を示すシーケンス図である。まず、利用者は、カラオケ装置１０の操作部１５を操作して、歌唱したい楽曲の楽曲ＩＤを入力し、演奏開始を指示する。この楽曲ＩＤとしては、例えば楽曲の名前や楽曲に割り当てられた番号などが用いられる。ＣＰＵ１１は、この操作に応じて、通信部１３によって採点サーバ装置２０との通信を確立させた後（ステップＳ１１）、利用者によって入力された楽曲ＩＤと記憶部１４に記憶されている装置ＩＤとを採点サーバ装置２０に送信する（ステップＳ１２）。このとき、通信部１３は、公開されているＡＰＩ（Application Programming Interface）を用いて、採点サーバ装置２０と通信を行う。カラオケ装置１０から楽曲ＩＤ及び装置ＩＤが送信されてくると、採点サーバ装置２０のＣＰＵ２１は、この楽曲ＩＤ及び装置ＩＤを通信部２３によって受信する（ステップＳ１３）。 [Operation]
Next, the operation of the scoring system 1 according to this embodiment will be described. FIG. 4 is a sequence diagram showing the operation of the scoring system 1. First, the user operates the operation unit 15 of the karaoke apparatus 10 to input a song ID of a song to be sung and instruct to start performance. As the music ID, for example, the name of the music or the number assigned to the music is used. In response to this operation, the CPU 11 establishes communication with the scoring server device 20 by the communication unit 13 (step S11), and then the music ID input by the user and the device ID stored in the storage unit 14 Is transmitted to the scoring server device 20 (step S12). At this time, the communication unit 13 communicates with the scoring server device 20 using a public API (Application Programming Interface). If music ID and apparatus ID are transmitted from the karaoke apparatus 10, CPU21 of the scoring server apparatus 20 will receive this music ID and apparatus ID by the communication part 23 (step S13).

また、カラオケ装置１０のＣＰＵ１１は、入力された楽曲ＩＤの楽曲のカラオケデータＤｋを記憶部１４から順次読み出し、カラオケ再生処理を行う（ステップＳ１４）。具体的には、ＣＰＵ１１は、記憶部１４から読み出したカラオケデータＤｋに含まれる演奏データを音源部１８に供給する。音源部１８は、ＣＰＵ１１によって供給された演奏データに応じた音声信号を生成し、生成した音声信号を放音部１９に供給して、その音声信号に応じた音声を放出させる。これにより、利用者によって指定された楽曲の演奏が開始される。また、ＣＰＵ１１は、楽曲の演奏と同期するように、記憶部１４から読み出したカラオケデータＤｋに含まれる歌詞データを表示部１６に供給する。これにより、利用者によって指定された楽曲の歌詞が表示される。 Further, the CPU 11 of the karaoke apparatus 10 sequentially reads out the karaoke data Dk of the music having the inputted music ID from the storage unit 14 and performs karaoke reproduction processing (step S14). Specifically, the CPU 11 supplies performance data included in the karaoke data Dk read from the storage unit 14 to the sound source unit 18. The sound source unit 18 generates an audio signal corresponding to the performance data supplied by the CPU 11, supplies the generated audio signal to the sound emitting unit 19, and emits sound corresponding to the audio signal. Thereby, the performance of the music designated by the user is started. Further, the CPU 11 supplies the lyric data included in the karaoke data Dk read from the storage unit 14 to the display unit 16 so as to be synchronized with the performance of the music. Thereby, the lyrics of the music designated by the user are displayed.

利用者は、楽曲の演奏に合わせて収音部１７に向かって歌唱する。このとき、収音部１７は、利用者の歌唱音声を収集し、収集した歌唱音声を表す歌唱データＤｘを生成する。ＣＰＵ１１は、収音部１７によって生成された歌唱データＤｘを順次放音部１９に供給し、その歌唱データＤｘに応じた歌唱音声を楽曲の演奏とともに放出させる。また、ＣＰＵ１１は、収音部１７によって生成された歌唱データＤｘを順次通信部１３に供給する。通信部１３は、ＣＰＵ１１によって供給された歌唱データＤｘをストリーミング方式で採点サーバ装置２０に送信する（ステップＳ１５）。ストリーミング方式のデータ送信では、データが或るＰＤＵ（Protocol Data Unit）単位（例えば、パケット単位）で分割されて送信される。そのため、データの受信側では、データ全体を受信し終わるのを待つことなく、データをＰＤＵ単位で受信して処理を行うことができる。 The user sings toward the sound collection unit 17 in accordance with the performance of the music. At this time, the sound collection unit 17 collects the user's singing voice and generates singing data Dx representing the collected singing voice. CPU11 supplies the song data Dx produced | generated by the sound collection part 17 to the sound emission part 19 one by one, and releases the song sound according to the song data Dx with the performance of a music. Further, the CPU 11 sequentially supplies the singing data Dx generated by the sound collection unit 17 to the communication unit 13. The communication unit 13 transmits the song data Dx supplied by the CPU 11 to the scoring server device 20 in a streaming manner (step S15). In streaming data transmission, data is divided and transmitted in a certain PDU (Protocol Data Unit) unit (for example, packet unit). Therefore, the data receiving side can receive and process data in units of PDUs without waiting for the completion of reception of the entire data.

カラオケ装置１０から歌唱データＤｘが送信されてくると、採点サーバ装置２０のＣＰＵ２１は、この歌唱データＤｘを通信部２３によって受信し（ステップＳ１６）、受信した歌唱データをメモリ２２に記憶させる。つまり、ＣＰＵ２１は、利用者の歌唱音声を表す音声データ（歌唱データ）をストリーミング方式でカラオケ装置１０から取得する取得手段として機能する。また、メモリ２２は、ＣＰＵ２１によって取得された音声データを記憶する第２の記憶手段として機能する。上述したように、この歌唱データＤｘはＰＤＵ単位で送信されてくる。よって、メモリ２２には、ＰＤＵ単位で送信されてきた歌唱データが順次記憶されていくことになる。 When the song data Dx is transmitted from the karaoke device 10, the CPU 21 of the scoring server device 20 receives the song data Dx by the communication unit 23 (step S <b> 16), and stores the received song data in the memory 22. That is, CPU21 functions as an acquisition means which acquires audio | speech data (singing data) showing a user's song voice from the karaoke apparatus 10 by a streaming system. Further, the memory 22 functions as a second storage unit that stores audio data acquired by the CPU 21. As described above, this song data Dx is transmitted in units of PDUs. Therefore, the singing data transmitted in units of PDUs is sequentially stored in the memory 22.

続いて、処理部２５は、メモリ２２に記憶されている歌唱データＤｘの時間軸を、採点処理の対象となる複数の採点区間に分割する（ステップＳ１７）。この採点区間は、利用者の歌唱音声の評価の対象となる評価区間として用いられる。つまり、処理部２５は、メモリ２２に記憶されている音声データの時間軸を、歌唱音声の評価の対象となる複数の評価区間に分割する分割手段として機能する。図５を参照して具体的に説明すると、処理部２５は、まず記憶部２４に記憶されている模範歌唱データＤｒにおいて、区切データｋによって区切られたフレーズ区間ｆ１，ｆ２，ｆ３，ｆ４・・・を特定する。続いて、処理部２５は、メモリ２２に記憶されている歌唱データＤｘの時間軸を、特定した各フレーズ区間に対応する採点区間に分割する。この例では、歌唱データＤｘの時間軸が、フレーズ区間ｆ１に対応する採点区間ｈ１，フレーズ区間ｆ２に対応する採点区間ｈ２，フレーズ区間ｆ３に対応する採点区間ｈ３，フレーズ区間ｆ４に対応する採点区間ｈ４・・・に分割される。つまり、処理部２５は、メモリ２２に記憶されている音声データの時間軸を、記憶部２４に記憶されている模範音声データにおいて区切データによって区切られた各フレーズ区間に対応する複数の評価区間に分割する。 Subsequently, the processing unit 25 divides the time axis of the song data Dx stored in the memory 22 into a plurality of scoring sections to be subjected to scoring processing (step S17). This scoring section is used as an evaluation section that is a target of evaluation of the user's singing voice. That is, the processing unit 25 functions as a dividing unit that divides the time axis of the audio data stored in the memory 22 into a plurality of evaluation sections to be evaluated for singing voice. Specifically, with reference to FIG. 5, the processing unit 25 firstly includes phrase sections f1, f2, f3, f4,... Delimited by the delimiter data k in the model song data Dr stored in the storage unit 24.・ Identify Subsequently, the processing unit 25 divides the time axis of the song data Dx stored in the memory 22 into scoring sections corresponding to the specified phrase sections. In this example, the time axis of the singing data Dx is the scoring section h1 corresponding to the phrase section f1, the scoring section h2 corresponding to the phrase section f2, the scoring section h3 corresponding to the phrase section f3, and the scoring section corresponding to the phrase section f4. It is divided into h4. That is, the processing unit 25 sets the time axis of the audio data stored in the memory 22 to a plurality of evaluation intervals corresponding to each phrase interval divided by the delimiter data in the model audio data stored in the storage unit 24. To divide.

続いて、処理部２５は、分割した採点区間の中から今回の採点処理の対象となる採点区間を選択する（ステップＳ１８）。この例では、いずれの採点区間についても採点処理が行われていないため、時間軸において先頭の採点区間ｈ１が選択される。続いて、処理部２５は、選択した採点区間に含まれる歌唱データＤｘと記憶部２４に記憶されている模範歌唱データＤｒとに基づいて、採点処理を行う（ステップＳ１９）。具体的には、処理部２５は、選択した採点区間に含まれる歌唱データＤｘと、その採点区間に対応するフレーズ区間に含まれる模範歌唱データＤｒとを比較して、それらの類似度に応じた点数を算出する。続いて、処理部２５は、算出した点数を採点結果とし、その採点結果を表す採点結果データを生成する。つまり、処理部２５は、分割された複数の評価区間の中から選択された評価区間に含まれる音声データを、記憶部２４に記憶されているその評価区間に対応する区間に含まれる模範音声データと比較することにより、選択された評価区間における利用者の歌唱音声を評価し、評価結果を生成する評価手段として機能する。この例では、図５中の採点区間ｈ１に含まれる歌唱データＤｘとフレーズ区間ｆ１に含まれる模範歌唱データＤｒとの類似度に応じた点数が算出され、算出された点数が採点区間ｈ１の採点結果として用いられて、採点区間ｈ１の採点結果を表す採点結果データが生成される。 Subsequently, the processing unit 25 selects a scoring section that is a target of the current scoring process from the divided scoring sections (step S18). In this example, since no scoring process is performed for any scoring section, the first scoring section h1 on the time axis is selected. Subsequently, the processing unit 25 performs a scoring process based on the song data Dx included in the selected scoring section and the model song data Dr stored in the storage unit 24 (step S19). Specifically, the processing unit 25 compares the singing data Dx included in the selected scoring section with the model singing data Dr included in the phrase section corresponding to the scoring section, and responds to their similarity. Calculate the score. Subsequently, the processing unit 25 uses the calculated score as a scoring result, and generates scoring result data representing the scoring result. That is, the processing unit 25 converts the voice data included in the evaluation section selected from the plurality of divided evaluation sections into the model voice data included in the section corresponding to the evaluation section stored in the storage unit 24. By comparing with, the user's singing voice in the selected evaluation section is evaluated and functions as an evaluation means for generating an evaluation result. In this example, the score corresponding to the similarity between the singing data Dx included in the scoring section h1 in FIG. 5 and the exemplary singing data Dr included in the phrase section f1 is calculated, and the calculated score is the scoring of the scoring section h1. As a result, scoring result data representing the scoring result of scoring section h1 is generated.

採点結果データが生成されると、ＣＰＵ２１は、この採点結果データを上述したステップＳ１６にて受信された装置ＩＤのカラオケ装置１０に通信部２３によって送信する（ステップＳ２０）。この例では、図５中の採点区間ｈ１の採点結果を表す採点結果データがカラオケ装置１０に送信される。続いて、処理部２５は、上述したステップＳ１８にて選択された採点区間が時間軸における最後の採点区間であるか否かを判定する（ステップＳ２１）。この例では、図５に示すように、採点区間ｈ１は時間軸における最後の採点区間ではないため、処理部２５は、選択した採点区間が最後の採点区間ではないと判定する（ステップＳ２１：ＮＯ）。この場合、処理部２５は、上述したステップＳ１８に戻り、次の採点処理の対象となる採点区間を選択する。具体的には、処理部２５は、上述にて採点区間ｈ１の採点処理が行われているため、時間軸において採点区間ｈ１の次の採点区間ｈ２を選択する。つまり、処理部２５は、分割された各々の評価区間を時間軸上の位置に応じた順番で選択する。続いて、処理部２５は、上述と同様に、ステップＳ１９において採点区間ｈ２についての採点処理を行う。そして、ＣＰＵ２１は、ステップＳ２０において採点区間ｈ２の採点結果を表す採点結果データをカラオケ装置１０に送信する。つまり、ＣＰＵ２１は、処理部２５によって各評価区間における評価結果が生成される度に、生成された評価結果をカラオケ装置１０に送信する送信手段として機能する。このようにして、処理部２５とＣＰＵ２１とは、ステップＳ２１において、選択された採点区間が時間軸における最後の採点区間であると判定されるまで、上述したステップＳ１８〜ステップＳ２１の処理を繰り返す。 When the scoring result data is generated, the CPU 21 transmits the scoring result data to the karaoke apparatus 10 having the apparatus ID received in step S16 described above by the communication unit 23 (step S20). In this example, scoring result data representing the scoring result of the scoring section h1 in FIG. Subsequently, the processing unit 25 determines whether or not the scoring section selected in Step S18 described above is the last scoring section on the time axis (Step S21). In this example, as shown in FIG. 5, since the scoring section h1 is not the last scoring section on the time axis, the processing unit 25 determines that the selected scoring section is not the last scoring section (step S21: NO). ). In this case, the processing unit 25 returns to step S18 described above, and selects a scoring section to be subjected to the next scoring process. Specifically, since the scoring process of the scoring section h1 is performed as described above, the processing unit 25 selects the scoring section h2 next to the scoring section h1 on the time axis. That is, the processing unit 25 selects each divided evaluation section in the order corresponding to the position on the time axis. Subsequently, the processing unit 25 performs the scoring process for the scoring section h2 in step S19 as described above. And CPU21 transmits the grading result data showing the scoring result of scoring area h2 to the karaoke apparatus 10 in step S20. That is, the CPU 21 functions as a transmission unit that transmits the generated evaluation result to the karaoke apparatus 10 every time the processing unit 25 generates an evaluation result in each evaluation section. In this way, the processing unit 25 and the CPU 21 repeat the above-described processing of step S18 to step S21 until it is determined in step S21 that the selected scoring section is the last scoring section on the time axis.

一方、カラオケ装置１０のＣＰＵ１１は、上述したステップＳ１４及びＳ１５の処理を行いながら、採点サーバ装置２０から採点結果データが送信されてくるまで待機する。そして、採点サーバ装置１０から採点結果データが送信されてくると、ＣＰＵ１１は、通信部１３によってこの採点結果データを受信する（ステップＳ２２）。続いて、ＣＰＵ１１は、受信した採点結果データが表す採点結果を表示部１６に表示させる（ステップＳ２３）。例えば、上述したように、図５中の採点区間ｈ１の採点結果を表す採点結果データが採点サーバ装置２０から送信されてきた場合には、採点区間ｈ１の採点結果が表示される。なお、このステップＳ２２〜Ｓ２３が行われている間も上述したカラオケ再生処理が継続しているため、表示部１６には楽曲の歌詞が表示されている。従って、ＣＰＵ１１は、上述したステップＳ２３にて採点結果を表示させるときには、楽曲の歌詞の表示をいったん中断して採点結果を表示させるか、あるいは楽曲の歌詞と採点結果とを合成したものを表示させる。 On the other hand, the CPU 11 of the karaoke apparatus 10 stands by until the scoring result data is transmitted from the scoring server apparatus 20 while performing the processes of steps S14 and S15 described above. When the scoring result data is transmitted from the scoring server device 10, the CPU 11 receives the scoring result data by the communication unit 13 (step S22). Subsequently, the CPU 11 displays the scoring result represented by the received scoring result data on the display unit 16 (step S23). For example, as described above, when scoring result data representing the scoring result of the scoring section h1 in FIG. 5 is transmitted from the scoring server device 20, the scoring result of the scoring section h1 is displayed. In addition, since the karaoke reproduction | regeneration processing mentioned above continues while these steps S22-S23 are performed, the lyrics of a music are displayed on the display part 16. FIG. Therefore, when displaying the scoring result in the above-described step S23, the CPU 11 interrupts the display of the lyrics of the music and displays the scoring result or displays the composite of the lyrics of the music and the scoring result. .

続いて、ＣＰＵ１１は、上述したステップＳ１４のカラオケ再生処理の状況に基づいて、利用者の歌唱が終了したか否かを判定する（ステップＳ２４）。例えば、カラオケ再生処理が終了していない場合、ＣＰＵ１１は、利用者の歌唱が終了していないと判定する（ステップＳ２４：ＮＯ）。この場合、ＣＰＵ１１は、上述したステップＳ２２に戻り、新たな採点結果データが送信されてくるまで待機する。そして、採点サーバ装置２０から新たな採点結果データが送信されてくると、ＣＰＵ１１は、上述と同様に、ステップＳ２１にてその採点結果データを通信部１３によって受信し、ステップＳ２３にてその採点結果データが表す採点結果を表示部１６に表示させる。これにより、表示部１６には、図５中の各採点区間の採点結果が順次表示されていく。 Subsequently, the CPU 11 determines whether or not the user's singing has been completed based on the situation of the karaoke reproduction process of step S14 described above (step S24). For example, when the karaoke playback process has not ended, the CPU 11 determines that the user's singing has not ended (step S24: NO). In this case, the CPU 11 returns to step S22 described above and waits until new scoring result data is transmitted. Then, when new scoring result data is transmitted from the scoring server device 20, the CPU 11 receives the scoring result data by the communication unit 13 in step S21 as described above, and the scoring result in step S23. The scoring result represented by the data is displayed on the display unit 16. Accordingly, the scoring results of the scoring sections in FIG. 5 are sequentially displayed on the display unit 16.

一方、上述したステップＳ１４において、利用者によって指定された楽曲のカラオケデータＤｋが時間軸における最後の位置まで読み出されて処理されると、カラオケ再生処理が終了する。この場合、ＣＰＵ１１は、利用者の歌唱が終了したと判定し（ステップＳ２４：ＹＥＳ）、通信部１３によって採点サーバ装置２０との間の通信を切断した後（ステップＳ２５）、この処理を終了する。 On the other hand, when the karaoke data Dk of the music designated by the user is read and processed up to the last position on the time axis in the above-described step S14, the karaoke playback process ends. In this case, CPU11 determines with a user's song having been complete | finished (step S24: YES), and after cut | disconnecting communication with the scoring server apparatus 20 by the communication part 13 (step S25), this process is complete | finished. .

以上説明した実施形態によれば、カラオケ演奏の再生処理を行うカラオケ装置１０とは異なる採点サーバ装置２０において、利用者の歌唱音声を評価区間毎に評価し、その評価結果を逐次出力することができる。また、各々のカラオケ装置１０に利用者の歌唱を採点する採点機能を設ける必要がないため、カラオケ装置１０のコストを安くすることができる。さらに、採点サーバ装置２０側では、カラオケ演奏の再生処理は行われないため、処理負荷の高い精密な採点処理を行うことができる。また、新たな採点機能を追加する場合には、採点サーバ装置２０だけに新たな採点機能を追加すればよいため、新たな採点機能を追加する作業を容易に行うことができる。 According to the embodiment described above, in the scoring server device 20 that is different from the karaoke device 10 that performs the reproduction processing of the karaoke performance, the user's singing voice is evaluated for each evaluation section, and the evaluation results are sequentially output. it can. Moreover, since it is not necessary to provide each karaoke apparatus 10 with the scoring function which marks a user's song, the cost of the karaoke apparatus 10 can be reduced. Furthermore, since the karaoke performance playback process is not performed on the scoring server device 20 side, a precise scoring process with a high processing load can be performed. In addition, when a new scoring function is added, it is only necessary to add a new scoring function only to the scoring server device 20, and therefore, an operation of adding a new scoring function can be easily performed.

［変形例］
以上が実施形態の説明であるが、この実施形態の内容は以下のように変形し得る。また、以下の各変形例を適宜組み合わせてもよい。
（変形例１）
上述した実施形態では、歌唱データの時間軸を模範歌唱データの各フレーズ区間に対応する複数の採点区間に分割していたが、採点区間の分割に用いられる区間はフレーズ区間に限らない。例えば、歌唱データの時間軸を、歌唱データに含まれる各メロディーの時間軸上の区間に対応する複数の採点区間に分割してもよい。あるいは、歌唱データの時間軸を、歌唱データにおいて息継ぎが行われない区間に対応する複数の採点区間に分割してもよい。この息継ぎが行われない区間は、例えば歌唱データを解析して息継ぎが行われたタイミングを検出し、検出した息継ぎのタイミングに基づいて特定すればよい。あるいは、歌唱データにおいて息継ぎが行われる時間軸上の位置が予め決められている場合には、その息継ぎのタイミングに基づいて特定してもよい。
また、カラオケ装置１０が採点区間を指定してもよい。この場合、カラオケ装置１０のＣＰＵ１１は、上述した楽曲ＩＤ及び装置ＩＤとともに、採点区間を表す採点区間情報を採点サーバ装置２０に送信する。そして、採点サーバ装置２０の処理部２５は、カラオケ装置１０から送信された採点区間情報が表す採点区間に基づいて歌唱データの時間軸を分割する。 [Modification]
The above is the description of the embodiment, but the contents of this embodiment can be modified as follows. Further, the following modifications may be combined as appropriate.
(Modification 1)
In the embodiment described above, the time axis of the song data is divided into a plurality of scoring sections corresponding to each phrase section of the model song data, but the section used for dividing the scoring section is not limited to the phrase section. For example, the time axis of the song data may be divided into a plurality of scoring sections corresponding to the sections on the time axis of each melody included in the song data. Alternatively, the time axis of the singing data may be divided into a plurality of scoring sections corresponding to sections in which no breathing is performed in the singing data. The section where the breathing is not performed may be specified based on, for example, analyzing the song data, detecting the timing when the breathing is performed, and detecting the timing of the breathing. Alternatively, when the position on the time axis where breathing is performed in the song data is determined in advance, the position may be specified based on the timing of breathing.
Moreover, the karaoke apparatus 10 may designate a scoring section. In this case, the CPU 11 of the karaoke apparatus 10 transmits scoring section information representing the scoring section to the scoring server apparatus 20 together with the music ID and the apparatus ID described above. And the process part 25 of the scoring server apparatus 20 divides | segments the time axis | shaft of song data based on the scoring area which the scoring area information transmitted from the karaoke apparatus 10 represents.

（変形例２）
上述した実施形態において、例えば歌唱の対象となる楽曲が１番と２番とで構成されている場合には、１番の歌唱が終了した時点で、１番の歌唱全体の採点処理を行うようにしてもよい。この場合、歌唱データには、楽曲の１番と２番との時間軸上の区切り位置を表す区切データが付加されている。採点サーバ装置２０の処理部２５は、この区切データを含む採点区間の採点処理を終了すると、歌唱データの時間軸において、歌唱が開始される位置から１番の歌唱が終了する位置までの区間を採点区間として設定し、設定した採点区間について上述と同様の採点処理を行う。
また、利用者の歌唱が終了した時点で、歌唱全体の採点処理を行うようにしてもよい。この場合、処理部２５は、歌唱データの終端を含む採点区間の採点処理を終了すると、歌唱データの時間軸において、歌唱が開始される位置から終了する位置までの区間を採点区間として設定し、設定した採点区間について上述と同様の採点処理を行う。 (Modification 2)
In the above-described embodiment, for example, when the song to be sung is composed of No. 1 and No. 2, the first singing is scored as a whole when the first singing is completed. It may be. In this case, delimiter data representing delimiter positions on the time axis between the first and second songs are added to the song data. When the processing unit 25 of the scoring server device 20 finishes the scoring process of the scoring section including the delimiter data, the section from the position where the singing is started to the position where the first singing is finished is performed on the time axis of the singing data. The scoring section is set, and the scoring process similar to that described above is performed for the set scoring section.
Moreover, you may make it perform the scoring process of the whole song when a user's song is complete | finished. In this case, when the processing unit 25 ends the scoring process of the scoring section including the end of the singing data, the section from the position where the singing starts to the position where the singing starts is set as the scoring section on the time axis of the singing data, The same scoring process as described above is performed for the set scoring section.

さらに、歌唱全体の採点処理を行う場合には、採点処理の内容を変えてもよい。例えば、各評価区間の採点処理では、歌唱データと模範歌唱データとの類似度を表す点数だけを算出し、歌唱の全体を対象とする採点処理では、歌唱データと模範歌唱データとの類似度を表す点数に加えて、「こぶし」や「しゃくり」といった歌唱の技法が用いられた回数を算出してもよい。さらに、類似度に応じた採点に加え、「ビブラート」，「走り」，「タメ」，「抑揚」，「演奏時間」，「低音・高音の明瞭さ」，「デュエットにおける一致度」などを加味し、加点してもよい。また、歌唱の全体を対象とする採点処理では、適切なタイミングで適切な歌唱の技法を用いているか否かに基づいて、上述した点数を算出してもよい。 Furthermore, when scoring the entire song, the content of the scoring process may be changed. For example, in the scoring process of each evaluation section, only the score representing the similarity between the song data and the model song data is calculated, and in the scoring process for the entire song, the similarity between the song data and the model song data is calculated. In addition to the number of points to be represented, the number of times that a singing technique such as “fist” or “shakuri” is used may be calculated. In addition to scoring according to the degree of similarity, "Vibrato", "Running", "Tame", "Intonation", "Performance time", "Clearness of bass and treble", "Duet match", etc. However, points may be added. In the scoring process for the entire song, the above-described score may be calculated based on whether or not an appropriate singing technique is used at an appropriate timing.

（変形例３）
上述した実施形態では、歌唱データがそのまま採点サーバ装置２０に送信されていたが、歌唱データに代えて歌唱データの特徴量だけが採点サーバ装置２０に送信されてもよい。この特徴量としては、例えば歌唱データの周波数特性、音程又はリズムなどが用いられる。この場合、カラオケ装置１０のＣＰＵ１１は、収音部１７によって生成された歌唱データから特徴量を抽出し、抽出した特徴量を通信部１３によって採点サーバ装置２０に送信する。そして、採点サーバ装置２０のＣＰＵ２１及び処理部２５は、カラオケ装置１０から送信されてきた特徴量を用いて、上述と同様の処理を行う。 (Modification 3)
In the above-described embodiment, the singing data is transmitted to the scoring server device 20 as it is, but only the feature amount of the singing data may be transmitted to the scoring server device 20 instead of the singing data. As this feature amount, for example, frequency characteristics, pitch or rhythm of song data is used. In this case, the CPU 11 of the karaoke apparatus 10 extracts feature amounts from the song data generated by the sound collection unit 17, and transmits the extracted feature amounts to the scoring server device 20 by the communication unit 13. Then, the CPU 21 and the processing unit 25 of the scoring server device 20 perform the same processing as described above by using the feature amount transmitted from the karaoke device 10.

また、利用者の歌唱の途中では、歌唱データの特徴量に基づいて採点処理を行い、利用者の歌唱が終了すると、歌唱データに基づいて歌唱全体を対象とする採点処理を行ってもよい。この場合、カラオケ装置１０のＣＰＵ１１は、利用者の歌唱が終了するまでは、歌唱データから特徴量を抽出し、抽出した特徴量を採点サーバ装置２０に送信する。そして、採点サーバ装置２０の処理部２５は、カラオケ装置１０から送信されてきた歌唱データの特徴量を用いて、上述と同様の処理を行う。このように、歌唱データの特徴量だけを用いて採点処理を行う場合には、採点処理の負荷が小さくなるため、採点処理にかかる時間が短くなる。よって、採点結果データを迅速にカラオケ装置１０に送信することができる。一方、利用者の歌唱が終了すると、カラオケ装置１０のＣＰＵ１１は、歌唱データそのものを採点サーバ装置２０に送信する。カラオケ装置１０から歌唱データが送信されてくると、採点サーバ装置２０の処理部２５は、この歌唱データの時間軸上の全ての区間を採点区間として設定し、設定した採点区間について上述と同様の採点処理を行う。このように、歌唱データそのものを用いて採点処理を行う場合には、採点処理において詳細な評価を行うことができる。よって、利用者は、詳細な採点結果を知ることができる。 Moreover, in the middle of a user's song, the scoring process may be performed based on the feature amount of the song data, and when the user's song is finished, the scoring process for the entire song may be performed based on the song data. In this case, the CPU 11 of the karaoke device 10 extracts the feature amount from the song data and transmits the extracted feature amount to the scoring server device 20 until the user's singing is completed. And the process part 25 of the scoring server apparatus 20 performs the process similar to the above using the feature-value of the song data transmitted from the karaoke apparatus 10. FIG. As described above, when the scoring process is performed using only the feature amount of the singing data, since the load of the scoring process is reduced, the time required for the scoring process is shortened. Therefore, the scoring result data can be quickly transmitted to the karaoke apparatus 10. On the other hand, when the user's singing is finished, the CPU 11 of the karaoke device 10 transmits the singing data itself to the scoring server device 20. When the singing data is transmitted from the karaoke device 10, the processing unit 25 of the scoring server device 20 sets all the sections on the time axis of the singing data as scoring sections, and the set scoring sections are the same as described above. Perform scoring. Thus, when scoring processing is performed using the song data itself, detailed evaluation can be performed in scoring processing. Therefore, the user can know the detailed scoring results.

また、カラオケ装置１０のＣＰＵ１１は、ネットワークＮの状態を検出し、ネットワークＮの可用帯域が狭い状態である場合には、歌唱データに代えて歌唱データの特徴量を送信してもよい。この場合、採点サーバ装置２０のＣＰＵ２１及び処理部２５は、カラオケ装置１０から送信されてきた歌唱データの特徴量を用いて、上述と同様の処理を行う。
また、採点サーバ装置２０の処理部２５は、複数のカラオケ装置１０との間で通信が確立された場合には、メモリ２２に記憶されている歌唱データから特徴量を抽出し、抽出した特徴量を用いて上述と同様の処理を行ってもよい。これは、複数のカラオケ装置１０との間で通信が確立された場合には、各々のカラオケ装置１０から送信されてくる歌唱データについて採点処理を行うことになるため、処理部２５の処理負荷を低く抑える必要があるためである。 Moreover, CPU11 of the karaoke apparatus 10 may detect the state of the network N, and may transmit the feature-value of song data instead of song data, when the usable band of the network N is a narrow state. In this case, the CPU 21 and the processing unit 25 of the scoring server device 20 perform the same processing as described above using the feature amount of the song data transmitted from the karaoke device 10.
Moreover, the processing part 25 of the scoring server apparatus 20 extracts the feature-value from the song data memorize | stored in the memory 22, when communication is established between several karaoke apparatuses 10, The extracted feature-value The same processing as described above may be performed using. This is because, when communication is established with a plurality of karaoke apparatuses 10, the singing data transmitted from each karaoke apparatus 10 is scored, so the processing load of the processing unit 25 is reduced. This is because it needs to be kept low.

（変形例４）
上述した実施形態において、採点サーバ装置２０のＣＰＵ２１は、歌唱を行った利用者に対して課金を行ってもよい。この場合、利用者は、操作部１５を操作して、自分のユーザＩＤを入力する。ＣＰＵ１１は、この操作に応じて、入力されたユーザＩＤを通信部１３によって採点サーバ装置２０に送信する。そして、採点サーバ装置２０のＣＰＵ２１は、上述した採点処理の計算量に応じた金額を算出し、カラオケ装置１０から送信されてきたユーザＩＤが表す利用者に対して、算出した金額が課金されるように課金処理を行う。あるいは、ＣＰＵ２１は、利用者に対して予め決められた金額が課金されるように課金処理を行ってもよい。 (Modification 4)
In embodiment mentioned above, CPU21 of the scoring server apparatus 20 may charge with respect to the user who performed the song. In this case, the user operates the operation unit 15 and inputs his / her user ID. In response to this operation, the CPU 11 transmits the input user ID to the scoring server device 20 through the communication unit 13. Then, the CPU 21 of the scoring server device 20 calculates an amount corresponding to the above-described calculation amount of the scoring process, and the calculated amount is charged to the user represented by the user ID transmitted from the karaoke device 10. The billing process is performed as follows. Alternatively, the CPU 21 may perform a charging process so that a predetermined amount is charged to the user.

（変形例５）
上述した実施形態では、採点サーバ装置２０に接続される装置がカラオケ装置１０だけであったが、採点サーバ装置２０に接続される装置はカラオケ装置１０に限らない。例えば、携帯電話機や携帯ゲーム機が採点サーバ装置２０に接続されてもよい。この場合、これらの機器は、上述したカラオケ装置１０と同様に、カラオケ再生処理を行いながら、利用者の歌唱を表す歌唱データを採点サーバ装置２０に送信し、採点サーバ装置２０から送信されてきた採点結果データが表す採点結果を表示する。つまり、携帯電話機や携帯ゲーム機はいずれも、本発明に係る通信装置として機能する。 (Modification 5)
In the embodiment described above, the karaoke device 10 is the only device connected to the scoring server device 20, but the device connected to the scoring server device 20 is not limited to the karaoke device 10. For example, a mobile phone or a mobile game machine may be connected to the scoring server device 20. In this case, similar to the karaoke device 10 described above, these devices transmit singing data representing the user's singing to the scoring server device 20 while performing the karaoke playback process, and have been transmitted from the scoring server device 20. The scoring result represented by the scoring result data is displayed. That is, both the cellular phone and the portable game machine function as the communication device according to the present invention.

また、採点サーバ装置２０は、歌唱データの送信元の種別に応じて、採点処理の内容を変えてもよい。この種別とは、例えばカラオケ装置、ゲーム機、携帯電話機などのように、予め決められた装置の分類であってもよいし、本格的な歌唱音声の評価を得たいときに用いられる装置、簡易な歌唱音声の評価を得たいときに用いられる装置など、使用目的に応じた種別であってもよい。あるいは、装置の処理能力に応じた種別であってもよい。この場合、採点サーバ装置２０の記憶部２４には、装置の種別と採点処理のアルゴリズムとが対応付けて記憶される。そして、採点サーバ装置２０のＣＰＵ２１は、歌唱データの送信元からその種別を取得し、取得した種別と対応付けて記憶された採点処理のアルゴリズムを特定する。そして、処理部２５は、特定されたアルゴリズムを用いて上述した採点処理を行う。 Moreover, the scoring server device 20 may change the content of the scoring process according to the type of the song data transmission source. This type may be a predetermined device classification, such as a karaoke device, a game machine, a mobile phone, etc., or a device used when it is desired to obtain a full-fledged singing voice evaluation. It may be of a type according to the purpose of use, such as a device used when it is desired to obtain a simple singing voice evaluation. Or the classification according to the processing capability of an apparatus may be sufficient. In this case, the storage unit 24 of the scoring server device 20 stores the device type and the scoring algorithm in association with each other. And CPU21 of scoring server apparatus 20 acquires the classification from the transmission source of song data, and specifies the algorithm of the scoring process memorize | stored in association with the acquired classification. And the process part 25 performs the scoring process mentioned above using the specified algorithm.

ここでは、記憶部２４において、「カラオケ装置」という種別と、「詳細な評価を行う採点処理のアルゴリズム」とが対応付けて記憶されており、「携帯電話機」という種別と、「簡易な評価を行う採点処理のアルゴリズム」とが対応付けて記憶されている場合を想定する。この場合、例えば、カラオケ装置１０から歌唱データと「カラオケ装置」という種別とが送信されてくると、採点サーバ装置２０の処理部２５は、記憶部２４において「カラオケ装置」という種別に対応付けられた「詳細な評価を行う採点処理のアルゴリズム」を用いて採点処理を行う。一方、携帯電話機から歌唱データと「携帯電話機」という種別とが送信されてくると、処理部２５は、記憶部２４において「携帯電話機」という種別に対応付けられた「簡易な評価を行う採点処理のアルゴリズム」を用いて採点処理を行う。つまり、処理部２５は、予め決められた種別の通信装置から音声データを取得した場合には、予め決められた種別とは異なる種別の通信装置から音声データを取得した場合よりも簡易な評価を行う。この簡易な評価とは、採点処理における処理ステップ数が少ない、又は同一の歌唱音声を評価するときの処理時間が少ないことをいう。具体的には、評価項目を少なくする、処理負荷の大きい処理を省くなどによって実現される。なお、ここでは、採点処理のアルゴリズムが２つ設けられて場合を例に挙げて説明したが、評価の簡易度に応じた採点処理のアルゴリズムが３つ以上設けられていてもよい。 Here, in the storage unit 24, the type “karaoke apparatus” and the “scoring algorithm for performing detailed evaluation” are stored in association with each other, and the type “mobile phone” and “simple evaluation are performed. Assume that the “scoring algorithm to be performed” is stored in association with each other. In this case, for example, when the singing data and the type “karaoke device” are transmitted from the karaoke device 10, the processing unit 25 of the scoring server device 20 is associated with the type “karaoke device” in the storage unit 24. The scoring process is performed using the “scoring algorithm for performing detailed evaluation”. On the other hand, when the singing data and the type “mobile phone” are transmitted from the mobile phone, the processing unit 25 performs “scoring process for performing simple evaluation” associated with the type “mobile phone” in the storage unit 24. The scoring process is performed using the “No algorithm”. In other words, the processing unit 25, when acquiring voice data from a predetermined type of communication device, performs a simpler evaluation than when acquiring voice data from a communication device of a type different from the predetermined type. Do. This simple evaluation means that the number of processing steps in the scoring process is small or the processing time when evaluating the same singing voice is short. Specifically, it is realized by reducing evaluation items or omitting processing with a large processing load. Here, the case where two scoring algorithms are provided is described as an example, but three or more scoring algorithms according to the degree of simplicity of evaluation may be provided.

（変形例６）
上述した実施形態では、歌唱データと模範歌唱データとの類似度に応じた点数が採点結果として用いられていたが、採点結果として用いられる情報はこれに限らない。例えば、歌唱データと模範歌唱データとの類似度に応じた点数を算出した後に、利用者の歌唱音声において点数の低下する要因を特定し、算出した点数に加えて、点数を高めるためのアドバイスを採点結果として用いてもよい。 (Modification 6)
In the embodiment described above, the score corresponding to the degree of similarity between the singing data and the model singing data is used as the scoring result, but the information used as the scoring result is not limited to this. For example, after calculating the score according to the degree of similarity between the singing data and the model singing data, the factor that decreases the score in the user's singing voice is specified, and in addition to the calculated score, advice for increasing the score is given. It may be used as a scoring result.

（変形例７）
上述した実施形態では、カラオケ装置１０と採点サーバ装置２０とが別体の装置である構成を例に挙げて説明したが、カラオケ装置１０が採点サーバ装置２０の機能を有していてもよい。この場合には、例えばカラオケ装置１０の間で相互に情報を交換しながら、カラオケ再生処理を行っていないカラオケ装置１０又はカラオケ再生処理を行っているが処理の負荷が低いカラオケ装置１０を決定し、そのカラオケ装置１０が採点サーバ装置２０として機能する。あるいは、カラオケ装置１０に管理装置が接続されている場合には、管理装置がカラオケ再生処理を行っていないカラオケ装置１０又はカラオケ再生処理を行っているが処理の負荷が低いカラオケ装置１０を検出し、検出したカラオケ装置１０を採点サーバ装置２０として動作させてもよい。 (Modification 7)
In the embodiment described above, the configuration in which the karaoke device 10 and the scoring server device 20 are separate devices has been described as an example, but the karaoke device 10 may have the function of the scoring server device 20. In this case, for example, while exchanging information between the karaoke apparatuses 10, the karaoke apparatus 10 that is not performing the karaoke reproduction process or the karaoke apparatus 10 that is performing the karaoke reproduction process but has a low processing load is determined. The karaoke device 10 functions as the scoring server device 20. Alternatively, when a management device is connected to the karaoke device 10, the management device detects the karaoke device 10 that is not performing karaoke playback processing or the karaoke device 10 that is performing karaoke playback processing but has a low processing load. The detected karaoke device 10 may be operated as the scoring server device 20.

（変形例８）
上述した実施形態では、採点サーバ装置２０が歌唱の巧拙を採点する例を挙げて説明したが、採点サーバ装置２０が採点する対象は歌唱に限らない。例えば、採点サーバ装置２０が、利用者の英会話の巧拙を採点してもよい。ここでは、利用者が携帯電話機やコンピュータ装置などの端末装置に向けて英会話を行い、この端末装置が利用者の英会話を表す音声データを採点サーバ装置２０に送信する場合を想定する。この端末装置は、上述したカラオケ装置１０と同様に、収音部１７と放音部１９とを備えており、ネットワークＮを介して採点サーバ装置２０に接続されている。この場合、採点サーバ装置２０の記憶部２４には、模範となる英会話を表す模範音声データが予め記憶されている。そして、処理部２５は、端末装置から送信されてきた音声データと記憶部２４に記憶されている模範音声データとに基づいて、上述と同様に、利用者の英会話の巧拙を表す採点処理を行う。 (Modification 8)
In embodiment mentioned above, although the example which the scoring server apparatus 20 scored the skill of a song was given and demonstrated, the object which the scoring server apparatus 20 scores is not restricted to a song. For example, the scoring server device 20 may score the skill of the user's English conversation. Here, it is assumed that the user conducts an English conversation toward a terminal device such as a mobile phone or a computer device, and the terminal device transmits voice data representing the user's English conversation to the scoring server device 20. Similar to the karaoke device 10 described above, this terminal device includes a sound collection unit 17 and a sound emission unit 19, and is connected to the scoring server device 20 via the network N. In this case, exemplary voice data representing an exemplary English conversation is stored in the storage unit 24 of the scoring server device 20 in advance. Then, the processing unit 25 performs a scoring process representing the skill of the user's English conversation based on the voice data transmitted from the terminal device and the model voice data stored in the storage unit 24, as described above. .

（変形例９）
上述した実施形態において、採点サーバ装置２０の記憶部２４に記憶されている模範音声データＤｒは、模範となる歌唱音声そのものを表すものであってもよいし、模範となる音声の音符、周波数、リズムなどを表すパラメータであってもよい。要するに、記憶部２４には、模範となる歌唱音声を表す模範音声データが記憶されていればよい。つまり、本発明でいう「音声データ」とは、音声そのものを表すデータに限らず、音声の特徴を表すパラメータであってもよい。 (Modification 9)
In the above-described embodiment, the model voice data Dr stored in the storage unit 24 of the scoring server device 20 may represent the model singing voice itself, or may be a model voice note, frequency, It may be a parameter representing a rhythm or the like. In short, the storage unit 24 only needs to store exemplary voice data representing an exemplary singing voice. That is, the “voice data” in the present invention is not limited to data representing the voice itself, but may be a parameter representing the characteristics of the voice.

（変形例１０）
上述した実施形態において、ＣＰＵ２１にて行われる処理が処理部２５にて行われてもよい。また、処理部２５にて行われる処理がＣＰＵ２１にて行われてもよい。さらに、ＣＰＵ２１又は処理部２５にて行なわれる処理は、単一又は複数のハードウェア資源によって実現されてもよいし、ＣＰＵ２１が１又は複数のプログラムを実行することにより実現されてもよい。また、このプログラムは、磁気テープや磁気ディスクなどの磁気記録媒体、光ディスクなどの光記録媒体、光磁気記録媒体、半導体メモリなどの、コンピュータ装置が読み取り可能な記録媒体に記憶された状態で提供し得る。また、プログラムを、インターネットなどのネットワーク経由でダウンロードさせることも可能である。 (Modification 10)
In the embodiment described above, the processing performed by the CPU 21 may be performed by the processing unit 25. Further, the processing performed by the processing unit 25 may be performed by the CPU 21. Furthermore, the processing performed by the CPU 21 or the processing unit 25 may be realized by a single or a plurality of hardware resources, or may be realized by the CPU 21 executing one or a plurality of programs. The program is provided in a state stored in a computer-readable recording medium such as a magnetic recording medium such as a magnetic tape or a magnetic disk, an optical recording medium such as an optical disk, a magneto-optical recording medium, or a semiconductor memory. obtain. It is also possible to download the program via a network such as the Internet.

１…採点システム、１０…カラオケ装置、１１…ＣＰＵ、１２…メモリ、１３…通信部、１４…記憶部、１５…操作部、１６…表示部、１７…収音部、１８…音源部、１９…放音部、２０…採点サーバ装置、２１…ＣＰＵ、２２…メモリ、２３…通信部、２４…記憶部、２５…処理部。 DESCRIPTION OF SYMBOLS 1 ... Scoring system, 10 ... Karaoke apparatus, 11 ... CPU, 12 ... Memory, 13 ... Communication part, 14 ... Memory | storage part, 15 ... Operation part, 16 ... Display part, 17 ... Sound collection part, 18 ... Sound source part, 19 ... Sound emitting part, 20 ... Scoring server device, 21 ... CPU, 22 ... Memory, 23 ... Communication part, 24 ... Storage part, 25 ... Processing part.

Claims

First storage means for storing exemplary voice data representing an exemplary singing voice;
Acquisition means for acquiring audio data representing a user's singing voice from a communication device in a streaming manner;
Second storage means for storing voice data acquired by the acquisition means;
A dividing unit that divides the time axis of the voice data stored in the second storage unit into a plurality of evaluation sections to be evaluated for singing voice;
Each evaluation section divided by the dividing means is selected in the order according to the position on the time axis, and the audio data included in the selected evaluation section is stored in the first storage means Evaluation means for evaluating the user's singing voice in the selected evaluation section by comparing with the model voice data included in the section corresponding to the evaluation section, and generating an evaluation result;
A transmission unit that transmits the generated evaluation result to the communication device each time an evaluation result in each evaluation section is generated by the evaluation unit ;
The communication device includes a karaoke device and a communication device of a type other than the karaoke device,
The evaluation means changes the processing content of the evaluation between when the acquisition means acquires the voice data from the karaoke device and when the voice data is acquired from a communication device of a type other than the karaoke device. Characteristic evaluation device.

Separation data indicating the separation position on the time axis of each phrase included in the exemplary singing voice is added to the exemplary voice data,
The dividing unit corresponds to each phrase section delimited by the delimiter data in the model audio data stored in the first storage unit with respect to the time axis of the audio data stored in the second storage unit. The evaluation apparatus according to claim 1, wherein the evaluation device is divided into a plurality of evaluation sections.

Before Symbol evaluation unit, when the acquisition unit has acquired the voice data from the communication device of the type other than the karaoke apparatus, the evaluation with a simple content than from the karaoke apparatus has acquired the voice data The evaluation device according to claim 1, wherein the evaluation device is performed.

Until the user finishes singing, the communication device extracts a feature amount from voice data representing the user's singing voice, and transmits the extracted feature amount to the evaluation device.
The evaluation unit evaluates the singing voice based on the feature amount when the feature amount is transmitted from the communication device.
The evaluation apparatus according to any one of claims 1 to 3, wherein

The evaluation device and the communication device are connected via a network,
The communication device detects a state of the network, and extracts a feature amount from voice data representing the user's singing voice when an available bandwidth of the network is narrower than a threshold, and the extracted feature Sending the quantity to the evaluation device,
The evaluation unit evaluates the singing voice based on the feature amount when the feature amount is transmitted from the communication device.
The evaluation apparatus according to any one of claims 1 to 4, wherein

A communication unit that communicates with a plurality of communication devices,
When the communication unit establishes communication with a plurality of communication devices, the evaluation unit extracts a feature amount from the audio data stored in the second storage unit, and based on the extracted feature amount To evaluate the singing voice
The evaluation apparatus according to any one of claims 1 to 5, wherein