JP6954780B2

JP6954780B2 - Karaoke equipment

Info

Publication number: JP6954780B2
Application number: JP2017147475A
Authority: JP
Inventors: 政之鎌田
Original assignee: Daiichikosho Co Ltd
Current assignee: Daiichikosho Co Ltd
Priority date: 2017-07-31
Filing date: 2017-07-31
Publication date: 2021-10-27
Anticipated expiration: 2037-07-31
Also published as: JP2019028251A

Description

本発明はカラオケ装置に関する。 The present invention relates to a karaoke device.

カラオケ装置は、マイクにより入力された歌唱音声から抽出した歌唱音声データと、カラオケ演奏された楽曲の主旋律を示すリファレンスデータとを比較することにより、カラオケ歌唱の採点を行うことができる。 The karaoke device can score karaoke singing by comparing the singing voice data extracted from the singing voice input by the microphone with the reference data indicating the main melody of the music played in karaoke.

たとえば、特許文献１には、カラオケ演奏に合わせてマイクから入力される歌唱音声信号から音高データ及び音長データを抽出し、カラオケ演奏に並行して読み出されるガイドメロディと比較することによって歌唱の巧拙を採点評価する技術が開示されている。 For example, in Patent Document 1, pitch data and pitch data are extracted from a singing voice signal input from a microphone in accordance with a karaoke performance, and compared with a guide melody read in parallel with the karaoke performance. A technique for scoring and evaluating skill is disclosed.

一方、歌唱者の歌唱スキルには差があるため、一生懸命歌っていてもリファレンスデータとの乖離が大きい場合には十分な採点評価を得ることができない。そこで、特許文献２には、歌唱に相当する音声から感情の込め方と相関のある感情特徴量を抽出し、当該感情特徴量に基づいて歌唱に込められた感情を分析する技術が開示されている。このような技術によれば、歌唱者の感情表現を評価することができる。 On the other hand, since there is a difference in the singing skills of the singers, even if they sing hard, if the deviation from the reference data is large, it is not possible to obtain a sufficient scoring evaluation. Therefore, Patent Document 2 discloses a technique for extracting an emotional feature amount that correlates with how to put emotions from a voice corresponding to singing and analyzing the emotions put into the singing based on the emotional feature amount. There is. According to such a technique, the emotional expression of the singer can be evaluated.

特開平１０−６９２１６号公報Japanese Unexamined Patent Publication No. 10-6916 特開平１０−１８７１７８号公報Japanese Unexamined Patent Publication No. 10-187178

ところで、カラオケ楽曲（カラオケ楽曲に含まれる各歌唱区間）は、歌詞や曲調等により、楽しい雰囲気や哀しい雰囲気等、様々な雰囲気が表現されている。従って、カラオケ楽曲毎の雰囲気を考慮した感情表現を伴う歌唱は巧いカラオケ歌唱であるといえる。 By the way, in karaoke music (each singing section included in the karaoke music), various atmospheres such as a fun atmosphere and a sad atmosphere are expressed by lyrics and tunes. Therefore, it can be said that singing with emotional expression considering the atmosphere of each karaoke song is a skillful karaoke singing.

ここで、特許文献２に開示されている技術では、カラオケ楽曲の雰囲気に関わらず、一生懸命歌えば（たとえば音量や熱唱度が高くなれば）高評価になってしまう。すなわち、特許文献２に開示されている技術では、カラオケ楽曲の雰囲気を考慮した感情表現の評価を行うことができない。 Here, in the technique disclosed in Patent Document 2, regardless of the atmosphere of the karaoke music, if the song is sung hard (for example, if the volume and the degree of enthusiasm are high), the evaluation will be high. That is, the technique disclosed in Patent Document 2 cannot evaluate the emotional expression in consideration of the atmosphere of the karaoke music.

本発明の目的は、カラオケ楽曲の雰囲気にふさわしい感情表現で歌唱が行われているかを適切に評価可能なカラオケ装置を提供することにある。 An object of the present invention is to provide a karaoke device capable of appropriately evaluating whether or not singing is performed with an emotional expression suitable for the atmosphere of a karaoke music.

上記目的を達成するための主たる発明は、カラオケ楽曲を歌唱する際に表現すべき感情を示す感情リファレンスデータを歌唱区間毎に取得する感情リファレンスデータ取得部と、前記カラオケ楽曲を歌唱することにより得られた歌唱音声を感情分析した歌唱感情データを前記歌唱区間毎に生成する歌唱感情データ生成部と、前記感情リファレンスデータと前記歌唱感情データとを比較することにより、前記歌唱区間毎の感情表現の評価を行う感情表現評価部と、を有するカラオケ装置である。
本発明の他の特徴については、後述する明細書及び図面の記載により明らかにする。 The main invention for achieving the above object is an emotion reference data acquisition unit that acquires emotion reference data indicating emotions to be expressed when singing a karaoke song for each singing section, and an emotion reference data acquisition unit obtained by singing the karaoke song. By comparing the emotion reference data and the singing emotion data with the singing emotion data generation unit that generates the singing emotion data obtained by emotionally analyzing the singing voice, the emotion expression of each singing section can be expressed. It is a karaoke device having an emotion expression evaluation unit for evaluation.
Other features of the present invention will be clarified by the description of the description and drawings described later.

本発明によれば、カラオケ楽曲の雰囲気にふさわしい感情表現で歌唱が行われているかを適切に評価できる。 According to the present invention, it is possible to appropriately evaluate whether or not the singing is performed with an emotional expression suitable for the atmosphere of the karaoke music.

実施形態に係るカラオケ装置のハードウェア構成例を示す図である。It is a figure which shows the hardware configuration example of the karaoke apparatus which concerns on embodiment. 実施形態に係る装置本体のソフトウェア構成例を示す図である。It is a figure which shows the software configuration example of the apparatus main body which concerns on embodiment. 実施形態に係る単語−感情タイプデータベースに記憶されるデータ例を示す図である。It is a figure which shows the example of data stored in the word-emotion type database which concerns on embodiment. 実施形態に係る曲調−感情タイプデータベースに記憶されるデータ例を示す図である。It is a figure which shows the example of data stored in the music tone-emotion type database which concerns on embodiment. 実施形態に係るカラオケ楽曲に含まれる歌唱区間毎の歌詞及び曲調を示した図である。It is a figure which showed the lyrics and the music tone for each singing section included in the karaoke music which concerns on embodiment. 実施形態に係るカラオケ楽曲に含まれる歌唱区間毎の合計スコアを示した図である。It is a figure which showed the total score for each singing section included in the karaoke music which concerns on embodiment. 実施形態に係る感情リファレンスデータを示した図である。It is a figure which showed the emotion reference data which concerns on embodiment. 実施形態に係る歌唱感情データを示した図である。It is a figure which showed the singing emotion data which concerns on embodiment. 実施形態に係る感情表現の評価を示した図である。It is a figure which showed the evaluation of the emotional expression which concerns on embodiment. 実施形態に係るカラオケ装置の処理を示すフローチャートである。It is a flowchart which shows the process of the karaoke apparatus which concerns on embodiment. 変形例に係る装置本体のソフトウェア構成例を示す図である。It is a figure which shows the software configuration example of the apparatus main body which concerns on the modification. 変形例に係る感情リファレンスデータを示した図である。It is a figure which showed the emotion reference data which concerns on a modification.

＜実施形態＞
図１〜図９を参照して、本実施形態に係るカラオケ装置１について説明する。 <Embodiment>
The karaoke device 1 according to the present embodiment will be described with reference to FIGS. 1 to 9.

＝＝カラオケ装置＝＝
カラオケ装置１は、歌唱者が選曲した楽曲のカラオケ演奏及び歌唱者がカラオケ歌唱を行うための装置である。図１に示すように、カラオケ装置１は、カラオケ本体１０、スピーカ２０、表示装置３０、マイク４０、及びリモコン装置５０を備える。 == Karaoke device ==
The karaoke device 1 is a device for performing karaoke performance of a song selected by a singer and singing karaoke by the singer. As shown in FIG. 1, the karaoke device 1 includes a karaoke body 10, a speaker 20, a display device 30, a microphone 40, and a remote control device 50.

スピーカ２０はカラオケ本体１０からの放音信号に基づいて放音するための構成である。表示装置３０はカラオケ本体１０からの信号に基づいて映像や画像を画面に表示するための構成である。マイク４０は歌唱者の歌唱音声（マイク４０からの入力音声）をアナログの歌唱音声信号に変換してカラオケ本体１０に入力するための構成である。 The speaker 20 is configured to emit sound based on the sound emitted signal from the karaoke main body 10. The display device 30 is configured to display an image or an image on the screen based on the signal from the karaoke main body 10. The microphone 40 is configured to convert the singing voice of the singer (input voice from the microphone 40) into an analog singing voice signal and input it to the karaoke main body 10.

（カラオケ本体のハードウェア）
図１に示すように、カラオケ本体１０は、制御部１１、通信部１２、記憶部１３、音響処理部１４、表示処理部１５及び操作部１６を備える。各構成はインターフェース（図示なし）を介してバスＢに接続されている。 (Karaoke hardware)
As shown in FIG. 1, the karaoke main body 10 includes a control unit 11, a communication unit 12, a storage unit 13, an acoustic processing unit 14, a display processing unit 15, and an operation unit 16. Each configuration is connected to bus B via an interface (not shown).

カラオケ本体１０は、選曲された楽曲のカラオケ演奏制御、歌詞や背景画像等の表示制御、マイク４０を通じて入力された歌唱音声信号の処理といった、カラオケ歌唱に関する各種の制御を行う。 The karaoke body 10 performs various controls related to karaoke singing, such as karaoke performance control of selected songs, display control of lyrics and background images, and processing of singing voice signals input through a microphone 40.

制御部１１は、ＣＰＵ１１ａおよびメモリ１１ｂを備える。ＣＰＵ１１ａは、メモリ１１ｂに記憶された動作プログラムを実行することにより各種の制御機能を実現する。メモリ１１ｂは、ＣＰＵ１１ａに実行されるプログラムを記憶したり、プログラムの実行時に各種情報を一時的に記憶したりする記憶装置である。 The control unit 11 includes a CPU 11a and a memory 11b. The CPU 11a realizes various control functions by executing an operation program stored in the memory 11b. The memory 11b is a storage device that stores a program to be executed in the CPU 11a and temporarily stores various information when the program is executed.

通信部１２は、ルーター（図示なし）を介してカラオケ本体１０を通信回線に接続するためのインターフェースを提供する。 The communication unit 12 provides an interface for connecting the karaoke body 10 to the communication line via a router (not shown).

記憶部１３は、各種のデータを記憶する大容量の記憶装置であり、たとえばハードディスクドライブなどである。記憶部１３は、カラオケ装置１によりカラオケ演奏を行うための複数の楽曲データを記憶する。 The storage unit 13 is a large-capacity storage device that stores various types of data, such as a hard disk drive. The storage unit 13 stores a plurality of music data for performing karaoke performance by the karaoke device 1.

楽曲データは、個々のカラオケ楽曲を特定するための識別情報（楽曲ＩＤ）が付与されている。楽曲データは、伴奏データ、リファレンスデータ、背景画像データ、歌詞データ及び属性情報を含む。伴奏データは、カラオケ演奏音の元となるＭＩＤＩ形式のデータである。リファレンスデータは、歌唱者によるカラオケ歌唱を採点する際の基準として用いられるデータである。リファレンスデータは、ピッチ（音高）データ、音長データ、タイミングデータ等を含む。背景画像データは、カラオケ演奏時に合わせて表示装置３０等に表示される背景画像に対応するデータである。歌詞データは、表示装置３０等に表示させる歌詞（歌詞テロップ）に関するデータである。属性情報は、曲名、歌手名、作詞・作曲者名、及びジャンル等の当該楽曲に関する情報である。 Identification information (music ID) for identifying each karaoke music is added to the music data. The music data includes accompaniment data, reference data, background image data, lyrics data, and attribute information. The accompaniment data is MIDI format data that is the source of the karaoke performance sound. Reference data is data used as a reference when scoring karaoke singing by a singer. The reference data includes pitch (pitch) data, sound length data, timing data, and the like. The background image data is data corresponding to the background image displayed on the display device 30 or the like at the time of karaoke performance. The lyrics data is data related to lyrics (lyric telop) to be displayed on the display device 30 or the like. The attribute information is information about the music such as a song name, a singer name, a lyricist / composer name, and a genre.

音響処理部１４は、制御部１１の制御に基づき、カラオケ楽曲に対する演奏の制御およびマイク４０を通じて入力された歌唱音声信号の処理を行う。音響処理部１４は、たとえばＭＩＤＩ音源、ミキサ、アンプ（いずれも図示なし）を含む。制御部１１は、予約された楽曲の伴奏データを、テンポクロック信号に基づいて順次読み出し、ＭＩＤＩ音源に入力する。ＭＩＤＩ音源は、当該伴奏データに基づいて楽音信号を生成する。ミキサは、当該音楽信号およびマイク４０から出力される歌唱音声信号を適当な比率でミキシングしてアンプに出力する。アンプは、ミキサからのミキシング信号を増幅し、放音信号としてスピーカ２０へ出力する。これにより、スピーカ２０からは放音信号に基づくカラオケ演奏音およびマイク４０からの歌唱音声が放音される。 Based on the control of the control unit 11, the sound processing unit 14 controls the performance of the karaoke music and processes the singing voice signal input through the microphone 40. The sound processing unit 14 includes, for example, a MIDI sound source, a mixer, and an amplifier (none of which are shown). The control unit 11 sequentially reads out the accompaniment data of the reserved music based on the tempo clock signal and inputs it to the MIDI sound source. The MIDI sound source generates a musical tone signal based on the accompaniment data. The mixer mixes the music signal and the singing audio signal output from the microphone 40 at an appropriate ratio and outputs the sound signal to the amplifier. The amplifier amplifies the mixing signal from the mixer and outputs it to the speaker 20 as a sound emission signal. As a result, the karaoke performance sound based on the sound emission signal and the singing sound from the microphone 40 are emitted from the speaker 20.

表示処理部１５は、制御部１１の制御に基づき、表示装置３０における各種表示に関する処理を行う。たとえば、表示処理部１５は、カラオケ演奏時における背景画像に歌詞や各種アイコンが重ねられた映像を表示装置３０に表示させる制御を行う。 The display processing unit 15 performs processing related to various displays in the display device 30 based on the control of the control unit 11. For example, the display processing unit 15 controls the display device 30 to display an image in which lyrics and various icons are superimposed on a background image during a karaoke performance.

操作部１６は、パネルスイッチおよびリモコン受信回路などからなり、歌唱者によるカラオケ装置１のパネルスイッチあるいはリモコン装置５０の操作に応じて選曲信号、演奏中止信号などの操作信号を制御部１１に対して出力する。制御部１１は、操作部１６からの操作信号を検出し、対応する処理を実行する。 The operation unit 16 includes a panel switch, a remote control reception circuit, and the like, and sends operation signals such as a music selection signal and a performance stop signal to the control unit 11 in response to the operation of the panel switch of the karaoke device 1 or the remote control device 50 by the singer. Output. The control unit 11 detects the operation signal from the operation unit 16 and executes the corresponding process.

リモコン装置５０は、カラオケ本体１０に対する各種操作をおこなうための装置である。歌唱者はリモコン装置５０を用いて歌唱を希望するカラオケ楽曲の選曲（予約）等を行うことができる。 The remote control device 50 is a device for performing various operations on the karaoke main body 10. The singer can use the remote control device 50 to select (reserve) a karaoke song that he / she wishes to sing.

（カラオケ本体のソフトウェア）
図２はカラオケ本体１０のソフトウェア構成例を示す図である。カラオケ本体１０は、単語−感情タイプデータベース１００、曲調−感情タイプデータベース２００、感情リファレンスデータ取得部３００、歌唱感情データ生成部４００、感情表現評価部５００、採点処理部６００、及びカラオケ歌唱評価部７００を備える。単語−感情タイプデータベース１００、及び曲調−感情タイプデータベース２００は、記憶部１３の記憶領域の一部として提供される。感情リファレンスデータ取得部３００、歌唱感情データ生成部４００、感情表現評価部５００、採点処理部６００、及びカラオケ歌唱評価部７００は、ＣＰＵ１１ａがメモリ１１ｂに記憶されるプログラムを実行することにより実現される。 (Karaoke main unit software)
FIG. 2 is a diagram showing a software configuration example of the karaoke main body 10. The karaoke body 10 includes a word-emotion type database 100, a song tone-emotion type database 200, an emotion reference data acquisition unit 300, a singing emotion data generation unit 400, an emotion expression evaluation unit 500, a scoring processing unit 600, and a karaoke singing evaluation unit 700. To be equipped. The word-emotion type database 100 and the tune-emotion type database 200 are provided as part of the storage area of the storage unit 13. The emotion reference data acquisition unit 300, the singing emotion data generation unit 400, the emotion expression evaluation unit 500, the scoring processing unit 600, and the karaoke singing evaluation unit 700 are realized by the CPU 11a executing a program stored in the memory 11b. ..

［単語−感情タイプデータベース］
単語−感情タイプデータベース１００は、様々なカラオケ楽曲の歌詞に含まれる単語それぞれに対し、感情タイプを関連付けたものである。感情タイプは、喜怒哀楽等の一般的な感情表現に相当する。図３は、単語−感情タイプデータベース１００に記憶されるデータの一部を示したものである。この例では、感情タイプとして、「喜び」、「怒り」、「哀しみ」の３つが設定されている。各単語には、感情タイプ毎に所定のスコア（この例では１〜５点の５段階）が付与されている。たとえば、単語「いない」に対しては、感情タイプ毎に「喜び：１点、怒り：３点、哀しみ：５点」のスコアが付与されている。 [Word-Emotion type database]
The word-emotion type database 100 associates emotion types with each of the words contained in the lyrics of various karaoke songs. Emotion types correspond to general emotional expressions such as emotions. FIG. 3 shows a part of the data stored in the word-emotion type database 100. In this example, three emotion types are set: "joy", "anger", and "sadness". Each word is given a predetermined score (5 grades of 1 to 5 points in this example) for each emotion type. For example, for the word "not", a score of "joy: 1 point, anger: 3 points, sadness: 5 points" is given for each emotion type.

［曲調−感情タイプデータベース］
曲調−感情タイプデータベース２００は、カラオケ楽曲に使用される曲調と感情タイプとを関連付けたものである。図４は、曲調−感情タイプデータベース２００に記憶されるデータの一部を示したものである。この例では、図３と同様、感情タイプとして、「喜び」、「怒り」、「哀しみ」の３つのタイプが設定されている。図４の例では、複数のコードで曲調を分類している。各コードには、感情タイプ毎に所定のスコア（この例では１〜５点の５段階）が付与されている。たとえば、メジャーコードに対しては、感情タイプ毎に「喜び：５点、怒り：１点、哀しみ：１点」のスコアが付与されている。また「メジャー」とは、例えばＣ、Ｅ、Ｆ、Ｇなどのメジャーコード（長調の和音）であり、「マイナー」とは、例えばＤｍ、Ｅｍ、Ａｍなどのマイナーコード（短調の和音）である。 [Song-Emotion Type Database]
The music tone-emotion type database 200 associates the music tone used for the karaoke music with the emotion type. FIG. 4 shows a part of the data stored in the tune-emotion type database 200. In this example, as in FIG. 3, three types of emotions, "joy", "anger", and "sadness", are set. In the example of FIG. 4, the musical tone is classified by a plurality of chords. Each code is given a predetermined score (5 grades of 1 to 5 points in this example) for each emotion type. For example, for the major code, a score of "joy: 5 points, anger: 1 point, sadness: 1 point" is given for each emotion type. Further, "major" is a major chord (chord in major) such as C, E, F, G, and "minor" is a minor chord (chord in minor) such as Dm, Em, Am. ..

なお、一の単語または一の曲調に対しては、少なくとも一つの感情タイプが関連付けられていればよい。たとえば、ある単語（ある曲調）に対して最も支配的な感情タイプを一つだけ関連付けることも可能であるし、逆に４つ以上の感情タイプを関連付けることでもよい。また、単語−感情タイプデータベース１００に登録されていない単語が歌詞の中で使われている可能性もありうる。そこで、単語−感情タイプデータベース１００は、「その他」として「喜び：１点、怒り：１点、哀しみ：１点」のようなスコアを記憶してもよい。 It is sufficient that at least one emotion type is associated with one word or one tune. For example, it is possible to associate only one most dominant emotion type with a word (a certain tune), or conversely, associate four or more emotion types. It is also possible that words that are not registered in the word-emotion type database 100 are used in the lyrics. Therefore, the word-emotion type database 100 may store a score such as "joy: 1 point, anger: 1 point, sadness: 1 point" as "other".

［感情リファレンスデータ取得部］
感情リファレンスデータ取得部３００は、カラオケ楽曲を歌唱する際に表現すべき感情を示す感情リファレンスデータを歌唱区間毎に取得する。 [Emotion reference data acquisition department]
The emotion reference data acquisition unit 300 acquires emotion reference data indicating emotions to be expressed when singing a karaoke song for each singing section.

カラオケ楽曲の各歌唱区間は、それぞれ歌唱にふさわしい雰囲気がある。すなわち、カラオケ楽曲を歌唱する際には、歌唱区間毎に表現すべき適切な感情が存在する。たとえば、哀しい雰囲気を持ったカラオケ楽曲に対しては、哀しみの感情表現を行うことが適切である。感情リファレンスデータは、このような感情表現が適切に行われているかどうかを判断する際に参照するデータである。感情リファレンスデータの取得は、カラオケ楽曲の歌詞及び曲調の少なくとも一方に基づいて行われる。 Each singing section of a karaoke song has an atmosphere suitable for singing. That is, when singing a karaoke song, there are appropriate emotions to be expressed for each singing section. For example, for a karaoke song with a sad atmosphere, it is appropriate to express the feeling of sadness. The emotional reference data is data to be referred to when determining whether or not such emotional expression is properly performed. The acquisition of emotional reference data is performed based on at least one of the lyrics and the tone of the karaoke song.

図５は、カラオケ楽曲Ｘに含まれる歌唱区間毎の歌詞、及び曲調を示したものである。各歌詞は複数の単語から構成されている。感情リファレンスデータ取得部３００は、図５に示した歌唱区間毎に、感情リファレンスデータを取得する。 FIG. 5 shows the lyrics and tunes for each singing section included in the karaoke song X. Each lyrics is composed of multiple words. The emotion reference data acquisition unit 300 acquires emotion reference data for each singing section shown in FIG.

たとえば、歌唱区間Ａの歌詞は「あなたは」及び「いない」の２つの単語を含む。また、歌唱区間Ａのコードは「Ａｍ」である（図５参照）。 For example, the lyrics of singing section A include two words, "you" and "not". The code of the singing section A is "Am" (see FIG. 5).

この場合、感情リファレンスデータ取得部３００は、単語−感情タイプデータベース１００から単語「あなたは」及び単語「いない」それぞれについて、感情タイプ毎のスコアを読み出す。図３の例を参照すると、単語「あなたは」は「喜び：５点、怒り：３点、哀しみ：５点」のスコアであり、単語「いない」は「喜び：１点、怒り：３点、哀しみ：５点」のスコアである。 In this case, the emotion reference data acquisition unit 300 reads out the score for each emotion type for each of the word "you" and the word "not" from the word-emotion type database 100. Referring to the example of FIG. 3, the word "you" has a score of "joy: 5 points, anger: 3 points, sadness: 5 points", and the word "not" has "joy: 1 point, anger: 3 points". , Sadness: 5 points ”.

感情リファレンスデータ取得部３００は、歌唱区間Ａの歌詞（２つの単語）について読み出したスコアを感情タイプ毎に加算する。歌唱区間Ａの例では、「喜び：６点、怒り：６点、哀しみ：１０点」となる。 The emotion reference data acquisition unit 300 adds the scores read out for the lyrics (two words) in the singing section A for each emotion type. In the example of the singing section A, "joy: 6 points, anger: 6 points, sadness: 10 points".

また、感情リファレンスデータ取得部３００は、曲調−感情タイプデータベース２００からコード「Ａｍ」について、感情タイプ毎のスコアを読み出す。図４の例を参照すると、コード「Ａｍ」はマイナーであるので「喜び：１点、怒り：３点、哀しみ：５点」のスコアである。 Further, the emotion reference data acquisition unit 300 reads out the score for each emotion type for the code "Am" from the song tone-emotion type database 200. Referring to the example of FIG. 4, since the code "Am" is a minor, the score is "joy: 1 point, anger: 3 points, sadness: 5 points".

感情リファレンスデータ取得部３００は、歌唱区間Ａの歌詞に基づくスコア（喜び：６点、怒り：６点、哀しみ：１０点）と、歌唱区間Ａの曲調に基づくスコア（喜び：１点、怒り：３点、哀しみ：５点）とを感情タイプ毎に加算した合計スコア（喜び：７点、怒り：９点、哀しみ：１５点）を求める。本実施形態においては、一の歌唱区間に含まれる複数の単語から成る歌詞に基づくスコアと、一の曲調に基づくスコアを単純に加算したが、歌唱区間の長さや単語及び曲調の数などに応じて、重み付けをして加算してもよい。 The emotion reference data acquisition unit 300 has a score based on the lyrics of the singing section A (joy: 6 points, anger: 6 points, sadness: 10 points) and a score based on the tune of the singing section A (joy: 1 point, anger: anger:). The total score (joy: 7 points, anger: 9 points, sadness: 15 points) is calculated by adding 3 points, sadness: 5 points) for each emotion type. In the present embodiment, the score based on the lyrics consisting of a plurality of words included in one singing section and the score based on one tune are simply added, but depending on the length of the singing section, the number of words and the number of tunes, and the like. Then, they may be weighted and added.

感情リファレンスデータ取得部３００は、残りの歌唱区間Ｂ〜Ｈについても同様に感情タイプ毎の合計スコアを求める（図６Ａ参照）。そして、感情リファレンスデータ取得部３００は、感情タイプ毎に加算した合計スコアの比率を歌唱区間毎に算出することで感情リファレンスデータを取得する（図６Ｂ参照）。このように、感情リファレンスデータは、感情タイプ毎の数値として取得することができる。 The emotion reference data acquisition unit 300 similarly obtains the total score for each emotion type for the remaining singing sections B to H (see FIG. 6A). Then, the emotion reference data acquisition unit 300 acquires the emotion reference data by calculating the ratio of the total score added for each emotion type for each singing section (see FIG. 6B). In this way, the emotion reference data can be acquired as a numerical value for each emotion type.

なお、本実施形態では、歌詞（単語）及び曲調に基づいて感情リファレンスデータを取得する例について述べたが、いずれか一方のみ（たとえば、単語のみ）に基づいて感情リファレンスデータを取得してもよい。また、感情リファレンスデータは、感情タイプの比率に基づく数値でなくてもよい。たとえば、感情リファレンスデータは、ある歌唱区間について、最もスコアが高い感情タイプを１００点とし、それ以外の感情タイプを０点としてもよい。図６Ｂの歌唱区間Ａの例でいえば、最もスコアが高い「哀しみ」を１００点とし、「喜び」及び「怒り」を０点とすることでもよい。また、本実施形態においては、一の歌唱区間に一のコード（曲調）が含まれている例で説明したが、一の歌唱区間に複数のコードが含まれている場合もありうる。その場合、感情リファレンスデータ取得部３００は、複数のコードのスコアを感情タイプ毎に合計したものを一の歌唱区間のスコアとして取得してもよい。 In the present embodiment, an example of acquiring emotion reference data based on lyrics (words) and tunes has been described, but emotion reference data may be acquired based on only one of them (for example, only words). .. Also, the emotion reference data does not have to be a numerical value based on the emotion type ratio. For example, in the emotion reference data, the emotion type with the highest score may be set to 100 points for a certain singing section, and the other emotion types may be set to 0 points. In the example of the singing section A of FIG. 6B, the highest score "sadness" may be set to 100 points, and "joy" and "anger" may be set to 0 points. Further, in the present embodiment, the example in which one chord (tune) is included in one singing section has been described, but there may be a case where a plurality of chords are included in one singing section. In that case, the emotion reference data acquisition unit 300 may acquire the sum of the scores of the plurality of codes for each emotion type as the score of one singing section.

［歌唱感情データ生成部］
歌唱感情データ生成部４００は、カラオケ楽曲を歌唱することにより得られた歌唱音声を感情分析した歌唱感情データを歌唱区間毎に生成する。 [Singing emotion data generation unit]
The singing emotion data generation unit 400 generates singing emotion data for each singing section, which is an emotion analysis of the singing voice obtained by singing the karaoke song.

歌唱感情データは、実際のカラオケ歌唱に込められた感情表現を示すデータである。本実施形態において、歌唱感情データは数値で示される。歌唱感情データは歌唱音声を感情分析することにより得られる。感情分析は、公知の技術（たとえば、特許文献２や「音声こころ分析サービス」（株式会社日立システムズ））を利用することができる。具体例として、歌唱感情データ生成部４００は、歌唱音声から感情の込め方と相関のある感情特徴量を抽出する。また、歌唱感情データ生成部４００は、抽出された感情特徴量に基づいて、歌唱に込められた感情を分析する。感情分析は、所定の歌唱区間毎に行われる。そして、歌唱感情データ生成部４００は、感情分析により得られた情報を数値化し、所定の感情タイプに分類することで、歌唱区間毎の歌唱感情データを生成する。 The singing emotion data is data showing the emotional expression included in the actual karaoke singing. In this embodiment, the singing emotion data is shown numerically. Singing emotion data is obtained by emotionally analyzing the singing voice. For emotion analysis, known techniques (for example, Patent Document 2 and "Voice Heart Analysis Service" (Hitachi Systems, Ltd.)) can be used. As a specific example, the singing emotion data generation unit 400 extracts an emotional feature amount that correlates with how to put emotions from the singing voice. In addition, the singing emotion data generation unit 400 analyzes the emotions contained in the singing based on the extracted emotional features. Sentiment analysis is performed for each predetermined singing section. Then, the singing emotion data generation unit 400 generates singing emotion data for each singing section by quantifying the information obtained by the emotion analysis and classifying it into a predetermined emotion type.

或いは、公知の人工知能技術を利用して歌唱感情データを生成することも可能である。たとえば、音声感情認識エンジンＳＴにより、会話音声から喜び、怒り、哀しみなどの感情状態と１０段階の興奮の強さを検出する機能を有するＰＣ用開発支援キット「ＳＴＥｍｏｔｉｏｎＳＤＫ」（株式会社ＡＧＩ）が存在する。歌唱感情データ生成部４００がこのようなキットと同様の機能を備え、歌唱音声の感情分析を機械学習させることで歌唱者の感情を検出することが可能となる。この場合、歌唱感情データ生成部４００は、検出した情報を数値化し、所定の感情タイプに分類することで、歌唱区間毎の歌唱感情データを生成することができる。 Alternatively, it is also possible to generate singing emotion data using a known artificial intelligence technique. For example, the PC development support kit "ST Emotion SDK" (AGI Co., Ltd.) has a function to detect emotional states such as joy, anger, and sorrow and the intensity of excitement in 10 stages from conversational voice using the voice emotion recognition engine ST. Exists. The singing emotion data generation unit 400 has a function similar to that of such a kit, and it is possible to detect the emotion of the singer by machine learning the emotion analysis of the singing voice. In this case, the singing emotion data generation unit 400 can generate singing emotion data for each singing section by quantifying the detected information and classifying it into a predetermined emotion type.

図７は、歌唱者が図５に示したカラオケ楽曲Ｘの歌詞を歌唱した場合の歌唱感情データを示す。この例では、歌唱区間毎に感情分析結果を３つの感情タイプに分類した歌唱感情データを生成している。また、図７の例では、３つの感情タイプの合計スコアが１００点以下になるように調整されている。なお、歌唱感情データは、ある歌唱区間について、最もスコアが高い感情タイプを１００点とし、それ以外の感情タイプを０点としてもよい。また、一の歌唱区間に対しては、少なくとも一つの感情タイプが関連付けられていればよい。たとえば、歌唱区間に対して最も支配的な感情タイプを一つだけ関連付けることも可能であるし、逆に４つ以上の感情タイプを関連付けることでもよい。 FIG. 7 shows singing emotion data when the singer sings the lyrics of the karaoke song X shown in FIG. In this example, singing emotion data is generated by classifying the emotion analysis results into three emotion types for each singing section. Further, in the example of FIG. 7, the total score of the three emotion types is adjusted to be 100 points or less. In the singing emotion data, the emotion type having the highest score may be set to 100 points for a certain singing section, and the other emotion types may be set to 0 points. Further, at least one emotion type may be associated with one singing section. For example, it is possible to associate only one of the most dominant emotion types with a singing section, or conversely, associate four or more emotion types.

［感情表現評価部］
感情表現評価部５００は、感情リファレンスデータと歌唱感情データとを比較することにより、歌唱区間毎の感情表現の評価を行う。 [Emotional expression evaluation department]
The emotional expression evaluation unit 500 evaluates the emotional expression for each singing section by comparing the emotional reference data with the singing emotional data.

感情表現の評価は、カラオケ歌唱に感情がどれだけ込められているかを数値化することにより行う。具体的に、感情表現評価部５００は、感情タイプ毎に感情リファレンスデータと歌唱感情データとの比較を行い、感情タイプ毎に得られたスコアを合計した値により感情表現の評価を行う。この場合、感情リファレンスデータ及び歌唱感情データは、共通する複数の感情タイプで構成されていることが好ましい。 Emotional expression is evaluated by quantifying how much emotion is contained in karaoke singing. Specifically, the emotion expression evaluation unit 500 compares the emotion reference data and the singing emotion data for each emotion type, and evaluates the emotion expression by the total value of the scores obtained for each emotion type. In this case, the emotion reference data and the singing emotion data are preferably composed of a plurality of common emotion types.

たとえば、感情リファレンスデータ取得部３００が図６Ｂに示す感情リファレンスデータを取得し、歌唱感情データ生成部４００が図７に示す歌唱感情データを生成したとする。 For example, suppose that the emotion reference data acquisition unit 300 acquires the emotion reference data shown in FIG. 6B, and the singing emotion data generation unit 400 generates the singing emotion data shown in FIG. 7.

この場合、感情表現評価部５００は、歌唱区間Ａについて、感情タイプ毎に感情リファレンスデータのスコアと歌唱感情データのスコアとのＡＮＤを取る。具体的には、「喜び：２３点＆１０点＝１０点」、「怒り：２９点＆０点＝０点」、「哀しみ：４８点＆２０点＝２０点」となる。そして、感情表現評価部５００は、各感情タイプのスコアを合計した値（１０点＋２０点＝３０点）を歌唱区間Ａの評価とする。感情表現評価部５００は、歌唱区間Ｂ〜Ｈについても同様の処理を行い、歌唱区間毎に感情表現を評価する（図８参照）。なお、ある歌唱区間における感情リファレンスデータのスコアが１００点を満点とし、ある歌唱区間における歌唱感情データのスコアの最大値が１００点であるため、歌唱区間毎の評価は１００点が満点となる。 In this case, the emotion expression evaluation unit 500 takes an AND of the score of the emotion reference data and the score of the singing emotion data for each emotion type in the singing section A. Specifically, "joy: 23 points & 10 points = 10 points", "anger: 29 points & 0 points = 0 points", and "sadness: 48 points & 20 points = 20 points". Then, the emotion expression evaluation unit 500 evaluates the singing section A by the total value (10 points + 20 points = 30 points) of the scores of each emotion type. The emotional expression evaluation unit 500 performs the same processing for the singing sections B to H, and evaluates the emotional expression for each singing section (see FIG. 8). Since the score of the emotion reference data in a certain singing section is 100 points and the maximum value of the score of the singing emotion data in a certain singing section is 100 points, the evaluation for each singing section is 100 points.

感情表現評価部５００は、評価結果を歌唱者に提示することができる。たとえば、感情表現評価部５００は、図８に示す評価結果を表示装置３０に表示させることが可能である。或いは、感情表現評価部５００は、歌唱区間毎のスコアの平均値（図８の例であれば、５３．１２５点）を算出し、当該平均値を提示することも可能である。 The emotional expression evaluation unit 500 can present the evaluation result to the singer. For example, the emotional expression evaluation unit 500 can display the evaluation result shown in FIG. 8 on the display device 30. Alternatively, the emotional expression evaluation unit 500 can calculate the average value of the scores for each singing section (53.125 points in the example of FIG. 8) and present the average value.

なお、感情リファレンスデータにおいて、最もスコアの高い感情タイプを１００点とし、それ以外を０点とし、歌唱感情データにおいて最もスコアの高い感情タイプ以外を０点とした場合、感情表現評価部５００においては実質的に、歌唱区間毎に支配的な感情タイプが一致するか否かという観点で評価がなされる事になる。 When the emotion type with the highest score is set to 100 points in the emotion reference data, the other points are set to 0 points, and the emotion type other than the emotion type having the highest score in the singing emotion data is set to 0 points, the emotion expression evaluation unit 500 will use the emotion expression evaluation unit 500. In essence, the evaluation is made from the viewpoint of whether or not the dominant emotion types match for each singing section.

［採点処理部］
採点処理部６００は、歌唱音声から抽出した歌唱音声データを、音高、音量及び歌唱技法の少なくとも一つに基づいて採点することにより採点値を算出する。 [Scoring section]
The scoring processing unit 600 calculates the scoring value by scoring the singing voice data extracted from the singing voice based on at least one of pitch, volume and singing technique.

カラオケ歌唱の採点は、公知の技術を利用することができる。たとえば、採点処理部６００は、マイク４０から入力された歌唱音声信号から、ピッチ（音高）データ、音量データ等の歌唱音声データを抽出し、カラオケ楽曲のリファレンスデータと比較することにより、採点値を算出する。 Known techniques can be used for scoring karaoke songs. For example, the scoring processing unit 600 extracts singing voice data such as pitch (pitch) data and volume data from the singing voice signal input from the microphone 40, and compares it with the reference data of the karaoke music to score the scoring value. Is calculated.

［カラオケ歌唱評価部］
カラオケ歌唱評価部７００は、カラオケ楽曲の歌唱に基づく感情表現の評価及び採点値に基づいて、カラオケ楽曲の歌唱の評価を行う。 [Karaoke Singing Evaluation Department]
The karaoke singing evaluation unit 700 evaluates the singing of the karaoke music based on the evaluation of the emotional expression based on the singing of the karaoke music and the scoring value.

本実施形態におけるカラオケ歌唱の評価は、リファレンスデータに沿ったカラオケ歌唱が行われたか、及び適切な感情表現が行われたかを総合的に評価することにより行う。 The evaluation of the karaoke singing in the present embodiment is performed by comprehensively evaluating whether the karaoke singing is performed according to the reference data and whether the appropriate emotional expression is performed.

たとえば、カラオケ楽曲Ｘ全体の感情表現評価部５００による各歌唱区間の感情表現の評価の平均値が６０点であり、採点処理部６００による採点結果が８５点であったとする。 For example, it is assumed that the average value of the emotional expression evaluation of each singing section by the emotional expression evaluation unit 500 of the entire karaoke song X is 60 points, and the scoring result by the scoring processing unit 600 is 85 points.

この場合、カラオケ歌唱評価部７００は、これらのスコアを用いてカラオケ楽曲Ｘの歌唱評価を行う。たとえば、「採点結果：感情表現の評価＝９：１」の重み付けが設定されている場合、カラオケ歌唱評価部７００は、（８５点×０．９）＋（６０点×０．１）＝８２．５点をカラオケ楽曲Ｘの歌唱評価のスコアとして算出する。 In this case, the karaoke singing evaluation unit 700 uses these scores to evaluate the singing of the karaoke song X. For example, when the weighting of "scoring result: evaluation of emotional expression = 9: 1" is set, the karaoke singing evaluation unit 700 has (85 points x 0.9) + (60 points x 0.1) = 82. .5 points are calculated as the score of the singing evaluation of the karaoke song X.

或いは、感情表現の評価を採点結果に対する１０％のボーナス点として加算する場合、カラオケ歌唱評価部７００は、８５点＋（６０点×０．１）＝９１点をカラオケ楽曲Ｘの歌唱評価のスコアとして算出する。なお、このように加算方式を採用する場合、上限は、カラオケ装置１の採点機能が備える最大値（たとえば１００点）とすることが好ましい。 Alternatively, when the evaluation of emotional expression is added as a bonus point of 10% to the scoring result, the karaoke singing evaluation unit 700 adds 85 points + (60 points x 0.1) = 91 points to the singing evaluation score of the karaoke song X. Calculate as. When the addition method is adopted in this way, the upper limit is preferably the maximum value (for example, 100 points) provided in the scoring function of the karaoke device 1.

カラオケ歌唱評価部７００は、評価結果を歌唱者に提示することができる。たとえば、カラオケ歌唱評価部７００は、算出したスコアを表示装置３０に表示させることが可能である。 The karaoke singing evaluation unit 700 can present the evaluation result to the singer. For example, the karaoke singing evaluation unit 700 can display the calculated score on the display device 30.

＝＝カラオケ装置１の動作について＝＝
次に、図９を参照して本実施形態におけるカラオケ装置１の動作の具体例について述べる。図９は、カラオケ装置１の動作例を示すフローチャートである。 == About the operation of the karaoke device 1 ==
Next, a specific example of the operation of the karaoke device 1 in the present embodiment will be described with reference to FIG. FIG. 9 is a flowchart showing an operation example of the karaoke device 1.

カラオケ装置１は、歌唱者が選曲したカラオケ楽曲Ｘの伴奏データに基づいてカラオケ演奏を行う（カラオケ演奏。ステップ１０）。歌唱者はカラオケ演奏に合わせてカラオケ歌唱を行う。 The karaoke device 1 performs a karaoke performance based on the accompaniment data of the karaoke music X selected by the singer (Karaoke performance, step 10). The singer sings karaoke along with the karaoke performance.

カラオケ楽曲Ｘのカラオケ歌唱が終了した後、感情リファレンスデータ取得部３００は、カラオケ楽曲Ｘを歌唱する際に表現すべき感情を示す感情リファレンスデータを歌唱区間毎に取得する（感情リファレンスデータの取得。ステップ１１）。 After the karaoke singing of the karaoke song X is completed, the emotion reference data acquisition unit 300 acquires the emotion reference data indicating the emotion to be expressed when singing the karaoke song X for each singing section (acquisition of the emotion reference data). Step 11).

歌唱感情データ生成部４００は、カラオケ楽曲Ｘを歌唱することにより得られた歌唱音声を感情分析し、歌唱区間毎に歌唱感情データを生成する（歌唱感情データの生成。ステップ１２）。 The singing emotion data generation unit 400 analyzes the singing voice obtained by singing the karaoke song X, and generates singing emotion data for each singing section (generation of singing emotion data, step 12).

感情表現評価部５００は、ステップ１１で取得した感情リファレンスデータと、ステップ１２で生成した歌唱感情データとを比較することにより、歌唱区間毎の感情表現の評価を行う（感情表現の評価。ステップ１３）。 The emotional expression evaluation unit 500 evaluates the emotional expression for each singing section by comparing the emotional reference data acquired in step 11 with the singing emotional data generated in step 12 (evaluation of emotional expression. Step 13). ).

採点処理部６００は、カラオケ楽曲Ｘを歌唱することにより得られた歌唱音声から抽出した歌唱音声データをリファレンスデータと比較することで、採点値を算出する（採点値の算出。ステップ１４）。 The scoring processing unit 600 calculates the scoring value by comparing the singing voice data extracted from the singing voice obtained by singing the karaoke music X with the reference data (calculation of the scoring value, step 14).

カラオケ歌唱評価部７００は、ステップ１３で行われた感情表現の評価、及びステップ１４で算出された採点値に基づいて、カラオケ楽曲Ｘの歌唱評価を行う（カラオケ歌唱の評価。ステップ１５）。 The karaoke singing evaluation unit 700 evaluates the karaoke song X based on the evaluation of the emotional expression performed in step 13 and the scoring value calculated in step 14 (evaluation of karaoke singing, step 15).

なお、上記例では、カラオケ楽曲Ｘのカラオケ歌唱が終了した後に処理する例について述べたが、感情表現の評価は、少なくとも一の歌唱区間のカラオケ歌唱が終了した後に行われることでもよい。たとえば、歌唱区間Ａの歌唱終了後に上記ステップ１１〜１４の処理を行い、歌唱区間Ｂの歌唱終了後に上記ステップ１１〜１４の処理を行い、歌唱区間Ｃの歌唱終了後に・・・・というように、カラオケ楽曲Ｘの終了までステップ１１〜１４の処理を繰り返し行い、最後に各歌唱区間における感情表現の評価と採点値に基づいて、カラオケ楽曲Ｘの歌唱評価を行ってもよい。 In the above example, the example of processing after the karaoke singing of the karaoke song X is completed has been described, but the evaluation of the emotional expression may be performed after the karaoke singing of at least one singing section is completed. For example, the processing of steps 11 to 14 is performed after the singing of the singing section A is completed, the processing of steps 11 to 14 is performed after the singing of the singing section B is completed, the processing of steps 11 to 14 is performed after the singing of the singing section C is completed, and so on. , The processing of steps 11 to 14 may be repeated until the end of the karaoke song X, and finally the singing evaluation of the karaoke song X may be performed based on the evaluation of the emotional expression and the scoring value in each singing section.

また、感情リファレンスデータは、カラオケ楽曲の歌詞や曲調に基づいて生成したものを記憶部１３に記憶しておくことでもよい。この場合、感情リファレンスデータ取得部３００は、記憶部１３から感情リファレンスデータを直接取得する。 Further, the emotion reference data may be stored in the storage unit 13 as the emotion reference data generated based on the lyrics and the tone of the karaoke music. In this case, the emotion reference data acquisition unit 300 directly acquires the emotion reference data from the storage unit 13.

このように本実施形態に係るカラオケ装置１は、カラオケ楽曲を歌唱する際に表現すべき感情を示す感情リファレンスデータを歌唱区間毎に取得する感情リファレンスデータ取得部３００と、カラオケ楽曲を歌唱することにより得られた歌唱音声を感情分析した歌唱感情データを歌唱区間毎に生成する歌唱感情データ生成部４００と、感情リファレンスデータと歌唱感情データとを比較することにより、歌唱区間毎の感情表現の評価を行う感情表現評価部５００とを有する。 As described above, the karaoke device 1 according to the present embodiment sings the karaoke song together with the emotion reference data acquisition unit 300 that acquires the emotion reference data indicating the emotion to be expressed when singing the karaoke song for each singing section. Evaluation of emotional expression for each singing section by comparing the singing emotion data generation unit 400, which generates singing emotion data obtained by emotionally analyzing the singing voice obtained in the above, for each singing section, and the emotion reference data and the singing emotion data. It has an emotional expression evaluation unit 500 that performs the above.

歌唱音声を感情分析した歌唱感情データと感情リファレンスデータとを比較することにより、カラオケ楽曲の雰囲気を考慮した感情表現がなされている場合には評価が高くなる。一方、単に大きな声で歌唱した場合のように、雰囲気を考慮しない歌唱は評価が低くなる。すなわち、本実施形態に係るカラオケ装置１によれば、カラオケ楽曲の雰囲気にふさわしい感情表現で歌唱が行われているかを適切に評価できる。 By comparing the singing emotion data obtained by emotionally analyzing the singing voice with the emotion reference data, the evaluation is high when the emotion expression considering the atmosphere of the karaoke song is made. On the other hand, singing that does not consider the atmosphere, such as when singing in a loud voice, has a low evaluation. That is, according to the karaoke device 1 according to the present embodiment, it is possible to appropriately evaluate whether or not the singing is performed with an emotional expression suitable for the atmosphere of the karaoke music.

また、本実施形態に係るカラオケ装置１において、感情リファレンスデータ及び歌唱感情データは、共通する複数の感情タイプで構成され、感情表現評価部５００は、感情タイプ毎に感情リファレンスデータと歌唱感情データとの比較を行う。このように、複数の感情タイプ毎に感情リファレンスデータと歌唱感情データとの比較を行うことにより、カラオケ楽曲の雰囲気にふさわしい感情表現がなされているかどうかをより正確に判断できる。 Further, in the karaoke device 1 according to the present embodiment, the emotion reference data and the singing emotion data are composed of a plurality of common emotion types, and the emotion expression evaluation unit 500 includes emotion reference data and singing emotion data for each emotion type. Make a comparison. In this way, by comparing the emotion reference data and the singing emotion data for each of the plurality of emotion types, it is possible to more accurately determine whether or not the emotion expression suitable for the atmosphere of the karaoke song is made.

また、本実施形態に係るカラオケ装置１は、歌唱音声から抽出した歌唱音声データを、音高、音量及び歌唱技法の少なくとも一つに基づいて採点することにより採点値を算出する採点処理部６００と、カラオケ楽曲の歌唱に基づく感情表現の評価及び採点値に基づいて、カラオケ楽曲の歌唱の評価を行うカラオケ歌唱評価部７００と、を有する。このようなカラオケ装置１によれば、感情表現の評価を含むカラオケ歌唱の総合評価を行うことができる。 Further, the karaoke device 1 according to the present embodiment includes a scoring processing unit 600 that calculates a scoring value by scoring the singing voice data extracted from the singing voice based on at least one of pitch, volume and singing technique. It also has a karaoke singing evaluation unit 700 that evaluates the singing of karaoke songs based on the evaluation of emotional expressions based on the singing of karaoke songs and the scoring value. According to such a karaoke device 1, it is possible to perform a comprehensive evaluation of karaoke singing including an evaluation of emotional expression.

また、感情リファレンスデータ取得部３００は、カラオケ楽曲の歌詞及び曲調の少なくとも一方に基づいて、感情リファレンスデータを取得する。このように、カラオケ楽曲の歌詞及び曲調の少なくとも一方を用いることにより、カラオケ楽曲の雰囲気を反映した感情リファレンスデータの取得が可能となる。 Further, the emotion reference data acquisition unit 300 acquires emotion reference data based on at least one of the lyrics and the tone of the karaoke song. In this way, by using at least one of the lyrics and the tone of the karaoke music, it is possible to acquire the emotion reference data that reflects the atmosphere of the karaoke music.

＜変形例＞
上記実施形態では、感情リファレンスデータ取得部３００が、カラオケ楽曲の歌詞及び曲調に基づいて、感情リファレンスデータを取得する例について述べた。 <Modification example>
In the above embodiment, an example in which the emotion reference data acquisition unit 300 acquires emotion reference data based on the lyrics and tone of the karaoke song has been described.

図１０は、本変形例に係るカラオケ本体１０のソフトウェア構成例を示す図である。カラオケ本体１０は、感情リファレンスデータ取得部３００、歌唱感情データ生成部４００、感情表現評価部５００、採点処理部６００、カラオケ歌唱評価部７００、及び感情リファレンスデータ記憶部８００を備える。感情リファレンスデータ記憶部８００は、記憶部１３の記憶領域の一部として提供される。 FIG. 10 is a diagram showing a software configuration example of the karaoke main body 10 according to this modification. The karaoke body 10 includes an emotion reference data acquisition unit 300, a singing emotion data generation unit 400, an emotion expression evaluation unit 500, a scoring processing unit 600, a karaoke singing evaluation unit 700, and an emotion reference data storage unit 800. The emotion reference data storage unit 800 is provided as a part of the storage area of the storage unit 13.

［感情リファレンスデータ記憶部］
感情リファレンスデータ記憶部８００は、カラオケ楽曲の原曲歌手の歌唱音声を感情分析した歌唱感情データに基づく感情リファレンスデータを記憶する。原曲歌手は、カラオケ楽曲を歌唱するプロの歌手等であり、カラオケ楽曲の歌唱を最も上手く歌える者である。すなわち、各歌唱区間における感情表現も正確に再現することができる。 [Emotion reference data storage]
The emotion reference data storage unit 800 stores emotion reference data based on the singing emotion data obtained by emotionally analyzing the singing voice of the original song singer of the karaoke song. The original song singer is a professional singer who sings a karaoke song, and is a person who can sing the karaoke song best. That is, the emotional expression in each singing section can be accurately reproduced.

感情リファレンスデータの記憶は、たとえば新たなカラオケ楽曲が配信される都度行われる。具体的に、歌唱感情データ生成部４００は、新たなカラオケ楽曲の原曲歌手の歌唱分析を行い、複数の感情タイプからなる歌唱感情データを生成する。歌唱感情データ生成部４００は、歌唱感情データを元に、複数の感情タイプの比率を１００点満点で正規化し、感情リファレンスデータとして記憶する。図１１は、上記実施形態における歌唱区間Ａ〜Ｈの感情リファレンスデータの例を示した図である。図１１から明らかなように、いずれの歌唱区間においても感情タイプの合計スコアが１００点満点になるように設定されている。 The emotion reference data is stored, for example, each time a new karaoke song is delivered. Specifically, the singing emotion data generation unit 400 performs singing analysis of the original singer of the new karaoke song, and generates singing emotion data composed of a plurality of emotion types. The singing emotion data generation unit 400 normalizes the ratio of a plurality of emotion types on a scale of 100 points based on the singing emotion data, and stores it as emotion reference data. FIG. 11 is a diagram showing an example of emotional reference data of the singing sections A to H in the above embodiment. As is clear from FIG. 11, the total score of the emotion type is set to a maximum of 100 points in each singing section.

［感情リファレンスデータ取得部］
本変形例に係る感情リファレンスデータ取得部３００は、カラオケ楽曲を歌唱する際に表現すべき感情を示す感情リファレンスデータを歌唱区間毎に取得する。 [Emotion reference data acquisition department]
The emotion reference data acquisition unit 300 according to this modification acquires emotion reference data indicating emotions to be expressed when singing a karaoke song for each singing section.

感情リファレンスデータ取得部３００は、感情リファレンスデータ記憶部８００から感情リファレンスデータを取得する。たとえば、カラオケ楽曲Ｘが選曲された場合、感情リファレンスデータ取得部３００は、感情リファレンスデータ記憶部８００からカラオケ楽曲Ｘの感情リファレンスデータを読み出す。 The emotion reference data acquisition unit 300 acquires emotion reference data from the emotion reference data storage unit 800. For example, when the karaoke music X is selected, the emotion reference data acquisition unit 300 reads the emotion reference data of the karaoke music X from the emotion reference data storage unit 800.

このように、本変形例に係るカラオケ装置１によれば、感情表現の評価を行う都度、感情リファレンスデータを算出する必要が無い。また、原曲歌手の歌唱音声から感情リファレンスデータを作成することで、原曲歌手の歌唱に沿った適切な感情表現の評価が可能となる。 As described above, according to the karaoke device 1 according to the present modification, it is not necessary to calculate the emotion reference data each time the emotion expression is evaluated. In addition, by creating emotion reference data from the singing voice of the original song singer, it is possible to evaluate an appropriate emotional expression along with the singing of the original song singer.

＜その他＞
上記実施形態は、例として提示したものであり、発明の範囲を限定するものではない。上記の構成は、適宜組み合わせて実施することが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。上記実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 <Others>
The above embodiment is presented as an example and does not limit the scope of the invention. The above configurations can be implemented in appropriate combinations, and various omissions, replacements, and changes can be made without departing from the gist of the invention. The above-described embodiments and modifications thereof are included in the scope and gist of the invention, as well as in the scope of the invention described in the claims and the equivalent scope thereof.

１カラオケ装置
１００単語−感情タイプデータベース
２００曲調−感情タイプデータベース
３００感情リファレンスデータ取得部
４００歌唱感情データ生成部
５００感情表現評価部
６００採点処理部
７００カラオケ歌唱評価部 1 Karaoke device 100 Word-Emotion type database 200 Song tone-Emotion type database 300 Emotion reference data acquisition unit 400 Singing emotion data generation unit 500 Emotion expression evaluation unit 600 Scoring processing unit 700 Karaoke singing evaluation unit

Claims

An emotion reference data acquisition unit that acquires emotion reference data indicating emotions to be expressed when singing a karaoke song for each singing section, and an emotion reference data acquisition unit.
A singing emotion data generation unit that generates singing emotion data obtained by emotionally analyzing the singing voice obtained by singing the karaoke song for each singing section.
An emotional expression evaluation unit that evaluates the emotional expression for each singing section by comparing the emotional reference data with the singing emotional data.
Have a,
The emotion reference data indicates emotions corresponding to the types of singing emotion data that can be generated by the singing emotion data generation unit.
The emotion reference data and the singing emotion data are composed of a plurality of common emotion types.
The emotion expression evaluation unit is a karaoke device that compares the emotion reference data with the singing emotion data for each emotion type.

A scoring processing unit that calculates a scoring value by scoring singing voice data extracted from the singing voice based on at least one of pitch, volume, and singing technique.
A karaoke singing evaluation unit that evaluates the emotional expression based on the singing of the karaoke song and evaluates the singing of the karaoke song based on the scoring value.
Claim 1 Symbol placement karaoke apparatus characterized by having a.

The karaoke device according to claim 1 or 2, wherein the emotion reference data acquisition unit acquires the emotion reference data based on at least one of the lyrics and the tone of the karaoke song.

It has an emotion reference data storage unit that stores emotion reference data based on singing emotion data obtained by emotionally analyzing the singing voice of the original singer of the karaoke song.
The karaoke device according to claim 1 or 2, wherein the emotion reference data acquisition unit acquires the emotion reference data from the emotion reference data storage unit.