JP2019090936A

JP2019090936A - Singing support device and karaoke device

Info

Publication number: JP2019090936A
Application number: JP2017219874A
Authority: JP
Inventors: 恵一徳田; Keiichi Tokuda; 圭一郎大浦; Keiichiro Oura; 和寛中村; Kazuhiro Nakamura
Original assignee: Techno Speech Inc
Current assignee: Techno Speech Inc
Priority date: 2017-11-15
Filing date: 2017-11-15
Publication date: 2019-06-13
Anticipated expiration: 2037-11-15
Also published as: JP6399715B1

Abstract

To provide a technique capable of supporting singing without interrupting singing of a singer.SOLUTION: A singing support device for supporting singing of a singer comprises: a guide melody synthesis section for generating two or more types of guide melodies which correspond to music to be sung and guide singing of the singer by voice synthesis; and a guide melody output section for switching two or more types of guide melodies and outputting the melodies by adjusting them to the music.SELECTED DRAWING: Figure 3

Description

本発明は、歌唱支援装置の技術に関する。 The present invention relates to the technology of a singing support device.

従来の通信カラオケ装置として、ガイドメロディによる歌唱者のサポートを行う歌唱支援装置が知られている。ガイドメロディとは、カラオケ用の楽曲の演奏に沿って歌唱者が歌えない場合、歌唱を誘導するために再生される主旋律や、模範ボーカル音声である（例えば、特許文献１および特許文献２参照）。 As a conventional communication karaoke apparatus, there is known a singing support apparatus that supports a singer by a guide melody. The guide melody is the main melody played to induce singing when the singer can not sing along with the performance of the karaoke song, and a model vocal voice (see, for example, Patent Document 1 and Patent Document 2). .

特開２００５−１４８６２７号公報JP, 2005-148627, A 特開２００５−０４９４１０号公報JP 2005-049410 A

しかし、歌唱者が歌唱できる部分に通常のガイドボーカルを流すと、歌唱者の歌唱と被ってしまうため、歌唱者が惑わされて歌いにくい場合や聴衆が聴きにくい場合がある。そのため、歌唱者の歌唱を妨げることなく歌唱支援が可能な技術が望まれていた。 However, if a singer gives a normal guide vocal to a portion where he can sing, he or she will be covered with the singer's singing, so the singer may be confused and difficult to sing or the audience may be difficult to hear. Therefore, there has been a demand for a technology that can support singing without interfering with the singing of the singer.

本発明は、上述の課題を解決するためになされたものであり、以下の形態として実現することが可能である。 The present invention has been made to solve the above-described problems, and can be realized as the following modes.

（１）本発明の一形態によれば、歌唱者の歌唱を支援する歌唱支援装置が提供される。この歌唱支援装置は、前記歌唱される楽曲に対応し前記歌唱者の前記歌唱を誘導するガイドメロディを、音声合成により、２種類以上生成するガイドメロディ合成部と；前記２種類以上のガイドメロディを切替えて、前記楽曲に合わせて出力するガイドメロディ出力部と、を備える。この形態の歌唱支援装置によれば、複数種類のガイドメロディを切替えて出力できるため、歌唱者の歌唱に対して適切な形態、例えば、歌唱者の歌唱と被らないでメロディ音のサポートができる。そのため、歌唱者の歌唱を妨げることなく歌唱支援ができる。 (1) According to one aspect of the present invention, there is provided a singing support device for supporting singing by a singer. The singing support device generates a guide melody corresponding to the song to be sung and guides the singing of the singer by two or more types by speech synthesis; and a guide melody synthesizing unit; and the two or more types of guide melody And a guide melody output unit configured to switch and output according to the music. According to the singing support device of this aspect, since it is possible to switch and output a plurality of types of guide melody, it is possible to support melody sound without having a form suitable for singing of the singer, for example, singing and singing of the singer . Therefore, singing support can be performed without interfering with the singing of the singer.

（２）上記形態の歌唱支援装置において、前記ガイドメロディ合成部は、前記ガイドメロディを統計的手法により音響パラメータを学習した音響モデルを用いて音声合成を行ってもよい。この形態の歌唱支援装置によれば、少ないデータ量で音声合成によりガイドメロディを生成する事ができる。 (2) In the song support device of the above aspect, the guide melody synthesis unit may perform speech synthesis using an acoustic model in which acoustic parameters are learned for the guide melody by a statistical method. According to the singing support device of this aspect, it is possible to generate a guide melody by speech synthesis with a small amount of data.

（３）上記形態の歌唱支援装置において、前記ガイドメロディ合成部は、前記音響モデルを用いた音声合成により、少なくとも、歌詞の無いメロディ音と歌詞のあるボーカル音とを、前記２種類以上のガイドメロディとして生成してもよい。この形態の歌唱支援装置によれば、歌詞の無いメロディ音により歌唱者の歌唱と被らないでメロディ音のサポートができる。そのため、歌唱者の歌唱を妨げることなく歌唱支援ができる。 (3) In the song support apparatus of the above aspect, the guide melody synthesis unit is configured to generate at least a melody sound without lyrics and a vocal sound with lyrics by voice synthesis using the acoustic model. It may be generated as a melody. According to the singing support device of this aspect, it is possible to support the melody sound without the singer singing and being covered by the melody sound without the lyrics. Therefore, singing support can be performed without interfering with the singing of the singer.

（４）上記形態の歌唱支援装置において、前記メロディ音の合成は、前記楽曲のハミングやスキャットなどの歌詞を伴わない歌唱音を学習して得られた第１の音響モデルに基づいて行われ、前記ボーカル音の合成は、前記楽曲の歌詞を用いた歌唱音を学習して得られた第２の音響モデルに基づいて行われてもよい。この形態の歌唱支援装置によれば、音声合成により容易に２種類以上のガイドメロディを生成する事ができる。 (4) In the song support apparatus of the above aspect, the synthesis of the melody sound is performed based on a first acoustic model obtained by learning a song sound without lyrics such as humming and scat of the music, The synthesis of the vocal sound may be performed based on a second acoustic model obtained by learning a singing sound using the lyrics of the music. According to the singing support device of this aspect, it is possible to easily generate two or more types of guide melodies by speech synthesis.

（５）上記形態の歌唱支援装置において、前記歌唱者の前記歌唱の程度を判定する歌唱状況判定部を備え；前記ガイドメロディ出力部は、前記判定した前記歌唱の程度に応じて前記ガイドメロディを切替えてもよい。この形態の歌唱支援装置によれば、歌唱状況に応じてガイドメロディを切替えて出力できるため、より効果的に歌唱者の歌唱と被らないでメロディ音のサポートができる。 (5) The singing support apparatus according to the above aspect, further comprising: a singing situation determination unit that determines the degree of the singing of the singer; the guide melody output unit determines the guide melody according to the determined degree of the singing You may switch. According to the singing support device of this aspect, since the guide melody can be switched and output according to the singing situation, the melody sound can be supported more effectively without being covered by the singing person and singing.

なお、本発明は、種々の態様で実現することが可能である。例えば、この形態の歌唱支援装置を備えたカラオケ装置やカラオケシステム、歌唱支援方法、コンピュータプログラム等の形態で実現することができる。コンピュータプログラムは、コンピュータが読み取り可能な一時的でない有形の記録媒体に記録されていてもよい。 The present invention can be realized in various aspects. For example, the present invention can be realized in the form of a karaoke apparatus, a karaoke system, a singing support method, a computer program or the like provided with the singing support device of this form. The computer program may be recorded on a non-transitory tangible recording medium readable by a computer.

歌唱支援装置を含むカラオケ装置の概要を示す説明図である。It is explanatory drawing which shows the outline | summary of the karaoke apparatus containing a song assistance apparatus. 音響モデルに含まれる各種の音響パラメータを示す図である。It is a figure which shows the various acoustic parameters contained in an acoustic model. ガイドメロディ出力処理を表すフローチャートである。It is a flowchart showing guide melody output processing. 第２実施形態における歌唱支援装置を含むカラオケ装置の概要を示す説明図である。It is explanatory drawing which shows the outline | summary of the karaoke apparatus containing the song assistance apparatus in 2nd Embodiment. 平均音量と歌唱尤度と平均音高差とガイドメロディとの関係を定義した図である。It is the figure which defined the relationship between an average volume, singing likelihood, an average pitch difference, and a guide melody.

Ａ．第１実施形態：
図１は、本発明の一実施形態における歌唱支援装置１００を含むカラオケ装置２００の概要を示す説明図である。カラオケ装置２００は、歌唱支援装置１００と、楽曲演奏装置１０５と、ミキシングアンプ５０とを備える。歌唱支援装置１００は、カラオケ装置２００において、歌唱者の歌唱をガイドメロディによってサポートする装置である。本実施形態において、歌唱支援装置１００は、ガイドメロディ合成部１０と、歌唱状況判定部２０と、ガイドメロディ出力部３０と、を備える。 A. First embodiment:
FIG. 1 is an explanatory view showing an outline of a karaoke apparatus 200 including a singing support apparatus 100 according to an embodiment of the present invention. The karaoke apparatus 200 includes a singing support apparatus 100, a music playing apparatus 105, and a mixing amplifier 50. The singing support device 100 is a device that supports singing of a singer by a guide melody in the karaoke device 200. In the present embodiment, the singing support device 100 includes a guide melody synthesis unit 10, a singing situation determination unit 20, and a guide melody output unit 30.

ガイドメロディ合成部１０は、一楽曲に対して２種類以上のガイドメロディを音声合成により生成する。「ガイドメロディ」とは、歌唱者の歌唱を誘導するための主旋律や、模範ボーカル音声である。本実施形態において、ガイドメロディ合成部１０は、統計的手法により音響パラメータを学習した音響モデルを用いて音声合成を行う。より具体的には、隠れマルコフモデル（以下、ＨＭＭ（ＨｉｄｄｅｎＭａｒｋｏｖＭｏｄｅｌ）とも記載する）や、ディープニューラルネットワーク（以下、ＤＮＮ（ＤｅｅｐＮｅｕｒａｌＮｅｔｗｏｒｋ）とも記載する）を用いて、ガイドメロディを生成する。音響モデルの学習に用いる音響パラメータの詳細については後述する。 The guide melody synthesizing unit 10 generates two or more types of guide melodies for one music by speech synthesis. The “guide melody” is a main melody for guiding a singer of the singer or a vocal vocal model. In the present embodiment, the guide melody synthesis unit 10 performs speech synthesis using an acoustic model in which acoustic parameters are learned by a statistical method. More specifically, the guide melody is generated using a Hidden Markov Model (hereinafter also described as HMM (Hidden Markov Model)) or a deep neural network (hereinafter also described as DNN (Deep Neural Network)). Details of the acoustic parameters used to learn the acoustic model will be described later.

歌唱状況判定部２０は、音声入力部４０（例えば、マイク）より入力された歌唱者の歌唱（歌声である音声）の程度を判定する。歌唱の程度とは、歌唱者がその歌を歌唱できているかどうかの程度を意味する。本実施形態において、歌唱状況判定部２０は、例えば、入力された歌唱の音量と音程と歌詞とによって判定する。 The singing situation determination unit 20 determines the degree of singing (song voice) of a singing person input from the voice input unit 40 (for example, a microphone). The degree of singing means the degree of whether the singer can sing the song. In the present embodiment, the singing situation determination unit 20 determines, for example, based on the volume, the pitch, and the lyrics of the input singing.

ガイドメロディ出力部３０は、ガイドメロディ合成部１０によって生成された複数のガイドメロディを、歌唱状況判定部２０において判定された歌唱の程度に応じて切替え、ミキシングアンプ５０を介してスピーカ６０へ出力する。 The guide melody output unit 30 switches the plurality of guide melodies generated by the guide melody synthesis unit 10 according to the degree of singing determined in the singing situation determination unit 20, and outputs the same to the speaker 60 via the mixing amplifier 50. .

楽曲演奏装置１０５は、カラオケ装置２００において、楽曲を演奏する装置である。楽曲演奏装置１０５は、記憶部１５と、制御部２５と、楽曲出力部３５と、を備える。楽曲演奏装置１０５は、操作部７０（例えば、リモコン）より操作され、演奏する楽曲が選択される。 The music playing device 105 is a device that plays music in the karaoke device 200. The music playing device 105 includes a storage unit 15, a control unit 25, and a music output unit 35. The music playing device 105 is operated by the operation unit 70 (for example, a remote control), and a music to be played is selected.

記憶部１５は、ネットワークＮＷより演奏する楽曲のデータ、例えばＭＩＤＩ規格で作成された伴奏楽音の演奏データ（以下「ＭＩＤＩデータ」という）を取得して記憶する。制御部２５は、操作部７０より入力された操作指示に応じて、楽曲出力部３５を制御する。例えば、楽曲を演奏する順番の制御や、楽曲の早送りや一時停止等の制御を行う。なお、記憶部１５は、ＭＩＤＩデータをネットワークＮＷから取得せず、予め一部または全部を記憶していてもよい。 The storage unit 15 acquires and stores data of music to be played from the network NW, for example, performance data (hereinafter referred to as "MIDI data") of an accompaniment musical tone created in accordance with the MIDI standard. The control unit 25 controls the music output unit 35 in accordance with the operation instruction input from the operation unit 70. For example, control of the order in which the music is played, and control of fast-forwarding and pause of the music are performed. The storage unit 15 may store a part or all of the MIDI data in advance without acquiring the MIDI data from the network NW.

楽曲出力部３５は、記憶部１５よりＭＩＤＩデータを取得し、デコードして演奏するカラオケ楽曲を生成し、ミキシングアンプ５０を介してスピーカ６０へ出力する。なお、ミキシングアンプ５０は、ガイドメロディと、音声入力部４０より入力された歌唱者の歌唱と、カラオケ楽曲とを合成したスピーカ６０より出力する。 The music output unit 35 acquires MIDI data from the storage unit 15, decodes the MIDI data to generate a karaoke music to be played, and outputs the karaoke music to the speaker 60 via the mixing amplifier 50. The mixing amplifier 50 outputs the guide melody, the singing of the singer input from the voice input unit 40, and the karaoke music, from the speaker 60 that is synthesized.

図２は、音響モデルによりモデル化する各種の音響パラメータを示す図である。基本周波数は、一般に対数基本周波数ｐｔとして扱われており、その関連パラメータとしては、有声／無声の区別、対数基本周波数の一次微分（Δｐｔ）や二次微分（Δ２ｐｔ）が考えられる。これらは音源情報と呼ばれることがある。なお、無声部分は対数基本周波数ｐｔの値を持たない。このため、無声部分に所定の定数を入れる等の方法によって有声／無声の区別を行う。また、スペクトルパラメータとしては、メルケプストラムｃｔやその一次微分（Δｃｔ）、二次微分（Δ２ｃｔ）などがある。これらは、スペクトル情報と呼ばれることがある。更に、こうした音源情報、スペクトル情報の他に、本実施形態では、歌唱表現情報を扱う。 FIG. 2 is a view showing various acoustic parameters modeled by the acoustic model. The fundamental frequency is generally treated as a logarithmic fundamental frequency pt, and as the related parameter, the voiced / unvoiced distinction, the first derivative (Δpt) and the second derivative (Δ2pt) of the logarithmic fundamental frequency can be considered. These are sometimes referred to as sound source information. The unvoiced portion has no value of the logarithmic fundamental frequency pt. For this reason, voiced / voiceless distinction is made by a method such as putting a predetermined constant in the unvoiced part. Further, as the spectrum parameter, there are Mel Cepstrum ct, its first derivative (Δct), second derivative (Δ2ct) and the like. These are sometimes referred to as spectral information. Furthermore, in addition to such sound source information and spectrum information, the present embodiment deals with singing expression information.

歌唱表現情報には、音高のビブラートの周期Ｖ１ｆｔおよび振幅Ｖ１ａｔと、音の大きさのビブラートの周期Ｖ２ｆｔおよび振幅Ｖ２ａｔとが、音素単位でモデル化されて含まれている。音高のビブラートの周期、音高のビブラートの振幅、音の大きさのビブラートの周期、音の大きさのビブラートの振幅についても、それぞれに対応する一時微分（Δ）と、二次微分（Δ２）とを持つが、図示の便宜上、図２ではこれら周期および振幅についての一次微分、二次微分の図示を省略している。上記パラメータのうち、メルケプストラムｃｔを初めとする各パラメータの一次微分や二次微分は、時間変動を考慮するために用いられる。動的特徴を考慮することにより、歌声の合成時における音と音のつながりが滑らかなものとなる。動的特徴を用いた音声合成の手法については、説明を省略する。 The song expression information includes a vibrato period V1ft and an amplitude V1at of the pitch, and a vibrato period V2ft and an amplitude V2at of the sound magnitude, which are modeled in phoneme units. The period of the pitch vibrato, the amplitude of the pitch vibrato, the period of the vibrato of the loudness, and the amplitude of the vibrato of loudness are also corresponding to the temporal derivative (Δ) and the second derivative (Δ2) However, for convenience of illustration, in FIG. 2, illustration of the first derivative and the second derivative of these periods and amplitudes is omitted. Among the above-mentioned parameters, the first derivative and the second derivative of each parameter including the mel cepstrum ct are used to consider time variation. By considering the dynamic feature, the connection between the sound and the sound when synthesizing the singing voice becomes smooth. The description of the speech synthesis method using dynamic features is omitted.

また、音響モデルとしてＤＮＮを用いた場合には、メルケプストラムｃｔの代わりにスペクトルをモデル化してもよいし、上記音響パラメータの代わりに音声波形を音響パラメータとしてモデル化してもよい。 When DNN is used as an acoustic model, a spectrum may be modeled instead of mer cepstrum ct, or a speech waveform may be modeled as an acoustic parameter instead of the acoustic parameter.

本実施形態において、音響モデルは、ハミングやスキャットの等の歌詞を伴わない歌唱音を学習して得られた第１の音響モデルと、歌詞を用いた歌唱音を学習して得られた第２の音響モデルとを含む。 In the present embodiment, the acoustic model is a first acoustic model obtained by learning a singing sound without lyrics such as humming or scat, and a second acoustic learning model obtained by learning a singing sound using the lyrics. And the acoustic model of

図３は、本実施形態におけるガイドメロディ出力処理を表すフローチャートである。ガイドメロディ出力処理は、カラオケにおいて、楽曲が再生されると同時に開始される処理であり、歌唱の程度に応じてガイドメロディを切替えるための処理である。なお、図示の便宜上、ガイドメロディをガイドと省略して記載している。ガイドメロディ出力部３０は、まず、出力するガイドメロディを「メロディ音」に設定する（ステップＳ１００）。「メロディ音」とは、歌詞を持たない旋律のみの音声である。メロディ音は、例えば、楽器音やハミング等による音声が用いられる。 FIG. 3 is a flowchart showing guide melody output processing in the present embodiment. The guide melody output process is a process that is started at the same time as playing a music in karaoke, and is a process for switching the guide melody according to the degree of singing. For convenience of illustration, the guide melody is described as a guide. The guide melody output unit 30 first sets the guide melody to be output to "melody sound" (step S100). The "melody sound" is a melody-only sound without lyrics. As the melody sound, for example, an instrumental sound or a sound by humming is used.

次に、歌唱状況判定部２０は、現在の楽曲の再生部分が歌唱区間かどうか判定する（ステップＳ１１０）。楽曲の再生部分が歌唱区間で無い場合（ステップＳ１１０：ＮＯ）、歌唱の程度に応じて歌声を切替えることがないため、歌唱状況判定部２０は、楽曲が終了かどうか判定する（ステップＳ２１０）。 Next, the singing situation determination unit 20 determines whether the reproduction portion of the current music is a singing section (step S110). When the reproduction portion of the music is not in the singing section (step S110: NO), the singing situation judging unit 20 judges whether the music is ended because the singing voice is not switched according to the degree of the singing (step S210).

楽曲の再生部分が歌唱区間である場合（ステップＳ１１０：ＹＥＳ）、歌唱状況判定部２０は、音声入力部４０より歌唱者の歌唱を取得する（ステップＳ１２０）。歌唱状況判定部２０は、取得した歌唱の平均音量を算出する（ステップＳ１３０）。平均音量とは、例えば、予め定めた長さの区間における歌唱の音量の平均として求めることができる。 When the reproduction portion of the music is a singing section (step S110: YES), the singing situation determination unit 20 acquires the singing of the singer from the voice input unit 40 (step S120). The singing situation determination unit 20 calculates the average volume of the acquired singing (step S130). The average volume can be determined, for example, as an average of the volume of singing in a section of a predetermined length.

次に、歌唱状況判定部２０は、算出した平均音量が予め定めた閾値以上かどうか判定する（ステップＳ１４０）。閾値は任意に定めることが出来る。平均音量が閾値より小さい場合（ステップＳ１４０：ＮＯ）、換言すると歌唱できていないと判断した場合、ガイドメロディ出力部３０は、出力するガイドメロディを「ボーカル音」に設定して（ステップＳ１６５）、ステップＳ２１０に移行する。「ボーカル音」とは、歌詞のある歌声である。本実施形態において、ボーカル音のガイドメロディは、上述した第２の音響モデルを用いて音声合成により生成する。 Next, the singing situation determination unit 20 determines whether the calculated average volume is equal to or more than a predetermined threshold (step S140). The threshold can be set arbitrarily. If the average volume is smaller than the threshold (step S140: NO), in other words, if it is determined that singing is not possible, the guide melody output unit 30 sets the guide melody to be output to "vocal sound" (step S165), It transfers to step S210. "Vocal sound" is a singing voice with lyrics. In the present embodiment, the guide melody of the vocal sound is generated by speech synthesis using the second acoustic model described above.

平均音量が閾値以上の場合（ステップＳ１４０：ＹＥＳ）、換言すると歌唱できていると判断した場合、歌唱状況判定部２０は、ステップＳ１２０で取得した歌唱の歌唱尤度を算出する（ステップＳ１５０）。歌唱尤度とは、予め定めた長さの区間における歌唱の尤もらしさ、より具体的には、歌詞と譜割り（歌詞の発音タイミング）の正しさの度合いを示す。歌唱尤度は、楽譜通りに歌った場合値が高くなる。 If the average volume is equal to or higher than the threshold (YES in step S140), in other words, if it is determined that singing is possible, the singing situation determination unit 20 calculates the singing likelihood of the singing obtained in step S120 (step S150). The singing likelihood indicates the likelihood of singing in a section of a predetermined length, more specifically, the degree of correctness of the lyrics and the score split (the pronunciation timing of the lyrics). The likelihood of singing becomes higher when singing according to the score.

歌唱尤度の算出における歌詞の正誤判断は、歌詞が予め解っているため、例えば、音声認識を用いてもよいし、音素の正誤によって判断してもよい。また、譜割りの正誤判断は、例えば、歌声の各音素がどのタイミングからどのタイミングまでを占めるかを示す音素アライメントを求め、楽譜の音符境界との一致具合によって判断すればよい。 Since the lyrics are known in advance, the correctness / incorrectness judgment of the lyrics in the calculation of the singing likelihood may use, for example, voice recognition or may be judged by the correctness of the phonemes. Further, the correctness determination of the score may be made, for example, by obtaining a phoneme alignment indicating which timing each phoneme of the singing voice occupies and which timing, and the degree of matching with the note boundary of the score.

次に、歌唱状況判定部２０は、算出した歌唱尤度が予め定めた閾値以上かどうか判定する（ステップＳ１６０）。閾値は任意に定めることが出来る。歌唱尤度が閾値より小さい場合（ステップＳ１６０：ＮＯ）、換言すると歌唱できていないと判断した場合、ガイドメロディ出力部３０は、出力するガイドメロディを「ボーカル音」に設定して（ステップＳ１６５）、ステップＳ２１０に移行する。なお、ステップＳ１４０の直後にステップＳ１６５で設定する「ボーカル音」と、ステップＳ１６０の直後にステップＳ１６５で設定する「ボーカル音」とは、同一のガイドメロディでなくてもよい、例えば音量等が異なっていてもよい。 Next, the singing situation determination unit 20 determines whether the calculated singing likelihood is equal to or more than a predetermined threshold (step S160). The threshold can be set arbitrarily. If the singing likelihood is smaller than the threshold (step S160: NO), in other words, if it is determined that singing is not possible, the guide melody output unit 30 sets the guide melody to be output to "vocal sound" (step S165) , Shift to step S210. The “vocal sound” set in step S165 immediately after step S140 and the “vocal sound” set in step S165 immediately after step S160 may not be the same guide melody, for example, the volume etc. are different. It may be

歌唱尤度が閾値以上の場合（ステップＳ１６０：ＹＥＳ）、換言すると歌唱できていると判断した場合、歌唱状況判定部２０は、ステップＳ１２０で取得した歌唱の平均音高差を算出する（ステップＳ１７０）。平均音高差とは、例えば、予め定めた長さの区間における歌唱の音高と、楽譜の音高との差の絶対値の平均として求めることができる。音高は、例えば、基本周波数を抽出することで求めることができる。なお、平均音高差の代わりに、例えば、予め定めた長さの区間における歌唱において、半音以上外した回数等を用いても良い。 If the singing likelihood is equal to or higher than the threshold (step S160: YES), in other words, if it is determined that singing is possible, the singing situation determination unit 20 calculates the average pitch difference of the singing obtained in step S120 (step S170). ). The average pitch difference can be obtained, for example, as an average of absolute values of differences between the pitch of a song in a section of a predetermined length and the pitch of a score. The pitch can be determined, for example, by extracting the fundamental frequency. Note that, instead of the average pitch difference, for example, the number of times a semitone or more is removed in singing in a section having a predetermined length may be used.

次に、歌唱状況判定部２０は、算出した平均音高差が予め定めた閾値以下かどうか判定する（ステップＳ１８０）。閾値は任意に定めることが出来る。平均音高差が閾値より大きい場合（ステップＳ１８０：ＮＯ）、換言すると正しい音高で歌唱できていないと判断した場合、ガイドメロディ出力部３０は、出力するガイドメロディを「メロディ音」に設定して（ステップＳ１９５）、ステップＳ２１０に移行する。本実施形態において、メロディ音のガイドメロディは、上述した第１の音響モデルを用いて音声合成により生成する。 Next, the singing situation determination unit 20 determines whether the calculated average pitch difference is equal to or less than a predetermined threshold (step S180). The threshold can be set arbitrarily. If the average pitch difference is larger than the threshold (step S180: NO), in other words, if it is determined that the correct pitch can not be used for singing, the guide melody output unit 30 sets the output guide melody to "melody sound". Then, the process proceeds to step S210 (step S195). In the present embodiment, the guide melody of the melody sound is generated by speech synthesis using the first acoustic model described above.

平均音高差が閾値以下の場合（ステップＳ１８０：ＹＥＳ）、換言すると平均音量と歌唱尤度と平均音高差とに基づいて正しい音高で歌唱できていると判断した場合、ガイドメロディ出力部３０は、ガイドが「メロディ音」に設定されているかどうか判定する（ステップＳ１９０）。ガイドが「メロディ音」でない場合（ステップＳ１９０：ＮＯ）、ガイドメロディ出力部３０は、出力するガイドメロディを「メロディ音」に設定して（ステップＳ１９５）、ステップＳ２１０に移行する。 If the average pitch difference is equal to or less than the threshold (YES in step S180), in other words, if it is determined that singing at the correct pitch is possible based on the average volume, the singing likelihood and the average pitch difference, the guide melody output unit 30 determines whether the guide is set to "melody sound" (step S190). When the guide is not the "melody sound" (step S190: NO), the guide melody output unit 30 sets the guide melody to be output to the "melody sound" (step S195), and proceeds to step S210.

ガイドが「メロディ音」の場合（ステップＳ１９０：ＹＥＳ）、ガイドメロディ出力部３０は、ガイドが「メロディ音」に変更されてから所定時間が経過したかどうか判定する（ステップＳ２００）。所定時間は任意に定めることが出来る。所定時間経過していない場合（ステップＳ２００：ＮＯ）、ガイドメロディ出力部３０は、出力するガイドメロディを「無し」に設定して（ステップＳ２０５）、ステップＳ２１０に移行する。ガイドメロディ出力部３０は、歌唱状況判定部２０が歌唱できていると判断しても、所定時間が経過するまでは出力するガイドメロディを「メロディ音」のままとし、所定時間が経過した後は、出力するガイドメロディを「無し」とすることで、歌唱者が歌唱の程度が安定するまで歌唱を支援できる。 If the guide is "melody sound" (step S190: YES), the guide melody output unit 30 determines whether a predetermined time has elapsed since the guide was changed to "melody sound" (step S200). The predetermined time can be set arbitrarily. If the predetermined time has not elapsed (step S200: NO), the guide melody output unit 30 sets the guide melody to be output to "absent" (step S205), and the process proceeds to step S210. Even if the guide melody output unit 30 determines that the singing situation determination unit 20 can sing, the guide melody to be output remains "melody sound" until the predetermined time elapses, and after the predetermined time elapses By setting the guide melody to be output as "none", the singer can support singing until the degree of singing becomes stable.

所定時間経過している場合（ステップＳ２００：ＹＥＳ）、歌唱状況判定部２０は、楽曲が終了かどうか判定する（ステップＳ２１０）。 If the predetermined time has elapsed (step S200: YES), the singing situation determination unit 20 determines whether the music is finished (step S210).

楽曲が終了していない場合（ステップＳ２１０：ＮＯ）、ステップＳ１１０から処理を繰り返す。楽曲が終了した場合（ステップＳ２１０：ＹＥＳ）、ガイドメロディ出力処理は終了する。 If the music has not ended (step S210: NO), the process is repeated from step S110. If the music has ended (step S210: YES), the guide melody output process ends.

以上で説明した本実施形態の歌唱支援装置１００によれば、歌唱の程度に応じて、ガイドメロディ出力部３０が、ガイドメロディを切替える。より具体的には、平均音量が閾値未満の場合および歌唱尤度が閾値未満の場合、換言すると、歌えていない場合、ガイドメロディ出力部３０は、ガイドメロディをボーカル音に設定する。また、平均音量および歌唱尤度が閾値以上でかつ平均音高差が閾値未満の場合、換言すると、歌詞については歌えている場合、ガイドメロディ出力部３０は、ガイドメロディをメロディ音に設定する。また、平均音量および歌唱尤度および平均音高差が閾値以上の場合、換言すると、歌えている場合、ガイドメロディ出力部３０は、ガイドメロディを無しに設定する。そのため、歌唱者の歌唱を妨げることなく歌唱支援装置１００ができる。 According to the song support device 100 of the present embodiment described above, the guide melody output unit 30 switches the guide melody according to the degree of singing. More specifically, when the average volume is less than the threshold and when the singing likelihood is less than the threshold, in other words, when not singing, the guide melody output unit 30 sets the guide melody to the vocal sound. In addition, when the average volume and the singing likelihood are equal to or higher than the threshold and the average pitch difference is smaller than the threshold, in other words, if the lyrics are being sung, the guide melody output unit 30 sets the guide melody to the melody sound. When the average volume, the singing likelihood, and the average pitch difference are equal to or more than the threshold, in other words, when singing, the guide melody output unit 30 sets the guide melody to none. Therefore, the singing support device 100 can be performed without disturbing the singing of the singer.

また、本実施形態では、歌唱状況判定部２０は、平均音量と歌唱尤度と平均音高差とを順に判定しており、例えば、平均音量が閾値未満の場合、歌唱尤度の算出・判定および平均音高差の算出・判定を行わずに、ガイドメロディ出力部３０は、ガイドメロディをボーカル音に設定する。そのため、歌唱状況判定部２０の処理を最小限にすることができる。 Further, in the present embodiment, the singing situation determination unit 20 sequentially determines the average volume, the singing likelihood, and the average pitch difference. For example, when the average volume is less than a threshold, the singing likelihood calculation / determination The guide melody output unit 30 sets the guide melody to the vocal sound without performing the calculation / determination of the average pitch difference. Therefore, the processing of the singing situation judging unit 20 can be minimized.

また、本実施形態では、ガイドメロディ合成部１０は、統計的手法により音響パラメータを学習した音響モデルを用いて音声合成を行っている。そのため、曲毎に音声を収録すること無く、音声合成によりガイドメロディを生成する事ができる。 Further, in the present embodiment, the guide melody synthesis unit 10 performs speech synthesis using an acoustic model in which acoustic parameters have been learned by a statistical method. Therefore, it is possible to generate a guide melody by speech synthesis without recording speech for each song.

また、本実施形態では、ガイドメロディ合成部は、歌詞を伴わない歌唱音を学習して得られた第１の音響モデルを用いて、歌詞の無いメロディ音を生成し、歌詞を用いた歌唱音を学習して得られた第２の音響モデルを用いて、歌詞のあるボーカル音を生成する。従って、ガイドメロディ合成部１０は、容易に２種類以上のガイドメロディを生成する事ができる。また、歌詞の無いメロディ音により歌唱者の歌唱と被らないでメロディ音のサポートができる。そのため、歌唱者の歌唱を妨げることなく歌唱支援ができる。 Further, in the present embodiment, the guide melody synthesis unit generates the melody sound without the lyrics using the first acoustic model obtained by learning the singing sounds without the lyrics, and the singing sound using the lyrics Using the second acoustic model obtained by learning and generating vocal sounds with lyrics. Therefore, the guide melody synthesizing unit 10 can easily generate two or more types of guide melodies. Moreover, the melody sound without the lyrics can support the melody sound without being sung by the singer. Therefore, singing support can be performed without interfering with the singing of the singer.

Ｂ．第２実施形態：
図４は、第２実施形態における歌唱支援装置１００を含むカラオケ装置２００の概要を示す説明図である。第２実施形態の歌唱支援装置は、第１実施形態の歌唱支援装置１００と異なり、操作部７０Ａにより操作指示を受ける点が異なるものの、その他の構成は第１実施形態の歌唱支援装置１００と同様であるため、操作部７０Ａについて説明し、その他の構成の説明は省略する。 B. Second embodiment:
FIG. 4 is an explanatory view showing an outline of the karaoke apparatus 200 including the singing support device 100 in the second embodiment. The song support device of the second embodiment differs from the song support device 100 of the first embodiment in that the operation instruction is received from the operation unit 70A, but the other configuration is the same as the song support device 100 of the first embodiment. Therefore, the operation unit 70A will be described, and the description of the other configurations will be omitted.

操作部７０Ａは、ガイドメロディの切替えを制御する。より具体的には、歌唱者が操作部７０Ａを操作することによって、ガイドメロディ出力部３０が出力するガイドメロディを切替えることができる。操作部７０Ａは、ボタン７１〜７３を備えている。ボタン７１は、ガイドメロディを「ボーカル音」に設定し、ボタン７２は、ガイドメロディを「メロディ音」に設定し、ボタン７３は、ガイドメロディを「無し」に設定する。 The operation unit 70A controls switching of the guide melody. More specifically, when the singer operates the operation unit 70A, the guide melody output by the guide melody output unit 30 can be switched. The operation unit 70A includes the buttons 71-73. The button 71 sets the guide melody to "vocal sound", the button 72 sets the guide melody to "melody sound", and the button 73 sets the guide melody to "none".

例えば、歌唱者がガイドメロディをボーカル音からメロディ音に切替える操作を行った場合、より具体的には、歌唱者がボタン７２を選択した場合、ガイドメロディ出力部３０は、操作部７０Ａより、ガイドメロディ切り替えの指示を受け、ガイドメロディをボーカル音からメロディ音に切替える。 For example, when the singer performs an operation to switch the guide melody from the vocal sound to the melody sound, more specifically, when the singer selects the button 72, the guide melody output unit 30 guides the user through the operation unit 70A. In response to the melody switching instruction, the guide melody is switched from the vocal sound to the melody sound.

以上で説明した本実施形態の歌唱支援装置１００によれば、歌唱者の操作により、ガイドメロディ出力部３０が出力するガイドメロディを切替えるため、ガイドメロディ出力部３０は、歌唱者の要求に沿ったガイドメロディの切り替えを行える。そのため、歌唱者の歌唱を妨げることなく歌唱支援ができる。 According to the song support apparatus 100 of the present embodiment described above, the guide melody output unit 30 follows the request of the singer in order to switch the guide melody output by the guide melody output unit 30 by the operation of the singer. You can switch the guide melody. Therefore, singing support can be performed without interfering with the singing of the singer.

Ｃ．その他の実施形態：
上記実施形態において、ガイドメロディ合成部１０は、統計的手法により音響パラメータを学習した音響モデルを用いて音声合成を行っている。この代わりに、ガイドメロディ合成部１０は、波形接続方式を用いて音声合成を行ってもよい。また、統計的手法により音響パラメータを学習した音響モデルを用いた音声合成により生成したガイドメロディと、波形接続方式を用いて行った音声合成により生成したガイドメロディとを切替えて出力してもよい。 C. Other embodiments:
In the above-described embodiment, the guide melody synthesis unit 10 performs speech synthesis using an acoustic model in which acoustic parameters are learned by a statistical method. Instead of this, the guide melody synthesizing unit 10 may perform speech synthesis using a waveform connection method. Further, a guide melody generated by speech synthesis using an acoustic model in which acoustic parameters are learned by a statistical method and a guide melody generated by speech synthesis performed using a waveform connection method may be switched and output.

また、上記実施形態に置いて、ガイドメロディ出力部３０は、ガイドメロディ合成部１０が生成した２種類以上のガイドメロディを切替えて出力しているが、ガイドメロディ合成部１０は、リアルタイムでガイドメロディを生成してもよい。例えば、ガイドメロディ合成部１０は、歌唱程度に応じて音声合成に用いる音響パラメータを学習した音響モデルを切替えて、ガイドメロディを生成してもよい。 In the above embodiment, the guide melody output unit 30 switches and outputs two or more types of guide melodies generated by the guide melody combining unit 10, but the guide melody combining unit 10 generates the guide melody in real time. May be generated. For example, the guide melody synthesis unit 10 may generate a guide melody by switching an acoustic model in which acoustic parameters used for speech synthesis are learned according to the degree of singing.

また、上記実施形態において、ガイドメロディ合成部１０は、歌詞を伴わない歌唱音を学習して得られた第１の音響モデルを用いて、歌詞の無いメロディ音を生成し、歌詞を用いた歌唱音を学習して得られた第２の音響モデルを用いて、歌詞のあるボーカル音を生成しているが、ガイドメロディの種類や、音響モデルはこれに限定されない。この代わりに、例えば、ガイドメロディ合成部１０は、男声による歌唱音をサンプリングして得られた音響モデルを用いて男声のボーカル音と、女声による歌唱音をサンプリングして得られた音響モデルを用いて女声のボーカル音とを生成してもよい。 Further, in the above embodiment, the guide melody synthesis unit 10 generates a melody sound without lyrics using the first acoustic model obtained by learning a singing sound without lyrics, and sings using the lyrics. Although the vocal sound having the lyrics is generated using the second acoustic model obtained by learning the sound, the type of guide melody and the acoustic model are not limited to this. Instead of this, for example, the guide melody synthesis unit 10 uses the acoustic model obtained by sampling the vocal sound of the male voice using the acoustic model obtained by sampling the singing voice by the male voice and the acoustic sound obtained by sampling the singing voice by the female voice. And the vocal sound of the female voice may be generated.

また、上記実施形態において、ガイドメロディ出力部３０は、歌唱程度に応じてガイドメロディを「ボーカル音」、「メロディ音」、「無し」とで切替えているが、これに限定されない。例えば、図３においてステップＳ２００、Ｓ２０５を省略して、常にガイドメロディを出力するようにしてもよい。 In the above embodiment, the guide melody output unit 30 switches the guide melody between “vocal sound”, “melody sound”, and “none” according to the degree of singing, but the present invention is not limited to this. For example, in FIG. 3, steps S200 and S205 may be omitted to always output a guide melody.

また、上記実施形態において、歌唱状況判定部２０は、入力された歌唱の音量と歌詞と音程とによって、より具体的には平均音量と歌唱尤度と平均音量差とによって歌唱の程度を判定しているが、歌唱の程度の判定はこれらの要素に限られない。例えば、抑揚やビブラート、しゃくりなどの要素を用いてもよい。 Further, in the above embodiment, the singing situation judging unit 20 judges the degree of singing according to the volume, the lyrics, and the pitch of the input song, more specifically, the average volume, the singing likelihood and the average volume difference. However, the determination of the degree of singing is not limited to these factors. For example, elements such as intonation, vibrato, and crouching may be used.

また、上記実施形態において、歌唱者が複数人いる場合、歌唱者毎に歌唱の程度の判定を行い、正しく歌えている方にガイド状態を合わせてもよい。この形態によれば、歌えていない歌唱者は歌えている歌唱者の歌声をガイドにすることができるため、ガイドメロディによる歌唱の誘導を最小限とすることができる。そのため、歌唱者の歌唱を妨げることなく歌唱支援ができる。 In the above embodiment, when there are a plurality of singers, the degree of singing may be determined for each singer and the guide state may be adjusted to the person who can sing correctly. According to this aspect, a singer who is not singing can guide the singing voice of the singing singer who is singing, so the guidance of singing by the guide melody can be minimized. Therefore, singing support can be performed without interfering with the singing of the singer.

また、上記実施形態において、歌唱の程度の判定に用いる歌唱の長さの範囲は、時間以外の予め定められた規則に基づき区切られた範囲としてもよい。規則とは、例えば、小節やフレーズを用いることができる。 Further, in the above embodiment, the range of the length of the song used to determine the degree of singing may be a range divided based on a predetermined rule other than time. As the rule, for example, a measure or a phrase can be used.

また、上記実施形態において、歌唱の程度の判定に用いる歌唱の長さは、判定に用いる数値が閾値に近いほど長く、閾値から遠いほど短くしてもよい。例えば、長い場合は過去１フレーズ分、短い場合は過去１小節分としてもよい。この形態によれば、ガイドメロディ出力部３０は、歌唱の程度に応じて即座に適切なガイドメロディに切り替えることができる。また、歌えているか歌えていないか判定が難しい場合はより慎重で正確な判定が行うことができる。 In the above embodiment, the length of the song used to determine the degree of singing may be longer as the numerical value used for the determination is closer to the threshold and shorter as it is farther from the threshold. For example, one phrase may be used if it is long, and one measure may be used if it is short. According to this aspect, the guide melody output unit 30 can immediately switch to an appropriate guide melody according to the degree of singing. In addition, when it is difficult to determine whether you are singing or not singing, more careful and accurate determination can be performed.

また、上記実施形態において、判定に用いる歌唱は、現在から遡る時間によって重みを付けて平均を計算してもよい。例えば、判定に用いる歌唱が現在に近いほど重みを大きくし、過去に遡るほど重みを小さくしてもよい。この形態によれば、直近の歌唱の程度に迅速に反応しつつ、過去の歌唱の程度も考慮することができるため、歌唱の程度の判定結果が安定する。 Further, in the above embodiment, the singing used for the determination may be weighted by the time going back from the present to calculate the average. For example, the weight may be increased as the song used for determination is closer to the current state, and the weight may be smaller as it goes back to the past. According to this aspect, it is possible to take into consideration the degree of the past singing while reacting promptly to the degree of the latest singing, so that the determination result of the degree of the singing becomes stable.

また、上記実施形態において、歌唱の程度の判定に平均音量と平均音高差とを用いているが、この代わりに、中央値や最頻値等、統計的な別の基準を用いてもよい。 In the above embodiment, the average volume and the average pitch difference are used to determine the degree of singing, but instead, other statistical criteria such as median or mode may be used. .

また、上記実施形態において、歌唱状況判定部２０は、歌唱の程度について、平均音量と歌唱尤度と平均音高差とを順に判定している。これ対して、歌唱状況判定部２０は、平均音量と歌唱尤度と平均音高差との関係が定義されたマップや関数に基づき、歌唱の程度を判定してもよい。図５は、平均音量と歌唱尤度と平均音高差とガイドメロディとの関係を定義した図である。図５によれば、平均音量が閾値未満であり、歌唱尤度および平均音高差が閾値以上の場合、ガイドメロディを無しに設定する。第１ボーカル音と第２ボーカル音と第３ボーカル音とは、それぞれ音量が異なる。なお、図５は歌唱の程度に応じて設定するガイドメロディの種類ついての一例に過ぎない。 Further, in the above-described embodiment, the singing situation determination unit 20 sequentially determines the average volume, the singing likelihood, and the average pitch difference with respect to the degree of singing. On the other hand, the singing situation judging unit 20 may judge the degree of singing based on a map or a function in which the relationship between the average sound volume and the singing likelihood and the average pitch difference is defined. FIG. 5 is a diagram defining the relationship between the average volume, the singing likelihood, the average pitch difference, and the guide melody. According to FIG. 5, when the average sound volume is less than the threshold and the singing likelihood and the average pitch difference are equal to or more than the threshold, the guide melody is set to no. The first vocal sound, the second vocal sound, and the third vocal sound have different volumes. In addition, FIG. 5 is only an example about the kind of guide melody set according to the degree of singing.

また、第１実施形態において、歌唱状況判定部２０が歌唱の程度を判定している。この代わりに、カラオケ装置２００が備えている採点機能を用いて歌唱の程度を判定してもよい。例えば、採点機能によるリアルタイムの採点点数が閾値を上回っているかどうかという基準を用いることができる。 In the first embodiment, the singing situation determination unit 20 determines the degree of singing. Alternatively, the degree of singing may be determined using a scoring function provided in the karaoke apparatus 200. For example, it is possible to use a criterion as to whether the score score in real time by the scoring function exceeds a threshold.

本発明は、上述の実施形態に限られるものではなく、その趣旨を逸脱しない範囲において種々の構成で実現することができる。例えば発明の概要の欄に記載した各形態中の技術的特徴に対応する実施形態中の技術的特徴は、上述した課題を解決するために、あるいは上述の効果の一部又は全部を達成するために、適宜、差し替えや組み合わせを行うことが可能である。また、その技術的特徴が本明細書中に必須なものとして説明されていなければ、適宜削除することが可能である。 The present invention is not limited to the above-described embodiment, and can be realized in various configurations without departing from the scope of the invention. For example, the technical features in the embodiments corresponding to the technical features in the respective forms described in the section of the summary of the invention are for solving the problems described above or for achieving some or all of the effects described above It is possible to replace or combine as appropriate. Also, if the technical features are not described as essential in the present specification, they can be deleted as appropriate.

１０…ガイドメロディ合成部
１５…記憶部
２０…歌唱状況判定部
２５…制御部
３０…ガイドメロディ出力部
３５…楽曲出力部
４０…音声入力部
５０…ミキシングアンプ
６０…スピーカ
７０、７０Ａ…操作部
７１、７２、７３…ボタン
１００…歌唱支援装置
１０５…楽曲演奏装置
２００…カラオケ装置
ＮＷ…ネットワーク DESCRIPTION OF SYMBOLS 10 ... Guide melody synthetic | combination part 15 ... Memory | storage part 20 ... Song condition determination part 25 ... Control part 30 ... Guide melody output part 35 ... Music output part 40 ... Speech input part 50 ... Mixing amplifier 60 ... Speaker 70, 70A ... Operation part 71 , 72, 73 ... button 100 ... singing support device 105 ... music playing device 200 ... karaoke device NW ... network

本発明は、上述の課題を解決するためになされたものであり、以下の形態として実現することが可能である。本発明は以下の形態で実施できる。本発明の一形態によれば、歌唱者の歌唱を支援する歌唱支援装置が提供される。この歌唱支援装置は、前記歌唱される楽曲に対応し前記歌唱者の前記歌唱を誘導するガイドメロディを、音声合成により、同一の旋律に対して２種類以上生成するガイドメロディ合成部と；前記２種類以上のガイドメロディを切替えて、前記楽曲に合わせて出力するガイドメロディ出力部と、を備える。この形態の歌唱支援装置によれば、複数種類のガイドメロディを切替えて出力できるため、歌唱者の歌唱に対して適切な形態、例えば、歌唱者の歌唱と被らないでメロディ音のサポートができる。そのため、歌唱者の歌唱を妨げることなく歌唱支援ができる。 The present invention has been made to solve the above-described problems, and can be realized as the following modes. The present invention can be implemented in the following forms. According to one aspect of the present invention, there is provided a singing support device that supports singing by a singer. The song support apparatus generates a guide melody corresponding to the song to be sung and which guides the singer of the song by two or more types for the same melody by voice synthesis; And a guide melody output unit configured to switch the guide melody of the type or more and output the selected melody in accordance with the music. According to the singing support device of this aspect, since it is possible to switch and output a plurality of types of guide melody, it is possible to support melody sound without having a form suitable for singing of the singer, for example, singing and singing of the singer . Therefore, singing support can be performed without interfering with the singing of the singer.

Claims

A singing support device for supporting singing by a singer,
A guide melody combining unit that generates two or more types of guide melodies that correspond to the song to be sung and guide the singer of the song by voice synthesis;
A song support device comprising: a guide melody output unit configured to switch between the two or more types of guide melodies and output according to the music.

The singing support device according to claim 1, wherein
The said guide melody synthetic | combination part is a song assistance apparatus which performs a voice synthesis using the acoustic model which learned the acoustic parameter by the statistical method to the said guide melody.

The singing support device according to claim 1 or 2, wherein
The song support device, wherein the guide melody synthesis unit generates at least a melody sound without lyrics and a vocal sound with lyrics as the two or more types of guide melody by the voice synthesis.

The singing support device according to claim 3, wherein
The synthesis of the melody sound is performed based on a first acoustic model obtained by learning a singing sound without lyrics such as humming or scat of the music, and the synthesis of the vocal sound is performed on the lyrics of the music. The singing support device performed based on the 2nd acoustic model obtained by learning the singing voice using.

A singing support device according to any one of claims 1 to 4, wherein
A singing situation judging unit that judges the degree of the singing by the singing person;
The song support device according to claim 1, wherein the guide melody output unit switches the guide melody according to the determined degree of the singing.

The karaoke apparatus provided with the song assistance apparatus as described in any one of Claim 1- Claim 5.