JP2015087617A

JP2015087617A - Device and method for generating guide vocal of karaoke

Info

Publication number: JP2015087617A
Application number: JP2013227101A
Authority: JP
Inventors: 康孝和田; Yasutaka Wada; 渡邉　毅; Takeshi Watanabe; 毅渡邉
Original assignee: Daiichikosho Co Ltd
Current assignee: Daiichikosho Co Ltd
Priority date: 2013-10-31
Filing date: 2013-10-31
Publication date: 2015-05-07
Anticipated expiration: 2033-10-31
Also published as: JP6184296B2

Abstract

PROBLEM TO BE SOLVED: To provide a guide vocal generation device which is more human-like and generates a guide vocal that gets a high score in a scoring result.SOLUTION: A reference voice synthesis parameter is corrected according to a setting from an operation part and a display part in parameter correction means 25, and technique data is added to a specified section in which a reference voice synthesis parameter is set by a degree setting, thereby to be set as a correction voice synthesis parameter; scoring means 24 performs a scoring based on reference voice synthesis data 42 generated from text character information and the reference voice synthesis parameter by voice synthesis means 23 and correction voice synthesis data 43 from the text character information and a correction voice synthesis parameter; and the correction voice synthesis parameter is corrected according to a changed setting when a setting is changed by an operation part and a display part based on a scoring result, and the correction voice synthesis data 43 is generated by the voice synthesis means 23, and the scoring is made by scoring means 24.

Description

本発明は、カラオケシステムで歌唱と共に再生される歌唱者の歌唱をガイドするガイドボーカルを生成するガイドボーカル生成装置及びガイドボーカル生成方法に関する。 The present invention relates to a guide vocal generating apparatus and a guide vocal generating method for generating a guide vocal that guides a singer's song reproduced together with a song in a karaoke system.

近年、カラオケシステムの高機能化が進み、歌唱者は、歌唱練習したい楽曲のお手本とする指導歌唱を聴くことができる。所謂「歌唱支援システム」であるが、このシステムとは、練習対象曲を理想的に歌唱できる模範歌唱者による指導歌唱を予めガイドボーカルとして録音しておき、歌唱者は、これを歌唱時や任意のタイミングにて聴くことができる。 In recent years, karaoke systems have become more sophisticated, and singers can listen to instructional songs that serve as examples of songs that they want to practice. This is a so-called “singing support system”. This system records in advance a guide song by a model singer who can ideally sing a song to be practiced as a guide vocal. You can listen at the timing.

このようなカラオケシステムの歌唱指導機能に関連し、従来、様々な技術が想到されている。例えば、歌唱者が楽曲を歌唱しやすいように、楽曲のオリジナル歌手の歌唱にできるだけ似せて録音したボーカルトラックをカラオケ演奏に合わせて再生する技術や、特許文献１のように、歌唱者本人の声をサンプリングしてガイドボーカルを生成する技術も知られている。 Various techniques have been conceived in the past in connection with the singing instruction function of such a karaoke system. For example, in order to make it easy for a singer to sing a song, a technique for reproducing a vocal track that is recorded as closely as possible to the original singer's singing along with a karaoke performance, or as disclosed in Patent Document 1, A technique is also known in which a guide vocal is generated by sampling.

ところで、カラオケシステムにおいて採点機能を備えることも一般的となってきており、歌唱者の歌唱信号の音高を検出し、手本となる音高の情報（リファレンスデータ）とを比較するなどして採点を行うというものであるが、リファレンスデータと上記ガイドボーカルは異なる目的のために生成されたものであることから、ガイドボーカルにより歌唱の支援を受け、ガイドボーカルと同じように歌唱できたとしても採点結果が高得点になるとは限らないという事態が生じてしまう。 By the way, it has become common to provide a scoring function in a karaoke system, and the pitch of a singer's singing signal is detected and compared with reference pitch information (reference data). Although the scoring is done, the reference data and the above guide vocal are generated for different purposes, so even if you can sing in the same way as the guide vocal with the support of singing by the guide vocal A situation occurs in which the scoring result is not always high.

一方、楽曲データに付随された歌詞表示の文字情報や、歌唱のための音高情報、タイミング情報、音量情報の楽譜情報などのパラメータを用いて、多数の音声波形データ（音素データ）の再生態様を制御することで音声合成を行い、人工的に歌唱を行わせる技術が知られている。例えば、特許文献２のように、音高情報、タイミング情報、音量情報を含むＭＩＤＩ（登録商標）フォーマットのデータと歌詞情報とから上記音声合成用のパラメータを取得して音声合成を行うというものである。 On the other hand, a reproduction mode of a large number of speech waveform data (phoneme data) using parameters such as lyric display character information attached to music data, pitch information for singing, timing information, and score information of volume information. There is known a technique for performing voice synthesis by controlling voice and artificially singing. For example, as in Patent Document 2, speech synthesis is performed by obtaining the above parameters for speech synthesis from MIDI (registered trademark) format data including pitch information, timing information, and volume information and lyrics information. is there.

特開２００９−２４４７８９号公報JP 2009-244789 A 特開平１０−２４０２６４号公報JP-A-10-240264

しかしながら、特許文献２のような音声合成技術で生成されたガイドボーカルは音高情報などの上記リファレンスデータを元に生成され、採点がリファレンスデータとの比較で行われることからして、当該ガイドボーカルと同じように歌唱すると高得点が得られるものであるが、このような音声合成技術で生成されたガイドボーカルは人工的（機械的）なもので味気ない歌唱音声であり、採点結果が高得点とはなるものの、人が聴いた印象とは異なって採点結果と乖離してしまうという問題がある。 However, the guide vocal generated by the speech synthesis technique as in Patent Document 2 is generated based on the reference data such as pitch information, and the scoring is performed by comparison with the reference data. Singing in the same way as above, a high score can be obtained, but the guide vocal generated by such speech synthesis technology is an artificial (mechanical) mellow singing voice, and the scoring result is a high score However, there is a problem that it is different from the scoring results, unlike the impression that people have heard.

そこで、本発明は上記課題に鑑みなされたもので、より人間的で採点結果が高得点となるガイドボーカルを生成するガイドボーカル生成装置及びガイドボーカル生成方法を提供することを目的とする。 Accordingly, the present invention has been made in view of the above problems, and an object thereof is to provide a guide vocal generating apparatus and a guide vocal generating method for generating a guide vocal that is more human and has a high scoring result.

上記課題を解決するために、請求項１の発明では、カラオケ楽曲毎に、歌唱者に対して歌詞を表示するための歌詞文字情報と、少なくとも歌唱者が歌唱すべき音高情報、タイミング情報及び音量情報を含むリファレンスデータとを有する演奏データに基づいてガイドボーカルを生成するガイドボーカル生成装置であって、前記楽曲毎の前記演奏データを格納する演奏データベースと、歌唱に際して種々の歌唱技法をデータ化した技法データを格納する歌唱技法データベースと、少なくとも、前記演奏データ中よりガイドボーカル生成対象の楽曲を選択し、前記リファレンスデータ中の情報の選択及びその程度を設定すると共に、前記技法データの選択及びその程度並びに付加する特定区間を設定する外部入力手段と、前記外部入力手段より選択された楽曲に基づいて、前記演奏データベースから、歌詞文字情報を取得すると共に、前記リファレンスデータより音高情報、タイミング情報及び音量情報の基準音声合成パラメータを取得するデータ取得手段と、前記データ取得手段より取得した基準音声合成パラメータを前記外部入力手段からの設定に応じて修正すると共に、当該外部入力手段により選択された技法データをその程度設定に応じて当該基準音声合成パラメータの設定された特定区間に付加して修正音声合成パラメータとするパラメータ修正手段と、前記データ取得手段で取得した歌詞文字情報と基準音声合成パラメータとから基準音声合成データを生成し、前記データ取得手段で取得した歌詞文字情報と前記パラメータ修正手段で修正した修正音声合成パラメータとから修正音声合成データを生成する音声合成手段と、前記音声合成手段で生成した基準音声合成データと修正音声合成データとを比較して採点する採点手段と、を有し、前記パラメータ修正手段は、前記採点手段による採点結果に基づいて前記外部入力手段より設定の変更があった場合に当該変更設定に応じた修正音声合成パラメータを修正し、前記音声合成手段で修正音声合成データを生成させ、前記採点手段で採点させる、構成とする。 In order to solve the above-mentioned problem, in the invention of claim 1, for each karaoke piece, lyric character information for displaying lyrics to the singer, pitch information, timing information to be sung at least by the singer, A guide vocal generating device that generates guide vocals based on performance data having reference data including volume information, wherein the performance database storing the performance data for each piece of music and various singing techniques are converted into data when singing A singing technique database for storing the technique data, and at least selecting a tune for which guide vocals are to be generated from the performance data, setting information selection in the reference data and its degree, and selecting the technique data and The external input means for setting the degree and the specific section to be added, and the external input means are selected. A data acquisition unit that acquires lyric character information from the performance database based on the performed music, and acquires reference speech synthesis parameters of pitch information, timing information, and volume information from the reference data; and the data acquisition unit The reference speech synthesis parameter acquired from the external input unit is modified according to the setting from the external input unit, and the technique data selected by the external input unit is set to the specific section in which the reference speech synthesis parameter is set according to the degree setting. Parameter correcting means that is added as a modified speech synthesis parameter, and lyric character information obtained by the data obtaining means by generating reference speech synthesis data from the lyric character information obtained by the data obtaining means and the reference speech synthesis parameter And the modified speech synthesis parameters modified by the parameter modification means Speech synthesis means for generating corrected speech synthesis data; and scoring means for scoring by comparing the reference speech synthesis data generated by the speech synthesis means and the modified speech synthesis data, the parameter correction means, When there is a setting change from the external input unit based on the scoring result by the scoring unit, the modified speech synthesis parameter corresponding to the change setting is modified, and the modified speech synthesis data is generated by the speech synthesis unit, and the scoring It is set as the structure which makes a score by means.

請求項２の発明では、表示手段を備え、前記採点手段は、採点結果が満点にならなかった場合に、各区間の採点情報を前記表示手段に表示する構成である。 The invention of claim 2 is provided with display means, and the scoring means is configured to display scoring information of each section on the display means when the scoring result is not full.

請求項３の発明では、任意のカラオケ楽曲に対応した、歌唱者に対して歌詞を表示するための歌詞文字情報と、前記歌唱者による前記カラオケ楽曲の歌唱を所定の歌唱区間毎に採点するための音高情報、タイミング情報及び音量情報を含むリファレンスデータとに基づいて、前記カラオケ楽曲の歌唱を支援するためのガイドボーカルを生成するガイドボーカル生成装置であって、記歌詞文字情報と前記リファレンスデータを入力するデータ入力手段と、前記データ入力手段により入力された歌詞文字情報と、前記データ入力手段により入力されたリファレンスデータに含まれる音高情報、タイミング情報及び音量情報を含む複数の音声合成パラメータとに基づいて、音声合成データを生成する音声合成手段と、記音声合成手段により生成された音声合成データを聴取可能に再生する再生手段と、記音声合成パラメータを修正するためのパラメータ修正手段と、前記パラメータ修正手段により修正された音声合成パラメータを含む前記複数の音声合成パラメータに基づいて前記音声合成手段により生成された修正音声合成データを、前記データ入力手段により入力されたリファレンスデータと所定の採点ルールに基づいて所定の歌唱区間毎に採点する採点手段と、有する構成とする。 In the invention of claim 3, in order to score the lyric character information for displaying lyrics to the singer corresponding to an arbitrary karaoke piece and the singing of the karaoke piece by the singer for each predetermined singing section A guide vocal generating device for generating a guide vocal for supporting the singing of the karaoke music based on reference data including pitch information, timing information, and volume information of the lyric character information and the reference data A plurality of speech synthesis parameters including pitch information, timing information, and volume information included in the reference data input by the data input means, the lyric character information input by the data input means, and the lyric character information input by the data input means Is generated by a speech synthesis unit that generates speech synthesis data and a speech synthesis unit. Based on the plurality of speech synthesis parameters including playback means for reproducibly reproducing the speech synthesis data, parameter correction means for correcting the speech synthesis parameters, and the speech synthesis parameters corrected by the parameter correction means The modified speech synthesis data generated by the speech synthesis means includes scoring means for scoring for each predetermined singing section based on the reference data input by the data input means and a predetermined scoring rule.

請求項４の発明では、表示手段を備え、前記採点手段は、採点結果が所定値未満であった場合に、前記パラメータ修正手段により修正された音声合成パラメータに係る歌唱区間の採点情報を、前記表示手段に表示する構成である。 In the invention of claim 4, comprising a display means, the scoring means, when the scoring result is less than a predetermined value, the scoring information of the singing section related to the speech synthesis parameter corrected by the parameter correction means, It is the structure displayed on a display means.

請求項５の発明では、任意のカラオケ楽曲に対応した、歌唱者に対して歌詞を表示するための歌詞文字情報と、前記歌唱者による前記カラオケ楽曲の歌唱を所定の歌唱区間毎に採点するための音高情報、タイミング情報及び音量情報を含むリファレンスデータとに基づいて、前記カラオケ楽曲の歌唱を支援するためのガイドボーカルを生成するガイドボーカル生成方法であって、前記歌詞文字情報と前記リファレンスデータを入力するデータ入力ステップと、前記歌詞文字情報と、前記音高情報、前記タイミング情報及び前記音量情報を含む複数の音声合成パラメータとに基づいて、音声合成データを生成する音声合成ステップと、記音声合成データを再生する再生ステップと、記音声合成パラメータを修正するパラメータ修正ステップと、前記修正された音声合成パラメータを含む前記複数の音声合成パラメータに基づいて、修正音声合成データを生成する修正音声合成ステップと、記修正音声合成データを、前記リファレンスデータと所定の採点ルールに基づいて所定の歌唱区間毎に採点する採点ステップと、記採点の結果が所定値未満であった場合に、前記修正された音声合成パラメータに係る歌唱区間の採点情報を表示する表示ステップと、有する構成とする。 In the invention of claim 5, in order to score lyric character information for displaying lyrics to a singer corresponding to an arbitrary karaoke piece and the singing of the karaoke piece by the singer for each predetermined singing section A guide vocal generating method for generating a guide vocal for supporting the singing of the karaoke music on the basis of reference data including pitch information, timing information and volume information of the lyrics character information and the reference data A speech input step for generating speech synthesis data based on a plurality of speech synthesis parameters including the lyric character information, the pitch information, the timing information, and the volume information, A reproduction step for reproducing the voice synthesis data, a parameter correction step for correcting the voice synthesis parameter, A modified speech synthesis step for generating modified speech synthesis data based on the plurality of speech synthesis parameters including the modified speech synthesis parameters, and the modified speech synthesis data based on the reference data and a predetermined scoring rule A scoring step for scoring for each predetermined singing section; and a display step for displaying scoring information of the singing section related to the modified speech synthesis parameter when the scoring result is less than a predetermined value; To do.

請求項１の発明によれば、パラメータ修正手段において外部入力手段からの設定に応じて基準音声合成パラメータを修正すると共に、技法データをその程度設定で当該基準音声合成パラメータの設定された特定区間に付加して修正音声合成パラメータとし、音声合成手段で歌詞文字情報及び基準音声合成パラメータから生成した基準音声合成データと、歌詞文字情報及び修正音声合成パラメータから修正音声合成データとに基づいて採点手段が採点を行い、採点結果に基づいて外部入力手段より設定の変更があった場合に当該変更設定に応じた修正音声合成パラメータを修正し、音声合成手段で修正音声合成データを生成させ、採点手段で採点させる構成とすることにより、採点結果が高得点となる範囲で、音声合成パラメータを修正して生成された修正音声合成データをガイドボーカルとしてより人間的に聴取できるように生成することができるものである。 According to the first aspect of the present invention, the parameter correcting unit corrects the reference speech synthesis parameter in accordance with the setting from the external input unit, and the technique data is set to the specific section in which the reference speech synthesis parameter is set by the degree setting. A scoring means is added based on the reference speech synthesis data generated from the lyric character information and the reference speech synthesis parameter by the speech synthesizer and the modified speech synthesis data from the lyric character information and the modified speech synthesis parameter. If the setting is changed from the external input unit based on the scoring result, the modified speech synthesis parameter corresponding to the changed setting is corrected, the modified speech synthesis data is generated by the speech synthesis unit, and the scoring unit By adopting a scoring configuration, the speech synthesis parameters are corrected and generated within the range where the scoring results are high. Those that can be generated to allow humanly listening than the modified speech synthesis data as a guide vocal.

請求項２の発明によれば、表示手段を備え、採点手段の採点結果が満点にならなかった場合に各区間の採点情報を表示手段に表示させることにより、パラメータ修正手段による満点となる修正をし易くすることができるものである。 According to the invention of claim 2, the display means is provided, and when the scoring result of the scoring means does not reach a perfect score, the scoring information of each section is displayed on the display means, thereby correcting the scoring by the parameter correcting means. It can be made easy.

請求項３及び請求項５の発明によれば、データ入力手段により入力された歌詞文字情報と、データ入力手段により入力されたリファレンスデータに含まれる音高情報、タイミング情報及び音量情報を含む複数の音声合成パラメータとに基づいて、音声合成データを音声合成手段で生成した音声合成データを聴取可能に再生し、パラメータ修正手段で音声合成パラメータを修正した音声合成パラメータを含む複数の音声合成パラメータに基づいて当該音声合成手段により生成された修正音声合成データを、データ入力手段により入力されたリファレンスデータと所定の採点ルールとに基づいて所定の歌唱区間毎に採点する構成とすることにより、歌詞文字情報とリファレンスデータとを入力させるものであっても、採点結果が高得点となる範囲で、音声合成パラメータを修正して生成された修正音声合成データをガイドボーカルとしてより人間的に聴取できるように生成することができるものである。 According to the third and fifth aspects of the present invention, there are a plurality of lyrics character information input by the data input means and a plurality of pitch information, timing information, and volume information included in the reference data input by the data input means. Based on the speech synthesis parameters, the speech synthesis data generated by the speech synthesis means is audibly reproduced based on the speech synthesis parameters, and based on a plurality of speech synthesis parameters including the speech synthesis parameters obtained by correcting the speech synthesis parameters by the parameter correction means. The lyric character information is formed by scoring the corrected speech synthesis data generated by the speech synthesizer for each predetermined singing section based on the reference data input by the data input unit and a predetermined scoring rule. Range in which the scoring result is high even if the reference data is entered Are those which can be generated as a modified speech synthesis data generated by modifying the speech synthesis parameters can be humanly listening than as a guide vocal.

請求項４の発明によれば、表示手段を備え、採点手段の採点結果が所定値未満であった場合に、パラメータ修正手段により修正された音声合成パラメータに係る歌唱区間の採点情報を表示手段に表示させることにより、パラメータ修正手段による高得点となる修正をし易くすることができるものである。 According to the invention of claim 4, the display means is provided, and when the scoring result of the scoring means is less than a predetermined value, the scoring information of the singing section related to the speech synthesis parameter corrected by the parameter correcting means is displayed on the display means. By displaying it, it is possible to facilitate correction with a high score by the parameter correcting means.

本発明に係るカラオケのガイドボーカル生成装置における第１実施形態のブロック構成図である。It is a block block diagram of 1st Embodiment in the guide vocal production | generation apparatus of the karaoke which concerns on this invention. 図１の演奏データ、歌唱技法データの説明図である。It is explanatory drawing of the performance data and singing technique data of FIG. 本発明のガイドボーカル生成装置における動作説明のフローチャートである。It is a flowchart of operation | movement description in the guide vocal production | generation apparatus of this invention. 図１における音声合成手段の処理説明図である。It is processing explanatory drawing of the speech synthesis means in FIG. 図１における採点手段の処理説明図である。It is processing explanatory drawing of the scoring means in FIG. 本発明に係るカラオケのガイドボーカル生成装置における第２実施形態のブロック構成図である。It is a block block diagram of 2nd Embodiment in the guide vocal production | generation apparatus of the karaoke which concerns on this invention.

以下、本発明の実施形態を図により説明する。
図１に本発明に係るカラオケのガイドボーカル生成装置における第１実施形態のブロック構成図を示すと共に、図２に図１の演奏データ、歌唱技法データの説明図を示す。図１に示すガイドボーカル生成装置１１は、装置本体１２に有線又は無線で外部接続されるものとして、操作部１３及び表示部１４を備える。当該操作部１３及び表示部１４により外部入力手段を構成する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 shows a block diagram of the first embodiment of the karaoke guide vocal generating apparatus according to the present invention, and FIG. 2 shows an explanatory diagram of the performance data and singing technique data of FIG. A guide vocal generating apparatus 11 shown in FIG. 1 includes an operation unit 13 and a display unit 14 that are externally connected to the apparatus main body 12 in a wired or wireless manner. The operation unit 13 and the display unit 14 constitute an external input unit.

上記操作部１３は、装置本体１２に対して種々の信号を入力させるためのもので、キーボードやマウスなどで構成される。ここでは、少なくとも、表示部１４との連携で、演奏データ中よりガイドボーカル生成対象の楽曲を選択し、後述のリファレンスデータ中の情報の選択及びその程度を設定すると共に、後述の歌唱技法データの選択及びその程度並びに付加する特定区間を設定する入力手段となる。 The operation unit 13 is used to input various signals to the apparatus main body 12 and includes a keyboard and a mouse. Here, at least in cooperation with the display unit 14, a musical piece for which a guide vocal is to be generated is selected from the performance data, the selection of information in reference data described later and the degree thereof are set, and the singing technique data described later is selected. It becomes an input means for setting the selection, its degree, and a specific section to be added.

上記表示部１４は、ガイドボーカル生成のために必要な表示画面を表示し、操作部１３より入力した各種設定値を装置本体１２に送出する。なお、操作部１３及び表示部１４を一体とし、液晶ディスプレイ（ＬＣＤ）とタッチセンサとを積層して入出力用とし、表示されるアイコン等に対応して当該タッチセンサにより楽曲の選択などのデータを入力することができるＧＵＩのユーザインタフェース機能を有するものとしてもよい。 The display unit 14 displays a display screen necessary for generating guide vocals, and sends various setting values input from the operation unit 13 to the apparatus main body 12. In addition, the operation unit 13 and the display unit 14 are integrated, and a liquid crystal display (LCD) and a touch sensor are stacked for input / output, and data such as selection of music by the touch sensor corresponding to displayed icons and the like. It is possible to have a GUI user interface function.

上記装置本体１２は、バス２１、制御部２２、音声合成手段２３、採点手段２４、データ取得手段２５、パラメータ修正手段２６、表示制御部２７、演奏データのデータベース（演奏ＤＢ）２８、歌唱技法データベース（ＤＢ）２９及びガイドボーカルデータベース（ＤＢ）３０を適宜備える。また、音声合成手段２３には、要素記憶部４１の記憶領域が形成される。そして、採点手段２４には、採点する対象の基準音声合成データ４２を記憶する領域、修正音声合成データ４３を記憶する領域が形成される。 The apparatus body 12 includes a bus 21, a control unit 22, a speech synthesis unit 23, a scoring unit 24, a data acquisition unit 25, a parameter correction unit 26, a display control unit 27, a performance data database (performance DB) 28, and a singing technique database. A (DB) 29 and a guide vocal database (DB) 30 are provided as appropriate. In addition, a storage area of the element storage unit 41 is formed in the speech synthesizer 23. In the scoring means 24, an area for storing the reference speech synthesis data 42 to be scored and an area for storing the modified speech synthesis data 43 are formed.

上記制御部２２は、このガイドボーカル生成装置を全体的に統括して処理制御する物理的なＣＰＵであり、図示しないＲＯＭに記憶されているプログラムに基づくアルゴリズム処理を行う。音声合成手段２３は、ここでは要素記憶部４１に記憶される音声合成パラメータ（修正音声合成パラメータ）、歌詞文字データ及び音素データに基づいて音声合成を行うプログラムであり、具体的な音声合成処理は前述の特許文献２に記載されている手法を用いることができる。 The control unit 22 is a physical CPU that performs overall process control of the guide vocal generation device, and performs algorithm processing based on a program stored in a ROM (not shown). The speech synthesis means 23 is a program that performs speech synthesis based on speech synthesis parameters (corrected speech synthesis parameters), lyrics character data, and phoneme data stored in the element storage unit 41, and specific speech synthesis processing is as follows. The method described in Patent Document 2 described above can be used.

上記採点手段２４は、音声合成データ同士を比較して採点処理を行うプログラムであり、音声合成手段２３で生成した基準音声合成データ４２と修正音声合成データ４３とを比較して採点する。具体的には、例えば特許第４２２２９１５号公報に記載されている手法を用いることができ、基本的には音声合成パラメータの修正は減点評価となるが、歌唱技法が加わることで表現力が増して加点評価となる。なお、採点結果が満点にならなかった場合に、各区間の採点情報を、表示制御部２７を介して表示部１４に表示させることとしてもよい。 The scoring unit 24 is a program that performs scoring processing by comparing speech synthesis data, and compares the reference speech synthesis data 42 generated by the speech synthesis unit 23 with the modified speech synthesis data 43 for scoring. Specifically, for example, the method described in Japanese Patent No. 4222915 can be used. Basically, the correction of the speech synthesis parameter is a deduction evaluation, but the expression power is increased by adding the singing technique. It becomes a score evaluation. In addition, when the scoring result is not full, the scoring information of each section may be displayed on the display unit 14 via the display control unit 27.

データ取得手段２５は、後述する操作部１３及び表示部１４より選択された楽曲に基づいて、後述の演奏ＤＢ２８から、歌詞文字情報を取得すると共に、後述のリファレンスデータより音高情報、タイミング情報及び音量情報の基準音声合成パラメータを取得するプログラムである。 The data acquisition unit 25 acquires lyric character information from a performance DB 28 described later based on the music selected from the operation unit 13 and the display unit 14 described later, and includes pitch information, timing information, and timing information from reference data described later. This is a program for acquiring a reference speech synthesis parameter for volume information.

パラメータ修正手段２６は、詳細は後述するが、データ取得手段２５より取得した基準音声合成パラメータを操作部１３及び表示部１４からの設定に応じて修正すると共に、当該選択された技法データをその程度設定に応じて当該基準音声合成パラメータの特定区間に付加して修正音声合成パラメータとし、また、採点手段２４による採点結果に基づいて操作部１３及び表示部１４より設定の変更があった場合に当該変更設定に応じた修正音声合成パラメータを修正するプログラムである。上記表示制御部２７は、表示部１４に表示させる表示画面の表示データを作成して当該表示部１４に送出し、表示部１４からの入力データを入力するプログラムないし電子回路である。 The parameter correction unit 26 corrects the reference speech synthesis parameter acquired from the data acquisition unit 25 according to the settings from the operation unit 13 and the display unit 14 and details the selected technique data to the extent that will be described in detail later. When the setting is changed from the operation unit 13 and the display unit 14 based on the scoring result by the scoring means 24, it is added to the specific section of the reference speech synthesis parameter according to the setting, and the modified speech synthesis parameter is used. This is a program for correcting a modified speech synthesis parameter corresponding to a change setting. The display control unit 27 is a program or an electronic circuit that creates display data for a display screen to be displayed on the display unit 14, sends the display data to the display unit 14, and inputs input data from the display unit 14.

上記演奏ＤＢ２８は、図２（Ａ）に示すように、カラオケ楽曲毎に、音符情報などの演奏データ及び歌詞文字情報などを格納するコンテンツデータベースである。具体的には、楽曲ＩＤ、曲名及びアーチストＩＤ（アーチスト名）が関連付けられた楽曲テーブルを有し、楽曲毎に、楽曲ＩＤで管理される所定データ形式（例えば、ＭＩＤＩ（登録商標）形式）の各楽曲の音符情報などをカラオケ演奏データとして格納すると共に、歌唱を採点するための音高情報、タイミング情報及び音量情報を含むリファレンスデータ及び歌詞文字データを少なくとも格納する。 As shown in FIG. 2A, the performance DB 28 is a content database that stores performance data such as note information and lyrics character information for each karaoke piece. Specifically, it has a song table in which a song ID, song name, and artist ID (artist name) are associated, and each song has a predetermined data format (for example, MIDI (registered trademark) format) managed by the song ID. The musical note information and the like of each musical piece are stored as karaoke performance data, and at least reference data and lyrics character data including pitch information, timing information and volume information for scoring a song are stored.

上記歌唱技法ＤＢ２９は、歌唱に際して種々の歌唱技法を周波数などでデータ化した技法データを格納するデータベースであり、図２（Ｂ）に示すように、ビブラート情報、抑揚情報、ロングトーン情報、走り情報、ため情報、こぶし情報、しゃくり情報、フォール情報、シャウト情報などの歌唱技法データなどがある。 The singing technique DB 29 is a database that stores technique data obtained by converting various singing techniques into data at the time of singing. As shown in FIG. 2B, vibrato information, intonation information, long tone information, and running information are stored. Singing technique data such as information, fist information, shackle information, fall information, and shout information.

そして、ガイドボーカルＤＢ３０は、音声合成手段２３で最終的に生成した楽曲毎の音声合成データ（修正音声合成データ）であるガイドボーカル（修正ガイドボーカル）を、楽曲ＩＤに関連付けて格納しておくデータベースである。上記演奏ＤＢ２８、歌唱技法ＤＢ２９及びガイドボーカルＤＢ３０のデータベースは、例えば装置本体１２の備えるハードディスク内で構築してもよく、装置本体１２に接続される外部記憶装置等で構築してもよい。 The guide vocal DB 30 is a database that stores guide vocals (corrected guide vocals) that are voice synthesis data (corrected voice synthesized data) for each piece of music finally generated by the voice synthesizer 23 in association with the music ID. It is. The database of the performance DB 28, singing technique DB 29, and guide vocal DB 30 may be constructed in, for example, a hard disk included in the apparatus main body 12, or may be constructed in an external storage device connected to the apparatus main body 12.

そこで、図３に本発明のガイドボーカル生成装置における動作説明のフローチャートを示すと共に、図４に図１における音声合成手段の処理説明図、図５に図１における採点手段の処理説明図を示す。図３において、まず、表示部１４にはガイドボーカルを生成する対象を入力させる入力画面が表示され、作成者が操作部１３よりガイドボーカルを作成する楽曲を、例えば楽曲ＩＤを入力して装置本体１２に送出する（ステップ（Ｓ）１）。 Therefore, FIG. 3 shows a flowchart for explaining the operation of the guide vocal generating apparatus of the present invention, FIG. 4 shows a process explanation diagram of the speech synthesis means in FIG. 1, and FIG. 5 shows a process explanation diagram of the scoring means in FIG. In FIG. 3, first, an input screen for inputting an object for generating a guide vocal is displayed on the display unit 14, and the creator inputs a song for creating a guide vocal from the operation unit 13, for example, a song ID, and the apparatus main body. 12 (step (S) 1).

操作部１３及び表示部１４より楽曲ＩＤを入力すると、データ取得手段２５が、演奏ＤＢ２８を参照して指定された楽曲ＩＤの演奏データを特定し、当該演奏データ中の歌詞文字データを取得すると共に、当該演奏データ中のリファレンスデータから音声合成パラメータ（音高情報、タイミング情報、音量情報、区間情報）を取得して音声合成手段２３に送出し、また当該音声合成パラメータをパラメータ修正手段２６に送出する（Ｓ２）。 When the music ID is input from the operation unit 13 and the display unit 14, the data acquisition unit 25 specifies the performance data of the music ID specified with reference to the performance DB 28 and acquires the lyric character data in the performance data. The voice synthesis parameters (pitch information, timing information, volume information, section information) are acquired from the reference data in the performance data and sent to the voice synthesis unit 23, and the voice synthesis parameters are sent to the parameter correction unit 26. (S2).

音声合成手段２３では、図４（Ａ）に示すように、音素データと、データ取得手段２５から送られてきた音声合成パラメータ、歌詞文字情報に基づいて、基準となる基準音声合成データ（基準ガイドボーカル）を生成して採点手段２３に送出する（Ｓ３）。採点手段２３では、音声合成手段２３より送られてきた基準音声合成データ４２を記憶しておく。 As shown in FIG. 4A, the speech synthesizer 23 uses reference speech synthesis data (reference guide) as a reference based on phoneme data, speech synthesis parameters and lyrics character information sent from the data acquisition unit 25. Vocal) is generated and sent to the scoring means 23 (S3). The scoring means 23 stores the reference speech synthesis data 42 sent from the speech synthesis means 23.

一方、パラメータ修正手段２６が、当該楽曲の音声合成パラメータの修正対象の種類、修正の歌唱区間、技法データの種類を設定させる表示画面を、表示制御部２７を介して表示部１４に送出して表示させる（Ｓ４）。この表示画面上で、作成者が、音声合成パラメータの修正対象の種類を設定し、修正の歌唱区間を設定し、技法データの種類を設定して装置本体１２に送出する（Ｓ５）。ここでは、例えば、音声合成パラメータの修正対象を音高情報とし、技法データをビブラートとするが、音声合成パラメータの上述の他のパラメータ、及び、歌唱技法データの上述の他の技法データについての修正についても、これらに応じた設定値を異ならせるだけでよく、音高情報やビブラートと同様の趣旨で対処できるものである。 On the other hand, the parameter correction means 26 sends a display screen for setting the type of correction target of the speech synthesis parameter, the song section to be corrected, and the type of technique data to the display unit 14 via the display control unit 27. Display (S4). On this display screen, the creator sets the type of speech synthesis parameter correction target, sets the correction singing section, sets the type of technique data, and sends it to the apparatus main body 12 (S5). Here, for example, the speech synthesis parameter to be corrected is pitch information and the technique data is vibrato. However, the other parameters of the speech synthesis parameter and the other technique data of the singing technique data are corrected. With respect to, it is only necessary to change the set values according to these, and it can be dealt with in the same manner as pitch information and vibrato.

装置本体１２では、パラメータ（音高情報）の選択を入力し、歌唱区間の選択及び技法データ中のビブラート選択を入力する（Ｓ６）。パラメータ修正手段２６では、音高情報を変更設定させると共に、ビブラート情報を歌唱技法ＤＢ２９より取得し、ビブラートの程度を数値で設定させる表示画面のデータを、表示制御部２７を介して表示部１４で表示させる（Ｓ７）。 In the apparatus main body 12, the parameter (pitch information) selection is input, and the selection of the singing section and the vibrato selection in the technique data are input (S6). In the parameter correction means 26, the pitch information is changed and set, the vibrato information is acquired from the singing technique DB 29, and the data of the display screen for setting the degree of vibrato numerically is displayed on the display unit 14 via the display control unit 27. Display (S7).

この表示画面上で、作成者が、例えば、異なる音高の音符が連続している区間では、正確なタイミングにて音高が変化する機械的な音高情報となっているため、歌唱者が実際に歌唱するような音高の変化になるように、音符と音符の間をなだらかに音高が変化するよう音高情報の設定を行い、所定長以上の音符の区間では、その音符の長さだけ音高が変化しない機械的な音高情報になっているため、ビブラートの程度を数値で設定して装置本体１２に送出する（Ｓ８）。なお、特に図示しないが、上記設定に際して対象候補としたときに暫定的に音声合成を行わせて作成者に聴きながら設定作業を行わせるようにしてもよいものである。 On this display screen, the creator is, for example, a mechanical pitch information in which the pitch changes at an accurate timing in a section in which notes of different pitches are continuous. The pitch information is set so that the pitch changes gently between notes so that the actual pitch changes as if singing. Since the mechanical pitch information is such that the pitch does not change, the degree of vibrato is set numerically and sent to the apparatus main body 12 (S8). Although not particularly illustrated, when setting is made as an object candidate in the above setting, the setting operation may be performed while listening to the creator by temporarily performing speech synthesis.

ここで、ビブラートについて図５を用いて説明する。図５は３０ｍｓ毎にサンプリングされた音声周波数のピッチを時系列にグラフ化したものであり、採点手段においては、図５（Ａ）に示すように、周期とピッチ幅が一定で、これが繰り返されることでビブラートと評価して加点対象とする。一方で、例えば周期とピッチ幅が一定でも、一周期が所定サンプリング数以上の場合や、所定サンプリング数以下の場合には、ビブラートとして評価せず、また、図５（Ｂ）に示すようにピッチ幅が一定でも、それぞれの周期のばらつきが所定サンプリング数以上の場合や、図５（Ｃ）に示すように周期が一定でも、それぞれのピッチのばらつきが所定ピッチ以上の場合にも、ビブラートとして評価しないため加点対象とはならない。 Here, vibrato will be described with reference to FIG. FIG. 5 is a graph showing the pitch of the audio frequency sampled every 30 ms in time series. In the scoring means, as shown in FIG. 5 (A), the cycle and the pitch width are constant, and this is repeated. Therefore, it is evaluated as vibrato and added. On the other hand, for example, even when the period and the pitch width are constant, if one period is equal to or greater than the predetermined number of samplings or less than the predetermined number of samplings, it is not evaluated as vibrato, and as shown in FIG. Even if the width is constant, evaluation is made as vibrato even when the variation of each cycle is greater than or equal to the predetermined number of samplings, or even when the cycle is constant and the variation of each pitch is greater than or equal to the predetermined pitch as shown in FIG. Because it does not, it is not eligible for points.

上記の図５（Ａ）のビブラートは極めて人工的な歌唱として聴取されることから、これをずらせることでより人間的な味わいのある歌唱と聴取されることを鑑みて、作業者はより人間的な味わいのある歌唱とするために、例えば図５（Ｄ）に示すように、ビブラートとして評価される一周期のサンプリング数の範囲内で、かつ、各周期と各ピッチ幅のばらつきがビブラートとして評価される範囲内になるようにそれぞれの周期とピッチ幅を異ならせるように設定することができるものである。 Since the vibrato shown in FIG. 5A is heard as an extremely artificial song, in view of being heard as a song with a more human taste by shifting this, the worker is more human. For example, as shown in FIG. 5D, within the range of the number of samplings in one cycle evaluated as vibrato, and the variation in each cycle and each pitch width as vibrato Each period and pitch width can be set differently so as to be within the evaluated range.

図４に戻り、音高情報の修正設定及びビブラートの設定が入力されると（Ｓ８）、パラメータ修正手段２６が、データ取得手段２５で取得した音声合成パラメータに、指定されたそれぞれの歌唱区間で、入力された音高情報の数値で修正し、ビブラート数値のビブラートデータを付加し、修正音声合成パラメータとして音声合成手段２３に送出する（Ｓ９）。 Returning to FIG. 4, when the pitch information correction setting and vibrato setting are input (S8), the parameter correction means 26 uses the voice synthesis parameters acquired by the data acquisition means 25 in each singing section designated. Then, it is corrected with the numerical value of the input pitch information, added with the vibrato data of the vibrato numerical value, and sent to the speech synthesizing means 23 as a corrected speech synthesis parameter (S9).

音声合成手段２３では、図４（Ｂ）に示すように、修正音声合成パラメータ、歌詞文字情報、音素データに基づいて、修正音声合成データを生成し、当該修正音声合成データを採点手段２４に送出する（Ｓ１０）。採点手段２４では、音声合成手段２３から修正音声合成データ４３を取得して記憶し、記憶している基準音声合成データ４２と修正音声合成データ４３とを比較して採点処理を行い、採点結果及び確定か、再設定かの表示画面を、表示制御部２７を介して表示部１４に送出する（Ｓ１１）。 As shown in FIG. 4B, the speech synthesizer 23 generates modified speech synthesis data based on the modified speech synthesis parameters, lyric character information, and phoneme data, and sends the modified speech synthesis data to the scoring unit 24. (S10). The scoring unit 24 acquires and stores the modified speech synthesis data 43 from the speech synthesis unit 23, compares the stored reference speech synthesis data 42 with the modified speech synthesis data 43, performs a scoring process, and performs scoring results and A display screen for confirming or resetting is sent to the display unit 14 via the display control unit 27 (S11).

例えば、採点結果が満点にならなかった場合に作成者は再設定を選択することで（Ｓ１２）、ステップ（Ｓ）４に戻り、再設定がされる。なお、採点手段２４において、採点結果が満点にならなかった場合に、各区間の採点情報を表示部１４に表示させることとしてもよい。これによって、パラメータ修正手段２６による満点となる修正をし易くすることができるものである。 For example, when the scoring result does not reach a perfect score, the creator selects resetting (S12), and the process returns to step (S) 4 to reset. In the scoring unit 24, when the scoring result is not full, scoring information for each section may be displayed on the display unit 14. As a result, it is possible to facilitate the correction to be a perfect score by the parameter correcting means 26.

一方、採点結果が満点の場合に、作成者が満足し、確定信号を装置本体１２に送出することで（Ｓ１２）、音声合成手段２３が、当該修正音声合成データを当該楽曲のガイドボーカルデータとして、楽曲ＩＤに関連付けてガイドボーカルＤＢ３０に格納する（Ｓ１３）。なお、満点となった修正ガイドボーカルを作成者に聴取できることとして、満点であってももう少し人間的な味わいを出すために再設定することもでき、また、他の歌唱区間や、他の音声合成パラメータや、他の歌唱技法を設定させることもできるものである。 On the other hand, when the scoring result is full, the creator is satisfied and sends a confirmation signal to the apparatus main body 12 (S12), so that the speech synthesis means 23 uses the modified speech synthesis data as guide vocal data of the music. Then, it is stored in the guide vocal DB 30 in association with the music ID (S13). In addition, it is possible to listen to the modified guide vocal that has reached the full score to the creator, so even if it is full, it can be reset to give a little more human taste, and other singing sections and other voice synthesis Parameters and other singing techniques can be set.

このように、ガイドボーカル生成の一要素である音声合成パラメータを修正し、また、歌唱技法を付加することにより、高得点の採点を得られる範囲でガイドボーカルとしてより人間的に聴取できるように生成することができるものである。 In this way, by correcting the speech synthesis parameters that are one element of guide vocal generation and adding singing techniques, it is generated so that it can be heard more as a guide vocal within the range where high scoring scores can be obtained. Is something that can be done.

次に、図６に、本発明に係るカラオケのガイドボーカル生成装置における第２実施形態のブロック構成図を示す。図６において、前述の第１実施形態では、ガイドボーカル生成装置が演奏ＤＢ２８、歌唱技法ＤＢ２９、ガイドボーカルＤＢ３０を備え、データ取得手段２５によりそれらのデータベースからデータを取得したが、それらのデータベースを備えずに、データ入力手段３１により、任意のカラオケ楽曲のガイドボーカルを生成するために必要なデータのみを外部から入力する構成にしてもよい。より具体的には、データのファイルを、伝送路を介してサーバーからダウンロードしたり、記憶媒体を介して取得したりしてもよい。 Next, FIG. 6 shows a block diagram of a second embodiment of the karaoke guide vocal generating apparatus according to the present invention. In FIG. 6, in the first embodiment described above, the guide vocal generating device includes the performance DB 28, the singing technique DB 29, and the guide vocal DB 30, and data is acquired from these databases by the data acquisition unit 25. Instead, the data input means 31 may be configured to input only data necessary for generating a guide vocal for an arbitrary karaoke piece from the outside. More specifically, a data file may be downloaded from a server via a transmission path or acquired via a storage medium.

さらに、データ入力手段３１により入力された歌詞文字情報と、同様に入力されたリファレンスデータに含まれる音高情報、タイミング情報及び音量情報を含む複数の音声合成パラメータと、音素データを用いて、音声合成手段２３により一旦音声合成データを生成し、生成された音声合成データを再生手段３２で再生して確認しながら、人間的に聴取できるようにパラメータ修正手段２６により音声合成パラメータを修正し、音声合成手段２３により修正音声合成データ４６を生成してもよいものである。 Furthermore, using the lyric character information input by the data input means 31, a plurality of speech synthesis parameters including pitch information, timing information, and volume information included in the reference data input in the same manner, and phoneme data, The speech synthesizing data is once generated by the synthesizing means 23, and the speech synthesizing data is corrected by the parameter correcting means 26 so that the generated speech synthesized data can be heard by the human being while being reproduced by the reproducing means 32. The modified speech synthesis data 46 may be generated by the synthesis means 23.

再生手段３２は、音声合成データを本装置の操作者が試聴できるよう、音声合成データをアナログ信号に変換し、適宜増幅してスピーカー等の放音手段３４から放音するものである。 The reproduction means 32 converts the voice synthesis data into an analog signal so that the operator of the apparatus can audition the voice synthesis data, amplifies it appropriately, and emits the sound from the sound emission means 34 such as a speaker.

さらに、第１実施形態では、採点手段２４は基準音声合成データ４２と修正音声合成データ４３とを比較したが、第２実施形態において採点手段３３は、採点ルール４４とデータ入力手段３１により入力されたリファレンスデータ４５とに基づいて、修正音声合成データ４６を所定の歌唱区間毎に採点し、各歌唱区間の採点結果を集計した採点値を報知するような構成にしてもよい。 Furthermore, in the first embodiment, the scoring unit 24 compares the reference speech synthesis data 42 and the modified speech synthesis data 43. In the second embodiment, the scoring unit 33 is input by the scoring rule 44 and the data input unit 31. On the basis of the reference data 45, the modified speech synthesis data 46 may be scored for each predetermined singing section, and a scoring value obtained by counting the grading results of each singing section may be notified.

ここで、採点ルール４４は、歌唱採点機能を備える従来のカラオケ装置に搭載されているものと同様のものである。より具体的には、歌唱者の歌唱音声信号を分析し、例えば採点対象となる歌唱区間の音高（ピッチ）変化が所定の条件を満たせば良質なビブラートの表現をしたものとして加点する、などの採点基準を規定したものである。そして、種々の歌唱表現についてのそれぞれの採点基準を備えたのが採点ルール４４である。本発明においては、歌唱者による歌唱音声信号の代わりに、より人間的に聴取できるように修正された修正音声合成データを分析し採点するために用いられる。 Here, the scoring rules 44 are the same as those installed in a conventional karaoke apparatus having a singing scoring function. More specifically, the singing voice signal of the singer is analyzed, and, for example, if the pitch (pitch) change in the singing section to be scored satisfies a predetermined condition, it is added as a high-quality vibrato expression, etc. The scoring standard is defined. And it is the scoring rule 44 with each scoring standard about various singing expressions. In the present invention, instead of a singing voice signal by a singer, it is used for analyzing and scoring modified voice synthesis data that has been corrected so that it can be heard more humanly.

このように、歌詞文字情報とリファレンスデータとに基づいて生成された機械的なガイドボーカルに対し、人間的に聴取できるよう音声合成パラメータを修正しても、修正された音声合成データを採点して適正な歌唱であると定められる範囲を逸脱していないことを確認でき、したがって、高得点の採点が得られて、より人間的に聴取できるガイドボーカルを生成することができるものである。 In this way, even if the speech synthesis parameters are modified so that the mechanical guide vocal generated based on the lyric character information and the reference data can be heard by humans, the modified speech synthesis data is scored. It is possible to confirm that the singing does not deviate from the range determined to be an appropriate singing, so that a high-scoring score can be obtained and a guide vocal that can be heard more humanly can be generated.

第２実施形態においてさらに表示手段を設け、修正音声合成データ４６を採点手段３３で採点した結果、その採点値が所定値（例えば満点）未満であった場合には、修正された音声合成パラメータに係る歌唱区間の採点情報を表示するよう構成してもよい。ここでいう採点情報には、当該歌唱区間の採点値の他に、減点となった歌唱表現の採点項目名やその適正値との乖離度などを含めることもできる。 In the second embodiment, a display unit is further provided, and when the corrected speech synthesis data 46 is scored by the scoring unit 33, if the score value is less than a predetermined value (for example, a perfect score), the corrected speech synthesis parameter is displayed. You may comprise so that the scoring information of the song area which concerns may be displayed. The scoring information here may include the scoring item name of the singing expression that has been deducted, the degree of deviation from the appropriate value, etc., in addition to the scoring value of the singing section.

これにより、人間的に聴取できるようにという目的で行った音声合成パラメータの修正が、リファレンスデータと採点ルールにより適正な歌唱であると定められる範囲を逸脱してしまった場合には、その減点の原因になった音声合成パラメータを採点情報に基づいて再度修正することができ、修正音声合成データの点数を高めることができるものである。なお、表示部１４と表示制御部２７が、表示手段に相当する。 As a result, when the modification of the speech synthesis parameters performed for the purpose of being able to be heard by humans deviates from the range determined as appropriate singing based on the reference data and the scoring rules, the deduction is made. The speech synthesis parameter that caused the problem can be corrected again based on the scoring information, and the score of the modified speech synthesis data can be increased. The display unit 14 and the display control unit 27 correspond to display means.

また、データ入力手段３１、音声合成手段２３、再生手段３２、パラメータ修正手段２６、採点手段３３、表示手段を制御する際には、第１実施形態と同様に操作部１３、表示部１４がそのユーザー・インターフェースとして機能し、装置全体の制御を制御部２２が行うものである。 Further, when controlling the data input means 31, the voice synthesis means 23, the reproduction means 32, the parameter correction means 26, the scoring means 33, and the display means, the operation unit 13 and the display unit 14 are the same as in the first embodiment. The control unit 22 functions as a user interface and controls the entire apparatus.

本発明のガイドボーカル生成装置は、カラオケの基本的機能を備えるカラオケ装置の産業分野に利用可能である。 The guide vocal generating device of the present invention can be used in the industrial field of karaoke devices having basic karaoke functions.

１１ガイドボーカル生成装置
１２装置本体
１３操作部
１４表示部
２１バス
２２制御部
２３音声合成手段
２４，３３採点手段
２５データ取得手段
２６パラメータ修正手段
２７表示制御部
２８演奏データ
２９技法データ
３０ガイドボーカルデータベース（ＤＢ）
３１データ入力手段
３２再生手段
４１要素記憶部
４２基準音声合成データ
４３，４６修正音声合成データ
４４採点ルール
４５リファレンスデータ DESCRIPTION OF SYMBOLS 11 Guide vocal production | generation apparatus 12 Apparatus main body 13 Operation part 14 Display part 21 Bus 22 Control part 23 Speech synthesis means 24,33 Scoring means 25 Data acquisition means 26 Parameter correction means 27 Display control part 28 Performance data 29 Technical data 30 Guide vocal database (DB)
31 Data input means 32 Playback means 41 Element storage section 42 Reference speech synthesis data 43, 46 Modified speech synthesis data 44 Scoring rule 45 Reference data

Claims

Lyric character information for displaying lyrics to the singer for each karaoke song, and pitch information, timing information, and volume information for scoring the singing of the karaoke song by the singer for each predetermined singing section A guide vocal generating device that generates a guide vocal for supporting the singing of the karaoke music based on performance data having reference data including:
A performance database storing the performance data for each of the music pieces;
A singing technique database storing technique data obtained by converting various singing techniques into data when singing,
At least the guide vocal generation target music is selected from the performance data, the selection of the information in the reference data and the degree thereof are set, and the selection of the technique data and the degree thereof and the specific section to be added are set externally Input means;
Data acquisition means for acquiring lyric character information from the performance database based on the music selected by the external input means, and for acquiring reference speech synthesis parameters for pitch information, timing information and volume information from the reference data When,
The reference speech synthesis parameter acquired from the data acquisition unit is modified according to the setting from the external input unit, and the technique data selected by the external input unit is set according to the degree setting. Parameter correction means for adding to the specified specific section and making it a modified speech synthesis parameter;
Reference speech synthesis data is generated from the lyric character information acquired by the data acquisition unit and the reference speech synthesis parameter, and corrected from the lyric character information acquired by the data acquisition unit and the modified speech synthesis parameter corrected by the parameter correction unit. Speech synthesis means for generating speech synthesis data;
Scoring means for scoring by comparing the reference speech synthesis data generated by the speech synthesis means and the modified speech synthesis data;
Have
The parameter correcting means corrects a modified speech synthesis parameter corresponding to the change setting when the setting is changed from the external input means based on a scoring result by the scoring means, and the speech synthesis means corrects the synthesized speech Generating data and scoring by the scoring means,
A guide vocal generator characterized by the above.

A guide vocal generating device according to claim 1,
A display means,
The scoring means displays the scoring information of each section on the display means when the scoring result is not full, the guide vocal generating device.

Lyric character information for displaying lyrics to a singer corresponding to an arbitrary karaoke piece, pitch information and timing information for scoring the singing of the karaoke piece by the singer for each predetermined singing section And a guide vocal generating device for generating a guide vocal for supporting the singing of the karaoke music based on reference data including volume information,
Data input means for inputting the lyric character information and the reference data;
Speech synthesis data based on lyrics character information input by the data input means and a plurality of speech synthesis parameters including pitch information, timing information, and volume information included in the reference data input by the data input means Speech synthesis means for generating
Reproducing means for reproducing the voice synthesis data generated by the voice synthesizing means in an audible manner;
Parameter correcting means for correcting the speech synthesis parameter;
The modified speech synthesis data generated by the speech synthesis unit based on the plurality of speech synthesis parameters including the speech synthesis parameter modified by the parameter modification unit, the reference data input by the data input unit and a predetermined scoring Scoring means for scoring for each predetermined singing section based on the rules;
A guide vocal generating device characterized by comprising:

A guide vocal generating device according to claim 3,
A display means,
The scoring means, when the scoring result is less than a predetermined value, displays on the display means the scoring information of the singing section related to the speech synthesis parameter corrected by the parameter correcting means. Generator.

Lyric character information for displaying lyrics to a singer corresponding to an arbitrary karaoke piece, pitch information and timing information for scoring the singing of the karaoke piece by the singer for each predetermined singing section And a guide vocal generating method for generating a guide vocal for supporting the singing of the karaoke song based on reference data including volume information,
A data input step for inputting the lyric character information and the reference data;
A speech synthesis step of generating speech synthesis data based on the lyrics character information and a plurality of speech synthesis parameters including the pitch information, the timing information, and the volume information;
A reproduction step of reproducing the speech synthesis data;
A parameter correcting step for correcting the speech synthesis parameter;
A modified speech synthesis step for generating modified speech synthesis data based on the plurality of speech synthesis parameters including the modified speech synthesis parameters;
A scoring step of scoring the modified speech synthesis data for each predetermined singing section based on the reference data and a predetermined scoring rule;
When the scoring result is less than a predetermined value, a display step for displaying scoring information of a singing section related to the modified speech synthesis parameter;
A guide vocal generation method characterized by comprising: