JP5122598B2

JP5122598B2 - Speech input evaluation system, control method for speech input evaluation system, and program

Info

Publication number: JP5122598B2
Application number: JP2010079805A
Authority: JP
Inventors: 記央花輪
Original assignee: Konami Digital Entertainment Co Ltd
Current assignee: Konami Digital Entertainment Co Ltd
Priority date: 2010-03-30
Filing date: 2010-03-30
Publication date: 2013-01-16
Anticipated expiration: 2030-03-30
Also published as: JP2011209654A

Description

本発明は音声入力評価システム、音声入力評価システムの制御方法、及びプログラムに関する。 The present invention relates to a speech input evaluation system, a control method for a speech input evaluation system, and a program.

音楽に合わせてユーザが行った音声入力を評価する音声入力評価システムが知られている。伴奏音楽に合わせてユーザが行った歌唱を評価するカラオケシステムが知られている（例えば特許文献１）。 A speech input evaluation system that evaluates speech input performed by a user in accordance with music is known. A karaoke system that evaluates a song performed by a user in accordance with accompaniment music is known (for example, Patent Document 1).

特開２００５−２０８１９６号公報JP-A-2005-208196

上記のようなカラオケシステムではユーザの歌唱音声と標準音声とが比較されて、ユーザの歌唱に対する評価が決定される。一般的に、ユーザの歌唱音声と標準音声との差が小さいほど、ユーザに対する評価が高くなる。しかしながら、従来のカラオケシステムでは、歌を歌うことが得意でないユーザは高い評価を得ることができず、歌を歌うことが得意でないユーザが不満を感じてしまう場合があった。このため、上記のようなカラオケシステムでは、例えば、歌を歌うことが得意でないユーザが比較的高い評価を得ることができるように補助する機能の実現が強く望まれる。 In the karaoke system as described above, the user's singing voice is compared with the standard voice, and the evaluation of the user's singing is determined. Generally, the smaller the difference between the user's singing voice and the standard voice, the higher the evaluation for the user. However, in a conventional karaoke system, a user who is not good at singing cannot obtain a high evaluation, and a user who is not good at singing may feel dissatisfied. For this reason, in the karaoke system as described above, for example, it is strongly desired to realize a function to assist a user who is not good at singing so that a relatively high evaluation can be obtained.

本発明は上記課題に鑑みてなされたものであって、その目的は、音楽に合わせてユーザが行った音声入力を評価する音声入力評価システムにおいて、例えば、ユーザが比較的高い評価を得ることができるように補助することが可能な音声入力評価システム、音声入力評価システムの制御方法、及びプログラムを提供することにある。 The present invention has been made in view of the above problems, and its purpose is to provide a voice input evaluation system that evaluates voice input performed by a user in accordance with music. For example, the user can obtain a relatively high evaluation. An object of the present invention is to provide a speech input evaluation system, a control method for a speech input evaluation system, and a program that can be assisted.

上記課題を解決するために、本発明に係る音声入力評価システムは、音楽に合わせてユーザが行った音声入力を評価する音声入力評価システムにおいて、前記ユーザが音声を入力するための音声入力手段と、前記音楽に合わせて前記ユーザが行うべき模範の音声入力を示す模範データを記憶してなる手段に記憶される前記模範データを取得する手段と、前記模範データが示す模範の音声入力と前記音声入力手段を介して行われた音声入力との比較結果に関する比較結果条件と、前記ユーザの音声入力に対する評価に関する評価情報と、を関連づけてなる評価判断基準を取得する取得手段と、前記模範データが示す模範の音声入力と、前記音声入力手段を介して行われた音声入力と、を比較する比較手段と、前記比較手段の比較結果と前記評価判断基準とに基づいて、前記ユーザの音声入力に対する評価を判断する評価手段と、を含み、前記取得手段は、前記音声入力手段を介して入力された前記ユーザの音声の特徴情報に基づいて、前記評価判断基準を変えることを特徴とする。 In order to solve the above-described problems, a speech input evaluation system according to the present invention is a speech input evaluation system for evaluating speech input performed by a user in accordance with music, and speech input means for the user to input speech. Means for acquiring the model data stored in the means for storing the model data indicating the model voice input to be performed by the user in accordance with the music, the model voice input indicated by the model data, and the voice An acquisition means for acquiring an evaluation judgment criterion that associates a comparison result condition relating to a comparison result with a voice input performed via an input means and evaluation information relating to an evaluation of the user's voice input, and the exemplary data includes A comparison means for comparing the voice input of the model shown and the voice input made via the voice input means, the comparison result of the comparison means and the evaluation Evaluation means for judging an evaluation of the user's voice input based on a disconnection criterion, and the acquisition means is based on feature information of the user's voice input via the voice input means, The evaluation criteria are changed.

また、本発明に係る音声入力評価システムの制御方法は、音楽に合わせてユーザが行った音声入力を評価する音声入力評価システムの制御方法において、前記音楽に合わせて前記ユーザが行うべき模範の音声入力を示す模範データを記憶してなる手段に記憶される前記模範データを取得するステップと、前記模範データが示す模範の音声入力と、前記ユーザが音声を入力するための音声入力手段を介して行われた音声入力と、の比較結果に関する比較結果条件と、前記ユーザの音声入力に対する評価に関する評価情報と、を関連づけてなる評価判断基準を取得する取得ステップと、前記模範データが示す模範の音声入力と、前記音声入力手段を介して行われた音声入力と、を比較する比較ステップと、前記比較ステップにおける比較結果と前記評価判断基準とに基づいて、前記ユーザの音声入力に対する評価を判断する評価ステップと、を含み、前記取得ステップは、前記音声入力手段を介して入力された前記ユーザの音声の特徴情報に基づいて、前記評価判断基準を変えるステップを含むことを特徴とする。 In addition, the control method of the speech input evaluation system according to the present invention is a control method of the speech input evaluation system that evaluates speech input performed by the user according to music, and the exemplary speech to be performed by the user according to the music. The step of obtaining the model data stored in the means for storing the model data indicating the input, the model voice input indicated by the model data, and the voice input unit for the user to input the voice An acquisition step of acquiring an evaluation judgment criterion obtained by associating a comparison result condition relating to a comparison result between the voice input performed and evaluation information relating to an evaluation of the user's voice input; and an exemplary voice indicated by the exemplary data A comparison step for comparing the input with the voice input made through the voice input means, and the comparison result in the comparison step and the previous An evaluation step of determining an evaluation of the user's voice input based on an evaluation determination criterion, wherein the obtaining step is based on feature information of the user's voice input via the voice input means And changing the evaluation criteria.

また、本発明に係るプログラムは、音楽に合わせてユーザが行った音声入力を評価する音声入力評価システムとしてコンピュータを機能させるためのプログラムであって、前記音楽に合わせて前記ユーザが行うべき模範の音声入力を示す模範データを記憶してなる手段に記憶される前記模範データを取得する手段、前記模範データが示す模範の音声入力と、前記ユーザが音声を入力するための音声入力手段を介して行われた音声入力と、の比較結果に関する比較結果条件と、前記ユーザの音声入力に対する評価に関する評価情報と、を関連づけてなる評価判断基準を取得する取得手段、前記模範データが示す模範の音声入力と、前記音声入力手段を介して行われた音声入力と、を比較する比較手段、及び、前記比較手段の比較結果と前記評価判断基準とに基づいて、前記ユーザの音声入力に対する評価を判断する評価手段、として前記コンピュータを機能させ、前記取得手段は、前記音声入力手段を介して入力された前記ユーザの音声の特徴情報に基づいて、前記評価判断基準を変えることを特徴とするプログラムである。 The program according to the present invention is a program for causing a computer to function as a voice input evaluation system that evaluates voice input performed by a user in accordance with music, and is a model that the user should perform in accordance with the music. The means for acquiring the model data stored in the means for storing the model data indicating the voice input, the model voice input indicated by the model data, and the voice input means for the user to input the voice An acquisition means for acquiring an evaluation judgment criterion obtained by associating a comparison result condition relating to a comparison result between the voice input performed and evaluation information relating to an evaluation of the user's voice input; an exemplary voice input indicated by the exemplary data And comparison means for comparing the voice input made via the voice input means, and the comparison result of the comparison means and the evaluation The computer is functioned as an evaluation unit that determines an evaluation of the user's voice input based on a disconnection criterion, and the acquisition unit uses the user's voice feature information input via the voice input unit. The program is characterized in that the evaluation criteria are changed based on the program.

また、本発明に係る情報記憶媒体は、上記プログラムを記録したコンピュータ読み取り可能な情報記憶媒体である。 An information storage medium according to the present invention is a computer-readable information storage medium recording the above program.

本発明によれば、音楽に合わせてユーザが行った音声入力を評価する音声入力評価システムにおいて、例えば、ユーザが比較的高い評価を得ることができるように補助することが可能になる。 ADVANTAGE OF THE INVENTION According to this invention, in the audio | voice input evaluation system which evaluates the audio | voice input which the user performed according to music, it becomes possible to assist so that a user can obtain comparatively high evaluation, for example.

また本発明の一態様では、前記取得手段は、音声の特徴情報に関する複数の特徴条件の各々に対応づけて前記評価判断基準を記憶してなる手段に記憶される前記評価判断基準のうちの、前記音声入力手段を介して入力された前記ユーザの音声の特徴情報が満足する前記特徴条件に対応づけられた前記評価判断基準を取得するようにしてもよい。 Moreover, in one aspect of the present invention, the acquisition unit includes the evaluation determination criterion stored in the unit configured to store the evaluation determination criterion in association with each of a plurality of feature conditions related to voice feature information. You may make it acquire the said evaluation judgment criteria matched with the said feature condition where the feature information of the said user's audio | voice input via the said audio | voice input means is satisfied.

また本発明の一態様では、前記取得手段は、基本の評価判断基準を記憶してなる手段に記憶される前記基本の評価判断基準を取得する手段を含み、前記音声入力手段を介して入力された前記ユーザの音声の特徴情報に基づいて、前記基本の評価判断基準を変更することによって、前記評価判断基準を取得するようにしてもよい。 In one aspect of the present invention, the acquisition unit includes a unit that acquires the basic evaluation criterion stored in a unit that stores a basic evaluation criterion, and is input via the voice input unit. Further, the evaluation criterion may be acquired by changing the basic evaluation criterion based on the feature information of the user's voice.

また本発明の一態様では、前記模範データは、前記音楽に合わせて前記ユーザが入力すべき模範音声を少なくとも示すようにしてもよい。前記評価判断基準は、前記模範データが示す模範音声の音高と、前記音声入力手段を介して入力された前記ユーザの音声の音高と、の比較結果に関する比較結果条件と、前記評価情報と、を対応づけてなる情報であってもよい。前記比較手段は、前記模範データが示す模範音声の音高と、前記音声入力手段を介して入力された前記ユーザの音声の音高と、を比較するようにしてもよい。 In one aspect of the present invention, the model data may indicate at least model voice to be input by the user in accordance with the music. The evaluation criterion is a comparison result condition regarding a comparison result between a pitch of the model voice indicated by the model data and a pitch of the voice of the user input through the voice input unit, the evaluation information, , May be information associated with each other. The comparison unit may compare the pitch of the model voice indicated by the model data with the pitch of the user's voice input through the voice input unit.

また本発明の一態様では、前記模範データは、前記ユーザが音声を入力すべき模範タイミングを少なくとも示すようにしてもよい。前記評価判断基準は、前記模範データが示す模範タイミングと、前記音声入力手段を介して前記ユーザの音声が入力されたタイミングと、の比較結果に関する比較結果条件と、前記評価情報と、を対応づけてなる情報であってもよい。前記比較手段は、前記模範データが示す模範タイミングと、前記音声入力手段を介して前記ユーザの音声が入力されたタイミングと、を比較するようにしてもよい。 In the aspect of the invention, the example data may indicate at least an example timing at which the user should input voice. The evaluation determination criterion associates the evaluation result with the comparison result condition regarding the comparison result between the model timing indicated by the model data and the timing when the user's voice is input via the voice input unit. It may be information. The comparison unit may compare the model timing indicated by the model data with the timing when the user's voice is input via the voice input unit.

また本発明の一態様では、前記特徴情報は、音声の音量に関する情報と、音声にビブラートがかかっているか否かに関する情報と、の少なくとも一方を含むようにしてもよい。 In the aspect of the invention, the feature information may include at least one of information related to sound volume and information related to whether or not the sound is vibrato.

また本発明の一態様では、前記音声入力評価システムは、前記音楽に合わせてユーザが歌唱するカラオケシステム、又は、前記音楽に合わせてユーザが音声入力を行うゲームを実行するゲームシステムであってもよい。 In one aspect of the present invention, the voice input evaluation system may be a karaoke system in which a user sings along with the music or a game system that executes a game in which the user performs voice input in accordance with the music. Good.

本発明の実施形態に係るカラオケシステム（音声入力評価システム）のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the karaoke system (voice input evaluation system) which concerns on embodiment of this invention. カラオケ画面の一例を示す図である。It is a figure which shows an example of a karaoke screen. ピアノロール画像について説明するための図である。It is a figure for demonstrating a piano roll image. カラオケ画面の他の一例を示す図である。It is a figure which shows another example of a karaoke screen. 評価判断基準の一例を示す図である。It is a figure which shows an example of an evaluation criteria. 評価判断基準の他の一例を示す図である。It is a figure which shows another example of evaluation criteria. カラオケ画面の他の一例を示す図である。It is a figure which shows another example of a karaoke screen. カラオケ画面の他の一例を示す図である。It is a figure which shows another example of a karaoke screen. 評価判断基準の他の一例を示す図である。It is a figure which shows another example of evaluation criteria. 評価判断基準の他の一例を示す図である。It is a figure which shows another example of evaluation criteria. カラオケシステムの機能ブロック図である。It is a functional block diagram of a karaoke system. 楽曲データの一例を示す図である。It is a figure which shows an example of music data. 記憶部の記憶内容の一例を示す図である。It is a figure which shows an example of the memory content of a memory | storage part. カラオケシステムで実行される処理の一例を示すフロー図である。It is a flowchart which shows an example of the process performed with a karaoke system. 評価判断基準の他の一例を示す図である。It is a figure which shows another example of evaluation criteria. カラオケ画面の他の一例を示す図である。It is a figure which shows another example of a karaoke screen.

以下、本発明の実施形態について図面に基づき詳細に説明する。ここでは、音楽に合わせてユーザが行った音声入力を評価する音声入力評価システムの一態様であるカラオケシステムに本発明を適用した場合について説明する。以下では、本発明の実施形態に係るカラオケシステムを家庭用ゲーム機（据置型ゲーム機）を用いて実現する場合について説明する。なお、本発明の実施形態に係るカラオケシステムは、例えば、携帯ゲーム機、業務用ゲーム機、携帯電話機、携帯情報端末、又はパーソナルコンピュータを用いて実現されるようにしてもよい。また、本発明の実施形態に係るカラオケシステムは、カラオケ機能を提供する目的で製造されたカラオケ専用の装置として実現されるようにしてもよい。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Here, the case where this invention is applied to the karaoke system which is one aspect | mode of the audio | voice input evaluation system which evaluates the audio | voice input which the user performed according to the music is demonstrated. Below, the case where the karaoke system which concerns on embodiment of this invention is implement | achieved using a consumer game machine (stationary game machine) is demonstrated. Note that the karaoke system according to the embodiment of the present invention may be realized using, for example, a portable game machine, an arcade game machine, a mobile phone, a portable information terminal, or a personal computer. Further, the karaoke system according to the embodiment of the present invention may be realized as a karaoke-dedicated device manufactured for the purpose of providing a karaoke function.

図１は、本発明の実施形態に係るカラオケシステム（音声入力評価システム）のハードウェア構成を示す。図１に示すように、カラオケシステム１０は家庭用ゲーム機１１、表示部３０、音声入力部３１、音声出力部３２、光ディスク３３（情報記憶媒体）、及びメモリカード３４（情報記憶媒体）を含む。 FIG. 1 shows a hardware configuration of a karaoke system (speech input evaluation system) according to an embodiment of the present invention. As shown in FIG. 1, the karaoke system 10 includes a consumer game machine 11, a display unit 30, a voice input unit 31, a voice output unit 32, an optical disc 33 (information storage medium), and a memory card 34 (information storage medium). .

表示部３０、音声入力部３１、及び音声出力部３２は家庭用ゲーム機１１に接続される。例えば、表示部３０は液晶ディスプレイ又はプラズマディスプレイ等の表示装置である。また、音声入力部３１はユーザが音声を入力するためのものである。例えば、音声入力部３１はマイク等の音声入力装置であり、入力された音声を電気信号に変換する。また例えば、音声出力部３２は表示装置に備えられたスピーカ又はヘッドホンである。 The display unit 30, the audio input unit 31, and the audio output unit 32 are connected to the consumer game machine 11. For example, the display unit 30 is a display device such as a liquid crystal display or a plasma display. The voice input unit 31 is for a user to input voice. For example, the voice input unit 31 is a voice input device such as a microphone, and converts the input voice into an electrical signal. Further, for example, the audio output unit 32 is a speaker or headphones provided in the display device.

家庭用ゲーム機１１はコンピュータシステムであり、バス１２、制御部１３、主記憶１４、画像処理部１５、音声処理部１６、光ディスクドライブ１７、メモリカードスロット１８、通信インタフェース（Ｉ／Ｆ）１９、及び操作部２０を含む。 The home game machine 11 is a computer system, and includes a bus 12, a control unit 13, a main memory 14, an image processing unit 15, an audio processing unit 16, an optical disk drive 17, a memory card slot 18, a communication interface (I / F) 19, And the operation unit 20.

バス１２はアドレス及びデータを家庭用ゲーム機１１の各部でやり取りするために用いられる。制御部１３、主記憶１４、画像処理部１５、音声処理部１６、光ディスクドライブ１７、メモリカードスロット１８、通信インタフェース１９、及び操作部２０は、バス１２によって相互データ通信可能に接続される。 The bus 12 is used for exchanging addresses and data among the units of the consumer game machine 11. The control unit 13, the main memory 14, the image processing unit 15, the sound processing unit 16, the optical disk drive 17, the memory card slot 18, the communication interface 19, and the operation unit 20 are connected by the bus 12 so that mutual data communication is possible.

制御部１３は例えば一又は複数のマイクロプロセッサを含み、例えば光ディスク３３から読み出されるプログラムに基づいて、家庭用ゲーム機１１の各部の制御処理や各種情報処理を実行する。主記憶１４は例えばＲＡＭを含み、光ディスク３３又はメモリカード３４から読み出されたプログラム及びデータが必要に応じて書き込まれる。主記憶１４は制御部１３の作業用としても用いられる。 The control unit 13 includes, for example, one or a plurality of microprocessors, and executes control processing and various information processing of each unit of the consumer game machine 11 based on a program read from the optical disc 33, for example. The main memory 14 includes, for example, a RAM, and a program and data read from the optical disc 33 or the memory card 34 are written as necessary. The main memory 14 is also used for work of the control unit 13.

画像処理部１５はＶＲＡＭを含み、制御部１３から送られる画像データに基づいてＶＲＡＭ上に画面を描画し、その画面を表示部３０に表示する。音声処理部１６はサウンドバッファを含み、光ディスク３３又はメモリカード３４からサウンドバッファに読み出された各種音声データ（音楽、効果音、メッセージ等）を音声出力部３２から出力する。また、音声処理部１６は音声入力部３１から出力される音声信号を制御部１３に供給する。 The image processing unit 15 includes a VRAM, draws a screen on the VRAM based on image data sent from the control unit 13, and displays the screen on the display unit 30. The audio processing unit 16 includes a sound buffer, and outputs various audio data (music, sound effects, messages, etc.) read from the optical disk 33 or the memory card 34 to the sound buffer from the audio output unit 32. In addition, the audio processing unit 16 supplies an audio signal output from the audio input unit 31 to the control unit 13.

光ディスクドライブ１７は、光ディスク３３に記録されたプログラムやデータを読み取る。ここではプログラムやデータを家庭用ゲーム機１１に供給するために光ディスク３３を用いることとするが、例えばメモリカード３４等の他のあらゆる情報記憶媒体を用いるようにしてもよい。また、インターネット等のデータ通信網を介して遠隔地からプログラムやデータを家庭用ゲーム機１１に供給するようにしてもよい。 The optical disk drive 17 reads programs and data recorded on the optical disk 33. Here, the optical disc 33 is used to supply the program and data to the consumer game machine 11, but any other information storage medium such as a memory card 34 may be used. Moreover, you may make it supply a program and data to the consumer game machine 11 from a remote place via data communication networks, such as the internet.

通信インタフェース１９は、インターネットなどのデータ通信網に通信接続するためのインタフェースである。メモリカードスロット１８はメモリカード３４を装着するためのインタフェースである。メモリカード３４は不揮発性メモリ（例えばＥＥＰＲＯＭなど）を含み、各種データを記憶する。なお、カラオケシステム１０はハードディスク装置（補助記憶装置）を備えるようにしてもよい。光ディスク３３又はメモリカード３４に記憶されることとして説明するプログラムやデータはハードディスク装置に記憶されるようにしてもよい。 The communication interface 19 is an interface for communication connection to a data communication network such as the Internet. The memory card slot 18 is an interface for mounting the memory card 34. The memory card 34 includes a nonvolatile memory (for example, an EEPROM) and stores various data. The karaoke system 10 may include a hard disk device (auxiliary storage device). Programs and data described as being stored in the optical disk 33 or the memory card 34 may be stored in the hard disk device.

操作部２０はユーザが操作を行うためのものである。操作部２０は複数の操作部材を備える。操作部２０の各操作部材の状態は一定周期毎（例えば１／６０秒ごと）にスキャンされ、そのスキャン結果を表す操作信号が制御部１３に供給される。制御部１３は操作信号に基づいてユーザの操作を判断する。 The operation unit 20 is for a user to perform an operation. The operation unit 20 includes a plurality of operation members. The state of each operation member of the operation unit 20 is scanned at regular intervals (for example, every 1/60 seconds), and an operation signal representing the scan result is supplied to the control unit 13. The control unit 13 determines a user operation based on the operation signal.

カラオケシステム１０ではユーザが伴奏音楽に合わせて歌唱する。音声出力部３２からは伴奏音楽とユーザの歌唱音声とが混合されて出力される。また、カラオケシステム１０ではユーザの歌唱に評価が与えられる。 In the karaoke system 10, the user sings along with the accompaniment music. The audio output unit 32 mixes and outputs the accompaniment music and the user's singing voice. Moreover, in the karaoke system 10, evaluation is given to a user's song.

図２は、表示部３０に表示されるカラオケ画面の一例を示す。カラオケ画面４０にはユーザが歌うべき歌詞４１が表示される。図２に示すカラオケ画面４０には「ＡＢＣＤＥＦＧＨＩＪＫＬ」が歌詞４１として表示されている。 FIG. 2 shows an example of a karaoke screen displayed on the display unit 30. On the karaoke screen 40, lyrics 41 to be sung by the user are displayed. “ABCDEFGHIJKL” is displayed as lyrics 41 on the karaoke screen 40 shown in FIG.

また、カラオケ画面４０には、ユーザが歌詞４１を歌うべきタイミング（期間）と、ユーザが発すべき音の高さ（音高）との両方をユーザに案内するためのピアノロール画像４２が表示される。ピアノロール画像４２は、音高の表記方法として一般的に用いられているピアノロールを模した画像になっている。 Also, the karaoke screen 40 displays a piano roll image 42 for guiding the user to both the timing (period) when the user should sing the lyrics 41 and the pitch (pitch) that the user should utter. The The piano roll image 42 is an image simulating a piano roll generally used as a pitch notation method.

ピアノロール画像４２には、時間軸であるＴ軸と、音高に関する軸であるＰ軸とが設定されている。図２に示す例では、ピアノロール画像４２の横方向がＴ軸方向になっており、ピアノロール画像４２の縦方向がＰ軸方向になっている。 In the piano roll image 42, a T-axis that is a time axis and a P-axis that is an axis related to a pitch are set. In the example shown in FIG. 2, the horizontal direction of the piano roll image 42 is the T-axis direction, and the vertical direction of the piano roll image 42 is the P-axis direction.

図３は、ピアノロール画像４２について説明するための図である。図３に示すピアノロール画像４２は、複数の音高に対応する複数の音高画像４３ａ〜４３ｗを含んで構成される。なお、以下では、音高画像４３ａ〜４３ｗのことを総称して「音高画像４３」と記載する場合がある。 FIG. 3 is a diagram for explaining the piano roll image 42. The piano roll image 42 shown in FIG. 3 includes a plurality of pitch images 43a to 43w corresponding to a plurality of pitches. Hereinafter, the pitch images 43a to 43w may be collectively referred to as “pitch image 43”.

音高画像４３は、横方向（Ｔ軸方向）の長さが縦方向（Ｐ軸方向）の長さに比べて長い矩形画像である。なお、白い音高画像４３はピアノの白い鍵盤の音高に対応しており、斜線が付された音高画像４３はピアノの黒い鍵盤の音高に対応している。 The pitch image 43 is a rectangular image whose length in the horizontal direction (T-axis direction) is longer than that in the vertical direction (P-axis direction). The white pitch image 43 corresponds to the pitch of the piano white keyboard, and the hatched pitch image 43 corresponds to the pitch of the piano black keyboard.

音高画像４３ａ〜４３ｗは縦方向（Ｐ軸方向）に配列される。より高い音高に対応する音高画像４３ほど、ピアノロール画像４２内の上方に表示されるようになっている。つまり、ピアノロール画像４２内で最も下に表示されている音高画像４３ａは、複数の音高のうちで最も低い音高に対応している。また、ピアノロール画像４２内で最も上に表示されている音高画像４３ｗは、複数の音高のうちで最も高い音高に対応している。 The pitch images 43a to 43w are arranged in the vertical direction (P-axis direction). The pitch image 43 corresponding to a higher pitch is displayed above the piano roll image 42. That is, the pitch image 43a displayed at the bottom in the piano roll image 42 corresponds to the lowest pitch among a plurality of pitches. The pitch image 43w displayed at the top in the piano roll image 42 corresponds to the highest pitch among a plurality of pitches.

ピアノロール画像４２は縦方向に５つの領域（第１領域４２ａ、第２領域４２ｂ，４２ｄ、第３領域４２ｃ，４２ｅ）に分けられる。第１領域４２ａはピアノロール画像４２の中央に位置する領域である。第１領域４２ａには、基本音高に対応する音高画像４３ｌと、基本音高より高い３つの音高に対応する３つの音高画像４３ｍ，４３ｎ，４３ｏと、基本音高より低い３つの音高に対応する音高画像４３ｉ，４３ｊ，４３ｋとが配列されている。言い換えれば、第１領域４２ａには、基本音高を中心とする１オクターブの７つの音高に対応する７つの音高画像４３ｉ〜４３ｏが配列されている。 The piano roll image 42 is divided into five regions (first region 42a, second regions 42b and 42d, and third regions 42c and 42e) in the vertical direction. The first area 42 a is an area located at the center of the piano roll image 42. The first area 42a includes a pitch image 43l corresponding to the basic pitch, three pitch images 43m, 43n, 43o corresponding to three pitches higher than the basic pitch, and three lower pitches than the basic pitch. Pitch images 43i, 43j, and 43k corresponding to the pitches are arranged. In other words, in the first region 42a, seven pitch images 43i to 43o corresponding to seven pitches of one octave centering on the basic pitch are arranged.

第２領域４２ｂは第１領域４２ａの上側に隣接する領域である。また、第３領域４２ｃは第２領域４２ｂの上側に隣接する領域である。第２領域４２ｂには、音高画像４３ｐに対応する音高より高い４つの音高に対応４つの音高画像４３ｐ，４３ｑ，４３ｒ，４３ｓが配列されている。また、第３領域４２ｃには、音高画像４３ｓに対応する音高よりさらに高い４つの音高に対応する４つの音高画像４３ｔ，４３ｕ，４３ｖ，４３ｗが配列されている。 The second area 42b is an area adjacent to the upper side of the first area 42a. The third region 42c is a region adjacent to the upper side of the second region 42b. In the second area 42b, four pitch images 43p, 43q, 43r, and 43s corresponding to four pitches higher than the pitch corresponding to the pitch image 43p are arranged. In the third region 42c, four pitch images 43t, 43u, 43v, and 43w corresponding to four pitches higher than the pitch corresponding to the pitch image 43s are arranged.

一方、第２領域４２ｄは第１領域４２ａの下側に隣接している領域である。また、第３領域４２ｅは第２領域４２ｄの下側に隣接している領域である。第２領域４２ｄには、音高画像４３ｉに対応する音高より低い４つの音高に対応する４つの音高画像４３ｅ，４３ｆ，４３ｇ，４３ｈが配列されている。また、第３領域４２ｅには、音高画像４３ｅに対応する音高よりさらに低い４つの音高に対応する４つの音高画像４３ａ，４３ｂ，４３ｃ，４３ｄが配列されている。 On the other hand, the second region 42d is a region adjacent to the lower side of the first region 42a. The third area 42e is an area adjacent to the lower side of the second area 42d. In the second area 42d, four pitch images 43e, 43f, 43g, and 43h corresponding to four pitches lower than the pitch corresponding to the pitch image 43i are arranged. In the third region 42e, four pitch images 43a, 43b, 43c, and 43d corresponding to four pitches lower than the pitch corresponding to the pitch image 43e are arranged.

なお、ピアノロール画像４２は音高画像４３のＰ軸方向の幅が一定でない点で一般的なピアノロールとは異なっている。音高画像４３ａ〜４３ｗの各々のＰ軸方向の幅は、該音高画像に対応する音高と、基本音高と、の差に応じた幅に設定されている。例えば、基本音高との差が比較的大きい音高に対応する音高画像４３ｓのＰ軸方向の幅は、基本音高との差が比較的小さい音高に対応する音高画像４３ｏのＰ軸方向の幅よりも小さく設定されている。 Note that the piano roll image 42 is different from a general piano roll in that the pitch image 43 has a non-constant width in the P-axis direction. The width in the P-axis direction of each of the pitch images 43a to 43w is set to a width corresponding to the difference between the pitch corresponding to the pitch image and the basic pitch. For example, the width in the P-axis direction of the pitch image 43s corresponding to a pitch having a relatively large difference from the basic pitch is P of the pitch image 43o corresponding to a pitch having a relatively small difference from the basic pitch. It is set smaller than the axial width.

すなわち、第２領域４２ｂ，４２ｄに配列されている音高画像４３ｅ〜４３ｈ，４３ｐ〜４３ｓのＰ軸方向の幅は、第１領域４２ａに配列されている音高画像４３ｉ〜４３ｏのＰ軸方向の幅よりも狭くなっている。また、第３領域４２ｃ，４２ｅに配列されている音高画像４３ａ〜４３ｄ，４３ｔ〜４３ｗのＰ軸方向の幅は、第２領域４２ｂ，４２ｃに配列されている音高画像４３ｅ〜４３ｈ，４３ｐ〜４３ｓのＰ軸方向の幅よりもさらに狭くなっている。このようにすることによって、比較的限られた画面領域内に、なるべく多くの音高に対応する音高画像４３を表示することが可能になる。 That is, the widths of the pitch images 43e to 43h and 43p to 43s arranged in the second regions 42b and 42d are in the P-axis direction of the pitch images 43i to 43o arranged in the first region 42a. It is narrower than the width. The pitch images 43a to 43d and 43t to 43w arranged in the third regions 42c and 42e have a width in the P-axis direction, which corresponds to the pitch images 43e to 43h and 43p arranged in the second regions 42b and 42c. It is further narrower than the width in the P-axis direction of ~ 43s. By doing so, it is possible to display the pitch image 43 corresponding to as many pitches as possible within a relatively limited screen area.

また、図２に示すように、ピアノロール画像４２上には基準ライン４４及び歌唱アイコン４５が表示される。基準ライン４４は、ピアノロール画像４２上の、現時点に対応する位置に表示される。つまり、基準ライン４４は現時点を示している。 Further, as shown in FIG. 2, a reference line 44 and a singing icon 45 are displayed on the piano roll image 42. The reference line 44 is displayed on the piano roll image 42 at a position corresponding to the current time. That is, the reference line 44 indicates the current time.

一方、歌唱アイコン４５は、ピアノロール画像４２（基準ライン４４）上の、音声入力部３１を介して入力されるユーザの音声の音高に対応する位置に表示される。つまり、歌唱アイコン４５はユーザの音声の音高に対応する音高画像４３上に表示される。つまり、歌唱アイコン４５はユーザの音声の音高をユーザに案内する役割を果たしている。 On the other hand, the singing icon 45 is displayed on the piano roll image 42 (reference line 44) at a position corresponding to the pitch of the user's voice input via the voice input unit 31. That is, the singing icon 45 is displayed on the pitch image 43 corresponding to the pitch of the user's voice. That is, the singing icon 45 plays the role of guiding the user's voice pitch to the user.

なお、本実施形態の場合、歌唱アイコン４５が第１領域４２ａ外に移動しようとすると（例えば図２に示す場合であれば、ユーザの音声の音高が、第１領域４２ａに含まれる音高画像４３ｉ〜４３ｏに対応しない音高になると）、ピアノロール画像４２がスクロールし、歌唱アイコン４５がピアノロール画像４２内のＰ軸方向中央の位置上に表示されるようになっている。このため、歌唱アイコン４５は常に第１領域４２ａ内に表示されるようになっている。 In the case of this embodiment, when the singing icon 45 tries to move out of the first area 42a (for example, in the case shown in FIG. 2, the pitch of the user's voice is the pitch included in the first area 42a). When the pitch does not correspond to the images 43i to 43o), the piano roll image 42 is scrolled, and the singing icon 45 is displayed on the center position in the P-axis direction in the piano roll image 42. For this reason, the singing icon 45 is always displayed in the first area 42a.

さらに、ピアノロール画像４２上には模範音声案内画像４６が表示される。模範音声案内画像４６は、ユーザが歌詞４１を歌うべきタイミング（期間）と、ユーザが発すべき音声（模範音声）の高さ（音高）との両方をユーザに案内する役割を果たす。模範音声案内画像４６の表示位置は、ユーザが歌詞４１を歌うべきタイミングが到来するまでの残り時間と、模範音声の音高との両方に対応する位置に設定される。 Further, a model voice guidance image 46 is displayed on the piano roll image 42. The model voice guidance image 46 plays a role of guiding both the timing (period) at which the user should sing the lyrics 41 and the pitch (pitch) of the voice (model voice) to be uttered by the user. The display position of the model voice guidance image 46 is set to a position corresponding to both the remaining time until the timing when the user should sing the lyrics 41 and the pitch of the model voice.

例えば、模範音声案内画像４６の表示位置のＰ軸座標値は、模範音声の音高に対応する座標値に設定される。また、模範音声案内画像４６の表示位置のＴ軸座標値は、ユーザが歌詞４１を歌うべきタイミングが到来するまでの残り時間に対応する座標値に設定される。このため、模範音声案内画像４６の表示位置は、ユーザが歌詞４１を歌うべきタイミングが到来するまでの残り時間が少ないほど、基準ライン４４からの距離が短くなるようにして設定されることになる。その結果、模範音声案内画像４６は、時間経過に伴って右から左へと移動し、基準ライン４４に接近する。そして、ユーザが歌うべきタイミングにおいて模範音声案内画像４６は基準ライン４４上に重なる。例えば、ユーザが歌詞４１を歌い始めるべきタイミングにおいて模範音声案内画像４６の先頭４６ａが基準ライン４４に到達する。 For example, the P-axis coordinate value of the display position of the model voice guidance image 46 is set to a coordinate value corresponding to the pitch of the model voice. In addition, the T-axis coordinate value of the display position of the model voice guidance image 46 is set to a coordinate value corresponding to the remaining time until the timing at which the user should sing the lyrics 41 arrives. For this reason, the display position of the model voice guidance image 46 is set so that the distance from the reference line 44 becomes shorter as the remaining time until the timing at which the user should sing the lyrics 41 arrives is shorter. . As a result, the model voice guidance image 46 moves from right to left with the passage of time and approaches the reference line 44. The model voice guidance image 46 overlaps the reference line 44 at the timing when the user should sing. For example, the head 46 a of the model voice guidance image 46 reaches the reference line 44 at the timing when the user should start singing the lyrics 41.

図２に示す時点から所定時間経過した時点におけるカラオケ画面４０の一例を図４に示す。図４に示すカラオケ画面４０では、模範音声案内画像４６の一部が基準ライン４４を通過している。すなわち、図４は、歌詞４１の最初の部分がすでに歌われている状態を示している。 FIG. 4 shows an example of the karaoke screen 40 when a predetermined time has elapsed from the time shown in FIG. In the karaoke screen 40 shown in FIG. 4, a part of the model voice guidance image 46 passes through the reference line 44. That is, FIG. 4 shows a state where the first part of the lyrics 41 is already sung.

図４に示すカラオケ画面４０では、歌詞４１が、黒字部分４１ａと、白字部分４１ｂとを含んでいる。黒字部分４１ａは、歌うべきタイミングが既に経過している部分を示している。すなわち、黒字部分４１ａは、ユーザが既に歌い終わっているはずの部分を示している。一方、白字部分４１ｂは、歌うべきタイミングがこれから到来する部分を示している。すなわち、白字部分４１ｂは、ユーザがこれから歌う部分を示している。白字部分４１ｂのうちの先頭の文字はユーザが次に発すべき音声に相当する。 In the karaoke screen 40 shown in FIG. 4, the lyrics 41 include a black character portion 41 a and a white character portion 41 b. The surplus portion 41a indicates a portion where the timing to sing has already passed. That is, the black portion 41a indicates a portion that the user should have already sung. On the other hand, the white part 41b shows the part where the timing to sing will come. That is, the white character part 41b has shown the part which a user will sing from now on. The first character in the white character portion 41b corresponds to the voice to be uttered next by the user.

歌詞４１の色の変化と模範音声案内画像４６の移動とは同期している。このため、ユーザは歌詞４１と模範音声案内画像４６との両方を参照することによって、歌うべき歌詞と、その歌詞を歌うべきタイミング（期間）と、その歌詞をどの音高で歌うべきかと、を把握することができる。 The change in the color of the lyrics 41 and the movement of the model voice guidance image 46 are synchronized. For this reason, the user refers to both the lyrics 41 and the model voice guidance image 46 to determine the lyrics to be sung, the timing (period) at which the lyrics should be sung, and the pitch at which the lyrics should be sung. I can grasp it.

また、ユーザは歌唱アイコン４５と模範音声案内画像４６とを参照することによって、ユーザが発すべき音声（模範音声）の音高と、ユーザが実際に発している音声の音高と、が一致しているか否かも把握することができる。 Further, by referring to the singing icon 45 and the model voice guidance image 46, the pitch of the voice (model voice) to be uttered by the user matches the pitch of the voice actually uttered by the user. You can also see if you are.

カラオケシステム１０ではユーザの歌唱に評価が与えられる。例えば、ユーザが歌唱すべき音声（模範音声）の音高と、ユーザが実際に歌唱した音声の音高との間のずれ（Δｐ）に基づいて、「ＥＸＣＥＬＬＥＮＴ」、「ＧＲＥＡＴ」、「ＧＯＯＤ」、「ＡＬＭＯＳＴ」、及び「ＢＯＯ」のうちのいずれかの評価が与えられる。「ＥＸＣＥＬＬＥＮＴ」が最も高い評価であり、「ＢＯＯ」が最も低い評価である。図５は評価判断基準の一例を示す図であり、音高のずれ（Δｐ）と評価との関係の一例を示す。音高のずれ（Δｐ）が小さいほど、ユーザに与えられる評価が高くなる。 In the karaoke system 10, an evaluation is given to the user's song. For example, “EXCELLENT”, “GREAT”, “GOOD” based on the difference (Δp) between the pitch of the voice that the user should sing (model voice) and the pitch of the voice that the user actually sang. , “ALMOST”, and “BOO” are given. “EXCELLENT” is the highest evaluation, and “BOO” is the lowest evaluation. FIG. 5 is a diagram showing an example of evaluation criteria, and shows an example of the relationship between pitch deviation (Δp) and evaluation. The smaller the pitch deviation (Δp), the higher the evaluation given to the user.

また、例えば、ユーザが歌唱すべきタイミング（模範タイミング）と、ユーザが実際に歌唱したタイミングとの間のずれ（Δｔ）に基づいて、「ＥＸＣＥＬＬＥＮＴ」、「ＧＲＥＡＴ」、「ＧＯＯＤ」、「ＡＬＭＯＳＴ」、及び「ＢＯＯ」のうちのいずれかの評価が与えられる。図６は評価判断基準の一例を示す図であり、タイミングのずれ（Δｔ）と評価との関係の一例を示す。タイミングのずれ（Δｔ）が小さいほど、ユーザに与えられる評価が高くなる。 Also, for example, “EXCELLENT”, “GREAT”, “GOOD”, “ALMOST” based on the deviation (Δt) between the timing when the user should sing (exemplary timing) and the timing when the user actually sang. And an evaluation of either “BOO”. FIG. 6 is a diagram showing an example of evaluation criteria, and shows an example of the relationship between timing deviation (Δt) and evaluation. The smaller the timing shift (Δt), the higher the evaluation given to the user.

なお、例えば、音高のずれ（Δｐ）に関する評価とタイミングのずれ（Δｔ）に関する評価とのうちの低い方がユーザの歌唱に対する評価になる。例えば、音高のずれ（Δｐ）に関する評価が「ＧＲＥＡＴ」であり、かつ、タイミングのずれ（Δｔ）に関する評価が「ＧＯＯＤ」である場合、ユーザの歌唱に対する評価は「ＧＯＯＤ」になる。 Note that, for example, the lower of the evaluation regarding the pitch shift (Δp) and the evaluation regarding the timing shift (Δt) is the evaluation of the user's song. For example, when the evaluation regarding the pitch shift (Δp) is “GREAT” and the evaluation regarding the timing shift (Δt) is “GOOD”, the evaluation of the user's song is “GOOD”.

ところで、歌唱アイコン４５の大きさ（本実施形態の場合、歌唱アイコン４５の半径）は、所定の評価（ここでは「ＧＲＥＡＴ」とする。）以上の評価を得ることができるような音高のずれ（Δｐ）及びタイミングのずれ（Δｔ）の大きさに対応する大きさに設定されている。このため、歌唱アイコン４５と模範音声案内画像４６とが重なっている場合には、音高のずれ（Δｐ）及びタイミングのずれ（Δｔ）の大きさが、「ＧＲＥＡＴ」以上の評価を得ることができるような範囲内におさまっていることになる。このため、ユーザは歌唱アイコン４５と模範音声案内画像４６とが重なるような音高及びタイミングで歌唱すれば、「ＧＲＥＡＴ」以上の評価を得ることができることになる。 By the way, the size of the singing icon 45 (in the case of the present embodiment, the radius of the singing icon 45) is a pitch shift that can obtain an evaluation equal to or higher than a predetermined evaluation (here, “GREAT”). It is set to a magnitude corresponding to the magnitude of (Δp) and timing deviation (Δt). For this reason, when the singing icon 45 and the model voice guidance image 46 overlap, the magnitudes of the pitch deviation (Δp) and the timing deviation (Δt) can be evaluated as “GREAT” or more. It is within the range that can be done. For this reason, if the user sings at a pitch and timing such that the singing icon 45 and the model voice guidance image 46 overlap each other, an evaluation of “GREAT” or higher can be obtained.

例えば、図４に示すカラオケ画面４０では、歌唱アイコン４５と模範音声案内画像４６とが重なっている。この場合、「ＧＲＥＡＴ」以上の評価がユーザに与えられる。一方、例えば、図７に示すカラオケ画面４０では、歌唱アイコン４５と模範音声案内画像４６とが重なっていない。この場合、「ＧＲＥＡＴ」よりも低い評価がユーザに与えられる。 For example, on the karaoke screen 40 shown in FIG. 4, the singing icon 45 and the model voice guidance image 46 overlap. In this case, an evaluation of “GREAT” or higher is given to the user. On the other hand, for example, in the karaoke screen 40 shown in FIG. 7, the singing icon 45 and the model voice guidance image 46 do not overlap. In this case, the user is given a lower evaluation than “GREAT”.

図４及び図７に示すように、カラオケ画面４０には、ユーザに与えられた評価を示すメッセージ５０が表示される。また、カラオケ画面４０には得点４７が表示される。ユーザに与える評価が決定された場合、その評価に対応する評価点がユーザの得点に加算される。評価と評価点との関係は図５及び図６に示す通りである。図５及び図６に示すように、評価が高いほど、評価点も高くなる。 As shown in FIGS. 4 and 7, the karaoke screen 40 displays a message 50 indicating the evaluation given to the user. A score 47 is displayed on the karaoke screen 40. When the evaluation to be given to the user is determined, an evaluation score corresponding to the evaluation is added to the user's score. The relationship between evaluation and evaluation points is as shown in FIGS. As shown in FIGS. 5 and 6, the higher the evaluation, the higher the evaluation score.

さらに、カラオケ画面４０にはコンボ数４８が表示される。コンボ数４８は、ユーザが比較的高い評価（例えばＥＸＣＥＬＬＥＮＴ」又は「ＧＲＥＡＴ」）を連続して得た回数である。また、カラオケ画面４０にはゲージ４９が表示される。ゲージ４９の長さはユーザに与えられた評価に基づいて変化する。例えば、ユーザに与えられた評価が比較的高い「ＥＸＣＥＬＬＥＮＴ」、「ＧＲＥＡＴ」、又は「ＧＯＯＤ」であった場合にゲージ４９は伸張し、評価が比較的低い「ＡＬＭＯＳＴ」又は「ＢＯＯ」であった場合にゲージ４９は収縮する。 Further, the combo number 48 is displayed on the karaoke screen 40. The combo number 48 is the number of times that the user has continuously obtained a relatively high evaluation (eg, EXCELLENT or “GREAT”). A gauge 49 is displayed on the karaoke screen 40. The length of the gauge 49 changes based on the evaluation given to the user. For example, when the evaluation given to the user is “EXCELLENT”, “GREAT”, or “GOOD”, the gauge 49 is extended, and the evaluation is “ALMOST” or “BOO”, which is relatively low. In some cases, the gauge 49 contracts.

ところで、カラオケシステム１０では、歌を歌うことが得意でないユーザであっても、比較的高い評価を得ることができるように補助するための機能を有している。以下、この補助機能について説明する。 By the way, in the karaoke system 10, even if it is a user who is not good at singing, it has the function for assisting so that comparatively high evaluation can be obtained. Hereinafter, this auxiliary function will be described.

カラオケシステム１０では、ユーザの歌唱音声の音量に基づいて、評価判断基準が変化するようになっている。具体的には、ユーザの歌唱音声の音量が大きくなると、評価判断基準が緩くなり、比較的高い評価を得やすくなるようになっている。このため、歌を歌うことが得意でないユーザであっても、大きい声で歌うことによって、比較的高い評価を得やすくなるようになっている。 In the karaoke system 10, the evaluation criteria are changed based on the volume of the user's singing voice. Specifically, when the volume of the user's singing voice is increased, the evaluation criterion is relaxed, and a relatively high evaluation is easily obtained. For this reason, even a user who is not good at singing can easily obtain a relatively high evaluation by singing with a loud voice.

図８は、ユーザの歌唱音声の音量が大きい場合のカラオケ画面４０の一例を示す。ユーザの歌唱音声の音量が大きい場合、図８に示すように、歌唱アイコン４５の大きさが大きくなる。上述したように、歌唱アイコン４５の大きさは「ＧＲＥＡＴ」以上の評価を得ることができるような音高のずれ（Δｐ）及びタイミングのずれ（Δｔ）の大きさに対応している。このため、この場合、音高のずれ（Δｐ）及びタイミングのずれ（Δｔ）の大きさがある程度大きくても、「ＧＲＥＡＴ」以上の評価をユーザが得ることができるようになる。 FIG. 8 shows an example of the karaoke screen 40 when the volume of the user's singing voice is high. When the volume of the user's singing voice is large, the size of the singing icon 45 increases as shown in FIG. As described above, the size of the singing icon 45 corresponds to the pitch deviation (Δp) and timing deviation (Δt) that can obtain an evaluation equal to or higher than “GREAT”. Therefore, in this case, even if the pitch deviation (Δp) and the timing deviation (Δt) are large to some extent, the user can obtain an evaluation of “GREAT” or higher.

上述したように，図８に示すようなカラオケ画面４０が表示される場合（すなわち、ユーザの歌唱音声の音量が大きい場合）には評価判断基準が緩くなる。図９及び図１０は、図８に示すようなカラオケ画面４０が表示される場合（すなわち、ユーザの歌唱音声の音量が大きい場合）の評価判断基準の一例を示している。 As described above, when the karaoke screen 40 as shown in FIG. 8 is displayed (that is, when the volume of the user's singing voice is high), the evaluation criterion is relaxed. FIGS. 9 and 10 show an example of evaluation criteria when the karaoke screen 40 as shown in FIG. 8 is displayed (that is, when the volume of the user's singing voice is high).

図９に示す評価判断基準は図５に示す評価判断基準に比べて緩くなっている。例えば、図５に示す評価判断基準では、模範音声とユーザの歌唱音声との間の音高のずれ（Δｐ）がＰ１以上であってかつＰ２未満である場合に「ＧＲＥＡＴ」の評価がユーザに与えられるようになっている。これに対し、図９に示す評価判断基準では、音高のずれ（Δｐ）がＰ１以上であってかつＰ２未満である場合に、「ＧＲＥＡＴ」よりも高い評価である「ＥＸＣＥＬＬＥＮＴ」がユーザに与えられるようになっている。 The evaluation criteria shown in FIG. 9 are looser than the evaluation criteria shown in FIG. For example, in the evaluation criteria shown in FIG. 5, if the pitch deviation (Δp) between the model voice and the user's singing voice is equal to or greater than P1 and less than P2, the evaluation of “GREAT” is given to the user. It has come to be given. On the other hand, in the evaluation criteria shown in FIG. 9, when the pitch deviation (Δp) is equal to or greater than P1 and less than P2, “EXCELLENT”, which is higher than “GREAT”, is given to the user. It is supposed to be.

また、図５に示す評価判断基準では、音高のずれ（Δｐ）が「Ｐ２」以上であってかつ「Ｐ３」未満である場合に「ＧＯＯＤ」の評価がユーザに与えられるようになっている。これに対し、図９に示す評価判断基準では、音高のずれ（Δｐ）が「Ｐ２」以上であってかつ「Ｐ３」未満である場合に、「ＧＯＯＤ」よりも高い評価である「ＧＲＥＡＴ」がユーザに与えられるようになっている。 Further, according to the evaluation criteria shown in FIG. 5, when the pitch deviation (Δp) is not less than “P2” and less than “P3”, the evaluation of “GOOD” is given to the user. . On the other hand, in the evaluation criteria shown in FIG. 9, when the pitch deviation (Δp) is equal to or greater than “P2” and less than “P3”, “GREAT” is a higher evaluation than “GOOD”. Is given to the user.

以上のように、ユーザの歌唱音声の音量が大きくない場合、音高のずれ（Δｐ）が「Ｐ２」未満でなければ、「ＧＲＥＡＴ」以上の評価をユーザが得ることができないが、ユーザの歌唱音声の音量が大きくなると、音高のずれ（Δｐ）が「Ｐ２」以上であっても、音高のずれ（Δｐ）が「Ｐ３」未満であれば、「ＧＲＥＡＴ」以上の評価をユーザが得ることができる。つまり、ユーザの歌唱音声の音量が大きい場合の評価判断基準（図９参照）では、ユーザの歌唱音声の音量が大きくない場合の評価判断基準（図５参照）に比べて、「ＧＲＥＡＴ」以上の評価を得るためにユーザが満たすべき音高のずれ（Δｐ）が大きくなっている。その結果、ユーザが「ＧＲＥＡＴ」以上の評価を得やすくなっている。 As described above, when the volume of the user's singing voice is not large, the user cannot obtain an evaluation higher than “GREAT” unless the pitch deviation (Δp) is less than “P2”. When the sound volume increases, even if the pitch deviation (Δp) is equal to or greater than “P2”, if the pitch deviation (Δp) is less than “P3”, the user obtains an evaluation of “GREAT” or higher. be able to. That is, in the evaluation judgment standard (see FIG. 9) when the volume of the user's singing voice is high, the evaluation judgment standard (see FIG. 5) when the volume of the user's singing voice is not high is higher than “GREAT”. The pitch shift (Δp) to be satisfied by the user to obtain the evaluation is large. As a result, it is easy for the user to obtain an evaluation of “GREAT” or higher.

同様に、図１０に示す評価判断基準は図６に示す評価判断基準に比べて緩くなっている。例えば、図６に示す評価判断基準では、ユーザが歌唱すべきタイミング（模範タイミング）と、ユーザが実際に歌唱したタイミングとの間のずれ（Δｔ）が「Ｔ１」以上であってかつ「Ｔ２」未満である場合に「ＧＲＥＡＴ」の評価がユーザに与えられるようになっている。これに対し、図１０に示す評価判断基準では、タイミングのずれ（Δｔ）が「Ｔ１」以上であってかつ「Ｔ２」未満である場合に、「ＧＲＥＡＴ」よりも高い評価である「ＥＸＣＥＬＬＥＮＴ」がユーザに与えられるようになっている。 Similarly, the evaluation criteria shown in FIG. 10 are looser than the evaluation criteria shown in FIG. For example, in the evaluation criteria shown in FIG. 6, the difference (Δt) between the timing when the user should sing (exemplary timing) and the timing when the user actually sings is “T1” or more and “T2”. If it is less, the evaluation of “GREAT” is given to the user. On the other hand, in the evaluation criteria shown in FIG. 10, when the timing shift (Δt) is equal to or greater than “T1” and less than “T2”, “EXCELLENT”, which is higher than “GREAT”, is determined. It is given to the user.

また、図６に示す評価判断基準では、タイミングのずれ（Δｔ）が「Ｔ２」以上であってかつ「Ｔ３」未満である場合に「ＧＯＯＤ」の評価がユーザに与えられるようになっている。これに対し、図１０に示す評価判断基準では、タイミングのずれ（Δｔ）が「Ｔ２」以上であってかつ「Ｔ３」未満である場合に、「ＧＯＯＤ」よりも高い評価である「ＧＲＥＡＴ」がユーザに与えられるようになっている。 Further, according to the evaluation criteria shown in FIG. 6, the evaluation of “GOOD” is given to the user when the timing shift (Δt) is “T2” or more and less than “T3”. On the other hand, in the evaluation criteria shown in FIG. 10, when the timing shift (Δt) is equal to or greater than “T2” and less than “T3”, “GREAT” that is higher than “GOOD” is determined. It is given to the user.

以上のように、ユーザの歌唱音声の音量が小さい場合、タイミングのずれ（Δｔ）が「Ｔ２」未満でなければ、「ＧＲＥＡＴ」以上の評価をユーザが得ることができないが、ユーザの歌唱音声の音量が大きくなると、タイミングのずれ（Δｔ）が「Ｔ２」以上であっても、タイミングのずれ（Δｔ）が「Ｔ３」未満であれば、「ＧＲＥＡＴ」以上の評価をユーザが得ることができる。つまり、ユーザの歌唱音声の音量が大きい場合の評価判断基準（図１０参照）では、ユーザの歌唱音声の音量が大きくない場合の評価判断基準（図６参照）に比べて、「ＧＲＥＡＴ」以上の評価を得るためにユーザが満たすべきタイミングのずれ（Δｔ）が大きくなっている。その結果、ユーザが「ＧＲＥＡＴ」以上の評価を得やすくなっている。 As described above, when the volume of the user's singing voice is low, the user cannot obtain an evaluation of “GREAT” or higher unless the timing shift (Δt) is less than “T2”. When the volume increases, even if the timing shift (Δt) is equal to or greater than “T2”, if the timing shift (Δt) is less than “T3”, the user can obtain an evaluation of “GREAT” or higher. In other words, the evaluation criterion (see FIG. 10) when the volume of the user's singing voice is high (see FIG. 10) is greater than or equal to “GREAT” compared to the evaluation criterion (see FIG. 6) when the volume of the user's singing voice is not high. The timing deviation (Δt) that the user should satisfy in order to obtain the evaluation is large. As a result, it is easy for the user to obtain an evaluation of “GREAT” or higher.

以下、上記のような機能を実現するための構成について説明する。図１１は、カラオケシステム１０で実現される機能を示す機能ブロック図である。図１１に示すように、カラオケシステム１０は、記憶部６０、音声出力制御部６１、評価判断基準取得部６２、比較部６３、及び評価部６４を含む。 Hereinafter, a configuration for realizing the above functions will be described. FIG. 11 is a functional block diagram showing functions realized by the karaoke system 10. As shown in FIG. 11, the karaoke system 10 includes a storage unit 60, an audio output control unit 61, an evaluation determination criterion acquisition unit 62, a comparison unit 63, and an evaluation unit 64.

記憶部６０は例えば光ディスク３３、メモリカード３４、及び主記憶１４によって実現される。なお、記憶部６０は、家庭用ゲーム機１１と通信ネットワークを介してデータ授受可能な装置に備えられる補助記憶装置（例えばハードディスク装置）によって実現されるようにしてもよい。すなわち、記憶部６０に記憶されることとして説明するデータの全部又は一部は上記の補助記憶装置に記憶されるようにしてもよい。一方、記憶部６０以外の機能ブロックは、例えば制御部１３が光ディスク３３に記憶されたプログラムを実行することによって実現される。 The storage unit 60 is realized by the optical disc 33, the memory card 34, and the main memory 14, for example. The storage unit 60 may be realized by an auxiliary storage device (for example, a hard disk device) provided in a device that can exchange data with the consumer game machine 11 via a communication network. That is, all or part of the data described as being stored in the storage unit 60 may be stored in the auxiliary storage device. On the other hand, functional blocks other than the storage unit 60 are realized, for example, when the control unit 13 executes a program stored on the optical disc 33.

記憶部６０は各種データを記憶する。本実施形態の場合、記憶部６０は複数の楽曲データを記憶する。図１２は、一の楽曲に対応する楽曲データの一例を示す図である。図１１に示すように、楽曲データは、伴奏音楽データ、歌詞データ、模範データ、及び背景画像データを含む。伴奏音楽データは楽曲の伴奏パートを所定のデータ形式で保存したものである。伴奏音楽データは例えばＭＩＤＩデータ等である。歌詞データは楽曲の歌詞を示すデータである。背景画像データは、カラオケ画面４０の背景として表示される画像を表すものである。 The storage unit 60 stores various data. In the present embodiment, the storage unit 60 stores a plurality of music data. FIG. 12 is a diagram illustrating an example of music data corresponding to one music. As shown in FIG. 11, the music data includes accompaniment music data, lyrics data, model data, and background image data. The accompaniment music data is obtained by storing the accompaniment part of a song in a predetermined data format. The accompaniment music data is, for example, MIDI data. The lyric data is data indicating the lyrics of the music. The background image data represents an image displayed as the background of the karaoke screen 40.

模範データは、音楽に合わせてユーザが行うべき模範の音声入力を示すデータである。すなわち、模範データは、音楽に合わせてユーザが音声を入力すべきタイミング（期間）と、音楽に合わせてユーザが入力すべき音声（模範音声）とを示す。具体的には、模範音声データは、伴奏音楽に合わせてユーザが歌詞を歌うべきタイミング（期間）と、伴奏音楽に合わせて歌を歌うユーザが模範とすべき音声（模範音声）とを示す。 The model data is data indicating a model voice input to be performed by the user in accordance with music. That is, the model data indicates the timing (period) when the user should input the voice in accordance with the music and the voice (model voice) that the user should input in accordance with the music. Specifically, the model voice data indicates a timing (period) when the user should sing lyrics in accordance with the accompaniment music and a voice (model voice) that the user who sings along with the accompaniment music should model.

歌詞データと模範データとは関連づけられている。このため、歌詞の各部分をどのタイミング（期間）において歌うべきかと、歌詞の各部分をどの音高で歌うべきかとの両方が歌詞データと模範データとに基づいて特定されるようになっている。 Lyric data and model data are associated with each other. For this reason, both the timing (period) at which each part of the lyrics should be sung and the pitch at which each part of the lyrics should be sung are specified based on the lyrics data and the model data. .

また、記憶部６０は、模範データが示す模範の音声入力と音声入力部３１を介して行われた音声入力との比較結果に基づいて、ユーザの音声入力に対する評価を判断するための評価判断基準を記憶する。評価判断基準は、上記比較結果に関する比較結果条件と、ユーザの音声入力に対する評価に関する評価情報と、を関連づけてなる情報である。本実施形態の場合、図５、図６、図９、及び図１０に示すような評価判断基準が記憶部６０に記憶される。 In addition, the storage unit 60 is an evaluation determination criterion for determining an evaluation of the user's voice input based on a comparison result between the model voice input indicated by the model data and the voice input performed via the voice input unit 31. Remember. The evaluation criterion is information obtained by associating the comparison result condition regarding the comparison result with the evaluation information regarding the evaluation of the user's voice input. In the case of the present embodiment, the evaluation criteria as shown in FIGS. 5, 6, 9, and 10 are stored in the storage unit 60.

図５及び図９に示す評価判断基準は、模範データが示す模範音声の音高と、音声入力部３１を介して入力されたユーザの音声の音高と、の比較結果に関する比較結果条件と、評価情報とを対応づけてなる情報である。図５及び図９に示す評価判断基準の「音高のずれ（ｐ）」フィールドが「比較結果条件」に相当する。また、「評価」及び「評価値」フィールドが「評価情報」に相当する。 The evaluation criteria shown in FIG. 5 and FIG. 9 are the comparison result condition regarding the comparison result between the pitch of the model voice indicated by the model data and the pitch of the voice of the user input through the voice input unit 31. This is information that is associated with evaluation information. The “pitch deviation (p)” field of the evaluation criteria shown in FIGS. 5 and 9 corresponds to “comparison result condition”. The “evaluation” and “evaluation value” fields correspond to “evaluation information”.

図６及び図１０に示す評価判断基準は、模範データが示す模範タイミング（ユーザが音声を入力すべきタイミング）と、音声入力部３１を介してユーザの音声が入力されたタイミングと、の比較結果に関する比較結果条件と、評価情報とを対応づけてなる情報である。図６及び図１０に示す評価判断基準の「タイミングのずれ（ｔ）」フィールドが「比較結果条件」に相当する。また、「評価」及び「評価値」フィールドが「評価情報」に相当する。 The evaluation criteria shown in FIGS. 6 and 10 are comparison results between the model timing indicated by the model data (the timing at which the user should input voice) and the timing at which the user's voice is input via the voice input unit 31. The comparison result condition and the evaluation information are associated with each other. The “timing deviation (t)” field of the evaluation criteria shown in FIGS. 6 and 10 corresponds to the “comparison result condition”. The “evaluation” and “evaluation value” fields correspond to “evaluation information”.

本実施形態の場合、記憶部６０は、音声の特徴情報に関する複数の特徴条件の各々に対応づけて、評価判断基準を記憶する。「音声の特徴情報」は、例えば音声の音量に関する情報である。なお、「音声の特徴情報」は、例えば音声の周波数成分に関する情報であってもよい。 In the case of the present embodiment, the storage unit 60 stores an evaluation criterion in association with each of a plurality of feature conditions related to voice feature information. “Speech feature information” is, for example, information related to sound volume. The “speech feature information” may be, for example, information related to the frequency component of the sound.

図１３は記憶部６０の記憶内容の一例を示す。図１３に示すように、記憶部６０は、音量（ｖ）に関する音量条件と、評価判断基準と、を対応づけて記憶する。図１３において「評価判断基準Ａ」は図５及び図６に示す評価判断基準であり、「評価判断基準Ｂ」は図９及び図１０に示す評価判断基準である。上述したように、図９及び図１０に示す評価判断基準は図５及び図６に示す評価判断基準よりも緩い。このため、図１３に示す例では、比較的小さい音量（ｖ）を示す音量範囲「ｖ＜Ｖｒ」には、比較的厳しい評価判断基準が関連づけられており、比較的大きい音量（ｖ）を示す音量範囲「Ｖｒ≦ｖ」に関しては、比較的緩い評価判断基準が関連づけられている。 FIG. 13 shows an example of the contents stored in the storage unit 60. As illustrated in FIG. 13, the storage unit 60 stores a volume condition regarding the volume (v) and an evaluation determination criterion in association with each other. In FIG. 13, “evaluation judgment standard A” is the evaluation judgment standard shown in FIGS. 5 and 6, and “evaluation judgment standard B” is the evaluation judgment standard shown in FIGS. 9 and 10. As described above, the evaluation criteria shown in FIGS. 9 and 10 are looser than the evaluation criteria shown in FIGS. Therefore, in the example shown in FIG. 13, a relatively strict evaluation criterion is associated with the volume range “v <Vr” indicating a relatively low volume (v), indicating a relatively large volume (v). A relatively loose evaluation criterion is associated with the volume range “Vr ≦ v”.

音声出力制御部６１は、伴奏音楽データに基づいて伴奏音楽を音声出力部３２から出力する。例えば、音声出力制御部６１は音源を有し、この音源と伴奏音楽データ（ＭＩＤＩデータ）とに基づいて、伴奏音楽を再生する。なお、音声出力制御部６１は、伴奏音楽と、音声入力部３１を介して入力されたユーザの歌唱音声と、を合成して音声出力部３２から出力させる。 The audio output control unit 61 outputs accompaniment music from the audio output unit 32 based on the accompaniment music data. For example, the audio output control unit 61 has a sound source, and reproduces accompaniment music based on the sound source and accompaniment music data (MIDI data). The audio output control unit 61 synthesizes the accompaniment music and the user's singing voice input via the audio input unit 31 and outputs the synthesized music from the audio output unit 32.

評価判断基準取得部６２は評価判断基準を取得する。また、評価判断基準取得部６２は、音声入力部３１を介して入力されたユーザの音声の特徴情報を取得し、ユーザの音声の特徴情報に基づいて評価判断基準を変える。すなわち、評価判断基準取得部６２は、模範データが示す模範の音声入力と音声入力部３１を介して行われた音声入力との比較結果に関する比較結果条件と、ユーザの音声入力に対する評価に関する評価情報との対応関係を、ユーザの音声の特徴情報に基づいて変える。なお、上述したように、「ユーザの音声の特徴情報」は、例えば、ユーザの音声の音量又は周波数成分等である。 The evaluation criterion acquisition unit 62 acquires an evaluation criterion. The evaluation determination criterion acquisition unit 62 acquires feature information of the user's voice input via the voice input unit 31, and changes the evaluation determination criterion based on the feature information of the user's voice. In other words, the evaluation criterion acquisition unit 62 compares the comparison result condition regarding the comparison result between the model voice input indicated by the model data and the voice input performed through the voice input unit 31, and the evaluation information regarding the evaluation of the user voice input. Is changed based on the feature information of the user's voice. As described above, the “user voice feature information” is, for example, the volume or frequency component of the user voice.

本実施形態の場合、評価判断基準取得部６２は、記憶部６０に記憶される評価判断基準のうちの、ユーザの音声の音量（ｖ）が満足する音量条件に対応づけられた評価判断基準を取得する（図１３参照）。 In the case of the present embodiment, the evaluation determination criterion acquisition unit 62 selects an evaluation determination criterion associated with a volume condition that satisfies the volume (v) of the user's voice among the evaluation determination criteria stored in the storage unit 60. Obtain (see FIG. 13).

比較部６３は、模範データが示す模範の音声入力と、音声入力部３１を介して行われた音声入力と、を比較する。例えば、比較部６３は、音声入力部３１を介して入力されたユーザの歌唱音声の音高を解析し、模範データが示す模範音声の音高と、ユーザの歌唱音声の音高とを比較する。すなわち、比較部６３は、模範データが示す模範音声の音高と、ユーザの歌唱音声の音高との間のずれ（Δｐ）を取得する。 The comparison unit 63 compares the model voice input indicated by the model data with the voice input performed via the voice input unit 31. For example, the comparison unit 63 analyzes the pitch of the user's singing voice input via the voice input unit 31 and compares the pitch of the model voice indicated by the model data with the pitch of the user's singing voice. . That is, the comparison unit 63 acquires a deviation (Δp) between the pitch of the model voice indicated by the model data and the pitch of the user's singing voice.

また、例えば、比較部６３は、模範データが示す模範タイミング（ユーザが歌唱すべきタイミング）と、音声入力部３１を介してユーザの歌唱音声が入力されたタイミングとを比較する。すなわち、比較部６３は、模範データが示す模範タイミングと、音声入力部３１を介してユーザの歌唱音声が入力されたタイミングとの間のずれ（Δｔ）を取得する。 Further, for example, the comparison unit 63 compares the model timing indicated by the model data (the timing at which the user should sing) with the timing at which the user's singing voice is input via the voice input unit 31. That is, the comparison unit 63 acquires a deviation (Δt) between the model timing indicated by the model data and the timing when the user's singing voice is input via the voice input unit 31.

評価部６４は、比較部６３の比較結果と、評価判断基準取得部６２によって取得された評価判断基準とに基づいて、ユーザの音声入力に対する評価を判断する。本実施形態の場合、評価部６４は、模範データが示す模範音声の音高とユーザの歌唱音声の音高との間のずれ（Δｐ）と、評価判断基準取得部６２によって取得された評価判断基準（図５，９参照）とに基づいて、ユーザの歌唱に対する評価を判断する。また、評価部６４は、模範データが示す模範タイミングと音声入力部３１を介してユーザの歌唱音声が入力されたタイミングとの間のずれ（Δｔ）と、評価判断基準取得部６２によって取得された評価判断基準（図６，１０参照）とに基づいて、ユーザの歌唱に対する評価を判断する。 The evaluation unit 64 determines the evaluation for the user's voice input based on the comparison result of the comparison unit 63 and the evaluation determination criterion acquired by the evaluation determination criterion acquisition unit 62. In the case of the present embodiment, the evaluation unit 64 evaluates the deviation (Δp) between the pitch of the model voice indicated by the model data and the pitch of the user's singing voice, and the evaluation judgment acquired by the evaluation judgment reference acquisition unit 62. Based on the criteria (see FIGS. 5 and 9), the user's evaluation for singing is determined. Further, the evaluation unit 64 is acquired by the evaluation determination criterion acquisition unit 62 and the deviation (Δt) between the exemplary model timing indicated by the exemplary model data and the timing when the user's singing voice is input via the audio input unit 31. Based on the evaluation criteria (see FIGS. 6 and 10), the user's evaluation for singing is determined.

次に、カラオケシステム１０で実行される処理について説明する。図１４はカラオケシステム１０で実行される処理のうちの、本発明に関連する処理を主に示すフロー図である。制御部１３は光ディスク３３に記憶されるプログラムに従って、図１４に示す処理を実行する。制御部１３が図１４に示す処理を実行することによって、図１１に示す機能ブロックが実現される。 Next, processing executed by the karaoke system 10 will be described. FIG. 14 is a flowchart mainly showing processing related to the present invention among the processing executed in the karaoke system 10. The control unit 13 executes the process shown in FIG. 14 according to the program stored in the optical disc 33. The control block 13 executes the processing shown in FIG. 14 to realize the functional block shown in FIG.

図１４に示すように、まず制御部１３は、カラオケ画面４０の表示と伴奏音楽の再生とを開始する（Ｓ１０１）。以降、制御部１３（音声出力制御部６１）は、伴奏音楽と、音声入力部３１を介して入力されるユーザの歌唱音声とを合成して音声出力部３２から出力させる。また、制御部１３は、伴奏音楽の再生が終了するまでの間、ステップＳ１０２〜Ｓ１０９の処理を所定時間（例えば１／６０秒）ごとに繰り返し実行する。 As shown in FIG. 14, first, the control unit 13 starts displaying the karaoke screen 40 and reproducing the accompaniment music (S101). Thereafter, the control unit 13 (audio output control unit 61) synthesizes the accompaniment music and the user's singing voice input via the audio input unit 31 and outputs the synthesized music from the audio output unit 32. Further, the control unit 13 repeatedly executes the processes of steps S102 to S109 every predetermined time (for example, 1/60 seconds) until the accompaniment music is finished playing.

まず、制御部１３（比較部６３）は、模範データと、音声入力部３１を介して入力されたユーザの歌唱音声とに基づいて、歌唱音声と模範音声との比較を実行する（Ｓ１０２）。 First, the control unit 13 (comparison unit 63) performs comparison between the singing voice and the model voice based on the model data and the user's singing voice input via the voice input unit 31 (S102).

ステップＳ１０２において、例えば、制御部１３は模範データに基づいて模範音声の音高を取得する。また、制御部１３は音声入力部３１を介して入力されたユーザの歌唱音声の音高を判断する。そして、制御部１３は歌唱音声の音高と模範音声の音高との間のずれ（Δｐ）を取得する。 In step S102, for example, the control unit 13 acquires the pitch of the model voice based on the model data. Further, the control unit 13 determines the pitch of the user's singing voice input via the voice input unit 31. Then, the control unit 13 acquires a deviation (Δp) between the pitch of the singing voice and the pitch of the model voice.

また、ステップＳ１０２において、例えば、制御部１３は、模範データが示す模範タイミングと、音声入力部３１を介してユーザの歌唱音声が入力されたタイミングとの間のずれ（Δｔ）を取得する。なお、タイミングのずれ（Δｔ）は、個々のタイミングに関して取得するのではなく、個々のパートの歌い始めの部分に関してのみ取得するようにしてもよい。例えば、図２に示す歌詞４１の場合、歌い始めの文字である「Ａ」を発声すべき標準タイミングと、ユーザが実際に「Ａ」を発声したタイミングとの間のずれ（Δｔ）を取得し、他の文字（例えば「Ｂ」等）に関しては、タイミングのずれ（Δｔ）を取得しないようにしてもよい。 In step S102, for example, the control unit 13 acquires a deviation (Δt) between the model timing indicated by the model data and the timing when the user's singing voice is input via the voice input unit 31. Note that the timing shift (Δt) may not be acquired for each timing, but may be acquired only for the beginning part of each part. For example, in the case of the lyrics 41 shown in FIG. 2, the deviation (Δt) between the standard timing at which “A”, which is the first character of the singing, should be uttered, and the timing at which the user actually uttered “A” is acquired. For other characters (for example, “B”, etc.), the timing shift (Δt) may not be acquired.

歌唱音声と模範音声との比較が実行された後、制御部１３（評価判断基準取得部６２）は音声入力部３１を介して入力されたユーザの歌唱音声を分析することによって、ユーザの歌唱音声の音量（ｖ）を取得する（Ｓ１０３）。そして、制御部１３（評価判断基準取得部６２）は、ステップＳ１０３で取得された音量（ｖ）に対応づけられた評価判断基準を記憶部６０から取得する（Ｓ１０４）。例えば、音量（ｖ）が基準音量（Ｖｒ）よりも小さい場合、図５及び図６に示す評価判断基準が取得される。一方、音量（ｖ）が基準音量（Ｖｒ）以上である場合、図９及び図１０に示す評価判断基準が取得される。 After the comparison between the singing voice and the model voice is executed, the control unit 13 (evaluation criterion acquisition unit 62) analyzes the user's singing voice input via the voice input unit 31, thereby the user's singing voice. Is obtained (S103). Then, the control unit 13 (evaluation determination criterion acquisition unit 62) acquires the evaluation determination criterion associated with the volume (v) acquired in step S103 from the storage unit 60 (S104). For example, when the volume (v) is smaller than the reference volume (Vr), the evaluation determination criteria shown in FIGS. 5 and 6 are acquired. On the other hand, when the volume (v) is equal to or higher than the reference volume (Vr), the evaluation criteria shown in FIGS. 9 and 10 are acquired.

評価判断基準が取得された後、制御部１３（評価部６４）は、ステップＳ１０２における比較結果と、ステップＳ１０４で取得された評価判断基準とに基づいて評価を決定する（Ｓ１０５）。その後、制御部１３は、ステップＳ１０５において決定された評価に基づいて、主記憶１４に記憶される得点を更新する（Ｓ１０６）。例えば、制御部１３は、評価に対応づけられた評価点を、主記憶１４に記憶される得点に加算する。 After the evaluation determination criterion is acquired, the control unit 13 (evaluation unit 64) determines the evaluation based on the comparison result in step S102 and the evaluation determination criterion acquired in step S104 (S105). Then, the control part 13 updates the score memorize | stored in the main memory 14 based on the evaluation determined in step S105 (S106). For example, the control unit 13 adds the evaluation score associated with the evaluation to the score stored in the main memory 14.

その後、制御部１３はカラオケ画面４０を更新する（Ｓ１０７）。ステップＳ１０７では、例えば下記に説明するような処理が実行される。 Thereafter, the control unit 13 updates the karaoke screen 40 (S107). In step S107, for example, processing as described below is executed.

例えば、ステップＳ１０７ではピアノロール画像４２をスクロールするための処理が実行される。具体的には、まず、音声入力部３１を介して入力されているユーザの歌唱音声の音高が基本範囲に含まれているか否かが判定される。ここで、「基本範囲」とは、例えば、ピアノロール画像４２における基本音高を中心とする１オクターブの範囲である。例えば図２に示すピアノロール画像４２の場合であれば、「基本範囲」とは、ピアノロール画像４２の第１領域４２ａに含まれる音高画像４３ｉに対応する音高から、第１領域４２ａに含まれる音高画像４３ｏに対応する音高までの範囲である。 For example, in step S107, a process for scrolling the piano roll image 42 is executed. Specifically, first, it is determined whether or not the pitch of the user's singing voice input via the voice input unit 31 is included in the basic range. Here, the “basic range” is, for example, a range of one octave centered on the basic pitch in the piano roll image 42. For example, in the case of the piano roll image 42 shown in FIG. 2, the “basic range” refers to the first area 42 a from the pitch corresponding to the pitch image 43 i included in the first area 42 a of the piano roll image 42. This is a range up to the pitch corresponding to the pitch image 43o included.

ユーザの歌唱音声の音高が基本範囲に含まれていない場合とは、ユーザの音声の音高が、第１領域４２ａに含まれる音高画像４３に対応する音高でなくなった場合である。この場合、制御部１３は、ユーザの歌唱音声の音高に基づいて、ピアノロール画像４２における基本音高を変更する。すなわち、ユーザの歌唱音声の音高がピアノロール画像４２における基本音高として設定される。 The case where the pitch of the user's singing voice is not included in the basic range is a case where the pitch of the user's voice is no longer the pitch corresponding to the pitch image 43 included in the first region 42a. In this case, the control unit 13 changes the basic pitch in the piano roll image 42 based on the pitch of the user's singing voice. That is, the pitch of the user's singing voice is set as the basic pitch in the piano roll image 42.

ピアノロール画像４２における基本音高が変更された場合、変更後の基本音高に基づいて、ピアノロール画像４２が更新される。この場合、ピアノロール画像４２の基本音高が上がる又は下がる結果として、ピアノロール画像４２がスクロールすることになる。 When the basic pitch in the piano roll image 42 is changed, the piano roll image 42 is updated based on the changed basic pitch. In this case, the piano roll image 42 is scrolled as a result of the basic pitch of the piano roll image 42 increasing or decreasing.

また、ステップＳ１０７では、音声入力部３１を介して入力されているユーザの歌唱音声の音高に基づいて、歌唱アイコン４５が表示される。すなわち、ユーザの歌唱音声の音高に対応する音高画像４３上に歌唱アイコン４５が表示される。 In step S <b> 107, the singing icon 45 is displayed based on the pitch of the user's singing voice input via the voice input unit 31. That is, the singing icon 45 is displayed on the pitch image 43 corresponding to the pitch of the user's singing voice.

なお、歌唱アイコン４５の大きさは、音声入力部３１を介して入力されているユーザの歌唱音声の音量（ｖ）に基づいて設定される。ユーザの歌唱音声の音量（ｖ）が基準音量（Ｖｒ）よりも小さい場合、歌唱アイコン４５は通常の大きさ（図２に示す大きさ）に設定される。一方、ユーザの歌唱音声の音量（ｖ）が基準音量（Ｖｒ）以上である場合、歌唱アイコン４５は、通常の大きさよりも大きい大きさ（図８に示す大きさ）に設定される。 Note that the size of the singing icon 45 is set based on the volume (v) of the user's singing voice input via the voice input unit 31. When the volume (v) of the user's singing voice is lower than the reference volume (Vr), the singing icon 45 is set to a normal size (the size shown in FIG. 2). On the other hand, when the volume (v) of the user's singing voice is equal to or higher than the reference volume (Vr), the singing icon 45 is set to a size larger than the normal size (the size shown in FIG. 8).

また、ステップＳ１０７では、歌詞データ及び模範データに基づいて、歌詞４１や模範音声案内画像４６が表示される。さらに、ステップＳ１０７では得点４７、コンボ数４８及びゲージ４９も更新される。さらに、ステップＳ１０７で決定された評価を示すメッセージ５０が表示される。 In step S107, the lyrics 41 and the model voice guidance image 46 are displayed based on the lyrics data and the model data. In step S107, the score 47, the combo number 48, and the gauge 49 are also updated. Further, a message 50 indicating the evaluation determined in step S107 is displayed.

カラオケ画面４０が更新された後、制御部１３は伴奏音楽の再生が終了したか否かを判定する（Ｓ１０８）。伴奏音楽の再生が終了していない場合、制御部１３はステップＳ１０２の処理を実行する。一方、伴奏音楽の再生が終了した場合、制御部１３は成績画面を表示部３０に表示する（Ｓ１０９）。そして、本処理は終了する。 After the karaoke screen 40 is updated, the control unit 13 determines whether or not the accompaniment music has been reproduced (S108). If the accompaniment music has not been played back, the control unit 13 executes the process of step S102. On the other hand, when the reproduction of the accompaniment music is completed, the control unit 13 displays a results screen on the display unit 30 (S109). Then, this process ends.

以上説明したカラオケシステム１０によれば、例えば、ユーザの歌唱音声の音量に基づいて、ユーザの歌唱に対する評価を判断するための評価判断基準を変化させることが可能になる。例えば、ユーザの歌唱音声の音量が比較的大きい場合には、ユーザの歌唱音声の音量が比較的大きくない場合に比べて、評価判断基準が緩くなるようにすることが可能になる。 According to the karaoke system 10 demonstrated above, it becomes possible to change the evaluation judgment criteria for judging the evaluation with respect to a user's song based on the volume of a user's song voice, for example. For example, when the volume of the user's singing voice is relatively high, it is possible to make the evaluation judgment criteria looser than when the volume of the user's singing voice is not relatively high.

その結果、歌を歌うことが得意でないユーザであっても、比較的大きい声で歌うことによって、比較的高い評価を得やすくなるようにすることが可能になる。すなわち、歌を歌うことが得意でないユーザであっても、比較的高い評価を得ることができるように補助することが可能になる。 As a result, even a user who is not good at singing can easily obtain a relatively high evaluation by singing with a relatively loud voice. That is, even a user who is not good at singing can be assisted so that a relatively high evaluation can be obtained.

なお、本発明は以上に説明した実施形態に限定されるものではない。 The present invention is not limited to the embodiment described above.

（１）例えば、以上に説明した実施形態では、音量（ｖ）が基準音量（Ｖｒ）未満である場合の評価判断基準（図５，６参照）と、音量（ｖ）が基準音量（Ｖｒ）以上である場合の評価判断基準（図９，１０参照）とがあらかじめ記憶部６０に記憶されるようになっていた。 (1) For example, in the embodiment described above, the evaluation judgment standard (see FIGS. 5 and 6) when the volume (v) is less than the reference volume (Vr), and the volume (v) is the reference volume (Vr). The evaluation criteria (see FIGS. 9 and 10) for the above cases are stored in the storage unit 60 in advance.

しかしながら、例えば、記憶部６０は一つの評価判断基準のみを基本の評価判断基準としてあらかじめ記憶するようにしてもよい。そして、評価判断基準取得部６２は、基本の評価判断基準に基づいて、他の評価判断基準を生成するようにしてもよい。すなわち、評価判断基準取得部６２は、基本の評価判断基準を変更することによって、他の評価判断基準を生成するようにしてもよい。 However, for example, the storage unit 60 may store in advance only one evaluation criterion as a basic evaluation criterion. Then, the evaluation determination criterion acquisition unit 62 may generate another evaluation determination criterion based on the basic evaluation determination criterion. That is, the evaluation determination criterion acquisition unit 62 may generate another evaluation determination criterion by changing the basic evaluation determination criterion.

この場合、記憶部６０は、音量条件（音声の特徴情報に関する特徴条件）に対応づけて、基本の評価判断基準をどのように変更すべきかを示す変更情報を記憶するようにすればよい。そして、評価判断基準取得部６２は、音声入力部３１を介して入力されたユーザの音声の音量（特徴情報）が満足する音量条件に対応づけられた変更情報に基づいて、基本の評価判断基準を変更することによって、他の評価判断基準を生成するようにすればよい。 In this case, the storage unit 60 may store change information indicating how to change the basic evaluation criterion in association with the sound volume condition (characteristic condition relating to audio feature information). Then, the evaluation determination criterion acquisition unit 62 performs basic evaluation determination criteria based on the change information associated with the volume condition that satisfies the volume (feature information) of the user's voice input via the voice input unit 31. It is only necessary to generate other evaluation criteria by changing.

具体的には、例えば、記憶部６０は、音量（ｖ）が基準音量（Ｖｒ）未満である場合の評価判断基準（図５，６参照）を基本の評価判断基準としてあらかじめ記憶するようにしてもよい。そして、評価判断基準取得部６２は、音量（ｖ）が基準音量（Ｖｒ）未満である場合の評価判断基準（図５，６参照）を変更することによって、音量（ｖ）が基準音量（Ｖｒ）以上である場合の評価判断基準（図９，１０参照）を生成するようにしてもよい。 Specifically, for example, the storage unit 60 stores, in advance, an evaluation determination criterion (see FIGS. 5 and 6) when the volume (v) is less than the reference volume (Vr) as a basic evaluation determination criterion. Also good. Then, the evaluation criterion acquisition unit 62 changes the evaluation criterion (see FIGS. 5 and 6) when the volume (v) is less than the reference volume (Vr), so that the volume (v) becomes the reference volume (Vr). ) An evaluation criterion (see FIGS. 9 and 10) for the above case may be generated.

この場合、例えば、音量（ｖ）が基準音量（Ｖｒ）以上である場合に、音量（ｖ）が基準音量（Ｖｒ）未満である場合の評価判断基準（図５，６参照）をどのように変更すべきかを示す変更情報が記憶部６０に記憶される。例えば、この場合の変更情報は、音高のずれ（Δｐ）が「Ｐ１」以上であってかつ「Ｐ２」未満である場合の評価及び評価値を「ＧＲＥＡＴ」及び「７」から「ＥＸＣＥＬＬＥＮＴ」及び「１０」に変更することを示すものになる。また、例えば、この場合の変更情報は、タイミングのずれ（Δｔ）が「Ｔ１」以上であってかつ「Ｔ２」未満である場合の評価及び評価値を「ＧＲＥＡＴ」及び「７」から「ＥＸＣＥＬＬＥＮＴ」及び「１０」に変更することを示すものになる。 In this case, for example, how is the evaluation criterion (see FIGS. 5 and 6) when the volume (v) is less than the reference volume (Vr) when the volume (v) is equal to or higher than the reference volume (Vr)? Change information indicating whether to change is stored in the storage unit 60. For example, the change information in this case includes the evaluation and evaluation values when the pitch shift (Δp) is equal to or greater than “P1” and less than “P2”, from “GREAT” and “7” to “EXCELLENT” and It shows that it changes to "10". Further, for example, the change information in this case is that the evaluation and evaluation value when the timing shift (Δt) is equal to or greater than “T1” and less than “T2” are changed from “GREAT” and “7” to “EXCELLENT”. And “10”.

そして、音声入力部３１を介して入力されたユーザの歌唱音声の音量（ｖ）が基準音量（Ｖｒ）以上である場合、評価判断基準取得部６２は、音量（ｖ）が基準音量（Ｖｒ）未満である場合の評価判断基準（図５，６参照）を上記変更情報に基づいて変更することによって、音量（ｖ）が基準音量（Ｖｒ）以上である場合の評価判断基準（図９，１０参照）を生成する。 When the volume (v) of the user's singing voice input via the voice input unit 31 is equal to or higher than the reference volume (Vr), the evaluation determination criterion acquisition unit 62 determines that the volume (v) is the reference volume (Vr). By changing the evaluation criterion (see FIGS. 5 and 6) when the volume is less than the threshold based on the change information, the evaluation criterion when the volume (v) is equal to or higher than the reference volume (Vr) (FIGS. 9 and 10). Reference).

（２）また例えば、以上に説明した実施形態では、二つの音量条件（音量範囲）の各々に対応づけて評価判断基準が記憶されるようになっていたが（図１３参照）、三つ以上の音量条件（音量範囲）の各々に対応づけて評価判断基準が記憶されるようにしてもよい。 (2) Also, for example, in the embodiment described above, the evaluation criterion is stored in association with each of the two volume conditions (volume range) (see FIG. 13). The evaluation criterion may be stored in association with each of the volume conditions (volume range).

（３）また例えば、評価判断基準取得部６２は、歌唱の周期的な周波数変動であるビブラートがユーザの歌唱音声にかかっているか否かに基づいて、評価判断基準を変えるようにしてもよい。この場合、ユーザの歌唱音声にビブラートがかかっているか否かの情報が「音声の特徴情報」に相当する。 (3) Further, for example, the evaluation determination criterion acquisition unit 62 may change the evaluation determination criterion based on whether or not vibrato, which is a periodic frequency fluctuation of the singing, is applied to the user's singing voice. In this case, information on whether or not the vibrato is applied to the user's singing voice corresponds to “speech feature information”.

例えば、ユーザの歌唱音声にビブラートがかかっている場合、評価判断基準取得部６２は、音高のずれ（Δｐ）に関連する評価判断基準を変えるようにしてもよい。図１５は、ユーザの歌唱音声にビブラートがかかっている場合の評価判断基準の一例を示す。 For example, when vibrato is applied to the user's singing voice, the evaluation determination criterion acquisition unit 62 may change the evaluation determination criterion related to the pitch shift (Δp). FIG. 15 shows an example of evaluation criteria when vibrato is applied to the user's singing voice.

図１５に示す評価判断基準は図５に示す評価判断基準に比べて緩くなっている。例えば、図５に示す評価判断基準では、模範音声とユーザの歌唱音声との間の音高のずれ（Δｐ）が「Ｐ１」以上であってかつ「Ｐ３」未満である場合に「ＧＲＥＡＴ」又は「ＧＯＯＤ」の評価がユーザに与えられるようになっている。これに対し、図１５に示す評価判断基準では、音高のずれ（Δｐ）が「Ｐ１」以上であってかつ「Ｐ３」未満である場合に、「ＧＲＥＡＴ」及び「ＧＯＯＤ」よりも高い評価である「ＥＸＣＥＬＬＥＮＴ」がユーザに与えられるようになっている。 The evaluation criteria shown in FIG. 15 are looser than the evaluation criteria shown in FIG. For example, in the evaluation criteria shown in FIG. 5, when the pitch deviation (Δp) between the model voice and the user's singing voice is “P1” or more and less than “P3”, “GREAT” or An evaluation of “GOOD” is given to the user. On the other hand, in the evaluation criteria shown in FIG. 15, when the pitch deviation (Δp) is equal to or greater than “P1” and less than “P3”, the evaluation is higher than “GREAT” and “GOOD”. A certain “EXCELLENT” is given to the user.

また例えば、図５に示す評価判断基準では、音高のずれ（Δｐ）が「Ｐ３」以上であってかつ「Ｐ４」未満である場合に「ＡＬＭＯＳＴ」の評価がユーザに与えられるようになっている。これに対し、図１５に示す評価判断基準では、音高のずれ（Δｐ）が「Ｐ３」以上であってかつ「Ｐ４」未満である場合に、「ＡＬＭＯＳＴ」よりも高い評価である「ＧＲＥＡＴ」がユーザに与えられるようになっている。 Further, for example, according to the evaluation criterion shown in FIG. 5, when the pitch deviation (Δp) is equal to or larger than “P3” and smaller than “P4”, the evaluation of “ALMOST” is given to the user. Yes. On the other hand, in the evaluation criteria shown in FIG. 15, when the pitch shift (Δp) is equal to or larger than “P3” and smaller than “P4”, “GREAT” is higher than “ALMOST”. Is given to the user.

ユーザの歌唱音声にビブラートがかかっていない場合、音高のずれ（Δｐ）が「Ｐ２」未満でなければ、「ＧＲＥＡＴ」以上の評価をユーザが得ることができないが、ユーザの歌唱音声にビブラートがかかっていると、音高のずれ（Δｐ）が「Ｐ２」以上であっても、音高のずれ（Δｐ）が「Ｐ４」未満であれば、「ＧＲＥＡＴ」以上の評価をユーザが得ることができる。つまり、ユーザの歌唱音声にビブラートがかかっている場合の評価判断基準（図１５参照）では、ユーザの歌唱音声にビブラートがかかっていない場合の評価判断基準（図５参照）に比べて、「ＧＲＥＡＴ」以上の評価を得るためにユーザが満たすべき音高のずれ（Δｐ）が大きくなっている。その結果、ユーザが「ＧＲＥＡＴ」以上の評価を得やすくなっている。 When the vibrato is not applied to the user's singing voice, the user cannot obtain an evaluation of “GREAT” or higher unless the pitch deviation (Δp) is less than “P2”, but the vibrato is not included in the user's singing voice. As a result, even if the pitch deviation (Δp) is “P2” or more, if the pitch deviation (Δp) is less than “P4”, the user can obtain an evaluation of “GREAT” or more. it can. That is, in the evaluation judgment standard (see FIG. 15) when the user's singing voice is applied with vibrato (see FIG. 15), compared with the evaluation judgment standard (see FIG. 5) when the user's singing voice is not applied with vibrato. The pitch shift (Δp) to be satisfied by the user to obtain the above evaluation is large. As a result, it is easy for the user to obtain an evaluation of “GREAT” or higher.

なお、図１６は、この場合のカラオケ画面４０の一例を示す。この場合、音高のずれ（Δｐ）に関連する評価判断基準のみが緩くなり、タイミングのずれ（Δｔ）に関連する評価判断基準は緩くならないため、図１６に示すように、歌唱アイコン４５の形状はＰ軸方向に伸びた形状になる。 FIG. 16 shows an example of the karaoke screen 40 in this case. In this case, only the evaluation judgment standard related to the pitch shift (Δp) becomes loose, and the evaluation judgment standard related to the timing shift (Δt) does not loosen. Therefore, as shown in FIG. Has a shape extending in the P-axis direction.

この変形例（３）によれば、ユーザの歌唱に対する評価を判断するための評価判断基準を、ユーザの歌唱音声にビブラートがかかっているか否かに基づいて変化させることが可能になる。こうすれば、歌唱音声にビブラートをかけることの興趣を向上することが可能になる。 According to this modification (3), it is possible to change the evaluation criterion for determining the evaluation of the user's song based on whether or not the user's song voice is vibrato. This makes it possible to improve the interest of applying vibrato to the singing voice.

（４）例えば、本発明は、ユーザが音楽に合わせてゲーム操作と音声入力とを行うようなゲームを実行するゲームシステム（音声入力評価システム）においてユーザの音声入力を評価するような場合にも適用することが可能である。例えば、ユーザが音楽に合わせて踊りながら歌を歌うようなゲームや、ユーザが音楽に合わせて踊りながらハミングを行うようなゲームを実行するゲームシステムにも本発明は適用することが可能である。また例えば、ユーザが音楽に合わせて、楽器（例えばドラム又はギター等）の演奏を模したゲーム操作を行いながら歌を歌うようなゲームを実行するゲームシステムにも本発明は適用することが可能である。 (4) For example, the present invention is also applicable to a case where a user's voice input is evaluated in a game system (voice input evaluation system) in which a user performs a game operation and voice input in accordance with music. It is possible to apply. For example, the present invention can also be applied to a game system in which a user sings a song while dancing to music or a game system in which a user performs humming while dancing to music. Further, for example, the present invention can be applied to a game system in which a user performs a game in which a user sings a song while performing a game operation simulating the performance of a musical instrument (for example, a drum or a guitar) according to music. is there.

１０カラオケシステム、１１家庭用ゲーム機、１２バス、１３制御部、１４主記憶、１５画像処理部、１６音声処理部、１７光ディスクドライブ、１８メモリカードスロット、１９通信インタフェース、２０操作部、３０表示部、３１音声入力部、３２音声出力部、３３光ディスク、３４メモリカード、４０カラオケ画面、４１歌詞、４２ピアノロール画像、４２ａ第１領域、４２ｂ，４２ｄ第２領域、４２ｃ，４２ｅ第３領域、４３ａ〜４３ｗ音高画像、４４基準ライン、４５歌唱アイコン、４６模範音声案内画像、４７得点、４８コンボ数、４９ゲージ、５０メッセージ、６０記憶部、６１音声出力制御部、６２評価判断基準取得部、６３比較部、６４評価部。 DESCRIPTION OF SYMBOLS 10 Karaoke system, 11 Home-use game machine, 12 bus, 13 Control part, 14 Main memory, 15 Image processing part, 16 Sound processing part, 17 Optical disk drive, 18 Memory card slot, 19 Communication interface, 20 Operation part, 30 Display Part, 31 voice input part, 32 voice output part, 33 optical disc, 34 memory card, 40 karaoke screen, 41 lyrics, 42 piano roll image, 42a first area, 42b, 42d second area, 42c, 42e third area, 43a-43w Pitch image, 44 reference line, 45 singing icon, 46 model voice guidance image, 47 points, 48 combo number, 49 gauge, 50 message, 60 storage unit, 61 voice output control unit, 62 evaluation judgment criterion acquisition unit 63 Comparison part 64 Evaluation part.

Claims

In a voice input evaluation system for evaluating voice input performed by a user according to music,
Voice input means for the user to input voice;
Means for acquiring the model data stored in the means for storing model data indicating a model voice input to be performed by the user in accordance with the music;
Evaluation judgment in which a comparison result condition relating to a comparison result between the voice input of the model indicated by the model data and the voice input performed through the voice input unit is associated with evaluation information related to the evaluation of the user's voice input. An acquisition means for acquiring a reference;
Comparison means for comparing the voice input of the model indicated by the model data and the voice input performed through the voice input means;
And a comparison result of the comparing means, the evaluation means and the evaluation criteria acquired by the acquisition means, based on the judges evaluated for the speech input of the user,
Including
The acquisition means includes the evaluation judgment criteria stored in the means for storing the evaluation judgment criteria in association with each of a plurality of volume ranges, and the user's input through the voice input means is stored. The evaluation judgment criteria used when judging the evaluation of the user's voice input by acquiring the evaluation judgment criteria associated with the volume range to which the volume of the voice belongs is obtained via the voice input means. can change based on the volume of the sound of the input the user,
A speech input evaluation system characterized by that.

In a voice input evaluation system for evaluating voice input performed by a user according to music,
Voice input means for the user to input voice;
Means for acquiring the model data stored in the means for storing model data indicating a model voice input to be performed by the user in accordance with the music;
Evaluation judgment in which a comparison result condition relating to a comparison result between the voice input of the model indicated by the model data and the voice input performed through the voice input unit is associated with evaluation information related to the evaluation of the user's voice input. An acquisition means for acquiring a reference;
Comparison means for comparing the voice input of the model indicated by the model data and the voice input performed through the voice input means;
Evaluation means for determining an evaluation of the user's voice input based on the comparison result of the comparison means and the evaluation determination criteria acquired by the acquisition means;
Including
It said acquisition means,
Means for obtaining said basic evaluation criteria stored in means for storing basic evaluation criteria ,
Of the change information stored in the means for storing the change information indicating how to change the basic evaluation judgment criteria in association with each of a plurality of sound volume ranges , via the voice input means By acquiring the evaluation determination criterion by changing the basic evaluation determination criterion based on the change information associated with the volume range to which the sound volume of the user's voice inputted in the above belongs , Changing the evaluation criteria used when judging the evaluation of the user's voice input based on the volume of the user's voice input through the voice input means;
A speech input evaluation system characterized by that.

The speech input evaluation system according to claim 1 or 2 ,
The exemplary data indicates at least exemplary speech to be input by the user in accordance with the music,
The evaluation criterion is a comparison result condition regarding a comparison result between a pitch of the model voice indicated by the model data and a pitch of the voice of the user input through the voice input unit, the evaluation information, , Is information that associates
The comparison means compares the pitch of the model voice indicated by the model data with the pitch of the user's voice input through the voice input means.
A speech input evaluation system characterized by that.

The speech input evaluation system according to any one of claims 1 to 3 ,
The exemplary data indicates at least exemplary timing at which the user should input voice,
The evaluation determination criterion associates the evaluation result with the comparison result condition regarding the comparison result between the model timing indicated by the model data and the timing when the user's voice is input via the voice input unit. Information,
The comparing means compares the model timing indicated by the model data with the timing when the user's voice is input via the voice input means.
A speech input evaluation system characterized by that.

The speech input evaluation system according to any one of claims 1 to 4 ,
The voice input evaluation system is a karaoke system in which a user sings according to the music or a game system that executes a game in which the user performs voice input according to the music.

In a control method of a voice input evaluation system for evaluating voice input performed by a user according to music,
Obtaining the model data stored in a means for storing model data indicating a model voice input to be performed by the user according to the music;
A comparison result condition regarding a comparison result between a model voice input indicated by the model data and a voice input performed by a voice input means for the user to input a voice, and an evaluation of the user's voice input An obtaining step for obtaining an evaluation criterion obtained by associating the evaluation information;
A comparison step of comparing the voice input of the model indicated by the model data with the voice input made through the voice input means;
A comparison result in the comparing step, the evaluation step and the evaluation criteria acquired in the acquisition step, based on the judges evaluated for the speech input of the user,
Including
In the obtaining step, the user's input through the voice input unit among the evaluation judgment criteria stored in the means for storing the evaluation judgment criteria in association with each of a plurality of volume ranges. The evaluation judgment criterion used when judging the evaluation of the user's voice input by acquiring the evaluation judgment criterion associated with the volume range to which the sound volume belongs is input via the voice input means. been includes time variant may step on the basis of the volume of the sound of the user,
A control method for a speech input evaluation system.

A program for causing a computer to function as an audio input evaluation system for evaluating audio input performed by a user in accordance with music,
Means for acquiring the model data stored in means for storing model data indicating a model voice input to be performed by the user in accordance with the music;
A comparison result condition regarding a comparison result between a model voice input indicated by the model data and a voice input performed by a voice input means for the user to input a voice, and an evaluation of the user's voice input An obtaining means for obtaining an evaluation judgment criterion obtained by associating the evaluation information,
Comparison means for comparing the voice input of the model indicated by the model data with the voice input made through the voice input means; and
And a comparison result of said comparing means, and the evaluation criteria obtained by the obtaining means, on the basis of the evaluation means for determining the evaluation of the voice input of the user,
It makes the computer function as,
The acquisition means includes the evaluation judgment criteria stored in the means for storing the evaluation judgment criteria in association with each of a plurality of volume ranges, and the user's input through the voice input means is stored. The evaluation judgment criteria used when judging the evaluation of the user's voice input by acquiring the evaluation judgment criteria associated with the volume range to which the volume of the voice belongs is obtained via the voice input means. can change based on the volume of the sound of the input the user,
A program characterized by that .

In a control method of a voice input evaluation system for evaluating voice input performed by a user according to music,
Obtaining the model data stored in a means for storing model data indicating a model voice input to be performed by the user according to the music;
A comparison result condition regarding a comparison result between a model voice input indicated by the model data and a voice input performed by a voice input means for the user to input a voice, and an evaluation of the user's voice input An obtaining step for obtaining an evaluation criterion obtained by associating the evaluation information;
A comparison step of comparing the voice input of the model indicated by the model data with the voice input made through the voice input means;
An evaluation step of determining an evaluation of the user's voice input based on the comparison result in the comparison step and the evaluation determination criterion acquired in the acquisition step;
Including
The obtaining step includes
Obtaining the basic evaluation criteria stored in a means for storing basic evaluation criteria;
Of the change information stored in the means for storing the change information indicating how to change the basic evaluation judgment criteria in association with each of a plurality of sound volume ranges, via the voice input means By acquiring the evaluation determination criterion by changing the basic evaluation determination criterion based on the change information associated with the volume range to which the sound volume of the user's voice inputted in the above belongs, Changing the evaluation criteria used when judging the evaluation of the user's voice input based on the volume of the user's voice input through the voice input means,
A control method for a speech input evaluation system.

A program for causing a computer to function as an audio input evaluation system for evaluating audio input performed by a user in accordance with music,
Means for acquiring the model data stored in means for storing model data indicating a model voice input to be performed by the user in accordance with the music;
A comparison result condition regarding a comparison result between a model voice input indicated by the model data and a voice input performed by a voice input means for the user to input a voice, and an evaluation of the user's voice input An obtaining means for obtaining an evaluation judgment criterion obtained by associating the evaluation information,
Comparison means for comparing the voice input of the model indicated by the model data with the voice input made through the voice input means; and
An evaluation unit that determines an evaluation of the user's voice input based on a comparison result of the comparison unit and the evaluation determination criterion acquired by the acquisition unit;
Function the computer as
The acquisition means includes
Means for obtaining said basic evaluation criteria stored in means for storing basic evaluation criteria,
Of the change information stored in the means for storing the change information indicating how to change the basic evaluation judgment criteria in association with each of a plurality of sound volume ranges, via the voice input means By acquiring the evaluation determination criterion by changing the basic evaluation determination criterion based on the change information associated with the volume range to which the sound volume of the user's voice inputted in the above belongs, Changing the evaluation criteria used when judging the evaluation of the user's voice input based on the volume of the user's voice input through the voice input means;
A program characterized by that.