JP2011203383A

JP2011203383A - Karaoke system

Info

Publication number: JP2011203383A
Application number: JP2010068815A
Authority: JP
Inventors: Yoshihiro Kito; 芳弘鬼頭
Original assignee: Xing Inc
Current assignee: Xing Inc
Priority date: 2010-03-24
Filing date: 2010-03-24
Publication date: 2011-10-13

Abstract

PROBLEM TO BE SOLVED: To provide a Karaoke system that achieves performance evaluation similar to impression a user actually has.SOLUTION: The Karaoke system includes: a pitch and sound volume detector 86 which detects pitch and sound volume per unit time concerning voice information input by a microphone 40; a multivariate analyzer which performs multivariate analysis of 2N kinds of data corresponding to the pitch and sound volume per unit time detected N times continuously by the pitch and sound volume detector 86, targeting at each voice information of a plurality of users; and a performance evaluator 100 which performs performance evaluation concerning the target user's voice information based on the analysis results by the multivariate analyzer 90 corresponding to the plurality of users' voice information in a pitch switching part in reference data of evaluation. Therefore, evaluation of the performance part corresponding to a connection between parts where the levels of pitch are different is suitably performed.

Description

本発明は、多数の演奏曲のうちから選択される演奏曲を出力させるカラオケ装置を用いたカラオケシステムに関し、特に、利用者が実際に感じる印象に近い演奏評価を実現するための改良に関する。 The present invention relates to a karaoke system using a karaoke apparatus that outputs a performance tune selected from a large number of performance tunes, and more particularly to an improvement for realizing performance evaluation close to an impression that a user actually feels.

多数の演奏曲のうちから選択される演奏曲を出力させる音楽再生装置が知られている。例えば、カラオケボックス等で使用されるカラオケ装置がそれである。斯かるカラオケ装置によれば、予め記憶装置に記憶された多数のカラオケ演奏曲から選択されたカラオケ演奏曲の音楽情報を出力させると共に、そのカラオケ演奏曲の歌詞情報を含む映像をその出力に同期して画面に表示させることで、所望の歌のカラオケ演奏を楽しむことができる。 2. Description of the Related Art Music playback apparatuses that output a performance song selected from a large number of performance songs are known. For example, a karaoke device used in a karaoke box or the like. According to such a karaoke apparatus, music information of a karaoke performance song selected from a large number of karaoke performance songs stored in advance in a storage device is output, and an image including lyrics information of the karaoke performance song is synchronized with the output. By displaying it on the screen, it is possible to enjoy karaoke performance of a desired song.

ところで、近年普及しているカラオケ装置には、上述のような音楽再生装置としての機能のみならず、音声入力装置（マイクロフォン）から入力される音声情報に基づいて歌唱力等を評価（採点）する演奏評価機能を備えたものがある。例えば、特許文献１に記載されたカラオケ装置がそれである。斯かるカラオケ装置によれば、音程、テンポ、音量等を基準として音声入力装置から入力される音声情報とカラオケ演奏曲の演奏情報とを比較することで、その入力される音声情報に応じて歌唱の評価を採点することができる。 By the way, the karaoke apparatus that has been widely used in recent years evaluates (scores) singing ability and the like based not only on the function as the music playback apparatus as described above but also on voice information input from a voice input device (microphone). Some have a performance evaluation function. For example, it is a karaoke apparatus described in Patent Document 1. According to such a karaoke device, by comparing the voice information input from the voice input device with the performance information of the karaoke performance song based on the pitch, tempo, volume, etc., singing according to the input voice information Can be scored.

特開平９−１０１７９４号公報JP-A-9-101794

しかし、上述した従来技術のような演奏評価は、通常、予め定められたアルゴリズムに従い、ＭＩＤＩ（Musical Instrument Digital Interface）データ等の演奏情報を基準データとして機械的に行われるものであったことから、利用者が実際に上手いと感じる演奏に正当な評価が与えられなかったり、逆に歌い方によっては下手な演奏に高い評価が与えられるというように、利用者の感じ方と演奏評価とか必ずしも一致しないという不具合があった。 However, performance evaluation as in the prior art described above is usually performed mechanically using performance information such as MIDI (Musical Instrument Digital Interface) data as reference data according to a predetermined algorithm. The user's feelings and performance evaluation do not necessarily match, such that the performance that the user actually feels good is not given a legitimate evaluation, or conversely, depending on the way of singing, the poor performance is given a high evaluation. There was a problem that.

本発明者は、利用者が実際に感じる印象に近い演奏評価を実現するカラオケシステムを開発すべく鋭意研究を継続する過程において、特に基準データにおいて音高が切り替わる演奏部分に関して利用者の印象と演奏評価結果とが乖離する傾向が強いことに着目した。そして、斯かる演奏部分について利用者の印象と演奏評価結果とが乖離するのは、基準データにおける音高の切り替わりが段階的であるのに対して利用者の歌唱音声の時間変化は連続的であり、たとえ高低差のある音高の繋ぎを上手く歌ったとしてもそれを正しく評価することができないためであると考えた。換言すれば、利用者が実際に感じる印象において、高低差のある音高の繋ぎに相当する演奏部分をどのように歌うかということに関する評価は無視できない重要な要素であるが、斯かる演奏部分を好適に評価し得るカラオケシステムは、未だ開発されていないのが現状である。 In the process of continuing eager research to develop a karaoke system that realizes performance evaluation close to the impression that the user actually feels, the present inventor, in particular, the user's impression and performance regarding the performance portion where the pitch changes in the reference data We focused on the strong tendency to deviate from the evaluation results. The difference between the user's impression and the performance evaluation result for such a performance portion is that the pitch change in the reference data is gradual, whereas the time change of the user's singing voice is continuous. Yes, I thought it was because I couldn't evaluate it correctly even if I sang well the connection of pitches with different pitches. In other words, in the impression that the user actually feels, the evaluation of how to sing the performance part corresponding to the connection of pitches with different pitches is an important element that cannot be ignored. At present, a karaoke system capable of suitably evaluating the above has not been developed yet.

本発明は、以上の事情を背景として為されたものであり、その目的とするところは、利用者が実際に感じる印象に近い演奏評価を実現するカラオケシステムを提供することにある。 The present invention has been made against the background of the above circumstances, and an object of the present invention is to provide a karaoke system that realizes performance evaluation close to the impression that a user actually feels.

斯かる目的を達成するために、本発明の要旨とするところは、多数の演奏曲のうちから選択される演奏曲を出力させると共に音声入力装置により入力される音声を増幅して出力させるカラオケ装置を用いたカラオケシステムであって、前記音声入力装置により入力される音声情報に関して単位時間毎の音高及び音量を検出する音高・音量検出手段と、複数の利用者それぞれの音声情報を対象として、前記音高・音量検出手段によりＮ回連続的に検出される単位時間毎の音高及び音量に対応する２Ｎ種類のデータに関して多変量解析を行う多変量解析手段と、評価の基準データにおける音高の切り替わり部分において、前記複数の利用者の音声情報に対応する前記多変量解析手段による解析結果に基づいて、対象となる利用者の音声情報に係る演奏評価を行う演奏評価手段とを、備えたことを特徴とするものである。 In order to achieve such an object, the gist of the present invention is to provide a karaoke apparatus that outputs a performance tune selected from a large number of performance tunes and amplifies and outputs a sound input by a sound input device. A karaoke system using a pitch / volume detection means for detecting a pitch and a volume per unit time with respect to voice information input by the voice input device, and voice information of each of a plurality of users , Multivariate analysis means for performing multivariate analysis on 2N types of data corresponding to pitch and volume per unit time detected N times continuously by the pitch / volume detection means, and sound in the reference data for evaluation Based on the analysis result by the multivariate analysis means corresponding to the voice information of the plurality of users in the high switching part, the voice information of the target user is related And playing evaluation means for performing Kanade evaluation, is characterized in that it comprises.

このようにすれば、前記音声入力装置により入力される音声情報に関して単位時間毎の音高及び音量を検出する音高・音量検出手段と、複数の利用者それぞれの音声情報を対象として、前記音高・音量検出手段によりＮ回連続的に検出される単位時間毎の音高及び音量に対応する２Ｎ種類のデータに関して多変量解析を行う多変量解析手段と、評価の基準データにおける音高の切り替わり部分において、前記複数の利用者の音声情報に対応する前記多変量解析手段による解析結果に基づいて、対象となる利用者の音声情報に係る演奏評価を行う演奏評価手段とを、備えたものであることから、高低差のある音高の繋ぎに相当する演奏部分を好適に評価することができる。すなわち、利用者が実際に感じる印象に近い演奏評価を実現するカラオケシステムを提供することができる。 According to this configuration, the sound and sound detection means for detecting the pitch and the sound volume per unit time with respect to the sound information input by the sound input device, and the sound information for the sound information of each of a plurality of users. Multivariate analysis means for performing multivariate analysis on 2N types of data corresponding to pitch and volume per unit time detected N times continuously by the high / volume detection means, and switching of pitches in the reference data for evaluation A performance evaluation unit that performs performance evaluation related to voice information of a target user based on an analysis result by the multivariate analysis unit corresponding to the voice information of the plurality of users. Therefore, it is possible to favorably evaluate a performance portion corresponding to a pitch connection having a difference in pitch. That is, it is possible to provide a karaoke system that realizes performance evaluation close to the impression that the user actually feels.

ここで、好適には、前記多変量解析手段は、前記音高・音量検出手段により検出される音高の変化の起点をｎ＝０とする、ｎ＝１からｎ＝Ｎまでの前記２Ｎ種類のデータに関して主成分分析を行う主成分分析手段と、その主成分分析手段による分析結果に対応して、２Ｎ組の固有値・固有ベクトルを算出する固有値・固有ベクトル算出手段と、その固有値・固有ベクトル算出手段により算出される２Ｎ組の固有値・固有ベクトルに関して、各固有ベクトルを軸とする固有値の分散を算出する分散算出手段と、その分散算出手段により算出される分散が大きいものから順に前記２Ｎ組の固有値・固有ベクトルの順位を決定する順位付け手段とを、含むものであり、前記演奏評価手段は、その順位付け手段により決定された順位の高い固有値・固有ベクトルから優先的に前記演奏評価のパラメータとして用いるものである。このようにすれば、高低差のある音高の繋ぎに相当する演奏部分を実用的な態様で好適に評価することができる。 Here, it is preferable that the multivariate analysis unit has the 2N types from n = 1 to n = N, where n = 0 is the starting point of the change in pitch detected by the pitch / volume detection unit. A principal component analysis means that performs principal component analysis on the data of the data, an eigenvalue / eigenvector calculation means that calculates 2N sets of eigenvalues / eigenvectors corresponding to the analysis result by the principal component analysis means, and an eigenvalue / eigenvector calculation means Regarding the 2N sets of eigenvalues / eigenvectors to be calculated, variance calculation means for calculating the variance of eigenvalues with each eigenvector as an axis, and the 2N sets of eigenvalues / eigenvectors in descending order of the variance calculated by the variance calculation means Ranking means for determining the ranking, and the performance evaluation means has a high eigenvalue / fixed value determined by the ranking means. It is to use as a parameter for preferentially the playing voted vector. In this way, it is possible to suitably evaluate the performance portion corresponding to the connection of pitches having different heights in a practical manner.

本発明が好適に適用されるカラオケシステムを説明する概略図である。It is the schematic explaining the karaoke system to which this invention is applied suitably. 図１のカラオケシステムに備えられたカラオケ装置の構成を例示するブロック線図である。It is a block diagram which illustrates the structure of the karaoke apparatus with which the karaoke system of FIG. 1 was equipped. 図１のカラオケシステムに備えられたサーバ装置の構成を例示するブロック線図である。It is a block diagram which illustrates the composition of the server apparatus with which the karaoke system of Drawing 1 was equipped. 図２のカラオケ装置のＣＰＵ及び図３のサーバ装置のＣＰＵに備えられた制御機能の要部を説明する機能ブロック線図である。It is a functional block diagram explaining the principal part of the control function with which CPU of the karaoke apparatus of FIG. 2 and CPU of the server apparatus of FIG. 3 were equipped. 本実施例を含む一般的なカラオケ演奏評価制御に用いられる基準データについて説明する図であり、入力音声情報の実測値を実線で、評価の基準となる基準データを破線で囲繞した領域でそれぞれ示している。It is a figure explaining the reference data used for general karaoke performance evaluation control including a present Example, and shows the actual value of input voice information with a solid line, and the reference data used as the standard of evaluation is shown in the area surrounded with a broken line, respectively. ing. 図５に示すような基準データに基づく演奏評価制御の問題点を説明する図である。It is a figure explaining the problem of the performance evaluation control based on reference | standard data as shown in FIG. 図２に示す本実施例のカラオケ装置による、基準データにおける音高の切り替わり部分においての音高及び音量の検出について説明する図である。It is a figure explaining the detection of the pitch and sound volume in the pitch change part in reference | standard data by the karaoke apparatus of a present Example shown in FIG. 図２に示す本実施例のカラオケ装置により算出される２Ｎ組の固有値・固有ベクトルのうち２本の固有ベクトルを示すパターン空間を例示しており、各サンプルの値を複数の点で示している。The pattern space which shows two eigenvectors among 2N sets of eigenvalues and eigenvectors calculated by the karaoke apparatus of the present embodiment shown in FIG. 2 is illustrated, and the value of each sample is indicated by a plurality of points. 図１に示す本実施例のカラオケシステムによる演奏評価制御の効果を検証するために、複数の利用者それぞれに対応する音声情報に係るｎ＝１〜８の音高データ及び音量データについて多変量解析制御乃至演奏評価制御を行った結果を比較して示すグラフである。In order to verify the effect of performance evaluation control by the karaoke system of the present embodiment shown in FIG. 1, multivariate analysis is performed on pitch data and volume data of n = 1 to 8 related to voice information corresponding to each of a plurality of users. It is a graph which compares and shows the result of performing control thru | or performance evaluation control. 図９の比較例としてベロシティを考慮しない結果、すなわち複数の利用者それぞれに対応する音声情報に係るｎ＝１〜８に対応する音高データのみについて多変量解析制御乃至演奏評価制御を行った結果を比較して示すグラフである。As a comparative example of FIG. 9, the result of not considering velocity, that is, the result of performing multivariate analysis control or performance evaluation control only on pitch data corresponding to n = 1 to 8 related to voice information corresponding to each of a plurality of users. It is a graph which compares and shows. 図２のカラオケ装置のＣＰＵによる本実施例のカラオケ演奏評価制御の要部を説明するフローチャートである。It is a flowchart explaining the principal part of the karaoke performance evaluation control of a present Example by CPU of the karaoke apparatus of FIG. 図３のサーバ装置のＣＰＵによる本実施例のカラオケ演奏評価制御に係る多変量解析制御の要部を説明するフローチャートである。It is a flowchart explaining the principal part of the multivariate analysis control which concerns on the karaoke performance evaluation control of the present Example by CPU of the server apparatus of FIG.

以下、本発明の好適な実施例を図面に基づいて詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明が好適に適用されるカラオケシステム１０を説明する概略図である。この図１に示すように、本実施例のカラオケシステム１０では、カラオケボックス、スナック、旅館等の店舗１２における複数の個室１４ａ、１４ｂ、１４ｃ、・・・（以下、特に区別しない場合には単に個室１４と称する）にそれぞれ１台乃至は複数台ずつ（図１では１台ずつ）のカラオケ装置１６ａ、１６ｂ、１６ｃ、・・・（以下、特に区別しない場合には単にカラオケ装置１６と称する）が設置されている。これら複数のカラオケ装置１６は、ルータ２８を介して公衆電話回線等による通信回線１８に接続されており、同じくその通信回線１８に接続されたカラオケサービス提供会社のサーバ装置（センタ装置）２０との相互間でその通信回線１８を介して情報の通信が可能とされている。 FIG. 1 is a schematic diagram illustrating a karaoke system 10 to which the present invention is preferably applied. As shown in FIG. 1, in the karaoke system 10 of the present embodiment, a plurality of private rooms 14a, 14b, 14c,... In a store 12 such as a karaoke box, snack, inn, etc. One or a plurality of karaoke apparatuses 16a, 16b, 16c,... (Hereinafter referred to simply as karaoke apparatus 16 unless otherwise specified). Is installed. The plurality of karaoke devices 16 are connected to a communication line 18 such as a public telephone line via a router 28, and are connected to a server device (center device) 20 of a karaoke service providing company connected to the communication line 18. Information can be communicated between each other via the communication line 18.

上記サーバ装置２０は、カラオケ情報（楽曲データ）、背景映像情報、曲間情報等のデジタルコンテンツ（Digital Contents）の保管や入出力管理等の基本的な制御に加えて、後述する多変量解析制御等を行うサーバであり、上記通信回線１８を介して上記カラオケ装置１６に定期的にコンテンツの配信を行うと共に、そのカラオケ装置１６からの要求に応じて所定の機能制御プログラムを送信する等の制御を行うものである。また、上記カラオケシステム１０は、複数の電子早見本装置２２ａ、２２ｂ、２２ｃ、・・・（以下、特に区別しない場合には単に電子早見本装置２２と称する）を備えており、上記カラオケ装置１６の利用に際して、各利用者（グループ）毎に１台ずつの電子早見本装置２２が貸与され、各個室１４において後述するように上記カラオケ装置１６の遠隔操作装置として用いられるようになっている。上記店舗１２内には上記複数のカラオケ装置１６を相互に接続するＬＡＮ２４が敷設されており、上記電子早見本装置２２からのカラオケ装置１６への入力は、所定のアクセスポイント２６及びＬＡＮ２４を介したＬＡＮ通信等により行われる。 The server device 20 performs multivariate analysis control, which will be described later, in addition to basic control such as storage and input / output management of digital contents such as karaoke information (music data), background video information, and information between songs. And the like, which regularly distributes content to the karaoke device 16 via the communication line 18 and transmits a predetermined function control program in response to a request from the karaoke device 16. Is to do. The karaoke system 10 includes a plurality of electronic quick sample devices 22a, 22b, 22c,... (Hereinafter simply referred to as the electronic quick sample device 22 unless otherwise distinguished). In use, one electronic quick sample device 22 is lent for each user (group) and used as a remote control device for the karaoke device 16 in each private room 14 as will be described later. A LAN 24 for connecting the plurality of karaoke apparatuses 16 to each other is laid in the store 12, and an input to the karaoke apparatus 16 from the electronic quick sample apparatus 22 is made via a predetermined access point 26 and the LAN 24. This is performed by LAN communication or the like.

図２は、前記カラオケ装置１６の構成を例示するブロック線図である。この図２に示すように、前記カラオケ装置１６は、ＴＦＴ（Thin Film Transistor Liquid Crystal）等の映像表示装置３０と、映像出力制御部３２と、映像情報デコーダ３４と、ビデオミキサ３６と、音源であるシンセサイザ３８と、音声入力装置であるマイクロフォン４０と、Ａ／Ｄコンバータ４１と、アンプミキサ４２と、スピーカ４４と、操作パネル４６と、その操作パネル４６等からの入力信号を処理する入出力インターフェイス４８と、中央演算処理装置であるＣＰＵ５０と、読出専用メモリであるＲＯＭ５２と、随時書込読出メモリであるＲＡＭ５４と、記憶装置であるハードディスク５６と、モデム５８と、ＬＡＮポート６０と、上記電子早見本装置２２やリモコン装置６４等の入力装置からのリモコン信号を受信するためのリモコン受信部６２とを、備えて構成されている。 FIG. 2 is a block diagram illustrating the configuration of the karaoke apparatus 16. As shown in FIG. 2, the karaoke device 16 includes a video display device 30 such as a TFT (Thin Film Transistor Liquid Crystal), a video output control unit 32, a video information decoder 34, a video mixer 36, and a sound source. A synthesizer 38, a microphone 40 as an audio input device, an A / D converter 41, an amplifier mixer 42, a speaker 44, an operation panel 46, and an input / output interface 48 that processes input signals from the operation panel 46 and the like. A CPU 50 as a central processing unit, a ROM 52 as a read-only memory, a RAM 54 as a write / read memory as needed, a hard disk 56 as a storage device, a modem 58, a LAN port 60, and the electronic sample Remote control reception for receiving a remote control signal from an input device such as device 22 or remote control device 64 A Department 62 is configured to include.

前記映像出力制御部３２は、前記ＣＰＵ５０において生成された歌詞文字映像等の文字映像（テロップ）を出力する文字映像出力装置として機能する他、前記映像表示装置３０による種々の映像表示を制御する表示制御装置である。また、前記映像情報デコーダ３４は、利用者が歌詞を参照しながら歌を歌う際に前記ハードディスク５６に記憶された背景映像情報に基づいて所定の背景映像を再生（デコード）する背景映像再生装置である。この背景映像情報は、例えば、ＭＰＥＧ（Moving Picture Experts Group）形式のデータであり、そのＭＰＥＧデータに基づいて前記映像情報デコーダ３４により再生された背景映像は、前記ビデオミキサへ送られる。また、前記ビデオミキサ３６は、前記ＣＰＵ５０において生成され且つ前記映像出力制御部３２から出力される文字映像と、前記映像情報デコーダ３４により再生される背景映像とを合成して前記映像表示装置３０に表示させる映像合成装置である。 The video output control unit 32 functions as a character video output device that outputs a character video (telop) such as a lyric character video generated by the CPU 50, and also displays for controlling various video displays by the video display device 30. It is a control device. The video information decoder 34 is a background video playback device that plays back (decodes) a predetermined background video based on background video information stored in the hard disk 56 when a user sings a song while referring to lyrics. is there. The background video information is, for example, MPEG (Moving Picture Experts Group) format data, and the background video reproduced by the video information decoder 34 based on the MPEG data is sent to the video mixer. The video mixer 36 combines the character video generated by the CPU 50 and output from the video output control unit 32 with the background video reproduced by the video information decoder 34 to the video display device 30. This is a video composition device to be displayed.

前記シンセサイザ３８は、前記ハードディスク５６から読み出されて送られて来るカラオケ演奏曲の演奏情報に基づいて楽器の演奏信号等の音楽信号を生成する音源である。この演奏情報は、例えば、ＭＩＤＩ（Musical Instrument Digital Interface）形式のデータであり、そのＭＩＤＩデータに基づいて前記シンセサイザ３８により生成された音楽信号は、アナログ信号に変換されて前記アンプミキサ４２へ送られる。そのアンプミキサ４２では、送られてきた音楽信号と前記マイクロフォン４０を介して入力される利用者の歌声とがミキシングされ、それらの信号が電気的に増幅されて前記スピーカ４４から出力される。また、前記Ａ／Ｄコンバータ４１は、音声入力装置である前記マイクロフォン４０から入力されるアナログ信号としての音声情報をディジタル信号に変換して前記ＣＰＵ５０等へ供給する。 The synthesizer 38 is a sound source that generates a music signal such as a musical instrument performance signal based on performance information of a karaoke performance song read from the hard disk 56 and sent. The performance information is, for example, data in MIDI (Musical Instrument Digital Interface) format, and the music signal generated by the synthesizer 38 based on the MIDI data is converted into an analog signal and sent to the amplifier mixer 42. The amplifier mixer 42 mixes the transmitted music signal and the user's singing voice input via the microphone 40, and those signals are electrically amplified and output from the speaker 44. The A / D converter 41 converts voice information as an analog signal input from the microphone 40 as a voice input device into a digital signal and supplies the digital signal to the CPU 50 and the like.

前記操作パネル４６は、前記カラオケ装置１６の利用者が歌いたいカラオケ演奏曲を選択したり、演奏曲の音程を調整したり、演奏と歌との音量バランスを調整したり、その他、エコー、音量、トーン等の各種調整を行うための操作ボタン（スイッチ）或いはつまみを備えた入力装置である。また、前記カラオケ装置１６には、前記操作パネル４６の一部機能を遠隔で実行するための入力装置として機能するリモコン装置６４が備えられており、前記リモコン受信部６２は、そのリモコン装置６４から送信されるリモコン信号を受信して前記ＣＰＵ５０へ供給する。また、前記カラオケ装置１６と電子早見本装置２２との対応付け（くくりつけ）処理も前記リモコン受信部６２を介して行われ、そのようにして前記カラオケ装置１６に対応付けられた電子早見本装置２２も同様に入力装置として機能する。 The operation panel 46 allows the user of the karaoke apparatus 16 to select a karaoke performance song that the user wants to sing, adjust the pitch of the performance song, adjust the volume balance between the performance and the song, and perform echo and volume. , An input device provided with operation buttons (switches) or knobs for performing various adjustments such as tone. Further, the karaoke device 16 is provided with a remote control device 64 that functions as an input device for remotely executing a part of the function of the operation panel 46, and the remote control receiving unit 62 is connected to the remote control device 64 from the remote control device 64. A remote control signal to be transmitted is received and supplied to the CPU 50. In addition, the association (sticking) processing between the karaoke device 16 and the electronic quick sample device 22 is also performed via the remote control receiving unit 62, and thus the electronic quick sample device associated with the karaoke device 16. 22 also functions as an input device.

上記ＣＰＵ５０は、上記ＲＡＭ５４の一時記憶機能を利用しつつ上記ＲＯＭ５２に予め記憶された所定のプログラムに基づいて電子情報を処理・制御する所謂マイクロコンピュータであり、上記電子早見本装置２２やリモコン装置６４等により所定のカラオケ演奏曲が選曲された場合、その選曲されたカラオケ演奏曲を上記ＲＡＭ５４に設けられた予約曲テーブルに登録したり、その予約曲テーブルの演奏順に従って上記ハードディスク５６から上記ＲＡＭ５４に選曲されたカラオケ演奏曲の演奏情報及び歌詞情報等を読み出したり、カラオケ演奏曲の演奏が進行するのに応じてそのＲＡＭ５４から上記シンセサイザ３８へ演奏情報を送信したり、歌詞情報に基づいて歌詞文字映像を生成して上記映像出力制御部３２へ送ったり、選曲時には曲名文字映像を生成して上記映像出力制御部３２へ送ったり、上記映像情報デコーダ３４を制御して所定の背景映像を再生させたり、カラオケ演奏が行われていない間すなわち曲間において、新譜情報、選曲ランキング、店舗広告等の曲間情報を出力させたり、前記通信回線１８を介した前記サーバ装置２０との間の情報通信制御等の基本的な制御に加えて、後述する演奏評価制御に係る各種制御を実行する。 The CPU 50 is a so-called microcomputer that uses the temporary storage function of the RAM 54 to process and control electronic information based on a predetermined program stored in the ROM 52 in advance, and the electronic quick reference device 22 and the remote control device 64. When a predetermined karaoke performance song is selected by the above or the like, the selected karaoke performance song is registered in the reserved song table provided in the RAM 54, or from the hard disk 56 to the RAM 54 according to the performance order of the reserved song table. The performance information and lyric information etc. of the selected karaoke performance song are read, the performance information is transmitted from the RAM 54 to the synthesizer 38 as the performance of the karaoke performance song progresses, and the lyric character is based on the lyric information. Generate a video and send it to the video output control unit 32. A new character information is generated while a name character image is generated and sent to the image output control unit 32, a predetermined background image is reproduced by controlling the image information decoder 34, that is, during a karaoke performance, that is, between songs. In addition to basic control such as information selection control, information communication control with the server device 20 via the communication line 18, such as music selection ranking, store advertisement, etc., performance evaluation control described later The various controls are executed.

前記モデム５８は、前記カラオケ装置１６を公衆電話回線等による通信回線１８に接続するための装置であり、前記ＣＰＵ５０から出力されるディジタル信号をアナログ信号に変換して前記通信回線１８に送り出すと共に、その通信回線１８を介して伝送されるアナログ信号をディジタル信号に変換して前記ＣＰＵ５０に供給する処理を行う。なお、前記店舗１２に備えられた複数のカラオケ装置１６のうち何れかのカラオケ装置１６が前記ルータ２８の機能を備えてマスターコマンダとして前記通信回線１８に接続される態様も考えられ、その場合、前記モデム５８はそのマスターコマンダとして機能するカラオケ装置１６には必要とされるが、そのマスターコマンダを介して前記サーバ装置２０との間で情報の通信を行う他のカラオケ装置１６には必ずしも設けられなくともよい。 The modem 58 is a device for connecting the karaoke device 16 to a communication line 18 such as a public telephone line, converts a digital signal output from the CPU 50 into an analog signal and sends it to the communication line 18. The analog signal transmitted via the communication line 18 is converted into a digital signal and supplied to the CPU 50. In addition, the aspect by which any karaoke apparatus 16 is equipped with the function of the said router 28 among the several karaoke apparatuses 16 with which the said store 12 was equipped, and is connected to the said communication line 18 as a master commander is also considered, In that case, The modem 58 is required for the karaoke device 16 that functions as the master commander, but is not necessarily provided for other karaoke devices 16 that communicate information with the server device 20 via the master commander. Not necessary.

前記ＬＡＮポート６０は、前記カラオケ装置１６をＬＡＮ２４を介して他のカラオケ装置１６や電子早見本装置２２等の他の機器に接続するための接続器であり、前記カラオケ装置１６は、そのようにＬＡＮ２４を介して接続されることで、他のカラオケ装置１６や電子早見本装置２２等の他の機器との間で情報の送受信が可能とされる。例えば、前記アクセスポイント２６を介して受信される前記電子早見本装置２２からの選曲入力を受け付けて前記ＲＡＭ５４に設けられた予約曲テーブルに記憶したり、そのアクセスポイント２６を介して前記カラオケ装置１６から電子早見本装置２２へ所定の情報を送信したりというように、電波を介して前記カラオケ装置１６と電子早見本装置２２との間における相互の情報のやりとりが実行される。 The LAN port 60 is a connector for connecting the karaoke device 16 to other devices such as the other karaoke device 16 and the electronic quick sample device 22 via the LAN 24, and the karaoke device 16 is used as such. By being connected via the LAN 24, information can be transmitted / received to / from other devices such as the other karaoke apparatus 16 and the electronic quick sample apparatus 22. For example, the music selection input from the electronic quick sample device 22 received through the access point 26 is received and stored in a reserved music table provided in the RAM 54, or the karaoke device 16 through the access point 26. Thus, mutual exchange of information between the karaoke device 16 and the electronic quick sample device 22 is performed via radio waves, such as transmitting predetermined information to the electronic quick sample device 22.

前記ハードディスク５６には、カラオケ演奏曲を出力させるための多数のカラオケ情報（楽曲データ）を記憶するカラオケデータベースをはじめとする各種データベースが設けられている。カラオケボックス等の店舗にそれぞれ備えられた複数のカラオケ装置１６のうち所定のカラオケ装置１６例えば前記カラオケ装置１６ａは、前記モデム５８を介して前記通信回線１８に接続されており、前記複数のカラオケ装置１６によって常に新しい曲が演奏可能とされるように、随時新たな楽曲データ等が前記サーバ装置２０から前記通信回線１８を介して配信され、前記ハードディスク５６のカラオケデータベース等に記憶される。また、そのようにして前記サーバ装置２０から情報を取得したカラオケ装置１６ａとその他のカラオケ装置１６との間で前記ＬＡＮ２４を介した通信が行われることにより、各カラオケ装置１６のハードディスク５６に記憶される情報が共有され、上記カラオケデータベース等の内容が等価なものとされる。 The hard disk 56 is provided with various databases including a karaoke database for storing a large number of karaoke information (music data) for outputting karaoke performance music. Among a plurality of karaoke devices 16 provided in a store such as a karaoke box, a predetermined karaoke device 16 such as the karaoke device 16a is connected to the communication line 18 via the modem 58, and the plurality of karaoke devices. 16, new music data and the like are distributed from the server device 20 via the communication line 18 and stored in the karaoke database of the hard disk 56 so that new music can always be played. In addition, communication is performed via the LAN 24 between the karaoke device 16 a that has acquired information from the server device 20 and the other karaoke devices 16, and is stored in the hard disk 56 of each karaoke device 16. Information is shared, and the contents of the karaoke database and the like are equivalent.

上記カラオケデータベースに記憶されるカラオケ情報（カラオケデータ）は、演奏音を生成するための演奏情報及び歌詞文字映像（歌詞テロップ）を生成するための歌詞情報から成るものであり、コンテンツＩＤである各演奏曲に固有の選曲番号により識別される。このカラオケ情報に含まれる演奏情報は、例えば前記シンセサイザ３８により所定の演奏音を出力させるためのＭＩＤＩデータであり、出力に係る演奏音（楽器）の種類と、各演奏音に対応する楽譜情報とを、含んでいる。また、上記歌詞情報は、前記映像出力制御部３２等を介して演奏曲の歌詞文字映像を出力させるためのデータであり、前記歌詞文字映像に対応する歌詞のテキスト情報と、演奏出力に併行してその歌詞文字映像の表示を切り替えるための切替タイミング情報と、演奏出力に併行してその歌詞文字映像を順次色替えするための色替タイミング情報とを、含んでいる。また、後述する演奏評価制御の基準となる図５に示すような基準データが定められている。なお、この基準データは、上記カラオケデータベースに記憶されるカラオケ情報とは別に、各演奏曲の選曲番号と関連付けられて記憶されたものであってもよい。 The karaoke information (karaoke data) stored in the karaoke database is composed of performance information for generating a performance sound and lyric information for generating a lyric character image (lyric telop), each of which is a content ID. It is identified by the music selection number unique to the performance song. The performance information included in the karaoke information is, for example, MIDI data for outputting a predetermined performance sound by the synthesizer 38, the type of performance sound (instrument) related to the output, and the score information corresponding to each performance sound, Is included. The lyric information is data for outputting a lyric character video of a performance tune through the video output control unit 32 and the like, and is combined with the text information of the lyric corresponding to the lyric character video and the performance output. Switch timing information for switching the display of the lyrics character image and color change timing information for sequentially changing the color of the lyrics character image in parallel with the performance output. Further, reference data as shown in FIG. 5 is defined as a reference for performance evaluation control described later. In addition, this reference data may be stored in association with the music selection number of each performance song separately from the karaoke information stored in the karaoke database.

図３は、前記サーバ装置２０の構成を説明するブロック線図である。この図３に示すように、前記サーバ装置２０は、中央演算処理装置であるＣＰＵ６６によりＲＡＭ７０の一時記憶機能を利用しつつＲＯＭ６８に予め記憶されたプログラムに従って信号処理を行う所謂マイクロコンピュータシステムを備えており、前記カラオケ装置１６へのカラオケ情報等のコンテンツ配信制御をはじめとする基本的な制御に加えて、後述する多変量解析制御等の各種制御を実行する。また、ビデオボード７２により制御されるＴＦＴ等の映像表示装置７４と、インターフェイス７８を介して接続されるキーボード等の入力装置７６と、上記ＣＰＵ６６を前記通信回線１８に接続するための装置であるモデム８０とを、備えて構成されている。また、それぞれハードディスク等の大容量記憶装置に、前記カラオケ装置１６に配信するための多数のカラオケ情報を記憶するカラオケデータベース８２や後述する多変量解析制御に関する各種情報を記憶する多変量解析データベース８４等の各種データベースが設けられている。 FIG. 3 is a block diagram illustrating the configuration of the server device 20. As shown in FIG. 3, the server device 20 includes a so-called microcomputer system that performs signal processing according to a program stored in advance in a ROM 68 by using a temporary storage function of a RAM 70 by a CPU 66 as a central processing unit. In addition to basic control including content distribution control such as karaoke information to the karaoke device 16, various controls such as multivariate analysis control described later are executed. Further, a video display device 74 such as a TFT controlled by the video board 72, an input device 76 such as a keyboard connected via an interface 78, and a modem which is a device for connecting the CPU 66 to the communication line 18. 80. Also, a karaoke database 82 that stores a large number of karaoke information to be distributed to the karaoke device 16 in a large-capacity storage device such as a hard disk, a multivariate analysis database 84 that stores various information related to multivariate analysis control described later, and the like. Various databases are provided.

図４は、前記カラオケ装置１６のＣＰＵ５０及び前記サーバ装置３０のＣＰＵ６６に備えられた制御機能の要部を説明する機能ブロック線図である。この図４に示す各種制御手段に関して、好適には、音高・音量検出手段８６及び検出結果送信手段８８が前記カラオケ装置１６のＣＰＵ５０に、多変量解析手段９０、演奏評価手段１００、及び評価結果送信手段１０２が前記サーバ装置３０のＣＰＵ６６に備えられたものであるが、上記音高・音量検出手段８６、多変量解析手段９０、及び演奏評価手段１００が一元的に前記カラオケ装置１６に備えられた態様も考えられる。 FIG. 4 is a functional block diagram illustrating a main part of control functions provided in the CPU 50 of the karaoke apparatus 16 and the CPU 66 of the server apparatus 30. Regarding the various control means shown in FIG. 4, the pitch / volume detection means 86 and the detection result transmission means 88 are preferably sent to the CPU 50 of the karaoke apparatus 16 by the multivariate analysis means 90, the performance evaluation means 100, and the evaluation results. Although the transmission means 102 is provided in the CPU 66 of the server device 30, the pitch / volume detection means 86, the multivariate analysis means 90, and the performance evaluation means 100 are integrated in the karaoke apparatus 16. Other embodiments are also conceivable.

上記音高・音量検出手段８６は、前記マイクロフォン４０により入力される音声情報に関して単位時間毎の音高及び音量を検出する。例えば、前記マイクロフォン４０により入力されて前記Ａ／Ｄコンバータ４１によりディジタル信号に変換された音声情報に関して、例えば０．０２５秒程度の極めて短い単位時間毎にその音声情報の要素としての音高（ピッチ）及び音量（ベロシティ）を検出する。具体的には、前記Ａ／Ｄコンバータ４１を介して入力される音声情報に対応する波形を前記ＲＡＭ５４等に記憶しておき、その波形を適宜読み出して解析することで、対象となる単位時間に相当する音高及び音量を数値的に算出する。 The pitch / volume detection means 86 detects the pitch and volume per unit time with respect to the voice information input by the microphone 40. For example, with respect to voice information inputted by the microphone 40 and converted into a digital signal by the A / D converter 41, for example, a pitch (pitch) as an element of the voice information every very short unit time of about 0.025 seconds, for example. ) And volume (velocity). Specifically, a waveform corresponding to audio information input via the A / D converter 41 is stored in the RAM 54 or the like, and the waveform is read and analyzed as appropriate, so that the target unit time can be obtained. The corresponding pitch and volume are calculated numerically.

前記音高・音量検出手段８６は、本実施例の演奏評価制御に関して、その演奏評価の基準データにおける音高の切り替わり部分において上記検出を行う。図５は、本実施例を含む一般的なカラオケ演奏評価制御に用いられる基準データについて説明する図であり、前記マイクロフォン４０により入力された音声情報の実測値を実線で、評価の基準となる基準データ（お手本データ）を破線で囲繞した領域でそれぞれ示している。この図５に示すように、前記カラオケ装置１６によるカラオケ演奏の評価に用いられる基準データは、対象となる演奏曲の進行（時間経過）に対応して所定の音高帯域（例えば、ガイドメロディを中心とする一定の音程幅）が段階的に定められたものである。換言すれば、高低差のある音高の切り替わり部分においては、低音側の音高帯域から高音側の音高帯域へ、或いは高音側の音高帯域から低音側の音高帯域へ、繋ぎを考慮せず即時的に移行するように基準データが定められている。 The pitch / volume detection means 86 performs the above-described detection at the pitch switching portion in the performance evaluation reference data for the performance evaluation control of this embodiment. FIG. 5 is a diagram for explaining reference data used for general karaoke performance evaluation control including the present embodiment. The actual value of voice information input by the microphone 40 is indicated by a solid line as a reference for evaluation. Data (example data) is indicated by a region surrounded by a broken line. As shown in FIG. 5, the reference data used for the evaluation of the karaoke performance by the karaoke device 16 is a predetermined pitch band (for example, a guide melody) corresponding to the progress (elapsed time) of the target performance song. A constant pitch range at the center) is determined in stages. In other words, at the part where the pitch changes with a difference in pitch, the connection is considered from the pitch band on the bass side to the pitch band on the treble side, or from the pitch band on the treble side to the pitch band on the bass side. Standard data has been established so that it can be transferred immediately.

図５に示すように定められた基準データに基づく演奏評価制御では、例えば、前記マイクロフォン４０により入力された音声情報の実測値がその基準データの範囲に入るか否かが判定される。すなわち、対象となる音声情報の実測値が基準データの範囲に入る場合には正しい音高で歌っているものと評価される一方、基準データの範囲から逸脱する場合には正しい音高で歌えていないものと評価される。そして、図５に示すように対象となる演奏曲の進行に伴って段階的に定められた基準データに関して、前記マイクロフォン４０により入力される音声情報に関して検出される所定時間毎の音高がその基準データに入るか否かの判定が連続的に実行されることで、その演奏曲全体としての演奏評価（総合評価）が行われる。 In the performance evaluation control based on the reference data determined as shown in FIG. 5, for example, it is determined whether or not the actual measurement value of the voice information input by the microphone 40 falls within the range of the reference data. In other words, if the measured value of the target speech information falls within the range of the reference data, it is evaluated that it is singing at the correct pitch, while if it deviates from the range of the reference data, it is sung at the correct pitch. It is evaluated as not. Then, as shown in FIG. 5, with respect to the reference data set stepwise as the target musical piece progresses, the pitch detected every predetermined time for the audio information input by the microphone 40 is the reference. The performance evaluation (overall evaluation) of the performance music as a whole is performed by continuously determining whether or not to enter the data.

図６は、図５に示すような基準データに基づく演奏評価制御の問題点を説明する図である。前述のように、評価の基準データが演奏曲の進行に伴って段階的に定められたものである場合には、高低差のある音高の切り替わり部分における評価が問題となる。すなわち、斯かる音高の切り替わり部分において、低音側の音高帯域から高音側の音高帯域へ、或いは高音側の音高帯域から低音側の音高帯域へ、繋ぎを考慮せず即時的に移行するように基準データが定められていることから、立ち上がりにおける音声情報の実測値が切り替わり前乃至切り替わり後の何れの基準データにも入らず、その部分を上手く歌ったとしても正しい評価が行われなかった。これは、基準データにおける音高の切り替わりが段階的であるのに対して前記マイクロフォン４０により検出される利用者の歌唱音声の時間変化は連続的であり、高低差のある音高の切り替わり部分においてはその音高（音程）が漸増乃至漸減することで滑らかで自然な歌い方となるからである。 FIG. 6 is a diagram for explaining problems in performance evaluation control based on the reference data as shown in FIG. As described above, in the case where the reference data for evaluation is determined step by step with the progress of the performance music, there is a problem in evaluation at a pitch switching portion with a difference in pitch. That is, in such a pitch switching portion, from the low tone pitch band to the high tone pitch band, or from the high tone pitch band to the low tone pitch band, immediately without considering the connection. Since the reference data is determined so as to shift, the actual evaluation value of the voice information at the rising edge does not enter any reference data before or after switching, and correct evaluation is performed even if the part is sung well. There wasn't. This is because the pitch change in the reference data is gradual, whereas the time change of the user's singing voice detected by the microphone 40 is continuous, and in the pitch change portion where there is a difference in pitch. This is because the pitch (pitch) gradually increases or decreases, and the singing becomes smooth and natural.

図７は、基準データにおける音高の切り替わり部分においての前記音高・音量検出手段８６による音高及び音量の検出について説明する図である。この図７に示すように、前記音高・音量検出手段８６は、好適には、評価の基準データにおける音高の切り替わり部分において、前記マイクロフォン４０により検出される音声情報における音高の変化の起点をｎ＝０とする、ｎ＝１からｎ＝Ｎまでの音高及び音量を検出する。ここで、ｎは前記音高・音量検出手段８６の検出単位である前記単位時間それぞれの検出結果に対して付される符号であり、Ｎは好適には６〜１０の整数、最適には８である。すなわち、前記音高・音量検出手段８６は、好適には、前記マイクロフォン４０により検出される音声情報における音高の変化の起点をｎ＝０とする、ｎ＝１からｎ＝８までの０．２秒間（単位時間０．０２５×８）における８単位分の音高及び音量を検出する。なお、上記音高の変化の起点としては、図７に示すように、対象となる基準データの切り替わり直後において音高が変化（低音側から高音側への切り替わり部分においては上昇、高音側から低音側への切り替わり部分においては下降）し始めた瞬間をｎ＝１（変化の直前としての起点をｎ＝０）とするのが好ましいが、例えば基準データの切り替わりの瞬間をｎ＝０とするように基準データに基づいて予め定められたものであってもよい。 FIG. 7 is a diagram for explaining detection of pitch and volume by the pitch / volume detection means 86 at a pitch switching portion in the reference data. As shown in FIG. 7, the pitch / volume detection means 86 is preferably a starting point of a change in pitch in the voice information detected by the microphone 40 at a pitch switching portion in the evaluation reference data. N = 0 and the pitch and volume from n = 1 to n = N are detected. Here, n is a code given to each detection result of the unit time which is a detection unit of the pitch / volume detection means 86, and N is preferably an integer of 6 to 10, and optimally 8 It is. That is, the pitch / volume detection means 86 preferably has a pitch change point in the voice information detected by the microphone 40 as n = 0, where n = 0 to 0. The pitch and volume for 8 units in 2 seconds (unit time 0.025 × 8) are detected. As shown in FIG. 7, the starting point of the change in the pitch is that the pitch changes immediately after the change of the target reference data (rising at the switching from the low to high side, and from the high to low side. It is preferable to set n = 1 (the starting point immediately before the change is n = 0) at the moment when it begins to fall), but for example, the reference data switching instant is set to n = 0. Alternatively, it may be predetermined based on the reference data.

前記検出結果送信手段８８は、前記通信回線１８を介して前記音高・音量検出手段８６による検出結果を前記サーバ装置２０における多変量解析データベース８４へ送信乃至蓄積する。すなわち、前記音高・音量検出手段８６により検出されたｎ＝１からｎ＝Ｎまでの音高データ及び音量データを、対象となる演奏曲の識別情報（選曲番号）及びｎ＝１〜Ｎの検出時点（例えば曲の演奏開始からの経過時点）、更には演奏者（音声情報の入力主体）である利用者の識別情報と関連付けて前記多変量解析データベース８４に記憶する。そのようにして、斯かる多変量解析データベース８４には、それぞれの演奏曲に係る基準データにおける各音高の切り替わり部分毎に、前記複数のカラオケ装置１６において検出されたｎ＝１〜Ｎの音高データ及び音量データが蓄積される。すなわち、各基準データにおける音高の切り替わり部分毎に、複数の利用者それぞれの音声情報を対象として、前記音高・音量検出手段８６によりＮ回連続的に検出された単位時間毎の音高及び音量に対応する２Ｎ種類のデータ（音高データＮ個＋音量データＮ個＝２Ｎ個のデータ）がサンプルとして蓄積されるようになっている。 The detection result transmission unit 88 transmits or stores the detection result of the pitch / volume detection unit 86 to the multivariate analysis database 84 in the server device 20 via the communication line 18. That is, the pitch data and volume data from n = 1 to n = N detected by the pitch / volume detection means 86 are used as identification information (music selection number) of the target performance music and n = 1 to N. The data is stored in the multivariate analysis database 84 in association with the identification time of the user who is the detection time (for example, the elapsed time from the start of the performance of the music) and the performer (speech information input subject). As such, the multivariate analysis database 84 stores n = 1 to N sounds detected by the plurality of karaoke apparatuses 16 for each pitch switching portion in the reference data relating to each performance piece. High data and volume data are accumulated. That is, for each reference data switching portion of the pitch, the pitch per unit time detected by the pitch / volume detection means 86 N times continuously for the voice information of each of a plurality of users, and 2N types of data corresponding to the volume (N pitch data + N volume data = 2N data) are stored as samples.

前記多変量解析手段９０は、上述のように多変量解析データベース８４に蓄積されたデータ、すなわち複数の利用者それぞれの音声情報を対象として、前記音高・音量検出手段８６によりＮ回連続的に検出された単位時間毎の音高及び音量に対応する２Ｎ種類のデータに関して多変量解析を行う。斯かる制御を行うために、前記多変量解析手段９０は、図４に示すように主成分分析手段９２、固有値・固有ベクトル算出手段９４、分散算出手段９６、及び順位付け手段９８を含んでいる。以下、これらの制御手段それぞれの制御について分説する。 The multivariate analysis means 90 is continuously processed N times by the pitch / volume detection means 86 for the data accumulated in the multivariate analysis database 84 as described above, that is, the voice information of each of a plurality of users. Multivariate analysis is performed on 2N types of data corresponding to the detected pitches and volume for each unit time. In order to perform such control, the multivariate analysis unit 90 includes a principal component analysis unit 92, an eigenvalue / eigenvector calculation unit 94, a variance calculation unit 96, and a ranking unit 98 as shown in FIG. Hereinafter, the control of each of these control means will be described.

上記主成分分析手段９２は、各基準データにおける音高の切り替わり部分に対応して前記多変量解析データベース８４に蓄積された２Ｎ種類のデータに関して、よく知られた主成分分析法（Principal Component Analysis）による主成分分析を行う。すなわち、各利用者それぞれの音声情報（対応する切り替わり部分に係るｎ＝１〜８、１６種類のデータ）に関して、次の数式１に示すような特徴ベクトルｘを算出する。なお、この数式１におけるＰ(n)は、ｎ番目の音高（入力ピッチ）をＰi(n)、ｎ番目のお手本ピッチをＰm(n)としてＰi(n)−Ｐm(n)で表される値であり、Ｖ(n)は、ｎ番目の音量（ベロシティ）をＶe(n)としてＶe(n)−Ｖe(n-1)で表される値である。また、前述のように好適にはＮ＝８である。また、前記主成分分析手段９２は、上述のように算出された複数の利用者の音声情報に対応する特徴ベクトルｘを標準化する。例えば、それら複数の特徴ベクトルｘを分散１、平均１となるように標準化する。そして、そのようにして標準化された複数の特徴ベクトルｘに基づいて分散・共分散行列を求めることにより主成分分析を行う。 The principal component analysis means 92 is a well-known principal component analysis method (Principal Component Analysis) for 2N types of data stored in the multivariate analysis database 84 corresponding to the pitch switching portion in each reference data. Perform principal component analysis with. That is, for each user's voice information (n = 1 to 8, 16 types of data related to the corresponding switching portion), a feature vector x as shown in the following Equation 1 is calculated. Note that P (n) in Equation 1 is represented by Pi (n) -Pm (n) where Pi (n) is the nth pitch (input pitch) and Pm (n) is the nth example pitch. V (n) is a value represented by Ve (n) −Ve (n−1), where n (th) volume (velocity) is Ve (n). As described above, N = 8 is preferable. The principal component analyzing unit 92 standardizes the feature vector x corresponding to the plurality of users' voice information calculated as described above. For example, the plurality of feature vectors x are standardized so that the variance is 1 and the average is 1. Then, principal component analysis is performed by obtaining a variance / covariance matrix based on a plurality of standardized feature vectors x.

［数式１］
ｘ＝｛Ｐ(n)，Ｐ(n+1)，・・・，Ｐ(n+N)，Ｖ(n)，Ｖ(n+1)，・・・，Ｖ(n+N)｝ [Formula 1]
x = {P (n), P (n + 1), ..., P (n + N), V (n), V (n + 1), ..., V (n + N)}

前記固有値・固有ベクトル算出手段９４は、前記主成分分析手段９２による分析結果に対応して、２Ｎ組の固有値・固有ベクトルを算出する。前述のようにＮ＝８である場合には、前記主成分分析手段９２による分析結果に対応して１６組の固有値・固有ベクトルを算出する。この固有値・固有ベクトル算出手段９４により算出された固有値・固有ベクトルは以下の制御におけるパラメータとして用いられ、その固有ベクトルがパターン空間の軸として使用されると共に、対応する固有値がその軸の単位として使用される。 The eigenvalue / eigenvector calculation means 94 calculates 2N sets of eigenvalues / eigenvectors corresponding to the analysis result by the principal component analysis means 92. As described above, when N = 8, 16 sets of eigenvalues / eigenvectors are calculated corresponding to the analysis result by the principal component analysis means 92. The eigenvalue / eigenvector calculated by the eigenvalue / eigenvector calculating means 94 is used as a parameter in the following control, the eigenvector is used as an axis of the pattern space, and the corresponding eigenvalue is used as a unit of the axis.

前記分散算出手段９６は、前記固有値・固有ベクトル算出手段９４により算出される２Ｎ組の固有値・固有ベクトルに関して、各固有ベクトルを軸とする固有値の分散を算出する。また、前記順位付け手段９８は、前記分散算出手段９６により算出される分散が大きいものから順に前記２Ｎ組の固有値・固有ベクトルの順位を決定する。図８は、前記固有値・固有ベクトル算出手段９４により算出される２Ｎ組の固有値・固有ベクトルのうち２本の固有ベクトルｖ１、ｖ２を示すパターン空間を例示しており、各サンプルの値を複数の点（ドット）で示している。この図８に示す例では、固有ベクトルｖ１を軸とするものの方が、固有ベクトルｖ２を軸とするものよりも分散が大きいことがわかる。前記順位付け手段９８は、前記固有値・固有ベクトル算出手段９４により算出される２Ｎ組の固有値・固有ベクトルそれぞれについて前記分散算出手段９６により算出される分散について斯かる比較を行うことにより、それらの固有値・固有ベクトルを分散が大きいものから順に第１位から第２Ｎ位まで順位付けする。 The variance calculating unit 96 calculates the variance of eigenvalues with each eigenvector as an axis with respect to 2N sets of eigenvalues / eigenvectors calculated by the eigenvalue / eigenvector calculating unit 94. The ranking unit 98 determines the ranks of the 2N sets of eigenvalues / eigenvectors in descending order of the variance calculated by the variance calculation unit 96. FIG. 8 exemplifies a pattern space showing two eigenvectors v1 and v2 out of 2N sets of eigenvalues / eigenvectors calculated by the eigenvalue / eigenvector calculation means 94, and each sample value is represented by a plurality of dots (dots). ). In the example shown in FIG. 8, it can be seen that the variance with the eigenvector v1 as the axis is larger than the variance with the eigenvector v2 as the axis. The ranking means 98 performs the comparison for the variances calculated by the variance calculation means 96 for each of the 2N sets of eigenvalues / eigenvectors calculated by the eigenvalue / eigenvector calculation means 94, thereby obtaining their eigenvalues / eigenvectors. Are ranked from the first to the second Nth in descending order of variance.

前記演奏評価手段１００は、評価の基準データにおける音高の切り替わり部分において、前記複数の利用者の音声情報に対応する前記多変量解析手段９０による解析結果に基づいて、対象となる利用者の音声情報に係る演奏評価を行う。すなわち、前記固有値・固有ベクトル算出手段９４により算出された固有ベクトルをパターン空間の軸として使用すると共に、対応する固有値をその軸の単位として使用することにより、基準値（例えば、対象となる固有ベクトルに係る平均値）からの各利用者に対応する値の偏差を求め、その偏差に対応する評価結果を算出する。好適には、斯かる基準値からの偏差が小さいほど高い評価となるように（偏差が大きいほど低い評価となるように）予め定められた関係から、各基準データにおける音高の切り替わり部分毎に算出される固有値・固有ベクトルをパラメータとして、それぞれの利用者の音声情報に対応する２Ｎ種類のデータに基づいて斯かる演奏評価を行う。 The performance evaluation means 100 is based on the analysis result by the multivariate analysis means 90 corresponding to the voice information of the plurality of users at the pitch switching portion in the evaluation reference data. Perform performance evaluation on information. That is, by using the eigenvector calculated by the eigenvalue / eigenvector calculating means 94 as an axis of the pattern space and using the corresponding eigenvalue as a unit of the axis, the reference value (for example, the average of the target eigenvector) The deviation of the value corresponding to each user from the value) is obtained, and the evaluation result corresponding to the deviation is calculated. Preferably, from a predetermined relationship such that the smaller the deviation from the reference value is, the higher the evaluation is (the lower the evaluation is, the lower the evaluation is), for each pitch switching portion in each reference data. Using the calculated eigenvalues and eigenvectors as parameters, such performance evaluation is performed based on 2N types of data corresponding to each user's voice information.

前記演奏評価手段１００は、好適には、順位付け手段により決定された順位の高い固有値・固有ベクトルから優先的に前記演奏評価のパラメータとして用いる。例えば、各基準データにおける音高の切り替わり部分に対応して前記固有値・固有ベクトル算出手段９４により算出される２Ｎ種類の固有値・固有ベクトルのうち、前記順位付け手段９８により最も高い順位（第１位）に順位付けされた固有ベクトルをパターン空間の軸として使用すると共に、対応する固有値をその軸の単位として前記評価を行う。すなわち、斯かる第１位の固有値・固有ベクトルに対応する基準値からの偏差が小さいほど高い評価となるように（偏差が大きいほど低い評価となるように）予め定められた関係から、それぞれの利用者の音声情報に対応する２Ｎ種類のデータに基づいて斯かる演奏評価を行う。 The performance evaluation unit 100 is preferably used as the performance evaluation parameter preferentially from eigenvalues / eigenvectors having higher ranks determined by the ranking unit. For example, among the 2N types of eigenvalues / eigenvectors calculated by the eigenvalue / eigenvector calculation means 94 corresponding to the pitch change portion in each reference data, the ranking means 98 ranks the highest (first). The ranked eigenvector is used as an axis of the pattern space, and the evaluation is performed using the corresponding eigenvalue as a unit of the axis. That is, each usage is determined based on a predetermined relationship so that the smaller the deviation from the reference value corresponding to the first eigenvalue / eigenvector, the higher the evaluation (the lower the evaluation, the lower the evaluation). Such performance evaluation is performed based on 2N types of data corresponding to the person's voice information.

図９は、本実施例の演奏評価制御の効果を検証するために、複数の利用者それぞれに対応する音声情報に係るｎ＝１〜８の音高データ及び音量データについて前述の多変量解析制御乃至演奏評価制御を行った結果を比較して示すグラフである。この図９においては、利用者が実際に感じる印象において最も歌唱が上手い第１の歌い手（ガイド）の音声情報に対応する値を実線で、比較的歌唱が下手な第２の歌い手の音声情報に対応する値を一点鎖線で、比較的歌唱が上手い第３の歌い手の音声情報に対応する値を二点鎖線で、歌唱が上手くも下手でもない第４の歌い手の音声情報に対応する値を破線でそれぞれ示している。また、各値（サンプル番号）の基準値からの距離値（ユークリッド距離）を縦軸に示している。図９に示す例では、最も歌唱が上手い第１の歌い手に係る距離値の平均値が３．３７、比較的歌唱が上手い第３の歌い手に係る距離値の平均値が４．０７、歌唱が上手くも下手でもない第４の歌い手に係る距離値の平均値が４．３４、比較的歌唱が下手な第２の歌い手に係る距離値の平均値が４．４０という結果が得られている。すなわち、図９に示す距離値が小さいほど歌唱が上手いという関係が成立しており、利用者が実際に感じる印象に近い演奏評価を実現できることがわかる。 FIG. 9 shows the multivariate analysis control for the pitch data and volume data of n = 1 to 8 related to the voice information corresponding to each of a plurality of users in order to verify the effect of the performance evaluation control of this embodiment. Or it is a graph which compares and shows the result of performing performance evaluation control. In FIG. 9, the value corresponding to the voice information of the first singer (guide) who is the best singer in the impression actually felt by the user is indicated by the solid line and the voice information of the second singer who is relatively poor at singing. The corresponding value is indicated by a one-dot chain line, the value corresponding to the voice information of the third singer who is relatively good at singing is indicated by the two-dot chain line, and the value corresponding to the voice information of the fourth singer who is not good or bad at singing is indicated by a broken line. Respectively. Further, a distance value (Euclidean distance) from the reference value of each value (sample number) is shown on the vertical axis. In the example shown in FIG. 9, the average value of the distance value related to the first singer who is the best singer is 3.37, the average value of the distance value related to the third singer who is relatively good singing is 4.07, and the singing is The average value of the distance value related to the fourth singer who is neither good nor bad is 4.34, and the average value of the distance value related to the second singer who is relatively poor singing is 4.40. In other words, it can be understood that the smaller the distance value shown in FIG. 9 is, the better the singing is, and the performance evaluation close to the impression that the user actually feels can be realized.

一方、図１０は、比較例としてベロシティを考慮しない結果、すなわち複数の利用者それぞれに対応する音声情報に係るｎ＝１〜８に対応する音高データのみについて前述の多変量解析制御乃至演奏評価制御を行った結果を比較して示すグラフである。この図１０においては、利用者が実際に感じる印象において最も歌唱が上手い第１の歌い手（ガイド）の音声情報に対応する値を実線で、比較的歌唱が下手な第２の歌い手の音声情報に対応する値を一点鎖線で、比較的歌唱が上手い第３の歌い手の音声情報に対応する値を破線でそれぞれ示している。また、各値（サンプル番号）の基準値からの距離値（ユークリッド距離）を縦軸に示している。図１０に示す例では、最も歌唱が上手い第１の歌い手に係る距離値の平均値が２．６９、比較的歌唱が上手い第３の歌い手に係る距離値の平均値が２．８０、比較的歌唱が下手な第２の歌い手に係る距離値の平均値が２．０６というように、最も上手い歌い手の評価と比較的上手い歌い手の評価が逆転していることに加え、比較的歌唱が下手な歌い手の評価に対する相対関係も上記図９の例に比べて狭まっており、評価がしづらい結果となっている。多変量解析においては、一般に解析対象となる対象の種類（何を解析するか）が重要になるが、図９及び図１０に比較して示すように、本実施例のように基準データにおける音高の切り替わり部分の演奏評価に関しては、前記音高・音量検出手段８６によりＮ回連続的に検出される単位時間毎の音高及び音量に対応する２Ｎ種類のデータを解析することで、利用者が実際に感じる印象に近い演奏評価を実現できるのである。 On the other hand, FIG. 10 shows the result of not considering velocity as a comparative example, that is, the above-described multivariate analysis control or performance evaluation only for pitch data corresponding to n = 1 to 8 related to voice information corresponding to each of a plurality of users. It is a graph which compares and shows the result of having performed control. In FIG. 10, in the impression that the user actually feels, the value corresponding to the voice information of the first singer (guide) who is the best singer is the solid line, and the voice information of the second singer who is relatively poor in singing. Corresponding values are indicated by alternate long and short dashed lines, and values corresponding to voice information of the third singer who is relatively good at singing are indicated by broken lines. Further, a distance value (Euclidean distance) from the reference value of each value (sample number) is shown on the vertical axis. In the example shown in FIG. 10, the average value of the distance value related to the first singer who is the best singer is 2.69, and the average value of the distance value related to the third singer who is relatively good singing is 2.80, which is relatively high. In addition to the fact that the evaluation of the best singer and the evaluation of the relatively good singer are reversed such that the average value of the distance value related to the second singer who is not good at singing is 2.06, the evaluation is relatively poor. The relative relationship with respect to the evaluation of the singer is also narrower than the example of FIG. 9 described above, which makes it difficult to evaluate. In multivariate analysis, the type of object to be analyzed (what to analyze) is generally important. However, as shown in comparison with FIGS. 9 and 10, the sound in the reference data as in this embodiment is used. Regarding performance evaluation of the high-switching portion, the user is analyzed by analyzing 2N types of data corresponding to the pitch and volume per unit time continuously detected N times by the pitch / volume detection means 86. The performance evaluation close to the impression actually felt can be realized.

前記評価結果送信手段１０２は、前記演奏評価手段１００による評価結果を前記通信回線１８を介して対応する各カラオケ装置１６へ送信する。すなわち、評価の対象となった２Ｎ種類のデータの送信元であるカラオケ装置１６へ、そのデータに係る前記演奏評価手段１００による評価結果を送信（返信）する。ここで、好適には、前記カラオケ装置１６には、前記演奏評価手段１００とは別に、評価の基準データにおける音高の切り替わり部分以外の部分について演奏評価を行うための第２の演奏評価手段が備えられている。この第２の演奏評価手段は、演奏曲の出力に伴って前記マイクロフォン４０から入力される音声に応じて演奏の内容を評価する。例えば、前記マイクロフォン４０により入力されて前記Ａ／Ｄコンバータ４１によりディジタル信号に変換された音声情報と、図５に示すように定められた基準データとを比較し、メロディなどの基本音程と入力される音声情報から抽出される音高（音程）との相対的なずれやその音声の絶対的な音量（声量）などを基準として評価を行う。そして、好適には、対象となる演奏曲の演奏が終了した時点で、各音高の切り替わり部分に対応して前記評価結果送信手段１０２から受信された評価結果（演奏評価手段１００による評価結果）と、それ以外の部分について第２の演奏評価手段により算出された評価結果とに基づいて、その演奏曲の演奏を通しての総合評価が算出される。 The evaluation result transmission means 102 transmits the evaluation result by the performance evaluation means 100 to each corresponding karaoke apparatus 16 via the communication line 18. That is, the evaluation result by the performance evaluation means 100 related to the data is transmitted (returned) to the karaoke apparatus 16 that is the transmission source of the 2N types of data that is the object of evaluation. Here, it is preferable that the karaoke apparatus 16 includes a second performance evaluation unit for performing performance evaluation on a portion other than the pitch switching portion in the reference data for evaluation separately from the performance evaluation unit 100. Is provided. The second performance evaluation means evaluates the content of the performance according to the sound input from the microphone 40 along with the output of the performance music. For example, the voice information input by the microphone 40 and converted into a digital signal by the A / D converter 41 is compared with reference data determined as shown in FIG. 5, and a basic pitch such as a melody is input. The evaluation is performed based on the relative deviation from the pitch (pitch) extracted from the voice information and the absolute volume (volume) of the voice. Preferably, the evaluation result received from the evaluation result transmitting means 102 corresponding to each pitch switching portion (evaluation result by the performance evaluation means 100) at the time when the performance of the target performance piece is completed. Based on the evaluation results calculated by the second performance evaluation means for the other parts, the overall evaluation through the performance of the performance music is calculated.

図１１は、前記カラオケ装置１６のＣＰＵ５０による本実施例のカラオケ演奏評価制御の要部を説明するフローチャートであり、所定の周期で繰り返し実行されるものである。 FIG. 11 is a flowchart for explaining a main part of the karaoke performance evaluation control of the present embodiment by the CPU 50 of the karaoke apparatus 16 and is repeatedly executed at a predetermined cycle.

先ず、ステップ（以下、ステップを省略する）ＳＡ１において、所定の演奏曲のカラオケ演奏が開始されたか否かが判断される。このＳＡ１の判断が否定される場合には、それをもって本ルーチンが終了させられるが、ＳＡ１の判断が肯定される場合には、ＳＡ２において、対象となる演奏曲のカラオケデータが前記ハードディスク５６のカラオケデータベースから読み出され、対応する演奏情報及び歌詞情報等が前記ＲＡＭ５４に展開される。次に、ＳＡ３において、対象となるカラオケ演奏制御すなわち前記シンセサイザ３８による演奏音の出力制御及びその演奏の進行に伴う歌詞文字映像の表示制御が開始される。次に、ＳＡ４において、基準データにおいて音高が切り替わるタイミングであるか否かが判断される。このＳＡ４の判断が否定される場合には、ＳＡ７以下の処理が実行されるが、ＳＡ４の判断が肯定される場合には、前記音高・音量検出手段８６の動作に対応するＳＡ５において、前記マイクロフォン４０により入力される音声情報に関して単位時間毎の音高及び音量がＮ回連続して検出される。次に、前記検出結果送信手段８８の動作に対応するＳＡ６において、ＳＡ５にて検出されたＮ回分の音高データ及び音量データが演奏曲の選曲番号及び該当箇所の時間情報と共に前記通信回線１８を介して前記サーバ装置２０へ送信される。次に、ＳＡ７において、ＳＡ６の送信に対応して前記サーバ装置２０からの評価結果の受信（返信）があったか否かが判断される。このＳＡ７の判断が否定される場合には、ＳＡ９以下の処理が実行されるが、ＳＡ７の判断が肯定される場合には、ＳＡ８において、受信された評価結果が前記ＲＡＭ５４に記憶されて以降の演奏評価に反映された後、ＳＡ９において、カラオケ演奏終了であるか否かが判断される。このＳＡ９の判断が否定される場合には、ＳＡ４以下の処理が再び実行されるが、ＳＡ９の判断が肯定される場合には、ＳＡ１０において、ＳＡ８にて前記ＲＡＭ５４等に記憶された音高の切り替わり部分に係る演奏評価と、他のプロセスにより算出されたそれ以外の部分に係る演奏評価とに基づいて、対象となるカラオケ演奏全体を通しての総合評価が算出された後、本ルーチンが終了させられる。 First, in step (hereinafter, step is omitted) SA1, it is determined whether or not the karaoke performance of a predetermined performance music has been started. If the determination of SA1 is negative, this routine is terminated accordingly. If the determination of SA1 is affirmative, the karaoke data of the target performance song is stored in the karaoke of the hard disk 56 in SA2. The data is read from the database, and the corresponding performance information and lyric information are developed in the RAM 54. Next, at SA3, the target karaoke performance control, that is, the output control of the performance sound by the synthesizer 38 and the display control of the lyric character video accompanying the progress of the performance are started. Next, in SA4, it is determined whether or not it is the timing at which the pitch is switched in the reference data. When the determination at SA4 is negative, the processing after SA7 is executed. When the determination at SA4 is affirmative, at SA5 corresponding to the operation of the pitch / volume detection means 86, With respect to the voice information input by the microphone 40, the pitch and volume per unit time are detected N times continuously. Next, in SA6 corresponding to the operation of the detection result transmitting means 88, the pitch data and volume data for N times detected in SA5 are transmitted through the communication line 18 together with the music selection number and the time information of the corresponding portion. And transmitted to the server device 20. Next, in SA7, it is determined whether or not the evaluation result has been received (returned) from the server device 20 in response to the transmission of SA6. When the determination at SA7 is negative, the processing after SA9 is executed, but when the determination at SA7 is affirmative, the received evaluation result is stored in the RAM 54 at SA8 and thereafter. After being reflected in the performance evaluation, it is determined in SA9 whether or not the karaoke performance has ended. If the determination at SA9 is negative, the processing after SA4 is executed again. If the determination at SA9 is affirmative, the pitch of the pitch stored in the RAM 54 or the like at SA8 is determined at SA10. After the overall evaluation of the entire target karaoke performance is calculated based on the performance evaluation related to the switching portion and the performance evaluation related to other portions calculated by other processes, this routine is terminated. .

図１２は、前記サーバ装置２０のＣＰＵ６６による本実施例の演奏評価制御に係る多変量解析制御の要部を説明するフローチャートであり、所定の周期で繰り返し実行されるものである。 FIG. 12 is a flowchart for explaining a main part of the multivariate analysis control related to the performance evaluation control of the present embodiment by the CPU 66 of the server device 20, and is repeatedly executed at a predetermined cycle.

先ず、ＳＢ１において、前記カラオケ装置１６からＮ回分の音高データ及び音量データが演奏曲の選曲番号及び該当箇所の時間情報と共に前記通信回線１８を介して受信されたか否かが判断される。このＳＢ１の判断が否定される場合には、それをもって本ルーチンが終了させられるが、ＳＢ１の判断が肯定される場合には、ＳＢ２において、受信されたＮ回分の音高データ及び音量データが演奏曲の選曲番号及び該当箇所の時間情報と対応付けられて前記多変量解析データベース８４に記憶（蓄積）される。次に、ＳＢ３において、受信されたＮ回分の音高データ及び音量データに対応して前述した数式１に示すような特徴ベクトルｘが算出される。次に、ＳＢ４において、ＳＢ３にて算出された特徴ベクトルが分散１、平均１となるように標準化される。次に、前記主成分分析手段９２の動作に対応するＳＢ５において、ＳＢ４にて標準化された特徴ベクトルに基づいて分散・共分散行列が算出されて主成分分析が行われる。次に、前記固有値・固有ベクトル算出手段９４の動作に対応するＳＢ６において、ＳＢ５における分析結果に対応して、２Ｎ組の固有値・固有ベクトルが算出される。次に、前記分散算出手段９６の動作に対応するＳＢ７において、ＳＢ６にて算出された固有値・固有ベクトルそれぞれに対応する分散が算出される。次に、前記順位付け手段９８の動作に対応するＳＢ８において、ＳＢ７にて算出された分散が大きいものから順にＳＢ６にて算出された固有値・固有ベクトルが第１位から第２Ｎ位まで順位付けされる。次に、前記演奏評価手段１００の動作に対応するＳＢ９において、ＳＢ８にて第１位とされた固有ベクトルをパターン空間の軸として使用すると共に、対応する固有値をその軸の単位として使用することにより、受信されたＮ回分の音高データ及び音量データに係る演奏評価が行われる。次に、前記評価結果送信手段１０２の動作に対応するＳＢ１０において、ＳＢ９における評価結果がデータの送信元である前記カラオケ装置１６に送信（返信）された後、本ルーチンが終了させられる。以上の制御において、ＳＢ３〜ＳＢ８が前記多変量解析手段９０の動作に対応する。 First, in SB1, it is determined whether or not pitch data and volume data for N times are received from the karaoke apparatus 16 through the communication line 18 together with the music selection number of the musical piece and the time information of the corresponding portion. If the determination at SB1 is negative, this routine is terminated. If the determination at SB1 is affirmative, the received pitch data and volume data for N times are played at SB2. It is stored (accumulated) in the multivariate analysis database 84 in association with the music selection number of the music and the time information of the corresponding part. Next, in SB3, a feature vector x as shown in Equation 1 is calculated corresponding to the received pitch data and volume data for N times. Next, in SB4, the feature vector calculated in SB3 is standardized so that the variance is 1 and the average is 1. Next, in SB5 corresponding to the operation of the principal component analyzing means 92, a variance / covariance matrix is calculated based on the feature vector standardized in SB4, and principal component analysis is performed. Next, in SB6 corresponding to the operation of the eigenvalue / eigenvector calculation means 94, 2N sets of eigenvalues / eigenvectors are calculated corresponding to the analysis result in SB5. Next, in SB7 corresponding to the operation of the variance calculating means 96, variances corresponding to the eigenvalues and eigenvectors calculated in SB6 are calculated. Next, in SB8 corresponding to the operation of the ranking means 98, the eigenvalues / eigenvectors calculated in SB6 are ranked from the first to the second Nth in descending order of the variance calculated in SB7. . Next, in SB9 corresponding to the operation of the performance evaluation means 100, by using the eigenvector ranked first in SB8 as the axis of the pattern space, and using the corresponding eigenvalue as the unit of the axis, Performance evaluation related to the received N pitch data and volume data is performed. Next, in SB10 corresponding to the operation of the evaluation result transmitting means 102, the evaluation result in SB9 is transmitted (returned) to the karaoke apparatus 16 which is the data transmission source, and then this routine is terminated. In the above control, SB3 to SB8 correspond to the operation of the multivariate analysis means 90.

このように、本実施例によれば、音声入力装置であるマイクロフォン４０により入力される音声情報に関して単位時間毎の音高及び音量を検出する音高・音量検出手段８６（ＳＡ５）と、複数の利用者それぞれの音声情報を対象として、前記音高・音量検出手段８６によりＮ回連続的に検出される単位時間毎の音高及び音量に対応する２Ｎ種類のデータに関して多変量解析を行う多変量解析手段（ＳＢ３〜ＳＢ８）と、評価の基準データにおける音高の切り替わり部分において、前記複数の利用者の音声情報に対応する前記多変量解析手段９０による解析結果に基づいて、対象となる利用者の音声情報に係る演奏評価を行う演奏評価手段１００（ＳＢ９）とを、備えたものであることから、高低差のある音高の繋ぎに相当する演奏部分を好適に評価することができる。すなわち、利用者が実際に感じる印象に近い演奏評価を実現するカラオケシステム１０を提供することができる。 Thus, according to the present embodiment, the pitch / volume detection means 86 (SA5) for detecting the pitch and volume per unit time with respect to the voice information input by the microphone 40 as the voice input device, Multivariate analysis is performed for 2N types of data corresponding to the pitch and volume per unit time detected by the pitch / volume detection means 86 N times continuously for the voice information of each user. Based on the analysis result by the multivariate analysis means 90 corresponding to the voice information of the plurality of users in the analysis means (SB3 to SB8) and the pitch switching portion in the evaluation reference data, the target user Since the performance evaluation means 100 (SB9) for performing performance evaluation related to the audio information is provided, a performance portion corresponding to a pitch connection having a height difference is suitable. It can be evaluated. That is, it is possible to provide the karaoke system 10 that realizes performance evaluation close to the impression that the user actually feels.

また、前記多変量解析手段９０は、前記音高・音量検出手段８６により検出される音高の変化の起点をｎ＝０とする、ｎ＝１からｎ＝Ｎまでの前記２Ｎ種類のデータに関して主成分分析を行う主成分分析手段９２（ＳＢ５）と、その主成分分析手段９６による分析結果に対応して、２Ｎ組の固有値・固有ベクトルを算出する固有値・固有ベクトル算出手段９４（ＳＢ６）と、その固有値・固有ベクトル算出手段９４により算出される２Ｎ組の固有値・固有ベクトルに関して、各固有ベクトルを軸とする固有値の分散を算出する分散算出手段９６（ＳＢ７）と、その分散算出手段９６により算出される分散が大きいものから順に前記２Ｎ組の固有値・固有ベクトルの順位を決定する順位付け手段９８（ＳＢ８）とを、含むものであり、前記演奏評価手段１００は、その順位付け手段９８により決定された順位の高い固有値・固有ベクトルから優先的に前記演奏評価のパラメータとして用いるものであるため、高低差のある音高の繋ぎに相当する演奏部分を実用的な態様で好適に評価することができる。 Further, the multivariate analysis unit 90 relates to the 2N types of data from n = 1 to n = N, where n = 0 is the starting point of the change in pitch detected by the pitch / volume detection unit 86. A principal component analysis means 92 (SB5) for performing principal component analysis, an eigenvalue / eigenvector calculation means 94 (SB6) for calculating 2N sets of eigenvalues / eigenvectors corresponding to the analysis result by the principal component analysis means 96, With respect to the 2N sets of eigenvalues / eigenvectors calculated by the eigenvalue / eigenvector calculation means 94, a variance calculation means 96 (SB7) for calculating the variance of eigenvalues with each eigenvector as an axis, and the variance calculated by the variance calculation means 96 are Ranking means 98 (SB8) for determining the order of the 2N sets of eigenvalues / eigenvectors in order from the largest, Since the valuation means 100 is used preferentially as the performance evaluation parameter from the eigenvalue / eigenvector having a higher rank determined by the ranking means 98, a performance portion corresponding to a pitch connection having a difference in pitch is used. It can be suitably evaluated in a practical manner.

以上、本発明の好適な実施例を図面に基づいて詳細に説明したが、本発明はこれに限定されるものではなく、更に別の態様においても実施される。 The preferred embodiments of the present invention have been described in detail with reference to the drawings. However, the present invention is not limited to these embodiments, and may be implemented in other modes.

例えば、前述の実施例において、前記音高・音量検出手段８６及び検出結果送信手段８８が前記カラオケ装置１６のＣＰＵ５０に、前記多変量解析手段９０、演奏評価手段１００、及び評価結果送信手段１０２が前記サーバ装置３０のＣＰＵ６６に備えられたものであったが、本発明はこれに限定されるものではなく、例えば、前記音高・音量検出手段８６、多変量解析手段９０、及び演奏評価手段１００が一元的に前記カラオケ装置１６に備えられ、それらの制御手段による処理を前記カラオケ装置１６側で実行するものであってもよい。この場合、前記サーバ装置２０に備えられていた多変量解析データベース８４に相当する情報が前記カラオケ装置１６のハードディスク５６等に蓄積されるのが好ましいが、前記サーバ装置２０の多変量解析データベース８４に記憶された情報を前記カラオケ装置１６により逐一読み出すことによっても前記実施例と同等の制御が可能である。なお、斯かる態様においては、前記検出結果送信手段８８及び評価結果送信手段１０２は必ずしも設けられなくともよい。 For example, in the above-described embodiment, the pitch / volume detection means 86 and the detection result transmission means 88 are connected to the CPU 50 of the karaoke apparatus 16, and the multivariate analysis means 90, the performance evaluation means 100, and the evaluation result transmission means 102 are provided. Although the present invention is not limited to this, the present invention is not limited to this. For example, the pitch / volume detection means 86, the multivariate analysis means 90, and the performance evaluation means 100 are provided. May be provided in the karaoke device 16 in a centralized manner, and processing by those control means may be executed on the karaoke device 16 side. In this case, it is preferable that information corresponding to the multivariate analysis database 84 provided in the server device 20 is stored in the hard disk 56 or the like of the karaoke device 16, but in the multivariate analysis database 84 of the server device 20. The same control as in the above embodiment can also be performed by reading the stored information one by one by the karaoke device 16. In such an aspect, the detection result transmitting unit 88 and the evaluation result transmitting unit 102 are not necessarily provided.

また、前述の実施例では、前記音高・音量検出手段８６により各利用者の音声情報に対応するデータが検出される毎にそのデータが前記多変量解析データベース８４に蓄積されるものであったが、サンプルとなるデータが十分に多変量解析データベース８４に蓄積されている場合には、必ずしも毎回の検出結果が新たに蓄積されるものでなくともよい。 In the above-described embodiment, every time data corresponding to each user's voice information is detected by the pitch / volume detector 86, the data is accumulated in the multivariate analysis database 84. However, when the sample data is sufficiently accumulated in the multivariate analysis database 84, the detection result for each time may not necessarily be newly accumulated.

その他、一々例示はしないが、本発明はその趣旨を逸脱しない範囲内において種々の変更が加えられて実施されるものである。 In addition, although not illustrated one by one, the present invention is implemented with various modifications within a range not departing from the gist thereof.

１０：カラオケシステム
１６：カラオケ装置
４０：マイクロフォン（音声入力装置）
８６：音高・音量検出手段
９０：多変量解析手段
９２：主成分分析手段
９４：固有値・固有ベクトル算出手段
９６：分散算出手段
９８：順位付け手段
１００：演奏評価手段 10: Karaoke system 16: Karaoke device 40: Microphone (voice input device)
86: Pitch / volume detection means 90: Multivariate analysis means 92: Principal component analysis means 94: Eigenvalue / eigenvector calculation means 96: Variance calculation means 98: Ranking means 100: Performance evaluation means

Claims

A karaoke system using a karaoke device that outputs a performance tune selected from a large number of performance tunes and amplifies and outputs a sound input by a voice input device,
Pitch / volume detection means for detecting pitch and volume per unit time with respect to voice information input by the voice input device;
Multivariate analysis is performed on 2N types of data corresponding to the pitch and volume per unit time detected by the pitch / volume detection means N times continuously for the voice information of each of a plurality of users. Variable analysis means;
A performance for performing performance evaluation related to voice information of a target user based on an analysis result by the multivariate analysis means corresponding to the voice information of the plurality of users at a pitch switching portion in the evaluation reference data A karaoke system characterized by comprising an evaluation means.

The multivariate analysis means includes:
Principal component analysis means for performing principal component analysis on the 2N types of data from n = 1 to n = N, where n = 0 is the starting point of the change in pitch detected by the pitch / volume detection means;
Eigenvalue / eigenvector calculation means for calculating 2N sets of eigenvalues / eigenvectors corresponding to the analysis result by the principal component analysis means;
Dispersion calculation means for calculating dispersion of eigenvalues with each eigenvector as an axis with respect to 2N sets of eigenvalues / eigenvectors calculated by the eigenvalue / eigenvector calculation means;
Ranking means for determining the order of the 2N sets of eigenvalues / eigenvectors in descending order of variance calculated by the variance calculating means,
2. The karaoke system according to claim 1, wherein the performance evaluation unit is preferentially used as a parameter for the performance evaluation from eigenvalues / eigenvectors having a higher rank determined by the ranking unit.