JP2015097639A

JP2015097639A - Karaoke device, dance scoring method, and program

Info

Publication number: JP2015097639A
Application number: JP2013238951A
Authority: JP
Inventors: 大樹清水; Daiki Shimizu; 洋平鳥越; Yohei Torigoe
Original assignee: Nippon Control System Corp
Current assignee: Nippon Control System Corp
Priority date: 2013-11-19
Filing date: 2013-11-19
Publication date: 2015-05-28
Anticipated expiration: 2033-11-19
Also published as: JP6431259B2

Abstract

PROBLEM TO BE SOLVED: To solve the problem that it is impossible to provide a Karaoke device capable of performing dance scoring with a high degree of precision.SOLUTION: A Karaoke device that performs dance scoring with a high degree of precision includes: a photographing part for photographing a singing person and acquiring a singing person's image which is an image taken of the singing person; a skeleton information acquisition part for acquiring skeleton information which is the information indicating the singing person's movement using information contained in the singing person's image; and a score output part for outputting a score which is the result of scoring the singing person's movement using the skeleton information.

Description

本発明は、カラオケ装置等に関するものである。 The present invention relates to a karaoke apparatus and the like.

従来、カラオケ装置が開発されている（非特許文献１参照）。 Conventionally, a karaoke apparatus has been developed (see Non-Patent Document 1).

“JOYSOUND f1 JS-F1”、［online］、株式会社エクシング、［２０１３年９月１０日検索］、インターネット［URL；http://joysound.biz/product/online/f1/index.html］“JOYSOUND f1 JS-F1”, [online], XING Inc. [searched on September 10, 2013], Internet [URL; http://joysound.biz/product/online/f1/index.html]

従来、ダンスの採点を精度良く行うカラオケ装置を提供することができなかった。 Conventionally, it has not been possible to provide a karaoke apparatus that accurately scores a dance.

本第一の発明のカラオケ装置は、歌唱者を撮影し、歌唱者が写された画像である歌唱者画像を取得する撮影部と、歌唱者画像に含まれる情報を用いて、歌唱者の動きを示す情報であるスケルトン情報を取得するスケルトン情報取得部と、スケルトン情報を用いて歌唱者の動きを採点した結果であるスコアを出力するスコア出力部とを備えるカラオケ装置である。 The karaoke apparatus according to the first aspect of the present invention uses a shooting unit that captures a singer and obtains a singer image that is an image of the singer, and information included in the singer image, and moves the singer. It is a karaoke apparatus provided with the skeleton information acquisition part which acquires the skeleton information which is information which shows, and the score output part which outputs the score which is the result of scoring a singer's movement using skeleton information.

このような構成により、ダンスの採点を精度良く行うカラオケ装置を提供することができる。 With such a configuration, it is possible to provide a karaoke apparatus that accurately scores a dance.

また、本第二の発明のカラオケ装置は、第一の発明に対して、歌唱者画像は、距離情報を含む画像であり、スケルトン情報取得部は、歌唱者画像に含まれる距離情報を用いてスケルトン情報を取得するカラオケ装置である。 Moreover, the karaoke apparatus of this 2nd invention is a singer image is an image containing distance information with respect to 1st invention, and a skeleton information acquisition part uses distance information contained in a singer image. It is a karaoke device that acquires skeleton information.

また、本第三の発明のカラオケ装置は、第二の発明に対して、歌唱者画像に含まれる情報は、距離情報のうち、歌唱者の領域に対応する距離情報であるカラオケ装置である。 Moreover, the karaoke apparatus of this 3rd invention is a karaoke apparatus with which the information contained in a singer image is distance information corresponding to a singer's area | region among distance information with respect to 2nd invention.

また、本第四の発明のカラオケ装置は、第一から第三いずれか１つの発明に対して、歌唱者の動きを判定するための情報である動き判定情報が格納される動き判定情報格納部と、動き判定情報とスケルトン情報とを用いて、スケルトン情報が示す歌唱者の動きを判定し、判定の結果を用いてスコアを算出するスコア算出部とをさらに備え、スコア出力部は、スコア算出部が算出したスコアを出力するカラオケ装置である。 Moreover, the karaoke apparatus of this 4th invention is a motion determination information storage part in which the motion determination information which is the information for determining a singer's movement is stored with respect to any one of the first to third inventions. And a score calculation unit that determines the movement of the singer indicated by the skeleton information using the movement determination information and the skeleton information, and calculates a score using the determination result, and the score output unit calculates the score. It is a karaoke apparatus that outputs the score calculated by the department.

また、本第五の発明のカラオケ装置は、第四の発明に対して、動き判定情報格納部には、予め決められた１以上のタイミングごとの歌唱者の動きを判定するための１以上の動き判定情報が格納され、スケルトン情報取得部は、歌唱者画像に含まれる情報を用いて、予め決められた１以上のタイミングごとの歌唱者の動きを示す１以上のスケルトン情報を取得し、スコア算出部は、スケルトン情報取得部が取得した１以上の各スケルトン情報が示す予め決められたタイミングの歌唱者の動きを、タイミングに対応する動き判定情報を用いて判定し、判定の結果を用いてスコアを算出するカラオケ装置である。 Moreover, the karaoke apparatus of this 5th invention is 1 or more for judging the movement of the singer for every predetermined timing in the movement determination information storage part with respect to 4th invention. Movement determination information is stored, and the skeleton information acquisition unit acquires one or more skeleton information indicating the movement of the singer at one or more predetermined timings using information included in the singer image, and scores The calculation unit determines the movement of the singer at a predetermined timing indicated by each of the one or more skeleton information acquired by the skeleton information acquisition unit using the movement determination information corresponding to the timing, and uses the determination result. This is a karaoke device for calculating a score.

このような構成により、ダンスの採点をより精度良く行うカラオケ装置を提供することができる。 With this configuration, it is possible to provide a karaoke apparatus that performs dance scoring with higher accuracy.

また、本第六の発明のカラオケ装置は、第五の発明に対して、スコア算出部は、予め決められた１以上のタイミングごとの歌唱者の動きを採点し、１以上のタイミングごとの採点の結果である１以上のスコアを算出し、１以上のスコアの加重平均であるスコアを算出するカラオケ装置である。 Further, in the karaoke apparatus according to the sixth aspect of the invention, in contrast to the fifth aspect of the invention, the score calculation unit scores the movement of the singer at one or more predetermined timings and scores at one or more timings. 1 is a karaoke device that calculates one or more scores, and calculates a score that is a weighted average of the one or more scores.

また、本第七の発明のカラオケ装置は、第四から第六いずれか１つの発明に対して、動き判定情報は、歌唱者の１以上の各関節の角度に関する条件である１以上の関節角度条件であり、スコア算出部は、スケルトン情報取得部が取得したスケルトン情報を用いて、歌唱者の１以上の各関節の角度を取得し、１以上の各関節に対応する関節角度条件を用いて、１以上の各関節の角度を判定し、判定の結果を用いてスコアを算出するカラオケ装置である。 Moreover, the karaoke apparatus of this 7th invention is 1 or more joint angles which are conditions regarding the angle of one or more each joints of a singer with respect to any one invention of 4th to 6th. The score calculation unit acquires the angles of one or more joints of the singer using the skeleton information acquired by the skeleton information acquisition unit, and uses joint angle conditions corresponding to the one or more joints. It is a karaoke apparatus which determines the angle of one or more each joint, and calculates a score using the determination result.

また、本第八の発明のカラオケ装置は、第四から第七いずれか１つの発明に対して、動き判定情報は、歌唱者の１以上の各ジョイント間の角度に関する条件である１以上のジョイント間角度条件であり、スコア算出部は、スケルトン情報取得部が取得したスケルトン情報を用いて、歌唱者の１以上の各ジョイント間の角度を取得し、１以上の各ジョイント間に対応するジョイント間角度条件を用いて、１以上の各ジョイント間の角度を判定し、判定の結果を用いてスコアを算出するカラオケ装置である。 Moreover, the karaoke apparatus of this 8th invention is 1 or more joints whose movement determination information is the conditions regarding the angle between 1 or more each joints of a singer with respect to any one invention of 4th to 7th. The score calculation unit uses the skeleton information acquired by the skeleton information acquisition unit to acquire an angle between one or more joints of the singer, and between the joints corresponding to the one or more joints. It is a karaoke apparatus that determines an angle between one or more joints using an angle condition, and calculates a score using the determination result.

また、本第九の発明のカラオケ装置は、第四から第八いずれか１つの発明に対して、動き判定情報は、歌唱者の１以上の各ジョイントの座標に関する条件である１以上のジョイント座標条件であり、スコア算出部は、スケルトン情報取得部が取得したスケルトン情報を用いて、歌唱者の１以上の各ジョイントの座標を取得し、１以上の各ジョイントに対応するジョイント座標条件を用いて、１以上の各ジョイントの座標を判定し、判定の結果を用いてスコアを算出するカラオケ装置である。 Further, the karaoke apparatus of the ninth aspect of the invention is directed to any one of the fourth to eighth aspects of the invention, wherein the motion determination information is one or more joint coordinates which are conditions relating to the coordinates of one or more joints of the singer. The score calculation unit acquires the coordinates of one or more joints of the singer using the skeleton information acquired by the skeleton information acquisition unit, and uses joint coordinate conditions corresponding to the one or more joints. This is a karaoke apparatus that determines the coordinates of one or more joints and calculates a score using the determination result.

また、本第十の発明のカラオケ装置は、第四から第九いずれか１つの発明に対して、動き判定情報は、歌唱者の１以上の各ジョイントの移動量に関する条件である１以上のジョイント移動量条件であり、スコア算出部は、スケルトン情報取得部が取得したスケルトン情報を用いて、歌唱者の１以上の各ジョイントの移動量を取得し、１以上の各ジョイントに対応するジョイント移動量条件を用いて、１以上の各ジョイントの移動量を判定し、判定の結果を用いてスコアを算出するカラオケ装置である。 Further, in the karaoke apparatus of the tenth invention, with respect to any one of the fourth to ninth inventions, the movement determination information is one or more joints which is a condition relating to a moving amount of each of the one or more joints of the singer. It is a movement amount condition, and the score calculation unit acquires the movement amount of each of the one or more joints of the singer using the skeleton information acquired by the skeleton information acquisition unit, and the joint movement amount corresponding to each of the one or more joints It is a karaoke apparatus that determines a moving amount of each of one or more joints using a condition and calculates a score using the determination result.

また、本第十一の発明のカラオケ装置は、第一から第十いずれか１つの発明に対して、歌唱者画像に含まれる情報は、歌唱者画像から予め決められた条件を満たす画素を削除した後の歌唱者画像であるカラオケ装置である。 Further, in the eleventh invention, the karaoke apparatus according to any one of the first to tenth inventions, wherein the information included in the singer image deletes pixels satisfying a predetermined condition from the singer image It is a karaoke apparatus which is a singer image after having performed.

このような構成により、ダンスの採点のリアルタイム性を向上させることができる。 With such a configuration, the real-time performance of the dance scoring can be improved.

また、本第十二の発明のカラオケ装置は、第一から第十一いずれか１つの発明に対して、ダンスの手本を示す画像である手本画像が格納される手本画像格納部と、手本画像を出力する画像出力部とをさらに備えるカラオケ装置である。 A karaoke apparatus according to the twelfth aspect of the present invention includes a model image storage unit that stores a model image that is an image indicating a dance model, according to any one of the first to eleventh aspects of the invention. The karaoke apparatus further includes an image output unit that outputs a model image.

このような構成により、ダンスの手本を表示することができる。 With this configuration, a dance example can be displayed.

また、本第十三の発明のカラオケ装置は、第十二の発明に対して、歌唱者画像に含まれる情報を用いて、歌唱者の１以上の属性値を取得し、少なくとも一部は歌唱者の１以上の属性値を用いてモデルを示す画像であるモデル画像を構成するモデル画像構成部をさらに備え、画像出力部は、手本画像と、モデル画像構成部が構成したモデル画像とを出力するカラオケ装置である。 Further, the karaoke apparatus of the thirteenth aspect of the invention obtains one or more attribute values of the singer using the information included in the singer image, and at least a part of the singing singing is performed. A model image constructing unit that configures a model image that is an image indicating a model using one or more attribute values of the user, and the image output unit includes a model image and a model image configured by the model image configuring unit. It is a karaoke device that outputs.

このような構成により、歌唱者の動きに応じた動きをリアルタイムに行うモデルを表示することができる。 With such a configuration, it is possible to display a model that performs a movement corresponding to the movement of the singer in real time.

本発明によるカラオケ装置等によれば、ダンスの採点を精度良く行うカラオケ装置を提供することができる。 According to the karaoke apparatus and the like according to the present invention, it is possible to provide a karaoke apparatus that accurately scores a dance.

実施の形態１におけるカラオケ装置１のブロック図Block diagram of karaoke apparatus 1 in the first embodiment 同カラオケ装置１の全体動作について説明するフローチャートThe flowchart explaining the whole operation | movement of the karaoke apparatus 1 同スケルトン情報の取得処理について説明するフローチャートFlowchart explaining the acquisition process of the skeleton information 同スコアの算出と出力処理について説明するフローチャートFlow chart explaining calculation and output processing of the score 同モデル画像の構成と出力処理について説明するフローチャートA flowchart for explaining the structure and output processing of the model image 同動き判定情報の例を示す図The figure which shows the example of the same movement determination information 同距離画像の例を示す図The figure which shows the example of the same distance image 同３次元スケルトン情報の例を示す図The figure which shows the example of the same three-dimensional skeleton information 同３次元スケルトン情報を画像化した画像の例を示す図The figure which shows the example of the image which imaged the same three-dimensional skeleton information 同歌唱者関節角度の例を示す図The figure which shows the example of the joint angle of the singer 同パラメータの例を示す図Figure showing an example of the same parameter 同動き判定情報の例を示す図The figure which shows the example of the same movement determination information 同判断の結果を示す情報の例を示す図The figure which shows the example of the information which shows the result of the same judgment 同歌唱者モデル画像の例を示す図The figure which shows the example of the singer model image 同手本画像の例を示す図Figure showing an example of a copy model image 同スコアとモデル画像と手本画像の出力例を示す図The figure which shows the output example of the same score, model image, and model image 同カラオケ装置２のブロック図Block diagram of the karaoke device 2 上記実施の形態におけるコンピュータシステムの概観図Overview of the computer system in the above embodiment 上記実施の形態におけるコンピュータシステムのブロック図Block diagram of a computer system in the above embodiment

以下、本発明によるカラオケ装置等の実施形態について図面を参照して説明する。なお、実施の形態において同じ符号を付した構成要素は同様の動作を行うので、再度の説明を省略する場合がある。また、本実施の形態において説明する各情報の形式、内容などは、あくまで例示であり、各情報の持つ意味を示すことができれば、形式、内容などは問わない。 Hereinafter, embodiments of a karaoke apparatus and the like according to the present invention will be described with reference to the drawings. In addition, since the component which attached | subjected the same code | symbol in embodiment performs the same operation | movement, description may be abbreviate | omitted again. In addition, the format, content, and the like of each information described in this embodiment are merely examples, and the format, content, and the like are not limited as long as the meaning of each information can be indicated.

（実施の形態１）
本実施の形態において、歌唱者を撮影し、当該歌唱者の動きを採点し、当該採点の結果であるスコアを出力するカラオケ装置１について説明する。 (Embodiment 1)
In the present embodiment, a karaoke apparatus 1 that photographs a singer, scores the movement of the singer, and outputs a score that is a result of the grading will be described.

図１は、本実施の形態におけるカラオケ装置１のブロック図である。カラオケ装置１は、動き判定情報格納部１０１、手本画像格納部１０２、楽曲データ格納部１０３、受付部１０４、撮影部１０５、スケルトン情報取得部１０６、スコア算出部１０７、スコア出力部１０８、モデル画像構成部１０９、画像出力部１１０、楽曲再生部１１１を備える。 FIG. 1 is a block diagram of a karaoke apparatus 1 in the present embodiment. The karaoke apparatus 1 includes a movement determination information storage unit 101, a model image storage unit 102, a music data storage unit 103, a reception unit 104, a photographing unit 105, a skeleton information acquisition unit 106, a score calculation unit 107, a score output unit 108, a model. An image construction unit 109, an image output unit 110, and a music playback unit 111 are provided.

動き判定情報格納部１０１には、歌唱者の動きを判定するための情報である動き判定情報が格納される。「動き判定情報」は、手本の動きを示す情報でもある。また、「手本の動き」とは、ダンスにおける体の動きの手本となる動きである。また、当該「動き」とは、動作や、体勢、姿勢などを含み、広く解する。また、「動き判定情報」は、通常、時系列の情報である。つまり、「動き判定情報」は、例えば、予め決められた１以上の各タイミングに対応する情報である。当該「予め決められた１以上の各タイミング」とは、例えば、動画を構成するフレーム（静止画）や、動画を構成するフレームの区間、動画の開始からの時間（タイムスタンプ）などである。 The movement determination information storage unit 101 stores movement determination information that is information for determining the movement of the singer. The “movement determination information” is also information indicating the movement of the model. The “exemplary movement” is a movement that is a model of body movement in dance. The “movement” includes a motion, a posture, a posture, and the like, and is widely understood. The “motion determination information” is usually time-series information. That is, the “motion determination information” is information corresponding to one or more predetermined timings, for example. The “one or more predetermined timings” are, for example, a frame (still image) constituting a moving image, a section of a frame constituting the moving image, a time from the start of the moving image (time stamp), and the like.

具体的に、「動き判定情報」は、例えば、以下のうちの１種類以上である。
（Ａ）関節角度条件
（Ｂ）ジョイント座標条件
（Ｃ）ジョイント移動量条件
（Ｄ）ジョイント間角度条件 Specifically, the “motion determination information” is, for example, one or more of the following.
(A) Joint angle condition (B) Joint coordinate condition (C) Joint movement amount condition (D) Inter-joint angle condition

（Ａ）関節角度条件：「関節角度条件」とは、関節（肘、膝、手首、足首、首など）の角度（以下、適宜、関節角度とする）に関する条件である。当該「角度」は、通常、いわゆる内角であるが、外角であってもよい。また、「動き判定情報」が「関節角度条件」である場合、「動き判定情報」は、通常、歌唱者の１以上の各関節の角度に関する１以上の関節角度条件から構成される情報である。また、「関節角度条件」は、通常、関節を識別する情報（以下、適宜、関節識別情報とする）を有する。「関節識別情報」は、例えば、関節名や、ＩＤなどである。また、「関節角度条件」は、例えば、「右肘＝９０°」や、「１２０°≦左膝≦１３０°」などである。 (A) Joint angle condition: The “joint angle condition” is a condition relating to an angle (hereinafter, appropriately referred to as a joint angle) of a joint (elbow, knee, wrist, ankle, neck, etc.). The “angle” is usually a so-called inner angle, but may be an outer angle. Further, when the “movement determination information” is “joint angle condition”, the “motion determination information” is usually information composed of one or more joint angle conditions regarding the angle of one or more joints of the singer. . Further, the “joint angle condition” usually includes information for identifying a joint (hereinafter, referred to as joint identification information as appropriate). The “joint identification information” is, for example, a joint name or ID. The “joint angle condition” is, for example, “right elbow = 90 °”, “120 ° ≦ left knee ≦ 130 °”, and the like.

（Ｂ）ジョイント座標条件：「ジョイント座標条件」とは、ジョイントの座標（以下、適宜、ジョイント座標とする）に関する条件である。「ジョイント」とは、関節、および関節を軸に動く部位（手、足、頭など）に対応する点である。この点は、通常、関節や上記部位の端点である。また、この点は、例えば、関節や上記部位の中点など、関節や上記部位に含まれる点、関節や上記部位に隣接する点などであってもよい。つまり、この点は、関節や上記部位に対応する点であればよい。また、「ジョイント」は、例えば、「ノード」としてもよい。また、当該「座標」は、通常、３次元の座標である。また、当該「座標」は、例えば、２次元の座標であってもよい。また、当該「座標」は、絶対的な座標であってもよいし、相対的な座標であってもよい。「絶対的な座標」とは、歌唱者画像における座標である。また、「相対的な座標」とは、他のジョイントに対する位置を示す座標である。言い換えると、「相対的な座標」とは、基準となるジョイントの座標を「（ｘ，ｙ）＝（０，０）」としたときの座標である。 (B) Joint coordinate condition: “Joint coordinate condition” is a condition related to the coordinates of a joint (hereinafter, appropriately referred to as joint coordinates). A “joint” is a point corresponding to a joint and a part (hand, foot, head, etc.) that moves about the joint. This point is usually the end point of a joint or the above part. Further, this point may be, for example, a point included in the joint or the part such as a joint or the midpoint of the part, or a point adjacent to the joint or the part. That is, this point should just be a point corresponding to a joint or the said site | part. The “joint” may be a “node”, for example. The “coordinates” are usually three-dimensional coordinates. The “coordinates” may be, for example, two-dimensional coordinates. Further, the “coordinates” may be absolute coordinates or relative coordinates. “Absolute coordinates” are coordinates in the singer image. The “relative coordinates” are coordinates indicating positions with respect to other joints. In other words, the “relative coordinates” are coordinates when the coordinates of the reference joint are “(x, y) = (0, 0)”.

また、「動き判定情報」が「ジョイント座標条件」である場合、「動き判定情報」は、通常、歌唱者の１以上の各ジョイントの座標に関する１以上のジョイント座標条件から構成される情報である。また、「ジョイント座標条件」は、通常、ジョイントを識別する情報（以下、適宜、ジョイント識別情報とする）を有する。「ジョイント座標条件」は、例えば、「右肘＝（ｘ１，ｙ１，ｚ１）」や、「（ｘ２，ｙ２，ｚ２）≦左膝≦（ｘ３，ｙ３，ｚ３）」などである。 Further, when the “motion determination information” is “joint coordinate condition”, the “motion determination information” is usually information composed of one or more joint coordinate conditions regarding the coordinates of one or more joints of the singer. . The “joint coordinate condition” usually includes information for identifying a joint (hereinafter, appropriately referred to as joint identification information). The “joint coordinate condition” is, for example, “right elbow = (x1, y1, z1)” or “(x2, y2, z2) ≦ left knee ≦ (x3, y3, z3)”.

（Ｃ）ジョイント移動量条件：「ジョイント移動量条件」とは、ジョイントの移動量（以下、適宜、ジョイント移動量とする）に関する条件である。「ジョイント移動量」は、通常、ジョイントの移動の大きさ、および、ジョイントの移動の方向である。また、「ジョイント移動量」は、例えば、ジョイントの移動の大きさのみであってもよい。また、「ジョイント移動量」は、いわゆる動きベクトル（ベクトル量）であると考えてもよい。 (C) Joint movement amount condition: The “joint movement amount condition” is a condition related to the movement amount of the joint (hereinafter referred to as “joint movement amount” as appropriate). The “joint movement amount” is usually the magnitude of movement of the joint and the direction of movement of the joint. Further, the “joint movement amount” may be, for example, only the magnitude of the joint movement. The “joint movement amount” may be considered as a so-called motion vector (vector amount).

また、「ジョイント移動量」は、ジョイントの移動の大きさ（長さ、ピクセル数など）およびジョイントの移動の方向（座標、角度など）を示すことができれば、内容や形式などは、問わない。また、「移動の大きさ」は、例えば、相対的な大きさであってもよい。「相対的な大きさ」は、例えば、歌唱者の大きさ（体格）に対する相対的な大きさである。また、「移動の方向」は、例えば、相対的な方向であってもよい。「相対的な方向」は、例えば、座標軸や、歌唱者の向きなどに対する相対的な方向である。 Further, the “joint movement amount” can be any content or format as long as it indicates the joint movement magnitude (length, number of pixels, etc.) and the joint movement direction (coordinates, angle, etc.). Further, the “size of movement” may be, for example, a relative size. The “relative size” is, for example, a relative size with respect to the size (physique) of the singer. Further, the “direction of movement” may be a relative direction, for example. The “relative direction” is, for example, a relative direction with respect to a coordinate axis, a singer's direction, and the like.

また、「移動量」は、例えば、「（ｘ：１０ｐｘ，ｙ：２０ｐｘ）／ｓｅｃ」や、「（２５ｐｘ，３０°）／ｆｒａｍｅ」などである。前者は、１秒あたりに、ｘ軸方向に１０ｐｘ、ｙ軸方向に２０ｐｘ移動したことを意味する。また、後者は、１フレームあたりに、水平方向から３０°の方向に２５ｐｘ移動したことを意味する。 The “movement amount” is, for example, “(x: 10 px, y: 20 px) / sec”, “(25 px, 30 °) / frame”, or the like. The former means that it has moved 10 px in the x-axis direction and 20 px in the y-axis direction per second. The latter means that the frame has moved 25 px in the direction of 30 ° from the horizontal direction per frame.

また、「動き判定情報」が「ジョイント移動量条件」である場合、「動き判定情報」は、通常、歌唱者の１以上の各ジョイントの移動量に関する１以上のジョイント移動量条件から構成される情報である。また、「ジョイント移動量条件」は、通常、ジョイント識別情報を有する。「ジョイント移動量条件」は、例えば、「右肘＝（（ｘ：１０ｐｘ，ｙ：２０ｐｘ）／ｓｅｃ）」や、「（（２５ｐｘ，３０°）／ｆｒａｍｅ）≦左膝≦（（３０ｐｘ，３０°）／ｆｒａｍｅ）」などである。 When “movement determination information” is “joint movement amount condition”, “movement determination information” is usually composed of one or more joint movement amount conditions regarding the movement amount of one or more joints of the singer. Information. Further, the “joint movement amount condition” usually has joint identification information. The “joint movement amount condition” is, for example, “right elbow = ((x: 10 px, y: 20 px) / sec)” or “((25 px, 30 °) / frame) ≦ left knee ≦ ((30 px, 30 °) / frame) ".

（Ｄ）ジョイント間角度条件：「ジョイント間角度条件」とは、ジョイント間の角度に関する条件である。「ジョイント間の角度」（以下、適宜、ジョイント間角度とする）とは、通常、３個のジョイントのうちの２個のジョイント間を結ぶ２本の線分が成す角度である。また、「ジョイント間角度」は、通常、いわゆる内角であるが、外角であってもよい。つまり、「ジョイント間」とは、通常、３個のジョイントの組み合わせまたは順列である。 (D) Inter-joint angle condition: The “inter-joint angle condition” is a condition related to the angle between joints. “An angle between joints” (hereinafter, appropriately referred to as an inter-joint angle) is an angle formed by two line segments connecting two joints of three joints. The “inter-joint angle” is usually a so-called inner angle, but may be an outer angle. That is, “between joints” is usually a combination or permutation of three joints.

また、「ジョイント間角度」は、例えば、正規化された座標系に対する線分（以下、適宜、対象線分とする）の角度であってもよい。正規化された座標系とは、例えば、線分（以下、適宜、基準線分とする）、または平面（以下、適宜、基準平面とする）である。基準線分は、例えば、座標軸や、歌唱者の体の中心線、歌唱者の体の重心を通過する線分などである。また、基準平面は、例えば、座標面や、歌唱者の体に水平な平面、歌唱者の体に垂直な平面、歌唱者の体の向きに水平な平面、歌唱者の体の向きに垂直な平面などである。また、対象線分は、通常、２個のジョイント間を結ぶ線分である。 In addition, the “inter-joint angle” may be, for example, an angle of a line segment with respect to a normalized coordinate system (hereinafter, appropriately referred to as a target line segment). The normalized coordinate system is, for example, a line segment (hereinafter appropriately referred to as a reference line segment) or a plane (hereinafter appropriately referred to as a reference plane). The reference line segment is, for example, a coordinate segment, a center line of the singer's body, a line segment passing through the centroid of the singer's body, or the like. The reference plane is, for example, a coordinate plane, a plane that is horizontal to the singer's body, a plane that is perpendicular to the singer's body, a plane that is horizontal to the singer's body, and a direction that is perpendicular to the singer's body. It is a plane. The target line segment is usually a line segment that connects two joints.

また、「動き判定情報」が「ジョイント間角度条件」である場合、「動き判定情報」は、通常、歌唱者の１以上の各ジョイント間の角度に関する１以上のジョイント間角度条件から構成される情報である。また、「ジョイント間角度条件」は、通常、ジョイント識別情報を有する。また、「ジョイント間角度条件」は、例えば、「左肩−右肩−右手首＝１２０°」や、「４５°≦右手−頭−腰≦６０°」などである。前者のジョイント間角度条件は、左肩と右肩とを結ぶ線分と、右肩と右手首とを結ぶ線分とが成す角度が１２０°であることを意味する。また、後者の「ジョイント間角度条件」は、右手と頭とを結ぶ線分と、頭と腰とを結ぶ線分とが成す角度が４０°以上６０°以下であることを意味する。 Further, when the “motion determination information” is “inter-joint angle condition”, the “motion determination information” is usually composed of one or more inter-joint angle conditions regarding the angle between one or more joints of the singer. Information. The “joint angle condition” usually includes joint identification information. The “joint angle condition” is, for example, “left shoulder−right shoulder−right wrist = 120 °” or “45 ° ≦ right hand−head−waist ≦ 60 °”. The former joint angle condition means that an angle formed by a line segment connecting the left shoulder and the right shoulder and a line segment connecting the right shoulder and the right wrist is 120 °. The latter “inter-joint angle condition” means that the angle formed by the line segment connecting the right hand and the head and the line segment connecting the head and the waist is 40 ° or more and 60 ° or less.

なお、動き判定情報は、例えば、１以上の関節角度から構成される情報であってもよい。この場合、当該１以上の各関節角度には、通常、関節識別情報が対応付いている。また、当該関節識別情報が対応付いている関節角度は、例えば、「右肘：９０°」や、「左膝：１３０°」などである。また、当該動き判定情報は、通常、関節角度条件に含まれる。 Note that the motion determination information may be information including one or more joint angles, for example. In this case, joint identification information is usually associated with the one or more joint angles. The joint angle associated with the joint identification information is, for example, “right elbow: 90 °” or “left knee: 130 °”. The motion determination information is usually included in the joint angle condition.

また、動き判定情報は、例えば、１以上のジョイント座標から構成される情報であってもよい。この場合、当該１以上の各ジョイント座標には、通常、ジョイント識別情報が対応付いている。また、当該ジョイント識別情報が対応付いているジョイント座標は、例えば、「右肘：（ｘ１，ｙ１，ｚ１）」や、「左膝：（ｘ３，ｙ３，ｚ３）」などである。また、当該動き判定情報は、通常、ジョイント座標条件に含まれる。 In addition, the motion determination information may be information including one or more joint coordinates, for example. In this case, the one or more joint coordinates are usually associated with joint identification information. The joint coordinates with which the joint identification information is associated are, for example, “right elbow: (x1, y1, z1)”, “left knee: (x3, y3, z3)”, and the like. The motion determination information is usually included in the joint coordinate condition.

また、動き判定情報は、例えば、１以上のジョイント移動量から構成される情報であってもよい。この場合、当該１以上の各ジョイント移動量には、通常、ジョイント識別情報が対応付いている。また、当該ジョイント識別情報が対応付いているジョイント移動量は、例えば、「右肘：（（ｘ：１０ｐｘ，ｙ：２０ｐｘ）／ｓｅｃ）」や、「左膝：（（３０ｐｘ，３０°）／ｆｒａｍｅ）」などである。また、当該動き判定情報は、通常、ジョイント移動量条件に含まれる。 Further, the motion determination information may be information composed of one or more joint movement amounts, for example. In this case, the one or more joint movement amounts are usually associated with joint identification information. The joint movement amount associated with the joint identification information is, for example, “right elbow: ((x: 10 px, y: 20 px) / sec)” or “left knee: ((30 px, 30 °) / frame) ". The motion determination information is usually included in the joint movement amount condition.

また、動き判定情報は、例えば、１以上のジョイント間角度から構成される情報であってもよい。この場合、当該１以上の各ジョイント間角度には、通常、ジョイント識別情報が対応付いている。また、当該ジョイント識別情報が対応付いているジョイント間角度は、例えば、「左肩−右肩−右手首：１２０°」や、「右手−頭−腰：６０°」などである。また、当該動き判定情報は、通常、ジョイント間角度条件に含まれる。 Further, the motion determination information may be information including one or more joint angles, for example. In this case, joint identification information is usually associated with the one or more inter-joint angles. Further, the angle between joints corresponding to the joint identification information is, for example, “left shoulder−right shoulder−right wrist: 120 °”, “right hand−head−waist: 60 °”, and the like. The motion determination information is usually included in the joint angle condition.

手本画像格納部１０２には、楽曲に対応するダンスの手本を示す画像である手本画像が格納される。当該「楽曲」は、通常、後述の楽曲再生部１１１が楽曲データを再生することにより出力される楽曲である。また、「手本画像」は、通常、動画である。また、「手本画像」は、例えば、人物やキャラクタなどがダンスをしている様子を示す画像である。言い換えると、「手本画像」は、例えば、人物やキャラクタなどがダンスをしている様子を撮影した画像である。また、「手本画像」は、ダンスにおける体の動きを示す画像であればよい。例えば、「手本画像」は、後述の距離画像であってもよい。 The model image storage unit 102 stores a model image which is an image showing a model of dance corresponding to the music. The “musical piece” is usually a musical piece that is output when the musical piece reproducing unit 111 described later reproduces musical piece data. The “example image” is usually a moving image. In addition, the “example image” is an image that shows a situation where, for example, a person or a character is dancing. In other words, the “example image” is an image obtained by photographing a person or character dancing. Further, the “example image” may be an image showing the body movement in the dance. For example, the “model image” may be a distance image described later.

また、「キャラクタ」とは、通常、いわゆるアバターである。また、「キャラクタ」は、例えば、いわゆるアニメに登場するキャラクタや、いわゆるマスコットキャラクタなどであってもよい。また、「キャラクタ」は、例えば、歌唱者であってもよい。 The “character” is usually a so-called avatar. The “character” may be, for example, a character appearing in a so-called animation, a so-called mascot character, or the like. The “character” may be a singer, for example.

なお、本実施の形態において、「画像」とは、静止画または動画である。動画は、２以上の静止画を有する。また、動画が有する２以上の静止画は、通常、動画を構成するフレームである。 In the present embodiment, the “image” is a still image or a moving image. The moving image has two or more still images. In addition, two or more still images included in a moving image are usually frames constituting the moving image.

楽曲データ格納部１０３には、１以上の楽曲データが格納される。「楽曲データ」とは、楽曲を電子的に扱うための情報である。また、「楽曲」とは、いわゆる「音楽」と呼ばれるものや、いわゆる「曲」と呼ばれるものなどである。また、「楽曲データ」は、楽曲を電子的に扱うための情報であれば、形式などは問わない。「楽曲データ」は、例えば、ＭＩＤＩや、ＷＡＶ、ＭＰ３などの形式である。また、「楽曲データ」は、通常、楽曲データの識別情報（以下、適宜、楽曲データ識別情報とする）を有する。「楽曲データ識別情報」は、例えば、歌手名および楽曲名である。また、当該識別情報は、例えば、いわゆるファイル名であってもよい。なお、楽曲データ格納部１０３には、通常、ネットワークから受信した楽曲データが蓄積される。また、当該楽曲データの受信は、通常、図示しない受信部が行う。 The music data storage unit 103 stores one or more music data. “Music data” is information for electronically handling music. Also, the “music” is what is called “music”, what is called “music”, and the like. The “music data” may be in any format as long as it is information for electronically handling the music. The “music data” is in a format such as MIDI, WAV, MP3, or the like. The “music data” usually has music data identification information (hereinafter referred to as music data identification information as appropriate). “Music data identification information” is, for example, a singer name and a song name. The identification information may be a so-called file name, for example. The music data storage unit 103 normally stores music data received from the network. The music data is normally received by a receiving unit (not shown).

受付部１０４は、指示を受け付ける。当該指示は、例えば、楽曲データを選択する指示である楽曲選択指示や、電源ＯＮの指示、電源ＯＦＦの指示などである。なお、受付部１０４が電源ＯＮの指示や電源ＯＦＦの指示などを受け付けた場合、カラオケ装置１は、通常、当該受け付けた指示に応じた処理を行う。 The accepting unit 104 accepts an instruction. The instruction is, for example, a music selection instruction that is an instruction to select music data, a power ON instruction, a power OFF instruction, or the like. When the reception unit 104 receives a power-on instruction, a power-off instruction, or the like, the karaoke apparatus 1 normally performs a process according to the received instruction.

また、受け付けとは、タッチパネルや、リモコン、キーボードなどの入力デバイスから入力された情報の取得、光ディスクや磁気ディスク、半導体メモリなどの記録媒体に格納されている情報の取得、有線もしくは無線の通信回線を介して送信された情報の受信などを含む概念である。 Acceptance refers to acquisition of information input from input devices such as touch panels, remote controllers, and keyboards, acquisition of information stored in recording media such as optical disks, magnetic disks, and semiconductor memories, and wired or wireless communication lines. It is a concept that includes reception of information transmitted through the Internet.

受付部１０４における情報や指示などの入力手段は、メニュー画面によるものや、キーボードなど、何でもよい。受付部１０４は、メニュー画面の制御ソフトウェアや、キーボード等の入力手段のデバイスドライバなどで実現され得る。 The input means such as information and instructions in the reception unit 104 may be anything such as a menu screen or a keyboard. The accepting unit 104 can be realized by control software for a menu screen, a device driver for input means such as a keyboard, and the like.

撮影部１０５は、歌唱者を撮影する。そして、撮影部１０５は、当該歌唱者が写された画像である歌唱者画像を取得する。当該「歌唱者」は、通常、１人である。また、当該「歌唱者」は、例えば、２人以上であってもよい。また、「歌唱者画像」には、少なくとも歌唱者が写されていればよく、その他の物体などが写されているか否かについては、問わない。 The photographing unit 105 photographs a singer. Then, the photographing unit 105 acquires a singer image that is an image in which the singer is copied. The “singer” is usually one person. Further, the “singer” may be two or more people, for example. Further, it is sufficient that at least the singer is copied in the “singer image”, and it does not matter whether or not other objects are copied.

また、「歌唱者画像」は、通常、距離情報、輝度情報、またはＲＧＢ情報のうちの１以上の情報を含む画像である。また、「歌唱者画像」は、例えば、距離画像である。「距離画像」とは、１以上の距離情報を有する画像である。また、「距離情報」とは、通常、カラオケ装置１から歌唱者までの距離を示す情報である。また、「歌唱者までの距離」とは、具体的に、歌唱者の頭や目、鼻、口、肩、胸、腰、膝などの各部位や、歌唱者の体の表面上の任意の点までの距離である。また、「距離情報」は、例えば、画像を構成する１以上の各画素に対応付いている。また、「距離画像」は、距離情報のみで構成されていてもよいし、ＲＧＢ情報や輝度情報などをも有していてもよい。なお、歌唱者画像のデータ構造は問わない。 Further, the “singer image” is an image including one or more pieces of information of distance information, luminance information, or RGB information. The “singer image” is, for example, a distance image. A “distance image” is an image having one or more pieces of distance information. The “distance information” is usually information indicating the distance from the karaoke apparatus 1 to the singer. In addition, the “distance to the singer” specifically refers to any part of the singer's head, eyes, nose, mouth, shoulders, chest, waist, knee, or any other surface on the singer's body. The distance to the point. In addition, “distance information” is associated with, for example, one or more pixels constituting an image. In addition, the “distance image” may be composed only of distance information, or may have RGB information, luminance information, and the like. The data structure of the singer image does not matter.

また、「歌唱者画像」は、例えば、撮影画像であってもよい。「撮影画像」とは、１以上の距離情報を有さない画像である。また、「撮影画像」は、通常、いわゆるカラー画像であるが、いわゆるグレースケール画像であってもよい。 Further, the “singer image” may be a captured image, for example. A “photographed image” is an image that does not have one or more distance information. The “photographed image” is usually a so-called color image, but may be a so-called grayscale image.

以上より、「歌唱者画像」は、例えば、ＲＧＢ情報と輝度情報のいずれか一方または両方を有していてもよいし、両方を有していなくてもよい。「ＲＧＢ情報」とは、画像を構成する１以上の各画素の色を示す情報である。また、「輝度情報」とは、画像を構成する１以上の各画素の輝度を示す情報である。また、「輝度」には、明暗や、濃淡なども含み、広く解する。また、「画素」は、通常、画像中の座標（ｘ，ｙ）により特定される。 As described above, the “singer image” may have, for example, one or both of RGB information and luminance information, or may not have both. “RGB information” is information indicating the color of one or more pixels constituting an image. “Luminance information” is information indicating the luminance of one or more pixels constituting an image. In addition, “brightness” is widely understood, including brightness and darkness and shading. The “pixel” is usually specified by coordinates (x, y) in the image.

また、撮影部１０５は、例えば、距離画像と撮影画像のいずれか一方のみを取得してもよいし、両方を取得してもよい。距離画像および撮影画像を取得する場合、当該距離画像および当該撮影画像において、撮影者が写されている領域は、通常、同様の位置である。つまり、距離画像および撮影画像を取得する場合、撮影部１０５は、通常、歌唱者が同様の位置に配置された距離画像および撮影画像を取得する。また、「歌唱者画像の取得」には、歌唱者の撮影を開始し、歌唱者画像の取得を開始することも含まれる。 Moreover, the imaging unit 105 may acquire only one of the distance image and the captured image, or may acquire both, for example. When acquiring a distance image and a photographed image, the region where the photographer is photographed in the distance image and the photographed image is usually at the same position. That is, when acquiring a distance image and a photographed image, the photographing unit 105 usually obtains a distance image and a photographed image in which the singer is arranged at the same position. In addition, “acquiring a singer image” includes starting shooting of a singer and starting acquiring a singer image.

なお、撮影部１０５は、通常、いわゆる距離画像カメラから、距離画像を取得する。また、撮影部１０５は、例えば、いわゆるステレオカメラから、距離画像を取得してもよい。また、撮影部１０５は、例えば、ＣＭＯＳやＣＣＤなどのイメージセンサ（固体撮像素子）や、イメージセンサを用いたカメラ（デジタルスチルカメラ、デジタルビデオカメラ）などから、撮影画像を取得する。また、撮影部１０５は、例えば、これらの装置を有していてもよいし、これらの装置で実現され得てもよい。また、撮影部１０５の処理手順は、通常、ソフトウェアで実現され、当該ソフトウェアはＲＯＭ等の記録媒体に記録されている。 Note that the photographing unit 105 usually acquires a distance image from a so-called distance image camera. Further, the photographing unit 105 may acquire a distance image from a so-called stereo camera, for example. The photographing unit 105 acquires a photographed image from an image sensor (solid-state imaging device) such as a CMOS or CCD, or a camera (digital still camera or digital video camera) using the image sensor, for example. In addition, the imaging unit 105 may include, for example, these devices or may be realized by these devices. The processing procedure of the photographing unit 105 is usually realized by software, and the software is recorded on a recording medium such as a ROM.

また、距離画像カメラには、例えば、３次元距離画像カメラＺＣシリーズ（http://www.optex.co.jp/product/3d.html）や、ＴＯＦ方式距離画像カメラＤＩＳＴＡＮＺＡシリーズ「http://www.brainvision.co.jp/xoops/modules/tinyd4/index.php?id=15」、KINECT for Windows（http://www.microsoft.com/en-us/kinectforwindows/）などがある。 The distance image camera includes, for example, a three-dimensional distance image camera ZC series (http://www.optex.co.jp/product/3d.html) and a TOF distance image camera DISSTANZA series “http: // www.brainvision.co.jp/xoops/modules/tinyd4/index.php?id=15 ", KINECT for Windows (http://www.microsoft.com/en-us/kinectforwindows/), etc.

スケルトン情報取得部１０６は、歌唱者画像を用いてスケルトン情報を取得する。当該歌唱者画像は、撮影部１０５が取得した歌唱者画像である。また、「スケルトン情報」とは、歌唱者の動きを示す情報である。また、「スケルトン情報を取得する」とは、スケルトン情報を構成することであってもよい。 The skeleton information acquisition unit 106 acquires skeleton information using the singer image. The singer image is a singer image acquired by the photographing unit 105. The “skeleton information” is information indicating the movement of the singer. Further, “acquiring skeleton information” may be constituting skeleton information.

具体的に、「スケルトン情報」とは、いわゆるジョイントの位置を示す１以上の座標の集合である。言い換えると、「スケルトン情報」とは、いわゆるジョイントの位置を示す１以上の座標から構成される情報である。また、当該「座標」は、通常、３次元の座標である。また、当該「座標」は、例えば、２次元の座標であってもよい。また、当該３次元の座標は、例えば、３次元画像を用いて取得された３次元の座標であってもよいし、２次元画像を用いて取得された３次元の座標であってもよい。また、当該２次元の座標は、例えば、３次元画像を用いて取得された２次元の座標であってもよいし、２次元画像を用いて取得された２次元の座標であってもよい。また、当該３次元画像および２次元画像は、撮影部１０５が取得した画像である。また、「スケルトン情報」は、例えば、歌唱者の体に対応するすべてのジョイントの位置を示す１以上の座標から構成されてもよいし、歌唱者が有する一部の部位（手、足、頭など）のみに対応するジョイントの位置を示す１以上の座標から構成されてもよい。 Specifically, “skeleton information” is a set of one or more coordinates indicating the position of a so-called joint. In other words, “skeleton information” is information composed of one or more coordinates indicating the position of a so-called joint. The “coordinates” are usually three-dimensional coordinates. The “coordinates” may be, for example, two-dimensional coordinates. In addition, the three-dimensional coordinates may be, for example, three-dimensional coordinates acquired using a three-dimensional image, or may be three-dimensional coordinates acquired using a two-dimensional image. Further, the two-dimensional coordinates may be, for example, two-dimensional coordinates acquired using a three-dimensional image, or may be two-dimensional coordinates acquired using a two-dimensional image. Further, the three-dimensional image and the two-dimensional image are images acquired by the imaging unit 105. In addition, the “skeleton information” may be composed of, for example, one or more coordinates indicating the positions of all joints corresponding to the singer's body, or a part of the singer (hand, foot, head) Or the like) may be composed of one or more coordinates indicating the position of the joint corresponding to only.

また、「スケルトン情報」が有する座標は、通常、連結しているジョイントごとに、対応付いている。つまり、例えば、肘と手首とは骨により連結されている。従って、「スケルトン情報」において、肘の座標と手首の座標とは、対応付いている。また、スケルトン情報が有する１以上の各座標には、通常、当該座標に対応するジョイントを識別するジョイント識別情報が対応付いている。また、当該１以上の各座標のうち、関節に対応する座標には、例えば、当該関節の角度が対応付いていてもよい。 In addition, the coordinates of “skeleton information” are usually associated with each connected joint. That is, for example, the elbow and the wrist are connected by a bone. Therefore, in the “skeleton information”, the coordinates of the elbow and the coordinates of the wrist are associated with each other. Further, one or more coordinates included in the skeleton information are usually associated with joint identification information for identifying a joint corresponding to the coordinates. Of the one or more coordinates, a coordinate corresponding to a joint may be associated with an angle of the joint, for example.

また、スケルトン情報取得部１０６は、例えば、以下のいずれかの場合に応じて、スケルトン情報を取得する。
（Ａ）歌唱者画像が距離画像である場合
（Ｂ）歌唱者画像が撮影画像である場合 Further, the skeleton information acquisition unit 106 acquires skeleton information, for example, in any of the following cases.
(A) When the singer image is a distance image (B) When the singer image is a captured image

（Ａ）の場合：この場合は、撮影部１０５が距離画像を取得した場合である。この場合、スケルトン情報取得部１０６は、通常、距離画像を用いて、３次元の座標から構成されるスケルトン情報を取得する。当該スケルトン情報の取得の手順は、例えば、以下のとおりである。
（１）距離画像が有する距離情報を用いて、当該距離情報が示す距離が予め決められた条件を満たすほど近い画素の座標（ｘ，ｙ）を取得する。これにより、距離画像中に写された歌唱者の領域（以下、適宜、歌唱者領域とする）が検出される。
（２）（１）で検出した歌唱者領域の輪郭を検出する。
（３）（２）で検出した輪郭に対してパターン認識を行い、距離画像における歌唱者のジョイントを検出する。これにより、距離画像中に写された歌唱者の各ジョイントに対応する２次元の座標（ｘ，ｙ）が取得される。
（４）（３）で取得した２次元の座標（ｘ，ｙ）を、連結しているジョイントごとに対応付ける。
（５）（４）で対応付けた２次元の座標（ｘ，ｙ）で示される各画素に対応する距離情報と、予め決められた算出式（以下、適宜、ジョイント座標値算出式とする）とを用いて、当該距離情報に対応する座標値（ｚ）を算出する。
（６）（４）で対応付けた２次元の座標（ｘ，ｙ）と、（４）で算出した座標値（ｚ）とを対応付け、３次元の座標（ｘ，ｙ，ｚ）を取得する。これにより、各ジョイントの位置を示す３次元の座標の集合（スケルトン情報）が取得される。 Case (A): In this case, the photographing unit 105 acquires a distance image. In this case, the skeleton information acquisition unit 106 normally acquires skeleton information composed of three-dimensional coordinates using a distance image. The procedure for acquiring the skeleton information is, for example, as follows.
(1) Using the distance information that the distance image has, the coordinates (x, y) of the pixels that are so close that the distance indicated by the distance information satisfies a predetermined condition are acquired. Thereby, the area | region (henceforth a singer area | region suitably) of the singer copied in the distance image is detected.
(2) The outline of the singer area detected in (1) is detected.
(3) Pattern recognition is performed on the contour detected in (2), and a singer's joint in the distance image is detected. Thereby, the two-dimensional coordinate (x, y) corresponding to each joint of the singer photographed in the distance image is acquired.
(4) The two-dimensional coordinates (x, y) acquired in (3) are associated with each connected joint.
(5) Distance information corresponding to each pixel indicated by the two-dimensional coordinates (x, y) associated in (4) and a predetermined calculation formula (hereinafter, appropriately referred to as a joint coordinate value calculation formula) Are used to calculate the coordinate value (z) corresponding to the distance information.
(6) The two-dimensional coordinate (x, y) associated in (4) is associated with the coordinate value (z) calculated in (4) to obtain the three-dimensional coordinate (x, y, z). To do. Thereby, a set of three-dimensional coordinates (skeleton information) indicating the position of each joint is acquired.

また、（Ａ）の場合、スケルトン情報の取得の手順は、例えば、以下のとおりであってもよい。
（１’）距離画像が有する距離情報を用いて、当該距離情報が示す距離が予め決められた条件を満たすほど近い画素の座標（ｘ，ｙ）を取得する。これにより、歌唱者領域が検出される。
（２’）（１’）で検出した歌唱者領域内の各画素に対応する距離情報と、予め決められた算出式（以下、適宜、歌唱者座標値算出式とする）とを用いて、当該距離情報に対応する座標値（ｚ）を算出する。
（３’）（１’）で取得した２次元の座標（ｘ，ｙ）と、（２’）で算出した座標値（ｚ）とを対応付け、３次元の座標（ｘ，ｙ，ｚ）を取得する。これにより、距離画像内に写された歌唱者の形状が検出される。
（４’）予め保持している基準スケルトン情報に、（３’）で取得した歌唱者の形状を適用する。当該「基準スケルトン情報」とは、基準の動きを示すスケルトン情報である。また、当該「適用する」とは、当該スケルトン情報が有する３次元の座標を、歌唱者の形状に合わせて変更することである。また、当該歌唱者の形状の適用には、通常、逆運動学を用いる。これにより、各ジョイントの位置を示す３次元の座標の集合（スケルトン情報）が取得される。 In the case of (A), the procedure for acquiring skeleton information may be as follows, for example.
(1 ′) Using the distance information included in the distance image, the coordinates (x, y) of the pixels that are close enough that the distance indicated by the distance information satisfies a predetermined condition are acquired. Thereby, a singer area | region is detected.
(2 ′) Using distance information corresponding to each pixel in the singer area detected in (1 ′) and a predetermined calculation formula (hereinafter, referred to as a singer coordinate value calculation formula as appropriate), A coordinate value (z) corresponding to the distance information is calculated.
(3 ′) The two-dimensional coordinates (x, y) acquired in (1 ′) are associated with the coordinate values (z) calculated in (2 ′), and the three-dimensional coordinates (x, y, z) To get. Thereby, the shape of the singer copied in the distance image is detected.
(4 ′) The shape of the singer acquired in (3 ′) is applied to the reference skeleton information held in advance. The “reference skeleton information” is skeleton information indicating the movement of the reference. The “applying” means changing the three-dimensional coordinates of the skeleton information according to the shape of the singer. Also, inverse kinematics is usually used to apply the singer's shape. Thereby, a set of three-dimensional coordinates (skeleton information) indicating the position of each joint is acquired.

なお、上記（１）、（１’）において、スケルトン情報取得部１０６は、例えば、距離情報が示す距離が予め決められた条件を満たす１以上の画素が隣接して形成される領域を、歌唱者領域として検出してもよい。この場合、スケルトン情報取得部１０６は、通常、当該１以上の画素が隣接して形成される領域の面積が、予め決められた条件を満たすほど大きい場合に、当該領域を歌唱者領域として検出する。 In the above (1) and (1 ′), the skeleton information acquisition unit 106 sings, for example, an area in which one or more pixels that satisfy a predetermined condition for the distance indicated by the distance information are sung. It may be detected as a person area. In this case, the skeleton information acquisition unit 106 usually detects the region as a singer region when the area of the region where the one or more pixels are adjacent to each other is large enough to satisfy a predetermined condition. .

また、上記（１）、（１’）において、スケルトン情報取得部１０６は、例えば、距離画像に対して二値化やラベリングなどの画像処理を施し、歌唱者領域を検出してもよい。 Further, in the above (1) and (1 ′), the skeleton information acquisition unit 106 may detect the singer area by performing image processing such as binarization and labeling on the distance image, for example.

また、上記（５）におけるジョイント座標値算出式、および上記（２’）における歌唱者座標値算出式は、例えば、距離情報を代入するための変数を有する。また、当該ジョイント座標値算出式および歌唱者座標値算出式は、例えば、いわゆる関数（プログラム）であってもよい。また、スケルトン情報取得部１０６は、通常、ジョイント座標値算出式および歌唱者座標値算出式を予め保持している。また、ジョイント座標値算出式と歌唱者座標値算出式とは、通常、異なる。 Further, the joint coordinate value calculation formula in (5) and the singer coordinate value calculation formula in (2 ′) have, for example, variables for substituting distance information. The joint coordinate value calculation formula and the singer coordinate value calculation formula may be, for example, a so-called function (program). Also, the skeleton information acquisition unit 106 normally holds in advance a joint coordinate value calculation formula and a singer coordinate value calculation formula. Further, the joint coordinate value calculation formula and the singer coordinate value calculation formula are usually different.

また、上記（Ａ）の場合、スケルトン情報取得部１０６は、例えば、２次元の座標から構成されるスケルトン情報を取得してもよい。当該スケルトン情報の取得の手順は、例えば、上記の（１）から（４）までの処理である。また、当該スケルトン情報の取得の手順は、例えば、上記の（１’）から（４’）までの処理と、その後に続く以下の（５’）の処理である。
（５’）（４’）で取得した１以上の各３次元の座標から、座標値（ｚ）を削除する。 In the case of (A) above, the skeleton information acquisition unit 106 may acquire skeleton information composed of two-dimensional coordinates, for example. The procedure for acquiring the skeleton information is, for example, the processes (1) to (4) described above. Also, the procedure for acquiring the skeleton information is, for example, the above-described processing from (1 ′) to (4 ′) and the following processing (5 ′).
(5 ′) The coordinate value (z) is deleted from the one or more three-dimensional coordinates acquired in (4 ′).

（Ｂ）の場合：この場合は、撮影部１０５が撮影画像を取得した場合である。この場合、スケルトン情報取得部１０６は、通常、撮影画像を用いて、２次元の座標から構成されるスケルトン情報を取得する。当該スケルトン情報の取得の手順は、例えば、以下のとおりである。
（１）撮影画像に対して二値化やラベリングなどの画像処理を施し、撮影画像中に写された歌唱者の輪郭を検出する。
（２）（１）で検出した輪郭に対してパターン認識を行い、撮影画像における歌唱者のジョイントを検出する。これにより、撮影画像中に写された歌唱者の各ジョイントに対応する２次元の座標（ｘ，ｙ）が取得される。
（３）（２）で取得した２次元の座標を、連結しているジョイントごとに対応付ける。これにより、各ジョイントの位置を示す２次元の座標の集合（スケルトン情報）が取得される。 Case (B): In this case, the photographing unit 105 acquires a photographed image. In this case, the skeleton information acquisition unit 106 usually acquires skeleton information composed of two-dimensional coordinates using a captured image. The procedure for acquiring the skeleton information is, for example, as follows.
(1) Image processing such as binarization and labeling is performed on the photographed image to detect the contour of the singer photographed in the photographed image.
(2) Pattern recognition is performed on the contour detected in (1), and a singer's joint in the photographed image is detected. Thereby, the two-dimensional coordinates (x, y) corresponding to each joint of the singer photographed in the photographed image are acquired.
(3) The two-dimensional coordinates acquired in (2) are associated with each connected joint. Thereby, a set of two-dimensional coordinates (skeleton information) indicating the position of each joint is acquired.

また、上記（Ｂ）の場合において、スケルトン情報の取得の手順は、例えば、以下のとおりであってもよい。
（１’）撮影画像に対して二値化やラベリングなどの画像処理を施し、撮影画像中の歌唱者領域を検出する。
（２’）（１’）で検出した歌唱者領域に対して細線化の処理を施し、撮影画像中に写された歌唱者の中心線を取得する。
（３’）（２’）で取得した中心線の交点や、端点、交点と端点の中点（中心線上の点）などの２次元の座標（ｘ，ｙ）を取得する。これにより、撮影画像中に写された歌唱者の各ジョイントに対応する２次元の座標（ｘ，ｙ）が取得される。
（４’）（３’）で取得した座標を、中心線により連結されている座標ごとに対応付ける。これにより、各ジョイントの位置を示す２次元の座標の集合が取得される。 In the case of (B) above, the procedure for acquiring the skeleton information may be as follows, for example.
(1 ′) Image processing such as binarization and labeling is performed on the photographed image to detect a singer area in the photographed image.
(2 ′) Thinning processing is performed on the singer area detected in (1 ′), and the center line of the singer photographed in the photographed image is acquired.
(3 ′) The two-dimensional coordinates (x, y) such as the intersection of the center lines acquired in (2 ′), the end points, and the midpoints of the intersection points and the end points (points on the center line) are acquired. Thereby, the two-dimensional coordinates (x, y) corresponding to each joint of the singer photographed in the photographed image are acquired.
(4 ′) The coordinates acquired in (3 ′) are associated with each coordinate connected by the center line. Thereby, a set of two-dimensional coordinates indicating the position of each joint is acquired.

また、スケルトン情報取得部１０６は、取得した各ジョイントの座標に対して、通常、当該各ジョイントを識別するジョイント識別情報を対応付ける。例えば、右肘の座標を取得した場合、スケルトン情報取得部１０６は、当該右肘の座標に、例えば、ジョイント識別情報「右肘」を対応付ける。なお、ジョイント識別情報は、通常、予め決められた記憶領域に格納されている。 Also, the skeleton information acquisition unit 106 normally associates joint identification information for identifying each joint with the acquired coordinates of each joint. For example, when the coordinates of the right elbow are acquired, the skeleton information acquisition unit 106 associates, for example, joint identification information “right elbow” with the coordinates of the right elbow. The joint identification information is usually stored in a predetermined storage area.

また、３次元の座標から構成されるスケルトン情報を、以下、適宜、３次元スケルトン情報とする。また、２次元の座標から構成されるスケルトン情報を、以下、適宜、２次元スケルトン情報とする。 Hereinafter, the skeleton information composed of the three-dimensional coordinates is appropriately referred to as three-dimensional skeleton information. Hereinafter, the skeleton information composed of the two-dimensional coordinates is appropriately referred to as two-dimensional skeleton information.

なお、人物が写された画像（ここでは、歌唱者画像）を用いてスケルトン情報を取得する方法や手順などは、公知であるので、詳細な説明を省略する。なお、距離画像を用いてのスケルトン情報の取得は、例えば、市販のソフトウェアや、市販の距離画像カメラに付属のＳＤＫ（ＳｏｆｔｗａｒｅＤｅｖｅｌｏｐｍｅｎｔＫｉｔ）などを用いることにより行うことが可能である。 In addition, since the method, procedure, etc. which acquire skeleton information using the image (here singer image) on which the person was copied are well-known, detailed description is abbreviate | omitted. In addition, acquisition of skeleton information using a distance image can be performed by using, for example, commercially available software, SDK (Software Development Kit) attached to a commercially available distance image camera, or the like.

また、スケルトン情報の取得に際して、スケルトン情報取得部１０６は、例えば、歌唱者画像から予め決められた条件（以下、適宜、画素削除条件とする）を満たす画素を削除してもよい。この場合、スケルトン情報取得部１０６は、当該削除後の歌唱者画像を用いて、スケルトン情報を取得する。また、これにより、歌唱者画像からより高速にスケルトン情報を取得することができる。なお、当該「画素の削除」とは、いわゆる画素の間引きである。また、当該「画素削除条件」とは、例えば、削除するか否かが判断される対象となる画素に対応する距離情報と、当該画素の周囲の４個（４近傍）または８個（８近傍）の各画素に対応する距離情報との差の平均が予め決められた閾値以下であることや、当該画素の２次元の座標（ｘ，ｙ）の座標値が共に偶数であること、当該画素の２次元の座標（ｘ，ｙ）の座標値が共に奇数であることなどである。 Moreover, when acquiring skeleton information, the skeleton information acquisition part 106 may delete the pixel which satisfy | fills the conditions (henceforth pixel deletion conditions suitably) from a singer image, for example. In this case, the skeleton information acquisition unit 106 acquires skeleton information using the singer image after the deletion. Thereby, skeleton information can be acquired from a singer image at higher speed. The “deletion of pixels” is so-called thinning out of pixels. The “pixel deletion condition” includes, for example, distance information corresponding to a pixel for which it is determined whether or not to delete, and four (4 neighbors) or 8 (8 neighbors) around the pixel. ) That the difference from the distance information corresponding to each pixel is equal to or less than a predetermined threshold, the coordinate values of the two-dimensional coordinates (x, y) of the pixel are both even, The coordinate values of the two-dimensional coordinates (x, y) are both odd numbers.

スコア算出部１０７は、歌唱者の動きを採点し、当該採点の結果であるスコアを算出する。言い換えると、スコア算出部１０７は、歌唱者のダンスのスコアを算出する。なお、以下において、「採点を行う」ことは、「スコアを算出する」ことであるものとする。 The score calculation unit 107 scores the movement of the singer and calculates a score that is a result of the scoring. In other words, the score calculation unit 107 calculates the dance score of the singer. In the following, “scoring” means “score calculation”.

また、スコア算出部１０７は、通常、スケルトン情報を用いて、当該スケルトン情報が示す歌唱者の動きを採点し、スコアを算出する。当該スケルトン情報は、スケルトン情報取得部１０６が取得したスケルトン情報である。具体的に、スコア算出部１０７は、例えば、動き判定情報を用いて、スケルトン情報が示す歌唱者の動きを判定し、当該判定の結果を用いてスコアを算出する。また、スコア算出部１０７は、例えば、スケルトン情報と、動き判定情報とを比較し、当該比較の結果を用いてスコアを算出する。当該「比較する」とは、差分や一致度などを算出することである。つまり、当該「比較の結果」とは、差分や一致度などである。 Further, the score calculation unit 107 usually scores the movement of the singer indicated by the skeleton information using the skeleton information, and calculates the score. The skeleton information is skeleton information acquired by the skeleton information acquisition unit 106. Specifically, the score calculation unit 107 determines, for example, the movement of the singer indicated by the skeleton information using the movement determination information, and calculates the score using the determination result. In addition, the score calculation unit 107 compares, for example, skeleton information and motion determination information, and calculates a score using the comparison result. The “compare” is to calculate a difference, a degree of coincidence, and the like. That is, the “comparison result” is a difference, a degree of coincidence, and the like.

また、スコア算出部１０７は、例えば、以下のいずれかの場合に応じて、スコアを算出する。
（Ａ）動き判定情報が関節角度条件である場合
（Ｂ）動き判定情報がジョイント座標条件である場合
（Ｃ）動き判定情報がジョイント移動量条件である場合
（Ｄ）動き判定情報がジョイント間角度条件である場合
（Ｅ）動き判定情報が関節角度である場合
（Ｆ）動き判定情報がジョイント座標である場合
（Ｇ）動き判定情報がジョイント移動量である場合
（Ｈ）動き判定情報がジョイント間角度である場合 In addition, the score calculation unit 107 calculates a score according to any of the following cases, for example.
(A) When the motion determination information is a joint angle condition (B) When the motion determination information is a joint coordinate condition (C) When the motion determination information is a joint movement amount condition (D) The motion determination information is an inter-joint angle When the condition is (E) When the motion determination information is a joint angle (F) When the motion determination information is joint coordinates (G) When the motion determination information is a joint movement amount (H) The motion determination information is between joints If it is an angle

（Ａ）の場合：この場合、スコア算出部１０７は、通常、１以上の各関節に対応する関節角度条件を用いて、１以上の各関節の角度を判定し、当該判定の結果を用いてスコアを算出する。当該（Ａ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報から、当該スケルトン情報が有する１以上の関節角度を取得する。このとき、関節角度に対応付いている関節識別情報と共に関節角度を取得する。
（２）（１）で取得した１以上の各関節角度に対応付いている関節識別情報と同一の関節識別情報を有する関節角度条件を、動き判定情報から取得する。
（３）（１）で取得した１以上の各関節角度が、当該関節角度に対応する関節角度条件（（２）で取得した関節角度条件）を満たすか否かを判断する。そして、当該判断の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、予め決められた算出式（以下、適宜、スコア算出式とする）とを用いて、スコアを算出する。 In the case of (A): In this case, the score calculation unit 107 normally determines the angle of each of the one or more joints using the joint angle condition corresponding to the one or more of the joints, and uses the result of the determination. Calculate the score. A specific procedure for calculating the score in the case of (A) is, for example, as follows.
(1) One or more joint angles included in the skeleton information are acquired from the skeleton information acquired by the skeleton information acquisition unit 106. At this time, the joint angle is acquired together with the joint identification information associated with the joint angle.
(2) The joint angle condition having the same joint identification information as the joint identification information associated with one or more joint angles acquired in (1) is acquired from the motion determination information.
(3) It is determined whether one or more joint angles acquired in (1) satisfy a joint angle condition (joint angle condition acquired in (2)) corresponding to the joint angle. Then, a parameter indicating the result of the determination is calculated.
(4) A score is calculated using the parameters calculated in (3) and a predetermined calculation formula (hereinafter referred to as a score calculation formula as appropriate).

なお、上記（１）において、スケルトン情報が１以上の関節角度を有していない場合、スコア算出部１０７は、通常、スケルトン情報が有する１以上の座標を用いて、関節角度を算出する。なお、３個の座標により形成される角の角度を算出する方法や手順などは、公知であるので、詳細な説明を省略する。また、スコア算出部１０７は、関節角度の算出に用いた３個の座標のうち、関節に対応する座標に対応付いているジョイント識別情報を、関節識別情報として算出した関節角度に対応付ける。 In the above (1), when the skeleton information does not have one or more joint angles, the score calculation unit 107 normally calculates the joint angle using one or more coordinates included in the skeleton information. The method and procedure for calculating the angle of the angle formed by the three coordinates are well known and will not be described in detail. Further, the score calculation unit 107 associates joint identification information associated with the coordinates corresponding to the joint among the three coordinates used for calculating the joint angle with the joint angle calculated as the joint identification information.

また、上記（３）において、「パラメータ」とは、例えば、関節角度条件を満たす関節角度の数や、関節角度条件を満たさない関節角度の数、関節角度の数に対する当該関節角度条件を満たす関節角度の数の割合、関節角度の数に対する当該関節角度条件を満たさない関節角度の数の割合などである。 In (3) above, the “parameter” refers to, for example, the number of joint angles that satisfy the joint angle condition, the number of joint angles that do not satisfy the joint angle condition, and the joint that satisfies the joint angle condition with respect to the number of joint angles The ratio of the number of angles, the ratio of the number of joint angles that do not satisfy the joint angle condition with respect to the number of joint angles, and the like.

また、上記（４）において、「スコア算出式」は、通常、上記パラメータを代入するための変数を有する。そして、スコア算出部１０７は、当該スコア算出式が有する変数に算出したパラメータを代入し、当該算出式を計算することにより、スコアを算出する。また、当該「スコア算出式」は、いわゆる関数（プログラム）であってもよい。また、スコア算出部１０７は、通常、スコア算出式を予め保持している。また、「スコア算出式」は、以下においても、上記と同様の意味を持つものとする。 In (4) above, the “score calculation formula” usually has variables for substituting the parameters. And the score calculation part 107 calculates a score by substituting the calculated parameter to the variable which the said score calculation formula has, and calculating the said calculation formula. The “score calculation formula” may be a so-called function (program). The score calculation unit 107 normally holds a score calculation formula in advance. Also, the “score calculation formula” has the same meaning as described above.

（Ｂ）の場合：この場合、スコア算出部１０７は、通常、１以上の各ジョイントに対応するジョイント座標条件を用いて、１以上の各ジョイントの座標を判定し、当該判定の結果を用いてスコアを算出する。当該（Ｂ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報から、当該スケルトン情報が有する１以上のジョイント座標を取得する。このとき、ジョイント座標に対応付いているジョイント識別情報と共にジョイント座標を取得する。
（２）（１）で取得した１以上の各ジョイント座標に対応付いているジョイント識別情報と同一のジョイント識別情報を有するジョイント座標条件を、動き判定情報から取得する。
（３）（１）で取得した１以上の各ジョイント座標が、当該ジョイント座標に対応するジョイント座標条件（（２）で取得したジョイント座標条件）を満たすか否かを判断する。そして、当該判断の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、スコア算出式とを用いて、スコアを算出する。 In the case of (B): In this case, the score calculation unit 107 normally determines the coordinates of one or more joints using joint coordinate conditions corresponding to the one or more joints, and uses the determination result. Calculate the score. A specific procedure for calculating the score in the case of (B) is, for example, as follows.
(1) One or more joint coordinates included in the skeleton information are acquired from the skeleton information acquired by the skeleton information acquisition unit 106. At this time, joint coordinates are acquired together with joint identification information associated with the joint coordinates.
(2) A joint coordinate condition having the same joint identification information as the joint identification information associated with one or more joint coordinates acquired in (1) is acquired from the motion determination information.
(3) It is determined whether one or more joint coordinates acquired in (1) satisfy a joint coordinate condition (joint coordinate condition acquired in (2)) corresponding to the joint coordinate. Then, a parameter indicating the result of the determination is calculated.
(4) The score is calculated using the parameter calculated in (3) and the score calculation formula.

なお、上記（３）において、「パラメータ」とは、例えば、ジョイント座標条件を満たすジョイント座標の数や、ジョイント座標条件を満たさないジョイント座標の数、ジョイント座標の数に対する当該ジョイント座標条件を満たすジョイント座標の数の割合、ジョイント座標の数に対する当該ジョイント座標条件を満たさないジョイント座標の数の割合などである。 In the above (3), the “parameter” means, for example, the number of joint coordinates that satisfy the joint coordinate condition, the number of joint coordinates that do not satisfy the joint coordinate condition, and the joint that satisfies the joint coordinate condition for the number of joint coordinates. For example, the ratio of the number of coordinates and the ratio of the number of joint coordinates that do not satisfy the joint coordinate condition with respect to the number of joint coordinates.

（Ｃ）の場合：この場合、スコア算出部１０７は、通常、１以上の各ジョイントに対応するジョイント移動量条件を用いて、１以上の各ジョイントの移動量を判定し、当該判定の結果を用いてスコアを算出する。当該（Ｃ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報が有するジョイント座標を用いて、１以上の各ジョイント移動量を算出する。このとき、算出に用いたジョイント座標に対応付いているジョイント識別情報を、算出したジョイント移動量に対応付ける。
（２）（１）で算出した１以上の各ジョイント移動量に対応付いているジョイント識別情報と同一のジョイント識別情報を有するジョイント移動量条件を、動き判定情報から取得する。
（３）（１）で取得した１以上の各ジョイント移動量が、当該ジョイント移動量に対応するジョイント移動量条件（（２）で取得したジョイント移動量条件）を満たすか否かを判断する。そして、当該判断の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、スコア算出式とを用いて、スコアを算出する。 In the case of (C): In this case, the score calculation unit 107 normally determines the movement amount of each of the one or more joints using the joint movement amount condition corresponding to the one or more of the joints, and determines the result of the determination. To calculate the score. A specific procedure for calculating the score in the case (C) is, for example, as follows.
(1) One or more joint movement amounts are calculated using joint coordinates included in the skeleton information acquired by the skeleton information acquisition unit 106. At this time, the joint identification information associated with the joint coordinates used for the calculation is associated with the calculated joint movement amount.
(2) A joint movement amount condition having the same joint identification information as the joint identification information associated with each of the one or more joint movement amounts calculated in (1) is acquired from the motion determination information.
(3) It is determined whether one or more joint movement amounts acquired in (1) satisfy a joint movement amount condition (joint movement amount condition acquired in (2)) corresponding to the joint movement amount. Then, a parameter indicating the result of the determination is calculated.
(4) The score is calculated using the parameter calculated in (3) and the score calculation formula.

なお、上記（１）において、例えば、動画を構成する２以上の静止画を用いてジョイント移動量を算出する方法や手順などは、公知である。従って、ジョイント移動量を算出する方法や手順などは、詳細な説明を省略する。 In the above (1), for example, a method and a procedure for calculating the joint movement amount using two or more still images constituting a moving image are known. Therefore, a detailed description of the method and procedure for calculating the joint movement amount is omitted.

また、上記（３）において、「パラメータ」とは、例えば、ジョイント移動量条件を満たすジョイント移動量の数や、ジョイント移動量条件を満たさないジョイント移動量の数、ジョイント移動量の数に対する当該ジョイント移動量条件を満たすジョイント移動量の数の割合、ジョイント移動量の数に対する当該ジョイント移動量条件を満たさないジョイント移動量の数の割合などである。 In the above (3), the “parameter” means, for example, the number of joint movement amounts that satisfy the joint movement amount condition, the number of joint movement amounts that do not satisfy the joint movement amount condition, and the joint movement amount relative to the number of joint movement amounts. The ratio of the number of joint movements satisfying the movement amount, the ratio of the number of joint movements not satisfying the joint movement amount condition to the number of joint movements, and the like.

（Ｄ）の場合：この場合、スコア算出部１０７は、通常、１以上の各ジョイント間に対応するジョイント間角度条件を用いて、１以上の各ジョイント間の角度を判定し、当該判定の結果を用いてスコアを算出する。当該（Ｄ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報が有するジョイント座標を用いて、１以上の各ジョイント間角度を算出する。このとき、算出に用いたジョイント座標に対応付いているジョイント識別情報を、算出したジョイント間角度に対応付ける。
（２）（１）で取得した１以上の各ジョイント間角度に対応付いているジョイント間識別情報と同一のジョイント間識別情報を有するジョイント間角度条件を、動き判定情報から取得する。
（３）（１）で取得した１以上の各ジョイント間角度が、当該ジョイント間角度に対応するジョイント間角度条件（（２）で取得したジョイント間角度条件）を満たすか否かを判断する。そして、当該判断の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、スコア算出式とを用いて、スコアを算出する。 In the case of (D): In this case, the score calculation unit 107 normally determines an angle between one or more joints using an inter-joint angle condition corresponding to the one or more joints, and the result of the determination The score is calculated using A specific procedure for calculating the score in the case of (D) is, for example, as follows.
(1) One or more inter-joint angles are calculated using joint coordinates included in the skeleton information acquired by the skeleton information acquisition unit 106. At this time, the joint identification information associated with the joint coordinates used for the calculation is associated with the calculated inter-joint angle.
(2) An inter-joint angle condition having inter-joint identification information identical to the inter-joint identification information associated with one or more inter-joint angles acquired in (1) is acquired from the motion determination information.
(3) It is determined whether one or more inter-joint angles acquired in (1) satisfy an inter-joint angle condition (inter-joint angle condition acquired in (2)) corresponding to the inter-joint angle. Then, a parameter indicating the result of the determination is calculated.
(4) The score is calculated using the parameter calculated in (3) and the score calculation formula.

なお、上記（１）において、角度を算出するジョイント間は、例えば、予め決められていてもよいし、そうでなくてもよい。そうでない場合、スコア算出部１０７は、例えば、スケルトン情報が有する１以上のジョイント座標のうち、３個のジョイント座標のすべての組み合わせまたは順列を取得し、当該組み合わせまたは順列ごとにジョイント間角度情報を算出してもよい。 In the above (1), the joint between which the angle is calculated may be determined in advance or may not be, for example. Otherwise, the score calculation unit 107 acquires, for example, all combinations or permutations of three joint coordinates among one or more joint coordinates included in the skeleton information, and obtains inter-joint angle information for each combination or permutation. It may be calculated.

また、上記（３）において、「パラメータ」とは、例えば、ジョイント間角度条件を満たすジョイント間角度の数や、ジョイント間角度条件を満たさないジョイント間角度の数、ジョイント間角度の数に対する当該ジョイント間角度条件を満たすジョイント間角度の数の割合、ジョイント間角度の数に対する当該ジョイント間角度条件を満たさないジョイント間角度の数の割合などである。 In (3) above, the “parameter” refers to, for example, the number of joint angles that satisfy the joint angle condition, the number of joint angles that do not satisfy the joint angle condition, and the joint to the number of joint angles. The ratio of the number of inter-joint angles satisfying the inter-angle angle, the ratio of the number of inter-joint angles not satisfying the inter-joint angle condition with respect to the number of inter-joint angles, and the like.

（Ｅ）の場合：この場合、スコア算出部１０７は、通常、歌唱者の１以上の各関節の角度と、当該各関節に対応する関節角度であり、動き判定情報である関節角度とを比較し、当該比較の結果を用いてスコアを算出する。当該（Ｅ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報から、当該スケルトン情報が有する１以上の関節角度を取得する。このとき、関節角度に対応付いている関節識別情報と共に関節角度を取得する。当該取得した関節角度を、以下、適宜、歌唱者関節角度とする。
（２）（１）で取得した１以上の各関節角度に対応付いている関節識別情報と同一の関節識別情報が対応付いている関節角度を、動き判定情報から取得する。当該取得した関節角度を、以下、適宜、手本関節角度とする。
（３）（１）で取得した１以上の各歌唱者関節角度と、当該歌唱者関節角度に対応する手本関節角度（（２）で取得した手本関節角度）とを比較する。そして、当該比較の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、スコア算出式とを用いて、スコアを算出する。 In the case of (E): In this case, the score calculation unit 107 normally compares the angle of each of the one or more joints of the singer with the joint angle corresponding to each joint and the motion determination information. Then, a score is calculated using the result of the comparison. A specific procedure for calculating the score in the case (E) is, for example, as follows.
(1) One or more joint angles included in the skeleton information are acquired from the skeleton information acquired by the skeleton information acquisition unit 106. At this time, the joint angle is acquired together with the joint identification information associated with the joint angle. Hereinafter, the acquired joint angle is appropriately referred to as a singer joint angle.
(2) The joint angle associated with the same joint identification information as the joint identification information associated with one or more joint angles obtained in (1) is obtained from the motion determination information. Hereinafter, the acquired joint angle is appropriately referred to as a model joint angle.
(3) One or more singer joint angles acquired in (1) are compared with the model joint angles (model joint angles acquired in (2)) corresponding to the singer joint angles. Then, a parameter indicating the result of the comparison is calculated.
(4) The score is calculated using the parameter calculated in (3) and the score calculation formula.

また、上記（３）において、「パラメータ」とは、例えば、歌唱者関節角度と手本関節角度との差の絶対値や、手本関節角度に対する当該絶対値の割合、手本関節角度に対する歌唱者関節角度の割合などである。 In (3) above, “parameter” refers to, for example, the absolute value of the difference between the singer joint angle and the model joint angle, the ratio of the absolute value to the model joint angle, and the singing to the model joint angle. The ratio of the joint angle of the person.

（Ｆ）の場合：この場合、スコア算出部１０７は、通常、歌唱者の１以上の各ジョイントの座標と、当該各ジョイントに対応するジョイント座標であり、動き判定情報であるジョイント座標とを比較し、当該比較の結果を用いてスコアを算出する。当該（Ｆ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報から、当該スケルトン情報が有する１以上のジョイント座標を取得する。このとき、ジョイント座標に対応付いているジョイント識別情報と共にジョイント座標を取得する。当該取得したジョイント座標を、以下、適宜、歌唱者ジョイント座標とする。
（２）（１）で取得した１以上の各ジョイント座標に対応付いているジョイント識別情報と同一のジョイント識別情報が対応付いているジョイント座標を、動き判定情報から取得する。当該取得したジョイント座標を、以下、適宜、手本ジョイント座標とする。
（３）（１）で取得した１以上の各歌唱者ジョイント座標と、当該歌唱者ジョイント座標に対応する手本ジョイント座標（（２）で取得した手本ジョイント座標）とを比較する。そして、当該比較の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、スコア算出式とを用いて、スコアを算出する。 In the case of (F): In this case, the score calculation unit 107 normally compares the coordinates of each of the one or more joints of the singer with the joint coordinates that are the joint coordinates corresponding to each of the joints and are motion determination information. Then, a score is calculated using the result of the comparison. A specific procedure for calculating the score in the case of (F) is, for example, as follows.
(1) One or more joint coordinates included in the skeleton information are acquired from the skeleton information acquired by the skeleton information acquisition unit 106. At this time, joint coordinates are acquired together with joint identification information associated with the joint coordinates. The acquired joint coordinates are hereinafter referred to as singer joint coordinates as appropriate.
(2) The joint coordinates associated with the same joint identification information as the joint identification information associated with one or more joint coordinates obtained in (1) are obtained from the motion determination information. Hereinafter, the acquired joint coordinates are appropriately referred to as model joint coordinates.
(3) The one or more singer joint coordinates acquired in (1) are compared with the model joint coordinates (example joint coordinates acquired in (2)) corresponding to the singer joint coordinates. Then, a parameter indicating the result of the comparison is calculated.
(4) The score is calculated using the parameter calculated in (3) and the score calculation formula.

なお、上記（３）において、「パラメータ」とは、例えば、歌唱者ジョイント座標と手本ジョイント座標との差の絶対値や、手本ジョイント座標に対する当該絶対値の割合、手本ジョイント座標に対する歌唱者ジョイント座標の割合などである。 In the above (3), the “parameter” means, for example, the absolute value of the difference between the singer joint coordinates and the model joint coordinates, the ratio of the absolute value to the model joint coordinates, and the singing with respect to the model joint coordinates. This is the ratio of the joint joint coordinates.

（Ｇ）の場合：この場合、スコア算出部１０７は、通常、歌唱者の１以上の各ジョイントの移動量と、当該各ジョイントに対応するジョイント移動量であり、動き判定情報であるジョイント移動量とを比較し、当該比較の結果を用いてスコアを算出する。当該（Ｇ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報が有するジョイント座標を用いて、１以上の各ジョイント移動量を算出する。このとき、算出に用いたジョイント座標に対応付いているジョイント識別情報を、算出したジョイント移動量に対応付ける。また、当該算出したジョイント移動量を、歌唱者ジョイント移動量とする。
（２）（１）で算出した１以上の各ジョイント移動量に対応付いているジョイント識別情報と同一のジョイント識別情報が対応付いているジョイント移動量を、動き判定情報から取得する。当該取得したジョイント移動量を、以下、適宜、手本ジョイント移動量とする。
（３）（１）で取得した１以上の各歌唱者ジョイント移動量と、当該歌唱者ジョイント移動量に対応する手本ジョイント移動量（（２）で取得した手本ジョイント移動量）とを比較する。そして、当該比較の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、スコア算出式とを用いて、スコアを算出する。 In the case of (G): In this case, the score calculation unit 107 is generally a movement amount of each of the one or more joints of the singer, a joint movement amount corresponding to each joint, and a joint movement amount that is motion determination information. And a score is calculated using the result of the comparison. A specific procedure for calculating the score in the case of (G) is, for example, as follows.
(1) One or more joint movement amounts are calculated using joint coordinates included in the skeleton information acquired by the skeleton information acquisition unit 106. At this time, the joint identification information associated with the joint coordinates used for the calculation is associated with the calculated joint movement amount. Moreover, let the calculated joint movement amount be a singer joint movement amount.
(2) A joint movement amount associated with the same joint identification information as the joint identification information associated with one or more joint movement amounts calculated in (1) is acquired from the motion determination information. Hereinafter, the acquired joint movement amount is appropriately referred to as a model joint movement amount.
(3) Compare one or more singer joint movement amounts acquired in (1) with the sample joint movement amount corresponding to the singers joint movement amount (example joint movement amount acquired in (2)). To do. Then, a parameter indicating the result of the comparison is calculated.
(4) The score is calculated using the parameter calculated in (3) and the score calculation formula.

なお、上記（３）において、「パラメータ」とは、例えば、歌唱者ジョイント移動量と手本ジョイント移動量との差の絶対値や、手本ジョイント移動量に対する当該絶対値の割合、手本ジョイント移動量に対する歌唱者ジョイント移動量の割合などである。 In the above (3), the “parameter” means, for example, the absolute value of the difference between the singer joint movement amount and the sample joint movement amount, the ratio of the absolute value to the sample joint movement amount, the sample joint For example, the ratio of the singer joint movement amount to the movement amount.

（Ｈ）の場合：この場合、スコア算出部１０７は、通常、歌唱者の１以上の各ジョイント間の角度と、当該各ジョイント間に対応するジョイント間角度であり、動き判定情報であるジョイント間角度とを比較し、当該比較の結果を用いてスコアを算出する。当該（Ｈ）の場合のスコアを算出する具体的な手順は、例えば、以下のとおりである。
（１）スケルトン情報取得部１０６が取得したスケルトン情報が有するジョイント座標を用いて、１以上の各ジョイント間角度を算出する。このとき、算出に用いたジョイント座標に対応付いているジョイント識別情報を、算出したジョイント間角度に対応付ける。また、当該算出したジョイント間角度を、歌唱者ジョイント間角度とする。
（２）（１）で算出した１以上の各ジョイント間角度に対応付いているジョイント間識別情報と同一のジョイント間識別情報が対応付いているジョイント間角度を、動き判定情報から取得する。当該取得したジョイント間角度を、以下、適宜、手本ジョイント間角度とする。
（３）（１）で取得した１以上の各歌唱者ジョイント間角度と、当該歌唱者ジョイント間角度に対応する手本ジョイント間角度（（２）で取得した手本ジョイント間角度）とを比較する。そして、当該比較の結果を示すパラメータを算出する。
（４）（３）で算出したパラメータと、スコア算出式とを用いて、スコアを算出する。 In the case of (H): In this case, the score calculation unit 107 is usually an angle between one or more joints of the singer and an inter-joint angle corresponding to each joint, and is the motion determination information. The angle is compared, and a score is calculated using the result of the comparison. A specific procedure for calculating the score in the case of (H) is, for example, as follows.
(1) One or more inter-joint angles are calculated using joint coordinates included in the skeleton information acquired by the skeleton information acquisition unit 106. At this time, the joint identification information associated with the joint coordinates used for the calculation is associated with the calculated inter-joint angle. Moreover, let the calculated inter-joint angle be an angle between singer joints.
(2) An inter-joint angle corresponding to the same inter-joint identification information corresponding to the one or more inter-joint angles calculated in (1) is acquired from the motion determination information. Hereinafter, the acquired inter-joint angle is appropriately referred to as a sample inter-joint angle.
(3) Compare one or more inter-singer joint angles acquired in (1) with an inter-sample joint angle corresponding to the inter-singer joint angle (inter-joint joint angle acquired in (2)). To do. Then, a parameter indicating the result of the comparison is calculated.
(4) The score is calculated using the parameter calculated in (3) and the score calculation formula.

なお、上記（３）において、「パラメータ」とは、例えば、歌唱者ジョイント間角度と手本ジョイント間角度との差の絶対値や、手本ジョイント間角度に対する当該絶対値の割合、手本ジョイント間角度に対する歌唱者ジョイント間角度の割合などである。 In the above (3), “parameter” means, for example, the absolute value of the difference between the angle between the singer joints and the angle between the model joints, the ratio of the absolute value to the angle between the model joints, the model joints, etc. For example, the ratio of the angle between the singer joints with respect to the interval angle.

また、上記（Ａ）から（Ｈ）までの場合において、「スコア算出式」は、例えば、ダンスが上手であればあるほど（歌唱者の動きが手本の動きと一致しているほど）スコアが高くなり、ダンスが下手であればあるほど（歌唱者の動きが手本の動きと一致していないほど）スコアが低くなるような算出式であることが好適である。 Further, in the cases (A) to (H) above, the “score calculation formula” indicates, for example, that the better the dance (the closer the singer's movement matches the movement of the model), the higher the score. It is preferable that the calculation formula is such that the higher the value is, the lower the dance is (the more the singer's movement does not match the movement of the model), the lower the score.

また、例えば、いわゆる加点方式で採点する場合、上記「スコア算出式」は、加点分を算出する算出式であってもよい。また、例えば、いわゆる減点方式で採点する場合、上記「スコア算出式」は、減点分を算出する算出式でもよい。 Further, for example, when scoring is performed by a so-called scoring method, the “score calculation formula” may be a calculation formula for calculating a score. In addition, for example, when scoring by a so-called deduction method, the “score calculation formula” may be a calculation formula for calculating a deduction amount.

また、上記（Ｂ）の場合、スコア算出部１０７は、例えば、ジョイント座標条件が有する座標、および、スケルトン情報が有する座標に対して、いわゆる正規化を行ってもよい。また、スコア算出部１０７は、当該正規化を、例えば、ジョイント座標条件が有する座標、または、スケルトン情報が有する座標のいずれか一方に対して行ってもよい。なお、ここでの正規化とは、ジョイント座標条件が有する座標と、スケルトン情報が有する座標との差異を吸収するための処理である。言い換えると、ここでの正規化とは、スケルトン情報により示される歌唱者の大きさと、ジョイント座標条件が有する座標により示される手本の大きさとの差異を吸収するための処理、スケルトン情報により示される歌唱者の位置と、ジョイント座標条件が有する座標により示される手本の位置との差異を吸収するための処理、スケルトン情報により示される歌唱者の向きと、ジョイント座標条件が有する座標により示される手本の向きとの差異を吸収するための処理の３つの処理のうちの１以上の処理である。 In the case of (B), the score calculation unit 107 may perform so-called normalization on, for example, the coordinates of the joint coordinate condition and the coordinates of the skeleton information. Further, the score calculation unit 107 may perform the normalization on, for example, one of the coordinates included in the joint coordinate condition or the coordinates included in the skeleton information. The normalization here is a process for absorbing the difference between the coordinates of the joint coordinate condition and the coordinates of the skeleton information. In other words, the normalization here is indicated by skeleton information, a process for absorbing the difference between the size of the singer indicated by the skeleton information and the size of the model indicated by the coordinates of the joint coordinate condition. Processing for absorbing the difference between the position of the singer and the position of the model indicated by the coordinates of the joint coordinate condition, the direction of the singer indicated by the skeleton information, and the hand indicated by the coordinates of the joint coordinate condition It is one or more of the three processes for absorbing the difference from the book orientation.

また、上記（Ｆ）の場合、スコア算出部１０７は、例えば、ジョイント座標、および、スケルトン情報が有する座標に対して、いわゆる正規化を行ってもよい。また、スコア算出部１０７は、当該正規化を、例えば、ジョイント座標、または、スケルトン情報が有する座標のいずれか一方に対して行ってもよい。なお、ここでの正規化とは、ジョイント座標と、スケルトン情報が有する座標との差異を吸収するための処理である。言い換えると、ここでの正規化とは、スケルトン情報により示される歌唱者の大きさと、ジョイント座標により示される手本の大きさとの差異を吸収するための処理、スケルトン情報により示される歌唱者の位置と、ジョイント座標により示される手本の位置との差異を吸収するための処理、スケルトン情報により示される歌唱者の向きと、ジョイント座標により示される手本の向きとの差異を吸収するための処理の３つの処理のうちの１以上の処理である。 In the case of (F), the score calculation unit 107 may perform so-called normalization on, for example, joint coordinates and coordinates included in the skeleton information. Further, the score calculation unit 107 may perform the normalization on, for example, one of joint coordinates or coordinates included in the skeleton information. The normalization here is a process for absorbing the difference between the joint coordinates and the coordinates of the skeleton information. In other words, normalization here means processing to absorb the difference between the size of the singer indicated by the skeleton information and the size of the model indicated by the joint coordinates, and the position of the singer indicated by the skeleton information. To absorb the difference between the position of the model indicated by the joint coordinates and the direction of the singer indicated by the skeleton information and the direction of the model indicated by the joint coordinates Are one or more of the three processes.

また、スコア算出部１０７は、例えば、スケルトン情報と動き判定情報とを比較することなく、スケルトン情報を用いて、スコアを算出してもよい。この場合、スコア算出部１０７は、例えば、スケルトン情報が有する情報またはスケルトン情報を用いて算出した情報と、スコア算出式とを用いて、当該スケルトン情報が示す動きに対応するスコアを算出する。また、このとき、スコア算出部１０７は、当該スケルトン情報が有する情報を、スコア算出式に代入するパラメータとして用いる。なお、「スケルトン情報を用いて算出した情報」とは、例えば、関節角度や、ジョイント移動量、ジョイント間角度などである。 Further, the score calculation unit 107 may calculate the score using the skeleton information without comparing the skeleton information and the motion determination information, for example. In this case, the score calculation unit 107 calculates a score corresponding to the movement indicated by the skeleton information using, for example, information included in the skeleton information or information calculated using the skeleton information and a score calculation formula. At this time, the score calculation unit 107 uses information included in the skeleton information as a parameter to be substituted into the score calculation formula. The “information calculated using the skeleton information” is, for example, a joint angle, a joint movement amount, an angle between joints, and the like.

また、スコアの算出に用いるパラメータは、例えば、各関節や各ジョイントなどのスコアに対する係数（重み）であると考えてもよい。 Further, the parameter used for calculating the score may be considered to be a coefficient (weight) for the score of each joint or each joint, for example.

また、以上より、スコア算出部１０７は、通常、少なくともスケルトン情報を用いて、歌唱者の動きの採点結果であるスコアを算出することができればよく、その方法や手順などは、問わない。 In addition, as described above, the score calculation unit 107 normally only needs to be able to calculate a score, which is a singer's movement scoring result, using at least the skeleton information, and the method and procedure are not limited.

スコア出力部１０８は、スコア算出部１０７が算出したスコアを出力する。出力とは、ディスプレイへの表示、プロジェクターを用いた投影、プリンタでの印字、音出力、外部の装置への送信、記録媒体への蓄積、他の処理装置や他のプログラムなどへの処理結果の引渡しなどを含む概念である。なお、送信や蓄積、処理結果の引渡しについては、出力対象が最終的にユーザに提示されるものとする。 The score output unit 108 outputs the score calculated by the score calculation unit 107. Output refers to display on a display, projection using a projector, printing on a printer, sound output, transmission to an external device, storage on a recording medium, processing results to other processing devices or other programs, etc. It is a concept that includes delivery. In addition, regarding transmission, accumulation, and delivery of processing results, an output target is finally presented to the user.

また、スコア出力部１０８によるスコアの出力態様は、問わない。例えば、スコア出力部１０８は、例えば、スコア算出部１０７がスコアを算出するたびに、当該スコアを出力してもよい。また、スコア出力部１０８は、例えば、スコア算出部１０７が算出したスコアを統計処理し、当該統計処理の結果を出力してもよい。当該統計処理とは、例えば、合計することや、平均（単純平均または加重平均）を算出することなどである。 Moreover, the output mode of the score by the score output unit 108 does not matter. For example, the score output unit 108 may output the score every time the score calculation unit 107 calculates the score, for example. The score output unit 108 may statistically process the score calculated by the score calculation unit 107 and output the result of the statistical processing, for example. The statistical processing includes, for example, summing and calculating an average (simple average or weighted average).

また、スコア出力部１０８は、ディスプレイやスピーカーなどの出力デバイスを含むと考えてもよいし、含まないと考えてもよい。スコア出力部１０８は、出力デバイスのドライバソフトまたは、出力デバイスのドライバソフトと出力デバイスなどで実現され得る。 The score output unit 108 may or may not include an output device such as a display or a speaker. The score output unit 108 can be realized by driver software of an output device or driver software and an output device of an output device.

モデル画像構成部１０９は、モデル画像を構成する。具体的に、モデル画像構成部１０９は、歌唱者画像に含まれる情報を用いて、歌唱者の１以上の属性値を取得し、少なくとも一部は当該歌唱者の１以上の属性値を用いてモデルを示す画像であるモデル画像を構成する。「歌唱者画像に含まれる情報」とは、例えば、距離情報、輝度情報、またはＲＧＢ情報のうちの１以上の情報のうち、歌唱者の領域に対応する距離情報、輝度情報、またはＲＧＢ情報のうちの１以上の情報である。また、「一部」とは、時間的、または空間的、または時間的及び空間的な一部である。「時間的な一部」とは、モデル画像が動画である場合の、一部の時間帯のことを言う。また、「空間的な一部」とは、歌唱者が有する１以上の部位のうちの一部を言う。また、「歌唱者の属性値」とは、歌唱者に関する情報であり、歌唱者の特徴量と言ってもよい。また、「１以上の属性値」は、例えば、歌唱者の動き、歌唱者の形状、または歌唱者の色のうちの１以上の情報である。また、「１以上の属性値」とは、通常、１種類以上の属性値のことである。 The model image construction unit 109 constructs a model image. Specifically, the model image construction unit 109 acquires one or more attribute values of the singer using information included in the singer image, and at least a part thereof uses one or more attribute values of the singer. A model image that is an image showing a model is constructed. “Information included in the singer image” means, for example, distance information, luminance information, or RGB information corresponding to a singer's area among one or more pieces of information of distance information, luminance information, or RGB information. One or more pieces of information. The “part” is a part of time, space, or part of time and space. “Part of time” refers to a part of the time zone when the model image is a moving image. In addition, “spatial part” refers to a part of one or more parts of a singer. The “singer's attribute value” is information relating to the singer, and may be referred to as a singer's feature value. The “one or more attribute values” are, for example, one or more pieces of information of a singer's movement, a singer's shape, or a singer's color. Further, “one or more attribute values” are usually one or more types of attribute values.

また、「モデル画像」は、例えば、少なくとも一部は歌唱者の動きに応じた動きを行う画像、または、歌唱者の形状または色を反映させた画像である。また、「歌唱者の動きに応じた動き」とは、例えば、歌唱者の動きに連動した動きや、歌唱者の動きと同一（ほぼ同一を含む）の動き、歌唱者の動きを反映させた動き、歌唱者の動きに対応した動きなどであり、広く解する。 The “model image” is, for example, an image that at least partially moves according to the movement of the singer, or an image that reflects the shape or color of the singer. In addition, the “movement according to the movement of the singer” reflects, for example, the movement linked to the movement of the singer, the same movement (including almost the same) as the movement of the singer, and the movement of the singer. Movement, movement corresponding to the movement of the singer, etc., widely understood.

また、「モデル画像」は、例えば、歌唱者を模した画像や、歌唱者の顔とキャラクタの胴体などを合成した画像、キャラクタを示す画像などである。「歌唱者を模したモデル画像の構成」とは、歌唱者の色や、歌唱者の形状などを模しているモデル画像を構成することを意味する。さらに具体的には、「歌唱者を模したモデル画像の構成」とは、例えば、歌唱者の色や、歌唱者の形状などの歌唱者の属性値を取得し、当該属性値を用いて、モデル画像を構成することである。また、「歌唱者の色」とは、例えば、歌唱者の髪の色や、歌唱者の肌の色、歌唱者の服の色などである。また、当該「色」には、通常、輝度も含む。また、「歌唱者の形状」とは、例えば、歌唱者の輪郭や、歌唱者の体の表面の起伏などである。 The “model image” is, for example, an image imitating a singer, an image in which a singer's face and a character's body are combined, an image showing a character, or the like. The “configuration of a model image imitating a singer” means that a model image imitating the color of a singer, the shape of a singer, or the like is configured. More specifically, “the configuration of the model image imitating a singer” means, for example, obtaining the singer's attribute values such as the color of the singer and the shape of the singer, and using the attribute value, To construct a model image. The “singer color” includes, for example, the hair color of the singer, the skin color of the singer, the color of the singer's clothes, and the like. In addition, the “color” usually includes luminance. The “singer's shape” is, for example, the contour of the singer or the undulation of the surface of the singer's body.

また、「モデル画像」は、通常、３次元のモデルを示す画像（以下、適宜、３次元モデル画像とする）である。「３次元モデル画像」は、モデルが３次元（立体的）に表現された画像である。また、「モデル画像」は、例えば、２次元のモデルを示す画像（以下、適宜、２次元モデル画像とする）であってもよい。「２次元モデル画像」は、モデルが２次元（平面的）に表現された画像である。また、「モデル画像」は、あくまで画像であり、ホログラムではない。 The “model image” is usually an image indicating a three-dimensional model (hereinafter, appropriately referred to as a three-dimensional model image). A “three-dimensional model image” is an image in which a model is expressed three-dimensionally (three-dimensionally). Further, the “model image” may be, for example, an image indicating a two-dimensional model (hereinafter, appropriately referred to as a two-dimensional model image). A “two-dimensional model image” is an image in which a model is expressed two-dimensionally (planar). Further, the “model image” is merely an image, not a hologram.

また、モデル画像構成部１０９は、例えば、以下のいずれかの方法により、モデル画像を構成する。
（Ａ）距離画像を用いる方法（その１）
（Ｂ）距離画像を用いる方法（その２）
（Ｃ）距離画像を用いる方法（その３）
（Ｄ）撮影画像を用いる方法
（Ｅ）スケルトン情報を用いる方法 Further, the model image construction unit 109 constructs a model image by, for example, any one of the following methods.
(A) Method using distance image (part 1)
(B) Method using distance image (part 2)
(C) Method using distance image (part 3)
(D) Method using photographed image (E) Method using skeleton information

（Ａ）の方法：当該方法は、距離画像および撮影画像を用いて、歌唱者の色および形状を模した３次元のモデルを示す３次元モデル画像を構成する方法である。つまり、当該方法は、撮影部１０５が距離画像および撮影画像を取得した場合において、当該３次元モデル画像を構成する方法である。また、当該方法は、いわゆるサーフェスモデリングを行う方法である。当該方法の具体的な手順は、例えば、以下のとおりである。
（１）距離画像が有する距離情報を用いて、当該距離情報が示す距離が予め決められた条件を満たすほど近い画素の座標（ｘ，ｙ）を取得する。これにより、歌唱者領域が検出される。
（２）（１）で座標を取得した画素に対応付いている距離情報と、歌唱者座標値算出式とを用いて、当該距離情報に対応する座標値（ｚ）を算出する。
（３）（２）で算出した座標値（ｚ）と、当該座標値の算出に用いた画素の座標（ｘ，ｙ）とを対応付け、３次元の座標（ｘ，ｙ，ｚ）を取得する。これにより、１以上の３次元の座標が取得される。言い換えると、これにより、距離画像内に写された歌唱者の形状が検出される。また、この処理は、例えば、３ＤＣＧを作成するための処理であるポリゴンモデリングであると考えてよい。
（４）（１）で検出した歌唱者領域に対応する画像（以下、適宜、歌唱者領域画像とする）を、撮影画像から切り出す。
（５）（３）で取得した１以上の３次元の座標（ｘ，ｙ，ｚ）に、（４）で切り出した歌唱者領域画像を適用し、歌唱者の色および形状を模した３次元モデル画像を構成する。当該「３次元の座標に画像を適用する」とは、例えば、当該１以上の３次元の座標（ｘ，ｙ，ｚ）により示される歌唱者の形状に、当該画像を重ね合わせることである。また、この処理は、例えば、３ＤＣＧを作成するための処理であるテクスチャマッピングであると考えてよい。 Method (A): This method is a method of constructing a three-dimensional model image showing a three-dimensional model imitating the color and shape of a singer using a distance image and a photographed image. That is, this method is a method of constructing the three-dimensional model image when the photographing unit 105 acquires the distance image and the photographed image. In addition, this method is a method for performing so-called surface modeling. The specific procedure of the method is, for example, as follows.
(1) Using the distance information that the distance image has, the coordinates (x, y) of the pixels that are so close that the distance indicated by the distance information satisfies a predetermined condition are acquired. Thereby, a singer area | region is detected.
(2) The coordinate value (z) corresponding to the distance information is calculated using the distance information associated with the pixel whose coordinates have been acquired in (1) and the singer coordinate value calculation formula.
(3) The coordinate value (z) calculated in (2) is associated with the coordinate (x, y) of the pixel used to calculate the coordinate value, and a three-dimensional coordinate (x, y, z) is acquired. To do. Thereby, one or more three-dimensional coordinates are acquired. In other words, the shape of the singer copied in the distance image is detected. Further, this process may be considered to be polygon modeling, which is a process for creating 3DCG, for example.
(4) An image corresponding to the singer area detected in (1) (hereinafter appropriately referred to as a singer area image) is cut out from the captured image.
(5) A three-dimensional model that imitates the color and shape of a singer by applying the singer area image cut out in (4) to one or more three-dimensional coordinates (x, y, z) acquired in (3). Construct a model image. “Applying an image to three-dimensional coordinates” means, for example, superimposing the image on the shape of a singer indicated by the one or more three-dimensional coordinates (x, y, z). Further, this process may be considered to be texture mapping, which is a process for creating 3DCG, for example.

以上の処理により、例えば、撮影画像がカラー画像である場合は、歌唱者の色および形状を模した３次元モデル画像が構成される。また、例えば、撮影画像がグレースケール画像である場合は、歌唱者の輝度および形状を模した３次元モデル画像が構成される。 By the above processing, for example, when the photographed image is a color image, a three-dimensional model image imitating the color and shape of the singer is constructed. For example, when the captured image is a grayscale image, a three-dimensional model image imitating the brightness and shape of the singer is constructed.

（Ｂ）の方法：この方法は、距離画像および撮影画像を用いて、歌唱者の色および形状を模した２次元のモデルを示す２次元モデル画像を構成する方法である。つまり、当該方法は、撮影部１０５が距離画像および撮影画像を取得した場合において、当該２次元モデル画像を構成する方法である。当該方法の具体的な手順は、例えば、以下のとおりである。
（１）距離画像が有する距離情報を用いて、当該距離情報が示す距離が予め決められた条件を満たすほど近い画素の座標（ｘ，ｙ）を取得する。これにより、歌唱者領域が検出される。
（２）（１）で検出した歌唱者領域に対応する歌唱者領域画像を、撮影画像から切り出す。当該切り出しにより取得した歌唱者領域画像が、２次元モデル画像である。 Method (B): This method is a method of constructing a two-dimensional model image showing a two-dimensional model imitating the color and shape of a singer using a distance image and a photographed image. That is, this method is a method of constructing the two-dimensional model image when the photographing unit 105 acquires the distance image and the photographed image. The specific procedure of the method is, for example, as follows.
(1) Using the distance information that the distance image has, the coordinates (x, y) of the pixels that are so close that the distance indicated by the distance information satisfies a predetermined condition are acquired. Thereby, a singer area | region is detected.
(2) A singer area image corresponding to the singer area detected in (1) is cut out from the photographed image. The singer area image acquired by the cutout is a two-dimensional model image.

以上の処理により、例えば、撮影画像がカラー画像である場合は、歌唱者の色および形状を模した２次元モデル画像が構成される。また、例えば、撮影画像がグレースケール画像である場合は、歌唱者の輝度および形状を模した２次元モデル画像が構成される。 By the above processing, for example, when the photographed image is a color image, a two-dimensional model image imitating the color and shape of the singer is constructed. For example, when the captured image is a grayscale image, a two-dimensional model image simulating the brightness and shape of the singer is constructed.

（Ｃ）の方法：当該方法は、距離画像を用いて、歌唱者の形状を模した３次元のまたは２次元のモデルを示す３次元モデル画像または２次元モデル画像を構成する方法である。つまり、当該方法は、撮影部１０５が距離画像を取得した場合において、当該３次元モデルまたは当該２次元モデルを構成する方法である。当該３次元モデル画像を構成する具体的な手順は、例えば、以下のとおりである。
（１）距離画像が有する距離情報を用いて、当該距離情報が示す距離が予め決められた条件を満たすほど近い画素の座標（ｘ，ｙ）を取得する。これにより、歌唱者領域が検出される。
（２）（１）で座標を取得した画素に対応付いている距離情報と、歌唱者座標値算出式とを用いて、当該距離情報に対応する座標値（ｚ）を算出する。
（３）（２）で算出した座標値（ｚ）と、当該座標値の算出に用いた画素の座標（ｘ，ｙ）とを対応付け、３次元の座標（ｘ，ｙ，ｚ）を取得する。これにより、１以上の３次元の座標が取得される。言い換えると、これにより、距離画像内に写された歌唱者の形状が検出される。また、この処理は、例えば、ポリゴンモデリングであると考えてよい。
（４）（３）で取得した１以上の３次元の座標（ｘ，ｙ，ｚ）に、予め保持している模様を示す画像を適用し、歌唱者の形状を模した３次元モデル画像を構成する。当該「模様」は、例えば、粘土や、水、泡などである。また、当該「模様を示す画像」は、例えば、テクスチャマッピングに用いるテクスチャであると考えてよい。また、この処理は、例えば、テクスチャマッピングであると考えてよい。 Method (C): This method is a method of constructing a three-dimensional model image or a two-dimensional model image showing a three-dimensional or two-dimensional model imitating the shape of a singer using a distance image. That is, this method is a method of configuring the three-dimensional model or the two-dimensional model when the photographing unit 105 acquires a distance image. A specific procedure for constructing the three-dimensional model image is, for example, as follows.
(1) Using the distance information that the distance image has, the coordinates (x, y) of the pixels that are so close that the distance indicated by the distance information satisfies a predetermined condition are acquired. Thereby, a singer area | region is detected.
(2) The coordinate value (z) corresponding to the distance information is calculated using the distance information associated with the pixel whose coordinates have been acquired in (1) and the singer coordinate value calculation formula.
(3) The coordinate value (z) calculated in (2) is associated with the coordinate (x, y) of the pixel used to calculate the coordinate value, and a three-dimensional coordinate (x, y, z) is acquired. To do. Thereby, one or more three-dimensional coordinates are acquired. In other words, the shape of the singer copied in the distance image is detected. Further, this processing may be considered to be polygon modeling, for example.
(4) A three-dimensional model image imitating the shape of a singer is applied to one or more three-dimensional coordinates (x, y, z) acquired in (3) by applying an image indicating a pattern held in advance. Configure. The “pattern” is, for example, clay, water, foam or the like. Further, the “image showing a pattern” may be considered as a texture used for texture mapping, for example. Further, this processing may be considered as texture mapping, for example.

また、上記（Ｃ）の方法において、２次元モデル画像を構成する具体的な手順は、例えば、以下のとおりである。
（１’）距離画像が有する距離情報を用いて、当該距離情報が示す距離が予め決められた条件を満たすほど近い画素の座標（ｘ，ｙ）を取得する。これにより、歌唱者領域が検出される。
（２’）（１’）で取得した１以上の２次元の座標に、予め保持している模様を示す画像を適用し、歌唱者の形状を模した２次元モデル画像を構成する。当該「２次元の座標に画像を適用する」とは、例えば、当該１以上の２次元の座標（ｘ，ｙ）により示される歌唱者の形状に、当該画像を重ね合わせることである。また、この処理は、例えば、テクスチャマッピングであると考えてよい。 In the method (C), the specific procedure for constructing the two-dimensional model image is, for example, as follows.
(1 ′) Using the distance information included in the distance image, the coordinates (x, y) of the pixels that are close enough that the distance indicated by the distance information satisfies a predetermined condition are acquired. Thereby, a singer area | region is detected.
(2 ′) An image indicating a pattern held in advance is applied to the one or more two-dimensional coordinates acquired in (1 ′) to form a two-dimensional model image imitating the shape of a singer. “Applying an image to two-dimensional coordinates” means, for example, superimposing the image on the shape of a singer indicated by the one or more two-dimensional coordinates (x, y). Further, this processing may be considered as texture mapping, for example.

なお、上記（Ａ）から（Ｃ）までの方法において、モデル画像構成部１０９は、例えば、距離画像に対して二値化やラベリングなどの画像処理を施し、歌唱者領域を検出してもよい。 In the methods (A) to (C) above, the model image constructing unit 109 may detect the singer area by performing image processing such as binarization and labeling on the distance image, for example. .

（Ｄ）の方法：当該方法は、撮影画像を用いて、歌唱者の色および形状を模した２次元のモデルを示す２次元モデル画像を構成する方法である。つまり、当該方法は、撮影部１０５が撮影画像を取得した場合において、当該２次元モデルを構成する方法である。当該方法の具体的な手順は、例えば、以下のとおりである。
（１）撮影画像に対して二値化やラベリングなどの画像処理を施し、歌唱者領域を検出する。
（２）（１）で検出した歌唱者領域に対応する歌唱者領域画像を、撮影画像から切り出す。当該切り出しにより取得した歌唱者領域画像が、２次元モデル画像である。 Method (D): This method is a method of constructing a two-dimensional model image showing a two-dimensional model imitating the color and shape of a singer using a photographed image. That is, this method is a method of constructing the two-dimensional model when the photographing unit 105 acquires a photographed image. The specific procedure of the method is, for example, as follows.
(1) Image processing such as binarization and labeling is performed on the photographed image, and a singer area is detected.
(2) A singer area image corresponding to the singer area detected in (1) is cut out from the photographed image. The singer area image acquired by the cutout is a two-dimensional model image.

（Ｅ）の方法：当該方法は、キャラクタ情報と、スケルトン情報とを用いて、キャラクタを示すモデル画像を構成する方法である。言い換えると、当該方法は、いわゆるスケルタルアニメーションにより、キャラクタを示すモデル画像を構成する方法である。 Method (E): This method is a method of constructing a model image showing a character using character information and skeleton information. In other words, the method is a method of constructing a model image showing a character by so-called skeletal animation.

「キャラクタ情報」とは、キャラクタを示す情報である。「キャラクタ情報」は、通常、キャラクタの形状を示す１以上の座標およびキャラクタの各部位の色を示す情報の集合である。当該座標は、通常、３次元（ｘ，ｙ，ｚ）である。また、当該座標は、例えば、２次元（ｘ，ｙ）であってもよい。また、「キャラクタ情報」は、例えば、キャラクタを示す画像であってもよい。当該「キャラクタを示す画像」は、通常、キャラクタが３次元的に表現される画像である。また、当該「キャラクタを示す画像」は、例えば、キャラクタが２次元的に表現される画像であってもよい。また、当該「キャラクタを示す画像」は、通常、静止画である。また、キャラクタ情報は、通常、予め決められた記憶領域に格納されている。 “Character information” is information indicating a character. “Character information” is usually a set of information indicating one or more coordinates indicating the shape of the character and the color of each part of the character. The coordinates are usually three-dimensional (x, y, z). Further, the coordinates may be two-dimensional (x, y), for example. Further, the “character information” may be an image showing a character, for example. The “image showing a character” is usually an image in which a character is expressed three-dimensionally. Further, the “image showing a character” may be, for example, an image in which a character is expressed two-dimensionally. The “image showing a character” is usually a still image. Character information is usually stored in a predetermined storage area.

以上より、キャラクタ情報は、キャラクタの形状や色などを示すことができる情報であればよく、そのデータ構造などは、問わない。なお、３次元のキャラクタを示すキャラクタ情報を、以下、適宜、３次元キャラクタ情報とする。また、２次元のキャラクタを示すキャラクタ情報を、以下、適宜、２次元キャラクタ情報とする。 As described above, the character information only needs to be information that can indicate the shape, color, etc. of the character, and the data structure thereof does not matter. Hereinafter, the character information indicating the three-dimensional character is appropriately referred to as three-dimensional character information. Further, the character information indicating the two-dimensional character is hereinafter appropriately referred to as two-dimensional character information.

また、（Ｅ）の方法は、さらに、例えば、以下の２つの方法に分類することができる。
（Ｅ１）３次元スケルトン情報を用いる方法
（Ｅ２）２次元スケルトン情報を用いる方法 The method (E) can be further classified into the following two methods, for example.
(E1) Method using three-dimensional skeleton information (E2) Method using two-dimensional skeleton information

（Ｅ１）の方法：当該方法は、３次元スケルトン情報を用いて、キャラクタ情報が示すキャラクタの３次元モデル画像を構成する方法である。当該方法の具体的な手順は、例えば、以下のとおりである。
（１）予め決められた記憶領域から、３次元キャラクタ情報を取得する。
（２）（１）で取得した３次元キャラクタ情報に、３次元スケルトン情報を適用する。「３次元スケルトン情報を適用する」とは、３次元キャラクタ情報が示すキャラクタのジョイントと、３次元スケルトン情報が有する座標とを対応付け、当該キャラクタを３次元スケルトン情報が示す動きに動かすことである。
（３）（２）で３次元スケルトン情報を適用した３次元キャラクタ情報を用いて、いわゆるレンダリングや、テクスチャマッピングなどの処理を行い、３次元モデル画像を構成する。 Method (E1): This method is a method of constructing a three-dimensional model image of a character indicated by character information using three-dimensional skeleton information. The specific procedure of the method is, for example, as follows.
(1) Obtain three-dimensional character information from a predetermined storage area.
(2) The three-dimensional skeleton information is applied to the three-dimensional character information acquired in (1). “Applying the 3D skeleton information” is to associate the joint of the character indicated by the 3D character information with the coordinates of the 3D skeleton information and move the character to the movement indicated by the 3D skeleton information. .
(3) Using the 3D character information to which the 3D skeleton information is applied in (2), processing such as so-called rendering and texture mapping is performed to form a 3D model image.

（Ｅ２）の場合：当該方法は、２次元スケルトン情報を用いて、キャラクタ情報が示すキャラクタの２次元モデル画像を構成する方法である。当該方法の具体的な手順は、例えば、以下のとおりである。
（１）予め決められた記憶領域から、２次元キャラクタ情報を取得する。
（２）（１）で取得した２次元キャラクタ情報に、２次元スケルトン情報を適用する。「２次元スケルトン情報を適用する」とは、２次元キャラクタ情報が示すキャラクタのジョイントと、２次元スケルトン情報が有する座標とを対応付け、当該キャラクタを２次元スケルトン情報が示す動きに動かすことである。
（３）（２）で２次元スケルトン情報を適用した２次元キャラクタ情報を画像化し、２次元モデル画像を構成する。 In the case of (E2): This method is a method of constructing a two-dimensional model image of the character indicated by the character information using the two-dimensional skeleton information. The specific procedure of the method is, for example, as follows.
(1) Two-dimensional character information is acquired from a predetermined storage area.
(2) The two-dimensional skeleton information is applied to the two-dimensional character information acquired in (1). “Applying the two-dimensional skeleton information” is to associate the joint of the character indicated by the two-dimensional character information with the coordinates of the two-dimensional skeleton information, and move the character to the movement indicated by the two-dimensional skeleton information. .
(3) In step (2), two-dimensional character information to which the two-dimensional skeleton information is applied is imaged to form a two-dimensional model image.

また、モデル画像構成部１０９は、例えば、歌唱者領域を示す１以上の２次元の座標（ｘ，ｙ）、または、距離画像内に写された歌唱者の形状を示す１以上の３次元の座標（ｘ，ｙ，ｚ）に、キャラクタ画像を適用し、歌唱者の形状を模したキャラクタを示す２次元モデル画像または３次元モデル画像を構成してもよい。 The model image construction unit 109 also has, for example, one or more two-dimensional coordinates (x, y) indicating a singer area, or one or more three-dimensional information indicating the shape of a singer photographed in a distance image. A character image may be applied to the coordinates (x, y, z) to form a two-dimensional model image or a three-dimensional model image showing a character imitating the shape of a singer.

なお、１以上の３次元の座標を用いて３次元モデル画像を構成する方法や手順、１以上の２次元の座標を用いて２次元モデル画像を構成する方法や手順、スケルトン情報とキャラクタ情報とを用いて３次元モデル画像または２次元モデル画像を構成する方法や手順などは、公知であるので、詳細な説明を省略する。例えば、１以上の３次元の座標を用いて、３次元のモデルを示す画像（ここでは、モデル画像）を構成するソフトウェアとして、例えば、以下のソフトウェアが利用可能である。
・Ｍａｙａ（http://www.autodesk.co.jp/products/autodesk-maya/overview）
・３ｄｓＭＡＸ（http://www.autodesk.co.jp/products/autodesk-3ds-max/overview）
・Ｓｈａｄｅ（http://shade.e-frontier.co.jp/）
・ＯｐｅｎＧＬ（http://www.opengl.org/）
・ＤｉｒｅｃｔＸ（http://www.microsoft.com/ja-jp/directx/default.aspx） It should be noted that a method and procedure for constructing a three-dimensional model image using one or more three-dimensional coordinates, a method and procedure for constructing a two-dimensional model image using one or more two-dimensional coordinates, skeleton information, and character information Since a method and procedure for constructing a three-dimensional model image or a two-dimensional model image using the method are well-known, detailed description thereof will be omitted. For example, the following software can be used as software that configures an image (here, a model image) showing a three-dimensional model using one or more three-dimensional coordinates.
・ Maya (http://www.autodesk.co.jp/products/autodesk-maya/overview)
・ 3dsMAX (http://www.autodesk.co.jp/products/autodesk-3ds-max/overview)
・ Shade (http://shade.e-frontier.co.jp/)
・ OpenGL (http://www.opengl.org/)
・ DirectX (http://www.microsoft.com/en-us/directx/default.aspx)

また、上記の方法などにより作成したモデル画像のうち、歌唱者を模したものを示すモデル画像を、以下、適宜、歌唱者モデル画像とする。また、上記の方法などにより作成したモデル画像のうち、キャラクタを示すモデル画像を、以下、適宜、キャラクタモデル画像とする。 Moreover, the model image which shows what imitates a singer among the model images created by said method etc. is hereafter called a singer model image suitably. Moreover, the model image which shows a character among the model images created by said method etc. is hereafter called a character model image suitably.

また、距離画像を用いた３次元モデル画像の作成に際して、モデル画像構成部１０９は、例えば、距離画像から画素削除条件を満たす画素を削除してもよい。この場合、モデル画像構成部１０９は、当該削除後の距離画像を用いて、３次元モデル画像を構成する。また、距離画像を用いた３次元モデル画像の構成に際して、モデル画像構成部１０９は、例えば、いわゆるカリングを行ってもよい。これらの処理により、より高速に距離画像を用いて３次元モデル画像を構成することができる。 In creating a three-dimensional model image using a distance image, the model image construction unit 109 may delete, for example, pixels that satisfy the pixel deletion condition from the distance image. In this case, the model image construction unit 109 constructs a three-dimensional model image using the deleted distance image. Further, when constructing a three-dimensional model image using a distance image, the model image construction unit 109 may perform so-called culling, for example. By these processes, a three-dimensional model image can be constructed using the distance image at a higher speed.

画像出力部１１０は、手本画像を出力する。また、例えば、モデル画像構成部１０９がモデル画像を構成した場合、画像出力部１１０は、通常、当該モデル画像を出力する。この場合、画像出力部１１０は、例えば、手本画像とモデル画像とを出力してもよい。また、画像出力部１１０は、例えば、２以上の画像が合成された画像である合成画像を出力してもよい。当該「合成画像」は、例えば、手本画像が背景画像上に配置された画像である。「背景画像」とは、背景の画像である。「背景画像」は、例えば、いわゆるステージや、風景、幾何学的な模様などを示す画像である。つまり、「背景画像」は、いわゆる背景となり得る画像であればよく、その内容などは、問わない。また、「合成画像」は、例えば、手本画像とモデル画像とが背景画像上に配置された画像であってもよい。この場合、手本画像とモデル画像とは、通常、背景画像上の重ならない位置に配置される。また、当該２以上の画像を合成することによる合成画像の構成は、通常、図示しない合成画像構成部が行う。 The image output unit 110 outputs a model image. For example, when the model image construction unit 109 constructs a model image, the image output unit 110 normally outputs the model image. In this case, the image output unit 110 may output a model image and a model image, for example. Further, the image output unit 110 may output a composite image that is an image obtained by combining two or more images, for example. The “composite image” is, for example, an image in which a model image is arranged on a background image. The “background image” is a background image. The “background image” is, for example, an image showing a so-called stage, landscape, geometric pattern, or the like. That is, the “background image” may be an image that can be a so-called background, and the content thereof is not limited. The “composite image” may be an image in which a model image and a model image are arranged on a background image, for example. In this case, the model image and the model image are usually arranged at positions that do not overlap on the background image. Further, the composition of the composite image by combining the two or more images is normally performed by a composite image construction unit (not shown).

また、例えば、手本画像を出力する場合、画像出力部１１０は、通常、後述の楽曲再生部１１１が楽曲データを再生することにより出力される楽曲のテンポ（リズム）と、手本画像が示す手本のダンスのテンポ（リズム）とが同期（一致）するように、手本画像を出力する。また、「手本画像の出力」には、合成画像の出力を開始することも含まれる。 Also, for example, when outputting a model image, the image output unit 110 normally indicates the tempo (rhythm) of the music output when the music playback unit 111 described later plays back the music data, and the model image indicates A model image is output so that the tempo (rhythm) of the model dance is synchronized (matched). In addition, “output of a model image” includes starting output of a composite image.

また、画像出力部１１０は、ディスプレイなどの出力デバイスを含むと考えてもよいし、含まないと考えてもよい。スコア出力部１０８は、出力デバイスのドライバソフトまたは、出力デバイスのドライバソフトと出力デバイスなどで実現され得る。 Further, the image output unit 110 may or may not include an output device such as a display. The score output unit 108 can be realized by driver software of an output device or driver software and an output device of an output device.

楽曲再生部１１１は、楽曲データ格納部１０３に格納されている楽曲データを再生する。「楽曲データを再生する」とは、例えば、楽曲データを再生することが可能な装置に楽曲データを送信することや、楽曲データを音出力すること、楽曲データを音出力することが可能な装置に楽曲データを送信することなどを含み、広く解する。また、「楽曲データの再生」を、以下、適宜、「楽曲の再生」とする。また、「楽曲データの再生」には、楽曲データの再生を開始することも含まれる。 The music playback unit 111 plays back the music data stored in the music data storage unit 103. “Play music data” means, for example, an apparatus capable of transmitting music data to a device capable of playing music data, outputting music data as a sound, and outputting music data as a sound. It is widely understood, including transmitting music data. Further, “reproduction of music data” is hereinafter referred to as “reproduction of music” as appropriate. In addition, “reproduction of music data” includes starting reproduction of music data.

また、楽曲再生部１１１が再生する楽曲データは、通常、受付部１０４が受け付けた楽曲選択指示により選択された楽曲データである。この場合、例えば、楽曲データの識別情報がユーザに提示されることにより、ユーザは、１以上の楽曲データの中から、自身が好む楽曲データを選択する。 The music data reproduced by the music reproducing unit 111 is normally music data selected by a music selection instruction received by the receiving unit 104. In this case, for example, when the identification information of the music data is presented to the user, the user selects music data that the user likes from one or more music data.

また、楽曲再生部１１１は、スピーカーなどの出力デバイスを含むと考えてもよいし、含まないと考えてもよい。楽曲再生部１１１は、出力デバイスのドライバソフトまたは、出力デバイスのドライバソフトと出力デバイスなどで実現され得る。 Further, the music reproducing unit 111 may be considered to include an output device such as a speaker, or may not be included. The music reproducing unit 111 can be realized by output device driver software, or output device driver software and an output device.

なお、動き判定情報格納部１０１、手本画像格納部１０２、楽曲データ格納部１０３は、不揮発性の記録媒体が好適であるが、揮発性の記録媒体でも実現可能である。また、動き判定情報格納部１０１などに所定の情報が記憶される過程は、問わない。例えば、当該所定の情報は、記録媒体や、通信回線、入力デバイスなどを介して動き判定情報格納部１０１などに記憶されてもよい。 The motion determination information storage unit 101, the model image storage unit 102, and the music data storage unit 103 are preferably non-volatile recording media, but can also be realized by volatile recording media. Further, the process in which the predetermined information is stored in the motion determination information storage unit 101 or the like does not matter. For example, the predetermined information may be stored in the motion determination information storage unit 101 or the like via a recording medium, a communication line, an input device, or the like.

また、スケルトン情報取得部１０６、スコア算出部１０７、モデル画像構成部１０９は、通常、ＭＰＵやメモリ等から実現され得る。また、スケルトン情報取得部１０６などの処理手順は、通常、ソフトウェアで実現され、当該ソフトウェアはＲＯＭ等の記録媒体に記録されている。なお、スケルトン情報取得部１０６などは、ハードウェア（専用回路）で実現されてもよい。 In addition, the skeleton information acquisition unit 106, the score calculation unit 107, and the model image configuration unit 109 can be usually realized by an MPU, a memory, or the like. The processing procedure of the skeleton information acquisition unit 106 and the like is usually realized by software, and the software is recorded on a recording medium such as a ROM. Note that the skeleton information acquisition unit 106 and the like may be realized by hardware (a dedicated circuit).

次に、カラオケ装置１の全体動作について、フローチャートを用いて説明する。なお、所定の情報におけるｉ番目の情報は、「情報［ｉ］」と記載するものとする。図２は、カラオケ装置１の全体動作を示すフローチャートである。 Next, the overall operation of the karaoke apparatus 1 will be described using a flowchart. Note that the i-th information in the predetermined information is described as “information [i]”. FIG. 2 is a flowchart showing the overall operation of the karaoke apparatus 1.

（ステップＳ２０１）カラオケ装置１は、受付部１０４が電源ＯＮの指示を受け付けたか否かを判断する。受け付けた場合は、ステップＳ２０２に進み、そうでない場合は、ステップＳ２０１に戻る。 (Step S201) The karaoke apparatus 1 determines whether or not the receiving unit 104 has received a power-on instruction. If accepted, the process proceeds to step S202; otherwise, the process returns to step S201.

（ステップＳ２０２）カラオケ装置１は、電源ＯＮの処理を行う。 (Step S202) The karaoke apparatus 1 performs a power ON process.

（ステップＳ２０３）図示しない受信部は、楽曲データを受信したか否かを判断する。受信した場合は、ステップＳ２０４に進み、そうでない場合は、ステップＳ２０５に進む。 (Step S203) A receiving unit (not shown) determines whether or not music data has been received. If received, the process proceeds to step S204, and if not, the process proceeds to step S205.

（ステップＳ２０４）図示しない蓄積部は、ステップＳ２０３で受信した楽曲データを、楽曲データ格納部１０３に蓄積する。 (Step S204) The storage unit (not shown) stores the music data received in step S203 in the music data storage unit 103.

（ステップＳ２０５）楽曲再生部１１１は、受付部１０４が楽曲選択指示を受け付けたか否かを判断する。受け付けた場合は、ステップＳ２０６に進み、そうでない場合は、ステップＳ２０７に進む。 (Step S205) The music reproducing unit 111 determines whether the receiving unit 104 has received a music selection instruction. If accepted, the process proceeds to step S206; otherwise, the process proceeds to step S207.

（ステップＳ２０６）楽曲再生部１１１は、ステップＳ２０５で受け付けた楽曲選択指示により選択された楽曲データが有する楽曲データ識別情報を、楽曲データ識別情報を格納するリストである再生リストに登録する。 (Step S206) The music reproducing unit 111 registers the music data identification information included in the music data selected by the music selection instruction received in step S205 in a reproduction list that is a list storing the music data identification information.

（ステップＳ２０７）楽曲再生部１１１は、楽曲データの再生のタイミングであるか否かを判断する。当該タイミングは、例えば、楽曲データの再生が終了した後や、受付部１０４が楽曲選択指示を受け付けた後などである。そして、楽曲データの再生のタイミングである場合は、ステップＳ２０８に進み、そうでない場合は、ステップＳ２１９に進む。 (Step S207) The music playback unit 111 determines whether it is the playback timing of the music data. The timing is, for example, after the reproduction of music data is finished or after the reception unit 104 receives a music selection instruction. If it is time to reproduce the music data, the process proceeds to step S208; otherwise, the process proceeds to step S219.

（ステップＳ２０８）楽曲再生部１１１は、再生リストに楽曲データ識別情報が登録されているか否かを判断する。登録されている場合は、ステップＳ２０９に進み、そうでない場合は、ステップＳ２１９に進む。 (Step S208) The music reproducing unit 111 determines whether or not music data identification information is registered in the reproduction list. If registered, the process proceeds to step S209, and if not, the process proceeds to step S219.

（ステップＳ２０９）楽曲再生部１１１は、再生リストから、次に再生する楽曲データの楽曲データ識別情報を取得する。 (Step S209) The music reproducing unit 111 acquires music data identification information of music data to be reproduced next from the reproduction list.

（ステップＳ２１０）楽曲再生部１１１は、ステップＳ２０９で取得した楽曲データ識別情報により識別される楽曲データを、楽曲データ格納部１０３から取得する。 (Step S210) The music reproducing unit 111 acquires music data identified by the music data identification information acquired in step S209 from the music data storage unit 103.

（ステップＳ２１１）楽曲再生部１１１は、ステップＳ２１０で取得した楽曲データの再生を開始する。 (Step S211) The music playback unit 111 starts playback of the music data acquired in step S210.

（ステップＳ２１２）画像出力部１１０は、手本画像を出力するか否かを判断する。当該判断は、例えば、手本画像を出力するか否かを示す情報が予め決められた記憶領域に格納されており、当該情報を用いて行う。そして、手本画像を出力する場合は、ステップＳ２１３に進み、そうでない場合は、ステップＳ２１４に進む。 (Step S212) The image output unit 110 determines whether to output a model image. The determination is made using, for example, information indicating whether or not to output a model image is stored in a predetermined storage area. If a model image is to be output, the process proceeds to step S213, and if not, the process proceeds to step S214.

（ステップＳ２１３）画像出力部１１０は、手本画像格納部１０２から手本画像を取得する。そして、画像出力部１１０は、当該取得した手本画像の出力を開始する。 (Step S213) The image output unit 110 acquires a model image from the model image storage unit 102. Then, the image output unit 110 starts outputting the acquired model image.

（ステップＳ２１４）撮影部１０５は、歌唱者の撮影を開始する。言い換えると、撮影部１０５は、距離画像の取得を開始する。 (Step S214) The photographing unit 105 starts photographing the singer. In other words, the imaging unit 105 starts acquiring the distance image.

（ステップＳ２１５）スケルトン情報取得部１０６は、撮影部１０５が距離画像を取得したか否かを判断する。取得した場合は、ステップＳ２１６に進み、そうでない場合は、ステップＳ２１９に進む。 (Step S215) The skeleton information acquisition unit 106 determines whether the imaging unit 105 has acquired a distance image. If acquired, the process proceeds to step S216, and if not, the process proceeds to step S219.

（ステップＳ２１６）スケルトン情報取得部１０６は、ステップＳ２１５で取得した距離画像を用いて、スケルトン情報を取得する。この処理の詳細は、図３のフローチャートを用いて説明する。 (Step S216) The skeleton information acquisition unit 106 acquires skeleton information using the distance image acquired in step S215. Details of this processing will be described with reference to the flowchart of FIG.

（ステップＳ２１７）スコア算出部１０７は、ステップＳ２１６で取得したスケルトン情報を用いてスコアを算出し、スコア出力部１０８は、当該算出したスコアを出力する。この処理の詳細は、図４のフローチャートを用いて説明する。 (Step S217) The score calculation unit 107 calculates a score using the skeleton information acquired in step S216, and the score output unit 108 outputs the calculated score. Details of this processing will be described with reference to the flowchart of FIG.

（ステップＳ２１８）モデル画像構成部１０９は、モデル画像を構成し、画像出力部１１０は、当該構成したモデル画像を出力する。この処理の詳細は、図５のフローチャートを用いて説明する。 (Step S218) The model image construction unit 109 constructs a model image, and the image output unit 110 outputs the constructed model image. Details of this processing will be described with reference to the flowchart of FIG.

（ステップＳ２１９）カラオケ装置１は、受付部１０４が電源ＯＦＦの指示を受け付けたか否かを判断する。受け付けた場合は、ステップＳ２２０に進み、そうでない場合は、ステップＳ２０３に戻る。 (Step S219) The karaoke apparatus 1 determines whether or not the reception unit 104 has received a power-off instruction. If accepted, the process proceeds to step S220, and if not, the process returns to step S203.

（ステップＳ２２０）カラオケ装置１は、電源ＯＦＦの処理を行う。そして、ステップＳ２０１に戻る。 (Step S220) The karaoke apparatus 1 performs a power-off process. Then, the process returns to step S201.

なお、図２のフローチャートにおいて、電源オフや処理終了の割り込みにより処理を終了してもよい。 In the flowchart of FIG. 2, the process may be terminated by powering off or a process termination interrupt.

また、図２のフローチャートにおいて、撮影部１０５による撮影は、例えば、楽曲再生部１１１による楽曲データの再生が終了すると、終了する。また、楽曲データの再生は、例えば、受付部１０４が楽曲データの再生を終了する指示を受け付けた場合や、楽曲データの再生が完了した場合などに終了する。 In the flowchart of FIG. 2, the photographing by the photographing unit 105 ends when the reproduction of the music data by the music reproducing unit 111 is completed, for example. In addition, the reproduction of the music data ends, for example, when the reception unit 104 receives an instruction to end the reproduction of the music data or when the reproduction of the music data is completed.

図３は、図２のフローチャートのステップＳ２１６のスケルトン情報の取得処理を示すフローチャートである。 FIG. 3 is a flowchart showing the skeleton information acquisition process in step S216 of the flowchart of FIG.

（ステップＳ３０１）スケルトン情報取得部１０６は、距離画像が有する距離情報を用いて、歌唱者領域を検出する。 (Step S301) The skeleton information acquisition unit 106 detects a singer area using the distance information included in the distance image.

（ステップＳ３０２）スケルトン情報取得部１０６は、ステップＳ３０１で検出した歌唱者領域の輪郭に対してパターン認識を行い、距離画像における歌唱者のジョイントを検出する。 (Step S302) The skeleton information acquisition unit 106 performs pattern recognition on the outline of the singer area detected in step S301, and detects a singer's joint in the distance image.

（ステップＳ３０３）スケルトン情報取得部１０６は、ステップＳ３０２で検出したジョイントを示す画素に対応付いている距離情報と、ジョイント座標値算出式とを用いて、当該距離情報に対応する座標値を算出する。 (Step S303) The skeleton information acquisition unit 106 calculates the coordinate value corresponding to the distance information using the distance information associated with the pixel indicating the joint detected in step S302 and the joint coordinate value calculation formula. .

（ステップＳ３０４）スケルトン情報取得部１０６は、ステップＳ３０２で検出したジョイントを示す画素の２次元の座標と、ステップＳ３０３で算出した座標値とを対応付け、３次元の座標を構成する。そして、スケルトン情報取得部１０６は、当該３次元の座標を、連結しているジョイントごとに対応付け、スケルトン情報を構成する。そして、上位処理にリターンする。 (Step S304) The skeleton information acquisition unit 106 configures a three-dimensional coordinate by associating the two-dimensional coordinates of the pixel indicating the joint detected in Step S302 with the coordinate value calculated in Step S303. And the skeleton information acquisition part 106 matches the said three-dimensional coordinate for every connected joint, and comprises skeleton information. Then, the process returns to the upper process.

図４は、図２のフローチャートのステップＳ２１７のスコアの算出と出力処理を示すフローチャートである。 FIG. 4 is a flowchart showing score calculation and output processing in step S217 of the flowchart of FIG.

（ステップＳ４０１）スコア算出部１０７は、スケルトン情報取得部１０６が取得したスケルトン情報を用いて、判定対象情報を取得する。判定対象情報とは、例えば、歌唱者関節角度や、歌唱者ジョイント座標、歌唱者ジョイント移動量などである。ここで、スコア算出部１０７は、ｍ個の判定対象情報を取得したものとする。 (Step S401) The score calculation unit 107 acquires determination target information using the skeleton information acquired by the skeleton information acquisition unit 106. The determination target information includes, for example, a singer joint angle, singer joint coordinates, singer joint movement amount, and the like. Here, it is assumed that the score calculation unit 107 has acquired m pieces of determination target information.

（ステップＳ４０２）スコア算出部１０７は、動き判定情報格納部１０１に格納されている動き判定情報の種類が、動き判定条件であるか、動き判定値であるかを判断する。動き判定条件とは、例えば、関節角度条件や、ジョイント座標条件、ジョイント移動量条件などである。また、動き判定値とは、例えば、手本関節角度や、手本ジョイント座標、手本ジョイント移動量などである。そして、動き判定条件である場合は、ステップＳ４０３に進み、そうでない場合は、ステップＳ４１４に進む。 (Step S402) The score calculation unit 107 determines whether the type of motion determination information stored in the motion determination information storage unit 101 is a motion determination condition or a motion determination value. The motion determination condition is, for example, a joint angle condition, a joint coordinate condition, a joint movement amount condition, or the like. The motion determination value is, for example, a model joint angle, a model joint coordinate, a model joint movement amount, or the like. If it is a motion determination condition, the process proceeds to step S403; otherwise, the process proceeds to step S414.

（ステップＳ４０３）スコア算出部１０７は、変数ｔｒｕｅに０をセットする。 (Step S403) The score calculation unit 107 sets 0 to the variable true.

（ステップＳ４０４）スコア算出部１０７は、変数ｆａｌｓｅに０をセットする。 (Step S404) The score calculation unit 107 sets 0 to the variable false.

（ステップＳ４０５）スコア算出部１０７は、カウンタｉに１をセットする。 (Step S405) The score calculation unit 107 sets 1 to the counter i.

（ステップＳ４０６）スコア算出部１０７は、判定対象情報［ｉ］に対応付いている識別情報と同一の識別情報が対応付いている動き判定条件を、動き判定情報格納部１０１から取得する。 (Step S406) The score calculation unit 107 acquires a motion determination condition associated with the same identification information as the identification information associated with the determination target information [i] from the motion determination information storage unit 101.

（ステップＳ４０７）スコア算出部１０７は、判定対象情報［ｉ］が、ステップＳ４０６で取得した動き判定条件を満たすか否かを判断する。満たす場合は、ステップＳ４０８に進み、そうでない場合は、ステップＳ４０９に進む。 (Step S407) The score calculation unit 107 determines whether or not the determination target information [i] satisfies the motion determination condition acquired in step S406. If satisfied, the process proceeds to step S408, and if not, the process proceeds to step S409.

（ステップＳ４０８）スコア算出部１０７は、ｔｒｕｅを１インクリメントする。 (Step S408) The score calculation unit 107 increments true by 1.

（ステップＳ４０９）スコア算出部１０７は、ｆａｌｓｅを１インクリメントする。 (Step S409) The score calculation unit 107 increments false by 1.

（ステップＳ４１０）スコア算出部１０７は、ｉがｍであるか否かを判断する。ｍである場合は、ステップＳ４１２に進み、そうでない場合は、ステップＳ４１１に進む。 (Step S410) The score calculation unit 107 determines whether i is m. When it is m, it progresses to step S412, and when that is not right, it progresses to step S411.

（ステップＳ４１１）スコア算出部１０７は、ｉを１インクリメントする。そして、ステップＳ４０６に戻る。 (Step S411) The score calculation unit 107 increments i by 1. Then, the process returns to step S406.

（ステップＳ４１２）スコア算出部１０７は、ｔｒｕｅとｆａｌｓｅとを用いて、予め決められた種類のパラメータを算出する。 (Step S412) The score calculation unit 107 calculates a predetermined type of parameter using true and false.

（ステップＳ４１３）スコア算出部１０７は、ステップＳ４１２で算出したパラメータと、予め保持しているスコア算出式とを用いて、スコアを算出する。 (Step S413) The score calculation unit 107 calculates a score using the parameter calculated in step S412 and a score calculation formula stored in advance.

（ステップＳ４１４）スコア算出部１０７は、カウンタｉに１をセットする。 (Step S414) The score calculation unit 107 sets 1 to the counter i.

（ステップＳ４１５）スコア算出部１０７は、判定対象情報［ｉ］に対応付いている識別情報と同一の識別情報が対応付いている動き判定値を、動き判定情報格納部１０１から取得する。 (Step S415) The score calculation unit 107 acquires, from the motion determination information storage unit 101, a motion determination value associated with the same identification information as the identification information associated with the determination target information [i].

（ステップＳ４１６）スコア算出部１０７は、判定対象情報［ｉ］とステップＳ４１５で取得した動き判定値とを比較し、当該比較の結果を示すパラメータであるパラメータ［ｉ］を算出する。 (Step S416) The score calculation unit 107 compares the determination target information [i] with the motion determination value acquired in Step S415, and calculates a parameter [i] that is a parameter indicating the result of the comparison.

（ステップＳ４１７）スコア算出部１０７は、ｉがｍであるか否かを判断する。ｍである場合は、ステップＳ４１９に進み、そうでない場合は、ステップＳ４１８に進む。 (Step S417) The score calculation unit 107 determines whether i is m. When it is m, it progresses to step S419, and when that is not right, it progresses to step S418.

（ステップＳ４１８）スコア算出部１０７は、ｉを１インクリメントする。そして、ステップＳ４１５に戻る。 (Step S418) The score calculation unit 107 increments i by 1. Then, the process returns to step S415.

（ステップＳ４１９）スコア算出部１０７は、ステップＳ４１６で算出したｍ個のパラメータと、予め保持しているスコア算出式とを用いて、スコアを算出する。 (Step S419) The score calculation unit 107 calculates a score using the m parameters calculated in step S416 and a score calculation formula stored in advance.

（ステップＳ４２０）スコア出力部１０８は、スコアを出力するタイミングであるか否かを判断する。当該タイミングは、例えば、楽曲再生部１１１が楽曲データを再生している間の定期的や、楽曲再生部１１１が楽曲データの再生を終了した後などである。そして、スコアを出力するタイミングである場合は、ステップＳ４２１に進み、そうでない場合は、上位処理にリターンする。 (Step S420) The score output unit 108 determines whether it is time to output a score. The timing is, for example, periodically while the music reproducing unit 111 is reproducing the music data, or after the music reproducing unit 111 finishes reproducing the music data. If it is time to output the score, the process proceeds to step S421, and if not, the process returns to the upper process.

（ステップＳ４２１）スコア出力部１０８は、ステップＳ４１３またはステップＳ４１９で算出したスコアを出力する。そして、上位処理にリターンする。 (Step S421) The score output unit 108 outputs the score calculated in step S413 or step S419. Then, the process returns to the upper process.

なお、図４のフローチャートにおいて、判定対象情報の種類と、動き判定情報の種類とは、同一である。つまり、判定対象情報が歌唱者関節角度である場合、動き判定情報は、関節角度条件または手本関節角度である。また、判定対象情報が歌唱者ジョイント座標である場合、動き判定情報は、ジョイント座標条件または手本ジョイント座標である。また、判定対象情報が歌唱者ジョイント移動量である場合、動き判定情報は、ジョイント移動量条件または手本ジョイント移動量である。 In the flowchart of FIG. 4, the type of determination target information and the type of motion determination information are the same. That is, when the determination target information is a singer joint angle, the motion determination information is a joint angle condition or a model joint angle. Moreover, when determination object information is a singer joint coordinate, movement determination information is a joint coordinate condition or a model joint coordinate. Further, when the determination target information is a singer joint movement amount, the movement determination information is a joint movement amount condition or a model joint movement amount.

図５は、図２のフローチャートのステップＳ２１８のモデル画像の構成と出力処理を示すフローチャートである。 FIG. 5 is a flowchart showing the configuration and output processing of the model image in step S218 of the flowchart of FIG.

（ステップＳ５０１）画像出力部１１０は、モデル画像を出力するか否かを判断する。当該判断は、例えば、モデル画像を出力するか否かを示す情報が予め決められた記憶領域に格納されており、当該情報を用いて行う。そして、モデル画像を出力する場合は、ステップＳ５０２に進み、そうでない場合は、上位処理にリターンする。 (Step S501) The image output unit 110 determines whether to output a model image. The determination is made using, for example, information indicating whether or not to output a model image is stored in a predetermined storage area. If the model image is to be output, the process proceeds to step S502. If not, the process returns to the upper process.

（ステップＳ５０２）モデル画像構成部１０９は、構成するモデル画像の種類が、歌唱者モデル画像であるかキャラクタモデル画像であるかを判断する。当該判断は、例えば、構成するモデル画像の種類を示す情報が予め決められた記憶領域に格納されており、当該情報を用いて行う。そして、歌唱者モデル画像を構成する場合は、ステップＳ５０３に進み、キャラクタモデル画像を構成する場合は、ステップＳ５０６に進む。 (Step S502) The model image construction unit 109 determines whether the type of the model image to be constructed is a singer model image or a character model image. The determination is performed using, for example, information indicating the type of model image to be configured is stored in a predetermined storage area. Then, when configuring the singer model image, the process proceeds to step S503, and when configuring the character model image, the process proceeds to step S506.

（ステップＳ５０３）モデル画像構成部１０９は、距離画像が有する距離情報を用いて、距離画像中に写された歌唱者の領域である歌唱者領域を検出する。 (Step S503) The model image construction unit 109 detects a singer area that is a singer area copied in the distance image, using the distance information included in the distance image.

（ステップＳ５０４）モデル画像構成部１０９は、ステップＳ５０３で検出した歌唱者領域内の画素に対応付いている距離情報と、歌唱者座標値算出式とを用いて、当該距離情報に対応する座標値を算出する。 (Step S504) The model image construction unit 109 uses the distance information associated with the pixel in the singer area detected in Step S503 and the singer coordinate value calculation formula, and the coordinate value corresponding to the distance information. Is calculated.

（ステップＳ５０５）モデル画像構成部１０９は、歌唱者領域内の画素の２次元の座標と、ステップＳ５０４で算出した座標値とを対応付け、３次元の座標を構成する。そして、モデル画像構成部１０９は、当該３次元の座標を用いて、歌唱者モデル画像を構成する。 (Step S505) The model image construction unit 109 associates the two-dimensional coordinates of the pixels in the singer area with the coordinate values calculated in Step S504, and constructs a three-dimensional coordinate. And the model image structure part 109 comprises a singer model image using the said three-dimensional coordinate.

（ステップＳ５０６）モデル画像構成部１０９は、スケルトン情報取得部１０６が取得したスケルトン情報を取得する。 (Step S506) The model image construction unit 109 acquires the skeleton information acquired by the skeleton information acquisition unit 106.

（ステップＳ５０７）モデル画像構成部１０９は、予め決められた記憶領域に格納されているキャラクタ情報を取得する。 (Step S507) The model image construction unit 109 acquires character information stored in a predetermined storage area.

（ステップＳ５０８）モデル画像構成部１０９は、ステップＳ５０６で取得したスケルトン情報と、ステップＳ５０７で取得したキャラクタ情報とを用いて、キャラクタモデル画像を構成する。 (Step S508) The model image construction unit 109 constructs a character model image using the skeleton information acquired in step S506 and the character information acquired in step S507.

（ステップＳ５０９）画像出力部１１０は、ステップＳ５０５で取得した歌唱者モデル画像またはステップＳ５０８で取得したキャラクタモデル画像を出力する。そして、上位処理にリターンする。 (Step S509) The image output unit 110 outputs the singer model image acquired in step S505 or the character model image acquired in step S508. Then, the process returns to the upper process.

なお、上記で説明したカラオケ装置１の全体動作は、あくまで一例である。つまり、カラオケ装置１の全体動作は、上記の説明に限定されるものではない。 In addition, the whole operation | movement of the karaoke apparatus 1 demonstrated above is an example to the last. That is, the overall operation of the karaoke apparatus 1 is not limited to the above description.

（具体例）
次に、カラオケ装置１の動作の具体例について説明する。なお、本具体例において、楽曲データ格納部１０３には、１以上の楽曲データが予め格納されているものとする。また、撮影部１０５は、距離画像を取得するものとする。 (Concrete example)
Next, a specific example of the operation of the karaoke apparatus 1 will be described. In this specific example, it is assumed that the music data storage unit 103 stores one or more music data in advance. The photographing unit 105 acquires a distance image.

（例１）
本例において、スケルトン情報と動き判定情報との比較により、歌唱者の動きを採点し、スコアを算出する例について説明する。なお、本例において、動き判定情報格納部１０１には、図６に示す動き判定情報が格納されているものとする。当該動き判定情報（項目名：手本関節角度）は、手本関節角度である。また、当該動き判定情報（手本関節角度）には、レコードを一意に特定するためのＩＤと、関節識別情報（項目名：関節名）が対応付いている。 (Example 1)
In this example, an example of scoring a singer's movement and calculating a score by comparing the skeleton information and the movement determination information will be described. In this example, it is assumed that the motion determination information storage unit 101 stores the motion determination information shown in FIG. The movement determination information (item name: model joint angle) is a model joint angle. Further, the motion determination information (example joint angle) is associated with an ID for uniquely identifying a record and joint identification information (item name: joint name).

まず、ユーザが、カラオケ装置１のリモコンを操作し、自身が歌いたい楽曲を選択するための操作を行ったとする。すると、受付部１０４は、楽曲選択指示を受け付ける。 First, it is assumed that the user operates the remote controller of the karaoke apparatus 1 and performs an operation for selecting a song that the user wants to sing. Then, the reception unit 104 receives a music selection instruction.

次に、楽曲再生部１１１は、楽曲選択指示により選択された楽曲データの楽曲データ識別情報を、再生リストに登録する。ここで、再生リストには、当該登録された楽曲データ識別情報のみが登録されているとする。すると、楽曲再生部１１１は、当該楽曲データ識別情報により識別される楽曲データを、楽曲データ格納部１０３から取得する。そして、楽曲再生部１１１は、当該取得した楽曲データの再生を開始する。 Next, the music reproducing unit 111 registers the music data identification information of the music data selected by the music selection instruction in the reproduction list. Here, it is assumed that only the registered music data identification information is registered in the reproduction list. Then, the music reproducing unit 111 acquires the music data identified by the music data identification information from the music data storage unit 103. Then, the music playback unit 111 starts playback of the acquired music data.

次に、撮影部１０５は、歌唱者の撮影を開始する。つまり、撮影部１０５は、距離画像の取得を開始する。ここで、撮影部１０５が、あるタイミングで取得した距離画像が、図７に示す静止画であるものとする。当該距離画像は、カラオケ装置１（厳密には、撮影部１０５を構成する距離画像カメラ）から歌唱者までの距離を、濃淡により表現している画像である。 Next, the photographing unit 105 starts photographing the singer. That is, the imaging unit 105 starts acquiring a distance image. Here, it is assumed that the distance image acquired by the photographing unit 105 at a certain timing is the still image shown in FIG. The distance image is an image expressing the distance from the karaoke apparatus 1 (strictly, a distance image camera constituting the photographing unit 105) to the singer by shading.

次に、スケルトン情報取得部１０６は、撮影部１０５が取得した距離画像から、歌唱者領域を検出する。そして、当該歌唱者領域の輪郭を用いて、距離画像中に写された歌唱者の１以上のジョイントを検出する。これにより、スケルトン情報取得部１０６は、当該ジョイントを示す画素の座標（ｘ，ｙ）を取得する。そして、スケルトン情報取得部１０６は、当該取得した座標（ｘ，ｙ）を、連結しているジョイントごとに対応付ける。 Next, the skeleton information acquisition unit 106 detects a singer area from the distance image acquired by the imaging unit 105. And the 1 or more joint of the singer copied in the distance image is detected using the outline of the said singer area | region. Thereby, the skeleton information acquisition unit 106 acquires the coordinates (x, y) of the pixel indicating the joint. The skeleton information acquisition unit 106 associates the acquired coordinates (x, y) with each connected joint.

次に、スケルトン情報取得部１０６は、取得した１以上の座標（ｘ，ｙ）で示される画素に対応付いている距離情報と、ジョイント座標値算出式とを用いて、当該各距離情報に対応する座標値（ｚ）を算出する。そして、スケルトン情報取得部１０６は、当該ジョイントの各画素を示す座標（ｘ，ｙ）と、当該算出した座標値（ｚ）とを、対応する画素ごとに対応付け、１以上の３次元の座標（ｘ，ｙ，ｚ）を構成する。これにより、スケルトン情報取得部１０６は、１以上の３次元の座標（ｘ，ｙ，ｚ）から構成される３次元スケルトン情報を取得する。当該３次元スケルトン情報は、例えば、図８である。図８において、３次元スケルトン情報は、レコードを一意に特定するためのＩＤと、ジョイント識別情報（項目名：ジョイント名１、ジョイント名２）と、３次元の座標（項目名：座標１、座標２）とを有する。また、当該スケルトン情報を画像化した画像は、例えば、図９である。 Next, the skeleton information acquisition unit 106 uses the distance information associated with the acquired pixel indicated by one or more coordinates (x, y) and the joint coordinate value calculation formula to correspond to each distance information. The coordinate value (z) to be calculated is calculated. The skeleton information acquisition unit 106 associates the coordinates (x, y) indicating each pixel of the joint with the calculated coordinate value (z) for each corresponding pixel, and one or more three-dimensional coordinates. (X, y, z) is constructed. Thereby, the skeleton information acquisition unit 106 acquires three-dimensional skeleton information composed of one or more three-dimensional coordinates (x, y, z). The three-dimensional skeleton information is, for example, FIG. In FIG. 8, three-dimensional skeleton information includes an ID for uniquely identifying a record, joint identification information (item name: joint name 1, joint name 2), and three-dimensional coordinates (item name: coordinate 1, coordinate). 2). Moreover, the image which imaged the said skeleton information is FIG. 9, for example.

次に、スコア算出部１０７は、図８のスケルトン情報を用いて、歌唱者関節角度を算出する。このとき、スコア算出部１０７は、図８のスケルトン情報から、関節を識別するジョイント識別情報である関節識別情報を取得する。そして、スコア算出部１０７は、当該取得した一の関節識別情報について、当該関節識別情報を有する２つのレコードを取得する。そして、当該取得した２つのレコードにおいてユニークな３つの座標を用いて、当該一の関節識別情報により識別される関節の角度を算出する。 Next, the score calculation part 107 calculates a singer joint angle using the skeleton information of FIG. At this time, the score calculation unit 107 acquires joint identification information, which is joint identification information for identifying a joint, from the skeleton information of FIG. And the score calculation part 107 acquires two records which have the said joint identification information about the acquired one joint identification information. Then, the angle of the joint identified by the one joint identification information is calculated using three coordinates that are unique in the two acquired records.

例えば、関節識別情報「右肘」に対応する関節角度を算出する場合、スコア算出部１０７は、まず、当該関節識別情報「右肘」を有する「ＩＤ＝０１１」と「ＩＤ＝０１２」のレコードを取得する。そして、スコア算出部１０７は、当該取得した２つのレコードにおいてユニークな３つの座標「（３０，３０，３０）」、「（２５，４０，３０）」、「（２０，５０，３５）」を用いて、「右肘」の角度を算出する。 For example, when calculating the joint angle corresponding to the joint identification information “right elbow”, the score calculation unit 107 first records “ID = 011” and “ID = 012” having the joint identification information “right elbow”. To get. Then, the score calculation unit 107 calculates three coordinates “(30, 30, 30)”, “(25, 40, 30)”, “(20, 50, 35)” that are unique in the two acquired records. Use to calculate the angle of the “right elbow”.

以上の様にして算出された歌唱者関節角度は、例えば、図１０である。図１０において、歌唱者関節角度には、レコードを一意に特定するためのＩＤと、関節識別情報（項目名：関節名）とが対応付いている。 The singer joint angle calculated as described above is, for example, FIG. In FIG. 10, an ID for uniquely identifying a record and joint identification information (item name: joint name) are associated with the singer joint angle.

次に、スコア算出部１０７は、図１０の歌唱者関節角度と、図６の手本関節角度とについて、同一の関節識別情報が対応付いている歌唱者関節角度と手本関節角度とを比較し、当該比較の結果を示すパラメータを算出する。図６の手本関節角度は、図７の距離画像が取得されたタイミングに対応する手本関節角度であるものとする。 Next, the score calculation unit 107 compares the singer joint angle corresponding to the same joint identification information and the sample joint angle with respect to the singer joint angle in FIG. 10 and the sample joint angle in FIG. 6. Then, a parameter indicating the result of the comparison is calculated. The model joint angle in FIG. 6 is a model joint angle corresponding to the timing at which the distance image in FIG. 7 is acquired.

例えば、スコア算出部１０７は、同一の関節識別情報「左肩」が対応付いている図１０の「ＩＤ＝０１２」の歌唱者関節角度「２５°」と、図６の「ＩＤ＝０１２」の手本関節角度「３０°」とを比較し、パラメータを算出する。また、当該パラメータを算出する算出式は、「ａｂｓ（歌唱者関節角度−手本関節角度）／手本関節角度」であるものとする。当該算出式は、歌唱者関節角度と手本関節角度との差の絶対値を手本関節角度で割ることを意味する。 For example, the score calculation unit 107 corresponds to the singer joint angle “25 °” of “ID = 012” in FIG. 10 associated with the same joint identification information “left shoulder” and the hand of “ID = 012” in FIG. The main joint angle “30 °” is compared, and the parameters are calculated. The calculation formula for calculating the parameter is “abs (singer joint angle−example joint angle) / example joint angle”. The calculation formula means that the absolute value of the difference between the singer joint angle and the model joint angle is divided by the model joint angle.

以上の様にして算出されたパラメータは、例えば、図１１である。図１１において、パラメータには、レコードを一意に特定するためのＩＤと、関節識別情報（項目名：関節名）とが対応付いている。 The parameters calculated as described above are, for example, FIG. In FIG. 11, parameters are associated with IDs for uniquely identifying records and joint identification information (item name: joint name).

次に、スコア算出部１０７は、図１１のパラメータとスコア算出式とを用いて、スコアを算出する。当該スコア算出式は、「１００−（パラメータの平均×１００）」であるものとする。当該スコア算出式における「パラメータの平均」とは、図１１のパラメータの平均である。そして、スコア算出部１０７は、例えば、スコア「８５」を算出する。 Next, the score calculation unit 107 calculates a score using the parameters and the score calculation formula of FIG. The score calculation formula is “100− (average parameter × 100)”. The “average parameter” in the score calculation formula is the average parameter in FIG. Then, the score calculation unit 107 calculates a score “85”, for example.

次に、スコア出力部１０８は、スコア「８５」を出力する。 Next, the score output unit 108 outputs the score “85”.

（例２）
本例において、動き判定情報によるスケルトン情報の判定により、歌唱者の動きを採点し、スコアを算出する例について説明する。なお、本例において、動き判定情報格納部１０１には、図１２に示す動き判定情報が格納されているものとする。当該動き判定情報（項目名：関節角度条件）は、関節角度条件である。また、当該動き判定情報（関節角度条件）には、レコードを一意に特定するためのＩＤが対応付いている。 (Example 2)
In this example, an example will be described in which a singer's movement is scored and a score is calculated based on determination of skeleton information based on movement determination information. In this example, it is assumed that the motion determination information storage unit 101 stores the motion determination information shown in FIG. The motion determination information (item name: joint angle condition) is a joint angle condition. Further, the motion determination information (joint angle condition) is associated with an ID for uniquely identifying a record.

まず、例１と同様に、ユーザの指示により楽曲再生部１１１が楽曲データの再生を開始した結果、スケルトン情報取得部１０６が図８のスケルトン情報を取得したものとする。 First, as in Example 1, it is assumed that the skeleton information acquisition unit 106 has acquired the skeleton information of FIG. 8 as a result of the music playback unit 111 starting playback of music data in accordance with a user instruction.

次に、スコア算出部１０７は、例１と同様に、図１０の歌唱者関節角度を算出する。 Next, the score calculation unit 107 calculates the singer joint angle of FIG.

次に、スコア算出部１０７は、図１０の歌唱者関節角度と、図１２の関節角度条件とにおいて、同一の関節識別情報が対応付いている歌唱者関節角度と関節角度条件とについて、当該歌唱者関節角度が当該関節角度条件を満たすか否かを判断する。そして、スコア算出部１０７は、当該判断の結果を示す情報を取得する。 Next, the score calculation unit 107 performs the singing for the singer joint angle and the joint angle condition associated with the same joint identification information in the singer joint angle of FIG. 10 and the joint angle condition of FIG. It is determined whether the joint angle satisfies the joint angle condition. Then, the score calculation unit 107 acquires information indicating the result of the determination.

例えば、スコア算出部１０７は、同一の関節識別情報「左肩」が対応付いている図１０の「ＩＤ＝０１２」の歌唱者関節角度「２５°」と、図１２の「ＩＤ＝０１２」の関節角度条件「２５°≦左肩≦３５°」とについて、当該歌唱者関節角度「２５°」が当該関節角度条件「２５°≦左肩≦３５°」を満たすと判断する。そして、スコア算出部１０７は、当該判断の結果を示す情報「１」を取得する。 For example, the score calculation unit 107 corresponds to the singer joint angle “25 °” of “ID = 012” in FIG. 10 associated with the same joint identification information “left shoulder” and the joint of “ID = 012” in FIG. For the angle condition “25 ° ≦ left shoulder ≦ 35 °”, it is determined that the singer joint angle “25 °” satisfies the joint angle condition “25 ° ≦ left shoulder ≦ 35 °”. Then, the score calculation unit 107 acquires information “1” indicating the result of the determination.

また、例えば、スコア算出部１０７は、同一の関節識別情報「右肘」が対応付いている図１０の「ＩＤ＝０１３」の歌唱者関節角度「１００°」と、図１２の「ＩＤ＝０１３」の関節角度条件「１１５°≦右肘≦１２５°」とについて、当該歌唱者関節角度「１００°」が当該関節角度条件「１１５°≦右肘≦１２５°」を満たさないと判断する。そして、スコア算出部１０７は、当該判断の結果を示す情報「０」を取得する。 In addition, for example, the score calculation unit 107 matches the singer joint angle “100 °” of “ID = 013” in FIG. 10 associated with the same joint identification information “right elbow” and “ID = 013” in FIG. The joint angle condition “115 ° ≦ right elbow ≦ 125 °” is determined to be that the singer joint angle “100 °” does not satisfy the joint angle condition “115 ° ≦ right elbow ≦ 125 °”. Then, the score calculation unit 107 acquires information “0” indicating the result of the determination.

以上の様にして取得された判断の結果を示す情報は、例えば、図１３である。図１３において、判断の結果を示す情報（項目名：判断結果）には、レコードを一意に特定するためのＩＤと、関節識別情報（項目名：関節名）とが対応付いている。 Information indicating the result of the determination acquired as described above is, for example, FIG. In FIG. 13, information (item name: determination result) indicating a determination result is associated with an ID for uniquely identifying a record and joint identification information (item name: joint name).

次に、スコア算出部１０７は、図１３の判断結果を示す情報を用いてパラメータを算出する。ここで、当該パラメータは、判断の結果を示す情報における「１」の数であるものとする。すると、スコア算出部１０７は、図１３の判断結果を示す情報における「１」の数をカウントし、パラメータを算出する。この結果、スコア算出部１０７は、パラメータ「１０」を算出したものとする。 Next, the score calculation unit 107 calculates a parameter using information indicating the determination result of FIG. Here, it is assumed that the parameter is the number of “1” in the information indicating the determination result. Then, the score calculation unit 107 counts the number of “1” in the information indicating the determination result in FIG. 13 and calculates a parameter. As a result, it is assumed that the score calculation unit 107 calculates the parameter “10”.

次に、スコア算出部１０７は、算出したパラメータ「８」とスコア算出式とを用いて、スコアを算出する。当該スコア算出式は、「１００×（パラメータ／１４）」であるものとする。そして、スコア算出部１０７は、例えば、スコア「７１」を算出する。 Next, the score calculation unit 107 calculates a score using the calculated parameter “8” and the score calculation formula. The score calculation formula is “100 × (parameter / 14)”. Then, the score calculation unit 107 calculates a score “71”, for example.

次に、スコア出力部１０８は、スコア「７１」を出力する。 Next, the score output unit 108 outputs the score “71”.

（例３）
次に、スコアと共に、モデル画像と手本画像とを画面出力する例について説明する。 (Example 3)
Next, an example of outputting a model image and a model image together with the score will be described.

まず、例１と同様に、スコア算出部１０７が、スコア「８５」を算出したものとする。 First, similarly to Example 1, it is assumed that the score calculation unit 107 calculates the score “85”.

次に、モデル画像構成部１０９は、図７の距離画像から、歌唱者領域を検出する。そして、当該歌唱者領域内の各画素に対応付いている距離情報と、歌唱者座標値算出式とを用いて、当該各距離情報に対応する座標値（ｚ）を算出する。そして、モデル画像構成部１０９は、当該歌唱者領域内の各画素を示す座標（ｘ，ｙ）と、当該算出した座標値（ｚ）とを、対応する画素ごとに対応付け、１以上の３次元の座標（ｘ，ｙ，ｚ）を構成する。 Next, the model image construction unit 109 detects a singer area from the distance image of FIG. And the coordinate value (z) corresponding to each said distance information is calculated using the distance information corresponding to each pixel in the said singer area | region, and a singer coordinate value calculation formula. Then, the model image construction unit 109 associates the coordinates (x, y) indicating each pixel in the singer area with the calculated coordinate value (z) for each corresponding pixel, and 1 or more 3 Construct dimension coordinates (x, y, z).

次に、モデル画像構成部１０９は、構成した１以上の３次元の座標（ｘ，ｙ，ｚ）を用いて、モデリングや、レンダリング、テクスチャマッピングなどの３ＤＣＧを作成するための処理を行う。そして、モデル画像構成部１０９は、歌唱者モデル画像を構成する。当該歌唱者モデル画像は、例えば、図１４に示す画像である。 Next, the model image construction unit 109 performs processing for creating 3DCG, such as modeling, rendering, and texture mapping, using the constructed one or more three-dimensional coordinates (x, y, z). And the model image structure part 109 comprises a singer model image. The singer model image is, for example, an image shown in FIG.

次に、画像出力部１１０は、手本画像格納部１０２から、手本画像を取得する。当該手本画像は、図１５に示す画像であるものとする。 Next, the image output unit 110 acquires a model image from the model image storage unit 102. The model image is assumed to be an image shown in FIG.

次に、スコア出力部１０８は、スコア「８５」を出力し、画像出力部１１０は、図１４のモデル画像と、図１５の手本画像とを出力する。このとき、スコア出力部１０８は、手本画像と重ならない位置に配置されるようにスコアを出力する。また、画像出力部１１０は、モデル画像と手本画像とを、別々の領域に画面出力する。このときの様子は、例えば、図１６である。図１６は、画面の左側にモデル画像が出力され、画面の右側に手本画像が出力され、モデル画像の上部にスコアが出力されている例を示す図である。 Next, the score output unit 108 outputs the score “85”, and the image output unit 110 outputs the model image of FIG. 14 and the model image of FIG. At this time, the score output unit 108 outputs the score so as to be arranged at a position that does not overlap with the model image. In addition, the image output unit 110 outputs the model image and the model image on different screens. The state at this time is, for example, FIG. FIG. 16 is a diagram illustrating an example in which a model image is output on the left side of the screen, a model image is output on the right side of the screen, and a score is output on the upper part of the model image.

以上、本実施の形態によるカラオケ装置１によれば、ダンスの採点を精度良く行うカラオケ装置を提供することができる。これにより、ユーザ（歌唱者）は、歌の採点だけでなく、ダンスの採点を楽しむことができ、歌うことだけでなく、ダンスをしながらもカラオケを楽しむことができる。また、手本画像を出力することにより、歌う楽曲に対応するダンスを知らないユーザでも、歌いながらダンスを楽しむことができる。つまり、本実施の形態によるカラオケ装置１によれば、カラオケのエンターテイメント性をより向上させることができ、ユーザがより楽しむことができるカラオケを、ユーザに提供することができる。 As described above, according to the karaoke apparatus 1 according to the present embodiment, it is possible to provide a karaoke apparatus that accurately scores a dance. Thereby, the user (singer) can enjoy not only singing a song but also dancing, and not only singing but also enjoying karaoke while dancing. Further, by outputting a model image, even a user who does not know the dance corresponding to the song to be sung can enjoy the dance while singing. That is, according to the karaoke apparatus 1 according to the present embodiment, karaoke entertainment can be further improved, and karaoke that the user can enjoy more can be provided to the user.

また、以上より、従来のカラオケ装置は、歌唱者の歌のみを採点する個別採点型カラオケを提供するカラオケ装置であり、本実施の形態によるカラオケ装置１は、歌唱者の歌のみではなくダンスも採点できる総合採点型カラオケを提供するカラオケ装置であると言える。言い換えると、本実施の形態によるカラオケ装置１は、次世代のカラオケを提供するカラオケ装置であると言える。 In addition, as described above, the conventional karaoke apparatus is a karaoke apparatus that provides an individual scoring type karaoke for scoring only the song of the singer. It can be said that it is a karaoke device that provides a comprehensive scoring karaoke that can be scored. In other words, it can be said that the karaoke apparatus 1 according to the present embodiment is a karaoke apparatus that provides the next generation of karaoke.

また、本実施の形態において、手本画像格納部１０２には、通常、手本画像が予め格納されている。また、当該手本画像は、例えば、楽曲再生部１１１が再生する楽曲データに応じて構成された手本画像であってもよい。この場合、カラオケ装置１は、例えば、キャラクタを示す情報であるキャラクタ情報が格納されるキャラクタ情報格納部と、手本の動きを示す情報である手本情報が格納される手本情報格納部と、当該キャラクタ情報と当該手本情報とを用いて、手本情報が示す手本の動きに応じた動きを行うキャラクタを示す画像である手本画像を構成する手本画像構成部とをさらに備える。そして、手本画像構成部により構成された手本画像が、手本画像格納部１０２に蓄積される。 In the present embodiment, the model image storage unit 102 normally stores model images in advance. The model image may be a model image configured according to music data reproduced by the music reproduction unit 111, for example. In this case, the karaoke apparatus 1 includes, for example, a character information storage unit that stores character information that is information indicating a character, and a model information storage unit that stores model information that is information indicating movement of a model. And a model image configuration unit that configures a model image that is an image showing a character that moves according to the movement of the model indicated by the model information using the character information and the model information. . Then, the model image configured by the model image configuration unit is accumulated in the model image storage unit 102.

また、上記「手本情報」のデータ構造は、スケルトン情報のデータ構造と同様であるので、説明を省略する。また、上記手本画像構成部が手本画像を構成する方法や手順などは、モデル画像構成部１０９がキャラクタ情報とスケルトン情報とを用いてモデル画像を構成する方法や手順などと同様であるので、説明を省略する。 Further, the data structure of the “model information” is the same as the data structure of the skeleton information, and thus the description thereof is omitted. Further, the method and procedure for the model image construction unit to construct the model image are the same as the method and procedure for the model image construction unit 109 to construct the model image using the character information and the skeleton information. The description is omitted.

また、本実施の形態において、動き判定情報格納部１０１には、通常、１以上の動き判定情報が格納される。当該１以上の動き判定情報は、通常、予め決められた１以上の各タイミングに対応する動き判定情報である。また、当該１以上の各動き判定情報は、予め決められた１以上のタイミングごとの歌唱者の動きを判定するための動き判定情報である。また、当該タイミングとは、例えば、当該楽曲データにおけるフレーム番号である。また、当該タイミングは、例えば、楽曲データの再生開始からの経過時間や、いわゆるタイムスタンプなどであってもよい。つまり、当該１以上の各動き判定情報には、通常、楽曲データ、および、当該楽曲データにおけるタイミングを示す情報が対応付いている。 In the present embodiment, the motion determination information storage unit 101 normally stores one or more pieces of motion determination information. The one or more pieces of movement determination information are usually movement determination information corresponding to one or more predetermined timings. The one or more pieces of movement determination information are movement determination information for determining the movement of the singer at each of one or more predetermined timings. The timing is, for example, a frame number in the music data. The timing may be, for example, an elapsed time from the start of reproduction of music data, a so-called time stamp, or the like. That is, the one or more pieces of motion determination information are normally associated with music data and information indicating timing in the music data.

また、本実施の形態において、撮影部１０５が取得した１以上の各歌唱者画像（静止画）には、通常、撮影開始からのタイミングを示す情報が対応付いている。当該タイミングとは、例えば、動画におけるフレーム番号である。また、当該タイミングは、例えば、撮影開始からの経過時間や、いわゆるタイムスタンプなどであってもよい。つまり、撮影部１０５は、取得した１以上の各歌唱者画像に、通常、撮影開始からのタイミングを示す情報を対応付ける。 In the present embodiment, one or more singer images (still images) acquired by the photographing unit 105 are usually associated with information indicating timing from the start of photographing. The said timing is a frame number in a moving image, for example. The timing may be, for example, an elapsed time from the start of shooting, a so-called time stamp, or the like. That is, the photographing unit 105 normally associates information indicating the timing from the start of photographing with each acquired one or more singer images.

また、本実施の形態において、撮影部１０５は、通常、楽曲データの再生が開始されると同時に、歌唱者画像の取得（歌唱者の撮影）を開始する。当該「同時」には、「ほぼ同時」も含まれる。つまり、楽曲再生部１１１が楽曲データの再生を開始すると、撮影部１０５が歌唱者画像の取得を開始する。 Moreover, in this Embodiment, the imaging | photography part 105 usually starts acquisition of a singer image (photographing of a singer) simultaneously with the reproduction | regeneration of music data being started. The “simultaneous” also includes “almost simultaneous”. That is, when the music reproducing unit 111 starts reproducing the music data, the photographing unit 105 starts acquiring the singer image.

以上より、本実施の形態において、スコア算出部１０７は、通常、楽曲再生部１１１が再生している楽曲データ、および、当該再生している楽曲データにおけるタイミングに対応する動き判定情報を、動き判定情報格納部１０１から取得する。そして、スコア算出部１０７は、当該フレーム番号に対応する歌唱者画像を用いて取得されたスケルトン情報と、当該取得した動き判定情報とを用いて、当該スケルトン情報により示される歌唱者の動きを採点し、スコアを算出する。なお、例えば、楽曲再生部１１１が再生する楽曲データに、どの動き判定情報も対応付いていない場合、スコア算出部１０７は、通常、スコアを算出しない。 As described above, in the present embodiment, the score calculation unit 107 normally obtains the motion determination information corresponding to the music data being played back by the music playback unit 111 and the timing of the music data being played back. Obtained from the information storage unit 101. Then, the score calculation unit 107 scores the movement of the singer indicated by the skeleton information using the skeleton information acquired using the singer image corresponding to the frame number and the acquired movement determination information. And calculate the score. Note that, for example, when no motion determination information is associated with the music data played by the music playback unit 111, the score calculation unit 107 does not normally calculate a score.

具体的に、例えば、楽曲再生部１１１が、楽曲データ識別情報「ｍｕｓｉｃ０１」で識別される楽曲データを再生しているとする。また、撮影部１０５が、１０フレーム目の歌唱者画像を取得したとする。この様な場合、スコア算出部１０７は、まず、楽曲データ識別情報「ｍｕｓｉｃ０１」とフレーム番号「１０フレーム目」に対応する動き判定情報を、動き判定情報格納部１０１から取得する。そして、スコア算出部１０７は、当該１０フレーム目の歌唱者画像を用いて取得されたスケルトン情報と、当該取得した動き判定情報とを用いて、当該１０フレーム目に対応するスケルトン情報により示される歌唱者の動き（当該１０フレーム目の歌唱者画像に写されている歌唱者の動き）を採点する。 Specifically, for example, it is assumed that the music playback unit 111 is playing back music data identified by the music data identification information “music01”. Further, it is assumed that the photographing unit 105 has acquired a singer image of the 10th frame. In such a case, the score calculation unit 107 first acquires motion determination information corresponding to the music data identification information “music01” and the frame number “10th frame” from the motion determination information storage unit 101. Then, the score calculation unit 107 uses the skeleton information acquired by using the singer image of the 10th frame and the acquired motion determination information, and the singing indicated by the skeleton information corresponding to the 10th frame. A person's movement (singer's movement copied to the singer image of the 10th frame) is scored.

また、例えば、楽曲再生部１１１が、楽曲データ識別情報「ｍｕｓｉｃ０１」で識別される楽曲データを再生しているとする。撮影部１０５が、撮影開始から１０秒後に歌唱者画像を取得したとする。この様な場合、スコア算出部１０７は、まず、楽曲データ識別情報「ｍｕｓｉｃ０１」と再生開始からの経過時間「１０秒」に対応する動き判定情報を、動き判定情報格納部１０１から取得する。そして、スコア算出部１０７は、当該撮影開始から１０秒後の歌唱者画像を用いて取得されたスケルトン情報と、当該取得した動き判定情報とを用いて、当該撮影開始から１０秒後に対応するスケルトン情報により示される歌唱者の動き（当該撮影開始から１０秒後の歌唱者画像に写されている歌唱者の動き）を採点する。 Further, for example, it is assumed that the music reproducing unit 111 is reproducing music data identified by the music data identification information “music01”. Assume that the photographing unit 105 acquires a singer image 10 seconds after the start of photographing. In such a case, the score calculation unit 107 first acquires, from the motion determination information storage unit 101, motion determination information corresponding to the music data identification information “music01” and the elapsed time “10 seconds” from the start of reproduction. Then, the score calculation unit 107 uses the skeleton information acquired using the singer image 10 seconds after the start of shooting and the acquired motion determination information, and the corresponding skeleton 10 seconds after the start of shooting. The movement of the singer indicated by the information (the movement of the singer shown in the singer image 10 seconds after the start of the shooting) is scored.

また、本実施の形態において、スケルトン情報取得部１０６は、例えば、歌唱者画像から、マイクの領域（以下、適宜、マイク領域とする）を検出し、当該マイク領域と隣接している領域または当該マイク領域を含む領域を、歌唱者領域として検出してもよい。マイク領域の検出は、例えば、輪郭抽出やパターン認識などの画像処理により行う。また、これにより、スケルトン情報取得部１０６は、例えば、撮影部１０５が撮影により取得した歌唱者画像に歌唱者となり得る２人以上の人物が写されている場合に、実際に歌を歌っているユーザ（歌唱者）の領域のみを歌唱者領域として歌唱者画像から検出することができる。 In the present embodiment, the skeleton information acquisition unit 106 detects, for example, a microphone area (hereinafter, appropriately referred to as a microphone area) from a singer image, and an area adjacent to the microphone area or the You may detect the area | region containing a microphone area | region as a singer area | region. The microphone region is detected by image processing such as contour extraction and pattern recognition. Thereby, skeleton information acquisition part 106 is actually singing a song, when two or more persons who can become a singer are copied in the singer image which photography part 105 acquired by photography, for example. Only a user (singer) area can be detected from the singer image as a singer area.

また、本実施の形態において、例えば、歌唱者画像が１以上の静止画を有する動画である場合（１以上のフレームから構成される動画である場合）、スケルトン情報取得部１０６は、当該１以上の各静止画を用いて、当該１以上の各静止画に対応するスケルトン情報を取得する。また、この場合、スケルトン情報取得部１０６は、例えば、予め決められた条件を満たす１以上の各静止画を用いて、当該１以上の各静止画に対応するスケルトン情報を取得してもよい。また、例えば、歌唱者画像が静止画である場合（動画が有する１以上の各静止画である場合、または、動画を構成する１以上の各フレームである場合）、スケルトン情報取得部１０６は、当該静止画を用いて、当該歌唱者画像に対応するスケルトン情報を取得する。また、この場合、スケルトン情報取得部１０６は、例えば、予め決められた条件を満たす１以上の各静止画を用いて、当該１以上の各歌唱者画像に対応するスケルトン情報を取得してもよい。なお、「予め決められた条件」とは、例えば、動画における静止画のタイミング（フレーム番号、経過時間など）や、静止画に写されている歌唱者が動いているか静止しているかに関する条件などである。 In the present embodiment, for example, when the singer image is a moving image having one or more still images (when the singing image is a moving image composed of one or more frames), the skeleton information acquisition unit 106 may include the one or more The skeleton information corresponding to each of the one or more still images is acquired using each of the still images. In this case, the skeleton information acquisition unit 106 may acquire skeleton information corresponding to each of the one or more still images using, for example, one or more still images that satisfy a predetermined condition. In addition, for example, when the singer image is a still image (one or more still images included in the moving image or one or more frames constituting the moving image), the skeleton information acquisition unit 106 Skeleton information corresponding to the singer image is acquired using the still image. In this case, for example, the skeleton information acquisition unit 106 may acquire skeleton information corresponding to each of the one or more singer images by using one or more still images that satisfy a predetermined condition. . Note that “predetermined conditions” are, for example, the timing of still images in a video (frame number, elapsed time, etc.), conditions regarding whether a singer photographed in a still image is moving or stationary, etc. It is.

また、スコア算出部１０７は、例えば、スケルトン情報取得部１０６がスケルトン情報を取得するたびに、当該スケルトン情報を用いて、当該スケルトン情報が取得された静止画（フレーム）に対応するスコアを算出してもよい。また、スコア算出部１０７は、例えば、スケルトン情報取得部１０６が取得した１以上のスケルトン情報のうち、予め決められた条件を満たすスケルトン情報を用いて、当該スケルトン情報が取得された静止画に対応するスコアを算出してもよい。なお、「予め決められた条件」とは、例えば、動画における静止画のタイミング（フレーム番号、経過時間など）や、静止画に写されている歌唱者が動いているか静止しているかに関する条件などである。 Further, for example, each time the skeleton information acquisition unit 106 acquires skeleton information, the score calculation unit 107 calculates a score corresponding to the still image (frame) from which the skeleton information is acquired, using the skeleton information. May be. The score calculation unit 107 corresponds to the still image from which the skeleton information is acquired using, for example, skeleton information that satisfies a predetermined condition among the one or more skeleton information acquired by the skeleton information acquisition unit 106. A score may be calculated. Note that “predetermined conditions” are, for example, the timing of still images in a video (frame number, elapsed time, etc.), conditions regarding whether a singer photographed in a still image is moving or stationary, etc. It is.

また、スコア算出部１０７は、例えば、算出した２以上のスコアを統計処理し、新たなスコアを算出してもよい。当該統計処理とは、例えば、合計を算出することや、平均を算出することなどである。また、当該平均は、単純平均であってもよいし、加重平均であってもよい。また、加重平均を算出する場合、２以上の各スコア（２以上の各スコアに対応する２以上の各タイミング）に対する重みは、通常、予め決められている。また、当該重みは、通常、スコア算出部１０７が予め保持している。また、スコア算出部１０７は、例えば、算出した新たな２以上のスコアをさらに統計処理し、新たなスコアを算出してもよい。 In addition, the score calculation unit 107 may statistically process two or more calculated scores and calculate a new score, for example. The statistical processing is, for example, calculating a total or calculating an average. The average may be a simple average or a weighted average. When calculating a weighted average, weights for two or more scores (two or more timings corresponding to two or more scores) are usually determined in advance. Further, the weight is normally held in advance by the score calculation unit 107. Further, the score calculation unit 107 may further perform statistical processing on the calculated two or more new scores to calculate a new score, for example.

例えば、５フレーム目から９フレーム目の各フレームに対応する５個のスコアの加重平均を算出し、新たなスコアを算出する場合、スコア算出部１０７は、当該５フレーム目から９フレーム目の各フレームに対応する５個の重みを取得する。そして、スコア算出部１０７は、当該５個のスコアと、当該５個の重みとを用いて、加重平均を算出し、新たなスコアを算出する。 For example, when calculating a weighted average of five scores corresponding to each frame from the fifth frame to the ninth frame and calculating a new score, the score calculation unit 107 calculates each score from the fifth frame to the ninth frame. Five weights corresponding to the frame are acquired. Then, the score calculation unit 107 calculates a weighted average using the five scores and the five weights, and calculates a new score.

また、本実施の形態において、スコア算出部１０７は、例えば、手本の動きに対する歌唱者の動きの遅れを考慮し、歌唱者の動きを採点してもよい。具体的に、この場合、スコア算出部１０７は、例えば、動き判定情報格納部１０１に格納されている動き判定情報と、スケルトン情報取得部１０６が取得したスケルトン情報とを用いて、当該動き判定情報が示す手本の動きに対する、当該スケルトン情報が示す歌唱者の動きの遅延の度合を示す遅延度を算出する。そして、スコア算出部１０７は、当該遅延度に応じて歌唱者の動きを採点し、当該採点の結果であるスコアを算出する。「遅延度」は、例えば、遅延しているフレーム数や、遅延している秒数などである。また、「遅延度に応じて」とは、例えば、遅延がないものと見なすことや、遅延度合をスコアに反映させることなどである。 Moreover, in this Embodiment, the score calculation part 107 may consider a singer's movement delay with respect to the movement of a model, for example, and may score a singer's movement. Specifically, in this case, the score calculation unit 107 uses the motion determination information stored in the motion determination information storage unit 101 and the skeleton information acquired by the skeleton information acquisition unit 106, for example, to determine the motion determination information. The degree of delay indicating the degree of delay of the singer's movement indicated by the skeleton information with respect to the movement of the model indicated by is calculated. And the score calculation part 107 scores a singer's movement according to the said delay degree, and calculates the score which is the result of the said scoring. The “delay degree” is, for example, the number of frames that are delayed, the number of seconds that are delayed, and the like. “According to the degree of delay” means, for example, that there is no delay, or that the degree of delay is reflected in the score.

上記「遅延がないものと見なす」とは、例えば、歌唱者画像のフレーム番号とは異なるフレーム番号に対応する動き判定情報を用いて、当該歌唱者画像に写されている歌唱者の動きを採点することである。言い換えると、「遅延がないものと見なす」とは、例えば、一の歌唱者画像を用いて取得されたスケルトン情報が示す歌唱者の動きに最も類似する手本の動きを示す動き判定情報を用いて、当該一の歌唱者画像に写されている歌唱者の動きを採点することである。「歌唱者の動きに最も類似する手本の動きを示す動き判定情報を用いて」とは、例えば、２以上の動き判定情報を用いて採点した結果のスコアのうち、最も高いスコアを取得することである。また、上記「遅延度合をスコアに反映させる」とは、例えば、算出したスコアに対して、遅延度合に応じた係数を掛けることである。 The above-mentioned “considering that there is no delay” means, for example, scoring the movement of a singer copied in the singer image using movement determination information corresponding to a frame number different from the frame number of the singer image. It is to be. In other words, “considering that there is no delay” means, for example, using motion determination information indicating the movement of the model most similar to the movement of the singer indicated by the skeleton information acquired using one singer image. Then, scoring the movement of the singer shown in the one singer image. “Using the motion determination information indicating the movement of the model most similar to the movement of the singer” means, for example, that the highest score among the scores obtained as a result of scoring using two or more motion determination information is acquired. That is. The above-mentioned “reflecting the degree of delay in the score” means, for example, multiplying the calculated score by a coefficient corresponding to the degree of delay.

例えば、撮影部１０５が取得した歌唱者画像（静止画）のフレーム番号との差が予め決められた条件を満たすフレーム番号に対応する動き判定情報を用いて、歌唱者の動きを採点することが予め決められているとする。当該「予め決められた条件」は、対象となるフレーム番号から＋５フレームであるとする。また、この場合において、撮影部１０５が、１０フレーム目の歌唱者画像を取得したとする。この様な場合、スコア算出部１０７は、例えば、５フレーム目から１０フレーム目までの各フレームに対応する動き判定情報を用いて、１０フレーム目の歌唱者画像を用いて取得されるスケルトン情報が示す歌唱者の動きを採点する。そして、スコア算出部１０７は、算出した６個のスコアのうち、最も高いスコアを、当該１０フレーム目の歌唱者画像に対応するスコアとして取得する。 For example, the movement of the singer can be scored using the motion determination information corresponding to the frame number that satisfies a predetermined condition with the frame number of the singer image (still image) acquired by the photographing unit 105. Suppose that it is decided beforehand. The “predetermined condition” is assumed to be +5 frames from the target frame number. In this case, it is assumed that the photographing unit 105 acquires a singer image of the 10th frame. In such a case, the score calculation unit 107 uses, for example, motion determination information corresponding to each frame from the 5th frame to the 10th frame, and the skeleton information acquired using the singer image of the 10th frame. Score the singers' movements shown. And the score calculation part 107 acquires the highest score among the calculated six scores as a score corresponding to the singer image of the 10th frame.

また、例えば、スコア算出部１０７が、上記算出した６個のスコアのうち、８フレーム目の動き判定情報を用いて算出したスコアを、上記１０フレーム目の歌唱者画像に対応するスコアとして取得したとする。この様な場合、スコア算出部１０７は、例えば、遅延度「１０−８＝２」を算出する。そして、スコア算出部１０７は、当該遅延度に応じた係数を算出する。当該係数の算出方法などは、問わない。そして、スコア算出部１０７は、取得した１０フレーム目の歌唱者画像に対応するスコアに対し、当該算出した係数を掛け、新たなスコアを算出する。 In addition, for example, the score calculation unit 107 acquires the score calculated using the motion determination information of the eighth frame among the calculated six scores as a score corresponding to the singer image of the tenth frame. And In such a case, for example, the score calculation unit 107 calculates a delay degree “10−8 = 2”. And the score calculation part 107 calculates the coefficient according to the said delay degree. The calculation method of the coefficient does not matter. Then, the score calculation unit 107 calculates a new score by multiplying the score corresponding to the acquired 10th-frame singer image by the calculated coefficient.

なお、上記の説明におけるフレーム番号は、経過時間であってもよいことは、言うまでもない。 Needless to say, the frame number in the above description may be an elapsed time.

以上より、本実施の形態において、スコア算出部１０７は、例えば、手本の動きに対する歌唱者の動きの一致度と、手本の動きに対する歌唱者の動きの遅延度とを用いて、歌唱者の動きを採点し、当該採点の結果であるスコアを算出する。この場合、スコア算出部１０７は、例えば、スコア算出式を用いてスコアを算出する。具体的に、スコア算出部１０７は、当該一致度と、当該遅延度とをパラメータとしてスコア算出式に代入する。 As described above, in the present embodiment, the score calculation unit 107 uses, for example, the degree of coincidence of the singer's movement with respect to the movement of the model and the degree of delay of the singer's movement with respect to the movement of the model. Is scored, and a score as a result of the scoring is calculated. In this case, the score calculation unit 107 calculates the score using, for example, a score calculation formula. Specifically, the score calculation unit 107 substitutes the degree of coincidence and the degree of delay as parameters in the score calculation formula.

また、本実施の形態において、カラオケ装置１は、スコア算出部１０７を備えていなくてもよい。この場合、カラオケ装置１は、例えば、前述のスコアを算出する装置（以下、適宜、スコア算出装置）が算出したスコアを、当該スコア算出装置から受信する。そして、スコア出力部１０８は、当該受信したスコアを出力する。なお、スコア算出装置が行う処理や動作などは、スコア算出部１０７が行う処理や動作などと同様であるので、説明を省略する。 Moreover, in this Embodiment, the karaoke apparatus 1 does not need to be provided with the score calculation part 107. FIG. In this case, the karaoke apparatus 1 receives, for example, the score calculated by the above-described score calculation apparatus (hereinafter, as appropriate, the score calculation apparatus) from the score calculation apparatus. Then, the score output unit 108 outputs the received score. Note that the processes and operations performed by the score calculation apparatus are the same as the processes and operations performed by the score calculation unit 107, and thus description thereof will be omitted.

また、本実施の形態において、カラオケ装置１は、モデル画像構成部１０９を備えていなくてもよい。この場合、画像出力部１１０は、手本画像のみを出力する。 In the present embodiment, the karaoke apparatus 1 may not include the model image configuration unit 109. In this case, the image output unit 110 outputs only the model image.

また、本実施の形態において、画像出力部１１０が出力する手本画像は、通常、楽曲再生部１１１が再生する楽曲データに対応付いている。つまり、画像出力部１１０は、手本画像を出力する場合、通常、楽曲再生部１１１が再生している楽曲データに対応付いている手本画像を、手本画像格納部１０２から取得する。そして、画像出力部１１０は、当該取得した手本画像を出力する。また、例えば、楽曲再生部１１１が再生する楽曲データに、どの手本画像も対応付いていない場合、画像出力部１１０は、通常、手本画像を出力しない。 In the present embodiment, the model image output from the image output unit 110 is usually associated with the music data played back by the music playback unit 111. That is, when outputting a model image, the image output unit 110 normally acquires a model image associated with the music data reproduced by the music reproduction unit 111 from the model image storage unit 102. Then, the image output unit 110 outputs the acquired model image. Further, for example, when no model image is associated with the music data reproduced by the music reproduction unit 111, the image output unit 110 normally does not output the model image.

また、本実施の形態において、画像出力部１１０は、通常、歌詞を示す画像である歌詞画像をも出力する。当該歌詞画像は、通常、予め決められた記憶領域に格納されている。当該予め決められた記憶領域は、例えば、楽曲データ格納部１０３であってもよい。また、出力する歌詞画像は、通常、楽曲再生部１１１が再生する楽曲データに対応付いている。つまり、画像出力部１１０は、通常、楽曲再生部１１１が再生している楽曲データに対応付いている歌詞画像を、予め決められた記憶領域から取得する。そして、画像出力部１１０は、当該取得した歌詞画像を出力する。また、歌詞画像を出力する場合、合成画像構成部は、例えば、歌詞画像をも用いて、合成画像を構成してもよい。 In the present embodiment, the image output unit 110 normally also outputs a lyrics image that is an image showing lyrics. The lyrics image is normally stored in a predetermined storage area. The predetermined storage area may be the music data storage unit 103, for example. The lyrics image to be output usually corresponds to the music data reproduced by the music reproducing unit 111. That is, the image output unit 110 normally acquires a lyric image associated with music data being reproduced by the music reproduction unit 111 from a predetermined storage area. Then, the image output unit 110 outputs the acquired lyrics image. Moreover, when outputting a lyric image, a synthetic | combination image structure part may comprise a synthetic | combination image also using a lyric image, for example.

また、本実施の形態において、手本画像が出力される場合、通常、楽曲データの再生が開始されると同時に、手本画像の出力が開始される。当該「同時」には、「ほぼ同時」も含まれる。つまり、楽曲再生部１１１が楽曲データの再生を開始すると、画像出力部１１０が手本画像の出力を開始する。 Further, in the present embodiment, when a model image is output, normally, the reproduction of the music data is started and the output of the model image is started at the same time. The “simultaneous” also includes “almost simultaneous”. That is, when the music reproducing unit 111 starts reproducing the music data, the image output unit 110 starts outputting the model image.

また、本実施の形態において、モデル画像が出力される場合、通常、楽曲データの再生が開始されると同時に、モデル画像の出力が開始される。当該「同時」には、「ほぼ同時」も含まれる。つまり、楽曲再生部１１１が楽曲データの再生を開始すると、撮影部１０５は、歌唱者の撮影を開始する。また、当該撮影部１０５による撮影の開始に応じて、モデル画像構成部１０９がモデル画像を構成し、画像出力部１１０がモデル画像の出力を開始する。 Further, in the present embodiment, when a model image is output, normally, the reproduction of the music data is started and the output of the model image is started at the same time. The “simultaneous” also includes “almost simultaneous”. That is, when the music reproducing unit 111 starts reproducing the music data, the photographing unit 105 starts photographing the singer. In response to the start of shooting by the shooting unit 105, the model image construction unit 109 constructs a model image, and the image output unit 110 starts outputting the model image.

また、モデル画像と手本画像とを出力する場合、画像出力部１１０は、例えば、モデル画像と手本画像とを重ねて出力してもよい。この場合、画像出力部１１０は、例えば、手本画像の上にモデル画像が重なるようにモデル画像と手本画像とを出力する。また、この場合、画像出力部１１０は、例えば、モデル画像の上に手本画像が重なるようにモデル画像と手本画像とを出力してもよい。また、この場合、画像出力部１１０は、例えば、他の画像の上に重なる画像に対して、いわゆる半透明化の処理を施してもよい。また、これにより、ユーザにとって、手本どおりにダンスが踊れているか否かがよりわかりやすくなる。 Further, when outputting the model image and the model image, the image output unit 110 may output the model image and the model image, for example, in an overlapping manner. In this case, for example, the image output unit 110 outputs the model image and the model image so that the model image overlaps the model image. In this case, the image output unit 110 may output the model image and the model image so that the model image overlaps the model image, for example. In this case, for example, the image output unit 110 may perform a so-called translucent process on an image that overlaps another image. This also makes it easier for the user to see whether the dance is performed according to the model.

また、本実施の形態において、カラオケ装置１は、例えば、歌唱者が歌う歌の採点を行う歌唱採点部２１２を備えていてもよい。歌唱採点部２１２は、通常、歌唱者が歌う歌の採点を行い、当該採点の結果であるスコア（以下、適宜、歌スコアとする）を算出する。なお、歌唱者が歌う歌を採点する方法や手順など（歌スコアを算出する方法や手順など）は、公知であるので、詳細な説明を省略する。 Moreover, in this Embodiment, the karaoke apparatus 1 may be provided with the song scoring part 212 which scores the song which a singer sings, for example. The singing scoring unit 212 usually scores a song sung by a singer, and calculates a score (hereinafter referred to as a song score as appropriate) as a result of the scoring. In addition, since the method, the procedure, etc. of scoring the song which a singer sings (a method, a procedure, etc. which calculate a song score) are well-known, detailed description is abbreviate | omitted.

また、歌唱採点部２１２を備える場合、カラオケ装置１は、通常、スコア算出部１０７に代えて、例えば、歌唱者の動きを採点した結果であるスコア（以下、適宜、ダンススコアとする）と、上記歌スコアとを用いて、歌唱者の歌と歌唱者の動きとを採点した結果であるスコア（以下、適宜、総合スコアとする）を算出するスコア算出部２０７を備えていてもよい。この場合、スコア出力部１０８は、通常、スコア算出部２０７が算出した総合スコアを出力する。また、スコア出力部１０８は、例えば、スコア算出部２０７が算出した総合スコアまたはダンススコア、歌唱採点部２１２が取得した歌スコアのうちの１以上を出力してもよい。 In addition, when the singing scoring unit 212 is provided, the karaoke apparatus 1 normally replaces the score calculation unit 107 with, for example, a score obtained by scoring the movement of the singer (hereinafter referred to as a dance score as appropriate), A score calculation unit 207 that calculates a score (hereinafter, referred to as an overall score as appropriate) that is a result of scoring the song of the singer and the movement of the singer using the song score may be provided. In this case, the score output unit 108 normally outputs the total score calculated by the score calculation unit 207. The score output unit 108 may output one or more of the total score or dance score calculated by the score calculation unit 207 and the song score acquired by the singing scoring unit 212, for example.

また、総合スコアを算出する手順は、例えば、次のとおりである。つまり、スコア算出部２０７は、例えば、総合スコアを算出するための算出式（以下、適宜、総合スコア算出式とする）を予め保持している。「総合スコア算出式」は、通常、ダンススコアと、歌スコアとを代入するための変数を有する。そして、スコア算出部２０７は、例えば、算出したダンススコアと、歌唱採点部が算出した歌スコアとを、当該総合スコア算出式に代入し、当該代入後の総合スコア算出式を計算することにより、総合スコアを算出する。なお、「総合スコア算出式」は、いわゆる関数（プログラム）であってもよい。また、「総合スコア算出式」は、通常、ダンススコアおよび歌スコアに対する増加関数であることが好適である。 The procedure for calculating the total score is, for example, as follows. That is, for example, the score calculation unit 207 holds in advance a calculation formula for calculating the total score (hereinafter, appropriately referred to as a total score calculation formula). The “total score calculation formula” usually has variables for substituting a dance score and a song score. Then, for example, the score calculation unit 207 substitutes the calculated dance score and the song score calculated by the singing scoring unit into the total score calculation formula, and calculates the total score calculation formula after the substitution, Calculate the total score. The “total score calculation formula” may be a so-called function (program). The “total score calculation formula” is preferably an increase function for the dance score and song score.

また、「総合スコア」は、歌唱者によるパフォーマンス（歌およびダンス）のスコアであるパフォーマンススコアであると言える。 Further, it can be said that the “total score” is a performance score that is a score of performance (song and dance) by a singer.

また、歌唱採点部２１２およびスコア算出部２０７を備えるカラオケ装置であるカラオケ装置２のブロック図は、図１７である。図１７において、カラオケ装置２は、動き判定情報格納部１０１、手本画像格納部１０２、楽曲データ格納部１０３、受付部１０４、撮影部１０５、スケルトン情報取得部１０６、スコア算出部２０７、スコア出力部１０８、モデル画像構成部１０９、画像出力部１１０、楽曲再生部１１１、歌唱採点部２１２を備える。 Moreover, the block diagram of the karaoke apparatus 2 which is a karaoke apparatus provided with the singing scoring part 212 and the score calculation part 207 is FIG. In FIG. 17, the karaoke apparatus 2 includes a motion determination information storage unit 101, a model image storage unit 102, a music data storage unit 103, a reception unit 104, a photographing unit 105, a skeleton information acquisition unit 106, a score calculation unit 207, and a score output. Unit 108, model image construction unit 109, image output unit 110, music playback unit 111, and singing scoring unit 212.

また、上記各実施の形態において、各処理または各機能は、単一の装置または単一のシステムによって集中処理されることによって実現されてもよいし、あるいは、複数の装置または複数のシステムによって分散処理されることによって実現されてもよい。 In each of the above embodiments, each process or each function may be realized by centralized processing by a single device or a single system, or distributed by a plurality of devices or a plurality of systems. It may be realized by being processed.

また、上記各実施の形態において、各構成要素は専用のハードウェアにより構成されてもよいし、あるいは、ソフトウェアにより実現可能な構成要素については、プログラムを実行することによって実現されてもよい。例えば、ハードディスクや半導体メモリ等の記録媒体に記録されたソフトウェア・プログラムをＣＰＵ等のプログラム実行部が読み出して実行することによって、各構成要素が実現され得る。 In each of the above embodiments, each component may be configured by dedicated hardware, or a component that can be realized by software may be realized by executing a program. For example, each component can be realized by a program execution unit such as a CPU reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory.

また、上記各実施の形態におけるカラオケ装置を実現するソフトウェアは、例えば、以下のようなプログラムである。つまり、このプログラムは、コンピュータを、歌唱者を撮影し、当該歌唱者が写された画像である歌唱者画像を取得する撮影部と、前記歌唱者画像に含まれる情報を用いて、歌唱者の動きを示す情報であるスケルトン情報を取得するスケルトン情報取得部と、前記スケルトン情報取得部が取得したスケルトン情報を用いて、当該スケルトン情報が示す歌唱者の動きを採点し、当該採点の結果であるスコアを算出するスコア算出部と、前記スコア算出部が算出したスコアを出力するスコア出力部として機能させるためのプログラムである。 Moreover, the software which implement | achieves the karaoke apparatus in each said embodiment is the following programs, for example. That is, this program uses a computer to shoot a singer, obtain a singer image that is an image of the singer, and information included in the singer image. Using the skeleton information acquisition unit that acquires skeleton information that is information indicating movement, and the skeleton information acquired by the skeleton information acquisition unit, the movement of the singer indicated by the skeleton information is scored, and the result of the scoring It is a program for functioning as a score calculation unit for calculating a score and a score output unit for outputting the score calculated by the score calculation unit.

なお、上記プログラムにおいて、上記プログラムが実現する機能には、ハードウェアでしか実現できない機能は含まれない。 In the program, the functions realized by the program do not include functions that can be realized only by hardware.

また、上記プログラムは、サーバなどからダウンロードされることによって実行されてもよいし、所定の記録媒体（例えば、ＣＤ−ＲＯＭなどの光ディスクや磁気ディスク、半導体メモリなど）に記録されたプログラムが読み出されることによって実行されてもよい。また、このプログラムは、プログラムプロダクトを構成するプログラムとして用いられてもよい。 The program may be executed by being downloaded from a server or the like, or a program recorded on a predetermined recording medium (for example, an optical disk such as a CD-ROM, a magnetic disk, a semiconductor memory, or the like) is read out. May be executed. Further, this program may be used as a program constituting a program product.

また、上記プログラムを実行するコンピュータは、単数であってもよいし、複数であってもよい。つまり、集中処理を行ってもよいし、あるいは分散処理を行ってもよい。 Moreover, the computer which performs the said program may be single, and plural may be sufficient as it. That is, centralized processing may be performed, or distributed processing may be performed.

また、図１８は、前述のプログラムを実行して、前述の実施の形態のカラオケ装置等を実現するコンピュータシステム９の概観図である。前述の実施の形態は、コンピュータハードウェア、およびその上で実行されるコンピュータプログラムで実現され得る。 FIG. 18 is a schematic diagram of a computer system 9 that executes the above-described program to realize the karaoke apparatus of the above-described embodiment. The above-described embodiments can be realized by computer hardware and a computer program executed thereon.

図１８において、コンピュータシステム９は、ＣＤ−ＲＯＭドライブ９０１１、ＦＤドライブ９０１２を含むコンピュータ９０１と、キーボード９０２と、マウス９０３と、モニタ９０４とを備える。 In FIG. 18, the computer system 9 includes a computer 901 including a CD-ROM drive 9011 and an FD drive 9012, a keyboard 902, a mouse 903, and a monitor 904.

図１９は、コンピュータシステム９のブロック図である。図１９において、コンピュータ９０１は、ＣＤ−ＲＯＭドライブ９０１１、ＦＤドライブ９０１２に加えて、ＭＰＵ９０１３と、ブートアッププログラム等のプログラムを記憶するためのＲＯＭ９０１４と、ＭＰＵ９０１３に接続され、アプリケーションプログラムの命令を一時的に記憶するとともに一時記憶空間を提供するためのＲＡＭ９０１５と、アプリケーションプログラム、システムプログラム、およびデータを記憶するためのハードディスク９０１６と、ＣＤ−ＲＯＭドライブ９０１１、ＦＤドライブ９０１２、ＭＰＵ９０１３等を相互に接続するバス９０１７とを備える。ここでは図示しないが、コンピュータ９０１は、さらに、ＬＡＮへの接続を提供するネットワークカードを備えていてもよい。 FIG. 19 is a block diagram of the computer system 9. In FIG. 19, in addition to a CD-ROM drive 9011 and an FD drive 9012, a computer 901 is connected to an MPU 9013, a ROM 9014 for storing a program such as a boot-up program, and an MPU 9013, and temporarily commands the application program. And a hard disk 9016 for storing application programs, system programs, and data, a CD-ROM drive 9011, an FD drive 9012, an MPU 9013, etc. 9017. Although not shown here, the computer 901 may further include a network card that provides connection to a LAN.

コンピュータシステム９に、前述の実施の形態のカラオケ装置等の機能を実行させるプログラムは、ＣＤ−ＲＯＭ９１０１、またはＦＤ９１０２に記憶されて、ＣＤ−ＲＯＭドライブ９０１１またはＦＤドライブ９０１２に挿入され、さらにハードディスク９０１６に転送されてもよい。これに代えて、プログラムは、図示しないネットワークを介してコンピュータ９０１に送信され、ハードディスク９０１６に記憶されてもよい。プログラムは実行の際にＲＡＭ９０１５にロードされる。プログラムは、ＣＤ−ＲＯＭ９１０１、ＦＤ９１０２またはネットワークから直接、ロードされてもよい。 A program that causes the computer system 9 to execute the functions of the karaoke apparatus or the like of the above-described embodiment is stored in the CD-ROM 9101 or FD 9102, inserted into the CD-ROM drive 9011 or FD drive 9012, and further stored in the hard disk 9016. May be forwarded. Alternatively, the program may be transmitted to the computer 901 via a network (not shown) and stored in the hard disk 9016. The program is loaded into the RAM 9015 when executed. The program may be loaded directly from the CD-ROM 9101, the FD 9102, or the network.

プログラムは、コンピュータ９０１に、前述の実施の形態のカラオケ装置等の機能を実行させるオペレーティングシステム（ＯＳ）、またはサードパーティープログラム等は、必ずしも含まなくてもよい。プログラムは、制御された態様で適切な機能（モジュール）を呼び出し、所望の結果が得られるようにする命令の部分のみを含んでいればよい。コンピュータシステム９がどのように動作するかは周知であり、詳細な説明は省略する。 The program does not necessarily include an operating system (OS), a third party program, or the like that causes the computer 901 to execute the functions of the karaoke apparatus of the above-described embodiment. The program only needs to include an instruction portion that calls an appropriate function (module) in a controlled manner and obtains a desired result. How the computer system 9 operates is well known and will not be described in detail.

本発明は、以上の実施の形態に限定されることなく、種々の変更が可能であり、それらも本発明の範囲内に包含されるものであることは言うまでもない。 The present invention is not limited to the above-described embodiments, and various modifications are possible, and it goes without saying that these are also included in the scope of the present invention.

以上のように、本発明にかかるカラオケ装置は、ダンスの採点を精度良く行うカラオケ装置を提供することができるという効果を有し、いわゆる通信カラオケ等として有用である。 As described above, the karaoke apparatus according to the present invention has an effect that it can provide a karaoke apparatus that accurately scores a dance, and is useful as a so-called communication karaoke apparatus.

１カラオケ装置
１０１動き判定情報格納部
１０２手本画像格納部
１０３楽曲データ格納部
１０４受付部
１０５撮影部
１０６スケルトン情報取得部
１０７スコア算出部
１０８スコア出力部
１０９モデル画像構成部
１１０画像出力部
１１１楽曲再生部 DESCRIPTION OF SYMBOLS 1 Karaoke apparatus 101 Motion determination information storage part 102 Model image storage part 103 Music data storage part 104 Reception part 105 Shooting part 106 Skeleton information acquisition part 107 Score calculation part 108 Score output part 109 Model image structure part 110 Image output part 111 Music Playback section

Claims

A shooting section for shooting a singer and obtaining a singer image that is an image of the singer;
Using information included in the singer image, a skeleton information acquisition unit that acquires skeleton information that is information indicating the movement of the singer,
A karaoke apparatus provided with the score output part which outputs the score which is the result of scoring the movement of the said singer using the said skeleton information.

The singer image is an image including distance information,
The skeleton information acquisition unit
The karaoke apparatus of Claim 1 which acquires the said skeleton information using the distance information contained in the said singer image.

The karaoke apparatus according to claim 2, wherein the information included in the singer image is distance information corresponding to the area of the singer in the distance information.

A motion determination information storage unit storing motion determination information which is information for determining the movement of the singer;
Using the movement determination information and the skeleton information, further determining a movement of a singer indicated by the skeleton information, and further including a score calculation unit that calculates the score using a result of the determination;
The score output unit
The karaoke apparatus as described in any one of Claims 1-3 which outputs the score which the said score calculation part calculated.

In the movement determination information storage unit,
One or more movement determination information for determining the movement of the singer at every predetermined timing is stored,
The skeleton information acquisition unit
Using the information included in the singer image, obtain one or more skeleton information indicating the movement of the singer at one or more predetermined timings,
The score calculation unit
The movement of the singer indicated by each of the one or more skeleton information acquired by the skeleton information acquisition unit is determined using the movement determination information corresponding to each of the one or more timings, and the score is used using the determination result. The karaoke apparatus according to claim 4 which calculates

The score calculation unit
A score that is a weighted average of the one or more scores is obtained by scoring the movement of the singer at each of the predetermined timings, calculating one or more scores as a result of scoring at the one or more timings. 6. The karaoke apparatus according to claim 5, wherein the karaoke device is calculated.

The movement determination information is one or more joint angle conditions that are conditions relating to the angle of one or more joints of the singer,
The score calculation unit
The skeleton information acquired by the skeleton information acquisition unit is used to acquire angles of one or more joints of the singer, and the joint angle conditions corresponding to the one or more joints are used. The karaoke apparatus according to any one of claims 4 to 6, wherein an angle of a joint is determined and the score is calculated using a result of the determination.

The movement determination information is an angle condition between one or more joints that is a condition related to an angle between one or more joints of the singer,
The score calculation unit
Using the skeleton information acquired by the skeleton information acquisition unit, an angle between one or more joints of the singer is acquired, and the inter-joint angle condition corresponding to the one or more joints is used to determine the 1 The karaoke apparatus according to any one of claims 4 to 7, wherein an angle between the joints is determined, and the score is calculated using a result of the determination.

The movement determination information is one or more joint coordinate conditions that are conditions related to the coordinates of one or more joints of the singer,
The score calculation unit
Using the skeleton information acquired by the skeleton information acquisition unit, acquire the coordinates of one or more joints of the singer, and use the joint coordinate conditions corresponding to the one or more joints, The karaoke apparatus according to any one of claims 4 to 8, wherein coordinates of a joint are determined, and the score is calculated using a result of the determination.

The movement determination information is one or more joint movement amount conditions that are conditions relating to the movement amount of each of the one or more joints of the singer.
The score calculation unit
Using the skeleton information acquired by the skeleton information acquisition unit, the movement amount of one or more joints of the singer is acquired, and the one or more joint movement amount conditions corresponding to the one or more joints are used. The karaoke apparatus according to any one of claims 4 to 9, wherein a movement amount of each joint is determined, and the score is calculated using a result of the determination.

The karaoke apparatus according to any one of claims 1 to 10, wherein the information included in the singer image is a singer image after deleting a pixel that satisfies a predetermined condition from the singer image.

A model image storage unit in which a model image which is an image showing a model of dance is stored;
The karaoke apparatus according to any one of claims 1 to 11, further comprising an image output unit that outputs the model image.

Using the information included in the singer image, one or more attribute values of the singer are obtained, and at least a part of the model image is a model image that indicates a model using the one or more attribute values of the singer. A model image construction unit
The image output unit includes:
The karaoke apparatus according to claim 12, wherein the model image and the model image configured by the model image configuration unit are output.

A dance scoring method performed using an imaging unit, a skeleton information acquisition unit, and a score output unit,
The photographing unit is
A shooting step of shooting a singer and obtaining a singer image that is an image of the singer;
The skeleton information acquisition unit
Using the information included in the singer image, a skeleton information acquisition step of acquiring skeleton information that is information indicating the movement of the singer,
The score output unit
A dance scoring method comprising: a score output step of outputting a score that is a result of scoring the movement of the singer using the skeleton information.

Computer
A shooting section for shooting a singer and obtaining a singer image that is an image of the singer;
Using information included in the singer image, a skeleton information acquisition unit that acquires skeleton information that is information indicating the movement of the singer,
The program for functioning as a score output part which outputs the score which is the result of scoring the movement of the singer using the skeleton information.