JP2011215968A

JP2011215968A - Program, information storage medium and object recognition system

Info

Publication number: JP2011215968A
Application number: JP2010084610A
Authority: JP
Inventors: Shoji Nakajima; 正二中島
Original assignee: Namco Bandai Games Inc
Current assignee: Bandai Namco Entertainment Inc
Priority date: 2010-03-31
Filing date: 2010-03-31
Publication date: 2011-10-27

Abstract

PROBLEM TO BE SOLVED: To provide a program for object recognition processing capable of efficiently and accurately performing processing for recognizing an object where a depth relation is considered, an information storage medium and an object recognition system.SOLUTION: An input image having a depth value of each pixel is acquired by irradiating the object with light and receiving the reflected light of the object, a specific area is set in the input image based on the depth value of each pixel of the input image, and the object recognition processing is performed in the specific area.

Description

本発明は、プログラム、情報記憶媒体及び物体認識システムに関する。 The present invention relates to a program, an information storage medium, and an object recognition system.

従来から、入力画像の各画素の動きベクトルを求めて入力画像を解析し、入力画像上の物体を認識する処理を行う装置が存在する（特許文献１）。 2. Description of the Related Art Conventionally, there is an apparatus that performs a process of recognizing an object on an input image by obtaining a motion vector of each pixel of the input image and analyzing the input image (Patent Document 1).

しかし、特許文献１に示すような従来技術は、動いている対象が人であることを前提として作られたものであり、実際には被写体が人でない物体（例えば、「鳥」）などの動きも、人の動きであると誤認されてしまうことがあった。 However, the conventional technique as shown in Patent Document 1 is made on the assumption that the moving object is a person, and in fact, the movement of an object (for example, “bird”) that is not a person is a subject. However, it was sometimes mistaken for human movement.

また、動いている対象が「人」であるか否かを判断する従来技術も存在するが、このような従来技術は、入力画像全体において「人」の認識パターンを用いて認識処理を行うので、非常に効率が悪く処理負荷が高くなるものであった。 In addition, there is a conventional technique for determining whether or not a moving object is a “person”, but such a conventional technique performs a recognition process using a recognition pattern of “person” in the entire input image. It was very inefficient and the processing load was high.

また、従来技術では、物体の奥行き関係を無視して物体認識処理を行っていたので、物体認識の正確性に欠けるという問題があった。 Further, in the prior art, the object recognition processing is performed ignoring the depth relation of the object, so that there is a problem that the accuracy of the object recognition is lacking.

特開２００５−１３７５０号公報JP 2005-13750 A

本発明は、以上のような問題に鑑みてなされたものであり、その目的とするところは、効率的に、かつ、正確に、奥行き関係を考慮した物体を認識する処理を行うことが可能な物体認識処理のプログラム、情報記憶媒体及び物体認識システムを提供することにある。 The present invention has been made in view of the above-described problems, and an object of the present invention is to perform processing for recognizing an object in consideration of depth relations efficiently and accurately. An object of the present invention is to provide an object recognition processing program, an information storage medium, and an object recognition system.

（１）本発明は、物体を認識する処理を行うプログラムであって、物体に光を照射し、当該物体の反射光を受光することによって各画素の深度値を有する入力画像を取得し、入力画像の各画素の深度値に基づいて、入力画像において特定領域を設定する領域設定部と、物体を認識する物体認識処理を行う物体認識処理部として、コンピュータを機能させ、前記物体認識処理部が、特定領域において物体認識処理を行うプログラムに関する。また本発明は、コンピュータに読み取り可能な情報記憶媒体であって、上記各部としてコンピュータを機能させるプログラムを記憶（記録）した情報記憶媒体に関係する。また、本発明は、上記各部を含む物体認識システムに関係する。 (1) The present invention is a program for performing processing for recognizing an object, which obtains an input image having a depth value of each pixel by irradiating the object with light and receiving reflected light from the object, Based on the depth value of each pixel of the image, the computer functions as a region setting unit that sets a specific region in the input image and an object recognition processing unit that performs object recognition processing for recognizing an object. The present invention relates to a program for performing object recognition processing in a specific area. The present invention also relates to an information storage medium that is readable by a computer and stores (records) a program that causes the computer to function as each of the above-described units. The present invention also relates to an object recognition system including the above-described units.

本発明によれば、特定領域に集中して物体認識処理を行うので、従来よりも正確に物体を認識することができる。また、本発明によれば、全画面ではなく特定領域において物体認識処理を行うので、無駄な処理を省略することができ、従来よりも効率よく物体認識処理を行うことができる。さらに、本発明によれば、入力画像の各画素の深度値に基づいて、入力画像において特定領域を設定し、その特定領域において物体認識処理を行うので、物体の奥行き関係を考慮してより正確に物体を認識することができる。例えば、入力画像において近距離にあるとみなされた領域を特定領域とすれば、近距離に存在する物体をより効率的かつ正確に認識することができる。また、本発明によれば特定領域において物体認識処理を行うので特定領域以外の領域に別の物体が映りこんでいる場合に、その別の物体を誤って認識する事態を防止することができる。 According to the present invention, since object recognition processing is performed in a concentrated manner in a specific area, an object can be recognized more accurately than in the past. Further, according to the present invention, since object recognition processing is performed in a specific area instead of the full screen, useless processing can be omitted and object recognition processing can be performed more efficiently than in the past. Furthermore, according to the present invention, a specific area is set in the input image based on the depth value of each pixel of the input image, and object recognition processing is performed in the specific area. Can recognize objects. For example, if an area that is regarded as being in a short distance in the input image is set as a specific area, an object existing in the short distance can be recognized more efficiently and accurately. In addition, according to the present invention, since the object recognition process is performed in the specific area, when another object is reflected in an area other than the specific area, it is possible to prevent a situation where the other object is erroneously recognized.

（２）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記物体認識処理部が、特定領域における物体認識処理の精度を特定領域以外の領域の物体認識処理の精度よりも上げて、特定領域において物体認識処理を行うようにしてもよい。本発明は、特定領域における物体認識処理の精度を特定領域以外の領域の物体認識処理の精度よりも上げて、特定領域において物体認識処理を行うので、より正確に物体を認識することができる。 (2) Further, in the program, the information storage medium, and the object recognition system of the present invention, the object recognition processing unit increases the accuracy of the object recognition process in the specific area more than the accuracy of the object recognition process in the area other than the specific area. The object recognition process may be performed in the specific area. According to the present invention, since the object recognition process is performed in the specific region with the accuracy of the object recognition process in the specific region being higher than the accuracy of the object recognition process in the region other than the specific region, the object can be recognized more accurately.

（３）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記物体認識処理部が、特定領域において物体認識処理を行う周期を特定領域以外の領域の物体認識処理の周期よりも短くして、特定領域において物体認識処理を行うようにしてもよい。本発明によれば、特定領域において物体認識処理を行う周期を特定領域以外の領域の物体認識処理の周期よりも短くするので、より正確に物体を認識することができる。 (3) Further, in the program, the information storage medium, and the object recognition system of the present invention, the cycle in which the object recognition processing unit performs the object recognition process in the specific region is shorter than the cycle of the object recognition process in the region other than the specific region. Then, the object recognition process may be performed in the specific area. According to the present invention, since the cycle for performing the object recognition process in the specific region is shorter than the cycle of the object recognition process in the region other than the specific region, the object can be recognized more accurately.

（４）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記物体認識処理部が、特定領域の画像精度を特定領域以外の領域の画像精度よりも上げて、特定領域において物体認識処理を行うようにしてもよい。本発明によれば、特定領域の画像精度を特定領域以外の領域の画像精度よりも上げて、特定領域において物体認識処理を行うので、より正確に物体を認識することができる。 (4) Further, in the program, the information storage medium, and the object recognition system of the present invention, the object recognition processing unit increases the image accuracy of the specific area higher than the image accuracy of the area other than the specific area, and recognizes the object in the specific area. Processing may be performed. According to the present invention, the object recognition process is performed in the specific area with the image accuracy of the specific area higher than the image accuracy of the area other than the specific area, so that the object can be recognized more accurately.

（５）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記物体認識処理部が、特定領域以外の領域において物体認識処理を行うと共に、特定領域以外の領域において物体認識処理を行う周期を、特定領域において物体認識処理を行う周期よりも長くするようにしてもよい。本発明によれば、特定領域以外の領域においても物体認識処理を行うことができる。また、本発明は、特定領域以外の領域において物体認識処理を行う周期を、特定領域において物体認識処理を行う周期よりも長くするので、マシンパワー（コンピュータの総合的な処理能力）を主に特定領域の物体認識処理に注力することができ、特定領域の物体認識処理をより正確に行うことができる。 (5) In the program, the information storage medium, and the object recognition system of the present invention, the object recognition processing unit performs object recognition processing in an area other than the specific area, and performs object recognition processing in an area other than the specific area. You may make it make a period longer than the period which performs an object recognition process in a specific area | region. According to the present invention, it is possible to perform object recognition processing in a region other than the specific region. In addition, since the present invention makes the period for performing object recognition processing in areas other than the specific area longer than the period for performing object recognition processing in the specific area, machine power (the overall processing capacity of the computer) is mainly specified. It is possible to focus on the object recognition processing of the area, and it is possible to perform the object recognition processing of the specific area more accurately.

（６）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記物体認識処理部が、特定領域以外の領域において物体認識処理を行うと共に、特定領域以外の領域の画像精度を特定領域の画像精度よりも低くするようにしてもよい。本発明によれば、特定領域以外の領域においても物体認識処理を行うことができる。また、本発明は、特定領域以外の領域の画像精度を特定領域の画像精度よりも低くするので、マシンパワーを主に特定領域の物体認識処理に注力することができ、特定領域の物体認識処理をより正確に行うことができる。 (6) Further, in the program, the information storage medium, and the object recognition system of the present invention, the object recognition processing unit performs object recognition processing in an area other than the specific area, and the image accuracy of the area other than the specific area is determined in the specific area. The image accuracy may be lower than the image accuracy. According to the present invention, it is possible to perform object recognition processing in a region other than the specific region. In addition, since the image accuracy of the region other than the specific region is lower than the image accuracy of the specific region according to the present invention, the machine power can be mainly focused on the object recognition process of the specific region. Can be performed more accurately.

（７）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記領域設定部が、複数の特定領域を設定した場合には、前記物体認識処理部が、少なくとも１つの特定領域について物体認識処理を行うようにしてもよい。本発明によれば、複数の特定領域を設定した場合に、少なくとも１つの特定領域について物体認識処理を行うので、少なくとも１つの特定領域について、より正確に物体を認識することができる。 (7) In the program, the information storage medium, and the object recognition system of the present invention, when the region setting unit sets a plurality of specific regions, the object recognition processing unit performs object detection for at least one specific region. Recognition processing may be performed. According to the present invention, when a plurality of specific areas are set, the object recognition process is performed for at least one specific area, so that an object can be recognized more accurately for at least one specific area.

（８）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記領域設定部が、各特定領域に優先度を付与し、前記物体認識処理部が、各特定領域において物体認識処理を行うと共に、優先度の低い特定領域において物体認識処理を行う周期を、優先度の高い特定領域において物体認識処理を行う周期よりも長くするようにしてもよい。本発明によれば、複数の特定領域において物体を認識する場合に、優先度の低い特定領域において物体認識処理を行う周期を、優先度の高い特定領域において物体認識処理を行う周期よりも長くするので、優先度が高い特定領域ほどマシンパワーを注力することができ、効率よく物体認識処理を行うことができる。 (8) In the program, the information storage medium, and the object recognition system of the present invention, the area setting unit gives priority to each specific area, and the object recognition processing section performs object recognition processing in each specific area. In addition, the period for performing the object recognition process in the specific area with low priority may be set longer than the period for performing the object recognition process in the specific area with high priority. According to the present invention, when recognizing an object in a plurality of specific areas, the period for performing the object recognition process in the specific area with low priority is set longer than the period for performing the object recognition process in the specific area with high priority. Therefore, the machine power can be focused on the specific area with higher priority, and the object recognition process can be performed efficiently.

（９）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記領域設定部が、各特定領域に優先度を付与し、前記物体認識処理部が、各特定領域において物体認識処理を行うと共に、優先度の低い特定領域の画像精度を、優先度の高い特定領域の画像精度よりも低くするようにしてもよい。本発明によれば、複数の特定領域において物体を認識する場合に、優先度の低い特定領域において物体認識処理を行う周期を、優先度の低い特定領域の画像精度を、優先度の高い特定領域の画像精度よりも低くするので、優先度が高い特定領域ほどマシンパワーを注力することができ、効率よく物体認識処理を行うことができる。 (9) In the program, the information storage medium, and the object recognition system of the present invention, the area setting unit gives priority to each specific area, and the object recognition processing section performs object recognition processing in each specific area. In addition, the image accuracy of the specific region with a low priority may be made lower than the image accuracy of the specific region with a high priority. According to the present invention, when recognizing an object in a plurality of specific areas, the period for performing the object recognition process in the low priority specific area, the image accuracy of the low priority specific area, and the high priority specific area Therefore, it is possible to focus the machine power on a specific area having a higher priority and perform object recognition processing efficiently.

（１０）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記領域設定部が、各特定領域に優先度を付与し、前記物体認識処理部が、優先度の低い特定領域において物体認識処理を行わずに、優先度の高い特定領域において物体認識処理を行うようにしてもよい。本発明によれば、優先度の低い特定領域において物体認識処理を行わずに、優先度の高い特定領域において物体認識処理を行うので、より効率よく物体認識処理を行うことができる。 (10) In the program, the information storage medium, and the object recognition system of the present invention, the area setting unit assigns a priority to each specific area, and the object recognition processing unit performs an object in the low priority specific area. The object recognition process may be performed in a specific area with a high priority without performing the recognition process. According to the present invention, since the object recognition process is performed in the specific area with high priority without performing the object recognition process in the specific area with low priority, the object recognition process can be performed more efficiently.

（１１）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記領域設定部が、入力画像の各画素の色情報と深度値とに基づいて、入力画像において特定領域を設定するようにしてもよい。本発明によれば、入力画像の各画素の色情報と深度値に基づいて、入力画像において特定領域を設定するので、より正確に物体を認識すべき特定領域を設定することができる。 (11) In the program, the information storage medium, and the object recognition system of the present invention, the region setting unit sets a specific region in the input image based on the color information and the depth value of each pixel of the input image. It may be. According to the present invention, since the specific area is set in the input image based on the color information and the depth value of each pixel of the input image, it is possible to set the specific area where the object should be recognized more accurately.

（１２）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記領域設定部が、入力画像の各画素の深度値のうち所定値以上である深度値に基づいて、入力画像において特定領域を設定するようにしてもよい。本発明によれば、入力画像の各画素の深度値のうち所定値以上である深度値に基づいて、入力画像において特定領域を設定するので、より正確に物体を認識すべき特定領域を設定することができる。 (12) Further, in the program, the information storage medium, and the object recognition system of the present invention, the region setting unit specifies an input image based on a depth value that is equal to or greater than a predetermined value among the depth values of each pixel of the input image. An area may be set. According to the present invention, since the specific area is set in the input image based on the depth value that is equal to or greater than a predetermined value among the depth values of each pixel of the input image, the specific area where the object should be recognized more accurately is set. be able to.

（１３）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記物体認識処理部が、ボーン情報に基づいて、前記特定領域における前記物体認識処理を行うようにしてもよい。本発明によれば、ボーン情報に基づいて特定領域における物体認識処理を行うので、物体をより適確に認識することができる。 (13) In the program, the information storage medium, and the object recognition system of the present invention, the object recognition processing unit may perform the object recognition processing in the specific area based on bone information. According to the present invention, since the object recognition process in the specific area is performed based on the bone information, the object can be recognized more accurately.

（１４）また、本発明のプログラム、情報記憶媒体及び物体認識システムは、前記物体認識処理部が、所定タイミングで前記特定領域における画像の変化を認識する処理を行うようにしてもよい。本発明によれば、簡易な処理で画像の変化を認識することができる。 (14) In the program, the information storage medium, and the object recognition system of the present invention, the object recognition processing unit may perform a process of recognizing an image change in the specific area at a predetermined timing. According to the present invention, a change in an image can be recognized with a simple process.

本実施形態の第１の物体認識システムの概観図。1 is an overview diagram of a first object recognition system of the present embodiment. 本実施形態の第１の物体認識システムの機能ブロック図。The functional block diagram of the 1st object recognition system of this embodiment. 動きベクトルの説明図。Explanatory drawing of a motion vector. 図４（Ａ）〜（Ｇ）は、動きベクトルの説明図。4A to 4G are explanatory diagrams of motion vectors. 動きベクトルの向きを算出するためのフローチャート。The flowchart for calculating the direction of a motion vector. 図６（Ａ）（Ｂ）（Ｃ）は、動きベクトルの説明図。6A, 6B, and 6C are explanatory diagrams of motion vectors. 特定領域を設定する手法を説明するための図。The figure for demonstrating the method of setting a specific area | region. 特定領域において物体認識処理を行う手法を説明するための説明図。Explanatory drawing for demonstrating the method of performing an object recognition process in a specific area | region. 物体認識処理の周期を説明するための図。The figure for demonstrating the period of an object recognition process. 図１０（Ａ）（Ｂ）（Ｃ）は、動きベクトルの説明図。10A, 10B, and 10C are explanatory diagrams of motion vectors. 特定領域を設定する手法を説明するための図。The figure for demonstrating the method of setting a specific area | region. 特定領域において物体認識処理を行う手法を説明するための説明図。Explanatory drawing for demonstrating the method of performing an object recognition process in a specific area | region. 物体認識処理の周期を説明するための図。The figure for demonstrating the period of an object recognition process. 特定領域を設定する手法を説明するための図。The figure for demonstrating the method of setting a specific area | region. 各特定領域の優先度に関する説明図。Explanatory drawing regarding the priority of each specific area. 特定領域において物体認識処理を行う手法を説明するための説明図。Explanatory drawing for demonstrating the method of performing an object recognition process in a specific area | region. 物体認識処理の周期を説明するための図。The figure for demonstrating the period of an object recognition process. 第１の実施形態のフローチャート。The flowchart of 1st Embodiment. 本実施形態の第２の物体認識システムの概観図。FIG. 6 is an overview diagram of a second object recognition system of the present embodiment. 本実施形態の第２の物体認識システムの機能ブロック図。The functional block diagram of the 2nd object recognition system of this embodiment. 図２１（Ａ）（Ｂ）は、本実施形態の入力部に入力される入力画像の説明図。21A and 21B are explanatory diagrams of input images input to the input unit of the present embodiment. 第２の実施形態の深度センサの説明図。Explanatory drawing of the depth sensor of 2nd Embodiment. 第２の実施形態の深度センサの説明図。Explanatory drawing of the depth sensor of 2nd Embodiment. 第２の実施形態の実空間における物体の位置と入力部の位置関係を示す説明図。Explanatory drawing which shows the positional relationship of the position of the object in the real space of 2nd Embodiment, and an input part. 図２５（Ａ）〜（Ｄ）は、物体認識処理の説明図。25A to 25D are explanatory diagrams of object recognition processing. 特定領域を設定する手法を説明するための図。The figure for demonstrating the method of setting a specific area | region. 特定領域において物体認識処理を行う手法を説明するための説明図。Explanatory drawing for demonstrating the method of performing an object recognition process in a specific area | region. 図２８（Ａ）（Ｂ）は、特定領域において物体認識処理を行う手法を説明するための説明図。28A and 28B are explanatory diagrams for explaining a method of performing object recognition processing in a specific region. 特定領域を設定する手法を説明するための図。The figure for demonstrating the method of setting a specific area | region. 特定領域において物体認識処理を行う手法を説明するための説明図。Explanatory drawing for demonstrating the method of performing an object recognition process in a specific area | region. 特定領域において物体認識処理を行う手法を説明するための説明図。Explanatory drawing for demonstrating the method of performing an object recognition process in a specific area | region. 第２の実施形態のフローチャート。The flowchart of 2nd Embodiment. 図３３（Ａ）（Ｂ）は、物体の動き認識処理の説明図。33A and 33B are explanatory diagrams of object motion recognition processing.

以下、本実施形態について説明する。なお、以下に説明する本実施形態は、特許請求の範囲に記載された本発明の内容を不当に限定するものではない。また本実施形態で説明される構成の全てが、本発明の必須構成要件であるとは限らない。 Hereinafter, this embodiment will be described. In addition, this embodiment demonstrated below does not unduly limit the content of this invention described in the claim. In addition, all the configurations described in the present embodiment are not necessarily essential configuration requirements of the present invention.

１．第１の実施形態
１−１．第１の物体認識システム
図１は、第１の実施形態における第１の物体認識システム（第１のゲームシステム、第１の画像生成システム）の概略外観図である。本実施形態の第１の物体認識システムは、ゲーム画像を表示させる表示部９０と、物体認識処理、ゲーム処理等を行う物体認識装置１０（ゲーム機）と、入力部２０とを含む。そして、図１に示すように、表示部９０（表示画面９１）の周囲には、表示部９０と関連付けた位置に入力部２０が配置されている。例えば、入力部２０は、表示部９０の下部に配置してもよいし、表示部９０の上部に配置してもよい。 1. 1. First embodiment 1-1. First Object Recognition System FIG. 1 is a schematic external view of a first object recognition system (a first game system and a first image generation system) in the first embodiment. The first object recognition system of the present embodiment includes a display unit 90 that displays a game image, an object recognition device 10 (game machine) that performs object recognition processing, game processing, and the like, and an input unit 20. As shown in FIG. 1, the input unit 20 is arranged around the display unit 90 (display screen 91) at a position associated with the display unit 90. For example, the input unit 20 may be disposed below the display unit 90 or may be disposed above the display unit 90.

そして、物体認識装置１０は、静止状態にあるＲＧＢカメラ（撮像部）２１を備える入力部２０から取得した入力画像を解析し、入力画像上の物体を認識する。例えば、図１に示すように、物体認識装置１０は、プレーヤＰを被写体として撮像した入力画像と「人」（物体の一例）の認識パターンとを比較することによって、入力画像で「人」を認識できるか否かを判断する。そして、物体認識装置１０は、「人」を認識できた場合には、「人」の動きや、ジェスチャーを認識する処理を行う。これにより、さまざまなゲーム処理を行うことができる。第１の実施形態では、この入力部２０を用いた第１の物体認識システムの処理例について説明する。 Then, the object recognition device 10 analyzes the input image acquired from the input unit 20 including the RGB camera (imaging unit) 21 in a stationary state, and recognizes an object on the input image. For example, as illustrated in FIG. 1, the object recognition device 10 compares the input image obtained by capturing an image of the player P with a recognition pattern of “person” (an example of an object), thereby determining “person” in the input image. Judge whether it can be recognized. Then, when the “person” is recognized, the object recognition apparatus 10 performs a process of recognizing the movement of the “person” and the gesture. Thereby, various game processes can be performed. In the first embodiment, a processing example of the first object recognition system using the input unit 20 will be described.

１−２．構成
図２は、第１の物体認識システムの機能ブロック図の一例である。なお、第１の物体認識システムでは、図２の各部を全て含む必要はなく、その一部を省略した構成としてもよい。 1-2. Configuration FIG. 2 is an example of a functional block diagram of the first object recognition system. In the first object recognition system, it is not necessary to include all the units in FIG. 2, and a configuration in which some of the units are omitted may be employed.

第１の物体認識システムは、物体認識装置１０と、入力部２０と、表示部９０、スピーカー９２を含む。入力部２０は、ＲＧＢカメラ（撮像部）２１、処理部２２、記憶部２３によって構成されている。 The first object recognition system includes an object recognition device 10, an input unit 20, a display unit 90, and a speaker 92. The input unit 20 includes an RGB camera (imaging unit) 21, a processing unit 22, and a storage unit 23.

ＲＧＢカメラ（撮像部）２１は、物体から発した光をレンズなどの光学系によって撮像素子の受光平面に結像させ、その像の光による明暗を電荷の量に光電変換し、それを順次読み出して電気信号に変換する。そして、ＲＧＢ化（カラー化）されたＲＧＢ画像（入力画像の一例）を記憶部２３に出力する処理を行う。ＲＧＢカメラ２１は、所定の周期で（例えば、１／６０秒毎に）、記憶部２３に出力する処理を行う。また、処理部２２は、ＲＧＢカメラ２１で撮像されたＲＧＢ画像を、物体認識装置１０に送信する処理などを行う。また、記憶部２３は、ＲＧＢカメラ２１によって出力されたＲＧＢ画像を逐次記憶する。 The RGB camera (imaging unit) 21 forms an image of light emitted from an object on a light receiving plane of an image sensor using an optical system such as a lens, photoelectrically converts light and darkness of the image into an amount of electric charge, and sequentially reads out the light. Convert it into an electrical signal. Then, RGB (colorized) RGB image (an example of an input image) is output to the storage unit 23. The RGB camera 21 performs a process of outputting to the storage unit 23 at a predetermined cycle (for example, every 1/60 seconds). In addition, the processing unit 22 performs a process of transmitting an RGB image captured by the RGB camera 21 to the object recognition device 10. The storage unit 23 sequentially stores the RGB images output by the RGB camera 21.

次に、本実施形態の物体認識装置１０について説明する。本実施形態の物体認識装置１０は、記憶部１７０、処理部１００、情報記憶媒体１８０、通信部１９６によって構成される。 Next, the object recognition apparatus 10 of this embodiment is demonstrated. The object recognition apparatus 10 of this embodiment includes a storage unit 170, a processing unit 100, an information storage medium 180, and a communication unit 196.

記憶部１７０は、主記憶部１７１、描画バッファ１７２、認識パターン記憶部１７３、入力画像記憶部１７４、差分画像記憶部１７５とを含む。主記憶部１７１は、処理部１００のワーク領域であり、描画バッファ１７２は、画像生成部１２０において描画された画像を格納するための記憶領域である。 The storage unit 170 includes a main storage unit 171, a drawing buffer 172, a recognition pattern storage unit 173, an input image storage unit 174, and a difference image storage unit 175. The main storage unit 171 is a work area of the processing unit 100, and the drawing buffer 172 is a storage area for storing an image drawn by the image generation unit 120.

また、認識パターン記憶部１７３は、物体を特定するために予め用意されたパターン、テンプレートを格納するための記憶領域であり、物体それぞれに対応づけてパターンが記憶されている。例えば、視覚的特徴や画素値そのものを認識パターンとして認識パターン記憶部１７３に格納される。 The recognition pattern storage unit 173 is a storage area for storing a pattern and a template prepared in advance for specifying an object, and stores a pattern in association with each object. For example, visual features and pixel values themselves are stored in the recognition pattern storage unit 173 as recognition patterns.

なお、認識パターン記憶部１７３は、データベースとして構築される記憶領域でもよい。例えば、「人」、「手」、「足」、「腕」などの各物体に対応づけて、１または複数のパターンを関連づけて記憶するようにしてもよい。 Note that the recognition pattern storage unit 173 may be a storage area constructed as a database. For example, one or more patterns may be stored in association with each object such as “person”, “hand”, “foot”, and “arm”.

また、入力画像記憶部１７４は、物体認識処理、動きベクトル算出処理を行うために所定周期で入力部２０が取得した入力画像を格納するための記憶領域である。また、差分画像記憶部１７５は、動きベクトル算出処理を行うために、異なる時点で撮像された２つの画像の各画素値の差分をとった差分画素値を格納するための記憶領域である。 The input image storage unit 174 is a storage area for storing an input image acquired by the input unit 20 at a predetermined cycle in order to perform object recognition processing and motion vector calculation processing. The difference image storage unit 175 is a storage area for storing a difference pixel value obtained by taking a difference between pixel values of two images captured at different points in time in order to perform a motion vector calculation process.

そして、処理部１００は、この情報記憶媒体１８０に格納されるプログラムから読み出されたデータに基づいて本実施形態の種々の処理を行う。即ち、情報記録媒体１８０には、本実施形態の各部としてコンピュータを機能させるためのプログラム（各部の処理をコンピュータに実行させるためのプログラム）が記憶される。 The processing unit 100 performs various processes according to the present embodiment based on data read from the program stored in the information storage medium 180. That is, the information recording medium 180 stores a program for causing a computer to function as each unit of the present embodiment (a program for causing a computer to execute processing of each unit).

通信部１９６は、ネットワーク（インターネット）を介して他のゲーム機と通信することができる。その機能は、各種プロセッサまたは通信用ＡＳＩＣ、ネットワーク・インタフェース・カードなどのハードウェアや、プログラムなどにより実現できる。 The communication unit 196 can communicate with other game machines via a network (Internet). The function can be realized by various processors, hardware such as a communication ASIC, a network interface card, or a program.

なお、本実施形態の各部としてコンピュータを機能させるためのプログラムは、サーバが有する、記憶部、情報記憶媒体からネットワークを介して情報記憶媒体１８０（または、記憶部１７０）に配信するようにしてもよい。このようなサーバの情報記憶媒体の使用も本発明の範囲に含まれる。 Note that a program for causing a computer to function as each unit of the present embodiment may be distributed from a storage unit or an information storage medium included in a server to the information storage medium 180 (or the storage unit 170) via a network. Good. Use of such server information storage media is also within the scope of the present invention.

処理部１００（プロセッサ）は、入力部２０から取得した情報や情報記憶媒体１８０から記憶部１７０に展開されたプログラム等に基づいて、物体認識処理、ゲーム処理、画像生成処理、或いは音制御の処理を行う。 The processing unit 100 (processor) performs object recognition processing, game processing, image generation processing, or sound control processing based on information acquired from the input unit 20, a program developed from the information storage medium 180 to the storage unit 170, or the like. I do.

特に、第１の物体認識システムの処理部１００は、取得部１１０、算出部１１１、領域決定部１１２、物体認識処理部１１３、ゲーム演算部１１４、画像生成部１２０、音制御部１３０として機能する。 In particular, the processing unit 100 of the first object recognition system functions as an acquisition unit 110, a calculation unit 111, a region determination unit 112, an object recognition processing unit 113, a game calculation unit 114, an image generation unit 120, and a sound control unit 130. .

取得部１１０は、ＲＧＢカメラ（撮像部）２１によって撮像された入力画像（ＲＧＢ画像）を取得する処理を行う。 The acquisition unit 110 performs processing for acquiring an input image (RGB image) captured by the RGB camera (imaging unit) 21.

算出部１１１は、異なる時点で撮像された２つの入力画像に基づいて、入力画像の各画素の動きベクトルを算出する。 The calculation unit 111 calculates a motion vector of each pixel of the input image based on two input images captured at different times.

領域設定部１１２は、入力画像の各画素の動きベクトルに基づいて、入力画像において特定領域を設定する。領域設定部１１２は、複数の特定領域を設定するようにしてもよい。複数の特定領域を設定した場合には、各特定領域に優先度（優先順位情報）を付与（設定）する。つまり、特定領域単位で優先度を付与する。なお、優先度は、優先度が上位である特定領域を、その特定領域より優先度が下位である特定領域に優先して特定領域の物体認識処理を行うための情報である。 The region setting unit 112 sets a specific region in the input image based on the motion vector of each pixel of the input image. The area setting unit 112 may set a plurality of specific areas. When a plurality of specific areas are set, priority (priority information) is assigned (set) to each specific area. That is, priority is given in specific area units. Note that the priority is information for performing object recognition processing of a specific area by giving priority to a specific area having a higher priority than a specific area having a lower priority than the specific area.

また、領域設定部１１２は、入力画像の各画素の動きベクトルと色情報とに基づいて、入力画像において特定領域を設定するようにしてもよい。また、領域設定部１１２は、入力画像の各画素の動きベクトルのうち大きさが所定値以上である動きベクトルに基づいて、入力画像において特定領域を設定するようにしてもよい。 The region setting unit 112 may set a specific region in the input image based on the motion vector and color information of each pixel of the input image. The region setting unit 112 may set a specific region in the input image based on a motion vector having a magnitude equal to or larger than a predetermined value among the motion vectors of the pixels of the input image.

物体認識処理部１１３は、物体を認識する物体認識処理を行う。ここで、物体を認識する物体認識処理とは、物体自体を認識する処理、物体の動きを認識する処理、物体のジェスチャー（形、ポーズ）を認識する処理の少なくとも１つを含む。 The object recognition processing unit 113 performs object recognition processing for recognizing an object. Here, the object recognition process for recognizing an object includes at least one of a process for recognizing the object itself, a process for recognizing the movement of the object, and a process for recognizing the gesture (shape or pose) of the object.

例えば、物体認識処理部１１３は、入力画像の各画素の画素情報（画素の位置座標、画素の色情報、画素の動きベクトルの少なくとも１つ）に基づいて、認識パターン記憶部１７３に格納されている「人」の認識パターンを用いて、「人」を認識できるか否かを判断する。 For example, the object recognition processing unit 113 is stored in the recognition pattern storage unit 173 based on pixel information (at least one of pixel position coordinates, pixel color information, and pixel motion vector) of each pixel of the input image. It is determined whether or not “person” can be recognized using the recognition pattern of “person”.

具体的には、「人」の形（形状、シルエット）の認識パターンを用意し、入力画像において動きベクトルによって示される動き領域の形が、「人」の形であるか否かを判断する。そして、「人」の形であると判断した場合には、「人」を認識したと判定する処理を行う。一方、「人」を認識できない場合には、次の物体（例えば「手」）の認識パターンを用いて、次の物体（「手」）を認識できるか否かを判断する。そして、物体を認識できるまで、入力画像と次の認識パターンとを照合する処理を行う。 Specifically, a recognition pattern of the shape (shape, silhouette) of “person” is prepared, and it is determined whether or not the shape of the motion region indicated by the motion vector in the input image is the shape of “person”. When it is determined that the form is “person”, processing for determining that “person” is recognized is performed. On the other hand, when “person” cannot be recognized, it is determined whether or not the next object (“hand”) can be recognized using the recognition pattern of the next object (for example, “hand”). Then, the process of collating the input image with the next recognition pattern is performed until the object can be recognized.

また、物体認識処理部１１３は、「人」を認識したと判定された場合には、入力画像の各画素の画素情報に基づいて、「人」の動きを認識する処理を行う。また、物体認識処理部１１３は、「人」を認識したと判定された場合には、入力画像の各画素の画素情報に基づいて、「人」のジェスチャーを認識する処理を行う。 Further, when it is determined that “person” is recognized, the object recognition processing unit 113 performs processing for recognizing the movement of “person” based on pixel information of each pixel of the input image. Further, when it is determined that the “person” is recognized, the object recognition processing unit 113 performs a process of recognizing the “person” gesture based on the pixel information of each pixel of the input image.

特に、本実施形態の物体認識処理部１１３は、特定領域において物体認識処理を行う。例えば、物体認識処理部１１３は、特定領域における物体認識処理の精度を特定領域以外の領域の物体認識処理の精度よりも上げて、特定領域において物体認識処理を行うようにしてもよい。 In particular, the object recognition processing unit 113 of the present embodiment performs object recognition processing in a specific area. For example, the object recognition processing unit 113 may perform the object recognition process in the specific area by increasing the accuracy of the object recognition process in the specific area higher than the accuracy of the object recognition process in the area other than the specific area.

より具体的に説明すると、物体認識処理部１１３は、特定領域において物体認識処理を行う周期を特定領域以外の領域の物体認識処理の周期よりも短くして、特定領域において物体認識処理を行う。また、物体認識処理部１１３は、特定領域の画像精度（精細度）を特定領域以外の領域の画像精度よりも上げて、特定領域において物体認識処理を行う。 More specifically, the object recognition processing unit 113 performs the object recognition process in the specific region by setting the cycle of performing the object recognition process in the specific region to be shorter than the cycle of the object recognition process in the region other than the specific region. Further, the object recognition processing unit 113 performs object recognition processing in the specific area by increasing the image accuracy (definition) of the specific area higher than the image accuracy of the area other than the specific area.

また、物体認識処理部１１３は、特定領域以外の領域において物体認識処理を行うと共に、特定領域以外の領域において物体認識処理を行う周期を、特定領域において物体認識処理を行う周期よりも長くする。また、物体認識処理部１１３は、特定領域以外の領域において物体認識処理を行うと共に、特定領域以外の領域の画像精度を特定領域の画像精度よりも低くする。 In addition, the object recognition processing unit 113 performs the object recognition process in an area other than the specific area, and makes the period for performing the object recognition process in the area other than the specific area longer than the period for performing the object recognition process in the specific area. In addition, the object recognition processing unit 113 performs object recognition processing in a region other than the specific region, and lowers the image accuracy of the region other than the specific region than the image accuracy of the specific region.

また、物体認識処理部１１３は、複数の（２以上の）特定領域を設定されている場合には、少なくとも１つの特定領域について物体認識処理を行う。 In addition, when a plurality of (two or more) specific areas are set, the object recognition processing unit 113 performs object recognition processing for at least one specific area.

また、物体認識処理部１１３は、複数の特定領域を設定されている場合には、各特定領域において物体認識処理を行うと共に、優先度の低い特定領域において物体認識処理を行う周期を、優先度の高い特定領域において物体認識処理を行う周期よりも長くするようにしてもよい。 In addition, when a plurality of specific areas are set, the object recognition processing unit 113 performs the object recognition process in each specific area and sets the period in which the object recognition process is performed in the specific area with a low priority. It may be made longer than the period for performing the object recognition process in a specific region having a high height.

また、物体認識処理部１１３は、複数の特定領域を設定されている場合には、各特定領域において物体認識処理を行うと共に、優先度の低い特定領域の画像精度を、優先度の高い特定領域の画像精度よりも低くするようにしてもよい。 In addition, when a plurality of specific areas are set, the object recognition processing unit 113 performs object recognition processing in each specific area, and sets the image accuracy of the specific area with a low priority to the specific area with a high priority. The image accuracy may be lower than the image accuracy.

また、物体認識処理部１１３は、複数の特定領域を設定されている場合には、優先度の低い特定領域において物体認識処理を行わずに、優先度の高い特定領域において物体認識処理を行うようにしてもよい。 In addition, when a plurality of specific areas are set, the object recognition processing unit 113 does not perform the object recognition process in the low priority specific area but performs the object recognition process in the high priority specific area. It may be.

また、物体認識処理部１１３は、ボーン情報に基づいて、前記特定領域における前記物体認識処理を行うようにしてもよい。 Further, the object recognition processing unit 113 may perform the object recognition processing in the specific region based on bone information.

ゲーム演算部１１４は、種々のゲーム演算を行う。ここでゲーム演算としては、ゲーム開始条件が満たされた場合にゲームを開始する処理、ゲームを進行させる処理、キャラクタやマップなどのオブジェクトを配置する処理、オブジェクトを表示する処理、ゲーム結果を演算する処理、或いはゲーム終了条件が満たされた場合にゲームを終了する処理などがある。 The game calculation unit 114 performs various game calculations. Here, the game calculation includes a process for starting a game when a game start condition is satisfied, a process for advancing the game, a process for placing an object such as a character or a map, a process for displaying an object, and a game result. There is a process or a process of ending a game when a game end condition is satisfied.

例えば、ゲーム演算部１１４は、入力部２０からの入力データやプログラムなどに基づいて、ゲーム処理を行う。本実施形態のゲーム演算部１１４は、例えば、物体認識処理部１１３の認識結果に基づいてゲーム演算処理を行う。つまり、物体認識処理部１１３において「人（プレーヤ）」を認識した場合には、物体認識処理部１１３が「人」の動きやジェスチャーを認識し、その「人」の動き（左右に人が動く動作）やジェスチャー（特定のポーズ）に基づいてゲーム演算処理を行うようにしてもよい。 For example, the game calculation unit 114 performs a game process based on input data from the input unit 20, a program, and the like. The game calculation unit 114 of the present embodiment performs a game calculation process based on the recognition result of the object recognition processing unit 113, for example. That is, when the object recognition processing unit 113 recognizes “person (player)”, the object recognition processing unit 113 recognizes the movement and gesture of the “person” and moves the “person” (the person moves to the left and right). The game calculation process may be performed based on an operation) or a gesture (specific pose).

なお、処理部１００は、仮想空間にオブジェクトを配置する処理、仮想空間に存在するオブジェクトを移動させる処理などを行うようにしてもよい。例えば、処理部１００は、オブジェクトを仮想空間（仮想３次元空間（オブジェクト空間）、仮想２次元空間）に配置する処理を行うようにしてもよい。例えば、キャラクタ、指示オブジェクトの他に、建物、球場、車、樹木、柱、壁、マップ（地形）などの表示物を、仮想空間に配置する処理を行う。ここで仮想空間とは、仮想的なゲーム空間であり、例えば、仮想３次元空間の場合、ワールド座標系、仮想カメラ座標系のように、３次元座標（Ｘ，Ｙ，Ｚ）においてオブジェクトが配置される空間である。 Note that the processing unit 100 may perform processing for arranging an object in the virtual space, processing for moving an object existing in the virtual space, and the like. For example, the processing unit 100 may perform processing for arranging an object in a virtual space (virtual three-dimensional space (object space), virtual two-dimensional space). For example, a display object such as a building, a stadium, a car, a tree, a pillar, a wall, and a map (terrain) is placed in the virtual space in addition to the character and the pointing object. Here, the virtual space is a virtual game space. For example, in the case of a virtual three-dimensional space, objects are arranged in three-dimensional coordinates (X, Y, Z) like a world coordinate system and a virtual camera coordinate system. Space.

例えば、処理部１００は、ワールド座標系にオブジェクト（ポリゴン、自由曲面又はサブディビジョンサーフェスなどのプリミティブで構成されるオブジェクト）を配置する。また、例えば、ワールド座標系でのオブジェクトの位置や回転角度（向き、方向と同義）を決定し、その位置（Ｘ、Ｙ、Ｚ）にその回転角度（Ｘ、Ｙ、Ｚ軸回りでの回転角度）でオブジェクトを配置する。なお、処理部１００は、スケーリングされたオブジェクトを仮想空間に配置する処理を行ってもよい。 For example, the processing unit 100 arranges an object (an object composed of a primitive such as a polygon, a free-form surface, or a subdivision surface) in the world coordinate system. Also, for example, the position and rotation angle (synonymous with direction and direction) of the object in the world coordinate system is determined, and the rotation angle (X, Y, Z axis rotation) is determined at that position (X, Y, Z). Position the object at (Angle). Note that the processing unit 100 may perform a process of placing the scaled object in the virtual space.

また、処理部１００は、仮想空間にあるオブジェクトの移動・動作演算を行うようにしてもよい。すなわち入力部から受け付けた入力情報、プログラム（移動・動作アルゴリズム）や、各種データ（モーションデータ）などに基づいて、オブジェクトを仮想空間内で移動させたり、オブジェクトを動作（モーション、アニメーション）させたりする処理を行う。具体的には、オブジェクトの移動情報（移動速度、移動加速度、位置、向きなど）や動作情報（オブジェクトを構成する各パーツの位置、或いは回転角度）を、１フレーム（１／６０秒）毎に順次求める処理を行う。なお、フレームは、オブジェクトの移動・動作処理や画像生成処理を行う時間の単位である。 Further, the processing unit 100 may perform movement / motion calculation of an object in the virtual space. In other words, based on the input information received from the input unit, programs (movement / motion algorithm), various data (motion data), etc., the object is moved in the virtual space and the object is moved (motion, animation). Process. Specifically, object movement information (movement speed, movement acceleration, position, orientation, etc.) and movement information (position of each part constituting the object, or rotation angle) are obtained every frame (1/60 second). Processing to obtain sequentially is performed. Note that a frame is a unit of time for performing object movement / motion processing and image generation processing.

画像生成部１２０は、処理部１００で行われる種々の処理の結果に基づいて描画処理を行い、これにより画像を生成し、表示部９０に出力する。例えば、本実施形態の画像生成部１２０は、基準開始タイミングと基準判定期間とを指示する画像を生成する。 The image generation unit 120 performs drawing processing based on the results of various processes performed by the processing unit 100, thereby generating an image and outputting it to the display unit 90. For example, the image generation unit 120 of the present embodiment generates an image that indicates the reference start timing and the reference determination period.

画像生成部１２０は、オブジェクト（モデル）の各頂点の頂点データ（頂点の位置座標、テクスチャ座標、色データ、法線ベクトル或いはα値等）を含むオブジェクトデータ（モデルデータ）が入力され、入力されたオブジェクトデータに含まれる頂点データに基づいて、頂点処理（頂点シェーダによるシェーディング）が行われる。なお頂点処理を行うに際して、必要に応じてポリゴンを再分割するための頂点生成処理（テッセレーション、曲面分割、ポリゴン分割）を行うようにしてもよい。 The image generation unit 120 receives and inputs object data (model data) including vertex data (vertex position coordinates, texture coordinates, color data, normal vector, α value, etc.) of each vertex of the object (model). Based on the vertex data included in the object data, vertex processing (shading by a vertex shader) is performed. When performing the vertex processing, vertex generation processing (tessellation, curved surface division, polygon division) for re-dividing the polygon may be performed as necessary.

頂点処理では、頂点処理プログラム（頂点シェーダプログラム、第１のシェーダプログラム）に従って、頂点の移動処理や、座標変換、例えばワールド座標変換、視野変換（カメラ座標変換）、クリッピング処理、透視変換（投影変換）、ビューポート変換等のジオメトリ処理が行われ、その処理結果に基づいて、オブジェクトを構成する頂点群について与えられた頂点データを変更（更新、調整）する。 In the vertex processing, according to the vertex processing program (vertex shader program, first shader program), vertex movement processing, coordinate transformation, for example, world coordinate transformation, visual field transformation (camera coordinate transformation), clipping processing, perspective transformation (projection transformation) ), Geometry processing such as viewport conversion is performed, and based on the processing result, the vertex data given to the vertex group constituting the object is changed (updated or adjusted).

そして、頂点処理後の頂点データに基づいてラスタライズ（走査変換）が行われ、ポリゴン（プリミティブ）の面とピクセルとが対応づけられる。そしてラスタライズに続いて、画像を構成するピクセル（表示画面を構成するフラグメント）を描画するピクセル処理（ピクセルシェーダによるシェーディング、フラグメント処理）が行われる。ピクセル処理では、ピクセル処理プログラム（ピクセルシェーダプログラム、第２のシェーダプログラム）に従って、テクスチャの読出し（テクスチャマッピング）、色データの設定／変更、半透明合成、アンチエイリアス等の各種処理を行って、画像を構成するピクセルの最終的な描画色を決定し、透視変換されたオブジェクトの描画色を画像バッファ１７２（ピクセル単位で画像情報を記憶できるバッファ。ＶＲＡＭ、レンダリングターゲット）に出力（描画）する。すなわち、ピクセル処理では、画像情報（色、法線、輝度、α値等）をピクセル単位で設定あるいは変更するパーピクセル処理を行う。これにより、オブジェクト空間内において仮想カメラ（所与の視点）から見える画像が生成される。なお、仮想カメラ（視点）が複数存在する場合には、それぞれの仮想カメラから見える画像を分割画像として１画面に表示できるように画像を生成することができる。 Then, rasterization (scan conversion) is performed based on the vertex data after the vertex processing, and the surface of the polygon (primitive) is associated with the pixel. Subsequent to rasterization, pixel processing (shading or fragment processing by a pixel shader) for drawing pixels (fragments forming a display screen) constituting an image is performed. In pixel processing, according to a pixel processing program (pixel shader program, second shader program), various processes such as texture reading (texture mapping), color data setting / change, translucent composition, anti-aliasing, etc. are performed, and an image is processed. The final drawing color of the constituent pixels is determined, and the drawing color of the perspective-transformed object is output (drawn) to the image buffer 172 (a buffer capable of storing image information in units of pixels; VRAM, rendering target). That is, in pixel processing, per-pixel processing for setting or changing image information (color, normal, luminance, α value, etc.) in units of pixels is performed. Thereby, an image that can be seen from the virtual camera (given viewpoint) in the object space is generated. Note that when there are a plurality of virtual cameras (viewpoints), an image can be generated so that an image seen from each virtual camera can be displayed as a divided image on one screen.

なお頂点処理やピクセル処理は、シェーディング言語によって記述されたシェーダプログラムによって、ポリゴン（プリミティブ）の描画処理をプログラム可能にするハードウェア、いわゆるプログラマブルシェーダ（頂点シェーダやピクセルシェーダ）により実現される。プログラマブルシェーダでは、頂点単位の処理やピクセル単位の処理がプログラム可能になることで描画処理内容の自由度が高く、従来のハードウェアによる固定的な描画処理に比べて表現力を大幅に向上させることができる。 The vertex processing and pixel processing are realized by hardware that enables polygon (primitive) drawing processing to be programmed by a shader program written in a shading language, so-called programmable shaders (vertex shaders and pixel shaders). Programmable shaders can be programmed with vertex-level processing and pixel-level processing, so that the degree of freedom of drawing processing is high, and expressive power is greatly improved compared to conventional hardware-based fixed drawing processing. Can do.

そして画像生成部１２０は、オブジェクトを描画する際に、ジオメトリ処理、テクスチャマッピング、隠面消去処理、αブレンディング等を行う。 The image generation unit 120 performs geometry processing, texture mapping, hidden surface removal processing, α blending, and the like when drawing an object.

ジオメトリ処理では、オブジェクトに対して、座標変換、クリッピング処理、透視投影変換、或いは光源計算等の処理が行われる。そして、ジオメトリ処理後（透視投影変換後）のオブジェクトデータ（オブジェクトの頂点の位置座標、テクスチャ座標、色データ（輝度データ）、法線ベクトル、或いはα値等）は、記憶部１７０に保存される。 In the geometry processing, processing such as coordinate conversion, clipping processing, perspective projection conversion, or light source calculation is performed on the object. Then, the object data (positional coordinates of object vertices, texture coordinates, color data (luminance data), normal vector, α value, etc.) after geometry processing (after perspective projection conversion) is stored in the storage unit 170. .

テクスチャマッピングは、記憶部１７０に記憶されるテクスチャ（テクセル値）をオブジェクトにマッピングするための処理である。具体的には、オブジェクトの頂点に設定（付与）されるテクスチャ座標等を用いて記憶部１７０からテクスチャ（色（ＲＧＢ）、α値などの表面プロパティ）を読み出す。そして、２次元の画像であるテクスチャをオブジェクトにマッピングする。この場合に、ピクセルとテクセルとを対応づける処理や、テクセルの補間としてバイリニア補間などを行う。 Texture mapping is a process for mapping a texture (texel value) stored in the storage unit 170 to an object. Specifically, the texture (surface properties such as color (RGB) and α value) is read from the storage unit 170 using texture coordinates or the like set (given) to the vertex of the object. Then, a texture that is a two-dimensional image is mapped to an object. In this case, processing for associating pixels with texels, bilinear interpolation or the like is performed as texel interpolation.

隠面消去処理としては、描画ピクセルのＺ値（奥行き情報）が格納されるＺバッファ（奥行きバッファ）を用いたＺバッファ法（奥行き比較法、Ｚテスト）による隠面消去処理を行うことができる。すなわちオブジェクトのプリミティブに対応する描画ピクセルを描画する際に、Ｚバッファに格納されるＺ値を参照する。そして参照されたＺバッファのＺ値と、プリミティブの描画ピクセルでのＺ値とを比較し、描画ピクセルでのＺ値が、仮想カメラから見て手前側となるＺ値（例えば小さなＺ値）である場合には、その描画ピクセルの描画処理を行うとともにＺバッファのＺ値を新たなＺ値に更新する。 As the hidden surface removal processing, hidden surface removal processing can be performed by a Z buffer method (depth comparison method, Z test) using a Z buffer (depth buffer) in which Z values (depth information) of drawing pixels are stored. . That is, when drawing pixels corresponding to the primitive of the object are drawn, the Z value stored in the Z buffer is referred to. Then, the Z value of the referenced Z buffer is compared with the Z value at the drawing pixel of the primitive, and the Z value at the drawing pixel is a Z value (for example, a small Z value) on the near side when viewed from the virtual camera. In some cases, the drawing process of the drawing pixel is performed and the Z value of the Z buffer is updated to a new Z value.

αブレンディング（α合成）は、α値（Ａ値）に基づく半透明合成処理（通常αブレンディング、加算αブレンディング又は減算αブレンディング等）のことである。 α blending (α synthesis) is a translucent synthesis process (usually α blending, addition α blending, subtraction α blending, or the like) based on an α value (A value).

例えば、αブレンディングでは、これから画像バッファ１７２に描画する描画色（上書きする色）Ｃ１と、既に画像バッファ１７２（レンダリングターゲット）に描画されている描画色（下地の色）Ｃ２とを、α値に基づいて線形合成処理を行う。つまり、最終的な描画色をＣとすると、Ｃ＝Ｃ１＊α＋Ｃ２＊（１−α）によって求めることができる。 For example, in α blending, the drawing color (overwriting color) C1 to be drawn in the image buffer 172 and the drawing color (background color) C2 already drawn in the image buffer 172 (rendering target) are set to α values. Based on this, a linear synthesis process is performed. That is, if the final drawing color is C, it can be obtained by C = C1 * α + C2 * (1−α).

なお、α値は、各ピクセル（テクセル、ドット）に関連づけて記憶できる情報であり、例えば色情報以外のプラスアルファの情報である。α値は、マスク情報、半透明度（透明度、不透明度と等価）、バンプ情報などとして使用できる。 The α value is information that can be stored in association with each pixel (texel, dot), for example, plus alpha information other than color information. The α value can be used as mask information, translucency (equivalent to transparency and opacity), bump information, and the like.

音制御部１３０は、処理部１００で行われる種々の処理の結果に基づいて音処理を行い、ＢＧＭ、効果音、又は音声などのゲーム音を生成し、スピーカー９２に出力する。 The sound control unit 130 performs sound processing based on the results of various processes performed by the processing unit 100, generates a game sound such as BGM, sound effect, or sound, and outputs the game sound to the speaker 92.

なお、本実施形態の端末は、１人のプレーヤのみがプレイできるシングルプレーヤモード、或いは、複数のプレーヤがプレイできるマルチプレーヤモードでゲームプレイできるように制御してもよい。例えば、マルチプレーヤモードで制御する場合には、ネットワークを介して他の端末とデータを送受信してゲーム処理を行うようにしてもよいし、１つの端末が、複数の入力部からの入力情報に基づいて処理を行うようにしてもよい。 Note that the terminal according to the present embodiment may be controlled so that the game can be played in a single player mode in which only one player can play or in a multiplayer mode in which a plurality of players can play. For example, in the case of controlling in the multiplayer mode, game processing may be performed by transmitting / receiving data to / from other terminals via a network, or one terminal may receive input information from a plurality of input units. Processing may be performed based on this.

情報記憶媒体１８０（コンピュータにより読み取り可能な媒体）は、プログラムやデータなどを格納するものであり、その機能は、光ディスク（ＣＤ、ＤＶＤ）、光磁気ディスク（ＭＯ）、磁気ディスク、ハードディスク、磁気テープ、或いはメモリ（ＲＯＭ）などのハードウェアにより実現できる。 The information storage medium 180 (computer-readable medium) stores programs, data, and the like, and functions as an optical disk (CD, DVD), magneto-optical disk (MO), magnetic disk, hard disk, and magnetic tape. Alternatively, it can be realized by hardware such as a memory (ROM).

表示部９０は、処理部１００により生成された画像を出力するものであり、その機能は、ＣＲＴディスプレイ、ＬＣＤ（液晶ディスプレイ）、ＯＥＬＤ（有機ＥＬディスプレイ）、ＰＤＰ（プラズマディスプレイパネル）、タッチパネル型ディスプレイ、或いはＨＭＤ（ヘッドマウントディスプレイ）などのハードウェアにより実現できる。 The display unit 90 outputs an image generated by the processing unit 100, and functions thereof are a CRT display, LCD (liquid crystal display), OELD (organic EL display), PDP (plasma display panel), touch panel display. Alternatively, it can be realized by hardware such as an HMD (head mounted display).

スピーカー９２は、音制御部１３０により再生する音を出力するものであり、その機能は、スピーカー、或いはヘッドフォンなどのハードウェアにより実現できる。なお、スピーカー９２は、表示部に備えられたスピーカーとしてもよい。例えば、テレビ（家庭用テレビジョン受像機）を表示部としている場合には、テレビのスピーカーとすることができる。 The speaker 92 outputs sound to be reproduced by the sound control unit 130, and its function can be realized by hardware such as a speaker or headphones. Note that the speaker 92 may be a speaker provided in the display unit. For example, when a television (household television receiver) is used as the display unit, a television speaker can be used.

なお、本実施形態は、物体認識装置１０の認識パターン記憶部１７３、入力画像記憶部１７４、差分画像記憶部１７５に記憶されるデータを、入力部２０の記憶部２３に記憶するようにし、本実施形態の算出部１１１、領域設定部１１２、物体認識処理部１１３の処理を、入力部２０の処理部２２が行うようにしてもよい。 In the present embodiment, the data stored in the recognition pattern storage unit 173, the input image storage unit 174, and the difference image storage unit 175 of the object recognition apparatus 10 are stored in the storage unit 23 of the input unit 20, The processing unit 22 of the input unit 20 may perform the processing of the calculation unit 111, the region setting unit 112, and the object recognition processing unit 113 of the embodiment.

１−３．動きベクトルの説明
本実施形態は、ＲＧＢカメラ２１（撮像部）により撮像された入力画像（ＲＧＢ画像）を取得する。例えば、図３に示すように、描画のフレームレートにあわせて所定周期（例えば、１／６０秒の周期）で、ＲＧＢカメラ２１からデジタル化された入力画像Ｆ１、Ｆ２、Ｆ３、Ｆ４を取得する処理を行う。 1-3. Description of Motion Vector In this embodiment, an input image (RGB image) captured by the RGB camera 21 (imaging unit) is acquired. For example, as shown in FIG. 3, the digitized input images F1, F2, F3, and F4 are acquired from the RGB camera 21 at a predetermined period (for example, a period of 1/60 seconds) according to the drawing frame rate. Process.

そして、本実施形態では、異なる時点で撮像された２つの入力画像において、同じ画素の対応付けを行い、その動き量（移動量）と動き方向（移動方向）とを示す動きベクトル（移動ベクトル、オプティカルフロー）を求める。図３の動きベクトルＶは、入力画像Ｆ１の画素Ｐ１に対応する入力画像Ｆ２の画素Ｐ２に向かうベクトルを示している。本実施形態では、入力画像の各画素について動きベクトルを求めている。 In the present embodiment, the same pixels are associated with each other in two input images captured at different times, and a motion vector (movement vector, movement direction) indicating the movement amount (movement amount) and the movement direction (movement direction) is obtained. Optical flow). A motion vector V in FIG. 3 indicates a vector directed to the pixel P2 of the input image F2 corresponding to the pixel P1 of the input image F1. In this embodiment, a motion vector is obtained for each pixel of the input image.

特に、本実施形態では、処理負荷を軽減するために、入力画像Ｆ１、Ｆ２との画素値（輝度値、カラー値）の差分をとった差分画像に基づいて動きベクトルを求めている。例えば、図４（Ａ）に示すように画素値が「５０」の領域が右下方向に動く例について説明する。本実施形態では、図４（Ｂ）に示すように前画像（入力画像Ｆ１）と、現画像（入力画像Ｆ２）とを入力画像記憶部１７４に記憶部し、前画像と現画像の画素値の差分をとり、図４（Ｃ）に示す差分画像を、差分画像記憶部１７５に記憶する。そして、差分画像に基づいて画素の動き量と動き方向とを求めている。つまり、本実施形態では、前画像と現画像との差分をとった各画素の差分画素値の絶対値を、各画素の動きベクトルＶの大きさ（動き量）とし、前画像と現画像との差分画素値に基づいて、画素における各方位（上方向、下方向、右方向、左方向）の推定方位を求める。 In particular, in this embodiment, in order to reduce the processing load, a motion vector is obtained based on a difference image obtained by taking a difference between pixel values (luminance values and color values) from the input images F1 and F2. For example, as shown in FIG. 4A, an example in which a region having a pixel value “50” moves in the lower right direction will be described. In this embodiment, as shown in FIG. 4B, the previous image (input image F1) and the current image (input image F2) are stored in the input image storage unit 174, and the pixel values of the previous image and the current image are stored. And the difference image shown in FIG. 4C is stored in the difference image storage unit 175. Then, the amount of movement and the direction of movement of the pixel are obtained based on the difference image. That is, in the present embodiment, the absolute value of the difference pixel value of each pixel obtained by taking the difference between the previous image and the current image is set as the magnitude (motion amount) of the motion vector V of each pixel, and the previous image and the current image Based on the difference pixel value, an estimated orientation of each orientation (upward, downward, rightward, leftward) of the pixel is obtained.

推定方位の求め方についてより詳しく図５に示すフローチャートを用いて説明する。まず、本実施形態では、前画像の画素Ｐ（ｘ_１，ｙ_１）の特定方位上の隣接する２つの画素値を比較する（ステップＳ１）。例えば、図４（Ｄ）に示すように、前画像の画素Ｐ（ｘ_１，ｙ_１）の左右方向上に隣接する画素（ｘ_０，ｙ_１）の画素値と（ｘ_２，ｙ_１）の画素値とを比較する。 The method for obtaining the estimated azimuth will be described in more detail with reference to the flowchart shown in FIG. First, in the present embodiment, two adjacent pixel values in a specific direction of the pixel P (x ₁ , y ₁ ) of the previous image are compared (step S1). For example, as shown in FIG. 4D, the pixel value of the pixel (x ₀ , y ₁ ) adjacent in the left-right direction of the pixel P (x ₁ , y ₁ ) of the previous image and (x ₂ , y ₁ ) Is compared with the pixel value of.

そして、比較した画素値が等しいか否かを判断する（ステップＳ２）。そして、比較した画素値が等しくない場合は、ステップＳ３に進み、一方、比較した画素値が等しい場合には画素Ｐ（ｘ_１，ｙ_１）が動いていないものとみなし処理を終了する。例えば、画素（ｘ_０，ｙ_１）の画素値と画素（ｘ_２，ｙ_１）の画素値は異なるので、ステップＳ３に進む。 Then, it is determined whether or not the compared pixel values are equal (step S2). If the compared pixel values are not equal, the process proceeds to step S3. On the other hand, if the compared pixel values are equal, it is assumed that the pixel P (x ₁ , y ₁ ) is not moving, and the process is terminated. For example, since the pixel value of the pixel (x ₀ , y ₁ ) is different from the pixel value of the pixel (x ₂ , y ₁ ), the process proceeds to step S3.

そして、小さい画素値を有する画素をＬとし、大きい画素値を有する画素をＧとする（ステップＳ３）。例えば、図４（Ｄ）の例では、画素（ｘ_０，ｙ_１）をＬとし、画素（ｘ_２，ｙ_１）をＧとする。 Then, a pixel having a small pixel value is set to L, and a pixel having a large pixel value is set to G (step S3). For example, in the example of FIG. 4D, the pixel (x ₀ , y ₁ ) is L and the pixel (x ₂ , y ₁ ) is G.

そして、画素Ｐ（ｘ_１，ｙ_１）での差分画像の差分画素値が０であるか否かを判断し（ステップＳ４）、差分画素値が０でない場合には、ステップＳ５に進み、差分画像の差分画素値が０である場合には、画素Ｐ（ｘ_１，ｙ_１）が動いていないものとみなし処理を終了する。例えば、図４（Ｄ）の例では、画素Ｐ（ｘ_１，ｙ_１）差分画像の差分画素値は「−４０」であるので、ステップＳ５に進む。 Then, it is determined whether or not the difference pixel value of the difference image at the pixel P (x ₁ , y ₁ ) is 0 (step S4). If the difference pixel value is not 0, the process proceeds to step S5, where the difference When the difference pixel value of the image is 0, it is considered that the pixel P (x ₁ , y ₁ ) is not moving, and the process is terminated. For example, in the example of FIG. 4D, since the difference pixel value of the pixel P (x ₁ , y ₁ ) difference image is “−40”, the process proceeds to step S5.

そして、差分画素値が０より小さいか否かを判断し（ステップＳ５）、差分画素値が０より小さい場合には、ステップＳ６に進み、Ｐ→Ｇの向きを選択する処理を行う（ステップＳ６）。一方、差分画素値が０より小さくない場合には、ステップＳ７に進み、Ｐ→Ｌの向きを選択する処理を行う（ステップＳ７）。図４（Ｄ）の例では、画素Ｐ（ｘ_１，ｙ_１）差分画像の差分画素値は「−４０」であり、０より小さいのでステップＳ６に進み、Ｐ→Ｇの右方向の向きが推定方向として選択される。以上で処理が終了する。 Then, it is determined whether or not the difference pixel value is smaller than 0 (step S5). If the difference pixel value is smaller than 0, the process proceeds to step S6, and the process of selecting the P → G direction is performed (step S6). ). On the other hand, if the difference pixel value is not smaller than 0, the process proceeds to step S7, and a process of selecting the direction P → L is performed (step S7). In the example of FIG. 4D, the difference pixel value of the pixel P (x ₁ , y ₁ ) difference image is “−40”, which is smaller than 0. Therefore, the process proceeds to step S6 and the right direction of P → G is Selected as the estimated direction. The process ends here.

図４（Ｅ）は、全画素について、画素の４方位（上方向、下方向、右方向、左方向）の推定方向を求めた例である。本実施形態では、図４（Ｆ）に示すように、各画素において、画素の４方位（上方向、下方向、右方向、左方向）に関する推定方向の和をその画素Ｐ（ｘ_１，ｙ_１）の動き方向としてもよい。また、図４（Ｇ）に示すように、各画素において、画素の周辺８個を含む推定方向を平滑化した方向を動き方向としてもよい。 FIG. 4E is an example in which the estimated directions of the four directions (upward, downward, rightward, leftward) of the pixels are obtained for all the pixels. In the present embodiment, as shown in FIG. 4F, in each pixel, the sum of the estimated directions with respect to the four directions (upward, downward, rightward, leftward) of the pixel is determined as the pixel P (x ₁ , y The movement direction of ₁ ) may be used. Further, as shown in FIG. 4G, in each pixel, a direction obtained by smoothing the estimated direction including the eight surrounding pixels may be used as the motion direction.

以上のようにして、本実施形態では、入力画像の各画素の動き量、及び動きベクトルを求めているが、いわゆる勾配法やブロックマッチング法によって求めてもよい。なお、本実施形態では、ＲＧＢ画像を入力画像としているが、輝度値を有するグレースケール画像を入力画像として用いてもよい。 As described above, in this embodiment, the motion amount and motion vector of each pixel of the input image are obtained, but may be obtained by a so-called gradient method or block matching method. In the present embodiment, an RGB image is used as an input image, but a grayscale image having a luminance value may be used as an input image.

１−４．物体認識処理
本実施形態では、記憶部に予め格納されている認識パターンを用いて、ＲＧＢカメラによって撮像された画像上の物体を認識する物体認識処理を行う。ここで、物体を認識する物体認識処理とは、物体自体を認識する処理、物体の動きを認識する処理、物体のジェスチャー（形、ポーズ）を認識する処理の少なくとも１つを含む。 1-4. Object Recognition Processing In this embodiment, object recognition processing for recognizing an object on an image captured by an RGB camera is performed using a recognition pattern stored in advance in a storage unit. Here, the object recognition process for recognizing an object includes at least one of a process for recognizing the object itself, a process for recognizing the movement of the object, and a process for recognizing the gesture (shape or pose) of the object.

例えば、入力画像の各画素の画素情報（画素の位置座標、画素の色情報、画素の動きベクトルの少なくとも１つ）に基づいて、認識パターン記憶部１７３に格納されている「人」の認識パターンを用いて、「人」を認識できるか否かを判断する。 For example, a recognition pattern of “person” stored in the recognition pattern storage unit 173 based on pixel information (at least one of pixel position coordinates, pixel color information, and pixel motion vector) of each pixel of the input image. Is used to determine whether or not “person” can be recognized.

具体的には、差分画像において差分画素値が２０以上の画素の領域、或いは、差分画像において差分画素値が０より大きい値の画素で区切られる領域を動き領域として設定し、「人」の形（形状）の認識パターンを用いて、動き領域の形が認識パターンで示される「人」の形と適合（一致）するか否かを判断する。そして、「人」の形と適合する場合には、「人」を認識したと判定する処理を行う。一方、「人」を認識できない場合には、次の物体（例えば「手」）の認識パターンを用いて、次の物体を認識できるか否かを判断する。そして、物体を認識できるまで、入力画像と次の認識パターンとを照合する処理を行う。そして、結果的に動き領域において物体を認識できない場合には、次の動き領域を特定して、次の動き領域において物体を認識する処理を行う。 Specifically, an area of a pixel having a difference pixel value of 20 or more in the difference image or an area delimited by pixels having a difference pixel value greater than 0 in the difference image is set as a motion area, Using the (shape) recognition pattern, it is determined whether or not the shape of the motion region matches (matches) the shape of the “person” indicated by the recognition pattern. If it matches the shape of “person”, a process of determining that “person” has been recognized is performed. On the other hand, when “person” cannot be recognized, it is determined whether or not the next object can be recognized using the recognition pattern of the next object (for example, “hand”). Then, the process of collating the input image with the next recognition pattern is performed until the object can be recognized. As a result, when the object cannot be recognized in the motion region, the next motion region is specified and the object is recognized in the next motion region.

本実施形態では、（Ａ）毎フレーム（１／６０秒間隔）で２つの画像の差分画像をとり、差分画像の差分画素値に基づいて、動き領域を特定してもよいし、（Ｂ）２フレーム（１／３０秒間隔）で２つの画像の差分画像をとり、差分画像の差分画素値に基づいて、動き領域を特定してもよいし、（Ｃ）１０フレーム（１／６秒間隔）で２つの画像の差分画像をとり、差分画像の差分画素値に基づいて、動き領域を特定してもよい。 In the present embodiment, (A) a difference image between two images may be taken every frame (1/60 second interval), and a motion region may be specified based on the difference pixel value of the difference image, or (B) A difference image between two images may be taken at 2 frames (1/30 second interval), and a motion region may be specified based on the difference pixel value of the difference image. (C) 10 frames (1/6 second interval) ), A difference image between the two images is taken, and the motion region may be specified based on the difference pixel value of the difference image.

なお、（Ａ）（Ｂ）（Ｃ）で特定した動き領域の平均的な領域を、動き領域として特定してもよい。例えば、１秒間の間に毎フレームでの差分画像の差分画素値の平均値Ａと、同じ１秒間の間に１／３０秒間隔での差分画像の差分画素値の平均値Ｂと、同じ１秒間の間に１／６秒間隔での差分画像の差分画素値の平均値Ｃの合計を、３で割った値を用いて、動き領域を設定するようにしてもよい。 Note that the average area of the motion areas identified in (A), (B), and (C) may be identified as the motion area. For example, the average value A of the difference pixel values of the difference image in each frame during one second and the average value B of the difference pixel values of the difference image at 1/30 second intervals in the same one second are the same 1 You may make it set a motion area using the value which divided the average value C of the difference pixel value of the difference image in the interval of 1/6 second during 3 seconds by 3.

１−５．特定領域において物体を認識する処理
本実施形態では、特定領域を設定し、特定領域において物体認識処理を行う。このようにすれば、効率的に、かつ、正確に物体を認識する処理を行うことができる。また、特定領域以外の領域に別の物体が映りこんでいる場合に、その別の物体を誤って認識する事態を防止することができる。 1-5. Processing for Recognizing Object in Specific Area In the present embodiment, a specific area is set, and object recognition processing is performed in the specific area. In this way, processing for recognizing an object efficiently and accurately can be performed. In addition, when another object is reflected in an area other than the specific area, it is possible to prevent a situation where the other object is erroneously recognized.

まず、本実施形態では、入力画像の各画素の動きベクトルに基づいて、入力画像において特定領域を設定する。例えば、図６（Ａ）に示すように入力画像Ｆ１と、図６（Ｂ）に示す入力画像Ｆ２（入力画像Ｆ１から１／６０秒後に取得した入力画像Ｆ２）とに基づいて、図６（Ｃ）に示すような入力画像Ｆ１、Ｆ２間の各画素の動きベクトルが得られた場合、入力画像Ｆ１、Ｆ２間の動きベクトルの方向、大きさに基づいて、特定領域を設定する。言い換えると、入力画像Ｆ１、Ｆ２間の差分画像の差分画素値に基づいて特定領域を設定する。 First, in the present embodiment, a specific region is set in the input image based on the motion vector of each pixel of the input image. For example, based on the input image F1 as shown in FIG. 6A and the input image F2 shown in FIG. 6B (the input image F2 acquired 1/60 seconds after the input image F1), FIG. When the motion vector of each pixel between the input images F1 and F2 as shown in C) is obtained, the specific region is set based on the direction and magnitude of the motion vector between the input images F1 and F2. In other words, the specific area is set based on the difference pixel value of the difference image between the input images F1 and F2.

例えば、「所与の期間（２秒間）において、動きベクトルの大きさ（差分画素値）の平均値が２００以上であって、動きベクトルが左又は右方向を向く領域」をルール１とし、図７に示すように、ルール１に基づいて決定される領域を、特定領域Ａ１として設定する。例えば、本実施形態では、ルール１などの規則情報に基づいて決定される領域を包囲する矩形の領域を特定領域Ａ１として設定する。なお、特定領域を設定するルールは記憶部７０に記憶されている。 For example, rule 1 is “a region where the average value of motion vector magnitude (difference pixel value) is 200 or more and the motion vector faces left or right in a given period (2 seconds)”. As shown in FIG. 7, the area determined based on the rule 1 is set as the specific area A1. For example, in the present embodiment, a rectangular area surrounding an area determined based on rule information such as rule 1 is set as the specific area A1. Note that the rules for setting the specific area are stored in the storage unit 70.

なお、本実施形態では、入力画像の各画素の動きベクトルと色情報とに基づいて、入力画像において特定領域を設定するようにしてもよい。例えば、「黄色系統のカラー値を有する画素であって、所与の期間（２秒間）において、動きベクトルの大きさ（差分画素値）の平均値が２００以上であって、動きベクトルが左又は右方向を向く領域」をルール１´とし、ルール１´に基づいて、特定領域Ａ１´を設定するようにしてもよい。 In the present embodiment, the specific area may be set in the input image based on the motion vector and color information of each pixel of the input image. For example, “a pixel having a yellow color value, and an average value of a motion vector size (difference pixel value) is 200 or more in a given period (2 seconds), and the motion vector is left or The “area facing right” may be rule 1 ′, and the specific area A1 ′ may be set based on rule 1 ′.

なお、一度、特定領域Ａ１を設定した場合、特定領域Ａ１の物体を認識する必要があるので、特定領域Ａ１を設定した時点から所与の期間（例えば６０秒間）、特定領域Ａ１を固定する。そして、所与の周期（例えば６０秒周期）で特定領域Ａ１を更新（変動、再設定）する。 Note that once the specific area A1 is set, it is necessary to recognize an object in the specific area A1, and thus the specific area A1 is fixed for a given period (for example, 60 seconds) from the time when the specific area A1 is set. Then, the specific area A1 is updated (varied, reset) at a given period (for example, a period of 60 seconds).

そして、本実施形態では、図７に示す特定領域Ａ１において物体認識処理を行う。つまり、設定された特定領域Ａ１において、動き領域Ｓ１を設定し、動き領域の形が認識パターンと一致するか否かを判断すればよい。例えば、特定領域Ａ１の各画素において、動きベクトルの大きさ（差分画素値が）所定値以上である画素の集合を動き領域Ｓ１とする。 In this embodiment, the object recognition process is performed in the specific area A1 shown in FIG. That is, in the set specific area A1, the motion area S1 is set, and it is determined whether or not the shape of the motion area matches the recognition pattern. For example, in each pixel of the specific area A1, a set of pixels having a motion vector magnitude (difference pixel value) equal to or greater than a predetermined value is defined as a motion area S1.

そして、本実施形態では、例えば、特定領域Ａ１の動き領域Ｓ１の形と、「人」の認識パターンとを比較し「人」であるか否かを判断する。図７の例では、「人」の認識パターンと一致しないと判断され、特定領域Ａ１の動き領域Ｓ１の形と「手」の認識パターンとを比較し「手」であるか否かを判断する。図７の例では、「手」の認識パターンと一致すると判断され、特定領域Ａ１において「手」を認識したと判定されることになる。 In this embodiment, for example, the shape of the motion area S1 of the specific area A1 is compared with the recognition pattern of “person” to determine whether the person is “person”. In the example of FIG. 7, it is determined that the pattern does not match the recognition pattern of “person”, and the shape of the motion area S1 of the specific area A1 is compared with the recognition pattern of “hand” to determine whether it is “hand”. . In the example of FIG. 7, it is determined that the recognition pattern matches the “hand” recognition pattern, and it is determined that “hand” has been recognized in the specific area A <b> 1.

また、本実施形態では、入力画像の一部の特定領域Ａ１にマシンパワーを注ぐことができるので、特定領域Ａ１の画像精度を上げて物体を認識するようにしてもよい。画像精度とは、画像の解像度（画像の総画素数）や、画像の量子化レベル（画素が取り得る範囲、階調）であり、解像度が高いほど、より精細に物体を認識することができる。また、量子化レベルが高いほど、画素値（差分画素値）の取り得る値域が広がり、より精細に動き領域を設定する精度を上げることができる。 In the present embodiment, since machine power can be poured into a specific area A1 that is a part of the input image, the image accuracy of the specific area A1 may be increased to recognize an object. Image accuracy is the resolution of the image (total number of pixels in the image) and the quantization level of the image (range that pixels can take, gradation). The higher the resolution, the more precise the object can be recognized. . Further, the higher the quantization level, the wider the range of values that can be taken by the pixel value (difference pixel value), and the accuracy of setting the motion region more precisely can be increased.

例えば、図８に示すように、特定領域Ａ１の解像度を上げて、特定領域Ａ１において各画素の動きベクトルを算出し直し、画像精度を上げた各画素の動きベクトルに基づいて、動き領域Ｓ１´を設定するようにしてもよい。このようにすれば、例えば、「手」を判断された場合に、「手」のジェスチャー（形状）や、「手」の動きをより詳しく認識することができる。図８の例では、特定領域Ａ１の動き領域Ｓ１´の形と、「手」の「グー」、「チョキ」、「パー」の３つの認識パターンそれぞれの一致度を判断し、「パー」の認識パターンに最も一致すると判断される。 For example, as shown in FIG. 8, the resolution of the specific area A1 is increased, the motion vector of each pixel is recalculated in the specific area A1, and the motion area S1 ′ is calculated based on the motion vector of each pixel whose image accuracy is increased. May be set. In this way, for example, when “hand” is determined, the gesture (shape) of “hand” and the movement of “hand” can be recognized in more detail. In the example of FIG. 8, the degree of coincidence between the shape of the movement area S1 ′ of the specific area A1 and the three recognition patterns “goo”, “choki”, and “par” of “hand” is determined. It is determined that it most closely matches the recognition pattern.

また、本実施形態では、図９に示すように、特定領域Ａ１において物体認識処理を行う周期を短くするようにしてもよい。例えば、特定領域Ａ１を設定する前において、１／６秒周期で取得した２つの入力画像の差分画像を求めていた場合、特定領域Ａ１を設定したｔ１０時点以後は、１／６０秒周期で取得した２つの入力画像の差分画像を求めるようにする。つまり、１／６０秒間隔で入力画像上の特定領域Ａ１の差分画像を求めるようにする。このようにすれば、より詳細に物体の動きを認識することができる。 Further, in the present embodiment, as shown in FIG. 9, the cycle for performing the object recognition process in the specific area A1 may be shortened. For example, if a difference image between two input images acquired with a 1/6 second period is obtained before setting the specific area A1, it is acquired with a 1/60 second period after time t10 when the specific area A1 is set. A difference image between the two input images is obtained. That is, the difference image of the specific area A1 on the input image is obtained at 1/60 second intervals. In this way, the movement of the object can be recognized in more detail.

以上のように、本実施形態では、特定領域Ａ１にマシンパワーを注ぐことができるので、物体について詳細に物体認識処理を行うことができる。また、本実施形態では、特定領域について画像精度を上げ、さらに認識周期を短くすることによって物体の誤認識を軽減することができる、という効果もある。 As described above, in the present embodiment, machine power can be poured into the specific area A1, so that object recognition processing can be performed in detail for an object. In addition, in the present embodiment, there is an effect that the erroneous recognition of the object can be reduced by increasing the image accuracy for the specific region and further shortening the recognition cycle.

例えば、図１０（Ａ）に示す入力画像Ｆ１０と、図１０（Ｂ）に示す入力画像Ｆ１１との差分画像に基づき、図１０（Ｃ）に基づく動きベクトルが得られ、かかる場合において、図１１に示すように、ルール１に基づいて特定領域Ｂ１が設定されたとする。ここで、画像精度を上げない場合や、認識周期を短くしない場合は、特定領域Ｂ１の動き領域Ｓ２の形が「鳥」ではなく「手」であると判断されるおそれがある。 For example, based on the difference image between the input image F10 shown in FIG. 10A and the input image F11 shown in FIG. 10B, a motion vector based on FIG. 10C is obtained. Suppose that the specific area B1 is set based on the rule 1 as shown in FIG. Here, when the image accuracy is not increased or when the recognition cycle is not shortened, it may be determined that the shape of the motion region S2 of the specific region B1 is not “bird” but “hand”.

しかし、特定領域Ｂ１では、図１０（Ａ）、（Ｂ）に示すように、実際は鳥が飛んでいるので、「鳥」と判断される方が自然である。そこで本実施形態では、図１２に示すように、特定領域Ｂ１について画像精度を上げ、また、認識周期を短くし、詳細に正しく物体認識処理を行うようにする。つまり、特定領域Ｂ１の解像度を上げて、動きベクトルを算出しなおし、特定領域Ｂ１において動き領域Ｓ２´を特定する。そして動き領域Ｓ２´の形が、「手」の認識パターンを一致せず、「鳥」の認識パターンと一致すると判断され、結果的に、特定領域Ｂ１において「鳥」を認識することができる。 However, in the specific area B1, as shown in FIGS. 10A and 10B, since a bird actually flies, it is natural to determine that it is a “bird”. Therefore, in the present embodiment, as shown in FIG. 12, the image accuracy is increased for the specific region B1, the recognition cycle is shortened, and the object recognition process is performed correctly in detail. That is, the resolution of the specific area B1 is increased, the motion vector is recalculated, and the motion area S2 ′ is specified in the specific area B1. Then, it is determined that the shape of the motion region S2 ′ does not match the recognition pattern of “hand” but matches the recognition pattern of “bird”, and as a result, “bird” can be recognized in the specific region B1.

１−６．特定領域と特定領域以外の領域との関係
本実施形態では、図７の特定領域Ａ１以外の領域において物体認識処理を行うようにしてもよい。特定領域以外の領域において物体認識処理を行う場合には、特定領域以外の領域の画像精度を特定領域の画像精度よりも低くする。このようにすれば、特定領域Ａ１よりも認識精度は劣るが、特定領域Ａ１以外に物体が存在する場合には、その物体を認識することができる。 1-6. Relationship between the specific area and the area other than the specific area In the present embodiment, the object recognition process may be performed in an area other than the specific area A1 in FIG. When performing object recognition processing in a region other than the specific region, the image accuracy of the region other than the specific region is set lower than the image accuracy of the specific region. In this way, the recognition accuracy is inferior to that of the specific area A1, but when an object exists outside the specific area A1, the object can be recognized.

本実施形態では、特定領域Ａ１以外の領域の画像精度を特定領域の画像精度よりも低くするようにしてもよい。また、特定領域以外の領域において物体認識処理を行う周期を、特定領域Ａ１において物体認識処理を行う周期よりも長くする。例えば、図１３に示すように、特定領域Ａ１以外の領域においては、１／６秒の周期で物体認識処理を行い、特定領域Ａ１においては、１／６０秒の周期で物体認識処理を行う。このようにすれば、マシンパワー（コンピュータの総合的な処理能力）を主に特定領域Ａ１の物体認識処理に注力することができ、特定領域Ａ１の物体認識処理をより正確に行うことができる。 In the present embodiment, the image accuracy of the region other than the specific region A1 may be made lower than the image accuracy of the specific region. Further, the cycle for performing the object recognition process in the region other than the specific region is set longer than the cycle for performing the object recognition process in the specific region A1. For example, as shown in FIG. 13, the object recognition process is performed at a period of 1/6 second in the area other than the specific area A1, and the object recognition process is performed at a period of 1/60 second in the specific area A1. In this way, the machine power (total processing capability of the computer) can be mainly focused on the object recognition process in the specific area A1, and the object recognition process in the specific area A1 can be performed more accurately.

１−７．複数の特定領域
本実施形態では、複数の特定領域を設定するようにしてもよい。例えば、「所与の期間（２秒間）において、動きベクトルの大きさ（差分画素値）の平均値が２００以上であって、動きベクトルが左又は右方向を向く領域」をルール１とし、「所与の期間（２秒間）において、動きベクトルの大きさ（差分画素値）の平均値が１００以上であって、動きベクトルが左又は右方向を向く領域」をルール２とした場合、図７に示すように、ルール１に基づいて決定される領域を、特定領域Ａ１として設定すると共に、図１４に示すように、ルール２に基づいて決定される領域を、特定領域Ａ２として設定する。なお、本実施形態では、３つ以上の特定領域を設定してもよい。 1-7. Multiple specific areas In the present embodiment, multiple specific areas may be set. For example, rule 1 is “a region in which the average value of motion vectors (difference pixel values) is 200 or more and the motion vector faces left or right in a given period (2 seconds)”. When the rule 2 is “a region where the average value of motion vectors (difference pixel values) is 100 or more and the motion vector faces left or right in a given period (2 seconds)”, FIG. As shown in FIG. 14, the area determined based on the rule 1 is set as the specific area A1, and as shown in FIG. 14, the area determined based on the rule 2 is set as the specific area A2. In the present embodiment, three or more specific areas may be set.

そして、少なくとも１つの特定領域について物体認識処理を行う。例えば、ルール１、ルール２に基づいて設定された特定領域Ａ１、Ａ２のいずれか一方について物体認識処理を行うようにしてもよいし、特定領域Ａ１、Ａ２の両方について物体認識処理を行うようにしてもよい。例えば、特定領域Ａ２において物体認識処理を行う場合には、特定領域Ａ２の動き領域Ｓ３の形と、「人」の認識パターンとを比較し「人」であるか否かを判断する。図１４の例では、「人」の認識パターンと一致すると判断される。 Then, object recognition processing is performed for at least one specific region. For example, the object recognition process may be performed for one of the specific areas A1 and A2 set based on the rule 1 and the rule 2, or the object recognition process may be performed for both the specific areas A1 and A2. May be. For example, when the object recognition process is performed in the specific area A2, the shape of the motion area S3 of the specific area A2 is compared with the recognition pattern of “person” to determine whether the person is “person”. In the example of FIG. 14, it is determined that the recognition pattern matches the “person” recognition pattern.

特に、本実施形態では、複数の特定領域を設定した場合には、各特定領域に優先度を設定する（付与する）。例えば、図１５に示すように、各ルールのＩＤに対応づけて（各領域に対応づけて）、優先度を設定する。本実施形態では、動きベクトルの大きさに従って優先度を決める。例えば、特定領域上の各画素の動きベクトルの大きさ（差分画素値）の平均値を算出し、平均値が高いほど優位になるように優先度を設定するようにしてもよい。 In particular, in the present embodiment, when a plurality of specific areas are set, a priority is set (given) to each specific area. For example, as shown in FIG. 15, the priority is set in association with the ID of each rule (in association with each area). In this embodiment, the priority is determined according to the magnitude of the motion vector. For example, the average value of the motion vector magnitude (difference pixel value) of each pixel on the specific region may be calculated, and the priority may be set so that the higher the average value, the more dominant.

そして、本実施形態では、優先度に基づいて、物体認識処理を行う。例えば、優先度の低い特定領域の画像精度を、優先度の高い特定領域の画像精度よりも低くするようにしてもよい。つまり、図１６に示すように、特定領域Ａ２について解像度を上げると共に、優先度の低い特定領域Ａ２の解像度を、優先度の高い特定領域Ａ１の解像度よりも低くする。このようにすれば、優先度の高い特定領域Ａ１について、マシンパワーを主に注力させて詳細に物体を認識することができ、優先度の低いものについて処理を簡素にし、効率よく特定領域Ａ１、Ａ２の物体認識処理を行うことができる。 In this embodiment, object recognition processing is performed based on the priority. For example, the image accuracy of a specific region with a low priority may be set lower than the image accuracy of a specific region with a high priority. That is, as shown in FIG. 16, the resolution of the specific area A2 is increased, and the resolution of the specific area A2 having a low priority is set lower than the resolution of the specific area A1 having a high priority. In this way, it is possible to recognize the object in detail by focusing mainly on the machine power for the specific area A1 with high priority, simplify the processing for the low priority area, and efficiently specify the specific area A1, The object recognition process of A2 can be performed.

なお、図１６に示すように、特定領域Ａ２について解像度を上げた場合には、解像度を上げた特定領域Ａ２において動きベクトルを算出し直し、特定領域Ａ２の動き領域Ｓ３´の形と、「人」の認識パターンとを比較し「人」であるか否かを判断するようにしてもよい。図１６の例では、「人」の認識パターンと一致すると判断される。 As shown in FIG. 16, when the resolution is increased for the specific area A2, the motion vector is recalculated in the specific area A2 with the increased resolution, and the shape of the motion area S3 ′ of the specific area A2 It is also possible to determine whether or not the person is a “person” by comparing with the recognition pattern “”. In the example of FIG. 16, it is determined that the recognition pattern matches the “person” recognition pattern.

同様に、例えば、優先度の低い特定領域において物体認識処理を行う周期を、優先度の高い特定領域において物体認識処理を行う周期よりも長くする。つまり、図１７に示すように、優先度の低い特定領域Ａ２において物体認識処理を行う周期を、優先度の高い特定領域Ａ１において物体認識処理を行う周期よりも長くする。このようにすれば、優先度の高い特定領域Ａ１についてより詳細に物体を認識することができる。 Similarly, for example, the period for performing the object recognition process in the specific area with low priority is set longer than the period for performing the object recognition process in the specific area with high priority. That is, as shown in FIG. 17, the period for performing the object recognition process in the specific area A2 with low priority is set longer than the period for performing the object recognition process in the specific area A1 with high priority. In this way, the object can be recognized in more detail with respect to the specific area A1 having a high priority.

また、本実施形態では、複数の特定領域を設定されている場合には、優先度の低い特定領域において物体認識処理を行わずに、優先度の高い特定領域において物体認識処理を行うようにしてもよい。例えば、優先度の低い特定領域Ａ２において物体認識処理を行わずに、優先度の高い特定領域Ａ１において物体認識処理を行うようにしてもよい。 In the present embodiment, when a plurality of specific areas are set, the object recognition process is performed in the high priority specific area without performing the object recognition process in the low priority specific area. Also good. For example, the object recognition process may be performed in the specific area A1 having a high priority without performing the object recognition process in the specific area A2 having a low priority.

１−８．フローチャート
最後に、本実施形態の処理の流れについて図１８を用いて説明する。まず、２つの入力画像の差分画像に基づいて、各画素の動きベクトルを算出する（ステップＳ１０）。そして、動きベクトルに基づいて、入力画像上の特定領域を設定する（ステップＳ１１）。そして、特定領域の画像精度を上げるとともに、特定領域の物体認識処理を行う周期を短くする（ステップＳ１２）。そして、特定領域において物体を認識する処理を行う（ステップＳ１３）。以上で処理が終了する。 1-8. Flowchart Finally, the processing flow of this embodiment will be described with reference to FIG. First, based on the difference image of two input images, the motion vector of each pixel is calculated (step S10). Then, a specific area on the input image is set based on the motion vector (step S11). Then, the image accuracy of the specific area is increased, and the period for performing the object recognition process for the specific area is shortened (step S12). And the process which recognizes an object in a specific area is performed (step S13). The process ends here.

２．第２の実施形態
次に、本実施形態の第２の実施形態について説明する。なお、第２の実施形態は、第１の実施形態を応用したものである。第２の実施形態では、第１の実施形態と共通する点について説明を省略し、第１の実施形態と相違する点や第２の実施形態で追加した点等について説明する。 2. Second Embodiment Next, a second embodiment of the present embodiment will be described. The second embodiment is an application of the first embodiment. In the second embodiment, description of points that are common to the first embodiment will be omitted, and points that are different from the first embodiment and points that are added in the second embodiment will be described.

２−１．第２の物体認識システム
図１９は、第２の実施形態における第２の物体認識システム（第１のゲームシステム、第１の画像生成システム）の概略外観図である。本実施形態の第２の物体認識システムは、ゲーム画像を表示させる表示部９０と、物体認識処理、ゲーム処理等を行う物体認識装置５０（ゲーム機）と、入力部６０とを含む。そして、図１９に示すように、表示部９０（表示画面９１）の周囲には、表示部９０と関連付けた位置に入力部６０が配置されている。例えば、入力部６０は、表示部９０の下部に配置してもよいし、表示部９０の上部に配置してもよい。 2-1. Second Object Recognition System FIG. 19 is a schematic external view of a second object recognition system (first game system, first image generation system) in the second embodiment. The second object recognition system of the present embodiment includes a display unit 90 that displays a game image, an object recognition device 50 (game machine) that performs object recognition processing, game processing, and the like, and an input unit 60. And as shown in FIG. 19, the input part 60 is arrange | positioned in the position linked | related with the display part 90 around the display part 90 (display screen 91). For example, the input unit 60 may be disposed below the display unit 90 or may be disposed above the display unit 90.

第２の物体認識システムは、プレーヤＰの手や体の動きを認識することができる入力部６０（センサの一例）を備えている。この入力部６０は、発光部６１０、深度センサ６２０、ＲＧＢカメラ６３０、音入力部６４０（マルチアレイマイクロフォン）とを備え、プレーヤＰ（物体）と非接触で、実空間におけるプレーヤＰの手や体の３次元の位置や、形の情報をとらえることができる。第２の実施形態では、この入力部６０を用いた第２の物体認識システムの処理例について説明する。 The second object recognition system includes an input unit 60 (an example of a sensor) that can recognize the movement of the player P's hand or body. The input unit 60 includes a light emitting unit 610, a depth sensor 620, an RGB camera 630, and a sound input unit 640 (multi-array microphone). The input unit 60 is non-contact with the player P (object), and the player P's hand and body in real space. 3D position and shape information can be captured. In the second embodiment, a processing example of the second object recognition system using the input unit 60 will be described.

２−２．構成
図２０は、第２の物体認識システムの機能ブロック図の一例である。なお、第１の物体認識システムの構成例との共通する点については説明を省略し、第１の物体認識システムの構成例と相違する点について説明する。なお、第２の物体認識システムでは、図２０の各部を全て含む必要はなく、その一部を省略した構成としてもよい。 2-2. Configuration FIG. 20 is an example of a functional block diagram of the second object recognition system. In addition, description is abbreviate | omitted about the point which is common in the structural example of a 1st object recognition system, and the point which is different from the structural example of a 1st object recognition system is demonstrated. In the second object recognition system, it is not necessary to include all the units in FIG. 20, and a configuration in which some of the units are omitted may be employed.

第２の物体認識システムは、物体認識装置５０と、入力部６０と、表示部９０、スピーカー９２を含む。 The second object recognition system includes an object recognition device 50, an input unit 60, a display unit 90, and a speaker 92.

入力部６０は、発光部６１０、深度センサ６２０、ＲＧＢカメラ（撮像部）６３０、音入力部６４０、処理部６５０、記憶部６６０によって構成されている。 The input unit 60 includes a light emitting unit 610, a depth sensor 620, an RGB camera (imaging unit) 630, a sound input unit 640, a processing unit 650, and a storage unit 660.

発光部６１０は、光を物体（プレーヤ、被写体）に照射する処理を行う。例えば、発光部６１０は、ＬＥＤなどの発光素子による赤外線などの光を対象の物体に照射する。 The light emitting unit 610 performs a process of irradiating an object (player, subject) with light. For example, the light emitting unit 610 irradiates a target object with light such as infrared rays from a light emitting element such as an LED.

深度センサ６２０は、物体から反射光を受光する受光部を有する。深度センサ６２０は、発光部６１０が発光しているときに受光した光量と、発光部６１０が発光していないときに受光した光量の差をとることによって、発光部６１０から照射される物体の反射光を取り出す処理を行う。つまり、深度センサ６２０は、図２１（Ａ）に示すように、発光部６１０から照射される物体の反射光を取り出した反射光画像（入力画像の一例）を、所定単位時間で（例えば、１／６０秒単位で）、記憶部６６０に出力する処理を行う。反射光画像は画素単位で、入力部６０から物体までの距離（深度値）を取得することができる。 The depth sensor 620 includes a light receiving unit that receives reflected light from an object. The depth sensor 620 takes a difference between the amount of light received when the light emitting unit 610 emits light and the amount of light received when the light emitting unit 610 is not emitting light, thereby reflecting the object irradiated from the light emitting unit 610. Process to extract light. That is, as shown in FIG. 21A, the depth sensor 620 generates a reflected light image (an example of an input image) obtained by extracting reflected light of an object irradiated from the light emitting unit 610 in a predetermined unit time (for example, 1 (In units of 60 seconds), the process of outputting to the storage unit 660 is performed. The reflected light image can acquire the distance (depth value) from the input unit 60 to the object in units of pixels.

ＲＧＢカメラ（撮像部）６３０は、物体（プレーヤＰ）から発した光をレンズなどの光学系によって撮像素子の受光平面に結像させ、その像の光による明暗を電荷の量に光電変換し、それを順次読み出して電気信号に変換する。そして、ＲＧＢ化（カラー化）されたＲＧＢ画像（入力画像の一例）を記憶部６６０に出力する処理を行う。例えば、図２１（Ｂ）に示すようなＲＧＢ画像を生成する。ＲＧＢカメラ６３０は、所定単位時間で（例えば、１／６０秒単位で）、記憶部６６０に出力する処理を行う。 The RGB camera (imaging unit) 630 forms an image of light emitted from an object (player P) on a light receiving plane of an image sensor by an optical system such as a lens, and photoelectrically converts light and darkness of the image light into an amount of charge. It is sequentially read out and converted into an electric signal. Then, RGB (colored) RGB image (an example of an input image) is output to the storage unit 660. For example, an RGB image as shown in FIG. 21B is generated. The RGB camera 630 performs a process of outputting to the storage unit 660 in a predetermined unit time (for example, in units of 1/60 seconds).

なお、深度センサ６２０とＲＧＢカメラ６３０とは、共通の受光部から光を受光するようにしてもよい。かかる場合、２つの受光部を有していてもよい。また、深度センサ６２０用の受光部と、ＲＧＢカメラ６３０用の受光部とをそれぞれ異ならせてもよい。 The depth sensor 620 and the RGB camera 630 may receive light from a common light receiving unit. In this case, you may have two light-receiving parts. The light receiving unit for the depth sensor 620 and the light receiving unit for the RGB camera 630 may be different from each other.

音入力部６４０は、音声認識処理を行うものであり、例えばマルチアレイマイクロフォンとすることができる。 The sound input unit 640 performs voice recognition processing, and can be a multi-array microphone, for example.

処理部６５０は、発光部６１０に発光するタイミングを指示したり、深度センサ６２０によって出力された反射光画像や、ＲＧＢカメラ６３０で撮像されたＲＧＢ画像を、物体認識装置５０に送信する処理などを行う。 The processing unit 650 instructs the light emitting unit 610 to emit light, performs a process of transmitting the reflected light image output by the depth sensor 620 or the RGB image captured by the RGB camera 630 to the object recognition device 50, and the like. Do.

記憶部６６０は、深度センサ６２０によって出力された反射光画像や、ＲＧＢカメラ６３０によって出力されたＲＧＢ画像を逐次記憶する。 The storage unit 660 sequentially stores the reflected light image output by the depth sensor 620 and the RGB image output by the RGB camera 630.

次に、本実施形態の物体認識装置５０について説明する。本実施形態の物体認識装置５０は、記憶部５７０、処理部５００、情報記憶媒体５８０、通信部５９６によって構成される。 Next, the object recognition apparatus 50 of this embodiment is demonstrated. The object recognition device 50 according to the present embodiment includes a storage unit 570, a processing unit 500, an information storage medium 580, and a communication unit 596.

認識パターン記憶部５７３には、物体を特定するために予め用意されたパターン、テンプレートを格納するための記憶領域であり、物体それぞれに対応づけてパターンが記憶されている。例えば、視覚的特徴や画素値そのものを認識パターンとして認識パターン記憶部５７３に格納される。 The recognition pattern storage unit 573 is a storage area for storing a pattern and a template prepared in advance for specifying an object, and stores a pattern in association with each object. For example, visual features and pixel values themselves are stored in the recognition pattern storage unit 573 as recognition patterns.

なお、認識パターン記憶部５７３は、データベースとして構築される記憶領域でもよい。例えば、「人」、「手」、「足」、「腕」などの各物体に対応づけて、１または複数の認識パターンを関連づけて記憶するようにしてもよい。 Note that the recognition pattern storage unit 573 may be a storage area constructed as a database. For example, one or more recognition patterns may be stored in association with each object such as “person”, “hand”, “foot”, and “arm”.

特に、第２の物体認識システムの認識パターン記憶部５７３には、物体を特定するための深度値情報を認識パターンとして認識パターン記憶部５７３に記憶するようにしてもよい。 In particular, the recognition pattern storage unit 573 of the second object recognition system may store depth value information for specifying an object in the recognition pattern storage unit 573 as a recognition pattern.

また、第２の物体認識システムの認識パターン記憶部５７３は、複数のボーン情報（スケルトン情報、骨格情報）を記憶するようにしてもよい。例えば、人を認識するためのボーン情報や、手のボーン、腕のボーン、足のボーンのように、人物を構成する部位単位でボーン情報を認識パターン記憶部５７３に記憶するようにしてもよい。なお、ボーン情報は、人体の３次元の関節位置及び３次元の関節の回転角度を仮想的に定義したものである。 Further, the recognition pattern storage unit 573 of the second object recognition system may store a plurality of bone information (skeleton information, skeleton information). For example, bone information for recognizing a person, bone information such as a hand bone, an arm bone, and a foot bone may be stored in the recognition pattern storage unit 573 for each part constituting the person. . The bone information virtually defines the three-dimensional joint position of the human body and the rotation angle of the three-dimensional joint.

また、入力画像記憶部５７４は、物体認識処理、動きベクトル算出処理を行うために所定周期で入力部２０が取得した入力画像を格納するための記憶領域である。また、差分画像記憶部５７５は、動きベクトル算出処理を行うために、異なる時点で撮像された２つの画像の各画素値の差分をとった差分画素値を格納するための記憶領域である。 The input image storage unit 574 is a storage area for storing an input image acquired by the input unit 20 at a predetermined cycle in order to perform object recognition processing and motion vector calculation processing. Further, the difference image storage unit 575 is a storage area for storing a difference pixel value obtained by taking a difference between pixel values of two images captured at different times in order to perform a motion vector calculation process.

そして、処理部５００は、この情報記憶媒体５８０に格納されるプログラムから読み出されたデータに基づいて本実施形態の種々の処理を行う。即ち、情報記録媒体５８０には、本実施形態の各部としてコンピュータを機能させるためのプログラム（各部の処理をコンピュータに実行させるためのプログラム）が記憶される。 The processing unit 500 performs various processes according to the present embodiment based on data read from the program stored in the information storage medium 580. That is, the information recording medium 580 stores a program for causing a computer to function as each unit of the present embodiment (a program for causing a computer to execute processing of each unit).

通信部５９６は、ネットワーク（インターネット）を介して他のゲーム機と通信することができる。その機能は、各種プロセッサまたは通信用ＡＳＩＣ、ネットワーク・インタフェース・カードなどのハードウェアや、プログラムなどにより実現できる。 The communication unit 596 can communicate with other game machines via a network (Internet). The function can be realized by various processors, hardware such as a communication ASIC, a network interface card, or a program.

なお、本実施形態の各部としてコンピュータを機能させるためのプログラムは、サーバが有する、記憶部、情報記憶媒体からネットワークを介して情報記憶媒体５８０（または、記憶部５７０）に配信するようにしてもよい。このようなサーバの情報記憶媒体の使用も本発明の範囲に含まれる。 Note that a program for causing a computer to function as each unit of the present embodiment may be distributed from a storage unit or an information storage medium included in the server to the information storage medium 580 (or the storage unit 570) via a network. Good. Use of such server information storage media is also within the scope of the present invention.

処理部５００（プロセッサ）は、入力部６０から受信した情報や情報記憶媒体５８０から記憶部５７０に展開されたプログラム等に基づいて、ゲーム処理、画像生成処理、或いは音制御の処理を行う。 The processing unit 500 (processor) performs game processing, image generation processing, or sound control processing based on information received from the input unit 60, a program developed from the information storage medium 580 to the storage unit 570, and the like.

特に、第２の物体認識システムの処理部５００は、取得部５１０、算出部５１１、領域決定部５１２、物体認識処理部５１３、ゲーム演算部５１４、画像生成部５２０、音制御部５３０として機能する。 In particular, the processing unit 500 of the second object recognition system functions as an acquisition unit 510, a calculation unit 511, a region determination unit 512, an object recognition processing unit 513, a game calculation unit 514, an image generation unit 520, and a sound control unit 530. .

取得部５１０は、入力部５０からＲＧＢ画像、反射光画像などの入力画像を取得する処理を行う。つまり、取得部５１０は、深度カメラ６２０によって、発光部から照射された物体の反射光を受光することによって各画素の深度値を有する反射光画像（赤外線の反射結果）を取得する。 The acquisition unit 510 performs processing for acquiring an input image such as an RGB image or a reflected light image from the input unit 50. That is, the acquisition unit 510 acquires the reflected light image (infrared reflection result) having the depth value of each pixel by receiving the reflected light of the object irradiated from the light emitting unit by the depth camera 620.

算出部５１１は、異なる時点で取得した２つの入力画像（ＲＧＢ画像、反射光画像）に基づいて、入力画像の各画素の動きベクトルを算出する。例えば、算出部５１１は、算出部１１１と同じように、ＲＧＢカメラ６３０によって撮像された入力画像を取得し、異なる時点で撮像された２つの入力画像に基づいて、入力画像の各画素の動きベクトルを算出するようにしてもよい。 The calculation unit 511 calculates a motion vector of each pixel of the input image based on two input images (RGB image and reflected light image) acquired at different times. For example, as with the calculation unit 111, the calculation unit 511 acquires an input image captured by the RGB camera 630, and based on two input images captured at different points in time, the motion vector of each pixel of the input image May be calculated.

領域設定部５１２は、入力画像の各画素の深度値に基づいて、入力画像において特定領域を設定する。また、領域設定部５１２は、複数の特定領域を設定するようにしてもよい。複数の特定領域を設定した場合には、各特定領域に優先度を付与（設定）する（特定領域単位で優先度を付与する）。 The region setting unit 512 sets a specific region in the input image based on the depth value of each pixel of the input image. Further, the area setting unit 512 may set a plurality of specific areas. When a plurality of specific areas are set, priority is given (set) to each specific area (priority is given in units of specific areas).

また、領域設定部５１２は、入力画像の各画素の色情報と深度値とに基づいて、入力画像において特定領域を設定するようにしてもよい。また、領域設定部５１２は、入力画像の各画素の深度値のうち所定値以上である深度値に基づいて、入力画像において特定領域を設定するようにしてもよい。 The area setting unit 512 may set a specific area in the input image based on the color information and depth value of each pixel of the input image. The region setting unit 512 may set a specific region in the input image based on a depth value that is a predetermined value or more among the depth values of each pixel of the input image.

また、領域設定部５１２は、入力画像の各画素の動きベクトルと深度値とに基づいて、入力画像において特定領域を設定するようにしてもよい。 The region setting unit 512 may set a specific region in the input image based on the motion vector and depth value of each pixel of the input image.

物体認識処理部５１３は、物体を認識する物体認識処理を行う。ここで、物体を認識する物体認識処理とは、物体自体を認識する処理、物体の動きを認識する処理、物体のジェスチャー（形、ポーズ）を認識する処理の少なくとも１つを含む。 The object recognition processing unit 513 performs object recognition processing for recognizing an object. Here, the object recognition process for recognizing an object includes at least one of a process for recognizing the object itself, a process for recognizing the movement of the object, and a process for recognizing the gesture (shape or pose) of the object.

特に、物体認識処理部５１３は、入力画像の各画素の画素情報（画素の深度値、画素の位置座標、画素の色情報、画素の動きベクトルの少なくとも１つ）に基づいて、認識パターン記憶部５７３に格納されている「人」の認識パターンを用いて、「人」を認識できるか否かを判断する。 In particular, the object recognition processing unit 513 includes a recognition pattern storage unit based on pixel information (at least one of pixel depth value, pixel position coordinate, pixel color information, and pixel motion vector) of each pixel of the input image. It is determined whether or not “person” can be recognized using the recognition pattern of “person” stored in 573.

具体的には、物体認識処理部５１３は、入力画像（例えば、反射光画像、ＲＧＢ画像）の各画素の画素情報に基づいて３次元の人のシルエット（３次元の形状）を切り出す処理を行う。例えば、反射光画像の各画素の深度値（輝度値）、ＲＧＢ画像のカラー値に基づいて、３次元の人のシルエットを切り出す処理を行う。 Specifically, the object recognition processing unit 513 performs a process of cutting out a three-dimensional human silhouette (three-dimensional shape) based on pixel information of each pixel of the input image (for example, reflected light image, RGB image). . For example, based on the depth value (luminance value) of each pixel of the reflected light image and the color value of the RGB image, a process of cutting out a three-dimensional human silhouette is performed.

そして、物体認識処理部５１３は、ボーン情報に基づいて、前記特定領域における前記物体認識処理を行う。例えば、物体認識処理部５１３は、認識パターン記憶部５７３で記憶されている複数のボーン（スケルトン、骨格）と、シルエットとを照合し、最もシルエットに合致するボーンを設定する。物体認識処理部５１３は、設定されたボーンの動きを演算する処理を行う。つまり、ボーンの動きを、プレーヤＰの動作とみなす処理を行なう。本実施形態では、フレーム毎に、ボーンを特定しプレーヤＰの動作を取得する処理を行っている。なお、本実施形態では、腕のボーン、足のボーンのように、人物を構成する部位単位で処理を行う場合には、部位単位で、抽出されたシルエットが複数のボーンのうちいずれと合致するかを判定し、部位の動きを、プレーヤの部位の動作とする処理を行う。 And the object recognition process part 513 performs the said object recognition process in the said specific area | region based on bone information. For example, the object recognition processing unit 513 collates a plurality of bones (skeleton, skeleton) stored in the recognition pattern storage unit 573 with a silhouette, and sets a bone that most closely matches the silhouette. The object recognition processing unit 513 performs processing for calculating the motion of the set bone. That is, processing is performed in which the motion of the bone is regarded as the motion of the player P. In the present embodiment, processing for specifying the bone and acquiring the operation of the player P is performed for each frame. In this embodiment, when processing is performed in units of parts constituting a person, such as arm bones and leg bones, the extracted silhouette matches any of a plurality of bones in part units. Is determined, and the process of converting the movement of the part into the movement of the part of the player is performed.

例えば、物体認識処理部５１３は、「人」のボーン情報を用意し、反射光画像から所定輝度値（所定深度値）以上の輝度値（深度値）を有する画素群領域の３次元のシルエットが、「人」のボーン情報に合致するか否かを判断する。そして、シルエットが「人」のボーン情報と合致する場合には、「人」を認識したと判定する処理を行う。一方、「人」を認識できない場合には、次の物体（例えば「手」）のボーン情報を用いて、次の物体を認識できるか否かを判断する。そして、物体（「手」）を認識できるまで、入力画像と次のボーン情報とを照合する処理を行う。 For example, the object recognition processing unit 513 prepares bone information of “people”, and a three-dimensional silhouette of a pixel group region having a luminance value (depth value) equal to or higher than a predetermined luminance value (predetermined depth value) from the reflected light image. , It is determined whether or not it matches the bone information of “person”. If the silhouette matches the bone information of “person”, a process of determining that “person” has been recognized is performed. On the other hand, when “person” cannot be recognized, it is determined whether or not the next object can be recognized using bone information of the next object (for example, “hand”). Until the object (“hand”) can be recognized, the input image and the next bone information are collated.

また、物体認識処理部５１３は、フレーム毎にシルエットに合致するボーン情報に基づいて、シルエットの動作を認識する処理を行う。例えば、「人」のボーン情報に基づいて、「人」を認識したと判定された場合には、当該ボーン情報に基づいて「人」の動作、ジェスチャーを認識する処理を行う。 In addition, the object recognition processing unit 513 performs processing for recognizing the motion of the silhouette based on bone information that matches the silhouette for each frame. For example, when it is determined that “person” is recognized based on the bone information of “person”, processing for recognizing the motion and gesture of “person” is performed based on the bone information.

また、本実施形態の物体認識処理部５１３は、特定領域において物体認識処理を行う。例えば、物体認識処理部５１３は、特定領域における物体認識処理の精度を特定領域以外の領域の物体認識処理の精度よりも上げて、特定領域において物体認識処理を行うようにしてもよい。 In addition, the object recognition processing unit 513 of the present embodiment performs object recognition processing in the specific area. For example, the object recognition processing unit 513 may perform the object recognition process in the specific area by increasing the accuracy of the object recognition process in the specific area higher than the accuracy of the object recognition process in the area other than the specific area.

より具体的に説明すると、物体認識処理部５１３は、特定領域において物体認識処理を行う周期を特定領域以外の領域の物体認識処理の周期よりも短くして、特定領域において物体認識処理を行う。また、物体認識処理部５１３は、特定領域の画像精度を特定領域以外の領域の画像精度よりも上げて、特定領域において物体認識処理を行う。 More specifically, the object recognition processing unit 513 performs the object recognition process in the specific region by setting the cycle of performing the object recognition process in the specific region to be shorter than the cycle of the object recognition process in the region other than the specific region. In addition, the object recognition processing unit 513 performs object recognition processing in the specific area by raising the image accuracy of the specific area higher than the image accuracy of the area other than the specific area.

また、物体認識処理部５１３は、特定領域以外の領域において物体認識処理を行うと共に、特定領域以外の領域において物体認識処理を行う周期を、特定領域において物体認識処理を行う周期よりも長くする。また、物体認識処理部５１３は、特定領域以外の領域において物体認識処理を行うと共に、特定領域以外の領域の画像精度を特定領域の画像精度よりも低くする。 In addition, the object recognition processing unit 513 performs the object recognition process in a region other than the specific region, and makes the cycle of performing the object recognition process in the region other than the specific region longer than the cycle of performing the object recognition process in the specific region. The object recognition processing unit 513 performs object recognition processing in a region other than the specific region, and lowers the image accuracy of the region other than the specific region than the image accuracy of the specific region.

また、物体認識処理部５１３は、複数の（２以上の）特定領域を設定されている場合には、少なくとも１つの特定領域について物体認識処理を行う。 Further, when a plurality of (two or more) specific areas are set, the object recognition processing unit 513 performs object recognition processing for at least one specific area.

また、物体認識処理部５１３は、複数の特定領域を設定されている場合には、各特定領域において物体認識処理を行うと共に、優先度の低い特定領域において物体認識処理を行う周期を、優先度の高い特定領域において物体認識処理を行う周期よりも長くするようにしてもよい。 In addition, when a plurality of specific areas are set, the object recognition processing unit 513 performs the object recognition process in each specific area and sets the cycle in which the object recognition process is performed in the specific area with a low priority. It may be made longer than the period for performing the object recognition process in a specific region having a high height.

また、物体認識処理部５１３は、複数の特定領域を設定されている場合には、各特定領域において物体認識処理を行うと共に、優先度の低い特定領域の画像精度を、優先度の高い特定領域の画像精度よりも低くするようにしてもよい。 In addition, when a plurality of specific areas are set, the object recognition processing unit 513 performs object recognition processing in each specific area, and sets the image accuracy of the specific area having a low priority to the specific area having a high priority. The image accuracy may be lower than the image accuracy.

また、物体認識処理部５１３は、複数の特定領域を設定されている場合には、優先度の低い特定領域において物体認識処理を行わずに、優先度の高い特定領域において物体認識処理を行うようにしてもよい。 Further, when a plurality of specific areas are set, the object recognition processing unit 513 performs the object recognition process in the specific area with a high priority without performing the object recognition process in the specific area with a low priority. It may be.

また、物体認識処理部５１３は、所定タイミングで前記特定領域における画像の変化を認識する処理を行うようにしてもよい。 Further, the object recognition processing unit 513 may perform processing for recognizing a change in the image in the specific area at a predetermined timing.

ゲーム演算部５１４は、種々のゲーム演算を行う。ここでゲーム演算としては、ゲーム開始条件が満たされた場合にゲームを開始する処理、ゲームを進行させる処理、キャラクタやマップなどのオブジェクトを配置する処理、オブジェクトを表示する処理、ゲーム結果を演算する処理、或いはゲーム終了条件が満たされた場合にゲームを終了する処理などがある。 The game calculation unit 514 performs various game calculations. Here, the game calculation includes a process for starting a game when a game start condition is satisfied, a process for advancing the game, a process for placing an object such as a character or a map, a process for displaying an object, and a game result. There is a process or a process of ending a game when a game end condition is satisfied.

例えば、ゲーム演算部５１４は、入力部６０からの入力データやプログラムなどに基づいて、ゲーム処理を行う。本実施形態のゲーム演算部５１４は、例えば、物体認識処理部５１３の認識結果に基づいてゲーム演算処理を行う。つまり、物体認識処理部５１３において「人（プレーヤ）」を認識した場合には、物体認識処理部５１３が「人」の動きやジェスチャーを認識し、その「人」の動き（左右に人が動く動作）やジェスチャー（特定のポーズ）に基づいてゲーム演算処理を行うようにしてもよい。 For example, the game calculation unit 514 performs a game process based on input data from the input unit 60, a program, and the like. The game calculation unit 514 of the present embodiment performs a game calculation process based on the recognition result of the object recognition processing unit 513, for example. That is, when the object recognition processing unit 513 recognizes “person (player)”, the object recognition processing unit 513 recognizes the movement and gesture of “person” and moves the “person” (the person moves to the left and right). The game calculation process may be performed based on an operation) or a gesture (specific pose).

なお、処理部５００は、処理部１００と同じように、仮想空間にオブジェクトを配置する処理、仮想空間に存在するオブジェクトを移動させる処理などを行うようにしてもよい。 Note that, like the processing unit 100, the processing unit 500 may perform processing for arranging an object in the virtual space, processing for moving an object existing in the virtual space, and the like.

また、処理部５００は、処理部１００と同じように、仮想空間にあるオブジェクトの移動・動作演算を行うようにしてもよい。 Further, like the processing unit 100, the processing unit 500 may perform movement / motion calculation of an object in the virtual space.

画像生成部５２０は、画像生成部１２０と同じように、処理部５００で行われる種々の処理の結果に基づいて描画処理を行い、これにより画像を生成し、表示部９０に出力する。 Similar to the image generation unit 120, the image generation unit 520 performs drawing processing based on the results of various processes performed by the processing unit 500, thereby generating an image and outputting the image to the display unit 90.

音制御部５３０は、音制御部１３０と同じように、処理部５００で行われる種々の処理の結果に基づいて音処理を行い、ＢＧＭ、効果音、又は音声などのゲーム音を生成し、スピーカー９２に出力する。 Similar to the sound control unit 130, the sound control unit 530 performs sound processing based on the results of various processes performed by the processing unit 500, generates game sounds such as BGM, sound effects, and sounds, and the speaker. 92.

なお、本実施形態の物体認識システムは、１人のプレーヤのみがプレイできるシングルプレーヤモード、或いは、複数のプレーヤがプレイできるマルチプレーヤモードでゲームプレイできるように制御してもよい。例えば、マルチプレーヤモードで制御する場合には、ネットワークを介して他の端末とデータを送受信してゲーム処理を行うようにしてもよいし、１つの端末が、複数の入力部からの入力情報に基づいて処理を行うようにしてもよい。 Note that the object recognition system of the present embodiment may be controlled so that the game can be played in a single player mode in which only one player can play or in a multiplayer mode in which a plurality of players can play. For example, in the case of controlling in the multiplayer mode, game processing may be performed by transmitting / receiving data to / from other terminals via a network, or one terminal may receive input information from a plurality of input units. Processing may be performed based on this.

情報記憶媒体５８０（コンピュータにより読み取り可能な媒体）は、プログラムやデータなどを格納するものであり、その機能は、光ディスク（ＣＤ、ＤＶＤ）、光磁気ディスク（ＭＯ）、磁気ディスク、ハードディスク、磁気テープ、或いはメモリ（ＲＯＭ）などのハードウェアにより実現できる。 An information storage medium 580 (a computer-readable medium) stores programs, data, and the like, and functions as an optical disk (CD, DVD), a magneto-optical disk (MO), a magnetic disk, a hard disk, and a magnetic tape. Alternatively, it can be realized by hardware such as a memory (ROM).

なお、物体認識装置５０の認識パターン記憶部５７３、入力画像記憶部５７４、差分画像記憶部５７５に記憶されるデータを、入力部６０の記憶部６６０に記憶するようにし、本実施形態の算出部５１１、領域設定部５１２、物体認識処理部５１３の処理を、入力部６０の処理部６５０が行うようにしてもよい。 Note that the data stored in the recognition pattern storage unit 573, the input image storage unit 574, and the difference image storage unit 575 of the object recognition device 50 are stored in the storage unit 660 of the input unit 60, and the calculation unit of the present embodiment. The processing unit 650 of the input unit 60 may perform the processing of the area setting unit 512 and the object recognition processing unit 513.

２−３．入力部の説明
第２の物体認識システムの入力部６０は、深度センサ６２０と、ＲＧＢカメラ６３０とを備え、コントローラなどの入力機器を必要とせず、物体（プレーヤ、プレーヤの手など）を画像処理することにより入力を受け付けることができる。これにより、従来にはない様々なゲーム処理を行うことができる。まず、入力部６０の深度センサ６２０、ＲＧＢカメラ６３０について説明する。 2-3. Description of Input Unit The input unit 60 of the second object recognition system includes a depth sensor 620 and an RGB camera 630, does not require an input device such as a controller, and performs image processing on an object (player, player's hand, etc.). By doing so, input can be accepted. As a result, various game processes that are not possible in the past can be performed. First, the depth sensor 620 and the RGB camera 630 of the input unit 60 will be described.

２−３−１．深度センサ
本実施形態の深度センサ６２０について、図２１を用いて説明する。まず、図２１に示すように、入力部６０が備える発光部６１０は、タイミング信号にしたがって時間的に強度変動する光を発光する。発光される光は、光源の前方に位置するプレーヤＰ（物体の一例）に照射される。 2-3-1. Depth Sensor A depth sensor 620 of this embodiment will be described with reference to FIG. First, as shown in FIG. 21, the light emitting unit 610 included in the input unit 60 emits light whose intensity varies with time in accordance with the timing signal. The emitted light is applied to a player P (an example of an object) located in front of the light source.

そして、深度センサ６２０は、発光部６１０が発光した光の反射光を受光する。つまり、深度センサ６２０は、反射光の空間的な強度分布を抽出した反射像画像を生成する。例えば、深度センサ６２０は、発光部６１０が発光しているときに受光した光量と、発光部６１０が発光していないときに受光した光量の差をとることによって、発光部６１０からの光の物体による反射光を取り出して反射光画像を得る。この反射光画像の各画素の値は、深度センサ６２０の入力部６０の位置ＧＰから物体までの距離（深度値）に対応する。なお、入力部６０の位置ＧＰは、深度センサ６２０の位置、深度センサ６０が備える受光位置と同義である。 The depth sensor 620 receives the reflected light of the light emitted from the light emitting unit 610. That is, the depth sensor 620 generates a reflected image image obtained by extracting the spatial intensity distribution of the reflected light. For example, the depth sensor 620 takes the difference between the amount of light received when the light emitting unit 610 emits light and the amount of light received when the light emitting unit 610 is not emitting light, thereby obtaining an object of light from the light emitting unit 610. A reflected light image is obtained by taking out the reflected light by. The value of each pixel of the reflected light image corresponds to the distance (depth value) from the position GP of the input unit 60 of the depth sensor 620 to the object. Note that the position GP of the input unit 60 is synonymous with the position of the depth sensor 620 and the light receiving position of the depth sensor 60.

例えば、図２２の例では、プレーヤＰの手の部分が入力部６０の位置ＧＰに最も近くにあるので、図２１（Ａ）に示すようなプレーヤＰの手を示す領域が、もっとも受光量多い部分（高輝度部分）となる反射像画像を得ることになる。 For example, in the example of FIG. 22, since the hand portion of the player P is closest to the position GP of the input unit 60, the region indicating the hand of the player P as shown in FIG. A reflected image that is a portion (high luminance portion) is obtained.

本実施形態では、反射光画像の各画素について、輝度値（受光量、画素値）が所定値以上である画素を、入力部６０の位置ＧＰに近い画素として抽出する。例えば、反射光画像の階調が２５６階調であれば、所定値（例えば２００）以上の画素を、高輝度の部分として抽出する。 In the present embodiment, for each pixel of the reflected light image, a pixel having a luminance value (light reception amount, pixel value) that is equal to or greater than a predetermined value is extracted as a pixel close to the position GP of the input unit 60. For example, when the gradation of the reflected light image is 256 gradations, pixels having a predetermined value (for example, 200) or more are extracted as high-luminance portions.

この深度センサで得られる反射光画像は、物体までの距離（深度値）に関係する。つまり、図２３に示すように、入力部６０の位置ＧＰから１メートル離れたところでプレーヤＰが位置する場合は、位置ＧＰから２メートル離れたところでプレーヤＰが位置する場合よりも、反射光画像の手の領域部分が高輝度となる（受光量が多くなる）。また、入力部６０の位置ＧＰから２メートル離れたところでプレーヤＰが位置する場合は、位置ＧＰから３メートル離れたところでプレーヤＰが位置する場合よりも、反射光画像の手の領域部分が高輝度となる（受光量が多くなる）。 The reflected light image obtained by this depth sensor is related to the distance (depth value) to the object. That is, as shown in FIG. 23, when the player P is located 1 meter away from the position GP of the input unit 60, the reflected light image of the reflected light image is larger than when the player P is located 2 meters away from the position GP. The region of the hand has high brightness (the amount of received light increases). Further, when the player P is located 2 meters away from the position GP of the input unit 60, the hand region portion of the reflected light image has a higher luminance than when the player P is located 3 meters away from the position GP. (The amount of received light increases).

このような原理に基づき、本実施形態では、反射光画像で高輝度部分として抽出された画素の輝度値に基づいて、実空間におけるプレーヤＰの位置を算出する。例えば、反射光画像のうち、輝度値が最も高い画素を特徴点とし、特徴点の輝度値に基づいて位置ＧＰからプレーヤＰまでの距離を算出する。なお、特徴点は、予め用意された形状パターンや動きベクトル等に基づいて特定される手の領域の重心画素としてもよい。なお、反射光画像において高輝度部分が広い場合には、高輝度部分が狭い場合よりも例えば物体が入力部の近くに存在すると判定することもできる。 Based on such a principle, in the present embodiment, the position of the player P in the real space is calculated based on the luminance value of the pixel extracted as a high luminance part in the reflected light image. For example, the pixel having the highest luminance value in the reflected light image is used as a feature point, and the distance from the position GP to the player P is calculated based on the luminance value of the feature point. Note that the feature point may be a barycentric pixel of a hand region specified based on a shape pattern or a motion vector prepared in advance. Note that when the high-luminance portion is wide in the reflected light image, it can be determined that the object is present near the input unit, for example, compared to when the high-luminance portion is narrow.

また、本実施形態では、反射光画像に基づいて、入力部６０を基準とする実空間における物体の位置を特定することができる。例えば、反射光画像の中心に、特徴点がある場合には、物体が入力部６０の光源の発射方向上に位置しているものと特定できる。また、特徴点が反射光画像の上部にある場合には、入力部６０を基準に物体が上部にあるものと特定できる。また、特徴点が反射光画像の下部にある場合には、入力部６０を基準に物体が下部にあるものと特定できる。また、特徴点が反射光画像の左部にある場合には、入力部６０を基準に（入力部の正面（光源側）からみて）物体が右部にあるものと特定できる。また、特徴点が反射光画像の左部にある場合には、入力部６０を基準に物体が（入力部の正面（光源側）からみて）右部にあるものと特定できる。このように、本実施形態では、反射光画像に基づいて、物体と入力部６０との位置関係を特定できる。 In the present embodiment, the position of the object in the real space with the input unit 60 as a reference can be specified based on the reflected light image. For example, when there is a feature point at the center of the reflected light image, it can be specified that the object is located in the emission direction of the light source of the input unit 60. When the feature point is at the upper part of the reflected light image, it can be specified that the object is at the upper part with reference to the input unit 60. Further, when the feature point is at the lower part of the reflected light image, it can be specified that the object is at the lower part with reference to the input unit 60. When the feature point is on the left side of the reflected light image, it can be determined that the object is on the right side with respect to the input unit 60 (as viewed from the front of the input unit (on the light source side)). When the feature point is on the left side of the reflected light image, it can be specified that the object is on the right side (as viewed from the front (light source side) of the input unit) with reference to the input unit 60. Thus, in this embodiment, the positional relationship between the object and the input unit 60 can be specified based on the reflected light image.

また、本実施形態では、反射光画像に基づいて、実空間における物体の移動方向を特定することができる。例えば、反射光画像の中心に特徴点があり、当該特徴点の輝度値が高くなる場合、物体が入力部６０の光源方向に近づいているものと特定できる。また、特徴点が反射光画像の上部から下部に移動している場合には、入力部６０を基準に物体が上部から下部に移動しているものと特定できる。また、特徴点が反射光画像の左部から右部に移動している場合には、入力部６０を基準に物体が右部から左部に移動しているものと特定できる。このように、本実施形態では、反射光画像に基づいて、入力部６０を基準に物体の移動方向を特定できる。 In the present embodiment, the moving direction of the object in real space can be specified based on the reflected light image. For example, when there is a feature point at the center of the reflected light image and the brightness value of the feature point is high, it can be specified that the object is approaching the light source direction of the input unit 60. When the feature point is moving from the upper part to the lower part of the reflected light image, it can be determined that the object is moving from the upper part to the lower part with reference to the input unit 60. Further, when the feature point is moving from the left part to the right part of the reflected light image, it can be specified that the object is moving from the right part to the left part with reference to the input unit 60. Thus, in this embodiment, the moving direction of the object can be specified based on the input light 60 based on the reflected light image.

なお、物体の反射光は、入力部６０の位置ＧＰから物体の距離が大きくなるにつれ大幅に減少する。例えば、反射光画像の１画素あたりの受光量は、物体までの距離の２乗に反比例して小さくなる。したがって、プレーヤＰが入力部６０から２０メートル程離れて位置する場合には、プレーヤＰからの反射光はほぼ無視できるくらいに受光量が小さくなり、プレーヤＰを特定できるような高輝度部分を抽出することができない。かかる場合には、入力がないものとして制御してもよい。また、高輝度部分を抽出することができない場合には、スピーカーから警告音を出力するようにしてもよい。 Note that the reflected light of the object greatly decreases as the distance of the object from the position GP of the input unit 60 increases. For example, the amount of received light per pixel of the reflected light image decreases in inverse proportion to the square of the distance to the object. Therefore, when the player P is located about 20 meters away from the input unit 60, the amount of light received is so small that the reflected light from the player P can be ignored, and a high-luminance portion that can identify the player P is extracted. Can not do it. In such a case, it may be controlled that there is no input. In addition, when a high luminance part cannot be extracted, a warning sound may be output from a speaker.

２−３−２．ＲＧＢカメラの説明
本実施形態は、ＲＧＢカメラ（撮像部）６３０によりＲＧＢ画像を入力情報として取得する。ＲＧＢ画像は、反射光画像に対応しているため、物体の動きベクトル（移動ベクトル）や特定領域の設定処理、物体認識処理の精度を高めることができる。 2-3-2. Description of RGB Camera In the present embodiment, an RGB image is acquired as input information by an RGB camera (imaging unit) 630. Since the RGB image corresponds to the reflected light image, it is possible to improve the accuracy of the object motion vector (movement vector), the specific region setting process, and the object recognition process.

第２の本実施形態のＲＧＢカメラ６３０は、ＲＧＢカメラ２１と同様な処理を行うことができる。つまり、第２の実施形態においても、図３に示すように、異なる時点で撮像された２つの入力画像において、同じ画素の対応付けを行い、その動き量（移動量）と動き方向（移動方向）とを示す動きベクトル（移動ベクトル、オプティカルフロー）を求めることができる。第２の実施形態においても、入力画像Ｆ１、Ｆ２との画素値（輝度値、カラー値）の差分をとった差分画像に基づいて動きベクトルを求めているようにしてもよい。 The RGB camera 630 of the second embodiment can perform the same processing as the RGB camera 21. That is, also in the second embodiment, as shown in FIG. 3, two input images captured at different times are associated with the same pixel, and the amount of movement (movement amount) and the direction of movement (movement direction) ) Indicating a motion vector (movement vector, optical flow). Also in the second embodiment, a motion vector may be obtained based on a difference image obtained by taking a difference between pixel values (luminance values, color values) from the input images F1 and F2.

また、本実施形態では、深度センサ６２０によって、入力部６０からの物体までの距離（深度値）を特定でき、反射光画像やＲＧＢ画像の２次元平面上において高輝度部分の特徴点の位置座標（Ｘ、Ｙ）、特徴点の動きベクトルを抽出できる。したがって、入力部６０から物体までの距離（Ｚ）と、反射光画像及びＲＧＢ画像の位置座標（Ｘ、Ｙ）とに基づいて、実空間における入力部６０を基準とする物体の位置Ｑを特定できる。したがって、本実施形態では、図２４に示すように、反射光画像に基づいて算出された、入力部６０の位置ＧＰから物体の位置Ｑまでの距離Ｌに基づいて、特定領域の設定処理、物体認識処理、ゲーム演算を行うことができる。 In the present embodiment, the depth sensor 620 can specify the distance (depth value) from the input unit 60 to the object, and the position coordinates of the feature points of the high-luminance portion on the two-dimensional plane of the reflected light image or the RGB image (X, Y), the motion vector of the feature point can be extracted. Therefore, based on the distance (Z) from the input unit 60 to the object and the position coordinates (X, Y) of the reflected light image and the RGB image, the position Q of the object in the real space is specified. it can. Therefore, in the present embodiment, as shown in FIG. 24, the specific region setting process, the object, and the object are calculated based on the distance L from the position GP of the input unit 60 to the object position Q calculated based on the reflected light image. Recognition processing and game calculation can be performed.

２−４．物体認識処理
本実施形態では、記憶部に予め格納されている認識パターンを用いて、入力画像（反射光画像、ＲＧＢ画像）上の物体を認識する物体認識処理を行う。ここで、物体を認識する物体認識処理とは、物体自体を認識する処理、物体の動きを認識する処理、物体のジェスチャー（形、ポーズ）を認識する処理の少なくとも１つを含む。 2-4. Object Recognition Process In this embodiment, an object recognition process for recognizing an object on an input image (reflected light image, RGB image) is performed using a recognition pattern stored in advance in a storage unit. Here, the object recognition process for recognizing an object includes at least one of a process for recognizing the object itself, a process for recognizing the movement of the object, and a process for recognizing the gesture (shape or pose) of the object.

第２の物体認識システムの物体認識処理について詳細に説明すると、例えば、図２５（Ａ）に示すように、深度カメラ６２０によって、発光部から照射された物体の反射光を受光した反射光画像（赤外線の反射結果）を取得する。 The object recognition processing of the second object recognition system will be described in detail. For example, as shown in FIG. 25A, a reflected light image (see FIG. 25A) in which reflected light of an object irradiated from a light emitting unit is received by a depth camera 620. Infrared reflection result) is acquired.

そして、図２５（Ｂ）に示すように、反射光画像の各画素の輝度値に基づいて、シルエット（形状）を切り出す処理を行う。つまり、反射光画像の各画素の輝度値のうち所定輝度値（２００）以上である画素値の領域をシルエットＳＴ１として切り出す処理（抽出する処理）を行う。 Then, as shown in FIG. 25B, a process of cutting out a silhouette (shape) is performed based on the luminance value of each pixel of the reflected light image. That is, a process of extracting (extracting) an area having a pixel value that is equal to or greater than the predetermined brightness value (200) among the brightness values of each pixel of the reflected light image as the silhouette ST1.

そして、認識パターン記憶部５７３に記憶されている複数のボーン情報（スケルトン、骨格）それぞれと、シルエットＳＴ１とを照合し、最もシルエットＳＴ１に合致するボーン情報を設定する。例えば、図２５（Ｃ）に示すように、「人」に関するボーン情報ＢＯ１、ＢＯ２、ＢＯ３が認識パターン記憶部５７３に記憶されており、シルエットＳＴ１が、ボーン情報ＢＯ１に最も合致すると判断されると、「人」を認識したと判定する処理を行う。 Then, each of the plurality of bone information (skeleton, skeleton) stored in the recognition pattern storage unit 573 is compared with the silhouette ST1, and the bone information that most closely matches the silhouette ST1 is set. For example, as shown in FIG. 25C, bone information BO1, BO2, and BO3 related to “person” is stored in the recognition pattern storage unit 573, and it is determined that the silhouette ST1 most closely matches the bone information BO1. , A process of determining that “person” has been recognized is performed.

一方、「人」を認識できない場合には、次の物体（例えば「手」）のボーン情報を用いて、次の物体（「手」）を認識できるか否かを判断する。そして、物体を認識できるまで、入力画像と次のボーン情報とを照合する処理を行う。図２５（Ｄ）の例では、シルエットＳＴ１が、ボーン情報ＢＯ１、ＢＯ２、ＢＯ３のうち、「人」のボーン情報ＢＯ１と最も合致すると判断され、「人」を認識したと判定する。 On the other hand, when “person” cannot be recognized, it is determined whether or not the next object (“hand”) can be recognized using bone information of the next object (for example, “hand”). Then, the process of collating the input image with the next bone information is performed until the object can be recognized. In the example of FIG. 25D, it is determined that the silhouette ST1 most closely matches the bone information BO1 of “person” among the bone information BO1, BO2, and BO3, and it is determined that “person” has been recognized.

また、本実施形態では、フレーム毎にシルエットに合致するボーン情報に基づいて、物体の動作を認識する処理、物体のジェスチャーを認識する処理を行う。例えば、ｓ０時点において、「人」のボーン情報ＢＯ１に基づいて、「人」を認識したと判定され、ｓ０時点から１０フレーム後のｓ１０時点において、「人」のボーン情報ＢＯ２に基づいて、「人」を認識したと判定され、ｓ１時点から２０フレーム後のｓ２０時点において「人」のボーン情報ＢＯ３に基づいて、「人」を認識したと判定された場合には、各ボーン情報ＢＯ１、ＢＯ２、ＢＯ３をキーフレームとし、キーフレーム間の動きを補間することによって、ｓ１時点からｓ２０時点までの「人」の動きを認識することができる。また、例えば、ｓ２０時点と、ｓ２０時点から１０フレーム後のｓ３０時点とにおいて、「人」のボーン情報ＢＯ３に基づいて、「人」を認識した場合には、ｓ２０時点からｓ３０時点までの間において、物体（人）が、ボーン情報ＢＯ３に基づくジェスチャーを行っていると認識する。 In this embodiment, processing for recognizing the motion of an object and processing for recognizing the gesture of the object are performed based on bone information matching the silhouette for each frame. For example, at time s0, it is determined that “person” is recognized based on the bone information BO1 of “person”, and at time s10, ten frames after time s0, based on the bone information BO2 of “person”. If it is determined that “person” is recognized, and it is determined that “person” is recognized based on the bone information BO3 of “person” at time s20, 20 frames after time s1, each bone information BO1, BO2 By using BO3 as a key frame and interpolating the movement between the key frames, the movement of the “person” from the time point s1 to the time point s20 can be recognized. For example, when “person” is recognized based on the bone information BO3 of “person” at time s20 and at time s30 10 frames after s20, between s20 and s30 It is recognized that the object (person) is performing a gesture based on the bone information BO3.

また、本実施形態では、３次元のボーン情報と、３次元に抽出したシルエットとが一致するかを判断することによって、物体を認識しているので、３次元の物体、物体の動きやジェスチャーを認識することができる。 In this embodiment, since the object is recognized by determining whether the three-dimensional bone information matches the silhouette extracted in the three-dimensional manner, the three-dimensional object, the movement of the object, and the gesture are determined. Can be recognized.

なお、本実施形態では、人物を構成する部位単位（手、腕、顔、足）で物体認識処理を行うこともできる。かかる場合は、部位単位で予め複数のボーンを認識パターン記憶部５７３に格納し、抽出されたシルエットが複数のボーンのうちいずれと合致するかを判定してもよい。 In the present embodiment, the object recognition process can also be performed on a part basis (hand, arm, face, foot) constituting a person. In such a case, a plurality of bones may be stored in advance in the recognition pattern storage unit 573 for each part, and it may be determined which of the plurality of bones the extracted silhouette matches.

また、第２の物体認識システムは、第１の物体認識システムのように、差分画像に基づいて動き領域を設定し、「人」の形状の認識パターンを用いて、動き領域のシルエットが認識パターンで示される「人」の形状と適合（一致）するか否かを判断するようにしてもよい。 In addition, the second object recognition system sets a motion region based on the difference image, like the first object recognition system, and the silhouette of the motion region is recognized by using the recognition pattern of the shape of “person”. It may be determined whether or not it matches (matches) the shape of the “person” indicated by.

２−５．特定領域において物体を認識する処理
本実施形態では、特定領域を設定し、特定領域において物体認識処理を行う。このようにすれば、効率的に、かつ、正確に物体を認識する処理を行うことができる。また、特定領域以外の領域に別の物体が映りこんでいる場合に、その別の物体を誤って認識する事態を防止することができる。 2-5. Processing for Recognizing Object in Specific Area In the present embodiment, a specific area is set, and object recognition processing is performed in the specific area. In this way, processing for recognizing an object efficiently and accurately can be performed. In addition, when another object is reflected in an area other than the specific area, it is possible to prevent a situation where the other object is erroneously recognized.

つまり、本実施形態では、図２１（Ａ）に示す反射光画像（入力画像）の各画素の深度値に基づいて、入力画像において特定領域を設定する。例えば、図２１（Ａ）に示すような反射光画像が得られた場合、深度値に基づいて、高輝度部分を特定領域として設定する。 That is, in this embodiment, a specific area is set in the input image based on the depth value of each pixel of the reflected light image (input image) shown in FIG. For example, when a reflected light image as shown in FIG. 21A is obtained, a high-luminance portion is set as a specific region based on the depth value.

例えば、「所与の期間（２秒間）において、反射光画像の各画素の輝度値の平均値がしきい値以上（例えば２２０以上）である領域」をルール１０とし、図２６に示すように、ルール１０に基づいて決定される領域を、特定領域Ｃ１として設定する。なお、特定領域を設定するルールは記憶部５７０に記憶されている。 For example, the rule 10 is “a region where the average value of the luminance values of the pixels of the reflected light image is not less than a threshold value (eg, not less than 220) in a given period (2 seconds)”, as shown in FIG. The area determined based on the rule 10 is set as the specific area C1. Note that a rule for setting the specific area is stored in the storage unit 570.

なお、本実施形態では、入力画像の各画素の深度値と色情報とに基づいて、入力画像において特定領域を設定するようにしてもよい。例えば、「黄色系統のカラー値を有する画素であって、所与の期間（２秒間）において、反射光画像の輝度値の平均値がしきい値以上（例えば２２０以上）である領域」をルール１０´とし、ルール１０´に基づいて、特定領域Ｃ１´を設定するようにしてもよい。 In the present embodiment, the specific area may be set in the input image based on the depth value and color information of each pixel of the input image. For example, a rule is defined as “a pixel having a yellow color value, and the average value of the luminance values of the reflected light image is not less than a threshold value (eg, not less than 220) in a given period (2 seconds)”. The specific area C1 ′ may be set based on the rule 10 ′.

なお、一度、特定領域Ｃ１を設定した場合、特定領域Ｃ１の物体を認識する必要があるので、特定領域Ｃ１を設定した時点から所与の期間（例えば６０秒間）、特定領域Ｃ１を固定する。そして、所与の周期（例えば６０秒周期）で特定領域Ｃ１を更新（変動、再設定）する。 Note that once the specific area C1 is set, it is necessary to recognize an object in the specific area C1, and thus the specific area C1 is fixed for a given period (for example, 60 seconds) from the time when the specific area C1 is set. Then, the specific area C1 is updated (changed or reset) at a given period (for example, a period of 60 seconds).

そして、本実施形態では、図２６に示す特定領域Ｃ１において物体認識処理を行う。つまり、設定された特定領域Ｃ１において、反射光画像の画素の輝度値がしきい値以上（例えば、２２０以上）である画素の領域のシルエットＳＴ２を切り出し、いずれのボーン情報と一致するか否かを判断すればよい。なお、特定領域においてシルエットを切り出す際のしきい値は、特定領域を設定するルールのしきい値と同じにしてもよい。 In the present embodiment, the object recognition process is performed in the specific area C1 shown in FIG. That is, in the set specific area C1, the silhouette ST2 of the pixel area where the luminance value of the pixel of the reflected light image is equal to or higher than a threshold value (for example, 220 or higher) is cut out, and which bone information is matched? Can be judged. It should be noted that the threshold for cutting out the silhouette in the specific area may be the same as the threshold of the rule for setting the specific area.

例えば、特定領域Ｃ１のシルエットＳＴ２と、「人」のボーン情報とを比較し、シルエットＳＴ２と「人」のボーン情報とが一致するか否かを判断する。図２６の例では、シルエットＳＴ２が、「人」のボーン情報と一致しないと判断される。そして、特定領域Ｃ１のシルエットＳＴ２と、次のボーン情報である例えば「手」のボーン情報とを比較し、シルエットＳＴ２と「手」のボーン情報とが一致するか否かを判断する。図２６の例では、シルエットＳＴ２が「手」のボーン情報と一致すると判断され、特定領域Ｃ１において「手」を認識したと判定されることになる。 For example, the silhouette ST2 of the specific area C1 is compared with the bone information of “person”, and it is determined whether or not the silhouette ST2 and the bone information of “person” match. In the example of FIG. 26, it is determined that the silhouette ST2 does not match the bone information of “person”. Then, the silhouette ST2 of the specific area C1 is compared with the next bone information, for example, “hand” bone information, and it is determined whether or not the silhouette ST2 and the “hand” bone information match. In the example of FIG. 26, it is determined that the silhouette ST2 matches the bone information of “hand”, and it is determined that “hand” has been recognized in the specific region C1.

なお、本実施形態では、入力画像の一部の特定領域Ｃ１にマシンパワーを注ぐことができるので、特定領域Ｃ１の画像精度を上げて物体を認識するようにしてもよい。画像精度とは、画像の解像度（画像の総画素数）や、画像の量子化レベル（画素が取り得る範囲、階調）、ボーン情報の関節数であり、解像度が高いほど、より精細に物体を認識することができる。また、量子化レベルが高いほど、画素値（差分画素値）の取り得る値域が広がり、より詳細にシルエットを切り出すことができる。 In the present embodiment, machine power can be applied to a specific area C1 that is a part of the input image. Therefore, an object may be recognized by increasing the image accuracy of the specific area C1. Image accuracy refers to image resolution (total number of pixels in the image), image quantization level (range that pixels can take, gradation), and number of bone information joints. The higher the resolution, the more detailed the object Can be recognized. Further, the higher the quantization level, the wider the range of values that can be taken by the pixel value (difference pixel value), and the silhouette can be extracted in more detail.

例えば、図２７に示すように、特定領域Ｃ１の解像度を上げて、特定領域Ｃ１において各画素の輝度値（深度値）を算出し直し、特定領域Ｃ１の各画素の輝度値に基づいて、シルエットＳＴ２´を切り出す。例えば、輝度値が２２０以上の画素の領域をシルエットＳＴ２´として切り出す。このようにすれば、例えば、「手」と判断された場合に、「手」のジェスチャー（形状）や、「手」の動きをより詳しく認識することができる。 For example, as shown in FIG. 27, the resolution of the specific area C1 is increased, the luminance value (depth value) of each pixel is recalculated in the specific area C1, and the silhouette is calculated based on the luminance value of each pixel of the specific area C1. Cut out ST2 '. For example, a pixel region having a luminance value of 220 or more is cut out as a silhouette ST2 ′. In this way, for example, when “hand” is determined, the gesture (shape) of “hand” and the movement of “hand” can be recognized in more detail.

また、特定領域Ｃ１のボーン情報の関節数を上げて、より詳細に物体を認識するようにしてもよい。より具体的に説明すると、図２７で抽出したシルエットＳＴ２´と関節数を多くしたボーン情報とを比較して、物体を認識するようにしてもよい。例えば、図２８（Ｂ）に示すように、シルエットＳＴ２´が「手」のボーン情報ＢＯＨ１と一致すると判断されると、特定領域Ｃ１において「手」を認識したと判定される。 Further, the number of joints in the bone information of the specific area C1 may be increased to recognize the object in more detail. More specifically, the object may be recognized by comparing the silhouette ST2 ′ extracted in FIG. 27 with bone information with an increased number of joints. For example, as shown in FIG. 28B, when it is determined that the silhouette ST2 ′ matches the bone information BOH1 of “hand”, it is determined that “hand” is recognized in the specific region C1.

なお、本実施形態では、物体に対応づけて、関節数が異なる複数のボーン情報群を予め用意する。例えば、低、中、高のレベルを設け、レベルが高くになるにつれて関節数が多くなるようにボーン情報群を用意する。例えば、「人」の場合には、関節数が１３個のボーン情報群を低レベルとし、関節数が２８個のボーン情報群を中レベルとし、関節数が５６個のボーン情報群を高レベルとする。また、「手」の場合には、関節数が１個のボーン情報群を低レベルとし、関節数が５個のボーン情報群を中レベルとし、関節数が１５個のボーン情報群を高レベルとしている。 In the present embodiment, a plurality of bone information groups with different numbers of joints are prepared in advance in association with objects. For example, low, medium, and high levels are provided, and bone information groups are prepared so that the number of joints increases as the level increases. For example, in the case of “person”, the bone information group having 13 joints is set to the low level, the bone information group having 28 joints is set to the medium level, and the bone information group having 56 joints is set to the high level. And In the case of “hand”, the bone information group with one joint is set to a low level, the bone information group with five joints is set to a medium level, and the bone information group with 15 joints is set to a high level. It is said.

そして、本実施形態では、特定領域において物体認識処理を行う場合には、少なくとも中レベル以上のボーン情報群を用いて物体認識処理を行うようにする。例えば、特定領域Ｃ１において物体認識処理を行う場合には、特定領域Ｃ１のシルエットと、中レベルの「人」のボーン情報群の各ボーン情報とを比較し、「人」であるか否かを判断する。 In this embodiment, when the object recognition process is performed in the specific area, the object recognition process is performed using at least the bone information group of the middle level or higher. For example, when the object recognition process is performed in the specific area C1, the silhouette of the specific area C1 is compared with each bone information of the bone information group of the medium-level “person” to determine whether or not the person is “person”. to decide.

図２７の例で示す特定領域Ｃ１のシルエットＳＴ２´は、「人」のボーン情報と一致しないと判断され、次に、特定領域Ｃ１のシルエットＳＴ２´と高レベルの「手」のボーン情報群の各ボーン情報とを比較し、特定領域Ｃ１のシルエットＳＴ２´と、ボーン情報ＢＯＨ１とが一致すると判断され、特定領域Ｃ１にある物体は「手」であることを認識する。 It is determined that the silhouette ST2 ′ of the specific area C1 shown in the example of FIG. 27 does not match the “person” bone information. Next, the silhouette ST2 ′ of the specific area C1 and the high-level “hand” bone information group. Each bone information is compared, and it is determined that the silhouette ST2 ′ of the specific area C1 matches the bone information BOH1, and it is recognized that the object in the specific area C1 is a “hand”.

なお、本実施形態では、特定領域Ｃ１において物体自体を認識した場合には、特定領域Ｃ１において物体自体を認識した関節数レベルのボーン情報群に基づいて、特定領域Ｃ１の物体のジェスチャーや動きを認識する処理を行うようにしてもよい。例えば、高レベルの「手」のボーン情報に基づいて特定領域Ｃ１において、「手」自体を認識した場合には、高レベルの「手」のボーン情報群に基づいて、特定領域Ｃ１の「手」のジェスチャーや動きを認識する処理を行うようにしてもよい。 In the present embodiment, when the object itself is recognized in the specific area C1, the gesture or movement of the object in the specific area C1 is determined based on the bone information group of the number of joints that recognized the object itself in the specific area C1. You may make it perform the process which recognizes. For example, when the “hand” itself is recognized in the specific area C1 based on the bone information of the high-level “hand”, the “hand” of the specific area C1 is based on the bone information group of the high-level “hand”. The process of recognizing the gesture or movement of “” may be performed.

また、本実施形態では、図９に示すように、特定領域Ｃ１において物体認識処理を行う周期を短くするようにしてもよい。例えば、特定領域Ｃ１を設定する前において、反射光画像を１／６秒周期で取得していた場合、特定領域Ｃ１を設定したｔ１０以後は、１／６０秒周期で取得するようにしてもよい。このようにすれば、より詳細に物体の動きを認識することができる。 In the present embodiment, as shown in FIG. 9, the cycle for performing the object recognition process in the specific area C1 may be shortened. For example, if the reflected light image is acquired at a 1/6 second period before setting the specific area C1, it may be acquired at a 1/60 second period after t10 when the specific area C1 is set. . In this way, the movement of the object can be recognized in more detail.

以上のように、本実施形態では、特定領域Ｃ１にマシンパワーを注ぐことができるので、物体について詳細に物体認識処理を行うことができる。また、本実施形態では、特定領域について画像精度を上げ、さらに認識周期を短くすることによって物体の誤認識を軽減することができる、という効果もある。 As described above, in the present embodiment, machine power can be poured into the specific area C1, and therefore object recognition processing can be performed in detail on an object. In addition, in the present embodiment, there is an effect that the erroneous recognition of the object can be reduced by increasing the image accuracy for the specific region and further shortening the recognition cycle.

２−６．特定領域と特定領域以外の領域との関係
本実施形態では、図２６の特定領域Ｃ１以外の領域において物体認識処理を行うようにしてもよい。特定領域以外の領域において物体認識処理を行う場合には、特定領域以外の領域の画像精度を特定領域の画像精度よりも低くする。このようにすれば、特定領域Ｃ１よりも認識精度は劣るが、特定領域Ｃ１以外に物体が存在する場合において物体を認識することができる。 2-6. Relationship between the specific region and the region other than the specific region In the present embodiment, the object recognition process may be performed in a region other than the specific region C1 in FIG. When performing object recognition processing in a region other than the specific region, the image accuracy of the region other than the specific region is set lower than the image accuracy of the specific region. In this way, the recognition accuracy is inferior to that of the specific area C1, but the object can be recognized when an object exists outside the specific area C1.

本実施形態では、特定領域Ｃ１以外の領域の画像精度を特定領域の画像精度よりも低くするようにしてもよい。また、特定領域以外の領域において物体認識処理を行う周期を、特定領域において物体認識処理を行う周期よりも長くするようにしてもよい。例えば、図１３に示すように、特定領域Ｃ１以外の領域においては、１／６秒の周期で物体認識処理を行い、特定領域Ｃ１においては、１／６０秒の周期で物体認識処理を行う。このようにすれば、マシンパワー（コンピュータの総合的な処理能力）を主に特定領域Ｃ１の物体認識処理に注力することができ、特定領域Ｃ１の物体認識処理をより正確に行うことができる。 In the present embodiment, the image accuracy of the region other than the specific region C1 may be lower than the image accuracy of the specific region. In addition, the period for performing the object recognition process in an area other than the specific area may be longer than the period for performing the object recognition process in the specific area. For example, as shown in FIG. 13, the object recognition process is performed at a period of 1/6 second in the area other than the specific area C1, and the object recognition process is performed at a period of 1/60 second in the specific area C1. In this way, the machine power (the overall processing capability of the computer) can be mainly focused on the object recognition process in the specific area C1, and the object recognition process in the specific area C1 can be performed more accurately.

２−７．複数の特定領域
本実施形態では、複数の特定領域を設定するようにしてもよい。例えば、「所与の期間（２秒間）において、反射光画像の輝度値の平均値がしきい値以上（例えば２２０以上）である領域」をルール１０とし、「所与の期間（２秒間）において、反射光画像の輝度値の平均値がしきい値以上（例えば２００以上）である領域」をルール２０とした場合、図２６に示すように、ルール１０に基づいて決定される領域を、特定領域Ｃ１として設定すると共に、図２９に示すように、ルール２０に基づいて決定される領域を、特定領域Ｃ２として設定する。なお、本実施形態では、３つ以上の特定領域を設定してもよい。つまり、本実施形態では、反射光画像の輝度値は奥行き（物体との距離）に関係するので、物体の距離関係に応じた特定領域の設定を行う。 2-7. Multiple specific areas In the present embodiment, multiple specific areas may be set. For example, the rule 10 is “a region where the average value of the luminance values of the reflected light image is not less than a threshold value (eg, not less than 220) in a given period (2 seconds)”, and “a given period (2 seconds)”. , The area where the average value of the luminance value of the reflected light image is equal to or greater than a threshold value (for example, 200 or more) is set as the rule 20, the area determined based on the rule 10 as shown in FIG. While setting as the specific area C1, as shown in FIG. 29, the area determined based on the rule 20 is set as the specific area C2. In the present embodiment, three or more specific areas may be set. That is, in the present embodiment, since the luminance value of the reflected light image is related to the depth (distance to the object), the specific area is set according to the distance relationship of the object.

そして、少なくとも１つの特定領域について物体認識処理を行う。例えば、ルール１０、ルール２０に基づいて設定された特定領域Ｃ１、Ｃ２のいずれか一方について物体認識処理を行うようにしてもよいし、特定領域Ｃ１、Ｃ２の両方について物体認識処理を行うようにしてもよい。 Then, object recognition processing is performed for at least one specific region. For example, the object recognition process may be performed on one of the specific areas C1 and C2 set based on the rule 10 and the rule 20, or the object recognition process may be performed on both the specific areas C1 and C2. May be.

例えば、図２９に示すように、特定領域Ｃ２について物体認識処理を行う場合には、特定領域Ｃ２において輝度値が２００以上の画素の領域をシルエットＳＴ３として切り出す。そして、シルエットＳＴ３とボーン情報とを対比して物体を特定する。図３１に示すように、例えば、シルエットＳＴ３と「人」のボーンとが一致する場合には、特定領域Ｃ２において「人」を認識したと判定する。 For example, as shown in FIG. 29, when the object recognition process is performed for the specific area C2, a pixel area having a luminance value of 200 or more in the specific area C2 is cut out as a silhouette ST3. Then, the object is specified by comparing the silhouette ST3 with the bone information. As shown in FIG. 31, for example, when the silhouette ST3 and the “person” bone match, it is determined that “person” has been recognized in the specific region C2.

また、本実施形態では、第１の物体認識システム同じように、各特定領域に優先度を設定する。本実施形態では、深度値に基づいて優先度を決める。例えば、特定領域上の各画素の深度値の平均値を算出し、平均値が高いほど優位になるように優先度を設定するようにしてもよい。このようにすれば、入力部６０に近い物体ほど優先的に物体認識処理を行うことができる。 In the present embodiment, the priority is set for each specific area in the same manner as in the first object recognition system. In the present embodiment, the priority is determined based on the depth value. For example, the average value of the depth value of each pixel on the specific region may be calculated, and the priority may be set so that the higher the average value, the more dominant. In this way, the object recognition process can be preferentially performed on an object closer to the input unit 60.

そして、本実施形態では、優先度に基づいて、物体認識処理を行う。例えば、優先度の低い特定領域の画像精度を、優先度の高い特定領域の画像精度よりも低くするようにしてもよい。 In this embodiment, object recognition processing is performed based on the priority. For example, the image accuracy of a specific region with a low priority may be set lower than the image accuracy of a specific region with a high priority.

例えば、特定領域Ｃ１の優先度が１であり、特定領域Ｃ２の優先度が２であるとすると、図３０に示すように、特定領域Ｃ２について解像度を上げると共に、優先度の低い特定領域Ｃ２の解像度を、優先度の高い特定領域Ｃ１において物体認識処理を行う解像度よりも低くする。 For example, if the priority of the specific area C1 is 1 and the priority of the specific area C2 is 2, as shown in FIG. 30, the resolution of the specific area C2 is increased and the specific area C2 having a low priority is selected. The resolution is set lower than the resolution at which the object recognition process is performed in the specific area C1 having a high priority.

また、優先度の高い特定領域Ｃ１については、図２８（Ｂ）に示すように、関節数を低レベルから高レベルに上げ、高レベルのボーン情報群に基づいて物体認識処理を行い、優先度が特定領域Ｃ１よりも低い特定領域Ｃ２については、図３１に示すように、関節数を低レベルから中レベルに上げて中レベルのボーン情報群に基づいて物体認識処理を行う。 For the specific area C1 with high priority, as shown in FIG. 28B, the number of joints is increased from a low level to a high level, and object recognition processing is performed based on the high-level bone information group. For the specific area C2 that is lower than the specific area C1, as shown in FIG. 31, the number of joints is increased from the low level to the medium level, and the object recognition process is performed based on the medium level bone information group.

例えば、図２９に示すように、特定領域Ｃ２について物体認識処理を行う場合には、特定領域Ｃ２の解像度を上げて、特定領域Ｃ２において各画素の輝度値（深度値）を算出し直し、例えば、輝度値が２００以上の画素の領域をシルエットＳＴ３´として切り出す。そして、シルエットＳＴ３´とボーン情報とを対比して物体を特定する。図３１に示すように、例えば、シルエットＳＴ３´と「人」のボーンＢＯ５とが一致する場合には、特定領域Ｃ２において「人」を認識することができる。 For example, as shown in FIG. 29, when performing the object recognition process for the specific area C2, the resolution of the specific area C2 is increased, and the luminance value (depth value) of each pixel is recalculated in the specific area C2. Then, a pixel region having a luminance value of 200 or more is cut out as a silhouette ST3 ′. Then, the object is specified by comparing the silhouette ST3 'with the bone information. As shown in FIG. 31, for example, when the silhouette ST3 ′ and the “people” bone BO5 match, “people” can be recognized in the specific region C2.

このようにすれば、優先度の高い特定領域Ｃ１についてより、マシンパワーを注いで詳細に物体を認識することができ、優先度の低いものについて処理を簡素にし、効率よく特定領域Ｃ１、Ｃ２の物体認識処理を行うことができる。 By doing this, it is possible to recognize the object in detail by pouring machine power more than the specific area C1 with high priority, simplify processing for the low priority, and efficiently specify the specific areas C1 and C2. Object recognition processing can be performed.

同様に、例えば、優先度の低い特定領域において物体認識処理を行う周期を、優先度の高い特定領域において物体認識処理を行う周期よりも長くする。つまり、図１７に示すように、優先度の低い特定領域Ｃ２において物体認識処理を行う周期を、優先度の高い特定領域Ｃ１において物体認識処理を行う周期よりも長くする。このようにすれば、優先度の高い特定領域Ｃ１についてより詳細に物体を認識することができる。 Similarly, for example, the period for performing the object recognition process in the specific area with low priority is set longer than the period for performing the object recognition process in the specific area with high priority. That is, as shown in FIG. 17, the period for performing the object recognition process in the specific area C2 with low priority is set longer than the period for performing the object recognition process in the specific area C1 with high priority. In this way, the object can be recognized in more detail with respect to the specific area C1 having a high priority.

また、本実施形態では、複数の特定領域を設定している場合には、優先度の低い特定領域において物体認識処理を行わずに、優先度の高い特定領域において物体認識処理を行うようにしてもよい。つまり、優先度の低い特定領域Ｃ２において物体認識処理を行わずに、優先度の高い特定領域Ｃ１において物体認識処理を行うようにしてもよい。 In the present embodiment, when a plurality of specific areas are set, the object recognition process is performed in the high priority specific area without performing the object recognition process in the low priority specific area. Also good. That is, the object recognition process may be performed in the specific area C1 having a high priority without performing the object recognition process in the specific area C2 having a low priority.

２−８．フローチャート
最後に、本実施形態の処理の流れについて図３２を用いて説明する。まず、入力画像の各画素の深度値に基づいて、入力画像上の特定領域を設定する（ステップＳ２０）。そして、特定領域の画像精度を上げると共に、特定領域の物体認識処理を行う周期を短くする（ステップＳ２１）。そして、特定領域において物体を認識する処理を行う（ステップＳ２２）。以上で処理が終了する。 2-8. Flowchart Finally, the processing flow of this embodiment will be described with reference to FIG. First, a specific area on the input image is set based on the depth value of each pixel of the input image (step S20). And while improving the image precision of a specific area, the period which performs the object recognition process of a specific area is shortened (step S21). And the process which recognizes an object in a specific area is performed (step S22). The process ends here.

３．ゲーム演算処理例
本実施形態では、以上に示す物体認識処理を行うことによって、種々のゲームを行うことができる。 3. Game Calculation Processing Example In the present embodiment, various games can be performed by performing the object recognition processing described above.

３−１．野球ゲームの例
３−１−１．第１の物体認識システムで野球ゲームのゲーム演算を行う例
本実施形態において、第１の物体認識システムで野球ゲームのゲーム演算を行う例について説明する。まず、本実施形態では、オブジェクト空間に、プレーヤキャラクタを配置する。そして、プレーヤの動きやジェスチャーを認識し、認識したプレーヤの動きやジェスチャーに基づいて、プレーヤキャラクタがピッチャーとしてボールを投げる動作処理を行う。 3-1. Example of baseball game 3-1-1. Example of Performing Game Calculation of Baseball Game with First Object Recognition System In this embodiment, an example of performing game calculation of a baseball game with the first object recognition system will be described. First, in this embodiment, a player character is placed in the object space. Then, the player's movement and gesture are recognized, and based on the recognized player's movement and gesture, an action process is performed in which the player character throws the ball as a pitcher.

例えば、オブジェクト空間内においてプレーヤキャラクタがマウンド（所定区域）に立ったタイミングで、入力画像においてルール２（所与の期間（２秒間）において、動きベクトルの大きさ（差分画素値）の平均値が１００以上であって、動きベクトルが左又は右方向を向く領域）に基づいて決定される特定領域Ａ２において、動きベクトルによって示される動き領域Ｓ３を抽出する。そして動き領域Ｓ３の形が「人」のパターン（「人」の形状、「人」の色情報）と一致するか否かを判断する。なお、人の認識処理を行う場合には、特定領域Ａ２において、画像精度を上げた各画素の動きベクトルに基づいて、動き領域Ｓ３´を設定し、動き領域Ｓ３´の形と、「手」の認識パターンの一致度を判断する。 For example, at the timing when the player character stands on the mound (predetermined area) in the object space, the average value of the magnitudes of motion vectors (difference pixel values) in rule 2 (a given period (2 seconds)) in the input image The motion region S3 indicated by the motion vector is extracted from the specific region A2 that is determined based on the motion vector of 100 or more and the motion vector facing left or right. Then, it is determined whether or not the shape of the motion region S3 matches the “person” pattern (“person” shape, “person” color information). When human recognition processing is performed, in the specific area A2, a motion area S3 ′ is set based on the motion vector of each pixel with increased image accuracy, and the shape of the motion area S3 ′ and “hand” The degree of coincidence of the recognition pattern is determined.

本実施形態では、特定領域Ａ２において「人」を認識した場合に、次に、入力画像に基づいて「手」を認識できるか否かを判断する。例えば、入力画像においてルール１（所与の期間（２秒間）において、動きベクトルの大きさ（差分画素値）の平均値が２００以上であって、動きベクトルが左又は右方向を向く領域）に基づいて決定される特定領域Ａ１の動き領域Ｓ１の形が、「手」のパターン（「手」の形状、「手」の色情報）と一致するか否かを判断する。例えば、手の認識処理を行う場合には、特定領域Ａ１において、画像精度を上げた各画素の動きベクトルに基づいて、動き領域Ｓ１´を設定し、動き領域Ｓ１´の形と、「手」の認識パターンの一致度を判断する。 In the present embodiment, when “person” is recognized in the specific area A2, it is next determined whether or not “hand” can be recognized based on the input image. For example, in the input image, rule 1 (region in which the average value of the motion vector (difference pixel value) is 200 or more and the motion vector faces left or right in a given period (2 seconds)) It is determined whether or not the shape of the motion area S1 of the specific area A1 determined based on the pattern matches the “hand” pattern (“hand” shape, “hand” color information). For example, when hand recognition processing is performed, a motion region S1 ′ is set based on the motion vector of each pixel whose image accuracy is increased in the specific region A1, and the shape of the motion region S1 ′ and “hand” The degree of coincidence of the recognition pattern is determined.

そして、特定領域Ａ１において、「手」を認識すると、次に、「手」のジェスチャーを認識する。本実施形態では、認識された「手」のジェスチャーに基づいて、プレーヤキャラクタがボールを投げる投球フォームを決定する。例えば、特定領域Ａ１の動き領域Ｓ１´の形と、「手」の「グー」、「チョキ」、「パー」の３つの認識パターンそれぞれの一致度を判断する。そして、動き領域Ｓ１´の形が「グー」の認識パターンと一致している場合には、プレーヤキャラクタの投球フォームを「ストレート」に決定し、動き領域Ｓ１´の形が「チョキ」の認識パターンと一致している場合には、プレーヤキャラクタの投球フォームを「フォーク」に決定し、動き領域Ｓ１´の形が「パー」の認識パターンと一致している場合には、プレーヤキャラクタの投球フォームを「スライダー」に決定する。 When “hand” is recognized in the specific area A1, next, the gesture of “hand” is recognized. In the present embodiment, a throwing form on which the player character throws the ball is determined based on the recognized “hand” gesture. For example, the degree of coincidence between the shape of the movement area S1 ′ of the specific area A1 and the three recognition patterns “goo”, “choki”, and “par” of “hand” is determined. If the shape of the motion area S1 ′ matches the recognition pattern of “Goo”, the player character's throwing form is determined to be “straight”, and the shape of the motion area S1 ′ is a recognition pattern of “Choki”. Is determined to be “fork”, and when the shape of the movement area S1 ′ matches the recognition pattern of “par”, the player character's pitch form is determined to be “fork”. Decide on “Slider”.

そして、特定領域Ａ１（或いは、動き領域Ｓ１´）の各画素の動きベクトルの方向に基づいて「手」の動きを認識し、「手」の動きに基づいてオブジェクト空間のプレーヤキャラクタの動作演算を行う。例えば、「手」を認識した特定領域Ａ１において、動きベクトルの方向が画面下方向であって、その動きベクトルの大きさが所定値以上（例えば、差分画素値が２５０以上）であることを検出した場合に、オブジェクト空間に存在するプレーヤキャラクタが、決定された投球フォームでボールを投げる動作を行う。 Then, the movement of the “hand” is recognized based on the direction of the motion vector of each pixel in the specific area A1 (or the movement area S1 ′), and the motion calculation of the player character in the object space is performed based on the movement of the “hand”. Do. For example, in the specific area A1 in which “hand” is recognized, it is detected that the direction of the motion vector is the downward direction on the screen and the magnitude of the motion vector is greater than or equal to a predetermined value (eg, the difference pixel value is greater than or equal to 250). In this case, the player character existing in the object space performs an action of throwing the ball with the determined throwing form.

以上のように、第１の物体認識システムで野球ゲームのゲーム演算を行う場合には、全画面ではなく特定領域において、「人」や、「手」の物体認識、ジェスチャー、動きを認識する処理を行うので、無駄な処理を省略することができ、効率よく処理を行うことができる。 As described above, when the base object game calculation is performed by the first object recognition system, the object recognition, gesture, and movement of “person” and “hand” are recognized in a specific area instead of the full screen. Therefore, useless processing can be omitted and processing can be performed efficiently.

３−１−２．第２の物体認識システムで野球ゲームのゲーム演算を行う例
次に、本実施形態の第２の物体認識システムで野球ゲームのゲーム演算を行う例について説明する。まず、本実施形態では、オブジェクト空間に、プレーヤキャラクタを配置し、プレーヤの動きやジェスチャーを認識し、認識したプレーヤの動きやジェスチャーに基づいて、プレーヤキャラクタがピッチャーとしてボールを投げる動作処理を行う。 3-1-2. Example of Performing Game Calculation of Baseball Game with Second Object Recognition System Next, an example of performing game calculation of a baseball game with the second object recognition system of the present embodiment will be described. First, in this embodiment, a player character is placed in the object space, the player's movement and gesture are recognized, and the player character performs a motion process of throwing a ball as a pitcher based on the recognized player movement and gesture.

例えば、オブジェクト空間内においてプレーヤキャラクタがマウンドに立ったタイミングで、入力画像においてルール２０（所与の期間（２秒間）において、反射光画像の輝度値の平均値がしきい値以上（例えば２００以上）である領域）に基づいて決定されるシルエットを特定領域Ｃ２として設定し、特定領域Ｃ２のシルエットＳＴ３が「人」のボーン情報と一致するか否かを判断する。例えば、「人」を認識する場合には、特定領域Ｃ２の画像精度を上げて各画素情報に基づいて、シルエットＳＴ３´を設定し、シルエットＳＴ３´の形と「人」のボーン情報の一致度を判断する。 For example, at the timing when the player character stands on the mound in the object space, the average value of the luminance value of the reflected light image is not less than a threshold value (for example, not less than 200 in the given period (2 seconds)) in the input image. ) Is determined as the specific region C2, and it is determined whether or not the silhouette ST3 of the specific region C2 matches the bone information of “person”. For example, when “person” is recognized, the silhouette ST3 ′ is set based on each pixel information by increasing the image accuracy of the specific area C2, and the degree of coincidence between the shape of the silhouette ST3 ′ and the bone information of “person” Judging.

そして、特定領域Ｃ２において「人」を認識した場合には、次に、入力画像に基づいて「手」を認識可能か否かを判断する。例えば、入力画像においてルール１０（所与の期間（２秒間）において、反射光画像の輝度値の平均値がしきい値以上（例えば２２０以上）である領域）に基づいて決定される領域を特定領域Ｃ１として設定し、特定領域Ｃ１のシルエットＳＴ２の形が「手」のボーン情報と一致するか否かを判断する。例えば、特定領域Ｃ１の画像精度を上げて各画素情報に基づいて、シルエットＳＴ２´を設定し、シルエットＳＴ２´の形と、「手」のボーン情報の一致度を判断する。 When “person” is recognized in the specific area C2, it is next determined whether or not “hand” can be recognized based on the input image. For example, in the input image, an area determined based on rule 10 (an area where the average value of the luminance value of the reflected light image is not less than a threshold value (for example, not less than 220) in a given period (2 seconds)) is specified. It is set as the region C1, and it is determined whether or not the shape of the silhouette ST2 of the specific region C1 matches the bone information of “hand”. For example, the silhouette ST2 ′ is set based on each pixel information by increasing the image accuracy of the specific area C1, and the degree of coincidence between the shape of the silhouette ST2 ′ and the bone information of “hand” is determined.

そして、特定領域Ｃ１において、「手」を認識すると、次に、「手」のジェスチャーを認識する。例えば、本実施形態では、認識されたジェスチャーに基づいて、プレーヤキャラクタがボールを投げる投球フォームを決定する。例えば、特定領域Ｃ１のシルエットＳＴ２´の形と、「手」の「グー」、「チョキ」、「パー」の３つのボーン情報それぞれの一致度を判断する。そして、シルエットＳＴ２´の形が「グー」のボーン情報と一致する場合には、プレーヤキャラクタの投球フォームを「ストレート」に決定し、シルエットＳＴ２´の形が「チョキ」のボーン情報と一致する場合には、プレーヤキャラクタの投球フォームを「フォーク」に決定し、シルエットＳＴ２´の形が「パー」のボーン情報と一致する場合には、プレーヤキャラクタの投球フォームを「スライダー」に決定する。 Then, when “hand” is recognized in the specific area C1, next, the gesture of “hand” is recognized. For example, in the present embodiment, a throwing form on which the player character throws the ball is determined based on the recognized gesture. For example, the degree of coincidence between the shape of the silhouette ST2 ′ of the specific area C1 and the three pieces of bone information “goo”, “choki”, and “par” of “hand” is determined. When the shape of the silhouette ST2 ′ matches the bone information of “Goo”, the pitching form of the player character is determined to be “straight”, and the shape of the silhouette ST2 ′ matches the bone information of “Cho” In this case, the pitch form of the player character is determined as “fork”, and when the shape of the silhouette ST2 ′ matches the bone information of “par”, the pitch form of the player character is determined as “slider”.

そして、本実施形態では、シルエットＳＴ３´に一致するボーン情報に基づいて、「人」の動きを認識する。例えば、本実施形態では、所定周期で更新するシルエットＳＴ３´に合致する「人」のボーン情報をキーフレームとし、キーフレーム間の動きを補間して、「人」の動作を認識し、認識した動作に基づいて、オブジェクト空間に存在するプレーヤキャラクタの動作処理（モーション処理）を行う。例えば、「人」を認識した特定領域Ｃ２において、「人」のボールを投げる動きをしていることを認識した場合に、決定された投球フォームに基づいて、オブジェクト空間に存在するプレーヤキャラクタがボールを投げる動作を行う。 In the present embodiment, the movement of the “person” is recognized based on the bone information that matches the silhouette ST3 ′. For example, in the present embodiment, the bone information of “person” matching the silhouette ST3 ′ updated at a predetermined cycle is used as a key frame, and the motion between the key frames is interpolated to recognize and recognize the operation of “person”. Based on the motion, motion processing of the player character existing in the object space is performed. For example, when it is recognized that the movement of throwing a “person” ball is recognized in the specific area C2 in which “person” is recognized, the player character existing in the object space moves to the ball based on the determined throwing form. Do the action of throwing.

以上のように、第２の物体認識システムで野球ゲームのゲーム演算を行う場合には、全画面ではなく特定領域において、「人」や、「手」の物体認識、ジェスチャー、動きを認識する処理を行うので、無駄な処理を省略することができ、効率よく処理を行うことができる。 As described above, when the base object game calculation is performed by the second object recognition system, the object recognition, gesture, and movement of “person” or “hand” are recognized in a specific area instead of the full screen. Therefore, useless processing can be omitted and processing can be performed efficiently.

３−２．対戦ゲームの例
３−２−１．第１の物体認識システムで対戦ゲームのゲーム演算を行う例
第１の物体認識システムで対戦ゲームのゲーム演算を行う例について説明する。まず、本実施形態では、オブジェクト空間に、プレーヤキャラクタを配置する。そして、プレーヤの動きやジェスチャーを認識し、認識したプレーヤの動きやジェスチャーに基づいて、プレーヤキャラクタが敵キャラクタに攻撃技、防御技などを決定する。 3-2. Example of competitive game 3-2-1. Example of performing game calculation of a battle game using the first object recognition system An example of performing game calculation of a battle game using the first object recognition system will be described. First, in this embodiment, a player character is placed in the object space. Then, the player's movement and gesture are recognized, and based on the recognized player's movement and gesture, the player character determines an attack technique, a defense technique, and the like for the enemy character.

例えば、オブジェクト空間内においてプレーヤキャラクタと敵キャラクタとの対戦ゲームが開始されると、入力画像においてルール２に基づいて決定される特定領域Ａ２において、動きベクトルによって示される動き領域Ｓ３の形が認識パターン記憶部１７３に記憶されている「人」のパターン（「人」の形状、「人」の色情報）と一致するか否かを判断する。例えば、特定領域Ａ２において画像精度を上げた各画素の動きベクトルに基づいて、動き領域Ｓ３´を設定し、動き領域Ｓ３´の形と、「人」の認識パターンの一致度を判断する。 For example, when a battle game between a player character and an enemy character is started in the object space, the shape of the motion area S3 indicated by the motion vector is recognized in the specific area A2 determined based on the rule 2 in the input image. It is determined whether or not the pattern matches the “person” pattern (“person” shape, “person” color information) stored in the storage unit 173. For example, the motion region S3 ′ is set based on the motion vector of each pixel whose image accuracy is improved in the specific region A2, and the degree of coincidence between the shape of the motion region S3 ′ and the recognition pattern of “person” is determined.

そして、特定領域Ａ２において「人」を認識した場合には、次に、入力画像に基づいて「手」を認識できるか否かを判断する。例えば、入力画像においてルール１に基づいて特定領域Ａ１を設定し、特定領域Ａ１の動き領域Ｓ１の形が「手」のパターン（「手」の形状、「手」の色情報）と一致するか否かを判断する。例えば、特定領域Ａ１において画像精度を上げた各画素の動きベクトルに基づいて、動き領域Ｓ１´を設定し、動き領域Ｓ１´の形と、「手」の認識パターンの一致度を判断する。 When “person” is recognized in the specific area A2, it is next determined whether or not “hand” can be recognized based on the input image. For example, whether the specific area A1 is set based on the rule 1 in the input image and the shape of the motion area S1 of the specific area A1 matches the “hand” pattern (“hand” shape, “hand” color information). Judge whether or not. For example, the motion region S1 ′ is set based on the motion vector of each pixel whose image accuracy is improved in the specific region A1, and the degree of coincidence between the shape of the motion region S1 ′ and the recognition pattern of “hand” is determined.

そして、特定領域Ａ１において、「手」を認識すると、次に、「手」のジェスチャーを認識する。例えば、本実施形態では、認識されたジェスチャーに基づいて、プレーヤキャラクタの技を決定する。例えば、特定領域Ａ１の動き領域Ｓ１´の形と、「手」の「グー」、「チョキ」、「パー」の３つの認識パターンそれぞれの一致度を判断する。例えば、動き領域Ｓ１´の形が「グー」の認識パターンと一致する場合にはプレーヤキャラクタの技を「パンチ」に決定し、動き領域Ｓ１´の形が「チョキ」の認識パターンと一致する場合にはプレーヤキャラクタの技を「タックル」に決定し、「パー」の認識パターンと一致する場合にはプレーヤキャラクタの技を「投げ技」に決定する。 When “hand” is recognized in the specific area A1, next, the gesture of “hand” is recognized. For example, in the present embodiment, the technique of the player character is determined based on the recognized gesture. For example, the degree of coincidence between the shape of the movement area S1 ′ of the specific area A1 and the three recognition patterns “goo”, “choki”, and “par” of “hand” is determined. For example, when the shape of the motion area S1 ′ matches the recognition pattern of “Goo”, the technique of the player character is determined to be “punch”, and the shape of the motion area S1 ′ matches the recognition pattern of “choki” In this case, the technique of the player character is determined to be “tackle”, and if it matches the recognition pattern of “par”, the technique of the player character is determined to be “throw technique”.

そして、特定領域Ａ１（或いは、動き領域Ｓ１´）の各画素の動きベクトルの方向に基づいて「手」の動きを認識する。例えば、「手」を認識した特定領域Ａ１において、動きベクトルの方向が画面下方向であって、その動きベクトルの大きさが所定値以上（例えば、差分画素値が２５０以上）であることを検出した場合に、決定された「技」に基づいて、オブジェクト空間に存在するプレーヤキャラクタが敵キャラクタに攻撃を行う動作を行う。 Then, the movement of the “hand” is recognized based on the direction of the motion vector of each pixel in the specific area A1 (or the movement area S1 ′). For example, in the specific area A1 in which “hand” is recognized, it is detected that the direction of the motion vector is the downward direction on the screen and the magnitude of the motion vector is greater than or equal to a predetermined value (for example, the difference pixel value is greater than or equal to 250). In this case, based on the determined “skill”, the player character existing in the object space performs an action to attack the enemy character.

なお、本実施形態では、認識したプレーヤの人や手の動きに基づいて、プレーヤキャラクタを動作させる処理を行うようにしてもよい（モーション演算を行うようにしてもよい）。このようにすれば、例えば、プレーヤが行うパンチの動作をプレーヤキャラクタに反映させることができる。例えば、特定領域Ａ１（或いは、動き領域Ｓ１´）の各画素の動きベクトルの方向に基づいて「手」の動きを認識した場合に、その「手」の動きに基づいて、オブジェクト空間に存在するプレーヤキャラクタの手（腕）の動作処理を行う。例えば、「手」を認識した特定領域Ａ１において、動きベクトルの方向が画面下方向であって、その動きベクトルの大きさが所定値以上であることを検出した場合に、プレーヤの「手」が上から下へ動いたものと判断し、プレーヤキャラクタの手（腕）を上から下に振り下ろすモーション処理を行う。 In the present embodiment, processing for moving the player character may be performed based on the recognized movement of the player's person or hand (motion calculation may be performed). In this way, for example, the punching action performed by the player can be reflected in the player character. For example, when the movement of the “hand” is recognized based on the direction of the motion vector of each pixel in the specific area A1 (or the movement area S1 ′), the movement is present in the object space based on the movement of the “hand”. The player character's hand (arm) motion processing is performed. For example, in the specific area A1 in which “hand” is recognized, when it is detected that the direction of the motion vector is the downward direction on the screen and the magnitude of the motion vector is a predetermined value or more, the “hand” of the player is It is determined that the player character has moved from top to bottom, and motion processing is performed in which the player character's hand (arm) is swung down from top to bottom.

以上のように、第１の物体認識システムで対戦ゲームのゲーム演算を行う場合には、全画面ではなく特定領域において、「人」や、「手」の物体認識、ジェスチャー、動きを認識する処理を行うので、無駄な処理を省略することができ、効率よく処理を行うことができる。 As described above, when the game calculation of the battle game is performed by the first object recognition system, the object recognition, gesture, and movement of “person” or “hand” are recognized in a specific area instead of the full screen. Therefore, useless processing can be omitted and processing can be performed efficiently.

３−２−２．第２の物体認識システムで対戦ゲームのゲーム演算を行う例
次に、本実施形態の第２の物体認識システムで対戦ゲームのゲーム演算を行う例について説明する。まず、本実施形態では、プレーヤの動きやジェスチャーを認識し、認識したプレーヤの動きやジェスチャーに基づいて、プレーヤキャラクタが敵キャラクタに攻撃技、防御技などを決定する。 3-2-2. Example of Performing Competitive Game Game Calculation in Second Object Recognition System Next, an example of performing a competitive game game calculation in the second object recognition system of the present embodiment will be described. First, in this embodiment, the player's movement and gesture are recognized, and the player character determines an attack technique, a defense technique, and the like for the enemy character based on the recognized player movement and gesture.

例えば、オブジェクト空間内においてプレーヤキャラクタと敵キャラクタとの対戦ゲームが開始されると、入力画像においてルール２０に基づいて決定される特定領域Ｃ２において切り出されたシルエットＳＴ３が「人」のボーン情報と一致するか否かを判断する。例えば、「人」を認識する場合には、特定領域Ｃ２の画像精度を上げて各画素情報に基づいて、シルエットＳＴ３´を設定し、シルエットＳＴ３´の形と「人」のボーン情報の一致度を判断する。 For example, when a battle game between a player character and an enemy character is started in the object space, the silhouette ST3 cut out in the specific area C2 determined based on the rule 20 in the input image matches the bone information of “person”. Judge whether to do. For example, when “person” is recognized, the silhouette ST3 ′ is set based on each pixel information by increasing the image accuracy of the specific area C2, and the degree of coincidence between the shape of the silhouette ST3 ′ and the bone information of “person” Judging.

そして、シルエットＳＴ３が「人」のボーン情報と一致し、「人」であることを認識した場合には、次に、入力画像に基づいて「手」を認識できるか否かを判断する。例えば、入力画像においてルール１０に基づいて特定領域Ｃ１を設定し、特定領域Ｃ１のシルエットＳＴ２の形が「手」のボーン情報と一致するか否かを判断する。例えば、特定領域Ｃ１の画像精度を上げて各画素情報に基づいて、シルエットＳＴ２´を設定し、シルエットＳＴ２´の形と、「手」のボーン情報の一致度を判断する。 If the silhouette ST3 matches the bone information of “person” and recognizes that it is “person”, it is next determined whether or not “hand” can be recognized based on the input image. For example, the specific region C1 is set based on the rule 10 in the input image, and it is determined whether or not the shape of the silhouette ST2 of the specific region C1 matches the bone information of “hand”. For example, the silhouette ST2 ′ is set based on each pixel information by increasing the image accuracy of the specific area C1, and the degree of coincidence between the shape of the silhouette ST2 ′ and the bone information of “hand” is determined.

そして、特定領域Ｃ１において、「手」を認識すると、次に、「手」のジェスチャーを認識する。例えば、本実施形態では、認識されたジェスチャーに基づいて、プレーヤキャラクタの技を決定する。例えば、シルエットＳＴ２´が「グー」のボーン情報と一致する場合にはプレーヤキャラクタの技を「パンチ」に決定し、シルエットＳＴ２´が「チョキ」のボーン情報と一致する場合にはプレーヤキャラクタの技を「タックル」に決定し、シルエットＳＴ２´が「パー」のボーン情報と一致する場合にはプレーヤキャラクタの技を「投げ技」に決定する。 Then, when “hand” is recognized in the specific area C1, next, the gesture of “hand” is recognized. For example, in the present embodiment, the technique of the player character is determined based on the recognized gesture. For example, when the silhouette ST2 ′ matches the bone information of “Goo”, the skill of the player character is determined as “punch”, and when the silhouette ST2 ′ matches the bone information of “Cho”, the technique of the player character is determined. Is determined to be “tackle”, and if the silhouette ST2 ′ matches the bone information of “par”, the technique of the player character is determined to be “throw technique”.

そして、特定領域Ｃ１（或いは、シルエットＳＴ２´）のボーン情報に基づいて「手」の動きを認識する。例えば、「手」を認識した特定領域Ｃ１において、「手」のボーンが、深度センサに対して前に突き出す動作（奥から手前へ手を前に出す動作）をしていることを検出した場合に、オブジェクト空間に存在するプレーヤキャラクタが、決定された「技」に基づいて、オブジェクト空間に存在するプレーヤキャラクタが敵キャラクタに攻撃を行う。 Then, the movement of the “hand” is recognized based on the bone information of the specific area C1 (or silhouette ST2 ′). For example, in the specific region C1 in which “hand” is recognized, it is detected that the bone of “hand” is protruding forward with respect to the depth sensor (moving the hand forward from the back). The player character existing in the object space attacks the enemy character based on the determined “skill”.

なお、本実施形態では、プレーヤの動きやジェスチャーを認識し、プレーヤの動きやジェスチャーに基づいて、プレーヤキャラクタを動作させる処理を行うようにしてもよい（モーション演算を行う）。 In the present embodiment, the player's movement or gesture may be recognized, and a process for moving the player character may be performed based on the player's movement or gesture (motion calculation is performed).

例えば、特定領域Ｃ２（或いは、シルエットＳＴ３´）のボーン情報に基づいて「人」の動きを認識した場合に、その「人」の動きに基づいて、オブジェクト空間に存在するプレーヤキャラクタの動作処理を行う。例えば、本実施形態では、所定周期で更新するシルエットＳＴ３´に合致する「人」のボーン情報をキーフレームとし、キーフレーム間の動きを補間して、「人」の動作を認識し、認識した動作に基づいて、オブジェクト空間に存在するプレーヤキャラクタの動作処理（モーション処理）を行う。このようにすれば、例えば、プレーヤが行うパンチの動作をプレーヤキャラクタに反映させることができる。 For example, when the movement of the “person” is recognized based on the bone information of the specific area C2 (or silhouette ST3 ′), the motion processing of the player character existing in the object space is performed based on the movement of the “person”. Do. For example, in the present embodiment, the bone information of “person” matching the silhouette ST3 ′ updated at a predetermined cycle is used as a key frame, and the motion between the key frames is interpolated to recognize and recognize the operation of “person”. Based on the motion, motion processing of the player character existing in the object space is performed. In this way, for example, the punching action performed by the player can be reflected in the player character.

以上のように、第２の物体認識システムで対戦ゲームのゲーム演算を行う場合には、全画面ではなく特定領域において、「人」や、「手」の物体認識、ジェスチャー、動きを認識する処理を行うので、無駄な処理を省略することができ、効率よく処理を行うことができる。 As described above, when the game calculation of the battle game is performed by the second object recognition system, the object recognition, gesture, and movement of “person” or “hand” are recognized in a specific area instead of the full screen. Therefore, useless processing can be omitted and processing can be performed efficiently.

３−３．その他のゲーム
本実施形態では、アクションゲーム、レーシングゲーム、シューティングゲーム、スポーツゲーム、競争ゲーム、ロールプレイングゲーム、シミュレーションゲーム、音楽演奏ゲーム、ダンスゲームなど種々のゲームに適用できる。 3-3. Other Games This embodiment can be applied to various games such as action games, racing games, shooting games, sports games, competitive games, role playing games, simulation games, music performance games, and dance games.

例えば、各ゲームにおいて適用する場合には、ゲーム進行に応じた所定のタイミングで、特定領域において物体認識処理を行うようにしてもよい。例えば、オブジェクト空間においてプレーヤの操作対象のプレーヤオブジェクトが敵オブジェクトと遭遇したタイミング（プレーヤオブジェクトの位置と敵オブジェクトの位置とが所定位置関係になったタイミング）で、特定領域において物体認識処理を行うようにしてもよい。 For example, when applied in each game, the object recognition process may be performed in a specific area at a predetermined timing according to the progress of the game. For example, the object recognition process is performed in the specific area at the timing when the player object to be operated by the player encounters the enemy object in the object space (the timing at which the position of the player object and the position of the enemy object are in a predetermined positional relationship). It may be.

また、レーシングゲーム、シューティングゲーム等においては、プレーヤがハンドルや操縦桿等を操作しているような動きを認識するようにしてもよい。例えば、レーシングゲーム等で手（或いは腕などでもよい）を認識するための特定領域を設定し、特定領域において手を認識できた場合には、その特定領域において、手の動き、手のジェスチャーを認識する処理を行う。 Further, in a racing game, a shooting game, or the like, it may be recognized that the player is operating a handle, a control stick or the like. For example, when a specific area for recognizing a hand (or an arm or the like) is set in a racing game or the like and the hand can be recognized in the specific area, the movement of the hand or the gesture of the hand is detected in the specific area. Perform recognition processing.

また、音楽ゲームやダンスゲームでは、プレーヤの動作を認識し、プレーヤの動作が予め決められた動作に一致しているか否かを判断してもよい。例えば、人を認識するための特定領域を設定し、特定領域において人を認識できた場合には、その特定領域において、人の動き、人のジェスチャーを認識する処理を行う。 In a music game or a dance game, the player's action may be recognized to determine whether or not the player's action matches a predetermined action. For example, when a specific area for recognizing a person is set and the person can be recognized in the specific area, a process of recognizing the movement of the person and the gesture of the person is performed in the specific area.

また、本実施形態では、業務用ゲームシステム、家庭用ゲームシステム、多数のプレーヤが参加する大型アトラクションシステム、シミュレーター、マルチメディア端末、ゲーム画像を生成するシステムボード、携帯電話などの種々のシステム（端末）に適用できる。 In the present embodiment, various systems (terminals) such as an arcade game system, a home game system, a large attraction system in which a large number of players participate, a simulator, a multimedia terminal, a system board for generating game images, and a mobile phone ).

４．応用例
本実施形態では、第１の物体認識システムの処理部１００の処理を、第２の物体認識システムの処理部５００に応用してもよいし、第２の物体認識システムの処理部５００の処理を、第１の物体認識システムの処理部１００に応用してもよい。例えば、第１の物体認識システムにおいて、ボーン情報を用いて物体認識処理を行うようにしてもよい。具体的には、第１の物体認識システムの認識パターン記憶部１７３に「人」に関するボーン情報を記憶させる。そして、認識パターン記憶部１７３で記憶されている複数のボーン（スケルトン、骨格）と入力画像の各画素の画素情報に基づいて設定された動き領域（２次元のシルエット）とを照合し、最も動き領域に合致するボーンを設定する。 4). Application Example In this embodiment, the processing of the processing unit 100 of the first object recognition system may be applied to the processing unit 500 of the second object recognition system, or the processing of the processing unit 500 of the second object recognition system. The processing may be applied to the processing unit 100 of the first object recognition system. For example, in the first object recognition system, object recognition processing may be performed using bone information. Specifically, bone information related to “person” is stored in the recognition pattern storage unit 173 of the first object recognition system. Then, the plurality of bones (skeleton, skeleton) stored in the recognition pattern storage unit 173 are collated with the motion region (two-dimensional silhouette) set based on the pixel information of each pixel of the input image. Set bones that match the region.

そして、本実施形態では、設定されたボーンの動きを演算する処理を行う。つまり、ボーンの動きを、プレーヤＰの動作とみなす処理を行なう。本実施形態では、所定間隔毎（例えば、フレーム毎）に、ボーンを特定しプレーヤＰの動作を取得する処理を行っている。 And in this embodiment, the process which calculates the motion of the set bone is performed. That is, processing is performed in which the motion of the bone is regarded as the motion of the player P. In the present embodiment, a process of specifying the bone and acquiring the action of the player P is performed at predetermined intervals (for example, every frame).

なお、本実施形態では、人物を構成する部位単位（手、腕、顔、足）で物体認識処理を行うこともできる。かかる場合は、部位単位で予め複数のボーンを認識パターン記憶部１７３に格納し、抽出された動き領域が複数のボーンのうちいずれと合致するかを判定してもよい。 In the present embodiment, the object recognition process can also be performed on a part basis (hand, arm, face, foot) constituting a person. In such a case, a plurality of bones may be stored in advance in the recognition pattern storage unit 173 for each part, and it may be determined which of the plurality of bones the extracted motion region matches.

５．物体の動きを認識する処理についての説明
本実施形態では、所定タイミングで画像の変化を認識する処理を行うようにしてもよい。つまり、本実施形態では、２つの異なる時点で取得した入力画像に基づいて、物体の動きを認識する処理を行うようにしてもよい。例えば、単純に２つの入力画像間の差分をとって物体の動き（物体の移動）を認識してもよい。このようにすれば、簡易な処理で物体の動きや移動等を認識することができる。 5. Description of Processing for Recognizing Object Movement In this embodiment, processing for recognizing a change in an image may be performed at a predetermined timing. That is, in the present embodiment, processing for recognizing the movement of an object may be performed based on input images acquired at two different times. For example, the movement of an object (movement of an object) may be recognized simply by taking a difference between two input images. In this way, it is possible to recognize the movement and movement of the object with a simple process.

より具体的に説明すると、図３３（Ａ）（Ｂ）に示すように、タイミングＴ２０で取得した入力画像Ｆ２０と、タイミングＴ２０時点から所定期間経過した所定タイミングでＴ２１時点（例えば、Ｔ２０時点から１秒後のＴ２１時点）において取得した入力画像Ｆ２１との差分をとって動きを認識してもよい。つまり、入力画像Ｆ２０の各画素値の合計値と、入力画像Ｆ２１の各画素値の合計値の差分値を算出し、差分値が動き認識のためのしきい値以上（所定値以上）であるか否かを判断し、差分値が動き認識のしきい値以上である場合に、動きがあったものと認識する。 More specifically, as shown in FIGS. 33A and 33B, the input image F20 acquired at the timing T20 and the time T21 (for example, 1 from the time T20) at a predetermined timing after the time T20 has elapsed. The motion may be recognized by taking a difference from the input image F21 acquired at time T21 after 2 seconds). That is, a difference value between the total value of each pixel value of the input image F20 and the total value of each pixel value of the input image F21 is calculated, and the difference value is equal to or greater than a threshold value for motion recognition (greater than a predetermined value). If the difference value is equal to or greater than the threshold value for motion recognition, it is recognized that there has been motion.

なお、入力画像はＲＧＢ画像でもよいし、反射光画像でもよい。例えば、Ｔ２０時点のＲＧＢ画像の各画素のカラー値の合計値と、Ｔ２１時点のＲＧＢ画像の各画素のカラー値の合計値の差分値を算出し、差分値がしきい値以上であるか否かを判断する。そして、差分値がしきい値以上である場合に、動きがあったものと認識する。 The input image may be an RGB image or a reflected light image. For example, a difference value between the total value of the color values of each pixel of the RGB image at time T20 and the total value of the color values of each pixel of the RGB image at time T21 is calculated, and whether or not the difference value is equal to or greater than a threshold value. Determine whether. Then, when the difference value is equal to or greater than the threshold value, it is recognized that there has been movement.

また、Ｔ２０時点の反射光画像の各画素の深度値の合計値と、Ｔ２１時点の反射光画像の各画素の深度値の合計値の差分値を算出し、差分値がしきい値以上であるか否かを判断する。そして、差分値がしきい値以上である場合に、動きがあったものと認識する。特に、反射光画像を用いれば、物体が奥行き方向へ動いたことを認識することができる。 Also, a difference value between the total depth value of each pixel of the reflected light image at time T20 and the total depth value of each pixel of the reflected light image at time T21 is calculated, and the difference value is equal to or greater than the threshold value. Determine whether or not. Then, when the difference value is equal to or greater than the threshold value, it is recognized that there has been movement. In particular, if a reflected light image is used, it can be recognized that the object has moved in the depth direction.

また、本実施形態では入力画像において特定領域を設定するので、所定タイミングで特定領域における画像の変化を認識する処理を行うようにしてもよい。例えば、図３３（Ａ）（Ｂ）に示すように、Ｔ２０時点で取得した入力画像Ｆ２０の特定領域Ａ２０の各画素値の合計値と、Ｔ２１時点で取得した入力画像Ｆ２１の特定領域Ａ２１の各画素値の合計値の差分値を算出し、差分値がしきい値以上である場合に、動きがあったものと認識するようにしてもよい。このようにすれば、効率的に物体の動きを認識することができ、また、特定領域にマシンパワーを注ぐことができ、より正確に物体の動きを認識することができるからである。 In the present embodiment, since a specific area is set in the input image, a process of recognizing a change in the image in the specific area may be performed at a predetermined timing. For example, as shown in FIGS. 33A and 33B, the total value of the pixel values of the specific area A20 of the input image F20 acquired at time T20 and each of the specific areas A21 of the input image F21 acquired at time T21. A difference value of the total value of the pixel values may be calculated, and when the difference value is equal to or greater than a threshold value, it may be recognized that there is movement. This is because the movement of the object can be recognized efficiently, the machine power can be poured into the specific area, and the movement of the object can be recognized more accurately.

例えば、特定領域Ａ２０の各画素のカラー値の合計値と、特定領域Ａ２１の各画素のカラー値の合計値の差分値を算出し、差分値がしきい値以上であるか否かを判断する。そして、差分値がしきい値以上である場合に、動きがあったものと認識する。また、特定領域Ａ２０の各画素の深度値の合計値と、特定領域Ａ２１の各画素の深度値の合計値の差分値を算出し、差分値がしきい値以上であるか否かを判断する。そして、差分値がしきい値以上である場合に、動きがあったものと認識する。 For example, a difference value between the total color value of each pixel in the specific area A20 and the total color value of each pixel in the specific area A21 is calculated, and it is determined whether the difference value is equal to or greater than a threshold value. . Then, when the difference value is equal to or greater than the threshold value, it is recognized that there has been movement. Also, a difference value between the total depth value of each pixel in the specific area A20 and the total depth value of each pixel in the specific area A21 is calculated, and it is determined whether or not the difference value is equal to or greater than a threshold value. . Then, when the difference value is equal to or greater than the threshold value, it is recognized that there has been movement.

物体認識装置１０、処理部１００、取得部１１０、算出部１１１、
領域設定部１１２、物体認識処理部１１３、ゲーム演算部１１４、
画像生成部１２０、音制御部１３０、記憶部１７０、主記憶部１７１、
描画バッファ１７２、認識パターン記憶部１７３、入力画像記憶部１７４、
差分画像記憶部１７５、情報記憶媒体１８０、通信部１９６、
入力部２０、ＲＧＢカメラ（撮像部）２１、処理部２２、記憶部２３、
物体認識装置５０、処理部５００、取得部５１０、算出部５１１、
領域設定部５１２、物体認識処理部５１３、ゲーム演算部５１４、
画像生成部５２０、音制御部５３０、記憶部５７０、主記憶部５７１、
描画バッファ５７２、認識パターン記憶部５７３、入力画像記憶部５７４、
差分画像記憶部５７５、情報記憶媒体５８０、通信部５９６、
入力部６０、発光部６１０、光源６１１、深度センサ６２０、
ＲＧＢカメラ（撮像部）６３０、音入力部６４０、処理部６５０、
記憶部６６０表示部９０、スピーカー９２、Ｐプレーヤ、
Ｆ１〜Ｆ４、Ｆ１０、Ｆ１１入力画像、Ｐ１、Ｐ２画素、Ｖ動きベクトル、
Ａ１、Ａ２、Ｂ１、Ｃ１、Ｃ２特定領域、Ｓ１、Ｓ１´、Ｓ２、Ｓ２´ 動き領域、
ＧＰ入力部の位置、Ｑ物体の位置、Ｌ物体から入力部までの距離、
ＢＯ１、ＢＯ２、ＢＯ３、ＢＯ４、ＢＯ５、ＢＯＨ１ボーン、
ＳＴ１、ＳＴ２、ＳＴ２´、ＳＴ３、ＳＴ３´ シルエット Object recognition device 10, processing unit 100, acquisition unit 110, calculation unit 111,
Area setting unit 112, object recognition processing unit 113, game calculation unit 114,
Image generation unit 120, sound control unit 130, storage unit 170, main storage unit 171,
Drawing buffer 172, recognition pattern storage unit 173, input image storage unit 174,
Difference image storage unit 175, information storage medium 180, communication unit 196,
Input unit 20, RGB camera (imaging unit) 21, processing unit 22, storage unit 23,
Object recognition device 50, processing unit 500, acquisition unit 510, calculation unit 511,
Area setting unit 512, object recognition processing unit 513, game calculation unit 514,
Image generation unit 520, sound control unit 530, storage unit 570, main storage unit 571,
Drawing buffer 572, recognition pattern storage unit 573, input image storage unit 574,
Difference image storage unit 575, information storage medium 580, communication unit 596,
Input unit 60, light emitting unit 610, light source 611, depth sensor 620,
RGB camera (imaging unit) 630, sound input unit 640, processing unit 650,
Storage unit 660 display unit 90, speaker 92, P player,
F1-F4, F10, F11 input image, P1, P2 pixels, V motion vector,
A1, A2, B1, C1, C2 specific area, S1, S1 ′, S2, S2 ′ movement area,
GP input unit position, Q object position, L object to input unit distance,
BO1, BO2, BO3, BO4, BO5, BOH1 bone,
ST1, ST2, ST2 ', ST3, ST3' silhouette

Claims

A program for performing object recognition processing,
An input image having a depth value of each pixel is obtained by irradiating the object with light and receiving reflected light of the object, and a specific region is set in the input image based on the depth value of each pixel of the input image An area setting section;
As an object recognition processing unit that performs object recognition processing for recognizing an object, the computer functions,
The object recognition processing unit
A program characterized by performing object recognition processing in a specific area.

In claim 1,
The object recognition processing unit
A program for performing object recognition processing in a specific area by increasing the accuracy of object recognition processing in a specific area higher than the accuracy of object recognition processing in an area other than the specific area.

In claim 1 or 2,
The object recognition processing unit
A program for performing object recognition processing in a specific area by setting a period for performing object recognition processing in a specific area to be shorter than a period of object recognition processing in an area other than the specific area.

In any one of Claims 1-3,
The object recognition processing unit
A program for performing object recognition processing in a specific area by increasing the image accuracy of the specific area to be higher than that of an area other than the specific area.

In any one of Claims 1-4,
The object recognition processing unit
While performing object recognition processing in areas other than the specific area,
A program characterized in that a period for performing object recognition processing in an area other than the specific area is longer than a period for performing object recognition processing in the specific area.

In any one of Claims 1-5,
The object recognition processing unit
While performing object recognition processing in areas other than the specific area,
A program characterized in that the image accuracy of a region other than the specific region is lower than the image accuracy of the specific region.

In any one of Claims 1-6,
When the area setting unit sets a plurality of specific areas,
The object recognition processing unit
A program for performing object recognition processing on at least one specific area.

In claim 7,
The area setting unit gives priority to each specific area,
The object recognition processing unit
While performing object recognition processing in each specific area,
A program characterized in that a period for performing object recognition processing in a specific area with low priority is made longer than a period for performing object recognition processing in a specific area with high priority.

In claim 7 or 8,
The area setting unit gives priority to each specific area,
The object recognition processing unit
While performing object recognition processing in each specific area,
A program characterized in that the image accuracy of a specific region with a low priority is made lower than the image accuracy of a specific region with a high priority.

In claim 7,
The area setting unit gives priority to each specific area,
The object recognition processing unit
A program that performs object recognition processing in a specific area with high priority without performing object recognition processing in the specific area with low priority.

In any one of Claims 1-10,
The region setting unit
A program that sets a specific region in an input image based on color information and depth value of each pixel of the input image.

In any one of Claims 1-11,
The region setting unit
A program that sets a specific region in an input image based on a depth value that is equal to or greater than a predetermined value among the depth values of each pixel of the input image.

In any one of Claims 1-12,
The object recognition processing unit
A program that performs the object recognition processing in the specific area based on bone information.

In any one of Claims 1-13,
The object recognition processing unit
A program for performing a process of recognizing a change in an image in the specific area at a predetermined timing.

An information storage medium readable by a computer, wherein the program according to any one of claims 1 to 14 is stored.

An object recognition system that performs processing for recognizing an object,
An input image having a depth value of each pixel is obtained by irradiating the object with light and receiving reflected light of the object, and a specific region is set in the input image based on the depth value of each pixel of the input image An area setting section;
An object recognition processing unit for performing object recognition processing for recognizing an object,
The object recognition processing unit
An object recognition system characterized by performing object recognition processing in a specific area.