JP2015219868A

JP2015219868A - Information processor, information processing method and program

Info

Publication number: JP2015219868A
Application number: JP2014105288A
Authority: JP
Inventors: 貴之岩本; Takayuki Iwamoto; 優和真継; Masakazu Matsugi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2014-05-21
Filing date: 2014-05-21
Publication date: 2015-12-07

Abstract

PROBLEM TO BE SOLVED: To create a model unique to an object without requesting the specific attitude of the object when creating an object shape model for estimating the position attitude of a deformation object in an image.SOLUTION: An information processor inputs an image, detects an object on the basis of the input image, evaluates relation between portions of the detected object on the basis of an object shape model for estimating the position attitude of the detected object, determines whether to calculate a model parameter on the basis of the evaluated result, calculates the model parameter on the basis of the determination result, and updates the object shape model held by holding means on the basis of the calculated model parameter.

Description

本発明は、画像中の変形物体の位置姿勢を推定するための物体形状モデルを作成する技術に関する。 The present invention relates to a technique for creating an object shape model for estimating the position and orientation of a deformed object in an image.

従来、画像情報を利用して、マーカーを用いずに変形物体の位置姿勢推定を行うために、対象となる物体固有のモデルを作成する方法が知られている。例えば、特許文献１では、デプス画像を利用して人体の姿勢推定を行うために、距離画像中において立位や両腕を挙げるポーズなど、特定の姿勢を取っている対象者の身体の各部位のサイズを推定することにより、対象者固有のモデルを作成している。 2. Description of the Related Art Conventionally, a method for creating a model unique to a target object is known in order to estimate the position and orientation of a deformed object without using a marker using image information. For example, in Patent Document 1, in order to estimate the posture of a human body using a depth image, each part of the body of a subject taking a specific posture, such as a pose that stands or stands on both arms in a distance image A model specific to the subject is created by estimating the size of the subject.

米国特許出願公開第２０１１／００５２００６号明細書US Patent Application Publication No. 2011/0052006

Ｎ．ＤａｌａｌａｎｄＢ．Ｔｒｉｇｇｓ “ＨｉｓｔｏｇｒａｍｓｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓｆｏｒＨｕｍａｎＤｅｔｅｃｔｉｏｎ，” ＩｎＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，ＳａｎＤｉｅｇｏ，ＵＳＡ，Ｊｕｎｅ２００５．Ｖｏｌ．ＩＩ，ｐｐ．８８６−８９３．N. Dalal and B.M. Triggs “Histograms of Oriented Gradients for Human Detection,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego US. Vol. II, pp. 886-893. Ｐ．ＦｅｌｚｅｎｓｚｗａｌｂａｎｄＤ．Ｈｕｔｔｅｎｌｏｃｈｅｒ “ＰｉｃｔｏｒｉａｌＳｔｒｕｃｔｕｒｅｓｆｏｒＯｂｊｅｃｔＲｅｃｏｇｎｉｔｉｏｎＩｎｔｅｒｎａｔｉｏｎａｌＪｏｕｒｎａｌｏｆＣｏｍｐｕｔｅｒＶｉｓｉｏｎ，” Ｖｏｌ．６１，Ｎｏ．１，Ｊａｎｕａｒｙ２００５P. Felzenszwalb and D.W. Huttenlocher “Pictial Structures for Object Recognition International Journal of Computer Vision,” Vol. 61, no. 1, January 2005 Ｍ．Ａｎｄｒｉｌｕｋａ，Ｓ．ＲｏｔｈａｎｄＢ．Ｓｃｈｉｅｌｅ， “ＰｉｃｔｏｒｉａｌＳｔｒｕｃｔｕｒｅｓＲｅｖｉｓｉｔｅｄ：ＰｅｏｐｌｅＤｅｔｅｃｔｉｏｎａｎｄＡｒｔｉｃｕｌａｔｅｄＰｏｓｅＥｓｔｉｍａｔｉｏｎ，” ＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＣＶＰＲ’０９），Ｍｉａｍｉ，ＵＳＡ，Ｊｕｎｅ２００９．M.M. Andriluka, S.M. Roth and B.M. Schiele, “Pictial Structures Revisited: People Detection and Articulated Pose Estimate,” IEEE Conference on Computer Vision and Pattern Revenue 9 ＬｕｂｏｍｉｒＢｏｕｒｄｅｖａｎｄＪｉｔｅｎｄｒａＭａｌｉｋ， “Ｐｏｓｅｌｅｔｓ：ＢｏｄｙＰａｒｔＤｅｔｅｃｔｏｒｓＴｒａｉｎｅｄＵｓｉｎｇ３ＤＨｕｍａｎＰｏｓｅＡｎｎｏｔａｔｉｏｎｓ，” ＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ２００９Lubomir Bourdev and Jitendra Malik, “Poselets: Body Part Detectors Trained 3D Human Pose Annotations,” International Conference on Computer 9 V Computer Ｒ．ＯｋａｄａａｎｄＳ．Ｓｏａｔｔｏ， “ＲｅｌｅｖａｎｔＦｅａｔｕｒｅＳｅｌｅｃｔｉｏｎｆｏｒＨｕｍａｎＰｏｓｅＥｓｔｉｍａｔｉｏｎａｎｄＬｏｃａｌｉｚａｔｉｏｎｉｎＣｌｕｔｔｅｒｅｄＩｍａｇｅｓ，” ＩｎＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＥｕｒｏｐｅａｎＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ，２００８．R. Okada and S.M. Soatto, “Relevant Feature Selection for Human Pose Estimate and Localization in Clustered Images,” In Proceedings of the European Conference on Computer V8.

しかしながら、特許文献１に記載の方法では、対象者が特定の姿勢をとる必要があり、利便性が低いという課題があった。 However, in the method described in Patent Document 1, there is a problem that the target person needs to take a specific posture and the convenience is low.

本発明は、このような課題に鑑みてなされたもので、対象物体が特定の姿勢を取ることを必要とせずに、対象物体固有のモデルを作成することを目的とする。 The present invention has been made in view of such problems, and an object of the present invention is to create a model specific to a target object without requiring the target object to take a specific posture.

本発明に係る情報処理装置は、例えば、画像を入力する画像入力手段と、前記画像入力手段で入力された画像に基づいて、前記物体の検出を行う物体検出手段と、前記物体の位置姿勢を推定するための物体形状モデルを保持する保持手段と、前記物体形状モデルに基づいて、前記検出された物体の部位間の関係を評価する部位関係評価手段と、前記部位関係評価手段から得られた評価結果に基づいて、モデルパラメタを算出するかの判断をする判断手段と、前記判断手段による判断の結果に基づいて、前記モデルパラメタを算出するモデルパラメタ算出手段と、前記モデルパラメタ算出手段によって算出されたモデルパラメタに基づいて、前記保持手段に保持される物体形状モデルを更新する更新手段とを備える。 The information processing apparatus according to the present invention includes, for example, an image input unit that inputs an image, an object detection unit that detects the object based on the image input by the image input unit, and a position and orientation of the object. Obtained from holding means for holding an object shape model for estimation, part relation evaluation means for evaluating a relation between parts of the detected object based on the object shape model, and part relation evaluation means Based on the evaluation result, a determination unit that determines whether to calculate a model parameter, a model parameter calculation unit that calculates the model parameter based on a determination result by the determination unit, and a calculation by the model parameter calculation unit Updating means for updating the object shape model held in the holding means based on the model parameter.

本発明によれば、対象物体が特定の姿勢を取ることを必要とせずに、対象物体固有のモデルを作成することができる。 According to the present invention, it is possible to create a model specific to a target object without requiring the target object to take a specific posture.

本発明の第１の実施形態に係る入力画像の一例を示す図である。It is a figure which shows an example of the input image which concerns on the 1st Embodiment of this invention. 本発明の第１、第２および第３の実施形態に係る情報処理装置の構成を示す図である。It is a figure which shows the structure of the information processing apparatus which concerns on 1st, 2nd and 3rd embodiment of this invention. 本発明の第１の実施形態に係る情報処理装置の対象物体とそのモデルの一例を示す図である。It is a figure which shows an example of the target object of the information processing apparatus which concerns on the 1st Embodiment of this invention, and its model. 本発明の第１の実施形態に係る情報処理装置のモデルの構造の一例を示す図である。It is a figure which shows an example of the structure of the model of the information processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る情報処理装置のモデルの一例を示す図である。It is a figure which shows an example of the model of the information processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る情報処理装置の処理過程の一例を示す図である。It is a figure which shows an example of the process of the information processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る情報処理装置の処理フローを示す図である。It is a figure which shows the processing flow of the information processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第２の実施形態に係る情報処理装置のモデルの一例を示す図である。It is a figure which shows an example of the model of the information processing apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施形態に係る情報処理装置の処理フローを示す図である。It is a figure which shows the processing flow of the information processing apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施形態に係る情報処理装置の処理フローを示す図である。It is a figure which shows the processing flow of the information processing apparatus which concerns on the 3rd Embodiment of this invention. 本発明の情報処理装置のハードウェア構成の例を示す図である。It is a figure which shows the example of the hardware constitutions of the information processing apparatus of this invention.

本発明にかかる各実施形態を説明するのに先立ち、各実施形態に示す情報処理装置が実装されるハードウェア構成について、図１１を用いて説明する。 Prior to describing each embodiment according to the present invention, a hardware configuration in which the information processing apparatus shown in each embodiment is mounted will be described with reference to FIG.

図１１は、本実施形態における情報装置のハードウェア構成図である。同図において、ＣＰＵ１１１０は、バス１１００を介して接続する各デバイスを統括的に制御する。ＣＰＵ１１１０は、読み出し専用メモリ（ＲＯＭ）１１２０に記憶された処理ステップやプログラムを読み出して実行する。オペレーティングシステム（ＯＳ）をはじめ、本実施形態に係る各処理プログラム、デバイスドライバ等はＲＯＭ１１２０に記憶されており、ランダムアクセスメモリ（ＲＡＭ）１１３０に一時記憶され、ＣＰＵ１１１０によって適宜実行される。また、入力Ｉ／Ｆ１１４０は、外部の装置（表示装置や操作装置など）から情報処理装置１で処理可能な形式で入力信号として入力する。また、出力Ｉ／Ｆ１１５０は、外部の装置（表示装置）へ表示装置が処理可能な形式で出力信号として出力する。 FIG. 11 is a hardware configuration diagram of the information device according to the present embodiment. In the figure, a CPU 1110 comprehensively controls each device connected via a bus 1100. The CPU 1110 reads and executes processing steps and programs stored in a read-only memory (ROM) 1120. In addition to the operating system (OS), each processing program, device driver, and the like according to this embodiment are stored in the ROM 1120, temporarily stored in a random access memory (RAM) 1130, and appropriately executed by the CPU 1110. The input I / F 1140 is input as an input signal in a format that can be processed by the information processing apparatus 1 from an external device (display device, operation device, or the like). The output I / F 1150 is output as an output signal in a format that can be processed by the display device to an external device (display device).

これらの各機能部は、ＣＰＵ１１１０が、ＲＯＭ１１２０に格納されたプログラムをＲＡＭ１１３０に展開し、後述する各フローチャートに従った処理を実行することで実現されている。また例えば、ＣＰＵ１１１０を用いたソフトウェア処理の代替としてハードウェアを構成する場合には、ここで説明する各機能部の処理に対応させた演算部や回路を構成すればよい。 Each of these functional units is realized by the CPU 1110 developing a program stored in the ROM 1120 in the RAM 1130 and executing processing according to each flowchart described later. Further, for example, when hardware is configured as an alternative to software processing using the CPU 1110, arithmetic units and circuits corresponding to the processing of each functional unit described here may be configured.

（第１の実施形態）
本実施形態に係る情報処理装置は、２Ｄ画像中の歩行者の各部位の画像上での位置姿勢を算出するための人体モデルを、モデルパラメタの算出に適した状況かどうかを判定し、適している場合にモデルの保存を行う。本提案中の「モデル」とは、複数の部位からなる物体の形状および姿勢を複数のパラメタを用いて表現したものであればどのようなものでもよく、例えば、円柱を連結した関節モデルや、３次元ＣＡＤモデルなどである。 (First embodiment)
The information processing apparatus according to the present embodiment determines whether or not the human body model for calculating the position and orientation on the image of each part of the pedestrian in the 2D image is suitable for the calculation of the model parameter. If it is, save the model. The “model” in the present proposal may be anything as long as it represents the shape and posture of an object composed of a plurality of parts using a plurality of parameters, for example, a joint model in which cylinders are connected, For example, a three-dimensional CAD model.

以下、図を用いて例を示す。 Hereinafter, an example is shown using figures.

図１の人物１１０は横向きで歩行動作をしている。人物１２０は正面を向き立っている。人物１１０の右下腿を通る中心線は、点線１１１であり、右上腿を通る中心線は１１２である。点線１１１と点線１１２の交点が膝の関節点に相当する。人物１１０の下腿の長さは、この交点から右足底面までの距離として算出することができる。また、点１１３のようにコーナー点を膝の関節点とみなして、コーナー点から右足底面までの距離として算出することもできる。一方で人物１２０の右上腿の中心線と右下腿の中心線とは点線１２１に一致し、膝関節点に相当する点を定位することができない。また、コーナー点のような画像特徴もないため、コーナー点を利用して膝関節点の定位を行うこともできない。すなわち、人物１２０の右脚上腿と右脚下腿のサイズの推定は正確ではないことになる。よって、人物１２０の右上腿および右下腿のモデルの更新をすることは不適切である。本実施形態に係る情報処理装置は、右上腿と右下腿の相対位置関係の推定結果に基づいて、モデル更新に適切な状況かどうかを判断し、モデルの更新を行う。なお、本実施形態においては、右上腿と右下腿を例として説明を行うが、他の部位に関しても適用可能であることは言うまでもない。ゆえに、それぞれの部位に関して、本実施形態で述べる方法によりモデルパラメタ算出が適切かどうかを判定し、適合した部位のモデルパラメタのみを保存することも可能である。 The person 110 in FIG. 1 is walking sideways. The person 120 faces the front. A center line passing through the right lower leg of the person 110 is a dotted line 111, and a center line passing through the upper right leg is 112. The intersection of the dotted line 111 and the dotted line 112 corresponds to the joint point of the knee. The length of the lower leg of the person 110 can be calculated as the distance from this intersection point to the bottom surface of the right foot. Further, it is also possible to calculate the distance from the corner point to the bottom surface of the right foot by regarding the corner point as a knee joint point like the point 113. On the other hand, the center line of the upper right thigh and the center line of the right lower thigh of the person 120 coincide with the dotted line 121, and the point corresponding to the knee joint point cannot be localized. Further, since there is no image feature such as a corner point, the knee joint point cannot be localized using the corner point. That is, the estimation of the size of the right leg upper leg and the right leg lower leg of the person 120 is not accurate. Therefore, it is inappropriate to update the models of the upper right thigh and right lower leg of the person 120. The information processing apparatus according to the present embodiment determines whether the situation is appropriate for model update based on the estimation result of the relative positional relationship between the upper right thigh and the right lower thigh, and updates the model. In the present embodiment, the upper right thigh and the right lower thigh will be described as an example. Needless to say, the present invention can also be applied to other parts. Therefore, with respect to each part, it is possible to determine whether the model parameter calculation is appropriate by the method described in this embodiment, and it is possible to store only the model parameter of the conforming part.

図２に示すように、本実施形態における情報処理装置２００は、画像入力部２０１と、物体検出部２０２と、姿勢推定部２０３と、部位関係評価部２０４と、モデルパラメタ算出部２０５と、モデルパラメタ記憶部２０６とから構成される。 As illustrated in FIG. 2, the information processing apparatus 200 according to the present embodiment includes an image input unit 201, an object detection unit 202, a posture estimation unit 203, a part relationship evaluation unit 204, a model parameter calculation unit 205, a model And a parameter storage unit 206.

画像入力部２０１は、撮像装置１００から順次送出される各フレームの画像（現実空間画像）を受け、後段の画像特徴検出部２０２に対して転送する。画像入力部２０１は、撮像装置１００の出力がＮＴＳＣなどのアナログ出力であればアナログビデオキャプチャボードによって実現される。また撮像装置１００の出力がＩＥＥＥ１３９４などのデジタル出力であれば、例えばＩＥＥＥ１３９４インタフェースボードによって実現される。画像入力部２０１より入力される画像は、ＲＧＢ画像であってもグレイスケール画像であってもよい。また実時間で撮影した画像であってもよいし、事前に撮影した画像であってもよい。また、ＲＧＢ画像であってもグレイスケール画像であってもよい。 The image input unit 201 receives each frame image (real space image) sequentially transmitted from the imaging apparatus 100 and transfers it to the subsequent image feature detection unit 202. The image input unit 201 is realized by an analog video capture board if the output of the imaging apparatus 100 is an analog output such as NTSC. Further, when the output of the imaging apparatus 100 is a digital output such as IEEE1394, it is realized by, for example, an IEEE1394 interface board. The image input from the image input unit 201 may be an RGB image or a gray scale image. Moreover, the image image | photographed in real time may be sufficient and the image image | photographed beforehand may be sufficient. Further, it may be an RGB image or a gray scale image.

物体検出部２０２は、画像入力部２０１から対象となる物体を画像中から検出する。 The object detection unit 202 detects a target object from the image input unit 201 from the image.

姿勢推定部２０３は、画像入力部より得られた画像中の人物の各部位の２次元画像上での位置姿勢を推定する。 The posture estimation unit 203 estimates the position and posture on the two-dimensional image of each part of the person in the image obtained from the image input unit.

部位関係評価部２０４は、姿勢推定部２０３より得られる情報から部位間の相対関係を評価する。ここで、評価値は、パーツ尤度、相対角度、画像特徴のコーナーネス、特定のパターンとの類似度などのことをいうが、詳しい説明は後述する。 The part relationship evaluation unit 204 evaluates the relative relationship between parts from the information obtained from the posture estimation unit 203. Here, the evaluation value refers to parts likelihood, relative angle, cornerness of image features, similarity to a specific pattern, and the like, which will be described in detail later.

モデルパラメタ算出部２０５は、部位関係評価部２０４で得られた評価値に基づいて、モデルパラメタを算出する部位を選択し、それぞれのパラメタの算出を行う。 The model parameter calculation unit 205 selects a part for calculating the model parameter based on the evaluation value obtained by the part relationship evaluation unit 204, and calculates each parameter.

モデルデータ記憶部２０６は、姿勢推定部２０３で用いる人体モデル（物体形状モデル）を保持しておく。 The model data storage unit 206 holds a human body model (object shape model) used by the posture estimation unit 203.

以下では、図７のフローチャートを用いて、本実施形態における情報処理装置２００の処理を説明する。 Below, the process of the information processing apparatus 200 in this embodiment is demonstrated using the flowchart of FIG.

（ステップＳ１０１）
まず、ステップＳ１０１において、画像入力部２０１より画像を１フレーム入力する。 (Step S101)
First, in step S101, one frame of image is input from the image input unit 201.

（ステップＳ１０２）
続いて、ステップＳ１０２において、物体検出部２０２によって、ステップＳ１０１において入力された画像中に人物がいるかどうかの検出を行う。検出された場合は、ステップＳ１０３に進む。検出されなかった場合はステップＳ１０１に戻る。検出方法としては、本実施形態では人物が対象であるため、例えば、非特許文献１に記載の方法を用いることができる。もちろん、この方法に限られるものではない。 (Step S102)
Subsequently, in step S102, the object detection unit 202 detects whether or not there is a person in the image input in step S101. If detected, the process proceeds to step S103. If not detected, the process returns to step S101. As a detection method, since a person is an object in the present embodiment, for example, the method described in Non-Patent Document 1 can be used. Of course, it is not limited to this method.

（ステップＳ１０３）
続いて、ステップＳ１０３において、姿勢推定部２０３はステップＳ１０２において検出された人体の各部位の画像上での位置姿勢を推定する。以下、その推定方法について詳述する。 (Step S103)
Subsequently, in step S103, the posture estimation unit 203 estimates the position and posture on the image of each part of the human body detected in step S102. Hereinafter, the estimation method will be described in detail.

本実施形態において位置姿勢の推定を行うためのモデルは、非特許文献２や非特許文献３に記載のＰｉｃｔｏｒｉａｌＳｔｒｕｃｔｕｒｅｓモデルとする。図３は、ＰｉｃｔｏｒｉａｌＳｔｒｕｃｔｕｒｅｓモデルにより、横向きで歩行する人物の各部位の位置姿勢推定を行っている図である。画像３０１は、横向き歩行の人物１１０にその各部位に対応するｂｏｕｎｄｉｎｇｂｏｘを点線で重畳して描画した画像である。画像３０２は、画像３０１のｂｏｕｎｄｉｎｇｂｏｘだけ別途描画した画像である。３１１は体幹部、３１２は頭部、３１３は右上腿、３１４は右下腿、３１５は左上腿、３１６は左下腿にそれぞれ対応するｂｏｕｎｄｉｎｇｂｏｘである。簡単のため、右腕および左腕に相当する部位を省略したが、それらを含んだモデルに拡張することが可能であることは言うまでもない。 In the present embodiment, the model for estimating the position and orientation is the Pictorial Structure model described in Non-Patent Document 2 or Non-Patent Document 3. FIG. 3 is a diagram in which the position and orientation of each part of a person who walks sideways is estimated using the Pictorial Structures model. The image 301 is an image in which a bounding box corresponding to each part of the person 110 walking sideways is superimposed with a dotted line. The image 302 is an image that is separately drawn for only the bounding box of the image 301. 311 is the trunk, 312 is the head, 313 is the upper right thigh, 314 is the right lower thigh, 315 is the left upper thigh, and 316 is a bounding box corresponding to the left lower thigh. The parts corresponding to the right arm and the left arm are omitted for the sake of simplicity, but it goes without saying that it can be extended to a model including them.

画像３０２の各ｂｏｕｎｄｉｎｇｂｏｘは幅と長さのパラメタを有している。姿勢推定が開始された時点では、このパラメタには予め定められた初期値が与えられている。情報処理装置は、対象者が歩行をしている最中に、下腿と上腿の相対配置関係を評価し、それが所定の範囲の配置関係になった場合に、下腿と上腿に対応するｂｏｕｎｄｉｎｇｂｏｘの幅と長さの更新を行う。 Each bounding box of the image 302 has width and length parameters. When the posture estimation is started, a predetermined initial value is given to this parameter. The information processing apparatus evaluates the relative positional relationship between the lower leg and the upper leg while the subject is walking, and corresponds to the lower leg and the upper leg when the target person has a predetermined range of positional relationship. Update the width and length of the bounding box.

ＰｉｃｔｏｒｉａｌＳｔｒｕｃｔｕｒｅｓモデルにおいて、入力画像Ｉを与えた時の各部位の位置姿勢Ｌ＝｛ｌ_ｉ｝の事後確率は、以下の式（１）で表される。 In the Pictorial Structures model, the posterior probability of the position and orientation L = {l _i } of each part when the input image I is given is expressed by the following equation (1).

ここで、Ψは部位ｉとｊの位置姿勢ｌ_ｉとｌ_ｊで定まるパーツ連結スコア、Φは部位ｉのパーツらしさを評価するパーツ尤度である。Ｅはパーツ間を連結するエッジの集合である。本実施形態におけるパーツ間の連結は図４に表されているように、体幹部と頭部、右上腿、左上腿がそれぞれ連結し、さらに右上腿と右下腿、左上腿と左下腿とが連結した構造になっている。ｌ_ｉは、部位ｉに対応するｂｏｕｎｄｉｎｇｂｏｘの中心位置（ｘ_ｉ，ｙ_ｉ）および回転角θ_ｉより構成される。式（１）に表される事後確率Ｐ（Ｌ｜Ｉ）が最大となるようなＬ＝｛ｌ_ｉ｝が求めるパーツの位置姿勢になる。 Here, Ψ is a part connection score determined by the positions and orientations l _i and l _j of the parts _i and _j , and Φ is a part likelihood for evaluating the part-likeness of the part i. E is a set of edges connecting the parts. As shown in FIG. 4, the connection between the parts in the present embodiment connects the trunk, the head, the upper right thigh, and the left upper thigh, and the upper right thigh and the right lower thigh, and the left upper thigh and the left lower thigh. It has a structure. l _i is composed of the center position (x _i , y _i ) and the rotation angle θ _{i of} the bounding box corresponding to the part i. L = {l _i } that maximizes the posterior probability P (L | I) expressed in the equation (1) is the part position and orientation to be obtained.

図５に２つの隣り合うｂｏｕｎｄｉｎｇｂｏｘの相対位置姿勢の例を示す。点５０１は、右上腿に対応するｂｏｕｎｄｉｎｇｂｏｘ３１３に属し、右下腿に接続する関節点であり、点５０２は右下腿３１４に対応するｂｏｕｎｄｉｎｇｂｏｘ３１４に属し、右上腿に接続する関節点である。この時、点５０１および点５０２の座標はそれぞれ（ｘ_３４，ｙ_３４）、（ｘ_４３，ｙ_４３）と表記する。また、ｂｏｕｎｄｉｎｇｂｏｘ３１３と３１４のなす角５０３は、θ_３４と表記する。 FIG. 5 shows an example of the relative position and orientation of two adjacent bounding boxes. A point 501 belongs to a bounding box 313 corresponding to the upper right thigh and is a joint point connected to the right lower leg, and a point 502 belongs to the bounding box 314 corresponding to the right lower leg 314 and is a joint point connected to the upper right thigh. At this time, the coordinates of the points 501 and 502 are expressed as (x ₃₄ , y ₃₄ ) and (x ₄₃ , y ₄₃ ), respectively. An angle 503 formed by the bounding boxes 313 and 314 is expressed as θ ₃₄ .

パーツ連結スコアΨは、例えば、以下の式で算出される。 The part connection score Ψ is calculated by the following formula, for example.

式（２）において、（ｘ_ｉｊ，ｙ_ｉｊ）は部位ｉに対応するｂｏｕｎｄｉｎｇｂｏｘの、部位ｊに接続する関節点の画像座標であり、θ_ｉｊは部位ｉと部位ｊに対応するｂｏｕｎｄｉｎｇｂｏｘのなす角である。第３項のκおよびμは定数である。 In Expression (2), (x _ij , y _ij ) is the image coordinates of the joint point of the bounding box corresponding to the part i and connected to the part j, and θ _ij is the bounding box corresponding to the part i and the part j. It is an angle to make. In the third term, κ and μ are constants.

また、部位ｉのパーツ尤度Φ（ｌ_ｉ）は、例えば、以下のようにして算出される。中心位置（ｘ_ｉ，ｙ_ｉ）に幅ｗ_ｉ、長さｈ_ｉのｂｏｕｎｄｉｎｇｂｏｘを角度θ_ｉ回転させて配置する。ｂｏｕｎｄｉｎｇｂｏｘ内の画像を切り出し、幅ｗ_ｉ’、長さｈ_ｉ’にリサイズをした後、非特許文献３に記載のＳｈａｐｅＣｏｎｔｅｘｔ特徴量を抽出した特徴ベクトルを用意する。部位ｉに対応する複数の幅ｗ_ｉ’、長さｈ_ｉ’のサンプル画像群をＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅによって予め学習した識別器を用いて、その特徴ベクトルのスコアを算出し、部位ｉのパーツ尤度Φ（ｌ_ｉ）とする。 Further, the part likelihood Φ (l _i ) of the part _i is calculated as follows, for example. A bounding box having a width w _i and a length h _i is arranged at the center position (x _i , y _i ) with an angle θ _i rotated. After the image in the bounding box is cut out and resized to the width w _i ′ and the length h _i ′, a feature vector in which the Shape Context feature amount described in Non-Patent Document 3 is extracted is prepared. Using a discriminator that has previously learned a sample image group having a plurality of widths w _i ′ and length h _i ′ corresponding to the part i using the Support Vector Machine, the score of the feature vector is calculated, and the part likelihood of the part i Let Φ (l _i ).

部位ｉに対応するｂｏｕｎｄｉｎｇｂｏｘの幅ｗ_ｉおよび長さｈ_ｉは、処理の開始時には、適当な初期値が与えられているが、後述するように、対象者が歩行をしている最中に更新される。 The width w _i and the length h _i of the bounding box corresponding to the part i are given appropriate initial values at the start of the process. As will be described later, while the subject is walking, Updated.

（ステップＳ１０４）
続いてステップＳ１０４において、部位関係評価部２０４は、右上腿と右下腿の相対位置関係が適切な状態かどうかを評価する。以下、その評価方法について、詳述する。 (Step S104)
Subsequently, in step S104, the part relationship evaluation unit 204 evaluates whether the relative positional relationship between the upper right thigh and the right lower thigh is appropriate. Hereinafter, the evaluation method will be described in detail.

姿勢推定部２０３によって、式（１）の事後確率を最大化するＬ＝｛ｌ_ｉ｝が算出されているとする。まず、部位関係評価部は、対象部位のパーツ尤度Φ（ｌ_３）およびΦ（ｌ_４）がそれぞれ予め定められた閾値を超えているかどうかを判定する。Φ（ｌ_３）およびΦ（ｌ_４）がともに閾値を超えている場合、さらに、部位関係評価部２０４は、対象部位同士の相対角度θ_３４を評価する。例えば、相対角度θ_３４が４５度以上１３５度未満である場合、後述のステップＳ１０５で、その相対角度は適切であると判定される。相対角度が適切であると判定された場合、ステップＳ１０６において、モデルパラメタが算出され、ステップ１０７において、右上腿と右下腿のｂｏｕｎｄｉｎｇｂｏｘの幅および高さの保存が行われる。 It is assumed that L = {l _i } that maximizes the posterior probability of Expression (1) is calculated by the posture estimation unit 203. First, the part relationship evaluation unit determines whether or not the part likelihoods Φ (l ₃ ) and Φ (l ₄ ) of the target part exceed predetermined threshold values. When both Φ (l ₃ ) and Φ (l ₄ ) exceed the threshold, the part relationship evaluation unit 204 further evaluates the relative angle θ ₃₄ between the target parts. For example, when the relative angle θ ₃₄ is not less than 45 degrees and less than 135 degrees, it is determined in step S105 described later that the relative angle is appropriate. If it is determined that the relative angle is appropriate, the model parameters are calculated in step S106, and the width and height of the bounding boxes of the upper right thigh and the right lower thigh are stored in step 107.

また、部位関係評価部２０４は、対象となる部位を含む画像特徴によって部位関係を評価することもできる。図６を用いて一例を示す。図６の６０１は、右膝周辺の部分画像である。部分画像を切り出す枠の中心座標は、例えば、姿勢推定部２０３によって推定された、右上腿に対応するｂｏｕｎｄｉｎｇｂｏｘの右下腿に接続する関節点の位置座標と、右下腿に対応するｂｏｕｎｄｉｎｇｂｏｘが右上腿に接続する関節点の位置座標の中点に設定することができる。また、部分画像を切り出す大きさは、例えば、右上腿のｂｏｕｎｄｉｎｇｂｏｘのサイズに比例した大きさを設定することができる。部分画像６０１を、以下の式（３）で表される評価値Λで評価する。
Λ＝ｄｅｔ（Ａ）−ｋｔｒａｃｅ^２（Ａ）（３）
ここで、Ａは以下の式（４）で表される二次モーメント行列である。 In addition, the part relation evaluation unit 204 can also evaluate the part relation based on the image feature including the target part. An example is shown using FIG. 601 in FIG. 6 is a partial image around the right knee. The center coordinates of the frame from which the partial image is cut out are, for example, the position coordinates of the joint point connected to the right lower leg of the bounding box corresponding to the upper right thigh estimated by the posture estimation unit 203, and the bounding box corresponding to the right lower thigh is the upper right It can be set to the midpoint of the position coordinates of the joint point connected to the thigh. Also, the size of cutting out the partial image can be set to a size proportional to the size of the bounding box of the upper right thigh, for example. The partial image 601 is evaluated with an evaluation value Λ expressed by the following equation (3).
Λ = det (A) −k trace ² (A) (3)
Here, A is a second moment matrix represented by the following formula (4).

式（４）においてｗ（ｕ，ｖ）は（ｕ，ｖ）における重みづけ、Ｉ_ｘおよびＩ_ｙは、それぞれ（ｕ，ｖ）におけるｘ方向およびｙ方向の輝度勾配の値である。Λの値が定められた範囲にある場合、右上腿のｂｏｕｎｄｉｎｇｂｏｘの幅ｗ_ＲＵおよび長さｈ_ＲＵの保存を行う。同様に右下腿のｂｏｕｎｄｉｎｇｂｏｘ幅ｗ_ＲＬおよび長さｈ_ＲＬの保存も行う。式（３）の評価値Λは一般に、部分画像内での輝度勾配方向を評価するための画像特徴量である。Λに代わる他の画像特徴量として、例えば、部分画像内での画像モーメントの値や、ヘッセ行列の行列式の値などを用いてもよく、本実施形態には限られない。 In equation (4), w (u, v) is a weight in (u, v), and I _x and I _y are values of luminance gradients in the x and y directions in (u, v), respectively. When the value of Λ is within the predetermined range, the width w _RU and the length h _RU of the bounding box of the upper right thigh are stored. Similarly, the bounding box width w _RL and length h _RL of the right lower leg are also stored. The evaluation value Λ in Expression (3) is generally an image feature amount for evaluating the luminance gradient direction in the partial image. As another image feature amount instead of Λ, for example, an image moment value in a partial image, a determinant value of a Hessian matrix, or the like may be used, and the present invention is not limited to this embodiment.

また、部位関係評価部２０４は、図６の部分画像６０１を、特定のパターンと比較することで、評価を行うことも可能である。非特許文献４に記載のように、予め特定の配置関係にある部位を含む部分画像のパターンを学習し、そのパターンと部分画像６０１との類似度を算出し、評価値としてもよい。その場合には、ステップＳ１０６で類似度が予め決められた水準以上の場合にモデルの更新を行う。 Further, the part relationship evaluation unit 204 can also perform evaluation by comparing the partial image 601 in FIG. 6 with a specific pattern. As described in Non-Patent Document 4, it is possible to learn a pattern of a partial image including a part having a specific arrangement relationship in advance, calculate the similarity between the pattern and the partial image 601, and use it as an evaluation value. In that case, the model is updated when the similarity is equal to or higher than a predetermined level in step S106.

（ステップＳ１０５）
続いてステップＳ１０５において、ステップＳ１０４での評価結果を受けて、モデルパラメタの算出が必要であればステップＳ１０６に進む。モデルパラメタの算出が必要なければ、ステップＳ１０１に戻る。 (Step S105)
Subsequently, in step S105, the evaluation result in step S104 is received, and if calculation of the model parameter is necessary, the process proceeds to step S106. If it is not necessary to calculate the model parameter, the process returns to step S101.

ステップＳ１０４の評価結果を受けて、モデルパラメタを算出するかどうかの判断の方法について述べる。モデルパラメタを算出するかどうかの判断手法としては、所定のしきい値を超えた場合にモデルパラメタを算出するという方法がある。評価値としては、上述のように、相対角度、部分画像のコーナーネス、画像のパターンなどさまざまなものが考えられ、また、ここにあげたものに限られるものではない。そのため、ステップＳ１０５でのしきい値もその評価値によって変わる。また、モデル更新部２０５が部位関係評価部２０４の評価値の現在の評価値と過去の評価値とを比較し、現在の評価値の方が高かった場合にモデルパラメタを算出するという方法なども考えられる。これにより、過去の履歴と比して、部位の長さをより精度よく算出可能な場合にモデルの更新を行うことができる。さらに、評価値として、特定のパターンとの類似度を用いた場合には、特定のパターンとの類似度の過去の履歴を保存しておき、過去の履歴よりも類似度が高い場合にモデルパラメタの算出を行うこともできる。 A method for determining whether to calculate a model parameter in response to the evaluation result in step S104 will be described. As a method for determining whether to calculate a model parameter, there is a method of calculating a model parameter when a predetermined threshold value is exceeded. As described above, various evaluation values such as a relative angle, a cornerness of a partial image, and an image pattern are conceivable as described above, and are not limited to those described here. Therefore, the threshold value in step S105 also varies depending on the evaluation value. In addition, the model update unit 205 compares the current evaluation value of the evaluation value of the part relationship evaluation unit 204 with the past evaluation value, and calculates the model parameter when the current evaluation value is higher. Conceivable. As a result, the model can be updated when the length of the part can be calculated more accurately than in the past history. Furthermore, when the similarity with a specific pattern is used as the evaluation value, the past history of the similarity with the specific pattern is saved, and the model parameter is used when the similarity is higher than the past history. Can also be calculated.

（ステップＳ１０６）
続いてステップＳ１０６において、モデル更新部２０５は、モデルパラメタを算出する。 (Step S106)
Subsequently, in step S106, the model update unit 205 calculates a model parameter.

本実施形態においては、右上腿に対応するｂｏｕｎｄｉｎｇｂｏｘの幅と長さ、および右下腿に対応するｂｏｕｎｄｉｎｇｂｏｘの幅と長さを算出する。例えば、右上腿のモデルを更新する場合は次のように行う。右上腿に対応するｂｏｕｎｄｉｎｇｂｏｘの位置姿勢ｌ_ＲＵを固定したまま、幅ｗ_ＲＵおよび長さｈ_ＲＵを変化させ、パーツ尤度Φ（ｌ_ＲＵ）が最大となるような幅ｗ_ＲＵおよび長さｈ_ＲＵを算出する。算出した幅ｗ_ＲＵおよび長さｈ_ＲＵを右上腿の新たな幅と長さとする。 In the present embodiment, the width and length of the bounding box corresponding to the right upper leg and the width and length of the bounding box corresponding to the right lower leg are calculated. For example, when updating the model of the upper right thigh, the following is performed. While fixing the position and orientation _{l RU} of bounding box corresponding to the upper right thigh, the width _{w RU} and the length _{h RU} changing a part likelihood Φ _{(l RU)} width that is maximum _{w RU} and the length h _RU is calculated. The calculated width w _RU and length h _RU are set as the new width and length of the upper right thigh.

（ステップＳ１０７）
続いてステップＳ１０７において、モデル更新部２０５はステップＳ１０６において算出されたモデルパラメタを用いて、モデルパラメタを保存する。モデルパラメタを保存した後、ステップＳ１０１に戻る。 (Step S107)
Subsequently, in step S107, the model update unit 205 stores the model parameter using the model parameter calculated in step S106. After storing the model parameters, the process returns to step S101.

より具体的には、例えば、ステップＳ１０６において新たに算出したパラメタの値とモデル記憶部２０６に保存されている値とを入れ替える処理が行われる。また、モデル更新部２０５は、過去の評価値の履歴と対応するモデルパラメタの値を用いることもできる。例えば、過去４回分の更新の際の評価値（パターンの類似度）をα_ｔ−１、α_ｔ−２、α_ｔ−３、α_ｔ−４とし、新たに評価値α_ｔを得たとする。またそれぞれの更新時にパーツ尤度Φ（ｌ_ＲＵ）が最大となるモデルパラメタをｐ_ｔ、ｐ_ｔ−１、ｐ_ｔ−２、ｐ_ｔ−３、ｐ_ｔ−４、ただし、ｐ＝（ｗ_ＲＵ，ｈ_ＲＵ）とする。この時、パラメタの値は、以下の式のｐに更新される。これにより、過去の履歴に依存せず、より精度が高い場合の情報のみ更新に利用できるという効果がある。モデル更新部２０５が、部位関係評価部２０４の評価値に応じた重みを付与して、モデルの更新を行うことにより、算出した部位パラメタの信頼度に応じたモデルの更新が可能になるという効果を示している。 More specifically, for example, a process of replacing the parameter value newly calculated in step S106 with the value stored in the model storage unit 206 is performed. The model update unit 205 can also use model parameter values corresponding to past evaluation value histories. For example, it is assumed that the evaluation values (pattern similarity) in the past four updates are α _t−1 , α _t−2 , α _t−3 , and α _{t− 4} , and a new evaluation value α _t is obtained. . In addition, model parameters that maximize the part likelihood Φ (l _RU ) at each update are p _t , p _t−1 , p _t−2 , p _t−3 , p _t−4 , where p = (w _RU , H _RU ). At this time, the parameter value is updated to p in the following expression. As a result, there is an effect that only information with higher accuracy can be used for updating without depending on the past history. Effect that model updating unit 205 assigns a weight according to the evaluation value of part relationship evaluation unit 204 and updates the model, thereby enabling updating of the model according to the reliability of the calculated part parameter. Is shown.

本明細書の「部位に関するパラメタ」および「部位パラメタ」とは、物体の部位の形状を規定するパラメタであればどのようなものでもよく、例えば、線分の長さ、円柱の長さや径、楕円体の各軸の長さ、直方体の各辺の長さなどのことである。 As used herein, the “parameter relating to the part” and the “part parameter” may be any parameters that define the shape of the part of the object, for example, the length of a line segment, the length or diameter of a cylinder, It is the length of each axis of the ellipsoid, the length of each side of the rectangular parallelepiped, and the like.

以上のように、本実施形態では、右上腿と右下腿の相対位置姿勢を評価することで、部位の長さを精度よく算出するのに適したタイミングかどうかを判定することができる。 As described above, in the present embodiment, it is possible to determine whether or not the timing is suitable for accurately calculating the length of the region by evaluating the relative position and posture of the upper right thigh and the right lower thigh.

（第２の実施形態）
本実施形態に係る情報処理装置は、距離画像とＲＧＢ画像とを入力とする画像中の人物の姿勢を推定するためのモデルを、対象人物の姿勢推定を行っている最中に更新していく。 (Second Embodiment)
The information processing apparatus according to the present embodiment updates a model for estimating the posture of a person in an image that receives a distance image and an RGB image while the posture of the target person is being estimated. .

本実施形態にかかる情報処理装置は、第１の実施形態における情報処理装置と同様の構成をとる。 The information processing apparatus according to the present embodiment has the same configuration as that of the information processing apparatus according to the first embodiment.

画像入力部２０１より入力される画像は、距離画像データおよびＲＧＢデータとする。距離画像データはステレオ、パターン光投影、レンジスキャンなどいかなる距離画像取得の方法によって得られたデータでもよい。また、それらの距離画像は実時間で撮影した画像であってもよいし、事前に撮影した画像であってもよい。入力される距離画像は、例えば、３２０×２４０ピクセルのサイズであり、各画素には撮像装置からの距離が格納されている。ＲＧＢデータは、距離画像データと同様またはそれ以上の解像度であり、距離画像と対応が取れるように補正を施されているものとする。 The image input from the image input unit 201 is assumed to be distance image data and RGB data. The distance image data may be data obtained by any distance image acquisition method such as stereo, pattern light projection, and range scan. These distance images may be images taken in real time or images taken in advance. The input distance image has a size of 320 × 240 pixels, for example, and the distance from the imaging device is stored in each pixel. It is assumed that the RGB data has a resolution similar to or higher than that of the distance image data and has been corrected so as to be compatible with the distance image.

物体検出部２０２は、画像入力部２０１から得られた距離画像中の人物の候補を抽出する。 The object detection unit 202 extracts human candidates in the distance image obtained from the image input unit 201.

姿勢推定部２０３は、距離画像と図８の人体モデルを用いて、物体検出部によって得られた距離画像中の人体候補領域における人体の姿勢を逐次的に推定する。 The posture estimation unit 203 sequentially estimates the posture of the human body in the human body candidate region in the distance image obtained by the object detection unit using the distance image and the human body model of FIG.

モデル更新部２０５は、部位関係評価部２０４における評価に応じて、姿勢推定部２０３で用いる人体を円柱で近似したモデルのパラメタを更新する。モデル記憶部２０６は、姿勢推定部２０３で用いる人体モデルを保存しておく。 The model update unit 205 updates parameters of a model obtained by approximating a human body used in the posture estimation unit 203 with a cylinder according to the evaluation in the part relationship evaluation unit 204. The model storage unit 206 stores the human body model used by the posture estimation unit 203.

以下では、図９のフローチャートを用いて、本実施形態における情報処理装置２００の処理を説明する。 Below, the process of the information processing apparatus 200 in this embodiment is demonstrated using the flowchart of FIG.

（ステップＳ２０１）
まず、ステップＳ２０１において、画像入力部２０１は、画像を１フレーム入力する。 (Step S201)
First, in step S201, the image input unit 201 inputs one frame of an image.

（ステップＳ２０２）
続いて、ステップＳ２０２において、物体検出部２０２は、ステップＳ２０１において入力された画像中に人物がいるかどうかの検出を行う。検出された場合は、ステップＳ２０３に進む。検出されなかった場合はステップＳ２０１に戻る。 (Step S202)
Subsequently, in step S202, the object detection unit 202 detects whether there is a person in the image input in step S201. If detected, the process proceeds to step S203. If not detected, the process returns to step S201.

まず、物体検出部２０２は、画像入力部２０１より入力された距離画像中の動体を検出する。動体の検出は背景差分およびフレーム間差分により行われる。次に、検出された動体領域に対応するＲＧＢ画像中の領域に対して、非特許文献５に記載の方法などによって、人体の向きを特定し、さらに予め学習された回帰関数により、各部位の３次元位置姿勢の推定を行う。これらの検出された位置と推定された姿勢を以て、対象の人体の初期位置および初期姿勢とする。また、前フレームにおいて、人体の位置姿勢が算出されている場合、それを初期位置および初期姿勢とすることができる。 First, the object detection unit 202 detects a moving object in the distance image input from the image input unit 201. The moving object is detected by the background difference and the inter-frame difference. Next, the orientation of the human body is specified for the region in the RGB image corresponding to the detected moving body region by the method described in Non-Patent Document 5, etc. Estimate the 3D position and orientation. The detected position and the estimated posture are used as the initial position and initial posture of the target human body. Further, when the position and orientation of the human body are calculated in the previous frame, they can be set as the initial position and the initial orientation.

（ステップＳ２０３）
続いて、ステップＳ２０３において、前フレームにおいて位置姿勢を推定した結果があるかどうか確認する。結果があれば、ステップＳ２０５に進む。結果がなければステップＳ２０４に進む。 (Step S203)
Subsequently, in step S203, it is confirmed whether there is a result of estimating the position and orientation in the previous frame. If there is a result, the process proceeds to step S205. If there is no result, the process proceeds to step S204.

（ステップＳ２０４）
ステップＳ２０４において、姿勢推定部２０３によって、ステップＳ２０２において検出された人体の初期姿勢を推定する。ステップＳ２０４の実行の後、ステップＳ２０１に戻る。 (Step S204)
In step S204, the posture estimation unit 203 estimates the initial posture of the human body detected in step S202. After execution of step S204, the process returns to step S201.

（ステップＳ２０５）
ステップＳ２０５において、姿勢推定部２０３は、現在の人体モデルおよび、前フレームに得られた位置姿勢推定結果を用いて、現在の人体の位置姿勢を推定する。 (Step S205)
In step S205, the posture estimation unit 203 estimates the current human body position and posture using the current human body model and the position and posture estimation result obtained in the previous frame.

人体のモデルは、ＰｉｃｔｏｒｉａｌＳｔｒｕｃｔｕｒｅｓモデルを３次元に拡張したものを用いる。部位は、トルソ７１１、頭部７１２、右上腕７１３、右前腕７１４、左上腕７１５、左前腕７１６、右上腿７１７、右下腿７１８、左上腿７１９、左下腿７２０よりなる。各部位のモデルは円柱メッシュで構成されており、半径、高さをパラメタとして持つ。各円柱は、回転と平行移動が可能である。部位ｉの位置姿勢ｌ_ｉの要素は、位置（ｘ_ｉ，ｙ_ｉ，ｚ_ｉ）および回転（α_ｉ，β_ｉ，γ_ｉ）である。 As the human body model, a three-dimensional extension of the Pictorial Structures model is used. The parts include a torso 711, a head 712, an upper right arm 713, a right forearm 714, a left upper arm 715, a left forearm 716, an upper right thigh 717, a right lower leg 718, a left upper leg 719, and a left lower leg 720. The model of each part is composed of a cylindrical mesh, and has a radius and a height as parameters. Each cylinder can be rotated and translated. Elements of the position and orientation l _i of the part i are the position (x _i , y _i , z _i ) and the rotation (α _i , β _i , γ _i ).

まず、姿勢推定部２０３は、物体検出部２０２の処理により得られた初期姿勢をとったモデルを、デプス画像中の人体候補領域である点群データの中心軸とモデルのトルソ部位の中心軸とが一致するようにモデルを点群データの座標空間中に配置する。 First, the posture estimation unit 203 obtains a model having an initial posture obtained by the processing of the object detection unit 202, a central axis of point cloud data that is a human body candidate region in the depth image, and a central axis of a torso part of the model. The models are arranged in the coordinate space of the point cloud data so that

各部位の位置姿勢は式（１）の事後確率を最大化するＬ＝｛ｌ_ｉ｝を求めることによって得られる。ただし、パーツ連結スコアΨは、以下の式（６）のように、３次元に拡張されている。 The position and orientation of each part can be obtained by obtaining L = {l _i } that maximizes the posterior probability of Expression (1). However, the part connection score Ψ is expanded in three dimensions as in the following formula (6).

ここで、φ_ｉｊは、部位ｉに対応する円筒の中心軸と部位ｊに対応する円筒の中心軸とがなす角度である。また、パーツ尤度Φ（ｌ_ｉ）は、以下の式（７）を用いて算出される。 Here, φ _ij is an angle formed by the central axis of the cylinder corresponding to the part i and the central axis of the cylinder corresponding to the part j. Further, the part likelihood Φ (l _i ) is calculated using the following equation (7).

ここで、ｍ_ｉ（ｋ）は、部位ｉに対応するモデルの円柱メッシュのｋ番目の点、ｐ_ｋは距離画像から得られる３次元空間中の点座標である。平行移動ベクトルｔ_ｉ、回転行列Ｒ_ｉは部位ｉの位置（ｘ_ｉ，ｙ_ｉ，ｚ_ｉ）と姿勢（α_ｉ，β_ｉ，γ_ｉ）に対応する。 Here, m _i (k) is the k-th point of the cylindrical mesh of the model corresponding to the part i, and _pk is the point coordinates in the three-dimensional space obtained from the distance image. The translation vector t _i and the rotation matrix R _i correspond to the position (x _i , y _i , z _i ) and posture (α _i , β _i , γ _i ) of the part _i .

部位関係評価部２０４は、実施形態１の場合と同様にして、姿勢推定部２０３によって得られた各部位の相対位置姿勢を評価する。例えば、右上腿と右下腿に関する評価値が適当であった場合、後述するモデルパラメタ算出部２０５が右上腿と右下腿それぞれのモデルパラメタを算出する。 The part relationship evaluation unit 204 evaluates the relative position and posture of each part obtained by the posture estimation unit 203 in the same manner as in the first embodiment. For example, when the evaluation values regarding the upper right thigh and the right lower thigh are appropriate, the model parameter calculation unit 205 described later calculates the model parameters of the upper right thigh and the right lower thigh.

（ステップＳ２０６）
続いて、ステップＳ２０６において、部位関係評価部２０４は、右上腿と右下腿の相対位置関係が適切な状態かどうかを評価する。 (Step S206)
Subsequently, in step S206, the part relationship evaluation unit 204 evaluates whether the relative positional relationship between the upper right thigh and the right lower thigh is appropriate.

また、部位関係評価部２０４は、右上腿と右下腿を含む局所の距離画像の画像特徴を評価する場合もある。例えば、右膝に相当する局所領域の距離画像を切り出し、距離値を輝度値とみなして式（３）によってコーナーネスを評価し、コーナーネスが一定値以上の場合、右上腿および右下腿のモデルパラメタを算出する。この時、膝が伸びきった状態では、局所画像中の物体は直線状となり、距離画像の勾配の二次モーメント行列を元にしたコーナーネスの指標は低くなる。 In addition, the part relationship evaluation unit 204 may evaluate the image feature of a local distance image including the upper right thigh and the right lower thigh. For example, a distance image of a local area corresponding to the right knee is cut out, the distance value is regarded as a luminance value, and the cornerness is evaluated by Expression (3). If the cornerness is a certain value or more, models of the upper right thigh and the right lower thigh Calculate the parameters. At this time, when the knee is fully extended, the object in the local image is linear, and the cornerness index based on the second moment matrix of the gradient of the distance image is low.

また別の方法としては、右膝に相当する局所領域の距離画像のパターンを予め学習しておき、パターンが合致した場合に、右上腿および右下腿のモデルパラメタの算出をする方法が考えられる。まず、予め、膝関節角度が９０°の場合の膝の３次元曲面を作成しておく。３次元曲面は例えば、ＣＧキャラクタから作成することもできるし、実際の人体の該当部分の距離画像を撮像した後、距離画像の点群を近似する曲面を算出することによっても得られる。右上腿の下端と右下腿の上端を含む領域において、膝関節角度が９０°の場合の膝の３次元曲面のフィッティングを行い、フィッティング後の誤差が閾値よりも低い場合に、右上腿と右下腿のモデルパラメタの保存を行う。同様に右上腕と右前腕に関しても、肘関節角度が９０°の場合の肘の３次元曲面をフィッティングし、フィッティング後の誤差が閾値よりも低い場合に、右上腿と右下腿のモデルパラメタの保存を行うこともできる。 As another method, a method of learning a pattern of a distance image of a local region corresponding to the right knee in advance and calculating model parameters of the upper right thigh and the right lower thigh when the patterns match can be considered. First, a three-dimensional curved surface of the knee when the knee joint angle is 90 ° is created in advance. For example, a three-dimensional curved surface can be created from a CG character, or can be obtained by taking a distance image of a corresponding part of an actual human body and then calculating a curved surface that approximates a point group of the distance image. In the area including the lower end of the upper right thigh and the upper end of the right lower thigh, when fitting the 3D curved surface of the knee when the knee joint angle is 90 ° and the error after fitting is lower than the threshold, the upper right thigh and the right lower thigh Save model parameters of. Similarly, for the upper right arm and the right forearm, fitting the 3D curved surface of the elbow when the elbow joint angle is 90 °, and storing the model parameters of the upper right thigh and right lower leg when the error after fitting is lower than the threshold Can also be done.

（ステップＳ２０７）
続いて、ステップＳ２０７において、ステップＳ２０６での評価結果を受けて、モデルパラメタの算出が必要であればステップＳ２０８に進む。モデルパラメタの算出が必要なければ、ステップＳ２０１に戻る。 (Step S207)
Subsequently, in step S207, the evaluation result in step S206 is received, and if calculation of the model parameter is necessary, the process proceeds to step S208. If calculation of the model parameter is not necessary, the process returns to step S201.

（ステップＳ２０８）
続いてステップＳ２０８において、モデルパラメタ算出部２０５は右上腿に対応する円柱モデルの半径と長さ、および右下腿に対応する円柱モデルの半径と長さを算出する。 (Step S208)
Subsequently, in step S208, the model parameter calculation unit 205 calculates the radius and length of the cylindrical model corresponding to the upper right thigh and the radius and length of the cylindrical model corresponding to the right lower leg.

例えば、右上腿のモデルであれば、その位置姿勢のパラメタを固定した状態で、半径および長さの異なる複数の円柱モデルに対して式（７）で表されるパーツ尤度を計算し、最もパーツ尤度が大きくなるような半径と長さを持った円柱を新たなモデルとして採用する。 For example, in the case of the model of the upper right thigh, the part likelihood represented by the equation (7) is calculated for a plurality of cylindrical models having different radii and lengths with the position and orientation parameters being fixed, A cylinder with a radius and length that increases the likelihood of parts is adopted as a new model.

（ステップＳ２０９）
続いてステップＳ２０９において、モデルパラメタ算出部２０５はステップＳ２０８において算出されたモデルパラメタを用いて、モデルパラメタ記憶部２０６に保存されているモデルを更新する。モデルを更新した後、ステップＳ２０１に戻る。 (Step S209)
Subsequently, in step S209, the model parameter calculation unit 205 updates the model stored in the model parameter storage unit 206 using the model parameter calculated in step S208. After updating the model, the process returns to step S201.

以上のように、第２の実施形態では、距離画像とＲＧＢ画像（二次元画像）を用いて３次元の人体位置姿勢推定を行った場合においても、実施形態１において説明をした効果と同様の効果を得られることを示している。 As described above, in the second embodiment, even when a three-dimensional human body position / posture is estimated using a distance image and an RGB image (two-dimensional image), the same effect as described in the first embodiment is used. It shows that an effect can be obtained.

（第３の実施形態）
第１の実施形態、および第２の実施形態では人体を対象としたが、本発明にかかる情報処理装置は人体以外の物体も対象とすることが可能である。本実施形態では、馬や犬等の四足歩行の動物を対象とする。これらの動物は、部位の構成や、部位同士の連結関係は共通であるが、各部位の長さ等のパラメタは、種類により大きくばらつきがある。そのため、物体検出部において、対象物体の種別を判別し、判別結果に応じたモデルを用いて対象物体の姿勢推定とモデルパラメタの保存を行う。本実施形態においても、実施形態１および実施形態２と同様、右上腿および右下腿のモデルパラメタの保存を行うとする。 (Third embodiment)
Although the first embodiment and the second embodiment are directed to the human body, the information processing apparatus according to the present invention can also target objects other than the human body. In the present embodiment, a quadruped walking animal such as a horse or a dog is targeted. Although these animals have the same part configuration and connection relationship between the parts, parameters such as the length of each part vary greatly depending on the type. Therefore, in the object detection unit, the type of the target object is determined, and the posture of the target object is estimated and the model parameters are stored using a model corresponding to the determination result. Also in this embodiment, it is assumed that the model parameters of the upper right thigh and the right lower thigh are stored as in the first and second embodiments.

本実施形態は、第２の実施形態と同様の構成により実現される。以下では、実施形態２と差異のあるモジュールに関してのみ、その内容を記す。 The present embodiment is realized by the same configuration as the second embodiment. Hereinafter, only the modules different from those of the second embodiment will be described.

本実施形態における物体検出部２０２は、対象となる動物の検出を行う。対象動物は例えば、馬、犬、猫とする。また、それぞれの動物においても種別ごとにモデルが異なるものとする。 The object detection unit 202 in the present embodiment detects a target animal. The target animals are horses, dogs, and cats, for example. Also, each animal has a different model for each type.

モデル記憶部２０６には、対象となる動物やその種別に応じたモデルがそれぞれ保存されている。モデルの各部位は、半径と長さをパラメタとする円柱によって表現されている。 The model storage unit 206 stores a model corresponding to the target animal and its type. Each part of the model is represented by a cylinder whose parameters are radius and length.

以下では、図１０のフローチャートを用いて、本実施形態における情報処理装置２００の処理を説明する。 Below, the process of the information processing apparatus 200 in this embodiment is demonstrated using the flowchart of FIG.

（ステップＳ３０１）
まず、ステップＳ３０１において、画像入力部２０１は、画像を１フレーム入力する。 (Step S301)
First, in step S301, the image input unit 201 inputs one frame of an image.

（ステップＳ３０２）
続いて、ステップＳ３０２において、物体検出部２０２は、ステップＳ３０１において入力された画像中に対象となる物体が存在するかどうかの検出を行う。検出された場合は、ステップＳ３０３に進む。検出されなかった場合はステップＳ３０１に戻る。 (Step S302)
Subsequently, in step S302, the object detection unit 202 detects whether a target object exists in the image input in step S301. If detected, the process proceeds to step S303. If not detected, the process returns to step S301.

物体検出部２０２は、まず、第２の実施形態と同様に、動体抽出を行い、画像入力部２０１より入力された距離画像から対象動物の候補領域を抽出する。次に、抽出された候補領域に対応するＲＧＢ画像の部分画像を抽出する。次に、抽出した部分画像がどの動物のどの種別に適合するかを判定する。判定には、予め、各クラスを異なる種別に割り当てた多クラスの検出器によって対象の種別を判定する。さらに、実施形態２と同様に、対象の向きと画像座標上の各部位の位置を算出し、それらを元に、対象の初期位置姿勢を算出する。 The object detection unit 202 first performs moving object extraction in the same manner as in the second embodiment, and extracts a candidate region of the target animal from the distance image input from the image input unit 201. Next, a partial image of the RGB image corresponding to the extracted candidate area is extracted. Next, it is determined which type of which animal the extracted partial image matches. In the determination, the target type is determined in advance by a multi-class detector in which each class is assigned to a different type. Further, as in the second embodiment, the orientation of the target and the position of each part on the image coordinates are calculated, and the initial position and orientation of the target are calculated based on them.

（ステップＳ３０３）
続いて、ステップＳ３０３において、前フレームにおいて位置姿勢を推定した結果があるかどうか確認する。結果があれば、ステップＳ３０５に進む。結果がなければステップＳ３０４に進む。 (Step S303)
Subsequently, in step S303, it is confirmed whether there is a result of estimating the position and orientation in the previous frame. If there is a result, the process proceeds to step S305. If there is no result, the process proceeds to step S304.

（ステップＳ３０４）
ステップＳ３０４において、姿勢推定部２０３は、ステップＳ３０２において検出された物体の初期姿勢を推定し、モデル記憶部２０６より、当該物体に対応する３次元姿勢推定のためのモデルを選択する。ステップＳ３０４の実行の後、ステップＳ３０１に戻る。 (Step S304)
In step S304, the posture estimation unit 203 estimates the initial posture of the object detected in step S302, and selects a model for three-dimensional posture estimation corresponding to the object from the model storage unit 206. After execution of step S304, the process returns to step S301.

姿勢推定部２０３は実施形態２と同様にして対象の姿勢を推定する。対象の姿勢推定に用いるモデルは、物体検出部２０２による種別の判定に対応したモデルをモデル記憶部２０６から選択して使用する。 The posture estimation unit 203 estimates the posture of the target in the same manner as in the second embodiment. As a model used for target posture estimation, a model corresponding to the type determination by the object detection unit 202 is selected from the model storage unit 206 and used.

（ステップＳ３０５）
ステップＳ３０５において、姿勢推定部２０３は、現在の物体モデルおよび、前フレームに得られた位置姿勢推定結果を用いて、現在の物体の位置姿勢を推定する。 (Step S305)
In step S305, the posture estimation unit 203 estimates the position and posture of the current object using the current object model and the position and posture estimation result obtained in the previous frame.

（ステップＳ３０６）
続いて、ステップＳ３０６において、部位関係評価部２０４は、右上腿と右下腿の相対位置関係が適切な状態かどうかを評価する。 (Step S306)
Subsequently, in step S306, the part relationship evaluation unit 204 evaluates whether the relative positional relationship between the upper right thigh and the right lower thigh is appropriate.

（ステップＳ３０７）
続いて、ステップＳ３０７において、ステップＳ３０６での評価結果を受けて、モデルパラメタの算出が必要であればステップＳ３０８に進む。モデルパラメタの算出が必要なければ、ステップＳ３０１に戻る。 (Step S307)
Subsequently, in step S307, the evaluation result in step S306 is received, and if calculation of the model parameter is necessary, the process proceeds to step S308. If calculation of the model parameter is not necessary, the process returns to step S301.

（ステップＳ３０８）
続いてステップＳ３０８において、モデルパラメタ算出部２０５は右上腿に対応する円柱モデルの半径と長さ、および右下腿に対応する円柱モデルの半径と長さを算出する。 (Step S308)
Subsequently, in step S308, the model parameter calculation unit 205 calculates the radius and length of the cylinder model corresponding to the upper right leg and the radius and length of the cylinder model corresponding to the right lower leg.

（ステップＳ３０９）
続いてステップＳ３０９において、モデルパラメタ算出部２０５はステップＳ３０８において得られたモデルパラメタを用いて、モデルパラメタ記憶部２０６に保存されている対応する物体のモデルパラメタを保存する。モデルパラメタを保存した後、ステップＳ３０１に戻る。 (Step S309)
Subsequently, in step S309, the model parameter calculation unit 205 stores the model parameter of the corresponding object stored in the model parameter storage unit 206 using the model parameter obtained in step S308. After storing the model parameters, the process returns to step S301.

以上のように、第３の実施形態では、馬、犬、猫等の人体以外の物体に本発明を適用した。このように、一つ以上の部位からなる変形物体であれば、第１の実施形態や第２の実施形態で示したのと同様の方法が適用可能である。 As described above, in the third embodiment, the present invention is applied to objects other than the human body such as horses, dogs, and cats. Thus, if it is a deformed object consisting of one or more parts, the same method as shown in the first embodiment or the second embodiment can be applied.

（その他の実施形態）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other embodiments)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

２００情報処理装置
２０１画像入力部
２０２物体検出部
２０３姿勢推定部
２０４部位関係評価部
２０５モデル更新部
２０６モデル記憶部 DESCRIPTION OF SYMBOLS 200 Information processing apparatus 201 Image input part 202 Object detection part 203 Posture estimation part 204 Part relation evaluation part 205 Model update part 206 Model storage part

Claims

An image input means for inputting an image;
Object detection means for detecting the object based on the image input by the image input means;
Holding means for holding an object shape model for estimating the position and orientation of the object;
A part relation evaluation unit that evaluates a relation between parts of the detected object based on the object shape model;
Based on the evaluation result obtained from the part relationship evaluation means, a determination means for determining whether to calculate a model parameter;
Model parameter calculation means for calculating the model parameter based on the result of the determination by the determination means;
An information processing apparatus comprising: an updating unit that updates an object shape model held by the holding unit based on the model parameter calculated by the model parameter calculating unit.

The information processing apparatus according to claim 1, wherein the image input to the image unit is at least one of a distance image and a two-dimensional image.

The information processing apparatus according to claim 1, wherein the part relationship evaluation unit evaluates a relative position and orientation of the first part and the second part of the object.

The information processing apparatus according to claim 3, wherein the part relationship evaluation unit evaluates an angle formed by the first part and the second part.

The information processing apparatus according to claim 1, wherein the part relationship evaluation unit evaluates image features of an image including a first part and an image including a second part of the object.

The said part relationship evaluation means evaluates the similarity degree of the image learned in advance and the image containing the 1st site | part of the said object, the image containing the 2nd site | part, and the said 1st site | part The information processing apparatus described.

The information processing apparatus according to claim 1, wherein the updating unit compares the evaluation of the part relationship evaluation unit with a past evaluation.

The information processing apparatus according to claim 1, wherein the updating unit compares the evaluation of the part relationship evaluation unit with a predetermined level.

The said update means gives the weight according to the evaluation of the said part relation evaluation means to the value computed by the said part relation evaluation means, The said object model is updated, The said Claim 1 characterized by the above-mentioned. Information processing device.

An image input process for inputting an image;
An object detection step of detecting the object based on the input image;
A part relation evaluation step for evaluating a relation between parts of the detected object based on an object shape model for estimating a position and orientation of the object held by a holding unit;
A determination step of determining whether to calculate a model parameter based on the evaluation result obtained from the part relationship evaluation means;
Based on the determination result in the determination step, a model parameter calculation step for calculating the model parameter;
An information processing method comprising: an updating step of updating an object shape model held in the holding unit based on the model parameter calculated in the model parameter calculating step.

A program for causing a computer to execute the information processing method according to claim 10.