JP2013120497A

JP2013120497A - Method for extracting silhouette and silhouette extraction system

Info

Publication number: JP2013120497A
Application number: JP2011268286A
Authority: JP
Inventors: Katsuhiko Ueda; 勝彦植田; Yoshiaki Shirai; 良明白井; Nobutaka Shimada; 伸敬島田
Original assignee: Sumitomo Rubber Industries Ltd; Dunlop Sports Co Ltd
Current assignee: Sumitomo Rubber Industries Ltd; Dunlop Sports Co Ltd
Priority date: 2011-12-07
Filing date: 2011-12-07
Publication date: 2013-06-17
Anticipated expiration: 2031-12-07
Also published as: JP5816068B2

Abstract

PROBLEM TO BE SOLVED: To provide a method for extracting a silhouette capable of extracting the silhouette of a person in each frame of a moving image.SOLUTION: The method for extracting silhouettes is configured to extract a frame from a moving image obtained by photographing a person performing a swing action along with background, identify each pixel to the person or the background, and extract the silhouette of the person. The method includes: a feature point extraction step S102 of performing image processing for an address frame and extracting a plurality of feature points of the person and a golf club; a step S103 of setting a mask region that is the region including the person and the golf club and including the background of the periphery with a small range; a step S104 of setting a human body zone that has a high probability zone of constituting the silhouette of the person, among the pixels included in the mask region; a step S105 of determining a human body section representing the human body in a histogram on the basis of a histogram of the pixels of the human body zone and color information of the pixels of the human zone of the address frame; and a step S106 of determining the human body section in the histogram of the pixels outside the human body zone on the basis of the determined human body section.

Description

本発明は、ゴルフクラブを持ってスイング動作を行う人物を背景とともに撮影した動画から抽出されたフレームの各ピクセルを、人物又は背景のいずれかに切り分けて人物のシルエットを抽出することができるシルエット抽出方法及びシルエット抽出システムに関する。 The present invention is a silhouette extraction capable of extracting a person's silhouette by dividing each pixel of a frame extracted from a moving image obtained by photographing a person performing a swing motion with a golf club together with a background into either a person or a background. The present invention relates to a method and a silhouette extraction system.

ゴルフの技量向上のためには、適切なゴルフスイングの習得が重要である。このような目的のために、ゴルフスイングの動画を撮影し、この動画を元にスイング診断がなされている。古典的なスイング診断では、ティーチングプロ等が動画に基づいてスイング中の問題点を指摘する。一方、前記動画を画像処理することでスイングを診断しようとの試みもなされている。 In order to improve the skill of golf, it is important to acquire an appropriate golf swing. For this purpose, a video of a golf swing is taken, and a swing diagnosis is made based on this video. In classic swing diagnosis, teaching professionals point out problems during swing based on video. On the other hand, an attempt has been made to diagnose a swing by performing image processing on the moving image.

画像処理による場合、人物が撮影されたピクセルと背景が撮影されたピクセルとが区別され、人物のシルエットが抽出されうる。このような処理を精度良く行うものとして、本件出願人は、すでに下記特許文献１を提案している。 In the case of image processing, a pixel where a person is photographed is distinguished from a pixel where a background is photographed, and a silhouette of the person can be extracted. The present applicant has already proposed the following Patent Document 1 for performing such processing with high accuracy.

特開２０１１−７８０６９号公報JP 2011-78069 A

上記特許文献では、それなりの精度が確保されてはいるが、いくつかの問題点が指摘されている。即ち、上記特許文献のものでは、人物と背景とが似たような色合いである場合、両者を上手く区別することができないという問題があった。 In the above patent document, although some accuracy is ensured, several problems are pointed out. That is, in the above-mentioned patent document, when the person and the background have similar colors, there is a problem that the two cannot be distinguished well.

本発明は、以上のような問題点に鑑み案出されたもので、人物と背景とが似たような色合いであっても、両者を上手く区別することができるシルエット抽出方法及びシルエット抽出システムを提供することを主たる目的としている。 The present invention has been devised in view of the above problems. A silhouette extraction method and a silhouette extraction system that can distinguish a person and a background even if they have similar colors are provided. The main purpose is to provide.

本発明のうち請求項１記載の発明は、ゴルフクラブを持ってスイング動作を行う人物を背景とともに撮影した動画から、ピクセルの集合体であるフレームを抽出し、該抽出されたフレームの各ピクセルを前記人物又は前記背景のいずれかに特定して前記人物のシルエットを抽出する方法であって、それぞれのピクセルについて、全てのフレームからなる全フレーム集合を作成するステップと、前記全フレーム集合について、頻度がフレーム数であり、階級が色情報であるヒストグラムを作成するステップと、前記動画から前記人物のアドレス状態が撮像されたアドレスフレームを抽出するステップと、前記アドレスフレームを画像処理し、前記人物及び前記ゴルフクラブについて予め定められた複数箇所の特徴点を抽出する特徴点抽出ステップと、前記特徴点に基づいて、前記人物及びゴルフクラブを包含しつつその周囲の背景を一部に含んだ領域であるマスク領域を設定するステップと、前記特徴点に基づいて、前記マスク領域に含まれるピクセルの中から前記人物のシルエットを構成する確率が高い領域である人体ゾーンを設定するステップと、前記人体ゾーンのピクセルのヒストグラムと、前記アドレスフレームの前記人体ゾーンのピクセルの色情報とに基づいて、前記ヒストグラムの中で人体を表す階級である人体区間を決定する人体区間決定ステップと、決定された前記人体区間に基づいて、前記人体ゾーンの外側のピクセルのヒストグラムに人体区間を決定する人体区間伝播ステップと、前記マスク領域の外側である非マスク領域のピクセルのヒストグラムと、前記アドレスフレームの前記非マスク領域のピクセルの色情報とに基づいて、前記ヒストグラムの中で背景を表す階級である背景区間を決定する背景区間決定ステップと、決定された前記背景区間に基づいて、前記マスク領域内のピクセルのヒストグラムに背景区間を決定する背景区間伝播ステップとを含むことを特徴とする。 According to the first aspect of the present invention, a frame, which is an aggregate of pixels, is extracted from a moving image in which a person who performs a swing motion with a golf club is photographed together with a background, and each pixel of the extracted frame is extracted. A method for extracting a silhouette of the person by specifying either the person or the background, the step of creating an entire frame set consisting of all frames for each pixel, and a frequency for the entire frame set Is a number of frames and a class is a color information histogram, extracting an address frame in which the address state of the person is imaged from the moving image, image processing the address frame, the person and A feature point extraction step for extracting a plurality of predetermined feature points for the golf club. And a step of setting a mask region that is a region that includes the person and the golf club and partially includes a background around the person and the golf club, based on the feature point, and on the mask region based on the feature point. A step of setting a human body zone which is a region having a high probability of constituting the silhouette of the person among the included pixels, a histogram of pixels of the human body zone, and color information of pixels of the human body zone of the address frame Based on the human body section determination step for determining a human body section that is a class representing the human body in the histogram, and based on the determined human body section, the human body section is determined in a histogram of pixels outside the human body zone. A human body segment propagation step, a histogram of pixels in a non-mask area outside the mask area, and the histogram. Based on the color information of the pixels of the non-masked area of the less frame, a background section determining step for determining a background section that is a class representing a background in the histogram, and based on the determined background section, And a background interval propagation step for determining a background interval in a histogram of pixels in the mask area.

また請求項２記載の発明は、前記ヒストグラムは、全てのピクセルを対象としかつ階級を輝度とした第１ヒストグラム、有彩色のピクセルを対象として階級を色相とした第２ヒストグラム、及び、無彩色のピクセルを対象として階級が輝度である第３ヒストグラムを含むことを特徴とする。 In the invention according to claim 2, the histogram includes a first histogram for all pixels and a class as luminance, a second histogram for chromatic pixels as a class and hue, and an achromatic color A third histogram having a luminance level for pixels is included.

また請求項３記載の発明は、前記特徴点は、アドレス状態の人体の頭頂部、おでこ、背中、腰、膝裏、踵、つま先、太股、手元、ゴルフクラブのヘッド及び肩の各部を代表する位置を含むことを特徴とする。 According to a third aspect of the present invention, the feature points represent the top of the human body in the addressed state, the forehead, the back, the waist, the back of the knee, the heel, the toes, the thighs, the hands, and the golf club head and shoulders. It includes a position.

また請求項４記載の発明は、前記マスク領域を設定するステップは、前記特徴点を繋いで初期マスク領域を設定するステップと、前記初期マスク領域を予め定められた厚さに膨張させて前記マスク領域を設定することを特徴とする。 According to a fourth aspect of the present invention, the step of setting the mask region includes the step of connecting the feature points to set an initial mask region, and expanding the initial mask region to a predetermined thickness to form the mask. An area is set.

また請求項５記載の発明は、前記人体ゾーンを設定するステップは、前記特徴点に基づいて、前記初期マスク領域の内側かつ明らかに人体である位置に基準点を設定するステップと、前記基準点を用いて前記人体ゾーンを決定することを特徴とする。 In the invention according to claim 5, the step of setting the human body zone includes a step of setting a reference point inside the initial mask area and a position that is clearly a human body based on the feature point, and the reference point The human body zone is determined using.

また請求項６記載の発明は、前記フレームのピクセルのうち、ゴルフ練習場の支柱を表示するピクセルを特定するステップと、該支柱を表示するピクセルと、そのヒストグラムとに基づいて、支柱を表示している階級である支柱区間を決定するステップとを含み、前記背景区間伝播ステップは、前記支柱区間をスキップすることを特徴とする。 According to a sixth aspect of the present invention, a strut is displayed based on a step of identifying a pixel for displaying a driving range support post, a pixel for displaying the support post, and a histogram thereof. Determining a strut section which is a certain class, and the background section propagation step skips the strut section.

また請求項７記載の発明は、前記背景区間及び前記人体区間が決定された後、これらの整合性を判定するステップをさらに含むことを特徴とする。 The invention according to claim 7 further includes a step of determining consistency between the background section and the human body section after the background section and the human body section are determined.

また請求項８記載の発明は、ゴルフクラブを持ってスイング動作を行う人物を背景とともに撮影した動画から、ピクセルの集合体であるフレームを抽出し、該抽出されたフレームの各ピクセルを前記人物又は前記背景のいずれかに特定して前記人物のシルエットを抽出するシステムであって、前記動画を撮影するカメラ、撮影された動画を記憶するメモリ、及び演算部を具え、前記演算部は、それぞれのピクセルについて、全てのフレームからなる全フレーム集合を作成する集合作成部と、前記全フレーム集合について、頻度がフレーム数であり、階級が色情報であるヒストグラムを作成するヒストグラム作成部とを含むとともに、前記動画から前記人物のアドレス状態が撮像されたアドレスフレームを抽出するアドレスフレーム抽出部、前記アドレスフレームを画像処理し、前記人物及び前記ゴルフクラブについて予め定められた複数箇所の特徴点を抽出する特徴点抽出部、前記特徴点に基づいて、前記人体及びゴルフクラブを包含しつつその周囲の背景を小範囲で含んだ領域であるマスク領域を設定するマスク領域設定部、前記特徴点に基づいて、前記マスク領域に含まれるピクセルの中から前記人物のシルエットを構成する確率が高い領域である人体ゾーンを設定する人体ゾーン設定部、前記人体ゾーンのピクセルのヒストグラムと、前記アドレスフレームの前記人体ゾーンのピクセルの色情報とに基づいて、前記ヒストグラムの中で人体を表す階級である人体区間を決定する人体区間決定部、決定された前記人体区間に基づいて、前記人体ゾーンの外側のピクセルのヒストグラムに人体区間を決定する人体区間伝播部、前記マスク領域の外側である非マスク領域のピクセルのヒストグラムと、前記アドレスフレームの前記非マスク領域のピクセルの色情報とに基づいて、前記ヒストグラムの中で背景を表す階級である背景区間を決定する背景区間決定部、及び決定された前記背景区間に基づいて、前記マスク領域内のピクセルのヒストグラムに背景区間を決定する背景区間伝播部を含む判定部を有することを特徴とする。 The invention according to claim 8 extracts a frame, which is an aggregate of pixels, from a moving image in which a person who performs a swing motion with a golf club is photographed together with a background, and each pixel of the extracted frame is defined as the person or A system for extracting the silhouette of the person by specifying any of the backgrounds, comprising a camera for capturing the moving image, a memory for storing the captured moving image, and a calculation unit, The pixel includes a set creation unit that creates a set of all frames including all frames, and a histogram creation unit that creates a histogram in which the frequency is the number of frames and the class is color information for the entire frame set. An address frame extraction unit for extracting an address frame in which the address state of the person is imaged from the moving image; An image processing is performed on the address frame, and a feature point extracting unit that extracts a plurality of predetermined feature points for the person and the golf club. Based on the feature points, the human body and the golf club are included while surrounding the human body and the golf club. A mask region setting unit that sets a mask region that is a region including a background in a small range, and a region that has a high probability of forming the silhouette of the person from the pixels included in the mask region based on the feature points Based on the human body zone setting unit for setting the human body zone, the pixel histogram of the human body zone, and the color information of the pixel of the human body zone of the address frame, a human body section which is a class representing the human body in the histogram. A human body segment determining unit for determining, based on the determined human body segment, a histogram of pixels outside the human body zone; A human body section propagation unit for determining a human body section in a ram, a histogram of pixels in a non-mask area outside the mask area, and color information of pixels in the non-mask area in the address frame. And a determination unit including a background interval determining unit that determines a background interval that is a class representing a background, and a background interval propagating unit that determines a background interval in a histogram of pixels in the mask region based on the determined background interval It is characterized by having.

本発明は、スイング動作中のアドレス状態に着目されてなされた。即ち、アドレス状態は、スイング動作の他の状態に比べて個人差が小さく、人物は、皆、ほぼ同様の姿勢になる。本発明では、このような経験則に基づき、アドレス状態のフレームであるアドレスフレームを画像処理し、人物及びゴルフクラブについて予め定められた複数箇所の特徴点が抽出される。この特徴点に基づいて、人物のシルエットを包含するマスク領域及び該マスク領域の中でさらに人体を構成する確率が高い領域である人体ゾーンが設定される。そして、人体ゾーンのピクセルのヒストグラムと、アドレスフレームの前記人体ゾーンのピクセルの色情報とに基づいて、ヒストグラムの中で人体を表す階級である人体区間が決定され、これに基づいて、人体ゾーンの外側のピクセルのヒストグラムに人体区間を決定するいわゆる人体区間の伝播が行われる。 The present invention has been made paying attention to the address state during the swing operation. In other words, the address state has a smaller individual difference than the other states of the swing motion, and all of the persons are in almost the same posture. In the present invention, based on such an empirical rule, an address frame, which is a frame in an address state, is subjected to image processing, and feature points at a plurality of predetermined locations for a person and a golf club are extracted. Based on the feature points, a mask region including a person's silhouette and a human body zone which is a region having a higher probability of forming a human body in the mask region are set. Then, based on the histogram of the pixel of the human body zone and the color information of the pixel of the human body zone of the address frame, a human body section which is a class representing the human body is determined in the histogram, and based on this, the human body zone Propagation of the so-called human body section is performed to determine the human body section in the histogram of the outer pixels.

また、アドレスフレームにおいて、マスク領域の外側である非マスク領域のピクセルは、背景を構成する確率が高い。そこで、本発明では、非マスク領域のピクセルのヒストグラムと、前記アドレスフレームの前記非マスク領域のピクセルの色情報とに基づいて、前記ヒストグラムの中で背景を表す階級である背景区間が決定され、これに基づいて、前記マスク領域内のピクセルのヒストグラムに背景区間を決定するいわゆる背景区間の伝播が行われる。 In the address frame, the pixels in the non-mask area outside the mask area have a high probability of forming the background. Therefore, in the present invention, based on the histogram of the pixels of the non-mask area and the color information of the pixels of the non-mask area of the address frame, a background section that is a class representing the background in the histogram is determined, Based on this, so-called background interval propagation is performed to determine the background interval in the histogram of the pixels in the mask area.

このような２つの伝播処理を含む本発明の方法によれば、人物と背景とが似たような色合いであっても、両者を上手く区別することができる。従って、精度の高いシルエット抽出が可能になる。 According to the method of the present invention including such two propagation processes, even if the person and the background have similar colors, the two can be distinguished well. Therefore, highly accurate silhouette extraction is possible.

本実施形態のシルエット抽出システムの概念図である。It is a conceptual diagram of the silhouette extraction system of this embodiment. 図１のシステムの演算部の詳細が示された概念図である。FIG. 2 is a conceptual diagram illustrating details of a calculation unit of the system of FIG. 1. 本実施形態のシルエット抽出方法のフローチャートである。It is a flowchart of the silhouette extraction method of this embodiment. カメラアングルが示された説明図である。It is explanatory drawing in which the camera angle was shown. （ａ）乃至（ｃ）はピクセルについての輝度ヒストグラムである。(A) thru | or (c) are the brightness | luminance histograms about a pixel. （ａ）乃至（ｃ）はピクセルについての色ヒストグラムである。(A) to (c) are color histograms for pixels. 判定処理のフローチャートである。It is a flowchart of a determination process. 人物の特徴点を説明する概念図である。It is a conceptual diagram explaining the feature point of a person. アドレスフレームとトップフレームとの差分画像である。It is a difference image between an address frame and a top frame. 踵及びつま先の特徴点を説明する画像である。It is an image explaining the feature point of a heel and a toe. 太股の特徴点を説明する画像である。It is an image explaining the feature point of a thigh. 頭頂の特徴点を説明する画像である。It is an image explaining the feature point of a head. おでこの特徴点を説明する画像である。It is an image explaining this forehead feature point. 腰の特徴点を説明する画像である。It is an image explaining the feature point of the waist. 膝裏の特徴点を説明する画像である。It is an image explaining the feature point of the back of the knee. 背中及び肩付近の特徴点を説明する画像である。It is an image explaining the feature point of back and shoulder vicinity. マスク領域を説明する画像である。It is an image explaining a mask area. 人体ゾーンを説明する画像である。It is an image explaining a human body zone. 人体区間決定ステップの一例を示すフローチャートである。It is a flowchart which shows an example of a human body area determination step. ヒストグラムの各位置を説明するグラフである。It is a graph explaining each position of a histogram. 人体区間伝播ステップの一例を示すフローチャートである。It is a flowchart which shows an example of a human body area propagation step. 距離画像を説明するピクセル配列図である。It is a pixel arrangement | sequence figure explaining a distance image. ヒストグラムの各位置を説明するグラフである。It is a graph explaining each position of a histogram. 本発明の概念を説明する線図である。It is a diagram explaining the concept of the present invention. 整合性チエックステップの一例を示すフローチャートである。It is a flowchart which shows an example of a consistency check step. ヒストグラムを用いた説明図である。It is explanatory drawing using a histogram. 本実施形態によって得られたアドレス状態の人物のシルエットである。It is the silhouette of the person of the address state obtained by this embodiment. 本実施形態によって得られたトップ状態の人物のシルエットである。It is the silhouette of the person of the top state obtained by this embodiment. 本実施形態によって得られたフィニッシュ状態の人物のシルエットである。It is the silhouette of the person of the finish state obtained by this embodiment. 支柱が撮影されたアドレスフレームである。This is an address frame in which a support is photographed. 背景区間伝播ステップの処理方法を説明するピクセルの拡大配列図である。It is an enlarged array diagram of a pixel explaining a processing method of a background section propagation step. 実施例１及び比較例１の処理結果を示すシルエット画像である。It is a silhouette image which shows the processing result of Example 1 and Comparative Example 1. 実施例２及び比較例２の処理結果を示すシルエット画像である。It is a silhouette image which shows the processing result of Example 2 and Comparative Example 2. 実施例３及び比較例３の処理結果を示すシルエット画像である。It is a silhouette image which shows the processing result of Example 3 and Comparative Example 3. 実施例４及び比較例４の処理結果を示すシルエット画像である。It is a silhouette image which shows the processing result of Example 4 and Comparative Example 4.

以下、本発明の実施の一形態が図面に基づき説明される。
図１には、本発明を実施するためのシルエット抽出システム１の構成図が示される。本実施形態のシステムは、携帯電話機２と、該携帯電話機２と通信手段を介して接続されたサーバー３とを含む。なお、通信手段には、無線のみならず、途中に有線の通信手段を含ませても良い。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 shows a configuration diagram of a silhouette extraction system 1 for carrying out the present invention. The system of the present embodiment includes a mobile phone 2 and a server 3 connected to the mobile phone 2 via communication means. Note that the communication means may include not only wireless but also wired communication means in the middle.

前記携帯電話機２は、カメラ４、メモリ５及び送受信部６とを含む。前記カメラ４は、動画を撮影することができる。前記メモリ５には、例えばＲＡＭ、ＳＤカード（ミニＳＤ、マイクロＳＤ等を含む）及びその他の記憶媒体が挙げられる。 The mobile phone 2 includes a camera 4, a memory 5, and a transmission / reception unit 6. The camera 4 can take a moving image. Examples of the memory 5 include a RAM, an SD card (including a mini SD, a micro SD, and the like) and other storage media.

前記サーバー３は、演算部７、メモリ８及び送受信部９を含む。 The server 3 includes a calculation unit 7, a memory 8, and a transmission / reception unit 9.

前記演算部７には、例えばＣＰＵが用いられ、図２に示されるように、フレーム抽出部７ａ、第１集合作成部７ｂ、第２集合作成部７ｃ、輝度ヒストグラム作成部７ｄ、色ヒストグラム作成部７ｅ及び判定部７ｆを含む。 For example, a CPU is used as the arithmetic unit 7, and as shown in FIG. 2, a frame extraction unit 7a, a first set creation unit 7b, a second set creation unit 7c, a luminance histogram creation unit 7d, and a color histogram creation unit. 7e and determination unit 7f.

図３には、本実施形態のシステム１によって行われるシルエット抽出方法の処理手順が記載される。本実施形態では、先ず、ゴルフスイングがカメラ４で撮影される（ステップＳ１）。 FIG. 3 describes a processing procedure of the silhouette extraction method performed by the system 1 of the present embodiment. In the present embodiment, first, a golf swing is photographed by the camera 4 (step S1).

ゴルフスイングは、ゴルフクラブを持った人物によって行われる一連の動きである。本明細書において、ゴルフスイングは、静止して構えた状態であるアドレス状態、ゴルフクラブを徐々に振り上げるテイクバック、最もゴルフクラブを振り上げられた状態であるトップ、トップからゴルフクラブが振り下ろされるダウンスイング、ゴルフクラブのヘッドがゴルフボールと衝突するインパクト、及び、インパクト後、ゴルフクラブを前方から上方へと振り抜いて静止するフィニッシュまでを含む。 A golf swing is a series of movements performed by a person with a golf club. In this specification, the golf swing refers to an address state where the golf club is stationary, a takeback where the golf club is gradually swung up, a top where the golf club is most swung up, and the golf club is swung down from the top. It includes a downswing, an impact in which the head of the golf club collides with the golf ball, and a finish after the impact, by swinging the golf club from the front upward.

図４には、携帯電話機２のカメラ４で撮影を開始する際の画面が示されている。この画面は、携帯電話機２のモニタ（図示省略）に表示される。この画面は、人物１１がゴルフクラブ１０を構えて静止したアドレス状態で、人物１１の後方から撮影されている。 FIG. 4 shows a screen when shooting is started with the camera 4 of the mobile phone 2. This screen is displayed on a monitor (not shown) of the mobile phone 2. This screen is taken from the back of the person 11 in the address state where the person 11 stands still holding the golf club 10.

好ましい実施形態として、携帯電話機２には、第１ガイド枠１２及び第２ガイド枠１３が示される。これらの枠１２、１３は、携帯電話機２の演算部（図示省略）上で実行されるソフトウエアによって表示される。撮影者は、第１ガイド枠１２にゴルフクラブ１０のグリップ１０ａが含まれ、第２ガイド枠１３にヘッド１０ｂが含まれるように、カメラ４のアングルを決定する。従って、これらの枠１２、１３は、カメラ４と人物１１との距離や、カメラアングルを決定するのに役立つ。 As a preferred embodiment, the mobile phone 2 shows a first guide frame 12 and a second guide frame 13. These frames 12 and 13 are displayed by software executed on a calculation unit (not shown) of the mobile phone 2. The photographer determines the angle of the camera 4 so that the first guide frame 12 includes the grip 10a of the golf club 10 and the second guide frame 13 includes the head 10b. Therefore, these frames 12 and 13 are useful for determining the distance between the camera 4 and the person 11 and the camera angle.

本実施形態では、図４のアドレス状態から前記撮影が開始される。撮影が開始されると、人物１１はゴルフスイングを開始し、前記フィニッシュまで撮影が継続される。撮影によって、動画データが得られる。このデータは、メモリ５に記憶される（ステップＳ２）。この動画のピクセル数は、例えば縦６４０×横３２０である。 In the present embodiment, the photographing is started from the address state of FIG. When shooting is started, the person 11 starts a golf swing, and shooting is continued until the finish. Movie data is obtained by shooting. This data is stored in the memory 5 (step S2). The number of pixels of this moving image is, for example, length 640 × width 320.

撮影者又は人物１１が携帯電話機２を操作することにより、動画のデータがサーバー３へと送信される（ステップＳ３）。データは、携帯電話機２の送受信部６から、サーバー３の送受信部９へ送信され、メモリ８に記憶される（ステップＳ４）。 When the photographer or person 11 operates the mobile phone 2, moving image data is transmitted to the server 3 (step S3). The data is transmitted from the transmission / reception unit 6 of the mobile phone 2 to the transmission / reception unit 9 of the server 3 and stored in the memory 8 (step S4).

次に、フレーム抽出部７ａが、動画のデータからピクセルの集合体である多数のフレーム（すなわち静止画のデータ）を抽出する（ステップＳ５）。１秒あたりの抽出数（フレーム）は３０又は６０である。例えば、フレーム数が３０である３秒の動画データの場合、３０×３の９０フレームが抽出される。なお、抽出されたフレームには、必要に応じ、各フレームに補正処理がなされる。補正処理としては、手ぶれ補正処理が挙げられる。 Next, the frame extraction unit 7a extracts a large number of frames (that is, still image data) that are aggregates of pixels from the moving image data (step S5). The number of extractions (frames) per second is 30 or 60. For example, in the case of 3 second moving image data with 30 frames, 30 × 3 90 frames are extracted. The extracted frame is subjected to correction processing as necessary. An example of the correction process is a camera shake correction process.

次に、第１集合作成部７ｂは、それぞれのピクセルについて、全てのフレームからなる全フレーム集合を作成する（ステップＳ６）。画像が縦６４０×横３２０の上記動画の場合、６４０×３２０×９０＝１８４３２０００（ピクセル）の集合が作成される。 Next, the first set creation unit 7b creates a whole frame set including all frames for each pixel (step S6). When the image is the above-mentioned moving image of 640 × 320, a set of 640 × 320 × 90 = 184432000 (pixels) is created.

また、第２集合作成部７ｃは、それぞれのフレームのそれぞれのピクセルが、無彩色であるか有彩色であるかを決定し、それぞれのピクセルごとに、有彩色フレーム集合と無彩色フレーム集合とを作成する（ステップＳ７）。 The second set creation unit 7c determines whether each pixel of each frame is an achromatic color or a chromatic color. For each pixel, a chromatic color frame set and an achromatic color frame set are determined. Create (step S7).

さらに、輝度ヒストグラム作成部７ｄは、全フレーム集合について、輝度ヒストグラム（第１ヒストグラム）を作成する（ステップＳ８）。この輝度ヒストグラムでは、例えば、図５（ａ）乃至（ｃ）に示されるように、頻度がフレーム数であり、階級が輝度（色情報）である。他の色情報に基づいて輝度ヒストグラムが作成されてもよい。また、輝度ヒストグラムは、平滑化処理されても良い。 Further, the luminance histogram creating unit 7d creates a luminance histogram (first histogram) for all frame sets (step S8). In this luminance histogram, for example, as shown in FIGS. 5A to 5C, the frequency is the number of frames, and the class is luminance (color information). A luminance histogram may be created based on other color information. Further, the luminance histogram may be smoothed.

さらに、色ヒストグラム作成部７ｅは、有彩色フレーム集合及び無彩色フレーム集合について、色ヒストグラム（第２ヒストグラム）を作成する（ステップＳ９）。この色ヒストグラムは、例えば、図６（ａ）乃至（ｃ）に示されるように、頻度がフレーム数であり、有彩色フレーム集合に関する階級が色相（色情報）であり、無彩色フレーム集合に関する階級が輝度（色情報）である。有彩色フレーム集合に関する階級が、色相以外の色情報であってもよく、無彩色フレーム集合に関する階級が、輝度以外の色情報であってもよい。 Further, the color histogram creation unit 7e creates a color histogram (second histogram) for the chromatic color frame set and the achromatic color frame set (step S9). In this color histogram, for example, as shown in FIGS. 6A to 6C, the frequency is the number of frames, the class relating to the chromatic color frame set is the hue (color information), and the class relating to the achromatic color frame set. Is the luminance (color information). The class related to the chromatic color frame set may be color information other than hue, and the class related to the achromatic color frame set may be color information other than luminance.

判定部７ｆは、これらの情報を用いて、それぞれのピクセルのそれぞれのフレームが背景であるか人物であるかを判定する判定処理（ラベル付け）を行う（ステップＳ１０）。判定処理の手順については、図７に示されている。 The determination unit 7f performs determination processing (labeling) for determining whether each frame of each pixel is a background or a person using these pieces of information (step S10). The procedure of the determination process is shown in FIG.

本実施形態の判定処理では、先ず、判定部７ｆのアドレスフレーム抽出部が、前記動画のフレームから人物１１のアドレス状態が撮像されたアドレスフレームを抽出する（ステップＳ１０１）。例えば、アドレスした状態から撮影が開始された場合、時間軸で先頭のフレームが抽出される。このアドレスレームには、図４に示したような、人物１１及びゴルフクラブ１０が撮影されている。 In the determination process of the present embodiment, first, the address frame extraction unit of the determination unit 7f extracts an address frame in which the address state of the person 11 is captured from the moving image frame (step S101). For example, when shooting starts from the addressed state, the first frame on the time axis is extracted. In this address frame, a person 11 and a golf club 10 are photographed as shown in FIG.

［特徴点抽出ステップ］
次に、判定部７ｆの特徴点抽出部が、アドレスフレームを画像処理し、人物１１及びゴルフクラブ１０について予め定められた複数箇所の特徴点を抽出する特徴点抽出ステップを行う（ステップＳ１０２）。 [Feature point extraction step]
Next, the feature point extraction unit of the determination unit 7f performs image processing on the address frame, and performs a feature point extraction step of extracting a plurality of predetermined feature points for the person 11 and the golf club 10 (step S102).

本実施形態において、前記特徴点は、図８に示されるように、アドレス状態の人体１１の頭頂部１１ａ、おでこ１１ｂ、背中１１ｃ、腰１１ｄ、膝裏１１ｅ、踵１１ｆ、つま先１１ｇ、太股１１ｈ、手元１１ｉ、ゴルフクラブのヘッド１０ｂ及び肩１１ｊの各部を代表する位置である特徴点Ｐ１乃至Ｐ１２を含む。これらの特徴点Ｐ１乃至Ｐ１２は、画像処理から得られるデータに対して、アドレス状態で人物（ゴルファ）がとるほぼ一定の姿勢を考慮することで計算により抽出することができる。以下、その例について述べる。 In the present embodiment, as shown in FIG. 8, the feature points include the top 11a of the human body 11 in the addressed state, the forehead 11b, the back 11c, the waist 11d, the back 11e, the heel 11f, the toes 11g, the thigh 11h, It includes feature points P1 to P12 which are positions representing the respective parts of the hand 11i, the golf club head 10b and the shoulder 11j. These feature points P1 to P12 can be extracted by calculation by considering a substantially constant posture taken by a person (golfer) in an address state with respect to data obtained from image processing. Examples thereof will be described below.

［手元の特徴点Ｐ１（ｘ１、ｙ１）］
先ず、判定部７ｆは、アドレスフレームと、トップの状態にあるフレームとの差分を計算する。図９には、上記計算結果である差分画像の２値画像Ｄ１が示され、黒で表示された部分は、変化があったピクセルである。図９から明らかなように、対局とも言える状態にある上記２つのフレームの差分をとることにより、アドレス状態の手元及びゴルフクラブ１０の状態が明りょうに現れる。次に、判定部７ｆは、この差分画像Ｄ１に、検出領域Ｆ１を設定する。この検出領域Ｆ１は、白抜きで示され、ピクセルの座標として、２００≦ｘ≦４８０、２５０≦ｙ≦５３０の範囲に設定される。これは、カメラアングル等に基づいて確実に手元及びヘッドが含まれる領域として予め定められている。しかし、検出領域Ｆ１は、異なる設定においては範囲の変更が可能である。次に、判定部７ｆは、差分画像Ｄ１の検出領域Ｆ１をハフ変換し、ゴルフクラブのシャフト１０ｃから直線Ｌａを抽出する。検出された直線Ｌａの両端Ｌａ１、Ｌａ２は、それぞれ検出領域Ｆ１の境界上にある。さらに、判定部７ｆは、直線Ｌ１の中点Ｌａ３に矩形領域を定義し、これを直線Ｌａに沿って、左上に移動させ、該矩形領域内の差分画像数が予め定めた一定値以上になった位置を手元の特徴点Ｐ１として抽出する。この処理は、アドレス姿勢では、シャフト１０ｃの上端側には人物の手によって膨らんだ部分、即ち、手元があるという経験則に基づいている。 [Feature point P1 (x1, y1) at hand]
First, the determination unit 7f calculates the difference between the address frame and the frame in the top state. FIG. 9 shows a binary image D1 of the difference image, which is the calculation result, and the portion displayed in black is a pixel that has changed. As is apparent from FIG. 9, by taking the difference between the two frames in a state that can be said to be a game, the hand of the address state and the state of the golf club 10 clearly appear. Next, the determination unit 7f sets a detection area F1 in the difference image D1. This detection area F1 is shown in white and is set in the range of 200 ≦ x ≦ 480 and 250 ≦ y ≦ 530 as pixel coordinates. This is determined in advance as an area that reliably includes the hand and the head based on the camera angle and the like. However, the range of the detection area F1 can be changed in different settings. Next, the determination unit 7f performs Hough transform on the detection area F1 of the difference image D1, and extracts a straight line La from the shaft 10c of the golf club. Both ends La1 and La2 of the detected straight line La are on the boundary of the detection area F1. Further, the determination unit 7f defines a rectangular area at the midpoint La3 of the straight line L1, moves it to the upper left along the straight line La, and the number of difference images in the rectangular area becomes equal to or greater than a predetermined value. The extracted position is extracted as a feature point P1 at hand. This processing is based on an empirical rule that, in the address posture, there is a portion that is swollen by the hand of the person, that is, the hand, on the upper end side of the shaft 10c.

［ヘッドの特徴点Ｐ２（ｘ２、ｙ２）］
上記図９において、判定部７ｆは、矩形領域を直線Ｌａに沿って、右下に移動させ、該矩形領域内の差分ピクセル数が予め定めた一定値以上になった位置をゴルフクラブのヘッドの特徴点Ｐ２として抽出する。これらの処理は、アドレス姿勢では、シャフト１０ｃの下端側に形状の大きなヘッドがあるという経験則に基づいている。 [Feature point P2 of head (x2, y2)]
In FIG. 9, the determination unit 7f moves the rectangular area to the lower right along the straight line La, and sets the position where the number of difference pixels in the rectangular area is equal to or larger than a predetermined value to the position of the golf club head. Extracted as a feature point P2. These processes are based on an empirical rule that there is a large-shaped head on the lower end side of the shaft 10c in the address posture.

［右足のつま先の特徴点Ｐ３（ｘ３、ｙ３）、右足の踵の特徴点Ｐ４（ｘ４、ｙ４）］
図１０には、アドレスフレームの二値画像を簡略化して示している。人物１１の大凡のシルエットが黒で表示されている。この例では、判定部７ｆは、上記二値画像に検出領域Ｆ２を設定する。この検出領域Ｆ２は、カメラアングル等に基づいて右足の踵及びつま先が含まれる領域として予め定められている。判定部７ｆは、検出領域Ｆ２の左下角Ｆ２Ｌから最も近い黒のピクセルを右足の踵の特徴点Ｐ４として抽出する。同様に、判定部７ｆは、検出領域Ｆ２の右下角Ｆ２Ｒから最も近い黒のピクセルを右足のつま先の特徴点Ｐ３として抽出する。この処理は、アドレス状態を後方から撮影した場合、人物の足元は、右側の端がつま先に、左側の端が踵になるという経験則に基づいている。 [Right foot toe feature point P3 (x3, y3), right foot heel feature point P4 (x4, y4)]
FIG. 10 shows a simplified binary image of an address frame. The general silhouette of the person 11 is displayed in black. In this example, the determination unit 7f sets a detection area F2 in the binary image. The detection area F2 is determined in advance as an area including the heel and toe of the right foot based on the camera angle and the like. The determination unit 7f extracts the black pixel closest to the lower left corner F2L of the detection area F2 as the feature point P4 of the right foot heel. Similarly, the determination unit 7f extracts the black pixel closest to the lower right corner F2R of the detection area F2 as the feature point P3 of the toe of the right foot. This process is based on an empirical rule that when the address state is photographed from the back, the right end is a toe and the left end is a heel.

［左足のつま先の特徴点Ｐ５（ｘ５、ｙ５）］
左足のつま先の特徴点Ｐ５は、先の右足のつま先の特徴点Ｐ３の結果を利用して抽出される。即ち、判定部７ｆは、図１０に示されるように、右足のつま先の特徴点Ｐ３を囲む参考領域Ｆ３を設定し、この参考領域Ｆ３をテンプレートとして、最もマッチングする領域を計算する。この処理は、アドレス状態を後方から撮影した場合、人物の足元は、左右でほぼ同じような形状になるという経験則に基づいている。なお、アドレスフレーム内では、左足のつま先は、右足のつま先の上方近傍に位置するため、マッチングする検索対象領域を限定することもできる。 [Left foot toe feature point P5 (x5, y5)]
The feature point P5 of the toe of the left foot is extracted using the result of the feature point P3 of the toe of the previous right foot. That is, as shown in FIG. 10, the determination unit 7f sets a reference region F3 surrounding the feature point P3 of the right foot toe, and calculates the most matching region using the reference region F3 as a template. This process is based on an empirical rule that when the address state is photographed from the rear, the person's feet have substantially the same shape on the left and right. In the address frame, the toe of the left foot is located in the vicinity of the upper portion of the toe of the right foot, so that the search target area to be matched can be limited.

［太股の特徴点Ｐ６（ｘ６、ｙ６）］
図１１には、アドレスフレームの二値画像を簡略化して示している。判定部７ｆは、左足のつま先の特徴点Ｐ５を通るｙ軸と平行な直線Ｌｂと、手元の特徴点Ｐ１を通るｘ軸と平行な直線Ｌｃとの交点を、太股の特徴点Ｐ６として抽出する。この処理も、アドレス状態を後方から撮影した場合、人物の太股前側は、上記の位置の近傍になるという経験則に基づいている。 [Thigh feature point P6 (x6, y6)]
FIG. 11 shows a simplified binary image of an address frame. The determination unit 7f extracts, as a feature point P6 of the thigh, an intersection point between a straight line Lb parallel to the y axis passing through the feature point P5 of the toe of the left foot and a straight line Lc parallel to the x axis passing through the feature point P1 at hand. . This process is also based on an empirical rule that when the address state is taken from behind, the front side of the person's thigh is in the vicinity of the above position.

［頭頂の特徴点Ｐ７（ｘ７、ｙ７）］
判定部７ｆは、アドレスフレームと、インパクトの状態のフレームとの差分を計算する。図１２には、計算された差分の二値画像Ｄ２が示されている。次に、判定部７ｆは、検出領域Ｆ４を設定する。この検出領域Ｆ４は、白抜きで示されており、ｘ及びｙの座標が下記の範囲に設定される。この検出領域Ｆ４は、カメラアングル等に基づいて、予め定められる。次に、判定部７ｆは、検出領域Ｆ４の中の差分ピクセルのうち、ｙ座標が最小の位置を頭頂の特徴点Ｐ７として抽出する。この処理では、上記２つの状態では、頭の位置がほぼ同じ位置にあることが経験則的に分かっているので、これらの状態のいずれかでｙ座標が最小の位置を頭頂とみなすことができるという経験則に基づいている。
ｘ５−１５ピクセル≦ｘ≦ｘ１＋１０ピクセル
１０≦ｙ≦ｙ１
ただし、符号は次の通りである。
ｘ１：手元の特徴点Ｐ１のｘ座標
ｘ５：左足のつま先の特徴点Ｐ５のｘ座標
ｙ１：手元の特徴点Ｐ１のｙ座標 [Characteristic point P7 (x7, y7)]
The determination unit 7f calculates the difference between the address frame and the frame in the impact state. FIG. 12 shows a binary image D2 of the calculated difference. Next, the determination unit 7f sets a detection area F4. This detection area F4 is shown in white, and the coordinates of x and y are set in the following range. This detection area F4 is determined in advance based on the camera angle or the like. Next, the determination unit 7f extracts the position having the smallest y coordinate among the difference pixels in the detection region F4 as the feature point P7 on the top of the head. In this process, since it is empirically known that the head position is almost the same in the above two states, the position having the smallest y coordinate in any of these states can be regarded as the top of the head. Based on the rule of thumb.
x5-15 pixels ≦ x ≦ x1 + 10 pixels 10 ≦ y ≦ y1
However, the symbols are as follows.
x1: x coordinate of the feature point P1 at hand x5: x coordinate of the feature point P5 of the toe of the left foot y1: y coordinate of the feature point P1 at hand

［おでこの特徴点Ｐ８（ｘ８、ｙ８）］
判定部７ｆは、アドレスフレームと、フィニッシュの状態のフレームとの差分を計算する。図１３には、計算された差分の二値画像Ｄ３が示されている。次に、判定部７ｆは、検出領域Ｆ５を設定する。この検出領域Ｆ５は、白抜きで示されており、ｘ及びｙの座標が下記の範囲に設定される。この検出領域Ｆ５は、カメラアングル等に基づいて、予め定められる。次に、判定部７ｆは、検出領域Ｆ５の中の差分ピクセルのうち、ｘ座標が最小の位置をおでこの特徴点Ｐ８として抽出する。この処理では、上記２つの状態では、頭の位置がほぼ同じ位置にあることが経験則的に分かっているので、これらの状態のいずれかでｘ座標が最小の位置をおでことみなすことができるという経験則に基づいている。
ｘ７≦ｘ≦ｘ７＋７０ピクセル
ｙ７≦ｙ７＋４０ピクセル
ただし、符号は次の通りである。
ｘ７：頭頂の特徴点Ｐ７のｘ座標
ｙ７：頭頂の特徴点Ｐ７のｙ座標 [Forehead feature point P8 (x8, y8)]
The determination unit 7f calculates the difference between the address frame and the frame in the finish state. FIG. 13 shows a binary image D3 of the calculated difference. Next, the determination unit 7f sets a detection area F5. This detection area F5 is shown in white, and the coordinates of x and y are set in the following range. This detection area F5 is determined in advance based on the camera angle or the like. Next, the determination unit 7f extracts, as the feature point P8, the position where the x coordinate is minimum among the difference pixels in the detection region F5. In this process, since it is empirically known that the head position is approximately the same in the above two states, it is possible to regard the position where the x coordinate is minimum in either of these states as a forehead. It is based on the rule of thumb that you can.
x7 ≦ x ≦ x7 + 70 pixels y7 ≦ y7 + 40 pixels However, the signs are as follows.
x7: x-coordinate of the top feature point P7 y7: y-coordinate of the top feature point P7

［腰の特徴点Ｐ９（ｘ９、ｙ９）］
判定部７ｆは、アドレスフレームと、フィニッシュの状態のフレームとの差分を計算する。図１４には、計算された差分の二値画像Ｄ４が示されている。次に、判定部７ｆは、検出領域Ｆ６を設定する。この検出領域Ｆ６は、白抜きで示されており、ｘ及びｙの座標が下記の範囲に設定される。この検出領域Ｆ６は、カメラアングル等に基づいて、予め定められる。次に、判定部７ｆは、検出領域Ｆ６の中の差分ピクセルのうち、ｘ座標が最小の位置を腰の特徴点Ｐ９として抽出する。この処理では、上記２つの状態では、腰の位置が最も出っ張ることが経験則的に分かっているので、これらの状態のいずれかでｘ座標が最小の位置をおでことみなすことができるという経験則に基づいている。
ｘ４−７０ピクセル≦ｘ≦ｘ４＋１０ピクセル
ｙ１−５０ピクセル≦ｙ１＋５０ピクセル
ただし、符号は次の通りである。
ｘ４：右足の踵の特徴点Ｐ４のｘ座標
ｙ１：手元の特徴点Ｐ１のｙ座標 [Lumbar feature point P9 (x9, y9)]
The determination unit 7f calculates the difference between the address frame and the frame in the finish state. FIG. 14 shows a binary image D4 of the calculated difference. Next, the determination unit 7f sets a detection area F6. This detection area F6 is shown in white, and the coordinates of x and y are set in the following range. This detection area F6 is determined in advance based on the camera angle or the like. Next, the determination unit 7f extracts a position having the minimum x coordinate as the waist feature point P9 from among the difference pixels in the detection region F6. In this process, since it is empirically known that the position of the waist protrudes most in the above two states, the experience that the position where the x coordinate is the minimum can be regarded as a forehead in any of these states. Based on the law.
x4-70 pixels ≦ x ≦ x4 + 10 pixels y1-50 pixels ≦ y1 + 50 pixels However, the symbols are as follows.
x4: x-coordinate of the right foot heel feature point P4 y1: y-coordinate of the hand feature point P1

［膝裏の特徴点Ｐ１０（ｘ１０、ｙ１０）］
判定部７ｆは、アドレスフレームと、インパクトの状態のフレームとの差分を計算する。図１５には、計算された差分の二値画像Ｄ５が示されている。次に、判定部７ｆは、検出領域Ｆ７を二値画像Ｄ５に設定する。この検出領域Ｆ７は、白抜きで示されており、ｘ及びｙの座標が下記の範囲に設定される。この検出領域Ｆ７は、カメラアングル等に基づいて予め定められる。次に、判定部７ｆは、検出領域Ｆ７の中の差分ピクセルのうち、ｘ座標が最大の位置を膝裏の特徴点Ｐ１０として抽出する。この処理では、上記２つの状態では、膝裏の位置が最も凹むことが経験則的に分かっているので、これらの状態のいずれかでｘ座標が最大の位置を膝裏とみなすことができるという経験則に基づいている。
ｘ９≦ｘ≦ｘ９＋１０ピクセル
０．５＊（ｙ９＋ｙ４）−１０≦ｙ≦０．５＊（ｙ９＋ｙ４）＋１０
ただし、符号は次の通りである。
ｘ９：腰の特徴点Ｐ９のｘ座標
ｙ４：右足の踵の特徴点Ｐ４のｙ座標
ｙ９：腰の特徴点Ｐ９のｙ座標 [Feature point P10 of the knee back (x10, y10)]
The determination unit 7f calculates the difference between the address frame and the frame in the impact state. FIG. 15 shows a binary image D5 of the calculated difference. Next, the determination unit 7f sets the detection area F7 to the binary image D5. This detection area F7 is shown in white, and the x and y coordinates are set in the following range. This detection area F7 is determined in advance based on the camera angle or the like. Next, the determination unit 7f extracts, as a feature point P10 of the back of the knee, a position having the maximum x coordinate among the difference pixels in the detection region F7. In this process, since it is empirically known that the position of the back of the knee is the most concave in the above two states, the position where the x coordinate is maximum in any of these states can be regarded as the back of the knee. Based on rules of thumb.
x9 ≦ x ≦ x9 + 10 pixels 0.5 * (y9 + y4) −10 ≦ y ≦ 0.5 * (y9 + y4) +10
However, the symbols are as follows.
x9: x-coordinate of the waist feature point P9 y4: y-coordinate of the feature point P4 of the right heel y9: y-coordinate of the waist feature point P9

［背中の特徴点Ｐ１１（ｘ１１、ｙ１１）］
図１６に示されるように、判定部７ｆは、頭頂の特徴点Ｐ７と腰の特徴点Ｐ９とを結ぶ直線Ｌｃの中点Ｌｃ１から法線方向に２５ピクセル外側の位置を背中の特徴点Ｐ１１として抽出する。これは、平均的な人体の体型に基づいて、経験則的に導かれる。 [Back feature point P11 (x11, y11)]
As illustrated in FIG. 16, the determination unit 7f sets a position 25 pixels outside in the normal direction from the midpoint Lc1 of the straight line Lc connecting the top feature point P7 and the waist feature point P9 as the back feature point P11. Extract. This is derived heuristically based on the average human body shape.

［肩付近の特徴点Ｐ１２（ｘ１２、ｙ１２）］
図１６に示されるように、判定部７ｆは、おでこの特徴点Ｐ８と手元の特徴点Ｐ１とを結ぶ直線Ｌｄの中点Ｌｄ１から法線方向に３０ピクセル内側の位置を肩付近の特徴点Ｐ１２として抽出する。これは、平均的な人体の体型に基づいて、経験則的に導かれる。 [Feature point P12 near shoulder (x12, y12)]
As shown in FIG. 16, the determination unit 7f determines a position 30 pixels inside the normal direction from the midpoint Ld1 of the straight line Ld connecting the forehead feature point P8 and the feature point P1 at hand. Extract as P12. This is derived heuristically based on the average human body shape.

［マスク領域決定ステップ］
次に、判定部７ｆは、特徴点Ｐ１乃至Ｐ１２に基づいて、マスク領域を設定する（ステップＳ１０３）。 [Mask area determination step]
Next, the determination unit 7f sets a mask area based on the feature points P1 to P12 (step S103).

図１７に示されるように、判定部７ｆは、特徴点Ｐ１−Ｐ６−Ｐ５−Ｐ３−Ｐ４−Ｐ１０−Ｐ９−Ｐ１１−Ｐ７−Ｐ８−Ｐ１２−Ｐ１を繋いで閉区間を形成する。次に、判定部７ｆは、これらの繋いだ線の外側に数ピクセルの距離ｄを隔てて平行線Ｌｅを引き閉じた領域である第１マスク領域Ｍ１を設定する。 As illustrated in FIG. 17, the determination unit 7f connects feature points P1-P6-P5-P3-P4-P10-P9-P11-P7-P8-P12-P1 to form a closed section. Next, the determination unit 7f sets a first mask region M1 that is a region in which the parallel line Le is closed and separated by a distance d of several pixels outside these connected lines.

また、判定部７ｆは、ゴルフクラブのシャフトを含むＰ１−Ｐ２を繋ぐ直線を膨張させて第２マスク領域Ｍ２を設定する。さらに、判定部７ｆは、ヘッドの特徴点Ｐ２を含むとともにヘッドよりも大きい所定の矩形領域である第３のマスク領域Ｍ３を設定する。 Moreover, the determination part 7f expands the straight line which connects P1-P2 containing the shaft of a golf club, and sets the 2nd mask area | region M2. Further, the determination unit 7f sets a third mask area M3 that includes a feature point P2 of the head and is a predetermined rectangular area larger than the head.

さらに、判定部７ｆは、これら第１乃至第３のマスク領域の論理和をマスク領域Ｍとして決定し、メモリ８に記憶する。このマスク領域Ｍは、特徴点Ｐ１乃至Ｐ１２に基づいて、人体１０及びゴルフクラブ１１を包含しつつその周囲の背景を小範囲で含んだ領域とされるよう、前記ピクセル数ｄが予め設定される。一例として、前記距離ｄは１０ピクセル程度が望ましい。また、前記矩形領域は、ｘが３０ピクセル、ｙが２５ピクセル程度の長方形が望ましい。これらの設定は、カメラアングル等に応じて、適宜変更される場合がある。 Further, the determination unit 7 f determines the logical sum of these first to third mask areas as the mask area M and stores it in the memory 8. Based on the feature points P1 to P12, the mask number M is set in advance so that the mask region M is a region that includes the human body 10 and the golf club 11 and includes a surrounding background in a small range. . As an example, the distance d is preferably about 10 pixels. The rectangular area is preferably a rectangle having x of about 30 pixels and y of about 25 pixels. These settings may be appropriately changed according to the camera angle or the like.

［人体ゾーン決定ステップ］
次に、判定部７ｆの人体ゾーン設定部が、前記特徴点Ｐ１乃至Ｐ１２に基づいて、前記マスク領域Ｍに含まれるピクセルの中から前記人物１１のシルエットを構成する確率が高い領域である人体ゾーンＺを設定する（ステップＳ１０４）。 [Human zone determination step]
Next, the human body zone setting unit of the determination unit 7f is a human body zone that has a high probability of forming the silhouette of the person 11 from the pixels included in the mask region M based on the feature points P1 to P12. Z is set (step S104).

この処理は、図１８に示されるように、ゾーン設定用の次の基準点Ｚ１乃至Ｚ６が先ず、抽出される。
基準点Ｚ１：頭頂の特徴点Ｐ７を通るｙ軸線と、おでこの特徴点Ｐ８を通るｘ軸線との交点
基準点Ｚ２：頭頂の特徴点Ｐ７と腰の特徴点Ｐ９とを結ぶ直線の中点
基準点Ｚ３：腰の特徴点Ｐ９から右側へ２０ピクセルの点
基準点Ｚ４：膝裏の特徴点Ｐ１０から右側に２０ピクセルの点
基準点Ｚ５：右足の踵の特徴点Ｐ４から右に２０ピクセル及び上に２０ピクセルの点
基準点Ｚ６：肩付近の特徴点Ｐ１２と基準点Ｚ２とを結ぶ直線の中点
これらの基準点Ｚ１乃至Ｚ２は、経験則、及び、カメラアングルを考慮し、各特徴点よりも内側（人物１１側）となるように定められる。 In this process, as shown in FIG. 18, the next reference points Z1 to Z6 for zone setting are first extracted.
Reference point Z1: intersection of the y-axis passing through the top feature point P7 and the x-axis passing through the forehead feature point P8 Reference point Z2: midpoint of a straight line connecting the top feature point P7 and the waist feature point P9 Reference point Z3: 20 pixels to the right from the waist feature point P9 Reference point Z4: 20 pixels to the right from the knee feature point P10 Reference point Z5: 20 pixels to the right from the feature point P4 of the right foot 20 pixel points above Reference point Z6: Midpoint of straight line connecting feature point P12 near shoulder and reference point Z2 These reference points Z1 to Z2 are characteristic points in consideration of empirical rules and camera angles. It is determined to be inside (the person 11 side).

次に、判定部７ｆは、前記基準点Ｚ１乃至Ｚ６を用いて、人体ゾーンＺａを決定し、メモリ８に記憶する。この実施形態において、人体ゾーンＺａは、基準点Ｚ１−Ｚ２−Ｚ３−Ｚ４−Ｚ５−Ｚ４−Ｚ６−Ｚ１で囲まれる領域である。しかし、より精度を高めるために、区間Ｚ４−Ｚ６は、基準点Ｚ４から上向きにのびる基準点Ｚ５−Ｚ４を結ぶ直線の延長線と、基準点Ｚ６から下向きにのびる基準点Ｚ２−Ｚ３を結ぶ直線と平行な直線とを用いて背中側に凹となる屈曲線で構成される。また、区間Ｚ１−Ｚ６も、基準点Ｚ１とＺ６とを直線で結ぶのではなく、基準点Ｚ６からｙ軸と平行な線と、基準点Ｚ１とＺ２とを結ぶ直線とで構成される。なお、人体ゾーンＺａの直線の部分は、その直線上のピクセルが、人体ゾーンＺａに含まれる。このように決定された人体ゾーンＺａは、経験則上、アドレス状態において、明らかに人物１１を構成している可能性が高いピクセルの集合となる。 Next, the determination unit 7f determines the human body zone Za using the reference points Z1 to Z6 and stores it in the memory 8. In this embodiment, the human body zone Za is a region surrounded by the reference points Z1-Z2-Z3-Z4-Z5-Z4-Z6-Z1. However, in order to increase the accuracy, the section Z4-Z6 includes an extended line of a straight line connecting the reference points Z5-Z4 extending upward from the reference point Z4 and a straight line connecting the reference points Z2-Z3 extending downward from the reference point Z6. And a bend line that is concave on the back side using a straight line parallel to. Also, the section Z1-Z6 is not formed by connecting the reference points Z1 and Z6 with a straight line, but is formed by a line parallel to the y-axis from the reference point Z6 and a straight line connecting the reference points Z1 and Z2. In the straight line portion of the human body zone Za, pixels on the straight line are included in the human body zone Za. The human body zone Za determined in this way is a set of pixels that are apparently likely to constitute the person 11 in the address state based on a rule of thumb.

［人体区間決定ステップ］
次に、判定部７ｆの人体区間決定部が、人体ゾーンのピクセルのヒストグラムと、前記アドレスフレームの前記人体ゾーンのピクセルの色情報とに基づいて、前記ヒストグラムに人体区間を決定する人体区間決定ステップを行う（ステップＳ１０５）。 [Human body segment determination step]
Next, the human body section determining step in which the human body section determining unit of the determination unit 7f determines the human body section in the histogram based on the histogram of the pixels of the human body zone and the color information of the pixels of the human body zone of the address frame. Is performed (step S105).

ここで、人体区間とは、ヒストグラムにおいて、人体を表している階級（この例では、輝度及び色相）についての範囲である。アドレスフレームの人体ゾーンＺａのピクセルは、経験則的に人物を表示している可能性が高い。従って、このピクセルが持つ輝度及び色相の情報を利用すれば、ヒストグラムに人体区間を精度良く決定することができる。 Here, the human body section is a range for a class (in this example, luminance and hue) representing the human body in the histogram. A pixel in the human body zone Za of the address frame is likely to display a person as a rule of thumb. Therefore, if the information on the luminance and hue of the pixel is used, the human body section can be accurately determined in the histogram.

図１９には、人体区間決定ステップの処理手順の一例が示されている。
判定部７ｆは、先ず、メモリ８を参照し、アドレスフレームの処理がされていない人体ゾーンＺａの１ピクセルを選択する（ステップＳ１０５１）。 FIG. 19 shows an example of the processing procedure of the human body section determination step.
First, the determination unit 7f refers to the memory 8 and selects one pixel of the human body zone Za that has not been processed for an address frame (step S1051).

次に、判定部７ｆは、該ピクセルの輝度（又は色相）に基づいて、ヒストグラムの左右の端Ｌ_left及びＬ_right（色相の場合、Ｈ_left及びＨ_rightで以下同じ。）を決定する（ステップＳ１０５２）。いま、当該ピクセルの輝度ヒストグラムの平滑化処理がなされた場合の例が図２０で、当該ピクセルの輝度がＬ_middle（色相の場合、Ｈ_middleで以下同じ。）でそれぞれ表される場合、左端Ｌ_leftは、ヒストグラムの曲線上をＬ_middleから左に進め、フレーム数が零になる位置とする。同様に、右端Ｌ_rightは、ヒストグラムの曲線上をＬ_middleから右に進め、フレーム数が零になる位置とする。ただし、零になる位置は、ノイズを避けるために、連続して３階級零が連続したところとする。これにより、アドレスフレームの人体ゾーンＺａの任意のピクセルについて、その輝度（又は色相）が含まれるヒストグラムの山の両端が分かる。 Next, the determination unit 7f determines _left and _right ends L _left and L _right (in the case of hue, the same _applies to H _left and H _right ) based on the luminance (or hue) of the pixel (step). S1052). Now, an example in which the luminance histogram of the pixel has been smoothed is shown in FIG. 20, and when the luminance of the pixel is represented by L _middle (the same _applies to H _middle in the case of hue), the left end L _“left” is a position where the number of frames becomes zero by _moving the curve of the histogram from L _middle to the left. Similarly, the right end L _right is set to a position where the number of frames becomes zero by _moving the curve of the histogram from L _middle to the right. However, in order to avoid noise, the position where the value becomes zero is assumed to be a place where the third class zero is consecutive. Thereby, for any pixel in the human body zone Za of the address frame, both ends of the peak of the histogram including the luminance (or hue) are known.

次に、判定部７ｆは、ヒストグラムの左端Ｌ_leftと右端Ｌ_rightとの差の絶対値｜Ｌ_right−Ｌ_left｜（色相の場合、｜Ｈ_right−Ｈ_left｜で以下同じ。）が所定の階級内か否かを判定する（ステップＳ１０５３）。本実施形態では、前記差の絶対値｜Ｌ_right−Ｌ_left｜が１以上１４階級以下か否かが判定される。この階級幅は、必要に応じて、変えることができる。 Next, the determination unit 7f the absolute value of the difference between the left L _left and right _{L. Right} of the histogram | L _right -L _left | (case of the hue, | H _right- H _left |. The same hereinafter) is given It is determined whether or not it is within a class (step S1053). In this embodiment, it is determined whether or not the absolute value | L _right −L _left | of the difference is 1 or more and 14 or less. This class width can be changed as required.

判定部７ｆは、前記差の絶対値｜Ｌ_right−Ｌ_left｜がステップＳ１０５３を満たさないと判断した場合、即ち、ヒストグラムの幅が零又は予め定めた幅よりも大きい場合、当該ピクセルは、全てのフレームにおいて「背景」を表示するピクセルとして決定する（ステップＳ１０５７）。このようなピクセルは、人体区間と決められないため、全てのフレームで「背景」と決定される。 If the determination unit 7f determines that the absolute value | L _right −L _left | of the difference does not satisfy step S1053, that is, if the width of the histogram is zero or larger than a predetermined width, all the pixels are Are determined as pixels for displaying “background” in the frame (step S1057). Since such a pixel cannot be determined as a human body section, it is determined as “background” in all frames.

他方、前記差の絶対値｜Ｌ_right−Ｌ_left｜がステップＳ１０５３で真と判定された場合、色距離Ｄが計算される（ステップＳ１０５４）。ここで、例えば、ピクセルの色情報がＨＳＶである場合、一方のピクセルの色相Ｈ₁、彩度Ｓ₁、明度Ｖ₁の色ベクトルをＣ₁とし、他方のピクセルの色相Ｈ₂、彩度Ｓ₂、明度Ｖ₂の色ベクトルをＣ₂とすると、それらの色距離Ｄ（Ｃ₁，Ｃ₂）は、下式で計算される。このステップでは、当該ピクセルの輝度（色相）と同一階級のピクセルのＨＳＶ平均と、Ｌ_right〜Ｌ_leftで同一階級のピクセルのＨＳＶ平均との色距離が計算される。 On the other hand, if the absolute value | L _right −L _left | of the difference is determined to be true in step S1053, the color distance D is calculated (step S1054). Here, for example, when the color information of a pixel is HSV, the color vector of hue H ₁ , saturation S ₁ , and brightness V ₁ of one pixel is C _1, and the hue H ₂ and saturation S of the other pixel are set. _{2. If} the color vector of lightness V ₂ is C ₂ , the color distance D (C ₁ , C ₂ ) is calculated by the following equation. In this step, the color distance between the HSV average of pixels of the same class as the luminance (hue) of the pixel and the HSV average of pixels of the same class from L _{right to} L _left is calculated.

Ｄ（Ｃ₁，Ｃ₂）＝ａΔＨ’＋ｂΔＳ’＋ｃΔＶ’
上記式において、符号は次の通りである。
ａ，ｂ，ｃ：定数で、本実施形態ではａ＝５．１、ｂ＝２．２５及びｃ＝２．６５
ΔＨ’＝ΔＨ／４．０
ΔＳ’＝ΔＳ／２．０
ΔＶ’＝ΔＶ／２．０
ΔＨ＝√｛（Ｘ₁−Ｘ₂）²＋（Ｙ₁−Ｙ₂）²｝
ΔＳ＝｜Ｓ₁／１００−Ｓ₂／１００｜
ΔＶ＝｜Ｖ₁／１００−Ｖ₂／１００｜
Ｘ₁＝Ｓ'_avgｃｏｓ（Ｈ₁×３．６）
Ｙ₁＝Ｓ'_avgｓｉｎ（Ｈ₁×３．６）
Ｘ₂＝Ｓ'_avgｃｏｓ（Ｈ₂×３．６）
Ｙ₂＝Ｓ'_avgｓｉｎ（Ｈ₂×３．６）
Ｓ'_avg＝（Ｓ₁'＋Ｓ₂'）／２
Ｓ₁'＝log₁₀（Ｓ₁／１００×９９＋１．０）
Ｓ₂'＝log₁₀（Ｓ₂／１００×９９＋１．０） D (C ₁ , C ₂ ) = aΔH ′ + bΔS ′ + cΔV ′
In the above formula, the symbols are as follows.
a, b, c: constants; in this embodiment, a = 5.1, b = 2.25, and c = 2.65
ΔH ′ = ΔH / 4.0
ΔS ′ = ΔS / 2.0
ΔV ′ = ΔV / 2.0
ΔH = √ {(X ₁ −X ₂ ) ² + (Y ₁ −Y ₂ ) ² }
_{ΔS = | S 1/100-} S 2/100 |
_{ΔV = | V 1/100-} V 2/100 |
X ₁ = S ′ _avg cos (H ₁ × 3.6)
Y ₁ = S ′ _avg sin (H ₁ × 3.6)
X ₂ = S ′ _avg cos (H ₂ × 3.6)
Y ₂ = S ′ _avg sin (H ₂ × 3.6)
S ′ _avg = (S ₁ ′ + S ₂ ′) / 2
S ₁ '= log ₁₀ (S ₁ /100×99+1.0)
S ₂ '= log ₁₀ (S ₂ /100×99+1.0)

次に、判定部７ｆは、全ての組合せで色距離が０．２５以下であるか否かを判定する（ステップＳ１０５５）。このステップＳ１０５５の判定結果が真の場合、当該ピクセルのヒストグラムの左右の端Ｌ_left、Ｌ_rightの階級を人体区間の範囲として定める（ステップＳ１０５６）。この色距離の閾値０．２５は、種々変更することができる。 Next, the determination unit 7f determines whether or not the color distance is 0.25 or less for all combinations (step S1055). If the determination result in step S1055 is true, the _left and _right ends L _left and L _right of the histogram of the pixel are determined as the range of the human body section (step S1056). The color distance threshold 0.25 can be variously changed.

また、ステップＳ１０５５の判定結果が偽である場合、当該ピクセルは、全てのフレームにおいて「背景」を表示するピクセルとして決定する（ステップＳ１０５７）。 If the determination result in step S1055 is false, the pixel is determined as a pixel that displays “background” in all frames (step S1057).

次に、判定部７ｆは、輝度ヒストグラム及び色相ヒストグラムの両ヒストグラムで判定したか否かを判断する。輝度ヒストグラムのみを判定した場合には、色相ヒストグラムについてステップＳ１０５２以降の処理を同様に行う。この際、輝度は色相に、またヒストグラムの左右の端Ｌ_left、Ｌ_rightは、色相ヒストグラムの左右の端Ｈ_left、Ｈ_rightにそれぞれ読み替えられるものとする。 Next, the determination unit 7f determines whether or not the determination is made using both the luminance histogram and the hue histogram. When only the luminance histogram is determined, the processing from step S1052 is similarly performed on the hue histogram. At this time, it is assumed that the luminance is read as hue, and the _left and _right ends L _left and L _right of the histogram are read as _left and _right ends H _left and H _right of the hue histogram, respectively.

次に、判定部７ｆは、人体ゾーンＺａの全てのピクセルについて、人体区間を決定したか否かを判断し（ステップＳ１０５９）、結果が偽の場合には、次のピクセルを選択し（ステップＳ１０５１）、ステップＳ１０５２以降が再度行われる。他方、ステップＳ１０５９の結果が真の場合には、メインルーチンに戻る（リターン）。 Next, the determination unit 7f determines whether or not a human body section has been determined for all the pixels in the human body zone Za (step S1059). If the result is false, the next pixel is selected (step S1051). ), Step S1052 and subsequent steps are performed again. On the other hand, if the result of step S1059 is true, the process returns to the main routine (return).

このように、人体区間は、アドレスフレームにおいて、明らかに人体を構成するピクセルを使用し、そのピクセルの輝度Ｌ_middle（又は色相Ｈ_middle）を含むヒストグラムの山が一定の幅の区間であること、かつ、色距離が全ての組合せで一定値以下となるようなほぼ等しい色であること条件として定められる。従って、本実施形態の人体区間設定ステップは、精度良く人体区間を決定することができる。 Thus, the human body section uses pixels that clearly constitute the human body in the address frame, and the peak of the histogram including the luminance L _middle (or hue H _middle ) of the pixel is a section having a certain width. In addition, it is determined as a condition that the color distances are substantially equal so that the color distances are not more than a certain value in all combinations. Therefore, the human body section setting step of the present embodiment can determine the human body section with high accuracy.

［人体区間伝播ステップ］
次に、判定部７ｆの人体区間伝播部は、ステップＳ１０５にて決定された人体区間に基づいて、人体ゾーンＺａの外側のピクセルのヒストグラムに人体区間を決定する人体区間伝播ステップを行う（ステップＳ１０６）。 [Human body interval propagation step]
Next, based on the human body section determined in step S105, the human body section propagation unit of the determination unit 7f performs a human body section propagation step of determining the human body section in the histogram of the pixels outside the human body zone Za (step S106). ).

一般に、人体を表す隣接するピクセルの色はほぼ等しい場合が多い。このような２つのピクセルでは、ヒストグラムの人体区間の範囲はほぼ等しくなり、かつ、対応するピクセルの色もほぼ等しい。人体区間伝播ステップは、これらの特徴を用いて、人体区間が高い精度で既に設定されているピクセル（判定完了ピクセル）の８近傍のピクセルについて、人体区間が決まっていないピクセル（未判定ピクセル）の人体区間を決定するものである。 In general, the colors of adjacent pixels representing a human body are often almost equal. With such two pixels, the range of the human body section of the histogram is approximately equal, and the colors of the corresponding pixels are approximately equal. Using these characteristics, the human body segment propagation step uses pixels of which the human body segment has not been determined (undecided pixels) for pixels in the vicinity of eight pixels (determination completion pixels) for which the human body segment has already been set with high accuracy. It determines the human body section.

人体区間伝播ステップの具体的な処理手順の一例は、図２１に示されている。判定部７ｆは、先ず、距離画像ｄ_ijを生成する（ステップＳ１０６１）。図２２にピクセル配列図が示されるように、距離画像ｄ_ijは、グレー色で表されている人体ゾーンＺａからの距離を表す画像であり、人体ゾーンＺａに最も近いところで”１”、そこから遠くなるにつれて１ずつ距離が大きくなるものである。 An example of a specific processing procedure of the human body section propagation step is shown in FIG. First, the determination unit 7f generates a distance image d _ij (step S1061). As shown in the pixel arrangement diagram in FIG. 22, the distance image _dij is an image representing the distance from the human body zone Za expressed in gray, and is “1” closest to the human body zone Za, and from there The distance increases by 1 as the distance increases.

次に、判定部７ｆは、初期値として、伝播範囲を規定するパラメータｔに１をセットする（ステップＳ１０６２）。 Next, the determination unit 7f sets 1 as a parameter t that defines the propagation range as an initial value (step S1062).

次に、判定部７ｆは、前記距離画像を参照し、ｔ≧ｄijである未判定ピクセルの８近傍に、人体区間が決定されている判定完了ピクセルが有るか否かを判断する（ステップＳ１０６３）。ここで、「８近傍」とは、当該未判定ピクセルに隣接して左、左上、上、右上、右、右下、下及び左下に位置する８個のピクセルを意味する。 Next, the determination unit 7f refers to the distance image and determines whether or not there is a determination completion pixel in which a human body section is determined in the vicinity of 8 of the undetermined pixels where t ≧ dij (step S1063). . Here, “near 8” means eight pixels located adjacent to the undetermined pixel at the left, upper left, upper, upper right, right, lower right, lower and lower left.

未判定ピクセルの８近傍に人体区間が決定されている判定完了ピクセルが全く存在しない場合（ステップＳ１０６３でＮ）、未判定ピクセルは、人体区間が決定されたピクセルと隣設（連続）していないことを意味する。このような場合、前記特徴を満たさないため、当該ピクセルは、全てのフレームにおいて「背景区間」と判定される（ステップＳ１０７０）。 If there is no determination completion pixel for which the human body section is determined in the vicinity of 8 of the undetermined pixel (N in step S1063), the undetermined pixel is not adjacent (continuous) to the pixel for which the human body section is determined. Means that. In such a case, since the feature is not satisfied, the pixel is determined as a “background section” in all frames (step S1070).

判定部７ｆは、８近傍に判定完了ピクセルが存在する場合（ステップＳ１０６３でＹ）、当該未判定ピクセルの輝度及び色相をＬＢ_middle、ＨＢ_middleに決定（セット）する（ステップＳ１０６４）。 If there is a determination completion pixel in the vicinity of 8 (Y in step S1063), the determination unit 7f determines (sets) the luminance and hue of the undetermined pixel as LB _middle and HB _middle (step S1064).

次に、判定部７ｆは、伝播条件１を満たすか否かを判断する（ステップＳ１０６５）。伝播条件１は、判定の対象となる未判定ピクセルの輝度及び色相が、判定完了ピクセルの人体区間の最頻の輝度及び色相と近似していることである。具体的な計算は、下記の条件で行われる。 Next, the determination unit 7f determines whether or not the propagation condition 1 is satisfied (step S1065). Propagation condition 1 is that the luminance and hue of an undetermined pixel to be determined approximate to the most frequent luminance and hue of the human body section of the determination-completed pixel. The specific calculation is performed under the following conditions.

［伝播条件１］
（輝度の場合）
ＬＡ_max−β≦ＬＢ_middle≦ＬＡ_max＋β
Ｄ（ＣＬＢ_middle、ＣＬＡ_max）≦ρ
ただし、符号は次の通りである。
ＬＡ_max：判定完了ピクセルの人体区間［Ｌ_left、Ｌ_right］の最頻の輝度
β：定数で本実施形態では３
ρ：定数で本実施形態では０．２５
Ｄ：色距離
ＣＬＢ_middle：アドレスフレームの未判定ピクセルの色ベクトル
ＣＬＡ_max：判定完了ピクセルの最頻値の色ベクトル
（色相の場合）
ＨＡ_max−β≦ＨＢ_middle≦ＨＡ_max＋β
Ｄ（ＣＨＢ_middle、ＣＨＡ_max）≦ρ
ただし、符号は次の通りである。
ＨＡ_max：判定完了ピクセルの人体区間［Ｈ_left、Ｈ_right］の最頻の輝度
β：定数で本実施形態では３
ρ：定数で本実施形態では０．２５
Ｄ：色距離
ＣＨＢ_middle：アドレスフレームの未判定ピクセルの色ベクトル
ＣＨＡ_max：判定完了ピクセルの最頻値の色ベクトル [Propagation condition 1]
(In the case of brightness)
LA _max −β ≦ LB _middle ≦ LA _max + β
D (CLB _middle , CLA _max ) ≦ ρ
However, the symbols are as follows.
LA _max : the most frequent luminance of the human body section [L _left , L _right ] of the determination completion pixel β: constant, 3 in the present embodiment
ρ: constant, 0.25 in this embodiment
D: Color distance CLB _middle : Color vector of undetermined pixel of address frame CLA _max : Color vector of mode value of determination completed pixel (in case of hue)
HA _max −β ≦ HB _middle ≦ HA _max + β
D (CHB _middle , CHA _max ) ≦ ρ
However, the symbols are as follows.
HA _max : the most frequent luminance of the human body section [H _left , H _right ] of the determination completion pixel β: constant, 3 in the present embodiment
ρ: constant, 0.25 in this embodiment
D: Color distance CHB _middle : Color vector of undetermined pixel of address frame CHA _max : Color vector of mode value of determination completed pixel

次に、判定部７ｆは、未判定ピクセルが伝播条件１を満たす場合（ステップＳ１０６５でＹ）、当該未判定ピクセルのヒストグラムの左右の端ＬＢ_left、ＬＢ_right、ＨＢ_left、ＨＢ_rightが決定される（ステップＳ１０６６）。 Next, when the undetermined pixel satisfies the propagation condition 1 (Y in step S1065), the determination unit 7f determines the _left and _right ends LB _left , LB _right , HB _left , and HB _right of the histogram of the undetermined pixel. (Step S1066).

いま、当該未判定ピクセルの輝度及び色相ヒストグラムの平滑化処理がなされた場合の例が図２３で、当該ピクセルの輝度（又は色相）がＬＢ_middle（ＨＢ_middle）でそれぞれ表される場合、左端ＬＢ_left（ＨＢ_left）は、前記同様、ヒストグラムの曲線上をＬＢ_middle（ＨＢ_middle）から左に進め、フレーム数が零になる位置とする。同様に、右端ＬＢ_right（ＨＢ_right）は、ヒストグラムの曲線上をＬＢ_middle（ＨＢ_middle）から右に進め、フレーム数が零になる位置とする。ただし、零になる位置は、ノイズを避けるために、連続して３階級零が連続したところとする。これにより、前記ピクセルが持っている輝度及び色相を含む両ヒストグラムの山の両端が分かる。 Now, an example in which the luminance and hue histograms of the undetermined pixel are smoothed is shown in FIG. 23. When the luminance (or hue) of the pixel is represented by LB _middle (HB _middle ), the left end LB Similarly to the above, _left (HB _left ) is a position where the number of frames becomes zero by _moving the curve on the histogram from LB _middle (HB _middle ) to the left. Similarly, the right end LB _right (HB _right ) is set to a position where the number of frames becomes zero by proceeding on the curve of the histogram from LB _middle (HB _middle ) to the right. However, in order to avoid noise, the position where the value becomes zero is assumed to be a place where the third class zero is consecutive. Thereby, the both ends of the peaks of both histograms including the luminance and hue of the pixel are found.

次に、判定部７ｆは、未判定ピクセルが、伝播条件２を満たすか否かを判断する（ステップＳ１０６７）。伝播条件２は、判定の対象となる未判定ピクセルの輝度ヒストグラム及び色相ヒストグラムの前記山が、予め定めた小さい幅の区間内にあることである。具体的には、下記の式によって計算される。 Next, the determination unit 7f determines whether or not the undetermined pixel satisfies the propagation condition 2 (step S1067). Propagation condition 2 is that the peaks of the luminance histogram and hue histogram of the undetermined pixel to be determined are within a predetermined small width interval. Specifically, it is calculated by the following formula.

［伝播条件２］
（輝度の場合）
１≦｜ＬＢ_right−ＬＢ_left｜≦１４
ＬＡ_left−γ≦ＬＢ_left≦ＬＡ_left＋γ
ＬＡ_right−γ≦ＬＢ_right≦ＬＡ_right＋γ
ここで、符号は次の通りである。
ＬＡ_left：判定完了ピクセルの人体区間の左端
ＬＡ_right：判定完了ピクセルの人体区間の右端
γ：定数で本実施形態では４とする。
（色相の場合）
１≦｜ＨＢ_right−ＨＢ_left｜≦１４
ＨＡ_left−γ≦ＨＢ_left≦ＨＡ_left＋γ
ＨＡ_right−γ≦ＨＢ_right≦ＨＡ_right＋γ
ここで、符号は次の通りである。
ＨＡ_left：判定完了ピクセルの人体区間の左端
ＨＡ_right：判定完了ピクセルの人体区間の右端
γ：定数で本実施形態では４とする。 [Propagation condition 2]
(In the case of brightness)
1 ≦ | LB _right −LB _left | ≦ 14
LA _left −γ ≦ LB _left ≦ LA _left + γ
LA _right −γ ≦ LB _right ≦ LA _right + γ
Here, the symbols are as follows.
LA _left : _Left end of the human body section of the determination completion pixel LA _right : _Right end of the human body section of the determination completion pixel γ: Constant, which is 4 in this embodiment.
(In the case of hue)
1 ≦ | HB _right −HB _left | ≦ 14
HA _left −γ ≦ HB _left ≦ HA _left + γ
HA _right −γ ≦ HB _right ≦ HA _right + γ
Here, the symbols are as follows.
HA _left : The left end of the human body section of the determination completion pixel HA _right : The _right end of the human body section of the determination completion pixel γ: A constant, which is 4 in this embodiment.

伝播条件２が満たされた場合（ステップＳ１０６７でＹ）、判定部７ｆは、未判定ピクセルが伝播条件３を満たすか否かを判定する（ステップＳ１０６８）。 When the propagation condition 2 is satisfied (Y in step S1067), the determination unit 7f determines whether or not the undetermined pixel satisfies the propagation condition 3 (step S1068).

［伝播条件３］
伝播条件３は、輝度については、輝度ヒストグラム内の同一階級のピクセルの色平均ｐ（ただし、ｐ∈［ＬＢ_left、ＬＢ_right］）と、ＬＢ_middleである全てのピクセルの色の平均とで計算された色距離Ｄの値が一定値以下（本実施形態では０．２５以下）であることと定義される。色相についても、色相ヒストグラム内の同一階級のピクセルのＨＳＶ平均ｍ（ただし、ｍ∈［ＨＢ_left、ＨＢ_right］）と、ＬＢ_middleである全てのピクセルの色の平均とで計算された色距離Ｄの値が一定値以下（本実施形態では０．２５以下）であること定義される。伝播条件３を検証することにより、輝度及び色相について、ヒストグラム内で色が同じような傾向になっているかを確認できる。 [Propagation condition 3]
In the propagation condition 3, the luminance is calculated by the color average p of pixels in the same class in the luminance histogram (where pε [LB _left , LB _right ]) and the average of the colors of all the pixels that are LB _middle. It is defined that the value of the color distance D is not more than a certain value (in this embodiment, not more than 0.25). Regarding the hue, the HSV average m (where m∈ [HB _left , HB _right ]) of pixels in the same class in the hue histogram and the color distance D calculated by the average of the colors of all the pixels that are LB _middle Is defined to be not more than a certain value (in this embodiment, not more than 0.25). By verifying the propagation condition 3, it is possible to confirm whether the colors have the same tendency in the histogram with respect to luminance and hue.

判定部７ｆは、未判定ピクセルが、伝播条件３を満たすと判定した場合、このピクセルのヒストグラムについて、伝播条件１乃至３を全て満たす区間は人体区間、それ以外の区間は背景区間にそれぞれ決定する（ステップＳ１０６９）。 When the determination unit 7f determines that the undetermined pixel satisfies the propagation condition 3, regarding the histogram of this pixel, the section that satisfies all the propagation conditions 1 to 3 is determined as the human body section, and the other sections are determined as the background section. (Step S1069).

次に、判定部７ｆは、未判定ピクセルが、伝播条件１、２及び３の一つでも満たさない場合（ステップＳ１０６５、１０６７又は１０６８でＮ）、当該未判定ピクセルの全てのフレームは背景と決定する（ステップＳ１０７０）。 Next, when the undetermined pixel does not satisfy one of the propagation conditions 1, 2, and 3 (N in Steps S1065, 1067, or 1068), the determination unit 7f determines that all frames of the undetermined pixel are the background. (Step S1070).

次に、判定部７ｆは、全ての未判定ピクセルを処理したか否かを判断し（ステップＳ１０７１）、判定結果が偽の場合（ステップＳ１０７１でＮ）、他の未判定ピクセルを選択してステップＳ１０６３以降を繰り返す。 Next, the determination unit 7f determines whether or not all undetermined pixels have been processed (step S1071). If the determination result is false (N in step S1071), another undetermined pixel is selected and step is performed. S1063 and subsequent steps are repeated.

次に、判定部７ｆは、ステップＳ１０７１で判定結果が真の場合（ステップＳ１０７１でＹ）、伝播範囲を特定するパラメータｔに１を加算する（ステップＳ１０７２）。これにより、伝播範囲の領域が広げられる。 Next, when the determination result is true in step S1071 (Y in step S1071), the determination unit 7f adds 1 to the parameter t that specifies the propagation range (step S1072). Thereby, the area | region of a propagation range is expanded.

次に、判定部７ｆは、パラメータｔが伝播範囲の最大値（本実施形態では１００）以下か否かを判断し（ステップＳ１０７３）、結果が真の場合には、メインルーチンに戻る（リターン）。他方、ステップＳ１０７３の結果が偽の場合、伝播範囲を一つ広げてステップＳ１０６３以降が繰り返される。 Next, the determination unit 7f determines whether or not the parameter t is equal to or less than the maximum value (100 in the present embodiment) of the propagation range (step S1073). If the result is true, the process returns to the main routine (return). . On the other hand, if the result of step S1073 is false, the propagation range is expanded by one and step S1063 and subsequent steps are repeated.

［背景区間決定ステップ］
次に、判定部７ｆの背景区間決定部は、前記マスク領域Ｍの外側である非マスク領域のピクセルのヒストグラムと、前記アドレスフレームの非マスク領域のピクセルの色情報とに基づいて、背景区間を決定する（ステップＳ１０７）。ここで、背景区間とは、ヒストグラムの中で背景を表す階級（この例では輝度及び色相）についての範囲である。 [Background section determination step]
Next, the background interval determination unit of the determination unit 7f determines the background interval based on the histogram of the pixels in the non-mask area outside the mask area M and the color information of the pixels in the non-mask area of the address frame. Determine (step S107). Here, the background section is a range for a class (in this example, luminance and hue) representing the background in the histogram.

アドレスフレームの非マスク領域のピクセルは、背景を表示している可能性が高い。従って、この非マスク領域のピクセルが持つ輝度及び色相の情報を利用すれば、ヒストグラムに背景区間を精度良く決定することができる。 The pixels in the unmasked area of the address frame are likely to display the background. Therefore, the background interval can be accurately determined in the histogram by using the luminance and hue information of the pixels in the non-mask area.

背景区間決定ステップの処理手順は、基本的には、図１９に示した人体区間決定ステップと同様であり、図１９の「人体区間」を「背景区間」と、「背景区間」を「人体区間」とそれぞれ読み替えたものであるため、詳細なフローチャートは示さないが、概ね、判定部７ｆは、次のような処理を行う。 The processing procedure of the background section determination step is basically the same as the human body section determination step shown in FIG. 19, with “human body section” in FIG. 19 being “background section” and “background section” being “human body section. ”, The detailed flowchart is not shown, but the determination unit 7f generally performs the following processing.

ａ）非マスク領域のピクセルの選択
ｂ）該ピクセルの輝度（色相）に基づき、そのピクセルのヒストグラムの左右の端Ｌ_left及びＬ_right（色相の場合には、Ｈ_left及びＨ_right）の決定
ｃ）ヒストグラムの左端Ｌ_left（Ｈ_left）と右端Ｌ_right（Ｈ_right）との差の絶対値｜Ｌ_right−Ｌ_left｜（｜Ｈ_right−Ｈ_left｜）が予め定めた所定の階級内か否かを判定
ｄ）前記差の絶対値｜Ｌ_right−Ｌ_left｜（｜Ｈ_right−Ｈ_left｜）が所定の値よりも大きい場合、当該ピクセルを全てのフレームにおいて「人物」を表示するピクセルとして決定
ｅ）前記差の絶対値｜Ｌ_right−Ｌ_left｜（｜Ｈ_right−Ｈ_left｜）が所定の値以下の場合、色距離Ｄの計算
ｆ）全ての組合せで色距離Ｄが所定の範囲内にあるか否かを判定
ｇ）ステップｆの結果が所定の範囲内の場合、当該ピクセルのヒストグラムの左右の端Ｌ_left（Ｈ_left）、Ｌ_right（Ｈ_right）の階級を背景区間の範囲として決定
ｈ）ステップｆの結果が所定の範囲内ではない場合、当該ピクセルは、全てのフレームにおいて「人物」を表示するものとして決定
ｉ）非マスク領域の全てのピクセルについて上記ステップを実行 a) Selection of a pixel in an unmasked area b) Determination of _left and _right ends L _left and L _right (H _left and H _{right in} the case of hue) of the histogram of the pixel based on the luminance (hue) of the pixel c ) Whether the absolute value | L _right −L _left | (| H _right −H _left |) of the difference between the _left end L _left (H _left ) and the right end L _right (H _right ) of the histogram is within a predetermined predetermined class. D) If the absolute value | L _right −L _left | (| H _right −H _left |) of the difference is larger than a predetermined value, the pixel is set as a pixel for displaying “person” in all frames. Determination e) When the absolute value | L _right −L _left | (| H _right −H _left |) of the difference is equal to or less than a predetermined value, the color distance D is calculated. F) The color distance D is a predetermined range in all combinations. G) The result of step f is within a predetermined range For end L _{_left} (H _left) of the right and left of the histogram of the pixel, when L _{_right} (H _right) class to determine h) the result of step f as a range of the background section of is not within the predetermined range, the pixel Determined to display “person” in all frames i) perform the above steps for all pixels in the non-masked area

このように、アドレスフレームの中で明らかに背景を構成するであろうピクセルを使用し、そのピクセルの輝度Ｌ_middle（又は色相Ｈ_middle）を含むヒストグラムの山が狭い幅の区間にあり、かつ、色距離が全ての組合せで一定値以下となるようなほぼ等しい色であること条件として背景区間が定められる。従って、本実施形態の背景区間設定ステップも、精度良く背景区間を決定することができる。 Thus, using a pixel that will obviously constitute the background in the address frame, the peak of the histogram containing the luminance L _middle (or hue H _middle ) of that pixel is in a narrow width interval, and The background interval is determined as a condition that the color distances are substantially equal so that the color distance is equal to or less than a certain value in all combinations. Therefore, the background interval setting step of this embodiment can also determine the background interval with high accuracy.

［背景区間伝播ステップ］
次に、判定部７ｆの背景区間伝播部は、決定された前記背景区間に基づいて、前記マスク領域内のピクセルのヒストグラムに背景区間を決定する背景区間伝播ステップを行う（ステップＳ１０８）。 [Background interval propagation step]
Next, the background interval propagation unit of the determination unit 7f performs a background interval propagation step of determining a background interval in the histogram of the pixels in the mask region based on the determined background interval (step S108).

背景区間伝播ステップの処理手順は、基本的には、図２１に示した人体区間伝播ステップと同様であり、図２１の「人体区間」を「背景区間」と、「背景区間」を「人体区間」とそれぞれ読み替えたものであるため、詳細なフローチャートは示さないが、判定部７ｆは、概ね、次のような処理を行う。 The processing procedure of the background section propagation step is basically the same as the human body section propagation step shown in FIG. 21. In FIG. 21, “human body section” is “background section” and “background section” is “human body section. The detailed description is not shown, but the determination unit 7f generally performs the following processing.

ア）距離画像ｄ_ijの生成
イ）伝播範囲を規定するパラメータｔに１をセット
ウ）ｔ≧ｄ_ijである未判定ピクセルの８近傍に、背景区間が決定されている判定完了ピクセルが有るか否かを判断
エ）未判定ピクセルの８近傍に背景区間が決定されている判定完了ピクセルが全く存在しない場合、当該ピクセルを全てのフレームにおいて「人体区間」と判定
オ）未判定ピクセルの８近傍に判定完了ピクセルが存在する場合、当該未判定ピクセルの輝度及び色相をＬＢ_middle、ＨＢ_middleに決定
カ）未判定ピクセルが伝播条件１を満たすか否かを判断し、未判定ピクセルが伝播条件１を満たす場合、当該未判定ピクセルのヒストグラムの左右の端ＬＢ_left、ＬＢ_right、ＨＢ_left、ＨＢ_rightを決定して伝播条件２を満たすか否かを判断
キ）伝播条件２が満たされた場合、未判定ピクセルが伝播条件３を満たすか否かの判定
ク）未判定ピクセルが、伝播条件３を満たす場合、このピクセルのヒストグラムについて、伝播条件１乃至３を全て満たす区間は背景区間、それ以外の区間は人体区間にそれぞれ決定
ケ）未判定ピクセルが、伝播条件１、２及び３の一つでも満たさない場合、当該未判定ピクセルの全てのフレームを人体と決定
コ）全ての未判定ピクセルについて上記処理を実行した後、パラメータｔに１を加算
サ）パラメータｔが伝播範囲の最大値を越えるまで処理を続ける A) Generation of distance image d _ij b) Set 1 to parameter t that defines the propagation range c) Whether there is a determination completion pixel whose background section is determined in the vicinity of 8 undetermined pixels where t ≧ d _ij D) If there are no determination complete pixels whose background interval is determined in the vicinity of 8 undetermined pixels, the pixel is determined to be a “human body interval” in all frames. E) 8 neighborhoods of undetermined pixels If there is a determination completion pixel, the luminance and hue of the undetermined pixel are determined as LB _middle and HB _{middle. F} ) It is determined whether the undetermined pixel satisfies the propagation condition 1, and the undetermined pixel is determined as the propagation condition 1. If it satisfies, the _left and _right ends LB _left , LB _right , HB _left , HB _right of the histogram of the undetermined pixel are determined to determine whether or not the propagation condition 2 is satisfied. If the undetermined pixel satisfies the propagation condition 3, if the undetermined pixel satisfies the propagation condition 3, the interval that satisfies all of the propagation conditions 1 to 3 in the histogram of this pixel is The background section and the other sections are determined as the human body section. I) If the undetermined pixel does not satisfy one of the propagation conditions 1, 2, and 3, all frames of the undetermined pixel are determined as the human body. After executing the above processing for the undetermined pixels, add 1 to the parameter t. The processing continues until the parameter t exceeds the maximum value of the propagation range.

［整合性判定ステップ］
次に、判定部７ｆの整合性チェック部は、前記背景区間及び前記人体区間が決定された後、これらの整合性を判定する（ステップＳ１０９）。 [Consistency judgment step]
Next, after the background section and the human body section are determined, the consistency check section of the determination unit 7f determines the consistency of these (step S109).

図２４には、これまで行った処理が模式的に示されている。人体区間伝播ステップは、人体ゾーンのピクセルを用いてその周囲に人体区間を伝播させていくものである。また、背景区間伝播ステップは、非マスク領域のピクセルを用いてその内側に背景区間を伝播させていくものである。これらのステップにより、次の各領域は、それぞれ２つの方式で人体区間及び背景区間が決定されることになる。
人体ゾーン：人体区間決定ステップ及び背景区間伝播ステップ
人体ゾーンの外部でマスク領域内：人体区間伝播ステップ及び背景区間伝播ステップ
非マスク領域：背景区間決定ステップ及び人体区間伝播ステップ FIG. 24 schematically shows the processing performed so far. In the human body zone propagation step, the human body zone is propagated around the human body zone using pixels. In the background interval propagation step, the background interval is propagated inward using pixels in the non-mask area. With these steps, the human body section and the background section are determined for each of the following areas by two methods.
Human body zone: human body zone determination step and background zone propagation step Outside the human body zone in mask area: human body zone propagation step and background zone propagation step Non-masked region: background zone determination step and human body zone propagation step

しかしながら、各方式の違いによって、区間の決め方に矛盾が生じるおそれがある。そこで、本実施形態では、このような矛盾を解決するために、整合性判定ステップが行われる。整合性判定ステップの処理手順の一例は、図２５に示されている。 However, there is a possibility that contradictions occur in the method of determining the section due to the difference in each method. Therefore, in this embodiment, a consistency determination step is performed in order to solve such a contradiction. An example of the processing procedure of the consistency determination step is shown in FIG.

判定部７ｆは、先ず、整合性条件１を満たすか否かを判断する（ステップ１０９１）。この実施形態では、整合性条件１は、次の通りである。 First, the determination unit 7f determines whether or not the consistency condition 1 is satisfied (step 1091). In this embodiment, the consistency condition 1 is as follows.

［整合性条件１］
Ｄ（Ｃ_ｂ；Ｃ_ｈ）＜０．３
人体区間伝播ステップ、背景区間伝播ステップ、人体区間決定ステップ又は背景区間決定ステップのいずれかにおいて、背景区間と人体区間の両方が決定されたピクセルを対象として、背景区間のピクセル値（輝度及び色相）の平均Ｃ_ｂと、人体区間のピクセル値（輝度及び色相）の平均Ｃ_ｈの色距離Ｄ（Ｃ_ｂ；Ｃ_ｈ）が予め定めた一定値未満（本実施形態では０．３未満）か否か [Consistency condition 1]
D (C _b ; C _h ) <0.3
Pixel values (luminance and hue) of the background section for pixels for which both the background section and the human body section are determined in any of the human body section propagation step, the background section propagation step, the human body section determination step, or the background section determination step. not; _{_(C h} C b) is less than a constant value determined in advance (less than 0.3 in this embodiment) of the average _{C b,} the body section pixel value average _{C h} color distance _D (luminance and hue) Or

そして、判定部７ｆは、整合性条件１の判定結果が偽の場合（ステップＳ１０９１でＮ）、決定された人体区間及び背景区間は、それぞれ正しいものと判定する。また、それ以外の区間は、背景区間として決定する（ステップＳ１０９５）。整合性条件１では、人体区間と背景区間のピクセル値の平均の色が近すぎないかが判断される。判定部７ｆは、整合性条件１が満たされないと判断した場合、人体区間と背景区間との色が十分に相違していると判断し、先の判定結果を正しいものとして取り扱う。 Then, when the determination result of the consistency condition 1 is false (N in Step S1091), the determination unit 7f determines that the determined human body section and the background section are correct. The other sections are determined as background sections (step S1095). In the consistency condition 1, it is determined whether the average color of the pixel values of the human body section and the background section is not too close. If the determination unit 7f determines that the consistency condition 1 is not satisfied, the determination unit 7f determines that the colors of the human body section and the background section are sufficiently different, and treats the previous determination result as correct.

他方、判定部７ｆは、整合性条件１を満たすと判断した場合（ステップＳ１０９１でＹ）、整合性条件２を満たすか否かを判断する（ステップＳ１０９２）。この実施形態では、整合性条件２は、次の通りである。 On the other hand, when determining that the consistency condition 1 is satisfied (Y in step S1091), the determination unit 7f determines whether the consistency condition 2 is satisfied (step S1092). In this embodiment, the consistency condition 2 is as follows.

［整合性条件２］
Ｄ（Ｃ_ｂ；Ｃ_back）＜Ｄ（Ｃ_ｂ；Ｃ_human） …（１）
Ａ＞Ｂ …（２）
ここで、符号は次の通りである。
Ｄ（Ｃ_ｂ；Ｃ_back）：対象ピクセルのピクセル値の平均Ｃ_ｂと、その伝播元である非マスク領域のピクセルの背景区間のピクセル値の色ベクトルＣ_backとの色距離
Ｄ（Ｃ_ｂ；Ｃ_human）：ピクセル値の平均Ｃ_ｂと、その伝播元である人体ゾーンのピクセルの人体区間のピクセル値の色ベクトルＣ_humanとの色距離
Ａ：（Ｆ_ｂとＦ_back）の共通しているフレーム数／Ｆ_backの数
Ｂ：（Ｆ_ｂとＦ_human）の共通しているフレーム数／Ｆ_humanの数
Ｆ_ｂ：Ｄ（Ｃ_ｂ；Ｃ_ｈ）＜０．３を満たす位置の背景区間に対応するフレーム集合
Ｆ_back：Ｃ_backの背景区間に対応するフレーム集合
Ｆ_human：Ｃ_humanの人体区間に対応するフレーム集合 [Consistency condition 2]
D (C _b ; C _back ) <D (C _b ; C _human ) (1)
A> B (2)
Here, the symbols are as follows.
D (C _b ; C _back ): Color distance D (C _b ;) between the average C _b of the pixel values of the target pixel and the color vector C _back of the pixel value of the background section of the non-mask region pixel that is the propagation source. C _human): average pixel value C _b and the color distance between the color vector C _human pixel values of the human body section of the propagation source in which the human body zones pixels a: are common in (F _b and F _back) Number of frames / Number of F _back B: Number of frames common to (F _b and F _human ) / Number of F _human F _b : In the background section at a position satisfying D (C _b ; C _h ) <0.3 Corresponding frame set F _back : Frame set corresponding to the background section of C _back F _human : Frame set corresponding to the _human body section of C _human

図２６には、あるピクセルのヒストグラムが例に挙げられている。あるピクセルの背景区間のピクセルの色平均と、伝播元の背景色、伝播元の人体色との色距離を計算した場合、背景色との色距離が小さくなることがある。整合性条件２の式（１）は、この点を検証する。また、背景のフレームを多く含む方が、背景区間として信頼できる。整合性条件２の式（２）では、この点を検証している。 FIG. 26 shows a histogram of a certain pixel as an example. When the color distance between the average color of pixels in the background section of a certain pixel, the background color of the propagation source, and the human body color of the propagation source is calculated, the color distance from the background color may be small. Equation (1) of consistency condition 2 verifies this point. In addition, it is more reliable as a background section to include more background frames. This point is verified in the expression (2) of the consistency condition 2.

判定部７ｆは、整合性条件２をも満たすと判断した場合（ステップＳ１０９２でＹ）、当該区間は、背景区間と決定する（ステップＳ１０９４）。また、判定部７ｆは、整合性条件２が満たされないと判断した場合、当該区間は、人体区間に決定する（ステップＳ１０９３）。このように、整合性チェックを行うことにより、より精度の高い判定が可能になる。 If it is determined that the consistency condition 2 is also satisfied (Y in step S1092), the determination unit 7f determines that the section is a background section (step S1094). If the determination unit 7f determines that the consistency condition 2 is not satisfied, the section is determined as a human body section (step S1093). As described above, by performing the consistency check, it is possible to perform determination with higher accuracy.

以上の処理により、全てのフレームの全てのピクセルが、「背景区間」又は「人物区間」のいずれかに区分される。そして、各フレームについては、輝度ヒストグラム及び色相ヒストグラムを用いて背景か人物かを決定することができる。 Through the above processing, all the pixels in all the frames are classified into either “background section” or “person section”. For each frame, it is possible to determine whether it is a background or a person using a luminance histogram and a hue histogram.

図２７乃至２９には、本実施形態のシルエット抽出方法によって得られたアドレス、トップ及びフィニッシュの各フレームの人物のシルエットが示されている。この方法により、人物のシルエットがほぼ忠実に再現されている。このようなシルエットを用い、ゴルファのスイングが画像処理によって診断される。 27 to 29 show the silhouettes of the person in the address, top, and finish frames obtained by the silhouette extraction method of the present embodiment. By this method, the silhouette of the person is reproduced almost faithfully. A golfer's swing is diagnosed by image processing using such a silhouette.

本発明は、種々の改良を加えることができる。
例えば、カメラ４によるスイングの撮影がゴルフ練習場で行われる場合、各フレームの背景の中には、図３０に示されるように、ゴルフ練習場のネットを支える１乃至複数本の支柱１６が含まれることがある。このような支柱１６は、カメラアングルに縦方向にのびている。このため、支柱１６は、その間の背景の類似した景色の色ｉの横方向の連続性を阻害する。これは、背景区間伝播ステップの精度を低下させる原因になりやすい。従って、背景の中でも、予め支柱１６を別に認識しておけば、背景区間伝播ステップにおいて、この支柱１６を表すピクセルをスキップして（跨いで）伝播させれば、背景区間伝播ステップの精度を高めることができる。 Various improvements can be added to the present invention.
For example, when shooting of a swing by the camera 4 is performed at a golf driving range, the background of each frame includes one or more support columns 16 that support the net of the golf driving range, as shown in FIG. May be. Such a support column 16 extends vertically in the camera angle. For this reason, the support | pillar 16 blocks | interrupts the horizontal continuity of the color i of the scenery with the similar background in the meantime. This tends to reduce the accuracy of the background interval propagation step. Therefore, if the struts 16 are separately recognized in the background in advance, if the pixels representing the struts 16 are skipped (spread) in the background interval propagation step, the accuracy of the background interval propagation step is increased. be able to.

支柱の特徴としては、例えば、次の規則性を挙げることができる。
Ａ）アドレスフレームの上部エリア（例えば０≦ｙ≦２５０）にある
Ｂ）ほぼ無彩色である
Ｃ）垂直線に対して±２度
Ｄ）ｘ方向の幅が所定ピクセルの範囲（本実施形態では１〜１１ピクセル） As a feature of the support, for example, the following regularity can be given.
A) In the upper area of the address frame (for example, 0 ≦ y ≦ 250) B) Almost achromatic C) ± 2 degrees with respect to the vertical line D) Range in the x direction of a predetermined pixel (in this embodiment, 1-11 pixels)

本実施形態では、上記の規則性を利用し、アドレスフレームの各ピクセルの情報を用いて、上記ピクセル集合を検索する。これによって、アドレスフレームにおいて、支柱１６を表示している階級である支柱区間の特定することができる。また、支柱１６は、ゴルフスイングの間、変化しないと考えられるので、他のフレームにおいても支柱１６とみなすことができる。 In the present embodiment, the pixel set is searched using the information of each pixel of the address frame using the regularity. Thereby, in the address frame, it is possible to specify the column section that is the class displaying the column 16. Moreover, since it is thought that the support | pillar 16 does not change during a golf swing, it can be considered as the support | pillar 16 also in another frame.

そして、前記背景区間決定ステップにおいて、前記支柱区間のピクセルをスキップすることが望ましい。図３１には、ピクセルの拡大平面図が示されており、グレーで表示されているピクセルは、支柱１６を表示するものとする。今、背景区間伝播ステップにおいて、ピクセル９を処理するものとする。ピクセル９の８近傍ピクセルは、ピクセル１乃至８になる。しかし、この場合、ピクセル３、５及び８が支柱を表示するピクセルであるため、この場合の８近傍は、これらのピクセルをスキップしてピクセル１、２、４、６、７、３’、５’及び８’とし、これらを参照するものとする。従って、例えば、ピクセル１、２、４、６及び７が、未判定ピクセルであっっても、ピクセル３’、５’及び８’のいずれかが判定完了ピクセルであればこれらを参考にしてピクセル９の背景区間を決定することができる。 In the background section determination step, it is desirable to skip pixels in the support section. FIG. 31 shows an enlarged plan view of the pixel, and the pixel displayed in gray displays the column 16. Now, assume that the pixel 9 is processed in the background interval propagation step. The eight neighboring pixels of pixel 9 become pixels 1-8. However, in this case, since the pixels 3, 5 and 8 are the pixels displaying the pillars, the neighborhood of 8 in this case skips these pixels, and the pixels 1, 2, 4, 6, 7, 3 ′, 5 These shall be referred to as “and 8”. Therefore, for example, even if the pixels 1, 2, 4, 6, and 7 are undetermined pixels, if any of the pixels 3 ′, 5 ′, and 8 ′ is a determination completed pixel, the pixels are referred to these pixels. Nine background segments can be determined.

以上、本発明について種々説明したが、本発明のシルエット抽出システムは、携帯電話機２のＣＰＵを使用して、これだけで実現することもできる。この場合、携帯電話機のみで本発明を実施でき、サーバーとの接続は不要である。撮影者は、携帯電話機２のみを持参すれば、その場でスイングを診断しうる。 As described above, the present invention has been variously described, but the silhouette extraction system of the present invention can be realized by using only the CPU of the mobile phone 2. In this case, the present invention can be implemented using only a mobile phone, and connection with a server is unnecessary. If the photographer brings only the mobile phone 2, the photographer can diagnose the swing on the spot.

［処理例１：図３２］
実施例１は、原画像に、本発明に従うシルエット抽出処理を施したものである。比較例１は、上記特許文献１のように、原画像に背景区間伝播ステップ＋固定マスク処理を行ってシルエット抽出を行ったものである。処理の結果、比較例１では、矩形の枠囲み部分のように、背景とよく似た色である人物の胸部分が背景として認識されている。一方、実施例１では、このような不具合はなく、認識精度が高い。 [Processing Example 1: FIG. 32]
In the first embodiment, a silhouette extraction process according to the present invention is performed on an original image. In Comparative Example 1, silhouette extraction is performed by performing a background interval propagation step + fixed mask process on an original image as in Patent Document 1 described above. As a result of the processing, in Comparative Example 1, a person's chest part having a color very similar to the background, such as a rectangular framed part, is recognized as the background. On the other hand, in Example 1, there is no such inconvenience and the recognition accuracy is high.

［処理例２：図３３］
実施例２は、原画像に、本発明に従うシルエット抽出処理を施したものである。比較例２は、上記特許文献１のシルエット抽出を行ったものである。処理の結果、比較例１では、矩形の枠囲み部分のように、背景とよく似た色である人物の腰前方部分が背景として認識されている。一方、実施例２では、このような不具合はなく、認識精度が高い。 [Processing Example 2: FIG. 33]
In the second embodiment, a silhouette extraction process according to the present invention is performed on an original image. In Comparative Example 2, the silhouette extraction of Patent Document 1 is performed. As a result of the processing, in Comparative Example 1, a waist front portion of a person having a color very similar to the background, such as a rectangular framed portion, is recognized as the background. On the other hand, in Example 2, there is no such a malfunction and recognition accuracy is high.

［処理例３：図３４］
実施例３は、原画像に、本発明に従うシルエット抽出処理を施したものであり、支柱の認識処理及び背景区間伝播ステップにおいて支柱のピクセルのスキップを行った。比較例３は、上記特許文献１のシルエット抽出を行ったものである。処理の結果、比較例３では、矩形の枠囲み部分のように、支柱の一部が人物として認識されている。これは、支柱を越えて背景区間の伝播が行えなかったため、支柱のピクセルが人物に含まれていると考えられる。一方、実施例３では、このような不具合はなく、認識精度が高い。 [Processing Example 3: FIG. 34]
In the third embodiment, silhouette extraction processing according to the present invention is performed on the original image, and strut pixels are skipped in the strut recognition processing and the background interval propagation step. In Comparative Example 3, the silhouette extraction of Patent Document 1 is performed. As a result of the processing, in Comparative Example 3, a part of the support column is recognized as a person like a rectangular framed part. This is because the background pixel cannot be propagated beyond the support column, and the pixel of the support column is considered to be included in the person. On the other hand, in Example 3, there is no such inconvenience and the recognition accuracy is high.

［処理例４：図３５］
実施例４は、原画像に、本発明に従うシルエット抽出処理を施したものであり、支柱の認識処理及び背景区間伝播ステップにおいて支柱のピクセルのスキップを行った。比較例４は、上記特許文献１のシルエット抽出を行ったものである。処理の結果、比較例４では、矩形の枠囲み部分のように、支柱の一部が人物として認識されている。一方、実施例４では、このような不具合はなく、認識精度が高い。 [Processing Example 4: FIG. 35]
In the fourth embodiment, silhouette extraction processing according to the present invention is performed on the original image, and strut pixels are skipped in the strut recognition processing and the background interval propagation step. In Comparative Example 4, the silhouette extraction of Patent Document 1 is performed. As a result of the processing, in Comparative Example 4, a part of the support column is recognized as a person like a rectangular framed portion. On the other hand, in Example 4, there is no such inconvenience and the recognition accuracy is high.

１シルエット抽出システム
２携帯電話機
３サーバー
４カメラ
５、８メモリ
７演算部 1 Silhouette Extraction System 2 Mobile Phone 3 Server 4 Camera 5 and 8 Memory 7 Calculation Unit

Claims

Extract a frame, which is a collection of pixels, from a moving image of a person who performs a swing motion with a golf club and the background, and specify each pixel of the extracted frame as either the person or the background A method for extracting the silhouette of the person,
Creating a complete frame set of all frames for each pixel;
Creating a histogram in which the frequency is the number of frames and the class is color information for the entire frame set;
Extracting an address frame in which the address state of the person is imaged from the moving image;
A feature point extracting step of performing image processing on the address frame and extracting feature points at a plurality of predetermined locations for the person and the golf club;
Based on the feature points, setting a mask area that includes the person and the golf club while partially including the surrounding background;
Setting a human body zone that is a region having a high probability of forming the silhouette of the person from pixels included in the mask region based on the feature points;
A human body section determining step for determining a human body section that is a class representing a human body in the histogram based on a histogram of pixels of the human body zone and color information of pixels of the human body zone of the address frame;
A human body segment propagation step for determining a human body segment in a histogram of pixels outside the human body zone based on the determined human body segment;
Based on the histogram of the pixels in the non-mask area outside the mask area and the color information of the pixels in the non-mask area in the address frame, a background section that is a class representing the background in the histogram is determined. A background segment determination step;
A silhouette extraction method comprising: a background interval propagation step of determining a background interval in a histogram of pixels in the mask region based on the determined background interval.

The histogram is a first histogram for all pixels and having a class as luminance, a second histogram with a class as hue for chromatic pixels, and a class whose luminance is achromatic pixels. The silhouette extraction method according to claim 1, comprising a third histogram.

3. The feature points include positions representing the top of the human body in the addressed state, forehead, back, waist, back of knees, heels, toes, thighs, hands, golf club heads and shoulders. Silhouette extraction method.

The step of setting the mask region includes the step of connecting the feature points to set an initial mask region;
4. The silhouette extraction method according to claim 1, wherein the mask area is set by expanding the initial mask area to a predetermined thickness.

Setting the human body zone based on the feature points, setting a reference point at a position that is clearly human body inside the initial mask region; and
The silhouette extraction method according to claim 4, wherein the human body zone is determined using the reference point.

Identifying a pixel for displaying a driving range support among the pixels of the frame;
Determining a column section that is a class displaying the column based on the pixel displaying the column and the histogram thereof;
The silhouette extraction method according to claim 1, wherein the background section propagation step skips the strut section.

The silhouette extraction method according to claim 1, further comprising a step of determining consistency between the background section and the human body section after the background section and the human body section are determined.

Extract a frame, which is a collection of pixels, from a moving image of a person who performs a swing motion with a golf club and the background, and specify each pixel of the extracted frame as either the person or the background A system for extracting the silhouette of the person,
A camera for capturing the video, a memory for storing the captured video, and a calculation unit;
The computing unit is
For each pixel, a set creation unit that creates a set of all frames including all frames;
A histogram creation unit that creates a histogram in which the frequency is the number of frames and the class is color information for the entire frame set;
An address frame extraction unit that extracts an address frame in which the address state of the person is imaged from the moving image;
A feature point extraction unit that performs image processing on the address frame and extracts feature points at a plurality of predetermined positions for the person and the golf club;
Based on the feature points, a mask area setting unit that sets a mask area that includes the human body and the golf club and includes a background around the human body and the golf club;
Based on the feature points, a human body zone setting unit that sets a human body zone that is a region having a high probability of forming the silhouette of the person from the pixels included in the mask region;
A human body section determination unit that determines a human body section that is a class representing a human body in the histogram based on a histogram of pixels of the human body zone and color information of pixels of the human body zone of the address frame;
Based on the determined human body section, a human body section propagation unit that determines a human body section in a histogram of pixels outside the human body zone,
Based on the histogram of the pixels in the non-mask area outside the mask area and the color information of the pixels in the non-mask area in the address frame, a background section that is a class representing the background in the histogram is determined. A silhouette extraction system comprising: a background section determination section; and a determination section including a background section propagation section that determines a background section in a histogram of pixels in the mask region based on the determined background section.