JP6187261B2

JP6187261B2 - Person feature amount extraction system, server, and person feature amount extraction method

Info

Publication number: JP6187261B2
Application number: JP2013552342A
Authority: JP
Inventors: 祐介 ▲高▼橋
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2012-01-05
Filing date: 2012-12-11
Publication date: 2017-08-30
Anticipated expiration: 2032-12-11
Also published as: JPWO2013102972A1; WO2013102972A1

Description

本発明は、画像から人物の特徴量を抽出するシステム、撮像端末、サーバおよび方法に関する。 The present invention relates to a system, an imaging terminal, a server, and a method for extracting a feature amount of a person from an image.

人物の監視、追跡や検索などの用途のために、単一カメラおよび複数カメラの映像間で同一人物の特定（人物同定）を行うことが要望されている。人物を追跡や検索するためには、画像から人物の顔や服飾などの見た目に関する特徴量の抽出を行う。この人物の特徴量を抽出するためには、画像中に映っている人物を抽出し、抽出した人物ごとに特徴量を算出する必要がある。 For purposes such as monitoring, tracking, and searching for a person, it is desired to specify the same person (person identification) between videos of a single camera and multiple cameras. In order to track or search for a person, a feature amount relating to the appearance of the person's face or clothing is extracted from the image. In order to extract the feature amount of the person, it is necessary to extract the person shown in the image and calculate the feature amount for each extracted person.

例えば、監視カメラでは、一般的に広い範囲をカバーするため、高い位置から広角レンズなどを用いて撮影されることが多い。このため、画像中の人物のサイズは小さく、人物も斜め上から見下ろした画角で撮影されていることが多い。このような環境で撮影されたカメラ映像から人物の頭部や顔を抽出するためには、高い演算処理が必要になる。このため、カメラに設置される小型画像処理装置では、処理を行うことが困難となるため、通常は、カメラで撮影した画像を高演算処理することができるサーバに転送し、サーバ上で人物抽出および特徴量算出を行う。 For example, since surveillance cameras generally cover a wide range, they are often photographed from a high position using a wide-angle lens or the like. For this reason, the size of the person in the image is small, and the person is often photographed with an angle of view looking down from above. In order to extract a person's head and face from a camera image shot in such an environment, high calculation processing is required. For this reason, since it is difficult to perform processing in a small image processing apparatus installed in the camera, the image captured by the camera is usually transferred to a server capable of high-calculation processing, and person extraction is performed on the server. And feature amount calculation.

しかし、カメラ映像を画像処理させるサーバへ転送する際には、画像そのものを送るとプライバシーの問題が生じる。このため、カメラ側で画像処理を施して元の画像が復元できない情報に変換した後にサーバ側へデータ（特徴量）を転送するなどの対策が必要である。 However, when transferring the camera video to a server that performs image processing, if the image itself is sent, a privacy problem arises. For this reason, it is necessary to take measures such as transferring data (features) to the server side after performing image processing on the camera side to convert the original image into information that cannot be restored.

既存の技術において、特許文献１では、人物のプライバシーを侵害せずに、当該人物の行動を分析する方式が提案されている。 In the existing technology, Patent Document 1 proposes a method of analyzing a person's behavior without infringing on the person's privacy.

特許文献１の方式では、撮影した映像に写っている人物の特徴を表す数値を特徴量データとして抽出して暗号化する。そして、撮影装置は、暗号化された特徴量データと所定の関連情報と当該撮影装置に固有に付与された撮影装置識別子とを照合装置へ送信する。照合装置は、暗号化済み特徴量データと関連情報と撮影装置識別子とを特徴量データベースに格納し、任意の時点にて暗号化したまま特徴量データのマッチングを行い、該マッチングに基づいてユニークな人物ごとに人物識別子を払い出す。そして、照合装置は、人物識別子ごとの関連情報と撮影装置識別子との組を含む行動データベースを作成する。 In the method of Patent Document 1, a numerical value representing a feature of a person shown in a captured video is extracted as feature data and encrypted. Then, the imaging device transmits the encrypted feature amount data, the predetermined related information, and the imaging device identifier uniquely given to the imaging device to the verification device. The collation device stores the encrypted feature amount data, the related information, and the photographing device identifier in the feature amount database, matches the feature amount data while being encrypted at an arbitrary time point, and performs unique matching based on the matching. A person identifier is paid out for each person. Then, the collation device creates an action database including a set of related information for each person identifier and the photographing device identifier.

また、特許文献２では、特徴データ抽出処理を２つの装置で分割する方式が提案されている。 Further, Patent Document 2 proposes a method of dividing feature data extraction processing by two devices.

この方式では、入力情報に対する特徴データ抽出処理の一部を第１の情報処理装置で実行し、残りの部分を第２の情報処理装置で実行する。つまり、特徴データの抽出処理を複数の情報処理装置にまたがって順次実行する。その際、第１の情報処理装置では、アルゴリズムが解読されても最終結果の特徴データが推測される可能性のない演算を行うことが特徴である。 In this method, a part of feature data extraction processing for input information is executed by the first information processing apparatus, and the remaining part is executed by the second information processing apparatus. That is, feature data extraction processing is sequentially executed across a plurality of information processing apparatuses. At this time, the first information processing apparatus is characterized in that even if the algorithm is decoded, the final result feature data is not estimated.

特許文献３では、カメラで取得した画像を人物追跡装置に送信し、人物追跡装置において頭部領域を検出し、検出位置やサイズを用いて人物情報を特徴量として抽出する方法が提案されている。 Patent Document 3 proposes a method in which an image acquired by a camera is transmitted to a person tracking device, a head region is detected by the person tracking device, and person information is extracted as a feature amount using the detected position and size. .

非特許文献１では、映像中の勾配情報を基に人物顔領域を検出する方法が提案されている。 Non-Patent Document 1 proposes a method for detecting a human face region based on gradient information in a video.

非特許文献２では、特徴データを表現するための標準規格であるＭＰＥＧ−７についての説明がされている。 Non-Patent Document 2 describes MPEG-7, which is a standard for expressing feature data.

特開２０１１−０１４０５９号公報JP 2011-014059 A 特開２００５−２２２３５２号公報JP 2005-222352 A 特開２０１０−２７３１１２号公報JP 2010-273112 A

細井利憲、鈴木哲明、佐藤敦、「一般化学習ベクトル量子化による顔検出」、電子情報通信学会技術研究報告. PRMU、パターン認識・メディア理解 102(651)、47-52、2003-02-13Toshinori Hosoi, Tetsuaki Suzuki, Satoshi Sato, “Face Detection by Generalized Learning Vector Quantization”, IEICE Technical Report. PRMU, Pattern Recognition / Media Understanding 102 (651), 47-52, 2003-02-13 Introduction to MPEG-7 Multimedia Content Description Interface, Edited by B. S. Manjunath, Philippe Salembier, Thomas Sikora, John Wiley & Sons, Ltd., Baffins Lane, Chichester, West Sussex PO19 1UD, England （ISBN 0 471 48678 7） pp.208〜pp.212Introduction to MPEG-7 Multimedia Content Description Interface, Edited by BS Manjunath, Philippe Salembier, Thomas Sikora, John Wiley & Sons, Ltd., Baffins Lane, Chichester, West Sussex PO19 1UD, England (ISBN 0 471 48678 7) pp.208 ~ Pp.212

しかしながら、特許文献１のように送信情報を暗号化しただけでは、何らかの手段により送信情報を取得して復号化に成功した場合、元の画像が生成される可能性がある。この場合、プライバシーの問題が生じる恐れがある。 However, just by encrypting the transmission information as in Patent Document 1, if the transmission information is acquired and decrypted by some means, the original image may be generated. In this case, a privacy problem may occur.

また、特許文献３のように、撮像装置の負担を減らすために、人物特徴量を抽出する装置へ撮影画像そのものを送信すると、人物の顔などがそのまま送信される。このため、プライバシーの問題が生じる恐れがある。 Further, as in Patent Document 3, when a captured image itself is transmitted to an apparatus that extracts a person feature amount in order to reduce the burden on the imaging apparatus, the face of the person is transmitted as it is. This can lead to privacy issues.

また、特許文献２に記載の技術では、第１の情報処理装置でアルゴリズムが解読されても最終結果の特徴データが推測される可能性のない演算を行うが、前記演算は計算量の大きい演算である。従って、前記演算を画像内の全ての人物に対して行うと、低い処理能力の撮像端末では処理しきれなくなる恐れがある。 Further, in the technique described in Patent Literature 2, although the algorithm is decoded by the first information processing apparatus, an operation that does not have the possibility of estimating the feature data of the final result is performed. It is. Therefore, if the calculation is performed on all the persons in the image, there is a possibility that the imaging terminal with low processing capability may not be able to process.

本発明の目的は、上記問題に鑑みてなされたもので、撮像端末の処理能力が低い場合であっても、撮像端末がサーバにプライバシーを保護した情報を送信することができる人物特徴量抽出システム、撮像端末、サーバおよび人物特徴量抽出方法を提供することにある。 An object of the present invention has been made in view of the above problems, and even if the processing capability of the imaging terminal is low, the person feature amount extraction system that enables the imaging terminal to transmit information protecting privacy to the server Another object is to provide an imaging terminal, a server, and a person feature amount extraction method.

本発明によれば、
少なくとも１つの撮像端末と、少なくとも１つのサーバを備え、
前記撮像端末は、
人物の映った画像から第１の特徴量を抽出する第１特徴量抽出手段と、
抽出した前記第１の特徴量を送信する特徴量送信手段を有し、
前記サーバは、
送信された前記第１の特徴量を受信する特徴量受信手段と、
前記第１の特徴量に基づき、前記画像内に存在する人物の位置を特定する人物位置特定手段と、
前記第１の特徴量と前記人物の位置に基づき、人物ごとの第２の特徴量を抽出する第２特徴量抽出手段を有する人物特徴量抽出システムが提供される。According to the present invention,
At least one imaging terminal and at least one server;
The imaging terminal is
First feature amount extraction means for extracting a first feature amount from an image of a person;
Feature amount transmitting means for transmitting the extracted first feature amount;
The server
Feature quantity receiving means for receiving the transmitted first feature quantity;
Based on the first feature amount, person position specifying means for specifying the position of a person existing in the image;
There is provided a person feature quantity extraction system having second feature quantity extraction means for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.

本発明によれば、
撮像端末から送信された第１の特徴量を受信する特徴量受信手段と、
前記第１の特徴量に基づき、前記画像内に存在する人物の位置を特定する人物位置特定手段と、
前記第１の特徴量と前記人物の位置に基づき、人物ごとの第２の特徴量を抽出する第２特徴量抽出手段を有するサーバが提供される。According to the present invention,
Feature quantity receiving means for receiving the first feature quantity transmitted from the imaging terminal;
Based on the first feature amount, person position specifying means for specifying the position of a person existing in the image;
A server is provided that includes second feature quantity extraction means for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.

本発明によれば、
少なくとも１つの撮像端末と、少なくとも１つのサーバを使用し、
前記撮像端末で、
人物の映った画像から第１の特徴量を抽出する第１特徴量抽出手段と、
抽出した前記第１の特徴量を送信する特徴量送信手段を実行し、
前記サーバで、
送信された前記第１の特徴量を受信する特徴量受信手段と、
前記第１の特徴量に基づき、画像内に存在する人物の位置を特定する人物位置特定手段と、
前記第１の特徴量と前記人物の位置に基づき、人物ごとの第２の特徴量を抽出する第２特徴量抽出手段を実行する人物特徴量抽出方法が提供される。According to the present invention,
Using at least one imaging terminal and at least one server,
In the imaging terminal,
First feature amount extraction means for extracting a first feature amount from an image of a person;
Executing a feature amount transmitting means for transmitting the extracted first feature amount;
On the server,
Feature quantity receiving means for receiving the transmitted first feature quantity;
Person position specifying means for specifying the position of a person existing in the image based on the first feature amount;
A person feature amount extraction method is provided that executes second feature amount extraction means for extracting a second feature amount for each person based on the first feature amount and the position of the person.

本発明によれば、撮像端末の処理能力が低い場合であっても、撮像端末がサーバにプライバシーを保護した情報を送信することができる。 According to the present invention, even when the processing capability of the imaging terminal is low, the imaging terminal can transmit the privacy-protected information to the server.

上述した目的、およびその他の目的、特徴および利点は、以下に述べる好適な実施の形態、およびそれに付随する以下の図面によってさらに明らかになる。 The above-described object and other objects, features, and advantages will become more apparent from the preferred embodiments described below and the accompanying drawings.

本発明の第１の実施形態に係るシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るシステムの処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態において、第１の特徴量を抽出する際のグリッドサイズの例を示す図である。It is a figure which shows the example of the grid size at the time of extracting the 1st feature value in the 1st Embodiment of this invention. 人物領域の作成例を示す図である。It is a figure which shows the example of creation of a person area. エッジヒストグラム特徴量を抽出する方法の例を示す図である。It is a figure which shows the example of the method of extracting an edge histogram feature-value. 輝度勾配方向の出現頻度のヒストグラムを示す図である。It is a figure which shows the histogram of the appearance frequency of a brightness | luminance gradient direction. 輝度勾配方向の分類方法およびアルゴリズムの例を示す図である。It is a figure which shows the example of the classification method and algorithm of a brightness | luminance gradient direction. カラーレイアウト特徴量を抽出する方法の例を示す図である。It is a figure which shows the example of the method of extracting a color layout feature-value. 本発明の第２の実施形態において、第１の特徴量を抽出する際のグリッドサイズの例を示す図である。It is a figure which shows the example of the grid size at the time of extracting the 1st feature value in the 2nd Embodiment of this invention. 本発明の第３の実施形態において、第１の特徴量を抽出する際のグリッドサイズの例を示す図である。It is a figure which shows the example of the grid size at the time of extracting the 1st feature value in the 3rd Embodiment of this invention.

以下、本発明の実施の形態について、図面を用いて説明する。尚、すべての図面において、同様な構成要素には同様の符号を付し、適宜説明を省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In all the drawings, the same reference numerals are given to the same components, and the description will be omitted as appropriate.

（第１の実施形態）
図１は、本発明の第１の実施形態に係るシステムの構成を示すブロック図である。本実施形態の人物特徴量抽出システムは、撮像端末１０とサーバ２０を備える。撮像端末１０は、画像取得手段１０２、第１特徴量抽出手段１０４、特徴量送信手段１０６を有する。サーバ２０は、特徴量受信手段２０２、人物位置特定手段２０４、第２特徴量抽出手段２０６を有する。(First embodiment)
FIG. 1 is a block diagram showing a configuration of a system according to the first embodiment of the present invention. The human feature amount extraction system of this embodiment includes an imaging terminal 10 and a server 20. The imaging terminal 10 includes an image acquisition unit 102, a first feature amount extraction unit 104, and a feature amount transmission unit 106. The server 20 includes a feature amount receiving unit 202, a person position specifying unit 204, and a second feature amount extracting unit 206.

撮像端末１０は、取得した画像から抽出した、第１の特徴量をサーバ２０に送信する。サーバ２０は、撮像端末１０から受信した第１の特徴量に基づき、人物の位置を特定し、人物を同定する第２の特徴量を抽出する。 The imaging terminal 10 transmits the first feature amount extracted from the acquired image to the server 20. The server 20 specifies the position of the person based on the first feature quantity received from the imaging terminal 10 and extracts the second feature quantity for identifying the person.

なお、図１に示した撮像端末１０及びサーバ２０の各構成要素は、ハードウエア単位の構成ではなく、機能単位のブロックを示している。撮像端末１０及びサーバ２０の各構成要素は、任意のコンピュータのＣＰＵ、メモリ、メモリにロードされた本図の構成要素を実現するプログラム、そのプログラムを格納するハードディスクなどの記憶メディア、ネットワーク接続用インタフェースを中心にハードウエアとソフトウエアの任意の組合せによって実現される。そして、その実現方法、装置には様々な変形例がある。 Note that each component of the imaging terminal 10 and the server 20 illustrated in FIG. 1 is not a hardware unit configuration but a functional unit block. Each component of the imaging terminal 10 and the server 20 includes an arbitrary computer CPU, a memory, a program for realizing the components shown in the figure loaded in the memory, a storage medium such as a hard disk for storing the program, and a network connection interface. It is realized by any combination of hardware and software. There are various modifications of the implementation method and apparatus.

本実施形態における処理の流れを、図２を用いて説明する。 The processing flow in this embodiment will be described with reference to FIG.

画像取得手段１０２は、例えば、監視カメラなどのＣＣＤデジタルカメラやビデオカメラなどの映像入力機器で撮像した画像を取得し、メモリやストレージ等の記憶領域に格納する（Ｓ１０２）。ここで、映像入力機器から入力された情報が動画である場合、画像取得手段１０２は、その動画を任意のフレーム単位に切り出して画像とする。通常、画像はＲＧＢ値で取得されるため、画像取得手段１０２は、取得された画像のＲＧＢ値をＹＣｂＣｒ値へ変換する。画像取得手段１０２は、例えば以下に示す式１を用いて、ＲＧＢ値をＹＣｂＣｒ値へ変換できる。

The image acquisition means 102 acquires, for example, an image captured by a video input device such as a CCD digital camera such as a surveillance camera or a video camera, and stores it in a storage area such as a memory or storage (S102). Here, when the information input from the video input device is a moving image, the image acquisition unit 102 cuts out the moving image into arbitrary frames and sets it as an image. Usually, since an image is acquired as an RGB value, the image acquisition unit 102 converts the RGB value of the acquired image into a YCbCr value. The image acquisition unit 102 can convert RGB values into YCbCr values using, for example, the following Equation 1.

第１特徴量抽出手段１０４は、取得された画像から第１の特徴量を算出する（Ｓ１０４）。本実施形態では、第１特徴量抽出手段１０４は、例として、色分布情報および輝度勾配情報を第１の特徴量として算出する。 The first feature quantity extraction unit 104 calculates a first feature quantity from the acquired image (S104). In the present embodiment, as an example, the first feature amount extraction unit 104 calculates color distribution information and luminance gradient information as the first feature amount.

以下に、第１特徴量抽出手段１０４が、色分布情報および輝度勾配情報を抽出する例を示す。 Hereinafter, an example in which the first feature amount extraction unit 104 extracts color distribution information and luminance gradient information will be described.

まず、第１特徴量抽出手段１０４が色分布情報を抽出する例について説明する。第１特徴量抽出手段１０４は、まず図３に示すように、取得された画像全体を均一なグリッドサイズ１０８（ここでは、横Ｍ×縦Ｎのサイズとする。）で分割する。このグリッドサイズ１０８は、例えば、第１特徴量抽出手段１０４に予め設定されている。そして、第１特徴量抽出手段１０４は、分割された各グリッドサイズ１０８内に存在する各画素（ｍ，ｎ）の、Ｙ、Ｃｂ、Ｃｒの平均値により表現される色（以下、平均色とする。）を、以下に示す式２を用いてグリッドごとに算出する。式２を用いて、取得された画像の各グリッドにおける平均色を算出することにより、第１特徴量抽出手段は、取得された画像から色分布情報を抽出する。

First, an example in which the first feature amount extraction unit 104 extracts color distribution information will be described. First, as shown in FIG. 3, the first feature quantity extraction unit 104 divides the entire acquired image with a uniform grid size 108 (here, a size of horizontal M × vertical N). The grid size 108 is set in advance in the first feature amount extraction unit 104, for example. Then, the first feature quantity extraction unit 104 is a color (hereinafter referred to as an average color) expressed by an average value of Y, Cb, and Cr of each pixel (m, n) present in each divided grid size 108. Is calculated for each grid using Equation 2 shown below. The first feature quantity extraction unit extracts color distribution information from the acquired image by calculating an average color in each grid of the acquired image using Expression 2.

次に、第１特徴量抽出手段１０４が輝度勾配情報を抽出する例について説明する。まず、取得された画像がカラー画像である場合、第１特徴量抽出手段１０４は、その画像をグレースケールに変換する。そして、第１特徴量抽出手段１０４は、Ｓｏｂｅｌフィルタ等を用いて、グレースケール画像から水平方向および垂直方向のエッジ勾配の強度を算出する。最後に、第１特徴量抽出手段１０４は、輝度勾配方向を任意のＫ方向に量子化し、勾配強度とともに算出する。以下に示す式３は、グレースケール（Ｙ成分）の画素（ｉ，ｊ）における勾配強度ｓｔｒｅｎｇｔｈ（ｉ，ｊ）を算出する式である。また、以下に示す式４は、グレースケール（Ｙ成分）の画素（ｉ，ｊ）におけるエッジ方向ｄｉｒ（ｉ，ｊ）を算出する式である。第１特徴量抽出手段１０４は、以下の式３及び式４を用いて、画像上の各画素におけるエッジ方向と勾配強度を算出する。

Next, an example in which the first feature amount extraction unit 104 extracts luminance gradient information will be described. First, when the acquired image is a color image, the first feature amount extraction unit 104 converts the image into a gray scale. Then, the first feature amount extraction unit 104 calculates the strength of the edge gradient in the horizontal direction and the vertical direction from the gray scale image using a Sobel filter or the like. Finally, the first feature quantity extraction unit 104 quantizes the luminance gradient direction in an arbitrary K direction and calculates it along with the gradient strength. Expression 3 shown below is an expression for calculating the gradient strength strength (i, j) in the gray scale (Y component) pixel (i, j). Also, the following Expression 4 is an expression for calculating the edge direction dir (i, j) in the grayscale (Y component) pixel (i, j). The first feature quantity extraction unit 104 calculates the edge direction and the gradient strength at each pixel on the image using the following Expression 3 and Expression 4.

上述のように、第１特徴量抽出手段１０４は、画素の数値を平均化する処理、もしくはフィルタを用いて輝度勾配情報を抽出する処理を行うのみである。そのため、離散的コサイン変換を用いた演算処理と比較し、その処理の計算量は少なくなる。 As described above, the first feature quantity extraction unit 104 only performs a process of averaging the numerical values of pixels or a process of extracting luminance gradient information using a filter. For this reason, the amount of calculation of the processing is smaller than that of arithmetic processing using discrete cosine transform.

特徴量送信手段１０６は、第１特徴量抽出手段１０４で原画像から抽出した第１の特徴量を、ＬＡＮなどの信号線、Ｗｉ−Ｆｉ（ＷｉｒｅｌｅｓｓＦｉｄｅｌｉｔｙ）などの無線によりサーバ２０に送信する（Ｓ１０６）。本実施形態では、特徴量送信手段１０６は、輝度勾配情報と色分布情報を送信する。送信するデータの量が多い場合、特徴量送信手段１０６は、例えばランレングス圧縮などを用いてデータを圧縮してもよい。第１特徴量抽出手段１０４で抽出した色分布情報は、分割したグリッド内で同じ色の情報が連続して出現するため、特徴量送信手段１０６は、ランレングス圧縮を行うことで、送信情報の効率的な圧縮が行える。同様に、第１特徴量抽出手段１０４で抽出した輝度勾配情報は、画像におけるエッジ以外の領域で方向別輝度勾配強度の値が０となることが多い。そのため、特徴量送信手段１０６は、方向別輝度勾配強度の値が０となる画素に対してランレングス圧縮を行うことで、送信情報の効率的な圧縮が行える。なお、特徴量送信手段１０６は、例えば各シンボルの出現頻度を考慮した算術符号化などの、ランレングス圧縮以外の圧縮方式を用いてもよい。 The feature amount transmitting unit 106 transmits the first feature amount extracted from the original image by the first feature amount extracting unit 104 to the server 20 by a signal line such as a LAN or wirelessly such as Wi-Fi (Wireless Fidelity) ( S106). In the present embodiment, the feature amount transmission unit 106 transmits luminance gradient information and color distribution information. When the amount of data to be transmitted is large, the feature amount transmitting unit 106 may compress the data using, for example, run length compression. In the color distribution information extracted by the first feature amount extraction unit 104, information of the same color appears continuously in the divided grid. Therefore, the feature amount transmission unit 106 performs run-length compression, so that the transmission information Efficient compression can be performed. Similarly, in the luminance gradient information extracted by the first feature amount extraction unit 104, the value of the luminance gradient strength for each direction is often 0 in a region other than the edge in the image. For this reason, the feature amount transmitting unit 106 can efficiently compress the transmission information by performing run-length compression on the pixels whose direction-specific luminance gradient strength value is zero. Note that the feature amount transmitting unit 106 may use a compression method other than run-length compression, such as arithmetic coding considering the appearance frequency of each symbol.

なお、特徴量送信手段１０６は、元の画像を送信せずに、第１の特徴量を送信する。これにより、第三者が特徴量送信手段１０６から送信される送信情報を傍受しても、その送信情報から元の画像を取得できなくなる。また、特徴量送信手段１０６は、第１の特徴量以外の情報を送信しないようにしてもよい。このようにすれば、プライバシーがより高い精度で保護される。 Note that the feature amount transmitting unit 106 transmits the first feature amount without transmitting the original image. Accordingly, even if a third party intercepts transmission information transmitted from the feature amount transmission unit 106, the original image cannot be acquired from the transmission information. Further, the feature amount transmitting unit 106 may not transmit information other than the first feature amount. In this way, privacy is protected with higher accuracy.

サーバ２０では、特徴量送信手段１０６から送信された第１の特徴量に基づき、人物の特定部位を検出して人物の位置を推定し、該当人物を特定するための第２の特徴量を抽出する。人物の特定部位としては、例えば上半身、頭、顔、肩などを用いることができる。以下では、特定部位として、頭部または顔部を検出した場合の例を説明する。 In the server 20, based on the first feature amount transmitted from the feature amount transmitting unit 106, a specific part of the person is detected, the position of the person is estimated, and a second feature amount for specifying the person is extracted. To do. As the specific part of the person, for example, the upper body, the head, the face, and the shoulder can be used. Below, the example at the time of detecting a head or a face as a specific part is explained.

特徴量受信手段２０２は、特徴量送信手段１０６から送信された第１の特徴量を受信し、メモリやストレージ等の記憶領域に確保する。ここで、受信した第１の特徴量がランレングス圧縮などで圧縮されている場合は、特徴量受信手段２０２は、圧縮されたデータを復元してメモリ等に確保する。 The feature amount receiving unit 202 receives the first feature amount transmitted from the feature amount transmitting unit 106 and secures it in a storage area such as a memory or a storage. If the received first feature value is compressed by run-length compression or the like, the feature value receiving unit 202 restores the compressed data and secures it in a memory or the like.

人物位置特定手段２０４は、メモリに確保した第１の特徴量から、人物が映っていると推測される領域（以下、人物領域とする）を検出する（Ｓ１０８）。本実施形態では、人物位置特定手段２０４が、色分布情報および輝度勾配情報を用いて、この人物領域を検出する例を説明する。 The person position specifying unit 204 detects an area (hereinafter referred to as a person area) where it is estimated that a person is reflected from the first feature amount secured in the memory (S108). In the present embodiment, an example will be described in which the person position specifying unit 204 detects this person area using color distribution information and luminance gradient information.

まず、人物位置特定手段２０４が、輝度勾配情報を用いて、人物領域を検出する方法を説明する。 First, a method in which the person position specifying unit 204 detects a person area using luminance gradient information will be described.

人物位置特定手段２０４は、記憶部（不図示）に人物の頭部および顔領域の基準となる輝度勾配情報を予め記憶している。人物位置特定手段２０４は、この記憶している輝度勾配情報と、受信した第１の特徴量である輝度勾配情報を比較して、所定の閾値以上の類似度を示す領域を人物の頭部または顔領域として検出する。ここで、Ｋ方向に量子化された輝度勾配情報から顔領域を検出する方法については、非特許文献１に説明されている方法において実現できる。また、頭部を検出する場合に関しても、非特許文献１において、頭部画像を学習させるデータとして用いることにより実現できる。 The person position specifying means 204 prestores brightness gradient information that serves as a reference for the person's head and face area in a storage unit (not shown). The person position specifying unit 204 compares the stored brightness gradient information with the received brightness gradient information that is the first feature amount, and determines a region showing similarity equal to or higher than a predetermined threshold as the head of the person or Detect as face area. Here, a method for detecting a face region from luminance gradient information quantized in the K direction can be realized by the method described in Non-Patent Document 1. Further, the case of detecting the head can also be realized by using the head image as data for learning in Non-Patent Document 1.

そして、人物位置特定手段２０４は、頭部および顔領域が検出された位置およびサイズに基づき、人物の胴体領域を推定する。人物位置特定手段２０４は、人物領域のモデルを作成するモデル作成手段（不図示）をさらに備え、このモデル作成手段は、頭部または顔領域３０２の位置およびサイズに基づき、上半身領域３０４および下半身領域３０６を決定する。図４は、人物領域の作成例を示す図である。図４では、頭部または顔領域３０２の位置より下に、上半身領域３０４として、頭部または顔領域３０２の大きさを縦横２倍に拡張した領域を作成し、その上半身領域３０４の下に、下半身領域３０６として、頭部または顔領域３０２の大きさを縦横２倍に拡張した領域を作成した人物領域例が示されている。 Then, the person position specifying unit 204 estimates the torso area of the person based on the position and size where the head and face areas are detected. The person position specifying means 204 further includes model creation means (not shown) for creating a model of a person area, and the model creation means is based on the position and size of the head or face area 302 and the upper body area 304 and the lower body area. 306 is determined. FIG. 4 is a diagram illustrating an example of creating a person area. In FIG. 4, an area in which the size of the head or face area 302 is doubled vertically and horizontally is created as an upper body area 304 below the position of the head or face area 302, and below the upper body area 304, As the lower body region 306, an example of a person region in which a region in which the size of the head or face region 302 is expanded twice vertically and horizontally is shown.

次に、人物位置特定手段２０４が、色分布情報を用いて、人物領域を検出する方法を説明する。 Next, a method in which the person position specifying unit 204 detects a person area using color distribution information will be described.

人物位置特定手段２０４は、形状記憶手段（不図示）をさらに備える。この形状記憶手段は、人物領域を特定するための、人物の特徴を示す色（以下、人物色とする。）や服の特徴を示す色（以下、服領域色とする。）の集合の形を予め記憶している。そして、人物位置特定手段２０４は、受信した色分布情報から、形状記憶手段に記憶されている人物色や服領域色と類似する色で形成される集合領域を抽出する。そして、人物位置特定手段２０４は、抽出された集合領域と、形状記憶手段に記憶されている集合の形を比較する。人物位置特定手段２０４は、比較の結果として、一定の閾値以上の類似度を示す集合領域を人物領域として特定できる。例えば、人物位置特定手段２０４は、頭部であれば髪の毛の色を人物色とし、その人物色で円形に形成された集合領域を頭部または顔領域３０２とみなすことができる。また、人物位置特定手段２０４は、人物の顔領域であれば肌の色を人物色として、その人物色で円形に形成された集合領域を頭部または顔領域３０２とみなすことができる。また、人物位置特定手段２０４は、ユニフォームなどの特定の色を服領域色として、その服領域色で矩形に形成された集合領域を服領域とみなすことができる。人物位置特定手段２０４は、頭部または顔領域３０２を抽出した場合、輝度勾配情報と同様に、頭部または顔領域３０２を抽出した位置と大きさに基づき上半身領域３０４を特定できる。また、人物位置特定手段２０４は、服領域を抽出した場合、抽出した服領域を上半身領域３０４として特定してもよい。 The person position specifying means 204 further includes shape storage means (not shown). This shape storage means is a shape of a set of a color indicating a person characteristic (hereinafter referred to as a person color) and a color indicating a characteristic of clothes (hereinafter referred to as a clothes area color) for specifying a person area. Is stored in advance. Then, the person position specifying unit 204 extracts a collective region formed by colors similar to the person color and the clothing region color stored in the shape storage unit from the received color distribution information. Then, the person position specifying unit 204 compares the extracted set area with the set shape stored in the shape storing unit. As a result of the comparison, the person position specifying unit 204 can specify, as a person area, a collective area that shows a degree of similarity greater than a certain threshold. For example, in the case of the head, the person position specifying unit 204 can use the color of the hair as a person color, and can consider the aggregate area formed in a circle with the person color as the head or face area 302. Further, the person position specifying unit 204 can regard the collective area formed in a circle with the person color as the head color or the face area 302 if the face color is a person's face area. Further, the person position specifying unit 204 can regard a collective area formed in a rectangle with the clothes area color as a clothes area color using a specific color such as a uniform as a clothes area. When the head position or face area 302 is extracted, the person position specifying unit 204 can specify the upper body area 304 based on the position and size of the extracted head or face area 302 as in the case of the luminance gradient information. In addition, when the clothes area is extracted, the person position specifying unit 204 may specify the extracted clothes area as the upper body area 304.

第２特徴量抽出手段２０６は、人物位置特定手段で特定した上半身領域３０４内の模様または色分布などを表す、第２の特徴量を算出する（Ｓ１１０）。本実施形態では、第２特徴量抽出手段２０６が、第２の特徴量として、上半身領域３０４内の色分布を表すカラーレイアウト特徴量と、上半身領域３０４内の模様を表すエッジヒストグラム特徴量をそれぞれ抽出する例を説明する。 The second feature amount extraction unit 206 calculates a second feature amount representing a pattern or color distribution in the upper body region 304 specified by the person position specifying unit (S110). In the present embodiment, the second feature quantity extraction unit 206 uses a color layout feature quantity representing the color distribution in the upper body area 304 and an edge histogram feature quantity representing a pattern in the upper body area 304 as the second feature quantities, respectively. An example of extraction will be described.

まず、第２特徴量抽出手段２０６が、輝度勾配情報に基づき、エッジヒストグラム特徴量を抽出する流れを説明する。 First, the flow in which the second feature amount extraction unit 206 extracts edge histogram feature amounts based on the luminance gradient information will be described.

輝度勾配情報は、第１特徴量抽出手段１０４において、任意のＫ方向に量子化されている。そこで、第２特徴量抽出手段２０６は、図５に示すように、抽出した上半身領域３０４を任意のＭ×Ｎグリッドに分割し、各グリッド内の各画素の輝度勾配方向の出現頻度を算出する。輝度勾配方向の出現頻度は、グリッドにおける輝度勾配方向を角度で分類し、その角度ごとの出現数により算出される。このとき、輝度の強度が一定の閾値以下の画素は、エッジなしとして分類される。図６は、Ａ、Ｂ、Ｃ、Ｄの四方向と、Ｅ（エッジなし）に輝度勾配方向を分類し、出現頻度を算出した場合の例を示している。また、輝度勾配方向の分類方法の例と各分類方法における出現頻度の算出アルゴリズムを、図７に示す。図７において、左右の図はどちらも８方向に分類した例を示しているが、左図は、３６０度方向を４５度単位で分類したものであり、右図は、１８０度方向を２２．５度単位で分類したものである。そして、第２特徴量抽出手段２０６は、算出された出現頻度を上半身領域３０４の総ピクセル数で割り、正規化を行ったものをエッジヒストグラム特徴量とする。 The luminance gradient information is quantized in an arbitrary K direction by the first feature amount extraction unit 104. Therefore, as shown in FIG. 5, the second feature amount extraction unit 206 divides the extracted upper body region 304 into arbitrary M × N grids, and calculates the appearance frequency in the luminance gradient direction of each pixel in each grid. . The appearance frequency of the luminance gradient direction is calculated by classifying the luminance gradient direction in the grid by an angle and the number of appearances for each angle. At this time, pixels whose luminance intensity is below a certain threshold are classified as having no edge. FIG. 6 shows an example in which the appearance frequency is calculated by classifying the luminance gradient directions into four directions A, B, C, and D and E (no edge). In addition, FIG. 7 shows an example of the classification method of the luminance gradient direction and an algorithm for calculating the appearance frequency in each classification method. In FIG. 7, the left and right figures both show examples classified into 8 directions, but the left figure shows the 360 degree direction classified in units of 45 degrees, and the right figure shows the 180 degree direction as 22.2. They are classified in units of 5 degrees. Then, the second feature amount extraction unit 206 divides the calculated appearance frequency by the total number of pixels of the upper body region 304 and performs normalization as the edge histogram feature amount.

次に、第２特徴量抽出手段２０６が、色分布情報より、カラーレイアウト特徴量を抽出する流れを説明する。 Next, a flow in which the second feature amount extraction unit 206 extracts a color layout feature amount from the color distribution information will be described.

カラーレイアウト特徴量を抽出する処理において、第２特徴量抽出手段２０６は、例えば図８に示すように、抽出された上半身領域３０４を８×８グリッドに分割し、各グリッドにおける平均色を算出する。なお、カラーレイアウト特徴量の抽出については、例えば、非特許文献２のｐｐ.２０８〜２１２に記載されている方法を用いることができる。 In the process of extracting the color layout feature quantity, the second feature quantity extraction unit 206 divides the extracted upper body area 304 into 8 × 8 grids, for example, as shown in FIG. 8, and calculates an average color in each grid. . For the extraction of the color layout feature quantity, for example, the method described in pp. 208 to 212 of Non-Patent Document 2 can be used.

上記第２の特徴量を抽出する処理は、人物位置特定手段２０４で特定された全ての人物に対して処理が完了するまで繰り返される（Ｓ１１２）。 The process of extracting the second feature amount is repeated until the process is completed for all persons specified by the person position specifying unit 204 (S112).

以上、本実施形態によれば、撮像端末１０側で、個人を特定できない送信情報を用い、かつ処理において計算量の大きい演算を行わない。そのため、撮像端末１０の処理能力が低く、高度処理が可能なサーバ２０を要するシステムであっても、プライバシーを保護した人物同定を行うことができる。 As described above, according to the present embodiment, the imaging terminal 10 side uses transmission information that cannot identify an individual, and does not perform a calculation with a large calculation amount in the processing. Therefore, even in a system that requires the server 20 that has a low processing capability of the imaging terminal 10 and is capable of advanced processing, it is possible to perform personal identification that protects privacy.

（第２の実施形態）
本実施形態は、以下の点を除き、第１の実施形態と同様である。(Second Embodiment)
This embodiment is the same as the first embodiment except for the following points.

図９は、本発明の第２の実施形態において、第１の特徴量を抽出する際のグリッドサイズ１０８の例を示す図である。本実施形態では、画像上部から画像下部にかけてグリッドが大きくなるように、グリッドサイズ１０８が設定されている。 FIG. 9 is a diagram illustrating an example of the grid size 108 when extracting the first feature amount in the second embodiment of the present invention. In the present embodiment, the grid size 108 is set so that the grid increases from the top of the image to the bottom of the image.

本構成によれば、画像上でより大きく映る、手前側の人物に対するグリッドサイズ１０８が大きくなるよう設定される。そのため、大きさが均一のグリッドサイズ１０８で平均色を算出した場合と比較し、大きく映る人物の画像が粗くなる。そのため、特徴量送信手段１０６がサーバに送信する情報を誰かが視認しても、大きく映る人物を特定することがより困難となる。さらに、グリッドサイズ１０８を大きくすることにより、同一の色情報が連続する領域が増えるため、均一のグリッドサイズ１０８で平均色を算出した場合と比較して、特徴量送信手段１０６がランレングス圧縮などを用いた場合に画像の圧縮効率を上げることができる。 According to this configuration, the grid size 108 for the person on the near side that appears larger on the image is set to be larger. Therefore, compared with the case where the average color is calculated with the grid size 108 having a uniform size, the image of the person appearing large is coarse. For this reason, even if someone visually recognizes the information transmitted by the feature amount transmission unit 106 to the server, it becomes more difficult to specify a person who appears to be large. Furthermore, since the area where the same color information continues is increased by increasing the grid size 108, the feature amount transmission unit 106 performs run length compression or the like as compared with the case where the average color is calculated with the uniform grid size 108. When is used, the compression efficiency of the image can be increased.

以上、本実施形態においても、第１の実施形態と同様の効果を得ることができる。また、本実施形態では、画像上部から画像下部にかけてグリッドサイズ１０８が大きくなる構成により、均一のグリッドサイズ１０８で分割した場合と比較し、画像内で大きく映る人物であっても個人として特定することが困難となり、プライバシーをより強固に保護できるようになる。さらに、特徴量送信手段１０６がランレングス圧縮などを用いた際の画像圧縮効率が向上し、通信コストを低減させることができる。 As described above, also in this embodiment, the same effect as that of the first embodiment can be obtained. In the present embodiment, the grid size 108 increases from the upper part of the image to the lower part of the image, so that even a person who appears large in the image is identified as an individual as compared to a case where the grid size 108 is divided. It becomes difficult to protect privacy more firmly. Furthermore, the image compression efficiency when the feature amount transmission means 106 uses run length compression or the like is improved, and the communication cost can be reduced.

（第３の実施形態）
本実施形態は、以下の点を除き、第２の実施形態と同様である。(Third embodiment)
This embodiment is the same as the second embodiment except for the following points.

図１０は、本発明の第３の実施形態において、第１の特徴量を抽出する際のグリッドサイズ１０８の例を示す図である。本実施形態では、撮像端末１０が、カメラのキャリブレーション情報に基づいて算出した画像内の３次元位置情報を保持している。第１特徴量抽出手段１０４は、保持している３次元位置情報に基づき、画像上の位置に応じた人物が映る大きさ（以下、想定人物サイズとする。）を推定する。第１特徴量抽出手段１０４は、推定された想定人物サイズに基づき、グリッドサイズ１０８を決定する。 FIG. 10 is a diagram illustrating an example of the grid size 108 when extracting the first feature amount in the third embodiment of the present invention. In the present embodiment, the imaging terminal 10 holds three-dimensional position information in an image calculated based on camera calibration information. The first feature quantity extraction unit 104 estimates the size (hereinafter referred to as an assumed person size) in which a person appears according to the position on the image based on the stored three-dimensional position information. The first feature quantity extraction unit 104 determines the grid size 108 based on the estimated assumed person size.

ここで、想定人物サイズは、例えば、平均身長に基づいた平均人物モデルや、年齢および性別ごとに収集した情報に基づいた各平均モデルなどの基準（以下、基準人物サイズとする。）に基づき推定することができる。また、基準人物サイズには、これらのモデルを複数採用することも可能である。第１特徴量抽出手段１０４は、この基準人物サイズを、画像内の３次元位置情報と照らし合わせて伸縮することにより、画像上の任意の位置で人物がどのくらいの大きさで映るかを推定することができる。この際、基準人物サイズに対して、予めグリッドサイズ１０８の大きさを関連付けておけば、第１特徴量抽出手段１０４は、上記推定に基づきグリッドサイズ１０８を動的に伸縮させることができる。 Here, the assumed person size is estimated based on a reference (hereinafter referred to as a reference person size) such as an average person model based on average height or each average model based on information collected for each age and sex. can do. Further, a plurality of these models can be adopted as the reference person size. The first feature quantity extraction unit 104 estimates the size of the person appearing at an arbitrary position on the image by expanding and contracting the reference person size against the three-dimensional position information in the image. be able to. At this time, if the size of the grid size 108 is associated with the reference person size in advance, the first feature amount extraction unit 104 can dynamically expand and contract the grid size 108 based on the above estimation.

以上、本実施形態においても、第１および第２の実施形態と同様の効果を得ることができる。本実施形態では、撮像端末１０が３次元位置情報および想定人物サイズをさらに有するため、人物が映る大きさに応じてグリッドサイズ１０８を動的に変化させることができる。よって、最適なグリッドサイズ１０８を自動的に選択することが可能となる。 As described above, also in this embodiment, the same effects as those in the first and second embodiments can be obtained. In the present embodiment, since the imaging terminal 10 further has the three-dimensional position information and the assumed person size, the grid size 108 can be dynamically changed according to the size in which the person is reflected. Therefore, the optimum grid size 108 can be automatically selected.

なお、上述した実施形態によれば以下の発明が開示されている。
（付記１）
少なくとも１つの撮像端末と、少なくとも１つのサーバを備え、
前記撮像端末は、
人物の映った画像から第１の特徴量を抽出する第１特徴量抽出手段と、
抽出した前記第１の特徴量を送信する特徴量送信手段を有し、
前記サーバは、
送信された前記第１の特徴量を受信する特徴量受信手段と、
前記第１の特徴量に基づき、前記画像内に存在する人物の位置を特定する人物位置特定手段と、
前記第１の特徴量と前記人物の位置に基づき、人物ごとの第２の特徴量を抽出する第２特徴量抽出手段を有する人物特徴量抽出システム。
（付記２）
付記１に記載の人物特徴量抽出システムにおいて、
前記撮像端末は、前記画像を送信しない人物特徴量抽出システム。
（付記３）
付記１または２に記載の人物特徴量抽出システムにおいて、
前記第１の特徴量は、色分布情報および輝度勾配情報の少なくとも１つを有する人物特徴量抽出システム。
（付記４）
付記１乃至３のいずれか一項に記載の人物特徴量抽出システムにおいて、
前記第１特徴量抽出手段は、縦横の大きさが前記画像上部から前記画像下部に向かって大きくなる可変のグリッドサイズを用いて前記画像を分割し、分割画像ごとの特徴量を用いることにより、前記第１の特徴量を抽出する人物特徴量抽出システム。
（付記５）
付記４に記載の人物特徴量抽出システムにおいて、
前記撮像端末は、
前記画像上における３次元の位置を示す、３次元位置情報を保持しており、
前記第１特徴量抽出手段は、
前記３次元位置情報に基づき、前記画像上で人物が映る想定の大きさである想定人物サイズを算出し、
前記想定人物サイズにより、前記グリッドサイズを決定する人物特徴量抽出システム。
（付記６）
付記５に記載の人物特徴量抽出システムにおいて、
前記想定人物サイズは、平均身長による平均人物モデル、もしくは性別および年齢ごとの平均的な人物モデルを含む基準人物サイズに基づいて算出される人物特徴量抽出システム。
（付記７）
付記４乃至６のいずれか一項に記載の人物特徴量抽出システムにおいて、
前記第１の特徴量は、色分布情報を有しており、
前記色分布情報による前記分割画像ごとの特徴量は、前記分割画像内に存在する色から算出した平均色である人物特徴量抽出システム。
（付記８）
人物の映った画像から第１の特徴量を抽出する第１特徴量抽出手段と、
前記画像は送信せず、抽出した前記第１の特徴量を送信する特徴量送信手段を有する撮像端末。
（付記９）
撮像端末から送信された第１の特徴量を受信する特徴量受信手段と、
前記第１の特徴量に基づき、画像内に存在する人物の位置を特定する人物位置特定手段と、
前記第１の特徴量と前記人物の位置に基づき、人物ごとの第２の特徴量を抽出する第２特徴量抽出手段を有するサーバ。
（付記１０）
少なくとも１つの撮像端末と、少なくとも１つのサーバを使用し、
前記撮像端末で、
人物の映った画像から第１の特徴量を抽出する第１特徴量抽出処理と、
抽出した前記第１の特徴量を送信する特徴量送信処理を実行し、
前記サーバで、
送信された前記第１の特徴量を受信する特徴量受信処理と、
前記第１の特徴量に基づき、前記画像内に存在する人物の位置を特定する人物位置特定処理と、
前記第１の特徴量と前記人物の位置に基づき、人物ごとの第２の特徴量を抽出する第２特徴量抽出処理を実行する人物特徴量抽出方法。
（付記１１）
付記８に記載の撮像端末において、
前記第１の特徴量は、色分布情報および輝度勾配情報の少なくとも１つを有する撮像端末。
（付記１２）
付記８または１１に記載の撮像端末において、
前記第１特徴量抽出手段は、縦横の大きさが前記画像上部から前記画像下部に向かって大きくなる可変のグリッドサイズを用いて前記画像を分割し、分割画像ごとの特徴量を用いることにより、前記第１の特徴量を抽出する撮像端末。
（付記１３）
付記１２に記載の撮像端末において、
前記撮像端末は、
前記画像上における３次元の位置を示す、３次元位置情報を保持しており、
前記第１特徴量抽出手段は、
前記３次元位置情報に基づき、前記画像上で人物が映る想定の大きさである想定人物サイズを算出し、
前記想定人物サイズにより、前記グリッドサイズを決定する撮像端末。
（付記１４）
付記１３に記載の撮像端末において、
前記想定人物サイズは、平均身長による平均人物モデル、もしくは性別および年齢ごとの平均的な人物モデルを含む基準人物サイズに基づいて算出される撮像端末。
（付記１５）
付記１２乃至１４のいずれか一項に記載の撮像端末において、
前記第１の特徴量は、色分布情報を有しており、
前記色分布情報による前記分割画像ごとの特徴量は、前記分割画像内に存在する色から算出した平均色である撮像端末。
（付記１６）
付記１０に記載の人物特徴量抽出方法において、
前記撮像端末は、前記画像を送信しない人物特徴量抽出方法。
（付記１７）
付記１０または１６に記載の人物特徴量抽出方法において、
前記第１の特徴量は、色分布情報および輝度勾配情報の少なくとも１つを有する人物特徴量抽出方法。
（付記１８）
付記１０または１６または１７のいずれか一項に記載の人物特徴量抽出方法において、
前記第１特徴量抽出処理は、縦横の大きさが前記画像上部から前記画像下部に向かって大きくなる可変のグリッドサイズを用いて前記画像を分割し、分割画像ごとの特徴量を用いることにより、前記第１の特徴量を抽出する人物特徴量抽出方法。
（付記１９）
付記１８に記載の人物特徴量抽出方法において、
前記撮像端末は、
前記画像上における３次元の位置を示す、３次元位置情報を保持しており、
前記第１特徴量抽出処理は、
前記３次元位置情報に基づき、前記画像上で人物が映る想定の大きさである想定人物サイズを算出し、
前記想定人物サイズにより、前記グリッドサイズを決定する人物特徴量抽出方法。
（付記２０）
付記１９に記載の人物特徴量抽出方法において、
前記想定人物サイズは、平均身長による平均人物モデル、もしくは性別および年齢ごとの平均的な人物モデルを含む基準人物サイズに基づいて算出される人物特徴量抽出方法。
（付記２１）
付記１８乃至２０のいずれか一項に記載の人物特徴量抽出方法において、
前記第１の特徴量は、色分布情報を有しており、
前記色分布情報による前記分割画像ごとの特徴量は、前記分割画像内に存在する色から算出した平均色である人物特徴量抽出方法。In addition, according to embodiment mentioned above, the following invention is disclosed.
(Appendix 1)
At least one imaging terminal and at least one server;
The imaging terminal is
First feature amount extraction means for extracting a first feature amount from an image of a person;
Feature amount transmitting means for transmitting the extracted first feature amount;
The server
Feature quantity receiving means for receiving the transmitted first feature quantity;
Based on the first feature amount, person position specifying means for specifying the position of a person existing in the image;
A person feature quantity extraction system comprising second feature quantity extraction means for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.
(Appendix 2)
In the human feature amount extraction system according to attachment 1,
The imaging terminal is a person feature extraction system that does not transmit the image.
(Appendix 3)
In the person feature extraction system according to appendix 1 or 2,
The person feature amount extraction system, wherein the first feature amount includes at least one of color distribution information and luminance gradient information.
(Appendix 4)
In the person feature extraction system according to any one of appendices 1 to 3,
The first feature amount extraction unit divides the image using a variable grid size whose vertical and horizontal sizes increase from the upper part of the image toward the lower part of the image, and uses the feature amount of each divided image, A person feature extraction system for extracting the first feature.
(Appendix 5)
In the person feature extraction system according to appendix 4,
The imaging terminal is
Holding 3D position information indicating a 3D position on the image;
The first feature amount extraction means includes:
Based on the three-dimensional position information, an assumed person size that is an assumed size that a person appears on the image is calculated,
A person feature extraction system that determines the grid size based on the assumed person size.
(Appendix 6)
In the person feature extraction system according to appendix 5,
The assumed person size is a person feature amount extraction system calculated based on a reference person size including an average person model based on average height or an average person model for each gender and age.
(Appendix 7)
In the person feature extraction system according to any one of appendices 4 to 6,
The first feature amount has color distribution information,
The person feature quantity extraction system in which the feature quantity for each of the divided images based on the color distribution information is an average color calculated from colors existing in the divided image.
(Appendix 8)
First feature amount extraction means for extracting a first feature amount from an image of a person;
An imaging terminal having a feature amount transmitting means for transmitting the extracted first feature amount without transmitting the image.
(Appendix 9)
Feature quantity receiving means for receiving the first feature quantity transmitted from the imaging terminal;
Person position specifying means for specifying the position of a person existing in the image based on the first feature amount;
A server having second feature quantity extraction means for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.
(Appendix 10)
Using at least one imaging terminal and at least one server,
In the imaging terminal,
A first feature extraction process for extracting a first feature from an image of a person;
Executing a feature amount transmission process for transmitting the extracted first feature amount;
On the server,
A feature amount receiving process for receiving the transmitted first feature amount;
A person position specifying process for specifying a position of a person existing in the image based on the first feature amount;
A person feature quantity extraction method for executing a second feature quantity extraction process for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.
(Appendix 11)
In the imaging terminal according to attachment 8,
The imaging terminal, wherein the first feature amount includes at least one of color distribution information and luminance gradient information.
(Appendix 12)
In the imaging terminal according to appendix 8 or 11,
The first feature amount extraction unit divides the image using a variable grid size whose vertical and horizontal sizes increase from the upper part of the image toward the lower part of the image, and uses the feature amount of each divided image, An imaging terminal for extracting the first feature amount.
(Appendix 13)
In the imaging terminal according to attachment 12,
The imaging terminal is
Holding 3D position information indicating a 3D position on the image;
The first feature amount extraction means includes:
Based on the three-dimensional position information, an assumed person size that is an assumed size that a person appears on the image is calculated,
An imaging terminal that determines the grid size based on the assumed person size.
(Appendix 14)
In the imaging terminal according to attachment 13,
The assumed terminal size is an imaging terminal that is calculated based on a reference person size including an average person model based on average height or an average person model for each gender and age.
(Appendix 15)
In the imaging terminal according to any one of appendices 12 to 14,
The first feature amount has color distribution information,
The imaging terminal, wherein the feature amount for each divided image based on the color distribution information is an average color calculated from colors existing in the divided image.
(Appendix 16)
In the person feature amount extraction method according to attachment 10,
A person feature extraction method in which the imaging terminal does not transmit the image.
(Appendix 17)
In the person feature extraction method according to appendix 10 or 16,
The person feature amount extraction method, wherein the first feature amount includes at least one of color distribution information and luminance gradient information.
(Appendix 18)
In the person feature extraction method according to any one of Supplementary Notes 10 or 16 or 17,
In the first feature amount extraction process, the image is divided using a variable grid size in which the vertical and horizontal sizes increase from the upper part of the image toward the lower part of the image, and the feature amount for each divided image is used. A person feature extraction method for extracting the first feature.
(Appendix 19)
In the person feature amount extraction method according to attachment 18,
The imaging terminal is
Holding 3D position information indicating a 3D position on the image;
The first feature amount extraction process includes:
Based on the three-dimensional position information, an assumed person size that is an assumed size that a person appears on the image is calculated,
A person feature extraction method for determining the grid size based on the assumed person size.
(Appendix 20)
In the person feature amount extraction method according to attachment 19,
The assumed person size is a person feature extraction method calculated based on a reference person size including an average person model based on average height or an average person model for each gender and age.
(Appendix 21)
In the person feature extraction method according to any one of appendices 18 to 20,
The first feature amount has color distribution information,
The person feature amount extraction method, wherein the feature amount for each of the divided images based on the color distribution information is an average color calculated from colors existing in the divided image.

以上、図面を参照して本発明の実施形態について述べたが、これらは本発明の例示であり、上記以外の様々な構成を採用することもできる。各実施形態は、説明の便宜上、１つの撮像端末１０と１つのサーバ２０の構成としたが、撮像端末１０またはサーバ２０が複数存在する構成をとることも可能である。 As mentioned above, although embodiment of this invention was described with reference to drawings, these are the illustrations of this invention, Various structures other than the above are also employable. Each embodiment has a configuration of one imaging terminal 10 and one server 20 for convenience of explanation, but a configuration in which a plurality of imaging terminals 10 or servers 20 exist may be employed.

この出願は、２０１２年１月５日に出願された日本出願特願２０１２−０００３８８号を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims the priority on the basis of Japanese application Japanese Patent Application No. 2012-000388 for which it applied on January 5, 2012, and takes in those the indications of all here.

Claims

At least one imaging terminal and at least one server;
The imaging terminal is
First feature amount extraction means for extracting a first feature amount from an image of a person;
Feature amount transmitting means for transmitting the extracted first feature amount;
The server
Feature quantity receiving means for receiving the transmitted first feature quantity;
Based on the first feature amount, person position specifying means for specifying the position of a person existing in the image;
A person feature quantity extraction system comprising second feature quantity extraction means for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.

The person feature amount extraction system according to claim 1,
The imaging terminal is a person feature extraction system that does not transmit the image.

The person feature amount extraction system according to claim 1 or 2,
The person feature amount extraction system, wherein the first feature amount includes at least one of color distribution information and luminance gradient information.

The person feature amount extraction system according to any one of claims 1 to 3,
The first feature amount extraction unit divides the image using a variable grid size whose vertical and horizontal sizes increase from the upper part of the image toward the lower part of the image, and uses the feature amount of each divided image, A person feature extraction system for extracting the first feature.

The person feature amount extraction system according to claim 4,
The imaging terminal is
Holding 3D position information indicating a 3D position on the image;
The first feature amount extraction means includes:
Based on the three-dimensional position information, an assumed person size that is an assumed size that a person appears on the image is calculated,
A person feature extraction system that determines the grid size based on the assumed person size.

The person feature amount extraction system according to claim 5,
The assumed person size is a person feature amount extraction system calculated based on a reference person size including an average person model based on average height or an average person model for each gender and age.

The person feature amount extraction system according to any one of claims 4 to 6,
The first feature amount has color distribution information,
The person feature quantity extraction system in which the feature quantity for each of the divided images based on the color distribution information is an average color calculated from colors existing in the divided image.

Feature quantity receiving means for receiving the first feature quantity transmitted from the imaging terminal;
Person position specifying means for specifying the position of a person existing in the image based on the first feature amount;
A server having second feature quantity extraction means for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.

Using at least one imaging terminal and at least one server,
In the imaging terminal,
A first feature extraction process for extracting a first feature from an image of a person;
Executing a feature amount transmission process for transmitting the extracted first feature amount;
On the server,
A feature amount receiving process for receiving the transmitted first feature amount;
A person position specifying process for specifying a position of a person existing in the image based on the first feature amount;
A person feature quantity extraction method for executing a second feature quantity extraction process for extracting a second feature quantity for each person based on the first feature quantity and the position of the person.