JP4967045B2

JP4967045B2 - Background discriminating apparatus, method and program

Info

Publication number: JP4967045B2
Application number: JP2010135643A
Authority: JP
Inventors: 雅二郎岩崎
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2010-06-15
Filing date: 2010-06-15
Publication date: 2012-07-04
Anticipated expiration: 2030-06-15
Also published as: JP2012003358A

Description

本発明は、コンピュータを用いた画像処理に関する。 The present invention relates to image processing using a computer.

近年、インターネットのウェブサイトなどにおいて、ユーザの指定した画像に類似した画像をコンピュータで検索し結果を表示する、いわゆる類似画像検索のサービスが提案されている。類似画像検索では、画像に含まれる色や形などの特徴を表す情報（「特徴量」と呼ぶ）を用いるが、画像に写っている物体などの対象物（「オブジェクト」とも呼ぶ）に注目した類似画像検索を行う場合、背景領域の特徴量が検索精度を低下させる問題がある。 2. Description of the Related Art In recent years, a so-called similar image search service has been proposed in which an image similar to an image designated by a user is searched with a computer and the result is displayed on an Internet website or the like. Similar image search uses information (referred to as “features”) representing features such as colors and shapes contained in images, but focused on objects such as objects (also referred to as “objects”) in the images. When a similar image search is performed, there is a problem that the feature amount of the background area decreases the search accuracy.

そのため、画像から背景領域を判別・除去することで背景の影響を抑制し、検索精度の向上を図る技術が提案されている。この種の技術では、画像中の部分毎の特徴量をクラスタリングすることによって、特徴量が似通った範囲をそれぞれ独立した領域として抽出する。また、部分や領域について、その特徴量を基に背景か否かを判別する手法の例として、特許文献１では、空の青や芝生の緑など背景となる特定色の部分を背景と判別したり（段落００６７）、画像を分割した領域ごとに、線や濃淡のパターンなどの特徴量を基に背景か判別する例を示している（段落００６８）。 For this reason, a technique has been proposed in which the influence of the background is suppressed and the search accuracy is improved by determining and removing the background region from the image. In this type of technology, the feature amounts for each part in the image are clustered to extract ranges having similar feature amounts as independent regions. In addition, as an example of a method for determining whether or not a part or region is a background based on the feature amount, in Patent Document 1, a part of a specific color that becomes a background such as sky blue or green grass is determined as a background. (Paragraph 0067), an example is shown in which for each region into which an image is divided, the background is determined based on feature quantities such as lines and shading patterns (paragraph 0068).

特開平９−１３８４７０号公報JP-A-9-138470

しかし、上記のようにクラスタリングを用いると、画像におけるオブジェクトや背景の種類や写り方によっては、独立した領域が複数抽出され、そのなかから主たるオブジェクトを認識するのが困難な場合が多いという課題があった。 However, when clustering is used as described above, depending on the type of object or background in the image and how it is captured, multiple independent regions are extracted, and it is often difficult to recognize the main object from them. there were.

例えば、特許文献１では、部分や領域の色形などの特徴量如何で背景かを判別するが、オブジェクトの色や形などが必ずしも背景とかけ離れているとは限らない。逆に、オブジェクトの周囲に位置している他のオブジェクトや背景が、必ずしも特定の色や質感で写っているとも限らない。さらに、空や植物以外の何らかの物体（オブジェクト）ではあるが、画像の意図する主なオブジェクトとの関係では、背景としての位置付けに過ぎない場合も多い。 For example, in Patent Document 1, the background is determined based on the feature amount such as the color shape of a part or region, but the color or shape of the object is not necessarily different from the background. On the other hand, other objects and backgrounds located around the object are not necessarily reflected in a specific color or texture. Furthermore, although it is some kind of object (object) other than the sky or plants, it is often only positioned as a background in relation to the main object intended by the image.

以上のように、クラスタリングでオブジェクトなど複数の領域が抽出されるような画像の場合には、主たるオブジェクトと背景とを適切に判別して抽出等の処理を行うことは困難であった。 As described above, in the case of an image in which a plurality of areas such as objects are extracted by clustering, it is difficult to appropriately perform processing such as extraction by appropriately distinguishing the main object and the background.

上記の課題に対し、本発明の目的は、主たるオブジェクトと背景とを精度よく判別することである。 In view of the above problems, an object of the present invention is to accurately discriminate between a main object and a background.

上記の目的をふまえ、本発明の一態様（１）は、画像内の背景を判別する背景判別装置であって、判別の対象とする元画像の全画素に対応するブロック毎に特徴量を抽出する特徴量抽出手段と、抽出された前記ブロックの特徴量に基づいてブロックをクラスタリングすることにより、前記画像を複数の領域に分割する分割手段と、画像の各ブロック毎に背景である確率を設定する設定手段と、分割された前記領域に対応する位置の前記ブロック毎に設定されている前記確率の平均値を前記領域毎に算出し、この平均値に基づいて各領域が背景であるか否かを判別する判別手段と、分割された前記各領域毎に、その領域に含まれる各ブロックの特徴量に基づいて、領域の代表特徴量を算出し、抽出した前記元画像の全画素に対応するブロック毎の前記特徴量がいずれの領域の前記代表特徴量に最も類似するかを判定し、最も類似する領域が背景と判別されているかに基づいて、その画素が背景かを判別する画素判定手段と、を備えることを特徴とする。 Based on the above object, one aspect (1) of the present invention is a background discriminating apparatus that discriminates a background in an image, and extracts feature amounts for each block corresponding to all pixels of the original image to be discriminated. A feature amount extracting means, a dividing means for dividing the image into a plurality of regions by clustering the blocks based on the extracted feature values of the block, and a probability of being a background for each block of the image is set And an average value of the probability set for each block at a position corresponding to the divided area, and whether each area is a background based on the average value. For each of the divided regions, the representative feature amount of the region is calculated based on the feature amount of each block included in the region, and corresponds to all the pixels of the extracted original image For each block Determining whether the feature amount is most similar to the representative feature quantity of any region, most similar regions on the basis of whether it is determined that the background, and the pixel determination means for that pixel to determine the background, the It is characterized by providing.

本発明の他の態様（４）は、上記態様を方法のカテゴリで捉えたもので、画像内の背景をコンピュータが判別する背景判別方法であって、コンピュータが、判別の対象とする元画像の全画素に対応するブロック毎に特徴量を抽出する特徴量抽出ステップと、コンピュータが、抽出された前記ブロックの特徴量に基づいてブロックをクラスタリングすることにより、前記画像を複数の領域に分割する分割ステップと、コンピュータが、画像の各ブロック毎に背景である確率を設定する設定ステップと、コンピュータが、分割された前記領域に対応する位置の前記ブロック毎に設定されている前記確率の平均値を前記領域毎に算出し、この平均値に基づいて各領域が背景であるか否かを判別する判別ステップと、コンピュータが、分割された前記各領域毎に、その領域に含まれる各ブロックの特徴量に基づいて、領域の代表特徴量を算出し、抽出した前記元画像の全画素に対応するブロック毎の前記特徴量がいずれの領域の前記代表特徴量に最も類似するかを判定し、最も類似する領域が背景と判別されているかに基づいて、その画素が背景かを判別する画素判定ステップと、を含むことを特徴とする。 Another aspect ( 4 ) of the present invention is a background determination method in which the above aspect is captured by a method category, in which the computer determines the background in the image, and the computer determines the original image to be determined. A feature amount extracting step for extracting feature amounts for each block corresponding to all pixels, and a computer dividing the image into a plurality of regions by clustering the blocks based on the extracted feature amounts of the blocks A setting step in which the computer sets a probability of being a background for each block of the image, and the computer sets an average value of the probabilities set for each of the blocks at positions corresponding to the divided areas calculated for each of the areas, a determining step of each region is determined whether or not the background on the basis of this average value, before the computer was divided For each region, a representative feature amount of the region is calculated based on the feature amount of each block included in the region, and the feature amount for each block corresponding to all the pixels of the extracted original image And a pixel determining step of determining whether the pixel is the background based on whether the most similar region is determined as the background .

本発明の他の態様（５）は、上記態様をコンピュータ・プログラムのカテゴリで捉えたもので、画像内の背景をコンピュータに判別させる背景判別プログラムであって、コンピュータに、判別の対象とする元画像の全画素に対応するブロック毎に特徴量を抽出させ、コンピュータに、抽出された前記ブロックの特徴量に基づいてブロックをクラスタリングさせることにより、前記画像を複数の領域に分割させ、コンピュータに、画像の各ブロック毎に背景である確率を設定させ、コンピュータに、分割された前記領域に対応する位置の前記ブロック毎に設定されている前記確率の平均値を前記領域毎に算出させ、この平均値に基づいて各領域が背景であるか否かを判別させ、コンピュータに、分割された前記各領域毎に、その領域に含まれる各ブロックの特徴量に基づいて、領域の代表特徴量を算出し、抽出した前記元画像の全画素に対応するブロック毎の前記特徴量がいずれの領域の前記代表特徴量に最も類似するかを判定し、最も類似する領域が背景と判別されているかに基づいて、その画素が背景かを判別させることを特徴とする。 Another aspect ( 5 ) of the present invention is a background determination program for capturing the above aspect in the category of a computer program, and for causing a computer to determine the background in an image. The feature amount is extracted for each block corresponding to all the pixels of the image, and the computer is clustered based on the extracted feature amount of the block, thereby dividing the image into a plurality of regions, and the computer. The probability of being a background is set for each block of the image, and the computer is made to calculate the average value of the probability set for each block at the position corresponding to the divided area for each area. each region based on the values allowed is judged whether or not the background, in a computer, divided the each region included in the region Based on the feature value of each block, the representative feature value of the area is calculated, and the feature value of each block corresponding to all the pixels of the extracted original image is most similar to the representative feature value of which area. And determining whether the pixel is the background based on whether the most similar region is determined to be the background .

このように、画像中央から離れるほど背景の可能性が高いなど画像の部分ごとに背景である確率を設定し、画像から特徴量の共通性によるクラスタリングで分割した領域について、領域内に分布する前記確率の平均値を算出する。この平均値は、画像中の各領域が背景である度合いを表すため、この平均値を基に背景かを判別することにより、オブジェクトや背景の色や形などに拘わらず、主たるオブジェクトと背景を精度よく判別することができる。また、全画素から特徴量を抽出するが、クラスタリング対象とする特徴量は間引いたり複数画素のブロック単位として処理負荷を軽減する。一方、クラスタリング結果の領域ごとに代表特徴量を算出し、この代表特徴量と画素ごとの特徴量との類似性により各画素についても背景か判別する。これにより、クラスタリング負荷の抑制による処理の効率化と、領域境界付近の画素も含む判別精度が両立できる。 In this way, the probability that the background is higher as the distance from the center of the image is higher, and the probability that the background is set for each part of the image is set. The average value of probability is calculated. Since this average value represents the degree to which each area in the image is the background, the main object and the background can be identified regardless of the color or shape of the object or background by determining whether it is the background based on this average value. It can be determined with high accuracy. In addition, although feature amounts are extracted from all pixels, the feature amounts to be clustered are thinned out or the processing load is reduced as a block unit of a plurality of pixels. On the other hand, a representative feature amount is calculated for each region of the clustering result, and whether each pixel is a background is determined based on the similarity between the representative feature amount and the feature amount for each pixel. This makes it possible to achieve both processing efficiency by suppressing the clustering load and discrimination accuracy including pixels near the region boundary.

本発明の他の態様（２）は、上記いずれかの態様において、学習用データとしてジャンルごとの画像を記憶している学習用データ記憶手段を有し、前記設定手段は、画像のブロックごとに背景である確率を対応付けたデータである背景確率分布を、前記学習用データ記憶手段に記憶されている前記学習用データを用いてジャンルごとに予め生成して所定の確率分布記憶手段に記憶させ、前記確率分布記憶手段に記憶されている前記背景確率分布に基づいて前記元画像の各ブロック毎に背景である確率を設定することを特徴とする。 According to another aspect (2) of the present invention, in any one of the above aspects, there is provided learning data storage means for storing an image for each genre as learning data, and the setting means is provided for each block of the image. A background probability distribution, which is data associated with a probability that is a background, is generated in advance for each genre using the learning data stored in the learning data storage means and stored in a predetermined probability distribution storage means. The probability of being a background is set for each block of the original image based on the background probability distribution stored in the probability distribution storage means.

このように、学習用データに基づいて、画像のジャンルごとに背景確率分布を予め生成し背景の判別に活用することにより判別精度が一層改善できる。 Thus, the discrimination accuracy can be further improved by generating a background probability distribution for each image genre in advance based on the learning data and using it for the discrimination of the background.

本発明の他の態様（３）は、上記いずれかの態様において、前記設定手段は、前記ジャンル毎の前記背景である確率の分布を、二次元正規分布に関する係数を用いて表現することを特徴とする。 According to another aspect (3) of the present invention, in any one of the above aspects, the setting unit represents the distribution of the probability as the background for each genre using a coefficient related to a two-dimensional normal distribution. And

このように、背景確率分布を二次元正規分布（ガウス分布）を用いて表現し、ジャンルごとに平均値や偏差などの係数を設定することにより、ジャンル毎の背景確率分布を容易にかつ少ないデータ量で記憶することが可能となる。 In this way, the background probability distribution is expressed using a two-dimensional normal distribution (Gaussian distribution), and by setting coefficients such as an average value and deviation for each genre, the background probability distribution for each genre can be easily and little data. It becomes possible to memorize in quantity.

なお、上記の各態様とは異なるカテゴリ（装置に対し方法、方法に対しプログラムなど）や、以下に説明するさらに具体的な各態様も本発明に含まれる。異なるカテゴリについては、「手段」を「ステップ」のように適宜読み替えるものとする。 It should be noted that a category (method for the apparatus, program for the method, etc.) different from each of the above-described modes and more specific modes described below are also included in the present invention. For different categories, “means” shall be appropriately read as “step”.

本発明によれば、主たるオブジェクトと背景とを精度よく判別することができる。 According to the present invention, the main object and the background can be distinguished with high accuracy.

本発明の実施形態の構成を示す機能ブロック図。The functional block diagram which shows the structure of embodiment of this invention. 本発明の実施形態で用いる情報（データ）を例示する図。The figure which illustrates the information (data) used by embodiment of this invention. 本発明の実施形態における処理手順を示すフローチャート。The flowchart which shows the process sequence in embodiment of this invention. 本発明の実施形態における背景確率分布を示す概念図。The conceptual diagram which shows the background probability distribution in embodiment of this invention. 本発明の実施形態における特徴量抽出の一例を表す概念図。The conceptual diagram showing an example of the feature-value extraction in embodiment of this invention. 本発明の実施形態における特徴量抽出の一例を表す概念図。The conceptual diagram showing an example of the feature-value extraction in embodiment of this invention. 本発明の実施形態における元画像の一例を示す図。The figure which shows an example of the original image in embodiment of this invention. 本発明の実施形態における領域を示す概念図。The conceptual diagram which shows the area | region in embodiment of this invention. 本発明の実施形態における領域と背景確率分布の関係を示す概念図。The conceptual diagram which shows the relationship between the area | region and background probability distribution in embodiment of this invention. 本発明の実施形態における領域ごとの背景確率の平均値を白っぽさで示す概念図。The conceptual diagram which shows the average value of the background probability for every area | region in embodiment of this invention with whitishness. 本発明の実施形態において背景を除去した画像を示す図。The figure which shows the image which removed the background in embodiment of this invention.

次に、本発明を実施するための形態（「実施形態」と呼ぶ）について、図に沿って説明する。なお、背景技術や課題などで既に述べた内容と共通の前提事項については適宜省略する。 Next, modes for carrying out the present invention (referred to as “embodiments”) will be described with reference to the drawings. It should be noted that assumptions common to those already described in the background art and problems are omitted as appropriate.

〔１．構成〕
本実施形態は、図１（構成図）に示すように、画像内の背景を判別する背景判別装置１（以下「本装置１」又は「本装置」と呼ぶ）に関する。本装置１は、通信ネットワークＮ経由で端末Ｔに類似画像検索サービスを提供する機能を持つサーバを兼ねるが、そのような機能を兼ねず背景の判別だけの装置としてもよい。 [1. Constitution〕
As shown in FIG. 1 (configuration diagram), the present embodiment relates to a background discriminating apparatus 1 (hereinafter referred to as “this apparatus 1” or “this apparatus”) that discriminates a background in an image. The device 1 also serves as a server having a function of providing a similar image search service to the terminal T via the communication network N, but may be a device that does not serve as such a function and only determines the background.

本装置１は、一般的なコンピュータの構成として少なくとも、ＣＰＵなどの演算制御部６と、外部記憶装置（ＨＤＤ等）や主メモリ等の記憶装置７と、通信ネットワークＮ（インターネット、携帯電話網、ＬＡＮなど）との通信手段８（ＬＡＮアダプタなど）と、を有する。また、端末Ｔは、ユーザの用いるパーソナル・コンピュータ（ＰＣ）、スマートフォンや携帯電話端末装置といった情報処理装置で、その数は問わない。 This apparatus 1 has at least an arithmetic control unit 6 such as a CPU, a storage device 7 such as an external storage device (HDD or the like) and a main memory, and a communication network N (Internet, mobile phone network, Communication means 8 (LAN adapter etc.). The terminal T is an information processing device such as a personal computer (PC), a smartphone, or a mobile phone terminal device used by the user, and the number thereof is not limited.

そして、本装置１では、記憶装置７に予め記憶（インストール）した図示しない所定のコンピュータ・プログラムが演算制御部６を制御することで、図１に示す各手段などの要素（１１，１２，１３など）を実現する。これら各要素のうち、情報の記憶手段は、記憶装置７において各種のデータベース（「ＤＢ」とも表す）やファイル、配列等の変数、各種スタックやレジスタ、システム設定値など任意の形式で実現できる。 In the apparatus 1, a predetermined computer program (not shown) stored (installed) in advance in the storage device 7 controls the arithmetic control unit 6, so that elements (11, 12, 13, etc.) shown in FIG. Etc.). Among these elements, the information storage means can be realized in the storage device 7 in any format such as various databases (also referred to as “DB”), variables such as files and arrays, various stacks and registers, and system setting values.

このような記憶手段のうち、元画像記憶手段１１は、背景を判別する対象とする画像である元画像（例えば類似画像検索の候補として特徴量などの情報を蓄積しておく多数の画像など）を記憶しておく手段である。また、学習用データ記憶手段１２は、背景である確率を画像中の各ブロック毎に設定するのに用いる学習用データとして、ジャンルごとの画像を記憶している手段である。なお、記憶手段以外の各手段は、以下のような情報処理の機能・作用を実現・実行する処理手段である。 Among such storage means, the original image storage means 11 is an original image that is an image whose background is to be discriminated (for example, a large number of images that accumulate information such as feature amounts as candidates for similar image search). Is a means for storing. Further, the learning data storage means 12 is a means for storing an image for each genre as learning data used for setting the probability of being the background for each block in the image. Each means other than the storage means is a processing means for realizing and executing the following information processing functions and operations.

〔２．作用〕
上記のような本装置１において、背景を判別する処理手順を図３のフローチャートに示す。
〔２−１．背景確率分布の生成〕
ここで、設定手段１３は、画像の各ブロック毎に背景である確率（「背景確率」と呼ぶこととする）を設定するが、本実施形態における背景確率の設定は、背景の判別（図３）に先立って、画像全体における背景確率の分布（「背景確率分布」と呼ぶこととする）を、画像のジャンル（例えば、人物、商品、風景など）ごとに予め定める処理と、背景の判別時に背景確率分布を適用することで個々の元画像のブロック毎に背景確率を設定する処理と、からなる。ここで、背景確率分布の例を図４の概念図に示す。この例では、破線の同心円で示すように、黒っぽい部分ほど主たるオブジェクトである確率が高く背景確率が低く、逆に、周囲の白っぽい色の部分ほど背景確率が高い。 [2. Action)
The processing procedure for determining the background in the apparatus 1 as described above is shown in the flowchart of FIG.
[2-1. (Generation of background probability distribution)
Here, the setting means 13 sets the probability of being the background (hereinafter referred to as “background probability”) for each block of the image. The setting of the background probability in the present embodiment is the background discrimination (FIG. 3). ) Prior to the determination of the background probability distribution in the entire image (referred to as “background probability distribution”) for each image genre (eg, person, product, landscape, etc.) And applying a background probability distribution to set a background probability for each block of each original image. Here, an example of the background probability distribution is shown in the conceptual diagram of FIG. In this example, as indicated by the dashed concentric circles, the darker portion has a higher probability of being the main object and the background probability is lower, and conversely the surrounding whitish color portion has a higher background probability.

このような背景確率分布を求めるには、画像の端の方が背景である確率が高いことを前提として、画像中の座標ごとに何らかの適宜な計算式又は手動設定などで得てもよいが、本実施形態では、画像のジャンルごとに、人手により背景を判別し背景部分のみを抽出したサンプル画像を学習用データとして学習用データ記憶手段１２に予め多数用意した上で、それらを統計処理することで各ピクセルの背景の確率を算出する。 In order to obtain such a background probability distribution, it may be obtained by some appropriate calculation formula or manual setting for each coordinate in the image on the assumption that the edge of the image is higher in the background, In the present embodiment, for each image genre, a large number of sample images obtained by manually determining the background and extracting only the background portion are prepared as learning data in the learning data storage unit 12 and then statistically processed. To calculate the background probability of each pixel.

すなわち、設定手段１３は、画像のブロックごとに背景である確率を対応付けたデータである背景確率分布を、学習用データ記憶手段１２に記憶されている学習用データを用いてジャンルごとに予め生成し、所定の確率分布記憶手段１４に記憶させる。背景確率分布を得るための統計処理としては、画像上の画素やブロック、座標などごとに、そのジャンルの学習用データにおいてその部分が背景となっている割合を集計すればよい。 That is, the setting unit 13 generates a background probability distribution, which is data in which the probability of the background is associated with each block of the image, in advance for each genre using the learning data stored in the learning data storage unit 12. And stored in the predetermined probability distribution storage means 14. As statistical processing for obtaining the background probability distribution, for each pixel, block, coordinate, and the like on the image, the ratio of the background in the genre learning data may be aggregated.

〔２−２．背景確率分布の一例〕
上記のように、設定手段１３が生成して記憶しておく背景確率分布の表現形式は自由であるが、その一例として、ジャンル毎の背景確率分布を、二次元正規分布に関する係数を用いて表現することができる。すなわち、まず、背景確率は、基本的に画像の中心ほど低く辺縁部ほど高いので、主たるオブジェクトである確率（仮に「対象確率」と呼ぶ）と逆であり、背景確率と対象確率は足して１になるいわゆる補数の関係にある。 [2-2. Example of background probability distribution)
As described above, the expression format of the background probability distribution generated and stored by the setting means 13 is arbitrary. As an example, the background probability distribution for each genre is expressed using a coefficient related to the two-dimensional normal distribution. can do. That is, first, the background probability is basically lower at the center of the image and higher at the edge, so it is opposite to the probability of being the main object (referred to as “target probability”). There is a so-called complement relationship that becomes 1.

そこで、補数である対象確率で説明すると、対象確率は、画像中心ほど高く、三次元空間上ならば画像のｘ軸、ｙ軸平面上で、ｚ軸（対象確率）方向に盛り上がる山形を描く二次元正規分布をとる。この場合、画像の横方向をｘ軸、縦方向をｙ軸とすると、画像の中心付近が二次元正規分布の山形の頂点となる平均値にあたる。 Therefore, in terms of the target probability which is a complement, the target probability is higher as the center of the image. Take a dimensional normal distribution. In this case, assuming that the horizontal direction of the image is the x-axis and the vertical direction is the y-axis, the vicinity of the center of the image corresponds to the average value that is the peak of the mountain shape of the two-dimensional normal distribution.

例えば、ｘ，ｙの分散をσ_ｘ ^２，σ_ｙ ^２とし、相関係数をσ_ｘｙとし、二次元正規分布の密度関数における指数部を「−ｃ^２／２」と表すと、密度関数は相似な楕円を表し、楕円上の点（ｘ，ｙ）を全て二次元正規分布の密度関数で

と表すことができる。 For example, x, dispersing sigma _{x 2} of ^y, and sigma _y ^2, the correlation coefficient is sigma _xy, if the exponent of the density function of the two-dimensional normal distribution represented as "-c ^2/2", the density function Represents a similar ellipse, and all points (x, y) on the ellipse are expressed by a density function of a two-dimensional normal distribution.

It can be expressed as.

但し、背景確率に利用する場合はｘ，ｙの相関はないと仮定することができるので、相関係数を０とすると背景確率分布は

のように表される。なお、σ_ｘ ^２，σ_ｙ ^２や、指数部に含まれる平均μ_ｘ，μ_ｙは、識別結果より適当な値を決めてもよいし、平均は画像の中心座標としてもよい。 However, when used for the background probability, it can be assumed that there is no correlation between x and y, so if the correlation coefficient is 0, the background probability distribution is

It is expressed as It should be noted that σ _x ² , σ _y ² and the average μ _x , μ _y included in the exponent part may be determined from appropriate values based on the identification result, or the average may be the center coordinates of the image.

以上のような二次元正規分布を用いて、背景確率分布を対象確率との補数関係に関連付けて表す場合、ジャンル毎の係数としては、例えば、確率（山形）の中心となる中心座標をｘ軸方向やｙ軸方向の平均値などで画像中で任意の位置に設定したり、確率の集中度を分散、標準偏差などで設定することにより、ジャンルに応じた背景確率分布のパターンを表して記憶しておくことができる（例えば図２）。 When using the two-dimensional normal distribution as described above and representing the background probability distribution in association with the complement relationship with the target probability, as the coefficient for each genre, for example, the center coordinate that is the center of the probability (mountain) is the x-axis. The background probability distribution pattern according to the genre can be represented and stored by setting it to any position in the image using the average value in the direction or y-axis direction, or by setting the probability concentration by variance, standard deviation, etc. (For example, FIG. 2).

〔２−３．特徴量の抽出〕
以上のように背景確率分布が生成され、記憶されている前提で、背景を判別して抽出する処理手順（図３）について説明する。まず、特徴量抽出手段１５が、元画像記憶手段１１から、背景判別の対象とする元画像（「対象画像」又は単に「画像」とも呼ぶこととする）をブロックに分割し、分割した画像のブロック毎に特徴量を計算などで抽出する（ステップＳ１１）。ここで抽出する特徴量は、カラーヒストグラムやテクスチャ特徴量など一般的なものでよい。 [2-3. Feature extraction)
A processing procedure (FIG. 3) for discriminating and extracting the background on the premise that the background probability distribution is generated and stored as described above will be described. First, the feature quantity extraction unit 15 divides an original image (hereinafter also referred to as “target image” or simply “image”) as a background discrimination target from the original image storage unit 11 into blocks. A feature amount is extracted for each block by calculation or the like (step S11). The feature amount extracted here may be a general one such as a color histogram or a texture feature amount.

また、特徴量抽出の基礎とするブロックとしては、個々の画素をそのままブロックとしてもよいが、図５に例示するように、例えば画素Ｇ１を中心とするブロックＢ１から特徴量Ｃ１を抽出し、隣の画素Ｇ２を中心とするブロックＢ２から特徴量Ｃ２を抽出、のように複数のピクセルを単位とする窓のようなブロックの範囲をずらしていきながらそれぞれの位置で特徴量を抽出すれば、特徴量の情報量が充実し判別精度の向上が期待できる。 Further, as a block on which the feature amount is extracted, individual pixels may be used as they are. However, as illustrated in FIG. 5, for example, the feature amount C1 is extracted from the block B1 centered on the pixel G1, and the neighboring pixels are extracted. If the feature amount is extracted at each position while shifting the range of the block like a window having a plurality of pixels as a unit, such as extracting the feature amount C2 from the block B2 centering on the pixel G2 The amount of information can be enhanced, and improvement in discrimination accuracy can be expected.

なお、この後行うクラスタリングの負荷が過大になることを防ぐには、図５の例のように全画素に対応して特徴量を抽出する場合は抽出対象とする画素を間引くか、又は、図６に例示するように所定数の画素をまとめたブロックＢ１１やＢ１２などを単位に特徴量Ｃ１１やＣ１２などを抽出すればよい。これらの対応により、クラスタリングの対象となる特徴量を削減して処理負荷を抑制し、処理速度の向上が見込める。 In order to prevent an excessive load of clustering to be performed thereafter, when extracting feature amounts corresponding to all pixels as in the example of FIG. 5, pixels to be extracted are thinned out, or As illustrated in FIG. 6, the feature amounts C11 and C12 may be extracted in units of blocks B11 and B12 in which a predetermined number of pixels are collected. With these measures, it is possible to reduce the feature amount to be clustered and suppress the processing load, and to improve the processing speed.

〔２−４．特徴量に基づくクラスタリング〕
続いて、分割手段１６が、上記のように抽出されたブロックの特徴量に基づいてブロックをクラスタリングして同一クラスタに属するブロック群ごとに領域（「部分領域」とも呼ぶこととする）を生成することにより、画像を複数の領域に分割する（ステップＳ１２）。この特徴量のクラスタリングは特徴量空間内で行い、クラスタリングした特徴量ごとに、元画像中における領域として配置するが、その例を以下に示す。 [2-4. Clustering based on features)
Subsequently, the dividing unit 16 clusters the blocks based on the feature amounts of the blocks extracted as described above, and generates a region (also referred to as a “partial region”) for each block group belonging to the same cluster. Thus, the image is divided into a plurality of regions (step S12). This clustering of feature quantities is performed in the feature quantity space, and each clustered feature quantity is arranged as a region in the original image. An example is shown below.

例えば、図７のような元画像を基に、特徴量を抽出してクラスタリングした結果、図８に示すような領域Ｒ１からＲ５に分割されたとする。この例で、領域Ｒ２は、領域Ｒ１，Ｒ３，Ｒ４，Ｒ５を除いた残る領域である。なお、特徴量空間では同一のクラスタであっても、対応するブロックの位置やまとまりなどによっては、画像上では独立した領域になる可能性がある。また、図７以降の説明では、符号Ｒ１〜Ｒ５で各領域を示すが、実装上は、各領域を識別するＩＤ、ラベル、番号などの識別情報やその順序については、例えばクラスタリング直後はクラスタリングにおける順序などであるものを、画像中の位置順に振り直すなど、適宜付け直しすればよい。 For example, it is assumed that as a result of extracting and clustering feature amounts based on the original image as shown in FIG. 7, the region is divided into regions R1 to R5 as shown in FIG. In this example, the region R2 is a remaining region excluding the regions R1, R3, R4, and R5. Note that, even in the feature amount space, even in the same cluster, there is a possibility that it becomes an independent area on the image depending on the position or group of the corresponding block. In the description of FIG. 7 and subsequent figures, each region is indicated by reference numerals R1 to R5. However, in terms of implementation, identification information such as ID, label, and number for identifying each region and its order are, for example, clustering immediately after clustering. What is in the order may be changed as appropriate, for example, by changing the order in the position in the image.

〔２−５．確率の平均値の算出と背景の判別〕
さらに、判別手段１７は、分割された領域に対応する位置のブロック毎に設定されている確率の平均値を領域毎に算出する。この処理では、判別手段１７は、まず、確率分布記憶手段１４に記憶されている背景確率分布のうち、その対象画像と同一ジャンルに対応する背景確率分布を選択する（ステップＳ１３）。そして、判別手段１７は、その背景確率分布に基づいて即ちその背景確率分布のサイズを対象画像に合わせて拡大縮小し（ステップＳ１４）、それを対象画像に当てはめることで、部分領域ごとに含まれる画像の各ブロック毎に対応する背景確率を求め、その平均値を算出する（ステップＳ１５）。 [2-5. (Calculation of average value of probability and discrimination of background)
Furthermore, the determination means 17 calculates the average value of the probability set for each block at the position corresponding to the divided area for each area. In this process, the determination unit 17 first selects a background probability distribution corresponding to the same genre as the target image from the background probability distributions stored in the probability distribution storage unit 14 (step S13). Based on the background probability distribution, that is, the discriminating means 17 enlarges or reduces the size of the background probability distribution according to the target image (step S14), and applies it to the target image to include each partial region. The background probability corresponding to each block of the image is obtained, and the average value is calculated (step S15).

例えば、図４に示した背景確率分布と、図８に示した各部分領域を、図９に示すようにサイズを合わせて重ね合わせ、背景確率のうち各領域における個々のブロックと重なる背景確率の平均値を求めれば、例えば図１０に例示するように、部分領域ごとの背景確率の平均値が得られる。図１０では、白っぽいほど背景確率が高いものとする。 For example, the background probability distribution shown in FIG. 4 and the partial areas shown in FIG. 8 are overlapped with the same size as shown in FIG. If the average value is obtained, for example, as illustrated in FIG. 10, the average value of the background probabilities for each partial region is obtained. In FIG. 10, it is assumed that the whiter the background probability is higher.

そして、判別手段１７は、上記のように各部分領域ごとに算出された背景確率の平均値に基づいて各領域が背景であるか否かを判別し、背景領域を抽出する（ステップＳ１６）。このように判別、抽出した背景領域を元画像から除去すれば、オブジェクトの特徴を精度よく表した類似画像検索用の画像が得られる。 Then, the discriminating means 17 discriminates whether or not each area is the background based on the average value of the background probabilities calculated for each partial area as described above, and extracts the background area (step S16). If the background region thus determined and extracted is removed from the original image, a similar image search image that accurately represents the features of the object can be obtained.

領域が背景であるか否かの判別基準としては、例えば、予め閾値を決定しておき、領域の背景確率の平均値がその閾値を超えた場合（図１０の例では、白に近い場合）にその領域を背景と判断することが考えられる。他にも例えば、個別具体的な画像に応じ、領域間における背景確率の平均値のギャップや差の比率などが所定値以上開いている場合に、それを背景か否かの区切りとしたり、画像の面積のうち、所定の割合が背景となるように、個別の元画像に応じて平均値の閾値を求めるなどしてもよい。 As a criterion for determining whether or not the region is the background, for example, a threshold value is determined in advance, and the average value of the background probability of the region exceeds the threshold value (in the example of FIG. 10, when it is close to white) It can be considered that the area is determined as the background. In addition, for example, according to an individual specific image, when the gap of the average value of the background probability between the regions or the ratio of the difference is more than a predetermined value, it can be used as a delimiter as to whether it is the background, The threshold value of the average value may be obtained in accordance with individual original images so that a predetermined proportion of the area becomes the background.

以上のような処理によって、例えば、図７の元画像を基に、図１０に示した領域Ｒ４とＲ５以外の領域Ｒ１，Ｒ２，Ｒ３が背景と判別された場合、それらに該当する立ち木などの画像部分を除去し、領域Ｒ４とＲ５に該当する建物２棟だけが残った図１１のような画像を、処理の結果とする。このような処理の結果の画像は、例えば判別手段１７が結果記憶手段１８に記憶させ、端末Ｔからの類似画像検索要求を受けた検索エンジンなどの類似画像検索手段１９が、画像同士の比較による検索処理の対象として利用する。 For example, when the regions R1, R2, and R3 other than the regions R4 and R5 shown in FIG. 10 are determined as the background based on the original image in FIG. The image portion is removed, and an image as shown in FIG. 11 in which only two buildings corresponding to the regions R4 and R5 remain is taken as the processing result. The image obtained as a result of such processing is stored in the result storage unit 18 by the determination unit 17, for example, and a similar image search unit 19 such as a search engine that has received a similar image search request from the terminal T performs comparison between the images. Used as a target for search processing.

〔３．効果〕
本実施形態では、以上のように、画像中央から離れるほど背景の可能性が高いなど画像の部分ごとに背景である確率を設定し（例えば図４、図９）、画像から特徴量の共通性によるクラスタリングで分割した領域について（例えば図８）、領域内に分布する前記確率の平均値を算出しそれを基に背景かを判別することにより（例えば図１０）、オブジェクトや背景の色や形などに拘わらず、主たるオブジェクトと背景を精度よく判別することができる。これにより、画像中のオブジェクトの特徴を精度よく表すことで、類似画像検索の精度を高めることができる。 [3. effect〕
In the present embodiment, as described above, the probability of being a background is set for each part of the image, such as the possibility of the background being higher as the distance from the center of the image increases (for example, FIG. 4 and FIG. 9). By calculating the average value of the probabilities distributed in the region and determining whether it is the background based on the average (for example, FIG. 8), the color or shape of the object or background is obtained. Regardless of the above, the main object and the background can be accurately distinguished. Thereby, the accuracy of the similar image search can be improved by accurately expressing the feature of the object in the image.

特に、本実施形態では、学習用データに基づいて、画像のジャンルごとに背景確率分布を予め生成し背景の判別に活用することにより判別精度が一層改善できる。 In particular, in this embodiment, the discrimination accuracy can be further improved by generating a background probability distribution for each image genre in advance based on the learning data and using it for discrimination of the background.

また、本実施形態では、背景確率分布を二次元正規分布（ガウス分布）を用いて表現し、ジャンルごとに平均値や偏差などの係数を設定することにより（例えば図２）、ジャンル毎の背景確率分布を容易にかつ少ないデータ量で記憶することが可能となる。 In the present embodiment, the background probability distribution is expressed using a two-dimensional normal distribution (Gaussian distribution), and a coefficient such as an average value or a deviation is set for each genre (for example, FIG. 2), so that the background for each genre is set. The probability distribution can be easily stored with a small amount of data.

〔４．第２実施形態〕
上記実施形態（以下「第１実施形態」と呼ぶ）に対し、背景判別の手法を更に改良することで処理の精度と効率を両立させる例を第２実施形態として示す。すなわち、図５の例において特徴量を抽出する対象の画素を間引いたり、また、図６に例示したように特徴量をブロック単位に抽出することで特徴量を削減した場合、元画像の画素単位に背景か否かを判定する処理をさらに加えれば、処理効率を維持しつつ、特に領域間の境界付近における判定精度を改善することができる。 [4. Second Embodiment]
An example in which both the accuracy and efficiency of the process are achieved by further improving the background discrimination method with respect to the above embodiment (hereinafter referred to as “first embodiment”) will be described as a second embodiment. That is, when the feature amount is reduced by thinning out the target pixel from which the feature amount is extracted in the example of FIG. 5 or by extracting the feature amount in block units as illustrated in FIG. If a process for determining whether or not the background is present is further added, it is possible to improve the determination accuracy particularly in the vicinity of the boundary between regions while maintaining the processing efficiency.

この場合、特徴量抽出手段１５は、背景の判別対象とする元画像から、全画素に対応するブロック毎に特徴量を抽出する一方、画素判定手段２０は、分割された各領域ごとに、その領域に含まれる各ブロックの特徴量に基づいて、領域の代表特徴量を算出する。各領域の代表特徴量は、例えば、特徴量ベクトルの平均をとって求めることができる。 In this case, the feature quantity extraction unit 15 extracts the feature quantity for each block corresponding to all pixels from the original image that is the background discrimination target, while the pixel judgment unit 20 performs the process for each divided area. Based on the feature amount of each block included in the region, the representative feature amount of the region is calculated. The representative feature amount of each region can be obtained, for example, by averaging feature amount vectors.

そして、画素判定手段２０は、元画像から抽出したブロック毎の各特徴量がいずれの領域の代表特徴量に最も類似するかを判定し、最も類似する領域が背景と判別されているかに基づいて、その画素が背景かを判別する。すなわち、画素判定手段２０は、最も類似する領域が背景領域と判定されていれば、その画素を背景と判定し、そうでなければ背景ではないと判定する。これにより、処理の精度と効率が両立できる。 Then, the pixel determining unit 20 determines which feature amount for each block extracted from the original image is most similar to the representative feature amount in which region, and based on whether the most similar region is determined as the background. It is determined whether the pixel is the background. That is, if the most similar region is determined to be the background region, the pixel determining unit 20 determines that the pixel is the background, and otherwise determines that it is not the background. As a result, both processing accuracy and efficiency can be achieved.

以上のように、第２実施形態では、全画素から特徴量を抽出するが、クラスタリング対象とする特徴量は間引いたり複数画素のブロック単位として処理負荷を軽減する。一方、クラスタリング結果の領域ごとに代表特徴量を算出し、この代表特徴量と画素ごとの特徴量との類似性により各画素についても背景か判別する。これにより、クラスタリング負荷の抑制による処理の効率化と、領域境界付近の画素も含む判別精度が両立できる。 As described above, in the second embodiment, feature quantities are extracted from all pixels, but the feature quantities to be clustered are thinned out or the processing load is reduced in units of blocks of a plurality of pixels. On the other hand, a representative feature amount is calculated for each region of the clustering result, and whether each pixel is a background is determined based on the similarity between the representative feature amount and the feature amount for each pixel. This makes it possible to achieve both processing efficiency by suppressing the clustering load and discrimination accuracy including pixels near the region boundary.

なお、画素がどの領域と類似するかの基準としては、上記のような領域の代表特徴量と画素の特徴量との類似性以外にも、各領域の背景確率の平均値と、元画像の各画素の特徴量と、の類似度を用いることもできる。即ち、元画像の各画素の特徴量が、領域の背景確率の平均値と類似している場合には、その画素が背景であると判定する。 In addition to the similarity between the representative feature amount of the region and the feature amount of the pixel as described above, as a criterion for which region the pixel is similar to, the average value of the background probability of each region and the original image It is also possible to use the similarity between the feature amount of each pixel. That is, when the feature amount of each pixel of the original image is similar to the average value of the background probabilities of the area, it is determined that the pixel is the background.

〔５．他の実施形態〕
なお、上記各実施形態は例示に過ぎず、本発明は、以下に例示するものやそれ以外の他の実施態様も含むものである。例えば、ブロック毎の背景確率の設定について、ジャンル毎の背景確率分布を予め用意することは必須ではなく、ある固定された背景確率の基準に基づいて個々の元画像を構成するブロック毎に背景確率を計算して設定してもよい。 [5. Other embodiments]
In addition, said each embodiment is only an illustration, and this invention includes what is illustrated below and other embodiment other than that. For example, regarding the setting of the background probability for each block, it is not essential to prepare a background probability distribution for each genre in advance, and the background probability for each block constituting each original image based on a certain fixed background probability criterion. May be calculated and set.

また、手段などの各要素は、コンピュータの演算制御部に限らず、ワイヤードロジック等に基づく電子回路など他の情報処理機構で実現してもよい。また、各構成図、データや画像、画面などの図、フローチャートの図などは例示に過ぎず、各要素の有無、その順序や具体的内容などは適宜変更可能である。例えば、本発明の装置は、サーバなどの装置を複数用いて実現してもよく、個々の記憶手段を別個独立のサーバ装置やシステムで実現する構成も一般的である。また、機能によっては、外部のプラットフォーム等をＡＰＩ（アプリケーション・プログラム・インタフェース）やネットワークコンピューティング（いわゆるクラウドなど）で呼び出して実現するなど、構成は柔軟に変更できる。 In addition, each element such as means may be realized by other information processing mechanisms such as an electronic circuit based on a wired logic or the like without being limited to an arithmetic control unit of a computer. Also, each configuration diagram, data and image, screen and other diagrams, flowchart diagrams, and the like are merely examples, and the presence / absence of each element, its order, specific contents, and the like can be changed as appropriate. For example, the apparatus of the present invention may be realized by using a plurality of apparatuses such as servers, and a configuration in which each storage unit is realized by a separate and independent server apparatus or system is also common. Depending on the function, the configuration can be flexibly changed, for example, by calling an external platform or the like with an API (application program interface) or network computing (so-called cloud or the like).

１背景判別装置
６演算制御部
７記憶装置
８通信手段
１１元画像記憶手段
１２学習用データ記憶手段
１３設定手段
１４確率分布記憶手段
１５特徴量抽出手段
１６分割手段
１７判別手段
１８結果記憶手段
１９類似画像検索手段
Ｂ１，Ｂ２，Ｂ１１，Ｂ１２ブロック
Ｃ１，Ｃ２，Ｃ１１，Ｃ１２特徴量
Ｇ，Ｇ１，Ｇ２画素
Ｎ通信ネットワーク
Ｒ１〜Ｒ５領域
Ｔ端末 DESCRIPTION OF SYMBOLS 1 Background discrimination device 6 Operation control part 7 Storage device 8 Communication means 11 Original image storage means 12 Learning data storage means 13 Setting means 14 Probability distribution storage means 15 Feature quantity extraction means 16 Dividing means 17 Discriminating means 18 Result storage means 19 Similar Image search means B1, B2, B11, B12 Blocks C1, C2, C11, C12 Features G, G1, G2 Pixel N Communication network R1-R5 Region T Terminal

Claims

A background discriminating device for discriminating a background in an image,
Feature amount extraction means for extracting feature amounts for each block corresponding to all pixels of the original image to be determined;
Dividing means for dividing the image into a plurality of regions by clustering the blocks based on the extracted feature quantities of the blocks;
Setting means for setting the probability of being a background for each block of the image;
Discrimination that calculates an average value of the probability set for each block at a position corresponding to the divided region for each region, and determines whether each region is a background based on the average value Means,
For each divided region, a representative feature amount of the region is calculated based on the feature amount of each block included in the region, and the feature amount for each block corresponding to all pixels of the extracted original image is Pixel determining means for determining which region is most similar to the representative feature amount, and determining whether the pixel is the background based on whether the most similar region is determined as the background;
A background discrimination apparatus comprising:

Having learning data storage means for storing images for each genre as learning data;
The setting means includes
A background probability distribution which is data in which a probability of a background is associated with each block of an image is generated in advance for each genre using the learning data stored in the learning data storage means, and a predetermined probability distribution Memorize it in the memory means,
The background determination apparatus according to claim 1, wherein a probability of being a background is set for each block of the original image based on the background probability distribution stored in the probability distribution storage unit.

The background determination apparatus according to claim 1, wherein the setting unit represents the probability distribution as the background using a coefficient related to a two-dimensional normal distribution.

A background determination method in which a computer determines a background in an image,
A feature amount extraction step in which the computer extracts a feature amount for each block corresponding to all pixels of the original image to be determined; and
A dividing step in which the computer divides the image into a plurality of regions by clustering the blocks based on the extracted feature quantities of the blocks;
A setting step in which the computer sets a probability of being a background for each block of the image;
The computer calculates an average value of the probabilities set for each of the blocks at positions corresponding to the divided areas, and determines whether each area is a background based on the average value. A determination step to determine;
For each of the divided regions, the computer calculates a representative feature amount of the region based on the feature amount of each block included in the region, and the block for each block corresponding to all pixels of the extracted original image A pixel determination step of determining whether the feature amount is most similar to the representative feature amount in which region, and determining whether the pixel is the background based on whether the most similar region is determined as the background;
A background discrimination method comprising:

A background determination program for causing a computer to determine the background in an image,
Let the computer extract the feature value for each block corresponding to all pixels of the original image to be determined,
By causing the computer to cluster the blocks based on the extracted feature quantities of the blocks, the image is divided into a plurality of regions,
Let the computer set the probability that it is the background for each block of the image,
Let the computer calculate the average value of the probability set for each block at the position corresponding to the divided area, and determine whether each area is a background based on the average value. to discrimination,
For each of the divided regions, the computer calculates a representative feature amount of the region based on the feature amount of each block included in the region, and the block for each block corresponding to all the pixels of the extracted original image A background discrimination characterized by determining which region has the most similar feature to the representative feature and determining whether the pixel is the background based on whether the most similar region is determined to be the background program.