JP6407467B1

JP6407467B1 - Image processing apparatus, image processing method, and program

Info

Publication number: JP6407467B1
Application number: JP2018097104A
Authority: JP
Inventors: 玄新名
Original assignee: Individual
Current assignee: Individual
Priority date: 2018-05-21
Filing date: 2018-05-21
Publication date: 2018-10-17
Anticipated expiration: 2038-05-21
Also published as: JP2019204163A

Abstract

【課題】画像から対象領域を背景領域と区別して切り出すために、画素に対して前景ラベル、背景ラベルを十分に正確に振り分けることができる画像処理装置、画像処理方法およびプログラムを提供すること。【解決手段】元画像を入力する画像入力部と、入力された前記元画像に対して、ＤＮＮを用いて、一次的な対象領域を算出して、前記一次的な対象領域以外の領域に背景ラベルを振り分ける一次対象領域算出部と、前記一次的な対象領域内において小領域の枠をスライドさせながら、前記小領域の抽出を行う小領域スライド部と、前記小領域の特徴を平均値や分散を用いて統計的に評価して数値化する小領域特徴数値化部と、前記小領域における画素内の位置情報と特徴を含んだ情報をベクトル化することで、前記小領域に含まれる情報に基づいてニューラルネットワークを用いて前記各小領域のクラスタリングを行い、前記小領域内の前記元画像の類似性または特徴ごとにノードとして記憶する小領域クラスタリング部と、前記小領域スライド部によって抽出された前記小領域と入力された前記元画像、前記小領域クラスタリング部による前記各ノードとの関係性などを評価することで、前記一次的な対象領域内の各位置に前記小領域をリンクする小領域リンク部と、前記小領域クラスタリング部による前記各小領域のクラスと、入力された前記画像との位置関係を評価し、前記小領域の連続性について評価して評価値を算出する小領域つながり評価部と、前記小領域つながり評価部による前記評価値を用いて、前記一次対象領域算出部によって切り出された前記一次的な対象領域を二次的な前記対象領域と背景領域に切り分けて、それぞれに対して前景ラベル、前記背景ラベルの振り分けを行ない、また、前記小領域内の前記各画素ごとについても個別に前記前景ラベル、前記背景ラベルの振り分けを行なう二次対象領域背景領域切分部と、前記二次対象領域背景領域切分部により、前記前景ラベル、前記背景ラベルを振り分けた結果をグラフカットの初期値として用いることで、前記グラフカットにより前記二次的な対象領域と前記背景領域でさらに前記前景ラベル、前記背景ラベルの振り分けを行なうグラフカット部と、前記グラフカット部によって前記前景ラベルとされた前記画素のみを対象画像として生成して出力する画像生成部とを備える画像処理装置。【選択図】図２To provide an image processing apparatus, an image processing method, and a program capable of sufficiently and accurately assigning a foreground label and a background label to pixels in order to distinguish a target area from a background area and cut it out from an image. An image input unit that inputs an original image, and a primary target area is calculated using DNN for the input original image, and a background is set in an area other than the primary target area. A primary target area calculation unit that distributes labels, a small area slide unit that extracts a small area while sliding a small area frame within the primary target area, and an average value or variance of the characteristics of the small area The information included in the small area is converted into a vector by converting the information including the position information and features in the pixel in the small area into a small area characteristic digitizing unit that is statistically evaluated and digitized using A small area clustering unit that performs clustering of each of the small areas using a neural network and stores as a node for each similarity or feature of the original image in the small area; and By evaluating the relationship between the small region extracted by the node and the original image inputted, the nodes by the small region clustering unit, etc., the small region is positioned at each position in the primary target region. A small area linking unit for linking areas, a class of each small area by the small area clustering unit, and a positional relationship between the input images and the continuity of the small areas are evaluated to obtain an evaluation value. Using the evaluation values obtained by the small region connection evaluation unit and the small region connection evaluation unit to calculate the primary target region cut out by the primary target region calculation unit, the secondary target region and the background region The foreground label and the background label are assigned to each of the pixels, and the foreground label, By using the result of distributing the foreground label and the background label by the secondary target area background area cut-out section that performs background label distribution and the secondary target area background area cut-out section as the initial value of the graph cut. A graph cut unit for further distributing the foreground label and the background label in the secondary target region and the background region by the graph cut, and only the pixels that are made the foreground label by the graph cut unit An image processing apparatus comprising: an image generation unit that generates and outputs an image. [Selection] Figure 2

Description

本発明は、画像処理技術に関し、特に、画像処理装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing technique, and more particularly to an image processing apparatus, an image processing method, and a program.

従来、写真画像からファッションアイテム（例えば洋服である。）だけを対象領域として背景領域と区別して切り出すことは難しかった。グレースケール等で画素の濃淡だけで判断すると、洋服は繊維であり凸凹を有しており、また光の加減、模様などにより、ノイズとして検出されてしまうことが多かった。 Conventionally, it has been difficult to cut out only a fashion item (for example, clothes) from a photographic image as a target area separately from a background area. Judging from the gray scale or the like only by the shading of the pixels, the clothes are fibers and have irregularities, and are often detected as noise due to light intensity and patterns.

ここで、関連技術として、以下のものがある。 Here, there are the following as related technologies.

特許文献１では以下の内容が開示されている。
対象領域を高精度に抽出するための技術として、画像を対象領域と背景領域とに分割することを画素間のリンクの切断でモデル化するグラフカット法が知られており、例えば、各画素をノードに見立てたグラフを作成して当該グラフを最小のエネルギーにて対象物領域のノード群と背景領域のノード群とに分割する切断を導出する。
しかし、各特徴量が適さない部位で抽出精度が低下する問題があった。
例えば、白いシャツを着た人物が棚及び白い壁の前を歩いているとき、シャツ付近では色の特徴量が抽出に適さずに抽出精度が低下し、脚部では形状の特徴量が抽出に適さずに抽出精度が低下する。
すなわちシャツ付近では、シャツと壁との境界以外にシャツの領域内でも壁の領域内でも色特徴のエネルギーが小さくなり得るため、シャツの一部が欠けた人物領域が抽出されたり、壁の領域を含んだ人物領域が抽出されたりしやすくなる。
一方、脚部付近では足のエッジに対しても棚のエッジに対しても形状特徴のエネルギーが小さくなり得るため、足の一部が欠けた人物領域が抽出されたり、棚の領域を含んだ人物領域が抽出されたりしやすくなる。
このように人物と背景との間で色が似ている部分が発生したり、人物付近に背景のエッジが存在する状況で抽出精度の低下が生じるが、人物の色は様々であり、また人物の移動によって人物周囲の背景の色やエッジは変わるため、色特徴のエネルギー及び形状特徴のエネルギーの適切な寄与率の配置を予め設定することは困難である。
上記問題を鑑みて、例えば、頭部では色重視の領域分割を行い、脚部では形状重視の領域分割を行う、というように１つの対象物でも部位ごとに特徴量の寄与が調整されることにより、対象物の部位ごとに異なる精度低下要因が生じても対象物の領域を高精度に抽出できる技術が提案されている（特許文献１参照）。 Patent Document 1 discloses the following contents.
As a technique for extracting a target region with high accuracy, a graph cut method is known in which an image is divided into a target region and a background region by modeling a link between pixels. A graph that looks like a node is created, and a cut that divides the graph into a node group of the object region and a node group of the background region with the minimum energy is derived.
However, there is a problem that the extraction accuracy is lowered at a site where each feature amount is not suitable.
For example, when a person wearing a white shirt is walking in front of a shelf and a white wall, the color feature is not suitable for extraction near the shirt, and the extraction accuracy is reduced. Inadequate extraction accuracy decreases.
In other words, in the vicinity of the shirt, the energy of the color feature can be reduced in both the shirt area and the wall area in addition to the boundary between the shirt and the wall. It is easy to extract a person area including.
On the other hand, because the energy of the shape feature can be reduced near the leg edge and the edge of the shelf near the leg, a person area lacking a part of the foot is extracted or includes the area of the shelf It becomes easy to extract a person area.
In this way, there are parts where the color is similar between the person and the background, or there is a background edge near the person, but the extraction accuracy decreases, but the color of the person varies, and the person Since the background color and edges around the person change due to the movement of, it is difficult to preset the arrangement of appropriate contribution ratios of the energy of the color features and the energy of the shape features.
In view of the above problems, the contribution of the feature amount is adjusted for each part of even one target object, for example, color-based region division is performed at the head and shape-oriented region division is performed at the leg. Thus, a technique has been proposed in which a region of an object can be extracted with high accuracy even when a different factor of decreasing accuracy occurs for each part of the object (see Patent Document 1).

特許文献２では以下の内容が開示されている。
グラフカットでは、人手で物体と背景の代表画素（以下、シードとも呼ぶ）、シードを元に物体切出しを行うため、人手でシードを設定しなければならないという欠点を持つ。
一方、シードを人手で設定しない手法としては、シードを自動計算する手法がある、予め抽出したい対象を別手法で学習しておき、領域を大まかに推定した後、領域内部と外部の画素をシードとして使用するといったものである。
グラフカットでは、画像各画素をノードとし、これに背景（又は前景）ラベルに対応するソースノードと前景（又は背景）ラベルに対応するシンクノードの２つのノードを加えたノードを持つグラフが作成される。
隣接する画素ノード間には、Ｎリンクと呼ばれるリンクが張ってあり、また、ソースノードと画素ノード間、シンクノードと画素ノード間にはＴリンクと呼ばれるリンクが張ってある。
ここで、従来技術では、画素ノードの一部に、人手または学習によって、背景・前景の正解ラベルを付与する。正解ラベルは、シードとも呼ぶ。
背景の正解ラベルを付与されたものが背景シード、前景の正解ラベルを付与されたものが前景シードである。これらのシードをユーザが指定するといったものであり、これらのシードを別手法により目的とする物体領域を推定し、領域の内側又は外側から前景シード又は背景シードを求めるといったものである。
シードを求めた後は、シードの特徴とシード以外の画素の特徴との関係からＴリンクに付与するコストが計算される。ＴリンクとＮリンクとのコストの和で定義されたエネルギー関数値が最小となるように、グラフを２つのサブグラフにカットすることで、前景と背景との分離を行うのがグラフカットの手法である。
シードを手で設定したり、多量なデータを収集して学習したりせずに、データコストを計算する必要があるため、グラフカットの手法を用いるが、シードをユーザが手で設定せずに、また、学習を必要としないパターン抽出を可能とする技術が提案されている（特許文献２参照）。 Patent Document 2 discloses the following contents.
The graph cut has a drawback that the seed must be set manually because the object and background representative pixels (hereinafter also referred to as seeds) and the object are cut out manually based on the seed.
On the other hand, there is a method that does not manually set the seed. There is a method that automatically calculates the seed. After learning the target to be extracted in advance using another method and roughly estimating the region, seed the pixels inside and outside the region. It is used as.
In graph cut, a graph is created with each image pixel as a node, plus two nodes: a source node corresponding to the background (or foreground) label and a sink node corresponding to the foreground (or background) label. The
A link called an N link is extended between adjacent pixel nodes, and a link called a T link is extended between the source node and the pixel node and between the sink node and the pixel node.
Here, in the prior art, correct labels for the background and the foreground are assigned to a part of the pixel nodes manually or by learning. The correct answer label is also called a seed.
A background seed is assigned a background correct label, and a foreground seed is assigned a foreground correct label. These seeds are specified by the user, the target object region is estimated by another method using these seeds, and the foreground seed or background seed is obtained from the inside or outside of the region.
After obtaining the seed, the cost to be given to the T link is calculated from the relationship between the characteristics of the seed and the characteristics of the pixels other than the seed. The graph cut technique is used to separate the foreground and the background by cutting the graph into two subgraphs so that the energy function value defined by the sum of the costs of the T link and N link is minimized. is there.
Since it is necessary to calculate the data cost without setting the seed by hand or collecting and learning a large amount of data, the graph cut method is used, but the seed is not set by the user by hand. In addition, a technique that enables pattern extraction that does not require learning has been proposed (see Patent Document 2).

特開２０１４−１０７１７号公報JP 2014-10717 A 特開２０１４−１３２３９２号公報JP 2014-132392 A

しかし、上記の特許文献１および特許文献２に開示されている技術でも、画像から対象領域を背景領域と区別して切り出すために、画素に対して前景ラベル、背景ラベルを十分に正確に振り分けることは未だ困難であった。 However, even with the techniques disclosed in Patent Document 1 and Patent Document 2 described above, in order to distinguish a target area from a background area and extract it from an image, it is not possible to allocate foreground labels and background labels to pixels sufficiently accurately. It was still difficult.

本発明の目的は、画像から対象領域を背景領域と区別して切り出すために、画素に対して前景ラベル、背景ラベルを十分に正確に振り分けることができる画像処理装置、画像処理方法およびプログラムを提供することにある。 An object of the present invention is to provide an image processing apparatus, an image processing method, and a program capable of sufficiently and accurately assigning foreground labels and background labels to pixels in order to distinguish a target area from a background area and cut out from an image. There is.

本発明の画像処理装置は、
元画像を入力する画像入力部と、
入力された前記元画像に対して、ＤＮＮを用いて、一次的な対象領域を算出して、前記一次的な対象領域以外の領域に背景ラベルを振り分ける一次対象領域算出部と、
前記一次的な対象領域内において小領域の枠をスライドさせながら、前記小領域の抽出を行う小領域スライド部と、
前記小領域の特徴を平均値や分散を用いて統計的に評価して数値化する小領域特徴数値化部と、
前記小領域における画素内の位置情報と特徴を含んだ情報をベクトル化することで、前記小領域に含まれる情報に基づいてニューラルネットワークを用いて前記各小領域のクラスタリングを行い、前記小領域内の前記元画像の類似性または特徴ごとにノードとして記憶する小領域クラスタリング部と、
前記小領域スライド部によって抽出された前記小領域と入力された前記元画像、前記小領域クラスタリング部による前記各ノードとの関係性を評価することで、前記一次的な対象領域内の各位置に前記小領域をリンクする小領域リンク部と、
前記小領域クラスタリング部による前記各小領域のクラスと、入力された前記画像との位置関係を評価し、前記小領域の連続性について評価して評価値を算出する小領域つながり評価部と、
前記小領域つながり評価部による前記評価値を用いて、前記一次対象領域算出部によって切り出された前記一次的な対象領域を二次的な前記対象領域と背景領域に切り分けて、それぞれに対して前景ラベル、前記背景ラベルの振り分けを行ない、また、前記小領域内の前記各画素ごとについても個別に前記前景ラベル、前記背景ラベルの振り分けを行なう二次対象領域背景領域切分部と、
前記二次対象領域背景領域切分部により、前記前景ラベル、前記背景ラベルを振り分けた結果をグラフカットの初期値として用いることで、前記グラフカットにより前記二次的な対象領域と前記背景領域でさらに前記前景ラベル、前記背景ラベルの振り分けを行なうグラフカット部と、
前記グラフカット部によって前記前景ラベルとされた前記画素のみを対象画像として生成して出力する画像生成部と
を備える。 The image processing apparatus of the present invention
An image input unit for inputting an original image;
A primary target area calculation unit that calculates a primary target area using DNN and assigns a background label to an area other than the primary target area for the input original image;
A small area slide unit that extracts the small area while sliding a frame of the small area in the primary target area;
A small region feature quantification unit that statistically evaluates and quantifies the features of the small region using an average value and variance;
By vectorizing the position information in the pixel in the small region and the information including the feature, the small region is clustered using a neural network based on the information included in the small region. A small area clustering unit that stores as a node for each similarity or feature of the original image,
By evaluating the relationship between the small region extracted by the small region slide unit and the input original image, and each node by the small region clustering unit, each position in the primary target region is evaluated. A small area link portion that links the small areas;
A small region connection evaluation unit that evaluates the positional relationship between the class of each small region by the small region clustering unit and the input image, evaluates the continuity of the small region, and calculates an evaluation value;
Using the evaluation value by the small region connection evaluation unit, the primary target region cut out by the primary target region calculation unit is divided into the secondary target region and the background region, and foreground A label, a background label, and a secondary target area background area separation unit that performs the foreground label and the background label individually for each pixel in the small area;
By using the result of sorting the foreground label and the background label as the initial value of the graph cut by the secondary target region background region segmentation unit, the secondary target region and the background region by the graph cut are used. Furthermore, the foreground label, the graph cut part for sorting the background label,
An image generation unit that generates and outputs only the pixels that have been used as the foreground label by the graph cut unit as a target image.

本発明の画像処理方法は、
画像入力部により、元画像を入力するステップと、
一次対象領域算出部により、入力された前記元画像に対して、ＤＮＮを用いて、一次的な対象領域を算出して、前記一次的な対象領域以外の領域に背景ラベルを振り分けるステップと、
小領域スライド部により、前記一次的な対象領域内において小領域の枠をスライドさせながら、前記小領域の抽出を行うステップと、
小領域特徴数値化部により、前記小領域の特徴を平均値や分散を用いて統計的に評価して数値化するステップと、
小領域クラスタリング部により、前記小領域における画素内の位置情報と特徴を含んだ情報をベクトル化することで、前記小領域に含まれる情報に基づいてニューラルネットワークを用いて前記各小領域のクラスタリングを行い、前記小領域内の前記元画像の類似性または特徴ごとにノードとして記憶するステップと、
小領域リンク部により、前記小領域スライド部によって抽出された前記小領域と入力された前記元画像、前記小領域クラスタリング部による前記各ノードとの関係性を評価することで、前記一次的な対象領域内の各位置に前記小領域をリンクするステップと、
小領域つながり評価部により、前記小領域クラスタリング部による前記各小領域のクラスと、入力された前記画像との位置関係を評価し、前記小領域の連続性について評価して評価値を算出するステップと、
二次対象領域背景領域切分部により、前記小領域つながり評価部による前記評価値を用いて、前記一次対象領域算出部によって切り出された前記一次的な対象領域を二次的な前記対象領域と背景領域に切り分けて、それぞれに対して前景ラベル、前記背景ラベルの振り分けを行ない、また、前記小領域内の前記各画素ごとについても個別に前記前景ラベル、前記背景ラベルの振り分けを行なうステップと、
グラフカット部により、前記二次対象領域背景領域切分部で、前記前景ラベル、前記背景ラベルを振り分けた結果をグラフカットの初期値として用いることで、前記グラフカットで前記二次的な対象領域と前記背景領域でさらに前記前景ラベル、前記背景ラベルの振り分けを行なうステップと、
画像生成部により、前記グラフカット部によって前記前景ラベルとされた前記画素のみを対象画像として生成して出力するステップと
を有する。 The image processing method of the present invention includes:
A step of inputting an original image by an image input unit;
A step of calculating a primary target area using DNN for the input original image by the primary target area calculation unit and allocating a background label to an area other than the primary target area;
Extracting the small area while sliding the frame of the small area in the primary target area by the small area slide unit;
A step of quantifying the characteristics of the small region statistically using an average value or variance by the small region feature digitizing unit;
The small area clustering unit vectorizes the position information in the pixel in the small area and information including the features, thereby performing clustering of the small areas using a neural network based on the information included in the small area. And storing as a node for each similarity or feature of the original image in the small region;
The small area link portion, the small region sliding part the original image input and extracted the small area by the to assess the relationship between each node of the Sub area clustering unit, the primary subject Linking the subregion to each position within the region;
A step of evaluating a positional relationship between the class of each small region by the small region clustering unit and the inputted image by a small region connection evaluation unit, evaluating the continuity of the small regions, and calculating an evaluation value; When,
By using the evaluation value obtained by the small region connection evaluation unit, the primary target region cut out by the primary target region calculation unit is determined as a secondary target region by the secondary target region background region cutting unit. Dividing into background areas, foreground labels and background labels are assigned to each, and foreground labels and background labels are individually assigned for each pixel in the small area;
By using the result of distributing the foreground label and the background label by the graph cut unit as the initial value of the graph cut in the secondary target region background region dividing unit, the secondary target region in the graph cut And further sorting the foreground label and the background label in the background region;
And a step of generating and outputting only the pixels set as the foreground label by the graph cut unit as a target image by an image generation unit.

本発明のプログラムは、
情報処理装置に、
画像入力部により、元画像を入力させる処理と、
一次対象領域算出部により、入力された前記元画像に対して、ＤＮＮを用いて、一次的な対象領域を算出して、前記一次的な対象領域以外の領域に背景ラベルを振り分けさせる処理と、
小領域スライド部により、前記一次的な対象領域内において小領域の枠をスライドさせながら、前記小領域の抽出を行わせる処理と、
小領域特徴数値化部により、前記小領域の特徴を平均値や分散を用いて統計的に評価して数値化させる処理と、
小領域クラスタリング部により、前記小領域における画素内の位置情報と特徴を含んだ情報をベクトル化することで、前記小領域に含まれる情報に基づいてニューラルネットワークを用いて前記各小領域のクラスタリングを行い、前記小領域内の前記元画像の類似性または特徴ごとにノードとして記憶させる処理と、
小領域リンク部により、前記小領域スライド部によって抽出された前記小領域と入力された前記元画像、前記小領域クラスタリング部による前記各ノードとの関係性を評価することで、前記一次的な対象領域内の各位置に前記小領域をリンクさせる処理と、
小領域つながり評価部により、前記小領域クラスタリング部による前記各小領域のクラスと、入力された前記画像との位置関係を評価し、前記小領域の連続性について評価して評価値を算出させる処理と、
二次対象領域背景領域切分部により、前記小領域つながり評価部による前記評価値を用いて、前記一次対象領域算出部によって切り出された前記一次的な対象領域を二次的な前記対象領域と背景領域に切り分けて、それぞれに対して前景ラベル、前記背景ラベルの振り分けを行ない、また、前記小領域内の前記各画素ごとについても個別に前記前景ラベル、前記背景ラベルの振り分けを行わせる処理と、
グラフカット部により、前記二次対象領域背景領域切分部で、前記前景ラベル、前記背景ラベルを振り分けた結果をグラフカットの初期値として用いることで、前記グラフカットで前記二次的な対象領域と前記背景領域でさらに前記前景ラベル、前記背景ラベルの振り分けを行わせる処理と、
画像生成部により、前記グラフカット部によって前記前景ラベルとされた前記画素のみを対象画像として生成して出力させる処理と
を実行させる。 The program of the present invention
In the information processing device,
A process of inputting an original image by the image input unit;
A process of calculating a primary target area using the DNN and distributing a background label to an area other than the primary target area for the input original image by the primary target area calculation unit;
A process of extracting the small area while sliding a frame of the small area in the primary target area by the small area slide unit;
A process of statistically evaluating and quantifying the features of the small region using an average value and variance by the small region feature digitizing unit;
The small area clustering unit vectorizes the position information in the pixel in the small area and information including the features, thereby performing clustering of the small areas using a neural network based on the information included in the small area. Processing to store as a node for each similarity or feature of the original image in the small area,
The small area link portion, the small region sliding part the original image input and extracted the small area by the to assess the relationship between each node of the Sub area clustering unit, the primary subject Linking the small area to each position within the area;
A process of evaluating a positional relationship between the class of each small region by the small region clustering unit and the input image by the small region connection evaluation unit, evaluating the continuity of the small regions, and calculating an evaluation value When,
By using the evaluation value obtained by the small region connection evaluation unit, the primary target region cut out by the primary target region calculation unit is determined as a secondary target region by the secondary target region background region cutting unit. A process of dividing the background area into the foreground label and the background label for each of the background areas, and separately assigning the foreground label and the background label to each of the pixels in the small area; ,
By using the result of distributing the foreground label and the background label by the graph cut unit as the initial value of the graph cut in the secondary target region background region dividing unit, the secondary target region in the graph cut And processing for further sorting the foreground label and the background label in the background region,
The image generation unit executes a process of generating and outputting only the pixels that have been set as the foreground label by the graph cut unit as a target image.

本発明によれば、画像から対象領域を背景領域と区別して切り出すために、画素に対して前景ラベル、背景ラベルを十分に正確に振り分けることができる画像処理装置、画像処理方法およびプログラムを提供することができる。 According to the present invention, there is provided an image processing apparatus, an image processing method, and a program capable of sufficiently and accurately assigning a foreground label and a background label to a pixel in order to distinguish a target area from a background area from an image. be able to.

本発明の実施の形態における画像処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image processing apparatus in embodiment of this invention. 本発明の実施の形態における画像処理装置の機能ブロック図である。It is a functional block diagram of the image processing apparatus in an embodiment of the present invention. 本発明の実施の形態における小領域のスライドを説明する図である。It is a figure explaining the slide of the small area | region in embodiment of this invention. 本発明の実施の形態における画像処理装置の処理動作を示すフローチャートである。4 is a flowchart illustrating a processing operation of the image processing apparatus according to the embodiment of the present invention. 本発明の実施の形態における画像処理方法の活用例を説明する図である。It is a figure explaining the utilization example of the image processing method in embodiment of this invention. 本発明の実施の形態における画像処理方法の活用例を説明する図である。It is a figure explaining the utilization example of the image processing method in embodiment of this invention.

以下、本発明の実施の形態を図面を参照しつつ説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本実施の形態における画像処理装置の構成の一例を示す図である。図１に示す例では、画像処理装置１０は、撮像部２０に接続されている。
撮像部２０は、カメラ等であってよく、例えば、新商品のファッションアイテムを着用したモデルやマネキンを撮像する。
画像処理装置１０は、ここでは撮像部２０と別構成として示すが、同じ装置内に設けてもよく、また、表示部などを設けてもよい。 FIG. 1 is a diagram illustrating an example of a configuration of an image processing apparatus according to the present embodiment. In the example illustrated in FIG. 1, the image processing apparatus 10 is connected to the imaging unit 20.
The imaging unit 20 may be a camera or the like, and for example, images a model or a mannequin wearing a new fashion item.
The image processing apparatus 10 is shown here as a separate configuration from the imaging unit 20, but may be provided in the same apparatus, or may be provided with a display unit or the like.

画像処理装置１０は、例えば、撮像部２０により撮像された画像に対してパターン抽出などの画像処理を実行する、パーソナル・コンピュータ（Personal Computer）、タブレット端末、スマートフォンなどの情報処理装置であり、ハードウェアとして、制御部１０１、主記憶部１０２、補助記憶部１０３、通信部１０４およびドライブ装置１０５などの周知の構成を備えている。なお、各部は、バスを介して相互にデータ送受信可能に接続されている。 The image processing apparatus 10 is, for example, an information processing apparatus such as a personal computer, a tablet terminal, or a smartphone that performs image processing such as pattern extraction on an image captured by the imaging unit 20. As a hardware, a known configuration such as a control unit 101, a main storage unit 102, an auxiliary storage unit 103, a communication unit 104, and a drive device 105 is provided. Each unit is connected to each other via a bus so that data can be transmitted and received.

制御部１０１は、所定の情報処理装置において、各装置の制御やデータの演算、加工を行うＣＰＵ（Central Processing Unit）である。また、制御部１０１は、主記憶部１０２や補助記憶部１０３に記憶されたプログラムを実行する演算装置であり、入力装置や記憶装置からデータを受け取り、演算、加工したうえで、所定の出力装置（不図示）や記憶装置に出力する。 The control unit 101 is a central processing unit (CPU) that performs control of each device, data calculation, and processing in a predetermined information processing device. The control unit 101 is an arithmetic device that executes a program stored in the main storage unit 102 or the auxiliary storage unit 103. The control unit 101 receives data from an input device or a storage device, calculates and processes the data, and then outputs a predetermined output device. (Not shown) or output to a storage device.

主記憶部１０２は、例えば、ＲＯＭ（Read Only Memory）やＲＡＭ（Random Access Memory）などである。主記憶部１０２は、制御部１０１が実行する基本ソフトウェアであるＯＳやアプリケーションソフトウェアなどのプログラムやデータを記憶又は一時保存する記憶装置である。 The main storage unit 102 is, for example, a ROM (Read Only Memory) or a RAM (Random Access Memory). The main storage unit 102 is a storage device that stores or temporarily stores programs and data such as an OS and application software that are basic software executed by the control unit 101.

補助記憶部１０３は、例えばＨＤＤ（Hard Disk Drive）などであり、アプリケーションソフトウェアなどに関連するデータを記憶する記憶装置である。補助記憶部１０３は、例えば撮像部２０から取得した画像のデータなどを記憶する。 The auxiliary storage unit 103 is, for example, an HDD (Hard Disk Drive) or the like, and is a storage device that stores data related to application software. The auxiliary storage unit 103 stores, for example, image data acquired from the imaging unit 20.

通信部１０４は、有線又は無線で周辺機器とデータ通信を行う。通信部１０４は、例えば、ネットワークを介して、パターンを含む画像を取得して、補助記憶部１０３に記憶する。 The communication unit 104 performs data communication with peripheral devices by wire or wireless. For example, the communication unit 104 acquires an image including a pattern via a network and stores the acquired image in the auxiliary storage unit 103.

ドライブ装置１０５は、記録媒体１０６、例えば、フレキシブルディスク、ＣＤ（Compact Disc）、光磁気ディスクなどから所定のプログラムを読み出し、記憶装置にインストールする。インストールされた所定のプログラムは、画像処理装置１０により実行可能となる。なお、所定のネットワークを介して他の装置から所定のプログラムを取得することであってもよい。 The drive device 105 reads a predetermined program from the recording medium 106, for example, a flexible disk, a CD (Compact Disc), a magneto-optical disk, etc., and installs it in the storage device. The installed predetermined program can be executed by the image processing apparatus 10. Note that a predetermined program may be acquired from another device via a predetermined network.

図２は、本実施の形態における画像処理装置１０の機能ブロック図である。図２に示す例では、画像処理装置１０は、画像入力部２０１、一次対象領域算出部２０２、小領域スライド部２０３、小領域特徴数値化部２０４、小領域クラスタリング部２０５、小領域リンク部２０６、小領域つながり評価部２０７、二次対象領域背景領域切分部２０８、グラフカット部２０９および画像生成部２１０を有している。 FIG. 2 is a functional block diagram of the image processing apparatus 10 in the present embodiment. In the example illustrated in FIG. 2, the image processing apparatus 10 includes an image input unit 201, a primary target region calculation unit 202, a small region slide unit 203, a small region feature digitization unit 204, a small region clustering unit 205, and a small region link unit 206. , A small region connection evaluation unit 207, a secondary target region background region segmentation unit 208, a graph cut unit 209, and an image generation unit 210.

図２に示す各部は、例えば、制御部１０１が、補助記憶部１０３に記憶される画像処理プログラムを主記憶部１０２にロードし、実行することで機能する。すなわち、図２に示す各部は、例えば、制御部１０１およびワークメモリとしての主記憶部１０２により実現され得るし、ハードウェア的に実現されることであってもよい。 Each unit illustrated in FIG. 2 functions, for example, when the control unit 101 loads an image processing program stored in the auxiliary storage unit 103 into the main storage unit 102 and executes it. That is, each unit illustrated in FIG. 2 may be realized by the control unit 101 and the main storage unit 102 as a work memory, or may be realized by hardware.

画像入力部２０１は、例えば、撮像部２０により撮像された画像（元画像）を入力する機能を有している。入力される画像は、カラー画像でもよい。画像入力部２０１は、入力された画像を主記憶部１０２に記憶する。 For example, the image input unit 201 has a function of inputting an image (original image) captured by the imaging unit 20. The input image may be a color image. The image input unit 201 stores the input image in the main storage unit 102.

一次対象領域算出部２０２は、入力された画像（記憶部１０２に記憶された元画像）に対して、ＤＮＮ（多層ニューラルネットワーク）を用いた手法により、一次的な対象領域（最終的な背景領域の一部を含んだものであってよい。）を算出して、対象領域（一次的な）以外の領域に背景ラベルを振り分ける機能を有している。 The primary target region calculation unit 202 performs a primary target region (final background region) on the input image (original image stored in the storage unit 102) using a technique using DNN (multilayer neural network). And a background label is assigned to an area other than the target area (primary).

小領域スライド部２０３は、対象領域内において小領域の枠をスライドさせながらその小領域の抽出を行う機能を有している。例えば、図３に示すように、３×３の合計９マスを１つの小領域の枠として、左右や上下方向に１マス分ずつスライドさせていくことであってもよい。 The small area slide unit 203 has a function of extracting the small area while sliding the frame of the small area within the target area. For example, as shown in FIG. 3, a total of 9 squares of 3 × 3 may be used as one small area frame, and may be slid by one square in the horizontal and vertical directions.

小領域特徴数値化部２０４は、小領域スライド部２０３によって抽出された小領域の特徴を定量的に評価して数値化する機能を有している。 The small region feature digitizing unit 204 has a function of quantitatively evaluating and digitizing the small region feature extracted by the small region slide unit 203.

小領域クラスタリング部２０５は、各小領域の特徴をクラスタリングし、小領域内の画像の類似性または特徴ごとにノードとして記憶する機能を有している。小領域における画素内の位置情報と特徴を含んだ情報をベクトル化することで、これらの小領域に含まれる情報を用いてニューラルネットワークを用いて各小領域のクラスタリング（分類）を行う。 The small region clustering unit 205 has a function of clustering the features of each small region and storing them as nodes for each similarity or feature of the images in the small region. By vectorizing the position information in the pixel in the small area and the information including the feature, the information contained in the small area is used to perform clustering (classification) of each small area using a neural network.

小領域リンク部２０６は、小領域スライド部２０３によって抽出された小領域と入力された画像（記憶部１０２に記憶された元画像）、小領域クラスタリング部２０５における各ノードとの関係性をリンクによって表現する機能を有している。背景ラベルが振り分けられた領域とのつながりと、対象領域（一次的な）における重心に位置する小領域のつながりなどを評価することで、対象領域（一次的な）内の各位置に小領域をリンクする。 The small region link unit 206 links the relationship between the small region extracted by the small region slide unit 203 and the input image (the original image stored in the storage unit 102) and each node in the small region clustering unit 205 by a link. It has a function to express. By evaluating the connection with the area to which the background label is assigned and the connection of the small area located at the center of gravity in the target area (primary), the small area is set at each position in the target area (primary). Link.

小領域つながり評価部２０７は、小領域クラスタリング部２０５における各小領域のクラスと、入力された画像（記憶部１０２に記憶された元画像）との位置関係を評価し、小領域の連続性について評価を行う機能を有している。類似した小領域のつながりとラベルの類似性を評価する。 The small region connection evaluation unit 207 evaluates the positional relationship between each small region class in the small region clustering unit 205 and the input image (original image stored in the storage unit 102), and the continuity of the small regions. Has a function to perform evaluation. Evaluate the similarity of similar subregion connections and labels.

二次対象領域背景領域切分部２０８は、小領域つながり評価部２０７による評価値を用いて、各小領域ごとにラベルを振り分ける、すなわち、一次対象領域算出部２０２によって切り出された対象領域（一次的な）を二次的な対象領域と背景領域に切り分けて、それぞれに対して前景ラベル、背景ラベルの振り分けを行ない、また、小領域内の各画素ごとについても個別に前景ラベル、背景ラベルの振り分けを行ない、ラベル付けを行う機能を有している。 The secondary target area background area segmentation unit 208 assigns labels to each small area using the evaluation value from the small area connection evaluation unit 207, that is, the target area (primary target area cut out by the primary target area calculation unit 202). The secondary foreground area and background area are divided into foreground labels and background labels, and foreground and background labels are also individually assigned to each pixel in the small area. It has the function of sorting and labeling.

グラフカット部２０９は、二次対象領域背景領域切分部２０８によりラベル付けされた情報をもとに、対象領域（二次的な）と背景領域でさらに前景ラベル、背景ラベルの振り分けを行ない、ラベル付けを行う機能を有している。 The graph cut unit 209 further sorts the foreground label and the background label between the target region (secondary) and the background region based on the information labeled by the secondary target region background region cutting unit 208, It has a function of labeling.

画像生成部２１０は、グラフカット部２０９によって前景ラベルとされた画素のみを画像（対象画像）として生成する機能を有している。 The image generation unit 210 has a function of generating, as an image (target image), only the pixels that have been set as foreground labels by the graph cut unit 209.

次に、図４のフローチャートを参照して、本実施の形態における画像処理装置１０の処理動作を詳細に説明する。以下では、画像の対象領域と背景領域に対してそれぞれ前景ラベル、背景ラベルを振り分けることを前提として説明する。 Next, the processing operation of the image processing apparatus 10 in the present embodiment will be described in detail with reference to the flowchart of FIG. In the following description, it is assumed that the foreground label and the background label are allocated to the target area and the background area of the image, respectively.

まず、画像処理装置１０の画像入力部２０１は、例えば、撮像部２０により撮像された画像を入力する（ステップＳ０１）。画像入力部２０１は、入力された画像を主記憶部１０２に記憶する。 First, the image input unit 201 of the image processing apparatus 10 inputs, for example, an image captured by the imaging unit 20 (step S01). The image input unit 201 stores the input image in the main storage unit 102.

次に、画像処理装置１０の一次対象領域算出部２０２は、画像入力部２０１により入力された画像に対して、ＤＮＮ（多層ニューラルネットワーク）を用いて、一次的な対象領域（最終的な前景領域・背景領域の一部を含んだものであってよい。）を算出し、対象領域（一次的な）以外の領域に背景ラベルを振り分ける（ステップＳ０２）。なお、このとき対象領域（一次的な）内にはまだ前景ラベルは振り分けない。 Next, the primary target area calculation unit 202 of the image processing apparatus 10 uses the DNN (multilayer neural network) for the image input by the image input unit 201 to perform a primary target area (final foreground area). (This may include a part of the background area.) Is calculated, and the background label is assigned to an area other than the target area (primary) (step S02). At this time, the foreground label is not yet distributed in the target area (primary).

次に、画像処理装置１０の小領域スライド部２０３は、対象領域（一次的な）内において小領域の枠をスライドさせながら（図３参照）、その小領域の抽出を行う（ステップＳ０３）。なお、小領域の枠のスライドとともに随時、以下に述べるような処理により、背景ラベルが振り分けられた領域とのつながりと、対象領域（一次的な）における重心に位置する小領域のつながりを評価することで、二次的な対象領域と背景領域に切り分けて、それぞれに対して前景ラベル、背景ラベルの振り分けを行うこととなる。 Next, the small area slide unit 203 of the image processing apparatus 10 extracts the small area while sliding the frame of the small area within the target area (primary) (see FIG. 3) (step S03). In addition, along with the slide of the frame of the small area, the connection with the area where the background label is distributed and the connection of the small area located at the center of gravity in the target area (primary) are evaluated by processing as described below. Thus, the foreground label and the background label are assigned to each of the secondary target area and the background area.

画像処理装置１０の小領域特徴数値化部２０４は、小領域の特徴を定量的に評価し数値化する（ステップＳ０４）。例えば、濃淡でなく（白黒の細かいチェックだと人間の視覚によるとグレーだが、機械はそれぞれ認識してしまうので所定のマスは平均でグレーと認識したい）、各小領域における画素のパターンを平均値や分散を用いて統計的に評価して数値化する。 The small area feature digitizing unit 204 of the image processing apparatus 10 quantitatively evaluates and digitizes the characteristics of the small area (step S04). For example, it is not grayscale (a black and white check is gray according to human vision, but the machine recognizes each, so we want to recognize a given square as gray on average). Statistically evaluated using or variance and digitized.

画像処理装置１０の小領域クラスタリング部２０５は、各小領域の特徴をクラスタリングし、小領域内の画像の類似性または特徴ごとにノードとして記憶する（ステップＳ０５）。すなわち、小領域における画素内の位置情報と特徴を含んだ情報をベクトル化することで、これらの小領域に含まれる情報に基づいてニューラルネットワークを用いて各小領域のクラスタリング（分類）を行う。 The small region clustering unit 205 of the image processing apparatus 10 clusters the features of each small region and stores them as nodes for each similarity or feature of the images in the small region (step S05). That is, by vectorizing the information including the position information and features in the pixels in the small area, clustering (classification) of each small area is performed using a neural network based on the information included in these small areas.

画像処理装置１０の小領域リンク部２０６は、小領域スライド部２０３によって抽出された小領域と入力された画像（記憶部１０２に記憶された元画像）、小領域クラスタリング部２０５による各ノードとの関係性などを評価することで、対象領域（一次的な）内の各位置に小領域をリンクする（ステップＳ０６）。背景ラベルが振り分けられた領域とのつながりと、対象領域（一次的な）における重心に位置する小領域のつながりなどを評価することで、対象領域（一次的な）内の各位置に小領域をリンクする。 The small area link unit 206 of the image processing apparatus 10 includes a small area extracted by the small area slide unit 203, an input image (an original image stored in the storage unit 102), and each node by the small area clustering unit 205. By evaluating the relationship or the like, the small area is linked to each position in the target area (primary) (step S06). By evaluating the connection with the area to which the background label is assigned and the connection of the small area located at the center of gravity in the target area (primary), the small area is set at each position in the target area (primary). Link.

画像処理装置１０の小領域つながり評価部２０７は、小領域クラスタリング部２０５による各小領域のクラスと、入力された画像（記憶部１０２に記憶された元画像）との位置関係を評価し、小領域の連続性について評価して評価値を算出する（ステップＳ０７）。類似した小領域のつながりとラベルの類似性を評価する。 The small region connection evaluation unit 207 of the image processing apparatus 10 evaluates the positional relationship between each small region class by the small region clustering unit 205 and the input image (original image stored in the storage unit 102). An evaluation value is calculated by evaluating the continuity of the region (step S07). Evaluate the similarity of similar subregion connections and labels.

画像処理装置１０の二次対象領域背景領域切分部２０８は、小領域つながり評価部２０７による評価値を用いて、各小領域ごとにラベルを振り分ける、すなわち、一次対象領域算出部２０２によって切り出された対象領域（一次的な）を二次的な対象領域と背景領域に切り分けて、それぞれに対して前景ラベル、背景ラベルの振り分けを行ない、また、小領域内の各画素ごとについても個別に前景ラベル、背景ラベルの振り分けを行ない、ラベル付けを行う（ステップＳ０８）。 The secondary target region background region segmentation unit 208 of the image processing apparatus 10 uses the evaluation value from the small region connection evaluation unit 207 to distribute labels for each small region, that is, cut out by the primary target region calculation unit 202. The target area (primary) is divided into a secondary target area and a background area, and the foreground label and background label are assigned to each area. Also, each pixel in the small area is individually foregrounded. Labels and background labels are sorted and labeled (step S08).

ＤＮＮを用いた前処理（一次的な）による背景領域のラベリングと、これらのラベル情報を用いてさらにニューラルネッワークを用いた、画像の各位置と関連づけられた小領域のクラスタリングによる少なくとも２つ以上の多段階方式が実現される。
このように、多段階的に切り分けることで、より詳細に前景領域と背景領域を徐々に切り分けることが可能となる。また、各画像の位置と関連づけられた小領域に対してラベルを振り分け、ニューラルネットワークを用いて小領域のクラスタリングを行うことで、小領域ごとの特徴のトポロジー（関連性）を抽出することができる。 At least two or more by the labeling of the background area by the pre-processing (primary) using DNN and the clustering of the small area associated with each position of the image using the label information and further using the neural network A multi-stage scheme is realized.
In this way, by separating in multiple stages, it becomes possible to gradually separate the foreground area and the background area in more detail. In addition, by assigning labels to small areas associated with the position of each image and clustering the small areas using a neural network, it is possible to extract the topology (relevance) of features for each small area. .

次に、画像処理装置１０のグラフカット部２０９は、二次対象領域背景領域切分部２０８により、入力された画像の対象領域（二次的な）と背景領域に前景ラベル、背景ラベルを振り分けた結果をグラフカットの初期値として用いることで、グラフカットによるさらなる詳細な対象領域の抽出、すなわち対象領域（二次的な）と背景領域でさらに前景ラベル、背景ラベルの振り分けを行ない、ラベル付けを行う（ステップＳ０９）。グラフカットの際、エネルギーの評価関数の和を最小化することで、小領域に含まれる画素を前景領域と背景領域にさらに精度よく切り分ける。すなわち、小領域の前景ラベル、背景ラベルを用いて小領域自体を切り分けるのではなく、小領域内の画素における前景ラベルと背景ラベルを用いて各小領域における画素を前景領域と背景領域に切り分ける。換言すれば、小領域の塊を切り分けるのではなく、小領域が含む画素に対しての切り分けを可能にする。 Next, the graph cut unit 209 of the image processing apparatus 10 distributes the foreground label and the background label to the target region (secondary) and the background region of the input image by the secondary target region background region cutting unit 208. Results are used as initial values for graph cuts, so that more detailed target regions can be extracted by graph cuts, that is, foreground and background labels can be further divided between target regions (secondary) and background regions for labeling. Is performed (step S09). When cutting the graph, by minimizing the sum of the energy evaluation functions, the pixels included in the small area are further accurately separated into the foreground area and the background area. That is, the small area itself is not cut using the foreground label and the background label of the small area, but the pixels in each small area are cut into the foreground area and the background area using the foreground label and the background label in the pixels in the small area. In other words, it is possible to perform segmentation for the pixels included in the small region, instead of segmenting the small region block.

そして、画像処理装置１０の画像生成部２１０は、グラフカット部２０９によって前景ラベルとされた画素のみを対象画像として生成して所定の出力装置（画像表示装置）などに出力する（ステップＳ１０）。これにより、入力された画像から最終的な対象領域のみを最終的な背景領域と区別して切り出すことができる。すなわち、例えば、写真画像からファッションアイテム（例えば洋服である。）だけを対象画像として最終的な背景領域と区別して切り出すことが可能となる。 Then, the image generation unit 210 of the image processing apparatus 10 generates only a pixel that has been set as the foreground label by the graph cut unit 209 as a target image and outputs the target image to a predetermined output device (image display device) or the like (step S10). Thus, only the final target area can be distinguished from the final background area from the input image. That is, for example, it is possible to cut out only a fashion item (for example, clothes) from a photographic image as a target image in distinction from the final background region.

本実施の形態によれば、画像から対象領域（最終的な）を背景領域（最終的な）と区別して切り出すために、画素に対して前景ラベル、背景ラベルを十分に正確に振り分けることができる。 According to the present embodiment, the foreground label and the background label can be allocated to the pixel sufficiently accurately in order to distinguish the target region (final) from the image and distinguish it from the background region (final). .

また、本実施の形態によれば、ニューラルネットワークにより、入力された画像の各位置に関連付けられた小領域の対象領域および背景領域をラベリングすることにより、入力された画像を間接的に対象領域と背景領域に切り分けるため、従来のような所定の閾値（濃淡など）を用いることなく、各小領域に関連付けられた画像の特徴をふるいわけることで、非線形な特徴トポロジーをもった小領域の分類が可能になる。すなわち、より、周辺領域との画像の特徴のつながりを考慮した分類およびラベリングが可能となる。 Further, according to the present embodiment, the input image is indirectly identified as the target region by labeling the target region and the background region of the small region associated with each position of the input image by the neural network. By classifying the features of the image associated with each small region without using a predetermined threshold (such as shading) as in the past, it is possible to classify the small region with a non-linear feature topology. It becomes possible. That is, classification and labeling can be performed in consideration of the connection of image features with the peripheral region.

また、本実施の形態によれば、これまでのように画像を素領域ごとに振り分けるのではなく、小領域の枠を対象領域内でスライドさせることで、小領域を画像の各位置とリンクさせ、リンクされた小領域を対象領域か背景領域かのどちらかに切り分ける、つまり、関連づけされた小領域にラベルを振り分けることとなり、関連付けされた小領域をラベル付けし、さらにクラスタリングに用いることで、小領域の特徴のつながりを評価し、また、ＤＮＮにおいて背景と判断された領域と対象領域にリンクされた小領域とのつながりを考慮し、より詳細に部分ごとの特徴のつながりを評価することができる。 Further, according to the present embodiment, the image is not divided into the elementary regions as in the past, but the small region is linked with each position of the image by sliding the frame of the small region within the target region. By dividing the linked small region into either the target region or the background region, that is, the label is assigned to the associated small region, and the associated small region is labeled and used for clustering. It is possible to evaluate the connection of features in each small area in detail, considering the connection between the area determined to be the background in DNN and the small area linked to the target area. it can.

また、これまでの隣接する画素間で分類（グルーピング）を行う手法は、様々な模様や質感のあるファッションアイテム（例えば洋服である。）の画像にはノイズとして検出される可能性が高いため適さないが、本実施の形態によれば、画像の各位置を初めから小領域に関連付けることで、画素単位では無く、画像の周辺領域を含んだ広い範囲を考慮しニューラルネットワークにより分類（グルーピング）を行うため、周辺領域内で類似画素が連続しない場合でも輪郭の抽出が可能となる。例えば、小領域ごとの境界の傾きをみて、小領域の画像全体に対する連接角度が連続していくような輪郭の抽出が可能となる。 In addition, the conventional method of performing classification (grouping) between adjacent pixels is suitable because it is highly likely to be detected as noise in images of fashion items having various patterns and textures (for example, clothes). However, according to the present embodiment, by associating each position of an image with a small region from the beginning, classification (grouping) is performed by a neural network in consideration of a wide range including a peripheral region of the image, not a pixel unit. Therefore, the contour can be extracted even when the similar pixels are not continuous in the peripheral region. For example, it is possible to extract a contour such that the concatenation angle of the small region with respect to the entire image is continuous by looking at the inclination of the boundary for each small region.

また、本実施の形態によれば、前景（対象領域）と背景における画像と関連付けられた小領域を、ニューラルネットワークを用いて分類するため、各画像上に複数関連付けられた小領域を考慮することで、それぞれの複数領域における画像の特徴の統計的違いを考慮することが可能である。 In addition, according to the present embodiment, the small areas associated with the images in the foreground (target area) and the background are classified using a neural network, so that a plurality of small areas associated with each image are considered. Thus, it is possible to take into account statistical differences in image features in each of the plurality of regions.

また、本実施の形態によれば、特定の確率モデルを使用するのでは無く、ニューラルネットによるより柔軟な非線形な解析が期待できる。 Further, according to the present embodiment, a more flexible nonlinear analysis using a neural network can be expected instead of using a specific probability model.

以下では、本実施の形態による画像処理方法を活用したサービス例を説明する。
自動タグ付サービス:ＡＴＳ(Auto Tagging Service)、すなわち、オンラインストア運営に伴う説明文・検索ワード付業務の最適化・効率化に活用することができる。 Hereinafter, an example of a service using the image processing method according to the present embodiment will be described.
Automatic tagging service: ATS (Auto Tagging Service), that is, it can be used for optimizing and improving the efficiency of operations with explanatory texts and search words associated with online store operations.

まず、従来のオンラインストア運営業務の流れとして、（１）商品撮影、すなわち、新商品をモデルやマネキンに着せて撮影し、（２）オンラインストア登録、すなわち、商品毎に検索ワード付・説明文を手作業（例えば、１着１０分程度かかる。）で作成していた。 First, the flow of conventional online store management operations is as follows: (1) product shooting, that is, shooting a new product on a model or mannequin, and (2) online store registration, that is, with a search word / description for each product. Was created manually (for example, it takes about 10 minutes for each wear).

そこで、自動タグ付サービス：ＡＴＳにより説明文・検索ワード付の自動化を図る。
図５に示すように、ＡＩで類似画像の画像特徴量から最適な説明文とタグを自動生成する。そうすることで、費用削減と品質レベル標準化を図ることができる。 Therefore, an automatic tagging service: ATS is used to automate explanations and search words.
As shown in FIG. 5, an optimal description and tag are automatically generated from image feature amounts of similar images by AI. By doing so, cost reduction and quality level standardization can be achieved.

図６に示すように、
（１）ＡＴＳに画像をアップロードする。
（２）洋服領域に特化したアルゴリズムで対象画像（対象領域）を抽出する（本実施の形態による画像処理方法）。
（３）画像解析アルゴリズムで特徴量を算出する。ベクトル化、例えば、各画素の色の特徴量（ＲＧＢ値）と形状の特徴量を特徴ベクトル（次元数３）としてクラスタリングを行う。
（４）高次元データの高速検索アルゴリズムで類似画像を検出する。例えば、１００万件以上のファッションアイテム画像を収集し、人手で教師データを作成して画像認識技術精度の向上を図ることであってもよい。また、誤分類された画像はＤＢに保存し誤分類の特徴をＡＩで学習して精度の向上を図ることであってもよい。
（５）商品マスタＤＢを検索する。例えば、写真画像（過去の類似のもの）との比較を行う。
（６）類似商品から「カテゴリ」、「タグ（ハッシュタグ）」、「説明文」の候補から最適なものを選択して自動生成する。 As shown in FIG.
(1) Upload an image to ATS.
(2) A target image (target region) is extracted by an algorithm specialized for a clothing region (an image processing method according to the present embodiment).
(3) A feature amount is calculated by an image analysis algorithm. Vectorization, for example, clustering is performed by using the color feature quantity (RGB value) and shape feature quantity of each pixel as a feature vector (dimension number 3).
(4) A similar image is detected by a high-speed search algorithm for high-dimensional data. For example, it is possible to collect 1 million or more fashion item images and manually create teacher data to improve the accuracy of image recognition technology. Further, misclassified images may be stored in a DB, and misclassified features may be learned by AI to improve accuracy.
(5) Search the product master DB. For example, a comparison with a photographic image (similar in the past) is performed.
(6) An optimum product is selected from candidates of “category”, “tag (hash tag)”, and “description” from similar products and automatically generated.

なお、上述する実施の形態は、本発明の好適な実施の形態であり、本発明の要旨を逸脱しない範囲内において種々変更実施が可能である。例えば、各装置の機能を実現するためのプログラムを各装置等に読込ませて実行することにより各装置等の機能を実現する処理を行ってもよい。さらに、そのプログラムは、コンピュータ読み取り可能な記録媒体であるＣＤ−ＲＯＭまたは光磁気ディスクなどを介して、または伝送媒体であるインターネット、電話回線等を介して伝送波により他のコンピュータシステムに伝送されてもよい。また、一部のシステムが人の動作を介在して実現されてもよい。 The above-described embodiment is a preferred embodiment of the present invention, and various modifications can be made without departing from the gist of the present invention. For example, a process for realizing the function of each device may be performed by causing each device or the like to read and execute a program for realizing the function of each device. Further, the program is transmitted to another computer system by a transmission wave through a CD-ROM or a magneto-optical disk as a computer-readable recording medium, or through the Internet or a telephone line as a transmission medium. Also good. Also, some systems may be realized through human actions.

１０画像処理装置
２０撮像部
１０１制御部
１０２主記憶部
１０３補助記憶部
１０４通信部
１０５ドライブ装置
１０６記録媒体
２０１画像入力部
２０２一次対象領域算出部
２０３小領域スライド部
２０４小領域特徴数値化部
２０５小領域クラスタリング部
２０６小領域リンク部
２０７小領域つながり評価部
２０８二次対象領域背景領域切分部
２０９グラフカット部
２１０画像生成部 DESCRIPTION OF SYMBOLS 10 Image processing apparatus 20 Imaging part 101 Control part 102 Main memory part 103 Auxiliary memory part 104 Communication part 105 Drive apparatus 106 Recording medium 201 Image input part 202 Primary object area | region calculation part 203 Small area slide part 204 Small area characteristic digitization part 205 Small region clustering unit 206 Small region link unit 207 Small region connection evaluation unit 208 Secondary target region background region segmentation unit 209 Graph cut unit 210 Image generation unit

Claims

An image input unit for inputting an original image;
A primary target area calculation unit that calculates a primary target area using DNN and assigns a background label to an area other than the primary target area for the input original image;
A small area slide unit that extracts the small area while sliding a frame of the small area in the primary target area;
A small region feature quantification unit that statistically evaluates and quantifies the features of the small region using an average value and variance;
By vectorizing the position information in the pixel in the small region and the information including the feature, the small region is clustered using a neural network based on the information included in the small region. A small area clustering unit that stores as a node for each similarity or feature of the original image,
By evaluating the relationship between the small region extracted by the small region slide unit and the input original image, and each node by the small region clustering unit, each position in the primary target region is evaluated. A small area link portion that links the small areas;
A small region connection evaluation unit that evaluates the positional relationship between the class of each small region by the small region clustering unit and the input image, evaluates the continuity of the small region, and calculates an evaluation value;
Using the evaluation value by the small region connection evaluation unit, the primary target region cut out by the primary target region calculation unit is divided into the secondary target region and the background region, and foreground A label, a background label, and a secondary target area background area separation unit that performs the foreground label and the background label individually for each pixel in the small area;
By using the result of sorting the foreground label and the background label as the initial value of the graph cut by the secondary target region background region segmentation unit, the secondary target region and the background region by the graph cut are used. Furthermore, the foreground label, the graph cut part for sorting the background label,
An image generation apparatus comprising: an image generation unit that generates and outputs only the pixels that have been used as the foreground label by the graph cut unit as a target image.

A step of inputting an original image by an image input unit;
A step of calculating a primary target area using DNN for the input original image by the primary target area calculation unit and allocating a background label to an area other than the primary target area;
Extracting the small area while sliding the frame of the small area in the primary target area by the small area slide unit;
A step of quantifying the characteristics of the small region statistically using an average value or variance by the small region feature digitizing unit;
The small area clustering unit vectorizes the position information in the pixel in the small area and information including the features, thereby performing clustering of the small areas using a neural network based on the information included in the small area. And storing as a node for each similarity or feature of the original image in the small region;
The small area link portion, the small region sliding part the original image input and extracted the small area by the to assess the relationship between each node of the Sub area clustering unit, the primary subject Linking the subregion to each position within the region;
A step of evaluating a positional relationship between the class of each small region by the small region clustering unit and the inputted image by a small region connection evaluation unit, evaluating the continuity of the small regions, and calculating an evaluation value; When,
By using the evaluation value obtained by the small region connection evaluation unit, the primary target region cut out by the primary target region calculation unit is determined as a secondary target region by the secondary target region background region cutting unit. Dividing into background areas, foreground labels and background labels are assigned to each, and foreground labels and background labels are individually assigned for each pixel in the small area;
By using the result of distributing the foreground label and the background label by the graph cut unit as the initial value of the graph cut in the secondary target region background region dividing unit, the secondary target region in the graph cut And further sorting the foreground label and the background label in the background region;
An image processing method comprising: a step of generating and outputting only the pixels that have been used as the foreground label by the graph cut unit as an object image by an image generation unit.

In the information processing device,
A process of inputting an original image by the image input unit;
A process of calculating a primary target area using the DNN and distributing a background label to an area other than the primary target area for the input original image by the primary target area calculation unit;
A process of extracting the small area while sliding a frame of the small area in the primary target area by the small area slide unit;
A process of statistically evaluating and quantifying the features of the small region using an average value and variance by the small region feature digitizing unit;
The small area clustering unit vectorizes the position information in the pixel in the small area and information including the features, thereby performing clustering of the small areas using a neural network based on the information included in the small area. Processing to store as a node for each similarity or feature of the original image in the small area,
The small area link portion, the small region sliding part the original image input and extracted the small area by the to assess the relationship between each node of the Sub area clustering unit, the primary subject Linking the small area to each position within the area;
A process of evaluating a positional relationship between the class of each small region by the small region clustering unit and the input image by the small region connection evaluation unit, evaluating the continuity of the small regions, and calculating an evaluation value When,
By using the evaluation value obtained by the small region connection evaluation unit, the primary target region cut out by the primary target region calculation unit is determined as a secondary target region by the secondary target region background region cutting unit. A process of dividing the background area into the foreground label and the background label for each of the background areas, and separately assigning the foreground label and the background label to each of the pixels in the small area; ,
By using the result of distributing the foreground label and the background label by the graph cut unit as the initial value of the graph cut in the secondary target region background region dividing unit, the secondary target region in the graph cut And processing for further sorting the foreground label and the background label in the background region,
A program that causes the image generation unit to execute a process of generating and outputting only the pixels that have been set as the foreground label by the graph cut unit as a target image.