JP5979008B2

JP5979008B2 - Image processing apparatus, image processing method, and program

Info

Publication number: JP5979008B2
Application number: JP2013000234A
Authority: JP
Inventors: 美佐子宗
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2013-01-04
Filing date: 2013-01-04
Publication date: 2016-08-24
Anticipated expiration: 2033-01-04
Also published as: JP2014132392A

Description

本発明は、画像処理を行う画像処理装置、画像処理方法及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program for performing image processing.

従来、撮像された画像からパターンを抽出する技術があり、例えば文字抽出を行う様々な技術がある。従来の文字抽出方法は、主に次の３種類に分類される。 Conventionally, there is a technique for extracting a pattern from a captured image. For example, there are various techniques for extracting characters. Conventional character extraction methods are mainly classified into the following three types.

（１）エッジ抽出に基づく方法
カラー画像から明度のみの濃度画像を作成し、明度変化の大きいところを文字線として抽出する技術がある（例えば特許文献１、非特許文献１参照）。 (1) Method based on edge extraction There is a technique in which a density image having only brightness is created from a color image, and a portion having a large change in brightness is extracted as a character line (for example, see Patent Document 1 and Non-Patent Document 1).

（２）カラークラスタリングに基づく方法
画像に対し、色クラスタリングを適用し、文字に対応するクラスタを推定して抽出する技術がある（例えば特許文献２〜３参照）。 (2) There is a technique in which color clustering is applied to a method image based on color clustering to estimate and extract clusters corresponding to characters (see, for example, Patent Documents 2 to 3).

（３）色分布の直線近似に基づく方法
カラー画像で出現頻度の高い色を主要色として色空間内で主要色間を線分で結び、各画素を線分に射影した内分点によってどちらの主要色に属するかを決定して文字抽出を行う技術がある（例えば特許文献４参照）。 (3) Method based on linear approximation of color distribution Which color image has a high appearance frequency as the main color, connects the main colors with a line segment in the color space, and either of them by an internal dividing point obtained by projecting each pixel to the line segment There is a technique for extracting characters by determining whether they belong to the main color (see, for example, Patent Document 4).

特開２０００−１８１９９２号公報JP 2000-181992 A 特開２００９−１９９２７６号公報JP 2009-199276 A 特開２００９−１９１９０６号公報JP 2009-191906 A 特開２００３−１６４４４号公報JP 2003-16444 A

Bernsen, J, "Dynamic Thresholding of Grey-Level Images", Proc. of the 8th Int. Conf. on Pattern Recognition, 1986Bernsen, J, "Dynamic Thresholding of Gray-Level Images", Proc. Of the 8th Int. Conf. On Pattern Recognition, 1986

最近、薬のパッケージをデジタルカメラで撮影して薬名を検索し、効用や副作用を調べたりする薬管理システムを実現するために、画像中から薬名を抽出したいというニーズがある。 Recently, there is a need to extract drug names from images in order to realize a drug management system in which drug packages are photographed with a digital camera, drug names are searched, and the effects and side effects are investigated.

薬のパッケージは、主に格子状の凹凸を持つプラスチック上に金属の薄膜が貼り付けてある場合が多い。よって、このパッケージを撮像して生成された画像は、本来使われている色が文字と背景の２色であっても、影や反射によってテクスチャが生じてしまう。 In many cases, a medicine package has a metal thin film affixed mainly on a plastic having lattice-like irregularities. Therefore, an image generated by imaging this package has a texture due to shadows and reflections even if the originally used colors are characters and background.

また、デジタルカメラの画像は、例えばＪＰＥＧ（Joint Photographic Experts Group）形式に圧縮されており、色ずれやノイズが発生する。 The digital camera image is compressed in, for example, JPEG (Joint Photographic Experts Group) format, and color shift and noise occur.

図１は、ＪＰＥＧ形式で圧縮された画像の一例を示す図である。図１に示す例は、「Ａ」という文字が記載された所定の材質（例えば金属箔上に凹凸を有する材質とし、材質１と呼ぶ）が撮像された画像を示す。また、所定の材質は薄紫色であり、文字は赤色である。図１に示すように、光が反射し、凹凸のある材質を撮像した画像には、色ずれやノイズが発生する。 FIG. 1 is a diagram illustrating an example of an image compressed in the JPEG format. The example shown in FIG. 1 shows an image in which a predetermined material (for example, a material having irregularities on a metal foil and called material 1) on which the letter “A” is written is captured. Further, the predetermined material is light purple and the characters are red. As shown in FIG. 1, color shift and noise occur in an image in which light is reflected and an uneven material is captured.

一方で、図１に示す画像に対し、従来技術（１）〜（３）を用いて文字抽出を行う。図２は、方法（１）を用いて文字抽出を行った結果を示す図である。図２に示すように、背景のテクスチャ部分が文字として抽出されてしまい、方法（１）では、材質１上の文字が撮像された画像に対して、適切に文字抽出することができない。 On the other hand, character extraction is performed on the image shown in FIG. 1 using conventional techniques (1) to (3). FIG. 2 is a diagram illustrating a result of character extraction using the method (1). As shown in FIG. 2, the texture portion of the background is extracted as a character, and the method (1) cannot appropriately extract a character from an image in which a character on the material 1 is captured.

図３は、方法（２）を用いて文字抽出を行った結果を示す図である。図３に示すように、ＪＰＥＧ圧縮により文字線の一部が濃度化したり、文字及び文字周辺の色相が本来の色相と変わってしまったりすることで、抽出した文字パターンに欠損や変形が生じている。よって、方法（２）では、材質１上の文字が撮像された画像に対して、適切に文字抽出することができない。 FIG. 3 is a diagram illustrating a result of character extraction using the method (2). As shown in FIG. 3, the extracted character pattern may be deficient or deformed due to the concentration of part of the character line due to JPEG compression or the hue of the character and the surroundings of the character being changed from the original hue. Yes. Therefore, in the method (2), it is not possible to appropriately extract characters from an image in which characters on the material 1 are captured.

次に、材質１に対して、方法（３）を用いて文字抽出を行う場合を説明する。方法（３）は、スキャンやアンチエイリアシングによる画像劣化に対応可能であるが、色ずれや、影、反射により生じたテクスチャには対応できない。 Next, a case where character extraction is performed on the material 1 using the method (3) will be described. Method (3) can deal with image degradation due to scanning and anti-aliasing, but cannot deal with textures caused by color shifts, shadows, and reflections.

図４は、図１に示す画像のＲＧＢ空間における分布画像を示す図である。影、反射により生じたテクスチャや、色ずれが無い場合は、図４に示す実線矢印に沿った近似直線上の分布となるはずである。しかし、テクスチャや色ずれにより分布が大きく変形し、点線矢印に沿った方向に分布が広がっている。この場合は、特に白方向に分布が広がっている。 FIG. 4 is a diagram showing a distribution image in the RGB space of the image shown in FIG. When there is no texture or color shift caused by shadows or reflections, the distribution should be on an approximate line along the solid line arrow shown in FIG. However, the distribution is greatly deformed due to texture and color shift, and the distribution spreads in the direction along the dotted arrow. In this case, the distribution spreads particularly in the white direction.

よって、方法（３）では、材質１上の文字が撮像された画像に対して、適切に文字抽出することができない。 Therefore, in the method (3), it is not possible to appropriately extract characters from an image in which characters on the material 1 are captured.

したがって、材質１上の文字が撮像された画像に対して、従来の文字抽出方法をそのまま適用すると、適切に文字抽出を行うことができないという問題がある。 Therefore, when the conventional character extraction method is applied as it is to an image in which characters on material 1 are captured, there is a problem that character extraction cannot be performed appropriately.

そこで、開示の技術は、文字抽出を適切に行うことができる画像処理装置、画像処理方法及びプログラムを提供することを目的とする。 Accordingly, it is an object of the disclosed technique to provide an image processing apparatus, an image processing method, and a program that can appropriately perform character extraction.

開示の一態様における画像処理装置は、画像を複数のチャンネルに分解し、各チャンネルの濃度画像を生成する分解部と、各濃度画像の濃度分布を２群に分離し、該２群の分離度が最大となる濃度画像を選択する選択部と、選択された濃度画像の濃度の最大値及び最小値を抽出する抽出部と、抽出された最大値及び最小値をシードに設定したグラフカットを用いて、前記選択された濃度画像を２群に分離する分離部と、分離された２群それぞれの画素に対して連結成分解析を行い、何れかの群の画素から画像を生成する画像生成部と、を備える。 An image processing apparatus according to an aspect of the disclosure decomposes an image into a plurality of channels, separates the density distribution of each density image into two groups, a separation unit that generates a density image of each channel, and the degree of separation of the two groups A selection unit that selects a density image that maximizes the density, an extraction unit that extracts the maximum and minimum values of the density of the selected density image, and a graph cut that sets the extracted maximum and minimum values as a seed A separation unit that separates the selected density image into two groups, and an image generation unit that performs a connected component analysis on each of the two separated pixels and generates an image from any group of pixels; .

開示の技術によれば、文字抽出を適切に行うことができる。 According to the disclosed technique, character extraction can be performed appropriately.

ＪＰＥＧ形式で圧縮された画像の一例を示す図。The figure which shows an example of the image compressed in the JPEG format. 方法（１）を用いて文字抽出を行った結果を示す図。The figure which shows the result of having performed the character extraction using the method (1). 方法（２）を用いて文字抽出を行った結果を示す図。The figure which shows the result of having performed character extraction using the method (2). 図１に示す画像のＲＧＢ空間における分布画像を示す図。The figure which shows the distribution image in RGB space of the image shown in FIG. ３×３の画像の場合のグラフの一例を示す図。The figure which shows an example of the graph in the case of a 3x3 image. 前景と背景のヒストグラムの一例を示す図。The figure which shows an example of the histogram of a foreground and a background. 実施例における画像処理装置のハードウェアの一例を示すブロック図。The block diagram which shows an example of the hardware of the image processing apparatus in an Example. 実施例における画像処理装置の機能の一例を示すブロック図。The block diagram which shows an example of the function of the image processing apparatus in an Example. チャンネル分解結果の一例を示す図。The figure which shows an example of a channel decomposition | disassembly result. 各チャンネルにおける最大分離度を示す図。The figure which shows the maximum separation degree in each channel. 最大分離度チャンネルの濃度画像の濃度の最大値及び最小値の一例を示す図。The figure which shows an example of the maximum value and the minimum value of the density | concentration image of the maximum resolution channel. 明度ヒストグラムと分離度最大チャンネルの濃度ヒストグラムとの一例を示す図。The figure which shows an example of the lightness histogram and the density histogram of the separation degree maximum channel. 着目画素に着目した場合のリンクと重みを説明する図。The figure explaining the link and weight at the time of paying attention to the attention pixel. 生成された２値画像の一例を示す図。The figure which shows an example of the produced | generated binary image. 実施例における画像処理の一例を示すフローチャート。6 is a flowchart illustrating an example of image processing in the embodiment.

以下、画像処理装置、画像処理方法及びプログラムの実施例について、添付図面を参照しながら説明する。 Hereinafter, embodiments of an image processing apparatus, an image processing method, and a program will be described with reference to the accompanying drawings.

まず、実施例で用いるグラフカットについて説明する。グラフカットとは、一般的に物体の切り出しを行う技術であり、画像の各画素をグラフのノードとする技術である。例えば、Boycovらの手法は、人手で物体と背景の代表画素（以下、シードとも呼ぶ）、シードを元に物体切出しを行う（例えば、Y. Boycov and G. Funka-lea, "Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images," Proc. Int'l Conf. Computer Vision, vol. l, pp.105-112,July 2001.参照）。Boycovらの手法では、人手でシードを設定しなければならないという欠点を持つ。 First, the graph cut used in the embodiment will be described. The graph cut is a technique that generally cuts out an object, and is a technique that uses each pixel of an image as a node of the graph. For example, the Boycov et al. Method manually cuts an object based on a representative pixel of an object and a background (hereinafter also referred to as a seed) and a seed (for example, Y. Boycov and G. Funka-lea, “Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in ND Images, "Proc. Int'l Conf. Computer Vision, vol. L, pp. 105-112, July 2001.). Boycov's method has the disadvantage of having to manually set seeds.

一方、シードを人手で設定しない手法としては、シードを自動計算するというHanらの手法がある（例えばDongfeng Han, Wenhui Li, Xiaosuo Lu, Tianzhu Wang, and Yi Wang, "Automatic Segmentation Based on AdaBoost Learning and Graph-Cuts*", ICIAR 2006, LNCS 4141, pp. 215-225, 2006参照）。 On the other hand, there is a technique of Han et al. That automatically calculates the seed (see Dongfeng Han, Wenhui Li, Xiaosuo Lu, Tianzhu Wang, and Yi Wang, "Automatic Segmentation Based on AdaBoost Learning and Graph-Cuts * ", ICIAR 2006, LNCS 4141, pp. 215-225, 2006).

Hanらの手法は、予め抽出したい対象を別手法で学習しておき、領域を大まかに推定した後、領域内部と外部の画素をシードとして使用するといったものである。 The Han et al. Method uses a different method to learn a target to be extracted in advance, roughly estimates the region, and then uses pixels inside and outside the region as seeds.

しかしながら、切り出し対象を、文字などのパターンにした場合、文字種やデザインによる変形により種類が膨大となる対象物は、学習が困難であり、かつ、誤抽出、未抽出が起きやすいという問題がある。 However, when the object to be cut out is a pattern such as a character, there is a problem that an object whose type is enormous due to deformation due to character type or design is difficult to learn and is likely to be erroneously extracted or not extracted.

また、文字やパターンは、線図形であるため、内部の画素を正しく求めにくい。よって、Boycovらの手法やHanらの手法は、例えば文字と背景の分離には不適切だと考えられる。 In addition, since characters and patterns are line figures, it is difficult to correctly determine internal pixels. Therefore, Boycov's method and Han's method are considered inappropriate for the separation of characters and background, for example.

ここで、薬のパッケージのように、凹凸のある金属の材質上に文字が印刷されている場合で、このパッケージが撮像されると、凹凸による反射や影によって画像の背景にテクスチャが生じ、明度値の変化では、文字パターンを区別できない場合がある。 Here, when a letter is printed on an uneven metal material like a medicine package, when this package is imaged, the background of the image is textured by reflections and shadows due to the unevenness, and the brightness A character pattern may not be distinguished by a change in value.

また、ＪＰＥＧ圧縮による劣化で、文字線の色が一様でない場合でも、文字パターンを高精度に抽出できない場合がある。 Further, even when the color of the character line is not uniform due to deterioration due to JPEG compression, the character pattern may not be extracted with high accuracy.

よって、明度変化部分に文字線がある、文字線の色が一様である、画素の色が文字と背景の代表色間を結んだ線分上に分布する、濃淡画像に対してある文字を抽出可能な閾値が必ず存在する、というパターン抽出の前提条件に当てはまらない場合がある。このような前提条件に当てはまらない場合でも、適切にパターン抽出ができるとよい。 Therefore, if there is a character line in the brightness change part, the color of the character line is uniform, and the color of the pixel is distributed on the line segment connecting the representative color of the character and the background, a certain character for the grayscale image There are cases where the precondition of pattern extraction that there is always a threshold that can be extracted does not apply. Even when such a precondition is not satisfied, it is desirable that pattern extraction can be performed appropriately.

次に、以下で説明する実施例で用いるグラフカットのセグメンテーションについて、簡単に説明しておく。グラフカットの手法では、画像各画素をノードとし、これに背景（又は前景）ラベルに対応するソースノードと前景（又は背景）ラベルに対応するシンクノードの２つのノードを加えたノードを持つグラフが作成される。 Next, graph cut segmentation used in the embodiments described below will be briefly described. In the graph cut method, each pixel of an image is a node, and a graph having a node obtained by adding two nodes of a source node corresponding to a background (or foreground) label and a sink node corresponding to a foreground (or background) label. Created.

隣接する画素ノード間には、Ｎリンクと呼ばれるリンクが張ってあり、また、ソースノードと画素ノード間、シンクノードと画素ノード間にはＴリンクと呼ばれるリンクが張ってある。 A link called an N link is extended between adjacent pixel nodes, and a link called a T link is extended between the source node and the pixel node and between the sink node and the pixel node.

図５は、３×３の画像の場合のグラフの一例を示す図である。図５に示すＮリンクには、近傍画素間の状態が等しくなると小さくなるような値をとる平滑化コストが付与される。ここで、従来技術では、画素ノードの一部に、人手または学習によって、背景・前景の正解ラベルを付与する。正解ラベルは、シードとも呼ぶ。 FIG. 5 is a diagram illustrating an example of a graph in the case of a 3 × 3 image. The N link shown in FIG. 5 is given a smoothing cost that takes a value that decreases as the state between neighboring pixels becomes equal. Here, in the prior art, correct labels for the background and the foreground are assigned to a part of the pixel nodes manually or by learning. The correct answer label is also called a seed.

背景の正解ラベルを付与されたものが背景シード、前景の正解ラベルを付与されたものが前景シードである。前述のBoycovらの手法は、これらのシードをユーザが指定するといったものであり、Hanらの手法はこれらのシードを別手法により目的とする物体領域を推定し、領域の内側又は外側から前景シード又は背景シードを求めるといったものである。 A background seed is assigned a background correct label, and a foreground seed is assigned a foreground correct label. The aforementioned Boycov et al. Method is such that the user designates these seeds, and the Han et al. Method estimates the target object region by using these seeds by another method, and foreground seeds from the inside or outside of the region. Alternatively, the background seed is obtained.

シードを求めた後は、シードの特徴とシード以外の画素の特徴との関係からＴリンクに付与するコストが計算される。ＴリンクとＮリンクとのコストの和で定義されたエネルギー関数値が最小となるように、グラフを２つのサブグラフにカットすることで、前景と背景との分離を行うのがグラフカットの手法である。 After obtaining the seed, the cost to be given to the T link is calculated from the relationship between the characteristics of the seed and the characteristics of the pixels other than the seed. The graph cut technique is used to separate the foreground and the background by cutting the graph into two subgraphs so that the energy function value defined by the sum of the costs of the T link and N link is minimized. is there.

上記のエネルギー関数Ｅは、一般的に次の式（１）で定義される。 The energy function E is generally defined by the following equation (1).

ここで、λは定数、Ｒ_ｉがＴリンクに付与されるデータコスト、Ｂ_ｉｊがＮリンクに付与される平滑化コストであり、Ａは背景、前景を表すラベルである。また、Ｄは画像の画素、Ｎは着目画素の近傍を表す。 Here, λ is a constant, R _i is a data cost given to the T link, B _ij is a smoothing cost given to the N link, and A is a label representing the background and foreground. D represents a pixel of the image, and N represents the vicinity of the pixel of interest.

シードが何らかの方法で決定された後、以下のようにＲ_ｉが計算される。
（ｉ）前景シードと背景シードとのそれぞれに対応する画素の明度を求め、明度のヒストグラムが求められる。ここで、ヒストグラムの各頻度を、対応するシードの全画素数で割ることで、事後確率Ｐとする。
（ｉｉ）シード以外の画素で、明度がＩであった場合、ヒストグラムを使い、画素が背景だった場合にＩとなる確率Ｐ_ｂと、前景であった場合にＩとなる確率Ｐ_ｆをそれぞれ求める。
（ｉｉｉ）次の式（２）、（３）で示されるように、確率値Ｐのそれぞれの逆数の対数をとることで、ソースノード、シンクノードとのデータコストＲ_ｉが計算される。このデータコストは、確率Ｐが高いほど、コストが小さくなるという意味を持つ。 After the seed is determined in some way, R _i is calculated as follows:
(I) The brightness of the pixel corresponding to each of the foreground seed and the background seed is obtained, and a brightness histogram is obtained. Here, the posterior probability P is obtained by dividing each frequency of the histogram by the total number of pixels of the corresponding seed.
(Ii) the pixel other than seed, when the brightness was I, using the histogram, the probability P _b to be I when the pixel was the background, the probability P _f which is I when was the foreground, respectively Ask.
(Iii) As shown by the following formulas (2) and (3), the data cost R _i between the source node and the sink node is calculated by taking the logarithm of each reciprocal of the probability value P. This data cost means that the higher the probability P, the smaller the cost.

図６は、前景と背景のヒストグラムの一例を示す図である。図６に示すように、明度Ｉ_１での確率Ｐ_ｂと確率Ｐ_ｆが同じような値であれば、その明度Ｉ_１の画素について、前景と背景のどちらに振り分けるかの判断が困難になる。 FIG. 6 is a diagram illustrating an example of the foreground and background histograms. As shown in FIG. 6, if the probability P _b and the probability P _f at the lightness I ₁ are the same value, it is difficult to determine whether the pixel of the lightness I ₁ is assigned to the foreground or the background. .

以上を踏まえると、前述のグラフカットを使った手法を、例えば文字抽出に適用する場合、以下のような課題が発生する。
・シードをユーザが手で毎回設定するのは困難である
・数千の文字種あり、かつ多様なフォントがある文字の形状を学習し、正確に領域推定することは困難である（誤抽出、未抽出が発生）
・文字領域が推定できても、線図形であるため前景シードとなる内部の領域を求めることが困難である
・何らかの方法でシードが推定できた場合でも、前景、背景の双方の明度ヒストグラムの重なり部分が大きいと精度が低下する
以上より、シードを手で設定したり、多量なデータを収集して学習したりせずに、データコストを計算する必要がある。そこで、以下に示す実施例では、前述したグラフカットの手法を用いるが、シードをユーザが手で設定せずに、また、学習を必要としないパターン抽出を可能とする方法を実現する。 Based on the above, when the above-described method using the graph cut is applied to, for example, character extraction, the following problems occur.
・ It is difficult for the user to set the seed each time by hand. ・ It is difficult to learn the shape of a character with thousands of character types and various fonts, and to accurately estimate the area (false extraction, unextracted). Extraction occurs)
・ Even if the text area can be estimated, it is difficult to determine the internal area that will be the foreground seed because it is a line figure. ・ When the seed can be estimated by some method, the lightness histograms of both the foreground and the background overlap. If the part is large, the accuracy is lowered. From the above, it is necessary to calculate the data cost without manually setting the seed or collecting and learning a large amount of data. Therefore, in the embodiment described below, the above-described graph cut method is used, but a method is realized in which a user can set a seed without manually setting a seed and without requiring learning.

［実施例］
＜ハードウェア＞
図７は、実施例における画像処理装置１０のハードウェアの一例を示すブロック図である。図７に示す例では、画像処理装置１０は、撮像部２０に接続される。 [Example]
<Hardware>
FIG. 7 is a block diagram illustrating an example of hardware of the image processing apparatus 10 according to the embodiment. In the example illustrated in FIG. 7, the image processing apparatus 10 is connected to the imaging unit 20.

撮像部２０は、例えばカメラであり、例えば、凹凸による反射や影によって画像の背景にテクスチャが生じるような媒体に印刷された文字を撮像する。なお、撮像部２０は、パターンが印刷（又は記載）されたものを撮像すればよい。 The imaging unit 20 is, for example, a camera, and images characters printed on a medium in which a texture is generated in the background of the image due to reflections and shadows due to unevenness, for example. Note that the imaging unit 20 may capture an image on which a pattern is printed (or described).

画像処理装置１０は、例えば撮像部２０により撮像された画像に対し、パターン抽出を行う。画像処理装置１０は、ハードウェアとして、例えば制御部１０１、主記憶部１０２、補助記憶部１０３、通信部１０４、及びドライブ装置１０５を有する。各部は、バスを介して相互にデータ送受信可能に接続されている。 For example, the image processing apparatus 10 performs pattern extraction on an image captured by the imaging unit 20. The image processing apparatus 10 includes, for example, a control unit 101, a main storage unit 102, an auxiliary storage unit 103, a communication unit 104, and a drive device 105 as hardware. Each unit is connected via a bus so that data can be transmitted / received to / from each other.

制御部１０１は、コンピュータの中で、各装置の制御やデータの演算、加工を行うＣＰＵ（Central Processing Unit）である。また、制御部１０１は、主記憶部１０２や補助記憶部１０３に記憶されたプログラムを実行する演算装置であり、入力装置や記憶装置からデータを受け取り、演算、加工した上で、出力装置や記憶装置に出力する。 The control unit 101 is a CPU (Central Processing Unit) that performs control of each device, calculation of data, and processing in a computer. The control unit 101 is an arithmetic device that executes a program stored in the main storage unit 102 or the auxiliary storage unit 103. The control unit 101 receives data from the input device or the storage device, calculates, processes the output device, the storage device, and the like. Output to the device.

主記憶部１０２は、例えば、ＲＯＭ（Read Only Memory）やＲＡＭ（Random Access Memory）などである。主記憶部１０２は、制御部１０１が実行する基本ソフトウェアであるＯＳやアプリケーションソフトウェアなどのプログラムやデータを記憶又は一時保存する記憶装置である。 The main storage unit 102 is, for example, a ROM (Read Only Memory) or a RAM (Random Access Memory). The main storage unit 102 is a storage device that stores or temporarily stores programs and data such as an OS and application software that are basic software executed by the control unit 101.

補助記憶部１０３は、例えばＨＤＤ（Hard Disk Drive）などであり、アプリケーションソフトウェアなどに関連するデータを記憶する記憶装置である。補助記憶部１０３は、例えば撮像部２０から取得した画像などを記憶する。 The auxiliary storage unit 103 is, for example, an HDD (Hard Disk Drive) or the like, and is a storage device that stores data related to application software. The auxiliary storage unit 103 stores, for example, an image acquired from the imaging unit 20.

通信部１０４は、有線又は無線で周辺機器とデータ通信を行う。通信部１０４は、例えばネットワークを介して、パターンを含む画像を取得し、補助記憶部１０３に記憶する。 The communication unit 104 performs data communication with peripheral devices by wire or wireless. The communication unit 104 acquires an image including a pattern via a network, for example, and stores it in the auxiliary storage unit 103.

ドライブ装置１０５は、記録媒体１０６、例えばフレキシブルディスクやＣＤ（Compact Disc）から所定のプログラムを読み出し、記憶装置にインストールする。 The drive device 105 reads a predetermined program from the recording medium 106, for example, a flexible disk or a CD (Compact Disc), and installs it in the storage device.

また、記録媒体１０６に、後述する画像処理プログラムを格納し、この記録媒体１０６に格納されたプログラムは、ドライブ装置１０５を介して画像処理装置１０にインストールされる。インストールされた画像処理プログラムは、画像処理装置１０により実行可能となる。 Further, an image processing program to be described later is stored in the recording medium 106, and the program stored in the recording medium 106 is installed in the image processing apparatus 10 via the drive device 105. The installed image processing program can be executed by the image processing apparatus 10.

なお、画像処理装置１０は、撮像部２０を別構成としたが、同じ装置内に設けてもよく、また、表示部などを設けてもよい。 In addition, although the image processing apparatus 10 has the imaging unit 20 as a separate configuration, it may be provided in the same apparatus, or may be provided with a display unit or the like.

＜機能＞
図８は、実施例における画像処理装置１０の機能の一例を示すブロック図である。図８に示す例では、画像処理装置１０は、画像入力部２０１、分解部２０２、選択部２０３、抽出部２０４、分離部２０５、及び画像生成部２０６を有する。 <Function>
FIG. 8 is a block diagram illustrating an example of functions of the image processing apparatus 10 according to the embodiment. In the example illustrated in FIG. 8, the image processing apparatus 10 includes an image input unit 201, a decomposition unit 202, a selection unit 203, an extraction unit 204, a separation unit 205, and an image generation unit 206.

図８に示す各部は、例えば制御部１０１が、補助記憶部１０３に記憶される画像処理プログラム（抽出処理プログラム）を主記憶部１０２にロードし、実行することで機能する。つまり、図８に示す各部は、例えば制御部１０１及びワークメモリとしての主記憶部１０２により実現されうる。また、図８に示す各部は、例えばハードウェア的に実現されてもよい。以下では、パターン抽出として、文字抽出を例にして説明する。 Each unit illustrated in FIG. 8 functions when, for example, the control unit 101 loads an image processing program (extraction processing program) stored in the auxiliary storage unit 103 into the main storage unit 102 and executes it. That is, each unit illustrated in FIG. 8 can be realized by the control unit 101 and the main storage unit 102 as a work memory, for example. Further, each unit illustrated in FIG. 8 may be realized by hardware, for example. Hereinafter, character extraction will be described as an example of pattern extraction.

画像入力部２０１は、例えば撮像部２０により撮像された画像を入力する。入力される画像は、カラー画像でもよい。画像入力部２０１は、入力された画像を主記憶部１０２に記憶する。 The image input unit 201 inputs an image captured by the imaging unit 20, for example. The input image may be a color image. The image input unit 201 stores the input image in the main storage unit 102.

分解部２０２は、画像入力部２０１が入力した画像を複数のチャンネルに分解し、各チャンネルの濃度画像を生成する。分解部２０２は、画像の座標値の線形結合で複数の濃度画像を生成してもよい。分解部２０２は、生成した各濃度画像を選択部２０３に出力する。 The decomposition unit 202 decomposes the image input by the image input unit 201 into a plurality of channels, and generates a density image of each channel. The decomposition unit 202 may generate a plurality of density images by linear combination of image coordinate values. The decomposition unit 202 outputs each generated density image to the selection unit 203.

選択部２０３は、分解部２０２により生成された各濃度画像の濃度分布を２群に分離し、この２群の分離度が最大となる濃度画像を選択する。選択部２０３は、この処理を行うために、ヒストグラム計算部２３１、及びチャンネル選択部２３２を有する。 The selection unit 203 separates the density distribution of each density image generated by the decomposition unit 202 into two groups, and selects a density image that maximizes the degree of separation of the two groups. The selection unit 203 includes a histogram calculation unit 231 and a channel selection unit 232 for performing this process.

ヒストグラム計算部２３１は、各チャンネルの各濃度画像から濃度ヒストグラムを生成する。ヒストグラム計算部２３１は、計算した各濃度ヒストグラムをチャンネル選択部２３２に出力する。 The histogram calculation unit 231 generates a density histogram from each density image of each channel. The histogram calculation unit 231 outputs the calculated density histograms to the channel selection unit 232.

チャンネル選択部２３２は、各濃度ヒストグラムに対し、例えば、各濃度値で画素を２群に分離した場合の群間分散を群内分散で除算した分離度（分離度Ｓ＝群間分散／郡内分散）が最大となる濃度で２群に分離する。チャンネル選択部２３２は、各濃度画像の最大の分離度Ｓのうち、一番大きい分離度Ｓを有するチャンネルの濃度画像を選択する。チャンネルと濃度画像は１対１に対応するため、チャンネル選択部２３２は、チャンネルを選択するようにしてもよい。選択部２０３は、選択された濃度画像を抽出部２０４に出力する。 The channel selection unit 232, for each density histogram, for example, the degree of separation obtained by dividing the inter-group variance when the pixels are separated into two groups at each density value by the intra-group variance (separation degree S = inter-group variance / intra-group). Separation into two groups at a concentration that maximizes (dispersion). The channel selection unit 232 selects the density image of the channel having the largest degree of separation S among the largest degree of separation S of each density image. Since the channel and the density image have a one-to-one correspondence, the channel selection unit 232 may select a channel. The selection unit 203 outputs the selected density image to the extraction unit 204.

抽出部２０４は、選択部２０３により選択された濃度画像の濃度の最大値及び最小値を抽出する。抽出部２０４は、抽出された最大値及び最小値を分離部２０５に出力する。 The extraction unit 204 extracts the maximum value and the minimum value of the density of the density image selected by the selection unit 203. The extraction unit 204 outputs the extracted maximum value and minimum value to the separation unit 205.

分離部２０５は、抽出された最大値及び最小値をシードに設定したグラフカットを用いて、選択された濃度画像を２群に分離する。分離部２０５は、グラフ生成部２５１、及びグラフカット部２５２を有する。 The separation unit 205 separates the selected density image into two groups using a graph cut in which the extracted maximum value and minimum value are set as seeds. The separation unit 205 includes a graph generation unit 251 and a graph cut unit 252.

グラフ生成部２５１は、抽出された最大値及び最小値をシードに設定したグラフを計算し、生成する。グラフ生成部２５１は、計算したグラフをグラフカット部２５２に出力する。 The graph generation unit 251 calculates and generates a graph in which the extracted maximum value and minimum value are set as seeds. The graph generation unit 251 outputs the calculated graph to the graph cut unit 252.

グラフカット部２５２は、グラフ生成部２５１により生成されたグラフに対し、式（１）で与えられるコストの総和が最小となるようにグラフを２つのサブグラフにカットする。グラフカット部２５２は、カットされたサブグラフに対応する画素を、それぞれ群１と、群２とに分離する。 The graph cut unit 252 cuts the graph into two subgraphs with respect to the graph generated by the graph generation unit 251 so that the sum of the costs given by Expression (1) is minimized. The graph cut unit 252 separates pixels corresponding to the cut subgraphs into groups 1 and 2, respectively.

分離部２０５は、群１と群２それぞれの群に分離された画素を、画像生成部２０６に出力する。 The separation unit 205 outputs the pixels separated into the groups 1 and 2 to the image generation unit 206.

画像生成部２０６は、分離された２群それぞれの画素に対して連結成分解析を行い、何れかの群の画素から画像を生成する。画像生成部２０６は、連結成分解析により連結された何れかの群の部分が、文字を抽出するための条件を満たす場合、この条件を満たす部分から文字画像を生成する。 The image generation unit 206 performs a connected component analysis on each pixel of the two separated groups, and generates an image from any group of pixels. When any part of the group connected by the connected component analysis satisfies a condition for extracting a character, the image generation unit 206 generates a character image from the part that satisfies the condition.

文字を抽出するための条件とは、例えば、連結成分の縦横比やサイズなどで判定することができる。この条件は、実験などにより文字らしいサイズや縦横比を調べ、予め設定しておけばよい。 The condition for extracting characters can be determined by, for example, the aspect ratio and size of the connected component. This condition may be set in advance by examining the character-like size and aspect ratio by experiments or the like.

なお、連結成分解析の結果、文字らしい連結成分が２群のうちいずれからも得られない場合、文字と背景の分離が失敗した可能性があるため、画像処理装置１０は、次に大きい分離度を持つ濃度画像に対して、シード設定、グラフ生成、グラフカットを行えばよい。 As a result of the connected component analysis, if a connected component that seems to be a character is not obtained from any of the two groups, there is a possibility that the separation of the character and the background may have failed. It is only necessary to perform seed setting, graph generation, and graph cut on the density image having.

画像生成部２０６により生成された文字画像は、例えばＯＣＲ（Optical Character Recognition）処理を行って文字データとして管理されてもよい。 The character image generated by the image generation unit 206 may be managed as character data by performing, for example, OCR (Optical Character Recognition) processing.

＜各処理＞
次に、画像処理装置１０のパターン抽出、例えば文字抽出や、画像生成などの具体例について説明する。 <Each process>
Next, specific examples of pattern extraction of the image processing apparatus 10 such as character extraction and image generation will be described.

（チャンネル分解処理）
分解部２０２は、図１に示す画像に対して、分解処理を行う。分解部２０２は、例えば図１に示す画像を複数のチャンネルに分解し、各チャンネルに対応する濃度画像を生成する。なお、図１に示す画像は、実際はカラー画像であり、背景が薄紫色で、文字が赤色である。 (Channel decomposition processing)
The decomposition unit 202 performs a decomposition process on the image shown in FIG. For example, the decomposition unit 202 decomposes the image shown in FIG. 1 into a plurality of channels, and generates a density image corresponding to each channel. Note that the image shown in FIG. 1 is actually a color image, the background is light purple, and the characters are red.

例えば、分解部２０２は、Ｉチャンネル、Ｒチャンネル、Ｇチャンネル、Ｂチャンネルの４チャンネルに分解する。なお、分解するチャンネル数は４つに限られない。 For example, the disassembling unit 202 disassembles into four channels of I channel, R channel, G channel, and B channel. Note that the number of channels to be decomposed is not limited to four.

図９は、チャンネル分解結果の一例を示す図である。図９に示すように、分解部２０２により、４つのチャンネルの濃度画像に分解される。なお、画像がグレイ画像の場合、画素値は明度値を表すことが多い。また、ＲＧＢの３色のカラー画素値は、例えば明度値=0.299×R＋0.587×G＋0.114×Bの変換式を用いて明度値へ変換することができる。 FIG. 9 is a diagram illustrating an example of the channel decomposition result. As shown in FIG. 9, the decomposition unit 202 decomposes the image into four channel density images. When the image is a gray image, the pixel value often represents a brightness value. The color pixel values of the three colors RGB can be converted into lightness values using, for example, a lightness value = 0.299 × R + 0.587 × G + 0.114 × B.

（選択処理）
選択部２０３は、各濃度画像に対して、濃度分布（例えば濃度ヒストグラム）を計算し、チャンネル別に２群の分離度が最大となる最大分離度を計算する。次に、選択部２０３は、得られた濃度ヒストグラムによってチャンネル毎に２群の最大分離度を計算し、得られた最大分離度が最大となるチャンネルを決定する。このチャンネルを最大分離度チャンネルとも呼ぶ。 (Selection process)
The selection unit 203 calculates a density distribution (for example, a density histogram) for each density image, and calculates the maximum separation degree that maximizes the separation degree of the two groups for each channel. Next, the selection unit 203 calculates the maximum separation degree of the two groups for each channel based on the obtained density histogram, and determines the channel that maximizes the obtained maximum separation degree. This channel is also called the maximum resolution channel.

選択部２０３は、２群の分離度について、例えば公知の大津の方式を用いればよい（例えば、大津展之、"判別および最小２乗基準に基づく自動しきい値選定法"、電子通信学会論文誌 80/4 Vol.J63-D No.4参照）。 The selection unit 203 may use, for example, a well-known Otsu system for the degree of separation of the two groups (for example, Noriyuki Otsu, “Automatic threshold selection method based on discrimination and least-square criterion”, IEICE paper) 80/4 Vol.J63-D No.4).

例えば、選択部２０３は、着目する濃度画像の濃度ヒストグラムに対し、各濃度値で画素を２群に分離した場合の群間分散／群内分散を２群の分離度とし、２群の分離度が最大となる濃度を濃度閾値とする。 For example, for the density histogram of the density image of interest, the selecting unit 203 sets the inter-group variance / in-group variance when the pixels are separated into two groups at each density value as the two groups, and the two groups are separated. The density at which the value is maximum is taken as the density threshold.

選択部２０３は、濃度ヒストグラムを濃度閾値で２群に分離した場合の群間分散／群内分散を、対象とするチャンネルの最大分離度Ｓとする。選択部２０３は、各チャンネルで最大分離度Ｓを求め、Ｓが最大となるチャンネルを最大分離度チャンネルと判定する。 The selection unit 203 sets the inter-group variance / in-group variance when the density histogram is separated into two groups by the density threshold as the maximum separation degree S of the target channel. The selection unit 203 obtains the maximum separation degree S for each channel, and determines the channel having the maximum S as the maximum separation degree channel.

図１０は、各チャンネルにおける最大分離度を示す図である。図１０に示す例では、選択部２０３は、図９に示す各チャンネルの濃度画像に対して最大分離度を求めている。図１０に示す例では、Ｂチャンネルが最大分離度チャンネルとなる。選択部２０３は、最大分離度チャンネルの濃度画像を抽出部２０４に出力する。 FIG. 10 is a diagram showing the maximum degree of separation in each channel. In the example shown in FIG. 10, the selection unit 203 obtains the maximum degree of separation for the density image of each channel shown in FIG. In the example shown in FIG. 10, the B channel is the maximum resolution channel. The selection unit 203 outputs the density image of the maximum resolution channel to the extraction unit 204.

（抽出処理）
抽出部２０４は、最大分離度チャンネルの濃度画像に対し、濃度の最大値ｇ_１及び最小値ｇ_０を抽出する。 (Extraction process)
The extraction unit 204 extracts the maximum density g ₁ and the minimum value g ₀ of the density image of the maximum resolution channel.

図１１は、最大分離度チャンネルの濃度画像の濃度の最大値及び最小値の一例を示す図である。図１１に示すように、抽出部２０４は、最大分離度チャンネルの濃度画像の最大値をｇ_１とし、最小値をｇ_０とする。抽出部２０４は、抽出した最大値及び最小値を分離部２０５に出力する。 FIG. 11 is a diagram showing an example of the maximum value and the minimum value of the density image of the maximum resolution channel. As shown in FIG. 11, the extraction unit 204 sets the maximum value of the density image of the maximum resolution channel to g ₁ and sets the minimum value to g ₀ . The extraction unit 204 outputs the extracted maximum value and minimum value to the separation unit 205.

（グラフ生成処理）
グラフ生成部２５１は、群１、群２に対応するノードでそれぞれ終了ノードをシンクノード、開始ノードをソースノードとし、画像画素をその他ノードとしてグラフを作成する。グラフ生成部２５１は、その他ノード間で、着目画素とその近傍にある画素のノード間にリンクを張ってＮリンクとする。 (Graph generation processing)
The graph generation unit 251 creates a graph with nodes corresponding to the groups 1 and 2 as end nodes as sink nodes, start nodes as source nodes, and image pixels as other nodes. The graph generation unit 251 establishes an N link by establishing a link between the target pixel and a node of a pixel in the vicinity thereof between other nodes.

グラフ生成部２５１は、その他ノードとシンクノード、その他ノードとソースノード間にそれぞれリンクを張ってＴリンクとする。グラフ生成部２５１は、Ｎリンクに対し、公知の手法で平滑化コストを付与する。平滑化コストは、近傍同志の画素の状態（特徴の値など）が類似のときは大きくなり、異なる場合は大きくなるようなコストである。 The graph generation unit 251 establishes links between other nodes and sink nodes, and other nodes and source nodes to form T links. The graph generation unit 251 assigns a smoothing cost to the N link by a known method. The smoothing cost is a cost that increases when the states (such as feature values) of neighboring pixels are similar and increases when they are different.

例えば、グラフ生成部２５１は、着目画素の濃度値をｇとしたとき、ｇと最大値ｇ_１との距離をｄ_１とし、ｇと最小値ｇ_０との距離をｄ_０とする。このとき、グラフ生成部２５１は、シンクノード、ソースノードと対象画素間のＴリンクのデータコストを式（４）、式（５）に従って計算する。ここで、ｆは関数を表す。 For example, when the density value of the pixel of interest is g, the graph generation unit 251 sets the distance between g and the maximum value g ₁ to d ₁ and sets the distance between g and the minimum value g ₀ to d ₀ . At this time, the graph generation unit 251 calculates the data cost of the T link between the sink node, the source node, and the target pixel according to the equations (4) and (5). Here, f represents a function.

図１２は、明度ヒストグラムと分離度最大チャンネルの濃度ヒストグラムとの一例を示す図である。図１２に示すように、同じ画像であっても、明度ヒストグラムを用いた場合は、ある明度の画素に対して、どちらの群に含めればよいか分からないときがある。一方、分離度最大チャンネルの濃度ヒストグラムは、そもそも２群の分離度が最大となるヒストグラムであるため、ある濃度の画素をいずれかの群に含めるかを容易に決定することができる。群１、群２は、前景、背景のいずれかに対応する。 FIG. 12 is a diagram illustrating an example of a brightness histogram and a density histogram of a channel with the highest degree of separation. As shown in FIG. 12, even if the images are the same, when a brightness histogram is used, it may not be clear which group should be included for a pixel of a certain brightness. On the other hand, the density histogram of the channel with the highest degree of separation is a histogram in which the degree of separation of the two groups is maximized in the first place. Therefore, it can be easily determined whether a certain density of pixels is included in any group. Groups 1 and 2 correspond to either the foreground or the background.

ここで、グラフ生成部２５１は、Ｔリンクとしての式（４）、式（５）に対して、最も簡単な式（６）、式（７）を用いてもよい。 Here, the graph generation unit 251 may use the simplest expressions (6) and (7) for the expressions (4) and (5) as the T link.

ここで、Ｌは、濃度がとるレンジを表す。 Here, L represents the range that the density takes.

（グラフカット処理）
グラフカット部２５２は、生成されたグラフに対し、例えば式（１）によりコストの総和が最小となるように、グラフを２つのサブグラフにカットする。 (Graph cut processing)
The graph cut unit 252 cuts the graph into two subgraphs with respect to the generated graph so that the total cost is minimized by, for example, the equation (1).

グラフカット部２５２は、最も簡単な例として、式（８）により得られるエネルギーを求めてもよい。 As a simplest example, the graph cut unit 252 may obtain the energy obtained by the equation (8).

図１３は、着目画素に着目した場合のリンクと重みを説明する図である。図１３に示す例は、式（９）、式（１０）の一例を示す。図１３に示す例では、式（１）で、Ａ_ｉに分離度最大チャンネルの濃度を使用する。ｉは、画像中の画素の番号である。図１３に示すｓノードは、ソースノードを表し、ｔノードはシンクノードを表す。また、ｎ１〜ｎ４のノードは、近傍画素（周辺画素）を表す。 FIG. 13 is a diagram illustrating links and weights when attention is focused on the target pixel. The example shown in FIG. 13 shows an example of Formula (9) and Formula (10). In the example shown in FIG. 13, the density of the maximum resolution channel is used for A _i in equation (1). i is the number of a pixel in the image. The s node shown in FIG. 13 represents a source node, and the t node represents a sink node. The nodes n1 to n4 represent neighboring pixels (peripheral pixels).

Ｔリンクの重みは、着目画素の濃度と真の値として選んだ濃度ｇとの差を表し、Ｎリンクの重みは、着目画素と近傍画素の濃度との差を表す。これらの差が大きくなるほど、エネルギーが大きくなるように設定される。 The T link weight represents the difference between the density of the target pixel and the density g selected as the true value, and the N link weight represents the difference between the target pixel and the density of the neighboring pixels. The larger the difference, the larger the energy.

λは、実験などにより、適切な値が設定され、Ｔリンクの重みと、Ｎリンクの重みの比率に対応する。 λ is set to an appropriate value through experiments or the like, and corresponds to the ratio of the T link weight and the N link weight.

グラフカット部２５２は、画像内の画素について、いずれのサブグラフに対応するかで群１と群２とに分離する。グラフカット部２５２は、群１に含まれる画素と、群２に含まれる画素とを、画像生成部２０６に出力する。 The graph cut unit 252 separates the pixels in the image into groups 1 and 2 depending on which subgraph corresponds to the pixel in the image. The graph cut unit 252 outputs the pixels included in the group 1 and the pixels included in the group 2 to the image generation unit 206.

（画像生成処理）
画像生成部２０６は、得られた２群の画素を連結成分解析によって、いずれかが前景かを判定し、前景画素から文字画像を生成する。さらに、画像生成部２０６は、生成した文字画像を文字認識に使用するために、文字部分を黒、背景を白とした２値画像を生成する。 (Image generation processing)
The image generation unit 206 determines whether one of the obtained two groups of pixels is a foreground by connected component analysis, and generates a character image from the foreground pixels. Furthermore, in order to use the generated character image for character recognition, the image generation unit 206 generates a binary image in which the character portion is black and the background is white.

また、画像生成部２０６は、用途に応じてカラー画像を生成してもよい。図１４は、生成された２値画像の一例を示す図である。図１４に示す例は、Ｂチャンネルの濃度画像を用いてグラフカットを行い、連結成分解析を行った結果である。図１４に示すように、図２や図３と比較して、適切に文字抽出ができている。 Further, the image generation unit 206 may generate a color image according to the application. FIG. 14 is a diagram illustrating an example of the generated binary image. The example shown in FIG. 14 is the result of performing a graph cut using the density image of the B channel and performing a connected component analysis. As shown in FIG. 14, character extraction is appropriately performed as compared with FIG. 2 and FIG.

以上の構成や機能を有することで、実施例では、シードを手動で設定したり、学習したりする必要がなく、Ｔリンクのデータコストを自動的に計算することが可能となる。 With the above configuration and function, in the embodiment, it is not necessary to manually set or learn the seed, and it is possible to automatically calculate the data cost of the T link.

＜動作＞
次に、実施例における画像処理装置１０の動作について説明する。図１５は、実施例における画像処理の一例を示すフローチャートである。図１５に示すステップＳ１０１で、分解部２０２は、入力された画像のチャンネルを分解し、各チャンネルで濃度画像を生成する。 <Operation>
Next, the operation of the image processing apparatus 10 in the embodiment will be described. FIG. 15 is a flowchart illustrating an example of image processing in the embodiment. In step S101 shown in FIG. 15, the decomposition unit 202 decomposes the channels of the input image and generates a density image for each channel.

ステップＳ１０２で、選択部２０３（ヒストグラム計算部２３１）は、チャンネル別に濃度ヒストグラムを計算する。 In step S102, the selection unit 203 (histogram calculation unit 231) calculates a density histogram for each channel.

ステップＳ１０３で、選択部２０３（チャンネル選択部２３２）は、チャンネル別に最大分離度を計算する。 In step S103, the selection unit 203 (channel selection unit 232) calculates the maximum degree of separation for each channel.

ステップＳ１０４で、選択部２０３（チャンネル選択部２３２）は、チャンネル毎の最大分離度で一番大きい最大分離度を有する最大分離度チャンネルを選択する。 In step S <b> 104, the selection unit 203 (channel selection unit 232) selects the maximum separation channel having the largest maximum separation among the maximum separations for each channel.

ステップＳ１０５で、抽出部２０４は、最大分離度チャンネルの濃度の最大値及び最小値を抽出する。 In step S105, the extraction unit 204 extracts the maximum value and the minimum value of the density of the maximum resolution channel.

ステップＳ１０６で、分離部２０５（グラフ生成部２５１及びグラフカット部２５２）は、抽出された最大値及び最小値を用いてグラフを生成し、Ｔリンクのコスト、Ｎリンクのコストを計算する。 In step S106, the separation unit 205 (graph generation unit 251 and graph cut unit 252) generates a graph using the extracted maximum value and minimum value, and calculates the cost of the T link and the cost of the N link.

ステップＳ１０７で、分離部２０５（グラフカット部２５２）は、コストの総和が最小となるように、グラフカットを行い、濃度画像の各画素を２群に分離する。 In step S107, the separation unit 205 (graph cut unit 252) performs a graph cut so that the total cost is minimized, and separates each pixel of the density image into two groups.

ステップＳ１０８で、画像生成部２０６は、各群の画素に対して、連結成分解析を行い、文字を抽出するための条件を満たす方の群の画素を用いて、画像を生成する。生成される画像は、２値画像でも、カラー画像でもよい。 In step S <b> 108, the image generation unit 206 performs a connected component analysis on each group of pixels, and generates an image using a group of pixels that satisfy a condition for extracting characters. The generated image may be a binary image or a color image.

以上、実施例によれば、文字抽出を適切に行うことができる。例えば、実施例によれば、従来では文字抽出が困難であった薬のパッケージに記載された文字を撮像した画像から、適切に文字抽出を行うことができる。 As described above, according to the embodiment, character extraction can be appropriately performed. For example, according to the embodiment, it is possible to appropriately perform character extraction from an image obtained by imaging characters described in a medicine package, which has conventionally been difficult to extract characters.

なお、上記実施例は、パターンを含む画像に対して適用可能であるが、特に、影、反射によりテクスチャが生じるような材質に印刷された文字が撮像された画像に対して、好適に機能する。また、上記実施例では、例えば、薬のパッケージに印刷された文字を抽出するのに好適に機能する。 The above-described embodiment can be applied to an image including a pattern. However, the above-described embodiment functions particularly suitably for an image in which characters printed on a material that causes texture due to shadows and reflections are captured. . Moreover, in the said Example, it functions suitably for extracting the character printed on the package of a medicine, for example.

なお、上記実施例で説明した画像処理を実現するためのプログラムを記録媒体に記録することで、上記実施例での画像処理をコンピュータに実施させることができる。例えば、このプログラムを記録媒体に記録し、このプログラムが記録された記録媒体をコンピュータに読み取らせて、前述した画像処理を実現させることも可能である。 In addition, by recording a program for realizing the image processing described in the above embodiment on a recording medium, the image processing in the above embodiment can be performed by a computer. For example, it is possible to record the program on a recording medium and cause the computer to read the recording medium on which the program is recorded, thereby realizing the above-described image processing.

なお、上記実施例における画像処理装置は、コンピュータ（Personal Computer）や、タブレット端末、スマートフォンなどのプロセッサと、メモリと、データを取得するインタフェースとを有する処理装置であれば適用可能である。また、上記実施例の画像処理装置１０をスキャナ装置に含め、スキャナ装置でも上記画像処理を実行することができるように実装してもよい。 In addition, the image processing apparatus in the said Example is applicable if it is a processing apparatus which has processors, such as a computer (Personal Computer), a tablet terminal, a smart phone, memory, and an interface which acquires data. Further, the image processing apparatus 10 of the above-described embodiment may be included in the scanner apparatus so that the scanner apparatus can execute the image processing.

なお、記録媒体は、ＣＤ−ＲＯＭ、フレキシブルディスク、光磁気ディスク等の様に情報を光学的、電気的或いは磁気的に記録する記録媒体、ＲＯＭ、フラッシュメモリ等の様に情報を電気的に記録する半導体メモリ等、様々なタイプの記録媒体を用いることができる。この記録媒体には、搬送波は含まれない。 The recording medium is a recording medium that records information optically, electrically, or magnetically, such as a CD-ROM, flexible disk, magneto-optical disk, etc., and information is electrically recorded such as ROM, flash memory, etc. Various types of recording media such as a semiconductor memory can be used. This recording medium does not include a carrier wave.

以上、実施例について詳述したが、特定の実施例に限定されるものではなく、特許請求の範囲に記載された範囲内において、種々の変形及び変更が可能である。また、前述した実施例の構成要素を全部又は複数を組み合わせることも可能である。 Although the embodiments have been described in detail above, the invention is not limited to the specific embodiments, and various modifications and changes can be made within the scope described in the claims. It is also possible to combine all or a plurality of the components of the above-described embodiments.

なお、以上の実施例に関し、さらに以下の付記を開示する。
（付記１）
画像を複数のチャンネルに分解し、各チャンネルの濃度画像を生成する分解部と、
各濃度画像の濃度分布を２群に分離し、該２群の分離度が最大となる濃度画像を選択する選択部と、
選択された濃度画像の濃度の最大値及び最小値を抽出する抽出部と、
抽出された最大値及び最小値をシードに設定したグラフカットを用いて、前記選択された濃度画像を２群に分離する分離部と、
分離された２群それぞれの画素に対して連結成分解析を行い、何れかの群の画素から画像を生成する画像生成部と、
を備える画像処理装置。
（付記２）
前記選択部は、
各濃度画像から濃度ヒストグラムを生成し、各濃度ヒストグラムに対し、各濃度値で画素を２群に分離した場合の群間分散を群内分散で除算した分離度が最大となる濃度で２群に分離し、各濃度画像の最大の分離度のうち、一番大きい分離度を有する濃度画像を選択する付記１記載の画像処理装置。
（付記３）
前記画像は、文字部分を含み、
前記画像生成部は、
前記連結成分解析により連結された何れかの群の部分が、文字を抽出するための条件を満たす場合、前記条件を満たす部分から文字画像を生成する付記１又は２記載の画像処理装置。
（付記４）
画像を複数のチャンネルに分解し、各チャンネルの濃度画像を生成し、
各濃度画像の濃度分布を２群に分離し、該２群の分離度が最大となる濃度画像を選択し、
選択された濃度画像の濃度の最大値及び最小値を抽出し、
抽出された最大値及び最小値をシードに設定したグラフカットを用いて、前記選択された濃度画像を２群に分離し、
分離された２群それぞれの画素に対して連結成分解析を行い、何れかの群の画素から画像を生成する処理
をコンピュータが実行する画像処理方法。
（付記５）
画像を複数のチャンネルに分解し、各チャンネルの濃度画像を生成し、
各濃度画像の濃度分布を２群に分離し、該２群の分離度が最大となる濃度画像を選択し、
選択された濃度画像の濃度の最大値及び最小値を抽出し、
抽出された最大値及び最小値をシードに設定したグラフカットを用いて、前記選択された濃度画像を２群に分離し、
分離された２群それぞれの画素に対して連結成分解析を行い、何れかの群の画素から画像を生成する処理
をコンピュータに実行させるプログラム。 In addition, the following additional remarks are disclosed regarding the above Example.
(Appendix 1)
A decomposition unit that decomposes the image into a plurality of channels and generates a density image of each channel;
A selection unit that separates the density distribution of each density image into two groups, and selects a density image that maximizes the degree of separation of the two groups;
An extraction unit for extracting the maximum value and the minimum value of the density of the selected density image;
A separation unit that separates the selected density image into two groups using a graph cut in which the extracted maximum and minimum values are set as seeds;
An image generation unit that performs a connected component analysis on each of the two separated pixels and generates an image from one of the pixels;
An image processing apparatus comprising:
(Appendix 2)
The selection unit includes:
A density histogram is generated from each density image, and for each density histogram, when the pixels are separated into two groups by each density value, the density between the groups is divided into the two groups with the maximum degree of separation divided by the intra-group variance. The image processing apparatus according to supplementary note 1, wherein the image processing apparatus separates and selects a density image having the largest degree of separation among the maximum degrees of separation of the density images.
(Appendix 3)
The image includes a character portion;
The image generation unit
The image processing apparatus according to claim 1 or 2, wherein when any part of the group connected by the connected component analysis satisfies a condition for extracting a character, a character image is generated from the part that satisfies the condition.
(Appendix 4)
Decompose the image into multiple channels and generate a density image for each channel,
The density distribution of each density image is separated into two groups, and the density image that maximizes the degree of separation of the two groups is selected,
Extract the maximum and minimum density values of the selected density image,
Using the graph cut set with the extracted maximum and minimum values as seeds, the selected density image is separated into two groups,
An image processing method in which a computer executes a process of performing a connected component analysis on each of two separated pixels and generating an image from one of the groups of pixels.
(Appendix 5)
Decompose the image into multiple channels and generate a density image for each channel,
The density distribution of each density image is separated into two groups, and the density image that maximizes the degree of separation of the two groups is selected,
Extract the maximum and minimum density values of the selected density image,
Using the graph cut set with the extracted maximum and minimum values as seeds, the selected density image is separated into two groups,
A program that causes a computer to perform a process of performing connected component analysis on each of the two separated pixels and generating an image from one of the groups of pixels.

１０画像処理装置
２０撮像部
１０１制御部
１０２主記憶部
１０３補助記憶部
１０４通信部
１０５ドライブ装置
２０１画像入力部
２０２分解部
２０３選択部
２０４抽出部
２０５分離部
２０６画像生成部
２３１ヒストグラム計算部
２３２チャンネル選択部
２５１グラフ生成部
２５２グラフカット部 DESCRIPTION OF SYMBOLS 10 Image processing apparatus 20 Imaging part 101 Control part 102 Main memory part 103 Auxiliary memory part 104 Communication part 105 Drive apparatus 201 Image input part 202 Decomposition part 203 Selection part 204 Extraction part 205 Separation part 206 Image generation part 231 Histogram calculation part 232 Channel Selection unit 251 Graph generation unit 252 Graph cut unit

Claims

A decomposition unit that decomposes the image into a plurality of channels and generates a density image of each channel;
A selection unit that separates the density distribution of each density image into two groups, and selects a density image that maximizes the degree of separation of the two groups;
An extraction unit for extracting the maximum value and the minimum value of the density of the selected density image;
A separation unit that separates the selected density image into two groups using a graph cut in which the extracted maximum and minimum values are set as seeds;
An image generation unit that performs a connected component analysis on each of the two separated pixels and generates an image from one of the pixels;
An image processing apparatus comprising:

The selection unit includes:
A density histogram is generated from each density image, and for each density histogram, when the pixels are separated into two groups by each density value, the density between the groups is divided into the two groups with the maximum degree of separation divided by the intra-group variance. The image processing apparatus according to claim 1, wherein the density image is separated and the density image having the largest degree of separation is selected from the maximum degrees of separation of the density images.

The image includes a character portion;
The image generation unit
3. The image processing apparatus according to claim 1, wherein when any part of the group connected by the connected component analysis satisfies a condition for extracting a character, a character image is generated from the part that satisfies the condition.

Decompose the image into multiple channels and generate a density image for each channel,
The density distribution of each density image is separated into two groups, and the density image that maximizes the degree of separation of the two groups is selected,
Extract the maximum and minimum density values of the selected density image,
Using the graph cut set with the extracted maximum and minimum values as seeds, the selected density image is separated into two groups,
An image processing method in which a computer executes a process of performing a connected component analysis on each of two separated pixels and generating an image from one of the groups of pixels.

Decompose the image into multiple channels and generate a density image for each channel,
The density distribution of each density image is separated into two groups, and the density image that maximizes the degree of separation of the two groups is selected,
Extract the maximum and minimum density values of the selected density image,
Using the graph cut set with the extracted maximum and minimum values as seeds, the selected density image is separated into two groups,
A program that causes a computer to perform a process of performing connected component analysis on each of the two separated pixels and generating an image from one of the groups of pixels.