JP2012252691A

JP2012252691A - Method and device for extracting text stroke image from image

Info

Publication number: JP2012252691A
Application number: JP2012110573A
Authority: JP
Inventors: Tianyi Gui; 天宜桂; Akihiro Minagawa; 明洋皆川; Yutaka Katsuyama; 裕勝山; Junu Sunu; スヌ・ジュヌ; Yoshinobu Hotta; 悦伸堀田; Satoshi Naoi; 聡直井
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-05-31
Filing date: 2012-05-14
Publication date: 2012-12-20
Anticipated expiration: 2032-05-14
Also published as: JP5939023B2; CN102810155A; CN102810155B

Abstract

【課題】画像からテキスト筆画画像を抽出する方法及び装置を提供する。
【解決手段】本発明の１態様によれば、画像からテキスト筆画画像を抽出する方法は、画像のエッジ情報と勾配情報を取得し、取得されたエッジ情報と勾配情報に予め設けられた強調処理を行うことにより、画像においてテキストに関するエッジ情報と勾配情報を強調し、強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を取得することを含む。
【選択図】図１A method and apparatus for extracting a text stroke image from an image.
According to one aspect of the present invention, a method of extracting a text stroke image from an image acquires edge information and gradient information of the image, and enhancement processing provided in advance in the acquired edge information and gradient information. To emphasize edge information and gradient information related to text in the image, and to obtain a text stroke image corresponding to the emphasized edge information and gradient information.
[Selection] Figure 1

Description

本発明は、画像処理分野に関し、特に、画像からテキスト筆画画像を抽出する方法及び装置に関する。 The present invention relates to the field of image processing, and more particularly to a method and apparatus for extracting a text stroke image from an image.

現在の情報処理分野において、大量のビデオファイルが存在している。これらビデオファイルに効率的な検索を行う必要がある。ビデオ注釈、ビデオ探索等については、ビデオにおけるテキスト情報が正確且つ簡単な手がかりとなる。従って、ビデオに含まれるテキスト情報をどのように正確に抽出して認識するかということは、後続のビデオ注釈、ビデオ検索にとって非常に重要である。 There are a large number of video files in the current information processing field. It is necessary to perform an efficient search on these video files. For video annotation, video search, etc., text information in the video is an accurate and simple clue. Therefore, how to accurately extract and recognize text information included in a video is very important for subsequent video annotation and video search.

既知の幾つかのテキスト筆画の抽出技術は、速度が遅く、ノイズが大きく、筆画の尺度に敏感ではないなどの欠点がある。
上記した課題を解決しうる、画像からテキスト筆画画像を抽出する方法及び装置が望まれる。 Some known text stroke extraction techniques have drawbacks such as slow speed, noisyness, and insensitivity to stroke scale.
A method and apparatus for extracting a text stroke image from an image that can solve the above-described problems is desired.

以下に、本発明を簡単に説明して本発明の幾つかの態様の基本的な理解を提供する。この簡単な説明は、本発明に対する網羅的なものではない。その目的としては、本発明の肝心部分又は重要部分を決定する意図がなく、本発明の範囲を限定する意図もなく、簡単な形式で幾つかの概念を提供して後述のより詳しい説明の先行説明とすることに過ぎない。
本発明は、画像からテキスト筆画画像を抽出する方法及び装置を提供することを目的とする。 The following briefly describes the present invention and provides a basic understanding of some aspects of the present invention. This brief description is not exhaustive for the invention. Its purpose is not to determine the essential or critical part of the invention, nor to limit the scope of the invention, but to provide some concepts in a simplified form and to precede the more detailed description below. It's just an explanation.
An object of this invention is to provide the method and apparatus which extract a text stroke image from an image.

本発明の１つの態様によれば、画像からテキスト筆画画像（text stroke image）を抽出する方法であって、画像のエッジ情報と勾配情報を取得し、取得されたエッジ情報と勾配情報に予め設けられた強調処理（enhancement）を行うことにより、画像においてテキストに関するエッジ情報と勾配情報を強調し、強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を取得することを含む方法が提供される。 According to one aspect of the present invention, there is provided a method for extracting a text stroke image from an image, acquiring edge information and gradient information of the image, and providing the acquired edge information and gradient information in advance. By performing the enhanced enhancement, a method is provided that includes emphasizing edge information and gradient information relating to text in the image and obtaining a text stroke image corresponding to the enhanced edge information and gradient information. .

本発明の別の態様によれば、画像からテキスト筆画画像を抽出する装置であって、画像のエッジ情報と勾配情報を取得する情報取得手段と、取得されたエッジ情報と勾配情報に予め設けられた強調処理を行うことにより、画像においてテキストに関するエッジ情報と勾配情報を強調する強調手段と、強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を取得する筆画画像取得手段とを備える装置が提供される。 According to another aspect of the present invention, an apparatus for extracting a text stroke image from an image is provided in advance in information acquisition means for acquiring edge information and gradient information of the image, and in the acquired edge information and gradient information. An apparatus including an enhancement unit that enhances edge information and gradient information related to text in an image by performing the enhancement process, and a stroke image acquisition unit that acquires a text stroke image corresponding to the enhanced edge information and gradient information. Provided.

また、本発明の実施の形態では、上記方法を実現するコンピュータプログラムが更に提供される。 In the embodiment of the present invention, a computer program for realizing the above method is further provided.

また、本発明の実施の形態では、上記方法を実現するコンピュータプログラムコードが記録され、少なくともコンピュータにより読取り可能な媒体の形式のコンピュータプログラムプロダクトが更に提供される。 In the embodiment of the present invention, computer program code for realizing the above method is recorded, and at least a computer program product in the form of a computer-readable medium is further provided.

以下の図面による本発明に対する最適の実施の形態の詳細な説明により、本発明の上記利点及び他の利点がより明らかになるであろう。 The above and other advantages of the present invention will become more apparent from the following detailed description of the preferred embodiment of the present invention with reference to the drawings.

以下、図面に基づいて本発明の実施の形態に対する説明を参照し、本発明の上記目的、特徴及び利点、並びに他の目的、特徴及び利点をより容易に理解することができる。図面における構成要素は本発明の原理を示すものに過ぎない。図面において、同一又は類似の技術的特徴又は構成要素は、同一又は類似の符号で示される。
本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する方法を示すフローチャートである。図２は、ステップ信号、パルス信号、ソーベル演算子に対応する信号、ソーベル演算子でステップ信号を抽出した抽出結果、ソーベル演算子でパルス信号を抽出した抽出結果、ステップ信号の抽出結果に絶対値を取った信号、パルス信号の抽出結果に絶対値を取った信号、及び相応のオフセットを行った信号を示す模式図である。本発明による、ソーベル演算子を利用して画像からテキスト筆画画像を抽出する方法を示すフローチャートである。処理される原始画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子を用いて畳込み処理を行った画像を示す図である。ソーベル演算子で処理した画像を対向する方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。ソーベル演算子で処理した画像を対向する方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。ソーベル演算子で処理した画像を対向する方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。ソーベル演算子で処理した画像を対向する方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。ソーベル演算子で処理した画像を反対の方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。ソーベル演算子で処理した画像を反対の方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。ソーベル演算子で処理した画像を反対の方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。ソーベル演算子で処理した画像を反対の方向にオフセットし、且つオフセットした画像を合成して得られた四つの画像を示す図である。整合された細い筆画画像を示す図である。整合した太い筆画画像を示す図である。フィルタリング処理により得られた太い筆画画像を示す図である。本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する装置を示すブロック図である。本発明の実施の形態に係る、ソーベル演算子で画像からテキスト筆画画像を抽出する装置を示すブロック図である。本発明における、画像からテキスト筆画画像を抽出する方法及び装置を実施可能なコンピュータ装置を例として示す構成図である。 The above objects, features and advantages of the present invention, as well as other objects, features and advantages can be more easily understood with reference to the description of the embodiments of the present invention based on the drawings. The components in the drawings are merely illustrative of the principles of the invention. In the drawings, the same or similar technical features or components are denoted by the same or similar reference numerals.
It is a flowchart which shows the method of extracting the text stroke image from the image based on embodiment of this invention. FIG. 2 shows the step signal, the pulse signal, the signal corresponding to the Sobel operator, the extraction result obtained by extracting the step signal by the Sobel operator, the extraction result obtained by extracting the pulse signal by the Sobel operator, and the absolute value of the extraction result of the step signal. It is a schematic diagram which shows the signal which took the signal which took the absolute value in the extraction signal of the pulse signal, the extraction result of the pulse signal, and the corresponding offset. 4 is a flowchart illustrating a method for extracting a text stroke image from an image using a Sobel operator according to the present invention. It is a figure which shows the primitive image processed. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows the image which performed the convolution process using the Sobel operator. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposing direction, and synthesize | combining the offset image. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposing direction, and synthesize | combining the offset image. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposing direction, and synthesize | combining the offset image. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposing direction, and synthesize | combining the offset image. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposite direction, and synthesize | combining the offset image. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposite direction, and synthesize | combining the offset image. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposite direction, and synthesize | combining the offset image. It is a figure which shows four images obtained by offsetting the image processed with the Sobel operator in the opposite direction, and synthesize | combining the offset image. It is a figure which shows the matched thin stroke image. It is a figure which shows the matched thick stroke image. It is a figure which shows the thick stroke image obtained by the filtering process. It is a block diagram which shows the apparatus which extracts the text stroke image from the image based on embodiment of this invention. It is a block diagram which shows the apparatus which extracts a text stroke image from an image with a Sobel operator based on embodiment of this invention. 1 is a configuration diagram illustrating, as an example, a computer apparatus that can implement a method and apparatus for extracting a text stroke image from an image according to the present invention.

以下に、図面を参照して本発明の実施の形態を説明する。本発明の一つの図面又は一種類の実施の形態に説明された要素と特徴は、一つ又はそれ以上の他の図面或いは実施の形態に示された要素及び特徴と結合することができる。ここで、明瞭にするために、図面及び説明において、当業者にとって知られている、本発明と関係ない構成要素と処理の表示及び説明が省略されている。 Embodiments of the present invention will be described below with reference to the drawings. Elements and features described in one drawing or one type of embodiment of the invention may be combined with elements and features shown in one or more other drawings or embodiments. Here, for the sake of clarity, in the drawings and description, the display and description of components and processes that are known to those skilled in the art and are not relevant to the present invention are omitted.

以下に、図１を参照して本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する方法について説明する。
図１は、本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する方法を示すフローチャートである。図１に示されたように、ステップＳ１０２において、画像のエッジ情報及び勾配情報を取得することができる。
好ましくは、画像のエッジ情報及び勾配情報を表すステップ信号又はパルス信号を解析し、解析結果に基づいてエッジ情報と勾配情報を抽出することができる。 A method for extracting a text stroke image from an image according to an embodiment of the present invention will be described below with reference to FIG.
FIG. 1 is a flowchart showing a method for extracting a text stroke image from an image according to an embodiment of the present invention. As shown in FIG. 1, in step S102, edge information and gradient information of the image can be acquired.
Preferably, a step signal or a pulse signal representing edge information and gradient information of an image can be analyzed, and edge information and gradient information can be extracted based on the analysis result.

画像において、細い筆画の画像データをパルス信号で示し、太い筆画の画像データ及びそれと類似する大きい尺度対象をステップ信号で示すことができる。パルス信号を解析し、その解析結果に基づいて細い筆画のエッジ情報と勾配情報を抽出することができる。また、ステップ信号を解析し、その解析結果に基づいて太い筆画のエッジ情報と勾配情報を抽出し、太い筆画と類似する大きい尺度対象のエッジ情報と勾配情報を抽出することができる。 In an image, image data of a thin stroke can be indicated by a pulse signal, and image data of a thick stroke and a large scale object similar thereto can be indicated by a step signal. By analyzing the pulse signal, it is possible to extract edge information and gradient information of a thin stroke based on the analysis result. Also, it is possible to analyze the step signal, extract edge information and gradient information of a thick stroke based on the analysis result, and extract edge information and gradient information of a large scale object similar to a thick stroke.

以下に、図２を参照して、ソーベル演算子でステップ信号を抽出する過程とソーベル演算子でパルス信号を抽出する過程、及びオフセット処理と合成処理により強調処理を実行する過程について説明する。 Hereinafter, with reference to FIG. 2, a process of extracting a step signal by the Sobel operator, a process of extracting a pulse signal by the Sobel operator, and a process of executing enhancement processing by offset processing and synthesis processing will be described.

図２（i）はステップ信号、図２（ii）はパルス信号、図２（iii）と図２（iv）はソーベル演算子に対応する信号、図２（v）はソーベル演算子でステップ信号を抽出した抽出結果、図２（vi）はソーベル演算子でパルス信号を抽出した抽出結果、図２（vii）はステップ信号の抽出結果に絶対値を取った信号、図２（viii）はパルス信号の抽出結果に絶対値を取った信号、及び図２（ix）と図２（x）は相応のオフセットした信号を示した。 2 (i) is a step signal, FIG. 2 (ii) is a pulse signal, FIGS. 2 (iii) and 2 (iv) are signals corresponding to the Sobel operator, and FIG. 2 (v) is a step signal using the Sobel operator. 2 (vi) shows the extraction result of extracting the pulse signal with the Sobel operator, FIG. 2 (vii) shows the signal obtained by taking the absolute value in the extraction result of the step signal, and FIG. 2 (viii) shows the pulse. The signal obtained as an absolute value in the signal extraction result, and FIGS. 2 (ix) and 2 (x) show the corresponding offset signals.

ここで、図２における座標及び大きさは、限定的ではなく、例示的なものであり、且つ関連の信号処理の原理を示すためのものに過ぎない。 Here, the coordinates and sizes in FIG. 2 are illustrative rather than limiting, and are only intended to illustrate the principles of related signal processing.

図２に示されたように、（図２（iii）に示されるような）ソーベル演算子で（図２（i）に示されるような）ステップ信号を抽出することにより、（図２（v）に示されるような）単一の谷状な信号を取得することができる。その一方、（図２（iv）に示されるような）ソーベル演算子で（図２（ii）に示されるような）パルス信号を抽出することにより、（図２（vi）に示されるような）谷状な信号と山のような信号を結合した信号を取得することができる。以上のように、ソーベル演算子でステップ信号とパルス信号を抽出することにより、（図２（v）と図２（vi）に示されるような）差異の大きい二つの信号を相応的に取得することができる。 As shown in FIG. 2, by extracting the step signal (as shown in FIG. 2 (i)) with a Sobel operator (as shown in FIG. 2 (iii)) (as shown in FIG. 2 (v) A single trough signal (as shown) can be obtained. On the other hand, by extracting a pulse signal (as shown in FIG. 2 (ii)) with a Sobel operator (as shown in FIG. 2 (iv)), as shown in FIG. 2 (vi). ) A signal obtained by combining a valley-like signal and a mountain-like signal can be obtained. As described above, by extracting the step signal and the pulse signal with the Sobel operator, two signals having a large difference (as shown in FIG. 2 (v) and FIG. 2 (vi)) are acquired correspondingly. be able to.

次に、後に説明する(例えばオフセット処理と合成処理を含む)強調処理によりステップ信号とパルス信号を強調することができる。これにより、パルス信号に対応する細い筆画の画像データ及びステップ信号に対応する太い筆画の画像データを抽出する。 Next, the step signal and the pulse signal can be enhanced by enhancement processing (for example, including offset processing and synthesis processing) described later. As a result, image data of a thin stroke corresponding to the pulse signal and image data of a thick stroke corresponding to the step signal are extracted.

例えば、ソーベル演算子で抽出されたパルス信号における(図２（vi）に示されるような)谷状な信号に絶対値を取って(即ち谷状な信号の大きさを取る)別の山のような信号（図２（viii）に示されるような）を得ることができる。その後に、当該別の山のような信号と元の山のような信号の部分(即ちソーベル演算子で抽出された山のような信号)を対向的な方向にオフセットさせ、且つオフセットした当該別の山のような信号と元の山のような信号とを重ねることにより(図２（x）に示されるような)、更に当該パルス信号に対応する細い筆画の画像データを強調することができる。また、(図２（v）に示されるような)ソーベル演算子で抽出されたステップ信号における谷状な信号に絶対値を取って、(図２（vii）に示されるような)もう一つの山のような信号を得ることができる。その後、（図２（ix）に示されるような）当該もう一つの山のような信号をオフセットさせる。 For example, in the pulse signal extracted by the Sobel operator, the absolute value of the valley signal (as shown in FIG. 2 (vi)) is taken (that is, the magnitude of the valley signal is taken). Such a signal (as shown in FIG. 2 (viii)) can be obtained. After that, the other peak-like signal and the original peak-like signal part (that is, the peak-like signal extracted by the Sobel operator) are offset in the opposite direction, and By superimposing a signal like a peak and a signal like the original peak (as shown in FIG. 2 (x)), the image data of a thin stroke corresponding to the pulse signal can be further enhanced. . Also, taking the absolute value of the valley signal in the step signal extracted by the Sobel operator (as shown in Fig. 2 (v)), another one (as shown in Fig. 2 (vii)) A mountain-like signal can be obtained. Thereafter, the other peak-like signal (as shown in FIG. 2 (ix)) is offset.

選択的に、予め設けられたフィルタリング条件に基づいて、太い筆画と類似する大きい尺度対象のエッジ情報と勾配情報をフィルタリングすると共に、太い筆画のエッジ情報と勾配情報を残すことができる。具体的なフィルタリング処理について後に説明する。 Optionally, the edge information and gradient information of a large scale object similar to a thick stroke can be filtered based on a predetermined filtering condition, and the edge information and gradient information of a thick stroke can be left. Specific filtering processing will be described later.

選択的に、画像の明瞭度に基づいて、或いは必要に応じて、図４に示された原始画像に対してフィルタリングしても良い。例えば、ローパスフィルタを用いて原始画像をフィルタリングすることによって画像中のノイズを抑制することができる。ローパスフィルタは、例えばガウシャンフィルタであっても良いが、それに限定されなく、当業者に既知された任意の適当なローパスフィルタであっても良い。 Optionally, the source image shown in FIG. 4 may be filtered based on the clarity of the image, or as needed. For example, noise in the image can be suppressed by filtering the original image using a low-pass filter. The low-pass filter may be, for example, a Gaussian filter, but is not limited thereto, and may be any appropriate low-pass filter known to those skilled in the art.

ステップＳ１０４において、取得されたエッジ処理と勾配処理に予め設けられた強調処理を行うことにより、画像においてテキストに関するエッジ情報と勾配情報を強調することができる。 In step S104, the edge information and the gradient information related to the text can be emphasized in the image by performing the enhancement processing provided in advance for the acquired edge processing and gradient processing.

各種の方法でエッジ情報と勾配情報に強調処理を行うことができる。好ましくは、２値化処理と整合処理でエッジ情報と勾配情報を強調することができる。ここでの整合処理は、積集合を求める処理、最大値を求める処理、或いは平均値を求める処理であっても良い。好ましくは、後に説明されるように、整合処理は積集合を求める処理であっても良い。 The edge information and the gradient information can be enhanced by various methods. Preferably, edge information and gradient information can be emphasized by binarization processing and matching processing. The matching process here may be a process for obtaining a product set, a process for obtaining a maximum value, or a process for obtaining an average value. Preferably, as will be described later, the matching process may be a process for obtaining a product set.

ステップＳ１０６において、強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を得ることができる。 In step S106, a text stroke image corresponding to the emphasized edge information and gradient information can be obtained.

以下に、図３〜図１０を参照して、本発明の実施の形態に係る、ソーベル演算子で画像からテキスト筆画画像を抽出する方法について説明する。 Hereinafter, a method for extracting a text stroke image from an image by a Sobel operator according to an embodiment of the present invention will be described with reference to FIGS.

図３は本発明の実施の形態に係る、ソーベル演算子で画像からテキスト筆画画像を抽出する方法を示すフローチャートである。図４は処理する原始画像を示す図である。図５Ａ〜図５Ｈは、ソーベル演算子で畳み込み処理を行った画像を示す図である。図６Ａ〜図６Ｄは、画像を対向する方向にオフセットさせ、且つオフセットした画像を合成した四つの画像を示す図である。図７Ａ〜図７Ｄは、画像を反対の方向にオフセットさせ、且つオフセットした画像を合成した四つの画像を示す図である。図８は、整合した細い筆画画像を示す図である。図９は、整合した太い筆画画像を示す図である。図１０は、フィルタリング処理を行った太い筆画画像を示す図である。 FIG. 3 is a flowchart showing a method for extracting a text stroke image from an image by a Sobel operator according to an embodiment of the present invention. FIG. 4 is a diagram showing a source image to be processed. FIG. 5A to FIG. 5H are diagrams illustrating images that have been subjected to convolution processing by the Sobel operator. 6A to 6D are diagrams illustrating four images obtained by offsetting images in the facing direction and combining the offset images. 7A to 7D are diagrams illustrating four images obtained by offsetting images in opposite directions and combining the offset images. FIG. 8 is a diagram showing a thin stroke image that is aligned. FIG. 9 is a diagram illustrating a thick stroke image that has been matched. FIG. 10 is a diagram illustrating a thick stroke image subjected to filtering processing.

図３に示されたように、ステップＳ２０２において、（例えば図４に示された原始画像といった）処理される画像に（例えばローパスフィルタといった）平滑化処理を行うことができる。ステップＳ２０２は選択可能なものである。言い換えれば、画像が比較に明瞭である場合に、或いは必要に応じて、図４に示された画像に（例えばローパスフィルタといった）平滑化処理を行わなくても良い。 As shown in FIG. 3, in step S202, a smoothing process (eg, a low-pass filter) can be performed on an image to be processed (eg, the original image shown in FIG. 4). Step S202 is selectable. In other words, if the image is clear for comparison, or if necessary, smoothing processing (such as a low-pass filter) may not be performed on the image shown in FIG.

ここで、図４に示された原始画像を処理すべき画像とする。図４において、抽出されたテキスト筆画に、画像の右上隅にあるFUJITSUと画像の下側の日本語文字を含むと望ましい。なお、FUJITSUは細い筆画の文字、日本語文字は太い筆画の文字である。ここでは、図４が例示的なものに過ぎない。実際に、画像において細い筆画の文字のみ、或いは太い筆画の文字のみが含まれる場合もある。 Here, let us say that the original image shown in FIG. 4 is an image to be processed. In FIG. 4, it is desirable that the extracted text stroke includes FUJITSU in the upper right corner of the image and Japanese characters on the lower side of the image. Note that FUJITSU is a character with thin strokes, and Japanese characters are characters with thick strokes. Here, FIG. 4 is merely exemplary. Actually, there may be a case where only thin stroke characters or only thick stroke characters are included in the image.

ステップＳ２０４において、ソーベル（Sobel）演算子で平滑化された画像に畳み込み処理を行うことができる。ステップＳ２０２の平滑化処理が行われない場合に、ソーベル演算子を直接に利用して、（例えば図４に示された原始画像）処理する画像に畳み込み処理を行うことができる。 In step S204, convolution processing can be performed on an image smoothed by a Sobel operator. When the smoothing process of step S202 is not performed, the convolution process can be performed on the image to be processed (for example, the original image shown in FIG. 4) using the Sobel operator directly.

具体的に、ソーベル演算子を利用して、処理する画像に複数の方向において畳み込み処理を行う。言い換えれば、画像の複数の方向におけるエッジ情報と勾配情報を取得するための複数のソーベル演算子のコンボリューションカーネルを利用して、画像の画像データに畳み込み演算をそれぞれ行う。 Specifically, convolution processing is performed on the image to be processed in a plurality of directions using a Sobel operator. In other words, a convolution operation of a plurality of Sobel operators for acquiring edge information and gradient information in a plurality of directions of the image is used to perform convolution operations on the image data of the image, respectively.

例えば、ソーベル演算子を利用して、処理される画像に四つの方向において畳み込み処理を行う。好ましくは、これら四つの方向が水平方向、垂直方向、及び二つの対角線方向を含む。畳み込み処理が行われた画像は、図５Ａ〜５Ｈに示される。これら四つの方向を選択する原因としては、これら四つの方向のそれぞれが常用な筆画
（外１）
に対応するからである。 For example, using a Sobel operator, the image to be processed is convolved in four directions. Preferably, these four directions include a horizontal direction, a vertical direction, and two diagonal directions. Images that have undergone the convolution process are shown in FIGS. The reason for selecting these four directions is that each of these four directions is a regular stroke (outside 1)
It is because it corresponds to.

例えば、利用されたソーベル演算子のコンボリューションカーネルは、以下のようになる。
For example, the convolution kernel of the Sobel operator used is as follows:

なお、S_hは水平方向に係るソーベル演算子のコンボリューションカーネル、S_vは垂直方向に係るソーベル演算子のコンボリューションカーネル、S_rdは第１の対角線方向に係るソーベル演算子のコンボリューションカーネル、S_ldは第２の対角線方向に係るソーベル演算子のコンボリューションカーネルである。第１の対角線方向は右上隅から左下隅までの対角線方向、第２の対角線方向は左下隅から右下隅までの対角線方向である。 S _h is the convolution kernel of the Sobel operator in the horizontal direction, S _v is the convolution kernel of the Sobel operator in the vertical direction, S _rd is the convolution kernel of the Sobel operator in the first diagonal direction, S _ld is a convolution kernel of the Sobel operator related to the second diagonal direction. The first diagonal direction is the diagonal direction from the upper right corner to the lower left corner, and the second diagonal direction is the diagonal direction from the lower left corner to the lower right corner.

ここで、畳み込み処理の結果が正と負の両方を含むため、二層の画像を利用して畳み込み結果を記憶することができる。その一つは正パルス応答画像を記憶するために用いられ、もう一つは負パルス応答画像を記憶するために用いられる。言い換えれば、各方向に対する畳み込み演算の結果を、各方向に対する正パルス応答画像データと負パルス応答画像データとに区分する。 Here, since the result of the convolution process includes both positive and negative, the convolution result can be stored using two layers of images. One is used to store a positive pulse response image and the other is used to store a negative pulse response image. In other words, the result of the convolution operation for each direction is divided into positive pulse response image data and negative pulse response image data for each direction.

本明細書に四つの方向の例が開示されたが、四つの方向において畳み込み演算を行うことに限定されるわけではない。高精度が必要の場合に、或いは他の必要に応じて、より多い方向、例えば８個の方向のソーベル演算子のコンボリューションカーネルを利用して、処理する画像に畳み込み処理を行うことができる。より多い方向の畳み込み演算により、精度のより高い結果を得ることができる。ソーベル演算子の定義及び画像データとの畳み込み処理は、任意の好適な従来の技術によるものであっても良い。 Although examples of four directions have been disclosed herein, the present invention is not limited to performing a convolution operation in four directions. When high accuracy is required, or as required, convolution processing can be performed on an image to be processed using a convolution kernel of Sobel operators in more directions, for example, eight directions. A more accurate result can be obtained by more convolution operations. The definition of the Sobel operator and the convolution process with the image data may be by any suitable conventional technique.

図４に示された原始画像に畳み込み処理を行うことにより、図５Ａ〜図５Ｈに示された、ソーベル演算子のコンボリューションカーネルで畳み込み処理を行った画像を得ることができる。 By performing the convolution process on the original image shown in FIG. 4, it is possible to obtain an image obtained by performing the convolution process with the convolution kernel of the Sobel operator shown in FIGS. 5A to 5H.

具体的に、図５Ａは、水平方向の正パルス応答画像データを含む画像、図５Ｂは、水平方向の負パルス応答画像データを含む画像を、それぞれ示す。図５Ｃは、垂直方向の正パルス応答画像データを含む画像、図５Ｄは、垂直方向の負パルス応答画像データを含む画像をそれぞれ示す。図５Ｅは、右上隅から左下隅までの対角線方向の正パルス応答画像データを含む画像、図５Ｆは、右上隅から左下隅までの対角線方向の負パルス応答画像データを含む画像をそれぞれ示す。図５Ｇは、左上隅から右下隅までの対角線方向の正パルス応答画像データを含む画像、図５Ｈは、左上隅から右下隅までの対角線方向の負パルス応答画像データを含む画像をそれぞれ示す。 Specifically, FIG. 5A shows an image including the positive pulse response image data in the horizontal direction, and FIG. 5B shows an image including the negative pulse response image data in the horizontal direction. FIG. 5C shows an image including positive pulse response image data in the vertical direction, and FIG. 5D shows an image including negative pulse response image data in the vertical direction. FIG. 5E shows an image including positive pulse response image data in the diagonal direction from the upper right corner to the lower left corner, and FIG. 5F shows an image including negative pulse response image data in the diagonal direction from the upper right corner to the lower left corner. FIG. 5G shows an image including positive pulse response image data in the diagonal direction from the upper left corner to the lower right corner, and FIG. 5H shows an image including negative pulse response image data in the diagonal direction from the upper left corner to the lower right corner.

次に、図６Ａ〜６Ｄ及び図７Ａ〜７Ｄに示されたように、ステップＳ２０６において、予め推定された筆画幅を利用して、正パルス応答画像データと負パルス応答画像データをオフセットし、オフセットされた画像を合成する。 Next, as shown in FIGS. 6A to 6D and FIGS. 7A to 7D, in step S206, the pre-estimated stroke width is used to offset the positive pulse response image data and the negative pulse response image data. Composite images.

各方向に対する正パルス応答画像データと負パルス応答画像データに対して、それを対向する方向にオフセットし、かつ、加算する演算を実行することによって、各方向に対する第１の合成画像データを得ることができる。また、各方向に対する正パルス応答画像データと負パルス応答画像データに対して、それを反対の方向にオフセットし、かつ、加算する演算を実行することによって、各方向に対する第２の合成画像データを得ることができる。 First positive composite image data for each direction is obtained by performing an operation of offsetting and adding the positive pulse response image data and negative pulse response image data for each direction in the opposite direction. Can do. In addition, by performing an operation of offsetting and adding the positive pulse response image data and the negative pulse response image data for each direction in the opposite directions, the second composite image data for each direction is obtained. Obtainable.

例えば、対向する方向にオフセットさせ、かつ、加算する演算は、以下の（１）式に基づいて実行されても良い。
I_h(x,y)=(I_h-positive(x,y-w/2)+I_h-negative(x,y+w/2))/2
I_v(x,y)=(I_v-positive(x-w/2,y)+I_v-negative(x+w/2,y))/2
I_rd(x,y)=(I_rd-positive(x+w/2,y-w/2)+I_rd-negative(x-w/2,y+w/2))/2
I_ld(x,y)=(I_ld-positive(x-w/2,y-w/2)+I_ld-negative(x+w/2,y+w/2))/2
また、反対の方向にオフセットし、かつ、加算する演算が以下の（２）式に基づいて実行されても良い。 For example, the operation of offsetting in the opposite direction and adding may be performed based on the following equation (1).
I _h (x, y) = (I _h-positive (x, yw / 2) + I _h-negative (x, y + w / 2)) / 2
I _v (x, y) = (I _v-positive (xw / 2, y) + I _v-negative (x + w / 2, y)) / 2
I _rd (x, y) = (I _rd-positive (x + w / 2, yw / 2) + I _rd-negative (xw / 2, y + w / 2)) / 2
I _ld (x, y) = (I _ld-positive (xw / 2, yw / 2) + I _ld-negative (x + w / 2, y + w / 2)) / 2
Further, the operation of offsetting in the opposite direction and adding may be executed based on the following equation (2).

I_h’(x,y)=(I_h-positive(x,y+w/2)+I_h-negative(x,y-w/2))/2,
I_v’(x,y)=(I_v-positive(x+w/2,y)+I_v-negative(x-w/2,y))/2,
I_rd’(x,y)=(I_rd-positive(x-w/2,y+w/2)+I_rd-negative(x+w/2,y-w/2))/2,
I_ld’(x,y)=(I_ld-positive(x+w/2,y+w/2)+I_ld-negative(x-w/2,y-w/2))/2
なお、xは画素の横座標、yは画素の縦座標、I_h(x,y)、I_v(x,y)、I_rd(x,y)及びI_ld(x,y)は画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する第１の合成画像データをそれぞれ示す。なお、第１の合成画像データを含む画像は、図５Ａ〜５Ｄに示される。 I _h '(x, y) = (I _h-positive (x, y + w / 2) + I _h-negative (x, yw / 2)) / 2,
I _v '(x, y) = (I _v-positive (x + w / 2, y) + I _v-negative (xw / 2, y)) / 2,
I _rd '(x, y) = (I _rd-positive (xw / 2, y + w / 2) + I _rd-negative (x + w / 2, yw / 2)) / 2,
I _ld '(x, y) = (I _ld-positive (x + w / 2, y + w / 2) + I _ld-negative (xw / 2, yw / 2)) / 2
X is the abscissa of the pixel, y is the ordinate of the pixel, I _h (x, y), I _v (x, y), I _rd (x, y), and I _ld (x, y) are the pixel First composite image data for the four directions of the horizontal direction, the vertical direction, and two diagonal directions are respectively shown. In addition, the image containing 1st synthetic | combination image data is shown by FIG.

図６Ａは、データI_h(x,y)を含む画像、図６Ｂは、データI_v(x,y)を含む画像、図６Ｃは、データI_rd(x,y)を含む画像、図６Ｄは、データI_ld(x,y)を含む画像をそれぞれ示す。 6A is an image including data I _h (x, y), FIG. 6B is an image including data I _v (x, y), FIG. 6C is an image including data I _rd (x, y), and FIG. 6D. Show images including data I _ld (x, y), respectively.

I_h’(x,y)、I_v’(x,y)及びI_rd’(x,y)、I_ld’(x,y)は、画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する第２の合成画像データをそれぞれ示す。なお、第２の合成画像データを含む画像は図７Ａ〜７Ｄに示される。 I _h ′ (x, y), I _v ′ (x, y) and I _rd ′ (x, y), I _ld ′ (x, y) are the horizontal, vertical and two diagonal directions of the pixel. Second composite image data for four directions are respectively shown. Note that images including the second composite image data are shown in FIGS.

図７Ａは、データI_h’(x,y)を含む画像、図７Ｂは、データI_v’(x,y)を含む画像、図７Ｃは、データI_rd’(x,y)を含む画像、図７Ｄは、データI_ld’(x,y)を含む画像をそれぞれ示す。 7A is an image including data I _h ′ (x, y), FIG. 7B is an image including data I _v ′ (x, y), and FIG. 7C is an image including data I _rd ′ (x, y). FIG. 7D shows images each including data I _ld ′ (x, y).

I_h-positive(x,y)、I_v-positive(x,y)、I_rd-positive(x,y)及びI_ld-positive(x,y)は、画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する正パルス応答画像データをそれぞれ示す。なお、正パルス応答画像データを含む画像は、図５Ａ、図５Ｃ、図５Ｅと図５Ｇに示された。 I _h-positive (x, y), I _v-positive (x, y), I _rd-positive (x, y) and I _ld-positive (x, y) are the horizontal direction, vertical direction and two Positive pulse response image data for four directions in one diagonal direction are shown. The images including the positive pulse response image data are shown in FIGS. 5A, 5C, 5E, and 5G.

I_h-negative(x,y)、I_v-negative(x,y)、I_rd-negative(x,y)及びI_ld-negative(x,y)は、画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する負パルス応答画像データをそれぞれ示す。なお、負パルス応答画像データを含む画像は、図５Ｂ、図５Ｄ、図５Ｆと図５Ｈに示される。 I _h-negative (x, y), I _v-negative (x, y), I _rd-negative (x, y), and I _ld-negative (x, y) are the horizontal, vertical, and two Negative pulse response image data for four directions in one diagonal direction are respectively shown. Note that images including negative pulse response image data are shown in FIGS. 5B, 5D, 5F, and 5H.

前記処理に係るｗは、予め推定された筆画幅である。例えば、経験によってｗの値を推定しても良く、或いは画像を目視してｗの値を推定しても良い。また、前記の（１）式と（２）式は、画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する画像データに合成処理を行う具体的な例のみを示したが、本発明に限定するものではない。実際に、筆画を強調する適切な合成処理であれば、任意の処理が適用される。 W related to the processing is a stroke width estimated in advance. For example, the value of w may be estimated by experience, or the value of w may be estimated by viewing an image. In addition, the above formulas (1) and (2) show only specific examples in which the image data for the four directions of the horizontal direction, the vertical direction, and the two diagonal directions of the pixel is combined. It is not limited to the invention. In practice, any processing is applied as long as it is an appropriate composition processing for emphasizing a stroke.

ここで、対向する方向へのオフセット及び反対の方向へのオフセットを行うことは、非特定な画像にどの方向へのオフセットにより筆画の強調の効果を実現できるかを容易に特定するとは限らないからである。従って、対向する方向へのオフセット及び反対の方向へのオフセットをそれぞれ試して行うことにより、必ず筆画の強調の効果を実現することができる。事前にどの方向へのオフセットにより筆画の強調の効果を実現できるかを特定できれば、対向する方向へのオフセット及び反対の方向へのオフセットを同時に実行する必要がなくなる。即ち、対向する方向へのオフセットと反対の方向へのオフセットのいずれかを実行すれば良い。本明細書に示した画像の例では、図６Ａ〜６Ｄに示されるように、対向する方向にオフセットすることにより筆画の強調の効果を実現することができる。図７Ａ〜７Ｄに示されたように、反対する方向にオフセットすることにより筆画の強調の効果を実現することができない。 Here, performing the offset in the opposite direction and the offset in the opposite direction does not always easily specify in which direction the offset effect can be realized for the non-specific image. It is. Therefore, the effect of enhancing the stroke can be realized without fail by trying the offset in the opposite direction and the offset in the opposite direction. If it is possible to specify in advance in which direction the effect of enhancing the stroke can be realized, it is not necessary to simultaneously execute the offset in the opposite direction and the offset in the opposite direction. That is, any one of the offset in the opposite direction and the offset in the opposite direction may be executed. In the example of the image shown in the present specification, as shown in FIGS. 6A to 6D, the effect of enhancing the stroke can be realized by offsetting in the opposite direction. As shown in FIGS. 7A to 7D, the effect of enhancing the stroke cannot be realized by offsetting in the opposite direction.

ステップＳ２０８において、ステップＳ２０６に合成された画像に高い閾値による２値化処理と整合処理を行うことができる。 In step S208, binarization processing and matching processing with a high threshold can be performed on the image combined in step S206.

例えば、予め設けされた高い閾値を用いて、各方向に対する第１の合成画像データと第２の合成画像データに高い閾値による２値化処理をそれぞれ行うことができる。高い閾値による２値化処理が行われた、各方向に対する第１の合成画像データに整合処理を行い、高い閾値による２値化処理が行われた、各方向に対する第２の合成画像データに整合処理を行う。 For example, it is possible to perform binarization processing with a high threshold on each of the first composite image data and the second composite image data for each direction using a high threshold provided in advance. Alignment processing is performed on the first composite image data for each direction that has been binarized by a high threshold value, and matching is performed on the second composite image data for each direction that has been binarized by a high threshold value. Process.

ここでの整合処理は、積集合を求める処理、最大値を求める処理、或いは平均値を求める処理のいずれか一つの処理であっても良い。 The matching process here may be any one of a process for obtaining a product set, a process for obtaining a maximum value, and a process for obtaining an average value.

整合処理が積集合を求める処理であれば、以下の式に従って積集合を求める処理を行うことができる。 If the matching process is a process for obtaining a product set, a process for obtaining a product set can be performed according to the following expression.

I_output-thin=I_{h-binarized-high}∪I_{v-binarized-high}∪I_{rd-binarized-high}∪I_{ld-binarized-high}
I’_output-thin=I’_{h-binarized-high}∪I’_{v-binarized-high}∪I’_{rd-binarized-high}∪I’_{ld-binarized-high}
なお、I_output-thinは、積集合が求められた第１の細い筆画画像データであり、I_{h-binarized-high}、I_{v-binarized-high}、I_{rd-binarized-high}及びI_{ld-binarized-high}のそれぞれは、第１の合成画像データI_h(x,y)、I_v(x,y)、I_rd(x,y)及びI_ld(x,y)に高い閾値による２値化処理を行った２値化画像データである。なお、I’_output-thinは、積集合が求められた第２の細い筆画画像データであり、I’_{h-binarized-high}、I’_{v-binarized-high}、I’_{rd-binarized-high}及びI’_{ld-binarized-high}のそれぞれは、第２の合成画像データI_h’(x,y)、I_v’(x,y)、I_rd’(x,y)及びI_ld’(x,y)に高い閾値による２値化処理を行った２値化画像データである。 I _output-thin = I _{h-binarized-high} ∪I _{v-binarized-high} ∪I _{rd-binarized-high} ∪I _{ld-binarized-high}
I ' _output-thin = I' _{h-binarized-high} ∪I ' _{v-binarized-high} ∪I' _{rd-binarized-high} ∪I ' _{ld-binarized-high}
Note that I _output-thin is the first thin stroke image data for which a product set has been obtained, and I _{h-binarized-high} , I _{v-binarized-high} , I _{rd-binarized-high} and I _{ld-binarized-} Each of _high represents binarization processing with a high threshold for the first composite image data I _h (x, y), I _v (x, y), I _rd (x, y), and I _ld (x, y) It is the binarized image data which performed. Note that I ′ _output-thin is the second thin stroke image data for which a product set has been obtained, and I ′ _{h-binarized-high} , I ′ _{v-binarized-high} , I ′ _{rd-binarized-high} and I ′ Each of ' _{ld-binarized-high} includes second composite image data I _h ' (x, y), I _v '(x, y), I _rd ' (x, y) and I _ld '(x, y ) Is binarized image data obtained by performing binarization processing with a high threshold.

ステップＳ２１０において、細い筆画画像データを取得することができる。即ち、I_output-thinとI’_output-thinを取得する。 In step S210, thin stroke image data can be acquired. That is, I _output-thin and I ′ _output-thin are acquired.

ステップＳ２１２において、合成された画像に低い閾値による２値化処理と整合処理を行うことができる。 In step S212, a binarization process and a matching process with a low threshold can be performed on the synthesized image.

例えば、予め設けられた、前記高い閾値よりも小さい低い閾値を利用して、各方向に対する第１の合成画像データと第２の合成画像データに低い閾値による２値化処理をそれぞれ行うことができる。低い閾値による２値化処理が行われた、各方向に対する第１の合成画像データには、整合処理が施され、第２の２値化処理が行われた、各方向に対する第２の合成画像データには、整合処理が施される。 For example, by using a low threshold smaller than the high threshold provided in advance, it is possible to perform binarization processing with a low threshold on each of the first composite image data and the second composite image data for each direction. . The first composite image data for each direction that has been subjected to the binarization process with a low threshold is subjected to the alignment process and the second composite image for each direction that has been subjected to the second binarization process. The data is subjected to a matching process.

整合処理が積集合を求める処理であれば、例えば、以下の式に従って積集合を求める処理を行うことができる。
I_output-thick=I_{h-binarized-low}∪I_{v-binarized-low}∪I_{rd-binarized-low}∪I_{ld-binarized-low}
I’_output-thick=I’_{h-binarized-low}∪I’_{v-binarized-low}∪I’_{rd-binarized-low}∪I’_{ld-binarized-low}
なお、I_output-thickは、積集合が求められた第１の太い筆画画像データであり、I_{h-binarized-low}、I_{v-binarized-low}、I_{rd-binarized-low}及びI_{ld-binarized-low}のそれぞれは、第１の合成画像データI_h(x,y)、I_v(x,y)、I_rd(x,y)及びI_ld(x,y)に低い閾値による２値化処理を行った２値化画像データである。なお、I’_output-thickは、積集合が求められた第２の太い筆画画像データであり、I’_{h-binarized-low}、I’_{v-binarized-low}、I’_{rd-binarized-low}及びI’_{ld-binarized-low}のそれぞれは、第２の合成画像データI_h’(x,y)、I_v’(x,y)、I_rd’(x,y)及びI_ld’(x, y)に低い閾値による２値化処理を行った２値化画像データである。 If the matching processing is processing for obtaining a product set, for example, processing for obtaining a product set can be performed according to the following expression.
I _output-thick = I _{h-binarized-low} ∪I _{v-binarized-low} ∪I _{rd-binarized-low} ∪I _{ld-binarized-low}
I ' _output-thick = I' _{h-binarized-low} ∪I ' _{v-binarized-low} ∪I' _{rd-binarized-low} ∪I ' _{ld-binarized-low}
Note that I _output-thick is the first thick stroke image data for which a product set is obtained, and I _{h-binarized-low} , I _{v-binarized-low} , I _{rd-binarized-low,} and I _{ld-binarized-} Each of _low is a binarization process with a low threshold for the first composite image data I _h (x, y), I _v (x, y), I _rd (x, y), and I _ld (x, y) It is the binarized image data which performed. Note that I ′ _output-thick is the second thick stroke image data for which a product set has been obtained, and I ′ _{h-binarized-low} , I ′ _{v-binarized-low} , I ′ _{rd-binarized-low} and I ′ ' _{ld-binarized-low} is respectively the second composite image data I _h ' (x, y), I _v '(x, y), I _rd ' (x, y) and I _ld '(x, y ) Is binarized image data that has been binarized with a low threshold.

以上のように、事前にどの方向へのオフセットにより筆画の強調の効果を実現できるかを特定できれば、対向する方向へのオフセットと反対の方向へのオフセットを同時に実行して第１の合成画像データと第２の合成画像データを取得する必要がなくなる。即ち、（例えば対向する方向又は反対の方向といった）筆画の強調の効果を実現できる方向においてオフセットを行い、相応の合成画像データが取得される。一方、当該合成画像データには、前記の２値化処理と整合処理が施される。 As described above, if it is possible to specify in advance in which direction the effect of enhancing the stroke can be realized, the offset in the opposite direction and the offset in the opposite direction are executed simultaneously to obtain the first composite image data. There is no need to acquire the second composite image data. That is, offset is performed in a direction that can realize the effect of enhancing the stroke (for example, the opposite direction or the opposite direction), and corresponding composite image data is acquired. On the other hand, the binarization process and the matching process are performed on the composite image data.

ステップＳ２１４において、選択的に、予め設けられたフィルタリング条件に応じて、ステップＳ２１２で処理されたデータがフィルタリングされる。 In step S214, the data processed in step S212 is selectively filtered according to filtering conditions provided in advance.

具体的に、予め設けられたフィルタリング条件を利用して、第１の太い筆画画像データと第２の太い筆画画像データにおける連通領域に関するデータにフィルタリング処理をそれぞれ行うことにより、予め設けられたフィルタリング条件を満足する第１の連通領域の画像データと第２の連通領域の画像データが取得される。 Specifically, by using a filtering condition that is provided in advance, a filtering condition that is provided in advance is performed by performing a filtering process on data relating to the communication area in the first thick stroke image data and the second thick stroke image data, respectively. The image data of the first communication area and the image data of the second communication area satisfying the above are acquired.

ここで予め設けられたフィルタリング条件を利用して第１の太い筆画画像データと第２の太い筆画画像データにフィルタリング処理を行うことは、ステップＳ２１２で得られた第１の太い筆画画像データと第２の太い筆画画像データに、太い筆画に加えて、例えば画像における人間のエッジデータといった、その他のオブジェクトのエッジデータ又はその他の太い筆画と類似するエッジデータが含まれる可能性があるからである。予め設けられたフィルタリング条件を利用して第１の太い筆画画像データと第２の太い筆画画像データにフィルタリング処理を行うことにより、太い筆画に対応する連通領域の画像データをより正確に取得することができる。 The filtering process performed on the first thick stroke image data and the second thick stroke image data by using the filtering condition provided in advance here means that the first thick stroke image data obtained in step S212 and the first thick stroke image data. This is because the thick stroke image data of 2 may include edge data of other objects, such as human edge data in an image, or edge data similar to other thick strokes, in addition to a thick stroke. By performing filtering processing on the first thick stroke image data and the second thick stroke image data using filtering conditions provided in advance, the image data of the communication area corresponding to the thick stroke can be acquired more accurately. Can do.

なお、予め設けられたフィルタリング条件は、（１）連通領域内の画素の階調分散が予め設けられた分散閾値よりも小さいこと、(２)連通領域の内側エッジから外側エッジまでの画素の極性が一致すること、及び(３)連通領域の大きさが予め設けられた大きさ閾値以内にあること、のいずれかであっても良い。或いは、フィルタリング条件は、上記（１）、（２）及び（３）のうち、二つ又は複数の条件の任意の組み合わせであっても良い。 The filtering conditions provided in advance are: (1) the gradation dispersion of the pixels in the communication area is smaller than a predetermined dispersion threshold; and (2) the polarity of the pixels from the inner edge to the outer edge of the communication area. Or (3) that the size of the communication area is within a predetermined size threshold value. Alternatively, the filtering condition may be any combination of two or a plurality of conditions among the above (1), (2), and (3).

ここで、連通領域とは、ステップＳ２１２の処理により得られた太い筆画画像データが示したエッジからなる閉じた領域である。 Here, the communication area is a closed area composed of edges indicated by the thick stroke image data obtained by the process of step S212.

フィルタリング条件（１）、即ち連通領域内の画素の階調分散が予め設けられた分散閾値よりも小さいことは、具体的に、まずステップＳ２１２で処理された画像において連通領域に関する座標情報を取得することができる。次に座標情報及び図４に示された原始画像に基づいて連通領域の階調分散を特定し、当該連通領域内の階調分散が予め設けられた分散閾値よりも小さいか否かを判断する。連通領域内の階調分散が予め設けられた分散閾値よりも小さい場合に、当該連通領域を保留し、即ち当該連通領域が太い筆画と関連あると考えられる。さもなければ、当該連通領域をフィルタリングする。 More specifically, the filtering condition (1), that is, that the gradation dispersion of the pixels in the communication area is smaller than a predetermined dispersion threshold, specifically, coordinate information related to the communication area is first acquired in the image processed in step S212. be able to. Next, the gradation dispersion of the communication area is specified based on the coordinate information and the original image shown in FIG. 4, and it is determined whether or not the gradation dispersion in the communication area is smaller than a predetermined dispersion threshold. . When the gradation variance in the communication area is smaller than a predetermined dispersion threshold, the communication area is reserved, that is, the communication area is considered to be related to a thick stroke. Otherwise, the communication area is filtered.

フィルタリング条件（２）、即ち連通領域の内側エッジから外側エッジまでの画素極性が一致であることは、具体的に、まずステップＳ２１２で処理された画像において連通領域に関する座標情報を取得することができる。次に座標情報及び図４に示された原始画像に基づいて連通領域の内側エッジから外側エッジまでの画素極性が一致か否かを特定する。連通領域内の内側エッジと外側エッジの画素極性が一致である場合に、当該連通領域を保留し、即ち当該連通領域が太い筆画と関連あると考えられる。さもなければ、当該連通領域をフィルタリングする。 The filtering condition (2), that is, the fact that the pixel polarities from the inner edge to the outer edge of the communication area are the same, specifically, the coordinate information regarding the communication area can be acquired first in the image processed in step S212. . Next, it is specified whether or not the pixel polarities from the inner edge to the outer edge of the communication area match based on the coordinate information and the original image shown in FIG. When the pixel polarities of the inner edge and the outer edge in the communication area are the same, the communication area is reserved, that is, the communication area is considered to be related to a thick stroke. Otherwise, the communication area is filtered.

ここでの画素の極性は、一つの内側エッジの画素の諧調と、当該内側エッジの画素と隣接する外側エッジの画素の階調との関係として理解されても良い。例えば、２値化画像における画素の階調が「０」或いは「２５５」に設置される場合に、二つの画素の階調値が何れも「０」或いは「２５５」であれば、この二つの画素の極性が一致であると考えられる。さもなければ、極性が不一致であると考えられる。勿論、必要に応じて画素極性が一致であるその他の適切な条件を設定しても良く、ここでは詳しく説明しない。 The polarity of the pixel here may be understood as the relationship between the gradation of the pixel of one inner edge and the gradation of the pixel of the outer edge adjacent to the inner edge pixel. For example, when the gradation of the pixel in the binarized image is set to “0” or “255”, if both of the gradation values of the two pixels are “0” or “255”, the two It is considered that the polarities of the pixels are the same. Otherwise, the polarities are considered inconsistent. Of course, other appropriate conditions for matching pixel polarities may be set as necessary, and will not be described in detail here.

フィルタリング条件（３）、即ち連通領域の大きさが予め設けられた大きさ領域以内にあることは、具体的に、連通領域の大きさが予め設けられた閾値以内にあれば、当該連通領域を保留し、即ち当該連通領域が太い筆画と関連あると考えられる。さもなければ、当該連通領域をフィルタリングする。ここで、太い筆画画像が細い筆画画像よりも大きいであるが、一般的に比較的な共通の上限があるため、当該上限に基づいて前記の予め設けられた閾値を設定しても良いことは、容易に理解できる。連通領域のサイズが大きすぎであれば、当該連通領域が対応するのは太い筆画ではなく非筆画対象であると考えても良い。例えば、殆どの場合に、太い筆画サイズが画像全体のサイズの半分ひいては半分以上を占拠することはない。勿論、特別な場合に太い筆画のサイズが確かに大きいであれば、前記の予め設けられた閾値の大きさを調整すればこのような態様を反映することができる。 Filtering condition (3), that is, that the size of the communication area is within a predetermined size area, specifically, if the size of the communication area is within a predetermined threshold value, It is considered that the communication area is reserved, that is, the communication area is associated with a thick stroke. Otherwise, the communication area is filtered. Here, a thick stroke image is larger than a thin stroke image, but generally there is a comparatively common upper limit. Therefore, the above-described threshold value may be set based on the upper limit. Easy to understand. If the size of the communication area is too large, it may be considered that the communication area corresponds to a non-stroke object rather than a thick stroke. For example, in most cases, a thick stroke size does not occupy half of the overall image size and thus more than half. Of course, if the size of the thick stroke is surely large in a special case, such a mode can be reflected by adjusting the size of the threshold value provided in advance.

なお、前記の予め設けられた分散閾値と予め設けられた大きさ閾値は、予め試験によって得られた経験値に基づいて特定しても良く、或いは、画像サンプルに対する目視に基づいて特定しても良い。勿論、任意のその他の適切な方法で特定しても良い。 The pre-set dispersion threshold and the pre-set size threshold may be specified based on experience values obtained in advance through testing, or may be specified based on visual observation of image samples. good. Of course, any other suitable method may be used.

フィルタリング条件（１）、（２）及び（３）のうち、二つ又は全部の条件を組み合わせてフィルタリングを実行することもできる。 Of the filtering conditions (1), (2), and (3), filtering can be executed by combining two or all of the conditions.

次に、ステップＳ２１６において、第１の連通領域の画像データと第２の連通領域の画像データにそれぞれ対応する第１の太い筆画エッジ画像データと第２の太い筆画エッジ画像データを取得する。 Next, in step S216, first thick stroke edge image data and second thick stroke edge image data respectively corresponding to the image data of the first communication area and the image data of the second communication area are acquired.

ステップＳ２１８において、第１の細い筆画画像データと第２の細い筆画画像データにそれぞれ対応する（図８に示されるような）第１の細い筆画画像と第２の細い筆画画像（図示せず）を取得し、第１の太い筆画エッジ画像データと第２の太い筆画エッジ画像データにそれぞれ対応する（図１０に示されるような）第１の太い筆画画像と第２の太い筆画画像（図示せず）を取得する。 In step S218, a first thin stroke image (as shown in FIG. 8) and a second thin stroke image (not shown) corresponding to the first thin stroke image data and the second thin stroke image data, respectively. And a first thick stroke image (as shown in FIG. 10) and a second thick stroke image (not shown) corresponding to the first thick stroke edge image data and the second thick stroke edge image data, respectively. )).

理解すべきなのは、ステップＳ２１４を実行しない場合でも太い筆画画像を取得することができる。言い換えれば、ステップＳ２１２で取得されたI_output-thickとI’_output-thickに基づいて（図９に示されるような）第１の太い筆画画像と第２の太い筆画画像を取得することができる。 It should be understood that a thick stroke image can be obtained even when step S214 is not executed. In other words, a first thick stroke image and a second thick stroke image (as shown in FIG. 9) can be acquired based on the I _output-thick and I ′ _output-thick acquired in step S212. .

本明細書において、前記ステップにより第２の細い筆画画像と第２の太い筆画画像を取得することができるが、反対の方向へのオフセットにより筆画の強調の効果を実現できないため、対応的に得られた第２の細い筆画画像や第２の太い筆画画像を具体的に示していない。 In the present specification, the second thin stroke image and the second thick stroke image can be acquired by the above steps, but the effect of enhancing the stroke cannot be realized by the offset in the opposite direction. The second thin stroke image and the second thick stroke image are not specifically shown.

選択的に、テキスト筆画の精度、テキスト筆画のリコールレート、或いはテキスト筆画の精度とテキスト筆画のリコールレートとの折衷に基づいて、第１の細い筆画画像と第２の細い筆画画像のいずれかを最終的な細い筆画画像として選別することができる。また、テキスト筆画の精度、テキスト筆画のリコールレート、或いはテキスト筆画の精度とテキスト筆画のリコールレートとの折衷に基づいて、第１の太い筆画画像と第２の太い筆画画像のいずれかを最終的な太い筆画画像として選別することもできる。本明細書の例において、対向する方向へのオフセットにより筆画の強調の効果を実現できるため、テキスト筆画の精度、テキスト筆画のリコールレート、或いはテキスト筆画の精度とテキスト筆画のリコールレートとの折衷に基づいて、第１の細い筆画画像と第１の太い筆画画像を最終的な筆画抽出結果として取得することができる。 Optionally, based on the accuracy of the text stroke, the recall rate of the text stroke, or the compromise between the accuracy of the text stroke and the recall rate of the text stroke, either the first thin stroke image or the second thin stroke image is selected. It can be selected as a final thin stroke image. In addition, based on the accuracy of the text stroke, the recall rate of the text stroke, or the compromise between the accuracy of the text stroke and the recall rate of the text stroke, either the first thick stroke image or the second thick stroke image is finalized. It can also be selected as a thick stroke image. In the example of the present specification, the effect of enhancing the stroke can be realized by offsetting in the opposite direction, so that the accuracy of the text stroke, the recall rate of the text stroke, or the compromise between the accuracy of the text stroke and the recall rate of the text stroke Based on this, the first thin stroke image and the first thick stroke image can be acquired as the final stroke extraction result.

本明細書に言及されたテキスト筆画の精度は、実際に検出された正しい筆画の数と実際に検出された筆画の数との比として理解されても良い。テキスト筆画のリコールレートは、実際に検出された正しい筆画の数と実際に存在した正しい筆画の数との比として理解されても良い。 The accuracy of text strokes referred to herein may be understood as the ratio between the number of correct strokes actually detected and the number of strokes actually detected. The recall rate of text strokes may be understood as the ratio of the number of correct strokes actually detected to the number of correct strokes actually present.

以上のように、ソーベル演算子を例として画像からテキスト筆画画像を抽出する方法を説明した。しかしながら、ただ例示的な方式で画像からテキスト筆画画像を抽出する方法を説明するに過ぎない。使用可能な演算子はこれに限定されなく、Robert演算子、Prewitt演算子、Laplace演算子、log演算子、Canny演算子、或いはその他の任意の適切な演算子であっても良い。 As described above, the method for extracting the text stroke image from the image has been described using the Sobel operator as an example. However, only a method for extracting a text stroke image from an image in an exemplary manner is described. Usable operators are not limited to this, and may be Robert operator, Prewitt operator, Laplace operator, log operator, Canny operator, or any other suitable operator.

また、ステップＳ２０８とＳ２１２で２値化処理を行ってから整合処理を行うことを説明したが、実際に整合処理を行ってから２値化処理を行っても良い。 In addition, although it has been described that the matching process is performed after performing the binarization process in steps S208 and S212, the binarization process may be performed after the matching process is actually performed.

以上に言及された高い閾値と低い閾値は、経験によって特定されても良く、或いは何回かの試験によって特定されても良い。しかしながら、これはただ例示的なものに過ぎず、任意の適切な方法で高い閾値と低い閾値を特定することができる。 The high and low thresholds mentioned above may be specified by experience or may be specified by several tests. However, this is merely exemplary and the high and low thresholds can be specified in any suitable manner.

以下、図１１を参照して、本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する装置４００を説明する。 Hereinafter, with reference to FIG. 11, an apparatus 400 for extracting a text stroke image from an image according to an embodiment of the present invention will be described.

図１１は、本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する装置４００を示すブロック図である。装置４００は、例えば、上述のように図１を参照して説明された、画像からテキスト筆画画像を抽出する方法を実行することができる。 FIG. 11 is a block diagram showing an apparatus 400 for extracting a text stroke image from an image according to an embodiment of the present invention. The apparatus 400 can perform, for example, the method of extracting a text stroke image from an image described with reference to FIG. 1 as described above.

具体的に、装置４００は、前記画像のエッジ情報と勾配情報を取得する情報取得手段４０２と、取得されたエッジ情報と勾配情報に予め設けられた強調処理を行うことにより、前記画像におけるテキストに関するエッジ情報と勾配情報を強調する強調手段４０４と、強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を取得する筆画画像取得手段４０６とを備える。 Specifically, the apparatus 400 relates to the text in the image by performing information enhancement means 402 that obtains edge information and gradient information of the image, and enhancement processing that is provided in advance in the acquired edge information and gradient information. Emphasis means 404 for enhancing edge information and gradient information, and stroke image acquisition means 406 for acquiring a text stroke image corresponding to the emphasized edge information and gradient information.

選択的に、装置４００は、選別手段（図示せず）を備えても良い。選別手段は、抽出された複数のテキスト筆画画像に選別を行うように配置される。 Optionally, the apparatus 400 may comprise sorting means (not shown). The sorting means is arranged to sort the plurality of extracted text stroke images.

情報取得手段４０２は、代表画像のエッジ情報と勾配情報のステップ信号又はパルス信号を解析し、解析結果に基づいてエッジ情報と勾配情報を抽出するように配置される。 The information acquisition means 402 is arranged to analyze the edge information and gradient information step signal or pulse signal of the representative image and extract the edge information and gradient information based on the analysis result.

画像において、細い筆画の画像データはパルス信号で表し、太い筆画の画像データ及び太い筆画と類似する大きい尺度対象は、ステップ信号で表すことができる。パルス信号を解析し、その解析結果に基づいて細い筆画のエッジ情報と勾配情報を抽出することができる。また、ステップ信号を解析し、その解析結果に基づいて、太い筆画のエッジ情報と勾配情報を抽出し、太い筆画と類似する大きい尺度対象のエッジ情報と勾配情報を抽出することができる。 In an image, thin stroke image data can be represented by a pulse signal, and thick stroke image data and a large scale object similar to a thick stroke can be represented by a step signal. By analyzing the pulse signal, it is possible to extract edge information and gradient information of a thin stroke based on the analysis result. Further, it is possible to analyze step signals, extract edge information and gradient information of a thick stroke based on the analysis result, and extract edge information and gradient information of a large scale object similar to a thick stroke.

選択的に、装置４００は、更にノイズ削減手段（図示せず）を備えても良い。ノイズ削減手段は、画像の明瞭度に基づいて、或いは必要に応じて、原始画像にフィルタリングを行ってノイズを削減するように配置される。例えば、ローパスフィルタを利用して画像をフィルタリングすることにより、原始画像におけるノイズを削減することができる。ローパスフィルタは、ガウシアンフィルタであっても良いが、それに限定されなく、当業者にとって知られている任意の適切なローパスフィルタであっても良い。 Optionally, the apparatus 400 may further comprise noise reduction means (not shown). The noise reduction means is arranged so as to reduce noise by filtering the original image based on the clarity of the image or as necessary. For example, noise in the original image can be reduced by filtering the image using a low-pass filter. The low-pass filter may be a Gaussian filter, but is not limited thereto, and may be any appropriate low-pass filter known to those skilled in the art.

強調手段４０４は、各種の方法に基づいてエッジ情報と勾配情報に強調処理を行うように配置される。好ましくは、２値化処理と整合処理によりエッジ情報と勾配情報を強調することができる。ここでの強調処理は、積集合を求める処理、最大値を求める処理、或いは平均値を求める処理の何れかであっても良い。好ましくは、後述されるように、整合処理は積集合処理であっても良い。 The enhancement unit 404 is arranged to perform enhancement processing on edge information and gradient information based on various methods. Preferably, edge information and gradient information can be emphasized by binarization processing and matching processing. The enhancement processing here may be any of processing for obtaining a product set, processing for obtaining a maximum value, and processing for obtaining an average value. Preferably, as will be described later, the matching process may be a product set process.

以下に、図１２を参照して、本発明の実施の形態に係る、ソーベル演算子で画像からテキスト筆画画像を抽出する装置４００’を説明する。 Hereinafter, with reference to FIG. 12, an apparatus 400 'for extracting a text stroke image from an image using a Sobel operator according to an embodiment of the present invention will be described.

図１２は、本発明の実施の形態に係る、ソーベル演算子で画像からテキスト筆画画像を抽出する装置４００’を示すブロック図である。 FIG. 12 is a block diagram showing an apparatus 400 'for extracting a text stroke image from an image using a Sobel operator according to an embodiment of the present invention.

装置４００’は、上述されたように図３〜図１０に示された、本発明の実施の形態に係るソーベル演算子で画像からテキスト筆画画像を抽出する方法を実行することができる。 The apparatus 400 'can execute the method of extracting the text stroke image from the image with the Sobel operator according to the embodiment of the present invention shown in FIGS. 3 to 10 as described above.

具体的には、装置４００と同様に、装置４００’は、情報取得手段４０２’と、強調手段４０４’と、筆画画像取得手段４０６’とを備える。選択的に、装置４００’は、選別手段とノイズ削減手段（何れも図示せず）を備えても良い。 Specifically, like the apparatus 400, the apparatus 400 'includes an information acquisition unit 402', an enhancement unit 404 ', and a stroke image acquisition unit 406'. Optionally, the device 400 'may comprise a sorting means and a noise reduction means (none shown).

選別手段は、テキスト筆画の精度とテキスト筆画のリコールレートとのうち少なくとも一つに基づいて、前記第１の細い筆画画像と前記第２の細い筆画画像の何れかを最終的な細い筆画画像として選別し、及び/又は、前記第１の太い筆画画像と前記第２の太い筆画画像の何れかを最終的な太い筆画画像として選別するように配置される。具体的な選別処理は、図３〜図１０を参照して説明されたため、ここでは説明を重複しない。 The selecting means uses either the first thin stroke image or the second thin stroke image as a final thin stroke image based on at least one of the accuracy of the text stroke and the recall rate of the text stroke. It sorts and / or arranges so that either the 1st thick stroke image and the 2nd thick stroke image may be sorted as a final thick stroke image. Since the specific sorting process has been described with reference to FIGS. 3 to 10, the description is not repeated here.

ノイズ削減手段は、原始画像（即ち処理する画像）をフィルタリングしてノイズを削減するように配置される。具体的なノイズ削減処理は、図３〜図１０に参照して説明されたため、ここでは説明を重複しない。 The noise reduction means is arranged to filter the original image (that is, the image to be processed) to reduce noise. Since the specific noise reduction processing has been described with reference to FIGS. 3 to 10, description thereof will not be repeated here.

なお、情報取得手段４０２’は、畳み込み演算サブ手段４０２２と区分サブ手段４０２４とを備えても良い。畳み込み演算サブ手段４０２２は、画像の複数の方向におけるエッジ情報と勾配情報を取得するための複数のソーベル演算子のコンボリューションカーネルを利用して前記画像の画像データと畳み込み演算をそれぞれ行うするように配置される。区分サブ手段４０２４は、各方向に対する畳み込み演算の結果を各方向に対する正パルス応答画像データと負パルス応答画像データとに区分するように配置される。 The information acquisition unit 402 ′ may include a convolution calculation sub unit 4022 and a division sub unit 4024. The convolution operation sub-unit 4022 performs convolution operations with the image data of the image by using a convolution kernel of a plurality of Sobel operators for acquiring edge information and gradient information in a plurality of directions of the image, respectively. Be placed. The sorting sub means 4024 is arranged to sort the result of the convolution operation for each direction into positive pulse response image data and negative pulse response image data for each direction.

強調手段４０４’は、第１の合成画像取得サブ手段４０４２及び/又は第２の合成画像取得サブ手段４０４４を備えても良い。第１の合成画像取得サブ手段４０４２は、各方向に対する正パルス応答画像データと負パルス応答画像データに、対向する方向へのオフセット及び加算を実行することにより、各方向に対する第１の合成画像データを取得するように配置される。第２の合成画像取得サブ手段４０４４は、各方向に対する正パルス応答画像データと負パルス応答画像データに、反対の方向へのオフセット及び加算を実行することにより、各方向に対する第２の合成画像データを取得するように配置される。 The enhancement unit 404 ′ may include a first composite image acquisition sub unit 4042 and / or a second composite image acquisition sub unit 4044. The first composite image acquisition sub means 4042 executes the offset and addition in the opposite direction to the positive pulse response image data and the negative pulse response image data for each direction, thereby adding the first composite image data for each direction. Arranged to get. The second composite image acquisition sub means 4044 performs the offset and addition in the opposite direction to the positive pulse response image data and the negative pulse response image data for each direction, thereby obtaining the second composite image data for each direction. Arranged to get.

前記筆画画像取得手段４０６’は、第１の２値化処理サブ手段４０６２と、第１の細い筆画画像取得サブ手段４０６４と、第２の細い筆画画像取得サブ手段４０６６と、第２の２値化処理サブ手段４０６８と、第１の太い筆画画像取得サブ手段４０６１０と、第２の太い筆画画像取得サブ手段４０６１２とを備えても良い。 The stroke image acquisition unit 406 ′ includes a first binarization processing sub unit 4062, a first thin stroke image acquisition sub unit 4064, a second thin stroke image acquisition sub unit 4066, and a second binary value. The image processing sub-unit 4068, the first thick stroke image acquisition sub-unit 40610, and the second thick stroke image acquisition sub-unit 40612 may be provided.

第１の２値化処理サブ手段４０６２は、予め設けられた第１の閾値を利用して各方向に対する第１の合成画像データと第２の合成画像データに第１の２値化処理をそれぞれ行うように配置される。 The first binarization processing sub-unit 4062 applies the first binarization processing to the first composite image data and the second composite image data for each direction by using a first threshold value provided in advance. Arranged to do.

第１の細い筆画画像取得サブ手段４０６４は、第１の２値化処理が行われた各方向に対する第１の合成画像データに整合処理を行って第１の細い筆画画像データを取得することにより、前記第１の細い筆画画像データに対応する第１の細い筆画画像を取得するように配置される。 The first thin stroke image acquisition sub means 4064 performs alignment processing on the first composite image data for each direction in which the first binarization processing has been performed to acquire the first thin stroke image data. The first thin stroke image data corresponding to the first thin stroke image data is arranged to be acquired.

第２の細い筆画画像取得サブ手段４０６６は、第１の２値化処理が行われた各方向に対する第２の合成画像データに整合処理を行って第２の細い筆画画像データを取得することにより、前記第２の細い筆画画像データに対応する第２の細い筆画画像を取得するように配置される。 The second thin stroke image acquisition sub means 4066 performs alignment processing on the second composite image data for each direction in which the first binarization processing has been performed to acquire the second thin stroke image data. The second thin stroke image data corresponding to the second thin stroke image data is arranged to be acquired.

第２の２値化処理サブ手段４０６８は、予め設けられた、第１の閾値よりも小さい第２の閾値を利用して各方向に対する第１の合成画像データと第２の合成画像データに第２の２値化処理をそれぞれ行うように配置される。 The second binarization processing sub-unit 4068 uses the second threshold value which is provided in advance and is smaller than the first threshold value, and outputs the first synthesized image data and the second synthesized image data for each direction. 2 are arranged so as to respectively perform binarization processing.

第１の太い筆画画像取得サブ手段４０６１０は、第２の２値化処理が行われた各方向に対する第１の合成画像データに整合処理を行って第１の太い筆画画像データを取得することにより、前記第１の太い筆画画像データに対応する第１の太い筆画画像を取得するように配置される。 The first thick stroke image acquisition sub-unit 40610 performs alignment processing on the first composite image data for each direction in which the second binarization processing has been performed to acquire the first thick stroke image data. The first thick stroke image data corresponding to the first thick stroke image data is arranged to be acquired.

第２の太い筆画画像取得サブ手段４０６１２は、第２の２値化処理が行われた各方向に対する第２の合成画像データに整合処理を行って第２の太い筆画画像データを取得することにより、前記第２の太い筆画画像データに対応する第２の太い筆画画像を取得するように配置される。 The second thick stroke image acquisition sub means 40612 performs alignment processing on the second composite image data for each direction in which the second binarization processing has been performed to acquire the second thick stroke image data. The second thick stroke image data corresponding to the second thick stroke image data is arranged to be acquired.

筆画画像取得手段４０６は、必要に応じて、第１の２値化処理サブ手段４０６２、第１の細い筆画画像取得サブ手段４０６４、第２の細い筆画画像取得サブ手段４０６６のみを備えても良いし、或いは、第２の２値化処理サブ手段４０６８、第１の太い筆画画像取得サブ手段４０６１０、第２の太い筆画画像取得サブ手段４０６１２のみを備えても良い。言い換えれば、筆画画像取得手段４０６は、必要に応じて、細い筆画画像又は太い筆画画像のみを取得するように配置される。例えば、処理する画像において太い筆画又は細い筆画のテキスト情報のみが存在し、或いは太い筆画又は細い筆画のテキスト情報のみを取得する必要がある場合が該当する。 The stroke image acquisition unit 406 may include only the first binarization processing sub-unit 4062, the first thin stroke image acquisition sub-unit 4064, and the second thin stroke image acquisition sub-unit 4066 as necessary. Alternatively, only the second binarization processing sub-unit 4068, the first thick stroke image acquisition sub-unit 40610, and the second thick stroke image acquisition sub-unit 40612 may be provided. In other words, the stroke image acquisition unit 406 is arranged to acquire only a thin stroke image or a thick stroke image as necessary. For example, a case where only text information of a thick stroke or a thin stroke exists in an image to be processed or only text information of a thick stroke or a thin stroke needs to be acquired corresponds to this case.

更に、以上のように、事前にどの方向へのオフセットにより筆画の強調の効果を実現できるかを特定することができれば、強調手段４０４’は、第１の合成画像取得サブ手段４０４２と第２の合成画像取得サブ手段４０４４とのうち、当該方向における処理に対応する合成画像取得サブ手段を備えても良い。一方、筆画画像取得手段４０６’は、第１の細い筆画画像取得サブ手段４０６４と第２の細い筆画画像取得サブ手段４０６６とのうち、当該方向における処理に対応するもの、及び第１の太い筆画画像取得サブ手段４０６１０と第２の太い筆画画像取得サブ手段４０６１２とのうち、当該方向における処理に対応するもののみを備えても良い。 Further, as described above, if it is possible to specify in advance in which direction the effect of enhancing the stroke can be realized, the enhancement unit 404 ′ includes the first synthesized image acquisition sub unit 4042 and the second Among the composite image acquisition sub means 4044, a composite image acquisition sub means corresponding to the processing in the direction may be provided. On the other hand, the stroke image acquisition unit 406 ′ includes a first thin stroke image acquisition sub-unit 4064 and a second thin stroke image acquisition sub-unit 4066, which correspond to the processing in the direction, and the first thick stroke. Of the image acquisition sub means 40610 and the second thick stroke image acquisition sub means 40612, only the one corresponding to the processing in the direction may be provided.

装置４００’は、例えば、上述のように図３〜図１０に示された、本発明の実施の形態に係るソーベル演算子で画像からテキスト筆画画像を抽出する方法を実行するように配置される。簡単にするために、ここでは第１の２値化処理サブ手段４０６２と、第１の細い筆画画像取得サブ手段４０６４と、第２の細い筆画画像取得サブ手段４０６６と、第２の２値化処理サブ手段４０６８と、第１の太い筆画画像取得サブ手段４０６１０と、第２の太い筆画画像取得サブ手段４０６１２とを具体的に説明しない。 The apparatus 400 ′ is arranged to execute a method for extracting a text stroke image from an image with the Sobel operator according to the embodiment of the present invention shown in FIGS. 3 to 10 as described above, for example. . For the sake of simplicity, here, the first binarization processing sub-unit 4062, the first thin stroke image acquisition sub-unit 4064, the second thin stroke image acquisition sub-unit 4066, and the second binarization The processing sub-unit 4068, the first thick stroke image acquisition sub-unit 40610, and the second thick stroke image acquisition sub-unit 40612 are not specifically described.

また、選択的に、筆画画像取得手段４０６’は、フィルタリングサブ手段（図示せず）を更に備えても良い。フィルタリングサブ手段は、予め設けられたフィルタリング条件を利用して第１の太い筆画エッジ画像データと第２の太い筆画エッジデータにおける、連通領域に関するデータにフィルタリング処理を行って、予め設けられたフィルタリング条件を満足する第１の連通領域の画像データと第２の連通領域の画像データを取得するように配置される。 Optionally, the stroke image acquisition unit 406 ′ may further include a filtering sub unit (not shown). The filtering sub means performs a filtering process on data related to the communication area in the first thick stroke edge image data and the second thick stroke edge data by using a filtering condition that is set in advance, and the filtering condition that is set in advance The image data of the first communication area and the image data of the second communication area satisfying the above are acquired.

予め設けられたフィルタリング条件は、前記連通領域内の画素の階調分散が予め設けられた分散閾値によりも小さいである条件と、前記連通領域の内側エッジから外側エッジまでの画素極性が一致である条件と、前記連通領域の大きさが予め設けられた大きさの閾値内にある条件と、のうちの少なくとも一つを含む。予め設けられたフィルタリング条件については、図３〜図１０を参照して説明されたため、ここでは説明を重複しない。 The pre-set filtering condition is the same as the condition that the gradation dispersion of the pixels in the communication area is smaller than the pre-set dispersion threshold and the pixel polarity from the inner edge to the outer edge of the communication area. It includes at least one of a condition and a condition in which the size of the communication area is within a predetermined threshold value. Since the filtering conditions provided in advance have been described with reference to FIGS. 3 to 10, description thereof will not be repeated here.

本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する方法及び装置により、少なくとも以下の技術的効果の一つを実現することができる。即ち、スピードが速くなる効果、及び、筆画の抽出と同時に処理されるビデオの悪品質に起因したノイズと筆画尺度に対する不敏感を抑制できる効果である。また、本発明の実施の形態に係る、画像からテキスト筆画画像を抽出する方法及び装置には、テキストの色、テキスト背景のコントラスト等の従来の知識の必要が無くなる。 At least one of the following technical effects can be realized by the method and apparatus for extracting a text stroke image from an image according to the embodiment of the present invention. That is, there are an effect of increasing the speed and an effect of suppressing noise and insensitivity to the stroke scale due to the bad quality of the video processed simultaneously with the extraction of the stroke. Further, the method and apparatus for extracting a text stroke image from an image according to the embodiment of the present invention eliminates the need for conventional knowledge such as text color and text background contrast.

以上に具体的な実施の形態を用いて本発明の基本的な原理を説明したが、本発明の方法及び装置の全て、或は任意のステップ又は構成要素をいかなる演算装置(プロセッサ、記憶媒体等を含む)又は演算装置のネットワークにおいてハードウェア、ファームウェア、ソフトウェア又はそれらの組合せで実現可能であることを当業者は理解することができる。これは、本発明の説明を検討した場合に、当業者により基本的なプログラミングスキルで実現可能である。 Although the basic principle of the present invention has been described above using specific embodiments, all of the methods and apparatuses of the present invention, or any step or component, can be replaced with any arithmetic device (processor, storage medium, etc. Or a combination thereof in a network of computing devices can be realized by those skilled in the art. This can be accomplished with basic programming skills by those skilled in the art when considering the description of the invention.

したがって、本発明の目的は更に、いかなる演算装置で一つのプログラム又は一組のプログラムを実行することで実現可能である。前記の演算装置は周知の汎用装置であっても良い。したがって、本発明の目的は、前記方法又は装置を実現するプログラムコードを含むプログラム製品を提供するだけで実現可能である。つまり、このようなプログラム製品と、このようなプログラム製品を記憶する媒体も本発明を構成する。勿論、前記の記憶媒体は、既に知られているか又は将来的に開発される任意の種類の記憶媒体であっても良い。 Therefore, the object of the present invention can be further realized by executing one program or a set of programs on any arithmetic device. The arithmetic device may be a known general-purpose device. Therefore, the object of the present invention can be realized simply by providing a program product including a program code for realizing the method or apparatus. That is, such a program product and a medium storing such a program product also constitute the present invention. Of course, the storage medium may be any kind of storage medium that is already known or that will be developed in the future.

ソフトウェア及び/又はファームウェアにより本発明の実施の形態を実現する場合に、記憶媒体又はネットワークから専用のハードウェア構成を有するコンピュータ、例えば図１３に示された汎用のコンピュータ１２００へ当該ソフトウェアを構成するプログラムをインストールする。当該コンピュータは、各種のプログラムがインストールされている場合に、各機能などを実行することができる。 When the embodiment of the present invention is realized by software and / or firmware, a program that configures the software from a storage medium or a network to a computer having a dedicated hardware configuration, for example, the general-purpose computer 1200 shown in FIG. Install. The computer can execute functions and the like when various programs are installed.

図１３において、中央処理装置（CPU）１２０１は、読取専用メモリ（ROM）１２０２に記憶されたプログラム又は記憶部１２０８からランダムアクセスメモリ（RAM）１２０３にロードされたプログラムに基づいて各種の処理を実行する。RAM１２０３には、必要に応じてCPU１２０１が各種の処理等を実行するために必要なデータも記憶されている。CPU１２０１、ROM１２０２とRAM１２０３は、バス１２０４を介して互いに接続されている。入力／出力インタフェース１２０５もバス１２０４に接続されている。 In FIG. 13, a central processing unit (CPU) 1201 executes various processes based on a program stored in a read-only memory (ROM) 1202 or a program loaded from a storage unit 1208 into a random access memory (RAM) 1203. To do. The RAM 1203 also stores data necessary for the CPU 1201 to execute various processes as necessary. The CPU 1201, ROM 1202, and RAM 1203 are connected to each other via a bus 1204. An input / output interface 1205 is also connected to the bus 1204.

入力部１２０６(キーボード、マウス等を含む)と、出力部１２０７（ディスプレイ、例えばブラウン管（CRT）、液晶ディスプレイ（LCD）等とスピーカ等を含む）と、記憶部１２０８（ハードディスク等を含）と、通信部１２０９（ネットワークインターフェースカード、例えばLANカード、モデム等を含む）とは、入力／出力インタフェース１２０５に接続されている。通信部１２０９はネットワーク、例えばインターネットを経由して通信処理を実行する。必要に応じて、入力／出力インタフェース１２０５にはドライバ１２１０も接続されている。磁気ディスク、光ディスク、光磁気ディスク、半導体メモリ等のような取り外し可能な媒体１２１１は、必要に応じてドライバ１２１０に装着される。これにより、読み出されたコンピュータプログラムが必要に応じて記憶部１２０８にインストールされる。 An input unit 1206 (including a keyboard, a mouse, and the like), an output unit 1207 (including a display such as a cathode ray tube (CRT), a liquid crystal display (LCD), and a speaker), and a storage unit 1208 (including a hard disk), A communication unit 1209 (including a network interface card such as a LAN card and a modem) is connected to the input / output interface 1205. The communication unit 1209 executes communication processing via a network, for example, the Internet. A driver 1210 is also connected to the input / output interface 1205 as necessary. A removable medium 1211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is attached to the driver 1210 as necessary. Thereby, the read computer program is installed in the storage unit 1208 as necessary.

ソフトウェアで前記一連の処理を実現する場合に、ネットワーク、例えばインターネット、又は記憶媒体、例えば取外し可能な媒体１２１１からソフトウェアを構成するプログラムをインストールする。 When the series of processing is realized by software, a program constituting the software is installed from a network, for example, the Internet, or a storage medium, for example, a removable medium 1211.

このような記憶媒体は、図１３に示された、その中にプログラムが記憶されており、デバイスから離れて配送されてユーザにプログラムを提供する取り外し可能な媒体１２１１に限定されないことを、当業者が理解すべきである。取り外し可能な媒体１２１１として、例えば、磁気ディスク（フロッピディスク（登録商標）含む）、光ディスク（コンパクトディスクリードオンリーメモリ（CD−ROM）やデジタルバーサティルディスク（DVD）を含む）、光磁気ディスク（ミニディスク(MD)（登録商標）含む）及び半導体メモリを含む。或いは、記憶媒体は、ROM１２０２、記憶部１２０８に含まれるハードディスクであって、プログラムが記憶されており、且つそれらを含むデバイスと一緒にユーザに配送されるハードディスクなどであっても良い。 Those skilled in the art will recognize that such storage media is not limited to the removable media 1211 shown in FIG. 13 in which the program is stored and delivered remotely from the device to provide the program to the user. Should be understood. Examples of the removable medium 1211 include a magnetic disk (including a floppy disk (registered trademark)), an optical disk (including a compact disk read-only memory (CD-ROM) and a digital versatile disk (DVD)), and a magneto-optical disk (mini-disk). Disk (MD) (including registered trademark) and semiconductor memory. Alternatively, the storage medium may be a hard disk included in the ROM 1202 and the storage unit 1208, in which a program is stored, and a hard disk delivered to the user together with a device including them.

本発明は、更にコンピュータが読取り可能なコマンドコードを記憶するプログラムプロダクトを開示した。コマンドコードは、コンピュータにより読取られて実行されると、上述された本発明の実施の形態に係る方法を実行することができる。 The present invention further discloses a program product for storing a computer readable command code. When the command code is read and executed by a computer, the above-described method according to the embodiment of the present invention can be executed.

一方、上記のコンピュータ読取り可能なコマンドコードを記憶するプログラムプロダクトを保持する記憶媒体も本発明の開示に含まれる。記憶媒体は、フロプティカルディスク、光ディスク、光磁気ディスク、メモリカード、メモリスティック等を含むが、それらに限定されない。 On the other hand, a storage medium that holds a program product that stores the computer-readable command code is also included in the disclosure of the present invention. Storage media include, but are not limited to, floppy disks, optical disks, magneto-optical disks, memory cards, memory sticks, and the like.

当業者は、ここで列挙されたのは例示的なものであり、本発明に限定しないことについて理解すべきである。 Those skilled in the art should understand that the listing herein is exemplary and not limiting.

本明細書において、「第１」、「第２」及び「第Ｎ個」等の記述は、説明された特徴を文字上で区別して本発明を明瞭に説明するためのものである。従って、それはいかなる限定的な意味を有するものではない。 In the present specification, descriptions such as “first”, “second”, and “Nth” are for clearly explaining the present invention by distinguishing the described features in characters. Therefore, it does not have any limiting meaning.

上記方法の各ステップ及び上記装置の各構成モジュール及び/又は手段は、例示として、ソフトウェア、ファームウェア、ハードウェア又はその組み合わせとして実施し、且つ相応の装置における一部とすることができる。上記装置における各構成モジュール、手段がソフトウェア、ファームウェア、ハードウェア又はその組み合わせの形態で配置される時に使用可能な具体的な方法又は形態は、当業者に周知されたため、ここでは詳しく説明しない。 Each step of the method and each component module and / or means of the device may be implemented by way of example as software, firmware, hardware or a combination thereof and may be part of a corresponding device. Since a specific method or form that can be used when each component module and means in the above apparatus is arranged in the form of software, firmware, hardware, or a combination thereof is well known to those skilled in the art, it will not be described in detail here.

ソフトウェア又はファームウェアで実現される場合に、例示として、記憶媒体又はネットワークから専用のハードウェア構成を有するコンピュータ（例えば図１３に示された汎用のコンピュータ１２００）へ当該ソフトウェアを構成するプログラムをインストールすることができる。当該コンピュータは、各種のプログラムがインストールされている場合に、各機能などを実行することができる。 When implemented by software or firmware, for example, installing a program constituting the software from a storage medium or a network to a computer having a dedicated hardware configuration (for example, the general-purpose computer 1200 shown in FIG. 13) Can do. The computer can execute functions and the like when various programs are installed.

以上、本発明の具体的な実施の形態に対する説明において、一種の実施の形態について説明及び/又は図示した特徴は、同一又は類似した形態で、一つ又は複数のその他の実施の形態で使用され、その他の実施の形態における特徴と組み合わせ、或いはその他の実施の形態における特徴を置換することができる。 In the above description of specific embodiments of the invention, features described and / or illustrated for one type of embodiment may be used in one or more other embodiments in the same or similar form. In addition, the features in the other embodiments can be combined or replaced with the features in the other embodiments.

強調すべきは、専門用語「備える/含む」は本明細書に使用される時に特徴、要素、ステップ又は構成要素の存在を指すが、一つ又は複数のその他の特徴、要素、ステップ又は構成要素の存在又は付加を排除するわけではない。 It should be emphasized that the term “comprising / comprising” as used herein refers to the presence of a feature, element, step or component, but one or more other features, elements, steps or components. The presence or addition of is not excluded.

また、本発明の方法は、明細書に説明された時間順で実行されることに限定されず、その他の時間に従って順次に、並行に又は個別に実行されでも良い。したがって、本明細書に説明された方法の実行順は、本発明の技術的範囲を限定するものではない。 Further, the method of the present invention is not limited to being executed in the order of time described in the specification, and may be executed sequentially, in parallel, or individually according to other times. Therefore, the order of execution of the methods described herein does not limit the technical scope of the present invention.

以上のように、本発明の具体的な実施の形態の説明により本発明を開示したが、当業者は以下に付記される本願発明の要旨と範囲内に本発明に対する様々な変更、改善又は同等物を設計することができる。これら変更、改善又は同等物は、本発明の保護範囲内に含まれるものと認めるべきである。
（付記）
（付記１）
画像からテキスト筆画画像を抽出する方法であって、
前記画像のエッジ情報と勾配情報を取得するステップと、
取得されたエッジ情報と勾配情報に予め設けられた強調処理を行うことにより、前記画像においてテキストに関するエッジ情報と勾配情報を強調するステップと、
強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を取得するステップと、
を含むことを特徴とする方法。
（付記２）
前記画像のエッジ情報と勾配情報を取得するステップは、
前記画像のエッジ情報と勾配情報を代表するステップ信号又はパルス信号を解析し、解析結果に基づいて前記エッジ情報と前記勾配情報を抽出することを含む、
付記１記載の方法。
（付記３）
前記画像のエッジ情報と勾配情報を取得するステップは、
画像の複数の方向におけるエッジ情報と勾配情報を取得するための複数のソーベル演算子のコンボリューションカーネルを利用して、前記画像の画像データと畳み込み演算をそれぞれ行い、
各方向に対する畳み込み演算の結果を各方向に対する正パルス応答画像データと負パルス応答画像データに区分することを含み、
前記画像においてテキストに関するエッジ情報と勾配情報を強調するステップは、
各方向に対する正パルス応答画像データと負パルス応答画像データに、対向する方向へのオフセット及び加算を実行して、各方向に対する第１の合成画像データを取得し、及び/又は、
各方向に対する正パルス応答画像データと負パルス応答画像データに、反対の方向へのオフセット及び加算を実行して、各方向に対する第２の合成画像データを取得することを含み、
前記強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を取得するステップは、
予め設けられた第１の閾値を用いて各方向に対する第１の合成画像データ及び/又は第２の合成画像データに第１の２値化処理をそれぞれ行い、第１の２値化処理が行われた、各方向に対する第１の合成画像データに整合処理を行って第１の細い筆画画像データを取得することにより、前記第１の細い筆画画像データに対応する第１の細い筆画画像を取得し、及び/又は、第１の２値化処理が行われた、各方向に対する第２の合成画像データに整合処理を行って、対応する第２の細い筆画画像データを取得することにより、前記第２の細い筆画画像データに対応する第２の細い筆画画像を取得し、及び/又は、
予め設けられた、第１の閾値よりも小さい第２の閾値を用いて各方向に対する第１の合成画像データ及び/又は第２の合成画像データに第２の２値化処理をそれぞれ行い、第２の２値化処理が行われた、各方向に対する第１の合成画像データに整合処理を行って第１の太い筆画画像データを取得することにより、前記第１の太い筆画画像データに対応する第１の太い筆画画像を取得し、及び/又は、第２の２値化処理が行われた、各方向に対する第２の合成画像データに整合処理を行って、対応する第２の太い筆画画像データを取得することにより、前記第２の太い筆画画像データに対応する第２の太い筆画画像を取得することを含む、
付記２記載の方法。
（付記４）
前記整合処理は、積集合を求める処理と、最大値を求める処理と、平均値を求める処理との何れかを含む、
付記３記載の方法。
（付記５）
前記画像の複数の方向におけるエッジ情報と勾配情報を取得するための複数のソーベル演算子のコンボリューションカーネルを利用して前記画像の画像データと畳み込み演算をそれぞれ行うステップは、
画像の横方向、縦方向及び二つの対角線方向の四つの方向におけるエッジ情報と勾配情報を取得するための四つのソーベル演算子のコンボリューションカーネルを利用して前記画像の画像データと畳み込み演算をそれぞれ行うことを含む、
付記３又は４記載の方法。
（付記６）
前記対向する方向へのオフセット及び加算は、以下の式
I_h(x,y)=(I_h-positive(x,y-w/2)+I_h-negative(x,y+w/2))/2、
I_v(x,y)=(I_v-positive(x-w/2,y)+I_v-negative(x+w/2,y))/2、
I_rd(x,y)=(I_rd-positive(x+w/2,y-w/2)+I_rd-negative(x-w/2,y+w/2))/2、及び
I_ld(x,y)=(I_ld-positive(x-w/2,y-w/2)+I_ld-negative(x+w/2,y+w/2))/2、
に基づいて行われ、
前記の反対の方向へのオフセット及び加算は、以下の式
I_h’(x,y)=(I_h-positive(x,y+w/2)+I_h-negative(x,y-w/2))/2、
I_v’(x,y)=(I_v-positive(x+w/2,y)+I_v-negative(x-w/2,y))/2、
I_rd’(x,y)=(I_rd-positive(x-w/2,y+w/2)+I_rd-negative(x+w/2,y-w/2))/2、及び
I_ld’(x,y)=(I_ld-positive(x+w/2,y+w/2)+I_ld-negative(x-w/2,y-w/2))/2、
に基づいて行われ、
xは画素の横座標、yは前記画素の縦座標、I_h(x,y)、I_v(x,y)、I_rd(x,y)とI_ld(x,y)は前記画素の横方向、縦方向、及び二つの対角線方向の四つの方向に対する第１の合成画像データをそれぞれ示し、
I_h’(x,y)、I_v’(x,y)、I_rd’(x,y)及びI_ld’(x,y)は、前記画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する第２の合成画像データをそれぞれ示し、
I_h-positive(x,y)、I_v-positive(x,y)、I_rd-positive(x,y)及びI_ld-positive(x,y)は、前記画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する正パルス応答画像データをそれぞれ示し、
I_h-negative(x,y)、I_v-negative(x,y)、I_rd-negative(x,y)及びI_ld-negative(x,y)は、前記画素の横方向、縦方向及び二つの対角線方向の四つの方法に対する負パルス応答画像データをそれぞれ示し、wは予め推定された筆画幅である、
付記３乃至５の何れか記載の方法。
（付記７）
前記第１の太い筆画エッジ画像データと前記第２の太い筆画エッジ画像データにそれぞれ対応する第１の太い筆画画像と第２の太い筆画画像を取得するステップは、
予め設けられたフィルタリング条件を利用して前記第１の太い筆画エッジ画像データと前記第２の太い筆画エッジ画像データにおける連通領域データにフィルタリング処理をそれぞれ行うことにより、前記予め設けられたフィルタリング条件を満たす第１の連通領域の画像データと第２の連通領域の画像データを取得し、
前記第１の連通領域の画像データと前記第２の連通領域の画像データにそれぞれ対応する第１の太い筆画画像と第２の太い筆画画像を取得することを含む、
付記３乃至６の何れか記載の方法。
（付記８）
前記予め設けられたフィルタリング条件は、少なくとも、
前記連通領域内の画素の階調分散が予め設けられた分散の閾値よりも小さい条件と、
前記連通領域内の内側エッジから外側エッジまでの画素極性が一致する条件と、
前記連通領域の大きさが予め設けられた大きさの閾値内にある条件と、
のうちの一つを含む、
付記７記載の方法。
（付記９）
テキスト筆画の精度とテキスト筆画のリコールレートとのうちの少なくとも一つに基づいて、前記第１の細い筆画画像と前記第２の細い筆画画像の何れかを最終的な細い筆画画像として選別し、及び/又は、前記第１の太い筆画画像と前記第２の太い筆画画像の何れかを最終的な太い筆画画像として選別することを更に含む、
付記３乃至８の何れか記載の方法。
（付記１０）
画像からテキスト筆画画像を抽出する装置であって、
前記画像のエッジ情報と勾配情報を取得する情報取得手段と、
取得されたエッジ情報と勾配情報に予め設けられた強調処理を行うことにより、前記画像においてテキストに関するエッジ情報と勾配情報を強調する強調手段と、
強調されたエッジ情報と勾配情報に対応するテキスト筆画画像を取得する筆画画像取得手段と、
を備えることを特徴とする装置。
（付記１１）
前記情報取得手段は、前記画像のエッジ情報と勾配情報を代表するステップ信号又はパルス信号を解析し、解析結果に基づいて前記エッジ情報と前記勾配情報を抽出する、
付記１０記載の装置。
（付記１２）
前記情報取得手段は、
画像の複数の方向におけるエッジ情報と勾配情報を取得するための複数のソーベル演算子のコンボリューションカーネルを利用して、前記画像の画像データと畳み込み演算をそれぞれ行う畳み込み演算サブ手段と、
各方向に対する畳み込み演算の結果を各方向に対する正パルス応答画像データと負パルス応答画像データに区分する区分サブ手段とを備え、
前記強調手段は、
各方向に対する正パルス応答画像データと負パルス応答画像データに、対向する方向へのオフセット及び加算を実行して、各方向に対する第１の合成画像データを取得する第１の合成画像取得サブ手段、及び/又は、
各方向に対する正パルス応答画像データと負パルス応答画像データに、反対の方向へのオフセット及び加算を実行して、各方向に対する第２の合成画像データを取得する第２の合成画像取得サブ手段を備え、
前記筆画画像取得手段は、
予め設けられた第１の閾値を用いて各方向に対する第１の合成画像データ及び/又は第２の合成画像データに第１の２値化処理をそれぞれ行う第１の２値化処理サブ手段と、第１の２値化処理が行われた、各方向に対する第１の合成画像データに整合処理を行って第１の細い筆画画像データを取得することにより、前記第１の細い筆画画像データに対応する第１の細い筆画画像を取得する第１の細い筆画画像取得サブ手段と、及び/又は、第１の２値化処理が行われた、各方向に対する第２の合成画像データに整合処理を行って、対応する第２の細い筆画画像データを取得することにより、前記第２の細い筆画画像データに対応する第２の細い筆画画像を取得する第２の細い筆画画像取得サブ手段と、及び/又は、
予め設けられた、第１の閾値よりも小さい第２の閾値を用いて各方向に対する第１の合成画像データ及び/又は第２の合成画像データに第２の２値化処理をそれぞれ行う第２の２値化処理サブ手段と、第２の２値化処理が行われた、各方向に対する第１の合成画像データに整合処理を行って第１の太い筆画画像データを取得することにより、前記第１の太い筆画画像データに対応する第１の太い筆画画像を取得する第１の太い筆画画像取得サブ手段と、及び/又は、第２の２値化処理が行われた、各方向に対する第２の合成画像データに整合処理を行って、対応する第２の太い筆画画像データを取得することにより、前記第２の太い筆画画像データに対応する第２の太い筆画画像を取得する第２の太い筆画画像取得サブ手段とを備える、
付記１１記載の装置。
（付記１３）
前記整合処理は、積集合を求める処理と、最大値を求める処理と、平均値を求める処理との何れかを含む、
付記１２記載の装置。
（付記１４）
前記畳み込み演算サブ手段は、画像の横方向、縦方向及び二つの対角線方向の四つの方向におけるエッジ情報と勾配情報を取得するための四つのソーベル演算子のコンボリューションカーネルを利用して前記画像の画像データと畳み込み演算をそれぞれ行う、付記１２又は１３記載の装置。
（付記１５）
前記対向する方向へのオフセット及び加算は、以下の式
I_h(x,y)=(I_h-positive(x,y-w/2)+I_h-negative(x,y+w/2))/2、
I_v(x,y)=(I_v-positive(x-w/2,y)+I_v-negative(x+w/2,y))/2、
I_rd(x,y)=(I_rd-positive(x+w/2,y-w/2)+I_rd-negative(x-w/2,y+w/2))/2、及び
I_ld(x,y)=(I_ld-positive(x-w/2,y-w/2)+I_ld-negative(x+w/2,y+w/2))/2、
に基づいて行われ、
前記の反対する方向へのオフセット及び加算は、以下の式
I_h’(x,y)=(I_h-positive(x,y+w/2)+I_h-negative(x,y-w/2))/2、
I_v’(x,y)=(I_v-positive(x+w/2,y)+I_v-negative(x-w/2,y))/2、
I_rd’(x,y)=(I_rd-positive(x-w/2,y+w/2)+I_rd-negative(x+w/2,y-w/2))/2、及び
I_ld’(x,y)=(I_ld-positive(x+w/2,y+w/2)+I_ld-negative(x-w/2,y-w/2))/2、
に基づいて行われ、
xは画素の横座標、yは前記画素の縦座標、I_h(x,y)、I_v(x,y)、I_rd(x,y)及びI_ld(x,y)は、前記画素の横方向、縦方向、及び二つの対角線方向の四つの方向に対する第１の合成画像データをそれぞれ示し、
I_h’(x,y)、I_v’(x,y)、I_rd’(x,y)及びI_ld’(x,y)は、前記画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する第２の合成画像データをそれぞれ示し、
I_h-positive(x,y)、I_v-positive(x,y)、I_rd-positive(x,y)及びI_ld-positive(x,y)は、前記画素の横方向、縦方向及び二つの対角線方向の四つの方向に対する正パルス応答画像データをそれぞれ示し、
I_h-negative(x,y)、I_v-negative(x,y)、I_rd-negative(x,y)及びI_ld-negative(x,y)は、前記画素の横方向、縦方向及び二つの対角線方向の四つの方法に対する負パルス応答画像データをそれぞれ示し、wは予め推定された筆画幅である、
付記１２乃至１４の何れか記載の装置。
（付記１６）
前記筆画画像取得手段は、更に、予め設けられたフィルタリング条件を利用して前記第１の太い筆画エッジ画像データと前記第２の太い筆画エッジ画像データにおける連通領域データにフィルタリング処理をそれぞれ行うことにより、前記予め設けられたフィルタリング条件を満たす第１の連通領域の画像データと第２の連通領域の画像データを取得するフィルタリングサブ手段を備える、
付記１２乃至１５の何れか記載の装置。
（付記１７）
前記予め設けられたフィルタリング条件は、少なくとも、前記連通領域内の画素の階調分散が予め設けられた分散の閾値よりも小さい条件と、前記連通領域内の内側エッジから外側エッジまでの画素極性が一致である条件と、前記連通領域の大きさが予め設けられた大きさの閾値内にある条件と、のうちの一つを含む、
付記１６記載の装置。
（付記１８）
テキスト筆画の精度とテキスト筆画のリコールレートとのうちの少なくとも一つに基づいて、前記第１の細い筆画画像と前記第２の細い筆画画像の何れかを最終的な細い筆画画像として選別し、及び/又は、前記第１の太い筆画画像と前記第２の太い筆画画像の何れかを最終的な太い筆画画像として選別する選別手段を更に備える、
付記１２乃至１７の何れか記載の装置。
（付記１９）
コンピュータに、付記１乃至９の何れか記載の、画像からテキスト筆画画像を抽出する方法を実行させるための命令を含むコンピュータプログラム。
（付記２０）
付記１９記載のコンピュータプログラムを記録したコンピュータ読み取り可能な記録媒体。 As described above, the present invention has been disclosed by the description of the specific embodiments of the present invention. However, those skilled in the art can make various modifications, improvements, or equivalents to the present invention within the spirit and scope of the present invention appended below. You can design things. These changes, improvements or equivalents should be admitted to be within the protection scope of the present invention.
(Appendix)
(Appendix 1)
A method for extracting a text stroke image from an image,
Obtaining edge information and gradient information of the image;
Emphasizing edge information and gradient information relating to text in the image by performing enhancement processing provided in advance on the acquired edge information and gradient information;
Obtaining a text stroke image corresponding to the enhanced edge information and gradient information;
A method comprising the steps of:
(Appendix 2)
The step of acquiring edge information and gradient information of the image includes
Analyzing a step signal or a pulse signal representing edge information and gradient information of the image, and extracting the edge information and the gradient information based on an analysis result,
The method according to appendix 1.
(Appendix 3)
The step of acquiring edge information and gradient information of the image includes
Using convolution kernels of a plurality of Sobel operators for acquiring edge information and gradient information in a plurality of directions of the image, respectively performing a convolution operation with the image data of the image,
Partitioning the result of the convolution operation for each direction into positive pulse response image data and negative pulse response image data for each direction;
Emphasizing edge information and gradient information about text in the image,
Performing offset and addition in opposite directions on the positive and negative pulse response image data for each direction to obtain first composite image data for each direction, and / or
Performing offset and addition in opposite directions on the positive and negative pulse response image data for each direction to obtain second composite image data for each direction;
Obtaining a text stroke image corresponding to the emphasized edge information and gradient information,
A first binarization process is performed on each of the first composite image data and / or the second composite image data for each direction by using a first threshold value provided in advance. The first thin stroke image corresponding to the first thin stroke image data is obtained by performing the alignment process on the first composite image data for each direction and acquiring the first narrow stroke image data. And / or performing the matching process on the second composite image data for each direction in which the first binarization process has been performed, and acquiring the corresponding second thin stroke image data, Obtaining a second thin stroke image corresponding to the second thin stroke image data, and / or
A second binarization process is performed on each of the first composite image data and / or the second composite image data for each direction using a second threshold value smaller than the first threshold value provided in advance. The first thick stroke image data is obtained by performing matching processing on the first composite image data for each direction that has been subjected to the binarization processing of 2, and acquiring the first thick stroke image data. The first thick stroke image is acquired and / or the second binarization processing is performed, the second combined image data for each direction is subjected to matching processing, and the corresponding second thick stroke image Acquiring a second thick stroke image corresponding to the second thick stroke image data by acquiring data,
The method according to appendix 2.
(Appendix 4)
The matching process includes any one of a process for obtaining a product set, a process for obtaining a maximum value, and a process for obtaining an average value.
The method according to attachment 3.
(Appendix 5)
The steps of performing convolution operations with the image data of the image using a convolution kernel of a plurality of Sobel operators for acquiring edge information and gradient information in a plurality of directions of the image, respectively.
Using the convolution kernel of four Sobel operators to acquire edge information and gradient information in four directions of the horizontal direction, vertical direction and two diagonal directions of the image, respectively, the image data and convolution operation of the image Including doing,
The method according to appendix 3 or 4.
(Appendix 6)
The offset and addition in the opposite direction are as follows:
I _h (x, y) = (I _h-positive (x, yw / 2) + I _h-negative (x, y + w / 2)) / 2,
I _v (x, y) = (I _v-positive (xw / 2, y) + I _v-negative (x + w / 2, y)) / 2,
I _rd (x, y) = (I _rd-positive (x + w / 2, yw / 2) + I _rd-negative (xw / 2, y + w / 2)) / 2, and
I _ld (x, y) = (I _ld-positive (xw / 2, yw / 2) + I _ld-negative (x + w / 2, y + w / 2)) / 2,
Based on
The offset and addition in the opposite direction is given by
I _h '(x, y) = (I _h-positive (x, y + w / 2) + I _h-negative (x, yw / 2)) / 2,
I _v '(x, y) = (I _v-positive (x + w / 2, y) + I _v-negative (xw / 2, y)) / 2,
I _rd '(x, y) = (I _rd-positive (xw / 2, y + w / 2) + I _rd-negative (x + w / 2, yw / 2)) / 2, and
I _ld '(x, y) = (I _ld-positive (x + w / 2, y + w / 2) + I _ld-negative (xw / 2, yw / 2)) / 2,
Based on
x is the abscissa of the pixel, y is the ordinate of the pixel, I _h (x, y), I _v (x, y), I _rd (x, y) and I _ld (x, y) are the pixels First composite image data for four directions of the horizontal direction, the vertical direction, and the two diagonal directions, respectively,
I _h ′ (x, y), I _v ′ (x, y), I _rd ′ (x, y) and I _ld ′ (x, y) are the horizontal direction, vertical direction and two diagonal directions of the pixel The second composite image data for the four directions are shown respectively.
I _h-positive (x, y), I _v-positive (x, y), I _rd-positive (x, y) and I _ld-positive (x, y) are the horizontal direction, vertical direction and Positive pulse response image data for four directions in two diagonal directions are shown respectively.
I _h-negative (x, y), I _v-negative (x, y), I _rd-negative (x, y) and I _ld-negative (x, y) are the horizontal direction, vertical direction and The negative pulse response image data for four methods in two diagonal directions are shown respectively, and w is a stroke width estimated in advance.
The method according to any one of appendices 3 to 5.
(Appendix 7)
The steps of obtaining the first thick stroke image data and the second thick stroke image data corresponding to the first thick stroke edge image data and the second thick stroke edge image data, respectively.
By performing a filtering process on the communication area data in the first thick stroke edge image data and the second thick stroke edge image data by using a predetermined filtering condition, the predetermined filtering condition is set. Acquiring image data of the first communication area and image data of the second communication area to be satisfied,
Obtaining a first thick stroke image and a second thick stroke image corresponding to the image data of the first communication area and the image data of the second communication area, respectively.
The method according to any one of appendices 3 to 6.
(Appendix 8)
The filtering condition provided in advance is at least:
A condition in which gradation dispersion of pixels in the communication area is smaller than a predetermined dispersion threshold;
A condition in which pixel polarities from the inner edge to the outer edge in the communication region match,
A condition in which the size of the communication area is within a predetermined threshold value;
Including one of
The method according to appendix 7.
(Appendix 9)
Based on at least one of the accuracy of the text stroke and the recall rate of the text stroke, one of the first thin stroke image and the second narrow stroke image is selected as a final thin stroke image, And / or further comprising selecting one of the first thick stroke image and the second thick stroke image as a final thick stroke image.
The method according to any one of appendices 3 to 8.
(Appendix 10)
An apparatus for extracting a text stroke image from an image,
Information acquisition means for acquiring edge information and gradient information of the image;
Emphasis means for emphasizing edge information and gradient information related to text in the image by performing enhancement processing provided in advance on the acquired edge information and gradient information;
A stroke image acquisition means for acquiring a text stroke image corresponding to the emphasized edge information and gradient information;
A device comprising:
(Appendix 11)
The information acquisition unit analyzes a step signal or a pulse signal representing edge information and gradient information of the image, and extracts the edge information and the gradient information based on an analysis result.
The apparatus according to appendix 10.
(Appendix 12)
The information acquisition means includes
Using a convolution kernel of a plurality of Sobel operators for acquiring edge information and gradient information in a plurality of directions of an image, a convolution operation sub-unit for performing a convolution operation with the image data of the image, and
A division sub-unit for dividing the result of the convolution operation for each direction into positive pulse response image data and negative pulse response image data for each direction;
The highlighting means is
A first composite image acquisition sub-unit that performs offset and addition in opposite directions to the positive pulse response image data and the negative pulse response image data for each direction to acquire first composite image data for each direction; And / or
A second composite image acquisition sub-unit that performs offset and addition in opposite directions on the positive pulse response image data and the negative pulse response image data for each direction to acquire second composite image data for each direction; Prepared,
The stroke image acquisition means includes:
First binarization processing sub-unit for respectively performing first binarization processing on the first composite image data and / or the second composite image data for each direction using a first threshold value provided in advance; The first thin stroke image data is obtained by performing the alignment processing on the first composite image data for each direction that has been subjected to the first binarization process and obtaining the first thin stroke image data. The first thin stroke image acquisition sub-unit for acquiring the corresponding first thin stroke image and / or the second binarized image data that has been subjected to the first binarization processing is matched with the second combined image data. And obtaining second thin stroke image data corresponding to the second thin stroke image data by obtaining the corresponding second thin stroke image data; and And / or
A second binarization process is performed for each of the first composite image data and / or the second composite image data for each direction using a second threshold value that is smaller than the first threshold value provided in advance. The binarization processing sub-unit of the first and second binarization processing, the first combined image data for each direction is subjected to alignment processing to obtain the first thick stroke image data, The first thick stroke image acquisition sub-unit for acquiring the first thick stroke image corresponding to the first thick stroke image data and / or the second binarization processing is performed for each direction. The second thick image data corresponding to the second thick stroke image data is acquired by performing matching processing on the composite image data 2 and acquiring the corresponding second thick stroke image data. A thick stroke image acquisition sub-unit,
The apparatus according to appendix 11.
(Appendix 13)
The matching process includes any one of a process for obtaining a product set, a process for obtaining a maximum value, and a process for obtaining an average value.
The apparatus according to appendix 12.
(Appendix 14)
The convolution operation sub means uses a convolution kernel of four Sobel operators for obtaining edge information and gradient information in four directions of the horizontal direction, the vertical direction, and two diagonal directions of the image. The apparatus according to appendix 12 or 13, which performs image data and a convolution operation, respectively.
(Appendix 15)
The offset and addition in the opposite direction are as follows:
I _h (x, y) = (I _h-positive (x, yw / 2) + I _h-negative (x, y + w / 2)) / 2,
I _v (x, y) = (I _v-positive (xw / 2, y) + I _v-negative (x + w / 2, y)) / 2,
I _rd (x, y) = (I _rd-positive (x + w / 2, yw / 2) + I _rd-negative (xw / 2, y + w / 2)) / 2, and
I _ld (x, y) = (I _ld-positive (xw / 2, yw / 2) + I _ld-negative (x + w / 2, y + w / 2)) / 2,
Based on
The offset and addition in the opposite direction is given by
I _h '(x, y) = (I _h-positive (x, y + w / 2) + I _h-negative (x, yw / 2)) / 2,
I _v '(x, y) = (I _v-positive (x + w / 2, y) + I _v-negative (xw / 2, y)) / 2,
I _rd '(x, y) = (I _rd-positive (xw / 2, y + w / 2) + I _rd-negative (x + w / 2, yw / 2)) / 2, and
I _ld '(x, y) = (I _ld-positive (x + w / 2, y + w / 2) + I _ld-negative (xw / 2, yw / 2)) / 2,
Based on
x is the abscissa of the pixel, y is the ordinate of the pixel, I _h (x, y), I _v (x, y), I _rd (x, y) and I _ld (x, y) are the pixels 1st composite image data for four directions of the horizontal direction, the vertical direction, and two diagonal directions, respectively,
I _h ′ (x, y), I _v ′ (x, y), I _rd ′ (x, y) and I _ld ′ (x, y) are the horizontal direction, vertical direction and two diagonal directions of the pixel The second composite image data for the four directions are shown respectively.
I _h-positive (x, y), I _v-positive (x, y), I _rd-positive (x, y) and I _ld-positive (x, y) are the horizontal direction, vertical direction and Positive pulse response image data for four directions in two diagonal directions are shown respectively.
I _h-negative (x, y), I _v-negative (x, y), I _rd-negative (x, y) and I _ld-negative (x, y) are the horizontal direction, vertical direction and The negative pulse response image data for four methods in two diagonal directions are shown respectively, and w is a stroke width estimated in advance.
The apparatus according to any one of appendices 12 to 14.
(Appendix 16)
The stroke image acquisition means further performs a filtering process on the communication area data in the first thick stroke edge image data and the second thick stroke edge image data by using filtering conditions provided in advance. A filtering sub-unit for acquiring image data of a first communication area and image data of a second communication area that satisfy the filtering condition provided in advance.
The apparatus according to any one of appendices 12 to 15.
(Appendix 17)
The predetermined filtering condition includes at least a condition in which gradation dispersion of pixels in the communication area is smaller than a predetermined dispersion threshold, and pixel polarity from the inner edge to the outer edge in the communication area. Including one of a matching condition and a condition in which the size of the communication region is within a predetermined size threshold,
The apparatus according to appendix 16.
(Appendix 18)
Based on at least one of the accuracy of the text stroke and the recall rate of the text stroke, one of the first thin stroke image and the second narrow stroke image is selected as a final thin stroke image, And / or further comprising a selecting means for selecting either the first thick stroke image or the second thick stroke image as a final thick stroke image.
The apparatus according to any one of appendices 12 to 17.
(Appendix 19)
A computer program comprising instructions for causing a computer to execute the method for extracting a text stroke image from an image according to any one of appendices 1 to 9.
(Appendix 20)
A computer-readable recording medium on which the computer program according to attachment 19 is recorded.

４００：画像からテキスト筆画画像を抽出する装置
４０２：情報取得手段
４０４：強調手段
４０６：筆画画像取得手段
４００’：画像からテキスト筆画画像を抽出する装置
４０２’：情報取得手段
４０２２：畳込み演算サブ手段
４０２４：区分サブ手段
４０４’：強調手段
４０４２：第１の合成画像取得サブ手段
４０４４：第２の合成画像取得サブ手段
４０６’：筆画画像取得手段
４０６２：第１の２値化処理サブ手段
４０６４：第１の細い筆画画像取得サブ手段
４０６６：第２の細い筆画画像取得サブ手段
４０６８：第２の２値化処理サブ手段
４０６１０：第１の太い筆画画像取得サブ手段
４０６１２：第２の太い筆画画像取得サブ手段

400: An apparatus for extracting a text stroke image from an image 402: Information acquisition means 404: An emphasis means 406: A stroke image acquisition means 400 ′: An apparatus for extracting a text stroke image from an image 402 ′: An information acquisition means 4022: A convolution operation sub Means 4024: Classification sub means 404 ′: Emphasis means 4042: First composite image acquisition sub means 4044: Second composite image acquisition sub means 406 ′: Stroke image acquisition means 4062: First binarization processing sub means 4064 First thin stroke image acquisition sub-unit 4066 Second thin stroke image acquisition sub-unit 4068 Second binarization processing sub-unit 40610 First thick stroke image acquisition sub-unit 40612 Second thick stroke Image acquisition sub means

Claims

A method for extracting a text stroke image from an image,
Obtaining edge information and gradient information of the image;
Emphasizing edge information and gradient information relating to text in the image by performing enhancement processing provided in advance on the acquired edge information and gradient information;
Obtaining a text stroke image corresponding to the enhanced edge information and gradient information;
A method comprising the steps of:

The step of acquiring edge information and gradient information of the image includes
Analyzing a step signal or a pulse signal representing edge information and gradient information of the image, and extracting the edge information and the gradient information based on an analysis result,
The method of claim 1.

The step of acquiring edge information and gradient information of the image includes
Using convolution kernels of a plurality of Sobel operators for acquiring edge information and gradient information in a plurality of directions of the image, respectively performing a convolution operation with the image data of the image,
Partitioning the result of the convolution operation for each direction into positive pulse response image data and negative pulse response image data for each direction;
Emphasizing edge information and gradient information about text in the image,
Performing offset and addition in opposite directions on the positive and negative pulse response image data for each direction to obtain first composite image data for each direction, and / or
Performing offset and addition in opposite directions on the positive and negative pulse response image data for each direction to obtain second composite image data for each direction;
Obtaining a text stroke image corresponding to the emphasized edge information and gradient information,
A first binarization process is performed on each of the first composite image data and / or the second composite image data for each direction by using a first threshold value provided in advance. The first thin stroke image corresponding to the first thin stroke image data is obtained by performing the alignment process on the first composite image data for each direction and acquiring the first narrow stroke image data. And / or performing the matching process on the second composite image data for each direction in which the first binarization process has been performed, and acquiring the corresponding second thin stroke image data, Obtaining a second thin stroke image corresponding to the second thin stroke image data, and / or
A second binarization process is performed on each of the first composite image data and / or the second composite image data for each direction using a second threshold value smaller than the first threshold value provided in advance. The first thick stroke image data is obtained by performing matching processing on the first composite image data for each direction that has been subjected to the binarization processing of 2, and acquiring the first thick stroke image data. The first thick stroke image is acquired and / or the second binarization processing is performed, the second combined image data for each direction is subjected to matching processing, and the corresponding second thick stroke image Acquiring a second thick stroke image corresponding to the second thick stroke image data by acquiring data,
The method of claim 2.

The matching process includes any one of a process for obtaining a product set, a process for obtaining a maximum value, and a process for obtaining an average value.
The method of claim 3.

The steps of performing convolution operations with the image data of the image using a convolution kernel of a plurality of Sobel operators for acquiring edge information and gradient information in a plurality of directions of the image, respectively.
Using the convolution kernel of four Sobel operators to acquire edge information and gradient information in four directions of the horizontal direction, vertical direction and two diagonal directions of the image, respectively, the image data and convolution operation of the image Including doing,
The method according to claim 3 or 4.

The offset and addition in the opposite direction are as follows:
I _h (x, y) = (I _h-positive (x, yw / 2) + I _h-negative (x, y + w / 2)) / 2,
I _v (x, y) = (I _v-positive (xw / 2, y) + I _v-negative (x + w / 2, y)) / 2,
I _rd (x, y) = (I _rd-positive (x + w / 2, yw / 2) + I _rd-negative (xw / 2, y + w / 2)) / 2, and
I _ld (x, y) = (I _ld-positive (xw / 2, yw / 2) + I _ld-negative (x + w / 2, y + w / 2)) / 2,
Based on
The offset and addition in the opposite direction is given by
I _h '(x, y) = (I _h-positive (x, y + w / 2) + I _h-negative (x, yw / 2)) / 2,
I _v '(x, y) = (I _v-positive (x + w / 2, y) + I _v-negative (xw / 2, y)) / 2,
I _rd '(x, y) = (I _rd-positive (xw / 2, y + w / 2) + I _rd-negative (x + w / 2, yw / 2)) / 2, and
I _ld '(x, y) = (I _ld-positive (x + w / 2, y + w / 2) + I _ld-negative (xw / 2, yw / 2)) / 2,
Based on
x is the abscissa of the pixel, y is the ordinate of the pixel, I _h (x, y), I _v (x, y), I _rd (x, y) and I _ld (x, y) are the pixels First composite image data for four directions of the horizontal direction, the vertical direction, and the two diagonal directions, respectively,
I _h ′ (x, y), I _v ′ (x, y), I _rd ′ (x, y) and I _ld ′ (x, y) are the horizontal direction, vertical direction and two diagonal directions of the pixel The second composite image data for the four directions are shown respectively.
I _h-positive (x, y), I _v-positive (x, y), I _rd-positive (x, y) and I _ld-positive (x, y) are the horizontal direction, vertical direction and Positive pulse response image data for four directions in two diagonal directions are shown respectively.
I _h-negative (x, y), I _v-negative (x, y), I _rd-negative (x, y) and I _ld-negative (x, y) are the horizontal, vertical and The negative pulse response image data for four methods in two diagonal directions are shown respectively, and w is a stroke width estimated in advance.
6. A method according to any one of claims 3-5.

The steps of obtaining the first thick stroke image data and the second thick stroke image data corresponding to the first thick stroke edge image data and the second thick stroke edge image data, respectively.
By performing a filtering process on the communication area data in the first thick stroke edge image data and the second thick stroke edge image data by using a predetermined filtering condition, the predetermined filtering condition is set. Acquiring image data of the first communication area and image data of the second communication area to be satisfied,
Obtaining a first thick stroke image and a second thick stroke image corresponding to the image data of the first communication area and the image data of the second communication area, respectively.
The method according to claim 3.

The filtering condition provided in advance is at least:
A condition in which gradation dispersion of pixels in the communication area is smaller than a predetermined dispersion threshold;
A condition in which pixel polarities from the inner edge to the outer edge in the communication region match,
A condition in which the size of the communication area is within a predetermined threshold value;
Including one of
The method of claim 7.

Based on at least one of the accuracy of the text stroke and the recall rate of the text stroke, one of the first thin stroke image and the second narrow stroke image is selected as a final thin stroke image, And / or further comprising selecting one of the first thick stroke image and the second thick stroke image as a final thick stroke image.
9. A method according to any one of claims 3 to 8.

An apparatus for extracting a text stroke image from an image,
Information acquisition means for acquiring edge information and gradient information of the image;
Emphasis means for emphasizing edge information and gradient information related to text in the image by performing enhancement processing provided in advance on the acquired edge information and gradient information;
A stroke image acquisition means for acquiring a text stroke image corresponding to the emphasized edge information and gradient information;
A device comprising: