JP2023034823A

JP2023034823A - Image processing apparatus, and control method, and program for image processing apparatus

Info

Publication number: JP2023034823A
Application number: JP2021141259A
Authority: JP
Inventors: 泰輔石黒; Taisuke Ishiguro
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-08-31
Filing date: 2021-08-31
Publication date: 2023-03-13

Abstract

To provide an image processing apparatus, an image processing control method, and a program for specifying and removing ruled lines so as not to reduce the accuracy of character recognition when handwritten characters are superimposed on the ruled lines.SOLUTION: In an information processing system comprising a reading apparatus and an information processing apparatus, an image processing apparatus is provided with extracting means for extracting pixels corresponding to handwritten characters 302 from an image, removing the extracted pixels, and obtaining a non-handwritten character image 305; generating means for generating ruled line determination information including at least a part of the pixels of the handwritten characters based on the pixels corresponding to the non-handwritten character image and the extracted handwritten characters; specifying means for specifying ruled line pixels from the ruled line determination information generated by the generating means; and removing means for removing the ruled line pixels specified by the specifying means from the non-handwritten character image 305.SELECTED DRAWING: Figure 3

Description

本発明は、画像処理装置、画像処理装置の制御方法およびプログラムに関する。 The present invention relates to an image processing apparatus, an image processing apparatus control method, and a program.

スキャナやカメラにより読み込んだ文書画像に対して、文字認識処理を行う技術が知られている。また、文字認識処理の方式として、認識精度を向上させるために印字された活字と手書き文字とを分離し、活字や手書き文字などの文字種毎に特化した認識処理を行う技術が知られている。
特許文献１では、文書画像に対して認識対象を特定する前処理を適用した後、前処理後の画像から手書き文字と活字文字とを分離する技術が開示されている。 2. Description of the Related Art Techniques for performing character recognition processing on document images read by a scanner or camera are known. As a method of character recognition processing, a technology is known in which printed characters and handwritten characters are separated in order to improve recognition accuracy, and recognition processing is performed specifically for each type of character such as printed characters and handwritten characters. .
Japanese Patent Application Laid-Open No. 2004-201001 discloses a technique of applying preprocessing for specifying a recognition target to a document image, and then separating handwritten characters and printed characters from the preprocessed image.

特開２００６－１０７５３４号公報JP 2006-107534 A

しかしながら、特許文献１に記載の技術では、罫線に手書き文字が重畳する場合について考慮されていない。具体的には、特許文献１に記載の技術では、文字認識の前処理において文字認識のノイズである罫線を除去しているが、手書き文字が重畳する罫線が存在する場合に罫線を除去すると、手書き文字の一部を分断した画像が生成されることになる。手書き文字の一部が分断されてしまうと、手書き文字の分離処理および手書き文字認識の精度を低下させてしまう可能性がある。 However, the technique described in Patent Document 1 does not consider the case where handwritten characters are superimposed on ruled lines. Specifically, in the technique described in Patent Document 1, ruled lines that are noise in character recognition are removed in preprocessing for character recognition. An image is generated in which a part of the handwritten character is divided. If a part of the handwritten character is cut off, there is a possibility that the accuracy of the handwritten character separation process and handwritten character recognition will be lowered.

本発明は、上記の課題に鑑みてなされたものであり、手書き文字が罫線に重畳する場合に、文字認識の精度を下げることのないように、罫線を特定・除去することができるようにすることを目的としている。 SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and provides a method for specifying and removing ruled lines without lowering the accuracy of character recognition when handwritten characters are superimposed on the ruled lines. It is intended to

本発明に係る画像処理装置は、画像から手書き文字に該当する画素を抽出し、前記抽出した画素を除去した非手書き文字画像を取得する抽出手段と、前記非手書き文字画像および前記抽出した手書き文字に該当する画素に基づいて、前記手書き文字の画素の少なくとも一部を含んだ罫線判定用情報を生成する生成手段と、前記生成手段により生成した罫線判定用情報から罫線画素を特定する特定手段と、前記非手書き文字画像から前記特定手段により特定した罫線画素を除去する除去手段と、を有する。 An image processing apparatus according to the present invention includes extraction means for extracting pixels corresponding to handwritten characters from an image and obtaining a non-handwritten character image from which the extracted pixels are removed; generating means for generating ruled-line determination information including at least a portion of the pixels of the handwritten character based on the pixels corresponding to (1); and specifying means for specifying ruled-line pixels from the ruled-line determination information generated by the generating means and a removing means for removing the ruled line pixels specified by the specifying means from the non-handwritten character image.

本発明によれば、手書き文字が罫線に重畳する場合に、文字認識の精度を下げることのないように、罫線を特定・除去することができる。 According to the present invention, when a handwritten character is superimposed on a ruled line, the ruled line can be identified and removed without lowering the accuracy of character recognition.

第１の実施形態に係る情報処理システムの構成例を示すブロック図である。1 is a block diagram showing a configuration example of an information processing system according to a first embodiment; FIG. 第１の実施形態に係るＯＣＲ処理例を示すフローチャートである。4 is a flowchart showing an example of OCR processing according to the first embodiment; 第１の実施形態に係る文書画像の一例を示す図である。FIG. 4 is a diagram showing an example of a document image according to the first embodiment; FIG. 第１の実施形態に係る罫線除去処理例を示すフローチャートである。7 is a flowchart showing an example of ruled line removal processing according to the first embodiment; 第１の実施形態に係る罫線判定用情報の生成処理例を示すフローチャートである。8 is a flowchart illustrating an example of processing for generating ruled line determination information according to the first embodiment; 第１の実施形態に係る罫線除去処理により生成される文書画像の一例を示す図である。FIG. 7 is a diagram showing an example of a document image generated by ruled line removal processing according to the first embodiment; 第１の実施形態に係る関連情報取得処理により生成される画像データの一例を示す図である。FIG. 7 is a diagram showing an example of image data generated by related information acquisition processing according to the first embodiment; 第２の実施形態に係る罫線候補領域の関連情報取得処理例を示すフローチャートである。FIG. 11 is a flow chart showing an example of related information acquisition processing for a ruled-line candidate area according to the second embodiment; FIG. 第２の実施形態に係る関連情報取得処理により生成される画像データの一例を示す図である。FIG. 10 is a diagram showing an example of image data generated by related information acquisition processing according to the second embodiment;

（第１の実施形態）
以下、図面を用いて本発明を実施するための形態について説明する。
図１は、第１の実施形態に係る情報処理システムの構成例を示すブロック図である。
情報処理システム１は、読み取り装置１００と、情報処理装置１１０とを有している。読み取り装置１００は、スキャナ１０１と、通信部１０２とを有している。スキャナ１０１は、文書の読み取りを行い、スキャン画像を生成する。通信部１０２は、ネットワークを介して情報処理装置１１０などの外部装置と通信を行う。 (First embodiment)
EMBODIMENT OF THE INVENTION Hereinafter, the form for implementing this invention is demonstrated using drawing.
FIG. 1 is a block diagram showing a configuration example of an information processing system according to the first embodiment.
The information processing system 1 has a reading device 100 and an information processing device 110 . The reading device 100 has a scanner 101 and a communication section 102 . A scanner 101 reads a document and generates a scanned image. The communication unit 102 communicates with an external device such as the information processing device 110 via a network.

情報処理装置１１０は、システム制御部１１１と、ＲＯＭ１１２と、ＲＡＭ１１３と、ＨＤＤ１１４と、表示部１１５と、入力部１１６と、通信部１１７とを有している。本実施形態に係る情報処理装置１１０は、画像処理装置の一例である。
システム制御部１１１は、ＲＯＭ１１２に記憶された制御プログラムを読み出して各種処理を実行する。ＲＡＭ１１３は、システム制御部１１１の主メモリ、ワークエリアなどの一時記憶装置として用いられる。ＨＤＤ１１４は、各種データや各種プログラムなどを記憶する。なお、後述する情報処理装置１１０の機能や処理は、システム制御部１１１がＲＯＭ１１２又はＨＤＤ１１４に格納されているプログラムを読み出し、このプログラムを実行することにより実現されるものである。 Information processing apparatus 110 includes system control unit 111 , ROM 112 , RAM 113 , HDD 114 , display unit 115 , input unit 116 , and communication unit 117 . The information processing device 110 according to this embodiment is an example of an image processing device.
The system control unit 111 reads control programs stored in the ROM 112 and executes various processes. A RAM 113 is used as a main memory of the system control unit 111 and a temporary storage device such as a work area. The HDD 114 stores various data, various programs, and the like. Functions and processes of the information processing apparatus 110, which will be described later, are realized by the system control unit 111 reading a program stored in the ROM 112 or the HDD 114 and executing the program.

通信部１１７は、ネットワークを介して読み取り装置１００などの外部装置と通信を行う。表示部１１５は、各種情報を表示する。入力部１１６は、キーボードやマウスを有し、ユーザによる各種操作を受け付ける。なお、表示部１１５および入力部１１６は、タッチパネルのように一体に設けられてもよい。また、表示部１１５は、プロジェクタによる投影を行うものであってもよく、入力部１１６は、投影された画像に対する指先の位置をカメラで認識するものであってもよい。
本実施形態においては、読み取り装置１００のスキャナ１０１が帳票などの紙文書を読み取り、スキャン画像を生成する。スキャン画像のデータは、通信部１０２により情報処理装置１１０に送信される。情報処理装置１１０では、通信部１１７がスキャン画像のデータを受信し、受信したスキャン画像のデータをＨＤＤ１１４などの記憶装置に記憶する。 A communication unit 117 communicates with an external device such as the reading device 100 via a network. The display unit 115 displays various information. The input unit 116 has a keyboard and a mouse, and receives various operations by the user. Note that the display unit 115 and the input unit 116 may be provided integrally like a touch panel. Further, the display unit 115 may perform projection using a projector, and the input unit 116 may recognize the position of the fingertip with respect to the projected image using a camera.
In this embodiment, the scanner 101 of the reading device 100 reads a paper document such as a form and generates a scanned image. Data of the scanned image is transmitted to the information processing apparatus 110 by the communication unit 102 . In the information processing apparatus 110 , the communication unit 117 receives the data of the scanned image, and stores the received data of the scanned image in a storage device such as the HDD 114 .

次に、図２のフローチャートを用いて、本実施形態におけるＯＣＲ処理の一例について説明する。本フローチャートの処理は、ＲＯＭ１１２に格納されたプログラムに従って情報処理装置１１０のシステム制御部１１１が実行することによって実現される。ユーザが読み取り装置１００を操作して紙文書をスキャナ１０１でスキャンすることにより本処理が開始される。
まず、Ｓ２００において、システム制御部１１１は、スキャナ１０１がユーザによる指示に従い紙文書をスキャンして送信してきたスキャン画像のデータを、通信部１１７を介して受信する。そして、システム制御部１１１は、受信したスキャン画像のデータをＨＤＤ１１４などの記憶部に格納する。図３（ａ）は、Ｓ２００の処理により得られたスキャン画像３０１の一例を示す図である。スキャン画像３０１は、手書き文字３０２および罫線３０３を含んでいる。 Next, an example of OCR processing in this embodiment will be described using the flowchart of FIG. The processing of this flowchart is implemented by the system control unit 111 of the information processing apparatus 110 executing according to the program stored in the ROM 112 . This processing is started when the user operates the reading device 100 to scan the paper document with the scanner 101 .
First, in S<b>200 , the system control unit 111 receives, via the communication unit 117 , scanned image data that the scanner 101 scans a paper document in accordance with an instruction from the user and transmits. The system control unit 111 then stores the received data of the scanned image in a storage unit such as the HDD 114 . FIG. 3A is a diagram showing an example of a scanned image 301 obtained by the process of S200. A scanned image 301 includes handwritten characters 302 and ruled lines 303 .

Ｓ２１０において、システム制御部１１１は、ＨＤＤ１１４などの記憶部に格納されたスキャン画像を読み出し、そのスキャン画像に対して二値化処理を行う。そして、システム制御部１１１は、処理結果である二値画像をＲＡＭ１１３に格納する。二値化処理は、多値の階調値からなる画像データを０と１などの二値データに量子化する処理であり、既知の二値化手法を用いることができる。例えば、既知の二値化手法として判別分析法などの方法が知られている。図３（ｂ）は、スキャン画像３０１に対する二値化処理により得られた二値画像３０４の一例を示す図である。手書き文字３０２および罫線３０３についても二値化されている。
上記の二値化手法は一例であって、二値データに量子化することができれば具体的な手法は問わない。例えば、あらかじめ階調値に対する閾値を決めておき、閾値を用いて二値データを生成することも可能である。 In S210, the system control unit 111 reads out the scanned image stored in the storage unit such as the HDD 114, and binarizes the scanned image. Then, the system control unit 111 stores the binary image, which is the processing result, in the RAM 113 . The binarization process is a process of quantizing image data consisting of multilevel gradation values into binary data such as 0 and 1, and a known binarization method can be used. For example, a method such as a discriminant analysis method is known as a known binarization method. FIG. 3B is a diagram showing an example of a binary image 304 obtained by binarizing the scanned image 301. As shown in FIG. Handwritten characters 302 and ruled lines 303 are also binarized.
The above binarization method is just an example, and any specific method can be used as long as it can be quantized into binary data. For example, it is possible to determine thresholds for gradation values in advance and generate binary data using the thresholds.

Ｓ２２０において、システム制御部１１１は、Ｓ２１０において生成された二値画像に対して、手書き文字画素の抽出処理を行う。そして、システム制御部１１１は、抽出された手書き文字画素を二値画像から除去した非手書き文字画像と、手書き文字画素のみで構成された手書き文字画像とを生成し、ＲＡＭ１１３に格納する。図３（ｃ）は、二値画像３０４に対して、抽出処理を実施した結果の一例を示す図である。抽出処理により、二値画像３０４に対する非手書き文字画像３０５と、二値画像３０４に対する手書き文字画像３０６とが生成される。
ここで、非手書き文字画像３０５は手書き文字画素が除かれているため、手書き文字が罫線と重なっていた部分３０７では、罫線が分断されることになる。図３（ｄ）は、Ｓ２２０の処理前後における、手書き文字が罫線と重なっていた部分３０７に該当する領域を示す図である。二値画像３０４から抜粋した手書き部分画像３０８に対して手書き文字画素を除去すると、非手書き画像の部分画像３０９となり、罫線が細かく分断されていることがわかる。 In S220, the system control unit 111 performs handwritten character pixel extraction processing on the binary image generated in S210. Then, system control unit 111 generates a non-handwritten character image obtained by removing the extracted handwritten character pixels from the binary image and a handwritten character image composed only of handwritten character pixels, and stores them in RAM 113 . FIG. 3C is a diagram showing an example of the result of performing extraction processing on the binary image 304. As shown in FIG. The extraction process generates a non-handwritten character image 305 for the binary image 304 and a handwritten character image 306 for the binary image 304 .
Here, since the handwritten character pixels are removed from the non-handwritten character image 305, the ruled lines are divided in the portion 307 where the handwritten characters overlap the ruled lines. FIG. 3D is a diagram showing the area corresponding to the portion 307 where the handwritten characters overlap the ruled lines before and after the process of S220. When the handwritten character pixels are removed from the handwritten partial image 308 extracted from the binary image 304, a partial image 309 of the non-handwritten image is obtained, and it can be seen that the ruled lines are finely divided.

なお、手書き文字画素抽出は、スキャン画像のなかから手書き文字に該当する画素のみを抽出する処理であり、既知の手書き文字画素抽出手法を用いることができる。例えば、既知の手書き文字画素抽出の手法として、深層学習を使った手法が知られている。なお、深層学習を使った手書き文字画素抽出処理は一例であって、スキャン画像から手書き文字画素が抽出できれば具体的な手法は問わない。例えば、連結した画素を解析して手書き文字の特徴を持つ連結画素を手書き文字画素として特定することも可能である。連結画素を解析して手書き文字と判定する技術は、特開平１０－１６２１０２号公報などで開示されている。 Note that handwritten character pixel extraction is a process of extracting only pixels corresponding to handwritten characters from the scanned image, and a known handwritten character pixel extraction method can be used. For example, a method using deep learning is known as a known handwritten character pixel extraction method. Note that the handwritten character pixel extraction processing using deep learning is an example, and any specific method may be used as long as handwritten character pixels can be extracted from a scanned image. For example, it is also possible to analyze connected pixels and identify connected pixels having handwritten character characteristics as handwritten character pixels. A technique of analyzing connected pixels and determining handwritten characters is disclosed in Japanese Patent Application Laid-Open No. 10-162102.

Ｓ２３０において、システム制御部１１１は、Ｓ２２０において生成された非手書き文字画像に対して、手書き文字画像を用いて罫線除去処理を行い、処理結果の画像を活字文字画像としてＲＡＭ１１３に格納する。罫線除去処理の詳細については、図４を用いて後述する。図３（ｅ）は、非手書き文字画像３０５に対する罫線除去処理の結果として得られた活字文字画像の一例を示す図である。罫線に相当する画素は、罫線除去処理によって除去されている。
Ｓ２４０において、システム制御部１１１は、Ｓ２２０において生成された手書き文字画像に対して手書きＯＣＲ処理を行い、処理結果を手書きＯＣＲ結果としてＲＡＭ１１３に格納する。手書きＯＣＲ処理は、手書き文字画像に対して、手書き文字画像に含まれる文字に該当する文字コードを特定する処理であり、既知の手書きＯＣＲ手法を用いることができる。例えば、パターンマッチングを用いた手書き文字認識の手法が知られており、特開平１０－１２４６１８号公報などで開示されている。なお、上記の手書き文字認識処理は一例であって、画像中の文字に該当する部分の文字コードを特定する処理であれば具体的な手法は問わない。 In S230, system control unit 111 performs ruled line removal processing using the handwritten character image on the non-handwritten character image generated in S220, and stores the resulting image in RAM 113 as a printed character image. Details of the ruled line removal process will be described later with reference to FIG. FIG. 3E is a diagram showing an example of a printed character image obtained as a result of the ruled line removal processing on the non-handwritten character image 305. As shown in FIG. Pixels corresponding to ruled lines are removed by ruled line removal processing.
In S240, system control unit 111 performs handwritten OCR processing on the handwritten character image generated in S220, and stores the processing result in RAM 113 as a handwritten OCR result. The handwritten OCR process is a process of identifying character codes corresponding to characters included in a handwritten character image, and a known handwritten OCR method can be used. For example, a method of recognizing handwritten characters using pattern matching is known and disclosed in Japanese Patent Application Laid-Open No. 10-124618. Note that the handwritten character recognition process described above is just an example, and any specific technique may be used as long as the process identifies the character code of the portion corresponding to the character in the image.

Ｓ２５０において、システム制御部１１１は、Ｓ２３０において生成された活字文字画像に対して活字ＯＣＲ処理を行い、処理結果を活字ＯＣＲ結果としてＲＡＭ１１３に格納する。活字ＯＣＲ処理は、活字画像に含まれる文字に対して、活字文字に該当する文字コードを特定する処理であり、既知の活字ＯＣＲ手法を用いることができる。例えば、手書きＯＣＲと同様に、パターンマッチングを用いた活字認識の手法が知られている。なお、上記の活字認識処理は一例であって、画像中の文字に該当する部分の文字コードを特定する処理であれば具体的な手法は問わない。
Ｓ２６０において、システム制御部１１１は、Ｓ２４０において生成された手書き文字認識結果および、Ｓ２５０において生成された活字認識結果に対して統合処理を行い、スキャン画像に対するＯＣＲ結果としてＲＡＭ１１３に格納する。 In S250, the system control unit 111 performs printed character OCR processing on the printed character image generated in S230, and stores the processing result in the RAM 113 as a printed character OCR result. The printed character OCR process is a process of specifying character codes corresponding to printed characters for characters included in a printed character image, and a known printed character OCR method can be used. For example, similar to handwritten OCR, a type recognition technique using pattern matching is known. Note that the above-described printed character recognition processing is just an example, and any specific method may be used as long as it is processing for specifying the character code of a portion corresponding to a character in an image.
In S260, the system control unit 111 integrates the handwritten character recognition result generated in S240 and the printed character recognition result generated in S250, and stores them in the RAM 113 as OCR results for the scanned image.

以上の処理により、手書き文字と活字文字とが混在し、罫線も存在する帳票においても、手書き文字と活字文字、それぞれに最適なＯＣＲ処理を適用することが可能となり、ＯＣＲ精度を向上させることができる。
なお、上記処理において、Ｓ２４０の手書きＯＣＲ処理およびＳ２５０の活字ＯＣＲ処理は、異なる順番で実施してもよい。手書きＯＣＲ処理はＳ２２０の手書き文字画素抽出が終了したタイミング以降であり、Ｓ２６０のＯＣＲ結果統合処理以前であれば、どのタイミングで実施してもよい。同様に、活字ＯＣＲはＳ２３０の罫線除去が終了したタイミング以降であり、Ｓ２６０のＯＣＲ結果統合処理以前であれば、どのタイミングで実施してもよい。例えば、Ｓ２３０の罫線除去処理とＳ２４０の手書きＯＣＲ処理との順番を変更しても、期待する結果を得ることができる。 With the above processing, it is possible to apply the optimum OCR processing to each of the handwritten characters and the printed characters, even in a form in which both handwritten characters and printed characters are mixed and ruled lines exist, thereby improving the OCR accuracy. can.
In the above process, the handwritten OCR process of S240 and the printed character OCR process of S250 may be performed in a different order. The handwritten OCR process may be performed at any timing after the handwritten character pixel extraction in S220 and before the OCR result integration process in S260. Similarly, printed character OCR may be performed at any timing after the completion of ruled line removal in S230 and before the OCR result integration processing in S260. For example, even if the order of the ruled line removal processing in S230 and the handwritten OCR processing in S240 is changed, expected results can be obtained.

続いて、図４のフローチャートを用いて、図２のＳ２３０における罫線除去処理の詳細について説明する。図４は本実施形態における、罫線除去処理の一例を示すフローチャートである。本フローチャートの処理は、ＲＯＭ１１２に格納されたプログラムに従って情報処理装置１１０のシステム制御部１１１が実行することによって実現される。 Next, details of the ruled line removal process in S230 of FIG. 2 will be described using the flowchart of FIG. FIG. 4 is a flowchart showing an example of ruled line removal processing in this embodiment. The processing of this flowchart is implemented by the system control unit 111 of the information processing apparatus 110 executing according to the program stored in the ROM 112 .

Ｓ４００において、システム制御部１１１は、ＲＡＭ１１３を参照し、非手書き文字画像および手書き文字画像を取得する。さらに、システム制御部１１１は、取得した非手書き文字画像と手書き文字画像とを解析することによって、罫線判定用情報を生成しＲＡＭ１１３に格納する。罫線判定用情報は、罫線候補のみから構成される画像である。 In S400, the system control unit 111 refers to the RAM 113 and acquires the non-handwritten character image and the handwritten character image. Further, the system control unit 111 analyzes the obtained non-handwritten character image and the handwritten character image to generate ruled line determination information and stores it in the RAM 113 . The ruled line determination information is an image composed only of ruled line candidates.

図６（ａ）は、罫線判定用情報の一例を示す図である。領域６０１に示すように、罫線候補には、罫線に加えて、罫線と重畳する手書き文字画素の一部も含まれている。非手書き文字画像における罫線の候補である画像に加えて、罫線候補に関連する手書き文字画像の画素を統合することによって、手書き文字が重畳した罫線も分断されずに画像として表現される。また、罫線に加えて余計に含まれている手書き文字画素の一部は、後述するＳ４１０の罫線画素の特定処理により、罫線ではないと判定されるため、この段階で手書き文字画素の一部が罫線候補として含まれることは問題ない。
罫線判定用情報生成処理の詳細については、図５を用いて後述する。 FIG. 6A is a diagram showing an example of ruled line determination information. As shown in area 601, ruled-line candidates include, in addition to ruled lines, some handwritten character pixels that overlap the ruled lines. By integrating the pixels of the handwritten character image related to the ruled line candidate in addition to the image that is the ruled line candidate in the non-handwritten character image, the ruled line on which the handwritten characters are superimposed is expressed as an image without being divided. In addition, since some of the handwritten character pixels that are extra included in addition to the ruled lines are determined not to be ruled lines by the ruled line pixel identification processing in S410, which will be described later, some of the handwritten character pixels are not included at this stage. There is no problem in being included as ruled line candidates.
Details of the ruled line determination information generation process will be described later with reference to FIG.

Ｓ４１０において、システム制御部１１１は、ＲＡＭ１１３を参照し、罫線判定用情報を取得する。さらに、システム制御部１１１は、取得した罫線判定用情報を解析して、除去対象である罫線を特定し、特定された罫線情報をＲＡＭ１１３に格納する。
罫線画素の特定処理は、罫線判定用情報である画像に対して、画素の細らせ・太らせ処理を実施し、細らせ・太らせ処理の結果における連結画素塊を解析することによって行う。具体的には例えば、罫線判定用情報である画像に対して、罫線として想定する長さよりも短い成分が消えるように画素を細らせ、太らせの順で処理する。すなわち、罫線として想定する長さの最小値分、細らせ処理を適用し、罫線以外の画素である手書き文字画素の一部を除去する。その後、細らせ処理結果に対して、細らせたサイズと同値の太らせ処理を適用することによって、消えなかった罫線については、元の画素サイズに戻すことができる。 In S410, the system control unit 111 refers to the RAM 113 and acquires ruled line determination information. Further, the system control unit 111 analyzes the acquired ruled-line determination information to specify ruled lines to be removed, and stores the specified ruled-line information in the RAM 113 .
Ruled line pixel identification processing is performed by thinning and thickening pixels on an image that is information for determining a ruled line, and analyzing connected pixel clusters as a result of the thinning and thickening processing. . Specifically, for example, the image, which is the information for determining ruled lines, is processed in the order of thinning and thickening pixels so that components shorter than the assumed length of ruled lines disappear. That is, the thinning process is applied by the minimum value of the length assumed for the ruled line, and some of the handwritten character pixels, which are pixels other than the ruled line, are removed. After that, by applying a thickening process with the same value as the thinning size to the result of the thinning process, the ruled lines that did not disappear can be returned to the original pixel size.

続いて、上記細らせ・太らせ処理により生成された画像に対して、連結画素塊を抽出し、抽出した連結画素塊の形状とサイズとを用いて、連結画素塊が罫線となりうるのか否かを判定する。例えば、連結画素塊が細長いものであれば、罫線として判定することができる。
また、罫線画素の特定処理において、表罫線のように連結画素塊としては大きなサイズであり、細長い形状とならないケースに対しては、直線検出を利用して判定することもできる。具体的には例えば、抽出した連結画素塊に対して直線検出処理を適用し、検出された直線長と数および配置関係を用いて、表罫線を構成するものか否かを判定する。 Subsequently, connected pixel blocks are extracted from the image generated by the thinning/thickening process, and the shapes and sizes of the extracted connected pixel blocks are used to determine whether the connected pixel blocks can form ruled lines. determine whether For example, if a connected pixel block is elongated, it can be determined as a ruled line.
In addition, in the ruled line pixel specifying process, straight line detection can also be used for a case where the connected pixel cluster is large in size and does not have an elongated shape, such as a table ruled line. Specifically, for example, straight line detection processing is applied to the extracted connected pixel clusters, and whether or not they form a table ruled line is determined using the detected straight line length, number, and arrangement relationship.

なお、本実施形態では、非手書き文字画像に対して細らせ・太らせ処理を適用した画像を用いて連結画素塊の解析を行ったが、細らせ・太らせ処理を適用していない画像に対して連結画素塊の解析を行ってもよい。例えば、破線などが存在する場合、細らせ処理によって、破線を構成する画素塊が消失してしまう可能性があるが、細らせ・太らせ処理を適用しない画像を用いることで、この問題を回避することができる。具体的には例えば、単一の連結画素塊だけを見ると、微小点である場合も、周囲の連結画素塊に動揺の特徴を持つ連結画素塊が一定間隔で直線状に配置されている場合、該当する連結画素塊群を罫線候補として判定する。連結画素塊の解析処理によって、破線など罫線が細かく分断されているケースにおいても罫線候補を特定することができる。 Note that in the present embodiment, an image obtained by applying thinning/thickening processing to a non-handwritten character image is used to analyze connected pixel blocks, but thinning/thickening processing is not applied. A connected pixel block analysis may be performed on the image. For example, if there is a dashed line, the thinning process may cause the pixel blocks that make up the dashed line to disappear. can be avoided. Specifically, for example, when looking only at a single connected pixel block, even if it is a minute point, when connected pixel blocks having fluctuation characteristics are arranged in a straight line at regular intervals in the surrounding connected pixel blocks , the corresponding connected pixel block group is determined as a ruled line candidate. By analyzing connected pixel clusters, it is possible to identify ruled line candidates even in cases where ruled lines such as dashed lines are finely divided.

図６（ｂ）は、図６（ａ）に示す罫線判定用情報に対して罫線画素の特定処理を行うことにより特定された罫線情報の一例を示す図である。図６（ａ）において手書き文字画素の一部を含んでいた領域６０１も、手書き文字画素がない罫線６０２として特定されていることがわかる。
Ｓ４２０において、システム制御部１１１は、ＲＡＭ１１３を参照し、非手書き文字画像および罫線情報を取得する。さらに、システム制御部１１１は、非手書き文字画像から罫線情報が示す罫線画素を除去し、罫線が除去された非手書き文字画像を生成してＲＡＭ１１３に格納する。 FIG. 6(b) is a diagram showing an example of ruled-line information specified by performing ruled-line pixel specifying processing on the ruled-line determination information shown in FIG. 6(a). It can be seen that a region 601 that partially contained handwritten character pixels in FIG. 6A is also identified as a ruled line 602 that does not have handwritten character pixels.
In S420, the system control unit 111 refers to the RAM 113 and acquires the non-handwritten character image and the ruled line information. Further, system control unit 111 removes ruled line pixels indicated by the ruled line information from the non-handwritten character image, generates a non-handwritten character image with the ruled lines removed, and stores the generated non-handwritten character image in RAM 113 .

続いて、図５のフローチャートを用いて、図４のＳ４００における罫線判定用情報生成処理の詳細について説明する。図５は本実施形態における、罫線候補領域の関連情報の取得処理の一例を示すフローチャートである。本フローチャートの処理は、ＲＯＭ１１２に格納されたプログラムに従って情報処理装置１１０のシステム制御部１１１が実行することによって実現される。
Ｓ５００において、システム制御部１１１は、ＲＡＭ１１３を参照し、非手書き文字画像を取得する。さらに、システム制御部１１１は、非手書き文字画像を解析することによって、罫線の候補である領域を特定し、特定された罫線候補領域および罫線候補領域を含む罫線候補画像をＲＡＭ１１３に格納する。 Next, details of the ruled line determination information generation process in S400 of FIG. 4 will be described with reference to the flowchart of FIG. FIG. 5 is a flow chart showing an example of a process of acquiring information related to a ruled-line candidate area according to the present embodiment. The processing of this flowchart is implemented by the system control unit 111 of the information processing apparatus 110 executing according to the program stored in the ROM 112 .
In S500, the system control unit 111 refers to the RAM 113 and acquires a non-handwritten character image. Further, system control unit 111 analyzes the non-handwritten character image to identify regions that are candidates for ruled lines, and stores the identified ruled-line candidate regions and ruled-line candidate images including the ruled-line candidate regions in RAM 113 .

罫線候補の特定処理は、非手書き文字画像に対して、連結画素塊を解析することによって実現する。具体的には例えば、非手書き文字画像に対して、連結画素塊を抽出し、抽出した連結画素塊の形状とサイズとを用いて、連結画素塊が罫線の候補となりうるのか否かを判定する。図４のＳ４１０における罫線画素の特定処理と同様に、連結画素塊が細長ければ、罫線候補として特定することができる。また、大きな連結画素塊についても、図４のＳ４１０における罫線画素の特定処理と同様に、直線検出を用いて表罫線を構成するものかを特定することができる。
さらに、罫線候補の特定処理では、図４のＳ４１０における罫線画素の特定処理とは異なる観点でも罫線候補を判定する。具体的には例えば、微小な連結画素塊が直線状に並んでいる場合にも、該当する連結画素塊群を罫線候補として判定する。この処理の結果、手書き文字画素の重畳により細かく分断されてしまった非手書き文字画像における罫線画素についても罫線候補として抽出することができる。 The ruled-line candidate identification process is realized by analyzing connected pixel clusters in the non-handwritten character image. Specifically, for example, a connected pixel block is extracted from a non-handwritten character image, and the shape and size of the extracted connected pixel block are used to determine whether or not the connected pixel block can be a ruled line candidate. . Similar to the ruled-line pixel specifying process in S410 of FIG. 4, if the connected pixel cluster is elongated, it can be specified as a ruled-line candidate. Also, for large connected pixel clusters, it is possible to identify whether or not they constitute table ruled lines using straight line detection in the same manner as the ruled line pixel identification processing in S410 of FIG.
Furthermore, in the ruled-line candidate specifying process, ruled-line candidates are determined from a different perspective from the ruled-line pixel specifying process in S410 of FIG. Specifically, for example, even when minute connected pixel blocks are arranged in a straight line, the relevant connected pixel block group is determined as a ruled line candidate. As a result of this process, ruled line pixels in the non-handwritten character image that have been finely segmented due to superimposition of handwritten character pixels can also be extracted as ruled line candidates.

続いて、上記処理により特定した罫線候補である連結画素塊又は連結画素塊群の外接矩形を取得し、外接矩形に所定のマージンを設定した領域を罫線候補領域として設定する。
図７（ａ）は、図３（ｃ）の非手書き文字画像３０５に対して、罫線候補領域の特定処理をして得られた罫線候補画像の一例を示す図である。領域７０１および領域７０２が罫線候補として特定された領域を示す。手書き文字画素により分断された罫線も連結画素塊群として罫線候補として特定され、領域７０２も罫線候補領域と設定されていることがわかる。 Subsequently, the circumscribing rectangle of the connected pixel block or connected pixel block group, which is the ruled line candidate specified by the above process, is obtained, and the area obtained by setting a predetermined margin to the circumscribing rectangle is set as the ruled line candidate area.
FIG. 7A is a diagram showing an example of a ruled-line candidate image obtained by subjecting the non-handwritten character image 305 of FIG. Areas 701 and 702 indicate areas identified as ruled line candidates. A ruled line segmented by handwritten character pixels is also specified as a group of connected pixels as a ruled line candidate, and the area 702 is also set as a ruled line candidate area.

Ｓ５１０において、システム制御部１１１は、ＲＡＭ１１３を参照し、罫線候補領域および手書き文字画像を取得する。さらに、システム制御部１１１は、罫線候補領域に関連する手書き文字画像の画素情報を取得し、罫線候補領域の関連情報としてＲＡＭ１１３に格納する。
罫線候補領域に関連する手書き文字画像の画素情報取得処理は、手書き文字画像において、罫線候補領域に対応する領域の画素を取得することにより実施する。
例えば、図７（ａ）の領域７０２に対応する手書き文字画像の領域は、図７（ｂ）の罫線候補領域の関連情報７０３になる。一方、図７（ａ）の領域７０１が示す手書き文字画像は、有効な黒画素が存在しないため、関連する手書き文字画像の画素情報としては設定されない。 In S510, the system control unit 111 refers to the RAM 113 and acquires the ruled line candidate area and the handwritten character image. Furthermore, the system control unit 111 acquires pixel information of the handwritten character image related to the ruled-line candidate area, and stores it in the RAM 113 as related information of the ruled-line candidate area.
The pixel information acquisition process of the handwritten character image related to the ruled line candidate area is performed by acquiring the pixels of the area corresponding to the ruled line candidate area in the handwritten character image.
For example, the area of the handwritten character image corresponding to the area 702 in FIG. 7(a) becomes the related information 703 of the ruled line candidate area in FIG. 7(b). On the other hand, the handwritten character image indicated by the area 701 in FIG. 7A does not have valid black pixels, so it is not set as the pixel information of the related handwritten character image.

Ｓ５２０において、システム制御部１１１は、ＲＡＭ１１３を参照し、罫線候補画像および罫線候補領域の関連情報を取得する。さらに、システム制御部１１１は、罫線候補画像と罫線候補領域の関連情報とを統合し、統合結果画像をＲＡＭ１１３に格納する。
図７（ｂ）は、Ｓ５１０の統合処理の一例を説明するための図である。図７（ｂ）において、図７（ａ）の領域７０２に対応する罫線候補画像７０４、および図７（ａ）の領域７０２に対応する罫線候補領域の関連情報７０３を統合した結果が、統合結果画像７０５である。
なお、本実施形態における罫線候補領域のマージンは、罫線として想定される幅および高さを満たす範囲で設定すれば十分である。広くマージンを取ると、関連性の低い手書き文字画素が関連情報として扱われる可能性が高くなり、図４のＳ４１０に示す罫線画素の特定処理において精度を低下させてしまう可能性がある。 In S<b>520 , the system control unit 111 refers to the RAM 113 and acquires related information of the ruled-line candidate image and the ruled-line candidate area. Furthermore, the system control unit 111 integrates the ruled-line candidate image and the related information of the ruled-line candidate area, and stores the integrated result image in the RAM 113 .
FIG. 7B is a diagram for explaining an example of the integration processing of S510. In FIG. 7B, the result of integrating the ruled-line candidate image 704 corresponding to the area 702 of FIG. 7A and the related information 703 of the ruled-line candidate area corresponding to the area 702 of FIG. An image 705 is shown.
Note that it is sufficient to set the margin of the ruled-line candidate area in the present embodiment within a range that satisfies the expected width and height of the ruled line. If a wide margin is provided, there is a high possibility that handwritten character pixels with low relevance will be treated as related information, and there is a possibility that accuracy will be lowered in the process of identifying ruled line pixels shown in S410 of FIG.

以上のように本実施形態によれば、手書き文字が罫線に重畳する場合も、非手書き文字画像から罫線候補を特定し、特定された罫線候補情報を使って非手書き文字画像に手書き文字画像の一部を統合することによって、罫線を精度よく除去することができる。 As described above, according to the present embodiment, even when a handwritten character is superimposed on a ruled line, ruled line candidates are specified from a non-handwritten character image, and the specified ruled line candidate information is used to convert the handwritten character image into the non-handwritten character image. By integrating a part, ruled lines can be removed with high accuracy.

（第２の実施形態）
第１の実施形態では、罫線候補領域に該当する手書き文字画像を罫線候補領域の関連情報として扱った。しかし、単に罫線候補領域に該当する手書き文字画像を関連情報とするのではなく、罫線候補領域に含まれる罫線画素と手書き文字画素の関連性を用いて、最終的に関連情報とするか否かの判定を行ってもよい。
本ケースについて、図８を用いて説明する。図８は、本実施形態における罫線判定用情報の生成処理を示すフローチャートであり、図４のＳ４００における罫線判定用情報生成処理の詳細である。なお、図４のＳ４００における罫線判定用情報生成処理以外の処理については、第１の実施形態と同様のため説明は省略する。 (Second embodiment)
In the first embodiment, the handwritten character image corresponding to the ruled-line candidate area is treated as related information of the ruled-line candidate area. However, instead of simply using the handwritten character image corresponding to the ruled-line candidate area as the related information, it is possible to determine whether or not the related information is finally obtained by using the relevance between the ruled-line pixels and the handwritten character pixels included in the ruled-line candidate area. may be determined.
This case will be described with reference to FIG. FIG. 8 is a flowchart showing the ruled line determination information generation process in this embodiment, and shows the details of the ruled line determination information generation process in S400 of FIG. Note that processing other than the ruled line determination information generation processing in S400 of FIG. 4 is the same as in the first embodiment, and thus description thereof is omitted.

Ｓ８００は、図５におけるＳ５００と同一の処理であるため、説明を省略する。
Ｓ８１０は、図５におけるＳ５１０と同一の処理であるため、説明を省略する。
Ｓ８２０において、システム制御部１１１は、ＲＡＭ１１３を参照し、罫線候補領域に含まれる罫線画素、および罫線候補領域に関連する手書き文字画像の画素情報を取得する。さらに、システム制御部１１１は、取得した手書き文字画像の画素情報（以降、本実施形態において、単に関連情報と記載する）を解析し、最終的な関連情報として採用するか否かを判定する。システム制御部１１１は、最終的な関連情報として採用しないと判定した場合は、ＲＡＭ１１３から該当する関連情報を削除する。 Since S800 is the same processing as S500 in FIG. 5, description thereof is omitted.
Since S810 is the same processing as S510 in FIG. 5, description thereof is omitted.
In S820, the system control unit 111 refers to the RAM 113 and obtains pixel information of the ruled line pixels included in the ruled line candidate area and the handwritten character image related to the ruled line candidate area. Furthermore, the system control unit 111 analyzes the acquired pixel information of the handwritten character image (hereinafter simply referred to as related information in this embodiment) and determines whether to adopt it as the final related information. When the system control unit 111 determines not to adopt the related information as the final related information, the system control unit 111 deletes the related information from the RAM 113 .

関連情報の採用判定処理は、罫線候補領域に含まれる罫線画素と関連情報との関連性を用いて行う。具体的には例えば、システム制御部１１１は、罫線画素と関連情報とが画素として連結するか否かを判定する。システム制御部１１１は、罫線画素と関連情報とが画素として連結する場合は、関連性があると判定する。一方、システム制御部１１１は、罫線画素と関連情報とが画素として連結しない場合は、関連性がないと判定する。
以下、図９を用いて、関連情報の採用判定処理例について説明する。図９（ａ）は、文書画像において罫線に手書き文字が重畳している部分を抜き出した部分画像例を示す図である。図９（ｂ）は、図９（ａ）の部分画像に対する手書き文字画像例を示す図である。図９（ｃ）は、図９（ａ）の部分画像に対する罫線候補画像例を示す図であり、罫線候補領域９０１を含んでいる。図９（ｄ）は、Ｓ８２０で取得される罫線候補領域９０１の関連情報例を示す図であり、本来罫線とは関係のない手書き文字画素塊の一部９０２も含んでいる。 The related information adoption determination process is performed using the relationship between the ruled line pixels included in the ruled line candidate area and the related information. Specifically, for example, the system control unit 111 determines whether or not the ruled line pixels and the related information are connected as pixels. If the ruled line pixel and the related information are connected as pixels, the system control unit 111 determines that they are related. On the other hand, the system control unit 111 determines that there is no relationship when the ruled line pixel and the related information are not connected as pixels.
An example of the relevant information adoption determination process will be described below with reference to FIG. 9 . FIG. 9A is a diagram showing an example of a partial image extracted from a document image where handwritten characters are superimposed on ruled lines. FIG. 9(b) is a diagram showing an example of a handwritten character image for the partial image of FIG. 9(a). FIG. 9C is a diagram showing an example of a ruled line candidate image for the partial image of FIG. 9A, including a ruled line candidate area 901 . FIG. 9(d) is a diagram showing an example of related information of the ruled line candidate area 901 acquired in S820, which also includes a part 902 of handwritten character pixel blocks that are originally unrelated to ruled lines.

関連情報の採用判定処理において、手書き文字画素塊の一部９０２は、どの罫線画素とも連結しないため、システム制御部１１１は、関連情報として適切でないと判定し、ＲＡＭ１１３に格納される関連情報から削除する。図９（ｅ）は、図９（ｄ）に対して関連情報の採用判定処理を適用した結果の一例である。前述のように、図９（ｅ）では手書き文字画素塊の一部９０２が削除されている。
Ｓ８３０は、図５におけるＳ５２０と同一の処理であるため、説明を省略する。 In the relevant information adoption determination process, the part 902 of the handwritten character pixel block is not connected to any ruled line pixels, so the system control unit 111 determines that it is not appropriate as related information, and deletes it from the related information stored in the RAM 113 . do. FIG. 9(e) is an example of the result of applying the relevant information adoption determination process to FIG. 9(d). As described above, part 902 of the handwritten character pixel block is deleted in FIG. 9(e).
Since S830 is the same processing as S520 in FIG. 5, description thereof is omitted.

以上説明したように、罫線候補領域の関連情報を取得する際、罫線候補領域に含まれる罫線画素との関連性を用いて、採用可否を判定することによって、手書き文字の一部を罫線候補として抽出してしまう可能性を低減することが可能になる。例えば、図８を用いて例示したように、手書き文字画素の一部だけ着目すると細長い線形状となり、罫線と類似する特徴を持つ場合がある。罫線と類似する特徴を持つ関連情報を統合し、図４のＳ４１０における罫線の特定処理を実施すると、罫線として誤抽出されてしまう可能性がある。本課題に対して、本実施形態における処理を適用することによって、罫線とは関係しない関連情報を除いたうえで、罫線の特定処理ができるようになる。 As described above, when acquiring information related to a ruled-line candidate area, the association with ruled-line pixels included in the ruled-line candidate area is used to determine whether or not to adopt a part of a handwritten character as a ruled-line candidate. It is possible to reduce the possibility of extraction. For example, as exemplified using FIG. 8, if only some of the handwritten character pixels are focused on, they may have an elongated line shape and have characteristics similar to ruled lines. If related information having characteristics similar to ruled lines is integrated and the ruled line specifying process in S410 of FIG. By applying the processing of the present embodiment to this problem, it is possible to perform processing for specifying ruled lines after removing related information unrelated to the ruled lines.

（その他の実施形態）
本発明は、前述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or device via a network or a storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

１１１システム制御部 111 system control unit

Claims

Extraction means for extracting pixels corresponding to handwritten characters from an image and obtaining a non-handwritten character image from which the extracted pixels are removed;
generating means for generating ruled line determination information including at least part of the pixels of the handwritten character based on the non-handwritten character image and the extracted pixels corresponding to the handwritten character;
specifying means for specifying ruled line pixels from the ruled line determination information generated by the generating means;
a removing means for removing the ruled line pixels identified by the identifying means from the non-handwritten character image;
An image processing device having

The generating means identifies a ruled-line candidate area based on the shape and size of a pixel block that is a ruled-line candidate included in the non-handwritten character image, and converts pixels of handwritten characters included in the identified ruled-line candidate area into related information. and combining the extracted related information and the non-handwritten image to generate the ruled line determination information.

3. The image processing apparatus according to claim 2, wherein said generation means further specifies, as said ruled line candidate area, a pixel block group in which pixel blocks included in said non-handwritten character image are arranged in a straight line.

The generation means determines relevance of pixels related to the extracted related information to ruled-line pixels included in the ruled-line candidate region, deletes pixels determined to have no relevance from the related information, 4. The image processing apparatus according to claim 2, wherein information for determination is generated.

In the relevance determination, the generating means determines that, when pixels of handwritten characters included in the ruled-line candidate area and ruled-line pixels included in the ruled-line candidate area are not connected, the unconnected handwritten character 5. The image processing apparatus according to claim 4, wherein the ruled line determination information is generated by deleting the pixels of from the related information.

an extraction step of extracting pixels corresponding to handwritten characters from an image and obtaining a non-handwritten character image from which the extracted pixels are removed;
a generation step of generating ruled line determination information including at least a portion of the pixels of the handwritten character based on the non-handwritten character image and the extracted pixels corresponding to the handwritten character;
an identifying step of identifying ruled-line pixels from the ruled-line determination information generated in the generating step;
a removing step of removing ruled line pixels identified by the identifying step from the non-handwritten character image;
A control method for an image processing apparatus having

an extraction step of extracting pixels corresponding to handwritten characters from an image and obtaining a non-handwritten character image from which the extracted pixels are removed;
a generation step of generating ruled line determination information including at least a portion of the pixels of the handwritten character based on the non-handwritten character image and the extracted pixels corresponding to the handwritten character;
an identifying step of identifying ruled-line pixels from the ruled-line determination information generated in the generating step;
a removing step of removing ruled line pixels identified by the identifying step from the non-handwritten character image;
A program that causes a computer to run