JP2021144526A

JP2021144526A - Image processing device, image reader device, determination method, program, and storage medium

Info

Publication number: JP2021144526A
Application number: JP2020043304A
Authority: JP
Inventors: 真彦高島; Masahiko Takashima
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2020-03-12
Filing date: 2020-03-12
Publication date: 2021-09-24

Abstract

To provide an image processing device or the like capable of appropriately determining the orientation of an input image.SOLUTION: An image processing device is provided, comprising: an extraction unit configured to extract an image of an area including parts constituting a character from an input image as an area image; a detection unit configured to detect a partial image showing the parts constituting the character from the area image; and a determination unit configured to determine the orientation of the input image on the basis of the detected partial image.SELECTED DRAWING: Figure 1

Description

本発明は、画像処理装置等に関する。 The present invention relates to an image processing apparatus and the like.

近年、デジタル画像処理システムが目覚ましい発達を遂げ、デジタル画像処理技術の構築が進んでいる。例えば、電子写真方式又はインクジェット方式を用いた複写機、複合機（ＭＦＰ：Multi-Function Printer）等の分野では、文書の原稿がスキャナで読み取られて電子データである文書ファイルとして保存され、文書ファイルを圧縮してe-mailで送信することがなされている。 In recent years, digital image processing systems have made remarkable progress, and the construction of digital image processing technology is progressing. For example, in the fields of copiers and multi-function printers (MFPs) that use electrophotographic or inkjet methods, a document document is read by a scanner and saved as a document file that is electronic data, and is a document file. Is compressed and sent by e-mail.

昨今では、自動シートフィーダ（ＡＤＦ：Auto Document Feeder）が搭載された複合機が広く普及している。利用者は、読み取る原稿をＡＤＦにセットすることで、スキャナに原稿を自動で順次搬送させて、一面ずつ、原稿の画像の読み取りを行わせることができる。ＡＤＦを備える複合機により、利用者に対して、原稿をスキャンするたびに読み取る面を原稿台にセットし直すといった負担を軽減させることができる。 In recent years, multifunction devices equipped with an automatic seat feeder (ADF: Auto Document Feeder) have become widespread. By setting the document to be read in the ADF, the user can automatically sequentially convey the document to the scanner and have the image of the document read one side at a time. The multifunction device provided with the ADF can reduce the burden on the user, such as resetting the scanning surface on the platen every time the document is scanned.

一方で、利用者は、複合機に複数の原稿を一度に読み取る場合、利用者にとって読みやすい方向に原稿の方向を揃えないと、得られる文書ファイルの利便性を低下させてしまうという問題がある。原稿の方向が利用者にとって読みやすい方向に揃っていないと、利用者は、複合機から得られる文書ファイルを表示装置等に表示させたとき、表示された原稿画像を回転させる操作を行わなければならない。一方で、利用者にとっては、ＡＤＦにセットする原稿の方向を手作業で揃えることは手間がかかる。その上、読み取る原稿によっては長辺綴じで両面に印字している場合もあれば、短辺綴じで両面に印字している場合もあり、これらが混在する複数の原稿の方向を、利用者の手作業だけで揃えることは不可能である。 On the other hand, when a user reads a plurality of documents at once on a multifunction device, there is a problem that the convenience of the obtained document file is reduced unless the directions of the documents are aligned in a direction that is easy for the user to read. .. If the orientation of the manuscript is not aligned in a direction that is easy for the user to read, the user must perform an operation to rotate the displayed manuscript image when the document file obtained from the multifunction device is displayed on a display device or the like. It doesn't become. On the other hand, it is troublesome for the user to manually align the directions of the documents to be set in the ADF. In addition, depending on the original to be read, it may be printed on both sides with long-edge binding, or it may be printed on both sides with short-edge binding. It is impossible to arrange them by hand.

上記のような状況を鑑みて、例えば、光学的文字認識（ＯＣＲ：Optical Character Recognition）技術を用いて、原稿を読み取ることで入力された画像（入力画像）に含まれる個々の文字の画像を識別し、その識別の精度が高い方向を、入力画像の方向であるとして判断する技術が広く知られている。この技術を実現するソフトウェアを複合機などに組み込むことで、入力画像毎に方向を自動的に判定することができる。更に、判定した方向に応じて画像を入力回転させることで、入力画像を正しいページ方向、すなわち、文字が転倒していないこと（文字が正しい向きになっていること）が利用者によって認識される可能性が最も高い方向に揃えることが可能となる。 In view of the above situation, for example, using optical character recognition (OCR) technology, the image of each character included in the input image (input image) is identified by scanning the original. However, a technique for determining the direction in which the identification accuracy is high is widely known as the direction of the input image. By incorporating software that realizes this technology into a multifunction device or the like, the direction can be automatically determined for each input image. Furthermore, by inputting and rotating the image according to the determined direction, the user recognizes that the input image has the correct page orientation, that is, the characters have not fallen (the characters are in the correct orientation). It is possible to align in the direction with the highest possibility.

例えば、入力画像に対して、０°、９０°、１８０°、２７０°の４方向で文字認識処理を行い、最も高い確からしさが得られた方向を入力画像の方向として決定する技術が提案されている（例えば、特許文献１参照）。 For example, a technique has been proposed in which character recognition processing is performed on an input image in four directions of 0 °, 90 °, 180 °, and 270 °, and the direction in which the highest certainty is obtained is determined as the direction of the input image. (See, for example, Patent Document 1).

また、ＯＣＲを不要とする技術として、入力画像の中で文字と判断した黒色画素をブロックに分割し、ブロックごとに黒色画素の方向性を示すベクトルを検出して入力画像の方向を判定する技術が提案されている（例えば、特許文献２参照）。 Further, as a technology that does not require OCR, a technology that divides black pixels judged to be characters in an input image into blocks, detects a vector indicating the directionality of the black pixels for each block, and determines the direction of the input image. Has been proposed (see, for example, Patent Document 2).

また、入力画像から抽出した字体要素から、相対位置毎に所定の画素パターンにマッチする特徴点の分布の特性を求め、その分布の特性と予めトレーニング画像データを用いて求めた分布の特性とを比較して、入力画像の方向を判定する技術が提案されている（例えば、特許文献３参照）。 In addition, from the font elements extracted from the input image, the characteristics of the distribution of feature points that match a predetermined pixel pattern for each relative position are obtained, and the characteristics of the distribution and the characteristics of the distribution obtained in advance using the training image data are obtained. A technique for determining the direction of an input image by comparison has been proposed (see, for example, Patent Document 3).

特開平１１−１２０３２１Japanese Patent Application Laid-Open No. 11-120321 特開２００７−６５８６４JP-A-2007-65864 特開２００９−０２０８８４JP 2009-020884

利用者の利便性を向上させるためには、複合機は、入力画像の方向を適切に判定する必要がある。しかし、特許文献１に開示された技術は、利用者によって複合機が利用される地域に応じた言語へＯＣＲを対応させるためのコストを必要とするという課題がある。特許文献２に開示された技術は、文字のデータベースによる文字の認識を必要としており、幅広い言語に対応する場合にはデータベースの規模が膨大となるという課題がある。特許文献３に開示された技術は、画数が多く形状も複雑な漢字などでは方向毎の分布特性に違いが乏しく、アジア系の言語に対しては、方向を特定しづらいという課題がある。 In order to improve the convenience of the user, the multifunction device needs to appropriately determine the direction of the input image. However, the technology disclosed in Patent Document 1 has a problem that it requires a cost for making OCR correspond to a language according to the area where the multifunction device is used by the user. The technique disclosed in Patent Document 2 requires character recognition by a character database, and has a problem that the scale of the database becomes enormous when it supports a wide range of languages. The technique disclosed in Patent Document 3 has a problem that it is difficult to specify the direction for Asian languages because there is little difference in the distribution characteristics for each direction in Chinese characters having a large number of strokes and a complicated shape.

上述した課題に鑑み、本発明は、入力画像の方向を適切に判定することが可能な画像処理装置等を提供することを目的とする。 In view of the above-mentioned problems, it is an object of the present invention to provide an image processing apparatus or the like capable of appropriately determining the direction of an input image.

上述した課題を解決するために、本発明の画像処理装置は、
入力画像から文字を構成する要素を含む領域の画像を領域画像として抽出する抽出部と、
前記領域画像から、文字を構成する要素を示す部分画像を検出する検出部と、
前記検出した部分画像に基づいて、前記入力画像の方向を判定する判定部と、
を備えることを特徴とする。 In order to solve the above-mentioned problems, the image processing apparatus of the present invention
An extraction unit that extracts an image of the area containing elements that make up characters from the input image as an area image,
A detection unit that detects a partial image showing the elements that make up the character from the area image,
A determination unit that determines the direction of the input image based on the detected partial image,
It is characterized by having.

本発明の画像読み取り装置は
上述の画像処理装置と、
画像を読み取り、読み取った画像の入力画像を前記画像処理装置に入力する画像入力部と、
を備えることを特徴とする。 The image reading device of the present invention includes the above-mentioned image processing device and
An image input unit that reads an image and inputs the scanned image to the image processing device,
It is characterized by having.

本発明の判定方法は、
入力画像から文字を構成する要素を含む領域の画像を領域画像として抽出するステップと、
前記領域画像から、文字を構成する要素を示す部分画像を検出するステップと、
前記検出した部分画像に基づいて、前記入力画像の方向を判定するステップと、
を含むことを特徴とする。 The determination method of the present invention is
A step of extracting an image of an area containing elements constituting characters from an input image as an area image, and
A step of detecting a partial image showing an element constituting a character from the area image, and
A step of determining the direction of the input image based on the detected partial image, and
It is characterized by including.

本発明のプログラムは、
コンピュータに、
入力画像から文字を構成する要素を含む領域の画像を領域画像として抽出する機能と、
前記領域画像から、文字を構成する要素を示す部分画像を検出する機能と、
前記検出した部分画像に基づいて、前記入力画像の方向を判定する機能と、
を実現させることを特徴とする。 The program of the present invention
On the computer
A function to extract an image of an area containing elements that make up characters from an input image as an area image,
A function to detect a partial image showing elements constituting characters from the area image, and
A function of determining the direction of the input image based on the detected partial image, and
It is characterized by realizing.

本発明の記録媒体は、上記プログラムを記録したコンピュータ読み取り可能な記録媒体であることを特徴とする。 The recording medium of the present invention is a computer-readable recording medium on which the above program is recorded.

本発明によれば、入力画像の方向を適切に判定することが可能となる。 According to the present invention, it is possible to appropriately determine the direction of the input image.

第１実施形態における画像形成装置の機能構成を説明するための図である。It is a figure for demonstrating the functional structure of the image forming apparatus in 1st Embodiment. 第１実施形態における方向補正部の機能構成を説明するための図である。It is a figure for demonstrating the functional structure of the direction correction part in 1st Embodiment. 第１実施形態における方向補正処理の流れを示すフロー図である。It is a flow chart which shows the flow of the direction correction processing in 1st Embodiment. 第１実施形態における方向解析処理の流れを示すフロー図である。It is a flow chart which shows the flow of the direction analysis processing in 1st Embodiment. 第１実施形態における動作例を示す図である。It is a figure which shows the operation example in 1st Embodiment. 第１実施形態における動作例を示す図である。It is a figure which shows the operation example in 1st Embodiment. 第１実施形態における動作例を示す図である。It is a figure which shows the operation example in 1st Embodiment. 第１実施形態における動作例を示す図である。It is a figure which shows the operation example in 1st Embodiment. 第１実施形態における動作例を示す図である。It is a figure which shows the operation example in 1st Embodiment. 第１実施形態における動作例を示す図である。It is a figure which shows the operation example in 1st Embodiment. 第２実施形態における動作例を示す図である。It is a figure which shows the operation example in 2nd Embodiment. 第２実施形態における動作例を示す図である。It is a figure which shows the operation example in 2nd Embodiment. 第３実施形態における方向補正処理の流れを示すフロー図である。It is a flow chart which shows the flow of the direction correction processing in 3rd Embodiment. 第３実施形態における句読点方向解析処理の流れを示すフロー図である。It is a flow chart which shows the flow of the punctuation mark direction analysis processing in 3rd Embodiment. 第３実施形態における動作例を示す図である。It is a figure which shows the operation example in 3rd Embodiment. 第３実施形態における動作例を示す図である。It is a figure which shows the operation example in 3rd Embodiment. 第４実施形態における方向補正処理の流れを示すフロー図である。It is a flow chart which shows the flow of the direction correction processing in 4th Embodiment. 第５実施形態における画像読取装置の機能構成を説明するための図である。It is a figure for demonstrating the functional structure of the image reading apparatus in 5th Embodiment. 第６実施形態における動作例を示す図である。It is a figure which shows the operation example in 6th Embodiment. 第７実施形態における動作例を示す図である。It is a figure which shows the operation example in 7th Embodiment.

以下、本発明の実施の形態について、図面を参照して説明する。なお、本実施形態では、一例として、本発明を適用した画像処理装置を備えた画像形成装置について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the present embodiment, as an example, an image forming apparatus provided with an image processing apparatus to which the present invention is applied will be described.

［１．第１実施形態］
［１．１機能構成］
はじめに、図１を参照して、画像形成装置１の機能構成を説明する。図１に示すように、画像形成装置１は、画像入力装置１０、画像処理装置２０、画像出力装置３０、送受信装置４０、操作パネル５０、記憶部６０、制御部７０を備えて構成される。 [1. First Embodiment]
[1.1 Functional configuration]
First, the functional configuration of the image forming apparatus 1 will be described with reference to FIG. As shown in FIG. 1, the image forming device 1 includes an image input device 10, an image processing device 20, an image output device 30, a transmission / reception device 40, an operation panel 50, a storage unit 60, and a control unit 70.

画像形成装置１で実行される各種処理は、制御部７０が画像入力装置１０、画像処理装置２０、画像出力装置３０、送受信装置４０、操作パネル５０、記憶部６０を制御して実行される。制御部７０は、例えば、ＣＰＵ（Central Processing Unit）によって構成される。また、制御部７０は、ネットワークカードやＬＡＮ（Local Area Network）ケーブル等を介して、ネットワークに接続されたコンピュータ及び他のデジタル複合機等とデータ通信を行う。 Various processes executed by the image forming apparatus 1 are executed by the control unit 70 controlling the image input device 10, the image processing device 20, the image output device 30, the transmission / reception device 40, the operation panel 50, and the storage unit 60. The control unit 70 is composed of, for example, a CPU (Central Processing Unit). Further, the control unit 70 performs data communication with a computer connected to the network, another digital multifunction device, or the like via a network card, a LAN (Local Area Network) cable, or the like.

［１．１．１画像読取装置］
画像入力装置１０は、原稿から画像を光学的に読み取る装置である。画像入力装置１０は、例えば、ＣＣＤ（Charge Coupled Device）を有するカラースキャナにより構成される。画像入力装置１０は、原稿からの反射光像を、ＣＣＤを用いてＲＧＢ（Ｒ：赤，Ｇ：緑，Ｂ：青）のアナログ信号として読み取る。 [1.1.1 Image Reader]
The image input device 10 is a device that optically reads an image from a document. The image input device 10 is composed of, for example, a color scanner having a CCD (Charge Coupled Device). The image input device 10 reads the reflected light image from the document as an RGB (R: red, G: green, B: blue) analog signal using a CCD.

また、画像入力装置１０は、画像処理装置２０が接続されている。画像入力装置１０は、読み取ったＲＧＢのアナログ信号を画像処理装置２０へ出力する。 Further, the image processing device 20 is connected to the image input device 10. The image input device 10 outputs the read RGB analog signal to the image processing device 20.

［１．１．２画像処理装置］
画像処理装置２０は、入力画像の画像データに処理を施す。ここで、入力画像とは、画像処理装置２０に入力された画像であり、例えば、画像入力装置１０によって読み取られた画像や、送受信装置４０によって外部の装置から受信した画像をいう。また、画像処理装置２０は、外部の装置に送信するための圧縮ファイルを生成したり、記録シート等に形成（出力）する画像の画像データを生成したりする。なお、画像処理装置２０は、記憶部６０に接続し、処理が施された画像データを記憶部６０に記憶してもよい。 [1.1.2 Image processing device]
The image processing device 20 processes the image data of the input image. Here, the input image is an image input to the image processing device 20, for example, an image read by the image input device 10 or an image received from an external device by the transmission / reception device 40. Further, the image processing device 20 generates a compressed file for transmission to an external device, or generates image data of an image formed (output) on a recording sheet or the like. The image processing device 20 may be connected to the storage unit 60 and store the processed image data in the storage unit 60.

画像処理装置２０には、画像処理装置２０が生成した画像データに基づく画像を出力する画像出力装置３０及び画像処理装置２０が生成した圧縮ファイルを外部の装置へ送信する送受信装置４０が接続されている。 The image processing device 20 is connected to an image output device 30 that outputs an image based on the image data generated by the image processing device 20 and a transmission / reception device 40 that transmits a compressed file generated by the image processing device 20 to an external device. There is.

以下、画像処理装置２０における各処理部について説明する。画像処理装置２０は、画像入力装置１０から入力されたＲＧＢのアナログ信号に対して、Ａ／Ｄ変換部２０２、シェーディング補正部２０４、原稿種別判定部２０６、ＡＣＳ判定部２０８、方向補正部２１０、入力階調補正部２１２、領域分離処理部２１４にて画像処理を実行することによって、ＲＧＢのデジタル信号（以下、ＲＧＢ信号という）からなる画像データを生成する。 Hereinafter, each processing unit in the image processing apparatus 20 will be described. The image processing device 20 has an A / D conversion unit 202, a shading correction unit 204, a document type determination unit 206, an ACS determination unit 208, and a direction correction unit 210 for an RGB analog signal input from the image input device 10. By executing image processing in the input gradation correction unit 212 and the area separation processing unit 214, image data composed of RGB digital signals (hereinafter referred to as RGB signals) is generated.

Ａ／Ｄ変換部２０２は、画像入力装置１０から画像処理装置２０へ入力されたＲＧＢのアナログ信号を受け付け、ＲＧＢのアナログ信号をＲＧＢのデジタル信号（即ちＲＧＢ信号）へ変換する処理を行う。また、Ａ／Ｄ変換部２０２は、変換したＲＧＢ信号をシェーディング補正部２０４へ出力する。 The A / D conversion unit 202 receives the RGB analog signal input from the image input device 10 to the image processing device 20, and performs a process of converting the RGB analog signal into an RGB digital signal (that is, an RGB signal). Further, the A / D conversion unit 202 outputs the converted RGB signal to the shading correction unit 204.

シェーディング補正部２０４は、Ａ／Ｄ変換部２０２から入力されたＲＧＢ信号に対して、シェーディング補正の処理を行う。シェーディング補正部２０４は、シェーディング補正として、例えば、ＲＧＢ信号に対して、画像入力装置１０の照明系、結像系及び撮像系で生じる各種の歪みを取り除く処理を行う。次いで、シェーディング補正部２０４は、歪みを取り除いたＲＧＢ信号を原稿種別判定部２０６へ出力する。 The shading correction unit 204 performs shading correction processing on the RGB signal input from the A / D conversion unit 202. As shading correction, the shading correction unit 204 performs, for example, a process of removing various distortions generated in the illumination system, the imaging system, and the imaging system of the image input device 10 for the RGB signal. Next, the shading correction unit 204 outputs the RGB signal from which the distortion has been removed to the document type determination unit 206.

原稿種別判定部２０６は、シェーディング補正部２０４から入力されたＲＧＢ信号を用いて、文字、写真、又は印画紙等の原稿のモードを判定する原稿種別判定処理を行う。また、原稿種別判定部２０６は、入力されたＲＧＢ信号を入力階調補正部２１２に出力する。原稿種別判定処理の処理結果は、後段の処理部における画像処理に反映される。 The manuscript type determination unit 206 uses the RGB signal input from the shading correction unit 204 to perform manuscript type determination processing for determining the mode of the manuscript such as characters, photographs, or photographic paper. Further, the document type determination unit 206 outputs the input RGB signal to the input gradation correction unit 212. The processing result of the manuscript type determination processing is reflected in the image processing in the processing unit in the subsequent stage.

ＡＣＳ判定部２０８は、原稿種別判定部２０６から入力されたＲＧＢ信号から、入力画像を有彩色を含むカラー画像として出力するか、もしくは有彩色を含まないモノクロ画像として出力するかの判定を行う。また、ＡＣＳ判定部２０８は、入力されたＲＧＢ信号を方向補正部２１０に出力する。 The ACS determination unit 208 determines whether to output the input image as a color image including chromatic colors or as a monochrome image not including chromatic colors from the RGB signal input from the document type determination unit 206. Further, the ACS determination unit 208 outputs the input RGB signal to the direction correction unit 210.

方向補正部２１０は、ＡＣＳ判定部２０８から入力されたＲＧＢ信号に基づき、入力画像の方向を判定し、入力画像の方向を補正する方向補正処理を実行する。 The direction correction unit 210 determines the direction of the input image based on the RGB signal input from the ACS determination unit 208, and executes the direction correction process for correcting the direction of the input image.

ここで、入力画像の方向とは、入力画像に表された文字が向いている方向として、最も適切な方向をいう。また、適切な方向とは、例えば、入力画像に表される文字の方向として、最も多く現れた方向である。具体的には、入力画像に表される文字の多くが反時計回りに９０°回転した文字として現れている場合、方向補正部２１０は、入力画像の方向は反時計回りに９０°回転した方向であると判定する。 Here, the direction of the input image means the most appropriate direction as the direction in which the characters represented in the input image are facing. Further, the appropriate direction is, for example, the direction in which the characters appear most frequently as the direction of the characters represented in the input image. Specifically, when many of the characters represented in the input image appear as characters rotated 90 ° counterclockwise, the direction correction unit 210 indicates that the direction of the input image is rotated 90 ° counterclockwise. Is determined to be.

また、入力画像の方向の補正とは、入力画像を回転させることにより、入力画像が表示や出力されたときに、利用者によって転倒した文字を最も少なく認識されるようにＲＧＢ信号に対して回転させる処理をいう。例えば、入力画像の方向が反時計回りに９０°回転した方向であれば、方向補正部２１０は、入力画像を時計回りに９０°回転させることで、入力画像の方向を補正する。このようにすることで、補正後の入力画像に表される多くの文字は、転倒していない状態で表示や出力されることとなる。方向補正処理については後述する。 Further, the correction of the direction of the input image is to rotate the input image so that when the input image is displayed or output, the overturned character is recognized by the user as the least. It means the process to make it. For example, if the direction of the input image is rotated 90 ° counterclockwise, the direction correction unit 210 corrects the direction of the input image by rotating the input image 90 ° clockwise. By doing so, many characters displayed in the corrected input image will be displayed and output in a state where they have not fallen. The direction correction process will be described later.

方向補正部２１０は、入力されたＲＧＢ信号又は入力されたＲＧＢ信号に対して回転処理を行ったＲＧＢ信号を入力階調補正部２１２に出力する。 The direction correction unit 210 outputs the input RGB signal or the RGB signal obtained by rotating the input RGB signal to the input gradation correction unit 212.

入力階調補正部２１２は、方向補正部２１０から入力されたＲＧＢ信号に対して、階調を補正する処理を行う。入力階調補正部２１２は、階調の補正処理として、例えば、カラーバランスの調整、下地濃度の除去、及びコントラストの調整等の処理を行う。入力階調補正部２１２は、処理を行ったＲＧＢ信号を領域分離処理部２１４へ出力する。 The input gradation correction unit 212 performs a process of correcting the gradation of the RGB signal input from the direction correction unit 210. The input gradation correction unit 212 performs processing such as color balance adjustment, background density removal, and contrast adjustment as gradation correction processing. The input gradation correction unit 212 outputs the processed RGB signal to the area separation processing unit 214.

領域分離処理部２１４は、入力階調補正部２１２から入力されたＲＧＢ信号が表す画像中の各画素を、文字領域、網点領域、又は写真領域のいずれかに分離する処理を行う。また、領域分離処理部２１４は、分離結果に基づき、各画素がいずれの領域に属しているかを示す領域識別信号を、黒色生成下色除去部２１８、空間フィルタ処理部２２０、階調再現処理部２２６へ出力する。更に、領域分離処理部２１４は、入力階調補正部２１２から入力されたＲＧＢ信号を、色補正部２１６へ出力する。 The area separation processing unit 214 performs a process of separating each pixel in the image represented by the RGB signal input from the input gradation correction unit 212 into a character area, a halftone dot area, or a photographic area. Further, the area separation processing unit 214 outputs a region identification signal indicating which region each pixel belongs to based on the separation result, such as a black generation undercolor removing unit 218, a spatial filter processing unit 220, and a gradation reproduction processing unit. Output to 226. Further, the area separation processing unit 214 outputs the RGB signal input from the input gradation correction unit 212 to the color correction unit 216.

色補正部２１６は、画像データを画像出力装置３０に最終的に出力する場合、領域分離処理部２１４から入力されたＲＧＢ信号をＣＭＹのデジタル信号（以下、ＣＭＹ信号という）へ変換し、色再現の忠実化実現のために、不要吸収成分を含むＣＭＹ色材の分光特性に基づいた色濁りをＣＭＹ信号から取り除く処理を行う。次いで、色補正部２１６は、色補正後のＣＭＹ信号を黒色生成下色除去部２１８へ出力する。 When the image data is finally output to the image output device 30, the color correction unit 216 converts the RGB signal input from the area separation processing unit 214 into a CMY digital signal (hereinafter referred to as a CMY signal) to reproduce colors. In order to realize the fidelity of the above, a process of removing color turbidity based on the spectral characteristics of the CMY color material containing an unnecessary absorbing component from the CMY signal is performed. Next, the color correction unit 216 outputs the color-corrected CMY signal to the black generation undercolor removal unit 218.

一方、色補正部２１６は、画像データを送受信装置４０に最終的に出力する場合、領域分離処理部２１４から入力されたＲＧＢ信号を色補正されたＲＧＢ信号又はグレー信号へ変換し、空間フィルタ処理部２２０へ出力する。 On the other hand, when the image data is finally output to the transmission / reception device 40, the color correction unit 216 converts the RGB signal input from the area separation processing unit 214 into a color-corrected RGB signal or a gray signal, and performs spatial filter processing. Output to unit 220.

黒色生成下色除去部２１８は、色補正部２１６から入力されたＣＭＹ信号に基づき、ＣＭＹ信号から黒色（Ｋ）信号を生成する黒色生成処理と、ＣＭＹ信号から黒色生成で得たＫ信号を差し引いて新たなＣＭＹ信号を生成する処理とを行う。この結果、ＣＭＹ３色のデジタル信号は、ＣＭＹＫ４色のデジタル信号（以下、ＣＭＹＫ信号という）に変換される。次いで、黒色生成下色除去部２１８は、ＣＭＹ信号を変換したＣＭＹＫ信号を空間フィルタ処理部２２０へ出力する。 The black generation undercolor removing unit 218 subtracts the black generation process for generating a black (K) signal from the CMY signal and the K signal obtained by black generation from the CMY signal based on the CMY signal input from the color correction unit 216. The process of generating a new CMY signal is performed. As a result, the CMY3 color digital signal is converted into a CMYK4 color digital signal (hereinafter referred to as a CMYK signal). Next, the black generation undercolor removing unit 218 outputs the CMYK signal converted from the CMY signal to the spatial filter processing unit 220.

黒色生成処理には、一般に、スケルトン・ブラックによる黒色生成を行う方法が用いられる。この方法は、入力されるデータであるＣ，Ｍ，Ｙから、出力するデータであるＣ’，Ｍ’，Ｙ’，Ｋ’を求める方法である。ここで、スケルトン・カーブの入出力特性をｙ＝ｆ（ｘ）、ＵＣＲ（Under Color Removal）率をα（０＜α＜１）とすると、黒色生成下色除去処理では、下記の式（１）〜式（４）で表わされる式を用いて、ＣＭＹ信号をＣＭＹＫ信号に変換する。
Ｋ’＝ｆ（ｍｉｎ（Ｃ，Ｍ，Ｙ）） …（１）
Ｃ’＝Ｃ−αＫ’ …（２）
Ｍ’＝Ｍ−αＫ’ …（３）
Ｙ’＝Ｙ−αＫ’ …（４） Generally, a method of producing black by skeleton black is used for the black generation process. This method is a method of obtaining the output data C', M', Y', K'from the input data C, M, Y. Here, assuming that the input / output characteristics of the skeleton curve are y = f (x) and the UCR (Under Color Removal) rate is α (0 <α <1), the following equation (1) is used in the black generation undercolor removal process. ) ~ The CMY signal is converted into a CMYK signal by using the equation represented by the equation (4).
K'= f (min (C, M, Y)) ... (1)
C'= C-αK'… (2)
M'= M-αK'… (3)
Y'= Y-αK'… (4)

ここで、ＵＣＲ率α（０＜α＜１）とは、ＣＭＹが重なっている部分をＫに置き換えてＣＭＹをどの程度削減するかを示すものである。式（１）は、ＣＭＹの各信号強度のうちの最も小さい信号強度に応じてＫ信号が生成されることを示している。 Here, the UCR rate α (0 <α <1) indicates how much the CMY is reduced by replacing the overlapping portion of the CMY with K. Equation (1) shows that a K signal is generated according to the lowest signal strength of each signal strength of CMY.

空間フィルタ処理部２２０は、画像データを画像出力装置３０に最終的に出力する場合、黒色生成下色除去部２１８から入力されたＣＭＹＫ信号の画像データに対して、領域分離処理部２１４から入力された領域識別信号に基づいてデジタルフィルタによる空間フィルタ処理を行い、空間周波数特性を補正する。この補正により、画像のぼやけ又は粒状性劣化を改善させる。例えば、領域分離処理部２１４にて文字に分離された領域に対しては、空間フィルタ処理部２２０は、文字の再現性を高めるために、高周波成分の強調量が大きいフィルタを用いて空間フィルタ処理を行う。また、領域分離処理部２１４にて網点に分離された領域に対しては、空間フィルタ処理部２２０は、入力網点成分を除去するためのローパス・フィルタ処理を行う。 When the spatial filter processing unit 220 finally outputs the image data to the image output device 30, the spatial filter processing unit 220 inputs the image data of the CMYK signal input from the black generation undercolor removing unit 218 from the area separation processing unit 214. Spatial filter processing by a digital filter is performed based on the region identification signal, and the spatial frequency characteristics are corrected. This correction improves blurring or graininess deterioration of the image. For example, for the region separated into characters by the region separation processing unit 214, the spatial filter processing unit 220 uses a filter having a large emphasis on high-frequency components to perform spatial filter processing in order to improve the reproducibility of the characters. I do. Further, the spatial filter processing unit 220 performs low-pass filter processing for removing the input halftone dot component on the region separated into halftone dots by the region separation processing unit 214.

一方、空間フィルタ処理部２２０は、画像データを送受信装置４０に最終的に出力する場合、色補正部２１６から入力されたＲＧＢ信号もしくはグレー信号に対して、領域分離処理部２１４から入力された領域識別信号に基づいてデジタルフィルタによる空間フィルタ処理を行い、空間周波数特性を補正する。この補正により、画像のぼやけ又は粒状性劣化を改善させる。 On the other hand, when the spatial filter processing unit 220 finally outputs the image data to the transmission / reception device 40, the area input from the area separation processing unit 214 with respect to the RGB signal or the gray signal input from the color correction unit 216. Spatial filter processing by a digital filter is performed based on the identification signal to correct the spatial frequency characteristics. This correction improves blurring or graininess deterioration of the image.

次いで、空間フィルタ処理部２２０は、画像データを画像出力装置３０に最終的に出力する場合、処理後のＣＭＹＫ信号を出力階調補正部２２４へ出力する。一方、画像データを送受信装置４０に最終的に出力する場合には、処理後のＲＧＢ信号を解像度変換処理部２２２へ出力する。 Next, when the spatial filter processing unit 220 finally outputs the image data to the image output device 30, the spatial filter processing unit 220 outputs the processed CMYK signal to the output gradation correction unit 224. On the other hand, when the image data is finally output to the transmission / reception device 40, the processed RGB signal is output to the resolution conversion processing unit 222.

解像度変換処理部２２２は、送受信装置４０に出力する画像データが、操作パネル５０で設定された解像度の画像データになるように、解像度変換処理を行う。例えば、入力画像の解像度が６００ＤＰＩ×３００ＤＰＩの画像データで、操作パネル５０で設定された解像度が３００ＤＰＩ×３００ＤＰＩであった場合、主走査方向の２画素毎に平均値を求め、それを出力値とすることで、６００ＤＰＩ×３００ＤＰＩから３００ＤＰＩ×３００ＤＰＩへの解像度変換が実行される。解像度変換処理部２２２は、処理後のＲＧＢ信号を出力階調補正部２２４へ出力する。 The resolution conversion processing unit 222 performs resolution conversion processing so that the image data output to the transmission / reception device 40 becomes the image data of the resolution set by the operation panel 50. For example, when the resolution of the input image is 600 DPI × 300 DPI and the resolution set by the operation panel 50 is 300 DPI × 300 DPI, the average value is calculated for each two pixels in the main scanning direction, and this is used as the output value. By doing so, the resolution conversion from 600 DPI × 300 DPI to 300 DPI × 300 DPI is executed. The resolution conversion processing unit 222 outputs the processed RGB signal to the output gradation correction unit 224.

出力階調補正部２２４は、画像データを画像出力装置３０に最終的に出力する場合、空間フィルタ処理部２２０から入力されたＣＭＹＫ信号に対して、画像出力装置３０の特性である網点面積率に基づく出力階調補正処理を行い、出力階調補正処理後のＣＭＹＫ信号を階調再現処理部２２６へ出力する。 When the output gradation correction unit 224 finally outputs the image data to the image output device 30, the halftone dot area ratio, which is a characteristic of the image output device 30, is relative to the CMYK signal input from the spatial filter processing unit 220. The output gradation correction processing based on the above is performed, and the CMYK signal after the output gradation correction processing is output to the gradation reproduction processing unit 226.

一方、出力階調補正部２２４は、画像データを送受信装置４０に出力する場合、解像度変換処理部２２２から入力されたＲＧＢ信号に対して、必要に応じてかぶりやハイライトの下地が消える又は薄くなるように出力階調補正を行い、出力階調補正処理後のＲＧＢ信号を階調再現処理部２２６へ出力する。 On the other hand, when the output gradation correction unit 224 outputs the image data to the transmission / reception device 40, the background of the fog or highlight disappears or becomes lighter as necessary with respect to the RGB signal input from the resolution conversion processing unit 222. The output gradation is corrected so as to be, and the RGB signal after the output gradation correction processing is output to the gradation reproduction processing unit 226.

階調再現処理部２２６は、画像データを画像出力装置３０に最終的に出力する場合、出力階調補正部２２４から入力されたＣＭＹＫ信号に対して、領域分離処理部２１４から入力された領域識別信号に基づいて、領域に応じた中間調処理を行う。例えば、領域分離処理部２１４にて文字に分離された領域に対しては、階調再現処理部２２６は、高域周波成分の再現に適した高解像度のスクリーンによる２値化又は多値化の処理を行う。また、領域分離処理部２１４にて網点に分離された領域に対しては、階調再現処理部２２６は、階調再現性を重視したスクリーンでの２値化又は多値化の処理を行う。次いで、階調再現処理部２２６は、処理後の画像データを画像出力装置３０へ出力する。 When the image data is finally output to the image output device 30, the gradation reproduction processing unit 226 identifies the area input from the area separation processing unit 214 with respect to the CMYK signal input from the output gradation correction unit 224. Based on the signal, halftone processing is performed according to the region. For example, for the region separated into characters by the region separation processing unit 214, the gradation reproduction processing unit 226 binarizes or multi-values with a high-resolution screen suitable for reproducing high-frequency components. Perform processing. Further, for the region separated into halftone dots by the region separation processing unit 214, the gradation reproduction processing unit 226 performs binarization or multi-value processing on a screen that emphasizes gradation reproducibility. .. Next, the gradation reproduction processing unit 226 outputs the processed image data to the image output device 30.

一方、階調再現処理部２２６は、画像データを送受信装置４０に出力する場合には、操作パネル５０でモノクロ２値が選択された場合のみ、出力階調補正部２２４から入力されたＲＧＢ信号に対して、２値化処理を行う。次いで、階調再現処理部２２６は、出力階調補正部２２４から入力されたＲＧＢ信号又は２値化処理を行ったＲＧＢ信号を圧縮処理部２２８に入力する。 On the other hand, when the gradation reproduction processing unit 226 outputs the image data to the transmission / reception device 40, the RGB signal input from the output gradation correction unit 224 is used only when the monochrome binary value is selected on the operation panel 50. On the other hand, binarization processing is performed. Next, the gradation reproduction processing unit 226 inputs the RGB signal input from the output gradation correction unit 224 or the RGB signal that has been binarized to the compression processing unit 228.

圧縮処理部２２８は、ＲＧＢ信号やグレー信号、白黒２値信号からなる画像データに対して、操作パネル５０で設定されたファイルフォーマットの設定に従い、必要に応じてＪＰＥＧ（Joint Photographic Experts Group）やＭＭＲ（Modified Modified Read）などの圧縮処理を行い、圧縮データを生成して画像ファイルを生成し、送受信装置４０に出力する。なお、圧縮処理部２２８は、複数の画像データを含む文書ファイルを生成してもよい。 The compression processing unit 228 uses JPEG (Joint Photographic Experts Group) or MMR as necessary for image data consisting of RGB signals, gray signals, and black-and-white binary signals according to the file format settings set in the operation panel 50. A compression process such as (Modified Modified Read) is performed to generate compressed data, an image file is generated, and the image file is output to the transmission / reception device 40. The compression processing unit 228 may generate a document file including a plurality of image data.

なお、画像処理装置２０における各処理部の処理は、制御部７０によって各処理部を制御することで実行されることとして説明したが、ＤＳＰ（Digital Signal Processor）等のプロセッサを含むコンピュータが各処理部を制御することにより実行されてもよい。 Although it has been described that the processing of each processing unit in the image processing device 20 is executed by controlling each processing unit by the control unit 70, each processing is performed by a computer including a processor such as a DSP (Digital Signal Processor). It may be executed by controlling the unit.

［１．１．３画像出力装置］
画像出力装置３０は、画像処理装置２０から入力された画像データに基づいて、熱転写、電子写真、又はインクジェット等の方式により、記録シート上に画像を形成して出力する。画像出力装置３０は、画像形成装置１の画像形成手段として機能する。 [1.1.3 Image output device]
The image output device 30 forms an image on a recording sheet and outputs it by a method such as thermal transfer, electrophotographic, or inkjet based on the image data input from the image processing device 20. The image output device 30 functions as an image forming means of the image forming device 1.

［１．１．４送受信装置］
送受信装置４０は、公衆回線網、ＬＡＮ又はインターネット等の通信ネットワークに接続可能であり、ファクシミリ又は電子メール等の通信方法により、通信ネットワークを介して外部へ圧縮ファイルを送信する。例えば、利用者によって操作パネル５０においてＳｃａｎｔｏｅ−ｍａｉｌモードが選択されている場合、圧縮ファイルは、ネットワークカード、モデム等を用いてなる送受信装置４０によってｅ−ｍａｉｌに添付され、設定された送信先へ送信される。 [1.1.4 Transmitter / receiver]
The transmission / reception device 40 can be connected to a communication network such as a public line network, LAN, or the Internet, and transmits a compressed file to the outside via the communication network by a communication method such as facsimile or e-mail. For example, when the Scan to e-mail mode is selected on the operation panel 50 by the user, the compressed file is attached to the e-mail by the transmission / reception device 40 using a network card, a modem, or the like, and the set transmission is performed. It will be sent first.

なお、ファクシミリの送信を行う場合は、制御部７０が、モデムを用いてなる送受信装置４０にて、相手先との通信手続きを行う。そして、制御部７０は、送信可能な状態が確保されたときに、圧縮ファイルに対して圧縮形式の変更等の必要な処理を施してから、相手先に通信回線を介して順次送信する処理を行う。 When transmitting by facsimile, the control unit 70 performs a communication procedure with the other party by the transmission / reception device 40 using a modem. Then, when the transmittable state is secured, the control unit 70 performs necessary processing such as changing the compression format on the compressed file, and then sequentially transmits the compressed file to the other party via the communication line. conduct.

また、送受信装置４０は、ファクシミリ等の通信方法により、他の装置から圧縮ファイルを受信してもよい。例えば、ファクシミリを受信する場合、制御部７０は、送受信装置４０にて通信手続きを行いながら、相手先から送信される圧縮ファイルを受信して、画像処理装置２０に入力する。画像処理装置２０は、受信した圧縮ファイルに対し伸張処理を施したり、伸張処理により得られた画像の画像データに対して、必要に応じて回転処理及び／又は解像度変換処理等を施したりする。また、画像処理装置２０は、出力階調補正部２２４で出力階調補正を施したり、階調再現処理部２２６で階調再現処理を施したりする。画像処理装置２０は、各種画像処理が施された画像データを、画像出力装置３０へ出力する。また、画像出力装置３０は、画像処理装置２０から出力された画像データに基づき、記録シート上に出力画像を形成する。 Further, the transmission / reception device 40 may receive a compressed file from another device by a communication method such as facsimile. For example, when receiving a facsimile, the control unit 70 receives a compressed file transmitted from the other party and inputs it to the image processing device 20 while performing a communication procedure on the transmission / reception device 40. The image processing device 20 performs decompression processing on the received compressed file, and performs rotation processing and / or resolution conversion processing or the like on the image data of the image obtained by the decompression processing, if necessary. Further, in the image processing device 20, the output gradation correction unit 224 performs output gradation correction, and the gradation reproduction processing unit 226 performs gradation reproduction processing. The image processing device 20 outputs the image data subjected to various image processing to the image output device 30. Further, the image output device 30 forms an output image on the recording sheet based on the image data output from the image processing device 20.

［１．１．５操作パネル］
操作パネル５０は、利用者が画像形成装置１の動作モード等を設定するための設定ボタン及びテンキー等のハードキーによって構成される操作部５２と、液晶ディスプレイ、有機ＥＬ（electro-luminescence）ディスプレイ等の装置で構成される表示部５４とを備える。なお、操作パネル５０は、操作部５２と表示部５４とが一体に形成されるタッチパネルであってもよい。この場合において、タッチパネルの入力を検出する方式は、例えば、抵抗膜方式、赤外線方式、電磁誘導方式、静電容量方式といった、一般的な検出方式であればよい。 [1.1.5 Operation panel]
The operation panel 50 includes an operation unit 52 composed of a setting button for the user to set an operation mode of the image forming apparatus 1 and hard keys such as a numeric keypad, a liquid crystal display, an organic EL (electro-luminescence) display, and the like. A display unit 54 composed of the above devices is provided. The operation panel 50 may be a touch panel in which the operation unit 52 and the display unit 54 are integrally formed. In this case, the method for detecting the input of the touch panel may be a general detection method such as a resistance film method, an infrared ray method, an electromagnetic induction method, or a capacitance method.

［１．１．６記憶部］
記憶部６０は、画像データ等の各種データや各種プログラムを記憶する機能部であり、例えば、不揮発性の記憶装置（例えば、ＨＤＤ（Hard Disk Drive））や、半導体メモリの記憶装置（例えば、ＳＳＤ（Solid State Drive））等により構成される。 [1.1.6 Storage unit]
The storage unit 60 is a functional unit that stores various data such as image data and various programs. For example, a non-volatile storage device (for example, HDD (Hard Disk Drive)) or a semiconductor memory storage device (for example, SSD). (Solid State Drive)) etc.

［１．２方向補正処理］
つづいて、方向補正部２１０が実行する方向補正処理について説明する。はじめに、図２を参照して、方向補正部２１０の構成を説明する。方向補正部２１０は、図２に示すように、抽出部２１０２、方向解析処理部２１０４、判定処理部２１０６、回転処理部２１０８を含んで構成され、この順に、ＡＣＳ判定部２０８から入力されたＲＧＢ信号を処理し、入力階調補正部２１２に出力する。 [1.2 Direction correction processing]
Next, the direction correction process executed by the direction correction unit 210 will be described. First, the configuration of the direction correction unit 210 will be described with reference to FIG. As shown in FIG. 2, the direction correction unit 210 includes an extraction unit 2102, a direction analysis processing unit 2104, a determination processing unit 2106, and a rotation processing unit 2108, and RGB input from the ACS determination unit 208 in this order. The signal is processed and output to the input gradation correction unit 212.

抽出部２１０２は、入力画像から文字もしくは文字を構成する要素（パーツ）の範囲を示す文字領域や、１つ以上の文字領域からなる文字列領域を抽出する。ここで、文字とは、言葉を表記するために用いられる記号をいう。また、文字を構成する要素とは、文字の部分として含まれる記号や図形であり、例えば、部首や、直線、曲線、点といったものが含まれる。方向解析処理部２１０４は、文字領域及び文字列領域から、文字を構成する要素のうち特定の要素を検出し、検出された方向に応じてスコアを算出する。判定処理部２１０６は、入力画像全体にわたって算出されたスコアに応じて、入力画像毎に入力画像の方向を判定する。回転処理部２１０８は、入力画像毎に判定処理部２１０６によって判定された入力画像の方向に応じて、入力画像を回転させる。例えば、回転処理部２１０８が、判定処理部２１０６によって判定された入力画像の方向が正方向ではない場合、入力画像が正方向となる方向に回転させることで、入力画像の方向が補正される。なお、正方向とは、文字が正しく読める方向（文字の角度が０°の方向、文字が転倒していない方向）をいう。例えば、後述するが、図８（ａ）の方向に文字が配置される方向を正方向とする。 The extraction unit 2102 extracts a character area indicating a character or a range of elements (parts) constituting the character from the input image, or a character string area composed of one or more character areas. Here, the character means a symbol used to describe a word. Further, the elements constituting the character are symbols and figures included as a part of the character, and include, for example, radicals, straight lines, curves, and points. The direction analysis processing unit 2104 detects a specific element among the elements constituting the character from the character area and the character string area, and calculates a score according to the detected direction. The determination processing unit 2106 determines the direction of the input image for each input image according to the score calculated over the entire input image. The rotation processing unit 2108 rotates the input image for each input image according to the direction of the input image determined by the determination processing unit 2106. For example, when the rotation processing unit 2108 rotates the input image determined by the determination processing unit 2106 in a direction other than the positive direction, the direction of the input image is corrected by rotating the input image in the positive direction. The positive direction means a direction in which the characters can be read correctly (a direction in which the angle of the characters is 0 °, a direction in which the characters are not overturned). For example, as will be described later, the direction in which the characters are arranged in the direction of FIG. 8A is the positive direction.

図３を参照して、方向補正処理について説明する。はじめに、抽出部２１０２は、文字を構成する要素を含む領域を抽出する（ステップＳ１０２）。抽出部２１０２は、文字を構成する要素を含む領域を、例えば、以下の方法によって抽出する。 The direction correction process will be described with reference to FIG. First, the extraction unit 2102 extracts an area including an element constituting a character (step S102). The extraction unit 2102 extracts an area including an element constituting a character by, for example, the following method.

（１）文字領域の抽出
抽出部２１０２は、入力画像のうち、文字らしい画素の集合を、文字領域として抽出する。文字らしい画素とは、例えば、入力画像を前景レイヤと背景レイヤに分離したときに前景レイヤに含まれる画素であって、エッジ部分の画素を含む画素である。
（２）文字列領域の抽出
抽出部２１０２は、文字領域同士の位置や大きさの関係から、文字領域をグループに分割したときの各グループを文字列領域として抽出する。このとき、抽出部２１０２は、文字列領域を構成する文字領域の並び方より、文字列領域毎に、文字の並び方向（文字列方向）を判定してもよい。 (1) Extraction of character area The extraction unit 2102 extracts a set of character-like pixels from the input image as a character area. The character-like pixels are, for example, pixels included in the foreground layer when the input image is separated into the foreground layer and the background layer, and are pixels including the pixels of the edge portion.
(2) Extraction of character string area The extraction unit 2102 extracts each group when the character area is divided into groups as a character string area based on the relationship between the positions and sizes of the character areas. At this time, the extraction unit 2102 may determine the character arrangement direction (character string direction) for each character string area based on the arrangement of the character areas constituting the character string area.

なお、文字領域の抽出方法及び文字列領域の抽出方法については、例えば、特開２０１２−１１４７４４号公報に開示された方法を利用することが可能であり、他の公知の技術を用いることもできる。特開２０１２−１１４７４４号公報に開示された方法のように、抽出された文字領域が大きすぎたり小さすぎたりした場合に、抽出された文字領域は文字らしくないとして除外しても良いし、グループに含まれる文字領域の数が少ない場合に、十分な文字数を構成した文字列ではないとして、当該グループに含まれる文字領域を全て除外するようにしても良い。このようにすることで、写真などの文字でない部分に含まれたエッジが、誤って文字領域として抽出されてしまうことを防ぐことが可能である。 As a method for extracting the character region and a method for extracting the character string region, for example, the method disclosed in Japanese Patent Application Laid-Open No. 2012-114744 can be used, and other known techniques can also be used. .. When the extracted character area is too large or too small as in the method disclosed in Japanese Patent Application Laid-Open No. 2012-114744, the extracted character area may be excluded as not being character-like, or a group. When the number of character areas included in is small, it is possible to exclude all the character areas included in the group, assuming that the character string does not constitute a sufficient number of characters. By doing so, it is possible to prevent edges included in non-character parts such as photographs from being mistakenly extracted as character areas.

また、文字領域の抽出は、文字列領域の抽出及び文字列方向の判定を行うための前段階であるため、個々の文字単位で正確に抽出されている必要はない。例えば、「へん」や「つくり」などのパーツ毎に分離して抽出されていてもよいし、複数の文字にまたがって１つの文字領域が抽出されていてもよい。 Further, since the extraction of the character area is a preliminary step for extracting the character string area and determining the character string direction, it is not necessary to accurately extract each character. For example, each part such as "hen" or "making" may be separated and extracted, or one character area may be extracted across a plurality of characters.

つづいて、方向解析処理部２１０４は、抽出部２１０２によって抽出された文字領域や文字列領域に基づき対象領域を設定し、設定した対象領域のうち１の領域の画像を領域画像として入力画像から抽出する（ステップＳ１０４）。対象領域とは、後述する方向解析処理の対象とする領域であり、入力画像の方向の判定に用いる画素が含まれる領域である。領域画像とは、入力画像から対象領域に基づいて抽出した画像である。方向解析処理部２１０４は、文字列領域そのものを対象領域としてもよいし、文字列領域を更に外側に所定画素数だけ拡大もしくは内側に所定画素数だけ縮小した領域や、１つの文字列領域を複数の領域に分割したそれぞれの領域を対象領域としてもよい。また、抽出部２１０２によって抽出された文字領域そのものを対象領域としてもよい。 Subsequently, the direction analysis processing unit 2104 sets a target area based on the character area and the character string area extracted by the extraction unit 2102, and extracts an image of one of the set target areas as an area image from the input image. (Step S104). The target area is an area to be targeted for the direction analysis processing described later, and is an area including pixels used for determining the direction of the input image. The area image is an image extracted from the input image based on the target area. The direction analysis processing unit 2104 may use the character string area itself as the target area, expand the character string area further outward by a predetermined number of pixels, or reduce the character string area by a predetermined number of pixels inside, or a plurality of one character string area. Each area divided into the above areas may be used as the target area. Further, the character area itself extracted by the extraction unit 2102 may be used as the target area.

また、抽出部２１０２によって、入力画像から複数の文字列領域や文字領域が抽出された場合は、方向解析処理部２１０４は、抽出部２１０２によって抽出された領域のうち、一部の領域を対象領域としてもよい。この場合、方向解析処理部２１０４は、所定の大きさ以上の領域を対象領域としてもよいし、特定の位置（例えば入力画像の上半分）に存在する領域を対象領域としてもよいし、所定の数だけランダムに選択した領域を対象領域としてもよい。 When a plurality of character string areas and character areas are extracted from the input image by the extraction unit 2102, the direction analysis processing unit 2104 covers a part of the areas extracted by the extraction unit 2102 as a target area. May be. In this case, the direction analysis processing unit 2104 may use a region having a predetermined size or larger as the target region, a region existing at a specific position (for example, the upper half of the input image) as the target region, or a predetermined region. A number of randomly selected areas may be used as the target area.

つづいて、方向解析処理部２１０４は、領域画像から対象パーツを示す画像を検出し、その検出結果に基づくスコア（方向別スコア）を算出する方向解析処理を実行する（ステップＳ１０６）。ここで、対象パーツとは、領域画像から検出する対象となるパーツをいう。対象パーツは、方向解析処理を実行する前に定義されるパーツであり、例えば画像形成装置１の利用者や、設計者等によって予め定められていたり、選択されたりする。対象パーツとして適切なパーツについては後述する。 Subsequently, the direction analysis processing unit 2104 executes a direction analysis process of detecting an image showing the target part from the area image and calculating a score (score for each direction) based on the detection result (step S106). Here, the target part means a part to be detected from the area image. The target part is a part defined before executing the direction analysis process, and is predetermined or selected by, for example, a user of the image forming apparatus 1, a designer, or the like. Parts suitable as target parts will be described later.

本実施形態では、方向解析処理として、領域画像を複数の方向に回転させ、それぞれ回転させた画像から正方向の対象パーツを検出して、検出結果に基づくスコアを算出する。正方向の対象パーツとは、正方向の文字に含まれる対象パーツをいう。 In the present embodiment, as the direction analysis process, the region image is rotated in a plurality of directions, the target parts in the positive direction are detected from the rotated images, and the score based on the detection result is calculated. The target part in the positive direction means the target part included in the characters in the positive direction.

また、領域画像を回転させる方向（以下、「領域画像の回転方向」という）には、画像が読み取られた状態の方向（入力画像の回転角度を０°とした方向）である入力方向に、以下の方向のうち少なくとも１つを加える。
（１）入力画像を反時計回りに９０°回転させた反時計回り方向
（２）入力画像を１８０°回転させた状態の方向である逆（反転）方向
（３）入力画像を時計回りに９０°（反時計回りに２７０°）回転させた状態の方向である時計回り方向 Further, in the direction in which the area image is rotated (hereinafter referred to as "rotation direction of the area image"), there is an input direction which is the direction in which the image is read (the direction in which the rotation angle of the input image is 0 °). Add at least one of the following directions:
(1) Counterclockwise direction in which the input image is rotated 90 ° counterclockwise (2) Reverse (reverse) direction in which the input image is rotated 180 ° (3) 90 ° clockwise rotation of the input image ° (270 ° counterclockwise) Clockwise, which is the direction of rotation

方向解析処理部２１０４は、領域画像の回転方向を前記の入力方向、反時計回り方向、逆方向、時計回り方向の４つの方向としてもよいし、領域画像の回転方向を入力方向と逆方向の２つに限定して、原稿の天地方向のみを判定するようにしてもよい。また、方向解析処理部２１０４は、領域画像の回転方向を入力方向と反時計回り方向（又は時計回り方向）の２つに限定して、縦原稿か横原稿かのみを判定するようにしてもよい。このようにして、領域画像の回転方向には、領域画像が正方向となる方向を含めるようにする。 The direction analysis processing unit 2104 may set the rotation direction of the region image to the four directions of the input direction, the counterclockwise direction, the reverse direction, and the clockwise direction, or set the rotation direction of the region image to the direction opposite to the input direction. It may be limited to two and only the top-bottom direction of the document may be determined. Further, the direction analysis processing unit 2104 may limit the rotation direction of the area image to two directions, an input direction and a counterclockwise direction (or a clockwise direction), and determine only a vertical document or a horizontal document. good. In this way, the rotation direction of the region image includes the direction in which the region image is in the positive direction.

なお、領域画像の回転方向は、利用者によって選択されてもよいし、方向解析処理部２１０４が方向解析処理を実行する前に、簡易的な方法によって自動的に判定してもよい。また、本実施形態では、領域画像の回転方向の数をＡとする。 The rotation direction of the region image may be selected by the user, or may be automatically determined by a simple method before the direction analysis processing unit 2104 executes the direction analysis process. Further, in the present embodiment, the number in the rotation direction of the region image is A.

本実施形態の方向解析処理について、図４を参照して説明する。はじめに、方向解析処理部２１０４は、領域画像の回転方向を示す変数ａに１を代入する（ステップＳ１２２）。 The direction analysis process of this embodiment will be described with reference to FIG. First, the direction analysis processing unit 2104 substitutes 1 for the variable a indicating the rotation direction of the region image (step S122).

つづいて、方向解析処理部２１０４は、領域画像を第ａの方向に回転させ、正方向の対象パーツを示す画像（以下、「部分画像」という）を検出する。 Subsequently, the direction analysis processing unit 2104 rotates the region image in the ath direction, and detects an image showing the target part in the positive direction (hereinafter, referred to as “partial image”).

ここで、対象パーツとして適切なパーツについて説明する。本実施形態では、第ａの方向に回転させた領域画像から正方向の対象パーツを検出する。検出するパーツとしては、例えば、一般的な文書において出現頻度が低くなかったり、パーツそのものを回転させたときに、そのパーツ自身や他のパーツと混同しづらかったりするパーツであることが望ましい。 Here, a part suitable as a target part will be described. In the present embodiment, the target part in the positive direction is detected from the region image rotated in the first direction a. It is desirable that the parts to be detected are, for example, parts that do not appear frequently in general documents or that are difficult to be confused with the part itself or other parts when the part itself is rotated.

パーツそのものを回転させたときについて、図５を参照して説明する。なお、本実施形態では、パーツを漢字の部首とする。また、漢字の部首のうち、対象パーツとして適切な部首の例を図５（ａ）に、不適切な部首の例を図５（ｂ）にそれぞれ具体的に示す。 The case where the part itself is rotated will be described with reference to FIG. In this embodiment, the parts are radicals of Chinese characters. Further, among the radicals of Chinese characters, an example of a radical that is appropriate as a target part is shown in FIG. 5 (a), and an example of an inappropriate radical is shown in FIG. 5 (b).

図５（ａ）は、パーツと、当該パーツを正方向、反時計回り方向、逆方向、時計回り方向の４つに回転させた場合のパーツを示した図である。図５（ａ）のうち、例えば、Ｐ１０２に示す「きへん」は、正方向、反時計回り方向、逆方向、時計回り方向にそれぞれ回転させても、「きへん」又は他のパーツと混同する可能性は低い。また「きへん」を用いた漢字は一般的な文書でも高い頻度で使用されるため、対象パーツとしては適切であるといえる。対象パーツとして適切なパーツとしては、他には、Ｐ１０４に示す「たけかんむり」、Ｐ１０６に示す「あめかんむり」、Ｐ１０８に示す「かねへん」が挙げられる。 FIG. 5A is a diagram showing a part and a part when the part is rotated in four directions: a forward direction, a counterclockwise direction, a reverse direction, and a clockwise direction. In FIG. 5A, for example, the "kihen" shown on page 102 is confused with the "kihen" or other parts even if it is rotated in the forward direction, the counterclockwise direction, the reverse direction, and the clockwise direction, respectively. It is unlikely that you will. In addition, Chinese characters using "kihen" are frequently used in general documents, so it can be said that they are suitable as target parts. Other suitable parts as the target parts include "Takekanmuri" shown on P104, "Amekanmuri" shown on P106, and "Kanehen" shown on P108.

一方で、図５（ｂ）のうち、例えば、Ｐ１１２に示す「くさかんむり」や、Ｐ１１４に示す「くるまへん」は、パーツそのものを回転させたときに、そのパーツ自身と混同する可能性が高いパーツである。すなわち、「くさかんむり」や、「くるまへん」は、正方向のパーツと逆方向のパーツとがほぼ同じ形状となる。このようなパーツを対象パーツとして定義すると、方向解析処理部２１０４は、正方向となる向きに回転された領域画像の他に、文字が上下転倒した向きに回転された領域画像からも正方向の対象パーツを検出してしまう。この結果、検出結果の精度は落ちてしまう。 On the other hand, in FIG. 5B, for example, "Kusakanmuri" shown on page 112 and "Kurumahen" shown on page 114 are likely to be confused with the parts themselves when the parts themselves are rotated. Is. That is, in "Kusakanmuri" and "Kurumahen", the parts in the forward direction and the parts in the reverse direction have almost the same shape. When such a part is defined as a target part, the direction analysis processing unit 2104 is in the positive direction not only from the area image rotated in the positive direction but also from the area image rotated in the direction in which the characters are turned upside down. The target part is detected. As a result, the accuracy of the detection result is reduced.

また、Ｐ１１６に示す「めへん」や、Ｐ１１８に示す「れんが」は、パーツを回転させたときに、他のパーツと混同する可能性が高いパーツである。例えば、「めへん」を反時計回り方向に回転させると、「さら」と混同しやすく、「れんが」を時計回り方向に回転させると、「さんずい」と混同しやすい。このようなパーツを対象パーツとして定義すると、方向解析処理部２１０４は、正方向ではない文字から、正方向の対象パーツを検出してしまう。この結果、検出結果の精度は落ちてしまう。例えば、Ｐ１１６に示す「めへん」を対象パーツとすると、「さら」を含む漢字（例えば、「益」「盆」「盛」）が時計回り方向に９０°回転した状態で含まれる領域画像から、正方向の「めへん」を検出してしまう。 Further, the "mehen" shown on P116 and the "brick" shown on P118 are parts that are likely to be confused with other parts when the parts are rotated. For example, rotating "Mehen" counterclockwise is likely to be confused with "Sara", and rotating "Brick" clockwise is likely to be confused with "Sanzui". If such a part is defined as a target part, the direction analysis processing unit 2104 detects the target part in the positive direction from characters that are not in the positive direction. As a result, the accuracy of the detection result is reduced. For example, if "Mehen" shown on page 116 is the target part, from the area image that includes the Chinese characters including "Sara" (for example, "Masu", "Bon", and "Sheng") rotated 90 ° clockwise. , Detects a positive "mehen".

なお、対象パーツとして不適切な他の例として「しかへん」が挙げられる。「しかへん」は、パーツそのものを回転させたときに、そのパーツ自身や他のパーツと混同しづらいが、「しかへん」を用いた漢字自体が一般的に高い頻度で使用されているとは言い難いため、パーツとしては不適切である。 In addition, "Shikahen" can be mentioned as another example that is inappropriate as a target part. "Shikahen" is hard to confuse with the part itself or other parts when the part itself is rotated, but it is said that the Chinese characters themselves using "Shikahen" are generally used frequently. It's hard to say, so it's not suitable as a part.

領域画像から部分画像を検出する方法については、方向解析処理部２１０４は、機械学習やパターンマッチングを用いる方法を利用できる。 As for the method of detecting the partial image from the region image, the direction analysis processing unit 2104 can use a method using machine learning or pattern matching.

ここでは、機械学習を用いる方法について説明する。まず、領域画像から部分画像を検出するための検出器を、予め学習により構築する。図６（ａ）は、検出器を構築するときにおける入力データと、正解データ（パーツ名）との組み合わせ（学習データ）を示す図である。 Here, a method using machine learning will be described. First, a detector for detecting a partial image from a region image is constructed in advance by learning. FIG. 6A is a diagram showing a combination (learning data) of the input data at the time of constructing the detector and the correct answer data (part name).

例えば、領域画像から正方向の「きへん」といった漢字の部首を検出する場合、領域画像から正方向の「きへん」を示す部分画像を検出する検出器を構築する。このとき、検出器には、正方向の「きへん」を示す画像を入力データとし、正解データとして「きへん」を与えることで、正方向の「きへん」の部分画像を検出することが可能なモデルを生成する。そして、生成された学習済みモデルを含む検出器に、領域画像を入力することで、領域画像から正方向の「きへん」を示す部分画像が検出される。このようにして、領域画像から正方向の「きへん」が検出される。なお、検出器や学習済みモデルは、例えば、記憶部６０に記憶される。 For example, when detecting a radical of a Chinese character such as "kihen" in the positive direction from a region image, a detector is constructed to detect a partial image indicating "kihen" in the positive direction from the region image. At this time, the detector can detect a partial image of the positive direction "kihen" by inputting an image showing the positive direction "kihen" as input data and giving "kihen" as the correct answer data. Generate a possible model. Then, by inputting the region image into the detector including the generated trained model, a partial image indicating a positive direction "kihen" is detected from the region image. In this way, the positive "kihen" is detected from the area image. The detector and the trained model are stored in the storage unit 60, for example.

なお、図６（ａ）のＰ１２２やＰ１２４に示すように、「きへん」や「たけかんむり」を示す画像であっても、フォントによって形状が異なり、また同じフォントでもそのパーツを含む文字の構成によっては縦横比が変わるなどの変形が見られる。したがって、検出の精度を向上させるために、フォントや縦横比を変更した入力データを用いて、１つの対象パーツに対応する学習データを増やすことが望ましい。また、入力画像の画質が劣化している場合、検出精度が低下することも考えられる。検出精度の低下を防ぐため、傾いた画像やノイズが付加された画像等、意図的に画質を劣化させた入力データを利用してもよい。 As shown in P122 and P124 of FIG. 6A, even if the image shows "kihen" or "takekanmuri", the shape differs depending on the font, and even if the same font is used, the characters including the parts are included. Deformation such as a change in aspect ratio can be seen depending on the configuration. Therefore, in order to improve the detection accuracy, it is desirable to increase the learning data corresponding to one target part by using the input data in which the font and the aspect ratio are changed. Further, if the image quality of the input image is deteriorated, the detection accuracy may be lowered. In order to prevent a decrease in detection accuracy, input data whose image quality is intentionally deteriorated, such as a tilted image or an image to which noise is added, may be used.

なお、正方向の「きへん」以外の対象パーツを検出できるように検出器を学習させてもよい。このとき、正方向の「きへん」のような細長いパーツと、正方向の「しんにょう」のようなＬ字型に近いパーツとのように、対象パーツの大きさや形状にバラつきがあると、１つの検出器でそれらの対象パーツを全て検出するのは困難となる。この場合、入力データが常に同じサイズとなるように事前に入力データを拡大又は縮小するか、入力データを全て「へん」や「かんむり」などの細長いパーツに限定するなどして、検出が容易になるように学習させてもよい。また、対象パーツ毎に、それぞれ検出器を構築してもよい。 The detector may be trained so that a target part other than the positive direction "kihen" can be detected. At this time, if there are variations in the size and shape of the target parts, such as elongated parts such as "kihen" in the forward direction and parts that are close to L-shape such as "shinnyo" in the forward direction, 1 It is difficult for one detector to detect all of these target parts. In this case, it is easy to detect by enlarging or reducing the input data in advance so that the input data will always be the same size, or by limiting all the input data to elongated parts such as "hen" and "crown". You may learn to become. Further, a detector may be constructed for each target part.

また、入力データとして、対象パーツそのものでなく、図６（ｂ）のＰ１２６やＰ１２８に示すように、正方向の対象パーツを含む文字そのものを学習させてもよい。この場合も、前述したようなフォントの違いや、入力画像の画質の劣化に対応可能なように学習させればよい。なお、「きへん」のように、そのパーツを含む文字が多数存在する場合もあるが、本実施形態は文字そのものを識別することが目的ではないため、代表的な文字に絞った入力データを作成して、学習データの規模を小さくしてもよい。 Further, as the input data, not the target part itself but the character itself including the target part in the positive direction may be learned as shown in P126 and P128 of FIG. 6B. In this case as well, the learning may be performed so as to cope with the difference in fonts and the deterioration of the image quality of the input image as described above. In addition, there may be a large number of characters including the part, such as "kihen", but since the purpose of this embodiment is not to identify the characters themselves, input data narrowed down to typical characters is used. It may be created to reduce the scale of the training data.

なお、図６では、正解データとしてパーツ名を与える場合について図示したが、パーツ名を正解データとして与える代わりに、正方向の対象パーツであるか否かの二択の分類結果等を正解データとして与えてもよい。 Although the case where the part name is given as the correct answer data is shown in FIG. 6, instead of giving the part name as the correct answer data, the classification result of two choices as to whether or not the part is the target part in the positive direction is used as the correct answer data. May be given.

方向解析処理部２１０４は、入力画像から得られた領域画像を第ａの方向に回転させる（ステップＳ１２４）。つづいて、方向解析処理部２１０４は、学習データから構築された検出器を用いて、領域画像から部分画像を検出することで、領域画像から正方向の対象パーツを検出する（ステップＳ１２６）。方向解析処理部２１０４は、ａに１を加算して、ａがＡを超えない場合はステップＳ１２４へ戻る（ステップＳ１２８→ステップＳ１３０；Ｎｏ→ステップＳ１２４）。このようにすることで、所定の方向に回転させたそれぞれの領域画像に対して、正方向の対象パーツの検出処理が実行される。 The direction analysis processing unit 2104 rotates the region image obtained from the input image in the ath direction (step S124). Subsequently, the direction analysis processing unit 2104 detects the target part in the positive direction from the area image by detecting the partial image from the area image using the detector constructed from the learning data (step S126). The direction analysis processing unit 2104 adds 1 to a, and if a does not exceed A, returns to step S124 (step S128 → step S130; No → step S124). By doing so, the detection process of the target part in the positive direction is executed for each region image rotated in the predetermined direction.

つづいて、方向解析処理部２１０４は、ステップＳ１２４で回転させた方向と、ステップＳ１２６の検出結果とに基づき算出したスコアを、方向別スコアに加算する（ステップＳ１３２）。ここで、方向別スコアとは、領域画像の回転方向毎に、所定の方法によって算出されたスコアを合算したスコアを示したものである。例えば、方向解析処理部２１０４は、１つの方向の領域画像でのみ正方向の対象パーツが検出された場合は、正方向の対象パーツが検出された領域画像が示す方向に対応する方向を検出方向として選ぶ。また、複数の方向の領域画像から正方向の対象パーツが検出された場合は、検出の確からしさ（尤度、信頼度）が最も高い方向を検出方向として選ぶ。そして、方向解析処理部２１０４は、選ばれた検出方向について、対応する方向別スコアに所定の方法でスコアを加算する。このようにすることで、スコアが高い方向ほど、その方向に回転させることで入力画像を正方向に補正することの妥当性が高いことを示すことができる。 Subsequently, the direction analysis processing unit 2104 adds the score calculated based on the direction rotated in step S124 and the detection result in step S126 to the direction-specific score (step S132). Here, the direction-specific score indicates a score obtained by adding up the scores calculated by a predetermined method for each rotation direction of the region image. For example, when the target part in the positive direction is detected only in the region image in one direction, the direction analysis processing unit 2104 detects the direction corresponding to the direction indicated by the region image in which the target part in the positive direction is detected. Choose as. When a target part in the positive direction is detected from the region images in a plurality of directions, the direction with the highest detection probability (likelihood, reliability) is selected as the detection direction. Then, the direction analysis processing unit 2104 adds a score to the corresponding direction-specific score by a predetermined method for the selected detection direction. By doing so, it can be shown that the higher the score, the higher the validity of correcting the input image in the positive direction by rotating in that direction.

ここで、領域画像の回転方向の中に、領域画像が正方向となる方向が含まれる場合、領域画像が正方向となる方向に回転させた領域画像から正方向の対象パーツが検出されるため、領域画像が正方向となる方向に対応する方向のスコアが高くなる。また、領域画像をスコアが高い方向から入力方向にするために回転させる方向を、入力画像の方向とすることの妥当性が高くなる。例えば、スコアが高い方向が時計回り方向である場合、時計回り方向の領域画像を入力方向の領域画像にするために回転させる方向は反時計回り方向である。したがって、入力画像の方向が、反時計回り方向であることの妥当性が高い。ここで、スコアが高い方向が入力方向（反時計回りに０°）の場合、入力画像の方向は入力方向と同一であり、スコアが高い方向が反時計回り方向（反時計回りに９０°）の場合、入力画像の方向は時計回り方向（反時計回りに２７０°）であり、スコアが高い方向が逆方向（反時計回りに１８０°）である場合、入力画像の方向は逆方向と同一であり、スコアが高い方向が時計回り方向（反時計回りに２７０°）である場合、入力画像の方向は反時計回り方向（反時計回りに９０°）である。 Here, when the rotation direction of the region image includes the direction in which the region image is in the positive direction, the target part in the positive direction is detected from the region image rotated in the direction in which the region image is in the positive direction. , The score in the direction corresponding to the direction in which the area image is in the positive direction becomes high. Further, it is highly appropriate to set the direction in which the region image is rotated from the direction in which the score is high to the input direction as the direction of the input image. For example, when the direction in which the score is high is the clockwise direction, the direction in which the region image in the clockwise direction is rotated to become the region image in the input direction is the counterclockwise direction. Therefore, it is highly valid that the direction of the input image is the counterclockwise direction. Here, when the direction with a high score is the input direction (counterclockwise 0 °), the direction of the input image is the same as the input direction, and the direction with a high score is the counterclockwise direction (counterclockwise 90 °). In the case of, the direction of the input image is clockwise (270 ° counterclockwise), and when the direction with the higher score is the opposite direction (180 ° counterclockwise), the direction of the input image is the same as the reverse direction. When the direction in which the score is high is the clockwise direction (counterclockwise 270 °), the direction of the input image is the counterclockwise direction (counterclockwise 90 °).

スコアを加算する所定の方法としては、選択された１つの方向に対して定数を与える方法であってもよいし、前述の確からしさが大きいほど大きな値を与える方法であってもよい。また、複数の方向の領域画像から正方向の対象パーツが検出された場合に、単一の方向を選択してその方向にのみスコアを加算するのではなく、前述の確からしさの大きさに基づいたスコアを、検出された複数の方向それぞれに加算する方法であってもよい。 As a predetermined method of adding the scores, a method of giving a constant in one selected direction may be used, or a method of giving a larger value as the above-mentioned certainty is larger may be used. Also, when a target part in the positive direction is detected from the area images in multiple directions, the score is not added only in that direction by selecting a single direction, but based on the magnitude of the above-mentioned certainty. The score may be added to each of the plurality of detected directions.

図３に戻り、つづいて、方向解析処理部２１０４は、全ての対象領域から領域画像を抽出したか否かを判定し、全ての対象領域から領域画像を抽出していない場合は、ステップＳ１０４へ戻る（ステップＳ１０８；Ｎｏ→ステップＳ１０４）。このようにして、方向解析処理部２１０４は、全ての対象領域から領域画像を抽出し、抽出した領域画像から部分画像を検出し、方向別スコアを算出する。また、方向解析処理部２１０４は、領域画像毎に算出した方向毎のスコアを、対応する方向の方向別スコアに順次加算することで、方向別に合算した方向別スコアを算出することができる。 Returning to FIG. 3, the direction analysis processing unit 2104 determines whether or not the area image has been extracted from all the target areas, and if the area image has not been extracted from all the target areas, the process proceeds to step S104. Return (step S108; No → step S104). In this way, the direction analysis processing unit 2104 extracts a region image from all the target regions, detects a partial image from the extracted region image, and calculates a score for each direction. Further, the direction analysis processing unit 2104 can calculate the total score for each direction by sequentially adding the score for each direction calculated for each area image to the score for each direction in the corresponding direction.

全ての対象領域から領域画像を抽出したら、判定処理部２１０６は、方向解析処理部２１０４によって算出された方向別スコアに基づき、入力画像の方向を判定する（ステップＳ１０８；Ｙｅｓ→ステップＳ１１０）。具体的には、判定処理部２１０６は、方向別スコアのうち最もスコアの高い方向に回転させた入力画像を、入力方向の入力画像にするために回転させる方向を、入力画像の方向として判定すればよい。 After extracting the area images from all the target areas, the determination processing unit 2106 determines the direction of the input image based on the direction-specific score calculated by the direction analysis processing unit 2104 (step S108; Yes → step S110). Specifically, the determination processing unit 2106 determines as the direction of the input image the direction in which the input image rotated in the direction having the highest score among the directional scores is rotated in order to make the input image in the input direction. Just do it.

つづいて、回転処理部２１０８は、ステップＳ１１０において判定処理部２１０６によって判定された入力画像の方向に基づき、入力画像が正方向となる方向に、入力画像を回転させる（ステップＳ１１２）。例えば、回転処理部２１０８は、入力画像の方向が時計回り方向（反時計回りに２７０°）であれば入力画像を反時計回り方向に９０°回転させ、入力画像の方向が逆方向（反時計回りに１８０°）であれば入力画像を１８０°回転させ、入力画像の方向が反時計回り方向（反時計回りに９０°）であれば入力画像を時計回りに９０°回転させる。なお、判定処理部２１０６によって入力画像の方向が正方向として判定された場合、回転処理部２１０８は、入力画像の回転は行わない（入力画像を回転させる処理をスキップする）。 Subsequently, the rotation processing unit 2108 rotates the input image in the direction in which the input image is in the positive direction based on the direction of the input image determined by the determination processing unit 2106 in step S110 (step S112). For example, the rotation processing unit 2108 rotates the input image 90 ° counterclockwise if the direction of the input image is clockwise (270 ° counterclockwise), and the direction of the input image is opposite (counterclockwise). If it is 180 ° clockwise, the input image is rotated 180 °, and if the direction of the input image is counterclockwise (90 ° counterclockwise), the input image is rotated 90 ° clockwise. When the determination processing unit 2106 determines that the direction of the input image is the positive direction, the rotation processing unit 2108 does not rotate the input image (skips the process of rotating the input image).

上述した方向補正処理により、入力画像を回転させた後の画像データが方向補正部２１０から出力されるため、圧縮処理部２２８を介して出力される画像ファイルは適切な方向に入力画像が回転された画像データが表された画像ファイルとなる。また、方向補正部２１０は、入力された原稿の画像毎に方向補正処理を実行することで、入力される原稿の画像データが複数であっても、それぞれの画像データを正しい方向の画像データに回転させることができる。その結果、圧縮処理部２２８を介して出力される文書ファイルに含まれる画像は、正しい方向に揃った状態となる。 Since the image data after rotating the input image is output from the direction correction unit 210 by the above-mentioned direction correction processing, the input image is rotated in an appropriate direction in the image file output via the compression processing unit 228. It becomes an image file that represents the image data. Further, the direction correction unit 210 executes the direction correction process for each image of the input original, so that even if there are a plurality of image data of the input original, each image data can be converted into image data in the correct direction. Can be rotated. As a result, the images included in the document file output via the compression processing unit 228 are aligned in the correct direction.

なお、対象パーツを抽出する方法は、機械学習による方法のほか、テンプレートマッチングを利用することもできる。この場合、はじめに、対象パーツのテンプレート画像を記憶部６０に記憶させておく。例えば、図６（ａ）で示した入力データの例（例えば、Ｐ１２２やＰ１２４）のように、フォントや縦横比に応じて複数のテンプレート画像を記憶させる。そして、方向解析処理部２１０４は、記憶部６０に記憶されたテンプレート画像と、領域画像とのテンプレートマッチングを行う。テンプレート画像の数だけテンプレートマッチングの回数も増えるため処理時間はかかるが、機械学習のような大規模なフィルタ処理などを必要としないため、比較的小さな規模で実現できる。なお、処理量を減らすために、テンプレート画像及び領域画像の双方の解像度を低下させて、意図的に細部の情報を欠落させることで、大雑把な形状を特徴として利用して対象パーツを検出するようにしてもよい。 In addition to the machine learning method, template matching can also be used as the method for extracting the target parts. In this case, first, the template image of the target part is stored in the storage unit 60. For example, as in the example of the input data shown in FIG. 6A (for example, P122 and P124), a plurality of template images are stored according to the font and the aspect ratio. Then, the direction analysis processing unit 2104 performs template matching between the template image stored in the storage unit 60 and the area image. Processing time is required because the number of template matchings increases by the number of template images, but it can be realized on a relatively small scale because it does not require large-scale filtering such as machine learning. In addition, in order to reduce the amount of processing, the resolution of both the template image and the area image is lowered, and detailed information is intentionally omitted so that the target part can be detected by using the rough shape as a feature. It may be.

［１．３動作例］
図を参照して、本実施形態の動作例を説明する。図７は、入力画像から文字領域及び文字列領域を抽出する具体例を示す図である。図７（ａ）は、入力画像Ｅ１００の例を示した図であり、入力画像には、領域Ｅ１０２に「販売促進部」という文字が現れているものとする。すなわち、領域Ｅ１０２には、「販売促進部」という文字を構成する画素が含まれている。 [1.3 Operation example]
An operation example of this embodiment will be described with reference to the figure. FIG. 7 is a diagram showing a specific example of extracting a character area and a character string area from an input image. FIG. 7A is a diagram showing an example of the input image E100, and it is assumed that the characters “sales promotion unit” appear in the area E102 in the input image. That is, the area E102 includes pixels constituting the characters "sales promotion unit".

入力画像に対して、抽出部２１０２は文字領域の抽出を行う。文字領域を抽出した結果、図７（ｂ）に示すように、点線で囲まれた文字を構成するそれぞれのパーツが、文字領域として抽出される。文字領域としては、例えば、領域Ｍ１００に示すような「販」の文字を構成する要素である「反」を含む領域や、領域Ｍ１０２に示すような「販」の文字を構成する点を含む領域が抽出される。このように、文字領域は、個々の文字単位ではなく、部首や部首の一部といった文字を構成する要素の単位で抽出される。 The extraction unit 2102 extracts a character area from the input image. As a result of extracting the character area, as shown in FIG. 7B, each part constituting the character surrounded by the dotted line is extracted as the character area. As the character area, for example, an area including "anti" which is an element constituting the character of "sales" as shown in the area M100, and an area including a point forming the character of "sales" as shown in the area M102. Is extracted. In this way, the character area is extracted not as an individual character unit but as a unit of elements constituting a character such as a radical or a part of the radical.

また、抽出部２１０２は文字領域に基づいて、文字列方向を判定したり、文字列領域を抽出したりしてもよい。例えば、特開２０１２−１１４７４４号公報に開示された方法を用いて、抽出部２１０２は、例えば、水平方向の文字領域における平坦画素ブロックの連続数と、垂直方向の文字領域における平坦画素ブロックの連続数とを比較することで、文字列方向を判定する。図７（ｂ）の例では、垂直方向の連結数に比べて水平方向の連結数が多いので、抽出部２１０２は、水平方向に文字が並んでいるとして、図７（ｃ）の領域Ｅ１０４に示すような、文字列方向が水平方向の文字列領域を抽出する。 Further, the extraction unit 2102 may determine the character string direction or extract the character string area based on the character area. For example, using the method disclosed in Japanese Patent Application Laid-Open No. 2012-114744, the extraction unit 2102 uses, for example, the number of consecutive flat pixel blocks in the horizontal character region and the continuous number of flat pixel blocks in the vertical character region. The character string direction is determined by comparing with the number. In the example of FIG. 7B, since the number of connections in the horizontal direction is larger than the number of connections in the vertical direction, the extraction unit 2102 assumes that the characters are arranged in the horizontal direction in the area E104 of FIG. 7C. As shown, the character string area whose character string direction is horizontal is extracted.

なお、抽出部２１０２は、文字領域を抽出した後、１つの文字列領域内において行方向に並ぶ複数の文字領域同士を統合してもよい。例えば、１つの文字列領域において、文字列方向と直交する行方向に複数の文字が並ぶことは原則として無いという仮定のもと、抽出部２１０２は、１つの文字列領域内において行方向に並ぶ複数の文字領域同士を統合してもよい。また、文字列方向で、複数の文字領域間において、文字を構成する前景画素同士は連結しないが文字領域の矩形同士が重複するような場合に、抽出部２１０２は、これらの文字領域同士を１つの文字領域として統合してもよい。 After extracting the character area, the extraction unit 2102 may integrate a plurality of character areas arranged in the line direction in one character string area. For example, under the assumption that a plurality of characters are not arranged in a line direction orthogonal to the character string direction in one character string area in principle, the extraction unit 2102 is arranged in the line direction in one character string area. A plurality of character areas may be integrated with each other. Further, in the character string direction, when the foreground pixels constituting the character are not connected to each other but the rectangles of the character area overlap between the plurality of character areas, the extraction unit 2102 sets 1 of these character areas to each other. It may be integrated as one character area.

図７（ｄ）は、文字領域を統合することにより、図７（ｂ）に示した文字領域を再定義した例を示す図である。このように文字領域を統合することにより、領域Ｍ１０４に示したように個々の文字単位の領域や、領域Ｍ１０６と領域Ｍ１０８とに示したように１つの文字を構成するまとまった単位を含む領域を文字領域とすることができる。その結果、文字領域を連結しない場合に比べて、対象領域の数が少なくなり、方向解析処理部２１０４が部分画像を検出する処理を削減することができる。 FIG. 7D is a diagram showing an example in which the character area shown in FIG. 7B is redefined by integrating the character areas. By integrating the character areas in this way, an area containing individual character units as shown in the area M104 and an area including a group of units constituting one character as shown in the area M106 and the area M108 can be obtained. It can be a character area. As a result, the number of target areas is reduced as compared with the case where the character areas are not connected, and the process of detecting the partial image by the direction analysis processing unit 2104 can be reduced.

方向解析処理部２１０４は、図７（ｂ）に示すような文字領域（例えば、Ｍ１００やＭ１０２）を対象領域にしてもよいし、図７（ｃ）に示すような文字列領域（例えば、Ｅ１０４）を対象領域にしてもよいし、図７（ｄ）に示すような文字領域を再定義した後の領域（例えば、Ｍ１０４、Ｍ１０６、Ｍ１０８）を対象領域にしてもよい。どのように方向解析処理部２１０４が対象領域を設定するかについては、予め決められていてもよいし、利用者によって決められてもよい。 The direction analysis processing unit 2104 may set the character area (for example, M100 or M102) as shown in FIG. 7B as the target area, or the character string area (for example, E104) as shown in FIG. 7C. ) May be the target area, or the area after redefining the character area as shown in FIG. 7D (for example, M104, M106, M108) may be the target area. How the direction analysis processing unit 2104 sets the target area may be predetermined or may be determined by the user.

図８は、入力画像の方向が正方向であるときにおける、領域画像と、当該領域画像から対象パーツを抽出したときの抽出結果を示す図である。図８は、「学校に筆箱を忘れた」という文字が表れた文字列領域を対象領域とし、対象パーツに「たけかんむり」と「きへん」が含まれる場合を示す。また、図８は、領域画像の回転方向は、第１の方向である入力方向（反時計回りに０°）、第２の方向である反時計回り方向（反時計回りに９０°）、第３の方向である逆方向（反時計回りに１８０°）、第４の方向である時計回り方向（反時計回りに２７０°）の４つである場合を示す。 FIG. 8 is a diagram showing a region image when the direction of the input image is the positive direction and an extraction result when the target part is extracted from the region image. FIG. 8 shows a case where the character string area in which the characters “I forgot the pencil case at school” appears is the target area, and the target parts include “Takekanmuri” and “Kihen”. Further, in FIG. 8, the rotation direction of the region image is the input direction (counterclockwise 0 °) which is the first direction, the counterclockwise direction (counterclockwise 90 °) which is the second direction, and the second direction. It shows four cases of the reverse direction (180 ° counterclockwise) which is the direction 3 and the clockwise direction (270 ° counterclockwise) which is the fourth direction.

はじめに、方向解析処理部２１０４は、領域画像を第１の方向である入力方向にする。具体的には、方向解析処理部２１０４は、入力画像から、対象領域に含まれる画像を、そのまま領域画像として抽出する。図８（ａ）は、領域画像を入力方向にした画像を示す。図８（ａ）の領域Ｅ１１０は対象領域を示す。なお、領域Ｅ１１０内の画像は領域画像である。また、領域Ｅ１１２及び領域Ｅ１１４は領域Ｅ１１０の画像のうち、「たけかんむり」の部分画像が検出された領域を示し、領域Ｅ１１６及び領域Ｅ１１８は領域Ｅ１１０の画像のうち、「きへん」の部分画像が抽出された領域を示す。なお、領域Ｅ１１８は、「たけかんむり」が本来の部首である「箱」という文字が表された画像から、「きへん」の部分画像を抽出したことを示しているが、領域Ｅ１１８のように、本来の部首以外の要素から、部分画像が抽出されてもよい。 First, the direction analysis processing unit 2104 sets the region image to the input direction, which is the first direction. Specifically, the direction analysis processing unit 2104 extracts the image included in the target area as it is as the area image from the input image. FIG. 8A shows an image in which the area image is in the input direction. The area E110 in FIG. 8A shows a target area. The image in the area E110 is an area image. Further, the area E112 and the area E114 indicate the area where the partial image of "Takekanmuri" is detected in the image of the area E110, and the area E116 and the area E118 are the "kihen" part of the image of the area E110. Indicates the area where the image was extracted. It should be noted that the area E118 indicates that the partial image of "Kihen" was extracted from the image in which the character "box", which is the original radical of "Takekanmuri", was expressed. As described above, a partial image may be extracted from an element other than the original radical.

つづいて、方向解析処理部２１０４は、領域画像を第２の方向である反時計回り方向にする。図８（ｂ）は、領域Ｅ１１０内の画像を反時計回り方向にした領域画像を示し、領域Ｅ１２０は対象領域を示す。同様に、方向解析処理部２１０４は、領域画像を第３の方向である逆方向及び第４の方向である時計回り方向にする。図８（ｃ）は、領域Ｅ１１０内の画像を逆方向にした画像を示し、領域Ｅ１３０は対象領域を示す。図８（ｄ）は、領域Ｅ１１０内の画像を時計回り方向にした画像を示し、領域Ｅ１４０は対象領域を示す。 Subsequently, the direction analysis processing unit 2104 sets the region image in the counterclockwise direction, which is the second direction. FIG. 8B shows a region image in which the image in the region E110 is counterclockwise, and the region E120 indicates a target region. Similarly, the direction analysis processing unit 2104 sets the region image in the reverse direction which is the third direction and the clockwise direction which is the fourth direction. FIG. 8C shows an image in which the image in the area E110 is reversed, and the area E130 shows a target area. FIG. 8D shows an image in which the image in the area E110 is in the clockwise direction, and the area E140 shows a target area.

また、方向解析処理部２１０４は、領域Ｅ１２０、領域Ｅ１３０、領域Ｅ１４０のそれぞれの領域内の画像である領域画像から対象パーツの部分画像を抽出する。しかし、図８（ｂ）、図８（ｃ）、図８（ｄ）に示すように、領域Ｅ１２０、領域Ｅ１３０、領域Ｅ１４０の領域画像からは、部分画像は検出されない。 Further, the direction analysis processing unit 2104 extracts a partial image of the target part from the region image which is an image in each region of the region E120, the region E130, and the region E140. However, as shown in FIGS. 8 (b), 8 (c), and 8 (d), no partial image is detected from the region images of the region E120, the region E130, and the region E140.

図９は、領域画像の回転方向である、入力方向、反時計回り方向、逆方向、時計回り方向毎に、スコアを算出した場合の例を示した図であり、具体的には、図８の例におけるスコアの具体例を示した図である。図８の例では、１つの方向（入力方向）に回転させた領域画像からのみ正方向の対象パーツが検出されている。この検出結果に基づき、図９では、方向解析処理部２１０４が、対象パーツを検出するたびに、入力方向を検出方向として選び、対応する方向（入力方向）のスコアに１を加算した場合の方向別スコアを示している。すなわち、方向解析処理部２１０４は、入力方向を示す領域画像において正方向の「きへん」と「たけかんむり」をそれぞれ２つ抽出したので、入力方向の方向別スコアは４であり、それ以外の方向の方向別スコアは０である。 FIG. 9 is a diagram showing an example in which the score is calculated for each of the input direction, the counterclockwise direction, the reverse direction, and the clockwise direction, which are the rotation directions of the region image. Specifically, FIG. 8 is a diagram showing an example. It is a figure which showed the specific example of the score in the example of. In the example of FIG. 8, the target part in the positive direction is detected only from the region image rotated in one direction (input direction). Based on this detection result, in FIG. 9, the direction when the direction analysis processing unit 2104 selects the input direction as the detection direction each time the target part is detected and adds 1 to the score of the corresponding direction (input direction). It shows another score. That is, since the direction analysis processing unit 2104 has extracted two positive "kihen" and two "takekanmuri" in the region image indicating the input direction, the score for each direction in the input direction is 4, and the other The direction-specific score in the direction of is 0.

この結果、入力方向の方向別スコアが他の方向（反時計回り方向、逆方向、時計回り方向）の方向別スコアに比べて高くなる。したがって、判定処理部２１０６は、領域画像の回転方向のうち、入力方向が、入力画像を正方向となるように回転するための方向であること、すなわち、入力画像の方向が正方向であることを判定することができる。また、判定処理部２１０６によって入力方向が正方向であると判定されるため、回転処理部２１０８は入力画像を回転しない。 As a result, the directional score in the input direction is higher than the directional score in the other directions (counterclockwise, counterclockwise, clockwise). Therefore, in the determination processing unit 2106, among the rotation directions of the region image, the input direction is the direction for rotating the input image so as to be the positive direction, that is, the direction of the input image is the positive direction. Can be determined. Further, since the determination processing unit 2106 determines that the input direction is the positive direction, the rotation processing unit 2108 does not rotate the input image.

なお、方向解析処理部２１０４は、対象パーツの検出において、対象パーツ毎に制約を設けて、制約に反して検出されるものは確からしさ（もしくは一致度）が高くても無視する（検出しない）ようにしてもよい。例えば、図１０に示すように、１８０°回転した「螢」という文字において、領域Ｅ１５０において正方向（反時計回りに０°）の「たけかんむり」が検出されてしまう場合があるが、「たけかんむり」は本来、文字の上側に位置するので、検出結果は適当ではない。したがって、このまま正方向の「たけかんむり」を検出すると、誤った方向を入力画像の方向として判定するおそれがある。ここで、図１０に示すように、「たけかんむり」が検出された位置の上に、同じ文字列領域に属する別の文字領域が存在した場合は、「たけかんむり」が下側に位置していることがわかる。そこで、「たけかんむり」に「正方向の場合は上側に位置する」とする制約を与えることで、図１０に示すような場合であっても、方向解析処理部２１０４は、制約に反するとみなし、検出をキャンセルすることができる。「たけかんむり」以外の対象パーツであっても、その対象パーツが本来位置しない特定の位置から検出された場合は、その検出結果を無視（キャンセル）してもよい。このようにして、入力画像の方向の判定の精度の低下を防ぐことができる。 In addition, the direction analysis processing unit 2104 sets a constraint for each target part in the detection of the target part, and ignores (does not detect) what is detected contrary to the constraint even if the certainty (or the degree of coincidence) is high. You may do so. For example, as shown in FIG. 10, in the character "螢" rotated by 180 °, a positive direction (0 ° counterclockwise) may be detected in the region E150, but "Takekanmuri" may be detected. Since "Takekanmuri" is originally located above the characters, the detection result is not appropriate. Therefore, if the positive direction "Takekanmuri" is detected as it is, the wrong direction may be determined as the direction of the input image. Here, as shown in FIG. 10, if another character area belonging to the same character string area exists above the position where "Takekanmuri" is detected, "Takekanmuri" is positioned on the lower side. You can see that it is doing. Therefore, by giving a constraint to "Takekanmuri" that "it is located on the upper side in the positive direction", the direction analysis processing unit 2104 violates the constraint even in the case shown in FIG. Deemed, detection can be canceled. Even if it is a target part other than "Takekanmuri", if the target part is detected from a specific position where it is not originally located, the detection result may be ignored (cancelled). In this way, it is possible to prevent a decrease in the accuracy of determining the direction of the input image.

なお、上記では活字が印刷された原稿の画像を入力画像として対象パーツを抽出する例を挙げたが、手書きの原稿の入力画像の方向を判定するようにしてもよい。この場合、検出器の構築において、人により筆記された文字から入力データを抽出して作成してもよい。また、他の活字による部分画像と共に人により筆記された文字の部分画像を学習させてもよいし、手書きの部分画像のみで検出器を構築してもよい。また、方向補正処理において手書き原稿を検知する処理を追加することで、手書き原稿として検知された原稿に対してのみ方向補正処理を適用するようにしてもよい。 In the above example, the target part is extracted by using the image of the original printed with the type as the input image, but the direction of the input image of the handwritten original may be determined. In this case, in constructing the detector, input data may be extracted from characters written by a person to create the detector. Further, the partial image of the character written by a person may be learned together with the partial image of other print characters, or the detector may be constructed only with the handwritten partial image. Further, by adding a process for detecting a handwritten document in the direction correction process, the direction correction process may be applied only to the document detected as the handwritten document.

手書き原稿の検知処理方法としては、例えば、特開２００７−０８７１９６号公報に開示された方法を利用することができ、他の公知技術を用いて実現してもよい。このようにすることで、ＯＣＲの適用が困難な手書きの原稿についても、入力画像の方向を判定して回転することが可能となり、利便性が向上する。 As a method for detecting the handwritten manuscript, for example, the method disclosed in Japanese Patent Application Laid-Open No. 2007-087196 can be used, and it may be realized by using other known techniques. By doing so, even for a handwritten document to which OCR is difficult to apply, it is possible to determine the direction of the input image and rotate it, which improves convenience.

なお、上述した説明以外であっても、矛盾のない範囲において、ステップの順番を変更したり、一部のステップを省略したりしても構わない。例えば、方向補正処理において、はじめに全ての対象領域の領域画像を抽出し、方向解析処理において、全ての領域画像を第ａに回転させた上で部分画像を検出する処理をしてもよい。 In addition to the above description, the order of the steps may be changed or some steps may be omitted as long as there is no contradiction. For example, in the direction correction process, the area images of all the target areas may be first extracted, and in the direction analysis process, all the area images may be rotated to the ath position and then the partial image may be detected.

本実施形態によれば、個々の文字がどの文字であるかといった識別をすることなく、主要なパーツだけを検出し、そのパーツが検出されたときのパーツの方向に基づいて、入力画像の方向を判定することが可能となる。文字を認識する必要がないため、大規模なデータベースを必要とせず、また、簡易な処理で、入力画像の方向を判定することができる。また、漢字の部首といった、特徴的なパーツを対象パーツとして定義することで、漢字等の画数が多く形状も複雑な文字の画像が含まれる画像にも対応することができる。 According to the present embodiment, only the main part is detected without distinguishing which character each character is, and the direction of the input image is based on the direction of the part when the part is detected. Can be determined. Since it is not necessary to recognize characters, a large-scale database is not required, and the direction of the input image can be determined by a simple process. Further, by defining a characteristic part such as a radical of a Chinese character as a target part, it is possible to correspond to an image including an image of a character having a large number of strokes and a complicated shape such as a Chinese character.

［２．第２実施形態］
つづいて第２実施形態について説明する。第２実施形態は、第１実施形態と異なり、領域画像を回転させるのではなく、所定の方向に回転させた部分画像を領域画像から入力画像の方向を判定する実施形態である。 [2. Second Embodiment]
Next, the second embodiment will be described. The second embodiment is different from the first embodiment in that the direction of the input image is determined from the region image by rotating the partial image in a predetermined direction instead of rotating the region image.

本実施形態では、方向解析処理として、領域画像を入力方向のままとし、所定の方向に回転させた対象パーツを検出して、検出結果に基づくスコアを算出する。 In the present embodiment, as the direction analysis process, the area image is left as the input direction, the target part rotated in a predetermined direction is detected, and the score based on the detection result is calculated.

本実施形態では、部分画像を回転させる方向の数をＡとする。また、第１実施形態で説明した方向補正処理のステップＳ１２４において、方向解析処理部２１０４は、正方向の対象パーツを示す部分画像を第ａの方向に回転させる。また、ステップＳ１２６において、方向解析処理部２１０４は、入力方向の領域画像から、当該第ａの方向に回転させた部分画像を検出する。例えば、検出する部分画像の方向が、正方向、反時計回り方向、逆方向、時計回り方向の４つであれば、方向解析処理部２１０４は、対象パーツを示す部分画像を正方向、反時計回り方向、逆方向、時計回り方向にそれぞれ回転させ、当該回転させた部分画像を入力方向の領域画像から検出する。 In the present embodiment, the number of directions in which the partial image is rotated is A. Further, in step S124 of the direction correction process described in the first embodiment, the direction analysis processing unit 2104 rotates a partial image showing the target part in the positive direction in the ath direction. Further, in step S126, the direction analysis processing unit 2104 detects a partial image rotated in the first direction from the region image in the input direction. For example, if there are four directions of the partial image to be detected, which are the forward direction, the counterclockwise direction, the reverse direction, and the clockwise direction, the direction analysis processing unit 2104 sets the partial image indicating the target part in the forward direction and the counterclockwise direction. It is rotated in the clockwise direction, the reverse direction, and the clockwise direction, respectively, and the rotated partial image is detected from the region image in the input direction.

また、ステップＳ１３２において、方向解析処理部２１０４は、検出した部分画像の方向を検出方向として選び、スコアを算出し、算出したスコアを方向別スコアに加算する。すなわち、本実施形態では、第１実施形態とは異なり、検出した部分画像の方向毎にスコアを算出する。この場合、例えば、入力画像の方向が正方向であれば、正方向の部分画像が多く検出されるため、正方向の方向別スコアが高くなる。また、入力画像の方向が反時計回り方向であれば、反時計回り方向の部分画像が多く検出されるため、反時計回り方向の方向別スコアが高くなる。このように、本実施形態の方向別スコアは、第１実施形態と異なり、スコアが高いほど、その方向が入力画像の方向である妥当性が高いことを示すため、部分画像を検出したときの方向に基づくスコアが高い方向を、そのまま入力画像の方向と判定することができる。したがって、方向補正処理のステップＳ１１０において、判定処理部２１０６は、方向別スコアのうち、最もスコアの高い方向を、入力画像の方向として判定すればよい。なお、方向解析処理部２１０４は、１の領域画像から複数の方向の部分画像が検出した場合は、検出の確からしさ（尤度、信頼度）に基づきスコアを算出してもよい。 Further, in step S132, the direction analysis processing unit 2104 selects the direction of the detected partial image as the detection direction, calculates the score, and adds the calculated score to the direction-specific score. That is, in the present embodiment, unlike the first embodiment, the score is calculated for each direction of the detected partial image. In this case, for example, if the direction of the input image is the positive direction, many partial images in the positive direction are detected, so that the score for each direction in the positive direction is high. Further, if the direction of the input image is the counterclockwise direction, many partial images in the counterclockwise direction are detected, so that the score for each direction in the counterclockwise direction becomes high. As described above, the direction-specific score of the present embodiment is different from that of the first embodiment, and the higher the score, the higher the validity that the direction is the direction of the input image. The direction in which the score based on the direction is high can be directly determined as the direction of the input image. Therefore, in step S110 of the direction correction process, the determination processing unit 2106 may determine the direction having the highest score among the direction-specific scores as the direction of the input image. When the direction analysis processing unit 2104 detects partial images in a plurality of directions from one area image, the direction analysis processing unit 2104 may calculate the score based on the certainty of detection (likelihood, reliability).

本実施形態では、方向の情報を持った学習データを用いて、部分画像を検出するための検出器を構築する。本実施形態における学習データの例を図１１に示す。本実施形態では、図１１に示すように、入力データを、検出する方向に対応付けて回転させたものとし、正解データをパーツ名と回転させた方向とを含めたものとしたものを学習データのペアとして、学習により検出器を構築する。 In the present embodiment, a detector for detecting a partial image is constructed by using learning data having direction information. An example of the learning data in this embodiment is shown in FIG. In the present embodiment, as shown in FIG. 11, the input data is rotated in association with the detection direction, and the correct answer data includes the part name and the rotated direction as learning data. As a pair of, a detector is constructed by learning.

本実施形態における動作例を図１２に示す。図１２（ａ）は、方向解析処理部２１０４が、対象領域Ｅ２００内の画像から、領域Ｅ２０２及び領域Ｅ２０４において、正方向の「たけかんむり」を、領域Ｅ２０６及び領域Ｅ２０８において、正方向の「きへん」を検出した例を示す。また、図１２（ｂ）は、検出した部分画像の方向に対応するスコアを１ずつ加算した場合のスコアを示す図である。図１２（ｂ）に示すように、正方向のスコアが最も高いため、判定処理部２１０６は、正方向を入力画像の方向として判定できる。 An operation example in this embodiment is shown in FIG. In FIG. 12A, the direction analysis processing unit 2104 shows, from the image in the target area E200, a positive “take-top” in the area E202 and E204, and a positive “take-kanmuri” in the area E206 and E208. An example in which "kihen" is detected is shown. Further, FIG. 12B is a diagram showing a score when the scores corresponding to the directions of the detected partial images are added one by one. As shown in FIG. 12B, since the score in the positive direction is the highest, the determination processing unit 2106 can determine the positive direction as the direction of the input image.

なお、機械学習の代わりにテンプレートマッチングを利用する場合は、対象パーツのテンプレート画像として、検出する方向毎に回転させたテンプレート画像を予め記憶部６０に記憶させておけばよい。このとき、方向解析処理部２１０４は、記憶部６０に記憶されたテンプレート画像と、領域画像とのテンプレートマッチングを行うことで、領域画像から、テンプレート画像とテンプレート画像の方向とを検出することができる。なお、テンプレートマッチングを利用する場合は、正方向のテンプレート画像を予め記憶部６０に記憶させ、テンプレートマッチングを実行するときに、正方向のテンプレート画像を検出する方向に回転させたテンプレート画像を生成してもよい。このように、正方向の部分画像から複数の方向に回転させた部分画像を生成することで、正方向の部分画像のみを記憶部６０に記憶している場合であっても、テンプレートマッチングを行うことができる。 When template matching is used instead of machine learning, the template image rotated in each detection direction may be stored in the storage unit 60 in advance as the template image of the target part. At this time, the direction analysis processing unit 2104 can detect the template image and the direction of the template image from the area image by performing template matching between the template image stored in the storage unit 60 and the area image. .. When using template matching, the template image in the positive direction is stored in the storage unit 60 in advance, and when the template matching is executed, the template image rotated in the direction in which the template image in the positive direction is detected is generated. You may. In this way, by generating a partial image rotated in a plurality of directions from the partial image in the positive direction, template matching is performed even when only the partial image in the positive direction is stored in the storage unit 60. be able to.

本実施形態によれば、方向補正部２１０に入力画像が入力される度に領域画像自体を回転することなく、方向を考慮した対象パーツの検出が可能となり、実使用上の処理時間を短縮することができる。 According to the present embodiment, it is possible to detect the target part in consideration of the direction without rotating the area image itself every time the input image is input to the direction correction unit 210, and the processing time in actual use is shortened. be able to.

［３．第３実施形態］
つづいて第３実施形態について説明する。第３実施形態は、方向解析処理部２１０４が複数の方向解析処理手段を備え、それぞれの方向解析処理手段において算出された方向別スコアを統合して最終的に得られた総合的方向別スコアに基づいて、ページ毎に入力画像の方向を判定する実施形態である。 [3. Third Embodiment]
Next, the third embodiment will be described. In the third embodiment, the directional analysis processing unit 2104 is provided with a plurality of directional analysis processing means, and the directional scores calculated by the respective directional analysis processing means are integrated into a finally obtained comprehensive directional score. Based on this, it is an embodiment in which the direction of the input image is determined for each page.

本実施形態は、第１実施形態及び第２実施形態の何れにも適用可能である。また、本実施形態では、第１実施形態及び第２実施形態における方向補正処理に関する図３を図１３に置き換える。なお、本実施形態では、第１実施形態及び第２実施形態と同一の機能部及び処理には同一の符号を付し、説明については省略する。 This embodiment is applicable to both the first embodiment and the second embodiment. Further, in the present embodiment, FIG. 3 relating to the direction correction processing in the first embodiment and the second embodiment is replaced with FIG. In this embodiment, the same functional parts and processes as those in the first embodiment and the second embodiment are designated by the same reference numerals, and the description thereof will be omitted.

なお、本実施形態では、方向解析処理部２１０４がＮ（＞１）個の方向解析処理手段を備えるとき、それぞれ第１方向解析処理、……、第Ｎ方向解析処理と呼ぶこととする。また、本実施形態では、第ｋ方向解析処理（１≦ｋ≦Ｎ）によって算出された方向別スコアを第ｋ方向別スコアといい、第ｋ方向別スコアを統合して得られる方向別スコアを総合的方向別スコアという。また、そのうち第１方向解析処理では、第１実施形態で説明した方向解析処理部２１０４と同等の処理を実行するものとする。なお、本実施形態における方向補正部２１０は、方向解析処理部２１０４以外は第１実施形態と同様の構成をとることができるとし、説明を省略する。 In the present embodiment, when the direction analysis processing unit 2104 is provided with N (> 1) direction analysis processing means, they are referred to as a first direction analysis process, ..., And an Nth direction analysis process, respectively. Further, in the present embodiment, the directional score calculated by the k-direction analysis process (1 ≦ k ≦ N) is referred to as the k-direction score, and the directional score obtained by integrating the k-direction scores is obtained. It is called the overall score by direction. Further, in the first direction analysis process, the same process as the direction analysis processing unit 2104 described in the first embodiment is executed. The direction correction unit 210 in the present embodiment can have the same configuration as that in the first embodiment except for the direction analysis processing unit 2104, and the description thereof will be omitted.

本実施形態における総合的方向別スコアは、第１実施形態で説明したような入力画像を正方向に補正するための方向に関するスコア又は第２実施形態で説明したような入力画像の方向に関するスコアの何れかである。総合的方向別スコアの種類（入力画像を正方向に補正するための方向に関するスコアであるか入力画像の方向に関するスコアであるか）は、例えば、予め定められている。 The overall direction-specific score in the present embodiment is the score related to the direction for correcting the input image in the positive direction as described in the first embodiment or the score related to the direction of the input image as described in the second embodiment. Either. The type of the overall direction-specific score (whether the score is related to the direction for correcting the input image in the positive direction or the score is related to the direction of the input image) is predetermined, for example.

ここで、第ｋ方向別スコアと総合的方向別スコアとが同じ種類のスコアであれば、方向解析処理部２１０４は、第ｋ方向別スコアを総合的方向別スコアにそのまま統合（例えば、加算）できる。したがって、複数の方向解析処理を、同様の種類のスコアを算出する方向解析処理とすることで、方向解析処理部２１０４は、総合的方向別スコアを効率的に算出できる。一方で、第ｋ方向別スコアと総合的方向別スコアとが、異なる種類のスコアであれば、第ｋ方向別スコアを総合的方向別スコアが示すスコアの種類に揃えた上で、総合的方向別スコアに統合する。 Here, if the k-direction score and the overall direction-specific score are of the same type, the direction analysis processing unit 2104 integrates the k-direction-specific score into the overall direction-specific score as it is (for example, addition). can. Therefore, by making the plurality of direction analysis processes a direction analysis process for calculating the same type of score, the direction analysis processing unit 2104 can efficiently calculate the overall score for each direction. On the other hand, if the k-direction score and the overall direction-specific score are different types of scores, the k-direction score is aligned with the score type indicated by the overall direction-specific score, and then the overall direction is adjusted. Integrate into another score.

例えば、総合的方向別スコアが入力画像の方向を示すスコアであり、第ｋ方向別スコアが入力画像を正方向に補正するための方向を示すスコアである場合は、第ｋ方向別スコアを入力画像の方向を示すスコアに揃えた上で、総合的方向別スコアに統合する。 For example, if the overall score for each direction is a score indicating the direction of the input image, and the score for the kth direction is a score indicating the direction for correcting the input image in the positive direction, the score for each k direction is input. After aligning the scores that indicate the direction of the image, integrate them into the overall score for each direction.

入力画像を正方向に補正するための方向を示すスコアは、方向毎のスコアを、仮にそのスコアが最も高かった場合に判定される入力画像の方向に対応するスコアとみなすことで、入力画像の方向を示すスコアとして扱うことができる。 The score indicating the direction for correcting the input image in the positive direction is the score of the input image by regarding the score for each direction as the score corresponding to the direction of the input image determined if the score is the highest. It can be treated as a score indicating the direction.

入力画像の方向を示すスコアは、方向毎のスコアを、仮にそのスコアが最も高かった場合に、入力画像を正方向に補正するために回転させる方向に対応するスコアとみなすことで、入力画像を正方向に補正するための方向を示すスコアとして扱うことができる。 The score indicating the direction of the input image represents the input image by regarding the score for each direction as the score corresponding to the direction in which the input image is rotated to correct it in the positive direction if the score is the highest. It can be treated as a score indicating the direction for correction in the positive direction.

図１３は、本実施例において方向補正部２１０が実行する方向補正処理の処理手順の例を示すフローである。なお、方向補正部２１０は、Ｎ個の方向解析処理手段を実行した場合における総合的方向別スコアの初期値として、総合的方向別スコアに統合する対象となる方向に対応する総合的方向別スコアをゼロで初期化する。 FIG. 13 is a flow showing an example of a processing procedure of the direction correction process executed by the direction correction unit 210 in this embodiment. The direction correction unit 210 sets the total direction-specific score corresponding to the direction to be integrated into the comprehensive direction-specific score as the initial value of the total direction-specific score when the N direction analysis processing means are executed. Is initialized to zero.

本実施形態では、方向解析処理部２１０４は、１の対象領域から領域画像を抽出したあと、ｋ＝１を初期値として（ステップＳ３０２）、第ｋ方向解析処理により第ｋ方向別スコアを算出する（ステップＳ３０４）。また、方向解析処理部２１０４は、第ｋ方向別スコアの値を、総合的方向別スコアに加算することで、総合的方向別スコアを更新する（ステップＳ３０６）。 In the present embodiment, the direction analysis processing unit 2104 extracts the area image from the target area of 1, and then calculates the score for each k direction by the k-direction analysis process with k = 1 as the initial value (step S302). (Step S304). Further, the direction analysis processing unit 2104 updates the total direction-specific score by adding the value of the k-th direction-specific score to the total direction-specific score (step S306).

つづいて、方向解析処理部２１０４は、ｋに１を加算して（ステップＳ３０８）、ｋ≦Ｎの場合は、更新されたｋに基づいて、ステップＳ３０４に戻り（ステップＳ３１０；Ｎｏ→ステップＳ３０４）、同様の処理を繰り返し、ｋ＞Ｎとなるまで総合的方向別スコアを更新する。ｋ＞Ｎとなると、方向解析処理部２１０４は、ステップＳ１０４で抽出した領域画像における対象パーツの検出の処理を終了する。方向解析処理部２１０４は、全ての対象領域から領域画像を抽出するまで、ステップＳ３０４からステップＳ３１０までの処理を繰り返す。このようにして、方向解析処理部２１０４は、総合的方向別スコアを更新する。 Subsequently, the direction analysis processing unit 2104 adds 1 to k (step S308), and if k ≦ N, returns to step S304 based on the updated k (step S310; No → step S304). , The same process is repeated, and the overall score for each direction is updated until k> N. When k> N, the direction analysis processing unit 2104 ends the process of detecting the target part in the region image extracted in step S104. The direction analysis processing unit 2104 repeats the processes from step S304 to step S310 until the area image is extracted from all the target areas. In this way, the direction analysis processing unit 2104 updates the overall direction-specific score.

全ての対象領域から領域画像を抽出した場合、方向解析処理部２１０４は、総合的方向別スコアを、入力画像に対応する方向別スコアであるとして、後段の判定処理部２１０６に入力する（ステップＳ３１０；Ｙｅｓ）。なお、図１３では複数の方向解析処理手段について逐次実行する方法について図示したが、複数の方向解析処理手段を並列で実行しても構わない。また、全ての対象領域から領域画像を抽出した場合、判定処理部２１０６は、総合的方向別スコアに基づいて、入力画像の方向を判定する（ステップＳ１０８；Ｙｅｓ→ステップＳ３１２）。判定処理部２１０６は、総合的方向別スコアが入力画像を正方向に補正するための方向に関するスコアである場合は、最もスコアの高い方向に回転させた入力画像を入力方向の入力画像にするために回転させる方向を、入力画像の方向として判定する。一方で、判定処理部２１０６は、総合的方向別スコアが入力画像の方向を示すスコアである場合は、最もスコアの高い方向を、入力画像の方向として判定する。 When the area image is extracted from all the target areas, the direction analysis processing unit 2104 inputs the overall direction-specific score to the subsequent determination processing unit 2106 as the direction-specific score corresponding to the input image (step S310). Yes). Although FIG. 13 shows a method of sequentially executing a plurality of directional analysis processing means, a plurality of directional analysis processing means may be executed in parallel. When the area image is extracted from all the target areas, the determination processing unit 2106 determines the direction of the input image based on the overall direction-specific score (step S108; Yes → step S312). When the overall direction-specific score is a score related to the direction for correcting the input image in the positive direction, the determination processing unit 2106 uses the input image rotated in the direction with the highest score as the input image in the input direction. The direction of rotation is determined as the direction of the input image. On the other hand, when the overall direction-specific score is a score indicating the direction of the input image, the determination processing unit 2106 determines the direction with the highest score as the direction of the input image.

本実施形態における方向補正処理において、第ｋ方向解析処理（ｋ＞１）が実行するパーツの検出方法及び第ｋ方向別スコアの算出方法については、あらゆる方法をとることができる。例えば、第１実施形態や第２実施形態で説明した対象パーツを検出する方法の他に、文章に用いられる記号であって、その記号が現れる位置に特徴がある記号を検出する方法を用いることができる。 In the direction correction process of the present embodiment, any method can be adopted as the method of detecting the parts executed by the k-direction analysis process (k> 1) and the method of calculating the score for each k-direction. For example, in addition to the method of detecting the target part described in the first embodiment and the second embodiment, a method of detecting a symbol used in a sentence and having a characteristic at the position where the symbol appears is used. Can be done.

記号を検出する方法の一例として、図１４を参照して、文字列領域に含まれる特定文字として一例である句読点を検出し、その検出結果に基づいて方向別スコアを算出する方法（以下、本解析処理と呼ぶ）について説明する。本解析処理における特定文字は、方向を判定するために使用する文字・記号等である。本実施形態では句読点を一例とするが、例えば、カンマ、ピリオド等の区切り記号であってもよいし、例えば解析対象となる言語でよく登場する文字（例えば、日本語の場合「は」等）を特定文字としてもよい。 As an example of the method of detecting the symbol, referring to FIG. 14, a method of detecting a punctuation mark as an example as a specific character included in the character string area and calculating a score for each direction based on the detection result (hereinafter, this book). This is called analysis processing). The specific characters in this analysis process are characters, symbols, etc. used to determine the direction. In this embodiment, punctuation marks are used as an example, but for example, they may be delimiters such as commas and periods, and for example, characters that often appear in the language to be analyzed (for example, "ha" in the case of Japanese). May be a specific character.

なお、図１４に示す本解析処理では、方向解析処理部２１０４は、領域画像を回転させることなく、特定文字が含まれる位置に応じて特定文字の方向を検出することで、第２実施形態と同様に、入力画像の方向に関するスコアを算出する。したがって、本解析処理と対象パーツを検出する方向解析処理とを組み合わせる場合、対象パーツを検出する方向解析処理として、第１実施形態の方法よりも第２実施形態の方法が用いられることにより、方向解析処理部２１０４は、効率的に総合的方向別スコアを効率的に算出できる。 In the present analysis process shown in FIG. 14, the direction analysis processing unit 2104 detects the direction of the specific character according to the position including the specific character without rotating the area image, thereby that of the second embodiment. Similarly, the score regarding the direction of the input image is calculated. Therefore, when the present analysis process and the direction analysis process for detecting the target part are combined, the method of the second embodiment is used rather than the method of the first embodiment as the direction analysis process for detecting the target part. The analysis processing unit 2104 can efficiently calculate the overall score for each direction.

句読点の検出を通じた本解析処理では、まず、方向解析処理部２１０４は、文字列領域から特定文字である句読点を検出する（ステップＳ３２２）。具体的には、方向解析処理部２１０４は、抽出部２１０２によって抽出された文字列領域に基づく対象領域を設定した上で領域画像を抽出し、領域画像から句読点を検出する。句読点の検出方法は、例えば、第１実施形態で説明した方向解析処理部２１０４の処理と同様に、句読点の部分画像を機械学習やパターンマッチングにより検出する方法でもよいし、公知の方法であってもよい。 In this analysis process through the detection of punctuation marks, the direction analysis processing unit 2104 first detects punctuation marks that are specific characters from the character string area (step S322). Specifically, the direction analysis processing unit 2104 sets a target area based on the character string area extracted by the extraction unit 2102, extracts the area image, and detects punctuation marks from the area image. The method for detecting punctuation marks may be, for example, a method of detecting a partial image of punctuation marks by machine learning or pattern matching, similar to the processing of the direction analysis processing unit 2104 described in the first embodiment, or a known method. May be good.

なお、例えば、「、」（読点）は、正方向（反時計回りに０°）と反時計回り方向（反時計回りに９０°）とでは向きの違いがあるが、正方向（反時計回りに０°）と逆方向（反時計回りに１８０°）とでは向きの違いが判らないため、総合的方向別スコアに統合する対象となる全ての方向に関して検出を試みる必要は無く、いずれかの方向で検出されたか否かのみ判明すればよい。 For example, "," (reading point) has a difference in direction between the positive direction (0 ° counterclockwise) and the counterclockwise direction (90 ° counterclockwise), but the positive direction (counterclockwise). Since the difference in orientation cannot be seen between (0 °) and the opposite direction (180 ° counterclockwise), it is not necessary to try to detect all the directions to be integrated into the overall directional score, and either one of them. It is only necessary to know whether or not it was detected in the direction.

また、かな文字の濁点や半濁点、「さんずいへん」の一部など、句読点ではないが、よく似た文字のパーツと区別するため、制約を設けて、制約を満たさない場合は句読点では無いとしてキャンセルするようにしてもよい。制約の例として、たとえば、句読点の候補として検出された領域から行方向に別の文字画素が存在しないことを挙げることができる。また、多くの文書レイアウトでは、句読点とその次の文字との間は字間の幅が大きくなるため、検出された句読点と、文字列領域において句読点の前後に位置する文字領域との距離を算出して、前方向に隣り合う文字領域との距離、もしくは後ろ方向に隣り合う文字領域との距離のうち、いずれか一方が所定値以上となることを、句読点であるか否かの条件として加えてもよい。 Also, although it is not a punctuation mark such as a dakuten or semi-voiced sound mark of kana characters, or a part of "Sanzuihen", a constraint is set to distinguish it from parts of similar characters, and if the constraint is not met, it is not a punctuation mark. You may cancel it. As an example of the constraint, for example, there is no other character pixel in the line direction from the area detected as a candidate for punctuation mark. Also, in many document layouts, the width between the punctuation mark and the next character is large, so the distance between the detected punctuation mark and the character area located before and after the punctuation mark in the character string area is calculated. Then, it is added as a condition of whether or not it is a punctuation mark that either one of the distances from the character areas adjacent to each other in the front direction and the character areas adjacent to each other in the rear direction is equal to or more than a predetermined value. You may.

図１５を参照して、抽出されている文字領域及び文字列領域に対して句読点を検出した場合の例を示す。図１５（ａ）は、点線で囲まれた範囲がそれぞれの文字領域であり、そのうち網掛けされた文字領域（領域Ｍ３００、領域Ｍ３０２）が、句読点として検出されていることを示す図である。この場合、領域Ｅ３００に示すように、５文字目の「で」など複数の文字において濁点が含まれるが、多くの場合、行方向において別の文字画素が存在し、また前方及び後方の文字領域との距離は小さいため、句読点としての検出はキャンセルされる。 With reference to FIG. 15, an example in which punctuation marks are detected in the extracted character area and character string area will be shown. FIG. 15A is a diagram showing that the range surrounded by the dotted line is each character area, and the shaded character area (area M300, area M302) is detected as a punctuation mark. In this case, as shown in the area E300, a dakuten is included in a plurality of characters such as the fifth character "de", but in many cases, another character pixel exists in the line direction, and the front and rear character areas are present. Since the distance to and is small, the detection as a punctuation mark is cancelled.

次に、方向解析処理部２１０４は、検出方向が未判定の句読点がある場合は、検出方向の判定の対象となる句読点を選択し（ステップＳ３２４；Ｙｅｓ→ステップＳ３２６）、選択した句読点と、前後の文字領域との位置関係を解析する（ステップＳ３２８）。 Next, when there is a punctuation mark whose detection direction has not been determined, the direction analysis processing unit 2104 selects a punctuation mark to be determined in the detection direction (step S324; Yes → step S326), and the selected punctuation mark and the front and back The positional relationship with the character area of is analyzed (step S328).

位置関係の解析については、具体的には、まず、方向解析処理部２１０４は、句読点と、句読点が属する文字列領域において、文字列方向に基づき、句読点の前方に隣り合う文字領域との距離Ｄ１、後方に隣り合う文字領域との距離Ｄ２を算出する。なお、文字列方向が水平方向の場合は、句読点の前方とは句読点の左側であり、句読点の後方とは句読点の右側である。文字列方向が垂直方向の場合は、句読点の前方とは句読点の上側であり、句読点の後方とは句読点の下側である。そして、ｐを正の所定係数とし、
Ｄ１＋ｐ＜Ｄ２・・・・・・（式１）
を満たすとき、句読点は後方に余白を持つとする。一方、式１を満たさず、
Ｄ２＋ｐ＜Ｄ１・・・・・・（式２）
を満たすとき、句読点は前方に余白を持つとする。式１及び式２のいずれも満たさない場合、句読点は前方、後方のいずれにも十分な余白が無いとする。前方、後方のいずれにも十分な余白が無い場合、検出された句読点は検出誤りであるとして、以降の処理を中断するようにしてもよい。 Regarding the analysis of the positional relationship, first, the direction analysis processing unit 2104 first describes the distance D1 between the punctuation mark and the character area adjacent to the front of the punctuation mark based on the character string direction in the character string area to which the punctuation mark belongs. , Calculate the distance D2 from the adjacent character area behind. When the character string direction is horizontal, the front of the punctuation mark is the left side of the punctuation mark, and the back of the punctuation mark is the right side of the punctuation mark. When the character string direction is vertical, the front of the punctuation mark is the upper side of the punctuation mark, and the back of the punctuation mark is the lower side of the punctuation mark. Then, let p be a positive predetermined coefficient, and set it as a positive coefficient.
D1 + p <D2 ... (Equation 1)
Suppose the punctuation marks have a trailing margin when satisfying. On the other hand, it does not satisfy Equation 1
D2 + p <D1 ... (Equation 2)
Suppose that the punctuation mark has a margin in front when the condition is satisfied. If neither Equation 1 nor Equation 2 is satisfied, it is assumed that the punctuation marks do not have sufficient margins at either the front or the back. If there is not enough margin in either the front or the back, the detected punctuation mark may be regarded as a detection error and the subsequent processing may be interrupted.

更に、方向解析処理部２１０４は、句読点と、句読点が属する文字列領域との位置関係を解析する（ステップＳ３３０）。具体的には、方向解析処理部２１０４は、検出された句読点及び文字列領域を示す位置座標のうち、行方向に対応する成分の最小値をＳ１及びＳ２、最大値をＥ１及びＥ２とする。更に、方向解析処理部２１０４は、文字列領域の行方向における中間位置をＭ２＝（Ｓ２＋Ｅ２）÷２として定義する。そして、方向解析処理部２１０４は、句読点の位置Ｓ１及びＥ１、文字列領域の中間位置Ｍ２の関係から、句読点が文字列領域の上半分（もしくは左半分）にあるか、下半分（もしくは右半分）にあるか、もしくはそのどちらでも無いかを判定する。 Further, the direction analysis processing unit 2104 analyzes the positional relationship between the punctuation marks and the character string region to which the punctuation marks belong (step S330). Specifically, the direction analysis processing unit 2104 sets the minimum value of the component corresponding to the row direction among the position coordinates indicating the detected punctuation mark and the character string area to be S1 and S2, and the maximum value to be E1 and E2. Further, the direction analysis processing unit 2104 defines the intermediate position in the line direction of the character string region as M2 = (S2 + E2) / 2. Then, the direction analysis processing unit 2104 determines that the punctuation marks are in the upper half (or left half) or the lower half (or right half) of the character string area due to the relationship between the punctuation marks positions S1 and E1 and the intermediate position M2 of the character string area. ), Or neither.

判定方法の例として、たとえば以下の条件に基づいて判定することができる。
Ｅ１ ≦ Ｍ２＋β ・・・・・・（式３）
Ｓ１ ≧ Ｍ２−β ・・・・・・（式４）
式３を満たすとき、方向解析処理部２１０４は、句読点が文字列領域の上半分（もしくは左半分）にあるとする。式４を満たすとき、方向解析処理部２１０４は、句読点が文字列領域の下半分（もしくは右半分）にあるとする。式３及び式４の双方を満たす場合、もしくは逆に双方とも満たさない場合は、方向解析処理部２１０４は、句読点は中間位置周辺にあるか、文字列領域の行方向にわたって大きな幅を持っているとして、どちら側でも無いとする。係数βは式３及び式４の条件を満たす範囲を調整する係数であり、正の値を取る場合は、中間位置よりも多少はみ出ていても許容するための条件として緩和される一方、負の値を取る場合は条件がより厳しくなる。 As an example of the determination method, the determination can be made based on, for example, the following conditions.
E1 ≤ M2 + β ... (Equation 3)
S1 ≧ M2-β ・・・・・・ (Equation 4)
When the equation 3 is satisfied, the direction analysis processing unit 2104 assumes that the punctuation marks are in the upper half (or left half) of the character string area. When the equation 4 is satisfied, the direction analysis processing unit 2104 assumes that the punctuation marks are in the lower half (or right half) of the character string area. If both equations 3 and 4 are satisfied, or vice versa, the direction analysis processing unit 2104 indicates that the punctuation marks are around the intermediate position or have a large width over the line direction of the character string area. Assuming that it is neither side. The coefficient β is a coefficient that adjusts the range that satisfies the conditions of Equations 3 and 4, and when it takes a positive value, it is relaxed as a condition for allowing even if it slightly exceeds the intermediate position, but it is negative. When taking a value, the condition becomes stricter.

図１５（ｂ）は、距離Ｄ１及びＤ２、位置Ｓ１、Ｓ２、Ｅ１、Ｅ２及び中間位置Ｍ２の例を示す図である。図１５（ｂ）に示す文字列領域は水平方向に文字領域が並ぶ水平方向の文字列であり、行方向は垂直方向となるため、位置の行方向に対応する成分はＹ成分となる。 FIG. 15B is a diagram showing an example of distances D1 and D2, positions S1, S2, E1, E2, and intermediate position M2. The character string area shown in FIG. 15B is a character string in the horizontal direction in which the character areas are arranged in the horizontal direction, and the line direction is the vertical direction. Therefore, the component corresponding to the line direction of the position is the Y component.

句読点の検出された文字領域と前後の文字領域との位置関係及び文字列領域との位置関係の解析結果から、方向解析処理部２１０４は、句読点毎に、検出方向を判定する（ステップＳ３３２）。また、方向解析処理部２１０４は、判定した検出方向に基づき、方向別スコアを算出し、加算する（ステップＳ３３４）。方向別スコアの算出方法の例として、たとえば、前記の位置関係の解析結果の組合せから最も適切な検出方向を１つ判定して、判定した検出方向に対応する方向別スコアを加算することができる。 The direction analysis processing unit 2104 determines the detection direction for each punctuation mark from the analysis result of the positional relationship between the character area where the punctuation mark is detected and the character area before and after the punctuation mark and the character string area (step S332). Further, the direction analysis processing unit 2104 calculates and adds the score for each direction based on the determined detection direction (step S334). As an example of the method of calculating the score for each direction, for example, one of the most appropriate detection directions can be determined from the combination of the analysis results of the positional relationship described above, and the score for each direction corresponding to the determined detection direction can be added. ..

つづいて、方向解析処理部２１０４は、ステップＳ３２４に戻り、判定方向が未判定の句読点があるか否かを判定する。このとき、候補判定が未判定の句読点がない場合、すなわち、全ての検出された句読点について方向別スコアの加算を終えた場合は、方向解析処理部２１０４は本解析処理を終了する（ステップＳ３２４；Ｎｏ）。 Subsequently, the direction analysis processing unit 2104 returns to step S324 and determines whether or not there is a punctuation mark whose determination direction is undetermined. At this time, if there are no punctuation marks for which the candidate determination has not been determined, that is, when the addition of the directional scores for all the detected punctuation marks is completed, the direction analysis processing unit 2104 ends the present analysis process (step S324; No).

図１６を参照して、本解析処理についての動作例を説明する。図１６（ａ）及び図１６（ｂ）は、前記の位置関係の解析結果の組合せと、対応する最も適切な検出方向との関係を表すテーブルの例である。図１６（ａ）は文字列方向が水平方向である場合、図１６（ｂ）は文字列方向が垂直方向である場合のテーブルを示す。 An operation example of this analysis process will be described with reference to FIG. 16 (a) and 16 (b) are examples of a table showing the relationship between the combination of the analysis results of the positional relationship and the corresponding most appropriate detection direction. FIG. 16A shows a table when the character string direction is horizontal, and FIG. 16B shows a table when the character string direction is vertical.

図１６（ａ）を参照して、文字列方向が水平方向である場合を例に説明する。方向解析処理部２１０４は、句読点が文字列領域の下半分にあり、前後の文字領域との関係において右側（後方）に余白を持つ場合、検出方向として正方向（反時計回りに０°）を判定する。方向解析処理部２１０４は、句読点が文字列領域の上半分にあり、前後の文字領域との関係において左側（前方）に余白を持つ場合、検出方向として逆方向（反時計回りに１８０°）を判定する。方向解析処理部２１０４は、句読点が文字列領域の上半分にあり、前後の文字領域との関係において右側（後方）に余白を持つ場合、検出方向として反時計回り方向（反時計回りに９０°）を判定する。方向解析処理部２１０４は、句読点が文字列領域の下半分にあり、前後の文字領域との関係において左側（前方）に余白を持つ場合、検出方向として時計回り方向（反時計回りに２７０°）を判定する。上記のいずれにも該当しない場合、適切な検出方向が見つからないとして、方向解析処理部２１０４は、検出なし（いずれの方向も検出方向としない）と判定する。方向解析処理部２１０４は、いずれかの検出方向を判定したとき、その検出方向に対応する方向別スコアを加算する。スコアは定数でもよいし、前記算出した距離や位置関係に基づいて動的に算出してもよい。 A case where the character string direction is the horizontal direction will be described as an example with reference to FIG. 16A. When the punctuation mark is in the lower half of the character string area and there is a margin on the right side (backward) in relation to the character area before and after, the direction analysis processing unit 2104 sets the detection direction in the positive direction (0 ° counterclockwise). judge. When the punctuation mark is in the upper half of the character string area and there is a margin on the left side (front) in relation to the character area before and after, the direction analysis processing unit 2104 sets the detection direction in the opposite direction (180 ° counterclockwise). judge. When the punctuation mark is in the upper half of the character string area and there is a margin on the right side (backward) in relation to the previous and next character areas, the direction analysis processing unit 2104 sets the detection direction in the counterclockwise direction (90 ° counterclockwise). ) Is determined. When the punctuation mark is in the lower half of the character string area and there is a margin on the left side (front) in relation to the previous and next character areas, the direction analysis processing unit 2104 sets the detection direction clockwise (270 ° counterclockwise). To judge. If none of the above applies, it is determined that no appropriate detection direction is found, and the direction analysis processing unit 2104 determines that there is no detection (neither direction is the detection direction). When the direction analysis processing unit 2104 determines any of the detection directions, the direction analysis processing unit 2104 adds a direction-specific score corresponding to the detection direction. The score may be a constant or may be dynamically calculated based on the calculated distance and positional relationship.

また、図１６（ｂ）を参照して、文字列方向が垂直方向である場合の例について説明する。方向解析処理部２１０４は、文字列方向が垂直方向である場合は、句読点が文字列領域の右半分にあり、前後の文字領域との関係において下側（後方）に余白を持つ場合、検出方向として正方向（反時計回りに０°）を判定する。方向解析処理部２１０４は、句読点が文字列領域の左半分にあり、前後の文字領域との関係において上側（前方）に余白を持つ場合、検出方向として逆方向（反時計回りに１８０°）を判定する。方向解析処理部２１０４は、句読点が文字列領域の右半分にあり、前後の文字領域との関係において上側（前方）に余白を持つ場合、検出方向として反時計回り方向（反時計回りに９０°）を判定する。方向解析処理部２１０４は、句読点が文字列領域の左半分にあり、前後の文字領域との関係において下側（後方）に余白を持つ場合、検出方向として時計回り方向（反時計回りに２７０°）を判定する。文字列方向が垂直方向である場合も、上記のいずれにも該当しないときは、方向解析処理部２１０４は、適切な検出方向が見つからないとして、検出なしと判定すればよい。また、方向解析処理部２１０４は、いずれかの検出方向を判定したとき、その検出方向に対応する方向別スコアを加算する。スコアは定数でもよいし、前記算出した距離や位置関係に基づいて動的に算出してもよい。 Further, an example in the case where the character string direction is the vertical direction will be described with reference to FIG. 16 (b). The direction analysis processing unit 2104 has a detection direction when the punctuation mark is in the right half of the character string area when the character string direction is vertical and there is a margin on the lower side (backward) in relation to the previous and next character areas. To determine the positive direction (0 ° counterclockwise). When the punctuation mark is in the left half of the character string area and there is a margin on the upper side (front) in relation to the character area before and after, the direction analysis processing unit 2104 sets the detection direction in the opposite direction (180 ° counterclockwise). judge. When the punctuation mark is in the right half of the character string area and there is a margin on the upper side (front) in relation to the previous and next character areas, the direction analysis processing unit 2104 sets the detection direction in the counterclockwise direction (counterclockwise 90 °). ) Is determined. When the punctuation mark is in the left half of the character string area and there is a margin on the lower side (backward) in relation to the previous and next character areas, the direction analysis processing unit 2104 sets the detection direction in the clockwise direction (counterclockwise 270 °). ) Is determined. Even when the character string direction is the vertical direction, if none of the above applies, the direction analysis processing unit 2104 may determine that no detection is possible, assuming that an appropriate detection direction cannot be found. Further, when the direction analysis processing unit 2104 determines any of the detection directions, the direction-specific score corresponding to the detection direction is added. The score may be a constant or may be dynamically calculated based on the calculated distance and positional relationship.

文字列方向が水平方向の場合に方向解析処理部２１０４が判定する検出方向について説明する。図１６（ｃ）は文字列方向が水平方向である文字列領域Ｅ３１０を示した図である。文字列領域Ｅ３１０に現れた文字は転倒しておらず、正方向の方向を示している。また、文字列領域Ｅ３１０には、読点Ｍ３１０が含まれている。図１６（ｃ）に記載した格子Ｇ３１０は、読点Ｍ３１０の位置を説明するために記載したものである。ここで、読点Ｍ３１０は、格子Ｇ３１０の左下に位置しており、前後の文字領域との関係において右側に広い余白を持ち、文字列領域の下半分にあることに対応する。したがって、方向解析処理部２１０４は、読点Ｍ３１０の検出方向として、文字列領域Ｅ３１０の文字の方向と同じ正方向と判定することができる。 The detection direction determined by the direction analysis processing unit 2104 when the character string direction is the horizontal direction will be described. FIG. 16C is a diagram showing a character string area E310 in which the character string direction is horizontal. The characters appearing in the character string area E310 have not fallen and indicate a positive direction. Further, the character string area E310 includes a reading point M310. The grid G310 shown in FIG. 16C is for explaining the position of the reading point M310. Here, the reading point M310 is located at the lower left of the grid G310, has a wide margin on the right side in relation to the front and rear character areas, and corresponds to the lower half of the character string area. Therefore, the direction analysis processing unit 2104 can determine that the detection direction of the reading point M310 is the same positive direction as the character direction of the character string area E310.

別の例について説明する。図１６（ｄ）は、文字列方向が水平方向である文字列領域Ｅ３２０を示した図である。文字列領域Ｅ３２０に現れた文字は、反時計回り方向を向いている。文字列領域Ｅ３２０には、格子Ｇ３２０の左上の位置に読点Ｍ３２０が含まれている。すなわち、読点Ｍ３２０は、前後の文字領域との関係において右側に広い余白を持ち、文字列領域の上半分にある。したがって、方向解析処理部２１０４は、読点Ｍ３２０の検出方向として、文字列領域Ｅ３２０の文字の方向と同じ反時計回り方向と判定することができる。 Another example will be described. FIG. 16D is a diagram showing a character string area E320 in which the character string direction is the horizontal direction. The characters appearing in the character string area E320 face counterclockwise. The character string area E320 includes a reading point M320 at the upper left position of the grid G320. That is, the reading point M320 has a wide margin on the right side in relation to the front and rear character areas, and is in the upper half of the character string area. Therefore, the direction analysis processing unit 2104 can determine that the detection direction of the reading point M320 is the same counterclockwise direction as the character direction of the character string area E320.

つづいて、文字列方向が垂直方向の場合に方向解析処理部２１０４が判定する検出方向について説明する。図１６（ｅ）は、文字列方向が垂直方向である文字列領域Ｅ３３０を示した図である。文字列領域Ｅ３３０に現れた文字は、転倒しておらず、正方向の方向を向いている。文字列領域Ｅ３３０には、格子Ｇ３３０の右上の位置に読点Ｍ３３０が含まれている。すなわち、読点Ｍ３３０は、前後の文字領域との関係において下側に広い余白を持ち、文字列領域の右半分にある。したがって、方向解析処理部２１０４は、読点Ｍ３３０の検出方向として、文字列領域Ｅ３３０の文字の方向と同じ反時計回り方向と判定することができる。 Next, the detection direction determined by the direction analysis processing unit 2104 when the character string direction is the vertical direction will be described. FIG. 16E is a diagram showing a character string area E330 in which the character string direction is the vertical direction. The characters appearing in the character string area E330 have not fallen and are facing in the positive direction. The character string area E330 includes a reading point M330 at a position on the upper right of the grid G330. That is, the comma M330 has a wide margin on the lower side in relation to the front and rear character areas, and is located in the right half of the character string area. Therefore, the direction analysis processing unit 2104 can determine that the detection direction of the reading point M330 is the same counterclockwise direction as the direction of the characters in the character string area E330.

別の例について説明する。図１６（ｆ）は、文字列方向が垂直方向である文字列領域Ｅ３４０を示した図である。文字列領域Ｅ３４０に現れた文字は、反時計回り方向を向いている。文字列領域Ｅ３４０には、格子Ｇ３４０の右下に位置に読点Ｍ３４０が含まれている。すなわち、読点Ｍ３４０は、前後の文字領域との関係において上側に広い余白を持ち、文字列領域の右半分にある。したがって、方向解析処理部２１０４は、読点Ｍ３４０の検出方向として、文字列領域Ｅ３４０の文字の方向と同じ反時計回り方向と判定することができる。 Another example will be described. FIG. 16 (f) is a diagram showing a character string area E340 in which the character string direction is the vertical direction. The characters appearing in the character string area E340 are facing counterclockwise. The character string area E340 includes a reading point M340 at a position at the lower right of the grid G340. That is, the comma M340 has a wide margin on the upper side in relation to the front and rear character areas, and is located on the right half of the character string area. Therefore, the direction analysis processing unit 2104 can determine that the detection direction of the reading point M340 is the same counterclockwise direction as the character direction of the character string area E340.

このように、方向解析処理部２１０４は、ステップＳ３０４の第ｋ方向解析処理を実行する。そして、方向解析処理部２１０４は、ステップＳ３０４の第ｋ方向解析処理によって得られた最終的な第ｋ方向別スコアを、ステップＳ３０６において総合的方向別スコアに加算することで、他の方向解析処理手段で得られた方向別スコアと加算する。 In this way, the direction analysis processing unit 2104 executes the k-direction analysis process in step S304. Then, the direction analysis processing unit 2104 adds the final k-direction score obtained by the k-direction analysis process in step S304 to the overall direction-specific score in step S306, thereby performing another direction analysis process. Add to the directional score obtained by the means.

第ｋ方向解析処理は前述した方法と異なる別の方法を取ることもできる。また、第ｋ方向解析処理においても、第１実施形態に記載した方向解析処理と同じ方法を採用してもよく、一方は機械学習によるパーツ検出を採用して他方はパターンマッチングによるパーツ検出を採用するといった差異を設けてもよい。 The k-th direction analysis process may be performed by another method different from the method described above. Further, in the k-direction analysis process, the same method as the direction analysis process described in the first embodiment may be adopted, one adopts part detection by machine learning and the other adopts parts detection by pattern matching. You may make a difference such as.

また、方向解析処理部２１０４は、方向解析処理毎に対象領域の設定方法を変更したりスコアの与え方を変更したりすることで、方向解析処理に対して差異を設けて、第１実施形態に記載した処理方法の強化を図ってもよい。 Further, the direction analysis processing unit 2104 makes a difference from the direction analysis processing by changing the setting method of the target area or the method of giving the score for each direction analysis processing, and the first embodiment. The processing method described in the above may be strengthened.

また、方向解析処理部２１０４は、複数の方向解析処理に共通する処理をまとめて実行したり、共通して用いられるデータを共有したりしてもよい。例えば、方向解析処理部２１０４は、２つの方向解析処理において、領域画像を回転させる処理が含まれる場合は、一方の方向解析処理において回転させた領域画像を一時的に記憶し、他方の方向解析処理で、当該回転させた領域画像を再利用してもよい。このようにすることで、方向解析処理部２１０４は、複数の方向解析処理を効率的に実行することができる。 Further, the direction analysis processing unit 2104 may collectively execute the processes common to the plurality of direction analysis processes, or may share the commonly used data. For example, when the two direction analysis processes include the process of rotating the area image, the direction analysis processing unit 2104 temporarily stores the rotated area image in one direction analysis process and analyzes the other direction. The rotated region image may be reused in the process. By doing so, the direction analysis processing unit 2104 can efficiently execute a plurality of direction analysis processes.

このように方向解析処理を組み合わせることにより、本実施形態では、入力画像の方向の判定の精度を向上させたり、入力画像に応じた方向解析処理を実行したりすることが可能となる。 By combining the direction analysis processing in this way, in the present embodiment, it is possible to improve the accuracy of determining the direction of the input image and to execute the direction analysis processing according to the input image.

［４．第４実施形態］
つづいて、第４実施形態について説明する。第４実施形態は、入力画像から抽出された文字領域の特徴から、文書中の言語のタイプを判定し、その言語のタイプに対応する方向補正処理を実行する実施形態である。本実施形態は、第１実施形態の図３を図１７に置き換えたものである。なお、同一の機能部及び処理には同一の符号を付し、説明については省略する。 [4. Fourth Embodiment]
Next, the fourth embodiment will be described. The fourth embodiment is an embodiment in which a language type in a document is determined from the characteristics of a character area extracted from an input image, and direction correction processing corresponding to the language type is executed. In this embodiment, FIG. 3 of the first embodiment is replaced with FIG. The same functional parts and processes are designated by the same reference numerals, and the description thereof will be omitted.

本実施形態における方向補正部２１０は、Ｍ（≧１）通りの言語タイプに対応し、方向解析処理部２１０４は言語タイプ毎にそれぞれ異なる方向解析処理手段を備える。ここで、言語タイプは、１言語毎に１つ割り当ててもよいし、類似した文字や文法を用いる言語同士を１つの言語タイプとしてまとめてもよい。便宜上、Ｍ通りの言語タイプをそれぞれ第１言語タイプ、……、第Ｍ言語タイプと呼ぶ。 The direction correction unit 210 in the present embodiment corresponds to M (≧ 1) language types, and the direction analysis processing unit 2104 includes different direction analysis processing means for each language type. Here, one language type may be assigned to each language, or languages using similar characters and grammars may be grouped together as one language type. For convenience, the M language types are referred to as the first language type, ..., And the M language type, respectively.

図１７は、本実施形態において、方向補正部２１０が実行する方向補正処理の手順を示した図である。本実施形態では、文字を構成する要素を含む領域を抽出したあと、方向解析処理部２１０４は、入力画像の言語タイプの判定を行う（ステップＳ４０２）。なお、本実施形態では、言語タイプを判定したら、判定した言語タイプに対応する番号Ｌ（１≦Ｌ≦Ｍ）を決定する。これを、言語タイプＬとよぶ。 FIG. 17 is a diagram showing a procedure of direction correction processing executed by the direction correction unit 210 in the present embodiment. In the present embodiment, after extracting the area including the elements constituting the characters, the direction analysis processing unit 2104 determines the language type of the input image (step S402). In the present embodiment, after the language type is determined, the number L (1 ≦ L ≦ M) corresponding to the determined language type is determined. This is called language type L.

言語タイプの判定手段は公知の方法を利用することができ、例えば、特許文献３に記載の方法により、特定された言語ファミリを言語タイプとして利用することができる。特許文献３に記載の方法では、画素パターンに基づく特徴から言語ファミリを判定しており、たとえばラテン語系のアルファベットと、アジア系の言語とを、文字あたりの画素パターンの特徴の数により判定する。そこで、例えば、方向解析処理部２１０４は、ラテン語系をＬ＝１、アジア系をＬ＝２、それ以外をＬ＝３とし、Ｍ＝３として分類を行うようにしてもよい。 A known method can be used as the language type determining means, and for example, the language family specified by the method described in Patent Document 3 can be used as the language type. In the method described in Patent Document 3, the language family is determined from the features based on the pixel pattern. For example, the Latin alphabet and the Asian language are determined by the number of features of the pixel pattern per character. Therefore, for example, the direction analysis processing unit 2104 may classify the Latin system as L = 1, the Asian system as L = 2, the other system as L = 3, and M = 3.

言語タイプを判定すると、方向解析処理部２１０４は、領域画像を抽出し、抽出した領域画像を対象に言語タイプの番号Ｌに対応する第Ｌ方向解析処理を実行する（ステップＳ４０４）。方向解析処理部２１０４の方向解析処理手段としては、例えば、第１実施形態や第２実施形態で述べた方法を、番号Ｌの言語タイプに対応する対象パーツのみ限定して適用する手段とする。なお、方向解析処理部２１０４は、第３実施形態のように、言語タイプ毎に異なる方向解析処理手段を複数組み合わせてもよい。 When the language type is determined, the direction analysis processing unit 2104 extracts the area image and executes the Lth direction analysis process corresponding to the language type number L for the extracted area image (step S404). As the direction analysis processing means of the direction analysis processing unit 2104, for example, the methods described in the first embodiment and the second embodiment are applied only to the target parts corresponding to the language type of the number L. The direction analysis processing unit 2104 may combine a plurality of different direction analysis processing means for each language type as in the third embodiment.

また、例えば、対応可能な言語タイプが見つからなかった場合（上述の例ではＬ＝３の「それ以外」がこの場合にあたる）、方向解析処理部２１０４は、言語タイプに応じた対象パーツの限定を行わずに全ての対象パーツを検出対象としてもよい。なお、方向解析処理部２１０４は、対応外の言語であるとして、入力画像の方向の判定及び回転を行わずに、入力画像をそのまま（反時計回りに０°のまま）出力するようにしてもよい。 Further, for example, when a compatible language type is not found (in the above example, “other” of L = 3 corresponds to this case), the direction analysis processing unit 2104 limits the target parts according to the language type. You may set all the target parts as detection targets without doing so. Note that the direction analysis processing unit 2104 may output the input image as it is (counterclockwise at 0 °) without determining the direction of the input image and rotating it, assuming that the language is not supported. good.

本実施形態によれば、言語タイプを判定してそれに基づいて処理を切り替えることで、複数の言語に対応した膨大な対象パーツを対象とすること無く、文書中に含まれていると判断された言語タイプに対応する対象パーツに限定して方向解析処理を実行できる。その結果、入力画像の方向の判定にかかる処理量の短縮につながり、また無関係な対象パーツの検出を回避することで、誤判定を減らすこともできる。 According to the present embodiment, by determining the language type and switching the processing based on the language type, it is determined that the enormous target parts corresponding to a plurality of languages are included in the document without being targeted. Direction analysis processing can be executed only for the target parts corresponding to the language type. As a result, the amount of processing required for determining the direction of the input image can be shortened, and erroneous determination can be reduced by avoiding the detection of irrelevant target parts.

［５．第５実施形態］
第１実施形態から第４実施形態では、画像処理装置を画像形成装置が有する画像処理装置に適用した構成について説明したが、これに限るものではない。そこで、第５実施形態として、第１実施形態から第４実施形態において説明した画像処理装置を、フラットベッドスキャナ等の画像読取装置が有する画像処理装置に適用した場合について説明する。 [5. Fifth Embodiment]
In the first to fourth embodiments, the configuration in which the image processing device is applied to the image processing device included in the image forming device has been described, but the present invention is not limited to this. Therefore, as a fifth embodiment, a case where the image processing apparatus described in the first to fourth embodiments is applied to an image processing apparatus included in an image reading apparatus such as a flatbed scanner will be described.

本実施形態は第１実施形態から第４実施形態のいずれにも適用できるが、第１実施形態に適用した場合について説明する。この場合、本実施形態は、第１実施形態の図１を図１８に置き換えたものである。なお、同一の機能部及び処理には同一の符号を付し、説明については省略する。 The present embodiment can be applied to any of the first to fourth embodiments, but a case where the present embodiment is applied to the first embodiment will be described. In this case, this embodiment replaces FIG. 1 of the first embodiment with FIG. The same functional parts and processes are designated by the same reference numerals, and the description thereof will be omitted.

図１８は、本実施形態に係る画像処理装置２０を備える画像読取装置２の構成を示すブロック図である。図１８に示すように、画像読取装置２は、画像入力装置１０、画像処理装置２０、送受信装置４０、操作パネル５０、記憶部６０、制御部７０を備えている。画像処理装置２０は、Ａ／Ｄ変換部２０２、シェーディング補正部２０４、原稿種別判定部２０６、ＡＣＳ判定部２０８、方向補正部２１０、入力階調補正部２１２、領域分離処理部２１４、色補正部２１６、空間フィルタ処理部２２０、解像度変換処理部２２２、出力階調補正部２２４、階調再現処理部２２６、及び圧縮処理部２２８を備えている。方向補正部２１０が方向補正処理を実行することで、第１実施形態と同様に、入力画像の方向の判定処理及び回転処理が実行される。 FIG. 18 is a block diagram showing a configuration of an image reading device 2 including the image processing device 20 according to the present embodiment. As shown in FIG. 18, the image reading device 2 includes an image input device 10, an image processing device 20, a transmission / reception device 40, an operation panel 50, a storage unit 60, and a control unit 70. The image processing device 20 includes an A / D conversion unit 202, a shading correction unit 204, a document type determination unit 206, an ACS determination unit 208, a direction correction unit 210, an input gradation correction unit 212, an area separation processing unit 214, and a color correction unit. It includes 216, a spatial filter processing unit 220, a resolution conversion processing unit 222, an output gradation correction unit 224, a gradation reproduction processing unit 226, and a compression processing unit 228. When the direction correction unit 210 executes the direction correction process, the direction determination process and the rotation process of the input image are executed as in the first embodiment.

画像読取装置２で実行される各種処理は、画像読取装置２に備えられる制御部７０（ＣＰＵあるいはＤＳＰ等のプロセッサを含むコンピュータ）により制御される。本実施形態では、画像読取装置２は、スキャナに限定されることはなく、例えば、デジタルスチルカメラ、書画カメラ、あるいは、カメラを搭載した電子機器類（例えば、携帯電話、スマートフォン、タブレット端末等）であってもよい。 Various processes executed by the image reading device 2 are controlled by a control unit 70 (a computer including a processor such as a CPU or DSP) provided in the image reading device 2. In the present embodiment, the image reading device 2 is not limited to a scanner, and is, for example, a digital still camera, a document camera, or electronic devices equipped with a camera (for example, a mobile phone, a smartphone, a tablet terminal, etc.). It may be.

なお、本実施形態を第２実施形態から第４実施形態に適用する場合は、それぞれの実施形態で説明した方向補正処理を実行する画像処理装置２０を備えて画像読取装置２を構成すればよい。 When applying this embodiment to the second to fourth embodiments, the image reading device 2 may be configured with an image processing device 20 that executes the direction correction process described in each embodiment. ..

このように、本実施形態によれば、スキャナやデジタルスチルカメラやカメラを搭載した電子機器類といった装置であっても、入力画像の方向を判定及び入力画像の方向の補正を行うことが可能となる。 As described above, according to the present embodiment, it is possible to determine the direction of the input image and correct the direction of the input image even in a device such as a scanner, a digital still camera, or an electronic device equipped with a camera. Become.

［６．第６実施形態］
第１実施形態から第５実施形態では、画像入力装置１０によって読み取られた原稿の画像それぞれについて、各入力画像の方向を判定し、判定した方向に従って回転した画像を出力する画像処理装置２０について説明した。それに対して、第６実施形態は、画像処理装置２０では入力画像の方向の判定及び回転は行わずに画像ファイルとして送信させた後に、受信したコンピュータにおいて、ソフトウェアによって、各入力画像の方向の判定を行う実施形態である。本実施形態では、読み取った画像の方向が正しくない（正方向でない）可能性を利用者に伝えることで、回転するか否かを利用者に判断させる。 [6. 6th Embodiment]
In the first to fifth embodiments, the image processing device 20 that determines the direction of each input image for each image of the document read by the image input device 10 and outputs an image rotated according to the determined direction will be described. bottom. On the other hand, in the sixth embodiment, the image processing device 20 determines the direction of each input image by software in the receiving computer after transmitting the image file as an image file without determining the direction and rotating the input image. It is an embodiment to carry out. In the present embodiment, the user is made to judge whether or not to rotate by notifying the user of the possibility that the direction of the read image is incorrect (not in the positive direction).

図１９は、本実施形態のソフトウェアである、入力画像それぞれに対して推奨される方向（入力画像を正方向にするために回転させる方向）を判定し、利用者に入力画像の回転を促すソフトウェアを実行したときに表示される画面Ｗ６００の例である。入力画像毎の推奨される方向の判定は、例えば、第１実施形態から第４実施形態で説明した方向補正部２１０で実行される方向補正処理のうち、ステップＳ１０２からステップＳ１１０までを実行して得られる方向別スコアに基づいて判定することで実現できる。 FIG. 19 is software of the present embodiment, which determines a recommended direction for each input image (direction to rotate the input image in order to make it positive) and prompts the user to rotate the input image. Is an example of the screen W600 displayed when the above is executed. For the determination of the recommended direction for each input image, for example, among the direction correction processes executed by the direction correction unit 210 described in the first to fourth embodiments, steps S102 to S110 are executed. This can be achieved by making a judgment based on the obtained direction-specific scores.

図１９の例では、画面Ｗ６００の左側の領域Ｅ６００に、入力画像に対応するサムネイル画像がページ番号と紐づけて一覧表示されており、利用者が選択したページに対応するサムネイル画像は他のページに対応するサムネイル画像に比べて太い枠線で強調表示されている。また、標識のアイコンＭ６００及びＭ６０２は、判定された入力画像の方向が正方向（反時計回りに０°）でないことを示す方法の例である。判定された入力画像の方向が正方向ではない場合、ページ番号と紐づけて注意を促す図もしくは文字列を表示することで、利用者はそのページが正しい方向（正方向）で読み取りできていなかった可能性があることを知ることができる。 In the example of FIG. 19, thumbnail images corresponding to the input images are displayed in a list in the area E600 on the left side of the screen W600 in association with the page number, and the thumbnail images corresponding to the page selected by the user are other pages. It is highlighted with a thicker border than the thumbnail image corresponding to. Further, the icon icons M600 and M602 are examples of a method for indicating that the direction of the determined input image is not the positive direction (0 ° counterclockwise). If the direction of the determined input image is not in the positive direction, the user cannot read the page in the correct direction (positive direction) by displaying a figure or character string that calls attention in association with the page number. You can know that there is a possibility.

また、図１９の例において、画面Ｗ６００の右側の領域Ｅ６０２には、利用者が選択したページに該当する入力画像と、本実施形態のソフトウェアによって判定された推奨される方向に基づいて入力画像を回転した画像とを並べて表示している。また、２つの画像の間に配置されている右向きの矢印Ａ６００は、入力画像（オリジナル）を、推奨される方向に基づいて回転させた状態を、そのページの画像の最新状態として更新する処理の指示を受け付けるものである。また、左向きの矢印Ａ６０２は、前述の方法で回転させた状態から、入力された状態での画像の状態に戻す、すなわち回転をキャンセルする処理の指示を受け付けるものである。各ページの画像の初期状態は入力された時点での画像の状態とし、前述の左右の矢印をクリックするなどの方法により、ページ毎に画像の最新状態を更新する。 Further, in the example of FIG. 19, in the area E602 on the right side of the screen W600, the input image corresponding to the page selected by the user and the input image based on the recommended direction determined by the software of the present embodiment are displayed. The rotated image is displayed side by side. Further, the right-pointing arrow A600 arranged between the two images is a process of updating the state in which the input image (original) is rotated based on the recommended direction as the latest state of the image on the page. It accepts instructions. Further, the arrow A602 pointing to the left receives an instruction of a process of returning from the state of being rotated by the above method to the state of the image in the input state, that is, canceling the rotation. The initial state of the image on each page is the state of the image at the time of input, and the latest state of the image is updated for each page by a method such as clicking the left and right arrows described above.

図１９では、入力画像と、推奨される方向に基づいて回転した画像とのうち、最新状態に対応する画像の方について、太枠によって強調表示することにより、選択されているページの最新状態がどちらであるかを利用者が判断し易いようにしている。利用者は選択したページ毎に、入力画像と、推奨される方向に基づいて回転した画像とを見比べ、必要に応じて、どちらを採用するかを切り替えることが可能となる。 In FIG. 19, the latest state of the selected page is displayed by highlighting the image corresponding to the latest state among the input image and the image rotated based on the recommended direction with a thick frame. It makes it easier for the user to determine which is the case. For each selected page, the user can compare the input image with the image rotated based on the recommended direction, and switch which one to use as needed.

各画像の最新状態が切り替わる度に最新状態に合わせて画像を保存してもよいし、本実施形態のソフトウェアに設けられたメニューから保存を指示したタイミングに、選択中のページについて保存を実行してもよい。また、本実施形態のソフトウェアを終了するまでの間は各ページの最新状態の変更を破棄せず記憶しておき、メニューから保存を指示したタイミングでまとめて保存するようにしてもよい。また、推奨される方向が正方向（反時計回りに０°）でないページについて、一括で推奨される方向に回転させることを指示する機能を設けてもよい。 Each time the latest state of each image is switched, the image may be saved according to the latest state, or the selected page is saved at the timing when saving is instructed from the menu provided in the software of the present embodiment. You may. Further, until the software of the present embodiment is terminated, the latest changes of each page may be stored without being discarded, and may be collectively saved at the timing instructed to save from the menu. In addition, a function may be provided to instruct the pages whose recommended direction is not the positive direction (0 ° counterclockwise) to be rotated in the recommended direction at once.

本実施形態によれば、入力画像の方向の判定や入力画像の方向の補正を、画像形成装置や画像処理装置といった装置ではなく、入力画像を受信したコンピュータにおいて、ソフトウェアによって実現することが可能となる。 According to the present embodiment, it is possible to determine the direction of the input image and correct the direction of the input image by software not in a device such as an image forming device or an image processing device but in a computer that receives the input image. Become.

［７．第７実施形態］
第１実施形態では対象パーツの例として漢字の部首を挙げたが、基準を満たすものであれば他の文字種、及びその文字を構成するパーツを対象パーツとして定義してもよい。本実施形態では、漢字の部首以外を対象パーツとして定義した場合について説明する。 [7. Seventh Embodiment]
In the first embodiment, the radical of the Chinese character is given as an example of the target part, but other character types and the parts constituting the character may be defined as the target part as long as they satisfy the criteria. In this embodiment, a case where a part other than the radical of the Chinese character is defined as a target part will be described.

例えば、対象パーツとしては、図２０（ａ）や図２０（ｂ）のように、ハングルを利用してもよいし、「あ」や「き」のようなひらがなを利用してもよい。 For example, as the target part, Hangul may be used as shown in FIGS. 20 (a) and 20 (b), or hiragana such as “a” or “ki” may be used.

また、欧文においては、単語間に余白が挿入されることを利用して、文字単体や文字の一部分でなく、比較的少ない文字数で構成された単語単位で対象パーツを構成してもよい。このとき、文字列方向に並ぶ文字領域間の距離が所定距離以内となるような文字領域同士の組を連結関係にあるとし、連結関係にある文字領域の集合を１つの文字領域として再定義することで、単語を１つの文字領域として扱うことができる。例えば、図２０（ｃ）、図２０（ｄ）、図２０（ｅ）は、単語を対象パーツとした例である。単語をパーツとすることで非対称性が向上するほか、欧文では余白によって１行の文字列を単語ごとに分離しやすいという利点もあるため、分離された単語単位で対象領域を設けて対象パーツの検出を行うことで、検出の効率も向上する。 Further, in European languages, the target part may be configured not by a single character or a part of a character but by a word unit composed of a relatively small number of characters by utilizing the fact that a margin is inserted between words. At this time, it is assumed that the pair of character areas in which the distance between the character areas arranged in the character string direction is within a predetermined distance is in a concatenated relationship, and the set of the concatenated character areas is redefined as one character area. Therefore, the word can be treated as one character area. For example, FIGS. 20 (c), 20 (d), and 20 (e) are examples in which words are targeted parts. In addition to improving asymmetry by using words as parts, there is also the advantage that it is easy to separate one line of character string for each word in European languages, so a target area is provided for each separated word and the target part By performing the detection, the efficiency of the detection is also improved.

また、図２０（ｄ）や（ｅ）に示すように、先頭を大文字にしたパーツを定義しておくと、登場頻度は低下するが、誤検出の頻度も低下するため有用である。 Further, as shown in FIGS. 20 (d) and 20 (e), it is useful to define a part in which the beginning is capitalized, because the frequency of appearance is reduced, but the frequency of false detection is also reduced.

［８．変形例］
本発明は上述した各実施の形態に限定されるものではなく、種々の変更が可能である。すなわち、本発明の要旨を逸脱しない範囲内において適宜変更した技術的手段を組み合わせて得られる実施の形態についても本発明の技術的範囲に含まれる。また、上述した実施形態では、画像入力装置１０から画像処理装置２０へはＲＧＢのアナログ信号が入力されるものとして、カラーの画像データを扱う構成としたが、これに限るものではなく、白黒の画像データを扱う構成であってもよい。 [8. Modification example]
The present invention is not limited to the above-described embodiments, and various modifications can be made. That is, the technical scope of the present invention also includes embodiments obtained by combining technical means appropriately modified within a range that does not deviate from the gist of the present invention. Further, in the above-described embodiment, color image data is handled assuming that an RGB analog signal is input from the image input device 10 to the image processing device 20, but the present invention is not limited to this, and is not limited to black and white. It may be configured to handle image data.

また、上述した実施形態は、説明の都合上、それぞれ別に説明している部分があるが、技術的に可能な範囲で組み合わせて実行してもよいことは勿論である。例えば、第３実施形態と第４実施形態とを組み合わせて、判定した言語タイプに対応して、複数の方向解析処理手段を実行するように構成してもよい。 In addition, although the above-described embodiments are described separately for convenience of explanation, it is needless to say that they may be combined and executed within a technically possible range. For example, the third embodiment and the fourth embodiment may be combined to execute a plurality of direction analysis processing means corresponding to the determined language type.

また、実施形態において各装置で動作するプログラムは、上述した実施形態の機能を実現するように、ＣＰＵ等を制御するプログラム（コンピュータを機能させるプログラム）である。そして、これら装置で取り扱われる情報は、その処理時に一時的に一時記憶装置（例えば、ＲＡＭ）に蓄積され、その後、各種ＲＯＭ（Read Only Memory）やＨＤＤ等の記憶装置に格納され、必要に応じてＣＰＵによって読み出し、修正・書き込みが行なわれる。 Further, the program that operates in each device in the embodiment is a program that controls a CPU or the like (a program that causes a computer to function) so as to realize the functions of the above-described embodiment. Then, the information handled by these devices is temporarily stored in a temporary storage device (for example, RAM) at the time of processing, and then stored in various storage devices such as ROM (Read Only Memory) and HDD, if necessary. Is read, modified, and written by the CPU.

ここで、プログラムを格納する記録媒体としては、半導体媒体（例えば、ＲＯＭや、不揮発性のメモリカード等）、光記録媒体・光磁気記録媒体（例えば、ＤＶＤ（Digital Versatile Disc）、ＭＯ（Magneto Optical Disc）、ＭＤ（Mini Disc）、ＣＤ（Compact Disc）、ＢＤ（Blu-ray Disc）（登録商標）等）、磁気記録媒体（例えば、磁気テープ、フレキシブルディスク等）等の何れであってもよい。また、ロードしたプログラムを実行することにより、上述した実施形態の機能が実現されるだけでなく、そのプログラムの指示に基づき、オペレーティングシステムあるいは他のアプリケーションプログラム等と共同して処理することにより、本発明の機能が実現される場合もある。 Here, as the recording medium for storing the program, a semiconductor medium (for example, ROM, a non-volatile memory card, etc.), an optical recording medium / magneto-optical recording medium (for example, a DVD (Digital Versatile Disc), MO (Magneto Optical), etc.) Disc), MD (Mini Disc), CD (Compact Disc), BD (Blu-ray Disc) (registered trademark), etc.), magnetic recording medium (for example, magnetic tape, flexible disc, etc.) may be used. .. In addition, by executing the loaded program, not only the functions of the above-described embodiment are realized, but also by processing in collaboration with the operating system or other application programs based on the instructions of the program, the present invention In some cases, the functions of the invention are realized.

また、市場に流通させる場合には、可搬型の記録媒体にプログラムを格納して流通させたり、インターネット等のネットワークを介して接続されたサーバコンピュータに転送したりすることができる。この場合、サーバコンピュータの記憶装置も本発明に含まれるのは勿論である。 Further, in the case of distribution to the market, the program can be stored and distributed in a portable recording medium, or transferred to a server computer connected via a network such as the Internet. In this case, it goes without saying that the storage device of the server computer is also included in the present invention.

１画像形成装置
１０画像入力装置
２０画像処理装置
２０２Ａ／Ｄ変換部
２０４シェーディング補正部
２０６原稿種別判定部
２０８ＡＣＳ判定部
２１０方向補正部
２１０２抽出部
２１０４方向解析処理部
２１０６判定処理部
２１０８回転処理部
２１２入力階調補正部
２１４領域分離処理部
２１６色補正部
２１８黒色生成下色除去部
２２０空間フィルタ処理部
２２２解像度変換処理部
２２４出力階調補正部
２２６階調再現処理部
２２８圧縮処理部
３０画像出力装置
４０送受信装置
５０操作パネル
６０記憶部
７０制御部
２画像読取装置 1 Image forming device 10 Image input device 20 Image processing device 202 A / D conversion unit 204 Shading correction unit 206 Manuscript type judgment unit 208 ACS judgment unit 210 Direction correction unit 2102 Extraction unit 2104 Direction analysis processing unit 2106 Judgment processing unit 2108 Rotation processing Unit 212 Input gradation correction unit 214 Area separation processing unit 216 Color correction unit 218 Black generation undercolor removal unit 220 Spatial filter processing unit 222 Resolution conversion processing unit 224 Output gradation correction unit 226 Gradation reproduction processing unit 228 Compression processing unit 30 Image output device 40 Transmission / output device 50 Operation panel 60 Storage unit 70 Control unit 2 Image reader

Claims

An extraction unit that extracts an image of the area containing elements that make up characters from the input image as an area image,
A detection unit that detects a partial image showing the elements that make up the character from the area image,
A determination unit that determines the direction of the input image based on the detected partial image,
An image processing device comprising.

The detection unit detects a partial image from an image obtained by rotating the region image in a plurality of directions.
The claim is characterized in that the determination unit calculates a score for each direction in which the region image is rotated based on the partial image detected by the detection unit, and determines the direction of the input image based on the calculated score. The image processing apparatus according to 1.

The image processing apparatus according to claim 2, wherein the image rotated in a plurality of directions includes an image in at least a positive direction.

The determination unit is characterized in that the number of partial images included in the region image detected by the detection unit is calculated as a score, and the direction of the input image is determined based on the direction of the region image having the highest score. The image processing apparatus according to claim 2 or 3.

The detection unit detects the partial image rotated in a plurality of directions from the region image, and then detects the partial image.
The claim is characterized in that the determination unit calculates a score for each direction in which the partial image is rotated based on the partial image detected by the detection unit, and determines the direction of the input image based on the calculated score. The image processing apparatus according to 1.

The image processing apparatus according to claim 5, further comprising a generation unit that generates a partial image rotated in a plurality of directions from the one partial image.

5. The detection unit is characterized in that it detects a partial image from the region image based on at least an image in the positive direction of the partial image and an image rotated in a direction different from the positive direction. The image processing apparatus according to 6.

The determination unit calculates the number of partial images included in the region image detected by the detection unit as a score, and determines the direction of the partial image having the highest score as the direction of the input image. The image processing apparatus according to any one of claims 5 to 7.

The image processing apparatus according to any one of claims 1 to 8, wherein the partial image is an image showing a radical of a Chinese character.

The image processing apparatus according to any one of claims 1 to 9, wherein when the detection unit detects a partial image from a specific position of the region image, the detection is ignored.

The detection unit further detects a specific character different from the partial image from the area image, and then detects the specific character.
The determination unit determines the direction of the input image based on the partial image detected by the detection unit and the specific character.
The image processing apparatus according to any one of claims 1 to 10.

A language type determination unit for determining a language type from the input image is further provided.
The image processing apparatus according to claim 1, wherein the detection unit detects a partial image corresponding to the language type from the region image.

Claims 1 to 12 further include a correction unit that corrects the direction of the input image to be the positive direction when the direction of the input image determined by the determination unit is other than the positive direction. The image processing apparatus according to any one of the above.

The image processing apparatus according to any one of claims 1 to 13.
An image input unit that reads an image and inputs the scanned image to the image processing device,
An image reading device comprising.

A step of extracting an image of an area containing elements constituting characters from an input image as an area image, and
A step of detecting a partial image showing an element constituting a character from the area image, and
A step of determining the direction of the input image based on the detected partial image, and
A determination method characterized by including.

On the computer
A function to extract an image of an area containing elements that make up characters from an input image as an area image,
A function to detect a partial image showing elements constituting characters from the area image, and
A function of determining the direction of the input image based on the detected partial image, and
A program characterized by realizing.

A computer-readable recording medium on which the program according to claim 16 is recorded.