JP4177325B2

JP4177325B2 - Image processing apparatus, image processing program, and image processing method

Info

Publication number: JP4177325B2
Application number: JP2004376155A
Authority: JP
Inventors: 外志正土橋; 博之水谷; 直朗小平; 彰夫古畑
Original assignee: Toshiba Corp; Toshiba Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2004-12-27
Filing date: 2004-12-27
Publication date: 2008-11-05
Anticipated expiration: 2024-12-27
Also published as: JP2006184415A

Description

本発明は、例えば文字等の幾何情報を含む文書画像を加工して表示する画像処理装置、画像処理プログラム及び画像処理方法に関する。 The present invention relates to an image processing apparatus, an image processing program, and an image processing method for processing and displaying a document image including geometric information such as characters.

スキャナ、デジタルカメラなどの画像入力装置、さらにコンピュータディスプレイ、携帯端末モニタ、電子ブックビューワなどの画像表示装置の多様化が進んでいる。
スキャナなどの入力装置から入力される文書画像に含まれる文字のサイズは、同一の文章を入力しても入力解像度によって変化する。またデジタルカメラや携帯端末内蔵カメラなどの光学式デジタル撮像機では、解像度やカメラと撮影対象間の距離が固定されておらず、多くは被写体が撮像領域一杯に撮影される。 Image input devices such as scanners and digital cameras, and image display devices such as computer displays, portable terminal monitors, and electronic book viewers have been diversified.
The size of characters included in a document image input from an input device such as a scanner varies depending on the input resolution even if the same sentence is input. Further, in an optical digital image pickup device such as a digital camera or a camera with a built-in mobile terminal, the resolution and the distance between the camera and the object to be imaged are not fixed, and in many cases, the object is imaged in the entire image area.

このため、上記光学式デジタル撮像機で文書画像を撮影した場合、画像ごとに文字のサイズが異なったり、マーカーの位置関係が異なるものとなっていた。 For this reason, when a document image is taken with the optical digital image pickup device, the size of characters is different for each image and the positional relationship of markers is different.

上記光学式デジタル撮像機で撮像されたさまざまな解像度の文書画像を、コンピュータや内部プロセッサなどの画像処理装置へ入力し、さまざまな解像度のまま文書画像を混在させて扱う場合、文字の大きさが統一されていないために画面に実際に表示される文字の可読性や視覚性が悪く、可読性や視覚性をよくするためには手動での修正が必要であり、そのための手間がかかるといった問題がある。 When document images of various resolutions captured by the above optical digital imaging device are input to an image processing apparatus such as a computer or an internal processor, and the document images are mixed and handled at various resolutions, the character size is There is a problem that the readability and visibility of characters that are actually displayed on the screen are poor because they are not unified, and manual correction is necessary to improve readability and visibility, and it takes time and effort to do so. .

例えば手動操作で文字の大きさを統一する場合、画像処理装置では、元の画像サイズの等倍に対する比率、あるいは解像度の指定などのように、文書画像中の文字サイズとは関係のない指標で、画像の拡大・縮小の指定が行う必要があり、縮尺指定を表示対象のすべての画像に対して行う必要があった。 For example, when unifying the character size by manual operation, the image processing apparatus uses an index that is not related to the character size in the document image, such as the ratio to the original image size or the resolution designation. Therefore, it is necessary to designate enlargement / reduction of the image, and it is necessary to designate the scale for all the images to be displayed.

一般に、スキャナなどによって読み取られた文書画像は、比較的高解像度で画像処理装置に入力・保存されるが、その文字画像を例えばＰＤＡ、携帯電話機等の画面サイズが小さい携帯端末で表示する場合、文字の大きさが大きすぎて多くの情報を一度に表示することができないなどの問題があった。また文書内容の確認ができさえすれば良い程度の文書画像に対しても必要以上に高解像度、高ファイルサイズで画像が保存されることがあった。 In general, a document image read by a scanner or the like is input and stored in an image processing apparatus with a relatively high resolution. However, when displaying the character image on a portable terminal having a small screen size such as a PDA or a mobile phone, There was a problem that the size of characters was too large to display a lot of information at once. In addition, an image may be stored with higher resolution and higher file size than necessary even for a document image that only needs to be confirmed.

この種の先行技術として、ユーザが予め文字サイズを指定しておき、文書画像を見るときに、指定されていた文字サイズで画像を加工して表示する技術が提案されている（例えば特許文献１参照）。
米国特許第５７５４８７３号明細書 As a prior art of this type, a technique has been proposed in which a user designates a character size in advance and processes and displays the image with the designated character size when viewing a document image (for example, Patent Document 1). reference).
US Pat. No. 5,754,873

上記先行技術の場合、ユーザが予め手操作で文字の表示サイズを指定する必要があるため、予め設定しておいた一定のサイズでしか文字を表示できず、言語（日本語、英語等）、文字種（フォントの種類等）によっては見難く表示される場合がある。 In the case of the above prior art, since the user needs to manually specify the display size of the character in advance, the character can be displayed only in a predetermined size, and the language (Japanese, English, etc.), Depending on the character type (font type, etc.), it may be difficult to see.

本発明はこのような課題を解決するためになされたものであり、ユーザが画像毎の手動操作による表示サイズ指定等を行うことなく、異なる解像度で得られた文字画像を、文字を読むのに適切な大きさで閲覧することができる画像処理装置、画像処理プログラム及び画像処理方法を提供することを目的としている。 The present invention has been made to solve such a problem, and allows a user to read a character image obtained at a different resolution without specifying a display size by manual operation for each image. An object of the present invention is to provide an image processing apparatus, an image processing program, and an image processing method that can be browsed in an appropriate size.

本発明の画像処理装置は、所定表示領域を有する画面を備えた表示手段と、前記表示領域よりも大きなサイズの名刺画像、この名刺画像の構成要素の配置を示す画像配置設定情報および名前辞書を記憶する記憶手段と、前記記憶手段から読み出した名刺画像と画像配置設定情報から、前記名刺画像に含まれる構成要素のレイアウト解析を行うことで前記名刺画像に含まれる文字列の配置情報を得るレイアウト解析手段と、前記レイアウト解析手段により得られた文字列の配置情報及び前記名前辞書から得られる情報を用いて前記名刺画像の中の名前の位置を決定し、位置が決定した名前の始まりを前記表示領域の左端に配置するように前記名刺画像を前記表示手段に出力する画像配置決定手段とを具備したことを特徴とする。 An image processing apparatus according to the present invention includes a display unit having a screen having a predetermined display area, a business card image having a size larger than the display area, image layout setting information indicating a layout of components of the business card image, and a name dictionary. A layout for obtaining layout information of a character string included in the business card image by performing layout analysis of a component included in the business card image from a storage unit for storing, and a business card image and image layout setting information read from the storage unit The position of the name in the business card image is determined using the analysis means, the arrangement information of the character string obtained by the layout analysis means and the information obtained from the name dictionary, and the start of the name determined by the position is An image arrangement determining means for outputting the business card image to the display means so as to be arranged at the left end of the display area is provided.

本発明の画像処理プログラムは、所定表示領域を有する画面を備えた表示手段と、前記表示領域よりも大きなサイズの名刺画像、名刺表示用の画像配置設定情報および名前辞書を記憶する記憶手段とを備えた画像処理装置に処理を実行させる画像処理プログラムであって、前記画像処理装置を、前記記憶手段から読み出した名刺画像とこの名刺画像の構成要素の配置を示す画像配置設定情報から、前記名刺画像に含まれる構成要素のレイアウト解析を行うことで前記名刺画像に含まれる文字列の配置情報を得るレイアウト解析手段と、前記レイアウト解析手段により得られた文字列の配置情報及び前記名前辞書から得られる情報を用いて前記名刺画像の中の名前の位置を決定し、位置が決定した名前の始まりを前記表示領域の左端に配置するように前記名刺画像を前記表示手段に出力する画像配置決定手段として機能させることを特徴とする。 An image processing program according to the present invention includes a display unit including a screen having a predetermined display area, and a storage unit that stores a business card image having a size larger than the display area, image layout setting information for displaying a business card, and a name dictionary. An image processing program for causing an image processing apparatus provided to execute processing, wherein the image processing apparatus is configured to use the business card image read from the storage unit and the image card setting information indicating the arrangement of components of the business card image. Layout analysis means for obtaining layout information of character strings included in the business card image by performing layout analysis of components included in the image, obtained from the layout information of the character strings obtained by the layout analysis means and the name dictionary using the information for determining the position of the name in the business card image, placing the beginning of the name position is determined at the left end of the display region It characterized in that to function urchin the business card image as an image arrangement determining means for outputting to said display means.

本発明の画像処理方法は、所定表示領域を有する画面を備えた表示手段と、前記表示領域よりも大きなサイズの名刺画像、この名刺画像の構成要素の配置を示す画像配置設定情報および名前辞書を記憶する記憶手段と、レイアウト解析手段と、画像配置決定手段とを備えた画像処理装置における画像処理方法において、前記レイアウト解析手段が、前記記憶手段から読み出した名刺画像とこの名刺画像の構成要素の配置を示す画像配置設定情報から、前記名刺画像に含まれる構成要素を抽出し、抽出した構成要素のレイアウト解析を行うことで前記名刺画像に含まれる文字列の配置情報を得るステップと、前記画像配置決定手段が、前記文字列の配置情報及び前記名前辞書から得られる情報を用いて、前記名刺画像の中の名前の位置を決定し、位置が決定した名前の始まりを前記表示領域の左端に配置するように前記名刺画像を前記表示手段に出力するステップとを有することを特徴とする。 An image processing method according to the present invention includes a display unit having a screen having a predetermined display area, a business card image having a size larger than the display area, image layout setting information indicating a layout of components of the business card image, and a name dictionary. storage means for storing for a layout analysis means, the image processing method in an image processing apparatus and an image arrangement determining means, said layout analysis means, the components of the business card image and read the business card image from the storage means Extracting a component included in the business card image from image layout setting information indicating a layout, obtaining layout information of a character string included in the business card image by performing layout analysis of the extracted component, and the image arrangement determining means, using information obtained from the arrangement information and the name dictionary of the character string, determines the position of the name in the business card image Position is characterized by a step of outputting the business card image to the beginning of the name determined to be placed at the left end of the display area on the display means.

本発明では、記憶手段から読み出した名刺画像とこの名刺画像の構成要素の配置を示す画像配置設定情報から、前記名刺画像に含まれる構成要素を抽出し、抽出した構成要素のレイアウト解析を行うことで前記名刺画像に含まれる文字列の配置情報を得て、文字列の配置情報及び名前辞書から得られる情報を用いて、名刺画像の中の名前の位置を決定し、その位置が決定した名前の始まりを表示領域の左端に配置するように名刺画像を表示手段に出力するので、画面にはユーザにとって必要な文字が見やすい位置に表示される。
In the present invention, the component included in the business card image is extracted from the business card image read from the storage means and the image layout setting information indicating the layout of the component of the business card image, and the layout analysis of the extracted component is performed To obtain the arrangement information of the character string included in the business card image, determine the position of the name in the business card image using the information obtained from the arrangement information of the character string and the name dictionary, and the name determined by the position Since the business card image is output to the display means so that the beginning of the character is placed at the left end of the display area, the necessary characters for the user are displayed on the screen in an easy-to-see position.

以上説明したように本発明によれば、ユーザが画像毎の手動操作による表示サイズ指定等を行うことなく、異なる解像度で得られた文字画像を、文字を読むのに適切な大きさで閲覧することができる。 As described above, according to the present invention, a user browses a character image obtained at a different resolution in a size suitable for reading a character, without specifying a display size by manual operation for each image. be able to.

以下、本発明の実施の形態を図面を参照して詳細に説明する。
図１は本発明に係る第１実施形態の画像処理装置の構成を示すブロック図、図２は図１の画像処理装置に記憶されている言語・文字種/文字サイズ対応テーブルを示す図である。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram showing the configuration of the image processing apparatus according to the first embodiment of the present invention, and FIG. 2 is a diagram showing a language / character type / character size correspondence table stored in the image processing apparatus of FIG.

図１に示すように、この実施形態の画像処理装置は、画像入力部１１、記憶部１２、レイアウト解析部１３、幾何情報推定部１４、言語・文字種推定部１５、画像処理部１６、画像表示部１７、画像出力部１８、操作入力部１９等を有している。 As shown in FIG. 1, the image processing apparatus according to this embodiment includes an image input unit 11, a storage unit 12, a layout analysis unit 13, a geometric information estimation unit 14, a language / character type estimation unit 15, an image processing unit 16, an image display. Unit 17, image output unit 18, operation input unit 19, and the like.

画像入力部１１は、文書画像をデジタルデータとして入力する画像入力手段である。画像入力部１１としては、例えばスキャナ、デジタル複合機、デジタルカメラ、携帯端末内蔵カメラなどのような画像入力装置が用いられる。 The image input unit 11 is image input means for inputting a document image as digital data. As the image input unit 11, for example, an image input device such as a scanner, a digital multi-function peripheral, a digital camera, a mobile terminal built-in camera, or the like is used.

記憶部１２は、画像入力部１１より入力された文書画像が保存（一次記憶も含む）および読み出される画像記憶手段である。画像記録部１２としては、例えば磁気ディスク装置、光学ディスク装置、各種メモリ、ネットワーク上のストレージ、サーバなどが用いられる。 The storage unit 12 is an image storage unit that saves (including primary storage) and reads out a document image input from the image input unit 11. As the image recording unit 12, for example, a magnetic disk device, an optical disk device, various memories, a storage on a network, a server, or the like is used.

記憶部１２には、図２に示すように、変換対象の画像に含まれる文字の言語、文字種毎に適正な変換文字サイズを対応させた言語・文字種/文字サイズ対応テーブル２０が記憶されている。 As shown in FIG. 2, the storage unit 12 stores a language / character type / character size correspondence table 20 in which an appropriate conversion character size is associated with the language and character type of characters included in the image to be converted. .

レイアウト解析部１３は、記憶部１２に記憶されている文書画像を読み出して文書画像の構成要素を抽出する画像構成要素抽出手段として機能する。またレイアウト解析部１３は、抽出した文書画像の構成要素のレイアウトを解析することで構成要素が文書画像の中にどのように配置されているかのレイアウト情報（外矩や列等）を得るレイアウト解析部である。
幾何情報推定部１４は、上記レイアウト解析部１３により抽出されたレイアウト情報を用いて文書の構成要素（文字、記号、マーク、マーカー、印鑑の印影等）の幾何情報（例えば文字の場合は幅、高さ、サイズ等）を推定する幾何情報推定手段である。 The layout analysis unit 13 functions as an image component extraction unit that reads a document image stored in the storage unit 12 and extracts a component of the document image. In addition, the layout analysis unit 13 analyzes the layout of the component elements of the extracted document image to obtain layout information (external rectangles, columns, etc.) indicating how the component elements are arranged in the document image. Part.
The geometric information estimation unit 14 uses the layout information extracted by the layout analysis unit 13 to describe geometric information (for example, a width in the case of a character, a character, a symbol, a mark, a marker, a seal stamp) This is geometric information estimation means for estimating height, size, and the like.

言語・文字種推定部１５は、記憶部１２に記憶されている言語・文字種/文字サイズ対応テーブル２０を参照して文書内で用いられている言語や文字種を推定する言語・文字種推定手段である。 The language / character type estimation unit 15 is a language / character type estimation unit that estimates the language and character type used in the document with reference to the language / character type / character size correspondence table 20 stored in the storage unit 12.

画像処理部１６は、言語・文字種推定部１５により推定された言語や文字種と幾何情報推定部１４により推定された幾何情報等を用いて記憶部１２より読み出した画像の加工処理を行うことで、ある表示領域の画面に表示する画像を生成し画像表示部１７へ出力する。画像の加工処理とは、例えば文書画像の文字を表示画面で見易い大きさにするための画像の縮小処理または拡大処理等である。
すなわち、画像処理部１６と言語・文字種推定部１５は、幾何情報推定部１４により推定された文字のサイズを基に言語・文字種/文字サイズ対応テーブル２０を参照して文書画像に含まれる文字のサイズが変換文字サイズとなるよう記憶部１２より読み出した画像を縮小または拡大して画像表示部１７へ出力する画像処理手段として機能する。 The image processing unit 16 performs processing of the image read from the storage unit 12 using the language and character type estimated by the language / character type estimation unit 15 and the geometric information estimated by the geometric information estimation unit 14. An image to be displayed on a screen in a certain display area is generated and output to the image display unit 17. The image processing processing is, for example, image reduction processing or enlargement processing for making the characters of the document image easy to see on the display screen.
That is, the image processing unit 16 and the language / character type estimation unit 15 refer to the language / character type / character size correspondence table 20 based on the character size estimated by the geometric information estimation unit 14 to determine the characters included in the document image. It functions as image processing means for reducing or enlarging the image read from the storage unit 12 so that the size becomes the converted character size and outputting it to the image display unit 17.

画像表示部１７は、所定の表示領域の画面を備えたＬＣＤ、有機ＥＬ表示装置、ＳＥＤＦＥＤ等であり、画像処理部１６により生成された画像を画面に表示する表示手段である。画像出力部１８は画像処理部１６により生成された画像を例えばイメージファイルなどファイル形式にして記憶部１２等へ記憶したり、外部のプリンタへ出力する画像出力手段である。操作入力部１９は利用者からの指示操作を受け付ける入力受付手段である。 The image display unit 17 is an LCD, an organic EL display device, SEDFED, or the like provided with a screen of a predetermined display area, and is a display unit that displays an image generated by the image processing unit 16 on the screen. The image output unit 18 is an image output unit that stores the image generated by the image processing unit 16 in a file format such as an image file in the storage unit 12 or the like, or outputs it to an external printer. The operation input unit 19 is an input receiving unit that receives an instruction operation from a user.

図３〜図７を参照してこの画像処理装置の動作を説明する。図３はこの画像処理装置の動作を示すフローチャート、図４，５は２つの文書画像が共に日本語の場合、図６，７は２つの文書画像が日本語とアルファベットの場合の画像の例を示す図である。 The operation of this image processing apparatus will be described with reference to FIGS. 3 is a flowchart showing the operation of the image processing apparatus, FIGS. 4 and 5 are examples of images when two document images are both in Japanese, and FIGS. 6 and 7 are examples of images when the two document images are Japanese and alphabet. FIG.

この画像処理装置では、画像入力部１１が帳票より読み取った文書画像が記憶部１２に記憶される（図３のステップ１１：以下ステップをＳと称す）。 In this image processing apparatus, the document image read from the form by the image input unit 11 is stored in the storage unit 12 (step 11 in FIG. 3; step is hereinafter referred to as S).

レイアウト解析部１３は、処理対象となる文書画像であるイメージデータを記憶部１２より読み出してレイアウト解析を実行することで（Ｓ１２）、イメージデータに含まれる構成要素が画像の中にどのように配置されているかを示すレイアウト情報を抽出する。 The layout analysis unit 13 reads out image data, which is a document image to be processed, from the storage unit 12 and executes layout analysis (S12), and how the components included in the image data are arranged in the image. The layout information indicating whether or not it has been extracted is extracted.

ここでのレイアウト解析では、文書中の文字外矩や文字列などのレイアウト情報が抽出される。文字外矩は、画像中の連結黒画素成分を検出することにより抽出される。また、文字列は、近傍文字の統合処理を行うことにより抽出される。 In the layout analysis here, layout information such as extra-rectangular rectangles and character strings in the document is extracted. The extra-character rectangle is extracted by detecting a connected black pixel component in the image. Further, the character string is extracted by performing a process of integrating neighboring characters.

続いて、幾何情報推定部１４は、レイアウト解析部１３により抽出されたレイアウト情報を基に構成要素の幾何情報を推定する（Ｓ１３）。
この例における幾何情報とは文書に含まれる文字のサイズ（大きさ）等である。文書の文字サイズは、例えば文書中の文字列の文字サイズの平均をとるなどの方法で推定できる。この他、最小の文字行の文字サイズ、文書の本文部の平均文字サイズ、あるいは最頻文字サイズを文書の文字サイズ推定値として用いるなどの方法を用いても良い。また文字サイズは幅と高さそれぞれを推定しても良く、いずれか一方で代表させても良い。 Subsequently, the geometric information estimation unit 14 estimates the geometric information of the constituent elements based on the layout information extracted by the layout analysis unit 13 (S13).
The geometric information in this example is the size (size) of characters included in the document. The character size of the document can be estimated by a method such as averaging the character sizes of character strings in the document. In addition, a method of using the character size of the minimum character line, the average character size of the body part of the document, or the mode of the most frequent character as the estimated character size of the document may be used. The character size may be estimated for width and height, or may be represented by either one.

言語・文字種推定部１５は、幾何情報推定部１４により推定された幾何情報を基にして文書内で用いられている言語や文字種を推定する（Ｓ１４）。言語や文字種の推定は、例えば文字認識機能（以下ＯＣＲ機能と称す）を用いることで実現している。ＯＣＲ機能は、文書画像内の文字画像（イメージデータ）を予め設定されている文字認識用の辞書のイメージデータとマッチングさせて文字コード（テキストデータ）へ変換する機能であり、例えば日本語に対応したＯＣＲソフトウェアであれば、文書中の日本語やアルファベット・数字などの読み取りが可能である。文字認識用の辞書には、文字のイメージデータと文字コード（テキストデータ）が対で記憶されている。文字種によって文字コードの範囲は異なる（文字コード何番から何番まではアルファベット、何番から何番までは、かな、何番から何番までは漢字というように決められている）ため、ＯＣＲ機能による文字認識結果を解析することで、文書内で用いられている文字の言語や文字種を推定することができる。 The language / character type estimation unit 15 estimates the language and character type used in the document based on the geometric information estimated by the geometric information estimation unit 14 (S14). The estimation of language and character type is realized by using, for example, a character recognition function (hereinafter referred to as OCR function). The OCR function is a function that matches a character image (image data) in a document image with image data in a preset character recognition dictionary and converts it into a character code (text data). With the OCR software, it is possible to read Japanese, alphabets and numbers in the document. In the character recognition dictionary, character image data and character codes (text data) are stored in pairs. The character code range varies depending on the character type (character code number to number is alphabetic, number to number is kana, number to number is kanji), so the OCR function By analyzing the character recognition result by, the language and character type of characters used in the document can be estimated.

続いて、画像処理部１６は、言語・文字種推定部１５により推定された言語や文字種の推定結果を基に、言語・文字種／文字サイズ対応テーブル２０を参照して該当画像の拡大率または縮小率を決定する（Ｓ１５）。 Subsequently, the image processing unit 16 refers to the language / character type / character size correspondence table 20 based on the language / character type estimation result estimated by the language / character type estimation unit 15 to enlarge or reduce the image. Is determined (S15).

例えば入力画像の幅をＷ、高さをＨ、推定された文字サイズの幅をＥｗ, 高さをＥｈとし、変換後の文字サイズの幅と高さがそれぞれＴw，Ｔhと予め記憶部１２に設定されているものとする。 For example, the width of the input image is W, the height is H, the width of the estimated character size is Ew, the height is Eh, and the width and height of the converted character size are Tw and Th in the storage unit 12 in advance. It is assumed that it is set.

この場合、画像処理部１６は、画像の拡大縮小率を、幅方向にＴw／Ｅw、高さ方向にＴh／Ｅhとし、変換元の画像を、幅Ｔw×Ｗ／Ｅw、高さＴh×Ｈ／Ｅhの画像に変換することで、表示画像を作成する（Ｓ１６）。 In this case, the image processing unit 16 sets the enlargement / reduction ratio of the image to Tw / Ew in the width direction and Th / Eh in the height direction, and sets the conversion source image to width Tw × W / Ew, height Th × H. A display image is created by converting the image to / Eh (S16).

ここで、図４〜図７の画像の例を用いてこの画像処理装置の動作を具体的に説明する。図４の例は、横１２８０ピクセル×縦１０２４ピクセル（ＳＸＧＡ）サイズの変換元の文書の画像Ａと画像Ｂであり、これら画像Ａ，Ｂにはサイズの異なる日本語の文字が含まれている。画像Ａの文字サイズは、８０×８０ピクセルと推定され、画像Ｂの文字サイズは、１６０×１６０ピクセルと推定されたものとする。 Here, the operation of the image processing apparatus will be described in detail with reference to examples of images shown in FIGS. The example of FIG. 4 is an image A and an image B of a conversion source document having a size of horizontal 1280 pixels × vertical 1024 pixels (SXGA), and these images A and B include Japanese characters having different sizes. . Assume that the character size of image A is estimated to be 80 × 80 pixels, and the character size of image B is estimated to be 160 × 160 pixels.

また、出力画像の文字サイズ（ターゲット文字サイズ）は、上記言語・文字種/文字サイズ対応テーブル２０に、日本語、漢字・かなの場合、２０ピクセル（縦２０ピクセル×横２０ピクセル）と設定されている。
この場合、画像処理部１６は、画像Ａの縮小率または拡大率を、幅方向にＴw／Ｅw＝２０／８０＝１／４、高さも同様で１／４と決定し、画像Ｂの場合はＴw／Ｅw＝２０／１６０＝１／８、高さも同様で１／８と決定する。 The character size (target character size) of the output image is set to 20 pixels (vertical 20 pixels × horizontal 20 pixels) in the language / character type / character size correspondence table 20 in the case of Japanese, Kanji / Kana. Yes.
In this case, the image processing unit 16 determines that the reduction or enlargement ratio of the image A is Tw / Ew = 20/80 = 1/4 in the width direction and the height is also 1/4, and in the case of the image B, Tw / Ew = 20/160 = 1/8 and the height is determined to be 1/8.

画像処理部１６は、このように決定した倍率（縮小率）に基づいて文書画像を縮小処理（画像加工）する。
これにより、図５に示すように、画像Ａは３２０×２５６ピクセルの大きさの画像Ａ１に変換され、画像Ｂは１６０×１２８ピクセルの大きさの画像Ｂ１に変換され、それぞれの画像処理結果の２つの画像Ａ１，Ｂ１中の文字サイズはほぼ同一となる。
画像表示部１７は、画像処理部１６により画像処理（倍率変換）された画像を表示し、画像出力部１８は、各種画像や文書ファイルフォーマット書類としてファイル出力する（Ｓ１７）。
また、図６に示す例は、横１２８０ピクセル×縦１０２４ピクセル（ＳＸＧＡ）サイズの変換元の文書の画像Ａと画像Ｂであり、画像Ａには日本語の文字が含まれており、画像Ｂにはアルファベット（英語の文字）が含まれている。画像Ａ，Ｂに含まれる文字のサイズは共に８０×８０ピクセルと推定されたものである。 The image processing unit 16 performs reduction processing (image processing) on the document image based on the magnification (reduction rate) determined in this way.
As a result, as shown in FIG. 5, the image A is converted to an image A1 having a size of 320 × 256 pixels, and the image B is converted to an image B1 having a size of 160 × 128 pixels. The character sizes in the two images A1 and B1 are almost the same.
The image display unit 17 displays the image processed (magnification conversion) by the image processing unit 16, and the image output unit 18 outputs the file as various images or document file format documents (S17).
Further, the example shown in FIG. 6 is an image A and an image B of a conversion source document having a size of 1280 pixels by 1024 pixels (SXGA), and the image A includes Japanese characters. Contains alphabets (English characters). The size of characters included in images A and B is estimated to be 80 × 80 pixels.

出力画像の文字サイズ（ターゲット文字サイズ）は、上記言語・文字種/文字サイズ対応テーブル２０に、言語毎に最適な変換文字サイズが設定されている。
例えば文字が日本語、漢字・かなの場合、２０ピクセル（縦２０ピクセル×横２０ピクセル）と設定されており、英語、アルファベットの場合、１６ピクセル（縦１６ピクセル×横１６ピクセル）等と設定されている。
この場合、画像処理部１６は、画像Ａの縮小率または拡大率を、幅方向にＴw／Ｅw＝２０／８０＝１／４、高さも同様で１／４と決定し、画像Ｂの場合はＴw／Ｅw＝１６／８０＝１／５、高さも同様で１／５と決定する。 As the character size (target character size) of the output image, an optimum conversion character size for each language is set in the language / character type / character size correspondence table 20.
For example, when the character is Japanese, Kanji / Kana, it is set to 20 pixels (vertical 20 pixels × horizontal 20 pixels), and when it is English or the alphabet, it is set to 16 pixels (vertical 16 pixels × horizontal 16 pixels). ing.
In this case, the image processing unit 16 determines that the reduction or enlargement ratio of the image A is Tw / Ew = 20/80 = 1/4 in the width direction and the height is also 1/4, and in the case of the image B, Tw / Ew = 16/80 = 1/5, and the height is similarly determined as 1/5.

画像処理部１６は、このように決定した倍率（縮小率）に基づいて文書画像を縮小処理（画像加工）する。
これにより、図７に示すように、画像Ａは３２０×２５６ピクセルの大きさの画像Ａ１に変換され、画像Ｂは２５６×２０５ピクセルの大きさの画像Ｂ１に変換され、それぞれの画像処理結果の２つの画像Ａ１，Ｂ１中の文字サイズは、見た目に読み易い違和感のない大きさとなる。 The image processing unit 16 performs reduction processing (image processing) on the document image based on the magnification (reduction rate) determined in this way.
As a result, as shown in FIG. 7, the image A is converted into an image A1 having a size of 320 × 256 pixels, and the image B is converted into an image B1 having a size of 256 × 205 pixels. The character size in the two images A1 and B1 is a size that is easy to read and does not feel uncomfortable.

この例では、例えば比較的文字サイズが小さくても読めるアルファベットなどからなる英語文書に対しては文字サイズＴを小さくし、画数の多い漢字などを含む日本語の文書に対しては文字サイズＴを大きく設定することで、文書の読みやすさ、ファイルサイズ、ディスプレイに表示できる情報量などのバランスを最適に調整することが可能となる。 In this example, the character size T is reduced for an English document such as an alphabet that can be read even if the character size is relatively small, and the character size T is set for a Japanese document including a kanji having a large number of strokes. By setting a large value, it becomes possible to optimally adjust the balance of the readability of the document, the file size, the amount of information that can be displayed on the display, and the like.

拡大・縮小処理の他のバリエーションとしては、入変換後の画像の高さが表示領域の高さと等しくなるように入力画像領域を拡大・縮小したり、表示領域の縦横比（Ｗv／Ｈv）が入力画像領域の縦横比（Ｗi／Ｈi）よりも大きい場合は、入変換後の画像の高さが表示領域の高さと等しくなるように入力画像領域を拡大・縮小する。
また、表示領域の縦横比（Ｗv／Ｈv）が入力画像領域の縦横比（Ｗi／Ｈi）より小さい場合のその他のバリエーションとしては、入変換後の画像の高さが表示領域の高さと等しくなるように入力画像領域を拡大・縮小し、表示領域の縦横比（Ｗv／Ｈv）が入力画像領域の縦横比（Ｗi／Ｈi）よりも大きい場合は入変換後の画像の幅が表示領域の幅と等しくなるように入力画像領域を拡大・縮小する、といったことも可能である。 As other variations of the enlargement / reduction process, the input image area is enlarged / reduced so that the height of the image after input conversion is equal to the height of the display area, or the aspect ratio (Wv / Hv) of the display area is set. When it is larger than the aspect ratio (Wi / Hi) of the input image area, the input image area is enlarged / reduced so that the height of the image after input conversion becomes equal to the height of the display area.
As another variation when the aspect ratio (Wv / Hv) of the display area is smaller than the aspect ratio (Wi / Hi) of the input image area, the height of the image after conversion is equal to the height of the display area. When the input image area is enlarged / reduced as described above and the aspect ratio (Wv / Hv) of the display area is larger than the aspect ratio (Wi / Hi) of the input image area, the width of the image after the input conversion is the width of the display area. It is also possible to enlarge / reduce the input image area so as to be equal to.

画像表示部１７は、画像処理部１６により画像処理（倍率変換）された画像を表示し、画像出力部１８は、各種画像や文書ファイルフォーマット書類としてファイル出力する。 The image display unit 17 displays the image processed (magnification conversion) by the image processing unit 16, and the image output unit 18 outputs the file as various images and document file format documents.

このようにこの第１実施形態の画像処理装置によれば、文書画像に対してレイアウト解析、幾何情報推定及び言語・文字種推定を順に行うことによって文書中の文字の言語、文字種、文字サイズなどの幾何情報を推定し、画像処理部１６は、推定した幾何情報を基に言語、文字種／文字サイズ対応テーブル２０を参照して言語毎に文字が見易い大きさで表示画面に表示される適切なサイズになるよう文書画像を縮小または拡大等の画像処理を行い、表示画面に適切な画像を表示、またはファイル出力を行うので、ユーザによる画像毎の手動操作による表示サイズ指定などの手間なく、文字を見易い大きさで文書画像を表示画面に表示できると共に、文書画像を縮小する場合は、無駄なストレージやメモリの使用を抑制することができる。 As described above, according to the image processing apparatus of the first embodiment, layout analysis, geometric information estimation, and language / character type estimation are sequentially performed on a document image, so that the language, character type, character size, etc. The geometric information is estimated, and the image processing unit 16 refers to the language / character type / character size correspondence table 20 based on the estimated geometric information, and displays an appropriate size for displaying characters on the display screen in a size easy to see for each language. Since image processing such as reduction or enlargement of the document image is performed and an appropriate image is displayed on the display screen or file output is performed, characters can be input without having to manually specify the display size for each image. The document image can be displayed on the display screen in a size that is easy to see, and when the document image is reduced, use of useless storage and memory can be suppressed.

つまり、手動操作による表示サイズの変更なしに文書画像中に含まれる文字サイズを統一できるので、表示画面における文書画像の可読性、視覚性を向上することができる。また、用途に応じて必要十分な文字サイズの情報を使って画像を縮小することにより、無駄なストレージやメモリの使用を抑制することができる。 In other words, since the character size included in the document image can be unified without changing the display size by manual operation, the readability and visibility of the document image on the display screen can be improved. In addition, use of unnecessary storage and memory can be suppressed by reducing an image using information having a necessary and sufficient character size according to the application.

ユーザが文字を閲覧するのに適切な文字サイズをシステムが言語毎に自動的に推定して文字画像を縮小または拡大して画面に表示することができる。また、言語毎に文字サイズを指定（編集）できる言語・文字種/文字サイズ対応テーブル２０を備えているので、設定・変更を自由に行うことができる。
なお、言語・文字種に対応する適当な文字サイズは、システムで決められた値を用いても良く、表示画面の大きさや解像度からシステムが適切なサイズを自動的に設定しても良く、ユーザが操作入力部１９から設定しても良い。 The system can automatically estimate a character size appropriate for the user to view the character for each language, and the character image can be reduced or enlarged and displayed on the screen. In addition, since the language / character type / character size correspondence table 20 that can specify (edit) the character size for each language is provided, the setting / change can be performed freely.
The appropriate character size corresponding to the language and character type may be a value determined by the system, or the system may automatically set an appropriate size based on the size and resolution of the display screen. It may be set from the operation input unit 19.

次に、図８〜図１５を参照して本発明に係る第２実施形態の画像処理装置について説明する。図８は本発明に係る第２実施形態の画像処理装置の構成を示すブロック図である。 Next, an image processing apparatus according to a second embodiment of the present invention will be described with reference to FIGS. FIG. 8 is a block diagram showing the configuration of the image processing apparatus according to the second embodiment of the present invention.

図８に示すように、この実施形態の画像処理装置は、画像入力部１１、記憶部１２、レイアウト解析部１３、幾何情報推定部１４、拡大・縮小倍率決定部３１、画像配置決定部３２、画像処理部１６、画像表示部１７、画像出力部１８、操作入力部１９等を有している。なお、第１実施形態と同様の構成については同一の符号を付しその説明は省略する。 As shown in FIG. 8, the image processing apparatus of this embodiment includes an image input unit 11, a storage unit 12, a layout analysis unit 13, a geometric information estimation unit 14, an enlargement / reduction magnification determination unit 31, an image arrangement determination unit 32, An image processing unit 16, an image display unit 17, an image output unit 18, an operation input unit 19 and the like are included. In addition, about the structure similar to 1st Embodiment, the same code | symbol is attached | subjected and the description is abbreviate | omitted.

拡大・縮小倍率決定部３１は、レイアウト解析部１３により得られたレイアウト情報に基づき画像の構成要素の１つである文字のサイズを推定し、画面に表示する文字のサイズが適正となる画像の拡大倍率または縮小倍率を決定する画像拡大・縮小倍率決定手段として機能する。なお、拡大・縮小倍率決定部３１に、上記第１実施形態のようにＯＣＲ機能を持たせて、幾何情報推定部１４により推定された幾何情報を基に文字認識して文字コード（テキストデータ）を得てから、画像に含まれる文書内で用いられている言語や文字種の拡大・縮小倍率を決定してもよい。 The enlargement / reduction ratio determination unit 31 estimates the size of a character that is one of the constituent elements of the image based on the layout information obtained by the layout analysis unit 13, and determines the size of the image for which the size of the character displayed on the screen is appropriate. It functions as an image enlargement / reduction magnification determination means for determining an enlargement magnification or a reduction magnification. Note that the enlargement / reduction ratio determination unit 31 has an OCR function as in the first embodiment, and recognizes characters based on the geometric information estimated by the geometric information estimation unit 14 and performs character code (text data). After that, the enlargement / reduction ratio of the language or character type used in the document included in the image may be determined.

画像配置決定部３２は、拡大・縮小倍率決定部３１により決定された言語や文字種の拡大・縮小倍率と幾何情報推定部１４により推定された幾何情報と予め設定されていた画像の帳票種類情報（文書、名刺等といった帳票種とその帳票において文字列の配置を特定する情報）を用いて、画像処理部１６により生成された画像の文字を画面のどの位置に配置するかを決定し、画像の配置処理を実行する画像処理手段である。画像の配置処理とは、例えば文書画像の文字を表示画面で見易い位置にするため、画像の中で文字が始まる位置を画面内のどこにどの程度の大きさで表示するかを決定する処理である。つまり画像配置決定部３６は、生成された画像の文字を画面の表示領域にどのように配置するかを決定する。 The image arrangement determining unit 32 is a language or character type enlargement / reduction magnification determined by the enlargement / reduction magnification determination unit 31, geometric information estimated by the geometric information estimation unit 14, and preset form type information ( Using a form type such as a document or a business card and information specifying the arrangement of character strings in the form), it is determined where the characters of the image generated by the image processing unit 16 are to be arranged on the screen. It is an image processing means for executing an arrangement process. The image arrangement process is a process for determining where and in what size the position where the character starts in the image is to be displayed in order to make the character of the document image easy to see on the display screen. . That is, the image arrangement determination unit 36 determines how to arrange the characters of the generated image in the display area of the screen.

画像表示部１７は、画像処理部１６により生成された画像を画面に表示する表示手段である。画像出力部１８は画像処理部１６により生成された画像を例えばイメージファイルなどファイル形式にして記憶部１２等へ記憶したり、外部のプリンタへ出力する画像出力手段である。 The image display unit 17 is a display unit that displays the image generated by the image processing unit 16 on the screen. The image output unit 18 is an image output unit that stores the image generated by the image processing unit 16 in a file format such as an image file in the storage unit 12 or the like, or outputs it to an external printer.

以下、図９〜図１５を参照してこの画像処理装置の動作を説明する。図９はこの画像処理装置の動作を示すフローチャート、図１０は画像の拡大・縮小倍率を決定する処理を示すフローチャートである。 The operation of this image processing apparatus will be described below with reference to FIGS. FIG. 9 is a flowchart showing the operation of the image processing apparatus, and FIG. 10 is a flowchart showing a process for determining the enlargement / reduction magnification of the image.

この画像処理装置では、画像入力部１１が帳票より読み取った文書画像が記憶部１２に記憶される（図９のＳ２１）。 In this image processing apparatus, the document image read from the form by the image input unit 11 is stored in the storage unit 12 (S21 in FIG. 9).

レイアウト解析部１３は、処理対象となる文書画像であるイメージデータを記憶部１２より読み出してレイアウト解析を実行することで（Ｓ２２）、レイアウト情報を抽出する。 The layout analysis unit 13 reads out image data, which is a document image to be processed, from the storage unit 12 and executes layout analysis (S22), thereby extracting layout information.

続いて、幾何情報推定部１４は、レイアウト解析部１３により抽出されたレイアウト情報を基に構成要素の幾何情報を推定する（Ｓ２３）。
この例における幾何情報とは文書に含まれる文字のサイズである。文書の文字サイズは、例えば文書中の文字列の文字サイズの平均をとるなどの方法で推定できる。この他、最小の文字行の文字サイズ、文書の本文部の平均文字サイズ、あるいは最頻文字サイズを文書の文字サイズ推定値として用いるなどの方法を用いても良い。また文字サイズは幅と高さそれぞれを推定しても良く、いずれか一方で代表させても良い。 Subsequently, the geometric information estimation unit 14 estimates the geometric information of the constituent elements based on the layout information extracted by the layout analysis unit 13 (S23).
The geometric information in this example is the size of characters included in the document. The character size of the document can be estimated by a method such as averaging the character sizes of character strings in the document. In addition, a method of using the character size of the minimum character line, the average character size of the body part of the document, or the most frequent character size as the estimated character size of the document may be used. The character size may be estimated for width and height, or may be represented by either one.

続いて、拡大・縮小倍率決定部３２は、幾何情報推定部１４により推定された幾何情報（記号や文字のサイズ等）を基に言語・文字種／文字サイズ対応テーブル２０を参照して画像全体として拡大率または縮小率を決定する（Ｓ２４）。 Subsequently, the enlargement / reduction magnification determination unit 32 refers to the language / character type / character size correspondence table 20 based on the geometric information (symbol, character size, etc.) estimated by the geometric information estimation unit 14 as the entire image. The enlargement ratio or reduction ratio is determined (S24).

一例として、入力画像あるいは入力画像内で表示したい部分領域（以下入力画像領域と称す）の縦横比を保持したまま、変換後の画像の幅が表示領域の幅と等しくなるように入力画像領域を拡大または縮小する場合の例を説明する。 As an example, the input image area is set so that the width of the converted image is equal to the width of the display area while maintaining the aspect ratio of the input image or a partial area to be displayed in the input image (hereinafter referred to as the input image area). An example of enlargement or reduction will be described.

入力画像領域の幅をＷi、高さをＨi、表示領域の幅をＷv、高さをＨv、推定された入力画像領域での文字サイズをＥ、変換画像での許容最大文字サイズをＴmax、変換画像での許容最小文字サイズをＴmin（Ｔmin ＜＝Ｔmax）、変換画像を表示領域に内接させた場合の変換画像での文字サイズをＴfit、変換画像での文字サイズをＴ、決定された拡大縮小率をＲとすると、画像処理後の画像幅が表示領域の幅と等しくなる拡大縮小率は、Ｗv／Ｗiであり、その際のＴfitはＥ×Ｗv／Ｗiである。 The width of the input image area is Wi, the height is Hi, the width of the display area is Wv, the height is Hv, the character size in the estimated input image area is E, the maximum allowable character size in the converted image is Tmax, and conversion The minimum allowable character size in the image is Tmin (Tmin ≤ Tmax), the character size in the converted image when the converted image is inscribed in the display area is Tfit, the character size in the converted image is T, and the determined enlargement When the reduction ratio is R, the enlargement / reduction ratio at which the image width after image processing is equal to the width of the display area is Wv / Wi, and Tfit at that time is E × Wv / Wi.

ここで、図１０を参照して、上記Ｓ２４の拡大縮小率決定処理について詳細に説明する。
拡大・縮小倍率決定部３２は、以下のフローチャート（図１０）の処理を実行して拡大縮小率Ｒを決定する。
拡大・縮小倍率決定部３２は、まずＴfitを計算する（Ｔfit＝Ｅ×Ｗv／Ｗi）（Ｓ３１）。 Here, with reference to FIG. 10, the enlargement / reduction ratio determination process in S24 will be described in detail.
The enlargement / reduction ratio determination unit 32 determines the enlargement / reduction ratio R by executing the processing of the following flowchart (FIG. 10).
The enlargement / reduction ratio determination unit 32 first calculates Tfit (Tfit = E × Wv / Wi) (S31).

拡大・縮小倍率決定部３２は、ＴfitとＴmaxとを比較する（Ｓ３２）。
この比較の結果、ＴfitがＴmaxより大きい場合（Ｓ３２のＹ）、拡大・縮小倍率決定部３２は、Ｔの値としてＴmaxを用いてＲをＴmax／Ｅとする（Ｓ３３）。
また、比較の結果、ＴfitがＴmaxより小さく（Ｓ３２のＮ）、かつＴfitがＴminよりも大きい場合（Ｓ３４のＮ）、拡大・縮小倍率決定部３２は、拡大縮小率ＲをＷv／Ｗiとする（Ｓ３５）。 The enlargement / reduction magnification determination unit 32 compares Tfit and Tmax (S32).
If Tfit is greater than Tmax as a result of this comparison (Y in S32), the enlargement / reduction ratio determination unit 32 uses Tmax as the value of T and sets R to Tmax / E (S33).
As a result of comparison, if Tfit is smaller than Tmax (N in S32) and Tfit is larger than Tmin (N in S34), the enlargement / reduction ratio determination unit 32 sets the enlargement / reduction ratio R to Wv / Wi. (S35).

一方、比較の結果、ＴfitがＴminより小さい場合（Ｓ３４のＹ）、拡大・縮小倍率決定部３２は、Ｔの値としてＴminを用いてＲをＴmin／Ｅとする（Ｓ３６）。 On the other hand, if Tfit is smaller than Tmin as a result of comparison (Y in S34), the enlargement / reduction ratio determination unit 32 uses Tmin as the value of T and sets R to Tmin / E (S36).

ここで、具体的な画像の例で、拡大・縮小倍率を決定することについて説明する。
表示対象の複数の入力画像が、それぞれ６４０×４８０ピクセル、６４０×４８０ピクセル、１２８０×１０２４ピクセルというように少なくとも１つが異なる解像度の３つの画像Ａ，Ｂ，Ｃである場合の例を図１１に示す。画像Ａ，Ｂ，Ｃでは、アルファベットの文字の大きさも異なる。 Here, the determination of the enlargement / reduction magnification will be described using a specific image example.
FIG. 11 shows an example in which the plurality of input images to be displayed are three images A, B, and C having at least one different resolution such as 640 × 480 pixels, 640 × 480 pixels, and 1280 × 1024 pixels, respectively. Show. In the images A, B, and C, the size of alphabet characters is also different.

このように横６４０ピクセル×縦４８０ピクセル（ＶＧＡ）サイズの画像Ａと横６４０ピクセル×縦４８０ピクセル（ＶＧＡ）サイズの画像Ｂと、横１２８０ピクセル×縦１０２４ピクセル（ＳＸＧＡ）サイズの画像Ｃが変換元の文書画像、つまり入力画像であり、これら入力画像に含まれる文字のサイズは、幾何情報推定部１４によって、それぞれ８０×８０ピクセル、４０×４０ピクセル、４０×４０ピクセルと推定されたものとする。
Ｔmax、Ｔminはそれぞれ３２，１２ピクセル、表示デバイスの解像度は、幅３２０ピクセル×高さ２４０ピクセルとする。 Thus, an image A having a size of 640 pixels × 480 pixels (VGA), an image B having a size of 640 pixels × 480 pixels (VGA), and an image C having a size of 1280 pixels × 1024 pixels (SXGA) are converted. Original document images, that is, input images, and the sizes of characters included in these input images are estimated as 80 × 80 pixels, 40 × 40 pixels, and 40 × 40 pixels, respectively, by the geometric information estimation unit 14. To do.
Tmax and Tmin are 32 and 12 pixels, respectively, and the resolution of the display device is 320 pixels wide by 240 pixels high.

図１２に、変換対象の画像Ａ，Ｂ，Ｃを、画像表示部１７の表示領域に内接させた場合の変換後の画像の文字サイズＴfitと拡大縮小率Ｒの関係を示す。
（画像Ａ）
画像Ａの幅は、６４０ピクセル×高さは４８０ピクセルであり、画像中の文字サイズは８０ピクセルと推定されている。よって変換後の画像の文字サイズＴfitは、Ｔfit＝Ｅ×Ｗv／Ｗi＝８０×３２０／６４０＝４０より、４０ピクセルとなる。
このとき変換後の画像の文字サイズＴfitは、Ｔfit ＞Ｔmax であるので、拡大縮小率Ｒは、Ｒ＝Ｔmax／Ｅ＝３２／８０＝０．４より、０．４となる。 FIG. 12 shows the relationship between the character size Tfit and the enlargement / reduction ratio R of the image after conversion when the images A, B, and C to be converted are inscribed in the display area of the image display unit 17.
(Image A)
The width of the image A is 640 pixels × height is 480 pixels, and the character size in the image is estimated to be 80 pixels. Therefore, the character size Tfit of the image after conversion is 40 pixels from Tfit = E × Wv / Wi = 80 × 320/640 = 40.
At this time, since the character size Tfit of the converted image is Tfit> Tmax, the enlargement / reduction ratio R is 0.4 from R = Tmax / E = 32/80 = 0.4.

（画像Ｂ）
画像Ｂの幅は、６４０ピクセル×高さは４８０ピクセルであり、画像中の文字サイズは４０ピクセルと推定されている。よって変換後の画像の文字サイズＴfitはＴfit=Ｅ×Ｗv／Ｗi＝４０×３２０／６４０＝４０より、４０ピクセルとなる。
このとき変換後の画像の文字サイズＴfitは、Ｔmin ＜＝Ｔfit ＜＝Ｔmax であるので、拡大縮小率Ｒは、Ｒ=Ｗv/Ｗi＝３２０／６４０＝０．５より、０．５となる。 (Image B)
The width of the image B is 640 pixels × height is 480 pixels, and the character size in the image is estimated to be 40 pixels. Therefore, the character size Tfit of the converted image is 40 pixels from Tfit = E × Wv / Wi = 40 × 320/640 = 40.
At this time, since the character size Tfit of the image after conversion is Tmin <= Tfit <= Tmax, the enlargement / reduction ratio R is 0.5 from R = Wv / Wi = 320/640 = 0.5.

（画像Ｃ）
画像Ｃの幅は、１２８０ピクセル×高さは１０２４ピクセルであり、画像中の文字サイズは４０ピクセルと推定されている。よって変換後の画像の文字サイズＴfitはＴfit=Ｅ×Ｗv/Ｗi=４０×３２０/１２８０=１０より、１０ピクセルとなる。
このとき変換後の画像の文字サイズＴfitは、Ｔmin ＞Ｔfitであるので、拡大縮小率Ｒは、Ｒ=Ｔmin／Ｅ=１２／４０＝０．３より、０．３となる。 (Image C)
The width of the image C is 1280 pixels × height is 1024 pixels, and the character size in the image is estimated to be 40 pixels. Therefore, the character size Tfit of the converted image is 10 pixels from Tfit = E × Wv / Wi = 40 × 320/1280 = 10.
At this time, since the character size Tfit of the converted image is Tmin> Tfit, the enlargement / reduction ratio R is 0.3 from R = Tmin / E = 12/40 = 0.3.

続いて、画像処理部１６は、拡大・縮小倍率決定部３２により決定された倍率で画像の拡大縮小処理を行う（Ｓ２５）。
この画像処理の結果、元々６４０×４８０ピクセルというサイズであった画像Ａの幅２５６（６４０×０．４）、高さ１９２（４８０×０．４）の画像Ａ１に変換される。
元々６４０×４８０ピクセルというサイズであった画像Ｂの幅３２０（６４０×０．５）、高さ２４０（４８０×０．５）の画像Ｂ１に変換される。
元々１２８０×１０２４ピクセルというサイズであった画像Ｃの幅３８４（１２８０ｘ０．３）、高さ３０８（１０２４×０．３）の画像Ｃ１に変換される。 Subsequently, the image processing unit 16 performs image enlargement / reduction processing at the magnification determined by the enlargement / reduction magnification determination unit 32 (S25).
As a result of this image processing, the image A is converted into an image A1 having a width of 256 (640 × 0.4) and a height of 192 (480 × 0.4), which was originally 640 × 480 pixels.
The image B is originally converted to an image B1 having a size of 640 × 480 pixels and a width 320 (640 × 0.5) and a height 240 (480 × 0.5) of the image B.
The image is converted into an image C1 having a width 384 (1280 × 0.3) and a height 308 (1024 × 0.3) of the image C which originally has a size of 1280 × 1024 pixels.

続いて、画像配置決定部３２は、画像処理部１６により作成された画像をどのような配置で表示領域に表示するかを決定する（Ｓ２６）。
図１０の拡大縮小率決定処理において、ＴfitがＴmaxより大きい場合は、画像サイズが表示領域よりも小さくなるため、画像配置決定部３２は、図１３の画像Ａ１のように、画像領域の中心が、表示領域の中心の位置にくるように画像を配置する。
ＴfitがＴmaxより小さくＴminよりも大きい場合は、画像サイズが表示と等しくなるため、画像配置決定部３２は、図１３の画像Ｂ１に示すように、画像の左上が表示領域の左上にくるように画像を配置する。
ＴfitがＴminより小さい場合は、画像サイズが表示領域よりも大きくなるため、図１３の画像Ｃ１のように、単純に画像領域の中心を表示領域の中心に配置するだけでは、画像に含まれる情報（画像領域）が一部表示されない場合が生じる。 Subsequently, the image arrangement determination unit 32 determines the arrangement in which the image created by the image processing unit 16 is displayed in the display area (S26).
In the enlargement / reduction ratio determination process in FIG. 10, when Tfit is larger than Tmax, the image size is smaller than the display area, so that the image arrangement determination unit 32 has the center of the image area as in the image A1 in FIG. The image is arranged so as to be positioned at the center of the display area.
When Tfit is smaller than Tmax and larger than Tmin, the image size becomes equal to the display. Therefore, as shown in the image B1 in FIG. 13, the image arrangement determining unit 32 sets the upper left of the image to the upper left of the display area. Arrange the images.
When Tfit is smaller than Tmin, the image size is larger than the display area. Therefore, as shown in the image C1 in FIG. 13, the information included in the image is simply arranged at the center of the display area. There is a case where a part of (image area) is not displayed.

そこで、画像サイズが表示領域よりも大きい場合、画像配置決定部３２は、キー、ボタン、マウス、タッチパネルなどの入力装置による指示で画像をスクロールさせることで、画面からはみ出して隠れていた画像領域を表示し、これにより、ユーザは、小さい画面でも画像領域全体を見ることができる。
つまり、画像配置決定部３２は、レイアウト解析の結果や入力される文書画像の種類に関する知識、情報を用いることで文書内容をより多く表示できるような画像の初期配置を決定する。 Therefore, when the image size is larger than the display area, the image arrangement determination unit 32 scrolls the image according to an instruction from an input device such as a key, a button, a mouse, or a touch panel, so that the image area that is hidden from the screen is hidden. Display, which allows the user to see the entire image area even on a small screen.
That is, the image arrangement determining unit 32 determines an initial arrangement of images that can display more document contents by using knowledge and information about the result of layout analysis and the type of document image to be input.

例えば文書画像の場合、レイアウト解析部１３によるレイアウト解析の結果として、文書中の文字の位置を判別できるので、画像配置決定部３２は、その文字の位置情報を用いて、図１４に示すように、文書の始まり（☆印の位置）がちょうど画面の表示領域の左上端にくるように配置を決定する。 For example, in the case of a document image, the position of the character in the document can be determined as a result of the layout analysis by the layout analysis unit 13, and therefore the image arrangement determination unit 32 uses the position information of the character as shown in FIG. The layout is determined so that the beginning of the document (the position of the ☆ mark) is exactly at the upper left corner of the display area of the screen.

また、入力画像が予め名刺画像等と帳票種が分かっているような場合、名刺表示用の画像配置設定情報を記憶部１２に記憶しておくことで、画像配置決定部３２は、図１５に示すように、名刺画像に含まれる名前が最もよく分かるように名前の始まり（☆印の位置）を表示領域の左端に配置する。
名前文字列の位置は、レイアウト解析部１３によりレイアウト解析された結果として得られる文字列配置情報や名前辞書等から得られる情報を画像配置決定部３２が用いて推定する。 In addition, when the input image is a business card image or the like and the form type is known in advance, the image arrangement determination unit 32 stores the image arrangement setting information for displaying the business card in the storage unit 12 so that the image arrangement determination unit 32 in FIG. As shown, the beginning of the name (the position of the ☆ mark) is arranged at the left end of the display area so that the name included in the business card image can be best understood.
The position of the name character string is estimated by the image arrangement determination unit 32 using character string arrangement information obtained as a result of layout analysis by the layout analysis unit 13 or information obtained from the name dictionary.

続いて、画像表示部１７は、画像処理部１６により画像処理（倍率変換）された画像を表示し、画像出力部１８は、各種画像や文書ファイルフォーマット書類としてファイル出力する（Ｓ２７）。 Subsequently, the image display unit 17 displays the image processed (magnification conversion) by the image processing unit 16, and the image output unit 18 outputs the file as various images and document file format documents (S27).

このようにこの第２実施形態の画像処理装置によれば、画像の表示領域である画面を最大限に生かしつつ、ユーザが読み易いサイズに文字の大きさを適正化した文書画像や名刺画像を画面に表示するので、ユーザは、手操作で画像の大きさを調整することなく画面で文書や名刺の文字を読むことができる。 As described above, according to the image processing apparatus of the second embodiment, a document image or a business card image in which the size of characters is optimized to a size that is easy for the user to read while making the most of the screen that is the display area of the image. Since it is displayed on the screen, the user can read text on a document or a business card on the screen without manually adjusting the size of the image.

また、用途に応じ必要十分な文字サイズに画像を縮小させることにより、無駄なストレージやメモリの使用を抑制することができる。
さらに、画像サイズが表示領域サイズより大きい場合、レイアウト解析結果を用いて初期配置を最適なものとすることで、ユーザの入力を最小限に抑えることができる。 In addition, useless storage and memory usage can be suppressed by reducing the image to a necessary and sufficient character size according to the application.
Further, when the image size is larger than the display area size, the user can be minimized by using the layout analysis result to optimize the initial arrangement.

なお、本発明は上記実施形態のみに限定されるものではない。
文字サイズを自動的に設定する方法としては、例えば言語、文字種に応じて表示に適した文字サイズとなるよう文字サイズを変更するという方法がある。
また、この他、表示するディスプレイの特性（主に解像度とサイズ）などから適切な文字サイズを自動推定する（例えば携帯端末の画面とＰＣのディスプレイでは最適なサイズは異なる）という方法もある。 In addition, this invention is not limited only to the said embodiment.
As a method of automatically setting the character size, for example, there is a method of changing the character size so that the character size is suitable for display according to the language and character type.
In addition, there is also a method of automatically estimating an appropriate character size from the characteristics (mainly resolution and size) of the display to be displayed (for example, the optimal size is different between the screen of the mobile terminal and the display of the PC).

本発明に係る第１実施形態の画像処理装置の構成を示す図である。1 is a diagram illustrating a configuration of an image processing apparatus according to a first embodiment of the present invention. 図１の画像処理装置の言語・文字種/文字サイズ対応テーブルを示す図である。FIG. 2 is a diagram illustrating a language / character type / character size correspondence table of the image processing apparatus of FIG. 1. この第１実施形態の画像処理装置における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the image processing apparatus of this 1st Embodiment. この画像処理装置において、２つの文書画像が共に日本語の文字を含む場合の変換元画像を示す図である。In this image processing apparatus, it is a figure which shows the conversion original image in case two document images contain a Japanese character. 図４の変換元画像をそれぞれ画像処理した結果の文書画像を示す図である。FIG. 5 is a diagram illustrating document images obtained as a result of image processing of the conversion source images in FIG. 4. この画像処理装置において、１つの文書画像が日本語の文字を含み、他の一つの文書画像がアルファベットの文字を含む場合の変換元画像を示す図である。In this image processing apparatus, it is a figure which shows the conversion original image in case one document image contains a Japanese character and the other one document image contains an alphabetic character. 図６の２つの変換元画像をそれぞれ画像処理した結果の文書画像を示す図である。It is a figure which shows the document image as a result of image-processing each of the two conversion source images of FIG. 本発明に係る第２実施形態の画像処理装置の構成を示す図である。It is a figure which shows the structure of the image processing apparatus of 2nd Embodiment which concerns on this invention. この第２実施形態における画像処理装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the image processing apparatus in this 2nd Embodiment. 画像の拡大・縮小率を求めるフローチャートである。It is a flowchart which calculates | requires the expansion / reduction rate of an image. 画像の拡大縮小率を決定する処理を説明するための図である。It is a figure for demonstrating the process which determines the expansion / contraction rate of an image. 画像の拡大縮小率を決定するＴfitとＲの計算処理を説明するための図である。It is a figure for demonstrating the calculation process of Tfit and R which determines the expansion / contraction rate of an image. 画像Ａ、Ｂ、Ｃに対する拡大縮小処理の結果を示す図である。It is a figure which shows the result of the expansion / contraction process with respect to the images A, B, and C. 生成した画像サイズが表示領域サイズより大きい場合の画像初期配置の一例を示す図である。It is a figure which shows an example of an image initial arrangement | positioning in case the produced | generated image size is larger than a display area size. 入力画像が予め名刺画像と分かっている場合に生成した画像を画面に表示する場合の一例を示す図である。It is a figure which shows an example in the case of displaying on the screen the image produced | generated when the input image is known beforehand as a business card image.

Explanation of symbols

１１…画像入力部、１２…記憶部、１３…レイアウト解析部、１４…幾何情報推定部、１５…言語・文字種推定部、１６…画像処理部、１７…画像表示部、１８…画像出力部、１９…操作入力部、２０…言語・文字種/文字サイズ対応テーブル、３１…拡大・縮小倍率決定部、３２…画像配置決定部 DESCRIPTION OF SYMBOLS 11 ... Image input part, 12 ... Memory | storage part, 13 ... Layout analysis part, 14 ... Geometric information estimation part, 15 ... Language and character kind estimation part, 16 ... Image processing part, 17 ... Image display part, 18 ... Image output part, DESCRIPTION OF SYMBOLS 19 ... Operation input part, 20 ... Language / character type / character size correspondence table, 31 ... Enlarging / reducing magnification determining part, 32 ... Image arrangement determining part

Claims

Display means comprising a screen having a predetermined display area;
Storage means for storing a business card image having a size larger than the display area, image layout setting information indicating the layout of components of the business card image, and a name dictionary;
Layout analysis means for obtaining layout information of character strings included in the business card image by performing layout analysis of components included in the business card image from the business card image and image layout setting information read from the storage means;
The position of the name in the business card image is determined using the arrangement information of the character string obtained by the layout analysis means and the information obtained from the name dictionary, and the start of the name determined by the position is set to the left end of the display area. An image processing apparatus comprising: an image arrangement determining unit that outputs the business card image to the display unit so as to be arranged on the display unit .

Processing is performed in an image processing apparatus including a display unit including a screen having a predetermined display area, and a storage unit that stores a business card image having a size larger than the display area, image layout setting information for displaying a business card, and a name dictionary. An image processing program to be executed,
The image processing apparatus;
Arrangement of character strings included in the business card image by performing layout analysis of the structural elements included in the business card image from the business card image read from the storage unit and image layout setting information indicating the layout of the structural elements of the business card image Layout analysis means for obtaining information;
The position of the name in the business card image is determined using the arrangement information of the character string obtained by the layout analysis means and the information obtained from the name dictionary, and the start of the name determined by the position is set to the left end of the display area. An image processing program causing an image arrangement determining unit to output the business card image to the display unit so as to arrange the image on the display unit .

Display means provided with a screen having a predetermined display area, storage means for storing a business card image having a size larger than the display area, image layout setting information indicating the layout of components of the business card image, and a name dictionary, and layout analysis means, in the image processing method in an image processing apparatus and an image arrangement determining means,
The layout analysis unit extracts the component included in the business card image from the business card image read from the storage unit and the image layout setting information indicating the layout of the component of the business card image, and the layout analysis of the extracted component Obtaining the arrangement information of the character string included in the business card image by performing
The image arrangement determining means determines the position of the name in the business card image using the arrangement information of the character string and the information obtained from the name dictionary, and the start of the name determined by the position of the display area And a step of outputting the business card image to the display means so as to be arranged at the left end.