JP2023003648A

JP2023003648A - Information processing device and program

Info

Publication number: JP2023003648A
Application number: JP2021104849A
Authority: JP
Inventors: 茂樹小澤; Shigeki Ozawa
Original assignee: Laurel Bank Machine Co Ltd; Laurel Precision Machines Co Ltd; Laurel Machinery Co Ltd
Current assignee: Laurel Bank Machine Co Ltd; Laurel Precision Machines Co Ltd; Laurel Machinery Co Ltd
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2023-01-17

Abstract

To increase character recognition accuracy for a document including a numeric character.SOLUTION: A character recognition device 10, an example of an information processing device, includes: a numeric character recognition unit 126 which recognizes a numeric character included in an image as a subject to be processed; and a symbol recognition unit 127 which recognizes a symbol included in the image as the subject to be processed. The symbol recognition unit 127 recognizes partial images C1, C4, C8 corresponding to the symbol from a paragraph image P4 showing numeric characters and other symbols. The numeric character recognition unit 126 recognizes numeric characters included in the paragraph image P4A obtained by deleting the partial images C1, C4, C8 corresponding to the symbols from the paragraph image.SELECTED DRAWING: Figure 9

Description

本発明は、情報処理装置およびプログラムに関する。 The present invention relates to an information processing device and program.

文字認識装置等の情報処理装置では、一般的に、光学文字認識（ＯＣＲ：ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）技術が採用されている。ＯＣＲ技術は、帳票等の文書に記載された文字をカメラおよびイメージスキャナ等の光学的な手段により画像として取込み、取り込んだ画像内の文字をコンピュータ等が利用可能な文字情報（例えば、文字コード）に変換する技術である。例えば、下記特許文献１には、文字と罫線とを含む処理対象画像から、設定した長さ方向の罫線を削除した画像を生成する技術が開示されている。文字認識においてノイズとなる罫線を処理対象画像から削除することにより、処理対象画像に含まれる文字の認識精度が高くなることが予測される。 2. Description of the Related Art Information processing devices such as character recognition devices generally employ optical character recognition (OCR) technology. OCR technology captures characters written on documents such as forms as images by optical means such as cameras and image scanners, and converts the characters in the captured images into character information (for example, character codes) that can be used by computers and the like. It is a technology to convert to For example, Japanese Unexamined Patent Application Publication No. 2002-200000 discloses a technique for generating an image from which ruled lines in a set length direction are deleted from an image to be processed that includes characters and ruled lines. It is expected that the recognition accuracy of characters included in the processing target image will be improved by removing the ruled lines that become noise in the character recognition from the processing target image.

特開２０１７－１４２６２８号公報JP 2017-142628 A

情報処理装置を用いて数字を含む文書を文字認識する際に、桁数を表すカンマや小数点を示すドットなどの数字に付随して使用される記号が、数字や数字の一部と誤認識されることによって、数字の認識精度が低下する場合がある。上述した従来技術は、長さ方向の罫線を取り除くことはできるが、数字に付随して使用される記号が数字として誤認識されることを防止できないため、数字の認識精度の低下を抑制できないという問題がある。 When using an information processing device to recognize characters in a document containing numbers, symbols that accompany numbers, such as commas that indicate the number of digits and dots that indicate the decimal point, are mistakenly recognized as numbers or part of numbers. This may reduce the accuracy of digit recognition. Although the above-mentioned conventional technology can remove ruled lines in the length direction, it cannot prevent symbols used with numbers from being erroneously recognized as numbers. There's a problem.

本発明の好適な態様に係る情報処理装置は、数字および数字以外の記号を表す第１画像から、前記記号に対応する第１部分を消去した第２画像を生成する生成部と、前記生成部によって生成された第２画像に含まれる数字を認識する数字認識部と、を備える。 An information processing apparatus according to a preferred aspect of the present invention includes a generation unit that generates a second image obtained by deleting a first portion corresponding to the symbol from a first image representing numbers and symbols other than numbers; and a digit recognition unit for recognizing digits included in the second image generated by.

本発明の好適な態様に係るプログラムは、プロセッサを、数字および数字以外の記号を表す第１画像から、前記記号に対応する第１部分を消去した第２画像を生成する生成部と、前記生成部によって生成された第２画像に含まれる数字を認識する数字認識部と、して機能させる。 A program according to a preferred aspect of the present invention comprises a processor, a generator that generates a second image obtained by erasing a first portion corresponding to the symbol from a first image representing numbers and symbols other than numbers; and a number recognition unit that recognizes numbers contained in the second image generated by the unit.

本発明によれば、数字を含む文書の文字認識精度を向上させることができる。 According to the present invention, it is possible to improve the character recognition accuracy of a document containing numerals.

実施形態にかかる文字認識システムの構成を示す図である。1 is a diagram showing the configuration of a character recognition system according to an embodiment; FIG. 領収書画像の一例を示す図である。It is a figure which shows an example of a receipt image. 認識結果データの一例を示す図である。It is a figure which shows an example of recognition result data. 端末装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of a terminal device. 端末装置の機能的構成の一例を示すブロック図である。3 is a block diagram showing an example of a functional configuration of a terminal device; FIG. 文字認識装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of a character recognition apparatus. 文字認識装置の機能的構成の一例を示すブロック図である。1 is a block diagram showing an example of a functional configuration of a character recognition device; FIG. レイアウト解析後の領収書画像の一例を示す図である。FIG. 10 is a diagram showing an example of a receipt image after layout analysis; 第２文字認識部の文字認識方法について説明する図である。It is a figure explaining the character-recognition method of a 2nd character-recognition part. 文字認識装置による文字認識処理の手順を示すフローチャートである。4 is a flow chart showing a procedure of character recognition processing by a character recognition device; 第２文字認識部による第２文字認識処理の手順を示すフローチャートである。9 is a flowchart showing a procedure of second character recognition processing by a second character recognition unit; 第２実施形態における認識結果データの一例を示す図である。It is a figure which shows an example of the recognition result data in 2nd Embodiment. 第３実施形態における第２文字認識部の文字認識方法について説明する図である。It is a figure explaining the character recognition method of the 2nd character recognition part in 3rd Embodiment. 第４実施形態における文字認識装置の機能的構成を示すブロック図である。FIG. 11 is a block diagram showing the functional configuration of a character recognition device according to a fourth embodiment; FIG. 第４実施形態の適用が好適な領収書画像の一例を示す図である。It is a figure which shows an example of a receipt image suitable for application of 4th Embodiment.

以下、添付図面を参照しながら本発明にかかる好適な実施形態を説明する。なお、図面において各部の寸法または縮尺は実際と適宜に異なり、理解を容易にするために模式的に示している部分もある。また、本発明の範囲は、以下の説明において特に本発明を限定する旨の記載がない限り、これらの形態に限られない。 Preferred embodiments of the present invention will be described below with reference to the accompanying drawings. Note that the dimensions or scale of each part in the drawings are appropriately different from the actual ones, and some parts are shown schematically for easy understanding. Moreover, the scope of the present invention is not limited to these forms unless there is a description to the effect that the present invention is particularly limited in the following description.

本実施形態において、表記とは、文書や文書を読み取った画像において、背景と区別される線や点を指すものとする。表記は、文字を示す場合もあるし、罫線を示す場合もある。また、表記は、図柄を示す場合もあるし、筆記具が意図せずに筆記用紙に接して書かれた線や点である場合もある。
本実施形態において、文字とは、漢字、ひらがな、カタカナ、アルファベット、数字、記号等、言葉や言語を伝達し記録するために線や点を使って人為的に定められた表記を指すものとする。よって、例えば筆記具が意図せずに筆記用紙に接して書かれた線や点、罫線などは、文字に含まれない。
本実施形態において、数字とは、数を示す文字である。本実施形態では、数字は、例えばアラビア数字（０，１…９）を指すものとする。
本実施形態において、言語文字とは、漢字、ひらがな、カタカナ、アルファベットなど、言語の表記に用いられる文字である。例えばドットやカンマ、句点、読点は、言語文字とともに用いられるが、本実施形態では、ドットやカンマ、句点、読点は、記号に分類される。
本実施形態において、記号とは、ドットやカンマ、句点、読点、ハイフン、演算記号等の、一般的には言語文字や数字に付随して用いられることにより、特定の意味を示す表記を指すものとする。なお、記号が単独で用いられてもよい。 In the present embodiment, the notation refers to lines or dots that are distinguished from the background in a document or an image obtained by reading the document. The notation may indicate a character or may indicate a ruled line. In addition, the notation may indicate a pattern, or may be a line or a point drawn by the writing instrument unintentionally in contact with the writing paper.
In this embodiment, characters refer to kanji, hiragana, katakana, the alphabet, numbers, symbols, etc., which are artificially determined using lines and dots to transmit and record words and languages. . Therefore, for example, a line, a point, a ruled line, or the like, which is written by the writing instrument unintentionally in contact with the writing paper, is not included in the characters.
In the present embodiment, numbers are letters that indicate numbers. In this embodiment, numbers refer to Arabic numerals (0, 1...9), for example.
In the present embodiment, language characters are characters used to describe languages, such as kanji, hiragana, katakana, and the alphabet. For example, dots, commas, full stops, and commas are used together with language characters, but in this embodiment, dots, commas, full stops, and commas are classified as symbols.
In the present embodiment, symbols refer to notations that indicate specific meanings, such as dots, commas, full stops, commas, hyphens, arithmetic symbols, etc., that are generally used in conjunction with language characters and numbers. and In addition, a symbol may be used independently.

［第１実施形態］
［文字認識システムの概要］
図１は、実施形態にかかる文字認識システム１の構成を示す図である。文字認識システム１は、文字認識装置１０と、端末装置２０とを備える。文字認識装置１０は、情報処理装置の一例である。文字認識装置１０と端末装置２０とは、ネットワークＮＷを介して接続されている。ネットワークＮＷは、インターネットおよびローカルエリアネットワークを含み得る。例えば、ネットワークＮＷは、有線ネットワークおよび無線ネットワークの一方または両方を含む。また、ネットワークＮＷと文字認識装置１０との接続は、例えば、複数の要素間を互いに通信可能にする接続であればよく、有線および無線の一方を用いた接続であってもよいし、有線および無線の両方を用いた接続であってもよい。 [First embodiment]
[Overview of character recognition system]
FIG. 1 is a diagram showing the configuration of a character recognition system 1 according to an embodiment. A character recognition system 1 includes a character recognition device 10 and a terminal device 20 . The character recognition device 10 is an example of an information processing device. The character recognition device 10 and the terminal device 20 are connected via a network NW. Network NW may include the Internet and local area networks. For example, the network NW includes one or both of a wired network and a wireless network. Also, the connection between the network NW and the character recognition device 10 may be, for example, a connection that enables mutual communication between a plurality of elements, and may be a wired or wireless connection, or a wired or wireless connection. It may be a connection using both wireless.

文字認識装置１０としては、ネットワークＮＷに接続可能な任意の情報処理装置を採用することができる。文字認識装置１０は、例えば、光学文字認識（ＯＣＲ）技術を用いて、画像に含まれる文字を認識する文字認識処理を実行する。文字認識処理の対象となる領収書画像ＲＩの一例は、後述する図２において説明される。また、文字認識装置１０の構成は、後述する図６および図７において説明される。 Any information processing device connectable to the network NW can be employed as the character recognition device 10 . The character recognition device 10 uses, for example, optical character recognition (OCR) technology to perform character recognition processing for recognizing characters included in an image. An example of the receipt image RI to be subjected to character recognition processing will be described later with reference to FIG. Also, the configuration of the character recognition device 10 will be described later with reference to FIGS. 6 and 7. FIG.

端末装置２０としては、ネットワークＮＷに接続可能な任意の情報処理装置を採用することができる。具体的には、端末装置２０は、例えば、パーソナルコンピュータ等の据置型の情報機器であってもよいし、ノート型のパーソナルコンピュータ、タブレット端末、スマートフォン等の可搬型の情報端末であってもよい。端末装置２０の構成は、後述する図４および図５において説明される。 Any information processing device connectable to the network NW can be employed as the terminal device 20 . Specifically, the terminal device 20 may be, for example, a stationary information device such as a personal computer, or a portable information terminal such as a notebook personal computer, a tablet terminal, or a smartphone. . The configuration of the terminal device 20 will be explained in FIGS. 4 and 5 which will be described later.

本実施形態では、文字認識装置１０は領収書データ化サービスを提供する事業者が保有する情報処理装置である。また、端末装置２０は、領収書データ化サービスを利用するユーザが保有する情報処理装置である。領収書データ化サービスとは、領収書が写る画像（以下「領収書画像」という）に対して文字認識処理を行い、領収書内に記載された文字をコンピュータで処理可能なデータに変換するサービスである。すなわち、本実施形態では、処理対象画像は領収書画像ＲＩ（図２参照）である。コンピュータで処理可能なデータとは、例えばＳｈｉｆｔ－ＪＩＳコードのようなテキストデータである。ユーザは、データ化したい領収書をスキャナで読み取る、または、カメラで撮影するなどして、領収書画像ＲＩを生成する。ユーザは、端末装置２０を用いて領収書画像ＲＩを文字認識装置１０に送信する。文字認識装置１０は、端末装置２０から送信された領収書画像ＲＩに対して文字認識処理を行い、文字認識処理の結果を含む認識結果データＲＤ（図３参照）を端末装置２０に送信する。ユーザは、送信された認識結果データを、例えば会計管理用アプリケーションに入力して経費の管理等を行う。領収書データ化サービスを利用することによって、紙の領収書に記載された内容が自動的にデータ化され、例えば領収書の内容をコンピュータに手入力する場合と比較して、経費管理の効率を向上させることができる。 In this embodiment, the character recognition device 10 is an information processing device owned by a company that provides a receipt data conversion service. Also, the terminal device 20 is an information processing device owned by a user who uses the receipt data conversion service. Receipt data conversion service is a service that performs character recognition processing on the image of the receipt (hereinafter referred to as "receipt image") and converts the characters written in the receipt into data that can be processed by a computer. is. That is, in this embodiment, the image to be processed is the receipt image RI (see FIG. 2). Data that can be processed by a computer is text data such as Shift-JIS code. A user scans a receipt to be converted into data with a scanner or takes an image with a camera to generate a receipt image RI. The user uses the terminal device 20 to send the receipt image RI to the character recognition device 10 . The character recognition device 10 performs character recognition processing on the receipt image RI transmitted from the terminal device 20 and transmits recognition result data RD (see FIG. 3) including the result of the character recognition processing to the terminal device 20 . The user inputs the transmitted recognition result data to, for example, an accounting management application to manage expenses. By using the receipt data conversion service, the contents written on paper receipts are automatically converted into data, and compared to, for example, manually entering the contents of receipts into a computer, the efficiency of expense management can be improved. can be improved.

図２は、領収書画像ＲＩの一例を示す図である。図２に示す領収書画像ＲＩは、文書名表記Ｎ１、日付表記Ｎ２、宛先表記Ｎ３、金額表記Ｎ４、但書表記Ｎ５、領収確認文表記Ｎ６、住所表記Ｎ７および発行者名表記Ｎ８を含んでいる。各表記Ｎ１～Ｎ８は、領収書の地色と区別可能な色で記載された線や点で構成されている。以下、領収書画像ＲＩにおいて、領収書の地色を表示する色を背景色、各表記Ｎ１～Ｎ８を表示する色を表記色という。 FIG. 2 is a diagram showing an example of the receipt image RI. The receipt image RI shown in FIG. 2 includes document name notation N1, date notation N2, address notation N3, amount notation N4, proviso notation N5, receipt confirmation statement notation N6, address notation N7, and issuer name notation N8. there is Each of the notations N1 to N8 is composed of lines and dots written in a color distinguishable from the background color of the receipt. Hereinafter, in the receipt image RI, the color that displays the ground color of the receipt is called the background color, and the colors that display the notations N1 to N8 are called the notation colors.

文書名表記Ｎ１は、文書が領収書であることを示す表記である。日付表記Ｎ２は、領収書が発行された日付を示す表記である。宛先表記Ｎ３は、領収書の宛先を示す表記である。金額表記Ｎ４は、領収された金銭の額（金額）を示す表記である。但書表記Ｎ５は、金銭と引き換えに提供された商品やサービスの名称を示す表記である。領収確認文表記Ｎ６は、金額表記Ｎ４に記載の金額を、但書表記Ｎ５の名目で領収した旨を確認する文言の表記である。住所表記Ｎ７は、領収書の発行者の住所を示す表記である。発行者名表記Ｎ８は、領収書の発行者の名称を示す表記である。 The document name notation N1 is a notation indicating that the document is a receipt. The date notation N2 is a notation indicating the date when the receipt was issued. The destination notation N3 is a notation indicating the destination of the receipt. The amount notation N4 is a notation indicating the amount (amount) of money received. The proviso notation N5 is a notation indicating the name of the product or service provided in exchange for money. The receipt confirmation statement N6 is a statement confirming that the amount indicated in the amount statement N4 has been received under the name of the proviso statement N5. The address notation N7 is a notation indicating the address of the issuer of the receipt. The issuer name notation N8 is a notation indicating the name of the issuer of the receipt.

図２に示す領収書画像ＲＩでは、文書名表記Ｎ１および領収確認文表記Ｎ６は、活字で記載されている。また、宛先表記Ｎ３、金額表記Ｎ４、住所表記Ｎ７および発行者名表記Ｎ８は、手書きされた文字である手書き文字で記入されている。日付表記Ｎ２および但書表記Ｎ５は、手書き文字と活字が混在している。一般に、手書き用に市販されている領収書用紙では、全ての領収書において共通の項目は活字で予め印刷されており、個々の領収書によって異なる項目は手書きで記載される。図２に示す領収書画像ＲＩは、手書き用に市販されている領収書用紙に、必要箇所が手書きされた領収書が写る画像である。なお、図２に示す領収書画像ＲＩは一例であり、例えば領収書画像の全ての項目が活字で印刷されていてもよいし、領収書画像の全ての項目が手書きされていてもよい。 In the receipt image RI shown in FIG. 2, the document name representation N1 and the receipt confirmation text representation N6 are printed. Also, the destination notation N3, the amount notation N4, the address notation N7, and the issuer name notation N8 are written in handwritten characters. The date notation N2 and the proviso notation N5 are a mixture of handwritten characters and typed characters. In general, in receipt papers commercially available for handwriting, items common to all receipts are preprinted in type, and items that are different for each receipt are written by hand. The receipt image RI shown in FIG. 2 is an image of a receipt in which necessary portions are handwritten on a commercially available receipt paper for handwriting. Note that the receipt image RI shown in FIG. 2 is just an example, and for example, all items of the receipt image may be printed in type, or all items of the receipt image may be handwritten.

図３は、認識結果データＲＤの一例を示す図である。図３に示す認識結果データＲＤは、図２の領収書画像ＲＩを文字認識した結果を示している。認識結果データＲＤは、日付認識結果Ｄ１、宛先認識結果Ｄ２、金額認識結果Ｄ３、但書認識結果Ｄ４、発行者認識結果Ｄ５を含んでいる。日付認識結果Ｄ１は、日付表記Ｎ２を文字認識した結果を示す。宛先認識結果Ｄ２は、宛先表記Ｎ３を文字認識した結果を示す。金額認識結果Ｄ３は、金額表記Ｎ４を文字認識した結果を示す。但書認識結果Ｄ４は、但書表記Ｎ５を文字認識した結果のうち、商品やサービスの名称部分を示す。発行者認識結果Ｄ５は、発行者名表記Ｎ８を文字認識した結果を示す。なお、図３に示す認識結果データＲＤでは、文書名表記Ｎ１、領収確認文表記Ｎ６および住所表記Ｎ７に対応する文字認識の結果は含まれていない。文書名表記Ｎ１および領収確認文表記Ｎ６は、領収書データ化サービスにおいては自明の事項であるため、省略される。また、住所表記Ｎ７は、経費管理上必要がないため、省略される。なお、認識結果データＲＤとして送信する情報を、ユーザが指定できるようにしてもよい。例えば、認識結果データＲＤに住所表記Ｎ７を含めるように、ユーザが文字認識装置１０に対して指定できるようにしてもよい。 FIG. 3 is a diagram showing an example of recognition result data RD. Recognition result data RD shown in FIG. 3 indicates the result of character recognition of the receipt image RI of FIG. The recognition result data RD includes a date recognition result D1, a destination recognition result D2, an amount recognition result D3, a proviso recognition result D4, and an issuer recognition result D5. The date recognition result D1 indicates the result of character recognition of the date notation N2. The destination recognition result D2 indicates the result of character recognition of the destination notation N3. The amount recognition result D3 indicates the result of character recognition of the amount notation N4. The proviso recognition result D4 indicates the product or service name portion of the result of character recognition of the proviso notation N5. The issuer recognition result D5 indicates the result of character recognition of the issuer name notation N8. Note that the recognition result data RD shown in FIG. 3 does not include the result of character recognition corresponding to the document name notation N1, the receipt confirmation statement notation N6, and the address notation N7. The document name notation N1 and the receipt confirmation text notation N6 are omitted because they are self-evident matters in the receipt data conversion service. Also, the address notation N7 is omitted because it is not necessary for expense management. It should be noted that the information to be transmitted as the recognition result data RD may be designated by the user. For example, the user may specify the character recognition apparatus 10 to include the address notation N7 in the recognition result data RD.

［システム構成］
次に、端末装置２０および文字認識装置１０の構成について説明する。 [System configuration]
Next, configurations of the terminal device 20 and the character recognition device 10 will be described.

図４は、端末装置２０のハードウェア構成の一例を示すブロック図である。端末装置２０は、端末装置２０の各部を制御するプロセッサ２２と、各種情報を記憶するメモリ２４と、通信装置２６と、操作装置２８と、表示装置２９とを備える。 FIG. 4 is a block diagram showing an example of the hardware configuration of the terminal device 20. As shown in FIG. The terminal device 20 includes a processor 22 that controls each part of the terminal device 20 , a memory 24 that stores various information, a communication device 26 , an operation device 28 and a display device 29 .

メモリ２４は、例えば、プロセッサ２２の作業領域として機能するＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等の揮発性メモリと、制御プログラムＰＧ２等の各種情報を記憶するＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄ－ＯｎｌｙＭｅｍｏｒｙ）等の不揮発性メモリとの、一方または両方を含み、後述する記憶部２４０（図５参照）として機能する。なお、メモリ２４は、端末装置２０に着脱可能であってもよい。具体的には、メモリ２４は、端末装置２０に着脱されるメモリカード等の記憶媒体であってもよい。また、メモリ２４は、例えば、端末装置２０とネットワークＮＷ等を介して通信可能に接続された記憶装置（例えば、オンラインストレージ）であってもよい。 The memory 24 is, for example, a volatile memory such as a RAM (Random Access Memory) that functions as a work area for the processor 22, and a non-volatile memory such as an EEPROM (Electrically Erasable Programmable Read-Only Memory) that stores various information such as the control program PG2. It includes one or both of a physical memory and functions as a storage unit 240 (see FIG. 5), which will be described later. Note that the memory 24 may be removable from the terminal device 20 . Specifically, the memory 24 may be a storage medium such as a memory card that is detachable from the terminal device 20 . Also, the memory 24 may be, for example, a storage device (for example, an online storage) communicably connected to the terminal device 20 via the network NW or the like.

メモリ２４は、例えば、制御プログラムＰＧ２を記憶している。本実施形態では、制御プログラムＰＧ２は、例えば、文字認識装置１０に領収書画像ＲＩを送信し、認識結果データＲＤを得るためのアプリケーションプログラムを含む。制御プログラムＰＧ２は、例えば、プロセッサ２２が端末装置２０の各部を制御するためのオペレーティングシステムプログラムを含んでもよい。 The memory 24 stores, for example, the control program PG2. In this embodiment, the control program PG2 includes, for example, an application program for transmitting the receipt image RI to the character recognition device 10 and obtaining the recognition result data RD. The control program PG2 may include an operating system program for the processor 22 to control each part of the terminal device 20, for example.

プロセッサ２２は、例えば、１または複数のＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を含んで構成される。プロセッサ２２は、例えば、メモリ２４に記憶された制御プログラムＰＧ２を実行し、制御プログラムＰＧ２に従って動作することで、後述する制御部２２０（図５参照）として機能する。 The processor 22 includes, for example, one or more CPUs (Central Processing Units). For example, the processor 22 executes a control program PG2 stored in the memory 24 and operates according to the control program PG2, thereby functioning as a control unit 220 (see FIG. 5), which will be described later.

また、例えば、プロセッサ２２が複数のＣＰＵを含んで構成される場合、制御部２２０の一部または全部の機能は、これら複数のＣＰＵが制御プログラムＰＧ２等のプログラムに従って協働して動作することで実現されてもよい。また、プロセッサ２２は、１または複数のＣＰＵに加え、もしくは、１または複数のＣＰＵのうち一部または全部に代えて、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）、または、ＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等のハードウェアを含んで構成されるものであってもよい。この場合、プロセッサ２２により実現される制御部２２０の一部または全部は、ＤＳＰ等のハードウェアにより実現されてもよい。 Further, for example, when the processor 22 includes a plurality of CPUs, some or all of the functions of the control unit 220 are performed by the plurality of CPUs operating in cooperation according to a program such as the control program PG2. may be implemented. In addition to one or more CPUs, or instead of part or all of one or more CPUs, the processor 22 may include a GPU (Graphics Processing Unit), a DSP (Digital Signal Processor), or an FPGA (Field Programmable Gate Array) and other hardware may be included. In this case, part or all of the control unit 220 implemented by the processor 22 may be implemented by hardware such as a DSP.

通信装置２６は、有線ネットワークおよび無線ネットワークの一方または両方を介して、端末装置２０の外部に存在する外部装置との通信を行うためのハードウェアであり、通信部２６０として機能する。 The communication device 26 is hardware for communicating with an external device existing outside the terminal device 20 via one or both of a wired network and a wireless network, and functions as a communication unit 260 .

操作装置２８は、端末装置２０のユーザによる操作を受け付けるためのハードウェアであり、操作部２８０として機能する。例えば、操作装置２８は、操作ボタン、タッチパネル、キーボード、および、マウス等の一部または全部を含む、１または複数の機器から構成されるものであってもよい。 The operation device 28 is hardware for receiving operations by the user of the terminal device 20 and functions as an operation unit 280 . For example, the operation device 28 may be composed of one or more devices including some or all of operation buttons, a touch panel, a keyboard, a mouse, and the like.

表示装置２９は、端末装置２０のユーザに各種情報を表示するためのハードウェアであり、表示部２９０として機能する。例えば、表示装置２９は、端末装置２０が有するディスプレイ、または端末装置２０に接続されたディスプレイであってもよい。 The display device 29 is hardware for displaying various information to the user of the terminal device 20 and functions as a display unit 290 . For example, the display device 29 may be a display included in the terminal device 20 or a display connected to the terminal device 20 .

なお、端末装置２０のハードウェア構成は、図４に示した例に限定されない。例えば、端末装置２０は、紙媒体である領収書から領収書画像ＲＩを生成するためのカメラまたはイメージスキャナ等の光学的な装置を有してもよい。 Note that the hardware configuration of the terminal device 20 is not limited to the example shown in FIG. For example, the terminal device 20 may have an optical device such as a camera or an image scanner for generating a receipt image RI from a paper receipt.

図５は、端末装置２０の機能的構成の一例を示すブロック図である。端末装置２０は、端末装置２０の各部を制御する制御部２２０と、各種情報を記憶する記憶部２４０と、文字認識装置１０等の外部装置との間の通信を実行するための通信部２６０と、端末装置２０のユーザによる操作を受け付けるための操作部２８０と、各種情報を表示するための表示部２９０とを有する。 FIG. 5 is a block diagram showing an example of the functional configuration of the terminal device 20. As shown in FIG. The terminal device 20 includes a control section 220 that controls each section of the terminal device 20, a storage section 240 that stores various information, and a communication section 260 that performs communication with an external device such as the character recognition device 10. , an operation unit 280 for accepting operations by the user of the terminal device 20, and a display unit 290 for displaying various information.

制御部２２０は、画像送信部２２２および認識結果データ受信部２２４を備える。画像送信部２２２は、文字認識装置１０に対して領収書画像ＲＩを送信する。認識結果データ受信部２２４は、文字認識装置１０から認識結果データＲＤを受信する。 The control unit 220 has an image transmission unit 222 and a recognition result data reception unit 224 . The image transmission unit 222 transmits the receipt image RI to the character recognition device 10 . The recognition result data receiving unit 224 receives recognition result data RD from the character recognition device 10 .

図６は、文字認識装置１０のハードウェア構成の一例を示すブロック図である。文字認識装置１０は、文字認識装置１０の各部を制御するプロセッサ１２と、各種情報を記憶するメモリ１４と、通信装置１６とを有する。 FIG. 6 is a block diagram showing an example of the hardware configuration of the character recognition device 10. As shown in FIG. The character recognition device 10 has a processor 12 that controls each part of the character recognition device 10 , a memory 14 that stores various information, and a communication device 16 .

メモリ１４は、例えば、プロセッサ１２の作業領域として機能するＲＡＭ等の揮発性メモリと、制御プログラムＰＧ１等の各種情報を記憶するＥＥＰＲＯＭ等の不揮発性メモリとの、一方または両方を含み、後述する記憶部１４０（図７参照）として機能する。なお、メモリ１４は、図４において説明された端末装置２０のメモリ２４と同様に、文字認識装置１０に着脱可能であってもよいし、文字認識装置１０とネットワークＮＷ等を介して通信可能に接続された記憶装置（例えば、オンラインストレージ）であってもよい。 The memory 14 includes, for example, one or both of a volatile memory such as a RAM that functions as a work area for the processor 12 and a nonvolatile memory such as an EEPROM that stores various information such as the control program PG1. It functions as part 140 (see FIG. 7). Note that the memory 14 may be detachable from the character recognition device 10 in the same manner as the memory 24 of the terminal device 20 explained in FIG. It may also be a connected storage device (eg, online storage).

本実施形態では、制御プログラムＰＧ１は、例えば、文字認識装置１０が文字認識処理を実行するためのアプリケーションプログラムを含む。制御プログラムＰＧ１は、例えば、制御部１２０が文字認識装置１０の各部を制御するためのオペレーティングシステムプログラムを含んでもよい。 In this embodiment, the control program PG1 includes, for example, an application program for the character recognition device 10 to perform character recognition processing. The control program PG1 may include, for example, an operating system program for the control section 120 to control each section of the character recognition device 10. FIG.

プロセッサ１２は、図４において説明された端末装置２０のプロセッサ２２と同様に構成される。例えば、プロセッサ１２は、例えば、１または複数のＣＰＵを含んで構成される。そして、プロセッサ１２は、メモリ１４に記憶された制御プログラムＰＧ１を実行し、制御プログラムＰＧ１に従って動作することで、後述する制御部１２０（図７参照）として機能する。 The processor 12 is configured similarly to the processor 22 of the terminal device 20 described in FIG. For example, processor 12 includes, for example, one or more CPUs. The processor 12 executes the control program PG1 stored in the memory 14 and operates according to the control program PG1, thereby functioning as a control unit 120 (see FIG. 7), which will be described later.

また、例えば、プロセッサ１２が複数のＣＰＵを含んで構成される場合、制御部１２０の一部または全部の機能は、これら複数のＣＰＵが制御プログラムＰＧ１等のプログラムに従って協働して動作することで実現されてもよい。また、プロセッサ１２は、１または複数のＣＰＵに加え、もしくは、１または複数のＣＰＵのうち一部または全部に代えて、ＧＰＵ、ＤＳＰ、または、ＦＰＧＡ等のハードウェアを含んで構成されるものであってもよい。この場合、プロセッサ１２により実現される制御部１２０の一部または全部は、ＤＳＰ等のハードウェアにより実現されてもよい。 Further, for example, when the processor 12 includes a plurality of CPUs, some or all of the functions of the control unit 120 are performed by the plurality of CPUs operating in cooperation according to a program such as the control program PG1. may be implemented. Further, the processor 12 includes hardware such as a GPU, DSP, or FPGA in addition to one or more CPUs, or in place of part or all of one or more CPUs. There may be. In this case, part or all of the control unit 120 implemented by the processor 12 may be implemented by hardware such as a DSP.

通信装置１６は、有線ネットワークおよび無線ネットワークの一方または両方を介して、文字認識装置１０の外部に存在する外部装置との通信を行うためのハードウェアであり、通信部１６０として機能する。 The communication device 16 is hardware for communicating with an external device existing outside the character recognition device 10 via one or both of a wired network and a wireless network, and functions as a communication unit 160 .

図７は、文字認識装置１０の機能的構成の一例を示すブロック図である。文字認識装置１０は、文字認識装置１０の各部を制御する制御部１２０と、各種情報を記憶する記憶部１４０と、端末装置２０等の外部装置との間の通信を実行するための通信部１６０とを有する。 FIG. 7 is a block diagram showing an example of the functional configuration of the character recognition device 10. As shown in FIG. The character recognition apparatus 10 includes a control section 120 that controls each section of the character recognition apparatus 10, a storage section 140 that stores various information, and a communication section 160 that executes communication with an external device such as the terminal device 20. and

［文字認識処理の詳細］
以下、文字認識装置１０の制御部１２０の各構成を説明するとともに、文字認識装置１０による文字認識処理の詳細について説明する。制御部１２０は、画像取得部１２１、文字領域特定部１２２、文字種判定部１２３、第１文字認識部１２４、第２文字認識部１２５（数字認識部１２６、記号認識部１２７、記号消去部１２８および出力部１２９）、認識結果生成部１３０を備える。 [Details of character recognition processing]
Each configuration of the control unit 120 of the character recognition device 10 will be described below, and details of the character recognition processing by the character recognition device 10 will be described. Control unit 120 includes image acquiring unit 121, character area identifying unit 122, character type determining unit 123, first character recognizing unit 124, second character recognizing unit 125 (number recognizing unit 126, symbol recognizing unit 127, symbol erasing unit 128 and Output unit 129) and recognition result generation unit 130 are provided.

画像取得部１２１は、処理対象の画像データを取得する。本実施形態では、画像取得部１２１は、端末装置２０から送信される領収書画像ＲＩを、ネットワークＮＷを介して受信する。 The image acquisition unit 121 acquires image data to be processed. In this embodiment, the image acquisition unit 121 receives the receipt image RI transmitted from the terminal device 20 via the network NW.

文字領域特定部１２２は、画像取得部１２１が取得した領収書画像ＲＩを、文字や罫線などの構成に分け、文字として読み取る領域を段落ごとに特定し、段落画像Ｐ（Ｐ１～Ｐ８）に区分する。文字領域特定部１２２が行う処理は、一般にレイアウト解析と呼ばれる。図８は、レイアウト解析後の領収書画像ＲＩの一例を示す図である。レイアウト解析により、領収書画像ＲＩは、文書名表記Ｎ１を囲う矩形の領域である段落画像Ｐ１、日付表記Ｎ２を囲う矩形の領域である段落画像Ｐ２、宛先表記Ｎ３を囲う矩形の領域である段落画像Ｐ３、金額表記Ｎ４を囲う矩形の領域である段落画像Ｐ４、但書表記Ｎ５および領収確認文表記Ｎ６を囲う矩形の領域である段落画像Ｐ５、住所表記Ｎ７および発行者名表記Ｎ８を囲う矩形の領域である段落画像Ｐ６が特定される。各段落画像Ｐ１～Ｐ６は、領収書画像ＲＩの一部分である。 The character region specifying unit 122 divides the receipt image RI acquired by the image acquiring unit 121 into structures such as characters and ruled lines, specifies regions to be read as characters for each paragraph, and divides them into paragraph images P (P1 to P8). do. The processing performed by the character area specifying unit 122 is generally called layout analysis. FIG. 8 is a diagram showing an example of a receipt image RI after layout analysis. According to the layout analysis, the receipt image RI has a paragraph image P1 which is a rectangular area surrounding the document name notation N1, a paragraph image P2 which is a rectangular area surrounding the date notation N2, and a paragraph image which is a rectangular area surrounding the address notation N3. Image P3, paragraph image P4 as a rectangular area surrounding amount notation N4, paragraph image P5 as a rectangular area surrounding disclaimer notation N5 and receipt confirmation statement notation N6, rectangle surrounding address notation N7 and issuer name notation N8 A paragraph image P6 that is an area of is specified. Each paragraph image P1-P6 is part of the receipt image RI.

文字種判定部１２３は、文字領域特定部１２２が特定した各段落画像Ｐに含まれる表記の文字種を判定する。本実施形態では、文字種判定部１２３は、領収書画像ＲＩに含まれる各段落画像Ｐについて、数字の表記のみを含むことが予測される数字段落画像、または、漢字、ひらがな、カタカナ、記号、数字等の表記が混在することが予測される混在段落画像のいずれに該当するかを判定する。 The character type determination unit 123 determines the type of characters included in each paragraph image P identified by the character region identification unit 122 . In this embodiment, the character type determination unit 123 determines whether each paragraph image P included in the receipt image RI is a numeric paragraph image that is expected to include only numeric notation, or a Chinese character, hiragana, katakana, symbol, or numeric character. It is determined which of the mixed paragraph images is expected to include notations such as .

本実施形態では、文字種判定部１２３は、金額表記Ｎ４に対応する段落画像Ｐ４を数字段落画像と判定し、残りの段落画像Ｐ１～Ｐ３、Ｐ５～Ｐ６を混在段落画像と判定する。文字種判定部１２３は、例えば、パターンマッチングを用いて、領収書画像ＲＩから金額表記Ｎ４に対応する段落画像Ｐ４を特定する。市販の領収書用紙や店舗のレジから印刷出力される領収書は、レイアウトがほぼ決まっている。このため、文字種判定部１２３は、一般的な領収書レイアウトをテンプレートとして保持し、今回の処理値対象の領収書画像ＲＩがどのテンプレートと一致するかを判定する。各テンプレートでは、金額表示に対応する段落の位置が予め指定されている。文字種判定部１２３は、テンプレートにおける金額表示の段落の位置と、領収書画像ＲＩに含まれる各段落画像Ｐの位置を比較し、当該領収書画像ＲＩにおいて金額表示に対応する段落画像Ｐを特定する。 In this embodiment, the character type determination unit 123 determines the paragraph image P4 corresponding to the monetary amount notation N4 as a number paragraph image, and determines the remaining paragraph images P1 to P3 and P5 to P6 as mixed paragraph images. The character type determination unit 123 uses, for example, pattern matching to identify the paragraph image P4 corresponding to the amount notation N4 from the receipt image RI. The layout of commercially available receipt paper and receipts printed out from cash registers in stores is almost fixed. For this reason, the character type determination unit 123 holds a general receipt layout as a template, and determines with which template the receipt image RI targeted for processing values this time matches. In each template, the position of the paragraph corresponding to the amount display is specified in advance. The character type determination unit 123 compares the position of the paragraph indicating the amount in the template with the position of each paragraph image P included in the receipt image RI, and identifies the paragraph image P corresponding to the amount indicating in the receipt image RI. .

第１文字認識部１２４および第２文字認識部１２５は、いずれも処理対象画像に含まれる表記を文字として認識する。本実施形態では、第１文字認識部１２４および第２文字認識部１２５は、例えばＡＩ（ＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｌｉｇｅｎｃｅ）－ＯＣＲにより文字認識を行う。ＡＩ－ＯＣＲでは、第１文字認識部１２４および第２文字認識部１２５は、文字を含む画像と、当該画像に含まれる文字との関係を学習した学習モデルを用いて、処理対象画像に含まれる文字を認識する。上記学習モデルは、多層ニューラルネットワークにより構成されている。ＡＩ－ＯＣＲを用いることにより、文字、特に手書きの文字についても精度よく認識することができる。第１文字認識部１２４および第２文字認識部１２５は、文字認識のアルゴリズムがそれぞれ異なる。第１文字認識部１２４は、漢字、ひらがな、カタカナ、記号、数字等が混在する文書の認識に適した日本語の認識において標準的なアルゴリズムを用いる。また、第２文字認識部１２５は、後述する数字認識部１２６においては、数字の認識に特化したアルゴリズムを用い、記号認識部１２７においては、記号の認識に特化したアルゴリズムを用いる。 Both the first character recognition unit 124 and the second character recognition unit 125 recognize notations included in the processing target image as characters. In this embodiment, the first character recognition unit 124 and the second character recognition unit 125 perform character recognition by, for example, AI (Artificial Intelligence)-OCR. In AI-OCR, the first character recognizing unit 124 and the second character recognizing unit 125 use a learning model that has learned the relationship between an image including characters and characters included in the image. recognize letters. The learning model is composed of a multilayer neural network. By using AI-OCR, it is possible to accurately recognize characters, especially handwritten characters. The first character recognition unit 124 and the second character recognition unit 125 have different algorithms for character recognition. The first character recognition unit 124 uses standard algorithms in Japanese language recognition suitable for recognizing documents containing a mixture of kanji, hiragana, katakana, symbols, numbers, and the like. In the second character recognition unit 125, the number recognition unit 126, which will be described later, uses an algorithm specialized for recognizing numbers, and the symbol recognition unit 127 uses an algorithm specialized for symbol recognition.

第１文字認識部１２４は、文字種判定部１２３で混在段落画像と判定された段落画像Ｐの文字認識を行う。本実施形態では、第１文字認識部１２４は、文書名表記Ｎ１に対応する段落画像Ｐ１、日付表記Ｎ２に対応する段落画像Ｐ２、宛先表記Ｎ３に対応する段落画像Ｐ３、但書表記Ｎ５および領収確認文表記Ｎ６に対応する段落画像Ｐ５、住所表記Ｎ７および発行者名表記Ｎ８に対応する段落画像Ｐ６に含まれる表記に対して文字認識を行う。第１文字認識部１２４は、各段落画像Ｐ１～Ｐ３、Ｐ５～Ｐ６に配置された表記が複数行に渡る場合には、各段落画像Ｐ１～Ｐ３、Ｐ５～Ｐ６を行ごとに区分し、更に１文字と推定される文字単位の部分画像に区分する。そして、第１文字認識部１２４は、文字単位に切り出した部分画像に含まれる表記を１文字として認識する。第１文字認識部１２４における認識結果は、認識結果生成部１３０に出力される。なお、本実施形態において、第１文字認識部１２４および第２文字認識部１２５の認識結果は、例えばＳｈｉｆｔ－ＪＩＳコードのようなテキストデータで出力される。なお、本実施形態では、第１文字認識部１２４は、日本語の認識において標準的なアルゴリズムを用いるものとするが、段落画像Ｐ１～Ｐ３、Ｐ５～Ｐ６のうち、特定の文字種のみが含まれる（または特定の文字種のみが含まれる可能性が高い）段落画像Ｐがある場合は、当該文字種の認識に特化したアルゴリズムを、当該段落画像Ｐに適用してもよい。 The first character recognition unit 124 performs character recognition on the paragraph image P determined by the character type determination unit 123 to be a mixed paragraph image. In this embodiment, the first character recognition unit 124 recognizes the paragraph image P1 corresponding to the document name notation N1, the paragraph image P2 corresponding to the date notation N2, the paragraph image P3 corresponding to the address notation N3, the proviso notation N5, and the receipt N5. Character recognition is performed on notations included in the paragraph image P5 corresponding to the confirmation sentence notation N6, the address notation N7, and the notation included in the paragraph image P6 corresponding to the issuer name notation N8. The first character recognition unit 124 divides each of the paragraph images P1 to P3 and P5 to P6 into lines when the notation arranged in each of the paragraph images P1 to P3 and P5 to P6 extends over a plurality of lines, and further The image is segmented into partial images for each character that is presumed to be one character. Then, the first character recognition unit 124 recognizes the notation included in the partial image extracted for each character as one character. The recognition result of the first character recognition section 124 is output to the recognition result generation section 130 . In this embodiment, the recognition results of the first character recognition unit 124 and the second character recognition unit 125 are output as text data such as Shift-JIS code. In this embodiment, the first character recognition unit 124 uses a standard algorithm for recognizing Japanese. If there is a paragraph image P (or there is a high possibility that only a specific character type is included), an algorithm specialized for recognizing the character type may be applied to the paragraph image P.

第２文字認識部１２５は、文字種判定部１２３で数字段落画像と判定された段落画像Ｐの文字認識を行う。本実施形態では、第２文字認識部１２５は、金額表記Ｎ４に対応する段落画像Ｐ４の文字認識を行う。領収書画像ＲＩの段落画像Ｐ４は、数字および数字以外の記号を表す第１画像の一例である。上述のように、金額表記Ｎ４は手書き文字で記入されている。よって、段落画像Ｐ４は、手書きの数字を含む文書を読み取った画像である。第２文字認識部１２５は、数字認識部１２６と、記号認識部１２７と、記号消去部１２８と、出力部１２９とを備える。 The second character recognition unit 125 performs character recognition on the paragraph image P determined by the character type determination unit 123 to be a number paragraph image. In this embodiment, the second character recognition unit 125 performs character recognition of the paragraph image P4 corresponding to the amount notation N4. The paragraph image P4 of the receipt image RI is an example of a first image representing numbers and symbols other than numbers. As described above, the amount notation N4 is written in handwritten characters. Therefore, the paragraph image P4 is an image obtained by reading a document containing handwritten numbers. The second character recognition section 125 includes a number recognition section 126 , a symbol recognition section 127 , a symbol elimination section 128 and an output section 129 .

数字認識部１２６は、数字の認識に特化したアルゴリズム（以下「数字用アルゴリズム」という）により文字認識処理を行う。記号認識部１２７は、記号の認識に特化したアルゴリズム（以下「記号用アルゴリズム」という）により文字認識処理を行う。記号認識部１２７は、例えば、記号のうち、数字に付随して使用される頻度が高い記号の認識に特化したアルゴリズムにより文字認識処理を行ってもよい。数字に付随して使用される頻度が高い記号とは、例えばカンマ（，）、ドット（．）、ハイフン（－）、通貨記号（￥、＄等）などである。また、特に数字が金額を表示する場合において、数字に付随して使用される頻度が高い「円」、「金」、「也」等の漢字（言語文字）を記号として扱い、記号認識部１２７で認識可能としてもよい。 The number recognition unit 126 performs character recognition processing using an algorithm specialized for number recognition (hereinafter referred to as "number algorithm"). The symbol recognition unit 127 performs character recognition processing using an algorithm specialized for recognizing symbols (hereinafter referred to as "algorithm for symbols"). For example, the symbol recognition unit 127 may perform character recognition processing using an algorithm specialized for recognizing symbols that are frequently used in association with numerals. Symbols that are frequently used in conjunction with numbers include, for example, commas (,), dots (.), hyphens (-), and currency symbols (¥, $, etc.). In addition, especially when numbers indicate monetary amounts, kanji (language characters) such as “yen”, “money”, and “ya”, which are frequently used in association with numbers, are treated as symbols, and the symbol recognition unit 127 , and may be recognizable.

記号消去部１２８は、処理対象画像（本実施形態では段落画像Ｐ４）から記号に対応する表記を消去する。記号消去部１２８は、例えば、記号に対応する表記を含む部分画像Ｃを、領収書画像ＲＩの背景色と同色の無地画像に置き換えることにより、記号を示す表記を消去する。 The symbol erasing unit 128 erases the notation corresponding to the symbol from the image to be processed (the paragraph image P4 in this embodiment). The symbol erasing unit 128 erases the notation indicating the symbol, for example, by replacing the partial image C including the notation corresponding to the symbol with a plain image having the same color as the background color of the receipt image RI.

出力部１２９は、数字認識部１２６による認識結果に基づいて、処理対象画像（本実施形態では段落画像Ｐ４）に含まれる表記の認識結果を出力する。また、出力部１２９は、数字認識部１２６による認識結果とともに、記号認識部１２７による認識結果に基づいて、処理対象画像に含まれる表記の認識結果を出力してもよい。 The output unit 129 outputs the recognition result of notation included in the image to be processed (the paragraph image P4 in this embodiment) based on the recognition result by the number recognition unit 126 . Further, the output unit 129 may output the recognition result of notation included in the processing target image based on the recognition result of the symbol recognition unit 127 together with the recognition result of the number recognition unit 126 .

図９を用いて、第２文字認識部１２５の文字認識方法について具体的に説明する。まず、第２文字認識部１２５は、図９Ａに示すように、段落画像Ｐ４に含まれる表記を１文字と推定される単位（以下、「文字単位」という）で区分する。文字単位で区分した領域を部分画像Ｃ（Ｃ１～Ｃ８）とする。本実施形態では、部分画像Ｃは、１文字と推定された表記を含む矩形の領域である。本実施形態では、段落画像Ｐ４から、「￥」の表記を含む部分画像Ｃ１、「２」の表記を含む部分画像Ｃ２、「５」の表記を含む部分画像Ｃ３、「，（カンマ）」の表記を含む部分画像Ｃ４、「７」の表記を含む部分画像Ｃ５、「６」の表記を含む部分画像Ｃ６、「０」の表記を含む部分画像Ｃ７、「－（ハイフン）」の表記を含む部分画像Ｃ８が切り出される。 The character recognition method of the second character recognition unit 125 will be specifically described with reference to FIG. First, as shown in FIG. 9A, the second character recognition unit 125 divides the notation included in the paragraph image P4 into units presumed to be one character (hereinafter referred to as "character units"). Areas divided in units of characters are assumed to be partial images C (C1 to C8). In this embodiment, the partial image C is a rectangular area containing a notation presumed to be one character. In this embodiment, from the paragraph image P4, a partial image C1 including the notation of "¥", a partial image C2 including the notation of "2", a partial image C3 including the notation of "5", and a partial image C3 including the notation of ", (comma)" Partial image C4 including notation, Partial image C5 including notation of "7", Partial image C6 including notation of "6", Partial image C7 including notation of "0", Including notation of "- (hyphen)" A partial image C8 is cut out.

つぎに、第２文字認識部１２５は、記号認識部１２７により、各部分画像Ｃ１～Ｃ８に含まれる表記の文字認識処理を行う。すなわち、記号認識部１２７は、段落画像Ｐ４に含まれる記号を認識する。第２文字認識部１２５は、例えば記号認識部１２７による記号としての認識結果の確度が所定値以上の表記は記号と判定し、認識結果の確度が所定値未満の部分画像Ｃは記号ではないと判定する。認識結果の確度とは、認識結果の確からしさを示す指標値であり、本実施形態では、確度が高いほど、認識結果の確からしさが大きいものとする。本実施形態では、「￥」の表記が含まれる部分画像Ｃ１、「，」の表記が含まれる部分画像Ｃ４、および「－」の表記が含まれる部分画像Ｃ８が、記号を示す表記を含むと判定される。第１実施形態では、記号の表記が含まれる部分画像Ｃを特定できればよく、当該記号の表記の認識結果は保存されなくてよい。 Next, the second character recognizing unit 125 uses the symbol recognizing unit 127 to perform character recognition processing of notations included in each of the partial images C1 to C8. That is, the symbol recognition unit 127 recognizes symbols included in the paragraph image P4. For example, the second character recognition unit 125 determines that a notation whose recognition result accuracy as a symbol by the symbol recognition unit 127 is a predetermined value or more is a symbol, and that a partial image C whose recognition result accuracy is less than a predetermined value is not a symbol. judge. The certainty of the recognition result is an index value indicating the certainty of the recognition result, and in this embodiment, the higher the certainty, the higher the certainty of the recognition result. In this embodiment, the partial image C1 including the notation of "\", the partial image C4 including the notation of ",", and the partial image C8 including the notation of "-" include the notation indicating the symbol. be judged. In the first embodiment, it suffices if the partial image C including the notation of the symbol can be specified, and the recognition result of the notation of the symbol need not be saved.

つづいて、第２文字認識部１２５は、記号消去部１２８により、図９Ｂに示すように、記号を示す表記を消去する。本実施形態では、記号消去部１２８は、例えば、部分画像Ｃ１，Ｃ４，Ｃ８の画像を、領収書画像ＲＩの背景色と同色の無地画像に置き換えることにより、記号を示す表記を消去する。記号を示す表記が削除された段落画像Ｐ４を段落画像Ｐ４Ａとする。段落画像Ｐ４Ａを図９Ｃに示す。すなわち、記号消去部１２８は、段落画像Ｐ４から、記号に対応する部分画像Ｃ１，Ｃ４，Ｃ８を消去した段落画像Ｐ４Ａを生成する生成部の一例である。より詳細には、記号消去部１２８は、記号認識部１２７によって認識された記号に対応する部分を消去して段落画像Ｐ４Ａを生成する。部分画像Ｃ１，Ｃ４，Ｃ８は第１部分の一例であり、段落画像Ｐ４Ａは第２画像の一例である。 Subsequently, the second character recognizing unit 125 causes the symbol erasing unit 128 to erase the notation indicating the symbol, as shown in FIG. 9B. In this embodiment, the symbol erasing unit 128 erases the notation indicating the symbol by replacing the images of the partial images C1, C4, and C8 with plain images having the same color as the background color of the receipt image RI, for example. The paragraph image P4 from which the notation indicating the symbol is deleted is referred to as a paragraph image P4A. A paragraph image P4A is shown in FIG. 9C. That is, the symbol erasing unit 128 is an example of a generating unit that generates a paragraph image P4A by erasing partial images C1, C4, and C8 corresponding to symbols from the paragraph image P4. More specifically, the symbol erasing section 128 erases the portion corresponding to the symbol recognized by the symbol recognizing section 127 to generate the paragraph image P4A. The partial images C1, C4, and C8 are examples of the first portion, and the paragraph image P4A is an example of the second image.

そして、第２文字認識部１２５は、数字認識部１２６により、段落画像Ｐ４Ａに含まれる表記に対する文字認識処理を行う。すなわち、数字認識部１２６は、記号消去部１２８の生成した段落画像Ｐ４Ａに含まれる数字を認識する。本実施形態では、段落画像Ｐ４Ａは、部分画像Ｃ２～Ｃ３、Ｃ５～Ｃ７を含んでいる。数字認識部１２６により、部分画像Ｃ２の表記は「２」、部分画像Ｃ３の表記は「５」、部分画像Ｃ５の表記は「７」、部分画像Ｃ６の表記は「６」、部分画像Ｃ７の表記は「０」をそれぞれ示すと認識される。 Then, the second character recognition unit 125 uses the number recognition unit 126 to perform character recognition processing on the notation included in the paragraph image P4A. That is, the number recognition unit 126 recognizes the numbers included in the paragraph image P4A generated by the symbol elimination unit 128. FIG. In this embodiment, the paragraph image P4A includes partial images C2-C3 and C5-C7. By the number recognition unit 126, the notation of the partial image C2 is "2", the notation of the partial image C3 is "5", the notation of the partial image C5 is "7", the notation of the partial image C6 is "6", and the notation of the partial image C7 is "2". The notation is recognized to indicate "0" respectively.

第２文字認識部１２５は、出力部１２９により、段落画像Ｐ４の認識結果を出力する。本実施形態では、出力部１２９は、各部分画像Ｃ２～Ｃ３、Ｃ５～Ｃ７の表記に対する認識結果を、段落画像Ｐ４における部分画像Ｃ２～Ｃ３、Ｃ５～Ｃ７の配置の順に配列し、段落画像Ｐ４に含まれる表記の認識結果として出力する。すなわち、出力部１２９は、段落画像Ｐ４に含まれる表記のうち、数字の認識結果のみ配列した文字列を出力する。このとき、記号の表記があった部分画像Ｃの箇所はブランクとせず、詰めて数字を配列してもよい。具体的には、出力部１２９は、図９Ｅに示すように、段落画像Ｐ４に含まれる表記の認識結果Ｕとして、「２５７６０」を出力する。第２文字認識部１２５による認識結果は、認識結果生成部１３０に出力される。 The output unit 129 of the second character recognition unit 125 outputs the recognition result of the paragraph image P4. In this embodiment, the output unit 129 arranges the recognition results for the descriptions of the partial images C2 to C3 and C5 to C7 in the order of arrangement of the partial images C2 to C3 and C5 to C7 in the paragraph image P4, and arranges them in the paragraph image P4. Output as the recognition result of the notation contained in . In other words, the output unit 129 outputs a character string in which only the recognition results of numbers among the notations included in the paragraph image P4 are arranged. At this time, the portions of the partial image C where the symbols are written may not be left blank, but numbers may be arranged. Specifically, as shown in FIG. 9E, the output unit 129 outputs "25760" as the recognition result U of the notation included in the paragraph image P4. The recognition result by the second character recognition section 125 is output to the recognition result generation section 130 .

認識結果生成部１３０は、第１文字認識部１２４および第２文字認識部１２５がそれぞれ出力した認識結果を、予め定められたフォーマットに入力して認識結果データを生成する。本実施形態では、予め定められたフォーマットとは、図３に示す認識結果データＲＤのフォーマットである。認識結果生成部１３０は、段落画像Ｐ２の認識結果を日付認識結果Ｄ１に、段落画像Ｐ３の認識結果を宛先認識結果Ｄ２に、段落画像Ｐ４の認識結果を金額認識結果Ｄ３に、段落画像Ｐ５の認識結果のうち「但し」「として」等の不要部分を除いた文字列を但書認識結果Ｄ４に、段落画像Ｐ６のうち発行者名表記Ｎ８に対応する文字列を発行者認識結果Ｄ５に、それぞれ入力して認識結果データを生成する。また、認識結果生成部１３０は、生成した認識結果データを端末装置２０に送信する。 Recognition result generation unit 130 generates recognition result data by inputting the recognition results respectively output by first character recognition unit 124 and second character recognition unit 125 into a predetermined format. In this embodiment, the predetermined format is the format of the recognition result data RD shown in FIG. The recognition result generator 130 converts the recognition result of the paragraph image P2 into the date recognition result D1, the recognition result of the paragraph image P3 into the address recognition result D2, the recognition result of the paragraph image P4 into the amount recognition result D3, and the paragraph image P5 into the recognition result D3. The character string from which unnecessary parts such as "however" and "as" are removed from the recognition result is set as the proviso recognition result D4, and the character string corresponding to the issuer name notation N8 in the paragraph image P6 is set as the issuer recognition result D5. Recognition result data is generated by inputting each of them. Further, the recognition result generator 130 transmits the generated recognition result data to the terminal device 20 .

［フローチャート］
図１０は、文字認識装置１０による文字認識処理の手順を示すフローチャートである。文字認識装置１０の制御部１２０は、画像取得部１２１として機能することにより、端末装置２０から処理対象の領収書画像ＲＩを受信する（ステップＳ１００）。つぎに、制御部１２０は、文字領域特定部１２２として機能することにより、ステップＳ１００で取得した領収書画像ＲＩに対してレイアウト解析を行い、領収書画像ＲＩに含まれる段落画像Ｐを特定する（ステップＳ１０２）。つづいて、制御部１２０は、文字種判定部１２３として機能することにより、ステップＳ１０２で特定した各段落画像Ｐの文字種を判定する（ステップＳ１０４）。より詳細には、制御部１２０は、それぞれの段落画像Ｐが、数字段落画像であるか、混在段落画像であるかを判定する。 [flowchart]
FIG. 10 is a flowchart showing the procedure of character recognition processing by the character recognition device 10. As shown in FIG. The control unit 120 of the character recognition device 10 receives the receipt image RI to be processed from the terminal device 20 by functioning as the image acquisition unit 121 (step S100). Next, the control unit 120 functions as the character area specifying unit 122 to perform layout analysis on the receipt image RI acquired in step S100, and specifies the paragraph image P included in the receipt image RI ( step S102). Subsequently, the control unit 120 determines the character type of each paragraph image P specified in step S102 by functioning as the character type determination unit 123 (step S104). More specifically, the control unit 120 determines whether each paragraph image P is a number paragraph image or a mixed paragraph image.

制御部１２０は、ステップＳ１０２で特定した各段落画像Ｐに対して、順次文字認識を行う。処理対象の段落画像Ｐが数字段落画像である場合（ステップＳ１０６：ＹＥＳ）、制御部１２０は、第２文字認識部１２５として機能し、数字用アルゴリズムおよび記号用アルゴリズムを用いて、処理対象の段落画像内の表記に対して文字認識を行う（ステップＳ１０８：第２文字認識処理）。ステップＳ１０８の詳細は、図１０のフローチャートを用いて後述する。 The control unit 120 sequentially performs character recognition on each paragraph image P specified in step S102. If the paragraph image P to be processed is a number paragraph image (step S106: YES), the control unit 120 functions as the second character recognition unit 125, and uses the algorithm for numbers and the algorithm for symbols to recognize the paragraph to be processed. Character recognition is performed on the notation in the image (step S108: second character recognition processing). Details of step S108 will be described later with reference to the flowchart of FIG.

また、処理対象の段落画像が数字段落画像ではない場合（ステップＳ１０６：ＮＯ）、すなわち混在段落画像の場合、制御部１２０は、第１文字認識部１２４として機能し、日本語の認識において標準的なアルゴリズムを用いて、処理対象の段落画像内の表記に対して文字認識を行う（ステップＳ１１０：第１文字認識処理）。 If the paragraph image to be processed is not a number paragraph image (step S106: NO), that is, if it is a mixed paragraph image, the control unit 120 functions as the first character recognition unit 124, character recognition is performed on the notation in the paragraph image to be processed (step S110: first character recognition process).

制御部１２０は、全ての段落画像の文字認識処理が終了するまでは（ステップＳ１１２：ＮＯ）、ステップＳ１０６に戻り、文字認識処理を継続する。全ての段落画像の文字認識が終了すると（ステップＳ１１２：ＹＥＳ）、制御部１２０は、認識結果生成部１３０として機能することにより、各段落画像Ｐに対する認識結果を予め定められたフォーマットに入力して認識結果データを生成し（ステップＳ１１４）、生成した認識結果データを端末装置２０に送信して（ステップＳ１１６）、本フローチャートによる処理を終了する。 Until the character recognition processing for all paragraph images is completed (step S112: NO), the control unit 120 returns to step S106 and continues the character recognition processing. When the character recognition of all paragraph images is completed (step S112: YES), the control unit 120 functions as the recognition result generation unit 130 to input the recognition result for each paragraph image P into a predetermined format. Recognition result data is generated (step S114), the generated recognition result data is transmitted to the terminal device 20 (step S116), and the processing according to this flow chart ends.

つぎに、図１１を用いて、ステップＳ１０８の処理（第２文字認識部１２５による第２文字認識処理）について説明する。上述のように、制御部１２０は、処理対象の段落画像Ｐが数字段落である場合には、第２文字認識部１２５として機能し、以下の処理を行う。制御部１２０は、処理対象の段落画像Ｐ（本実施形態では段落画像Ｐ４）に含まれる表記を文字単位に区分（図９Ａ参照）して、複数の部分画像Ｃを切り出す（ステップＳ２００）。 Next, the process of step S108 (the second character recognition process by the second character recognition unit 125) will be described with reference to FIG. As described above, when the paragraph image P to be processed is a numeric paragraph, the control section 120 functions as the second character recognition section 125 and performs the following processing. The control unit 120 classifies the notation included in the paragraph image P to be processed (the paragraph image P4 in this embodiment) into character units (see FIG. 9A), and cuts out a plurality of partial images C (step S200).

制御部１２０は、記号認識部１２７として機能することにより、段落画像Ｐに含まれる部分画像Ｃごとに記号用アルゴリズムを用いて文字認識処理を行う（ステップＳ２０２：記号認識処理）。処理対象の段落画像Ｐに、記号を示す表記を含む部分画像Ｃがある場合（ステップＳ２０４：ＹＥＳ）、制御部１２０は、記号消去部１２８として機能することにより、記号を示す表記を含む部分画像Ｃを消去（図９Ｂ参照）する（ステップＳ２０６）。また、処理対象の段落に、記号を示す表記が含まれる部分画像Ｃがない場合（ステップＳ２０４：ＮＯ）、制御部１２０は、ステップＳ２０８に移行する。 By functioning as the symbol recognition unit 127, the control unit 120 performs character recognition processing using a symbol algorithm for each partial image C included in the paragraph image P (step S202: symbol recognition processing). If the paragraph image P to be processed includes a partial image C including a notation indicating a symbol (step S204: YES), the control unit 120 functions as the symbol erasing unit 128 to remove a partial image including a notation indicating a symbol. C is erased (see FIG. 9B) (step S206). If the paragraph to be processed does not include a partial image C that includes a notation indicating a symbol (step S204: NO), the control unit 120 proceeds to step S208.

制御部１２０は、数字認識部１２６として機能することにより、段落画像Ｐに含まれる表記について部分画像Ｃごとに数字用アルゴリズムを用いて文字認識（図９Ｃ参照）を行う（ステップＳ２０８：数字認識処理）。制御部１２０は、出力部１２９として機能することにより、ステップＳ２０８における認識結果を、段落画像Ｐの認識結果として認識結果生成部１３０に出力（図９Ｅ参照）して（ステップＳ２１０）、本フローチャートの処理を終了する。 The control unit 120 functions as the number recognition unit 126 to perform character recognition (see FIG. 9C) using a number algorithm for each partial image C for the notation included in the paragraph image P (step S208: number recognition processing). ). By functioning as the output unit 129, the control unit 120 outputs the recognition result in step S208 to the recognition result generation unit 130 as the recognition result of the paragraph image P (see FIG. 9E) (step S210), and the End the process.

以上説明したように、第１実施形態では、記号消去部１２８は、数字および数字以外の記号を表す段落画像Ｐ４（図８参照）から、記号に対応する部分画像Ｃ１，Ｃ４，Ｃ８を消去した段落画像Ｐ４Ａ（図９Ｃ参照）を生成し、数字認識部１２６は、段落画像Ｐ４Ａに含まれる数字を認識する。数字認識部１２６は、予め数字以外の記号が消去された段落画像Ｐ４Ａに対して数字の認識を行うので、記号を数字と誤認識する可能性が低くなり、数字以外の記号が混在する段落画像Ｐ４に対して数字の認識を行うのと比較して、数字の認識精度を向上させることができる。 As described above, in the first embodiment, the symbol erasing unit 128 erases the partial images C1, C4, and C8 corresponding to the symbols from the paragraph image P4 (see FIG. 8) representing numbers and symbols other than numbers. A paragraph image P4A (see FIG. 9C) is generated, and the number recognition unit 126 recognizes numbers included in the paragraph image P4A. Since the number recognition unit 126 recognizes numbers in the paragraph image P4A from which symbols other than numbers have been previously erased, the possibility of erroneously recognizing symbols as numbers is reduced, and a paragraph image in which symbols other than numbers are mixed is reduced. The accuracy of digit recognition can be improved compared to performing digit recognition on P4.

また、第１実施形態では、記号認識部１２７によって段落画像Ｐ４に含まれる記号を認識し、記号消去部１２８は、記号認識部１２７によって認識された記号に対応する部分画像Ｃ１，Ｃ４，Ｃ８を消去して段落画像Ｐ４Ａを生成する。記号認識部１２７によって段落画像Ｐ４に含まれる記号を認識することにより、記号に対応する部分画像Ｃを確実に特定し、段落画像Ｐ４から消去することができる。例えば、画像処理によりカンマやドット等の数字と比べて面積が小さい記号をノイズとして削除する場合と比較すると、記号認識部１２７を用いた場合には、「￥」等の数字と同程度の大きさを有する記号も消去することができ、より確実に記号を段落画像Ｐ４から消去することができる。 In the first embodiment, the symbol recognition unit 127 recognizes symbols included in the paragraph image P4, and the symbol erasure unit 128 selects partial images C1, C4, and C8 corresponding to the symbols recognized by the symbol recognition unit 127. Erase to generate a paragraph image P4A. By recognizing the symbol included in the paragraph image P4 by the symbol recognition unit 127, the partial image C corresponding to the symbol can be reliably specified and deleted from the paragraph image P4. For example, compared to the case where symbols such as commas and dots, which have a smaller area than numbers, are deleted as noise by image processing, when the symbol recognition unit 127 is used, the noise is about the same size as numbers such as "¥". A symbol having a width can also be deleted, and the symbol can be more reliably deleted from the paragraph image P4.

また、第１実施形態では、出力部１２９は、段落画像Ｐ４の認識結果として、段落画像Ｐ４に含まれる表記のうち数字の認識結果のみ配列した文字列を出力する。これにより、段落画像Ｐ４の認識結果に含まれる文字種が数字のみに統一され、認識結果データを用いた処理の負荷を軽減することができる。例えば、領収書画像ＲＩに対する文字認識結果を会計アプリケーションで用いる場合に、金額表記Ｎ４の認識結果が数字のみに統一されているので、余分な記号を消去する等の前処理を行うことなく、そのまま演算等の処理に用いることができる。 In the first embodiment, the output unit 129 outputs, as the recognition result of the paragraph image P4, a character string in which only the recognition results of numbers included in the paragraph image P4 are arranged. As a result, the character types included in the recognition result of the paragraph image P4 are unified to numbers only, and the load of processing using the recognition result data can be reduced. For example, when the character recognition result for the receipt image RI is used in an accounting application, the recognition result for the amount notation N4 is unified to numbers only. It can be used for processing such as calculation.

また、第１実施形態では、数字認識部１２６は、数字の認識に特化した数字用アルゴリズムを用いて文字認識処理を行う。一般に、処理対象画像に含まれる文字種が固定されている場合には、当該文字種の認識に特化したアルゴリズムを用いることにより、文字の認識精度を高めることができる。一方で、特定の文字種の認識に特化したアルゴリズムは、どのような文字や線も当該特定の文字種の中の１つの文字として認識しようとする傾向があり、処理対象画像に他の文字種が混在する場合に認識精度の低下が生じやすい。本実施形態のように、処理対象画像から予め特定の文字種（数字）以外の表記を消去しておくことにより、特定の文字種の認識に特化したアルゴリズムを用いた文字認識の精度を更に向上させることができる。 Further, in the first embodiment, the number recognition unit 126 performs character recognition processing using a number algorithm specialized for number recognition. In general, when the character type included in the image to be processed is fixed, character recognition accuracy can be improved by using an algorithm specialized for recognizing the character type. On the other hand, algorithms that specialize in recognizing specific character types tend to recognize any character or line as one character in the specific character type, and other character types are mixed in the image to be processed. In such a case, the recognition accuracy tends to decrease. As in the present embodiment, by deleting notation other than a specific character type (number) from the image to be processed in advance, the accuracy of character recognition using an algorithm specialized for recognizing a specific character type is further improved. be able to.

［第２実施形態］
次に、第２実施形態を説明する。以下の各例示において機能が第１実施形態と同様である要素については、第１実施形態の説明で使用した符号を流用して各々の詳細な説明を適宜省略する。第１実施形態では、出力部１２９が出力する認識結果は、記号と認識された表記の認識結果を含まなかった。第２実施形態では、出力部１２９が出力する認識結果に、記号と認識された表記の認識結果を含めるようにする。 [Second embodiment]
Next, a second embodiment will be described. In each of the following illustrations, the reference numerals used in the description of the first embodiment are used for elements whose functions are the same as those of the first embodiment, and detailed description of each element is omitted as appropriate. In the first embodiment, the recognition results output by the output unit 129 did not include recognition results of notations recognized as symbols. In the second embodiment, the recognition results output by the output unit 129 include the recognition results of notations recognized as symbols.

図９を援用して、第２実施形態における第２文字認識部１２５の文字認識方法について説明する。第２実施形態においても、第２文字認識部１２５は、まず、図９Ａに示すように、段落画像Ｐ４に含まれる表記を文字単位で区分する。本実施形態では、段落画像Ｐ４から、部分画像Ｃ１～Ｃ８が切り出される。 A character recognition method of the second character recognition unit 125 in the second embodiment will be described with reference to FIG. Also in the second embodiment, the second character recognition unit 125 first classifies the notation included in the paragraph image P4 for each character as shown in FIG. 9A. In this embodiment, partial images C1 to C8 are cut out from the paragraph image P4.

つぎに、第２文字認識部１２５は、記号認識部１２７により、各部分画像Ｃ１～Ｃ８に含まれる表記の文字認識処理を行う。本実施形態では、部分画像Ｃ１に含まれる表記が「￥」、部分画像Ｃ４に含まれる表記が「，」、部分画像Ｃ８に含まれる表記が「－」と認識され、その他の部分画像Ｃ２～Ｃ３，Ｃ５～Ｃ７に含まれる表記は記号ではないと認識される。第２実施形態では、出力部１２９は、記号の表記が含まれる部分画像Ｃの位置と、当該記号の表記の認識結果を保存する。 Next, the second character recognizing unit 125 uses the symbol recognizing unit 127 to perform character recognition processing of notations included in each of the partial images C1 to C8. In this embodiment, the notation included in the partial image C1 is recognized as "\", the notation included in the partial image C4 is recognized as ",", and the notation included in the partial image C8 is recognized as "-". It is recognized that the notations contained in C3, C5-C7 are not symbols. In the second embodiment, the output unit 129 stores the position of the partial image C including the notation of the symbol and the recognition result of the notation of the symbol.

つづいて、第２文字認識部１２５は、記号消去部１２８により、図９Ｂに示すように、記号を示す表記を消去する。本実施形態では、部分画像Ｃ１，Ｃ４，Ｃ８の表記が消去される。記号を示す表記が削除された段落画像Ｐ４を段落画像Ｐ４Ａとし、図９Ｃに示す。そして、第２文字認識部１２５は、数字認識部１２６により、段落画像Ｐ４Ａに含まれる部分画像Ｃ２～Ｃ３、Ｃ５～Ｃ７に含まれる表記の文字認識処理を行い、各表記が示す数字を認識する。本実施形態では、図９Ｄに示すように、部分画像Ｃ２の表記は「２」、部分画像Ｃ３の表記は「５」、部分画像Ｃ５の表記は「７」、部分画像Ｃ６の表記は「６」、部分画像Ｃ７の表記は「０」をそれぞれ示すと認識される。 Subsequently, the second character recognizing unit 125 causes the symbol erasing unit 128 to erase the notation indicating the symbol, as shown in FIG. 9B. In this embodiment, the representations of the partial images C1, C4, and C8 are deleted. The paragraph image P4 from which the notation indicating the symbol is deleted is referred to as a paragraph image P4A, which is shown in FIG. 9C. Then, the second character recognition unit 125 performs character recognition processing of the notations included in the partial images C2 to C3 and C5 to C7 included in the paragraph image P4A by the number recognition unit 126, and recognizes the number indicated by each notation. . In this embodiment, as shown in FIG. 9D, the notation of partial image C2 is "2", the notation of partial image C3 is "5", the notation of partial image C5 is "7", and the notation of partial image C6 is "6". ” and the notation of the partial image C7 are recognized as indicating “0”, respectively.

第２文字認識部１２５は、全ての部分画像Ｃ１～Ｃ８の認識結果が揃うと、出力部１２９により、全ての部分画像Ｃ１～Ｃ８の表記の認識結果を、段落画像Ｐ４における部分画像Ｃ１～Ｃ８の配置に合わせて配列した文字列を生成する。そして、出力部１２９は、この文字列を段落画像Ｐ４における認識結果として出力する。すなわち、第２文字認識部１２５は、部分画像Ｃ１の表記である「￥」、部分画像Ｃ２の表記である「２」、部分画像Ｃ３の表記である「５」、部分画像Ｃ４の表記である「，」、部分画像Ｃ５の表記である「７」、部分画像Ｃ６の表記である「６」、部分画像Ｃ７の表記である「０」、部分画像Ｃ８の表記である「－」をこの順に配列した文字列「￥２５，７６０－」を段落画像Ｐ４における認識結果として出力する。すなわち、出力部１２９は、数字認識部１２６によって識別された数字と、記号認識部１２７によって識別された記号とを配列した文字列を出力する。この文字列に含まれる数字および記号の配列は、段落画像Ｐ４における数字および記号の配列に対応している。 When the recognition results of all the partial images C1 to C8 are obtained, the second character recognition unit 125 causes the output unit 129 to convert the recognition results of the notation of all the partial images C1 to C8 into the partial images C1 to C8 in the paragraph image P4. Generates a string arranged according to the arrangement of . Then, the output unit 129 outputs this character string as the recognition result for the paragraph image P4. That is, the second character recognition unit 125 recognizes the notation of the partial image C1 as "\", the notation of the partial image C2 as "2", the notation of the partial image C3 as "5", and the notation of the partial image C4. ``,'', ``7'' for partial image C5, ``6'' for partial image C6, ``0'' for partial image C7, and ``-'' for partial image C8 in this order. The arrayed character string "¥25,760-" is output as the recognition result for the paragraph image P4. That is, the output unit 129 outputs a character string in which the numbers identified by the number recognition unit 126 and the symbols identified by the symbol recognition unit 127 are arranged. The array of numbers and symbols included in this character string corresponds to the array of numbers and symbols in the paragraph image P4.

図１２は、第２実施形態における認識結果データの一例を示す図である。図１２に示す認識結果データＲＤＡは、図２の領収書画像ＲＩを文字認識した結果を示している。認識結果データＲＤＡのうち、日付認識結果Ｄ１、宛先認識結果Ｄ２、但書認識結果Ｄ４、発行者認識結果Ｄ５は、図３に示す認識結果データＲＤと同一である。一方、図３に示す認識結果データＲＤでは、金額認識結果Ｄ３は「２５７６０」であり、数字のみの文字列である。これに対して、図１２に示す認識結果データＲＤＡでは、金額認識結果Ｄ３は「￥２５，７６０－」であり、数字と記号を含んでいる。 FIG. 12 is a diagram showing an example of recognition result data in the second embodiment. Recognition result data RDA shown in FIG. 12 indicates the result of character recognition of the receipt image RI of FIG. Among the recognition result data RDA, the date recognition result D1, destination recognition result D2, proviso recognition result D4, and issuer recognition result D5 are the same as the recognition result data RD shown in FIG. On the other hand, in the recognition result data RD shown in FIG. 3, the amount recognition result D3 is "25760", which is a character string consisting of numbers only. On the other hand, in the recognition result data RDA shown in FIG. 12, the amount recognition result D3 is "¥25,760-" and includes numbers and symbols.

以上説明したように、第２実施形態では、出力部１２９は、数字認識部１２６によって識別された数字と、記号認識部１２７によって識別された記号とを配列した文字列を出力する。この文字列に含まれる数字および記号の配列は、段落画像Ｐ４における数字および記号の配列に対応している。すなわち、第２実施形態では、出力部１２９は、段落画像Ｐ４の認識結果として、段落画像Ｐ４に含まれる表記の全ての認識結果を配列した文字列を出力する。これにより、段落画像Ｐ４に含まれる表記の認識結果が、もれなく認識結果データに記録される。よって、例えば段落画像Ｐ４における金額表記が小数点以下の数値を含む場合や、金額表記における通貨の種類が不明であり、￥等の通貨表記により識別する必要がある場合などにおいて、利便性を向上させることができる。 As described above, in the second embodiment, the output unit 129 outputs a character string in which numbers identified by the number recognition unit 126 and symbols identified by the symbol recognition unit 127 are arranged. The array of numbers and symbols included in this character string corresponds to the array of numbers and symbols in the paragraph image P4. That is, in the second embodiment, the output unit 129 outputs, as the recognition result of the paragraph image P4, a character string in which all recognition results of notation included in the paragraph image P4 are arranged. As a result, the recognition result of the notation included in the paragraph image P4 is completely recorded in the recognition result data. Therefore, for example, when the amount notation in the paragraph image P4 includes a numerical value after the decimal point, or when the type of currency in the amount notation is unknown and needs to be identified by currency notation such as ¥, etc., convenience is improved. be able to.

なお、出力部１２９に対して、記号認識部１２７によって識別された記号うち、特定の記号のみを認識結果の文字列に含める、または、特定の記号のみを認識結果の文字列に含めないような設定をできるようにしてもよい。例えば、「１，２３４，５６７．８９」という数字の表記を文字認識した際に、桁を区切るカンマは不要であっても、小数点に対応するドットは認識結果に含めなければ、数値としての認識結果が異なってしまう場合がある。この場合、出力部１２９に対して、認識された記号のうちドットについては認識結果に含め、その他の記号については認識結果に含めないように予め設定しておく。これにより、上記の数字の表記の認識結果が「１２３４５６７．８９」と出力される。 Note that the output unit 129 is provided with a symbol that includes only a specific symbol among the symbols identified by the symbol recognition unit 127 in the character string of the recognition result, or that does not include only the specific symbol in the character string of the recognition result. You may enable the setting. For example, when recognizing the number "1,234,567.89", even if the comma separating the digits is unnecessary, if the dot corresponding to the decimal point is not included in the recognition result, it will not be recognized as a number. Results may vary. In this case, the output unit 129 is set in advance such that dots among the recognized symbols are included in the recognition result, and other symbols are not included in the recognition result. As a result, "1234567.89" is output as the result of recognizing the above numeric notation.

［第３実施形態］
次に、第３実施形態を説明する。以下の各例示において機能が第１実施形態または第２実施形態と同様である要素については、第１実施形態または第２実施形態の説明で使用した符号を流用して各々の詳細な説明を適宜省略する。第１実施形態および第２実施形態では、第２文字認識部１２５は、記号認識部１２７によって認識された記号を画像から消去した上で、数字認識部１２６により数字の文字認識を行っていた。第３実施形態では、段落画像Ｐ４に含まれる全ての表記に対して、数字認識部１２６および記号認識部１２７のそれぞれが文字認識を行い、認識結果の確度に基づいて、記号の表記を判定する。その後、記号と認定された表記を段落から消去した上で、数字認識部１２６により数字の文字認識を行う。 [Third Embodiment]
Next, a third embodiment will be described. In each of the following illustrations, elements whose functions are the same as those of the first embodiment or the second embodiment will be appropriately described in detail using the reference numerals used in the description of the first embodiment or the second embodiment. omitted. In the first and second embodiments, the second character recognition unit 125 erases the symbol recognized by the symbol recognition unit 127 from the image, and then the number recognition unit 126 performs character recognition of numerals. In the third embodiment, each of the number recognition unit 126 and the symbol recognition unit 127 performs character recognition on all the notations included in the paragraph image P4, and determines the symbol notation based on the accuracy of the recognition result. . After that, after erasing the notation recognized as the symbol from the paragraph, the number recognition unit 126 performs character recognition of the number.

図１３を用いて、第３実施形態における第２文字認識部１２５の文字認識方法について説明する。第３実施形態においても、第２文字認識部１２５は、まず、図１３Ａに示すように、段落画像Ｐ４に含まれる表記を文字単位で区分する。本実施形態では、段落画像Ｐ４から、部分画像Ｃ１～Ｃ８が切り出される。 A character recognition method of the second character recognition unit 125 in the third embodiment will be described with reference to FIG. Also in the third embodiment, the second character recognition unit 125 first classifies the notation included in the paragraph image P4 on a character-by-character basis, as shown in FIG. 13A. In this embodiment, partial images C1 to C8 are cut out from the paragraph image P4.

つぎに、第２文字認識部１２５は、記号認識部１２７により、部分画像Ｃ１～Ｃ８に含まれる表記の文字認識を行う。第３実施形態では、記号認識部１２７は、各部分画像Ｃ１～Ｃ８の表記の認識結果とともに、各認識結果の確度情報を出力する。以下、記号認識部１２７による認識結果を「第２認識結果」、第２認識結果の確度を「第２確度」という。本実施形態では、図１３Ｂに示すように、部分画像Ｃ１の表記の第２認識結果は「￥」であり第２確度は９５％である。部分画像Ｃ２の表記の文第２字認識結果は「＞」であり第２確度は１５％である。部分画像Ｃ３の表記の第２認識結果は「＄」であり第２確度は１８％である。部分画像Ｃ４の表記の第２認識結果は「，」であり第２確度は９３％である。部分画像Ｃ５の表記の第２認識結果は「／」であり第２確度は２８％である。部分画像Ｃ６の表記の第２認識結果は「＆」であり第２確度は１１％である。部分画像Ｃ７の表記の第２認識結果は「＠」であり第２確度は２０％である。部分画像Ｃ８の表記の第２認識結果は「－」であり第２確度は９５％である。なお、図１３では、確度の％の表記は省略している。 Next, the second character recognition unit 125 uses the symbol recognition unit 127 to recognize characters written in the partial images C1 to C8. In the third embodiment, the symbol recognition unit 127 outputs accuracy information of each recognition result together with the recognition result of the notation of each of the partial images C1 to C8. Hereinafter, the recognition result by the symbol recognition unit 127 will be referred to as "second recognition result", and the accuracy of the second recognition result will be referred to as "second accuracy". In this embodiment, as shown in FIG. 13B, the second recognition result of the notation of the partial image C1 is "¥" and the second accuracy is 95%. The second character recognition result of the notation of the partial image C2 is ">", and the second accuracy is 15%. The second recognition result for the notation of partial image C3 is "$" and the second probability is 18%. The second recognition result for the notation of the partial image C4 is "," and the second accuracy is 93%. The second recognition result for the notation of partial image C5 is "/" and the second accuracy is 28%. The second recognition result for the notation of partial image C6 is "&" and the second accuracy is 11%. The second recognition result of the notation of the partial image C7 is "@" and the second accuracy is 20%. The second recognition result for the notation of partial image C8 is "-" and the second accuracy is 95%. In addition, in FIG. 13, notation of accuracy % is omitted.

また、第２文字認識部１２５は、数字認識部１２６により、部分画像Ｃ１～Ｃ８に含まれる表記の文字認識を行う。数字認識部１２６も、各部分画像Ｃ１～Ｃ８の表記の認識結果とともに、各認識結果の確度情報を出力する。以下、数字認識部１２６による認識結果を「第１認識結果」、第１認識結果の確度を「第１確度」という。本実施形態では、図１３Ｃに示すように、部分画像Ｃ１の表記の第１認識結果は「７」であり第１確度は２８％である。部分画像Ｃ２の表記の第１認識結果は「２」であり第１確度は９３％である。部分画像Ｃ３の表記の第１認識結果は「５」であり第１確度は９５％である。部分画像Ｃ４の表記の第１認識結果は「１」であり第１確度は２０％である。部分画像Ｃ５の表記の第１認識結果は「７」であり第１確度は９４％である。部分画像Ｃ６の表記の第１認識結果は「６」であり第１確度は９５％である。部分画像Ｃ７の表記の第１認識結果は「０」であり第１確度は９６％である。部分画像Ｃ８の表記の第１認識結果は「７」であり第１確度は１１％である。 In addition, the second character recognition unit 125 uses the number recognition unit 126 to recognize characters written in the partial images C1 to C8. The number recognition unit 126 also outputs the recognition result of the notation of each of the partial images C1 to C8 and the accuracy information of each recognition result. Hereinafter, the recognition result by the number recognition unit 126 will be referred to as "first recognition result", and the accuracy of the first recognition result will be referred to as "first accuracy". In this embodiment, as shown in FIG. 13C, the first recognition result for the notation of partial image C1 is "7" and the first accuracy is 28%. The first recognition result for the notation of partial image C2 is "2" and the first accuracy is 93%. The first recognition result for the notation of partial image C3 is "5" and the first accuracy is 95%. The first recognition result for the notation of partial image C4 is "1" and the first accuracy is 20%. The first recognition result for the notation of partial image C5 is "7" and the first accuracy is 94%. The first recognition result for the notation of partial image C6 is "6" and the first accuracy is 95%. The first recognition result for the notation of partial image C7 is "0" and the first accuracy is 96%. The first recognition result for the notation of partial image C8 is "7" and the first accuracy is 11%.

第２文字認識部１２５は、記号消去部１２８により、各部分画像Ｃ１～Ｃ８について第１確度と第２確度とを比較し、第２確度の方が高い部分画像Ｃの表記について、記号であると判定する。本実施形態では、部分画像Ｃ１，Ｃ４，Ｃ８の表記が記号と判定される。その後は、第１実施形態と同様に、第２文字認識部１２５は、記号消去部１２８により、図９Ｃに示すように記号を示す表記を消去する。第２文字認識部１２５は、数字認識部１２６により、記号を示す表記が削除された段落画像Ｐ４Ａに対して文字認識を行う。そして、出力部１２９により、数字認識部１２６による認識結果が、段落画像Ｐ４の認識結果として出力される。 The second character recognizing unit 125 compares the first accuracy and the second accuracy for each of the partial images C1 to C8 by the symbol erasing unit 128, and determines that the notation of the partial image C with the higher second accuracy is a symbol. I judge. In this embodiment, the representations of partial images C1, C4, and C8 are determined to be symbols. Thereafter, as in the first embodiment, the second character recognizing unit 125 causes the symbol erasing unit 128 to erase the notation indicating the symbol as shown in FIG. 9C. The second character recognition unit 125 performs character recognition on the paragraph image P4A from which the notation indicating the symbol has been deleted by the number recognition unit 126 . Then, the output unit 129 outputs the recognition result of the number recognition unit 126 as the recognition result of the paragraph image P4.

すなわち、第３実施形態において、段落画像Ｐ４は、複数の部分画像Ｃ１～Ｃ８に分割され、数字認識部１２６は、部分画像Ｃ１～Ｃ８の各々における表記を数字として認識し、数字の認識結果の確度を示す第１確度を算出する。また、記号認識部１２７は、部分画像Ｃ１～Ｃ８の各々における表記を記号として認識し、記号の認識結果の確度を示す第２確度を算出する。記号消去部１２８は、第１確度と前記第２確度との比較結果に基づいて、記号の表記を含む部分画像Ｃを特定し、記号の表記を含む部分画像Ｃを段落画像Ｐ４から消去する。 That is, in the third embodiment, the paragraph image P4 is divided into a plurality of partial images C1 to C8, and the numeral recognition unit 126 recognizes the notation in each of the partial images C1 to C8 as numerals. Calculate a first accuracy that indicates accuracy. The symbol recognition unit 127 also recognizes the notation in each of the partial images C1 to C8 as a symbol, and calculates a second accuracy indicating the accuracy of the recognition result of the symbol. The symbol erasing section 128 identifies the partial image C including the notation of the symbol based on the comparison result between the first accuracy and the second accuracy, and deletes the partial image C including the notation of the symbol from the paragraph image P4.

このように、第３実施形態では、記号認識部１２７による認識結果のみならず、数字認識部１２６による認識結果を用いて、段落画像Ｐ４に含まれる記号の表記を特定するので、記号の表記をより精度よく特定することができる。例えば、数字と似た形状の記号についても、記号認識部１２７および数字認識部１２６のそれぞれの認識結果および確度に基づいて、記号であるか数字であるかを判定することができる。 As described above, in the third embodiment, not only the recognition result by the symbol recognition unit 127 but also the recognition result by the number recognition unit 126 are used to specify the notation of the symbols included in the paragraph image P4. It can be specified more accurately. For example, it is possible to determine whether a symbol having a shape similar to a number is a symbol or a number based on the recognition results and accuracy of each of the symbol recognition section 127 and the number recognition section 126 .

なお、上述した説明では、第２文字認識部１２５は、第１確度と第２確度とに基づいて記号の表記を含む部分画像Ｃを判定し、記号の表記を含む部分画像Ｃを消去した上で更に数字認識部１２６による文字認識を行った。これに限らず、第２文字認識部１２５は、第１確度と第２確度とに基づいて、各部分画像Ｃ１～Ｃ８の表記に対して第１認識結果と第２認識結果のいずれを採用するかを選択してもよい。この場合、第２文字認識部１２５は、選択した認識結果を段落画像Ｐ４における部分画像Ｃ１～Ｃ８の配置に基づいて配列して、段落画像Ｐ４における認識結果として出力する。 In the above description, the second character recognition unit 125 determines the partial image C including the notation of the symbol based on the first accuracy and the second accuracy, and deletes the partial image C including the notation of the symbol. In addition, character recognition was performed by the number recognition unit 126 . Not limited to this, the second character recognition unit 125 adopts either the first recognition result or the second recognition result for the representation of each of the partial images C1 to C8 based on the first accuracy and the second accuracy. You can choose either In this case, the second character recognition unit 125 arranges the selected recognition results based on the arrangement of the partial images C1 to C8 in the paragraph image P4, and outputs them as the recognition results in the paragraph image P4.

例えば図１３の例では、部分画像Ｃ１の表記の第１確度は３０％、第２確度は９５％である。よって、部分画像Ｃ１の表記の認識結果としてより確からしいのは、第２認識結果である「￥」である。同様に、部分画像Ｃ２の表記は「２」、部分画像Ｃ３の表記は「５」、部分画像Ｃ４の表記は「，」、部分画像Ｃ５の表記は「７」、部分画像Ｃ６の表記は「６」、部分画像Ｃ７の表記は「０」、部分画像Ｃ８の表記は「－」が、それぞれより確からしい認識結果として選択される。第２文字認識部１２５は、出力部１２９により、各部分画像Ｃ１～Ｃ８の表記の認識結果を、段落画像Ｐ４における部分画像Ｃ１～Ｃ８の配置と同様に「￥２５，７６０－」と配列して、段落画像Ｐ４における認識結果として出力する。 For example, in the example of FIG. 13, the first accuracy of notation of partial image C1 is 30%, and the second accuracy is 95%. Therefore, the second recognition result of "¥" is more likely to be the recognition result of the notation of the partial image C1. Similarly, the notation of partial image C2 is "2", the notation of partial image C3 is "5", the notation of partial image C4 is ",", the notation of partial image C5 is "7", and the notation of partial image C6 is " 6”, “0” for the partial image C7, and “−” for the partial image C8 are selected as the most probable recognition results. The second character recognition unit 125 causes the output unit 129 to arrange the recognition results of the representations of the partial images C1 to C8 as “¥25,760-” in the same manner as the partial images C1 to C8 in the paragraph image P4. and output as the recognition result for the paragraph image P4.

なお、段落画像Ｐ４の認識結果に記号を含めたくない場合、段落画像Ｐ４の認識結果に第２認識結果が含まれる際には第２認識結果を削除して、第１認識結果のみで構成される文字列とするように、出力部１２９を設定してもよい。 If the recognition result of the paragraph image P4 does not want to include symbols, and the second recognition result is included in the recognition result of the paragraph image P4, the second recognition result is deleted, and the recognition result consists of only the first recognition result. You may set the output part 129 so that it may be a character string.

［第４実施形態］
次に、第４実施形態を説明する。以下の各例示において機能が第１実施形態～第３実施形態と同様である要素については、第１実施形態～第３実施形態の説明で使用した符号を流用して各々の詳細な説明を適宜省略する。第１実施形態～第３実施形態では、第２文字認識部１２５は、数字認識部１２６と、記号認識部１２７と、記号消去部１２８とを備え、記号認識部１２７によって認識された記号の表記を記号消去部１２８により消去した上で、数字認識部１２６により数字の文字認識を行っていた。第４実施形態では、記号認識部１２７および記号消去部１２８を用いずに、段落画像Ｐ４に対して画像処理の一種であるノイズ除去処理が行われることにより、段落画像Ｐ４から記号が消去される。そして、ノイズ除去処理後の段落画像Ｐ４に対して、数字認識部１２６が数字の文字認識を行って、段落画像Ｐ４における認識結果を出力する。 [Fourth Embodiment]
Next, a fourth embodiment will be described. In each of the following illustrations, for elements whose functions are the same as those of the first to third embodiments, the reference numerals used in the description of the first to third embodiments are used and detailed descriptions of each are given as appropriate. omitted. In the first to third embodiments, the second character recognition unit 125 includes a number recognition unit 126, a symbol recognition unit 127, and a symbol erasure unit 128. is erased by the symbol erasing unit 128, and the character recognition of the numbers is performed by the number recognition unit 126. FIG. In the fourth embodiment, the symbols are deleted from the paragraph image P4 by performing noise removal processing, which is a kind of image processing, on the paragraph image P4 without using the symbol recognition unit 127 and the symbol deletion unit 128. . Then, the number recognition unit 126 performs character recognition of numbers on the paragraph image P4 after noise removal processing, and outputs the recognition result in the paragraph image P4.

図１４は、第４実施形態における文字認識装置１０Ａの機能的構成を示すブロック図である。文字認識装置１０Ａの第２文字認識部１２５Ａは、数字認識部１２６と、画像処理部１３２と、出力部１２９とを備える。画像処理部１３２は、処理対象の段落画像Ｐ４に対してノイズを除去するノイズ除去処理を行う。画像に対するノイズ除去処理については従来技術のため詳細な説明を省略するが、画像処理部１３２は、段落画像Ｐ４に対して例えばメディアンフィルタや平滑化フィルタなどを適用して、段落画像Ｐ４に含まれる記号を除去する。すなわち、画像処理部１３２は、段落画像Ｐ４に対してノイズ除去処理を行うことにより、段落画像Ｐ４から記号に対応する部分画像Ｃを消去した画像を生成する生成部の一例である。 FIG. 14 is a block diagram showing the functional configuration of a character recognition device 10A according to the fourth embodiment. The second character recognition section 125A of the character recognition device 10A includes a number recognition section 126, an image processing section 132, and an output section 129. FIG. The image processing unit 132 performs noise removal processing for removing noise from the paragraph image P4 to be processed. The noise removal process for the image is a conventional technique, so a detailed description will be omitted. Remove symbols. That is, the image processing unit 132 is an example of a generation unit that generates an image by removing the partial image C corresponding to the symbol from the paragraph image P4 by performing noise removal processing on the paragraph image P4.

図１５は、第４実施形態の適用が好適な領収書画像の一例を示す図である。図１５の領収書画像ＲＩＡの金額表記Ｎ４には、「２５７６０」の数字の他、「５」と「２」の間に桁区切り記号であるカンマが、「０」の後に金額の末尾を示す記号であるハイフンが、それぞれ含まれている。これらの記号は、数字と比べて段落画像Ｐ４に占める面積が小さく、ノイズとして除去することが可能である。 FIG. 15 is a diagram showing an example of a receipt image suitable for application of the fourth embodiment. In the amount notation N4 of the receipt image RIA of FIG. 15, in addition to the number "25760", a comma as a digit separator is placed between "5" and "2" to indicate the end of the amount after "0". The symbol hyphen is included in each. These symbols occupy a smaller area in the paragraph image P4 than numbers, and can be removed as noise.

また、例えば図９Ａに示すように、段落画像Ｐ４に含まれる表記を文字単位の部分画像Ｃに区分した際に、大きさが所定値以下の部分画像Ｃがある場合には、当該部分画像Ｃをノイズとして除去してもよい。部分画像Ｃをノイズとして除去するとは、例えば当該部分画像Ｃを領収書画像ＲＩの背景色と同色の無地画像に置き換えることであってもよい。また、部分画像Ｃの大きさが所定値以下とは、例えば部分画像Ｃを構成する画素数が所定数以下である場合であってもよい。この場合の所定数は、予め定められた固定値であってもよいし、同じ段落画像Ｐに含まれる他の部分画像Ｃの画素数に基づいて定められてもよい。 For example, as shown in FIG. 9A, when the notation included in the paragraph image P4 is divided into partial images C for each character, if there is a partial image C whose size is equal to or less than a predetermined value, the partial image C can be removed as noise. Removing the partial image C as noise may be, for example, replacing the partial image C with a plain image having the same color as the background color of the receipt image RI. Further, the size of the partial image C being equal to or less than a predetermined value may be, for example, the case where the number of pixels forming the partial image C is equal to or less than a predetermined number. The predetermined number in this case may be a predetermined fixed value, or may be determined based on the number of pixels of another partial image C included in the same paragraph image P.

以上説明したように、第４実施形態では、画像処理部１３２により段落画像Ｐ４に対してノイズ除去処理を行うことで、段落画像Ｐ４から記号に対応する部分画像Ｃを消去した画像を生成する。これにより、簡易な処理で段落画像Ｐ４から記号に対応する部分画像Ｃを消去することができ、文字認識装置１０における処理負荷を軽減することができる。 As described above, in the fourth embodiment, the image processing unit 132 performs noise removal processing on the paragraph image P4 to generate an image in which the partial image C corresponding to the symbol is deleted from the paragraph image P4. As a result, the partial image C corresponding to the symbol can be deleted from the paragraph image P4 by simple processing, and the processing load on the character recognition device 10 can be reduced.

［変形例］
本発明は、以上に例示した実施形態に限定されない。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された２以上の態様を併合してもよい。 [Modification]
The invention is not limited to the embodiments exemplified above. Specific modification modes are exemplified below. Two or more aspects arbitrarily selected from the following examples may be combined.

［第１変形例］
上述した実施形態では、領収書画像ＲＩ（特に数字および数字以外の記号を表す段落画像Ｐ４）が手書きの数字を含む領収書を読み取った画像であった。これに限らず、領収書画像ＲＩに含まれる数字が活字（テキストエディタやレジ端末において用いられる各種フォント、活版印刷における字型等）を用いて印刷された文字である場合にも、上述した実施形態を適用可能である。 [First modification]
In the above-described embodiment, the receipt image RI (especially the paragraph image P4 representing numbers and symbols other than numbers) is an image obtained by reading a receipt containing handwritten numbers. Not limited to this, even when the numbers included in the receipt image RI are characters printed using type (various fonts used in text editors and cash register terminals, character shapes in letterpress printing, etc.), the above-described implementation can be performed. Morphology is applicable.

［第２変形例］
上述した実施形態では、文字認識装置１０における処理対象画像が領収書画像ＲＩであった。これに限らず、文字認識装置１０における処理対象画像は、数字を含む媒体（例えば文書）をスキャナで読み取る、または、カメラで撮影するなどして生成された画像であればよい。数字を含む媒体とは、例えば請求書、決算書、振込用紙、見積書などの経理書類、実験や観測における測定値を記録した記録用紙など、多岐にわたる分野の文書が該当する。 [Second modification]
In the above-described embodiment, the image to be processed in the character recognition device 10 is the receipt image RI. The image to be processed by the character recognition device 10 is not limited to this, and may be an image generated by reading a medium (for example, a document) containing numbers with a scanner or photographing it with a camera. Examples of media containing numbers include documents in a wide variety of fields, such as bills, financial statements, transfer forms, accounting documents such as quotations, and recording sheets recording measured values in experiments and observations.

［第３変形例］
上述した実施形態では、文字認識装置１０は、端末装置２０から送信された領収書画像ＲＩに文字認識処理を行い、認識結果データＲＤを端末装置２０に返信した。これに限らず、例えば、文字認識装置１０が処理対象画像を生成してもよい。具体的には、例えば、ユーザが手持ちの領収書を領収書データ化サービスの提供事業者に郵送し、提供事業者が文字認識装置１０に接続されたスキャナで、領収書を読み取って処理対象画像を生成してもよい。また、例えば、文字認識処理を実行するためのプログラム（例えば、図６の制御プログラムＰＧ１）が、端末装置２０にインストールされていてもよい。この場合は、端末装置２０が文字認識装置１０として機能する。 [Third Modification]
In the embodiment described above, the character recognition device 10 performs character recognition processing on the receipt image RI transmitted from the terminal device 20 and returns the recognition result data RD to the terminal device 20 . For example, the character recognition device 10 may generate the image to be processed. Specifically, for example, the user mails a receipt in hand to a provider of a receipt data conversion service, and the provider scans the receipt with a scanner connected to the character recognition device 10 to obtain an image to be processed. may be generated. Further, for example, a program for executing character recognition processing (for example, the control program PG1 in FIG. 6) may be installed in the terminal device 20. FIG. In this case, the terminal device 20 functions as the character recognition device 10 .

１…文字認識システム、１０，１０Ａ…文字認識装置、１２０…制御部、１２１…画像取得部、１２２…文字領域特定部、１２３…文字種判定部、１２４…第１文字認識部、１２５，１２５Ａ…第２文字認識部、１２６…数字認識部、１２７…記号認識部、１２８…記号消去部、１２９…出力部、１３０…認識結果生成部、１３２…画像処理部、１４０…記憶部、１６０…通信部、２０…端末装置、２２０…制御部、２２２…画像送信部、２２４…認識結果データ受信部、２４０…記憶部、２６０…通信部、２８０…操作部、２９０…表示部、ＮＷ…ネットワーク、ＲＤ，ＲＤＡ…認識結果データ、ＲＩ，ＲＩＡ…領収書画像。 REFERENCE SIGNS LIST 1 character recognition system 10, 10A character recognition device 120 control unit 121 image acquisition unit 122 character area identification unit 123 character type determination unit 124 first character recognition unit 125, 125A Second character recognition unit 126 Number recognition unit 127 Symbol recognition unit 128 Symbol erasure unit 129 Output unit 130 Recognition result generation unit 132 Image processing unit 140 Storage unit 160 Communication Unit 20 Terminal device 220 Control unit 222 Image transmission unit 224 Recognition result data reception unit 240 Storage unit 260 Communication unit 280 Operation unit 290 Display unit NW Network, RD, RDA... Recognition result data, RI, RIA... Receipt image.

Claims

a generation unit that generates a second image obtained by deleting a first portion corresponding to the symbol from the first image representing numbers and symbols other than numbers;
a number recognition unit that recognizes numbers included in the second image generated by the generation unit;
comprising
Information processing equipment.

further comprising a symbol recognition unit that recognizes the symbol included in the first image;
The generating unit
generating the second image by erasing the portion corresponding to the symbol recognized by the symbol recognition unit as the first portion;
The information processing apparatus according to claim 1.

An output unit that outputs a character string in which the numbers identified by the number recognition unit and the symbols identified by the symbol recognition unit are arranged,
The arrangement of the numbers and symbols contained in the character string is
corresponding to the arrangement of the numbers and symbols in the first image;
3. The information processing apparatus according to claim 2.

the first image is divided into a plurality of parts;
The digit recognition unit
Recognize the notation in each of the plurality of parts as the number, and calculate a first accuracy indicating the accuracy of the recognition result of the number;
The symbol recognition unit
recognizing the notation in each of the plurality of parts as the symbol, and calculating a second accuracy indicating the accuracy of the recognition result of the symbol;
The generating unit
Identifying the first portion in the first image based on a comparison result of the first probability and the second probability, and deleting the first portion from the first image;
4. The information processing apparatus according to claim 2 or 3.

The generating unit
generating the second image by performing noise removal processing on the first image;
The information processing apparatus according to claim 1.

The first image is an image obtained by reading a document containing the handwritten numbers,
The information processing apparatus according to any one of claims 1 to 5.

the processor,
a generation unit that generates a second image obtained by deleting a first portion corresponding to the symbol from the first image representing numbers and symbols other than numbers;
a number recognition unit that recognizes numbers included in the second image generated by the generation unit;
to make it work,
program.