JP2011145725A

JP2011145725A - Character recognition unit and text processing device

Info

Publication number: JP2011145725A
Application number: JP2010003615A
Authority: JP
Inventors: 大 ▲高▼田; Dai Takada
Original assignee: Murata Machinery Ltd
Current assignee: Murata Machinery Ltd
Priority date: 2010-01-12
Filing date: 2010-01-12
Publication date: 2011-07-28

Abstract

<P>PROBLEM TO BE SOLVED: To shorten a time to be spent on the character recognition of a plurality of originals having a common section and individual sections. <P>SOLUTION: A character recognition unit is configured to perform the character recognition of a plurality of originals having the common section and individual sections, and is provided with: a first acquisition means for acquiring the respective images of the plurality of originals; a second acquisition means for acquiring specification information specifying the common section obtained on the basis of the first original which becomes the object of character recognition first among the plurality of originals; a setting means for setting the common section and individual sections as object regions to be subjected to the character recognition for the first original, and for setting the object region in regions other than the common section for the second and following originals; and a recognition means for performing character recognition only to the object regions. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、原稿の文字を認識する文字認識ユニット、および文字認識の結果を処理する文章処理装置に関する。 The present invention relates to a character recognition unit for recognizing characters of a document and a sentence processing apparatus for processing a result of character recognition.

近年、原稿に記載された文字を読み取って、読み取り結果を活用する文字認識装置が、名刺の情報を読み取る用途などに活用され始めている。 In recent years, a character recognition device that reads characters written on a manuscript and uses the read results has started to be used for reading business card information.

名刺を対象とする文字認識装置では、例えば、先ず、紙面の画像がスキャナを用いて文字認識装置に取り込まれ（画像入力）、「社名」、「氏名」、「住所」、「電話番号」等の各項目毎に文字認識すべき文字群が存在している文字認識の対象領域が設定される（対象領域設定）。 In a character recognition device for a business card, for example, a paper image is first taken into the character recognition device using a scanner (image input), “company name”, “name”, “address”, “phone number”, etc. A character recognition target area in which there is a character group to be recognized for each item is set (target area setting).

次に、各項目の属性に応じた辞書を使用するなどして各対象領域に含まれている個々の文字が何であるかの認識がなされ（文字認識）、単語などの言語情報を利用して読み取った文字の修正などが行われる（後処理）。 Next, it recognizes what each character contained in each target area is by using a dictionary according to the attribute of each item (character recognition), and uses language information such as words The read characters are corrected (post-processing).

ところで、名刺においては各項目の記載位置（レイアウト）は、通常、会社毎に異なっているが、例えば、同一会社の名刺などの同種の名刺においては、レイアウトはほとんど同じである場合が多い。 By the way, although the description position (layout) of each item in a business card is usually different for each company, for example, the same type of business card such as a business card of the same company often has almost the same layout.

そこで、特許文献１に記載された文字認識装置では、文字認識の各対象領域毎に領域の位置情報と、項目名とを対応づけてテンプレートとして設定、保存しておき、同種名刺の文字認識処理の際に、処理対象の名刺に対応したテンプレートを選択、利用することによって、名刺１枚毎に文字認識の対象領域を設定する手間を省いて文字認識の効率を上げている。 Therefore, in the character recognition device described in Patent Literature 1, region position information and item names are set and stored as a template in association with each target region for character recognition, and the same kind of business card character recognition processing is performed. At this time, by selecting and using a template corresponding to the business card to be processed, the efficiency of character recognition is improved by eliminating the trouble of setting a character recognition target area for each business card.

特開２００１−２０２４７５号公報JP 2001-202475 A

しかしながら、特許文献１の文字認識装置では、複数枚の同種の名刺の文字認識を行う際に、「会社名」、「住所」等のような各名刺間で内容が共通している共通情報を有する項目についても、「氏名」等の名刺毎の個別情報を有する項目と同様に名刺一枚毎に毎回文字認識を行う必要がある。 However, in the character recognition device of Patent Document 1, when performing character recognition of a plurality of business cards of the same type, common information such as “company name”, “address”, etc. that is common to the contents of each business card is used. As for items that have individual names, it is necessary to perform character recognition for each business card as in the case of items having individual information for each business card such as “name”.

文字認識は、上述した一連の文字読取処理の工程の中でも処理装置の能力面および処理時間の観点で、通常、最も処理コストがかかる処理であることから、特許文献１の文字認識装置によっては、処理時間の短縮化、あるいは処理能力が低い装置を採用することによる低コスト化を行うことが困難であるといった問題がある。 Character recognition is usually the process that requires the most processing cost from the viewpoint of the capability and processing time of the processing device among the above-described series of character reading processing steps. Depending on the character recognition device of Patent Document 1, There is a problem that it is difficult to shorten the processing time or to reduce the cost by adopting an apparatus having a low processing capability.

本発明は、こうした問題を解決するためになされたもので、複数枚の同種名刺に対する文字認識を行う際に、文字認識の対象となるデータ量を効率よく減ずることによって文字認識に要する時間を短縮することができる技術を提供することを目的とする。 The present invention has been made to solve these problems, and shortens the time required for character recognition by efficiently reducing the amount of data to be character-recognized when character recognition is performed on a plurality of similar business cards. It aims at providing the technology which can be done.

上記の課題を解決するため、請求項１の発明は、共通部分と個別部分とを有する複数の原稿の文字認識を行う文字認識ユニットであって、前記複数の原稿の各画像を取得する第１の取得手段と、前記複数の原稿のうち最初に文字認識の対象となる１枚目の原稿に基づいて得られた、前記共通部分を特定する特定情報を取得する第２の取得手段と、前記１枚目の原稿については前記共通部分と前記個別部分とを文字認識すべき対象領域として設定するとともに、２枚目以降の原稿については前記共通部分以外の領域に前記対象領域を設定する設定手段と、前記対象領域のみに対して文字認識を行う認識手段と、を備えることを特徴とする。 In order to solve the above-described problems, the invention of claim 1 is a character recognition unit that performs character recognition of a plurality of documents having a common part and individual parts, and is a first unit that acquires each image of the plurality of documents. Acquisition means, second acquisition means for acquiring specific information for identifying the common part, obtained based on a first original that is first subjected to character recognition among the plurality of originals, Setting means for setting the common part and the individual part as target areas for character recognition for the first original and setting the target area in an area other than the common part for the second and subsequent originals And a recognition means for performing character recognition only on the target area.

また、請求項２の発明は、請求項１に記載された文字認識ユニットであって、前記特定情報を記憶する記憶手段をさらに備えることを特徴とする。 The invention according to claim 2 is the character recognition unit according to claim 1, further comprising storage means for storing the specific information.

また、請求項３の発明は、請求項２に記載された文字認識ユニットであって、所定の操作入力に応答して前記特定情報を前記記憶手段から取得する第３の取得手段をさらに備え、前記第３の取得手段が前記特定情報を取得した場合には、(ａ)前記文字認識ユニットは、前記第２の取得手段を不能化するとともに、(ｂ)前記設定手段は、取得された前記特定情報に基づいて前記複数の原稿のそれぞれに前記対象領域を設定することを特徴とする。 The invention of claim 3 is the character recognition unit according to claim 2, further comprising third acquisition means for acquiring the specific information from the storage means in response to a predetermined operation input, When the third acquisition unit acquires the specific information, (a) the character recognition unit disables the second acquisition unit, and (b) the setting unit acquires the acquired information. The target area is set for each of the plurality of documents based on specific information.

また、請求項４の発明は、文章処理装置であって、請求項１から請求項３のいずれか１つの請求項に記載された文字認識ユニットと、前記１枚目の原稿における前記共通部分についての文字認識の結果と、前記２枚目以降の各原稿における前記対象領域についての文字認識の結果とに基づいて前記各原稿の文字データを生成する生成部と、を備えることを特徴とする。 According to a fourth aspect of the present invention, there is provided a text processing apparatus, the character recognition unit according to any one of the first to third aspects, and the common part in the first original. And a generation unit that generates character data of each document based on the result of character recognition and the result of character recognition for the target area in each of the second and subsequent documents.

請求項１から請求項４に記載された各発明によれば、１枚目の原稿に基づいて各原稿に共通する共通部分を特定する特定情報が取得され、２枚目以降の原稿の文字認識を行う際には共通部分以外の領域に文字認識の対象領域が設定されるので、共通部分についての文字認識が不要となり、効率よく文字認識の対象となるデータ量を減ずることができ、文字認識に要する時間を短縮することができる。 According to each of the first to fourth aspects of the present invention, specific information for specifying a common part common to each original is acquired based on the first original, and character recognition of the second and subsequent originals is performed. Since the target area for character recognition is set in an area other than the common part, character recognition for the common part is not required, and the amount of data subject to character recognition can be reduced efficiently. Can be shortened.

実施形態に係る文章処理装置の外観の１例を示す図である。It is a figure which shows an example of the external appearance of the text processing apparatus which concerns on embodiment. 実施形態に係る文章処理装置の主な構成の１例を示すブロック図である。It is a block diagram which shows an example of the main structures of the text processing apparatus which concerns on embodiment. 実施形態に係る文章処理装置および文字認識ユニットの主な機能構成の１例を示すブロック図である。It is a block diagram which shows one example of the main function structures of the text processing apparatus and character recognition unit which concern on embodiment. 名刺の構成の一例を示す図である。It is a figure which shows an example of a structure of a business card. 共通部分の選択画面の一例を示す図である。It is a figure which shows an example of the selection screen of a common part. 名刺における共通部分の位置情報の一例を示す図である。It is a figure which shows an example of the positional information on the common part in a business card. テンプレートの一例を示す図である。It is a figure which shows an example of a template. 名刺の構成の一例を示す図である。It is a figure which shows an example of a structure of a business card. ２枚目以降の名刺における対象領域の設定を説明する図である。It is a figure explaining the setting of the object area | region in the business card after the 2nd sheet. ２枚目以降の名刺における対象領域の設定を説明する図である。It is a figure explaining the setting of the object area | region in the business card after the 2nd sheet. テンプレートの選択画面の一例を示す図である。It is a figure which shows an example of the selection screen of a template. 実施形態に係る文章処理装置および文字認識ユニットの動作フローの一例を示す図である。It is a figure which shows an example of the operation | movement flow of the text processing apparatus and character recognition unit which concern on embodiment. 実施形態に係る文章処理装置および文字認識ユニットの動作フローの一例を示す図である。It is a figure which shows an example of the operation | movement flow of the text processing apparatus and character recognition unit which concern on embodiment.

＜１．文章処理装置の構成：＞
図１は、実施形態に係る文章処理装置１の外観の１例を示す図である。図２は、文章処理装置１の主な構成の１例を示すブロック図である。 <1. Structure of sentence processing device:>
FIG. 1 is a diagram illustrating an example of an appearance of a text processing apparatus 1 according to an embodiment. FIG. 2 is a block diagram illustrating an example of a main configuration of the text processing apparatus 1.

図１および図２に示すように、文章処理装置１は、原稿給送部３１と、原稿載置部３２と、読取部４１と、入出力部４２と、記録部５１と、表示部６２が設けられた操作部６１と、原稿検出部７１と、を主に備えている。 As shown in FIGS. 1 and 2, the text processing device 1 includes a document feeding unit 31, a document placing unit 32, a reading unit 41, an input / output unit 42, a recording unit 51, and a display unit 62. An operation unit 61 provided and a document detection unit 71 are mainly provided.

文章処理装置１は、これらの各構成要素を適宜制御下で利用することにより、ＦＡＸ機能、コピー機能、スキャン機能、プリント機能、文字認識機能、およびアドレス帳登録機能などの諸機能を実現する構成となっている。なお、アドレス帳登録機能および文字認識ユニット２（図３）によって実現される文字認識機能については、後述する。 The text processing apparatus 1 is configured to realize various functions such as a FAX function, a copy function, a scan function, a print function, a character recognition function, and an address book registration function by using each of these components under appropriate control. It has become. The address book registration function and the character recognition function realized by the character recognition unit 2 (FIG. 3) will be described later.

原稿給送部３１は、いわゆる自動給紙（ＡＤＦ：Automatic Document Feeder）方式により、原稿を読取部４１へと給送する。原稿給送部３１によって給送される原稿の画像は、給送経路の所定位置において読取部４１によって給送中に取得される。 The document feeding unit 31 feeds a document to the reading unit 41 by a so-called automatic document feeder (ADF) method. An image of a document fed by the document feeding unit 31 is acquired during feeding by the reading unit 41 at a predetermined position on the feeding path.

図１に示すように、原稿給送部３１は、主として、読取対象となる原稿を載置する原稿台３３と、読取部４１によってＡＤＦスキャン方式で読取処理が実行された原稿を蓄積する原稿排出台３４と、を有している。 As shown in FIG. 1, the document feeding unit 31 mainly includes a document table 33 on which a document to be read is placed, and a document discharge for storing a document that has been read by the reading unit 41 using the ADF scan method. And a table 34.

ここで、原稿給送部３１は、例えば、名刺サイズの小さな原稿からＡ３サイズの大きな原稿まで、各種サイズの原稿を読取部４１へと給紙可能に構成されている。 Here, the document feeding unit 31 is configured to be able to feed various sizes of documents from the small business card size document to the large A3 size document to the reading unit 41, for example.

原稿載置部３２は、図１に示すように、原稿給送部３１の下方に配置されており、原稿を静止させた状態で原稿の画像を読取部４１に取得させる、いわゆるフラットベッド（Flat Bed）スキャン方式による画像読取に使用される。 As shown in FIG. 1, the document placement unit 32 is disposed below the document feeding unit 31, and allows the reading unit 41 to acquire an image of a document while the document is stationary. Bed) Used to scan images.

例えば、原稿載置部３２に対して原稿給送部３１が開閉されることによって、読取対象となる原稿が、原稿載置部３２に載置される。 For example, when the document feeding unit 31 is opened and closed with respect to the document placing unit 32, the document to be read is placed on the document placing unit 32.

そして、読取部４１から投光された読取用の検出光が、原稿載置部３２を透過して原稿の表面で反射され、再び原稿載置部３２を透過した光が、読取部４１の受光センサに読み取られることによって、原稿の画像が取得される。 The reading detection light projected from the reading unit 41 is transmitted through the document placing unit 32 and reflected by the surface of the document, and the light transmitted through the document placing unit 32 is received by the reading unit 41 again. An image of the document is acquired by being read by the sensor.

また、原稿載置部３２は、読取部４１による画像の読取に係る主走査方向および副走査方向に沿って行列状に複数の名刺を保持する名刺ホルダーが、原稿載置部３２に載置された状態で、これら複数の名刺をフラットベッドスキャン方式によって一度に効率よく読み取り可能なようにも構成されている。 In the document placement unit 32, a business card holder that holds a plurality of business cards in a matrix along the main scanning direction and the sub-scanning direction related to image reading by the reading unit 41 is placed on the document placement unit 32. In this state, the plurality of business cards can be efficiently read at once by a flat bed scanning method.

読取部４１は、いわゆるスキャナ部であり、図１に示すように、原稿給送部３１および原稿載置部３２の下方に設けられている。読取部４１は、画像読取用の検出光を投光する投光部と、原稿表面で反射された検出光を光電変換する受光部とを有するスキャナを用いて原稿の情報を画像データとして読み取るように構成されている。 The reading unit 41 is a so-called scanner unit, and is provided below the document feeding unit 31 and the document placing unit 32 as shown in FIG. The reading unit 41 reads document information as image data using a scanner having a light projecting unit that projects detection light for image reading and a light receiving unit that photoelectrically converts the detection light reflected on the document surface. It is configured.

ここで、読取部４１は、フラットベッドスキャン方式のスキャナ部として原稿を読み取るとともに、ＡＤＦスキャン方式のスキャナ部としても原稿を読み取り可能なように構成されている。 Here, the reading unit 41 is configured to read a document as a flatbed scan type scanner unit and also to read a document as an ADF scan type scanner unit.

読取部４１で読み取られた画像データは、例えばＪＰＥＧ方式に圧縮され、ＲＡＭ２２の画像格納部２３に記憶される。 The image data read by the reading unit 41 is compressed, for example, in the JPEG format and stored in the image storage unit 23 of the RAM 22.

入出力部４２は、文章処理装置１を外部のネットワークに接続するためのインタフェース部であり、例えば、外部のコンピュータから供給される画像データをプリンタとして機能する文章処理装置１へと入力するとともに、スキャナとして機能する文章処理装置１が取得した画像データを外部のコンピュータなどへと出力する。 The input / output unit 42 is an interface unit for connecting the text processing device 1 to an external network. For example, the input / output unit 42 inputs image data supplied from an external computer to the text processing device 1 functioning as a printer. The image data acquired by the text processing apparatus 1 functioning as a scanner is output to an external computer or the like.

入出力部４２は、また、電話回線とのインタフェースも備えており、文章処理装置１がＦＡＸとして機能するときの入出力インタフェースなどとしても機能する。 The input / output unit 42 also includes an interface with a telephone line, and functions as an input / output interface when the text processing apparatus 1 functions as a FAX.

記録部５１は、電子写真方式により、静電潜像に基づいたトナー画像を記録紙に記録する画像形成部である。 The recording unit 51 is an image forming unit that records a toner image based on an electrostatic latent image on a recording sheet by an electrophotographic method.

例えば、記録部５１は、画像格納部２３に記憶された画像データに基づいて感光体ドラム（図示省略）上にトナー画像を形成するとともに、このトナー画像を記録部５１の下方に設けられた給紙部から供給される記録紙に転写させることによって、原稿から読み取られた画像を記録紙に記録する。 For example, the recording unit 51 forms a toner image on a photosensitive drum (not shown) based on the image data stored in the image storage unit 23, and supplies the toner image below the recording unit 51. The image read from the original is recorded on the recording paper by being transferred onto the recording paper supplied from the paper section.

操作部６１は、いわゆる操作パネルである。操作部６１にはタッチパネルとしても機能する表示部６２と、読取開始ボタン６３をはじめとする各種の操作ボタンなどを備えている。 The operation unit 61 is a so-called operation panel. The operation unit 61 includes a display unit 62 that also functions as a touch panel, various operation buttons including a reading start button 63, and the like.

表示部６２は、例えば、液晶ディスプレイにより構成されており、文章処理装置１の動作状況、操作内容等を表示部６２に表示される情報によって文章処理装置１の操作者（以下、単に、「操作者」と称する）に提示する表示手段として機能する。 The display unit 62 is configured by, for example, a liquid crystal display, and the operator of the text processing device 1 (hereinafter simply referred to as “operation”) according to the information displayed on the display unit 62 displays the operation status, operation contents, and the like of the text processing device 1. It functions as a display means to be presented to a person.

また、表示部６２は、指や専用のペンで画面に触れることによって画面上の位置を指定できる「タッチパネル」としての機能を有している。 Further, the display unit 62 has a function as a “touch panel” that can designate a position on the screen by touching the screen with a finger or a dedicated pen.

したがって、操作者は、表示部６２に表示された内容に基づき、表示部６２の「タッチパネル」機能を使用した指示を行うことによって、文章処理装置１に所定の処理（例えば、記録紙に画像を記録させる処理）を実行させることができる。 Accordingly, the operator gives an instruction using the “touch panel” function of the display unit 62 based on the content displayed on the display unit 62, thereby causing the text processing apparatus 1 to perform predetermined processing (for example, an image on a recording sheet). Recording process).

すなわち、表示部６２は、文章処理装置１に対する操作者の諸操作をタッチパネルおよび各種ボタンを介して受け付ける入力部としても機能する。 That is, the display unit 62 also functions as an input unit that receives various operations of the operator for the text processing device 1 via the touch panel and various buttons.

このように、操作者は、操作部６１上の表示部６２から文章処理装置１の状態などを取得するとともに、表示部６２および各種操作ボタンを用いて設定および動作等を文章処理装置１に入力することができる。 As described above, the operator acquires the state of the text processing device 1 from the display unit 62 on the operation unit 61 and inputs settings and operations to the text processing device 1 using the display unit 62 and various operation buttons. can do.

原稿検出部７１は、例えば、原稿給送部３１に設けられた接触式または非接触式のセンサにより構成されており、原稿台３３に載置された原稿の存在を検出する。 The document detection unit 71 includes, for example, a contact type or non-contact type sensor provided in the document feeding unit 31, and detects the presence of a document placed on the document table 33.

ＭＰＵ（Micro Processing Unit）１１は、文章処理装置１の各機能部を統轄制御する制御処理装置であり、ＲＯＭ２１に格納されたプログラムに従った制御および処理を実行する。 An MPU (Micro Processing Unit) 11 is a control processing device that controls each functional unit of the text processing device 1, and executes control and processing according to a program stored in the ROM 21.

ＭＰＵ１１は、後述するように、認識部１２、設定部１３、第３取得部１４、第２取得部１５、初稿検出部１６、および生成部１７としても機能する。 The MPU 11 also functions as a recognition unit 12, a setting unit 13, a third acquisition unit 14, a second acquisition unit 15, an initial draft detection unit 16, and a generation unit 17, as will be described later.

また、ＭＰＵ１１、ＲＯＭ２１、ＲＡＭ２２、記憶装置２６、読取部４１、記録部５１等のそれぞれは、信号線１９を介して電気的に接続されている。したがって、ＭＰＵ１１は、例えば、記録部５１による記録処理等を所定のタイミングで実行できる。 Further, the MPU 11, the ROM 21, the RAM 22, the storage device 26, the reading unit 41, the recording unit 51 and the like are electrically connected via the signal line 19. Therefore, the MPU 11 can execute, for example, a recording process by the recording unit 51 at a predetermined timing.

ＲＡＭ（Random Access Memory ）２２は、読み書き自在の揮発性メモリであり、読取部４１が読み取った画像および入出力部４２から入力された画像を格納する画像格納部２３、ＭＰＵ１１の処理情報を一時的に記憶するワークメモリなどとして機能する。 A RAM (Random Access Memory) 22 is a readable / writable volatile memory, and temporarily stores processing information of the MPU 11 and the image storage unit 23 that stores an image read by the reading unit 41 and an image input from the input / output unit 42. It functions as a work memory that stores data.

ＲＯＭ（Read Only Memory）２１は、読出し専用メモリであり、ＭＰＵ１１を動作させるプログラムなどを格納している。なお、読み書き自在の不揮発性メモリ（例えば、フラッシュメモリ）が、ＲＯＭ１３に代えて使用されてもよい。 A ROM (Read Only Memory) 21 is a read only memory and stores a program for operating the MPU 11 and the like. A readable / writable nonvolatile memory (for example, a flash memory) may be used instead of the ROM 13.

記憶装置２６は、例えば、フラッシュメモリ等の読み書き自在な不揮発性メモリによって構成されており、文章処理装置１に対する設定情報などの各種情報を恒久的に記録する。 The storage device 26 is configured by a readable / writable nonvolatile memory such as a flash memory, for example, and permanently records various information such as setting information for the text processing device 1.

記憶装置２６は、また、テンプレート格納部２７およびアドレス格納部２８を備えており、それぞれには、後述するテンプレートおよびアドレス帳が格納される。 The storage device 26 also includes a template storage unit 27 and an address storage unit 28, which store a template and an address book, which will be described later.

＜２．文字認識ユニットの構成：＞
◎アドレス帳登録機能に係る文章処理装置１について：
図３は、実施形態に係る文章処理装置１および文字認識ユニット２の主な機能構成の１例を示すブロック図である。 <2. Structure of character recognition unit:>
◎ About text processing device 1 related to the address book registration function:
FIG. 3 is a block diagram illustrating an example of main functional configurations of the text processing device 1 and the character recognition unit 2 according to the embodiment.

なお、図３では、文章処理装置１の各機能構成のうち、名刺などの原稿の画像を読み取って該画像に対して文字認識を行い、文字認識の結果のうちの必要な情報をアドレス帳に登録する機能（以下、単に「アドレス帳登録機能」と称する）に関連する機能部が示されている。 In FIG. 3, among the functional components of the text processing apparatus 1, an image of a document such as a business card is read and character recognition is performed on the image, and necessary information of the character recognition result is stored in the address book. A functional section related to a function to be registered (hereinafter simply referred to as “address book registration function”) is shown.

図３に示されるように、文章処理装置１は、文字認識ユニット２、生成部１７、およびアドレス格納部２８などの各機能部を主に動作させることによって、アドレス帳登録機能を発揮する。 As shown in FIG. 3, the sentence processing apparatus 1 exhibits an address book registration function by mainly operating each function unit such as the character recognition unit 2, the generation unit 17, and the address storage unit 28.

生成部１７は、ＭＰＵ１１によって実現される機能部であり、文字認識ユニット２が生成した名刺等の各種原稿についての文字認識の結果に基づいて、アドレス帳に登録するための登録データを生成する。 The generation unit 17 is a functional unit realized by the MPU 11, and generates registration data for registration in the address book based on character recognition results for various documents such as business cards generated by the character recognition unit 2.

アドレス格納部２８は、記憶装置２６によって実現される機能部であり、生成部１７が生成した登録データをアドレス帳として格納する。 The address storage unit 28 is a functional unit realized by the storage device 26, and stores the registration data generated by the generation unit 17 as an address book.

◎文字認識ユニット２について：
図３に示される文字認識ユニット２は、各種原稿の画像に対して文字認識を行うことによって文字認識の結果である文字データを取得する文字認識を行う機能部である。 ◎ About character recognition unit 2:
The character recognition unit 2 shown in FIG. 3 is a functional unit that performs character recognition that acquires character data as a result of character recognition by performing character recognition on images of various documents.

文字認識ユニット２は、第１取得部４３、画像格納部２３、表示部６２、認識部１２、設定部１３、第３取得部１４、第２取得部１５、初稿検出部１６、およびテンプレート格納部２７を主な機能部として備えて構成される。 The character recognition unit 2 includes a first acquisition unit 43, an image storage unit 23, a display unit 62, a recognition unit 12, a setting unit 13, a third acquisition unit 14, a second acquisition unit 15, an initial draft detection unit 16, and a template storage unit. 27 as a main functional unit.

文字認識ユニット２は、これらの機能部を適宜動作させることによって、例えば同じ会社などの名刺および同種の帳票などのように、各原稿に共通する共通部分と、各原稿毎の個別部分とを有する複数の原稿に対する文字認識において、複数の原稿のうち最初に文字認識の対象となる１枚目の原稿に基づいて、各原稿に共通する共通部分を特定する特定情報を取得する。 The character recognition unit 2 has a common part common to each original and individual parts for each original such as a business card of the same company and the same type of form by appropriately operating these functional units. In character recognition for a plurality of documents, specific information for identifying a common part common to each document is acquired based on the first document that is first subjected to character recognition among the plurality of documents.

また、文字認識ユニット２は、２枚目以降の原稿の文字認識を行う際には共通部分以外の領域に文字認識の対象領域を設定する。 The character recognition unit 2 sets a character recognition target area in an area other than the common part when performing character recognition of the second and subsequent originals.

従って、２枚目以降の原稿では、共通部分についての再度の文字認識が不要となり、効率よく文字認識の対象となるデータ量を減ずることによって文字認識に要する時間を短縮することができる。 Accordingly, in the second and subsequent originals, it is not necessary to recognize the character again for the common portion, and the time required for character recognition can be shortened by efficiently reducing the amount of data to be subjected to character recognition.

なお、既述したように、図２に示される構成例では、認識部１２、設定部１３、第３取得部１４、第２取得部１５、および初稿検出部１６の各機能部は、ＭＰＵ１１によって実現されているが、これらの各機能部はそれぞれ、例えば、専用のハードウェア回路などによって実現されてもよい。 As described above, in the configuration example illustrated in FIG. 2, the functional units of the recognition unit 12, the setting unit 13, the third acquisition unit 14, the second acquisition unit 15, and the initial draft detection unit 16 are performed by the MPU 11. Although implemented, each of these functional units may be implemented by a dedicated hardware circuit, for example.

第１取得部４３は、例えば、読取部４１または入出力部４２などによって実現される機能部であり、文章処理装置１の処理対象である複数の原稿の画像を取得する。 The first acquisition unit 43 is a functional unit realized by, for example, the reading unit 41 or the input / output unit 42 and acquires images of a plurality of documents that are processing targets of the text processing device 1.

取得された画像データは、ＭＰＵ１１によってＲＡＭ２２の画像格納部２３に格納されるとともに、表示部６２に画像、または認識部１２による文字認識を経た文字データとして表示される。 The acquired image data is stored in the image storage unit 23 of the RAM 22 by the MPU 11 and displayed on the display unit 62 as an image or character data that has undergone character recognition by the recognition unit 12.

読取部４１が第１取得部４３として動作している場合には、第１取得部４３は、処理対象の名刺に対してＡＤＦスキャン方式等の画像読取処理を行って原稿の画像を取得する。 When the reading unit 41 operates as the first acquisition unit 43, the first acquisition unit 43 performs an image reading process such as an ADF scan method on a business card to be processed to acquire an image of a document.

また、入出力部４２が第１取得部４３として動作している場合には、第１取得部４３は、外部のコンピュータなどによって既に取得されている原稿の画像を文章処理装置１に入力することによって原稿の画像を取得する。 When the input / output unit 42 operates as the first acquisition unit 43, the first acquisition unit 43 inputs an image of a document already acquired by an external computer or the like to the text processing apparatus 1. To obtain an image of the original.

このように、読取部４１と入出力部４２は、文字認識ユニット２の第１取得部４３として機能する。 Thus, the reading unit 41 and the input / output unit 42 function as the first acquisition unit 43 of the character recognition unit 2.

初稿検出部１６は、共通部分と個別部分とを有する複数の原稿のうち最初に文字認識の対象となる１枚目の原稿を検出する機能部である。また、第２取得部１５は、初稿検出部１６によって検出された１枚目の原稿に基づいて得られた、複数の各原稿に共通する共通部分を特定する特定情報を取得する。 The initial draft detection unit 16 is a functional unit that detects a first original that is a target of character recognition among a plurality of originals having a common part and an individual part. The second acquisition unit 15 acquires specific information for identifying a common part common to a plurality of originals obtained based on the first original detected by the initial draft detection unit 16.

第２取得部１５によって取得された共通部分の特定情報は、共通部分と個別部分とを有する複数の原稿における共通部分を特定する特定情報を収めたテンプレート（以下、単に「テンプレート」と称する）としてテンプレート格納部２７に記憶され、第３取得部１４によってテンプレート格納部２７から取得される。 The common part specification information acquired by the second acquisition unit 15 is a template (hereinafter simply referred to as a “template”) containing specific information for specifying a common part in a plurality of documents having a common part and individual parts. It is stored in the template storage unit 27 and acquired from the template storage unit 27 by the third acquisition unit 14.

第３取得部１４は、文字認識の対象である複数の名刺等の原稿に共通する共通部分を特定する特定情報を所定の操作入力に応答してテンプレート格納部２７から取得する機能部である。 The third acquisition unit 14 is a functional unit that acquires, from the template storage unit 27, specific information for specifying a common part common to documents such as a plurality of business cards as character recognition targets in response to a predetermined operation input.

具体的には、例えば、操作者が、アドレス格納部２８に格納された各テンプレートのうち文字認識の対象である原稿に対応したテンプレートを選択し、該選択を確認する入力操作をタッチパネル等から行うことによって、第３取得部１４は、該入力操作に応答して、選択されたテンプレートに収められた特定情報を取得する。 Specifically, for example, the operator selects a template corresponding to a document that is a character recognition target from among the templates stored in the address storage unit 28, and performs an input operation for confirming the selection from a touch panel or the like. Accordingly, the third acquisition unit 14 acquires specific information stored in the selected template in response to the input operation.

また、例えば、先ず、第３取得部１４が、処理対象の名刺の画像と、各テンプレートに収められた共通部分の領域とを重ねて比較することで所望のテンプレートを自動的に特定して、該テンプレートを表示部６２に表示し、次に、操作者がタッチパネル等を介して該テンプレートが処理対象の名刺に対応した適切なテンプレートであることを確認する入力を行い、該入力操作に応答して、第３取得部１４が、特定されたテンプレートから特定情報を取得する構成が採用されてもよい。 Also, for example, first, the third acquisition unit 14 automatically specifies a desired template by comparing the image of the business card to be processed with the common part area stored in each template, The template is displayed on the display unit 62, and then the operator performs input for confirming that the template is an appropriate template corresponding to the business card to be processed via the touch panel or the like, and responds to the input operation. Thus, a configuration in which the third acquisition unit 14 acquires specific information from the specified template may be employed.

設定部１３は、原稿の画像に対して文字認識の対象領域を設定する機能部であり、また、認識部１２は、設定部１３によって設定された文字認識の対象領域のみに対して文字認識を行う機能部である。 The setting unit 13 is a functional unit that sets a target area for character recognition with respect to an image of a document, and the recognizing unit 12 performs character recognition only on the target area for character recognition set by the setting unit 13. It is a functional part to perform.

次に、図３に示される各機能部の動作について、図１２および図１３に示されるフローチャート、その他の各図面を用いて詳しく説明する。 Next, the operation of each functional unit shown in FIG. 3 will be described in detail with reference to the flowcharts shown in FIGS. 12 and 13 and other drawings.

＜３．文章処理装置１および文字認識ユニット２の動作説明：＞
図１２および図１３は、共通部分と個別部分とを有する複数の原稿が複数の名刺である場合において、これら複数の名刺（以下、単に「複数の名刺」と称する）をアドレス帳作成の処理対象とする文章処理装置１および文字認識ユニット２の動作フローの一例を示す図である。 <3. Explanation of operations of the text processing device 1 and the character recognition unit 2:>
FIGS. 12 and 13 show a plurality of business cards (hereinafter simply referred to as “a plurality of business cards”) to be processed for address book creation when a plurality of documents having a common part and individual parts are a plurality of business cards. It is a figure which shows an example of the operation | movement flow of the text processing apparatus 1 and the character recognition unit 2 which are taken as.

◎１枚目の名刺基準の文字認識処理およびアドレス帳作成について：
以下では、共通部分と個別部分とを有する複数の原稿における共通部分を特定する特定情報であるテンプレート（以下、単に「テンプレート」と称する）が、これら複数の名刺および他の原稿に対して未だテンプレート格納部２７に格納されていない場合における文章処理装置１および文字認識ユニット２の動作を、図１２および図１３に示されるフローチャートを用いて説明する。 ◎ About the first card-based character recognition process and address book creation:
In the following, templates (hereinafter simply referred to as “templates”), which are identification information for specifying common parts in a plurality of originals having common parts and individual parts, are still templates for these plural business cards and other originals. Operations of the text processing device 1 and the character recognition unit 2 when not stored in the storage unit 27 will be described with reference to flowcharts shown in FIGS. 12 and 13.

先ず、名刺の情報をアドレス帳へと登録するために、操作者が複数の名刺を文章処理装置１の原稿台３３に載置することにより、該載置が原稿検出部７１によって検出されて、検出信号が、ＭＰＵ１１へと供給される。 First, in order to register the information on the business card in the address book, the operator places a plurality of business cards on the document table 33 of the text processing device 1, and the placement is detected by the document detection unit 71. A detection signal is supplied to the MPU 11.

そして、該検出信号を供給されたＭＰＵ１１からの制御によって、表示部６２には、文章処理装置１の動作モードの選択を促す画面が表示される。 Then, a screen that prompts the user to select an operation mode of the text processing apparatus 1 is displayed on the display unit 62 under the control of the MPU 11 supplied with the detection signal.

操作者は、表示部６２に表示された該画面において「複数の同種の原稿からのアドレス帳登録」モード（以下、単に「アドレス帳登録モード」と称する）を選択する。 The operator selects the “address book registration from a plurality of documents of the same type” mode (hereinafter simply referred to as “address book registration mode”) on the screen displayed on the display unit 62.

該モードの選択によって文章処理装置１は、図１２に示される「複数の同種名刺からのアドレス帳作成処理」を開始する。 By selecting the mode, the sentence processing apparatus 1 starts “address book creation processing from a plurality of similar business cards” shown in FIG.

次に、アドレス帳登録モードにおいて操作者が読取開始ボタン６３を押し込むと、この操作の信号がＭＰＵ１１へと供給されて、初稿検出部１６は、現在の処理対象の名刺が１枚目の名刺であると判定し、１枚目の名刺であることを示す１枚目検出フラグをＲＡＭ２２にセットする。 Next, when the operator presses the reading start button 63 in the address book registration mode, a signal of this operation is supplied to the MPU 11, and the first draft detection unit 16 uses the first business card as the business card to be processed. It is determined that there is, and a first sheet detection flag indicating the first business card is set in the RAM 22.

ＭＰＵ１１は、１枚目検出フラグの設定状態によって、処理対象の名刺が１枚目であるか否かを判定する（ステップＳ１０）。 The MPU 11 determines whether or not the business card to be processed is the first sheet according to the setting state of the first sheet detection flag (step S10).

ここでは、１枚目検出フラグがセットされているので、ＭＰＵ１１は処理をステップＳ２０へと移し、１以上の既存のテンプレートがテンプレート格納部２７に格納されているか否かを判定する。 Here, since the first sheet detection flag is set, the MPU 11 moves the process to step S20 and determines whether one or more existing templates are stored in the template storage unit 27.

複数枚の同種名刺に対する文字認識を行う場合に、このように１枚目の名刺が現在の文字認識対象であることを操作者に提示して、共通部分を特定したテンプレートが存在するか否かの選択を促せば、既存のテンプレートがあるにもかかわらず、不必要なテンプレートの作成を行って時間を浪費する状況の発生を少なくすることができ、複数枚の同種名刺についての文字認識を高速化することができる。 Whether or not there is a template that identifies the common part by presenting to the operator that the first business card is the current character recognition target when character recognition is performed on a plurality of business cards of the same type. Prompting you to select, you can reduce the occurrence of time-consuming situations by creating unnecessary templates, even if there are existing templates, and speeding up character recognition for multiple similar business cards Can be

また、適切なテンプレートが無い場合には１枚目の名刺に基づいてテンプレートを作成する手法によれば、当初の何枚か名刺については全ての項目についての文字認識を行った後に、以降の名刺に基づいて共通部分を設定する場合に比べて、複数の名刺全体の文字認識に要する処理時間を短縮することができる。 Also, if there is no appropriate template, according to the method of creating a template based on the first business card, after performing character recognition for all items for the first several business cards, the subsequent business cards Compared with the case where a common part is set on the basis of, the processing time required for character recognition of the entire plurality of business cards can be shortened.

なお、１枚目検出フラグに代えて、例えば、処理する名刺を計数するカウンタを用いることによって１枚目の名刺を検出してもよい。 Note that the first business card may be detected by using, for example, a counter that counts business cards to be processed instead of the first sheet detection flag.

ここでは、既述したように既存のテンプレートは存在していないので、ＭＰＵ１１は、現在の処理対象の名刺に基づいてテンプレートを作成するために処理をステップＳ５０へと移す。 Here, since there is no existing template as described above, the MPU 11 shifts the processing to step S50 in order to create a template based on the current business target business card.

ステップＳ５０では、第１取得部４３（ここでは、読取部４１）が１枚目の名刺の画像データを取得する。また、取得された画像データはＭＰＵ１１によって画像格納部２３（図３）格納される。 In step S50, the first acquisition unit 43 (here, the reading unit 41) acquires the image data of the first business card. The acquired image data is stored by the MPU 11 in the image storage unit 23 (FIG. 3).

ここで、図４は、１枚目の名刺６ａの構成の一例を示す図である。図４に示されるように、名刺には、一般に、「会社名」、「住所」等のような各名刺間で内容が共通している共通情報を有する項目についての情報が記載されている共通部分と、「氏名」のように名刺毎の個別情報を有する項目についての情報が記載されている個別部分とが存在している。 Here, FIG. 4 is a diagram showing an example of the configuration of the first business card 6a. As shown in FIG. 4, a business card generally includes information about items having common information such as “company name”, “address”, etc., whose contents are common among business cards. There are a portion and an individual portion in which information about items having individual information for each business card such as “name” is described.

図４に示される例では、共通部分８ａ、８ｂ、および８ｃは、それぞれ「会社名」、「住所」「電話番号」が記載された共通部分であり、個別部分９ａ、９ｂ、９ｃ、および９ｄは、それぞれ「所属」、「役職」、「氏名」、および「電子メールアドレス」が記載された個別部分である。 In the example shown in FIG. 4, the common portions 8a, 8b, and 8c are common portions in which “company name”, “address”, and “telephone number” are described, and the individual portions 9a, 9b, 9c, and 9d. Are individual parts in which “affiliation”, “position”, “name”, and “e-mail address” are respectively described.

図４に示されるように、各個別部分および各共通部分は、名刺６ａの全体領域の中にそれぞれ領域を有している。 As shown in FIG. 4, each individual part and each common part has an area in the entire area of the business card 6 a.

また、共有部分は、文字認識を一度行って文字認識の結果を記憶しておけば、再度の文字認識が不要な部分であり、個別部分は、原稿毎に文字認識を行うことが必要な部分である。 The shared part is a part that does not need to be recognized again once character recognition is performed and the result of character recognition is stored, and the individual part is a part that needs to be recognized for each original. It is.

次に、図１２に戻って、設定部１３は、取得された名刺の画像データの全域を解析し、「社名」、「氏名」、「住所」、「電話番号」等の共通部分および個別部分から成る各項目毎に文字認識すべき文字群が存在している文字認識の対象領域を設定する（ステップＳ６０）。 Next, returning to FIG. 12, the setting unit 13 analyzes the entire image data of the acquired business card, and common parts and individual parts such as “company name”, “name”, “address”, “phone number”, etc. A character recognition target area in which a character group to be recognized for each item is set (step S60).

すなわち、設定部１３は、１枚目の原稿については共通部分と個別部分とを文字認識すべき対象領域として設定する。なお、この段階では、通常、文字認識処理は行われておらず、公知の画像処理技術によって対象領域の設定が行われる。 That is, the setting unit 13 sets the common part and the individual part as target areas to be recognized for the first original. At this stage, character recognition processing is usually not performed, and the target area is set by a known image processing technique.

文字認識の対象領域が設定されると、認識部１２は、設定された対象領域のみを対象として、該対象領域から個々の文字の画像を抽出し、ＲＯＭ２１または記憶装置２６などに記憶されている辞書の内容と照合することによって個々の文字が何であるかを認識する文字認識を実施する（ステップＳ７０）。 When the target area for character recognition is set, the recognition unit 12 extracts an image of each character from the set target area and stores it in the ROM 21 or the storage device 26. Character recognition is performed to recognize what each character is by collating with the contents of the dictionary (step S70).

次に、ＭＰＵ１１は、認識部１２によって行われた各対象領域毎の文字認識の結果に基づいて各対象領域の項目名を推測する。 Next, the MPU 11 estimates the item name of each target area based on the result of character recognition for each target area performed by the recognition unit 12.

該推測には、例えば、対象領域の文字群に「都道府県」等の文字が含まれている場合には該対象領域の項目は「住所」であると推測する手法などが用いられる。 For the estimation, for example, when a character group such as “prefecture” is included in the character group of the target area, a method of estimating that the item of the target area is “address” or the like is used.

ここで、図５は、共通部分の選択画面の一例を示す図である。ＭＰＵ１１は、各対象領域の項目名が推測できると、推測した各項目名と、各項目の内容、すなわち、各対象領域に含まれる文字群についての文字データと、各項目が共通部分に対する項目であるかを操作者が選択可能にするためのインタフェース（図５における「Ｙ」／「Ｎ」ボタン）を図５に示されるように各項目毎に行単位で表示部６２に表示する。 Here, FIG. 5 is a diagram showing an example of the selection screen for the common part. When the item name of each target area can be estimated, the MPU 11 includes the estimated item name, the contents of each item, that is, the character data for the character group included in each target area, and the items for the common part. An interface ("Y" / "N" button in FIG. 5) for enabling the operator to select whether or not there is displayed on the display unit 62 line by line for each item as shown in FIG.

該画像では、項目名およびその内容は、表示部６２上で操作者が指示することによって修正可能な修正モードに移行するインタフェースも備えており、操作者は、表示部６２に表示された認識結果が妥当か否かを判定し、妥当でない場合には文字認識ユニット２の状態を修正モードに移行させて文字認識の結果を修正する（ステップＳ８０）。 In the image, an item name and its contents are also provided with an interface for shifting to a correction mode that can be corrected by an operator instructing on the display unit 62. Is not valid, the character recognition unit 2 is shifted to the correction mode to correct the character recognition result (step S80).

該修正モードでは、例えば、誤って文字認識された文字の修正、電子メールアドレスが電話番号であると誤認識されたような項目名の認識の誤り修正、欠落している情報の補充する修正、および原稿上では２行にわたって記載された住所のうちの１行分のみが住所として認識されて２行目が住所以外の項目として認識されている場合に、２行目の項目名を変更する修正などが行われる。 In the correction mode, for example, correction of characters that have been mistakenly recognized, correction of recognition of item names such that the e-mail address is erroneously recognized as a telephone number, correction to supplement missing information, In the manuscript, when only one of the addresses written on two lines is recognized as an address and the second line is recognized as an item other than an address, the item name on the second line is changed. Etc. are performed.

また、修正については、ＲＯＭ２１などに格納された単語辞書を参照することなどによる自動修正が採用されてもよい。 For correction, automatic correction by referring to a word dictionary stored in the ROM 21 or the like may be employed.

なお、図５に示される例では、文字認識の結果の修正に連動して共通部分の位置情報の修正が自動的に行われるが、例えば、取得された名刺の画像が、対象領域を周囲の領域から識別することが可能なように表示部６２に表示されている状態において、操作者が共通部分の領域をタッチパネルを用いて画像上で修正を行うことが可能な構成を採用してもよい。 In the example shown in FIG. 5, the position information of the common part is automatically corrected in conjunction with the correction of the result of character recognition. For example, the acquired business card image is displayed around the target area. A configuration in which the operator can correct the common part region on the image using the touch panel in a state where the region is displayed on the display unit 62 so as to be identified from the region may be adopted. .

このように文字認識結果によらず画像に基づいても共通部分は特定できるので本発明の有用性を損なうものではないが、図５に例示されるように、１枚目の原稿の文字認識の結果に基づいて特定された共通部分の特定情報を取得する手法によれば、画像を表示して共通部分を特定する場合に比べて、より小さな表示部を採用することができるので、小さな表示装置を備えた小型の文章処理装置であっても、共通部分を適切に特定することによって共通部分と個別部分とを備えた複数の原稿についての文字認識処理を高速で行うことができる。 As described above, since the common part can be specified based on the image regardless of the character recognition result, the usefulness of the present invention is not impaired. However, as illustrated in FIG. 5, the character recognition of the first original is performed. According to the method of acquiring the identification information of the common part identified based on the result, a smaller display unit can be adopted as compared with the case where the common part is identified by displaying an image. Even in a small text processing apparatus equipped with the above, it is possible to perform character recognition processing on a plurality of originals having common parts and individual parts at high speed by appropriately specifying the common parts.

操作者は、文字認識結果についての必要な修正が終了すれば、表示部６２に表示された各項目が共通部分に対応するものであるか、個別部分に対応する項目であるかを判断し、共通部分に対応する文字認識の結果を、タッチパネル上の「Ｙ」／「Ｎ」ボタンの「Ｙ」ボタンを指示することによって、共通部分の選択を行う（ステップＳ９０）。 When the operator completes the necessary correction for the character recognition result, the operator determines whether each item displayed on the display unit 62 corresponds to a common part or an item corresponding to an individual part, The result of character recognition corresponding to the common part is selected by instructing the “Y” button of the “Y” / “N” button on the touch panel (step S90).

図６は、処理対象の名刺における共通部分８ａの位置情報を例示する図である。図６に示されるように、名刺における領域の位置情報は、例えば、名刺の左上を原点Ｐ０とする座標系における各領域の左上端部の点Ｐ１および右下端部の点Ｐ２の座標などによって特定される。 FIG. 6 is a diagram illustrating position information of the common part 8a in the business card to be processed. As shown in FIG. 6, the position information of the area on the business card is specified by, for example, the coordinates of the point P1 at the upper left end and the point P2 at the lower right end of each area in the coordinate system with the upper left corner of the business card as the origin P0. Is done.

操作者は、共通部分の選択が完了すれば、画面上の「確認」ボタンを指示することによって共通部分の選択を終了する。 When the selection of the common part is completed, the operator terminates the selection of the common part by instructing a “confirm” button on the screen.

図１２に戻って、該指示操作によって共通部分の選択が終了することにより、共通部分についての選択結果は、ＭＰＵ１１へと供給される。 Returning to FIG. 12, when the selection of the common part is completed by the instruction operation, the selection result for the common part is supplied to the MPU 11.

また、第２取得部１５は、操作者の選択結果をもとに、項目名、項目の内容、および各共通部分などの各共通部分を特定する特定情報を取得してテンプレートを作成し、テンプレート格納部２７に格納する（ステップＳ１００）。 The second acquisition unit 15 acquires specific information for identifying each common part such as an item name, item content, and each common part based on the selection result of the operator, creates a template, Store in the storage unit 27 (step S100).

なお、既述した修正モードで修正が行われた場合は、修正内容が反映された特定情報が取得される。 In addition, when correction is performed in the correction mode described above, specific information reflecting the correction content is acquired.

図７は、テンプレートの一例を示す図であり、該テンプレートは、図４に示される名刺に対応している。 FIG. 7 is a diagram showing an example of a template, and the template corresponds to the business card shown in FIG.

図７に示されるテンプレートでは、例えば、各共通部分のそれぞれについて項目名と、内容と、共通部分の左上端部の座標および右下端部の座標とが各共通部分の特定情報として対応づけられており、一組の同種の名刺群に対しては、通常、１つのテンプレートが設定される。 In the template shown in FIG. 7, for example, the item name, the content, the coordinates of the upper left corner and the coordinates of the lower right corner of each common portion are associated as specific information of each common portion. In general, one template is set for a group of business cards of the same type.

各テンプレートは、例えば、当該テンプレートの名称、データサイズ、含まれる共通部分の個数、ならびに各共通部分のそれぞれについての項目名および内容のそれぞれの文字数などを記録したヘッダ部と、含まれる各共通部分のそれぞれについて、項目名、内容、領域の左上座標、および領域の右下座標などを順次記憶したデータ部とを有するデータ構造に基づいて、テンプレート格納部２７に格納される。 Each template includes, for example, a header section that records the name of the template, data size, the number of common parts included, the item name for each common part, and the number of characters of each content, and each common part included. Are stored in the template storage unit 27 based on a data structure having a data part that sequentially stores item names, contents, upper left coordinates of the area, lower right coordinates of the area, and the like.

テンプレートにおける各共通部分の特定情報は、ヘッダ部分の情報を参照することによって順不同に参照され得る。 The specific information of each common part in the template can be referred to in any order by referring to the information of the header part.

また、テンプレート格納部２７には、例えば、各テンプレートのそれぞれについて名称および格納先アドレスなどを記載したテンプレートライブラリ管理テーブルも格納されており、各テンプレートの参照は、該管理テーブルを参照することによって順不同に行われ得る。 The template storage unit 27 also stores, for example, a template library management table that describes the name and storage destination address of each template, and the templates are referred to in any order by referring to the management table. Can be done.

上述の説明では、第２取得部１５は操作者の操作によって定められた共通部分の特定情報を取得しているが、第２取得部１５は、ＲＯＭ２１などに予め記憶されている共通情報の項目名などの情報に基づいて自動的に共通部分を特定し、テンプレートを設定、記録する処理を行ってもよい。 In the above description, the second acquisition unit 15 acquires the common part specific information determined by the operation of the operator, but the second acquisition unit 15 is an item of common information stored in advance in the ROM 21 or the like. A process of automatically specifying a common part based on information such as a name and setting and recording a template may be performed.

すなわち、第２取得部１５は、１枚目の原稿に基づいた操作者の操作に応じて、または、１枚目の原稿に基づいて自動的に、共通部分を特定する特定情報を取得する処理を行う。 That is, the second acquisition unit 15 acquires specific information for specifying the common part in response to an operation of the operator based on the first document or automatically based on the first document. I do.

なお、図７に示されるテンプレートでは、各共通部分の位置情報についての特定情報として、共通部分の領域範囲の左上端点と、右下端点の座標を採用しているが、例えば、共通部分以外の領域範囲を示す座標情報等の位置情報を採用しても本発明の有用性を損なうことはない。 In the template shown in FIG. 7, the coordinates of the upper left end point and the lower right end point of the area range of the common part are adopted as the specific information about the position information of each common part. Even if position information such as coordinate information indicating a region range is employed, the usefulness of the present invention is not impaired.

また、図６の例では、各共通部分の領域は、１つの項目を形成する全ての文字から成る文字列を囲む矩形として示されているが、例えば、文字列を囲む楕円形などで合ってもよいし、１つの項目の文字列を形成する各文字毎に、文字を囲む領域範囲を指定してもよい。 In the example of FIG. 6, each common area is shown as a rectangle that encloses a character string made up of all the characters that form one item. Alternatively, an area range surrounding the character may be designated for each character forming the character string of one item.

次に、図１２に戻って、生成部１７は、ステップＳ８０において必要な修正が行われた、共通部分および個別部分についての文字認識の結果のうち、アドレス帳に登録すべき所定の項目についての情報を所定の順序に配列することなどによってアドレス帳への登録データを生成する（ステップＳ１１０）。 Next, returning to FIG. 12, the generation unit 17 determines a predetermined item to be registered in the address book from among the character recognition results for the common part and the individual part that have been subjected to the necessary correction in step S80. Registration data in the address book is generated by arranging information in a predetermined order (step S110).

生成部１７によってアドレス帳への登録データが生成されると、ＭＰＵ１１は、アドレス格納部２８に生成された登録データを格納することによって、取得された登録データのアドレス帳への登録を行う（ステップＳ１２０）。 When the registration data to the address book is generated by the generation unit 17, the MPU 11 stores the registration data generated in the address storage unit 28, thereby registering the acquired registration data in the address book (Step S1). S120).

アドレス帳への登録が完了することにより、ＭＰＵ１１は、１枚目の名刺であることを示す１枚目検出フラグをクリアする。 Upon completion of registration in the address book, the MPU 11 clears the first sheet detection flag indicating that it is the first business card.

次に、ＭＰＵ１１は、全ての名刺についてのアドレス帳作成処理が完了したか否かを確認する（ステップＳ１３０）。該完了の確認は、例えば、ＡＤＦスキャン方式によって読み取られる場合であれば原稿検出部７１による名刺の検出の有無によって行われる。 Next, the MPU 11 checks whether or not the address book creation process for all business cards has been completed (step S130). The confirmation of the completion is performed, for example, based on whether or not the business card is detected by the document detection unit 71 in the case of reading by the ADF scan method.

また、既述した専用ホルダに保持された複数の名刺をフラットベッドスキャン方式で読み取る場合には、該完了の確認は、ホルダーにおける所定の終了位置に保持された名刺についての処理が完了したか否かを確認することなどによって行われる。 In addition, when a plurality of business cards held in the dedicated holder described above are read by the flatbed scanning method, the confirmation of the completion is whether the processing for the business cards held at the predetermined end position in the holder is completed. It is done by confirming.

また、文章処理装置１への名刺の挿入、文字認識およびアドレス帳登録、ならびに文章処理装置１からの名刺抜き取りという一連の作業を各名刺一枚毎に繰り返しつつアドレス帳を作成する場合では、例えば、各名刺毎のアドレス帳登録処理が完了する毎に、表示部６２に「続行」か「終了」か、を選択可能なインタフェースを備えた画面を表示して操作者に選択を促し、操作者による該選択の結果に基づいて、全ての名刺についてのアドレス帳作成処理が完了したか否かを判定する手法などが採用され得る。 Further, when creating an address book while repeating a series of operations of inserting a business card into the text processing device 1, character recognition and address book registration, and removing a business card from the text processing device 1 for each business card, for example, Each time the address book registration process for each business card is completed, a screen having an interface that allows selection of “continue” or “end” is displayed on the display unit 62 to prompt the operator to select. A method of determining whether or not the address book creation process for all business cards is completed based on the result of the selection according to the above can be adopted.

全ての名刺についてのアドレス帳作成処理が完了している場合には、文章処理装置１はアドレス帳作成処理を終了するが、ここでは、全ての名刺についてのアドレス帳作成処理が完了していないので、処理はステップＳ１０へと戻される。 If the address book creation processing for all business cards has been completed, the text processing apparatus 1 ends the address book creation processing, but here the address book creation processing for all business cards has not been completed. The process returns to step S10.

ステップＳ１０において、ここでは２枚目の名刺が処理対象であって、１枚目検出フラグがクリアされているので、ＭＰＵ１１は処理をステップＳ１７０（図１３）に移し、該ステップにおいて、ステップＳ５０と同様に名刺の画像データが取得される。 In step S10, since the second business card is the processing target and the first sheet detection flag is cleared here, the MPU 11 moves the process to step S170 (FIG. 13). Similarly, business card image data is acquired.

図８は、２枚目の名刺６ｂの構成の一例を示す図である。名刺６ｂにおいては、共通部分８ａから８ｃの位置および記載内容は、名刺６ａと同じである。 FIG. 8 is a diagram showing an example of the configuration of the second business card 6b. In the business card 6b, the positions and description contents of the common portions 8a to 8c are the same as those of the business card 6a.

また、名刺６ｂの個別部分９ｅから９ｈは、それぞれ名刺６ａにおける個別部分９ａから９ｄの各項目に対応しているが、それぞれの記載内容は名刺６ａとは異なっている。 Further, the individual portions 9e to 9h of the business card 6b correspond to the respective items of the individual portions 9a to 9d of the business card 6a, but the description contents thereof are different from those of the business card 6a.

さらに、文字数の変動に伴って、それぞれの位置情報も、個別部分９ａから９ｄのそれぞれの位置情報とは異なっている。 Further, with the variation in the number of characters, the position information also differs from the position information of the individual portions 9a to 9d.

この場合、１枚目の名刺６ａに基づいて共通部分を特定する特定情報がテンプレートとして既に取得されているので、２枚目の名刺６ｂについての画像データが取得されると、設定部１３は、テンプレートとして設定されている各共通部分の位置情報に基づいて、名刺６ｂについての文字認識の対象領域を各共通部分以外の領域に設定する（ステップＳ１８０）。 In this case, since the specific information for specifying the common part based on the first business card 6a has already been acquired as a template, when the image data for the second business card 6b is acquired, the setting unit 13 Based on the position information of each common part set as a template, the character recognition target area for the business card 6b is set to an area other than each common part (step S180).

図９および図１０は、２枚目の名刺６ｂを例として２枚目以降の名刺における対象領域の設定を説明する図である。 FIG. 9 and FIG. 10 are diagrams for explaining setting of target areas in the second and subsequent business cards, taking the second business card 6b as an example.

図９は、名刺６ｂにおける共通部分８ａから８ｃ以外の領域である非共通部分７ａを示しており、また、図１０は、名刺６ｂにおける文字認識の対象領域である個別部分９ｅから９ｈを示している。 FIG. 9 shows a non-common part 7a which is an area other than the common parts 8a to 8c in the business card 6b, and FIG. 10 shows individual parts 9e to 9h which are target areas for character recognition in the business card 6b. Yes.

設定部１３は、処理対象の名刺についての共通部分の特定情報が格納されたテンプレートを用いて、図９および図１０に示されるように２枚目以降の原稿について、共通部分以外の領域である非共通部分７ａに文字認識の対象領域を設定する。 The setting unit 13 is an area other than the common part for the second and subsequent originals as shown in FIGS. 9 and 10 by using a template in which common part specifying information for the business card to be processed is stored. A target area for character recognition is set in the non-common part 7a.

次に、図１３に戻って、設定部１３による対象領域の設定がされると、認識部１２が、ステップＳ７０と同様にして対象領域のみについての文字認識を行う（ステップＳ１９０）。 Next, returning to FIG. 13, when the target area is set by the setting unit 13, the recognition unit 12 performs character recognition only for the target area in the same manner as in step S70 (step S190).

文字認識が完了すれば、文字認識の結果が表示部６２に表示される。操作者は、表示された内容を参照してステップＳ８０と同様に文字認識の結果の必要な修正を行う（ステップＳ２００）。 When the character recognition is completed, the result of character recognition is displayed on the display unit 62. The operator refers to the displayed content and performs the necessary correction of the character recognition result as in step S80 (step S200).

文字認識の結果についての必要な修正が完了すれば、生成部１７は、２枚目以降の名刺についての文字認識の結果と、テンプレートに保存されている、１枚目の名刺に基づいて生成された共通部分の文字認識の結果と、のうちのアドレス帳に登録すべき所定の項目についての情報を、所定の順序に配列することなどによってアドレス帳への登録データを生成する（ステップＳ２１０）。 When the necessary correction for the character recognition result is completed, the generation unit 17 generates the character recognition result for the second and subsequent business cards and the first business card stored in the template. Then, registration data for the address book is generated by arranging the result of the character recognition of the common part and information on the predetermined item to be registered in the address book in a predetermined order (step S210).

生成部１７によってアドレス帳への登録データが生成されると、ＭＰＵ１１は、１枚目の名刺と同様にして２枚目の名刺についての登録データをアドレス帳に登録する（ステップＳ１２０）。 When the registration data for the address book is generated by the generation unit 17, the MPU 11 registers the registration data for the second business card in the address book in the same manner as the first business card (step S120).

次に、ＭＰＵ１１は、１枚目の名刺についての処理と同様にして、ステップＳ１３０の処理において全ての名刺についてのアドレス帳作成処理が完了したか否かを確認する。 Next, the MPU 11 confirms whether or not the address book creation processing for all business cards has been completed in the processing of step S130 in the same manner as the processing for the first business card.

全ての名刺についてのアドレス帳作成処理が完了していなければ、ＭＰＵ１１は、処理をステップＳ１０へと戻し、文章処理装置１および文字認識ユニット２の各機能部によって、未処理の全ての名刺について２枚目の名刺についての処理と同様の処理が繰り返される。 If the address book creation processing for all business cards has not been completed, the MPU 11 returns the processing to step S10, and the functional units of the text processing device 1 and the character recognition unit 2 perform 2 for all unprocessed business cards. The same processing as that for the first business card is repeated.

全ての名刺についてのアドレス帳作成処理が完了している場合には、ＭＰＵ１１は、アドレス帳作成処理を終了する。 If the address book creation process for all business cards has been completed, the MPU 11 ends the address book creation process.

以上に説明した処理手順によって、共通部分を特定する特定情報であるテンプレートが１つもテンプレート格納部２７に格納されていない状態から共通部分と個別部分とを有する複数の原稿を対象とするアドレス帳作成処理が開始される場合におけるアドレス帳作成処理が行われる。 By the processing procedure described above, an address book is created for a plurality of documents having common parts and individual parts from a state in which no template, which is identification information for identifying common parts, is stored in the template storage unit 27. An address book creation process is performed when the process is started.

上述したように、文字認識ユニット２においては、複数の原稿のうち最初に文字認識の対象となる１枚目の原稿に基づいて各原稿に共通する共通部分を特定する特定情報が取得され、２枚目以降の原稿の文字認識を行う際には共通部分以外の領域に文字認識の対象領域が設定されるので、共通部分についての再度の文字認識が不要となり、効率よく文字認識の対象となるデータ量を減ずることによって文字認識に要する時間を短縮することができる。 As described above, the character recognition unit 2 acquires specific information for specifying a common portion common to each document based on the first document to be character-recognized first among a plurality of documents. When character recognition is performed on the first and subsequent documents, character recognition target areas are set in areas other than the common part, so that it is not necessary to repeat character recognition for the common part, and the character recognition is efficiently performed. By reducing the amount of data, the time required for character recognition can be shortened.

◎テンプレート基準の文字認識処理およびアドレス帳作成について：
次に、テンプレート格納部２７に１枚目の原稿に基づいて作成された１以上のテンプレートが既に格納されている状態から共通部分と個別部分とを有する複数の原稿を対象とするアドレス帳作成処理が開始される場合の、アドレス帳作成処理について説明する。 ◎ About template-based character recognition processing and address book creation:
Next, address book creation processing for a plurality of documents having common parts and individual parts from a state in which one or more templates created based on the first document are already stored in the template storage unit 27 A description will be given of address book creation processing in the case where is started.

図１２において、アドレス帳作成処理が開始されると、ステップＳ１０の判定処理において、処理対象の名刺が１枚目であるとの判定がなされ、次に、ステップＳ２０の判定処理において、１以上のテンプレートが既存であるとの判定がなされて処理はステップＳ３０へと移される。 In FIG. 12, when the address book creation process is started, it is determined that the business card to be processed is the first one in the determination process in step S10, and then one or more in the determination process in step S20. It is determined that the template already exists, and the process proceeds to step S30.

このタイミングでＭＰＵ１１は、１枚目の名刺に基づいて共通部分を特定する特定情報を取得する第２取得部１５の処理機能を不能化する。 At this timing, the MPU 11 disables the processing function of the second acquisition unit 15 that acquires the specific information for specifying the common part based on the first business card.

また、このタイミングでＭＰＵ１１の第３取得部１４は、処理対象の複数の名刺に対するアドレス帳作成処理において、１枚目の名刺に基づいて新たにテンプレートを作成して２枚目以降の名刺に適用する（１枚目の名刺基準の文字認識処理）か、既存のテンプレートを１枚目以下のすべての名刺に対して適用する（テンプレート基準の文字認識処理）かの選択を操作者に促すためのインタフェースを備えた画面を表示部６２に表示する。 Also, at this timing, the third acquisition unit 14 of the MPU 11 creates a new template based on the first business card and applies it to the second and subsequent business cards in the address book creation process for a plurality of business cards to be processed. To prompt the operator to select whether to apply the first card-based character recognition process or to apply an existing template to all the first and lower business cards (template-based character recognition process) A screen having an interface is displayed on the display unit 62.

操作者は、表示部６２に表示された画面のタッチパネルを操作することによって、これから行おうとする処理が「１枚目の名刺基準の文字認識処理」であるか、「テンプレート基準の文字認識処理」であるかを選択する（ステップＳ３０）。 The operator operates the touch panel of the screen displayed on the display unit 62 to determine whether the process to be performed is “character recognition process based on the first business card” or “character recognition process based on the template”. Is selected (step S30).

ステップＳ３０において、「１枚目の名刺基準の文字認識処理」が選択された場合には、「１枚目の名刺基準の文字認識処理」の説明において既述した各処理フローに沿って文字認識およびアドレス帳作成が行われ、「テンプレート基準の文字認識処理」が選択された場合には、処理はステップＳ１４０へ移される（ステップＳ４０）。 If “first business card reference character recognition processing” is selected in step S30, character recognition is performed according to the processing flow described above in the description of “first business card reference character recognition processing”. If the address book is created and “template-based character recognition process” is selected, the process proceeds to step S140 (step S40).

操作者の選択に基づいて処理がステップＳ１４０に移されると、ＭＰＵ１１は、テンプレート格納部２７に格納されている各テンプレートを表示部６２に表示するなどして、操作者に対して使用するテンプレートの選択を促す。 When the process moves to step S140 based on the operator's selection, the MPU 11 displays each template stored in the template storage unit 27 on the display unit 62, for example, to select a template to be used for the operator. Encourage selection.

図１１は、ステップＳ１４０において表示部６２に表示されるテンプレートの選択画面の一例を示す図である。 FIG. 11 is a diagram showing an example of a template selection screen displayed on the display unit 62 in step S140.

図１１に示される例では、各テンプレートとともにテンプレート格納部２７に格納されているテンプレートライブラリ管理テーブルに基づいて、各テンプレートの名称が画面に表示されている。 In the example shown in FIG. 11, the name of each template is displayed on the screen based on the template library management table stored in the template storage unit 27 together with each template.

既述したようにテンプレートライブラリ管理テーブルには、各テンプレートのそれぞれについて名称および格納先アドレスなどが格納されている。 As described above, the template library management table stores the name and storage address of each template.

そして、操作者によって選択されたテンプレートの参照は、テンプレートライブラリ管理テーブルをＭＰＵ１１が参照することによって行われる。 The template selected by the operator is referred to by the MPU 11 referring to the template library management table.

図１１の例は、Ａ社からＤ社の各社の名刺についてのテンプレートが、それぞれ「Ａ社名刺」、「Ｂ社名刺」、「Ｃ社名刺」、「Ｄ社名刺」という名称でテンプレート格納部２７に格納されているとともに、今回の処理対象である複数の名刺である「○○株式会社」の名刺に対応するテンプレートが「○○株式会社名刺」という名称でテンプレート格納部２７に格納されている場合を例示している。 In the example of FIG. 11, templates for business cards of each company from Company A to Company D are named “A Company Business Card”, “B Company Business Card”, “C Company Business Card”, and “D Company Business Card”, respectively. 27, and a template corresponding to a business card of “XX Co., Ltd.” which is a plurality of business cards to be processed this time is stored in the template storage unit 27 under the name “XX Co., Ltd. business card”. The case is shown as an example.

図１３に戻って、操作者は、図１１に示された「○○株式会社名刺」というテンプレートの名称をタッチパネル上で指示することによって該テンプレートを選択する。 Returning to FIG. 13, the operator selects the template by designating on the touch panel the name of the template “xxx business card” shown in FIG. 11.

この選択動作によって該名称の表示が、白地上の黒文字から黒地上の白文字へと反転表示されたことによって、操作者は、選択が適切になされたことを確認した後、「確認」ボタンを押し込んで選択を確定させ、テンプレートの選択処理を終了させる（ステップＳ１５０）。 By this selection operation, the display of the name is highlighted from the black character on the white background to the white character on the black background, so that the operator confirms that the selection has been properly made and then presses the “confirm” button. The template is pressed to confirm the selection, and the template selection process is terminated (step S150).

操作者が押し込んだ「確認」ボタンからの押し込み信号をトリガーとして検出するなどして、第３取得部１４は、操作者に選択されたテンプレートに基づいて共通部分の特定情報、すなわち、共通部分の項目名、文字認識の結果、および位置情報などを取得する（ステップＳ１６０）。 The third acquisition unit 14 detects, for example, a push signal from the “confirm” button pushed by the operator as a trigger, and thus the third acquisition unit 14 identifies specific information of the common part based on the template selected by the operator, that is, the common part. Item names, character recognition results, position information, and the like are acquired (step S160).

共通部分についての特定情報が取得されると、「１枚目の名刺基準の文字認識処理」の説明欄で説明した、２枚目以降の名刺に対する各処理が、今回の「テンプレート基準の文字認識処理」における１枚目以下の各名刺に対して適用され、該各名刺についてのアドレス帳作成処理が完了される。 When the specific information about the common part is acquired, each process for the second and subsequent business cards described in the explanation section of “First business card standard character recognition process” is performed in the “template standard character recognition”. This is applied to each of the first and subsequent business cards in “Process”, and the address book creation process for each business card is completed.

以上に説明した処理手順によって、共通部分を特定する特定情報を格納した１つ以上のテンプレートが既にテンプレート格納部２７に格納されている状態から、共通部分と個別部分とを有する複数の原稿を対象とするアドレス帳作成処理が開始される場合の、アドレス帳作成処理が行われる。 From the state in which one or more templates storing the specific information for specifying the common part are already stored in the template storage unit 27 according to the processing procedure described above, a plurality of documents having the common part and the individual part are targeted. The address book creation process is performed when the address book creation process is started.

また、個別部分と共通部分とを有する複数の同種原稿を対象とする文字認識において、一枚目の原稿に基づいて作成されたテンプレートを保存しておき、再度、該複数の同種名刺を対象とする文字認識を行う際の一枚目以降の原稿に対して保存したテンプレートを適用すれば、該再度の文字認識処理においては、一枚目の原稿に基づいて再度、共通部分を特定する特定情報を取得する必要がないので、該再度の複数枚の原稿についての文字認識処理の時間を短縮することができる。 In character recognition for a plurality of same-type documents having individual parts and a common part, a template created based on the first document is saved, and the plurality of same-type business cards are again targeted. If the saved template is applied to the first and subsequent originals when character recognition is performed, in the second character recognition process, specific information for specifying the common portion again based on the first original Therefore, it is possible to shorten the character recognition processing time for the plurality of originals again.

＜４．変形例：＞
以上、本発明の実施の形態について説明してきたが、本発明は上記実施の形態に限定されるものではなく様々な変形が可能である。 <4. Variation:>
Although the embodiments of the present invention have been described above, the present invention is not limited to the above embodiments, and various modifications can be made.

（１）本実施の形態において、共通部分についての各特定情報が１つのテンプレートに納められるものとして説明したが、共通部分と各特定情報との対応付けを適切に行えば、例えば、各特定情報を各特定情報毎に個別に保存する構成を採用してもよい。 (1) In the present embodiment, it has been described that each piece of specific information about the common part is stored in one template. However, if the common part and each piece of specific information are appropriately associated, for example, each piece of specific information May be individually stored for each specific information.

（２）また、本実施の形態において、テンプレートは記憶装置２６のテンプレート格納部２７に恒久的に格納されると説明したが、ＲＡＭ２２に一時的に格納するものとしても、多数の複数枚の同種名刺を、複数の名刺群ごとに分割して文字認識処理を行う場合も多いことから、本願発明の有用性を損なうものではない。 (2) In the present embodiment, it has been described that the template is permanently stored in the template storage unit 27 of the storage device 26. However, even if the template is temporarily stored in the RAM 22, a plurality of similar types Since the business card is often divided into a plurality of business card groups and character recognition processing is performed, the usefulness of the present invention is not impaired.

（３）また、本実施の形態において、第１取得部４３は、共通部分が特定されている場合の名刺に対しても、名刺の全体領域に対して画像の取得を行うが、名刺の全体領域のうち少なくとも非共通部分７ａの全ての画像が取得できるように、共通部分に基づいて画像取得領域を設定しても良い。 (3) In the present embodiment, the first acquisition unit 43 also acquires an image for the entire area of the business card even for the business card when the common part is specified. The image acquisition area may be set based on the common part so that at least all the images of the non-common part 7a in the area can be acquired.

この構成によれば、例えば、読取部４１の副走査方向に沿って名刺の上下方向が定まるようにフラットベッドに名刺が置かれて画像が取得される場合において、名刺の上半分のみに非共通部分７ａが存在していれば、該上半分のみを画像取得の対象とすることができるので、２枚目以降の名刺についての画像取得時間を削減することができ、複数枚の名刺についての文字認識に係る総時間を短縮することができる。 According to this configuration, for example, when an image is acquired by placing a business card on a flat bed so that the vertical direction of the business card is determined along the sub-scanning direction of the reading unit 41, it is not common to only the upper half of the business card. If the portion 7a exists, only the upper half can be targeted for image acquisition, so the image acquisition time for the second and subsequent business cards can be reduced, and characters for a plurality of business cards can be reduced. The total time for recognition can be shortened.

１文章処理装置
２文字認識ユニット
３１原稿給送部
３２原稿載置部
３３原稿台
３４原稿排出台
４１読取部
４２入出力部
４３第１取得部
５１記録部
６１操作部
６２表示部
６３読取開始ボタン
７１原稿検出部
６ａ，６ｂ名刺
７ａ非共通部分
８ａ，８ｂ，８ｃ共通部分
９ａ，９ｂ，９ｃ，９ｄ，９ｅ，９ｆ，９ｇ，９ｈ個別部分 DESCRIPTION OF SYMBOLS 1 Text processing apparatus 2 Character recognition unit 31 Document feeding part 32 Document placing part 33 Document stand 34 Document discharge stand 41 Reading part 42 Input / output part 43 First acquisition part 51 Recording part 61 Operation part 62 Display part 63 Reading start button 71 Document detection unit 6a, 6b Business card 7a Non-common part 8a, 8b, 8c Common part 9a, 9b, 9c, 9d, 9e, 9f, 9g, 9h Individual part

Claims

A character recognition unit that performs character recognition of a plurality of originals having a common part and individual parts
First acquisition means for acquiring each image of the plurality of documents;
Second acquisition means for acquiring specific information for specifying the common part, obtained based on a first original to be character-recognized first among the plurality of originals;
For the first original, the common part and the individual part are set as target areas for character recognition, and for the second and subsequent originals, the target area is set in an area other than the common part. Means,
Recognition means for performing character recognition only on the target area;
A character recognition unit comprising:

A character recognition unit according to claim 1,
A character recognition unit, further comprising storage means for storing the specific information.

A character recognition unit according to claim 2,
Further comprising third acquisition means for acquiring the specific information from the storage means in response to a predetermined operation input;
When the third acquisition unit acquires the specific information,
(a) the character recognition unit disables the second acquisition means;
(b) The character recognition unit characterized in that the setting means sets the target area in each of the plurality of documents based on the acquired specific information.

A character recognition unit according to any one of claims 1 to 3,
Generation for generating character data of each original based on the result of character recognition for the common part in the first original and the result of character recognition for the target area in each of the second and subsequent originals And
A sentence processing apparatus comprising: