JP6582464B2

JP6582464B2 - Information input device and program

Info

Publication number: JP6582464B2
Application number: JP2015053081A
Authority: JP
Inventors: 里葉子芦田; 晃慶山下
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2015-03-17
Filing date: 2015-03-17
Publication date: 2019-10-02
Anticipated expiration: 2035-03-17
Also published as: JP2016173710A

Description

本発明は、ユーザによる入力作業を支援する光学文字認識を用いた情報入力装置に関する。 The present invention relates to an information input device using optical character recognition that supports an input operation by a user.

従前より、企業が街頭で行うキャンペーンへの参加や一般商取引におけるポイントカード会員などの入会手続きにおいて、ユーザが参加や入会の申し込みを行う場合、ユーザが所持する個人認証用の媒体に記載されている個人情報を撮影し、光学文字認識（ＯＣＲ：ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）を行うことによって、ユーザに対する申込票への記入の負荷を低減させる方法が行われている。 In the past, when a user applies for participation or membership in a membership campaign such as a point card member in a business campaign on the street or in a general commercial transaction, it is described in the personal authentication medium possessed by the user. A method of reducing the burden of filling in an application form for a user by photographing personal information and performing optical character recognition (OCR) has been performed.

この場合、キャンペーンや特典が付与される対象商品が、酒類やたばこなど、その利用に法的な年齢制限が課せられているような場合、前記個人認証用の媒体は、例えば、ユーザの免許証など、生年月日の記載のある公的身分証明書であることが望ましい。そのような媒体であれば、入会申し込みと同時に年齢認証を行えるため、ユーザにとっても企業側にとっても年齢認証に係る負荷が軽減される。 In this case, if the target product to which the campaign or privilege is granted is subject to legal age restrictions on its use, such as alcoholic beverages and tobacco, the personal authentication medium is, for example, a user's license. It is desirable to have a public ID with a date of birth. With such a medium, age authentication can be performed at the same time as application for membership, so the burden on age authentication is reduced for both users and companies.

これらの、免許証などの公的身分証明書においては、情報入力装置で読み取る対象の、姓名、住所、生年月日などが記載されている位置や文字サイズがほぼ定まっているため、特にＯＣＲを用いた自動文字認識処理に適していると考えられる。 In these official IDs such as licenses, the position and character size where the first name, last name, address, date of birth, etc., to be read by the information input device are almost fixed, so in particular, OCR It is considered suitable for the automatic character recognition process used.

特許文献１では、保険業界等の契約内容の変更手続等における契約者の家族などの手続者の認証を行うため、手続者の免許証などの身分証明書の画像データから、前記手続者の姓をテキストデータで取得して、公的な身分証データとして抽出し、契約者の姓と一致することにより認証を行う認証装置が呈示されている。 In Patent Document 1, in order to authenticate a procedural member such as a contractor's family in a procedure for changing the contents of a contract in the insurance industry, the surname of the procedural member is obtained from image data of an identification card such as a procedural license. Is obtained as text data, extracted as official identification data, and presented with an authentication device that performs authentication by matching with the contractor's last name.

また、特許文献２では、光学的文字読取り装置（ＯＣＲ）を用いて、姓名を認識する場合に使用される文字認識処理装置が紹介されている。これによると姓名の認識を行なう場合には、姓名から成る文字列を読取り、文字の数に応じて姓名特有の仕切りを設定して、前半部と後半部とに２分割し、前半部を姓辞書と比較照合し、後半部を名辞書と比較照合して、認識ができない場合は仕切りを変更して再度各辞書と比較照合することにより、姓と名を認識する方法が呈示されている。 Further, Patent Document 2 introduces a character recognition processing device used for recognizing first and last names using an optical character reader (OCR). According to this, when recognizing first and last names, the character string consisting of the first and last names is read, a partition specific to the first and last names is set according to the number of characters, and divided into two parts, the first half and the second half. A method of recognizing a surname and a first name is presented by comparing with a dictionary and comparing the latter half with a name dictionary, and when it cannot be recognized, changing the partition and comparing with each dictionary again.

また、特許文献３では、手書きで記された手書き漢字（印刷文字でもよい）の姓名認識を行う光学文字読取り装置（ＯＣＲ）における文字認識方法において、異なる姓と名の区切方法がある場合、いずれかの区切方法を選択するための候補データを表示する方法が紹介されている。 Moreover, in patent document 3, in the character recognition method in the optical character reader (OCR) for recognizing first and last names of handwritten Chinese characters (which may be printed characters) written by hand, A method for displaying candidate data for selecting a separation method is introduced.

特許第５１６６３３０号公報Japanese Patent No. 5166330 特許第２８９２３７６号公報Japanese Patent No. 2892376 特許第２９３３１７８号公報Japanese Patent No. 2933178

ところで、ＯＣＲを用いて自動文字認識を行う場合、例え記載位置や文字サイズ（文字の高さ）が揃っていたとしても、例えば姓名欄に記載された姓名においては、姓と名の間に明瞭な区切り（空白）などがないといった理由で正確な姓と名の区切りを自動で認識できなかったり、住所などにおいては記載欄の大きさの制限によって、記載された文字数が多い場合、文字の横幅が縦のサイズの半角や２／３角となっていて、光学文字認識が困難となってしまったりする場合がある。 By the way, when automatic character recognition is performed using OCR, even if the description position and character size (character height) are aligned, for example, in the first name and last name written in the first and last name field, it is clear between the first name and the last name. The width of the character when there is a large number of characters written due to restrictions on the size of the entry field in the address, etc. May be half-width or 2 / 3-corner of the vertical size, which may make optical character recognition difficult.

特許文献１に開示されている方法では、姓名の区切りを決定する以前に、抽出すべき手続者の姓が予め判っている（契約者の姓と一致することが期待されている）ため、姓と名を正確に区分することは容易にできる、しかし、通常の文字認識においては、姓は予め判ってはいない。特許文献２では、特許文献３で指摘されているように、複数の異なる姓と名の区切方法がある場合において、いずれか一の区切方法が選択されてしまうため、必ずしも正確に姓と名が区切られる保証がないといった問題がある。また、特許文献３では、複数の異なる姓と名の区切方法がある場合について、区切方法を選択するための候補データを表示して、ユーザが候補データの中から所望する姓名の区切を選択するため、姓と名は正確に区切られる。しかし、候補となる姓と名の区切方法が複数表示されるため、ユーザが不快に感じる姓名の区切方法が表示されたり、表示された区切方法のリストの中から別途キーボードやマウス等の入力装置を用いて選択したりする必要があるため、物理的に固定されたスペースなどが必要となり、ユーザにとっても企業側にとっても煩わしさを感じる場合がある。 In the method disclosed in Patent Document 1, since the surname of a procedure person to be extracted is known in advance before the last name is determined (it is expected to match the surname of the contractor), the surname It is easy to accurately distinguish the first name and the last name, but in normal character recognition, the last name is not known in advance. In Patent Document 2, as pointed out in Patent Document 3, in the case where there are a plurality of different surname and first name delimiters, any one delimiter is selected. There is a problem that there is no guarantee of being separated. Further, in Patent Document 3, when there are a plurality of different last name and first name separation methods, candidate data for selecting the separation method is displayed, and the user selects a desired first and last name separation from the candidate data. Therefore, the first name and the last name are separated accurately. However, since multiple candidate last name and first name separation methods are displayed, the first name and last name separation method that the user feels uncomfortable is displayed, or an input device such as a keyboard or a mouse is separately displayed from the displayed list of separation methods. Since it is necessary to make a selection by using a space, a physically fixed space or the like is required, which may be bothersome for the user and the company.

本発明の目的は、ユーザが所持する個人認証用の媒体に記載されている個人情報を撮影し、その画像データから光学文字認識（ＯＣＲ）によって、正確に記載データを抽出するとともに、姓と名の間に明瞭な区切り（空白）などがない場合でも、場所を選ばずに容易な操作で正確に姓と名を区分することができる情報入力装置を提供することである。 An object of the present invention is to capture personal information described in a medium for personal authentication possessed by a user, accurately extract description data from the image data by optical character recognition (OCR), It is an object to provide an information input device capable of accurately distinguishing a first name and a last name by an easy operation without choosing a place even when there is no clear separation (blank) or the like.

前記課題を解決するための、本願の第１の発明は、公的身分証明書を撮影して画像データとして記憶する撮影手段と、前記記憶した画像データを前記公的身分証明書の種類ごとに定められたレイアウト情報を用いて複数の部分画像データに分割する画像分割手段と、前記部分画像データを所定のサイズに拡大縮小する画像補正手段と、前記所定のサイズに拡大縮小された各々の部分画像データを光学文字認識を用いてテキストデータ化してレイアウト情報から得られる属性と併せて保持する文字認識手段と、姓名に係る部分画像データをテキストデータ化した文字列を姓と名に区分する姓名区分手段と、姓名の姓と名の区切りを示す入力を検知してテキストデータを姓のテキストデータと名のテキストデータの区切りを決定する区分決定手段と、それぞれのテキストデータを記憶するデータ保存手段と、を備え、前記画像補正手段は、姓名に係る部分画像データに含まれる姓名の文字数と住所に係る部分画像データにおける１文字分の文字幅を比較することによりそれぞれの文字幅を判定してそれぞれの部分画像データを同一の文字幅になるように拡大縮小することを特徴とする情報入力装置である。こうして、ユーザが所持する公的身分証明書を撮影することによってユーザの個人情報を取得することができ、文字の拡縮による補正を予め行うことによって、光学文字認識（ＯＣＲ）による文字認識の精度アップを図ることができる。
In order to solve the above problems, the first invention of the present application is a photographing means for photographing a public identification card and storing it as image data, and storing the stored image data for each type of the public identification card. Image dividing means for dividing the partial image data into a plurality of partial image data using the determined layout information, image correcting means for scaling the partial image data to a predetermined size, and each part scaled to the predetermined size Character recognition means that converts image data into text data using optical character recognition and retains it together with attributes obtained from layout information, and first and last names that categorize character strings obtained by converting partial image data related to first and last names into text data Classifying means and classifying means for detecting the input indicating the first name and last name separator of the first and last names and determining the text data as the last name text data and first name text data separator Data storing means for storing each text data, and the image correcting means compares the number of characters of the first and last names included in the partial image data relating to the first and last names with the character width of one character in the partial image data relating to the address. Thus, the information input device is characterized in that each character width is determined and each partial image data is enlarged / reduced to have the same character width. In this way, the user's personal information can be acquired by photographing the public identification card possessed by the user, and the accuracy of character recognition by optical character recognition (OCR) is improved by performing correction by enlargement / reduction of characters in advance. Can be achieved.

第２の発明は、公的身分証明書を撮影して画像データとして記憶する撮影手段と、前記記憶した画像データを前記公的身分証明書の種類ごとに定められたレイアウト情報を用いて複数の部分画像データに分割する画像分割手段と、前記部分画像データを所定のサイズに拡大縮小する画像補正手段と、前記所定のサイズに拡大縮小された各々の部分画像データを光学文字認識を用いてテキストデータ化してレイアウト情報から得られる属性と併せて保持する文字認識手段と、姓名に係る部分画像データをテキストデータ化した文字列を姓と名に区分する姓名区分手段と、姓名の姓と名の区切りを示す入力を検知してテキストデータを姓のテキストデータと名のテキストデータの区切りを決定する区分決定手段と、それぞれのテキストデータを記憶するデータ保存手段と、を備え、前記姓名区分手段は、姓名に係る部分画像データをテキストデータ化した文字列を一つ以上の姓と名に区分してそれぞれの区切り方の確からしさの指標を取得し、前記区分決定手段は、前記指標をもとに前記文字列を修飾して表示部に表示することを特徴とする情報入力装置である。こうして、ユーザが所持する公的身分証明書を撮影することによってユーザの個人情報を取得することができ、ユーザは正しい姓と名の区切りをより的確に指定することができる。
According to a second aspect of the present invention, there is provided a photographing means for photographing a public identification card and storing it as image data, and a plurality of pieces of the stored image data using layout information determined for each type of the public identification card. Image dividing means for dividing the image into partial image data; image correcting means for enlarging / reducing the partial image data to a predetermined size; and text data for each partial image data enlarged / reduced to the predetermined size using optical character recognition Character recognition means for storing data in combination with attributes obtained from layout information, first-name and last-name classification means for classifying a character string obtained by converting partial image data relating to first and last names into text data, first-name and last-name Classifying means for detecting the input indicating the delimiter and determining the delimiter between the text data for the last name and the text data for the first name and the text data for each. The first and last name classifying means classifies a character string obtained by converting the partial image data relating to the first and last name into text data into one or more last names and first names, and provides an index of the certainty of each separation method. The information input device is characterized in that the classification determination means modifies the character string based on the index and displays it on a display unit. In this way, the user's personal information can be obtained by photographing the public identification card possessed by the user, and the user can more accurately specify the correct first and last name separators.

第３の発明は、前記画像補正手段は、姓名に係る部分画像データに含まれる姓名の文字数と住所に係る部分画像データにおける１文字分の文字幅を比較することによりそれぞれの文字幅を判定してそれぞれの部分画像データを同一の文字幅になるように拡大縮小することを特徴とする第２の発明における情報入力装置である。このように文字の拡縮による補正を予め行うことによって、光学文字認識（ＯＣＲ）による文字認識の精度アップを図ることができる。 In a third aspect of the invention, the image correction unit determines the character width by comparing the number of characters of the first and last names included in the partial image data relating to the first and last names and the character width of one character in the partial image data relating to the address. In the information input device according to the second aspect of the present invention, each partial image data is enlarged or reduced so as to have the same character width. Thus, by performing the correction by the enlargement / reduction of characters in advance, the accuracy of character recognition by optical character recognition (OCR) can be improved.

第４の発明は、コンピュータを、第１の発明から第３の発明のいずれか１つに記載の情報入力装置として機能させるためのコンピュータプログラムである。 A fourth invention is a computer program for causing a computer to function as the information input device according to any one of the first to third inventions.

本発明によれば、ユーザが所持する個人認証用の媒体に記載されている個人情報を撮影するだけで、正確に媒体に記載されたデータを抽出するとともに、場所を選ばずに容易な操作で正確に姓と名を区分することができる。 According to the present invention, it is possible to extract data described on a medium accurately and to easily operate without selecting a place only by photographing personal information described on a medium for personal authentication possessed by a user. The first name and last name can be accurately distinguished.

本実施形態の構成を示す図である。It is a figure which shows the structure of this embodiment. 情報入力装置１００についてより詳しく説明する機能ブロック図である。2 is a functional block diagram for explaining the information input device 100 in more detail. FIG. 情報入力装置１００のハードウェア構成を示す図である。2 is a diagram illustrating a hardware configuration of an information input device 100. FIG. レイアウト情報１３１を説明する図である。It is a figure explaining the layout information 131. FIG. 姓名辞書１６１と姓名履歴１６２を説明する図である。It is a figure explaining the full name dictionary 161 and the full name history 162. FIG. 本実施形態の処理の流れを説明するフローチャートであるIt is a flowchart explaining the flow of a process of this embodiment. 個人認証用の媒体２００の外観の例を示す図である。3 is a diagram illustrating an example of the appearance of a personal authentication medium 200. FIG. 画像補正手段１４０における文字サイズの補正に関するフローチャートである。5 is a flowchart relating to correction of character size in an image correction unit 140. 姓名区分手段１６０における姓名の区切りの指標を求めるフローチャートである。5 is a flowchart for obtaining an index for separating first and last names in first and last name classification means 160. 区分決定手段１７０が表示部１０５に表示する姓名のテキストの例を示す図である。It is a figure which shows the example of the text of the full name which the division | segmentation determination means 170 displays on the display part 105. FIG.

以下に、本発明の一つの実施形態に係るシステムの構成について図面を参照してさらに詳細に説明する。 Hereinafter, the configuration of a system according to an embodiment of the present invention will be described in more detail with reference to the drawings.

図１は、本実施形態の構成を示す図である。１００は情報入力装置であり、典型的にはタブレット型コンピュータ等の可搬型の端末であり、カメラ１２１を備える。また、情報入力装置１００は、前記カメラ１２１でユーザの所持する公的身分証明書などの個人認証用の媒体２００を撮影する。また、情報入力装置１００は、ネットワーク５００を介して外部装置との情報の送受信を行う。ここでは特に、ユーザの個人情報を管理する企業の会員サーバ３００と情報の授受を行う。 FIG. 1 is a diagram showing a configuration of the present embodiment. Reference numeral 100 denotes an information input device, which is typically a portable terminal such as a tablet computer, and includes a camera 121. In addition, the information input apparatus 100 photographs the medium 200 for personal authentication such as a public identification card possessed by the user with the camera 121. The information input device 100 transmits and receives information to and from an external device via the network 500. Here, in particular, information is exchanged with a member server 300 of a company that manages user personal information.

図２は、本実施形態の情報入力装置１００についてその機能の側面からより詳しく説明する機能ブロック図である。情報入力装置１００は、内蔵するカメラが撮影した画像データを記憶する撮影手段１２０と、画像データを所定のレイアウト情報を用いて複数の部分画像データに分割する画像分割手段１３０と、部分画像データを所定のサイズに拡大縮小する画像補正手段１４０と、所定のサイズに拡大縮小された各々の部分画像データを光学文字認識を用いてテキストデータ化する文字認識手段１５０と、テキストデータ化した姓名を区分してその場合に得られる指標を姓名の文字列とともに表示部に表示する姓名区分手段１６０と、表示された姓名の姓と名の区切りを示す入力を検知してテキストデータを姓のテキストデータと名のテキストデータに区分する区分決定手段１７０と、それぞれのテキストデータを記憶するデータ保存手段１８０とそれぞれの手段を制御する制御手段１１０が備えられている。処理の流れについては、さらに詳細な説明を後述する。 FIG. 2 is a functional block diagram for explaining the information input device 100 of the present embodiment in more detail from the aspect of its function. The information input device 100 includes a photographing unit 120 that stores image data photographed by a built-in camera, an image dividing unit 130 that divides the image data into a plurality of partial image data using predetermined layout information, and partial image data. An image correcting unit 140 for enlarging / reducing to a predetermined size, a character recognizing unit 150 for converting each partial image data enlarged / reduced to a predetermined size into text data using optical character recognition, and a surname / name converted to text data are classified The first and last name classification means 160 for displaying the index obtained in that case together with the first and last character strings on the display unit, and the input indicating the last name and first name separator of the displayed first and last names is detected and the text data is converted into the last name text data. Classification determining means 170 for classifying the text data into names, and data storage means 180 for storing the respective text data, respectively. Control means 110 is provided for controlling the means. The process flow will be described in further detail later.

図３は、情報入力装置１００のハードウェア構成を示す図である。情報入力装置１００は先に述べたように典型的にはタブレット型のコンピュータ（以降、タブレットＰＣと称する）であり、ハードウェアとしては一つのコンピュータシステムである。情報入力装置１００は、制御部１０１、記憶部１０２、周辺機器インターフェース（Ｉ／Ｆ）部１０３、入力部１０４、表示部１０５、通信部１０６を備え、それらがバス１０９を介して接続される。また、本構成の特徴として、周辺機器Ｉ／Ｆ部１０３に接続されるカメラ１２１が内蔵されている。尚、図３のハードウェア構成は一例であり、別途、目的に応じて様々な構成を採ることが可能である。 FIG. 3 is a diagram illustrating a hardware configuration of the information input device 100. As described above, the information input device 100 is typically a tablet computer (hereinafter referred to as a tablet PC), and is a single computer system as hardware. The information input device 100 includes a control unit 101, a storage unit 102, a peripheral device interface (I / F) unit 103, an input unit 104, a display unit 105, and a communication unit 106, which are connected via a bus 109. Further, as a feature of this configuration, a camera 121 connected to the peripheral device I / F unit 103 is incorporated. Note that the hardware configuration in FIG. 3 is merely an example, and various configurations can be adopted depending on purposes.

制御部１０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等で構成される。ＣＰＵは、記憶部１０２、ＲＯＭ、記録媒体等に格納されるプログラムをＲＡＭ上のワークメモリ領域に呼び出して実行し、バス１０９を介して接続された各装置を駆動制御し、コンピュータが行う処理を実現する。ＲＯＭは、不揮発性メモリであり、情報入力装置１００のブートプログラム、データ等を恒久的に保持している。ＲＡＭは、揮発性メモリであり、記憶部１０２、ＲＯＭ、記録媒体等からロードしたプログラム、データ等を一時的に保持するとともに、制御部１０１が各種処理を行う為に使用するワークエリアを備える。 The control unit 101 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The CPU calls a program stored in the storage unit 102, ROM, recording medium or the like to a work memory area on the RAM, executes it, drives and controls each device connected via the bus 109, and performs processing performed by the computer. Realize. The ROM is a non-volatile memory and permanently holds a boot program, data, and the like of the information input device 100. The RAM is a volatile memory, and temporarily stores a program, data, and the like loaded from the storage unit 102, ROM, recording medium, and the like, and includes a work area used by the control unit 101 for performing various processes.

記憶部１０２は、制御部１０１が実行するプログラム、プログラム実行に必要なデータ、ＯＳ（オペレーティングシステム）等が格納される。プログラムに関しては、ＯＳ（オペレーティングシステム）に相当する制御プログラムや、制御手段１１０として機能するプログラムコード、撮影手段１２０として機能するプログラムコード、画像分割手段１３０として機能するプログラムコード、画像補正手段１４０として機能するプログラムコード、文字認識手段１５０として機能するプログラムコード、姓名区分手段１６０として機能するプログラムコード、区分決定手段１７０として機能するプログラムコード、データ保存手段１８０として機能するプログラムコードが格納されている。これらのプログラムコードは、制御部１０１により必要に応じて読み出されてＲＡＭに移され、ＣＰＵに読み出されて実行されることにより各種の手段として機能する。 The storage unit 102 stores a program executed by the control unit 101, data necessary for program execution, an OS (operating system), and the like. Regarding programs, a control program corresponding to an OS (operating system), a program code functioning as the control unit 110, a program code functioning as the photographing unit 120, a program code functioning as the image dividing unit 130, and a function as the image correction unit 140 A program code that functions as the character recognition means 150, a program code that functions as the first and last name classification means 160, a program code that functions as the classification determination means 170, and a program code that functions as the data storage means 180. These program codes are read by the control unit 101 as necessary, transferred to the RAM, and read and executed by the CPU to function as various means.

周辺機器Ｉ／Ｆ部１０３は、コンピュータに周辺機器を接続させるためのポートであり、周辺機器Ｉ／Ｆ部１０３を介してコンピュータは周辺機器とのデータの送受信を行う。周辺機器Ｉ／Ｆ部１０３は、ＵＳＢやＩＥＥＥ１３９４やＲＳ−２３２Ｃ等で構成されており、本実施形態では、タブレットＰＣに内蔵されたカメラ１２１と接続されており、撮影されたデータを受信する。入力部１０４は、例えば、タッチパネル等の入力装置である。入力部１０４を介して、ユーザは情報入力装置１００に対して、操作指示、動作指示、データ入力等を行うことができる。表示部１０５は、液晶パネル等のディスプレイ装置であり、入力部１０４のタッチパネルと重畳して設置されていてもよい。 The peripheral device I / F unit 103 is a port for connecting a peripheral device to the computer, and the computer transmits and receives data to and from the peripheral device via the peripheral device I / F unit 103. The peripheral device I / F unit 103 is configured by USB, IEEE 1394, RS-232C, or the like. In this embodiment, the peripheral device I / F unit 103 is connected to the camera 121 built in the tablet PC and receives captured data. The input unit 104 is an input device such as a touch panel, for example. Through the input unit 104, the user can perform an operation instruction, an operation instruction, data input, and the like with respect to the information input device 100. The display unit 105 is a display device such as a liquid crystal panel, and may be installed so as to overlap with the touch panel of the input unit 104.

通信部１０６は、通信制御装置、通信ポート等を有し、情報入力装置１００とネットワーク５００間の通信を媒介する無線の通信インターフェースであり、ネットワーク５００を介して、他のコンピュータとの通信制御を行う。 The communication unit 106 includes a communication control device, a communication port, and the like, and is a wireless communication interface that mediates communication between the information input device 100 and the network 500, and controls communication with other computers via the network 500. Do.

図２に戻って、制御手段１１０は制御部１０１とＯＳ（オペレーティングシステム）等を実行するコンピュータプログラムであり、撮影手段１２０はカメラとカメラによる撮影を制御するコンピュータプログラムである。また、画像分割手段１３０からデータ保存手段１８０についても、撮影した画像の認識に係るコンピュータプログラムであり、制御手段１１０が各コンピュータプログラムを解釈実行することにより機能する。 Returning to FIG. 2, the control unit 110 is a computer program that executes the control unit 101 and an OS (operating system), and the photographing unit 120 is a computer program that controls photographing with the camera. Further, the image dividing unit 130 to the data storage unit 180 are also computer programs related to the recognition of captured images, and function by the control unit 110 interpreting and executing each computer program.

以下に、本実施形態の情報入力装置１００が持つデータベースについて説明する。また、本実施形態では、これらの各種データベースについて、説明の上で必要最低限の項目しか例示していないが、実際の運用においては、これらに派生する様々な項目が付加され保存されていてもよい。 Below, the database which the information input device 100 of this embodiment has is demonstrated. Further, in this embodiment, these various databases only illustrate the minimum necessary items in the description, but in actual operation, various items derived from these may be added and stored. Good.

図４は、情報入力装置１００が持つデータベースの一つであるレイアウト情報１３１を説明する図である。情報入力装置１００の画像分割手段１３０は、撮影手段１２０によって記憶された画像データをレイアウト情報１３１を用いて複数の部分画像データに分割する。 FIG. 4 is a diagram for explaining layout information 131 which is one of databases included in the information input device 100. The image dividing unit 130 of the information input device 100 divides the image data stored by the photographing unit 120 into a plurality of partial image data using the layout information 131.

レイアウト情報１３１は、個人認証用媒体名（Ｔ１２）単位で使用され、この単位で1レコードを構成する。画像分割手段１３０がどの個人認証用媒体を使用するかについては、後述するように、情報入力装置１００のオペレータが予め設定しておいてもよいし、自動的に設定するようにしてもよい。Ｔ１１の数字は個人認証用媒体名（Ｔ１２）に対する識別子（ＩＤ）である。 The layout information 131 is used in units of personal authentication medium names (T12), and this record constitutes one record. Which personal authentication medium the image dividing unit 130 uses may be set in advance by the operator of the information input apparatus 100 or may be automatically set, as will be described later. The number T11 is an identifier (ID) for the personal authentication medium name (T12).

例えば、個人認証用媒体が自動車運転免許証と設定された場合、このレコードはいくつかの属性（項目とも称す、Ｔ１３）を含み、例えば、姓名（Ｔ１３の一つ）のデータは入力画像の基準点（通常左上隅の点）からの始点座標（Ｔ１４、Ｔ１５、この例ではＸ＝２９，Ｙ＝１５）と終点座標（Ｔ１６、Ｔ１７、この例ではＸ＝４０，Ｙ＝１０５）に囲まれた範囲に存在する。また、姓名の文字の書体（フォント、Ｔ１８）は「丸ゴシック」、文字の縦サイズ（Ｔ１９）は１２ポイントであることが示されている。尚、座標系やフォントの呼称、文字サイズの単位などは、プログラム内で任意に設定できるようにしてもよい。 For example, when the personal authentication medium is set as a car driver's license, this record includes several attributes (also referred to as items, T13). For example, first name and last name data (one of T13) is used as a reference of the input image. Surrounded by start point coordinates (T14, T15, X = 29, Y = 15 in this example) and end point coordinates (T16, T17, X = 40, Y = 105 in this example) from the point (usually the upper left corner point) Exists in the range. In addition, it is shown that the font of the first and last name characters (font, T18) is “Maru Gothic”, and the vertical size (T19) of the characters is 12 points. The coordinate system, font name, character size unit, and the like may be arbitrarily set in the program.

図５は、情報入力装置１００が持つデータベースの一つである姓名辞書１６１と姓名履歴１６２を説明する図である。情報入力装置１００の姓名区分手段１６０は、文字認識手段１５０によってテキストデータ化された姓名を姓名辞書１６１と姓名履歴１６２を用いて姓と名を区分するとともに、区分の確からしさを表す指標を姓名辞書１６１と姓名履歴１６２から抽出する。 FIG. 5 is a diagram for explaining the first name surname dictionary 161 and the first name surname history 162 which are one of the databases of the information input device 100. The first and last name classifying means 160 of the information input device 100 classifies the first name and last name converted into text data by the character recognition means 150 using the first name and first name dictionary 161 and the first name and first name history 162, and uses a first name and last name as an index indicating the likelihood of the classification. Extracted from the dictionary 161 and the first and last name history 162.

図５（ａ）は姓名辞書１６１の構成を示す例である。この例では、存在すべきであろう姓のリスト（Ｔ２１）と、その姓が出現する確率に基づいて付与された指標（Ｔ２２）が併記されて羅列されている。この指標は、統計に基づいた「区切り方の確からしさの指標」とも解釈できる。 FIG. 5A is an example showing the configuration of the first name surname dictionary 161. In this example, a list of surnames that should exist (T21) and an index (T22) given based on the probability that the surname appears will be listed together. This index can be interpreted as a “index of probability of separation” based on statistics.

同様に、図５（ｂ）は姓名履歴１６２の構成を示す例である。この例では、姓名辞書１６１と同じ姓のリスト（Ｔ２３）と、その姓がこの処理において実際に出現した数値である指標（Ｔ２４）が併記されて羅列されている。この指標は、現実の使用に基づいた「区切り方の確からしさの指標」とも解釈できる。尚、後述するように姓名辞書１６１は、任意に入れ替えられてもよいし、姓名履歴１６２は、他の装置の履歴を取り込んで加算しても適宜更新されるようにしてもよい。 Similarly, FIG. 5B is an example showing the configuration of the first name surname history 162. In this example, a list (T23) of the same surname as the surname / name dictionary 161 and an index (T24) whose surname is a numerical value actually appearing in this process are listed together. This index can be interpreted as an “index of probability of separation” based on actual use. As will be described later, the first and last name dictionary 161 may be arbitrarily replaced, and the first and last name history 162 may be updated as appropriate by taking in and adding the history of other devices.

次に、本実施形態の処理の流れについて、図６を用いて、さらに詳細に説明する。先ず、情報入力装置１００の撮影手段１２０によって、入力対象となるユーザの所持する個人認証用の媒体２００を、情報入力装置１００に内蔵されたカメラ１２１で撮影する。撮影手段１２０は、撮影した画像データを記憶する。 Next, the processing flow of this embodiment will be described in more detail with reference to FIG. First, the personal authentication medium 200 possessed by the user to be input is photographed by the camera 121 built in the information input device 100 by the photographing means 120 of the information input device 100. The photographing unit 120 stores photographed image data.

続いて、情報入力装置１００の画像分割手段１３０は、どのレイアウト情報１３１のレコードを使用するか（どの種類の個人認証用媒体のレイアウト情報を使用するか）についての情報を取得する、これについては情報入力装置１００のオペレータが入力部１０４を介して手入力により制御手段１１０にレイアウト情報を指示してもよいし、保持しているレイアウト情報１３１のそれぞれのレコードから、文字が入力されていることが期待される領域を抽出し、前記撮影した画像データから該領域に係る画像データを取得して、その濃淡の解析や空間周波数分析を行い、それらの画像データの中から文字の存在の確度が一番高いレイアウト情報１３１のレコードを自動的に選択し、設定するようにしてもよい。画像分割手段１３０は、画像データを所定のレイアウト情報を用いて複数の部分画像データに分割する。 Subsequently, the image dividing unit 130 of the information input device 100 acquires information on which layout information 131 record is used (which type of personal authentication medium layout information is used). The operator of the information input apparatus 100 may instruct layout information to the control means 110 by manual input via the input unit 104, and characters are input from each record of the held layout information 131. Is extracted from the captured image data, the image data relating to the region is obtained, the density analysis and the spatial frequency analysis are performed, and the accuracy of the presence of characters from the image data is confirmed. The record of the highest layout information 131 may be automatically selected and set. The image dividing unit 130 divides the image data into a plurality of partial image data using predetermined layout information.

画像補正手段１４０は、画像分割手段１３０が分割した部分画像データのうち、姓名に係る部分画像データと住所に係る部分画像データを取得して、空間周波数分析などを用いて、先ず姓名の文字数を求めて、その文字数から、例えば７文字以下ならば文字幅は全角、１４文字以下ならば２／３角、１５文字以上なら半角というように判定する。続いて、住所に係る部分画像データを取得して、１文字分の幅を求め、姓名の文字幅と比較して文字幅を判定し、１文字の幅が全角になるよう（所定の大きさになるよう）、それぞれの部分画像データを幅方向に拡大する（詳細な処理の流れについては後述する）。 The image correction unit 140 acquires partial image data related to the first and last names and partial image data related to the address among the partial image data divided by the image dividing unit 130, and first calculates the number of characters of the first and last names using a spatial frequency analysis or the like. Thus, from the number of characters, for example, the character width is determined to be full-width if it is 7 characters or less, 2 / 3-angle if it is 14 characters or less, and half-width if it is 15 characters or more. Subsequently, partial image data relating to the address is obtained, the width for one character is obtained, the character width is determined by comparison with the character width of the first and last names, and the width of one character is full-width (predetermined size) Each partial image data is expanded in the width direction (detailed processing flow will be described later).

図７は、公的身分証明書などの個人認証用の媒体２００の外観の例を示す図である。この例では、画像補正手段１４０は、姓名に係る部分画像データから検知される文字数が４文字なので、姓名に係る部分画像データの文字の幅２０１を全角と判定する。また、住所に係る部分画像データから検知される文字幅２０２については、姓名に係る部分画像データから検知される文字の幅２０１と比較して、ここでは２／３角と判定する。そこで、姓名に係る部分画像データについてはそのまま補正を行わず、住所に係る部分画像データについては、１文字の幅が全角になるように幅方向に拡大される。 FIG. 7 is a diagram showing an example of the appearance of a personal authentication medium 200 such as a public identification card. In this example, since the number of characters detected from the partial image data relating to the first and last names is four, the image correcting unit 140 determines that the character width 201 of the partial image data relating to the first and last names is full-width. Further, the character width 202 detected from the partial image data related to the address is determined to be 2/3 corners here, as compared with the character width 201 detected from the partial image data related to the surname. Therefore, the partial image data related to the first and last names is not corrected as it is, and the partial image data related to the address is enlarged in the width direction so that the width of one character becomes full-width.

図８は、画像補正手段１４０における文字サイズの補正に関する詳細な処理の流れを示すフローチャートである。画像補正手段１４０は、先ず、姓名に係る部分画像データの取得する（Ｓ２０１）。続いて、空間周波数分析などを用いて姓名の文字数を取得する（Ｓ２０２）。次に、先に説明したように、取得した文字数から、文字幅が全角か、２／３角か、半角かの判定を行う（Ｓ２０３、Ｓ２０４〜Ｓ２０６）。続いて、住所に係る部分画像データの取得し（Ｓ２０７）、空間周波数分析などを用いて住所の文字幅を取得する（Ｓ２０８）。ここで、先に判定した姓名の文字幅と比較し、住所の文字幅は全角か、２／３角か、半角かを判定する（Ｓ２０９、Ｓ２１０〜Ｓ２１２）。ここで求められた文字幅に基づいて、姓名に係る部分画像データの大きさを文字幅が全角になるように補正し（Ｓ２１３）、住所に係る部分画像データの大きさを文字幅が全角になるように補正して（Ｓ２１４）、姓名と住所に係る部分画像データにおける文字幅が同一になるようにして処理を終える。 FIG. 8 is a flowchart showing a detailed processing flow related to the correction of the character size in the image correcting unit 140. First, the image correction unit 140 acquires partial image data relating to the first and last names (S201). Subsequently, the number of characters of the first and last names is acquired using spatial frequency analysis or the like (S202). Next, as described above, it is determined from the acquired number of characters whether the character width is full-width, 2 / 3-angle, or half-width (S203, S204 to S206). Subsequently, partial image data relating to the address is acquired (S207), and the character width of the address is acquired using spatial frequency analysis or the like (S208). Here, it is determined whether the character width of the address is full-width, 2 / 3-square, or half-width compared with the character width of the first and last name determined previously (S209, S210 to S212). Based on the character width obtained here, the size of the partial image data related to the first and last names is corrected so that the character width becomes full-width (S213), and the size of the partial image data related to the address is changed to full-width (S214), and the process is finished so that the character widths in the partial image data relating to the first name and the last address are the same.

図６に戻り、情報入力装置１００の文字認識手段１５０は、補正された各々の部分画像データを光学文字認識（ＯＣＲ）を用いてテキストデータ化する。ここで、光学文字認識（ＯＣＲ）処理については幅広く知られており、市販もされているため、本実施形態の装置では、既存技術のＯＣＲを搭載して使用する。このとき、汎用的なＯＣＲを利用したいため、上述の画像補正手段１４０で説明した処理を行い、文字幅を全角に統一するような正規化を行って、入力データの汎用化を図る。 Returning to FIG. 6, the character recognition unit 150 of the information input device 100 converts each corrected partial image data into text data using optical character recognition (OCR). Here, since optical character recognition (OCR) processing is widely known and commercially available, the apparatus according to the present embodiment is mounted with an existing OCR. At this time, since it is desired to use a general-purpose OCR, the processing described in the above-described image correction unit 140 is performed, and normalization is performed so that the character width is unified to full width, so that the input data is generalized.

続いて、姓名区分手段１６０は、文字認識手段１５０によってテキストデータ化された姓名について、姓名辞書１６１と姓名履歴１６２を用いて姓と名に区分する。 Subsequently, the first and last name classification means 160 classifies the first and last names converted into text data by the character recognition means 150 into first and last names using the first and last name dictionary 161 and the first and last name history 162.

図９は、姓名区分手段１６０における姓名の区切りの指標を求めるフローチャートである。先ず、テキストデータ化された姓名の文字列を取得する（Ｓ３０１）。次に、取得した姓名の文字列から姓名の区切りとなる可能性がある位置の数を求める（Ｓ３０２）、例えば、「波多野理子」という姓名のユーザがいた場合、区切りとなる可能性がある位置は次の「／」で示される４か所である。
波／多／野／理／子。
ここで、この数「４」回だけループして（Ｓ３０３）、姓のパターンを４つ作成する（Ｓ３０４）。作成される姓のパターンは「波」、「波多」、「波多野」、「波多野理」の４種である。ここで、作成した姓のパターンの数だけループして（Ｓ３０５）、その姓のパターンが姓名辞書１６１に存在するかどうか姓名辞書に問い合わせる（Ｓ３０６、Ｓ３０７）。ここで姓が存在した場合、姓名辞書１６１の当該姓欄（Ｔ２１）に対応する指標（Ｔ２２）を取得し、区切り方の確からしさの指標値として保持する（Ｓ３０８）。続いて、この姓が姓名履歴１６２に存在するかどうか姓名辞書に問い合わせる（Ｓ３０９、Ｓ３１０）。ここで姓が存在した場合、姓名履歴１６２の当該姓欄（Ｔ２３）に対応する指標（Ｔ２４）を取得し、先の区切り方の確からしさの指標値に加算する（Ｓ３１１）。尚、姓名辞書１６１と姓名履歴１６２が保有する姓のレコードが同期している場合（常に同じ場合）、姓名辞書に問い合わせる（Ｓ３０９、Ｓ３１０）処理は不要で、姓名辞書の指標を保持した（Ｓ３０８）直後に姓名履歴の指標を加算する（Ｓ３１１）ようにしてもよい。こうして、それぞれの姓のパターンに対する区切り方の確からしさの指標を記憶して（Ｓ３１２）処理を終える。 FIG. 9 is a flowchart for obtaining an index for separating first and last names in the first and last name classification means 160. First, a character string of first and last names converted into text data is acquired (S301). Next, the number of positions that are likely to be delimiters of the first and last names is obtained from the acquired first and last character strings (S302). For example, if there is a user whose first and last names are “Riko Hatano”, positions that may be delimited Are the four places indicated by the following "/".
Wave / Ta / No / Ri / Koko.
Here, this number “4” is looped (S303), and four surname patterns are created (S304). There are four types of surname patterns to be created: “Hami”, “Hata”, “Hatano”, and “Ri Hatano”. Here, the number of surname patterns created is looped (S305), and the surname / name dictionary is inquired as to whether the surname pattern exists in the surname / name dictionary 161 (S306, S307). If the surname exists, the index (T22) corresponding to the surname field (T21) of the surname dictionary 161 is acquired and held as an index value of the probability of separation (S308). Subsequently, the surname surname dictionary is inquired as to whether the surname exists in the surname history 162 (S309, S310). If the surname exists, the index (T24) corresponding to the surname field (T23) of the surname history 162 is acquired and added to the index value of the probability of the previous separation (S311). When the surname record 161 and the surname record held in the surname name history 162 are synchronized (always the same), the process of querying the surname / name dictionary (S309, S310) is unnecessary, and the index of the surname / name dictionary is retained (S308). ) Immediately after that, a surname history index may be added (S311). In this way, the index of the certainty of how to separate each surname pattern is stored (S312), and the process ends.

図６に戻り、情報入力装置１００の区分決定手段１７０は、こうして作成した姓のパターンに対する区切り方の確からしさの指標をもとに、テキストデータ化された姓名の文字列を修飾して情報入力装置１００の表示部１０５に表示し、ユーザによる姓名の姓と名の区切りを示す入力を情報入力装置１００の入力部１０４から検知して、テキストデータを姓のテキストデータと名のテキストデータに区分する。尚、この例ではタッチパネルディスプレイを搭載したタブレットＰＣを想定しているため、入力部１０４と表示部１０５は一つの重畳されたデバイスとなっている。 Returning to FIG. 6, the classification determination unit 170 of the information input device 100 modifies the character string of the first and last names converted into text data based on the index of the probability of separation for the last name pattern thus created. Displayed on the display unit 105 of the apparatus 100, and the input indicating the separation between the first name and the last name by the user is detected from the input unit 104 of the information input apparatus 100, and the text data is classified into the text data of the last name and the text data of the first name. To do. In this example, since a tablet PC equipped with a touch panel display is assumed, the input unit 104 and the display unit 105 are a single superimposed device.

図１０は、本実施形態における区分決定手段１７０が情報入力装置１００の表示部１０５に表示する、姓名のテキストの例である。図１０（ａ）は、姓名のテキストを、区切り方の確からしさの指標に関係なくそのまま表示した例である。ユーザは、例えば、このテキストの文字と文字の間を、指で上から下に切るジェスチャによって姓と名に区切ることにより姓と名の区切りを決定する。指で上から下に切るジェスチャは、情報入力装置１００の入力部１０４であるタッチパネルで検知すればよい。図１０（ｂ）は、姓名のテキストを、区切り方の確からしさの指標に基づいて、姓と名の間隔を調整（修飾）して表示した例である。この例では区切り方の確からしさが高い文字間が広く表示されており（波多野理子より波多野理子の方が確からしい）、ジェスチャによる操作がやりやすくなっている。さらに、図１０（ｃ）や図１０（ｄ）のように、区切り方の確からしさの指標に基づいて、姓と名の間に記号やマークのような修飾を施し、ジェスチャによる操作をやりやすくするようにしてもよい。 FIG. 10 is an example of first and last name texts displayed on the display unit 105 of the information input device 100 by the classification determining unit 170 in the present embodiment. FIG. 10A shows an example in which the first and last name text is displayed as it is regardless of the index of the probability of separation. The user determines the separation between the first name and the last name by, for example, separating the first name and the last name by using a gesture of cutting from top to bottom with a finger. The gesture of cutting from the top to the bottom with a finger may be detected by the touch panel that is the input unit 104 of the information input device 100. FIG. 10B is an example in which the first and last name texts are displayed by adjusting (modifying) the interval between the first name and the last name based on an index of the probability of separation. In this example, the space between characters with high probability of separation is displayed widely (Riko Hatano is more likely than Riko Hatano), and gesture operations are easier. Furthermore, as shown in FIG. 10C and FIG. 10D, based on an index of the probability of separation, a modification such as a symbol or a mark is given between the last name and the first name, making it easy to perform operations by gestures. You may make it do.

図６に戻り、区分決定手段１７０は、こうして区分が決定された姓と名のテキストに加え、住所や生年月日、有効期間などといった、レイアウト情報によって部分画像データに分割されて文字認識された様々なテキストデータを情報入力装置１００の表示部１０５に表示するようにしてもよい。 Returning to FIG. 6, in addition to the text of the surname and first name in which the classification is determined in this way, the classification determination unit 170 is divided into partial image data based on layout information such as an address, date of birth, validity period, and the like, and has been recognized. Various text data may be displayed on the display unit 105 of the information input device 100.

次に、情報入力装置１００のデータ保存手段１８０は、上述の区分された姓と名のテキスト、住所、生年月日、有効期間などといった、レイアウト情報によって部分画像データに分割されて文字認識された様々なテキストデータを記憶し、個人情報ＤＢ（データベース）１８１として記憶する。 Next, the data storage unit 180 of the information input device 100 is divided into partial image data according to layout information such as the above-described divided first and last name text, address, date of birth, and validity period, and character recognition is performed. Various text data are stored and stored as a personal information DB (database) 181.

また、情報入力装置１００は、前記個人情報ＤＢ１８１に記録されている情報を、通信部１０６を経由して、ネットワーク５００を介して、外部の企業の会員サーバ３００などに情報を送信することもできる。 The information input device 100 can also transmit information recorded in the personal information DB 181 to the member server 300 of an external company via the communication unit 106 and the network 500. .

本実施形態によれば、ユーザが所持する個人認証用の媒体に記載されている個人情報を撮影し、その画像データを光学文字認識（ＯＣＲ）によって、正確に記載データを抽出するとともに、姓と名の間に明瞭な区切り（空白）などがない場合でも、容易な操作で正確に姓と名を区分することができる情報入力装置が提供される。この装置はポータブルで設置場所を選ばず、狭いイベント会場であっても使用することができ、また、個人認証用の媒体から文字認識した生年月日から現在の年齢を算定することにより、年齢認証をその場で完了することができる。 According to the present embodiment, personal information described in a personal authentication medium possessed by a user is photographed, and the image data is accurately extracted by optical character recognition (OCR). There is provided an information input device capable of accurately distinguishing between first name and last name with an easy operation even when there is no clear separation (blank) between first names. This device is portable and can be used in small event venues, regardless of installation location, and it can also be used for age verification by calculating the current age from the date of birth recognized from the personal authentication medium. Can be completed on the spot.

本発明は、上述の実施形態に限定されることなく、特許請求の範囲内で種々変更、応用が可能である。例えば、情報入力装置１００は、本実施形態で示したようなタブレットＰＣでなく、通常のＰＣ等の装置であってもよい。この場合、区分決定手段１７０における、指で上から下に切るジェスチャなどの指示の入力をマウスで行うようにしても同様に本発明の実施をすることが可能である。 The present invention is not limited to the above-described embodiments, and various modifications and applications are possible within the scope of the claims. For example, the information input device 100 may be a device such as a normal PC instead of the tablet PC as shown in the present embodiment. In this case, the present invention can be implemented in the same manner even if the mouse is used to input an instruction such as a gesture for cutting from the top to the bottom with the finger in the classification determining unit 170.

１００情報入力装置
１１０制御手段
１２０撮影手段
１３０画像分割手段
１３１レイアウト情報
１４０画像補正手段
１５０文字認識手段
１６０姓名区分手段
１６１姓名辞書
１６２姓名履歴
１７０区分決定手段
１８０データ保存手段
１８１個人情報ＤＢ
２００個人認証用の媒体
３００企業の会員サーバ
５００ネットワーク DESCRIPTION OF SYMBOLS 100 Information input device 110 Control means 120 Image | photographing means 130 Image division means 131 Layout information 140 Image correction means 150 Character recognition means 160 First name surname classification means 161 First name surname dictionary 162 First name surname history 170 Classification determination means 180 Data storage means 181 Personal information DB
200 Medium for personal authentication 300 Corporate member server 500 Network

Claims

Photographing means for photographing a public identification card and storing it as image data;
Image dividing means for dividing the stored image data into a plurality of partial image data using layout information determined for each type of the public identification card;
Image correction means for enlarging or reducing the partial image data to a predetermined size;
Character recognition means for storing each of the partial image data scaled up to the predetermined size into text data using optical character recognition and holding it together with attributes obtained from layout information;
First and last name classifying means for classifying a character string obtained by converting partial image data relating to first and last names into text data;
A classification determining means for detecting an input indicating a separation between a first name and a last name of a first and last name and determining a text data as a separator between the last name text data and the first name text data;
Data storage means for storing each text data,
The image correcting means determines the character width by comparing the number of characters of the first and last names included in the partial image data relating to the first and last names and the character width of one character in the partial image data relating to the address, and thereby determines the partial image data. An information input device characterized by enlarging or reducing the size so as to have the same character width.

Photographing means for photographing a public identification card and storing it as image data;
Image dividing means for dividing the stored image data into a plurality of partial image data using layout information determined for each type of the public identification card;
Image correction means for enlarging or reducing the partial image data to a predetermined size;
Character recognition means for storing each of the partial image data scaled up to the predetermined size into text data using optical character recognition and holding it together with attributes obtained from layout information;
First and last name classifying means for classifying a character string obtained by converting partial image data relating to first and last names into text data;
A classification determining means for detecting an input indicating a separation between a first name and a last name of a first and last name and determining a text data as a separator between the last name text data and the first name text data;
Data storage means for storing each text data,
The first and last name classifying means classifies the character string obtained by converting the partial image data relating to the first name to the text into one or more first names and first names, and obtains an index of the probability of each division,
The classification determining means modifies the character string based on these indices and displays it on a display unit.

The image correcting means determines the character width by comparing the number of characters of the first and last names included in the partial image data relating to the first and last names and the character width of one character in the partial image data relating to the address, and thereby determines the partial image data. The information input device according to claim 2, wherein the information is enlarged / reduced to have the same character width.

A computer program for causing a computer to function as the information input device according to any one of claims 1 to 3.