JP2010009440A

JP2010009440A - Character recognition program, character recognition apparatus, and character recognition method

Info

Publication number: JP2010009440A
Application number: JP2008169844A
Authority: JP
Inventors: Kenichi Hirooka; 健一廣岡; Minoru Fukuda; 稔福田
Original assignee: Fujitsu Frontech Ltd
Current assignee: Fujitsu Frontech Ltd
Priority date: 2008-06-30
Filing date: 2008-06-30
Publication date: 2010-01-14
Anticipated expiration: 2028-06-30
Also published as: JP5107157B2

Abstract

PROBLEM TO BE SOLVED: To enable a character string to be accurately recognized by a process of a low load. SOLUTION: Character estimating means "1d" estimates each character of the character string based on image information, and outputs one or more candidate characters as a candidate of its estimation result about each character within the character string. Candidate character string producing means "1e" sequentially produces one or more candidate character strings as the candidate of the character string by extracting and combining candidate characters corresponding to each character within the character string outputted by the character estimating means "1d" one by one with an order of larger frequency of appearance, based on weighted information in which the frequency of appearance of characters included in word registration information in which a plurality of words are registered, is made to corresponded to every character. A character string specifying means "1f" verifies the candidate character string with a word within the word registration information by its produced order, and specifies a word corresponding to the character string from its verification result. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、文字認識プログラム、文字認識装置および文字認識方法に関し、特に、画像情報に含まれる文字列を特定する文字認識プログラム、文字認識装置および文字認識方法に関する。 The present invention relates to a character recognition program, a character recognition device, and a character recognition method, and more particularly to a character recognition program, a character recognition device, and a character recognition method for specifying a character string included in image information.

従来、帳票等に記入された１文字以上の文字列を画像情報として読み込んで、読み込んだ画像情報から記入された文字列を認識する文字認識システムが用いられている。このシステムは、例えば、画像読込装置と画像読込装置に接続されたコンピュータを有している。画像読込装置は、帳票等の画像情報を読み込む。画像読込装置は、例えば、イメージスキャナである。そして、コンピュータが、画像読込装置から画像情報を取得し、画像情報に含まれる文字列に対応する文字列画像を認識する。更に、コンピュータが、文字列画像に対応する文字列を、コンピュータで処理可能な、この文字列に対応する所定の文字コード列として特定する。 2. Description of the Related Art Conventionally, a character recognition system has been used that reads a character string of one or more characters entered on a form as image information and recognizes the entered character string from the read image information. This system has, for example, an image reading device and a computer connected to the image reading device. The image reading device reads image information such as a form. The image reading device is, for example, an image scanner. Then, the computer acquires image information from the image reading device and recognizes a character string image corresponding to the character string included in the image information. Further, the computer specifies a character string corresponding to the character string image as a predetermined character code string corresponding to the character string that can be processed by the computer.

コンピュータにおいて、このような処理を実現するアプリケーションソフトウェアは、光学文字認識（ＯＣＲ：Optical Character Recognition）ソフトと呼ばれる。また、このようなシステムを実現する装置全体を光学文字読取装置（ＯＣＲ：Optical Character Reader）と呼ぶこともある。以下では、ＯＣＲという場合、前者のＯＣＲソフトを指すものとする。 In a computer, application software that realizes such processing is called optical character recognition (OCR) software. Moreover, the whole apparatus which implement | achieves such a system may be called an optical character reader (OCR: Optical Character Reader). Hereinafter, the OCR refers to the former OCR software.

ここで、文字認識システムでは、認識する文字列画像に記入者の筆跡の違い等による揺らぎが含まれる。この揺らぎは、文字コード列の特定精度を低下させる原因となり、文字コード列が一意に特定されない場合もある。このため、文字認識システムでは、文字コード列の特定精度を向上することが課題となる。特定精度が高いとは、すなわち、少数の候補に正確に絞り込むことができることを意味する。 Here, in the character recognition system, the character string image to be recognized includes fluctuations due to differences in the handwriting of the writer. This fluctuation causes a decrease in the accuracy of character code string identification, and the character code string may not be uniquely identified. For this reason, in a character recognition system, it becomes a subject to improve the specific accuracy of a character code sequence. The high specific accuracy means that it is possible to accurately narrow down to a small number of candidates.

この課題に対し、画像情報の所定範囲内での文字パターン（文字形状の特徴）の出現頻度情報を保持し、この頻度情報に基づいて、文字列画像の認識結果を絞り込む技術が知られている（例えば、特許文献１参照）。また、文字列画像として認識され得る文字コード列を予め限定しておき、その他の文字コード列が得られた場合には、これを特定結果から除外する技術も知られている（例えば、特許文献２参照）。更に、得られた文字コード列の前回までの取得頻度を保持し、この取得頻度に基づいて文字コード列の特定結果を絞り込む技術も知られている（例えば、特許文献３参照）。
特開平５−２９８４８９号公報特開平６−０９６２８７号公報特開平８−０１６７３０号公報 In order to solve this problem, a technique is known in which appearance frequency information of character patterns (character shape characteristics) within a predetermined range of image information is held, and a recognition result of a character string image is narrowed down based on the frequency information. (For example, refer to Patent Document 1). There is also known a technique in which character code strings that can be recognized as a character string image are limited in advance, and when other character code strings are obtained, this is excluded from the identification result (for example, Patent Documents). 2). Furthermore, there is also known a technique that holds the frequency of acquisition of the obtained character code string up to the previous time and narrows down the result of specifying the character code string based on this acquisition frequency (see, for example, Patent Document 3).
Japanese Patent Laid-Open No. 5-298589 JP-A-6-096287 JP-A-8-016730

しかし、上記特許文献１，３に記載の方法では、頻度情報を文字列認識の実行のたびに更新する必要がある。このため、文字列認識の処理実行時にコンピュータに余計な負荷が生じるという課題がある。 However, in the methods described in Patent Documents 1 and 3, the frequency information needs to be updated each time character string recognition is executed. For this reason, there is a problem that an extra load is generated on the computer when the character string recognition process is executed.

また、上記特許文献２に記載の方法では、利用され得る文字列が多数存在する場合には、精度良く少数の候補に絞り込むことが困難となるという課題がある。
本発明はこのような点に鑑みてなされたものであり、低負荷の処理で文字列を精度良く認識することが可能な文字認識プログラム、文字認識装置および文字認識方法を提供することを目的とする。 Further, the method described in Patent Document 2 has a problem that when there are a large number of character strings that can be used, it is difficult to narrow down to a small number of candidates with high accuracy.
The present invention has been made in view of these points, and an object thereof is to provide a character recognition program, a character recognition device, and a character recognition method capable of accurately recognizing a character string with low-load processing. To do.

上記課題を解決するために、画像情報に含まれる文字列を認識する文字認識プログラムが提供される。この文字認識プログラムを実行するコンピュータは、文字推定手段、候補文字列生成手段および文字列特定手段として機能する。文字推定手段は、画像情報を基に文字列の各文字を推定し、その推定結果の候補としての１つ以上の候補文字を文字列内の各文字について出力する。候補文字列生成手段は、複数の単語が登録された単語登録情報に含まれる文字の出現回数を文字ごとに対応付けた重み付け情報に基づき、文字推定手段が出力した文字列内の各文字に対応する候補文字を出現回数が多い順に１つずつ抽出して組み合わせることで、文字列の候補としての１つ以上の候補文字列を順次生成する。文字列特定手段は、候補文字列をその生成順に単語登録情報内の単語と照合し、その照合結果から文字列に対応する単語を特定する。 In order to solve the above problems, a character recognition program for recognizing a character string included in image information is provided. A computer that executes this character recognition program functions as character estimation means, candidate character string generation means, and character string identification means. The character estimation means estimates each character of the character string based on the image information, and outputs one or more candidate characters as candidates of the estimation result for each character in the character string. The candidate character string generation means corresponds to each character in the character string output by the character estimation means based on weighting information in which the number of appearances of the characters included in the word registration information in which a plurality of words are registered is associated for each character. One or more candidate character strings as character string candidates are sequentially generated by extracting and combining candidate characters to be extracted one by one in descending order of appearance frequency. The character string identifying means collates the candidate character string with the words in the word registration information in the order of generation, and identifies the word corresponding to the character string from the collation result.

このような文字認識プログラムによれば、文字推定手段により、画像情報を基に文字列の各文字が推定され、その推定結果の候補として１つ以上の候補文字が文字列内の各文字について出力される。次に、候補文字列生成手段により、複数の単語が登録された単語登録情報に含まれる文字の出現回数を文字ごとに対応付けた重み付け情報に基づき、文字推定手段が出力した文字列内の各文字に対応する候補文字が出現回数の多い順に１つずつ抽出されて組み合わされることで、文字列の候補としての１つ以上の候補文字列が順次生成される。そして、文字列特定手段により、候補文字列がその生成順に単語登録情報内の単語と照合され、その照合結果から文字列に対応する単語が特定される。 According to such a character recognition program, each character of the character string is estimated based on the image information by the character estimation means, and one or more candidate characters are output for each character in the character string as a candidate of the estimation result. Is done. Next, based on the weighting information in which the number of appearances of characters included in the word registration information in which a plurality of words are registered is associated with each character by the candidate character string generation unit, each character string output by the character estimation unit One or more candidate character strings as character string candidates are sequentially generated by extracting and combining candidate characters corresponding to the characters one by one in descending order of appearance frequency. Then, the character string identifying means collates the candidate character string with the words in the word registration information in the order of generation, and identifies the word corresponding to the character string from the collation result.

また、上記課題を解決するために、画像情報に含まれる第１の文字列と第２の文字列とを認識する文字認識プログラムが提供される。この文字認識プログラムを実行するコンピュータは、文字推定手段、第１の候補文字列生成手段、第１の文字列特定手段、第２の候補文字列生成手段および第２の文字列特定手段として機能する。文字推定手段は、画像情報を基に第１の文字列および第２の文字列の各文字を推定し、その推定結果の候補として、第１の文字列の各文字に対応する第１の候補文字と、第２の文字列の各文字に対応する第２の候補文字とを、それぞれ１つ以上出力する。第１の候補文字列生成手段は、複数の単語が登録された第１の単語登録情報に含まれる文字の出現回数を文字ごとに対応付けた第１の重み付け情報に基づき、文字推定手段が出力した第１の文字列内の各文字に対応する第１の候補文字を第１の単語登録情報における出現回数が多い順に１つずつ抽出して組み合わせることで、第１の文字列の候補としての１つ以上の第１の候補文字列を順次生成する。第１の文字列特定手段は、第１の候補文字列をその生成順に第１の単語登録情報内の単語と照合し、その照合結果から第１の文字列と一致すると推定される単語を示す第１の候補単語を複数選択して出力する。第２の候補文字列生成手段は、複数の単語がそれぞれ登録された複数の第２の単語登録情報の中から、第１の候補単語から特定される第２の単語登録情報を第１の候補単語ごとに選択し、選択した第２の単語登録情報のそれぞれに含まれる単語の出現回数を文字ごとに対応付けた複数の第２の重み付け情報に基づき、文字推定手段が出力した第２の文字列内の各文字に対応する第２の候補文字を第２の単語登録情報における出現回数が多い順に１つずつ組み合わせることで、第２の文字列の候補としての１つ以上の第２の候補文字列を第１の候補単語ごとにそれぞれ順に生成する。第２の文字列特定手段は、第２の候補文字列をその生成順に対応する第２の単語登録情報内の単語と照合し、その照合結果から第２の文字列と一致すると推定される単語を示す第２の候補単語を、いずれかの第２の単語登録情報から選択して出力するとともに、第２の候補単語に対応する第１の候補単語を第１の文字列と一致する単語として確定する。 Moreover, in order to solve the said subject, the character recognition program which recognizes the 1st character string and 2nd character string which are contained in image information is provided. A computer that executes this character recognition program functions as character estimation means, first candidate character string generation means, first character string specification means, second candidate character string generation means, and second character string specification means. . The character estimation means estimates each character of the first character string and the second character string based on the image information, and a first candidate corresponding to each character of the first character string as a candidate of the estimation result One or more characters and one or more second candidate characters corresponding to each character of the second character string are output. The first candidate character string generation means outputs the character estimation means based on the first weighting information in which the number of appearances of the character included in the first word registration information in which a plurality of words are registered is associated for each character. By extracting and combining the first candidate characters corresponding to each character in the first character string one by one in descending order of the number of appearances in the first word registration information, One or more first candidate character strings are sequentially generated. The first character string specifying means collates the first candidate character string with the words in the first word registration information in the order of generation, and indicates a word estimated to match the first character string from the collation result. A plurality of first candidate words are selected and output. The second candidate character string generation means generates second word registration information specified from the first candidate word from among the plurality of second word registration information in which a plurality of words are registered, respectively. The second character selected by the word and output by the character estimation means based on a plurality of second weighting information in which the number of appearances of the word included in each of the selected second word registration information is associated with each character. One or more second candidates as candidates for the second character string by combining the second candidate characters corresponding to each character in the sequence one by one in descending order of the number of appearances in the second word registration information A character string is sequentially generated for each first candidate word. The second character string specifying means collates the second candidate character string with the word in the second word registration information corresponding to the generation order, and the word estimated to match the second character string from the collation result A second candidate word indicating, is selected from any second word registration information and output, and the first candidate word corresponding to the second candidate word is set as a word that matches the first character string Determine.

このような文字認識プログラムを実行するコンピュータによれば、文字推定手段により、画像情報を基に第１の文字列および第２の文字列の各文字が推定され、その推定結果の候補として、第１の文字列の各文字に対応する第１の候補文字と、第２の文字列の各文字に対応する第２の候補文字とが、それぞれ１つ以上出力される。次に、第１の候補文字列生成手段により、複数の単語が登録された第１の単語登録情報に含まれる文字の出現回数を文字ごとに対応付けた第１の重み付け情報に基づき、文字推定手段が出力した第１の文字列内の各文字に対応する第１の候補文字が第１の単語登録情報における出現回数が多い順に１つずつ抽出されて組み合わされることで、第１の文字列の候補としての１つ以上の第１の候補文字列が順次生成される。そして、第１の文字列特定手段により、第１の候補文字列がその生成順に第１の単語登録情報内の単語と照合され、その照合結果から第１の文字列と一致すると推定される単語を示す第１の候補単語が複数選択されて出力される。更に、第２の候補文字列生成手段により、複数の単語がそれぞれ登録された複数の第２の単語登録情報の中から、第１の候補単語から特定される第２の単語登録情報を第１の候補単語ごとに選択し、選択した第２の単語登録情報のそれぞれに含まれる単語の出現回数を文字ごとに対応付けた複数の第２の重み付け情報に基づき、文字推定手段が出力した第２の文字列内の各文字に対応する第２の候補文字が第２の単語登録情報における出現回数が多い順に１つずつ組み合わされることで、第２の文字列の候補としての１つ以上の第２の候補文字列が第１の候補単語ごとにそれぞれ順に生成される。そして、第２の文字列特定手段により、第２の候補文字列がその生成順に対応する第２の単語登録情報内の単語と照合され、その照合結果から第２の文字列と一致すると推定される単語を示す第２の候補単語がいずれかの第２の単語登録情報から選択して出力されるとともに、第２の候補単語に対応する第１の候補単語が第１の文字列と一致する単語として確定される。 According to the computer that executes such a character recognition program, each character of the first character string and the second character string is estimated on the basis of the image information by the character estimation means, One or more first candidate characters corresponding to each character of one character string and one or more second candidate characters corresponding to each character of the second character string are output. Next, based on the first weighting information in which the number of appearances of characters included in the first word registration information in which a plurality of words are registered is associated with each character by the first candidate character string generation unit, character estimation is performed. The first character string is obtained by extracting and combining the first candidate characters corresponding to the respective characters in the first character string output by the means in the descending order of the number of appearances in the first word registration information. One or more first candidate character strings as candidates are sequentially generated. Then, the first character string specifying means collates the first candidate character string with the words in the first word registration information in the order of generation, and the word estimated to match the first character string from the collation result A plurality of first candidate words indicating are selected and output. Further, the second candidate character string generation means converts the second word registration information specified from the first candidate word out of the plurality of second word registration information in which the plurality of words are respectively registered. The second word output by the character estimation means based on a plurality of second weighting information in which the number of occurrences of the word included in each of the selected second word registration information is associated with each character. The second candidate characters corresponding to the respective characters in the character string are combined one by one in descending order of the number of appearances in the second word registration information, so that one or more second character strings as candidates for the second character string are combined. Two candidate character strings are generated in order for each first candidate word. Then, by the second character string specifying means, the second candidate character string is collated with words in the second word registration information corresponding to the generation order, and it is estimated from the collation result that the second character string matches the second character string. The second candidate word indicating the word is selected from any second word registration information and output, and the first candidate word corresponding to the second candidate word matches the first character string Confirmed as a word.

また、上記課題を解決するために、上記文字認識プログラムを実行するコンピュータと同様の処理を行う文字認識装置および文字認識方法が提供される。 Moreover, in order to solve the said subject, the character recognition apparatus and the character recognition method which perform the process similar to the computer which performs the said character recognition program are provided.

上記文字認識プログラム、文字認識装置および文字認識方法によれば、低負荷の処理で文字列を精度良く認識することが可能となる。 According to the above character recognition program, character recognition device, and character recognition method, it is possible to accurately recognize a character string with low-load processing.

以下、本実施の形態を図面を参照して詳細に説明する。
図１は、文字認識システムの概要を示す図である。この文字認識システムは、コンピュータ１および画像情報取込装置２を有する。コンピュータ１と画像情報取込装置２とは、所定のインタフェースによって接続されており、相互にデータ通信が可能である。コンピュータ１は、画像情報取込装置２から画像情報を取得する。そして、コンピュータ１は、画像情報に含まれる１文字以上の文字列を、その文字列に対応する文字コード列として特定する。ここで、以降の説明では、文字列を特定することと文字コード列を特定することとは同義であるものとする。コンピュータ１は、このような文字列の特定処理を行う文字認識プログラムを実行している。このコンピュータ１は、文字認識プログラムを実行することにより、単語登録情報記憶手段１ａ、重み付け情報記憶手段１ｂ、画像情報入力手段１ｃ、文字推定手段１ｄ、候補文字列生成手段１ｅおよび文字列特定手段１ｆとして機能する。 Hereinafter, the present embodiment will be described in detail with reference to the drawings.
FIG. 1 is a diagram showing an outline of a character recognition system. This character recognition system includes a computer 1 and an image information capturing device 2. The computer 1 and the image information capturing device 2 are connected by a predetermined interface, and data communication is possible between them. The computer 1 acquires image information from the image information capturing device 2. Then, the computer 1 specifies a character string of one or more characters included in the image information as a character code string corresponding to the character string. Here, in the following description, specifying a character string and specifying a character code string are synonymous. The computer 1 executes a character recognition program for performing such character string specifying processing. By executing the character recognition program, the computer 1 executes word registration information storage means 1a, weighting information storage means 1b, image information input means 1c, character estimation means 1d, candidate character string generation means 1e, and character string specification means 1f. Function as.

単語登録情報記憶手段１ａは、複数の所定の単語を登録した単語登録情報を記憶する。
重み付け情報記憶手段１ｂは、単語登録情報記憶手段１ａに記憶された単語登録情報に含まれる文字の出現回数を各文字に対応付けた重み付け情報を記憶する。 The word registration information storage unit 1a stores word registration information in which a plurality of predetermined words are registered.
The weighting information storage unit 1b stores weighting information in which the number of appearances of characters included in the word registration information stored in the word registration information storage unit 1a is associated with each character.

画像情報入力手段１ｃは、画像情報取込装置２から取得する画像情報を文字推定手段１ｄに出力する。
文字推定手段１ｄは、画像情報入力手段１ｃから画像情報を取得すると、画像情報に含まれる複数の文字を推定し、推定した複数の文字それぞれに対する複数の候補文字を生成する。そして、文字推定手段１ｄは、生成した複数の候補文字を候補文字列生成手段１ｅに出力する。 The image information input unit 1c outputs the image information acquired from the image information capturing device 2 to the character estimation unit 1d.
When the character estimation unit 1d acquires the image information from the image information input unit 1c, the character estimation unit 1d estimates a plurality of characters included in the image information and generates a plurality of candidate characters for each of the estimated plurality of characters. Then, the character estimation unit 1d outputs the generated plurality of candidate characters to the candidate character string generation unit 1e.

候補文字列生成手段１ｅは、重み付け情報記憶手段１ｂに記憶された重み付け情報および文字推定手段１ｄが生成した複数の候補文字に基づいて、画像情報に含まれる文字列に対する複数の候補文字列を生成する。このとき候補文字列生成手段１ｅは、この複数の候補文字列それぞれに、重み付け情報に基づいて優先順位を付与する。そして、候補文字列生成手段１ｅは、生成した複数の候補文字列を文字列特定手段１ｆに出力する。 The candidate character string generating unit 1e generates a plurality of candidate character strings for the character strings included in the image information based on the weighting information stored in the weighting information storage unit 1b and the plurality of candidate characters generated by the character estimating unit 1d. To do. At this time, the candidate character string generating unit 1e gives priority to each of the plurality of candidate character strings based on the weighting information. Then, the candidate character string generating unit 1e outputs the generated plurality of candidate character strings to the character string specifying unit 1f.

文字列特定手段１ｆは、単語登録情報記憶手段１ａに記憶された単語登録情報に基づいて、候補文字列生成手段１ｅが生成した複数の候補文字列から画像情報に含まれる文字列を特定する。特定方法としては、例えば、複数の候補文字列それぞれと単語登録情報に含まれる複数の単語それぞれとの一致の度合い（一致率）に基づく方法が考えられる。このとき、文字列特定手段１ｆは、上記の優先順位の高い順に一致率を判定する。 The character string specifying unit 1f specifies a character string included in the image information from a plurality of candidate character strings generated by the candidate character string generating unit 1e based on the word registration information stored in the word registration information storage unit 1a. As the specifying method, for example, a method based on the degree of matching (matching rate) between each of the plurality of candidate character strings and each of the plurality of words included in the word registration information can be considered. At this time, the character string specifying means 1f determines the match rate in descending order of priority.

画像情報取込装置２は、文字列が記入された紙面の情報を画像情報として取り込む。画像情報取込装置２は、取り込んだ画像情報をコンピュータ１に出力する。画像情報取込装置２は、例えば、イメージスキャナ（以下、単にスキャナという）である。 The image information capturing device 2 captures information on a sheet of paper on which a character string is entered as image information. The image information capturing device 2 outputs the captured image information to the computer 1. The image information capturing device 2 is, for example, an image scanner (hereinafter simply referred to as a scanner).

このような文字認識システムによれば、所定の文字列情報に含まれる文字の出現回数が重み付け値として各文字に予め対応付けられ、重み付け情報として保持される。そして、この重み付け情報に基づいて、候補文字列が優先順位を付与されて、生成される。更に、生成された候補文字列、候補文字列に付与された優先順位および単語登録情報に基づいて、画像情報に含まれる文字列が特定される。 According to such a character recognition system, the number of appearances of characters included in the predetermined character string information is associated with each character in advance as a weighting value and held as weighting information. And based on this weighting information, a candidate character string is given a priority and is generated. Furthermore, the character string included in the image information is specified based on the generated candidate character string, the priority order given to the candidate character string, and the word registration information.

このように、予め単語登録情報に基づいて各文字に重み付けを付与しておくことで、文字の特定精度を向上することができる。また、優先順位の高い順に候補文字列の一致率判定処理等を行うことにより、文字列を短時間に特定できるようになる。更に、優先順位の低い候補文字列の一致率判定を省略することができるため、認識処理の負荷を低減することができる。すなわち、低負荷の処理で文字列を精度良く認識することが可能となる。 Thus, by assigning weights to each character based on the word registration information in advance, it is possible to improve character identification accuracy. Further, by performing the matching rate determination process of candidate character strings in descending order of priority, the character strings can be specified in a short time. Furthermore, since it is possible to omit the matching rate determination of candidate character strings having low priorities, it is possible to reduce the recognition processing load. That is, the character string can be recognized with high accuracy by low-load processing.

ところで、図１に示した文字認識システムは、例えば、金融機関の業務において紙帳票に記入された文字列を特定し、それに基づいて業務処理を行う場合に有用である。そこで、このような文字認識システムを銀行業務に関連付けた場合を例に挙げ、実施の形態を図面を参照して詳細に説明する。 By the way, the character recognition system shown in FIG. 1 is useful, for example, when a character string entered in a paper form is specified in business of a financial institution and business processing is performed based on the character string. Therefore, the embodiment will be described in detail with reference to the drawings, taking as an example the case where such a character recognition system is associated with banking business.

図２は、本実施の形態のコンピュータのハードウェア構成を示す図である。コンピュータ１００は、本実施の形態の文字認識プログラムを実行しており、所定の帳票の画像情報に含まれる金融機関の名称を特定する。コンピュータ１００は、ＣＰＵ（Central Processing Unit）１０１によって装置全体が制御されている。ＣＰＵ１０１には、バス１０８を介してＲＡＭ（Random Access Memory）１０２、ＨＤＤ（Hard Disk Drive）１０３、グラフィック処理装置１０４、入力インタフェース１０５，１０６および通信インタフェース１０７が接続されている。 FIG. 2 is a diagram illustrating a hardware configuration of the computer according to the present embodiment. The computer 100 executes the character recognition program of the present embodiment, and specifies the name of the financial institution included in the image information of a predetermined form. The computer 100 is entirely controlled by a CPU (Central Processing Unit) 101. A random access memory (RAM) 102, a hard disk drive (HDD) 103, a graphic processing device 104, input interfaces 105 and 106, and a communication interface 107 are connected to the CPU 101 via a bus 108.

ＲＡＭ１０２には、ＣＰＵ１０１に実行させるＯＳ（Operating System）のプログラムやアプリケーションソフト（以下、アプリケーションという）のプログラムの少なくとも一部が一時的に格納される。また、ＲＡＭ１０２には、ＣＰＵ１０１による処理に必要な各種データが格納される。 The RAM 102 temporarily stores at least part of an OS (Operating System) program and application software (hereinafter referred to as an application) program to be executed by the CPU 101. The RAM 102 stores various data necessary for processing by the CPU 101.

ＨＤＤ１０３は、データを記憶するためのディスク装置である。ＨＤＤ１０３には、ＯＳのプログラムやアプリケーションのプログラムが格納される。また、ＨＤＤ１０３には、ＣＰＵ１０１による処理に必要な各種データが格納される。 The HDD 103 is a disk device for storing data. The HDD 103 stores an OS program and application programs. The HDD 103 stores various data necessary for processing by the CPU 101.

グラフィック処理装置１０４には、モニタ１１が接続されている。グラフィック処理装置１０４は、ＣＰＵ１０１からの命令に従って、画像をモニタ１１の画面に表示させる。
入力インタフェース１０５，１０６は、外部装置からのデータの入力を受け付けるインタフェースである。入力インタフェース１０５には、キーボード１２とマウス１３とが接続されている。入力インタフェース１０５は、キーボード１２やマウス１３から送られてくる信号を、バス１０８を介してＣＰＵ１０１に送信する。入力インタフェース１０６には、スキャナ１４が接続されている。入力インタフェース１０６は、スキャナ１４から送られてくる所定の帳票の画像情報に対応する信号をバス１０８を介してＣＰＵ１０１に送信する。また、入力インタフェース１０６は、入力インタフェース１０６の有するＤＭＡ（Direct Memory Access）機能により取得する画像情報をバス１０８を介して直接ＲＡＭ１０２に格納することもある。 A monitor 11 is connected to the graphic processing device 104. The graphic processing device 104 displays an image on the screen of the monitor 11 in accordance with a command from the CPU 101.
The input interfaces 105 and 106 are interfaces that accept data input from external devices. A keyboard 12 and a mouse 13 are connected to the input interface 105. The input interface 105 transmits a signal sent from the keyboard 12 or the mouse 13 to the CPU 101 via the bus 108. A scanner 14 is connected to the input interface 106. The input interface 106 transmits a signal corresponding to image information of a predetermined form sent from the scanner 14 to the CPU 101 via the bus 108. The input interface 106 may store image information acquired by a DMA (Direct Memory Access) function of the input interface 106 directly in the RAM 102 via the bus 108.

通信インタフェース１０７は、ネットワーク１０に接続されている。通信インタフェース１０７は、ネットワーク１０を介して、他の情報処理装置との間でデータの送受信を行う。 The communication interface 107 is connected to the network 10. The communication interface 107 transmits / receives data to / from other information processing apparatuses via the network 10.

図３は、本実施の形態のコンピュータの機能を示すブロック図である。コンピュータ１００は、文字コード記憶部１１０、金融機関辞書記憶部１２０、重み付け情報記憶部１３０、更新情報入力部１４０、重み付け処理部１４５、画像情報入力部１５０、文字識別部１６０、候補文字除外部１７０、候補名称生成部１８０および名称特定部１９０を有する。 FIG. 3 is a block diagram illustrating functions of the computer according to the present embodiment. The computer 100 includes a character code storage unit 110, a financial institution dictionary storage unit 120, a weighting information storage unit 130, an update information input unit 140, a weighting processing unit 145, an image information input unit 150, a character identification unit 160, and a candidate character exclusion unit 170. The candidate name generation unit 180 and the name identification unit 190 are included.

文字コード記憶部１１０は、コンピュータ１００で使用可能な文字を文字コードに対応付けた文字コード対応テーブルを記憶する。
金融機関辞書記憶部１２０は、金融機関名を登録した金融機関名テーブルを記憶する。また、金融機関辞書記憶部１２０は、各金融機関の支店名を各金融機関名に対応付けて登録した、支店名テーブル群を記憶する。なお、金融機関名テーブルおよび支店名テーブル群を含む情報を金融機関辞書と呼ぶこととする。 The character code storage unit 110 stores a character code correspondence table in which characters usable on the computer 100 are associated with character codes.
The financial institution dictionary storage unit 120 stores a financial institution name table in which financial institution names are registered. Further, the financial institution dictionary storage unit 120 stores a branch name table group in which the branch names of the respective financial institutions are registered in association with the names of the respective financial institutions. Information including the financial institution name table and the branch name table group is referred to as a financial institution dictionary.

重み付け情報記憶部１３０は、金融機関辞書に含まれる金融機関名で使用される文字の出現回数を上記の文字コードに対応付けた金融機関名用重み付けテーブルを記憶する。また、重み付け情報記憶部１３０は、各金融機関の支店名で使用される文字の出現回数を上記の文字コードに対応付けた支店名用重み付けテーブルを各金融機関名に対応付けて記憶する。 The weighting information storage unit 130 stores a weighting table for financial institution names in which the number of appearances of characters used in financial institution names included in the financial institution dictionary is associated with the above character codes. Further, the weighting information storage unit 130 stores a branch name weighting table in which the number of occurrences of characters used in the branch name of each financial institution is associated with the above character code in association with each financial institution name.

更新情報入力部１４０は、金融機関辞書に対する更新情報を取得する。更新情報には、金融機関名や各金融機関の支店名の変更が含まれる。更新情報は、例えば、オペレータにより定期的に入力されたり、ネットワーク等を介して定期的に配信されたりする。更新情報入力部１４０は、更新情報を取得すると金融機関辞書記憶部１２０に記憶された金融機関辞書を更新情報に応じて更新する。 The update information input unit 140 acquires update information for the financial institution dictionary. The update information includes a change in the name of the financial institution and the branch name of each financial institution. For example, the update information is periodically input by an operator or is regularly distributed via a network or the like. When the update information is acquired, the update information input unit 140 updates the financial institution dictionary stored in the financial institution dictionary storage unit 120 according to the update information.

重み付け処理部１４５は、金融機関辞書記憶部１２０に記憶された金融機関辞書が更新されたことを検知すると、金融機関名テーブルに含まれる文字の出現回数を算出し、各文字に対応付けて金融機関名用重み付けテーブルを生成する。重み付け処理部１４５は、生成した金融機関名用重み付けテーブルを重み付け情報記憶部１３０に格納する。 When the weighting processing unit 145 detects that the financial institution dictionary stored in the financial institution dictionary storage unit 120 has been updated, the weighting processing unit 145 calculates the number of appearances of characters included in the financial institution name table, and associates each character with the financial Generate an institution name weighting table. The weighting processing unit 145 stores the generated financial institution name weighting table in the weighting information storage unit 130.

また、重み付け処理部１４５は、名称特定部１９０の指示に基づいて、金融機関辞書記憶部１２０に記憶された支店名テーブルを参照して、支店名テーブルに含まれる文字の出現回数を算出し、各文字に対応付けて支店名用重み付けテーブルを生成する。重み付け処理部１４５は、生成した支店名用重み付けテーブルを重み付け情報記憶部１３０に格納する。 Further, the weighting processing unit 145 calculates the number of appearances of characters included in the branch name table with reference to the branch name table stored in the financial institution dictionary storage unit 120 based on an instruction from the name specifying unit 190. A branch name weighting table is generated in association with each character. The weighting processing unit 145 stores the generated branch name weighting table in the weighting information storage unit 130.

画像情報入力部１５０は、スキャナ１４から取得する画像情報を文字識別部１６０に出力する。
文字識別部１６０は、画像情報入力部１５０から画像情報を取得すると、画像情報に含まれる文字画像を抽出する。文字識別部１６０は、抽出した文字画像に所定の画像識別処理を行う。ここで、文字識別部１６０が抽出する文字画像には、金融機関名および各金融機関の支店名が含まれる。金融機関名や各金融機関の支店名といった名称の種別は、例えば、記入された画像上の領域によって区別される。そして、文字識別部１６０は、文字コード記憶部１１０に記憶された文字コード対応テーブルに基づいて、各文字の形状に対応する文字コードを取得する。 The image information input unit 150 outputs image information acquired from the scanner 14 to the character identification unit 160.
When the character identification unit 160 acquires image information from the image information input unit 150, the character identification unit 160 extracts a character image included in the image information. The character identification unit 160 performs a predetermined image identification process on the extracted character image. Here, the character image extracted by the character identification unit 160 includes the financial institution name and the branch name of each financial institution. Types of names such as financial institution names and branch names of each financial institution are distinguished by, for example, the area on the filled image. Then, the character identification unit 160 acquires a character code corresponding to the shape of each character based on the character code correspondence table stored in the character code storage unit 110.

ここで、文字識別部１６０による文字画像の識別結果は、帳票に記入された文字の筆跡の違い等により一意に定まる可能性は小さい。このため、文字識別部１６０は、識別結果の尤度（確からしさ）等に基づいて、その尤度の優位な識別結果から順に複数の候補文字を特定する。すなわち、文字識別部１６０は、複数の候補文字に対応する複数の候補文字コードを確からしいものから順に取得する。文字識別部１６０は、取得した複数の候補文字コードを金融機関名用および支店名用を区別して候補文字除外部１７０に出力する。なお、以下では、候補文字という場合、候補文字に対応する候補文字コードを示すものとする。 Here, the identification result of the character image by the character identification unit 160 is unlikely to be uniquely determined due to a difference in handwriting of characters entered in the form. For this reason, the character identification unit 160 specifies a plurality of candidate characters in order from the identification result having the highest likelihood based on the likelihood (probability) of the identification result. That is, the character identification unit 160 acquires a plurality of candidate character codes corresponding to a plurality of candidate characters in order from a probable one. The character identification unit 160 distinguishes the acquired candidate character codes for financial institution names and branch names and outputs them to the candidate character exclusion unit 170. In the following, when a candidate character is used, a candidate character code corresponding to the candidate character is indicated.

候補文字除外部１７０は、文字識別部１６０から金融機関名用の候補文字および支店名用の候補文字を取得する。候補文字除外部１７０は、金融機関名用の候補文字に関して、重み付け情報記憶部１３０に記憶された金融機関名用重み付けテーブルで重み付け値が０である候補文字を候補から除外する。候補文字除外部１７０は、この除外処理の結果、残った金融機関名用の候補文字を候補名称生成部１８０に出力する。 Candidate character exclusion unit 170 acquires candidate characters for financial institution names and candidate characters for branch names from character identification unit 160. The candidate character excluding unit 170 excludes candidate characters having a weighting value of 0 in the financial institution name weighting table stored in the weighting information storage unit 130 from candidates for the financial institution name candidate characters. The candidate character exclusion unit 170 outputs the candidate characters for the remaining financial institution name as a result of the exclusion process to the candidate name generation unit 180.

また、候補文字除外部１７０は、名称特定部１９０の指示に基づいて、支店名用の候補文字に対し、重み付け情報記憶部１３０に記憶された支店名用重み付けテーブルで重み付け値が０である候補文字を候補から除外する。候補文字除外部１７０は、この除外処理の結果、残った支店名用の候補文字を候補名称生成部１８０に出力する。 In addition, based on the instruction from the name specifying unit 190, the candidate character excluding unit 170 is a candidate for which the weighting value is 0 in the branch name weighting table stored in the weighting information storage unit 130 for the branch name candidate characters. Exclude characters from candidates. Candidate character exclusion section 170 outputs candidate characters for branch names remaining as a result of this exclusion processing to candidate name generation section 180.

候補名称生成部１８０は、候補文字除外部１７０から候補除外処理後の金融機関名用の候補文字および支店名用の候補文字を取得する。候補名称生成部１８０は、取得した金融機関名用の候補文字を用いて、重み付け情報記憶部１３０に記憶された金融機関名用重み付けテーブルに基づき、複数の候補金融機関名を生成する。このとき、候補名称生成部１８０は、金融機関名用重み付けテーブルの重み付け値に基づいて、候補金融機関名に優先順位を付与する。例えば、重み付け値が大きい候補文字同士を組み合わせて生成された候補金融機関名は、重み付け値が小さい候補文字同士で組み合わせて生成された候補金融機関名に比べて優先順位が高くなるようにする。候補名称生成部１８０は、このように優先順位を付与して生成した複数の候補金融機関名を名称特定部１９０に出力する。 Candidate name generation unit 180 acquires candidate characters for financial institution names and candidate characters for branch names after candidate exclusion processing from candidate character exclusion unit 170. The candidate name generation unit 180 generates a plurality of candidate financial institution names based on the financial institution name weighting table stored in the weighting information storage unit 130 using the acquired candidate characters for financial institution names. At this time, the candidate name generation unit 180 gives priority to the candidate financial institution names based on the weighting values in the financial institution name weighting table. For example, a candidate financial institution name generated by combining candidate characters having a large weighting value has a higher priority than a candidate financial institution name generated by combining candidate characters having a small weighting value. The candidate name generation unit 180 outputs a plurality of candidate financial institution names generated by assigning priorities in this way to the name specifying unit 190.

また、候補名称生成部１８０は、候補文字除外部１７０から取得する支店名用の候補文字を用いて、重み付け情報記憶部１３０に記憶された支店名用重み付けテーブルに基づき、複数の候補支店名を生成する。このとき、候補名称生成部１８０は、支店名用重み付けテーブルの重み付け値に基づいて、候補支店名に優先順位を付与する。優先順位の付与の方法は、候補金融機関名の生成の際と同様の方法が考えられる。候補名称生成部１８０は、生成した複数の候補支店名を名称特定部１９０に出力する。 In addition, the candidate name generation unit 180 uses the branch name candidate characters acquired from the candidate character exclusion unit 170 to generate a plurality of candidate branch names based on the branch name weighting table stored in the weighting information storage unit 130. Generate. At this time, the candidate name generation unit 180 gives priority to the candidate branch names based on the weight values in the branch name weighting table. As a method for assigning priorities, the same method as that for generating candidate financial institution names can be considered. The candidate name generation unit 180 outputs the generated plurality of candidate branch names to the name identification unit 190.

名称特定部１９０は、候補名称生成部１８０から複数の候補金融機関名を取得する。名称特定部１９０は、金融機関辞書記憶部１２０に記憶された金融機関名テーブルに、取得した複数の候補金融機関名それぞれに対応する金融機関名が存在するか否かを判定する。このとき、名称特定部１９０は、付与された優先順位の高い候補金融機関名から順に判定を行う。名称特定部１９０の上記判定の方法としては、例えば、双方の文字列を比較して、同じ文字である割合を示す一致率を算出する方法を用い、一致率の大きいものを優先的に判定結果として採用する。 The name identification unit 190 acquires a plurality of candidate financial institution names from the candidate name generation unit 180. The name specifying unit 190 determines whether or not there are financial institution names corresponding to the acquired plurality of candidate financial institution names in the financial institution name table stored in the financial institution dictionary storage unit 120. At this time, the name specifying unit 190 makes a determination in order from the assigned candidate financial institution names with the highest priority. As the determination method of the name specifying unit 190, for example, a method is used in which both character strings are compared to calculate a coincidence rate indicating the ratio of the same character, and a result with a high coincidence rate is determined with priority Adopt as.

その後、名称特定部１９０は、判定の結果得られた金融機関名に対応する支店名用重み付けテーブルの生成を、重み付け処理部１４５に指示する。名称特定部１９０は、支店名用重み付けテーブルの生成処理が完了すると、候補文字除外部１７０に候補支店名に対する処理を指示する。そして、名称特定部１９０は、候補名称生成部１８０から候補支店名を取得すると、金融機関辞書記憶部１２０に記憶された支店名テーブルに、取得した複数の候補支店名それぞれに対応する支店名が存在するか否かを判定する。このとき、名称特定部１９０は、金融機関名の特定の場合と同様に、付与された優先順位の高い候補支店名から順に判定を行う。この判定の方法には、金融機関名の判定を行う場合と同様に一致率を算出する方法を用いる。このとき、名称特定部１９０は、金融機関名の特定の結果で金融機関名が一意に特定されていない場合には、特定した支店名に基づいて金融機関名の特定を行う。 Thereafter, the name specifying unit 190 instructs the weighting processing unit 145 to generate a branch name weighting table corresponding to the financial institution name obtained as a result of the determination. When the generation processing of the branch name weighting table is completed, the name specifying unit 190 instructs the candidate character excluding unit 170 to process the candidate branch name. When the name specifying unit 190 acquires the candidate branch name from the candidate name generating unit 180, the branch name corresponding to each of the acquired plurality of candidate branch names is stored in the branch name table stored in the financial institution dictionary storage unit 120. Determine if it exists. At this time, the name specifying unit 190 performs the determination in order from the assigned candidate branch name having the highest priority, as in the case of specifying the financial institution name. As the determination method, a method of calculating the coincidence rate is used as in the case of determining the name of the financial institution. At this time, if the financial institution name is not uniquely specified as a result of specifying the financial institution name, the name specifying unit 190 specifies the name of the financial institution based on the specified branch name.

図４は、文字コード記憶部に記憶されるテーブルを示す図である。文字コード記憶部１１０には、文字コード対応テーブル１１１が記憶される。文字コード対応テーブル１１１は、文字画像に含まれる文字とその文字に対応する文字コードを対応付けた情報である。 FIG. 4 is a diagram illustrating a table stored in the character code storage unit. A character code correspondence table 111 is stored in the character code storage unit 110. The character code correspondence table 111 is information in which a character included in a character image is associated with a character code corresponding to the character.

図５は、文字コード対応テーブルのデータ構造例を示す図である。文字コード対応テーブル１１１には、Ｎｏ．を示す項目、文字コードを示す項目および文字を示す項目が設けられている。各項目の横方向に関連付けられた情報同士が１つの文字コードに関する情報を構成する。 FIG. 5 shows an example of the data structure of the character code correspondence table. In the character code correspondence table 111, no. , An item indicating a character code, and an item indicating a character are provided. Information associated with the horizontal direction of each item constitutes information on one character code.

Ｎｏ．を示す項目には、項目の番号を示す値が設定される。文字コードを示す項目には、コンピュータ１００で認識可能なコードが設定される。文字を示す項目には、文字コードに対応付けられる文字が設定される。 No. A value indicating an item number is set in the item indicating. In the item indicating the character code, a code that can be recognized by the computer 100 is set. A character associated with a character code is set in the item indicating a character.

文字コード対応テーブル１１１には、例えば、Ｎｏ．が“１２３０６”、文字コードが“０ｘ３０１２”、文字が“東”という情報が設定される。これは、項目番号“１２３０６”番目の文字コードとして“０ｘ３０１２”が定められており、この文字コードに対応する文字が“東”であることを示している。すなわち、文字識別部１６０は、文字として“東”を認識すると、これに対する文字コード“０ｘ３０１２”を取得する。 In the character code correspondence table 111, for example, No. Is set to “12306”, the character code is “0x3012”, and the character is “East”. This indicates that “0x3012” is defined as the character code of the item number “12306”, and the character corresponding to this character code is “east”. That is, when the character identification unit 160 recognizes “East” as a character, it acquires the character code “0x3012” for this.

なお、“ｎｕｌｌ”は、文字コードに対応する文字が定義されていないことを示す。
文字コード対応テーブル１１１のコード体系としては、例えば、ＵｎｉｃｏｄｅやＪＩＳ（Japanese Industrial Standards）コード等を用いることができる。 Note that “null” indicates that a character corresponding to the character code is not defined.
As a code system of the character code correspondence table 111, for example, Unicode, JIS (Japanese Industrial Standards) code, or the like can be used.

図６は、金融機関辞書記憶部に記憶されるテーブルを示す図である。金融機関辞書記憶部１２０には、金融機関名テーブル１２１および支店名テーブル群１２２が記憶される。金融機関名テーブル１２１は、業務処理システムで利用する金融機関名を登録したテーブルである。支店名テーブル群１２２は、各金融機関に対応付けられた支店名を登録するテーブルの集合である。支店名テーブル群１２２は、支店名テーブル１２２ａ，１２２ｂ，１２２ｃを含む。支店名テーブル１２２ａ，１２２ｂ，１２２ｃは、金融機関名テーブル１２１に登録された各金融機関の支店名を各金融機関に対応付けて登録したテーブルである。なお、金融機関名および支店名は、文字コード列の情報として登録される。 FIG. 6 is a diagram illustrating a table stored in the financial institution dictionary storage unit. The financial institution dictionary storage unit 120 stores a financial institution name table 121 and a branch name table group 122. The financial institution name table 121 is a table in which names of financial institutions used in the business processing system are registered. The branch name table group 122 is a set of tables for registering branch names associated with each financial institution. The branch name table group 122 includes branch name tables 122a, 122b, and 122c. The branch name tables 122a, 122b, and 122c are tables in which the branch names of the respective financial institutions registered in the financial institution name table 121 are registered in association with the respective financial institutions. The financial institution name and the branch name are registered as character code string information.

図７は、金融機関名テーブルのデータ構造例を示す図である。金融機関名テーブル１２１には、Ｎｏ．を示す項目、金融機関名を示す項目が設けられている。
Ｎｏ．を示す項目には、項目の番号を示す値が設定される。金融機関名を示す項目には、金融機関の名称を示す情報が設定される。 FIG. 7 shows an example of the data structure of the financial institution name table. In the financial institution name table 121, no. And an item indicating the name of the financial institution.
No. A value indicating an item number is set in the item indicating. In the item indicating the financial institution name, information indicating the name of the financial institution is set.

金融機関名テーブル１２１には、例えば、Ｎｏ．が“１”、金融機関名が“東京ＡＢＣ銀行”という情報が設定される。
図８は、支店名テーブルのデータ構造例を示す図である。支店名テーブル１２２ａ，１２２ｂ，１２２ｃには、Ｎｏ．を示す項目、支店名を示す項目が設けられている。ここでは、金融機関名“東京ＡＢＣ銀行”の支店名を登録したテーブルである支店名テーブル１２２ａに関して説明するが、支店名テーブル１２２ｂ，１２２ｃに関しても同様である。 The financial institution name table 121 includes, for example, No. Is set to “1” and the financial institution name is “Tokyo ABC Bank”.
FIG. 8 shows an example of the data structure of the branch name table. In the branch name tables 122a, 122b, 122c, No. And an item indicating a branch name. Here, the branch name table 122a that is a table in which the branch name of the financial institution name “Tokyo ABC Bank” is registered will be described, but the same applies to the branch name tables 122b and 122c.

Ｎｏ．を示す項目には、項目の番号を示す値が設定される。支店名を示す項目には、支店の名称を示す情報が設定される。
支店名テーブル１２２ａには、例えば、Ｎｏ．が“１”、支店名が“本店”という情報が設定される。同様にして支店名テーブル１２２ｂ，１２２ｃに関しても他の金融機関の各支店名が登録される。 No. A value indicating an item number is set in the item indicating. In the item indicating the branch name, information indicating the name of the branch is set.
The branch name table 122a includes, for example, No. Is set to “1” and the branch name is “main store”. Similarly, branch names of other financial institutions are registered with respect to the branch name tables 122b and 122c.

図９は、重み付け情報記憶部に記憶されるテーブルを示す図である。重み付け情報記憶部１３０には、金融機関名用重み付けテーブル１３１および支店名用重み付けテーブル群１３２が記憶される。金融機関名用重み付けテーブル１３１は、金融機関名テーブル１２１に登録された金融機関名に含まれる各文字コードの出現回数を、各文字コードに対応付けたテーブルである。支店名用重み付けテーブル群１３２は、各金融機関に対応付けられた支店名に含まれる各文字コードの出現回数を、各文字コードに対応付けたテーブルの集合である。支店名用重み付けテーブル群１３２は、支店名用重み付けテーブル１３２ａ，１３２ｂ，１３２ｃを含む。支店名用重み付けテーブル１３２ａ，１３２ｂ，１３２ｃは、それぞれ支店名テーブル１２２ａ，１２２ｂ，１２２ｃに登録された金融機関の支店名に含まれる各文字コードの出現回数を、各文字コードに対応付けたテーブルである。 FIG. 9 is a diagram illustrating a table stored in the weighting information storage unit. The weighting information storage unit 130 stores a financial institution name weighting table 131 and a branch name weighting table group 132. The financial institution name weighting table 131 is a table in which the number of appearances of each character code included in the financial institution name registered in the financial institution name table 121 is associated with each character code. The branch name weighting table group 132 is a set of tables in which the number of appearances of each character code included in the branch name associated with each financial institution is associated with each character code. The branch name weighting table group 132 includes branch name weighting tables 132a, 132b, and 132c. The branch name weighting tables 132a, 132b, and 132c are tables in which the number of appearances of each character code included in the branch names of financial institutions registered in the branch name tables 122a, 122b, and 122c are associated with the respective character codes. is there.

ここで、金融機関名用重み付けテーブル１３１と支店名用重み付けテーブル群１３２は、同時に重み付け情報記憶部１３０に記憶されている必要はない。本実施の形態では、重み付け処理部１４５は、金融機関名用重み付けテーブル１３１に関しては、システムの起動時や金融機関名テーブルの更新時に一度だけ重み付け処理を行って生成して重み付け情報記憶部１３０に格納する。一方、支店名の重み付け処理に関しては、その処理負荷が小さい場合も多いため、重み付け処理部１４５は、必要に応じて重み付け処理を行い重み付け情報記憶部１３０に登録する。このようにすることで、コンピュータ１００が利用可能なメモリリソースを効率良く利用することができる。 Here, the weighting table 131 for financial institution names and the weighting table group 132 for branch names need not be stored in the weighting information storage unit 130 at the same time. In the present embodiment, the weighting processing unit 145 generates the weighting table 131 for the financial institution name by performing the weighting processing only once when the system is started or when the financial institution name table is updated, and is generated in the weighting information storage unit 130. Store. On the other hand, with regard to the weighting processing of the branch name, the processing load is often small, so the weighting processing unit 145 performs weighting processing as necessary and registers it in the weighting information storage unit 130. In this way, memory resources that can be used by the computer 100 can be used efficiently.

なお、支店名用重み付けテーブル群１３２も金融機関名用重み付けテーブル１３１と同様にシステムの起動時や金融機関名テーブル１２１の更新時に一度だけ重み付け処理を行い、重み付け情報記憶部１３０に格納するようにしてもよい。 Note that the branch name weighting table group 132 is also subjected to weighting processing only once when the system is started up or when the financial institution name table 121 is updated, and stored in the weighting information storage unit 130, similarly to the financial institution name weighting table 131. May be.

また、支店名用重み付けテーブル群１３２に含まれる支店名用重み付けテーブルの数は、図９のように複数の場合もあるし、単数の場合もある。
図１０は、金融機関名用重み付けテーブルのデータ構造例を示す図である。金融機関名用重み付けテーブル１３１には、Ｎｏ．を示す項目、文字コードを示す項目、重み付け値を示す項目が設けられている。各項目の横方向に関連付けられた情報同士が１つの文字コードに関する情報を構成する。 The number of branch name weighting tables included in the branch name weighting table group 132 may be plural as shown in FIG. 9 or may be singular.
FIG. 10 is a diagram illustrating an example of a data structure of a weighting table for financial institution names. In the weighting table 131 for financial institution names, no. , An item indicating a character code, and an item indicating a weighting value are provided. Information associated with the horizontal direction of each item constitutes information on one character code.

Ｎｏ．を示す項目には、項目の番号を示す値が設定される。文字コードを示す項目には、コンピュータ１００で使用可能な文字コードが設定される。重み付け値を示す項目には、対応する文字コードの、金融機関名テーブル１２１に登録された金融機関名における出現回数が設定される。 No. A value indicating an item number is set in the item indicating. In the item indicating the character code, a character code usable in the computer 100 is set. In the item indicating the weighting value, the number of appearances of the corresponding character code in the financial institution name registered in the financial institution name table 121 is set.

金融機関名用重み付けテーブル１３１には、例えば、Ｎｏ．が“１２３０６”、文字コードが“０ｘ３０１２（東）”、重み付け値が“５”という情報が設定される。これは、文字コード“０ｘ３０１２（東）”が、金融機関名テーブル１２１に登録された金融機関名に５回出現することを示している。 In the weighting table 131 for financial institution names, for example, No. Is set to “12306”, the character code is “0x3012 (east)”, and the weighting value is “5”. This indicates that the character code “0x3012 (east)” appears five times in the financial institution name registered in the financial institution name table 121.

図１１は、支店名用重み付けテーブルのデータ構造例を示す図である。支店名用重み付けテーブル１３２ａ，１３２ｂ，１３２ｃには、Ｎｏ．を示す項目、文字コードを示す項目、重み付け値を示す項目が設けられている。ここでは、金融機関名“東京ＡＢＣ銀行”の支店名を登録したテーブルである支店名用重み付けテーブル１３２ａに関して説明するが、支店名テーブル１３２ｂ，１３２ｃに関しても同様である。各項目の横方向に関連付けられた情報同士が１つの文字コードに関する情報を構成する。 FIG. 11 is a diagram illustrating an example of the data structure of the branch name weighting table. The branch name weighting tables 132a, 132b, and 132c include No. , An item indicating a character code, and an item indicating a weighting value are provided. Here, the branch name weighting table 132a which is a table in which the branch name of the financial institution name “Tokyo ABC Bank” is registered will be described, but the same applies to the branch name tables 132b and 132c. Information associated with the horizontal direction of each item constitutes information on one character code.

Ｎｏ．を示す項目には、項目の番号を示す値が設定される。文字コードを示す項目には、コンピュータ１００で使用可能な文字コードが設定される。重み付け値を示す項目には、対応する文字コードの、支店名テーブル１２２ａに登録された支店名における出現回数が設定される。 No. A value indicating an item number is set in the item indicating. In the item indicating the character code, a character code usable in the computer 100 is set. In the item indicating the weighting value, the number of appearances of the corresponding character code in the branch name registered in the branch name table 122a is set.

支店名用重み付けテーブル１３２ａには、例えば、Ｎｏ．が“３３４４６”、文字コードが“０ｘ８２Ａ６（新）”という情報が設定される。これは、文字コード“０ｘ８２Ａ６（新）”が、“東京ＡＢＣ銀行”の支店名テーブル１２２ａに登録された支店名に４回出現することを示している。 In the branch name weighting table 132a, for example, No. Is set to “33446” and the character code is “0x82A6 (new)”. This indicates that the character code “0x82A6 (new)” appears four times in the branch name registered in the branch name table 122a of “Tokyo ABC Bank”.

次に、以上のような構成を備えるコンピュータ１００において実行される処理の詳細を説明する。
図１２は、金融機関名に対する重み付け処理の手順を示すフローチャートである。以下、図１２に示す処理をステップ番号に沿って説明する。 Next, details of processing executed in the computer 100 having the above-described configuration will be described.
FIG. 12 is a flowchart showing the procedure of the weighting process for the financial institution name. In the following, the process illustrated in FIG. 12 will be described in order of step number.

［ステップＳ１１］更新情報入力部１４０は、オペレータの入力やネットワーク等による配信により取得する更新情報に基づいて、金融機関辞書記憶部１２０に記憶された金融機関名テーブル１２１および支店名テーブル群１２２を更新する。 [Step S11] The update information input unit 140 stores the financial institution name table 121 and the branch name table group 122 stored in the financial institution dictionary storage unit 120 based on update information acquired by operator input or distribution over a network or the like. Update.

［ステップＳ１２］重み付け処理部１４５は、金融機関辞書記憶部１２０に記憶された金融機関名テーブル１２１が更新されたことを検知すると、金融機関名テーブル１２１に含まれる各文字の出現回数を算出する。重み付け処理部１４５は、算出した出現回数を重み付け値として、該当の文字に対応付けた金融機関名用重み付けテーブル１３１を生成する。 [Step S12] When the weighting processing unit 145 detects that the financial institution name table 121 stored in the financial institution dictionary storage unit 120 has been updated, the weighting processing unit 145 calculates the number of appearances of each character included in the financial institution name table 121. . The weighting processing unit 145 uses the calculated number of appearances as a weighting value to generate the financial institution name weighting table 131 associated with the corresponding character.

［ステップＳ１３］重み付け処理部１４５は、生成した金融機関名用重み付けテーブル１３１を重み付け情報記憶部１３０に格納する。
このように、コンピュータ１００は、更新情報を取得すると、更新後の金融機関名テーブル１２１に基づいて金融機関名用重み付けテーブル１３１を更新する。 [Step S13] The weighting processing unit 145 stores the generated weighting table 131 for financial institution name in the weighting information storage unit 130.
As described above, when the computer 100 acquires the update information, the computer 100 updates the financial institution name weighting table 131 based on the updated financial institution name table 121.

なお、重み付け処理部１４５が、金融機関名テーブル１２１の更新を検知する方法としては、例えば、更新情報入力部１４０からその旨の通知を受けたり、金融機関名テーブル１２１を所定の間隔で監視したりする方法が考えられる。 As a method for the weighting processing unit 145 to detect the update of the financial institution name table 121, for example, a notification to that effect is received from the update information input unit 140, or the financial institution name table 121 is monitored at a predetermined interval. Can be considered.

これにより、金融機関辞書の更新が発生した場合にも、金融機関名テーブル１２１と金融機関名用重み付けテーブル１３１とを適正に同期することができる。
図１３は、金融機関名・支店名の特定処理の手順を示すフローチャートである。以下、図１３に示す処理をステップ番号に沿って説明する。 Thereby, even when the financial institution dictionary is updated, the financial institution name table 121 and the financial institution name weighting table 131 can be properly synchronized.
FIG. 13 is a flowchart showing the procedure for identifying the financial institution name / branch name. In the following, the process illustrated in FIG. 13 will be described in order of step number.

［ステップＳ２１］画像情報入力部１５０は、スキャナ１４から帳票に対応する画像情報を取得する。画像情報入力部１５０は、取得した画像情報を文字識別部１６０に出力する。 [Step S 21] The image information input unit 150 acquires image information corresponding to the form from the scanner 14. The image information input unit 150 outputs the acquired image information to the character identification unit 160.

［ステップＳ２２］文字識別部１６０は、画像情報に含まれる金融機関名用の文字およびこの金融機関の支店名用の文字を識別する。そして、識別結果および文字コード記憶部１１０に記憶された文字コード対応テーブル１１１に基づいて金融機関名用の候補文字および支店名用の候補文字を取得し、候補文字除外部１７０に出力する。 [Step S22] The character identifying unit 160 identifies the character for the financial institution name and the character for the branch name of the financial institution included in the image information. Based on the identification result and the character code correspondence table 111 stored in the character code storage unit 110, candidate characters for financial institution names and candidate characters for branch names are acquired and output to the candidate character exclusion unit 170.

［ステップＳ２３］候補文字除外部１７０は、文字識別部１６０より取得した金融機関名用の候補文字に対し、重み付け情報記憶部１３０に記憶された金融機関名用重み付けテーブル１３１で重み付け値が０である候補文字を候補から除外する。候補文字除外部１７０は、除外処理後の金融機関名用の候補文字を候補名称生成部１８０に出力する。また、候補文字除外部１７０は、文字識別部１６０より支店名用の候補文字を取得する。この支店名用の候補文字は、金融機関名の特定処理が完了するまで、候補文字除外部１７０が利用可能な記憶領域に保持される。 [Step S23] The candidate character exclusion unit 170 sets the weighting value to 0 in the financial institution name weighting table 131 stored in the weighting information storage unit 130 for the candidate character for financial institution name acquired from the character identification unit 160. A candidate character is excluded from the candidates. Candidate character exclusion section 170 outputs candidate characters for the financial institution name after the exclusion process to candidate name generation section 180. Candidate character exclusion unit 170 also acquires candidate characters for branch names from character identification unit 160. The candidate characters for the branch name are held in a storage area that can be used by the candidate character exclusion unit 170 until the financial institution name identification process is completed.

［ステップＳ２４］候補名称生成部１８０は、候補文字除外部１７０から金融機関名用の候補文字を取得する。候補名称生成部１８０は、取得した金融機関名用の候補文字を用いて、重み付け情報記憶部１３０に記憶された金融機関名用重み付けテーブル１３１に基づき、複数の候補金融機関名を優先順位を付与しながら生成する。候補名称生成部１８０は、生成した複数の候補金融機関名を名称特定部１９０に出力する。 [Step S24] The candidate name generation unit 180 acquires candidate characters for financial institution names from the candidate character exclusion unit 170. Candidate name generation unit 180 assigns priorities to a plurality of candidate financial institution names based on financial institution name weighting table 131 stored in weighting information storage unit 130 using the acquired candidate letters for financial institution names. While generating. The candidate name generation unit 180 outputs the generated plurality of candidate financial institution names to the name identification unit 190.

［ステップＳ２５］名称特定部１９０は、候補名称生成部１８０から複数の候補金融機関名を取得する。名称特定部１９０は、金融機関辞書記憶部１２０に記憶された金融機関名テーブル１２１に、取得した複数の候補金融機関名それぞれに対応する金融機関名が存在するか否かを優先順位の高い順に判定し、その金融機関名を特定する。ここでは、特定された金融機関名に“東京ＡＢＣ銀行”が含まれるものとする。なお、上記の特定処理は、候補金融機関名と金融機関名テーブル１２１に存在する金融機関名との一致率を算出し、一致率の高いものを選択する方法で行われる。 [Step S25] The name identification unit 190 acquires a plurality of candidate financial institution names from the candidate name generation unit 180. The name specifying unit 190 determines whether or not there are financial institution names corresponding to the acquired plurality of candidate financial institution names in the financial institution name table 121 stored in the financial institution dictionary storage unit 120 in descending order of priority. Determine and identify the name of the financial institution. Here, it is assumed that “Tokyo ABC Bank” is included in the specified financial institution name. Note that the above specific processing is performed by a method of calculating the matching rate between the candidate financial institution name and the financial institution name existing in the financial institution name table 121 and selecting one having a high matching rate.

［ステップＳ２６］名称特定部１９０は、上記ステップＳ２５の特定の結果が一意であるか否かを判定する。一意である場合、名称特定部１９０は、重み付け処理部１４５に特定した金融機関の支店名に対する重み付け処理を実行するよう指示して、処理がステップＳ２７に移される。一意でない場合、名称特定部１９０は、重み付け処理部１４５に特定した複数の金融機関の支店名に対する重み付け処理を実行するよう指示して、処理がステップＳ３１に移される。 [Step S26] The name specifying unit 190 determines whether or not the specific result of step S25 is unique. If it is unique, the name specifying unit 190 instructs the weighting processing unit 145 to execute the weighting process for the branch name of the specified financial institution, and the process proceeds to step S27. If not unique, the name specifying unit 190 instructs the weighting processing unit 145 to execute weighting processing for the branch names of the plurality of financial institutions specified, and the process proceeds to step S31.

［ステップＳ２７］重み付け処理部１４５は、金融機関辞書記憶部１２０に記憶された該当の金融機関の支店名テーブル１２２ａに含まれる各文字の出現回数を算出して支店名用重み付けテーブル１３２ａを生成し、重み付け情報記憶部１３０に格納する。そして、重み付け処理部１４５は、名称特定部１９０に重み付け処理が完了したことを通知する。名称特定部１９０は、重み付け処理部１４５から重み付け処理が完了した旨の通知を取得すると、候補文字除外部１７０に支店名用の候補文字に関する処理を実行するよう指示する。 [Step S27] The weighting processing unit 145 calculates the number of appearances of each character included in the branch name table 122a of the corresponding financial institution stored in the financial institution dictionary storage unit 120, and generates a branch name weighting table 132a. And stored in the weighting information storage unit 130. Then, the weighting processing unit 145 notifies the name specifying unit 190 that the weighting processing has been completed. When the name specifying unit 190 obtains a notification that the weighting processing is completed from the weighting processing unit 145, the name specifying unit 190 instructs the candidate character excluding unit 170 to execute processing related to the branch character candidate character.

［ステップＳ２８］候補文字除外部１７０は、名称特定部１９０からの指示に基づいて、取得した支店名用の候補文字に対し、重み付け情報記憶部１３０に記憶された支店名用重み付けテーブル１３２ａで重み付け値が０である候補文字を候補から除外する。そして、候補文字除外部１７０は、除外処理後の支店名用の候補文字を候補名称生成部１８０に出力する。 [Step S28] The candidate character excluding unit 170 weights the acquired branch name candidate characters with the branch name weighting table 132a stored in the weighting information storage unit 130 based on the instruction from the name specifying unit 190. Candidate characters whose value is 0 are excluded from the candidates. Candidate character exclusion unit 170 then outputs the candidate characters for the branch name after the exclusion process to candidate name generation unit 180.

［ステップＳ２９］候補名称生成部１８０は、候補文字除外部１７０から支店名用の候補文字を取得する。候補名称生成部１８０は、取得した支店名用の候補文字を用いて、重み付け情報記憶部１３０に記憶された支店名用重み付けテーブル１３２ａに基づき、複数の候補支店名を優先順位を付与しながら生成する。候補名称生成部１８０は、生成した複数の候補支店名を名称特定部１９０に出力する。 [Step S 29] The candidate name generation unit 180 acquires candidate characters for branch names from the candidate character exclusion unit 170. The candidate name generation unit 180 generates a plurality of candidate branch names while assigning priorities based on the branch name weighting table 132a stored in the weighting information storage unit 130, using the acquired branch name candidate characters. To do. The candidate name generation unit 180 outputs the generated plurality of candidate branch names to the name identification unit 190.

［ステップＳ３０］名称特定部１９０は、候補名称生成部１８０から複数の候補支店名を取得する。名称特定部１９０は、金融機関辞書記憶部１２０に記憶された支店名テーブル１２２ａに、取得した複数の候補支店名それぞれに対応する支店名が存在するか否かを優先順位の高い順に判定し、その支店名を特定する。なお、上記の特定処理は、候補支店名と支店名テーブル１２２ａに存在する支店名との一致率を算出し、一致率の高いものを選択する方法で行われる。 [Step S30] The name specifying unit 190 acquires a plurality of candidate branch names from the candidate name generating unit 180. The name specifying unit 190 determines whether or not there are branch names corresponding to the acquired plurality of candidate branch names in the branch name table 122a stored in the financial institution dictionary storage unit 120 in descending order of priority. Identify the branch name. The above specific processing is performed by a method of calculating a matching rate between the candidate branch name and the branch name existing in the branch name table 122a, and selecting one having a high matching rate.

［ステップＳ３１］重み付け処理部１４５は、金融機関辞書記憶部１２０に記憶された上記ステップＳ２６で特定された複数の金融機関の支店名テーブルに含まれる各文字の出現回数を算出して、金融機関ごとの支店名用重み付けテーブルを生成する。重み付け処理部１４５は、生成した複数の支店名用重み付けテーブルを重み付け情報記憶部１３０に格納する。そして、重み付け処理部１４５は、名称特定部１９０に重み付け処理が完了したことを通知する。名称特定部１９０は、重み付け処理部１４５から重み付け処理が完了した旨の通知を取得すると、候補文字除外部１７０に支店名用の候補文字に関する処理を複数の支店名用重み付けテーブルそれぞれを用いて実行するよう指示する。 [Step S31] The weighting processing unit 145 calculates the number of appearances of each character included in the branch name table of the plurality of financial institutions specified in Step S26 stored in the financial institution dictionary storage unit 120, thereby obtaining the financial institution. A weighting table for each branch name is generated. The weighting processing unit 145 stores the generated plurality of branch name weighting tables in the weighting information storage unit 130. Then, the weighting processing unit 145 notifies the name specifying unit 190 that the weighting processing has been completed. When the name specifying unit 190 receives a notification that the weighting processing is completed from the weighting processing unit 145, the name specifying unit 190 performs processing related to the candidate characters for the branch name using the plurality of branch name weighting tables. Instruct them to do so.

［ステップＳ３２］候補文字除外部１７０は、名称特定部１９０からの指示に基づいて、支店名用の候補文字に対し、重み付け情報記憶部１３０に記憶された複数の支店名用重み付けテーブルそれぞれについて、重み付け値が０である候補文字を候補から除外して、金融機関ごとに候補文字を生成する。そして、候補文字除外部１７０は、除外処理後の金融機関ごとの支店名用の候補文字を候補名称生成部１８０に出力する。 [Step S32] Based on the instruction from the name specifying unit 190, the candidate character excluding unit 170 applies the branch name candidate characters to each of the plurality of branch name weighting tables stored in the weighting information storage unit 130. Candidate characters having a weighting value of 0 are excluded from candidates, and candidate characters are generated for each financial institution. Candidate character exclusion unit 170 then outputs candidate characters for branch names for each financial institution after the exclusion process to candidate name generation unit 180.

［ステップＳ３３］候補名称生成部１８０は、候補文字除外部１７０から金融機関ごとの支店名用の候補文字を取得する。候補名称生成部１８０は、取得した支店名用の候補文字を用いて、重み付け情報記憶部１３０に記憶された支店名用重み付けテーブルに基づき、金融機関毎に複数の候補支店名を優先順位を付与しながら生成する。候補名称生成部１８０は、生成した金融機関ごとの候補支店名を名称特定部１９０に出力する。 [Step S33] The candidate name generation unit 180 acquires candidate characters for branch names for each financial institution from the candidate character exclusion unit 170. Candidate name generation unit 180 assigns a priority to a plurality of candidate branch names for each financial institution based on the branch name weighting table stored in weighting information storage unit 130 using the acquired candidate characters for branch names. While generating. The candidate name generation unit 180 outputs the generated candidate branch name for each financial institution to the name identification unit 190.

［ステップＳ３４］名称特定部１９０は、候補名称生成部１８０から金融機関ごとの複数の候補支店名を取得する。名称特定部１９０は、金融機関辞書記憶部１２０に記憶された金融機関ごとの支店名テーブルに、取得した複数の候補支店名それぞれに対応する支店名が存在するか否かを優先順位の高い順に判定し、その支店名を特定する。なお、上記の特定処理は、金融機関ごとに、候補支店名と支店名テーブルに存在する支店名との一致率を算出し、一致率の高いものを選択する方法で行われる。そして、名称特定部１９０は、一致率の最も高い支店名を有する金融機関名を特定する。 [Step S34] The name specifying unit 190 acquires a plurality of candidate branch names for each financial institution from the candidate name generating unit 180. The name specifying unit 190 determines whether or not the branch name corresponding to each of the acquired plurality of candidate branch names exists in the branch name table for each financial institution stored in the financial institution dictionary storage unit 120 in descending order of priority. Determine and specify the branch name. The above specific processing is performed by calculating a matching rate between the candidate branch name and the branch name existing in the branch name table for each financial institution, and selecting one having a high matching rate. Then, the name specifying unit 190 specifies the name of the financial institution having the branch name with the highest matching rate.

［ステップＳ３５］名称特定部１９０は、特定した金融機関名および支店名を必要に応じて他の業務システムに出力する。
このようにして、コンピュータ１００は取得した画像情報に含まれる金融機関名および支店名を特定することができる。予め算出した重み付け値により、候補となる名称に優先順位を付与し、この優先順位の高いものから判定処理を実行することで、特定結果の確からしさの高いものから順に処理が完了する。すなわち、優先順位の低いものに対する判定処理を省略することもでき、特定処理の精度を維持しつつ、コンピュータ１００の処理負荷を低減することができる。 [Step S35] The name identification unit 190 outputs the identified financial institution name and branch name to other business systems as necessary.
In this way, the computer 100 can specify the financial institution name and the branch name included in the acquired image information. By assigning priorities to candidate names using weight values calculated in advance, and executing the determination process from the one with the highest priority, the process is completed in order from the one with the highest probability of the specific result. That is, it is possible to omit the determination process for the low priority order, and to reduce the processing load on the computer 100 while maintaining the accuracy of the specific process.

次に、以上の処理の流れを更に具体的に説明する。
図１４は、帳票の記入例を示す図である。帳票には、金融機関名を記入する欄２０１および支店名を記入する欄２０２が設けられている。金融機関名を記入する欄２０１には、記入者により、金融機関の名称が記入される。支店名を記入する欄２０２には、記入者により、記入した金融機関の支店の名称が記入される。このような情報は、例えば、業務システムにおいて銀行口座を特定するための情報として用いられる。 Next, the above processing flow will be described more specifically.
FIG. 14 is a diagram illustrating a form entry example. The form is provided with a column 201 for entering a financial institution name and a column 202 for entering a branch name. In the column 201 for entering a financial institution name, the name of the financial institution is entered by the writer. In the column 202 for entering a branch name, the name of the branch of the financial institution entered by the writer is entered. Such information is used, for example, as information for specifying a bank account in a business system.

そして、金融機関名および支店名が記入された帳票は、スキャナ１４によって画像情報として取り込まれる。コンピュータ１００は、スキャナ１４が取り込んだ画像情報を取得する。 The form in which the name of the financial institution and the branch name are entered is captured by the scanner 14 as image information. The computer 100 acquires image information captured by the scanner 14.

図１５は、名称特定処理の流れを示す第１の模式図である。以下、図１５に示す処理をステップ番号に沿って説明する。
［ステップＳＴ１］文字識別部１６０は、図１４の帳票に記入された金融機関名に対する文字識別部１６０の文字識別処理の結果、候補文字リスト３０１を取得する。これらの候補文字は、文字識別部１６０による識別処理の結果の確からしさ等による順序で並んでおり、その順序に特に意味はない。 FIG. 15 is a first schematic diagram illustrating the flow of the name identification process. In the following, the process illustrated in FIG. 15 will be described in order of step number.
[Step ST1] The character identification unit 160 acquires the candidate character list 301 as a result of the character identification processing of the character identification unit 160 for the financial institution name entered in the form of FIG. These candidate characters are arranged in the order of the probability of the result of the identification processing by the character identification unit 160, and the order has no particular meaning.

［ステップＳＴ２］候補文字除外部１７０は、候補文字リスト３０１に含まれる文字のうち、金融機関名用重み付けテーブル１３１において重み付け値が０である“糸”、“余”、“令”、“Ｐ”、“及”、“て”および“Ｏ”の文字を候補から除外する。更に、候補文字除外部１７０は、残った文字の重み付け値に基づいて優先順位の高い文字から第１候補、第２の候補、・・・とし、候補文字リスト３０２を取得する。 [Step ST2] Among the characters included in the candidate character list 301, the candidate character exclusion unit 170 selects “thread”, “remainder”, “decision”, “P” with a weighting value of 0 in the financial institution name weighting table 131. The characters “,” “and”, “te” and “O” are excluded from the candidates. Further, the candidate character exclusion unit 170 obtains the candidate character list 302 as the first candidate, the second candidate,... From the characters with high priority based on the weight values of the remaining characters.

ここで候補文字リスト３０２において“−（ハイフン）”で示される欄は、該当候補となる文字が存在しないことを意味する。例えば、“３文字目”に該当する候補文字は、第４候補以降には、存在しないことを示している。また、“４文字目”および“５文字目”に該当する候補文字は、ともに第３候補以降には、存在しないことを示している。 Here, the column indicated by “-(hyphen)” in the candidate character list 302 means that there is no corresponding candidate character. For example, the candidate character corresponding to the “third character” does not exist after the fourth candidate. Further, it is indicated that candidate characters corresponding to “fourth character” and “fifth character” do not exist after the third candidate.

このように明らかに入力として有り得ない文字を候補から除外することで、以降の処理で不要なステップが発生するのを防止することができる。
図１６は、名称特定処理の流れを示す第２の模式図である。以下、図１６に示す処理をステップ番号に沿って説明する。なお、図１６に示す処理は、図１５に示すステップＳＴ２の後に実行される。 Thus, by excluding characters that cannot be clearly input as candidates, it is possible to prevent unnecessary steps from occurring in subsequent processing.
FIG. 16 is a second schematic diagram illustrating the flow of the name identification process. In the following, the process illustrated in FIG. 16 will be described in order of step number. The process shown in FIG. 16 is executed after step ST2 shown in FIG.

［ステップＳＴ３］候補名称生成部１８０は、候補文字リスト３０２に含まれる文字を組み合わせて、候補名称リスト３０３を取得する。候補名称リスト３０３では、各候補文字の重み付け値の大きい文字の組み合わせが優先順位の高い候補となる。例えば、金融機関名用重み付けテーブル１３１において、“東”は、重み付け値が“５”であり、“束”は重み付け値が“４”である。このため、１文字目として識別された候補文字のうち、“束”よりも“東”を用いて生成された候補金融機関名の方が、優先順位が高くなる。２文字目以降の候補文字に関しても同様である。候補名称生成部１８０は、生成した候補金融機関名のうち、優先順位の高いもの（例えば、第１候補から第５候補）を名称特定部１９０に出力する。 [Step ST3] The candidate name generation unit 180 combines the characters included in the candidate character list 302 to acquire the candidate name list 303. In the candidate name list 303, a combination of characters having a large weighting value for each candidate character is a candidate with a high priority. For example, in the weighting table 131 for financial institution names, “east” has a weight value of “5”, and “bundle” has a weight value of “4”. For this reason, among the candidate characters identified as the first character, the candidate financial institution name generated using “east” rather than “bundle” has a higher priority. The same applies to the second and subsequent candidate characters. The candidate name generation unit 180 outputs the generated candidate financial institution names having high priority (for example, the first candidate to the fifth candidate) to the name specifying unit 190.

このようにすると、名称特定部１９０における特定処理の負荷を軽減することができる。また、候補金融機関名の作成に利用する候補文字を、重み付け値による優先順位が高いものから利用することで候補の作成精度を向上することができる。 If it does in this way, the load of the specific process in the name specific | specification part 190 can be reduced. In addition, candidate creation accuracy can be improved by using candidate characters used for creating candidate financial institution names in descending order of priority based on weight values.

図１７は、名称特定処理の流れを示す第３の模式図である。以下、図１７に示す処理をステップ番号に沿って説明する。なお、図１７に示す処理は、図１６に示すステップＳＴ３の後に実行される。 FIG. 17 is a third schematic diagram illustrating the flow of the name identification process. In the following, the process illustrated in FIG. 17 will be described in order of step number. The process shown in FIG. 17 is executed after step ST3 shown in FIG.

［ステップＳＴ４］名称特定部１９０は、候補名称生成部１８０から取得する各候補金融機関名を、金融機関辞書記憶部１２０に記憶された金融機関名テーブル１２１の金融機関名と比較して一致率を算出し、候補名称リスト３０４を取得する。そして、候補名称リスト３０４に含まれる候補金融機関名のうち、一致率の最も高いものを金融機関名テーブル１２１から特定する。一致率の最も高い金融機関が１つであった場合、処理がステップＳＴ５ａに移される。また、一致率の最も高い金融機関が複数であった場合、処理がステップＳＴ５ｂに移される。 [Step ST4] The name specifying unit 190 compares each candidate financial institution name acquired from the candidate name generating unit 180 with the financial institution name in the financial institution name table 121 stored in the financial institution dictionary storage unit 120. And the candidate name list 304 is acquired. Then, a candidate financial institution name included in the candidate name list 304 is identified from the financial institution name table 121 with the highest match rate. If there is one financial institution with the highest match rate, the process proceeds to step ST5a. If there are a plurality of financial institutions with the highest match rate, the process proceeds to step ST5b.

［ステップＳＴ５ａ］特定結果リスト３０５ａは、一致率の最も高い金融機関が１つであった場合、すなわち、金融機関名を一意に特定できた場合を示している。ここでは、第１候補であった“東京ＡＢＣ銀行”が金融機関名テーブル１２１に含まれる“東京ＡＢＣ銀行”と一致率１００％で一致し、他の候補がそれよりも低い一致率である場合である。この場合、名称特定部１９０は、“東京ＡＢＣ銀行”を帳票に記入された金融機関名として一意に特定する。そして、“東京ＡＢＣ銀行”の支店名として帳票に記入された“新宿西支店”を特定するために、再度、図１５〜１７のステップＳＴ１〜ステップＳＴ４迄の処理が実行される。 [Step ST5a] The identification result list 305a indicates a case where there is one financial institution with the highest match rate, that is, a case where the name of the financial institution can be uniquely identified. Here, “Tokyo ABC Bank” that was the first candidate matches with “Tokyo ABC Bank” included in the financial institution name table 121 at a match rate of 100%, and other candidates have a match rate lower than that. It is. In this case, the name identifying unit 190 uniquely identifies “Tokyo ABC Bank” as the name of the financial institution entered in the form. Then, in order to specify “Shinjuku West Branch” entered in the form as the branch name of “Tokyo ABC Bank”, the processing from Step ST1 to Step ST4 of FIGS. 15 to 17 is executed again.

［ステップＳＴ５ｂ］特定結果リスト３０５ｂは、一致率の最も高い金融機関が２つであった場合、すなわち、金融機関名を一意に特定できなかった場合を示している。名称特定部１９０は、例えば、第１候補であった“東京ＡＢＣ銀行”が金融機関名テーブル１２１に含まれる“東京ＡＢＣ銀行”と一致率１００％であると判定する。また、名称特定部１９０は、例えば、第５候補であった“東西ＡＢＣ銀行”が金融機関名テーブル１２１に含まれる“東西ＡＢＣ銀行”と一致率１００％であると判定する。この場合、名称特定部１９０の特定結果は、一意に定まらず、“東京ＡＢＣ銀行”および“東西ＡＢＣ銀行”となる。この場合、２つの金融機関名のうちのいずれかを更に特定するために、支店名の特定結果を用いる。 [Step ST5b] The identification result list 305b indicates a case where there are two financial institutions with the highest matching rate, that is, a case where the financial institution name cannot be uniquely identified. For example, the name specifying unit 190 determines that “Tokyo ABC Bank”, which is the first candidate, has a matching rate of 100% with “Tokyo ABC Bank” included in the financial institution name table 121. For example, the name specifying unit 190 determines that “East and West ABC Bank”, which is the fifth candidate, matches the “East and West ABC Bank” included in the financial institution name table 121 with a matching rate of 100%. In this case, the identification result of the name identification unit 190 is not uniquely determined, but is “Tokyo ABC Bank” and “East-West ABC Bank”. In this case, the branch name identification result is used to further identify one of the two financial institution names.

なお、上記の説明では、一致率が同じとなった金融機関名が複数存在した場合にステップＳＴ５ｂのように判定しているが、他に例えば、算出された一致率が所定の範囲内（例えば、９０％〜１００％）となったすべての金融機関名を候補として、次に示すステップＳＴ６のような処理を行うようにしてもよい。このとき、一致率の判定範囲を、算出された一致率の最大値を基準とした範囲（例えば、所定の割合の範囲、あるいは所定の数値範囲）としてもよい。また、別の例としては、算出された一致率が大きい順に所定数の金融機関名を抽出して候補としてもよい。 In the above description, when there are a plurality of financial institution names having the same match rate, the determination is made as in step ST5b. However, for example, the calculated match rate is within a predetermined range (for example, , 90% to 100%) may be used as candidates, and the process as shown in step ST6 shown below may be performed. At this time, the coincidence rate determination range may be a range based on the calculated maximum value of the coincidence rate (for example, a predetermined ratio range or a predetermined numerical value range). As another example, a predetermined number of financial institution names may be extracted in descending order of the calculated coincidence rate to be candidates.

図１８は、名称特定処理の流れを示す第４の模式図である。以下、図１８に示す処理をステップ番号に沿って説明する。なお、図１８に示す処理は、図１７に示したステップＳＴ５ｂの後に実行される。 FIG. 18 is a fourth schematic diagram illustrating the flow of the name identification process. In the following, the process illustrated in FIG. 18 will be described in order of step number. The process shown in FIG. 18 is executed after step ST5b shown in FIG.

［ステップＳＴ６］名称特定部１９０は、候補名称生成部１８０から“東京ＡＢＣ銀行”および“東西ＡＢＣ銀行”の候補支店名を取得する。次に、名称特定部１９０は、“東京ＡＢＣ銀行”の各候補支店名について、支店名テーブル１２２ａの支店名と比較する。そして、各候補に対する文字の一致率を算出し、候補名称リスト３０６ａを取得する。更に、名称特定部１９０は、“東西ＡＢＣ銀行”の各候補支店名について、“東西ＡＢＣ銀行”の支店名テーブルに含まれる支店名と比較する。そして、各候補に対する文字の一致率を算出し、候補名称リスト３０６ｂを取得する。 [Step ST6] The name identification unit 190 acquires candidate branch names of “Tokyo ABC Bank” and “Tozai ABC Bank” from the candidate name generation unit 180. Next, the name specifying unit 190 compares each candidate branch name of “Tokyo ABC Bank” with the branch name in the branch name table 122a. Then, the character matching rate for each candidate is calculated, and the candidate name list 306a is obtained. Further, the name specifying unit 190 compares each candidate branch name of “Tozai ABC Bank” with a branch name included in the branch name table of “Tozai ABC Bank”. Then, the character matching rate for each candidate is calculated, and the candidate name list 306b is obtained.

そして、名称特定部１９０は、候補名称リスト３０６ａ，３０６ｂに基づいて、最も一致率の高い候補支店名を特定する。候補名称リスト３０６ａでは、“東京ＡＢＣ銀行”の支店名“新宿西支店”が、支店名テーブル１２２ａに含まれる“新宿西支店”と一致率１００％である場合を示している。この一致率は、候補名称リスト３０６ａ，３０６ｂに含まれる候補支店名の中で最も大きい。 Then, the name specifying unit 190 specifies the candidate branch name having the highest matching rate based on the candidate name lists 306a and 306b. The candidate name list 306a shows a case where the branch name “Shinjuku West Branch” of “Tokyo ABC Bank” is 100% coincident with “Shinjuku West Branch” included in the branch name table 122a. This matching rate is the largest among the candidate branch names included in the candidate name lists 306a and 306b.

［ステップＳＴ７］名称特定部１９０は、帳票に記入された支店名を“新宿西支店”と特定すると同時に、帳票に記入された金融機関名を“東京ＡＢＣ銀行”と特定し、特定結果３０７を取得する。 [Step ST7] The name identifying unit 190 identifies the branch name entered in the form as “Shinjuku West Branch” and at the same time identifies the financial institution name entered in the form as “Tokyo ABC Bank”. get.

なお、候補支店名に関して最も高い一致率となる支店名が、“東京ＡＢＣ銀行”と“東西ＡＢＣ銀行”との双方に存在する場合も考えられる。例えば、上記ステップＳＴ６において、“新宿西支店”という名称の支店が双方の金融機関に存在する場合である。この場合、双方に一致率１００％となる支店名が存在することになる。この場合には、例えば、図１６の候補名称リスト３０３において、優先順位の高い候補金融機関名として生成された方を採用する。すなわち、第１候補である“東京ＡＢＣ銀行”を採用する。このようにすることで、確からしさにおいて優位な名称を優先的に選択することができ、特定精度を向上することができる。 Note that there may be a case where the branch name having the highest matching rate with respect to the candidate branch name exists in both “Tokyo ABC Bank” and “Tozai ABC Bank”. For example, in step ST6, a branch named “Shinjuku West Branch” exists in both financial institutions. In this case, a branch name having a match rate of 100% exists on both sides. In this case, for example, the one generated as a candidate financial institution name having a high priority in the candidate name list 303 of FIG. 16 is adopted. That is, the first candidate “Tokyo ABC Bank” is adopted. In this way, it is possible to preferentially select names that are superior in accuracy, and to improve the identification accuracy.

本実施の形態の説明では、金融機関の業務を想定した例を挙げて説明したが、特定対象とする文字列は、銀行名や支店名に限らない。例えば、紙面等に記入された住所の特定に用いることもできる。そして、２段階で特定する方法は、県名および市町村名の対応等、階層的に管理される情報を特定する場合に適用することができる。 In the description of the present embodiment, an example was given assuming a business of a financial institution, but the character string to be specified is not limited to a bank name or a branch name. For example, it can also be used to specify an address written on paper. The method of specifying in two steps can be applied when specifying information managed hierarchically, such as correspondence between prefecture names and city names.

以上説明したように、予め金融機関辞書に基づいて各文字に重み付けを付与しておくことで、文字の特定精度を向上することができる。また、優先順位の高い順に候補文字列の一致率判定処理等を行うことにより、文字列を短時間に特定できるようになる。更に、優先順位の低い候補文字列の一致率判定を省略することができるため、認識処理の負荷を低減することができる。すなわち、低負荷の処理で文字列を精度良く認識することが可能となる。 As described above, the character identification accuracy can be improved by assigning a weight to each character in advance based on the financial institution dictionary. Further, by performing the matching rate determination process of candidate character strings in descending order of priority, the character strings can be specified in a short time. Furthermore, since it is possible to omit the matching rate determination of candidate character strings having low priorities, it is possible to reduce the recognition processing load. That is, the character string can be recognized with high accuracy by low-load processing.

なお、図１や図３に示した処理機能の少なくとも一部は、コンピュータによって実現することができる。その場合には、これらの処理機能の処理内容を記述したプログラムが提供される。そして、そのプログラムをコンピュータで実行することにより、上記処理機能がコンピュータ上で実現される。処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリなどがある。 Note that at least a part of the processing functions shown in FIGS. 1 and 3 can be realized by a computer. In that case, a program describing the processing contents of these processing functions is provided. And the said processing function is implement | achieved on a computer by running the program with a computer. The program describing the processing contents can be recorded on a computer-readable recording medium. Examples of the computer-readable recording medium include a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory.

プログラムを流通させる場合には、例えば、そのプログラムが記録された光ディスクなどの可搬型記録媒体が販売される。また、プログラムをサーバコンピュータの記憶装置に格納しておき、そのプログラムを、サーバコンピュータからネットワークを介して他のコンピュータに転送することもできる。 When the program is distributed, for example, a portable recording medium such as an optical disk on which the program is recorded is sold. It is also possible to store the program in a storage device of a server computer and transfer the program from the server computer to another computer via a network.

プログラムを実行するコンピュータは、例えば、可搬型記録媒体に記録されたプログラムまたはサーバコンピュータから転送されたプログラムを、自己の記憶装置に格納する。そして、コンピュータは、自己の記憶装置からプログラムを読み取り、そのプログラムに従った処理を実行する。なお、コンピュータは、可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することもできる。また、コンピュータは、サーバコンピュータからプログラムが転送されるごとに、逐次、受け取ったプログラムに従った処理を実行することもできる。 The computer that executes the program stores, for example, the program recorded on the portable recording medium or the program transferred from the server computer in its own storage device. Then, the computer reads the program from its own storage device and executes processing according to the program. The computer can also read the program directly from the portable recording medium and execute processing according to the program. Further, each time the program is transferred from the server computer, the computer can sequentially execute processing according to the received program.

以上、本発明の文字認識プログラム、文字認識装置および文字認識方法を図示の実施の形態に基づいて説明したが、これらに限定されるものではなく、各部の構成は同様の機能を有する任意の構成のものに置換することができる。また、他の任意の構成物や工程が付加されてもよい。また、本発明は前述した実施の形態のうちの任意の２以上の構成（特徴）を組み合わせたものであってもよい。 As described above, the character recognition program, the character recognition device, and the character recognition method of the present invention have been described based on the illustrated embodiments. Can be substituted. Moreover, other arbitrary structures and processes may be added. Further, the present invention may be a combination of any two or more configurations (features) of the above-described embodiments.

文字認識システムの概要を示す図である。It is a figure which shows the outline | summary of a character recognition system. 本実施の形態のコンピュータのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the computer of this Embodiment. 本実施の形態のコンピュータの機能を示すブロック図である。It is a block diagram which shows the function of the computer of this Embodiment. 文字コード記憶部に記憶されるテーブルを示す図である。It is a figure which shows the table memorize | stored in a character code memory | storage part. 文字コード対応テーブルのデータ構造例を示す図である。It is a figure which shows the example of a data structure of a character code corresponding | compatible table. 金融機関辞書記憶部に記憶されるテーブルを示す図である。It is a figure which shows the table memorize | stored in a financial institution dictionary memory | storage part. 金融機関名テーブルのデータ構造例を示す図である。It is a figure which shows the example of a data structure of a financial institution name table. 支店名テーブルのデータ構造例を示す図である。It is a figure which shows the example of a data structure of a branch name table. 重み付け情報記憶部に記憶されるテーブルを示す図である。It is a figure which shows the table memorize | stored in a weighting information storage part. 金融機関名用重み付けテーブルのデータ構造例を示す図である。It is a figure which shows the example of a data structure of the weighting table for financial institution names. 支店名用重み付けテーブルのデータ構造例を示す図である。It is a figure which shows the example of a data structure of the weighting table for branch names. 金融機関名に対する重み付け処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the weighting process with respect to a financial institution name. 金融機関名・支店名の特定処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the specific process of a financial institution name and a branch name. 帳票の記入例を示す図である。It is a figure which shows the example of entry of a form. 名称特定処理の流れを示す第１の模式図である。It is a 1st schematic diagram which shows the flow of a name specific process. 名称特定処理の流れを示す第２の模式図である。It is a 2nd schematic diagram which shows the flow of a name specific process. 名称特定処理の流れを示す第３の模式図である。It is a 3rd schematic diagram which shows the flow of a name specific process. 名称特定処理の流れを示す第４の模式図である。It is a 4th schematic diagram which shows the flow of a name specific process.

Explanation of symbols

１コンピュータ
１ａ単語登録情報記憶手段
１ｂ重み付け情報記憶手段
１ｃ画像情報入力手段
１ｄ文字推定手段
１ｅ候補文字列生成手段
１ｆ文字列特定手段
２画像情報取込装置 DESCRIPTION OF SYMBOLS 1 Computer 1a Word registration information storage means 1b Weighted information storage means 1c Image information input means 1d Character estimation means 1e Candidate character string generation means 1f Character string specification means 2 Image information capture device

Claims

In a character recognition program that recognizes a character string included in image information,
Computer
Character estimation means for estimating each character of the character string based on the image information, and outputting one or more candidate characters as candidates of the estimation result for each character in the character string,
The candidate character corresponding to each character in the character string output by the character estimation means is based on weighting information in which the appearance count of characters included in the word registration information in which a plurality of words are registered is associated for each character. Candidate character string generating means for sequentially generating one or more candidate character strings as the character string candidates by extracting and combining one by one in descending order of the number of appearances;
Character string specifying means for comparing the candidate character strings with words in the word registration information in the order of generation, and specifying words corresponding to the character strings from the matching results;
Character recognition program characterized by functioning as

The candidate character string generation unit generates the candidate character string by excluding characters not included in the word registration information from the candidate characters output by the character estimation unit. The described character recognition program.

3. The character string specifying unit specifies a word corresponding to the character string based on a character matching rate between the candidate character string and a word in the word registration information. The described character recognition program.

Said computer further
Weighting processing means for calculating the number of appearances of characters included in the word registration information and generating the weighting information;
The character recognition program according to claim 1, wherein the character recognition program is made to function as:

In a character recognition program for recognizing a first character string and a second character string included in image information,
Computer
Estimating each character of the first character string and the second character string based on the image information, and as a candidate of the estimation result, a first candidate character corresponding to each character of the first character string And character estimation means for outputting at least one second candidate character corresponding to each character of the second character string,
Based on the first weighting information in which the number of appearances of characters included in the first word registration information in which a plurality of words are registered is associated for each character, By extracting and combining the first candidate characters corresponding to each character one by one in descending order of the number of appearances in the first word registration information, one or more first characters as candidates for the first character string are combined. First candidate character string generating means for sequentially generating one candidate character string;
The first candidate character string is collated with words in the first word registration information in the order of generation, and a first candidate word indicating a word estimated to match the first character string is obtained from the collation result. First character string specifying means for selecting and outputting a plurality;
The second word registration information specified from the first candidate word is selected for each of the first candidate words from a plurality of second word registration information in which a plurality of words are registered, and selected. Each of the second character strings output by the character estimation means is based on a plurality of second weighting information in which the number of appearances of the word included in each of the second word registration information is associated with each character. One or more second candidate characters as candidates for the second character string by combining the second candidate characters corresponding to characters one by one in descending order of the number of appearances in the second word registration information Second candidate character string generating means for generating a sequence for each of the first candidate words in order,
The second candidate character string is matched with a word in the second word registration information corresponding to the generation order, and a second candidate indicating a word presumed to match the second character string from the collation result A word is selected and output from any of the second word registration information, and the first candidate word corresponding to the second candidate word is determined as a word that matches the first character string. Second character string specifying means to perform,
Character recognition program characterized by functioning as

The second character string specifying means determines the word having the highest match rate with the second candidate character string to be collated among all words collated with the second candidate character string, 6. The character recognition program according to claim 5, wherein the character recognition program is selected as a second candidate word.

The first character string specifying means determines a word estimated to match the first character string based on a matching rate between the first candidate character string and a word in the first word registration information. Determine whether it can be fixed to one,
If the number of the first candidate words cannot be determined, the plurality of first candidate words are selected and output to the second character string specifying means,
If it can be confirmed to one, the confirmed word is output as the only first candidate word to the second character string specifying means, and the first word specified from the first candidate word is output. Generating the second candidate character string based on the weighting information of 2, and causing the second character string specifying means to perform a matching process using the generated second candidate character string;
The character recognition program according to claim 5 or 6, characterized by the above-mentioned.

The first character string specifying means includes a plurality of calculated values within a predetermined ratio range or a predetermined numerical value range from the maximum value of the calculated matching rates, or only one calculated value exists. The character recognition program according to claim 7, wherein it is determined whether or not it is possible to determine one word estimated to match the first character string depending on whether or not.

Said computer further
Calculating the number of appearances of characters included in the first word registration information to generate the first weighting information; calculating the number of appearances of words included in each of the plurality of second word registration information; Weighting processing means for generating a plurality of the second weighting information associated with each of the second word registration information;
The character recognition program according to claim 5, wherein the character recognition program is made to function as:

The first character string is a name of a financial institution;
The second character string is a name of a branch of the financial institution.
The character recognition program according to claim 5, wherein the character recognition program is a character recognition program.

In a character recognition device that recognizes a character string included in image information,
Character estimation means for estimating each character of the character string based on the image information and outputting one or more candidate characters as candidates of the estimation result for each character in the character string;
The candidate character corresponding to each character in the character string output by the character estimation means is based on weighting information in which the appearance count of characters included in the word registration information in which a plurality of words are registered is associated for each character. Candidate character string generating means for sequentially generating one or more candidate character strings as the character string candidates by extracting and combining one by one in descending order of the number of appearances;
Character string specifying means for matching the candidate character string with the words in the word registration information in the order of generation, and specifying the word corresponding to the character string from the matching result;
A character recognition device comprising:

12. The candidate character string generation unit generates the candidate character string by excluding characters not included in the word registration information from the candidate characters output by the character estimation unit. The character recognition device described.

The character string specifying means specifies a word corresponding to the character string based on a matching rate of characters between the candidate character string and a word in the word registration information. The character recognition device described.

The character recognition device according to claim 11, further comprising weighting processing means for calculating the number of appearances of characters included in the word registration information and generating the weighting information.

In a character recognition method of a character recognition device that recognizes a character string included in image information,
Character estimation means estimates each character of the character string based on the image information, and outputs one or more candidate characters as candidates of the estimation result for each character in the character string,
Each character in the character string output by the character estimation unit based on the weighting information in which the candidate character string generation unit associates the number of appearances of the character included in the word registration information in which a plurality of words are registered for each character. One or more candidate character strings as candidates for the character string are sequentially generated by extracting and combining the candidate characters corresponding to each one in descending order of the number of appearances,
The character string specifying means matches the candidate character string with the words in the word registration information in the order of generation, and specifies the word corresponding to the character string from the matching result.
A character recognition method characterized by the above.