JPH07262320A

JPH07262320A - Address recognition device

Info

Publication number: JPH07262320A
Application number: JP6048590A
Authority: JP
Inventors: Toshio Niwa; 寿男丹羽; Koji Yamamoto; 浩司山本; Yoshihiro Kojima; 良宏小島; Hidetsugu Maekawa; 英嗣前川; Kazuhiro Kayashima; 一弘萱嶋
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1994-03-18
Filing date: 1994-03-18
Publication date: 1995-10-13

Abstract

PURPOSE:To improve a recognition rate by estimating an error part from a properly recognized character even when there is any character segmentation error or character recognition error concerning the address recognizing device for recognizing an address from an address image. CONSTITUTION:Plural segmenting candidates are outputted by a character segmenting part 11, a key character extracting part 13 extracts a key character from candidate characters which are provided by recognizing characters concerning the respective candidates with a character recognition part 12, and a place name area candidate retrieving part 14 infers a place name area based on the key character. A place name candidate evaluated value arithmetic part 16 calculates a place name candidate evaluated value corresponding to the number of matched characters between the place name candidates and a place name dictionary 18 or the like, and a place name candidate selecting part 17 selects a correct place name.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、配達物などに示されて
いる住所を読み取って認識するための住所認識装置に関
するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an address recognition device for reading and recognizing an address shown on a delivery or the like.

【０００２】[0002]

【従来の技術】従来から、住所文字列を認識する文字認
識装置では、地名辞書を用いた知識処理を導入して、認
識精度の向上が図られている。この知識処理は、１文字
ごとの認識結果に対して、地名辞書との照合を行うこと
により、認識結果を最も確からしい文字に修正する方式
である。2. Description of the Related Art Conventionally, in a character recognition device for recognizing an address character string, knowledge processing using a place name dictionary has been introduced to improve the recognition accuracy. This knowledge processing is a method of correcting the recognition result into the most probable character by collating the recognition result for each character with the place name dictionary.

【０００３】従来の住所認識装置としては、例えば、特
開平４−111186号公報に示されている。図６は従来の住
所認識装置を示すものである。文字切り出し部61は入力
画像から１文字分の画像領域を切り出し、その切り出さ
れた画像から文字認識部62は１文字につきＮ個の候補文
字を出力する。地名検索部63は、地名辞書64を用いて、
候補文字列集合の中から地名を構成する文字の組み合せ
を求め、地名候補評価値演算部65で、認識結果の類似度
などに基づいて地名候補評価値を求める。地名候補選択
部66では地名階層辞書67を用いて上位レベルの地名から
順に地名候補評価値が高い地名候補を選択し、この地名
候補を認識結果として出力する。A conventional address recognition device is disclosed, for example, in Japanese Patent Laid-Open No. 4-111186. FIG. 6 shows a conventional address recognition device. The character cutout unit 61 cuts out an image area for one character from the input image, and the character recognition unit 62 outputs N candidate characters for each character from the cutout image. The place name search unit 63 uses the place name dictionary 64,
A combination of characters forming a place name is obtained from the set of candidate character strings, and a place name candidate evaluation value calculation unit 65 obtains a place name candidate evaluation value based on the similarity of the recognition result. The place name candidate selection unit 66 uses the place name hierarchy dictionary 67 to select place name candidates having higher place name candidate evaluation values in order from the higher level place names, and outputs the place name candidates as a recognition result.

【０００４】以上のようにして、住所を認識することに
より、文字認識部62が誤った認識をした文字を修正する
ことができ、認識の向上を図ることができる。By recognizing the address as described above, it is possible to correct a character that the character recognition unit 62 has erroneously recognized, and it is possible to improve the recognition.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記従
来の住所認識装置では、手書き文字などの認識で、文字
切り出し部61が切り出しを誤ることがある。この場合
に、文字認識部62では正しい文字を認識することができ
ず、地名検索部63では正しい地名候補を検索することが
できない。また、切り出し誤りにより文字位置がずれる
と、正確に切り出しができた部分においても、地名辞書
64との照合ができずに地名検索部63で正しい地名候補を
検索することができない。However, in the above-mentioned conventional address recognition device, the character slicing section 61 may erroneously clip a character when recognizing a handwritten character or the like. In this case, the character recognition unit 62 cannot recognize the correct character, and the place name search unit 63 cannot search the correct place name candidate. Also, if the character position is misaligned due to a clipping error, the place name dictionary will be
Since it cannot be compared with 64, the place name search unit 63 cannot search for a correct place name candidate.

【０００６】さらに、地名候補選択部66では、上位レベ
ルの地名から順に地名候補を選択していくため、上位レ
ベルの地名の認識を誤った場合には、それより下位のレ
ベルの地名をすべて誤って認識してしまうという課題を
有していた。Further, since the place name candidate selecting section 66 selects place name candidates in order from the higher level place names, if the upper level place names are erroneously recognized, all lower level place names are wrong. Had the problem of recognizing it.

【０００７】本発明はこのような従来の課題を解決する
もので、最も正解であると思われる地名候補から選択す
ることによって、文字認識率を高くすることを目的とし
ている。The present invention solves such a conventional problem, and an object thereof is to increase the character recognition rate by selecting from the most likely correct place name candidates.

【０００８】[0008]

【課題を解決するための手段】本発明は上記目的を達成
するために、入力画像を１文字ごとの領域に切り出し、
切り出し領域として複数の切り出し候補を出力する文字
切り出し部と、前記切り出し候補につきＮ個の候補文字
を出力する文字認識部と、前記候補文字の集合からあら
かじめ決められたキー文字を抽出するキー文字抽出部
と、前記キー文字を基準として地名の階層に対応する地
名領域候補を検索する地名領域候補検索部と、前記地名
領域候補に含まれる候補文字集合から地名辞書を用いて
地名候補を検索する地名検索部と、前記地名候補の確か
らしさを計算し地名候補評価値を出力する地名候補評価
値演算部と、前記地名候補評価値を基に地名候補を選択
する地名候補選択部とを備えた構成である。In order to achieve the above object, the present invention cuts out an input image into regions for each character,
A character cutout unit that outputs a plurality of cutout candidates as a cutout region, a character recognition unit that outputs N candidate characters for each cutout candidate, and a key character extraction that extracts a predetermined key character from the set of candidate characters Section, a place name area candidate search section for searching a place name area candidate corresponding to a place name hierarchy based on the key character, and a place name for searching a place name candidate using a place name dictionary from a candidate character set included in the place name area candidate A configuration including a search unit, a place name candidate evaluation value calculation unit that calculates the probability of the place name candidate and outputs a place name candidate evaluation value, and a place name candidate selection unit that selects a place name candidate based on the place name candidate evaluation value Is.

【０００９】[0009]

【作用】本発明は上記した構成により、文字切り出し部
において複数の切り出し候補を求め、全ての切り出し候
補に対して文字認識を行い、全ての切り出し候補を用い
た文字認識結果の候補文字集合の中から最も正しい地名
を選択する。さらに、地名検索部で検索された地名候補
に対して、最も信頼性の高い地名階層レベルの地名候補
から地名を選択するものである。According to the present invention, with the above configuration, a plurality of cutout candidates are obtained in the character cutout unit, character recognition is performed on all cutout candidates, and a candidate character set of character recognition results using all cutout candidates is set. Select the most correct place name from. Further, with respect to the place name candidates searched by the place name search unit, the place name is selected from the place name candidates of the most reliable place name hierarchy level.

【００１０】従って、文字切り出し部が文字切り出し結
果を確定できなかった場合でも、切り出し候補の中から
正解を探索することにより、誤認識を減らし、さらに、
最も信頼性の高い地名階層レベルの地名候補から地名を
選択することにより、探索誤りがなくなる。Therefore, even if the character cutout unit cannot determine the result of the character cutout, by searching for the correct answer from the cutout candidates, false recognition is reduced, and further,
By selecting the place name from the place name candidates of the most reliable place name hierarchy level, the search error is eliminated.

【００１１】[0011]

【実施例】以下、本発明の第１の発明の実施例について
説明する。図１にこの実施例の住所認識装置の全体の構
成を示す。EXAMPLES Examples of the first invention of the present invention will be described below. FIG. 1 shows the overall configuration of the address recognition device of this embodiment.

【００１２】文字切り出し部11は、入力画像より１文字
づつの領域に切り出し、切り出し候補を出力する。文字
認識部12は、文字の画像より文字認識を行い、１文字に
つき第１候補文字から第ｎ候補文字までのｎ個の候補文
字を持つ候補文字集合を出力する。キー文字抽出部13
は、地名階層を示すキーとなる文字、例えば、「県」、
「市」、「町」などの文字を候補文字集合の中から抽出
する。The character cutout unit 11 cuts out an area of each character from the input image and outputs cutout candidates. The character recognition unit 12 performs character recognition from a character image and outputs a candidate character set having n candidate characters from the first candidate character to the nth candidate character for each character. Key character extraction unit 13
Is a key character indicating the place name hierarchy, for example, "prefecture",
Characters such as "city" and "town" are extracted from the candidate character set.

【００１３】地名領域候補検索部14は、キー文字を基準
として、それぞれの階層の地名領域である地名領域候補
を検索する。地名検索部15は、各地名領域候補に対し
て、地名辞書18を検索することにより、候補文字集合の
組み合せの中から地名候補となる組み合せを選び出す。
地名候補評価値演算部16は、地名検索部15で検索された
地名候補を地名辞書18との一致度及び文字認識部12での
認識類似度を基準として地名候補評価値を計算する。地
名候補選択部17は、地名候補の中で地名候補評価値の最
も大きい地名候補を選択し、認識結果の住所を出力す
る。The place name area candidate search unit 14 searches for a place name area candidate which is a place name area of each layer based on the key character. The place name search unit 15 searches the place name dictionary 18 for each place name area candidate, and selects a combination that is a place name candidate from the combinations of candidate character sets.
The place name candidate evaluation value calculation unit 16 calculates the place name candidate evaluation value based on the degree of coincidence between the place name candidate searched by the place name search unit 15 and the place name dictionary 18 and the recognition similarity of the character recognition unit 12. The place name candidate selecting unit 17 selects the place name candidate having the largest place name candidate evaluation value among the place name candidates, and outputs the address of the recognition result.

【００１４】上記の構成の住所認識装置において次のよ
うにして住所認識を行う。まず、認識対象の住所画像を
文字切り出し部11で処理し、１文字づつの領域に画像を
切り出す。切り出し位置として複数の候補が考えられる
ときは、考えられる切り出しの領域を切り出し候補とし
て複数出力する。図２に複数の切り出し候補の例を示
す。図２は、「富士市松岡」という住所の入力画像に対
して、切り出しを行った結果である。a〜kが処理結果の
切り出し候補であり、それぞれの記号の縦線の範囲が切
り出された領域である。Address recognition is performed as follows in the address recognition device having the above configuration. First, the address image to be recognized is processed by the character cutout unit 11, and the image is cut out into areas for each character. When a plurality of candidates for the cutout position are considered, a plurality of possible cutout areas are output as cutout candidates. FIG. 2 shows an example of a plurality of cutout candidates. FIG. 2 shows the result of clipping the input image of the address "Matsuoka, Fuji-shi". a to k are cutout candidates of the processing result, and the range of the vertical line of each symbol is the cutout area.

【００１５】次に、文字認識部12で、各切り出し候補ご
との領域画像に対して文字認識を行い、１切り出し候補
につき第１候補文字から第ｎ候補文字までのｎ個の候補
文字を持つ候補文字集合を出力する。図２は、１切り出
し候補につき第１候補文字から第10候補文字までの文字
認識を行った結果である。Next, the character recognition unit 12 performs character recognition on the area image for each cutout candidate, and a candidate having n candidate characters from the first candidate character to the nth candidate character for each cutout candidate. Output the character set. FIG. 2 shows the result of character recognition from the first candidate character to the tenth candidate character for each cut-out candidate.

【００１６】さらに、キー文字抽出部13では、候補文字
集合の中から地名階層の区切りとなる文字（キー文字と
呼ぶことにする）、例えば、「県」、「都」、「道」、
「府」、「市」、「郡」、「町」、「区」、「村」、
「大字」、「字」などを抽出する。このとき、候補文字
集合の第１候補文字から第ｎ候補文字のすべての文字の
中から、キー文字を抽出する。Further, in the key character extraction unit 13, a character (to be referred to as a key character) that serves as a delimiter of the place name hierarchy from the candidate character set, for example, "prefecture", "city", "road",
"Prefecture", "city", "county", "town", "ward", "village",
Extract "large letters", "letters", etc. At this time, the key character is extracted from all the characters from the first candidate character to the nth candidate character of the candidate character set.

【００１７】地名領域候補検索部14で、キー文字抽出部
13で抽出されたキー文字とキー文字、文字列開始文字、
文字列終了文字とに挟まれる領域を地名領域候補として
検索する。ただし、地名領域を検索するときに、地名領
域を挟むキー文字の順序は地名の階層構造の順序によっ
て拘束される。例えば、キー文字「郡」の後には、キー
文字「町」「村」のみ接続する。さらに、各地名領域の
文字列の長さは、キー文字の種類に応じて決まる範囲の
長さとする。例えば、キー文字「県」を最後の文字に持
つ地名領域は、３文字または４文字の長さである。In the place name area candidate search unit 14, the key character extraction unit
Key character and key character extracted in 13, character string start character,
The area sandwiched between the character string end character is searched as a place name area candidate. However, when searching the place name area, the order of the key characters sandwiching the place name area is restricted by the order of the hierarchical structure of the place name. For example, after the key character "gun", only the key characters "town" and "village" are connected. Further, the length of the character string in each place name area is set to a length determined in accordance with the type of key character. For example, the place name area having the key character "ken" as the last character has a length of 3 or 4 characters.

【００１８】図３に地名領域の検索例を示す。図３は、
キー文字としてそれぞれの文字位置で「府」、「郡」、
「市」、「町」、「区」が抽出されたときの地名領域の
検索である。このとき、検索された地名領域候補は、
「○○府」、「○○市」、「○○区」、「○○」、「○
郡」、「○○○町」、「○○○○○市」、「○○町」で
あり、「○○○区」（「区」は「郡」に接続しない）や
「○○○○郡」（５文字以上の長さの「郡」）などは、
検索されない。ただし、この例では、切り出し候補が複
数ないときの例を示している。FIG. 3 shows a search example of the place name area. Figure 3
As a key character, "fu", "gun", at each character position
This is a search for a place name area when “city”, “town”, and “ward” are extracted. At this time, the searched place name area candidate is
"○○ Prefecture", "○○ City", "○○ Ward", "○○", "○"
"County", "○○○ town", "○○○○○ city", "○○ town", and "○○○ ward"("ward" is not connected to "county") or "○○○" ○ "County"("County" with a length of 5 characters or more)
Not searched. However, this example shows an example in which there are not a plurality of cutout candidates.

【００１９】地名検索部15では、候補文字集合の第１候
補文字から第ｍ候補文字（ｍ≦ｎ）までの文字と地名辞
書18のマッチングを行い、地名辞書18と１文字でもマッ
チングする地名を地名候補として検索する。地名辞書18
とのマッチングに用いる候補文字の数ｍは、ｍが大きい
ほど地名候補が増え演算量が多くなるので、計算時間に
応じた数に決める必要がある。一方、キー文字の抽出に
用いる候補文字の数ｎは、地名領域候補を決定するのに
重要であるので、ある程度大きい値とした方がよい。The place name search unit 15 matches the characters from the first candidate character to the mth candidate character (m ≦ n) of the candidate character set with the place name dictionary 18, and finds a place name that matches even one character with the place name dictionary 18. Search as a place name candidate. Place Name Dictionary 18
The number m of the candidate characters used for the matching with and must be determined in accordance with the calculation time because the larger the number m, the more the place name candidates and the larger the calculation amount. On the other hand, the number n of the candidate characters used for extracting the key character is important for determining the place name area candidate, and thus it is preferable to set a large value to some extent.

【００２０】このように文字切り出し部11、文字認識部
12、キー文字抽出部13、地名領域候補検索部14、地名検
索部15を用いることにより、文字切り出し部11で文字の
切り出しを一意に決定できない場合であっても、住所認
識を行うことが可能となる。また、キー文字を用いた地
名領域の推論により、地名探索の候補を絞ることができ
る。As described above, the character cutout unit 11 and the character recognition unit
12, by using the key character extraction unit 13, the place name area candidate search unit 14, and the place name search unit 15, even if the character cutout unit 11 cannot uniquely determine the cutout of the character, it is possible to perform the address recognition. Becomes Further, by inferring the place name area using the key characters, the candidates for the place name search can be narrowed down.

【００２１】地名候補評価値演算部16は、地名候補の中
で最も確からしい地名を求めるために、一致文字数演算
部21、隣接文字一致数演算部22、不一致文字位置対応数
演算部23、不一致文字数演算部24の出力に基づいて、地
名候補の評価値を（数１）によって計算する。The place name candidate evaluation value calculating unit 16 calculates the most probable place name among the place name candidates by calculating the matching character number calculating unit 21, the adjacent character matching number calculating unit 22, the mismatching character position correspondence number calculating unit 23, and the mismatching character position. Based on the output of the character number calculation unit 24, the evaluation value of the place name candidate is calculated by (Equation 1).

【００２２】（数１）評価値＝ａ×(一致文字数)＋ｂ×
(隣接文字一致数)＋ｃ×(不一致文字の対応した数)−ｄ
×(不一致文字数) ただし、a、b、c、dは定数、一致文字数は地名候補と地
名辞書18とのマッチングした文字の数、隣接文字一致数
は地名候補の隣接する文字が両方の文字とも地名辞書18
にマッチングした数、不一致文字の対応した数は地名候
補と地名辞書18でマッチングしなかった文字どうしが対
応する位置に存在するときその文字の合計である。ま
た、不一致文字数は地名辞書18でマッチングしなかった
文字の数である。(Equation 1) Evaluation value = a × (number of matching characters) + b ×
(Number of matching adjacent characters) + c x (number of corresponding non-matching characters) -d
× (number of non-matching characters) where a, b, c, d are constants, the number of matching characters is the number of matching characters in the place name candidate and the place name dictionary 18, and the number of adjacent character matches is that both adjacent characters of the place name candidate are in both characters. Place Name Dictionary 18
And the corresponding number of non-matching characters are the totals of the place name candidates and the characters that are not matched in the place name dictionary 18 at corresponding positions. The number of unmatched characters is the number of characters that are not matched in the place name dictionary 18.

【００２３】上記構成の地名候補評価値演算部16を用い
ることにより、文字認識部12での認識誤りや、文字切り
出し部11での切り出し誤りがあっても、正解している部
分から全体を推論することが可能となる。By using the place name candidate evaluation value calculation unit 16 having the above configuration, even if there is a recognition error in the character recognition unit 12 or a cutout error in the character cutout unit 11, the whole is inferred from the correct answer part. It becomes possible to do.

【００２４】地名候補選択部17は、地名候補評価値比較
部25で地名候補の中から地名候補評価値の最も高い候補
を検索し、地名正当性評価部26では、検索された地名候
補が前後に接続する地名候補と矛盾しないかを地名階層
辞書19を用いて調べる。もし矛盾が起きているときに
は、最後に検索された地名候補を棄却して、再度地名候
補評価値比較部25において、地名候補評価値の最も高い
候補を検索する。これを繰り返して、すべての文字領域
の地名が選択されたら、選択された地名候補を認識した
住所として出力する。The place name candidate selection unit 17 searches the place name candidate evaluation value comparison unit 25 for a candidate with the highest place name candidate evaluation value, and the place name validity evaluation unit 26 searches for the place name candidate before and after. Using the place name hierarchy dictionary 19, it is checked whether there is a contradiction with the place name candidates connected to. If there is a contradiction, the last searched place name candidate is rejected, and the place name candidate evaluation value comparison unit 25 searches again for the candidate with the highest place name candidate evaluation value. When the place names in all the character areas are selected by repeating this, the selected place name candidates are output as the recognized addresses.

【００２５】上記構成の地名候補選択部17を用いること
により、最も正解であると思われる候補から確定し、不
確かな候補の確定は最後になるので、誤りが入り込む余
地が減り、認識率が向上する。By using the place name candidate selecting unit 17 having the above-mentioned configuration, the candidate that seems to be the most correct answer is decided, and the uncertain candidate is decided last, so that there is less room for error and the recognition rate is improved. To do.

【００２６】なお、地名候補評価値演算部16及び地名候
補選択部17の内部の構成は、従来用いられてきた構成と
してもよい。The place name candidate evaluation value calculating unit 16 and the place name candidate selecting unit 17 may have a conventional structure.

【００２７】次に、本発明の第２の発明の実施例につい
て説明する。図４にこの実施例の住所認識装置の全体の
構成を示す。図中文字切り出し部11、文字認識部12、キ
ー文字抽出部13、地名領域候補検索部14、地名検索部1
5、地名候補評価部16は、第１の発明の実施例と同じで
ある。Next, a second embodiment of the present invention will be described. FIG. 4 shows the overall configuration of the address recognition device of this embodiment. Character cutout unit 11, character recognition unit 12, key character extraction unit 13, place name area candidate search unit 14, place name search unit 1 in the figure
5. The place name candidate evaluation unit 16 is the same as that of the first embodiment of the invention.

【００２８】地名候補選択部17は、地名候補の中で地名
候補評価値の最も大きい地名候補を選択し、認識結果の
住所を出力する。もし、地名候補の中に正解と思われる
地名がない場合には、地名領域候補を再設定して、もう
一度地名検索部15から再処理を行う。The place name candidate selecting section 17 selects the place name candidate having the largest place name candidate evaluation value among the place name candidates and outputs the address of the recognition result. If there is no place name that seems to be the correct answer among the place name candidates, the place name area candidate is reset and the place name search unit 15 performs the process again.

【００２９】上記の構成の住所認識装置において次のよ
うにして住所認識を行う。まず、認識対象の住所画像を
文字切り出し部11で処理し、１文字づつの領域に画像を
切り出す。切り出し位置として複数の候補が考えられる
ときは、考えられる切り出しの領域を切り出し候補とし
て複数出力する。Address recognition is performed as follows in the address recognition device having the above configuration. First, the address image to be recognized is processed by the character cutout unit 11, and the image is cut out into areas for each character. When a plurality of candidates for the cutout position are considered, a plurality of possible cutout areas are output as cutout candidates.

【００３０】次に、文字認識部12で、各切り出し候補ご
との領域画像に対して文字認識を行い、１切り出し候補
につき第１候補文字から第ｎ候補文字までのｎ個の候補
文字を持つ候補文字集合を出力する。Next, the character recognition unit 12 performs character recognition on the area image for each cutout candidate, and a candidate having n candidate characters from the first candidate character to the nth candidate character for each cutout candidate. Output the character set.

【００３１】さらに、キー文字抽出部13では、候補文字
集合の中から地名階層の区切りとなるキー文字を抽出す
る。このとき、候補文字集合の第１候補文字から第ｎ候
補文字のすべての文字の中から、キー文字を抽出する。Further, the key character extraction unit 13 extracts a key character serving as a delimiter of the place name hierarchy from the candidate character set. At this time, the key character is extracted from all the characters from the first candidate character to the nth candidate character of the candidate character set.

【００３２】地名領域候補検索部14では、キー文字抽出
部13で抽出されたキー文字とキー文字、文字列開始文
字、文字列終了文字とに挟まれる領域を地名領域候補と
して検索する。ただし、地名領域を検索するときに、地
名領域を挟むキー文字の順序は地名の階層構造の順序に
よって拘束される。さらに、各地名領域の文字列の長さ
は、キー文字の種類に応じて決まる範囲の長さとする。The place name area candidate retrieving unit 14 retrieves an area sandwiched between the key characters extracted by the key character extracting unit 13 and the key characters, the character string start character, and the character string end character as place name area candidates. However, when searching the place name area, the order of the key characters sandwiching the place name area is restricted by the order of the hierarchical structure of the place name. Further, the length of the character string in each place name area is set to a length determined in accordance with the type of key character.

【００３３】地名検索部15では、候補文字集合の第１候
補文字から第ｍ候補文字（ｍ≦ｎ）までの文字と地名辞
書18のマッチングを行い、地名辞書18と１文字でもマッ
チングする地名を地名候補として検索する。地名辞書18
とのマッチングに用いる候補文字の数ｍは、ｍが大きい
ほど地名候補が増え演算量が多くなるので、計算時間に
応じた数に決める必要がある。一方、キー文字の抽出に
用いる候補文字の数ｎは、地名領域候補を決定するのに
重要であるので、ある程度大きい値とした方がよい。In the place name search unit 15, the characters from the first candidate character to the mth candidate character (m ≦ n) of the candidate character set are matched with the place name dictionary 18, and a place name that matches even one character with the place name dictionary 18 is searched. Search as a place name candidate. Place Name Dictionary 18
The number m of the candidate characters used for the matching with and must be determined in accordance with the calculation time because the larger the number m, the more the place name candidates and the larger the calculation amount. On the other hand, the number n of the candidate characters used for extracting the key character is important for determining the place name area candidate, and thus it is preferable to set a large value to some extent.

【００３４】地名候補評価値演算部16は、地名候補の中
で最も確からしい地名を求めるために、第１の実施例と
同様にして地名候補評価値を計算する。The place name candidate evaluation value calculation unit 16 calculates the place name candidate evaluation value in the same manner as in the first embodiment in order to find the most probable place name among the place name candidates.

【００３５】地名候補選択部17は、地名候補評価値比較
部25で地名候補の中から地名候補評価値の最も高い候補
を検索し、地名正当性評価部26では、検索された地名候
補が前後に接続する地名候補と矛盾しないかを地名階層
辞書19を用いて調べる。もし矛盾が起きているときに
は、最後に検索された地名候補を棄却して、再度地名候
補評価値比較部25において、地名候補評価値の最も高い
候補を検索する。これを繰り返して、すべての文字領域
の地名を選択する。さらに、リジェクト判定部27で地名
候補評価値を基準にして選択された地名の確からしさか
らリジェクトを判定する。The place name candidate selection unit 17 searches the place name candidate evaluation value comparison unit 25 for a candidate having the highest place name candidate evaluation value from the place name candidates, and the place name validity evaluation unit 26 searches the place name candidate evaluation values forward and backward. Using the place name hierarchy dictionary 19, it is checked whether there is a contradiction with the place name candidates connected to. If there is a contradiction, the last searched place name candidate is rejected, and the place name candidate evaluation value comparison unit 25 searches again for the candidate with the highest place name candidate evaluation value. Repeating this, the place names of all the character areas are selected. Further, the reject determination unit 27 determines the rejection based on the certainty of the selected place name based on the place name candidate evaluation value.

【００３６】リジェクトされなかった場合には、選択さ
れた地名を認識した住所として出力する。一方、リジェ
クトされた場合には、キー文字推論部28で文字認識でき
なかったキー文字を推論し、地名領域候補設定部29で推
論したキー文字に従って地名領域候補を設定し、その地
名領域候補を用いて再度地名検索部15から再処理を行
う。If not rejected, the selected place name is output as the recognized address. On the other hand, when rejected, the key character inference unit 28 infers a key character that could not be recognized, sets a place name area candidate according to the key character inferred by the place name area candidate setting unit 29, and selects the place name area candidate. The place name search unit 15 is used again to reprocess.

【００３７】図５に地名領域候補設定部29の処理例を示
す。図５では、初めのキー文字抽出で、「市」、
「町」、「区」が抽出され、地名領域候補として、「○
○○○○市」、「○○区」、「○○町」、「○○」が検
索された例である。これらの地名領域候補から決定され
た地名がリジェクト判定部27でリジェクトされると、キ
ー文字推論部28は、地名の階層の知識やすでに抽出され
ているキー文字から抽出できていないキー文字を推論す
る。図５では、「県」が３文字めにあると推論した。地
名領域候補設定部29では、キー文字推論部28で抽出した
キー文字を基に新たに地名領域候補、「○○県」、「○
○市」を設定する。FIG. 5 shows a processing example of the place name area candidate setting unit 29. In Figure 5, the first key character extraction, "city",
“Town” and “ward” are extracted, and “○” is selected as the place name area candidate.
In this example, "○○○ city", "○○ ward", "○○ town", and "○○" are searched. When the place name determined from these place name area candidates is rejected by the reject determination unit 27, the key character inference unit 28 infers the key character that cannot be extracted from the knowledge of the place name hierarchy or the already extracted key character. To do. In FIG. 5, it is inferred that "prefecture" is in the third letter. In the place name area candidate setting unit 29, based on the key characters extracted by the key character inference unit 28, new place name area candidates, "○○ prefecture", "○"
○ "City" is set.

【００３８】[0038]

【発明の効果】以上のように本発明によれば、文字切り
出し部が文字切り出し結果を確定できず、切り出し誤り
が発生した場合においても複数の切り出し候補の中から
正解を探索することにより、複数の切り出し候補より正
解文字を決定することができ誤認識を減らすことができ
る。As described above, according to the present invention, even if the character segmentation unit cannot determine the result of character segmentation and a segmentation error occurs, a plurality of segmentation candidates are searched for the correct answer, thereby making it It is possible to determine the correct character from the cut-out candidate and reduce misrecognition.

【００３９】さらに、キー文字により、地名領域を推定
し、最も信頼性の高い地名階層レベルの地名候補から地
名を選択することにより、地名探索誤りを防ぐことがで
き、よって文字認識率が向上する。Further, by estimating the place name area by the key character and selecting the place name from the place name candidates of the most reliable place name hierarchy level, the place name search error can be prevented, thus improving the character recognition rate. .

[Brief description of drawings]

【図１】本発明の第１の実施例の住所認識装置の構成を
示すブロック図FIG. 1 is a block diagram showing a configuration of an address recognition device according to a first embodiment of the present invention.

【図２】本発明の第１の実施例の文字切り出し部及び文
字認識部の出力を示す図FIG. 2 is a diagram showing outputs of a character cutout unit and a character recognition unit according to the first embodiment of this invention.

【図３】本発明の第１の実施例の地名領域候補検索部の
出力を示す図FIG. 3 is a diagram showing an output of a place name area candidate search unit according to the first embodiment of this invention.

【図４】本発明の第２の実施例の住所認識装置の構成を
示すブロック図FIG. 4 is a block diagram showing a configuration of an address recognition device according to a second embodiment of the present invention.

【図５】本発明の第２の実施例の地名領域候補設定部の
出力を示す図FIG. 5 is a diagram showing an output of a place name area candidate setting unit according to a second embodiment of the present invention.

【図６】従来の住所認識装置の構成を示すブロック図FIG. 6 is a block diagram showing a configuration of a conventional address recognition device.

[Explanation of symbols]

11 文字切り出し部 12 文字認識部 13 キー文字抽出部 14 地名領域候補検索部 15 地名検索部 16 地名候補評価値演算部 17 地名候補選択部 18 地名辞書 19 地名階層辞書 21 一致文字数演算部 22 隣接文字一致数演算部 23 不一致文字位置対応数演算部 24 不一致文字数演算部 25 地名候補評価値比較部 26 地名正当性評価部 27 リジェクト判定部 28 キー文字推論部 29 地名領域候補設定部 11 Character cutout unit 12 Character recognition unit 13 Key character extraction unit 14 Place name area candidate search unit 15 Place name search unit 16 Place name candidate evaluation value calculation unit 17 Place name candidate selection unit 18 Place name dictionary 19 Place name hierarchical dictionary 21 Matched character number calculation unit 22 Adjacent characters Matching number calculation unit 23 Non-matching character position correspondence number calculation unit 24 Non-matching character number calculation unit 25 Place name candidate evaluation value comparison unit 26 Place name validity evaluation unit 27 Rejection judgment unit 28 Key character reasoning unit 29 Place name area candidate setting unit

フロントページの続き (72)発明者前川英嗣大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者萱嶋一弘大阪府門真市大字門真1006番地松下電器産業株式会社内Front page continuation (72) Inventor Hidetsugu Maekawa 1006 Kadoma, Kadoma City, Osaka Prefecture Matsushita Electric Industrial Co., Ltd. (72) Inventor, Kazuhiro Kayashima 1006 Kadoma, Kadoma City, Osaka Matsushita Electric Industrial Co., Ltd.

Claims

[Claims]

1. An input image is cut out into an area for each character,
A character cutout unit that outputs a plurality of cutout candidates as a cutout region, a character recognition unit that outputs N candidate characters for each cutout candidate, and a key character extraction that extracts a predetermined key character from the set of candidate characters Section, a place name area candidate search section for searching a place name area candidate corresponding to a place name hierarchy based on the key character, and a place name for searching a place name candidate using a place name dictionary from a candidate character set included in the place name area candidate A place name candidate evaluation value calculation unit that calculates a probability of the place name candidate and outputs a place name candidate evaluation value, and a place name candidate selection unit that selects a place name candidate based on the place name candidate evaluation value are provided. Address recognition device characterized by.

2. A place name candidate evaluation value calculation unit calculates the number of matching characters between the place name candidate and the place name dictionary, and both adjacent characters of the place name candidate match the place name dictionary. An adjacent character matching number calculation unit for calculating a number, a non-matching character position correspondence number calculation unit for calculating the number of characters that are not matched in the place name candidate and the place name dictionary at corresponding positions, and the place name dictionary The address recognition device according to claim 1, further comprising a non-matching character number calculation unit that calculates the number of unmatched characters.

3. A place name candidate selection unit selects a place name candidate evaluation value comparison unit that selects a place name candidate having the largest place name candidate evaluation value, and a place name that checks whether or not the retrieved place name candidate is consistent with a place name candidate connected before and after. The address recognition device according to claim 1, further comprising a legitimacy evaluation unit.

4. A place name candidate selection unit selects a place name candidate evaluation value comparison unit that selects a place name candidate having the largest place name candidate evaluation value, and a place name that checks whether or not the retrieved place name candidate is consistent with a place name candidate connected before and after. A legitimacy evaluator, a reject evaluator that determines rejects based on the evaluation value of the selected place name, a key character inferor that infers unrecognized key characters, and a place name region that resets place name region candidates. The address recognition device according to claim 1, further comprising a candidate setting unit.