JPH04283852A

JPH04283852A - Chinese character converting method for japanese syllabary of described address

Info

Publication number: JPH04283852A
Application number: JP3072277A
Authority: JP
Inventors: Yukari Sato; ゆかり佐藤; Tsunefumi Shindo; 進藤　恒文
Original assignee: Oki Electric Industry Co Ltd; Oki Software Co Ltd
Current assignee: Oki Electric Industry Co Ltd; Oki Software Co Ltd
Priority date: 1991-03-12
Filing date: 1991-03-12
Publication date: 1992-10-08
Anticipated expiration: 2015-07-04
Also published as: JP3058706B2

Abstract

PURPOSE:To convert input KANA(Japanese syllabary) into correct KANJI (Chinese character) data as to only place names of village section level of an address. CONSTITUTION:When there is no same KANA data with input KANA data in a character dictionary file, it is checked whether or not there is another reading applied to KANJI corresponding to the input KANA data; when so, the input KANA data is corrected into KANA data based upon the reading and the corrected KANA data is collated with KANA data in the character dictionary file. When they match each other, KANJI data corresponding to the KANA data is taken out of the character dictionary file.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、住所カナを漢字に変換
する方法に関し、特に、住所（居所を含む）の字レベル
の漢字が複数の読み方をもつ場合における字レベルの入
力カナを正しく漢字に変換する住所カナの漢字変換方法
に関する。[Field of Industrial Application] The present invention relates to a method for converting address kana to kanji, and in particular, a method for converting input kana at the character level into kanji correctly when the character level kanji for address (including residence) has multiple readings. Concerning how to convert address kana to kanji.

【０００２】0002

【従来の技術】従来、たとえば保険会社等では帳票に住
所、氏名を記載する場合、住所を漢字で記載し、氏名を
カナやカナ付き漢字で記載していた。この帳票をＡＮＫ
文字に加えて手書きの漢字の認識が行なえる光学式文字
読取装置（以下、ＯＣＲ装置という。）で読取る。その
読取った住所の漢字データをカナデータに変換する場合
、変換装置でその読取った漢字データと記憶装置に格納
してある漢字辞書（漢字とそのカナとが対となって登録
されている。）ファイルの登録漢字との単語照合を行な
う。ここで単語照合とは、都道府県名のレベル（第１レ
ベルという。）と市、区等の地名レベル（第２レベルと
いう。）と町、村等の地名レベル（第３レベル、ここで
は特に字レベル（番地レベルの、その上の地名レベル即
ち住所の最後の地名レベルに相当する。）という。）に
分けてレベル毎の地名を１単語として扱い、各単語毎に
照合を行なうことをいう。2. Description of the Related Art Conventionally, for example, when an insurance company or the like writes an address and a name on a form, the address is written in kanji and the name is written in kana or kanji with kana. ANK this form
It is read with an optical character reader (hereinafter referred to as an OCR device) that can recognize handwritten kanji in addition to letters. When converting the read address kanji data into kana data, the conversion device uses the read kanji data and the kanji dictionary stored in the storage device (a kanji and its kana are registered as a pair). Verify the word with the registered kanji in the file. Here, word matching refers to the level of prefecture names (referred to as the first level), the level of place names such as cities and wards (referred to as the second level), and the level of place names such as towns and villages (referred to as the third level). This refers to dividing the place name into character level (corresponding to the place name level above the street address level, i.e., the last place name level of the address)), treating the place name at each level as one word, and performing matching for each word. .

【０００３】単語照合により一致したとき、その登録漢
字に対してのカナデータを漢字辞書ファイルより取出し
てカナデータを生成していた。しかし、このように、住
所の手書き漢字から正しいカナデータに変換することは
、正解率が悪く、かつ単語照合に時間がかかっていた。一方、帳票に記載した氏名については、カナ書きの部分
は、手書き漢字ＯＣＲ装置による氏名のカナデータの認
識率（正解率）が高く、しかも照合に時間がかからない
という利点があった。そこで、ユーザからの希望により
、帳票の住所もカナ書きにさせることにし、住所の単語
照合によりカナデータから漢字データを生成する機能（
住所のカナ漢字変換）を変換装置に追加することにした
。この場合、カナは、漢字のように複雑な字体でないた
め手書き漢字ＯＣＲ装置からのカナの認識結果である読
取データは、漢字のときよりも正確であるため、変換装
置よりカナ漢字変換されて出力される漢字データは、正
常に変換されたものとして処理を行なっていた。[0003] When a match is found through word matching, the kana data for the registered kanji is extracted from the kanji dictionary file to generate kana data. However, converting the handwritten kanji of an address into correct kana data has a low accuracy rate and takes a long time to match words. On the other hand, when it comes to names written on forms, the handwritten kanji OCR device has a high recognition rate (accuracy rate) of kana data for names written in kana, and the advantage is that it does not take much time to collate. Therefore, in response to the user's request, we decided to have the address on the form written in kana, and a function that generates kanji data from kana data by matching words in the address (
I decided to add the Kana-Kanji conversion of addresses to the conversion device. In this case, since kana is not a complex font like kanji, the read data that is the recognition result of kana from the handwritten kanji OCR device is more accurate than when it is kanji, so the conversion device converts it into kana-kanji and outputs it. The kanji data that was converted was processed as if it had been converted normally.

【０００４】0004

【発明が解決しようとする課題】しかしながら、上述し
た従来の住所カナの漢字変換方法では、次のような問題
があった。手書き漢字ＯＣＲ装置にて読取られた入力カ
ナデータが、記憶装置の辞書ファイルに登録されている
カナデータと全く同じ場合のみ、漢字データに変換する
ことができ、住所の地名漢字がどちらの読み方でも適用
する場合（例、町→「チョウ」、又は「マチ」）でも、
辞書ファイルに登録されたカナと異なれば、全てエラー
として扱い、漢字データに変換できなかった。なお、実
際の辞書内では「ョ」は大文字の「ヨ」となり、また濁
点は電報文のように１文字分となるが、本書では、説明
の都合上、通常の記載方法としている。本発明の目的は
、このような従来の問題点に鑑み、住所の字レベルの地
名に限り、その地名の漢字に適用する複数の読み方があ
る場合、字辞書ファイルに登録されたカナデータと異な
った入力カナデータに対して、正しいカナデータ（字辞
書ファイルに登録されたカナデータと同じカナデータ）
に修正し、正しい漢字データに変換することができるよ
うにした住所カナの漢字変換方法を提供することにある
。[Problems to be Solved by the Invention] However, the above-mentioned conventional address kana to kanji conversion method has the following problems. Only when the input kana data read by the handwritten kanji OCR device is exactly the same as the kana data registered in the dictionary file of the storage device, can it be converted to kanji data, and the place name kanji of the address can be read no matter which way it is read. Even when applied (e.g., town → “cho” or “machi”),
Anything different from the kana registered in the dictionary file was treated as an error and could not be converted to kanji data. Note that in the actual dictionary, ``yo'' is capitalized as ``yo'', and the voiced mark is one character, like in a telegram, but in this book, for convenience of explanation, it is written in the usual way. In view of such conventional problems, the purpose of the present invention is limited to place names at the character level of addresses, and if there are multiple readings that apply to the kanji of the place name, the kana data registered in the character dictionary file differs from the kana data registered in the character dictionary file. Correct kana data for the input kana data (the same kana data as the kana data registered in the character dictionary file)
The objective is to provide a method for converting address kana into kanji that can be corrected and converted into correct kanji data.

【０００５】[0005]

【課題を解決するための手段】本発明は、住所の字レベ
ルにおける地名の１単語の入力カナデータを記憶手段に
設けた字辞書ファイルのカナデータと単語照合し、一致
したときそのカナデータに対しての漢字データを前記字
辞書ファイルより取出すようにした住所カナの漢字変換
方法において、前記入力カナデータと同じカナデータが
前記字辞書ファイルにないとき、前記入力カナデータに
相当する漢字に通用する別の読み方があるかを調べ、別
の読み方があれば、前記入力カナデータをその別の読み
方によるカナデータに修正し、その修正したカナデータ
と前記字辞書ファイルのカナデータとの単語照合を行な
い、一致したとき、そのカナデータに対しての漢字デー
タを前記字辞書ファイルより取り出すようにしたもので
ある。[Means for Solving the Problems] The present invention verifies the input kana data of one word of a place name at the character level of an address with the kana data of a character dictionary file provided in a storage means, and when a match is found, the kana data is In the address kana kanji conversion method in which kanji data for the input kana data is extracted from the character dictionary file, when the same kana data as the input kana data is not in the character dictionary file, the kanji corresponding to the input kana data is used. If there is another reading, the input kana data is corrected to the other reading, and the corrected kana data is word-matched with the kana data in the character dictionary file. When a match is found, the kanji data corresponding to the kana data is extracted from the character dictionary file.

【０００６】[0006]

【作用】入力カナデータと同じカナデータが字辞書ファ
イルにないとき、入力カナデータに相当に漢字に適用す
る別の読み方があるかを調べ、別の読み方があれば、入
力カナデータをその別の読み方によるカナデータに修正
し、その修正したカナデータと字辞書ファイルのカナデ
ータとの照合を行ない、一致したときそのカナデータに
対しての漢字データを字辞書ファイルより取出す。故に
、住所の字レベルの地名に限り、入力カナを正しい漢字
データに変換することができる。[Operation] When the same kana data as the input kana data is not found in the character dictionary file, it is checked whether the input kana data has another reading that corresponds to the kanji, and if there is another reading, the input kana data is changed to that other reading. The corrected kana data is compared with the kana data in the character dictionary file, and when a match is found, the kanji data corresponding to the kana data is extracted from the character dictionary file. Therefore, input kana can be converted into correct kanji data only for place names at the character level of addresses.

【０００７】[0007]

【実施例】次に本発明の実施例について説明する。図３
は本発明に係るＯＣＲシステムの一実施例を示すブロッ
ク図である。同図において、１はＡＮＫの文字認識に加
えて手書き漢字の文字認識の行なえるＯＣＲ装置（以下
、手書き漢字ＯＣＲ装置という。）であって、この手書
き漢字ＯＣＲ装置１は、読取媒体としての帳票の漢字及
びカナ等を読取って文字認識を行ない、その読取データ
（認識データ）を制御部２に渡す。２は、メモリ２−１
を内蔵する制御部であって、この制御部２に手書き漢字
ＯＣＲ装置１、入力装置としてのキーボード（ＫＢとい
う）３、表示装置（ＣＲＴという。）４およびハードデ
ィスク（ＨＤという。）５が接続されている。制御部２
はこれらの手書き漢字ＯＣＲ装置１、ＣＲＴ４、および
ハードディスク５等を統括制御するものである。[Example] Next, an example of the present invention will be described. Figure 3
1 is a block diagram showing an embodiment of an OCR system according to the present invention. In the figure, reference numeral 1 denotes an OCR device (hereinafter referred to as a handwritten kanji OCR device) that can recognize handwritten kanji characters in addition to ANK character recognition. Kanji, kana, etc. are read, character recognition is performed, and the read data (recognition data) is passed to the control unit 2. 2 is memory 2-1
This control unit 2 is connected to a handwritten kanji OCR device 1, a keyboard (referred to as KB) 3 as an input device, a display device (referred to as CRT) 4, and a hard disk (referred to as HD) 5. ing. Control part 2
is for controlling the handwritten kanji OCR device 1, CRT 4, hard disk 5, etc. in an integrated manner.

【０００８】ここで、ＣＲＴ４は、入力データを表示し
たり、手書き漢字ＯＣＲ装置１からの読取データを表示
したり等するものである。また、ハードディスク５には
、辞書ファイル特に住所の地名の単語照合用辞書ファイ
ル（地名のカナとそのカナの漢字が登録されている。）
等が格納されている。[0008] Here, the CRT 4 is used to display input data, display data read from the handwritten kanji OCR device 1, and so on. In addition, the hard disk 5 has dictionary files, particularly dictionary files for checking words of place names in addresses (kana of place names and kanji of the kana are registered).
etc. are stored.

【０００９】住所の地名の単語照合用辞書ファイルには
、都道府県名レベルのファイル（第１レベルファイルと
いう。）と区、市、郡等の地名レベルのファイル（第２
レベルファイルという。）と町、村等の地名レベルのフ
ァイル（第３レベルファイル、又は字レベルファイルと
いう。）がある。第２レベルファイルは第１レベルの各
都道府県別に分類されて設けられている。第３レベルフ
ァイルは更に第２レベルの該当する各市、区、郡別に分
類されて設けられている。なお、本発明は、（字レベル
ファイル）に関係し、特にこの（字レベルファイル）を
以下、字辞書ファイルという。また、６は変換装置であ
って、この変換装置６は、入力カナデータを漢字データ
に変換したり、漢字データをカナデータに変換したり等
するものである。変換装置６は、制御部２とキーボード
３とＣＲＴ４とハードディスク５等から構成される。また、７は読取媒体としてのたとえば帳票を読取り、そ
の読取った読取データ（認識データ）と辞書ファイルの
データとの照合を行ない、たとえば読取データであるカ
ナデータ（又は漢字データ）を漢字データ（又はカナデ
ータ）に変換したりするＯＣＲ装置である。このＯＣＲ
装置７は、キーボード３からの入力データと辞書ファイ
ルのデータとの照合を行ない、入力データをカナデータ
又は漢字データ等に変換することもできるものである。[0009] The dictionary file for word matching of place names in addresses includes a file at the prefecture name level (referred to as the first level file) and a file at the place name level such as ward, city, county, etc. (second level file).
It's called a level file. ) and place name level files (referred to as third level files or character level files) such as towns and villages. The second level files are provided classified by each prefecture of the first level. The third level file is further classified into each city, ward, and county corresponding to the second level. The present invention relates to a (character level file), and in particular, this (character level file) is hereinafter referred to as a character dictionary file. Further, reference numeral 6 denotes a conversion device, and this conversion device 6 converts input kana data into kanji data, converts kanji data into kana data, and the like. The conversion device 6 includes a control section 2, a keyboard 3, a CRT 4, a hard disk 5, and the like. In addition, 7 reads a document as a reading medium, and compares the read data (recognition data) with the data in the dictionary file. This is an OCR device that converts images into kana data. This OCR
The device 7 is also capable of comparing input data from the keyboard 3 with data in a dictionary file, and converting the input data into kana data, kanji data, or the like.

【００１０】次に、本発明に係る単語照合の処理概要を
図２を用いて説明する。なお、図２は、本発明に係る単
語照合の処理概要を説明する説明図である。読取媒体と
しての帳票１１（カナ書きの住所地が図示の如くカナで
記載されている）を手書き漢字ＯＣＲ装置１に入力する
。又はキーボード３にて図示の如く住所地をカナ入力す
る。制御部２は、手書き漢字ＯＣＲ装置１からの読取デ
ータやキーボード３による入力カナデータを、制御部２
のメモリ２−１に格納させる。次に制御部２は、単語照
合（住所カナの漢字変換）を行なう。この単語照合につ
いて以下説明する。カナデータはスペースで区切られて
いるので、制御部２はスペースで区切られたカナを１つ
の単語としてメモリ２−１より取出す。図示の例では、
「トウキョウト」、「ミナトク」、「シバウラ」がそれ
ぞれ１つの単位を構成する。次に制御部２は、住所の地
名の第１レベルから第３レベルへと順に１単語ずつ取出
して照合を行なう。図示の例では、制御部２はまず、メ
モリ２−１より「トウキョウト」を取出し、これとハー
ドディスク５の単語照合用辞書ファイルの第１レベルフ
ァイルの都道府県名との照合を行ない、一致すればその
「トウキョウト」の漢字「東京都」を第１レベルファイ
ルより取出しメモリ２−１に格納する。次に、制御部２
は、メモリ２−１より「ミナトク」を取出し、この「ミ
ナトク」と、第２レベルファイルの区名、市名等との照
合を行ない、一致すればその「ミナトク」の漢字「港区
」を第２レベルファイルより取出しメモリ２−１に格納
する。更に、制御部２は、メモリ２−１より「シバウラ
」を取出し、この「シバウラ」と、字辞書ファイルの町
名、村名等の照合を行ない、一致すれば、その「シバウ
ラ」の漢字「芝浦」を取出しメモリ２−１に格納する。このようにして「東京都港区芝浦」という住所地の漢字
に変換される。Next, an outline of the word matching process according to the present invention will be explained with reference to FIG. Note that FIG. 2 is an explanatory diagram illustrating an overview of word matching processing according to the present invention. A form 11 (in which the address is written in kana as shown) as a reading medium is input into the handwritten kanji OCR device 1. Or input the address using the keyboard 3 as shown in the figure. The control unit 2 receives read data from the handwritten kanji OCR device 1 and input kana data from the keyboard 3.
The data is stored in the memory 2-1. Next, the control unit 2 performs word matching (conversion of address kana to kanji). This word matching will be explained below. Since the kana data is separated by spaces, the control unit 2 takes out the kana data separated by spaces from the memory 2-1 as one word. In the illustrated example,
``Tokyo'', ``Minatoku'', and ``Shibaura'' each constitute one unit. Next, the control unit 2 sequentially extracts one word at a time from the first level to the third level of the place name of the address and performs verification. In the illustrated example, the control unit 2 first retrieves "Tokyo" from the memory 2-1, matches this with the prefecture name in the first level file of the dictionary file for word matching on the hard disk 5, and if they match, The kanji character "Tokyo" for "Tokyo" is extracted from the first level file and stored in the memory 2-1. Next, the control section 2
retrieves "Minatoku" from memory 2-1, checks this "Minatoku" with the ward name, city name, etc. in the second level file, and if there is a match, converts the kanji "Minato-ku" from "Minatoku". The data is extracted from the second level file and stored in the memory 2-1. Furthermore, the control unit 2 retrieves "Shibaura" from the memory 2-1, compares this "Shibaura" with the town name, village name, etc. in the character dictionary file, and if they match, the kanji "Shibaura" of the "Shibaura" is used. ” is taken out and stored in the memory 2-1. In this way, it is converted into the kanji for the address "Shibaura, Minato-ku, Tokyo."

【００１１】次に本発明の実施例を図１のフローチャー
トを用いて説明する。なお、図１は本発明の一実施例を
示すフローチャートである。また、具体例として図４を
参照しながら図１を説明する。図４はカナデータの修正
例を示す説明図である。まずカナデータを入力する（ス
テップＳ１）。ここでは、読取媒体としての帳票（住所
地名がカナ書きされている）を手書き漢字ＯＣＲ装置１
に入力する。又は、キーボード３にてオペレータが住所
地をカナ入力する。手書き漢字ＯＣＲ装置１で読取られ
たカナデータ（認識データ）は、制御部２へ供給される
。制御部２はこのカナデータをメモリ２−１に格納する
。またキーボード３からのカナ入力も制御部２のメモリ
２−１に格納される。Next, an embodiment of the present invention will be explained using the flowchart shown in FIG. Note that FIG. 1 is a flowchart showing one embodiment of the present invention. Further, FIG. 1 will be explained with reference to FIG. 4 as a specific example. FIG. 4 is an explanatory diagram showing an example of correction of kana data. First, kana data is input (step S1). Here, a form (with the address and place name written in kana) as a reading medium is transferred to the handwritten kanji OCR device 1.
Enter. Alternatively, the operator inputs the address using the keyboard 3 in kana characters. The kana data (recognition data) read by the handwritten kanji OCR device 1 is supplied to the control section 2 . The control unit 2 stores this kana data in the memory 2-1. Furthermore, kana input from the keyboard 3 is also stored in the memory 2-1 of the control section 2.

【００１２】ここで、メモリ２−１に格納されたカナデ
ータは図２で説明したように、第１レベルの地名と第２
レベルの地名の間、第２レベルの地名と第３レベル（字
レベル）の地名の間は、スペースで区切られており、ス
ペースで区切られたカナを１つの単語として扱う。図４
の例では、第２レベルの地名と第３レベルの地名の間は
スペースとなっている。制御部２は、メモリ２−１に格
納した住所地のカナデータより１単語のカナデータを取
り出す（ステップＳ２）。なお、メモリ２−１より取出
された１単語のカナデータを入力カナという。ここで、
メモリ２−１から１単語のカナデータを取出す場合、第
１レベルの方から第３レベルの方に向かって順番に取出
すものとする。従って制御部２は、第１レベルの地名が
なければ、第２レベルの地名をメモリ２−１より取出す
ものとする。Here, the kana data stored in the memory 2-1 includes the first level place name and the second level place name.
Place names at the second level and place names at the third level (character level) are separated by spaces, and kana separated by spaces are treated as one word. Figure 4
In the example, there is a space between the second level place name and the third level place name. The control unit 2 extracts one word of kana data from the kana data of the address stored in the memory 2-1 (step S2). Note that the kana data of one word taken out from the memory 2-1 is referred to as input kana data. here,
When taking out one word of kana data from the memory 2-1, it is assumed that the data is taken out in order from the first level to the third level. Therefore, if there is no first level place name, the control unit 2 retrieves the second level place name from the memory 2-1.

【００１３】次に、制御部２は、入力カナが字レベルか
否かを調べる（ステップＳ３）。字レベルでなければ従
来通りの処理を行なう（ステップＳ４）。即ち、制御部
２は、字レベルでない場合、まず入力カナとハードディ
スク５の第１レベルファイルのカナと単語照合を行ない
、一致すれば、そのカナの漢字（都道府県名）を第１レ
ベルファイルより取出す。入力カナと第１レベルファイ
ルのカナとが一致しない場合、制御部２は入力カナと第
２レベルファイルのカナと単語照合を行ない、一致すれ
ばそのカナの漢字（市、区等の地名）を第２レベルファ
イルより取出す。なお、図４の例では、「カワゴエシ」
はこの場合に該当するが、発明に関係しないので図４で
は説明を省略してある。制御部２は、ハードディスク５
より取出された地名の漢字データをメモリ２−１に格納
させる。また、実際には、第１レベル、第２レベルの地
名は、字レベルの地名よりも数が少なく、一般に問題は
生じない。ステップ４で従来通りの処理を終えると、次
の１単語のカナデータをメモリ２−１より取出し（ステ
ップＳ５、Ｓ６）、再び字レベルでなければ、上述した
と同様に従来通りの処理を行なう（ステップＳ４）。Next, the control unit 2 checks whether the input kana is at the character level (step S3). If it is not at the character level, the conventional processing is performed (step S4). That is, if the control unit 2 is not at the character level, it first compares the input kana with the kana in the first level file on the hard disk 5, and if there is a match, it selects the kanji (prefecture name) for that kana from the first level file. Take it out. If the input kana and the kana in the first level file do not match, the control unit 2 performs word matching between the input kana and the kana in the second level file, and if they match, the kanji (place name of city, ward, etc.) of the kana is displayed. Extract from the second level file. In addition, in the example in Figure 4, "Kawagoeshi"
corresponds to this case, but the explanation is omitted in FIG. 4 because it is not related to the invention. The control unit 2 controls the hard disk 5
The kanji data of the place name extracted is stored in the memory 2-1. Furthermore, in reality, the number of place names at the first level and the second level is smaller than the number of place names at the character level, and generally no problem occurs. When the conventional processing is completed in step 4, the kana data for the next word is retrieved from the memory 2-1 (steps S5 and S6), and if it is not at the character level again, the conventional processing is performed in the same way as described above. (Step S4).

【００１４】次に、制御部２はメモリ２−１より１単語
のカナデータを取出す（ステップＳ５、Ｓ６）。制御部
２は字レベルであれば、入力カナとハードディスク５の
字辞書ファイルのカナとの単語照合（マッチング）を行
ない、単語照合の結果、一致すれば（入力カナと同じカ
ナが字辞書ファイルにあれば）、そのカナに対しての漢
字データを字辞書ファイルより取出し、漢字データを生
成する（ステップＳ７、Ｓ１３）。この漢字データはメ
モリ２−１に格納される。もし、ステップＳ７において
、入力カナと字辞書ファイルのカナとが一致しなければ
（入力カナと同じカナが字辞書ファイルになければ）、
制御部２は、入力カナに相当する漢字に、別の通用する
読み方（文字）があるかをハードディスク５の辞書ファ
イルより調べ（ステップＳ８、Ｓ９）、無ければエラー
として漢字データの生成は行なわない。また、制御部２
は、入力カナに相当する漢字に別の通用する読み方（文
字）があれば、入力カナを別の読み方の文字に修正して
、再度、その修正したカナと字辞書ファイルのカナとの
単語照合（マッチング）を行なう（ステップＳ１０、Ｓ
１１）。制御部２は単語照合の結果、一致すれば（修正
したカナが字辞書ファイルにあれば）、そのカナに対し
ての漢字データを字辞書ファイルより取出し漢字データ
を生成する（ステップＳ１２、Ｓ１３）。もし、修正し
たカナと同じカナが字辞書ファイルに無ければ、エラー
として漢字データ生成を行なわない。Next, the control unit 2 retrieves one word of kana data from the memory 2-1 (steps S5 and S6). If the input kana is at the character level, the control unit 2 performs word matching between the input kana and the kana in the character dictionary file on the hard disk 5, and if the word matching results in a match (the same kana as the input kana is in the character dictionary file). If there is one), the kanji data for that kana is extracted from the character dictionary file and kanji data is generated (steps S7, S13). This kanji data is stored in the memory 2-1. If in step S7, the input kana does not match the kana in the character dictionary file (if the same kana as the input kana does not exist in the character dictionary file),
The control unit 2 checks the dictionary file on the hard disk 5 to see if there is another commonly used reading (character) for the kanji corresponding to the input kana (steps S8 and S9), and if there is no other commonly used reading (letter), it determines as an error and does not generate kanji data. . In addition, the control unit 2
If the kanji corresponding to the input kana has another commonly used reading (character), correct the input kana to a character with another reading, and then match the corrected kana with the kana in the character dictionary file again. (matching) (steps S10, S
11). If there is a match as a result of word matching (if the corrected kana is in the character dictionary file), the control unit 2 extracts the kanji data for that kana from the character dictionary file and generates kanji data (steps S12, S13). . If the same kana as the corrected kana is not found in the character dictionary file, it will be treated as an error and the kanji data will not be generated.

【００１５】図４の例では、「サイワイマチ」が字辞書
ファイルにないので、たとえば「マチ」に相当する漢字
に別の通用する読み方（文字）があるかをハードディス
ク５に別に設けた単語照合プログラムの中で作成したテ
ーブルより探し、「マチ」を「チョウ」とし、修正カナ
「サイワイチョウ」と字辞書ファイルとの照合（マッチ
ング）を行なう（ステップＳ７〜Ｓ１１）。その照合の
結果、修正カナが字辞書ファイルにあったので、そのカ
ナに対しての漢字データ「幸町」を字辞書ファイルより
取出して漢字データ「幸町」を図示の如く生成する（ス
テップＳ１２、Ｓ１３）。以上の説明から判るように、
住所の字レベルの地名に限り、その地名の漢字に適用す
る複数の読み方（文字）がある場合、字辞書ファイルに
登録されたカナデータと異なった入力カナデータに対し
て、正しいカナデータ（字辞書ファイルに登録されたカ
ナデータと同じカナデータ）に修正し、その正しいカナ
データに対しての漢字データを字辞書ファイルより取出
すことができる。従って入力カナが字辞書ファイルのカ
ナと異なっていても、入力カナを正しいカナに修正し、
正しい漢字データに変換することができる。本発明は本
実施例に限定されることなく、本発明の要旨を逸脱しな
い範囲で種々の応用および変形が考えられる。In the example shown in FIG. 4, since "saiwaimachi" is not in the character dictionary file, a word matching program installed separately on the hard disk 5 is used to check whether there is another commonly used reading (letter) for the kanji corresponding to "machi", for example. ``Machi'' is searched from the table created in ``Cho'', and the corrected kana ``Saiwaicho'' is compared with the character dictionary file (steps S7 to S11). As a result of the comparison, since the corrected kana was found in the character dictionary file, the kanji data "Saiwai-cho" for that kana is extracted from the character dictionary file and the kanji data "Saiwai-cho" is generated as shown in the figure (step S12 , S13). As can be seen from the above explanation,
For place names at the character level of addresses, if there are multiple readings (characters) that can be applied to the kanji of the place name, the correct kana data (characters) will be displayed for input kana data that is different from the kana data registered in the character dictionary file. It is possible to correct the correct kana data (the same kana data as the kana data registered in the dictionary file) and extract the kanji data for the correct kana data from the character dictionary file. Therefore, even if the input kana is different from the kana in the character dictionary file, the input kana is corrected to the correct kana,
It can be converted into correct kanji data. The present invention is not limited to this embodiment, and various applications and modifications can be made without departing from the gist of the present invention.

【００１６】[0016]

【発明の効果】上述したように本発明によれば、住所の
字レベルの地名に限り、その地名の漢字に適用する複数
の読み方（文字）がある場合、字辞書ファイルに登録さ
れたカナデータと異なった入力カナデータを正しいカナ
に修正し、正しい漢字データに変換することができる等
の効果を奏する。Effects of the Invention As described above, according to the present invention, only for a place name at the character level of an address, if there are multiple readings (characters) applicable to the kanji of the place name, the kana data registered in the character dictionary file This has effects such as being able to correct input kana data that is different from the original kana to the correct kana and converting it to correct kanji data.

[Brief explanation of the drawing]

【図１】本発明の一実施例を示すフローチャートである
。FIG. 1 is a flowchart showing one embodiment of the present invention.

【図２】本発明に係る単語照合の処理概要の説明図であ
る。FIG. 2 is an explanatory diagram of an overview of word matching processing according to the present invention.

【図３】本発明に係るＯＣＲ装置の一実施例を示すブロ
ック図である。FIG. 3 is a block diagram showing an embodiment of an OCR device according to the present invention.

【図４】カナデータ修正例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of kana data correction.

[Explanation of symbols]

１　　手書き漢字ＯＣＲ装置２　　制御部２−１　　メモリ３　　キーボード４　　ＣＲＴ５　　ハードディスク６　　変換装置７　　ＯＣＲシステム 1 Handwritten kanji OCR device 2 Control section 2-1 Memory 3 Keyboard 4 CRT 5 Hard disk 6 Conversion device 7 OCR system

Claims

[Claims]

Claim 1: The input kana data of one word of a place name at the character level of the address is word-checked with the kana data in the character dictionary file provided in the storage means, and when there is a match, the kanji data for the kana data is transferred to the character In a kanji conversion method for address kana that is extracted from a dictionary file, when the same kana data as the input kana data is not in the character dictionary file, it is determined whether there is another reading that can be used for the kanji corresponding to the input kana data. If there is a different reading, modify the input kana data to that other reading, then match the corrected kana data with the kana data in the character dictionary file. A method for converting address kana to kanji, characterized in that kanji data for kana data is extracted from the character dictionary file.