JPH07225807A

JPH07225807A - Census register data generator

Info

Publication number: JPH07225807A
Application number: JP6016114A
Authority: JP
Inventors: Kazuo Kobashi; 一夫小橋; Masatoshi Hino; 匡利樋野; Takuya Okamoto; 卓哉岡本; Takehiko Kumada; 武彦熊田; Shinichi Yokoi; 慎一横井; Yukio Sakano; 幸生坂野; Akitoshi Sakamoto; 晃敏坂本; Motoaki Kamata; 素明鎌田; Kensuke Sarai; 謙介皿井; Masao Nishizawa; 正夫西沢
Original assignee: Hitachi Ltd; Hitachi Information Systems Ltd
Current assignee: Hitachi Ltd; Hitachi Information Systems Ltd
Priority date: 1994-02-10
Filing date: 1994-02-10
Publication date: 1995-08-22

Abstract

PURPOSE:To efficiently perform the revision work and the itemizing work required for census register data generation by systemizing a series of processes to generate census register data base from family registers and using code information. CONSTITUTION:Image data inputted from a scanner 1 is stored in a family register data storage part 2 and is converted to code data by a character recognition part 3 and is temporarily stored in a family register master 4. Thereafter, a data processing part 5 edits code data based on data read from an description pattern definition file 6 to output the result to a census register data base 7, a display device 9, and a printer 10 with respect to editing for revision work and itemization. When census register data is inputted from a keyboard 8, code data stored in the census register master 4 is sent to the data processing part, and the editing work of revision and itemization is performed, and edited census register data may be outputted to the display device 9 and the printer 10.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、紙により保管、管理さ
れている戸籍簿を電子計算機による処理に必要な情報に
変換して戸籍データを作成するシステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system for creating family register data by converting a family register stored and managed on paper into information necessary for processing by an electronic computer.

【０００２】[0002]

【従来の技術】従来、入力作業においてイメージデータ
をコードデータに変換するにはＯＣＲや文字認識装置を
用いている。しかし、戸籍簿等の文書データに対するデ
ータセットアップ作業については、データセットアップ
作業中にイメージデータに対して戸籍簿の記載されてい
る事項を戸籍原本に代る戸籍データベースに格納するか
否かを戸籍法等に照らし合わせて判断する改製作業、及
び文章形式の記載を項目に分ける項目化作業を行う必要
があり、このようなデータ編集については考慮されてい
なかった。2. Description of the Related Art Conventionally, an OCR or a character recognition device has been used to convert image data into code data in input work. However, regarding the data setup work for document data such as a family register, it is necessary to decide whether to store the items described in the family register for the image data during the data setup work in the family register database instead of the original family register law. It was necessary to carry out rework work to judge in light of the above, and itemization work to divide the description in text format into items, and such data editing was not considered.

【０００３】[0003]

【発明が解決しようとする課題】上記従来技術はデータ
セットアップ中の改製作業、及び項目化作業に対して考
慮されておらず、データ入力作業と改製作業、及び項目
化作業の３種類の作業を組み合わせたデータセットアッ
プ作業を行なうには戸籍事務の専門家の知識が必要であ
り、人による作業では時間がかかる、誤りが発生しても
発見しずらい、データが不均一になりやすいという問題
が発生する。また機械的な単純作業で行えないため手間
がかかるという問題がある。The above-mentioned prior art does not consider the modification work and the itemization work during the data setup. The data input work, the modification work, and the itemization work are classified into three types of work. To perform combined data setup work requires the knowledge of a family register clerical worker, and it takes time for human work, it is difficult to find even if an error occurs, and data tends to be uneven. Occur. Further, there is a problem that it takes time because it cannot be performed by a simple mechanical work.

【０００４】本発明の目的はこのような問題点を改善
し、戸籍データ作成に必要な改製作業及び項目化作業の
効率化を図り、入力作業・改製作業・項目化作業を備え
た戸籍データ作成装置を提供することにある。The object of the present invention is to improve such problems, to improve the efficiency of the remodeling work and the itemization work necessary for preparing the family register data, and to prepare the family register data including the input work, the remodeling work, and the itemization work. To provide a device.

【０００５】[0005]

【課題を解決するための手段】上記課題を解決するた
め、本発明の戸籍データ作成装置は、戸籍簿をイメージ
情報として入力する戸籍簿入力手段により入力し、戸籍
簿データを戸籍簿データ記憶部に記憶する。次に、戸籍
簿データをデータ変換手段により文字認識してイメージ
情報からコード情報に変換する。データ変換手段により
文字認識できなかったデータ（以下、リジェクトデータ
という）については出力手段を用いて出力し、修正デー
タ入力手段により修正して入力する。次に、データ変換
手段により文字認識してコード情報に変換した戸籍簿デ
ータ及び修正データ入力手段により入力した戸籍簿デー
タを戸籍簿マスタに格納する。戸籍簿マスタに格納され
た戸籍簿データをデータ変換手段により戸籍データベー
スに登録するかどうかを選択する改製作業、及び記載事
項を要素ごとに項目化する項目化作業を行い、戸籍デー
タベース作成手段により戸籍データベースを作成するこ
とにより達成される。In order to solve the above-mentioned problems, the family register data creating device of the present invention inputs the family register data by means of the family register data input means for inputting the family register as image information, and stores the family register data in the family register data storage unit. Remember. Next, the family register data is character-recognized by the data conversion means and converted from image information to code information. Data for which characters cannot be recognized by the data conversion means (hereinafter referred to as reject data) is output using the output means, corrected by the correction data input means, and input. Next, the family register data, which is character-recognized by the data converting unit and converted into code information, and the family register data input by the correction data input unit are stored in the family register master. Remodeling work to select whether or not to register the family register data stored in the family register master in the family register database by data conversion means, and itemization work to itemize the described items for each element, and the family register database creation means This is achieved by creating a database.

【０００６】又、上記課題を解決するため、本発明の戸
籍データ作成装置は、戸籍簿データ記憶部に記憶した戸
籍簿データを仕分け手段によりタイプ文字型戸籍データ
と非タイプ文字型戸籍データに仕分けする。仕分したタ
イプ文字型戸籍データをデータ変換手段により文字認識
してイメージ情報からコード情報に変換する。仕分けし
た非タイプ文字型戸籍データを、イメージデータ出力手
段により出力し、そのイメージデータを基にコードデー
タ入力手段により入力させる。データ変換手段により文
字認識してコード情報に変換した戸籍簿データと修正デ
ータ入力手段により入力した戸籍簿データとコードデー
タ入力手段により入力した戸籍場データを戸籍簿マスタ
に格納することでも達成される。In order to solve the above-mentioned problems, the family register data creating apparatus of the present invention sorts the family register data stored in the family register data storage unit into type character type family register data and non-type character type family register data by a sorting means. To do. Characters of the assorted type character type family register data are recognized by the data conversion means and converted from image information to code information. The sorted non-type character type family register data is output by the image data output means, and is input by the code data input means based on the image data. It is also achieved by storing in the family register master the family register data which is character-recognized by the data converting means and converted into code information, the family register data input by the correction data inputting means and the family register data input by the code data inputting means. .

【０００７】[0007]

【作用】上記手段のうち、戸籍簿入力手段は戸籍簿のイ
メージ情報を入力する働きをし、データ変換手段は戸籍
簿データのイメージ情報を１文字ずつ文字認識してコー
ド情報に変換する働きをする。出力手段はデータ変換手
段によってコード情報に変換されなかった文字を帳票又
はディスプレイにイメージ情報として出力する働きを
し、修正データ入力手段は前記出力手段によって出力し
たイメージ情報をコード情報として入力する働きをす
る。又、データ変換手段は戸籍マスタに格納された戸籍
簿のコードデータに対して改製作業及び項目化作業を行
うことによりデータ編集する働きをし、戸籍データ作成
手段はコード情報を戸籍データベースに格納する働きが
ある。Among the above means, the family register input means functions to input the image information of the family register, and the data conversion means functions to recognize the image information of the family register data character by character and convert it into code information. To do. The output means functions to output the characters that have not been converted to code information by the data conversion means as image information on a form or display, and the correction data input means functions to input the image information output by the output means as code information. To do. Further, the data converting means functions to edit the data by performing the reworking work and the itemizing work on the code data of the family register stored in the family register master, and the family data creating means stores the code information in the family database. It has a function.

【０００８】又、仕分け手段は戸籍簿データを文字認識
し、認識率があらかじめ設定されている閾値以上の場合
はタイプ文字型戸籍データ、認識率が閾値未満の場合は
非タイプ文字型戸籍データとして仕分ける。データ変換
手段はタイプ文字型戸籍データのイメージ情報を１文字
ずつ文字認識してコード情報に変換する働きをし、出力
手段は文字認識によってコード情報に変換されなかった
タイプ文字型戸籍データを出力する働きがあり、修正デ
ータ入力手段出力手段により出力したタイプ文字型戸籍
データをコードデータとして入力する働きがある。イメ
ージデータ出力手段は非タイプ文字型戸籍データをイメ
ージデータとして帳票及びディスプレイに出力する働き
があり、コードデータ入力手段はイメージデータ出力手
段により出力したイメージデータを基にコードデータを
入力する働きがある。Further, the sorting means character-recognizes the family register data, and when the recognition rate is equal to or higher than a preset threshold value, it is type character type family register data, and when the recognition rate is lower than the threshold value, it is non-type character type family register data. Sort. The data conversion means functions to character-recognize the image information of the type character type family register data character by character and convert it into code information, and the output means outputs the type character type family register data which has not been converted into code information by character recognition. It has a function of inputting the type character type family register data output by the correction data input means output means as code data. The image data output means has a function of outputting non-type character type family register data as image data to a form and a display, and the code data input means has a function of inputting code data based on the image data output by the image data output means. .

【０００９】このように、戸籍簿データの入力からデー
タベースの作成までのデータ入力作業と改製作業、及び
項目化作業の３種類の作業を組み合わせることにより、
人による大量な入力作業が極力軽量でき、短時間に正確
で均一なデータの提供が可能になるため上記従来技術の
課題を解決できる。As described above, by combining the three types of work, that is, the data input work from the entry of the family register data to the creation of the database, the rework work, and the itemization work,
Since a large amount of input work by a person can be made as light as possible and accurate and uniform data can be provided in a short time, the above-mentioned problems of the prior art can be solved.

【００１０】[0010]

【実施例】以下、本発明の一実施例について図面を用い
て説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings.

【００１１】図１は本発明に係るシステム構成図であ
る。図１において、１はスキャナー、２は戸籍簿データ
記憶部、３は文字認識部、４は戸籍簿マスタ、５はデー
タ処理部、６は記載パターン定義ファイル、７は戸籍デ
ータベース、８はキーボード、９はディスプレイ、１０
はプリンタである。FIG. 1 is a system configuration diagram according to the present invention. 1, 1 is a scanner, 2 is a family register data storage unit, 3 is a character recognition unit, 4 is a family register master, 5 is a data processing unit, 6 is a description pattern definition file, 7 is a family register database, 8 is a keyboard, 9 is a display, 10
Is a printer.

【００１２】以下、本実施例の戸籍データ作成装置の動
作概要を説明する。スキャナー１より入力されたイメー
ジデータは、戸籍簿データ記憶部２に格納された後、文
字認識部３に送られてコードデータに変換され、戸籍簿
マスタ４に一時記憶される。その後、データ処理部５に
て、改製作業・項目化の編集を記載パターン定義ファイ
ル６より読み込んだデータをもとにこのコードデータの
編集を行ない、戸籍データベース７やディスプレイ９及
びプリンタ１０に出力する。又、キーボード８より戸籍
データを入力することにより、戸籍マスタ４に格納され
るコードデータについてデータ処理部５に送って改製作
業・項目化の編集作業を行ない、その後、編集された戸
籍データをディスプレイ９及びプリンタ１０に出力する
ことも可能である。The outline of the operation of the family register data creating apparatus of this embodiment will be described below. The image data input from the scanner 1 is stored in the family register data storage unit 2, then sent to the character recognition unit 3, converted into code data, and temporarily stored in the family register master 4. After that, the data processing unit 5 edits this code data based on the data read from the description pattern definition file 6 for the editing work / itemization editing, and outputs it to the family register database 7, the display 9 and the printer 10. . In addition, by inputting family register data from the keyboard 8, the code data stored in the family register master 4 is sent to the data processing unit 5 for reworking and itemizing editing work, and then the edited family register data is displayed. It is also possible to output to 9 and the printer 10.

【００１３】図３は、戸籍簿の形式を示す事例である。
一般に戸籍簿に記載される文章は形式が定まっており、
図３に示すように「婚姻」「出生」等の事柄について記
載した事件文３０２が複数集まり戸籍文３０１を構成し
ている。すなわち戸籍文は次のように表現される。FIG. 3 is an example showing the format of a family register.
Generally, the format of the sentences written in the family register is fixed,
As shown in FIG. 3, a plurality of case sentences 302 that describe matters such as “marriage” and “birth” form a family register sentence 301. That is, the family register sentence is expressed as follows.

【００１４】戸籍文＝「事件文」「事件文」・・・・・・「事件文」戸籍文３０１を例に用いると、以下のようになる。Family register sentence = “case sentence” “case sentence” ... “Case sentence” Using the family register sentence 301 as an example, the following is obtained.

【００１５】戸籍文＝「昭和拾年参月四日東京都台東区
で出生同月五日父届出入籍」「昭和参拾弐年九月八日青空晴子と婚姻届出東京都台東
区東上野五丁目八番日立一郎戸籍から入籍」又、事件文は、「出生日」「届出日」等の抽出可能な複
数の情報である身分事項と、「で出生」「届出入籍」等
の事件種別（出生、婚姻、離婚、死亡、転籍等）及び身
分事項の記載の順番をパターン化した記載パターンを特
定するために用いられる特徴データとから構成される。Family register statement = "Showa pick-up date 4th day of birth in Taito-ku, Tokyo Registered on the 5th day of the same month by father registration""September 8th year of Showa 2nd year of Haruka Aozora and marriage registration 5th, Higashi-Ueno, Taito-ku, Tokyo Hachiban Hitachi Ichiro's family register is registered. In addition, the case sentence is identification information that is multiple pieces of information that can be extracted, such as the "birth date" and "notification date", and the incident type (birth, "notification registration", etc. , Marital status, divorce, death, transfer, etc.) and characteristic data used to specify a description pattern in which the order of description of identification items is patterned.

【００１６】事件文＝「身分事項」「身分事項」「特徴
データ」「身分事項」「身分事項」「特徴データ」・・・・
・・事件文３０２を例に用いると、以下のようになる。Incident sentence = “identity matter” “identity matter” “characteristic data” “identity matter” “identity matter” “characteristic data” ...
..Using the case sentence 302 as an example, it becomes as follows.

【００１７】事件文＝「昭和拾年参月四日」「東京都台
東区」「で出生」「同月五日」「父」「届出入籍」上記の文で特徴データは「で出生」と「届出入籍」であ
る。Incident sentence = "Showa-cho 4th month of the year", "Taito-ku, Tokyo", "Birth in", "Fiveth day of the same month", "Father", "Notification enrollment" In the above sentence, the characteristic data are "birth in" and " Report registration ”.

【００１８】例えば、「昭和拾年参月四日東京都台東区で出生同月五日父届出
入籍」という文章の記載パターンデータは「（出生日）（出生地）で出生（届出日）（届出人）届
出入籍」であり、この事件文は「（身分事項)(身分事項)(特徴データ)(身分事項)(身分
事項)(特徴データ）（ＮＬ）」となる。ここで「ＮＬ」とはニューラインの略で、改行
を示している。[0018] For example, the pattern data of the sentence "Birth date of birth in Taito-ku, Tokyo on the 4th day of the Showa era on the 5th day of the same year" is "(birth date) (place of birth) birth (report date) (report “Person) Notification”, and this case sentence is “(Identity matter) (Identity matter) (Characteristic data) (Identity matter) (Identity matter) (Characteristic data) (NL)”. Here, "NL" is an abbreviation for new line and indicates a line break.

【００１９】図２は本実施例の戸籍データ作成装置の処
理の流れを示すものである。戸籍簿入力記憶手段１０１
はスキャナー１を介し戸籍簿１００をイメージデータと
して入力し、戸籍簿データ記憶部２に格納する。スキャ
ナー１は例えばイメージスキャナ装置であり、戸籍簿デ
ータ記憶部２は例えば光ディスクである。戸籍簿データ
記憶部２に記憶された戸籍簿のイメージデータはデータ
変換手段１０６０により戸籍簿のイメージデータを文字
認識し、対応する文字コードに１文字づつ変換する。出
力手段１０６３は、文字認識できなかったリジェクトデ
ータをリストにしたリジェクトリスト１０６４としてプ
リンタ１０に印刷するか又はディスプレイ９に表示す
る。FIG. 2 shows the flow of processing of the family register data creating apparatus of this embodiment. Family register input storage means 101
Inputs the family register 100 as image data through the scanner 1 and stores it in the family register data storage unit 2. The scanner 1 is, for example, an image scanner device, and the family register data storage unit 2 is, for example, an optical disc. With respect to the image data of the family register stored in the family register data storage unit 2, the data conversion means 1060 character-recognizes the image data of the family register and converts the image data into a corresponding character code character by character. The output unit 1063 prints on the printer 10 or displays it on the display 9 as a reject list 1064 that lists reject data that cannot be recognized.

【００２０】図７はリジェクトリスト１０６４の印刷フ
ォーマットを示す。７００は例えば、「戸籍番号」と印
刷した見出しである。７０１はリジェクト文字の位置を
示す情報であり、例えば「何行目」という情報である。
７０２は「何文字目」という情報であり、例えばリジェ
クトデータの文字が図７に示すように９文字目である場
合、その前後１文字づつの文字位置を含めた文字位置
（８，９，１０）を表示する。７０３は、リジェクトデ
ータの文字を含めた１文をイメージ情報で表示したもの
であり、この一文を文字認識して、コード情報に変換し
た結果が７０４である。この場合、全く文字認識できな
かった文字については図７に示すように「？」を表示
し、例えば文字認識の結果、認識した文字の候補が複数
選出された文字については前述の文字位置７０２を表示
するようにしてもよい。FIG. 7 shows the print format of the reject list 1064. Reference numeral 700 is, for example, a heading in which "family registration number" is printed. Reference numeral 701 is information indicating the position of the reject character, for example, "what line".
702 is information of “what character”. For example, when the character of the reject data is the 9th character as shown in FIG. 7, character positions (8, 9, 10 ) Is displayed. Reference numeral 703 represents one sentence including characters of reject data displayed as image information. The result of recognizing the one sentence and converting it into code information is 704. In this case, “?” Is displayed as shown in FIG. 7 for a character that cannot be recognized at all, and for example, as a result of the character recognition, a plurality of recognized character candidates are selected and the above-mentioned character position 702 is displayed. It may be displayed.

【００２１】このリジェクトリスト１０６４のリジェク
トデータ文字をイメージ情報７０３の文字に基づいて、
修正データ入力手段１０６５により正規の文字を入力し
て、コード情報７０４のリジェクトデータ文字のコード
情報に修正データを上書きする。又、リジェクトデータ
の出現しない正常なデータについてもデータ変換手段１
０６０による文字認識の結果を確認するために、出力手
段１０６１により確認リスト１０６２をプリンタ１０上
の帳票又はディスプレイ９に出力し、誤りがあった場合
修正データ入力手段１０６５により修正することも可能
である。次に、データ変換手段１０６０により正常に文
字認識したコードデータと修正データ入力手段により入
力されたコードデータを戸籍簿マスタ４に格納する。次
に、この戸籍簿マスタ４に格納されたデータが正しく入
力されたかどうかを論理的にチェックする（１０８）。
この論理チェック１０８は、例えば「本籍、筆頭者の個
人レコードがあるか」，「在籍者が一人以上いるか」等
のレコード構成のチェックや、氏名の中に「死，怨」等
の氏名にはめったに使用されない文字があるかというチ
ェック、本籍地の町名が実在するかのチェック、配偶者
が「夫」又は「妻」であるかのチェック等、入力された
データの矛盾を調べるためのチェックである。論理チェ
ック１０８が終ると、データ変換手段１０９により項目
化作業・改製作業を行なう。記載パターン定義ファイル
１１０にあらかじめ格納した記載パターンデータを用い
て、戸籍簿マスタ４に格納された戸籍簿情報から必要な
語句に編集する処理を行う（詳細は後述する）。語句を
編集した後、編集したデータの整合性を確認する整合性
チェックを行う（１１１）。このチェックは、日付項目
のチェック（例えば、日付の前後チェック）と項目間の
整合性チェック（配偶者の夫と妻は一対か，養子縁組み
があるものについて養父簿欄があるか，胎児認知事項の
直前は出生事項か等）である。この整合性チェック１１
１が終ると、データ変換手段１０９により編集した戸籍
情報は戸籍データベース編集手段１１２により編集さ
れ、複数のデータベースから構成される戸籍データベー
ス７に格納される。Based on the characters of the image information 703, the reject data characters of the reject list 1064 are
The correct data is input by the correction data input unit 1065, and the correction data is overwritten on the code information of the reject data character of the code information 704. Moreover, the data conversion means 1 is also used for normal data in which reject data does not appear.
In order to confirm the result of character recognition by 060, it is possible to output the confirmation list 1062 to the form or the display 9 on the printer 10 by the output means 1061 and correct it by the correction data input means 1065 if there is an error. . Next, the code data normally recognized by the data conversion means 1060 and the code data input by the correction data input means are stored in the family register master 4. Next, it is logically checked whether the data stored in the family register master 4 has been correctly input (108).
This logical check 108 is performed, for example, for checking the record structure such as "whether there is a personal record of the main person or the first person", "whether there is one or more enrolled persons", or for the name such as "death, grudge" in the name. Checks to check for inconsistencies in the entered data, such as checking if there are characters that are rarely used, checking whether the name of the town where the domicile is domicile actually exists, checking whether the spouse is "husband" or "wife", etc. is there. When the logic check 108 is completed, the data conversion means 109 performs itemization work and rework work. Using the description pattern data stored in advance in the description pattern definition file 110, a process of editing the family register information stored in the family register master 4 into necessary words and phrases (details will be described later). After editing the phrase, a consistency check is performed to confirm the consistency of the edited data (111). This check includes checking the date items (for example, checking before and after the date) and checking the consistency between items (a spouse's husband and wife are paired, or if there is an adoption, there is an adoptive family record field, or a fetus recognition matter. Immediately before is a birth matter etc.). This consistency check 11
When step 1 is completed, the family register information edited by the data converting means 109 is edited by the family register database editing means 112 and stored in the family register database 7 composed of a plurality of databases.

【００２２】図８は１つのデータベースの構成図であ
る。データは項目名称８０１と項目のバイト数８０２か
ら成り、例えば、戸籍番号８０３のバイト数は数字項目
の１０桁であり、父氏名８０４のバイト数は日本語項目
の３０桁である。FIG. 8 is a block diagram of one database. The data consists of an item name 801 and an item byte count 802. For example, the family register number 803 has 10 digits for the numeric item, and the father name 804 has 30 digits for the Japanese item.

【００２３】なお文字認識部３の仕分け手段１０３によ
り戸籍簿のイメージデータをタイプ文字型戸籍データ１
０５と非タイプ文字型戸籍データ１０４０とに仕分けし
てもよい。この場合、仕分け手段１０３では、例えば閾
値を９０％の認識率としたとき、戸籍簿データの１行分
の文字認識率が、９０％以上のものをタイプ文字型戸籍
データとし、９０％未満のものを非タイプ文字型戸籍デ
ータとして仕分けを行う。非タイプ文字型戸籍データは
主に手書きによる戸籍簿であり、文字認識率がタイプ文
字に比べ大きく下回る。仕分けされた非タイプ文字型戸
籍データ１０４０は、その戸籍簿データ１０４２をイメ
ージデータ出力手段１０４１により、イメージデータの
ままプリンタ１０で印字出力またはディスプレイ９で画
面表示する。コードデータ入力手段１０４３により、出
力したイメージ情報を基にコードデータを入力する。仕
分けされたタイプ文字型戸籍データ１０５は、データ変
換手段１０６０に入力され、以下上述のように処理され
る。コードデータ入力手段１０４３により入力したコー
ドデータは戸籍簿マスタ４に格納される。The sorting means 103 of the character recognition unit 3 converts the image data of the family register into the type character type family register data 1
05 and non-type character type family register data 1040 may be sorted. In this case, in the sorting means 103, for example, when the threshold is 90%, the character recognition rate for one line of the family register data is 90% or more is the type character type family register data, and the character recognition rate is less than 90%. Items are sorted as non-type character type family register data. The non-type character type family register data is mainly handwritten family register, and the character recognition rate is much lower than that of type letters. As for the sorted non-type character type family register data 1040, the family register data 1042 is output as image data by the image data output means 1041 by the printer 10 or displayed on the screen of the display 9. The code data input means 1043 inputs code data based on the output image information. The sorted type character type family register data 105 is input to the data conversion means 1060 and processed as described above. The code data input by the code data input means 1043 is stored in the family register master 4.

【００２４】次に図４及び図５を用いてデータ変換手段
１０９について説明する。記載パターン定義ファイル６
からあらかじめ登録された記載パターンデータを抽出
し、テーブルに展開する（４００）。Next, the data conversion means 109 will be described with reference to FIGS. 4 and 5. Description pattern definition file 6
The description pattern data registered in advance is extracted from the table and expanded in a table (400).

【００２５】図６は、記載パターン定義ファイル６及び
内部テーブルのデータ構成を示すものである。記載パタ
ーンデータは、出生，婚姻等の事件ごとに複数のパター
ンに分かれており、事件文を展開するための記載パター
ンデータが全て格納されている。例えば、事件種別が出
生６００の場合は記載パターンデータは６０１に示すよ
うに１０種類ある。それぞれの記載パターンデータは特
徴データと身分事項の組合せを表すものであり、複数の
特徴データは出現順に配列されており、複数の身分事項
は出現順に配列されている。パターン１（６０２）で
は、「で出生」「届出入籍」が特徴データ６０３とな
り、この場合の記載パターンデータ６０４は、「出生
日」「出生地」「届出日」「届出人」の順番で登録され
ている。FIG. 6 shows the data structure of the description pattern definition file 6 and the internal table. The description pattern data is divided into a plurality of patterns for each case such as birth and marriage, and all the description pattern data for developing the case sentence is stored. For example, when the case type is birth 600, there are 10 types of description pattern data as shown in 601. Each of the described pattern data represents a combination of the characteristic data and the identification item, the plurality of characteristic data items are arranged in the order of appearance, and the plurality of identification items are arranged in the order of appearance. In the pattern 1 (602), "birth" and "notification enrollment" are the characteristic data 603, and the described pattern data 604 in this case is registered in the order of "birth date""place of birth""reportdate""reporter". Has been done.

【００２６】図４に戻って説明を続けると、次に戸籍簿
マスタ４から戸籍データを入力する（４０１）。既に複
数の戸籍データが入力されている場合には、処理の対象
となる戸籍データを取り出す。戸籍データが入力される
と、テーブルに展開された記載パターンデータと入力さ
れた戸籍データとについてマッチング処理を行う。この
戸籍データは、戸籍上の事件文データが１文１レコード
の構成となって格納されている。マッチング処理に際し
て、先ず、登録された全記載パターンデータとのマッチ
ングを避けるため、予め記載項目の事件種別を特定する
ための特徴データにより、事件種別を特定する（４０
３）。この時、一意に決まらない場合（４０４ＮＯ）
は、それぞれの事件種別に対して記載パターンの特徴デ
ータとのマッチングを行う（４０５）。この時、特定さ
れた複数の事件種別の内、真の事件種別でない事件種別
の記載パターンの特徴データと対象となる文の特徴デー
タがマッチングすることはないので、事件種別は一意に
決定できる。記載項目の事件種別が決定したら、次は記
載パターンを特定する。記載パターンの特定は、対象と
なった事件種別の全記載パターンに対して、内部テーブ
ルに展開した順番に入力データと記載パターンを特定す
る特徴データとを比較するマッチング処理により行う
（４０６）。マッチング処理が終了したら（４０７）、
次に特定された記載パターンの記載パターンデータを用
いて事件文の展開を行う（４０８）。以下に事件文の展
開例を示す。Returning to FIG. 4 and continuing the description, next, the family register data is input from the family register master 4 (401). When a plurality of family register data has already been input, the family register data to be processed is taken out. When the family register data is input, matching processing is performed on the description pattern data developed in the table and the entered family register data. In this family register data, case sentence data on the family register is stored in the structure of one sentence and one record. In the matching process, first, in order to avoid matching with all registered description pattern data, the case type is specified by the characteristic data for specifying the case type of the described item in advance (40
3). At this time, if it cannot be uniquely determined (404 NO)
Performs matching with the characteristic data of the description pattern for each case type (405). At this time, the characteristic data of the description pattern of the incident type that is not the true incident type does not match the characteristic data of the target sentence among the plurality of identified incident types, so that the incident type can be uniquely determined. After the incident type of the described item is determined, the description pattern is specified next. The description pattern is specified by a matching process of comparing the input data with the characteristic data for specifying the description pattern in the order of development in the internal table for all the description patterns of the target incident type (406). When the matching process is completed (407),
Next, the case sentence is expanded using the description pattern data of the specified description pattern (408). An example of the development of the case sentence is shown below.

【００２７】入力データ：「平成５年３月１日東京都千
代田区で出生同月２日父届出同月３日同区長から送付入
籍」記載パターン：（出生日）（出生地）で出生（届出日）
（届出人）届出（送付を受けた日）（受理者）から送付
入籍記載パターンデータとマッチングを行い、事件文を展
開する。Input data: "March 1, 1993, born in Chiyoda-ku, Tokyo, father reported on 2nd day of the same month, sent from the same mayor on 3rd day of the same month." Description pattern: Birth on (date of birth) (place of birth) (date of notification) )
(Notifier) Report (Date of receipt) (Receiver) Sending Matches with the entry pattern data sent and develops the case sentence.

【００２８】(ア)記載パターンデータの先頭の情報が
「出生日」で、次の情報が身分事項である。(A) The first information of the described pattern data is the "birth date", and the following information is the identification item.

【００２９】・・・・・・・「出生日」が「日付」であるので
「日」で区切り、「平成５年３月１日」を抽出 (イ)次の情報が「出生地」で、その次が特徴データであ
る。Since "birth date" is "date", delimit by "day" and extract "March 1, 1993" (b) The following information is "place of birth" The next is the characteristic data.

【００３０】・・・・・・・特徴データ「で出生」で区切り、
「東京都千代田区」を抽出 (ウ)次の情報が「届出日」で、その次が身分事項であ
る。..... Separated by feature data "Birth",
Extract "Chiyoda-ku, Tokyo" (C) The following information is the "report date", and the next is the identification item.

【００３１】・・・・・・・「届出日」が「日付」であるので
「日」で区切り、「同月２日」を抽出 (エ)次の情報が「届出人」で、その次が特徴データであ
る。.............. Since the "report date" is "date", it is separated by "day" and "2nd of the same month" is extracted. (D) The following information is the "reporter" and the next is It is characteristic data.

【００３２】・・・・・・・特徴データ「届出」で区切り、
「父」を抽出 (オ)次の情報が「送付を受けた日」で、その次が身分事
項である。...... Separated by feature data "report",
"Father" is extracted. (E) The next information is "the date when it was sent", and the next is the identification item.

【００３３】・・・・・・・「送付を受けた日」が「日付」で
あるので「日」で区切り、「同月３日」を抽出 (カ)次の情報が「受理者」で、その次が特徴データであ
る。.............. Since the "sent date" is "date", delimit by "day" and extract "3rd of the same month". (F) The following information is "receiver", Next is the characteristic data.

【００３４】・・・・・・・特徴データ「から送付入籍」で区
切り、「同区長」を抽出 (キ)次の情報がＮＬである。................................................................................................................ ||| ... The same information is separated by "Sending enrollment", (G) The following information is NL.

【００３５】・・・・・・・終了ＮＬまで展開したら（４０９）、事件文データをデ
ータ処理部のワークエリアにセットする（４１０）。........ End After expanding to the NL (409), the case sentence data is set in the work area of the data processing unit (410).

【００３６】データ処理部のワークエリアにセット
された事件文データと、記載パターンを用いて文章を復
元し、入力データとの整合性をチェックし（４１１）、
整合、不整合の選別をし（４１２）、不整合が生じなけ
れば、データ処理部のワークエリアの事件文データをデ
ータ処理部のセーブエリアにセットする（４１３）。不
整合の場合には例えば、例外記載パターンとする（４１
４）。The case sentence data set in the work area of the data processing unit and the sentence are used to restore the sentence, and the consistency with the input data is checked (411).
Matching or inconsistency is selected (412). If no inconsistency occurs, the case sentence data in the work area of the data processing unit is set in the save area of the data processing unit (413). In the case of inconsistency, for example, it is set as an exception description pattern (41
4).

【００３７】セーブエリアにセットされている事件
文データの内、省略されているものは、決められたルー
ルに従い事件文データを復元する（４１５）。（「同
月」を「３月」にする等）以上のような方法により、戸籍データの項目分けを終了
する（４１６）。Of the case sentence data set in the save area, the omitted case sentence data is restored according to the determined rule (415). (“Same month” is changed to “March”, etc.) The itemization of family register data is completed by the above method (416).

【００３８】次に項目化した戸籍データの改製作業を図
５を用いて説明する。戸籍データベースに格納すべき事
件種別かを判定するための条件を登録しておく登録条件
定義テーブルがデータ処理部５内にある。この条件定義
テーブルを基に、項目化した戸籍データはそれぞれの事
件文が登録すべきデータなのか否かを事件種別をもとに
選択する（５０１）。例えば、事件名「婚姻」「認知」
は登録対象であるが「死亡」「離婚」は登録対象外であ
る。登録対象とした戸籍データを特徴データをもとに登
録すべきデータなのか否かを前記条件定義テーブルを基
に決定する（５０２）。例えば、「を認知」という特徴
データを含む事件文については登録対象外である。次
に、登録対象とした戸籍データが現在継続している事件
文か否かを登録対象の事件文以降の事件文と前記条件定
義テーブルを基に決定する（５０３）。例えば、事件種
別「婚姻」の事件文以降のデータに「婚姻取消」の特徴
データが存在する場合はその「婚姻」の事件文について
は現在継続していないため登録対象外である。Next, the itemizing work of the family register data will be described with reference to FIG. The data processing unit 5 has a registration condition definition table in which a condition for determining whether the case type should be stored in the family register database is registered. Based on this condition definition table, whether the itemized family register data is the data to be registered for each case sentence is selected based on the case type (501). For example, case name "marriage""cognition"
Is subject to registration, but "death" and "divorce" are not subject to registration. It is determined based on the condition definition table whether or not the family register data to be registered is data to be registered based on the characteristic data (502). For example, the case sentence including the feature data of "cognition" is not registered. Next, it is determined whether or not the family register data to be registered is the case sentence that is continuing at present based on the case sentence after the case sentence to be registered and the condition definition table (503). For example, if the characteristic data of “marriage cancellation” is present in the data after the case sentence of the case type “marriage”, the case sentence of the “marriage” is not registered because it is not continued.

【００３９】[0039]

【発明の効果】本発明によれば、戸籍簿から戸籍データ
ベースを作成するための入力作業と改製作業、及び項目
化作業を含む一連の過程をシステム化して、コード情報
を用いることにより、人手による判断が最小限の作業で
戸籍データを編集して戸籍データベースを作成できると
いう効果がある。According to the present invention, a series of processes including an inputting work for making a family register database from a family register, a remodeling work, and an itemizing work are systemized, and by using code information, it is manually performed. There is an effect that it is possible to create a family register database by editing family register data with a minimum of work.

【００４０】また、手書きの戸籍簿や汚れの多い戸籍簿
が大量に混在する場合、文字認識しやすいタイプ文字型
戸籍データと文字認識しずらい非タイプ文字型戸籍デー
タとに仕分けしてから戸籍データベースの作成処理を行
うことができ、効率良く戸籍データベースを作成でき
る。When a large number of handwritten family registers and dirty family registers are mixed, the family register data is sorted into the type character type family register data which is easy to recognize characters and the non-type character type family register data which is difficult to recognize characters. A database can be created and a family register database can be created efficiently.

[Brief description of drawings]

【図１】本発明に係るシステム構成図である。FIG. 1 is a system configuration diagram according to the present invention.

【図２】実施例の戸籍データ作成装置の処理の流れを示
す図である。FIG. 2 is a diagram showing a flow of processing of the family register data creation device of the embodiment.

【図３】戸籍簿の形式の一例を示す図である。FIG. 3 is a diagram showing an example of a format of a family register.

【図４】項目化作業の処理の流れを示す図である。FIG. 4 is a diagram showing a flow of processing of itemization work.

【図５】改製作業の処理の流れを示す図である。FIG. 5 is a diagram showing a flow of processing of a rework work.

【図６】記載パターンデータの構成を示す図である。FIG. 6 is a diagram showing a structure of written pattern data.

【図７】リジェクトリストの一例を示す図である。FIG. 7 is a diagram showing an example of a reject list.

【図８】戸籍データベースのデータ構成の一例を示す図
である。FIG. 8 is a diagram showing an example of a data structure of a family register database.

[Explanation of symbols]

１・・・スキャナー、２・・・戸籍簿データ記憶部、３
・・・文字認識部、４・・・戸籍簿マスタ、５・・・デ
ータ処理部、６・・・記載パターン定義ファイル、７・
・・戸籍データベース1 ... Scanner, 2 ... Family register data storage unit, 3
... Character recognition unit, 4 ... Family register master, 5 ... Data processing unit, 6 ... Described pattern definition file, 7 ...
..Family register database

───────────────────────────────────────────────────── フロントページの続き (72)発明者岡本卓哉神奈川県川崎市麻生区王禅寺1099番地株式会社日立製作所システム開発研究所内 (72)発明者熊田武彦東京都江東区新砂一丁目６番27号株式会社日立製作所公共情報事業部内 (72)発明者横井慎一東京都江東区新砂一丁目６番27号株式会社日立製作所公共情報事業部内 (72)発明者坂野幸生東京都江東区新砂一丁目６番27号株式会社日立製作所公共情報事業部内 (72)発明者坂本晃敏東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 (72)発明者鎌田素明東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 (72)発明者皿井謙介東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 (72)発明者西沢正夫東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 (72)発明者大村真二東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 (72)発明者菊池章子東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 (72)発明者柴田克尚東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 (72)発明者赤塚勝東京都渋谷区道玄坂一丁目16番５号株式会社日立情報システムズ内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Takuya Okamoto 1099, Ozenji, Aso-ku, Kawasaki-shi, Kanagawa Inside the Hitachi, Ltd. Systems Development Laboratory (72) Inventor Takehiko Kumata 1-6-27 Shinsuna, Koto-ku, Tokyo Hitachi, Ltd. Public Information Division (72) Inventor Shinichi Yokoi 1-6-27 Shinsuna, Koto-ku, Tokyo Stock Corporation Hitachi Ltd. Public Information Division (72) Inventor Yukio Sakano 1-chome Shinsago, Koto-ku, Tokyo 6-27 No. 27, Public Information Division, Hitachi, Ltd. (72) Inventor Akitoshi Sakamoto 1-16-5 Dogenzaka, Shibuya-ku, Tokyo Inside Hitachi Information Systems Ltd. (72) Inventor, Nobuaki Kamada Dogenzaka, Shibuya-ku, Tokyo 1-16-5 Hitachi Information Systems Co., Ltd. (72) Inventor Kensuke Sarai Dogen, Shibuya-ku, Tokyo 1-16-5 Hitachi Information Systems Co., Ltd. (72) Inventor Masao Nishizawa Dogenzaka Shibuya-ku, Tokyo 1-16-5 Hitachi Information Systems Co., Ltd. (72) Inventor Shinji Omura 1 Dogenzaka Shibuya-ku, Tokyo Chome 16-5 Hitachi Information Systems Co., Ltd. (72) Inventor Akiko Kikuchi 1-16-5 Dogenzaka Shibuya-ku, Tokyo Inside Hitachi Information Systems Co., Ltd. (72) Katsuhisa Shibata 1-16 Dogenzaka Shibuya-ku, Tokyo No. 5 In Hitachi Information Systems Co., Ltd. (72) Inventor Masaru Akatsuka 1-16-5 Dogenzaka, Shibuya-ku, Tokyo Within Hitachi Information Systems Co., Ltd.

Claims

[Claims]

1. A family register book input means for inputting a family register as image information, a family register data storage section for storing the family register data input by the family register input means, and a family register data storage section for storing the family register data. Data conversion means for recognizing the family register data by character and converting image information into code information, a family register master for storing the data code-converted by the data transforming means, and data stored in the family register master. A family register data creating device comprising: a family register database creating means for creating a family register database.

2. The data conversion means, depending on whether or not the recognition rate of character recognition of the family register data stored in the family register data storage unit is a predetermined value or more, type character type family register data and non-type character, respectively. 2. The family register data creation device according to claim 1, further comprising means for sorting into type family register data.

3. A storage means is provided for storing a plurality of characteristic data such as "birth" and "notification registration", and a plurality of written pattern data representing a combination of identification items such as "birth date" and "notification date". The family register database creating means compares the family register data input from the family register book master with the data stored in the storage means to identify matching pattern data, and then identifies the entered family register data as characteristic data. Alternatively, the family register data creation device according to claim 1, wherein the items are divided by items of identification.