JPH0581313A

JPH0581313A - Dictionary preparing device

Info

Publication number: JPH0581313A
Application number: JP3240711A
Authority: JP
Inventors: Noboru Hatano; 昇波多野
Original assignee: KOBE NIPPON DENKI SOFTWARE KK; NEC Software Kobe Ltd
Current assignee: KOBE NIPPON DENKI SOFTWARE KK; NEC Software Kobe Ltd
Priority date: 1991-09-20
Filing date: 1991-09-20
Publication date: 1993-04-02

Abstract

PURPOSE:To prepare an index by expanding automatically a heading of an inflective form. CONSTITUTION:A language data generating part 2 generates newly language data of a dictionary. A table consulting part 3 consults an inflective form expansion table 4 by using information (an original form of a heading word, a part of speech, an inflective form) stored in the language data as a consulting subject. The inflective form expansion table 4 expands an inflective form in accordance with an inflective variation in the matched entry. An index preparing part 7 links each expanded inflective form to the language data, and prepares an index.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は辞書作成装置に関し、特
に、計算機処理用電子化辞書作成装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a dictionary creating apparatus, and more particularly to a computerized electronic dictionary creating apparatus.

【０００２】[0002]

【従来の技術】近年の自然言語処理技術の発展に伴い、
計算機処理用電子化辞書の重要性が高まっている。辞書
に記述された情報量が自然言語処理システムの性能を左
右するので、辞書は、信頼性が高くバグのない高品質な
ものでなければならない。そしてそのために、メニュー
方式で言語情報を入力できるようにした辞書開発装置が
提案されている。2. Description of the Related Art With the recent development of natural language processing technology,
Computerized dictionaries for computer processing are becoming more important. Since the amount of information written in the dictionary affects the performance of the natural language processing system, the dictionary must be reliable and of high quality without bugs. For that purpose, a dictionary development apparatus has been proposed that allows the user to input language information in a menu system.

【０００３】ところで、多くの辞書は、辞書検索の効率
化を図るために、言語情報を記述したデータ部とは独立
に検索のためのインデクス部を備えている。従来の辞書
開発装置ではデータ部の作成は簡易であるが、インデク
ス部は、人間が単語ごとに見出し語（原形，活用形）を
考え、各々を個々に言語データにリンクするという方法
である。By the way, many dictionaries are provided with an index part for searching independently of the data part describing the language information in order to improve the efficiency of dictionary searching. In the conventional dictionary development device, the creation of the data part is simple, but the index part is a method in which a person considers a headword (prototype, inflectional form) for each word and links each to the language data individually.

【０００４】[0004]

【発明が解決しようとする課題】上述した従来方法で
は、インデクス部の作成は、非常にコストのかかる作業
にならざるをえない。そのうえ、英語のような不規則活
用の多い言語では間違った活用形をリンクしてしまうこ
とが多く、作成者の不注意によるリンクミスを防げない
という問題がある。In the above-mentioned conventional method, the creation of the index part is inevitably a very expensive operation. In addition, in languages such as English that have a lot of irregular usage, incorrect usage forms are often linked, and there is a problem that a link mistake due to the carelessness of the creator cannot be prevented.

【０００５】また、インデクスと言語データとのつなが
りを支えているものが物理的なリンク情報だけであるた
め、このリンク情報に障害が発生した場合、インデクス
部の復旧がほとんと不可能であるという問題もある。Further, since the physical link information is the only one that supports the connection between the index and the language data, if a failure occurs in this link information, the index part cannot be restored at all. There are also problems.

【０００６】[0006]

【課題を解決するための手段】本発明は、言語情報を格
納したデータ部と検索のためのインデクス部を有する計
算機処理用電子化辞書を作成，更新する辞書作成装置に
おいて、利用者からの入力を受け付ける入力部と、この
入力部から入力された情報に基づいて新しく言語データ
を作成する言語データ作成部と、前記言語データに格納
された情報から見出し語の活用形を展開する活用形展開
テーブルと、この活用形展開テーブルをひくテーブルび
き部と、前記言語データ作成部で作成した言語データの
インデクスを作成するインデクス作成部と、辞書に対し
て言語データとインデクスの書き込みを行なう辞書アク
セス部とを具備することを特徴とするものである。The present invention provides a dictionary creating apparatus for creating and updating a computerized computerized dictionary having a data section storing language information and an index section for retrieval, and an input from a user. An input section that accepts, a language data creation section that creates new language data based on the information input from this input section, and an inflectional-type expansion table that develops the inflectional forms of the headword from the information stored in the language data. A table drawing section that draws this inflectional expansion table, an index creating section that creates an index of the language data created by the language data creating section, and a dictionary access section that writes the language data and the index to the dictionary. It is characterized by including.

【０００７】[0007]

【実施例】次に、本発明について図面を参照して説明す
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS Next, the present invention will be described with reference to the drawings.

【０００８】図１は本発明の一実施例の構成図である。FIG. 1 is a block diagram of an embodiment of the present invention.

【０００９】入力部１は、利用者から入力として見出し
語の原形と品詞などの言語情報を受け取り、言語データ
作成部２へ伝達する。The input unit 1 receives the language information such as the original form of the headword and the part of speech as an input from the user, and transmits it to the language data creation unit 2.

【００１０】言語データ作成部２は、入力情報に基づい
て辞書の言語データを新しく作成する。テーブルびき部
３は、言語データに格納された情報のうち、見出し語の
原形，品詞，活用型などを引数として活用形展開テーブ
ル４をテーブルびきする。The language data creation unit 2 creates new language data of the dictionary based on the input information. Of the information stored in the language data, the table arranging unit 3 argues the inflectional form expansion table 4 using the original form of the entry word, the part of speech, the inflectional form, etc. as arguments.

【００１１】活用形展開テーブル４は、図２，図３に示
すような、検索キーと活用変化を対に記述したエントリ
で構成されるテーブルであり、不規則活用展開テーブル
５と規則活用展開テーブル６の２つのテーブルからなっ
ている。不規則活用展開テーブル５には、不規則な活用
変化をする語があらかじめ登録されており、原形文字
列，品詞，活用型がテーブルびきの検索キーとなる。そ
れに対し、規則活用展開テーブル６は原形の語尾の文字
列，品詞，活用型が検索キーとなる。The utilization type expansion table 4 is a table composed of entries that describe a search key and utilization changes as shown in FIGS. 2 and 3, and includes an irregular utilization expansion table 5 and a rule utilization expansion table. It consists of 6 tables. In the irregular utilization expansion table 5, words that undergo irregular utilization changes are registered in advance, and the original character string, the part of speech, and the utilization type serve as the search key for the table. On the other hand, in the rule utilization expansion table 6, the character string of the ending of the original form, the part of speech, and the utilization type are used as the search keys.

【００１２】テーブルは、不規則活用展開テーブル５、
規則活用展開テーブル６の順で検索され、最初にマッチ
したエントリにある活用変化に従って活用形を展開し、
その結果（各活用形の見出し語）をテーブルびき部３に
返す。The table is an irregular utilization expansion table 5,
It is searched in the order of the rule utilization expansion table 6, and the utilization forms are expanded according to the utilization change in the first matching entry,
The result (headword of each inflectional form) is returned to the table siding unit 3.

【００１３】インデクス作成部７は、展開された個々の
活用形を言語データにリンクし、インデクスを作成す
る。そして、作成した言語データとインデクスを辞書ア
クセス部８に送る。The index creating unit 7 creates an index by linking each expanded form of utilization to language data. Then, the created language data and index are sent to the dictionary access unit 8.

【００１４】辞書アクセス部８は、作成された言語デー
タとそのインデクスを、それぞれ辞書９のデータファイ
ルとインデクスファイルに書き込む。The dictionary access unit 8 writes the created language data and its index into the data file and index file of the dictionary 9, respectively.

【００１５】次に、図４を参照して、図１の実施例の動
作とデータの変化を説明する。Next, referring to FIG. 4, the operation and data change of the embodiment shown in FIG. 1 will be described.

【００１６】たとえば、日本語の例として入力０１ａ，
０１ｂを新規に登録する場合、言語データ作成部２は入
力部１から渡された入力情報に基づいてデータ２３ａ，
２３ｂのような言語データを作成する。この時点では、
インデクスは入力情報である見出し語の原形しかリンク
されていない。For example, as an example of Japanese, input 01a,
In the case of newly registering 01b, the language data creating unit 2 uses the data 23a,
Create language data such as 23b. At this point,
The index is linked only to the original form of the entry word that is the input information.

【００１７】次に、テーブルびき部３が活用形展開テー
ブル４をひく。「書く」は規則活用動詞なので、規則活
用展開テーブルの以下のエントリにマッチする。原形の語尾の文字列：く品詞：動詞活用型：五段活用そして以下のような活用形が展開される。未然形１：書か未然形２：書こ連用形１：書き連用形２：書い終止形：書く連体形：書く仮定形：書け命令形：書け一方、「行く」は不規則活用動詞（連用形２が「行い
た」ではなく「行った」）なので、不規則活用展開テー
ブルの以下のようなエントリにマッチする。原形の文字列：行く品詞：動詞活用形：五段活用そして以下のような活用形が展開される。未然形１：行か未然形２：行こ連用形１：行き連用形２：行っ終止形：行く連体形：行く仮定形：行け命令形：行けインデクス作成部７は、得られた各活用形を言語データ
にリンクしてデータ７８ａ，７８ｂのようなインデクス
を作成する。Next, the table pulling unit 3 pulls the inflection type expansion table 4. Since "write" is a rule verb, it matches the following entry in the rule usage expansion table. Prototype ending character string: Ku Part of speech: Verb Inflectional type: Five-stage inflection And the following inflectional forms are developed. Vocabulary 1: Writing Vocabulary 2: Sentences Consecutive Form 1: Writing Sequential Form 2: Writing Ending Form: Writing Conjugation Form: Writing Hypothesis Form: Writing Imperative Form: Writing On the other hand, “go” is an irregular verb "I went" instead of ".", So it matches the following entries in the irregular usage expansion table. Prototype character string: Go Part of speech: Verb Inflectional form: Five-stage utilization The following inflectional forms are developed. Form 1: Form or form Form 2: Form line Consecutive form 1: Go Line Consecutive form 2: Go End form: Go Union form: Go Hypothetical form: Go Imperative form: Go Links are created to create indexes such as data 78a and 78b.

【００１８】最後に、このようにして作成された言語デ
ータとインデクスを辞書アクセス部８に送り、辞書アク
セス部８がこれらを辞書９に書き込む。Finally, the language data and index thus created are sent to the dictionary access unit 8, and the dictionary access unit 8 writes them in the dictionary 9.

【００１９】活用形展開に関しては、言語データ作成部
２，テーブルびき部３，インデクス作成部７は特定の言
語に依存しておらず、各言語の活用形展開テーブルを用
いることによって、どのような言語に対しても活用形展
開，インデクス作成を自動的に行なうことが可能であ
る。Regarding the expansive expansion, the language data creating section 2, the table arranging section 3, and the index creating section 7 do not depend on a specific language. It is also possible to automatically expand the usage and create indexes for languages.

【００２０】他の言語の場合の一例として、英語の入力
における動作とデータの変化を図５に示す。処理の流れ
やデータの変化は上記の日本語の例と同様であり、英語
用活用形展開テーブルによって展開される結果だけが異
なる。As an example of the case of another language, FIG. 5 shows a change in operation and data when inputting English. The flow of processing and changes in data are the same as those in the above Japanese example, and only the results expanded by the inflectional expansion table for English are different.

【００２１】以下にマッチしたエントリと展開された結
果を示す。「ｓｔｕｄｙ」（規則活用動詞）（エントリ）原形の語尾の文字列：子音＋ｙ品詞：動詞活用型：無指定（結果）原形：ｓｔｕｄｙ３人称単数現在形：ｓｔｕｄｉｅｓ過去形：ｓｔｕｄｉｅｄ現在分詞形：ｓｔｕｄｙｉｎｇ過去分詞形：ｓｔｕｄｉｅｄ「ｔａｋｅ」（不規則活用動詞）（エントリ）原形の語尾の文字列：ｔａｋｅ品詞：動詞活用型：無指定（結果）原形：ｔａｋｅ３人称単数現在形：ｔａｋｅｓ過去形：ｔｏｏｋ現在分詞形：ｔａｋｉｎｇ過去分詞形：ｔａｋｅｎThe matched entries and the expanded results are shown below. "Study" (rule conjugation verb) (entry) pronoun ending character string: consonant + y part of speech: verb conjugation type: unspecified (result) prototype: study third person singular present tense: past past tense: present present participle: studying Past participle form: studied “take” (irregular conjugation verb) (entry) Prototype ending character string: take Part of speech: verb conjugation type: unspecified (result) Prototype: take 3rd person singular present form: takes Past form: talk Present participle: taking past participle: taken

【発明の効果】以上説明したように、本発明によれば、
辞書作成において活用形の見出しを人間が個々に作成す
る必要がないため、辞書作成のコストを低減させること
ができる。As described above, according to the present invention,
Since it is not necessary for a person to individually create inflectional headings when creating a dictionary, the cost of creating a dictionary can be reduced.

[Brief description of drawings]

【図１】本発明の一実施例の構成図である。FIG. 1 is a configuration diagram of an embodiment of the present invention.

【図２】図１中の活用形展開テーブルの模式図である。FIG. 2 is a schematic diagram of an inflection type expansion table in FIG.

【図３】図１中の活用形展開テーブルの模式図である。FIG. 3 is a schematic diagram of an inflection type expansion table in FIG.

【図４】図１の実施例における処理を示す説明図であ
る。FIG. 4 is an explanatory diagram showing a process in the embodiment of FIG.

【図５】図１の実施例における処理を示す説明図であ
る。FIG. 5 is an explanatory diagram showing a process in the embodiment of FIG.

[Explanation of symbols]

１入力部２言語データ作成部３テーブルびき部４活用形展開テーブル５不規則活用展開テーブル６規則活用展開テーブル７インデクス作成部８辞書アクセス部９辞書 1 Input Part 2 Language Data Creation Part 3 Table Drawing Part 4 Inflectional Expansion Table 5 Irregular Application Expansion Table 6 Rule Usage Expansion Table 7 Index Creation Part 8 Dictionary Access Part 9 Dictionary

Claims

[Claims]

1. A dictionary creating apparatus for creating and updating a computerized computerized dictionary having a data part storing language information and an index part for searching, and an input part for accepting an input from a user, and this input. The language data creation unit that creates new language data based on the information input from the department, the inflectional expansion table that develops inflectional forms of headwords from the information stored in the language data, and this inflectional form expansion table. With a table pulling part,
A dictionary creating device comprising: an index creating unit that creates an index of the language data created by the language data creating unit; and a dictionary access unit that writes the language data and the index to the dictionary.

2. The inflection type expansion table is a table composed of an entry in which a search key and an inflection change are described as a pair, and is composed of two tables, an irregular utilization expansion table and a rule utilization expansion table. In the irregular utilization expansion table, words that undergo irregular utilization changes are registered in advance, and the original form character string, the part of speech, and the utilization type serve as a search key for the table, and the rule utilization expansion table shows the end of the original form. 2. The dictionary creating device according to claim 1, wherein a character string, a part of speech, and an inflection type are used as search keys.