JPH02148265A

JPH02148265A - Automatic indexing system

Info

Publication number: JPH02148265A
Application number: JP63301036A
Authority: JP
Inventors: Akiko Mikami; 三上　明子
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-11-30
Filing date: 1988-11-30
Publication date: 1990-06-07

Abstract

PURPOSE:To take out only a more important noun as the index word by taking out nouns accompanied with case particles from analysis results of morphemes and eliminating unnecessary words from taken-out nouns. CONSTITUTION:A morpheme analysis dictionary file 2 is used to divide a sentence 1 into words by a morpheme analyzing part 3, and information of parts of speech are given to respective words to generate word data 4. A case particle taking-out rule file 5 is used to extract nouns accompanied with case particles by a noun extracting part 6, and noun data 7 is generated. Since nouns accompanied with case particles are extracted, nouns having important meanings in the sentence are taken out. An unnecessary word dictionary file 8 is used to eliminate unnecessary words such as pronouns from noun data 7 in an unnecessary word eliminating part 9, and an index word 10 is extracted. Thus, a more important noun is taken out as the index word when the number of index words given to the object sentence is limited.

Description

【発明の詳細な説明】［産業上の利用分野〕本発明は、自動索引システムに関する。[Detailed description of the invention] [Industrial application field] The present invention relates to automatic indexing systems.

［従来の技術］従来、自動索引システムにおいては、対象の文章を形８
素解析した後１名詞をすべて取り出し、取り出された名
詞から不要語を取り除くことにより、索引語を抽出して
いた。[Prior art] Conventionally, in automatic indexing systems, target sentences are
After elemental analysis, index words were extracted by extracting all nouns and removing unnecessary words from the extracted nouns.

［解決すべき課題］上述した従来のシステムでは、不要語以外の名詞は、す
べて索引語として抽出されるのて、対象の文章に対して
、索引語が多数抽出され、索引語間の重要性の度合を計
る情報に欠けているという問題かあった。[Problems to be solved] In the conventional system described above, all nouns other than unnecessary words are extracted as index words, but many index words are extracted for the target sentence, and the importance of each index word is determined. The problem was that there was a lack of information to measure the degree of

したかって、従来のシステムては、対象の文章の中で、
より重要である名詞のみを取り出すことか出来ないとい
う問題かあった。Therefore, in the conventional system, in the target sentence,
There was a problem that it was not possible to extract only the more important nouns.

本発明は上述した問題点にかんがみてなされたちのて、
より重要である名詞のみを索引語として取り出すことの
てきる自動索引システムの提供を目的とする。The present invention has been made in view of the above-mentioned problems.
The purpose of the present invention is to provide an automatic indexing system that can extract only more important nouns as index words.

［課題の解決手段コ上記目的を達成するために本発明の自動索引システムは
、対象の文章を形態素解析して単語データを作成する形
態素解析手段と、この形態素解析手段により得られた単
語データから格助詞を伴なう名詞を取り出して名詞デー
タを作成する名詞抽出手段と、この名詞抽出手段により
得られた名詞データから不要語を取り除いて索引語を抽
出する不要語除去手段とを備えた構成としである。[Means for solving the problem] In order to achieve the above object, the automatic indexing system of the present invention includes a morphological analysis means that morphologically analyzes a target sentence to create word data, and a morphological analysis means that creates word data from the word data obtained by this morphological analysis means. A configuration comprising a noun extraction means for extracting a noun accompanied by a case particle to create noun data, and an unnecessary word removal means for extracting an index word by removing unnecessary words from the noun data obtained by the noun extraction means. It's Toshide.

［実施例］以下、本発明の一実施例について図面を参照して説明す
る。[Example] Hereinafter, an example of the present invention will be described with reference to the drawings.

第１図は本発明に係る自動索引システムの一実施例の構
成図であるｒ本実施例の自動索引システムては、先ず、文章ｌを、形
態素解析用辞書ファイル２を用いて、形態素解析部３に
おいて、単語に分割し、それぞれの単語に品詞情報を付
与して、単語データ４を作成する。FIG. 1 is a block diagram of an embodiment of an automatic indexing system according to the present invention. In the automatic indexing system of this embodiment, a morphological analysis unit first analyzes a sentence l using a morphological analysis dictionary file 2. In step 3, word data 4 is created by dividing the word into words and adding part-of-speech information to each word.

次に、格助詞取り出しルールファイル５を用いて１名詞
抽出部６において、格助詞を伴なう名詞を抽出し、名詞
データ７を作成する。ここて格助詞を伴なう名詞を抽出
することにより、文章て重要な意味を持つ名詞を取り出
すことか可能となる。Next, a noun extraction unit 6 extracts nouns accompanied by case particles using the case particle extraction rule file 5, and creates noun data 7. By extracting nouns accompanied by case particles, it becomes possible to extract nouns that have important meanings in the text.

次いて、不要語辞書ファイル８を用いて、不要語除去部
９において１名詞データ７から不要語を除去する。ここ
でいう不要語とは、代名詞等の名詞をいう。Next, an unnecessary word removal unit 9 removes unnecessary words from one noun data 7 using the unnecessary word dictionary file 8. The unnecessary words here refer to nouns such as pronouns.

以上の作業により、索引語ｌＯを抽出する。By the above operations, the index word IO is extracted.

［発明の効果］以上説明したように本発明は、自動索引において、対象
の文章を形態素解析する手段と、形態素解析した結果か
ら格助詞を伴なう名詞を取り出す１段と、取り出した名
詞から不要語を取り除く手段とを有しているのて１文章
中の名詞をすべて取り出すのではなく、より重要である
名詞のみを取り出すことか出来るという効果かある。[Effects of the Invention] As explained above, the present invention provides, in an automatic index, a means for morphologically analyzing a target sentence, a step for extracting a noun with a case particle from the result of the morphological analysis, and a step for extracting a noun accompanied by a case particle from the result of the morphological analysis. Having a means for removing unnecessary words has the effect of being able to extract only the more important nouns, rather than all the nouns in one sentence.

これにより、対象文章に付与する索引語の語数が限定さ
れている場合、より重要な名詞を索引語として取り出す
ことか可能であるという効果かある。This has the effect that if the number of index words to be added to a target sentence is limited, it is possible to extract more important nouns as index words.

[Brief explanation of the drawing]

第１図は本発明に係る自動索引システムの一実施例を示
す構成図である。 ■：文章２：形態素解析用辞書ファイル３：形態素解析部４；単語データ５：格助詞取り出しルールファイル６：名詞抽出部７：名詞データ８：不要語辞書ファイル９：不要語除去部ｌＯ：索引語代理人　弁理士　渡　辺　喜　平第図FIG. 1 is a block diagram showing an embodiment of an automatic indexing system according to the present invention. ■: Sentence 2: Morphological analysis dictionary file 3: Morphological analysis unit 4; Word data 5: Case particle extraction rule file 6: Noun extraction unit 7: Noun data 8: Unnecessary word dictionary file 9: Unnecessary word removal unit IO: Index Agent Patent Attorney Kihei Watanabe

Claims

[Claims]

a morphological analysis means for morphologically analyzing a target sentence to create word data; a noun extraction means for extracting nouns with case particles from the word data obtained by the morphological analysis means to create noun data; An automatic indexing system comprising: an unnecessary word removing means for extracting index words by removing unnecessary words from noun data obtained by the noun extracting means.