JPH0492966A

JPH0492966A - Numerical quantity expression processing system

Info

Publication number: JPH0492966A
Application number: JP2207931A
Authority: JP
Inventors: Shinichiro Kamei; 亀井　眞一郎
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1990-08-06
Filing date: 1990-08-06
Publication date: 1992-03-25
Anticipated expiration: 2013-12-24
Also published as: JP2841778B2

Abstract

PURPOSE:To accurately identify the structure of a phrase in a text processing, to execute reading through the use of the structure and to accurately decide appropriate expression in the language of an opposite party in translation by arranging and discriminating an ordinal numeral function and a noun function. CONSTITUTION:Since the characteristic and the meaning interpolation of a construction are arranged and described when a Japanese word is related to a numeral in the dictionary of the numeral, the noun and the suffix of the Japanese word, the expression of a numerical quantity can accurately be analyzed and generated by using the dictionary. The examples of the contents in the dictionary are shown in the diagram. Since the corresponding expression in English is arranged and described in the dictionary, it can be made to accurately correspond to the expression of the numerical quantity in English. Thus, a numerical quantity expression processing system has the dictionary where the characteristic and meaning interpolation of the construction when the Japanese word is related to the numerical are described in the dictionary of the ordinal numeral, the noun and the suffix of the Japanese word and the Japanese word including the expression of the numerical quantity is analyzed by using the information in a natural language processing system in which a natural language is set to be input and the structure and meaning are approved. Thus, a Japanese sentence related to the ordinal numeral can accurately be processed. Furthermore, such system is applied to a machine translation system, for example.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、自然言語における数量表現の処理方式に関す
る。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a method for processing quantitative expressions in natural language.

[Conventional technology]

自然言語の中の日本語は、名詞を数えるのに助数詞を必
要とする。助数詞とは数詞の直後に現れる計量語のこと
で、例えば、「一つ」、「二つ」の「つ」や、「−人」
、「二人」の「人」などがその典型例である。Japanese, a natural language, requires a particle to count nouns. Counter words are metric words that appear immediately after number words, such as ``tsu'' in ``one'' and ``two,'' and ``-person.''
Typical examples include ``person'' in ``two''.

従来、助数詞に関連した言語表現を統一的に正しく解析
・認定し、解釈できる処理方式は存在しなかった。Until now, there has been no processing method that can uniformly and correctly analyze, identify, and interpret linguistic expressions related to particle words.

[Problem to be solved by the invention]

上述したように、助数詞に関連した言語表現を、統一的
に正しく解析し、認定し、そして解釈できる処理方式は
存在しない。これは、日本語の助数詞には、以下に示す
ような性質があるためである。As mentioned above, there is no processing method that can uniformly and correctly analyze, recognize, and interpret linguistic expressions related to classifiers. This is because Japanese particle nouns have the following properties.

（ｉ）名詞と助数詞との弁別通常、日本語では数詞は助数詞を介して名詞とつながる
。例えば、動物を数える時には「三匹の犬」というよう
に助数詞「匹」が必要であり、数詞と名詞が「三大」の
ように直接つながることはない。「三匹」の「匹」は典
型的な助数詞であり、「匹」だけで独立した名詞として
機能することはない。このように、名詞と助数詞は一般
的には機能がまったく異なる。(i) Discrimination between nouns and particle words In Japanese, number words are usually connected to nouns through particle words. For example, when counting animals, we need the particle ``human'', as in ``three dogs,'' and the number word and noun are not directly connected, as in ``three big''. The ``teri'' in ``three'' is a typical particle, and ``teri'' alone does not function as an independent noun. In this way, nouns and particles generally have completely different functions.

しかし、「−袋のトマト」の「袋」のように、通常の名
詞が数詞の直後に置かれ計量語として使われることがあ
る。この表現の意味を正しく認定し適切な英語に対応さ
せるには、「袋」のもつ名詞機能と助数詞機能の弁別が
必要である。もし、この「−袋の、トマト」の「袋］を
単なる通常の名詞であるとすると、英語では通常の名詞
の語順にしたがって、ｒｔｏｍａｔｏｓ　ｏｆ　ａ　ｂ
ａｇ」という語順になるはずであるが、「−袋のトマト
」に対応する正しい英語の表現はｒａ　ｂａｇ　ｏｆ　
ｔｏｍａｔｏｓ」であり、日本語と同様に計量語が前置
詞「Ｏｆ」をはさんで名詞の前に位置する。However, an ordinary noun is sometimes placed immediately after a number word and used as a metric, such as ``bag'' in ``-bag of tomatoes.'' In order to correctly identify the meaning of this expression and make it correspond to appropriate English, it is necessary to distinguish between the noun function and particle function of ``bag''. If we assume that "bag" in "-bag of tomatoes" is just a normal noun, then in English, according to the word order of normal nouns, rtomatos of a b
The correct English expression for ``-bag of tomatoes'' is ra bag of.
``tomatos'', and as in Japanese, the metric word is placed before the noun with the preposition ``Of'' in between.

従来は助数詞と名詞とを弁別する方法が不明確であり、
このような現象を正しく処理できなかった。Conventionally, it was unclear how to distinguish between particles and nouns,
We were unable to properly handle this phenomenon.

（ｉｉ）英語との対応日本語では人を数えるのに「人」を用い、鉛筆なら「本
」を用いるが、英語ではこれらに相当する表現はない。(ii) Correspondence with English In Japanese, ``jin'' is used to count people, and ``hon'' is used to count pencils, but there are no equivalent expressions in English.

単に数詞と名詞を直接結びつけるだけである。すなわち
、「−人の男」はｒａＩｌａｎＪであり、「二本の鉛筆
」はｒｔｗｏ　ｐｅｎｃｉｌｓ」である。It simply connects a number word and a noun directly. That is, "-man" is raIlanJ, and "two pencils" is "rtwo pencils."

二のように、日本語にくらべて英語はあまり助数詞が発
達していない。As shown in 2, English has less well-developed classifiers than Japanese.

しかし、紙を数える場合には、日本語で「−枚の紙」と
表現するのと同様に、英語でも’ａ　５ｈｅｅｔｏｆ　
ｐａｐｅｒ」と助数詞相当表現を用いる。However, when counting paper, just as in Japanese we say '-sheets of paper,' in English we say 'a 5 sheets of paper.'
Use the expression equivalent to the number particle "paper".

従来はどのような場合に日本語と英語とにずれがあり、
どのような場合に対応があるのかといった問題が未整理
であったため、適切な辞書記述ができず、このような現
象を正しく扱う処理法が充分確立できていない。In the past, in what cases were there discrepancies between Japanese and English?
Since the issue of what kind of cases should be dealt with has not been sorted out, appropriate dictionary descriptions have not been possible, and a processing method for correctly handling such phenomena has not been fully established.

本発明の目的は、このような欠点を除去し、助数詞に関
係する表現を処理を正確に行う数量表現処理方式を提供
することにある。SUMMARY OF THE INVENTION An object of the present invention is to provide a quantitative expression processing method that eliminates such drawbacks and accurately processes expressions related to classifiers.

[Means to solve the problem]

本発明は、自然言語を入力としてその構造と意味とを認
定する自然言語処理方式において、日本語の品詞の辞書
に、各品詞が数詞と関係したときの構文の特徴および意
味解釈を記述した辞書部ををし、辞書部の情報を用いて数量表現を含んだ日本語を解析す
ることを特徴としている。The present invention uses a natural language processing method that receives natural language as input and identifies its structure and meaning. The feature is that it uses information from the dictionary section to analyze Japanese words that include quantitative expressions.

[Effect]

本発明の助数詞処理方式では、日本語の助数詞、名詞、
接尾語の辞書にその語が数詞と関係したときの構文の特
徴および意味解釈を整理して記述しであるので、その辞
書を使うことにより、数量表現の解析、生成が正しく行
える。第４図に辞書内容の例を示した。辞書には対応す
る英語の表現も整理して記述しであるので、英語の数量
表現とも正しく対応させることができる。In the particle processing method of the present invention, Japanese particles, nouns,
Since the suffix dictionary organizes and describes the syntactical features and semantic interpretation of the word when it is related to a numeral, by using the dictionary, quantitative expressions can be correctly analyzed and generated. Figure 4 shows an example of dictionary contents. Since the corresponding English expressions are also organized and described in the dictionary, it is possible to correctly correspond to English quantitative expressions.

以下では、本発明の背景となる日本語の助数詞に関係し
た言語現象について第２図、第３図を用いて詳しく説明
する。第４図に示したような助数詞関連語の辞書内容は
、第２図、第３図に示した分析に基づいて作成する。In the following, linguistic phenomena related to Japanese classifiers, which form the background of the present invention, will be explained in detail using FIGS. 2 and 3. The dictionary contents of classifier-related words as shown in FIG. 4 are created based on the analysis shown in FIGS. 2 and 3.

まず、第２図を用いて助数詞の種類を説明する。First, the types of classifiers will be explained using FIG.

第２図は本発明の辞書内容を説明するための助数詞の分
類図である。第２図中の数字は助数詞の分類記号を表す
。この分類記号は第４図に示すような助数詞の辞書に記
述される。FIG. 2 is a classification diagram of classifiers for explaining the contents of the dictionary of the present invention. The numbers in FIG. 2 represent classification symbols for classifiers. This classification symbol is described in a dictionary of classifiers as shown in FIG.

第２図では、助数詞の意味内容をまず大きく、分類記号
１の「多さ」を示すもの分類記号２の「順序」を示すものとに二分する。In FIG. 2, the meaning content of the number particles is first roughly divided into two categories: classification symbol 1, which indicates "abundance", and classification symbol 2, which indicates "order".

助数詞が表す「多さ」は、分類記号１１の「ものの集まりの全体量」分類記号１２
の「全体を構成している種類」（例：種類、通り）分類記号１３の「もののもつ属性量」（例：ｃｍ、才）分類記号１４の「行為の頻度」（例：回、周り、楯）の四種類に分けられる。The "abundance" expressed by the number particle is the "total amount of a collection of things" in classification symbol 11, classification symbol 12.
``Types that make up the whole'' (e.g. type, street) Classification code 13 ``Attributes of things'' (e.g. cm, age) Classification code 14 ``Frequency of actions'' (e.g. times, circumference, It is divided into four types (shield).

さらに、「ものの集まりの全体量」の数え方は次の三種
類にまとめられる。すなわち、分類記号１１１の「その
もの自体を数える」（例：人、個、匹、枚）分類記号１１２の［組にまとめて数える」（例：組、足
、段、層）分類記号１１３の「分量器を使って数える」（例：杯、
袋、箱、皿）の三種類にまとめられる。Furthermore, there are three ways to count the "total amount of a collection of things": In other words, the classification symbol 111, "Count the thing itself" (e.g., person, individual, animal, piece), the classification symbol 112, "count in groups" (e.g., group, foot, step, layer), the classification symbol 113, " count using a measuring device” (e.g., cups,
They can be categorized into three types: bags, boxes, and plates).

助数詞の使われ方は後ろにくる名詞の意味や様子によっ
て制限がある。例えば、分類記号１１１の助数詞には、（イ）中立的　　　　　・・・「つ」（ロ）名詞の形状を含意・・・「本」　「片」　「枚」
１粒」（ハ）名詞の意味分類　・・・「人」　「冊」　「台」
１脚」のような類があることがわかる。There are restrictions on how number particles can be used depending on the meaning and appearance of the noun that follows them. For example, the classifier of classification symbol 111 is: (a) Neutral..."tsu" (b) Connotes the shape of a noun..."book""piece""sheet"
1 grain" (c) Semantic classification of nouns..."person""book""table"
It can be seen that there are classes such as ``one leg''.

同様に分類記号１１２には、（イ）中立的　　　　　・・・「組」　ｒセット」（ロ
）構成要素数　　　・・・「対」　「ペア」（十名詞の
意味分類）「足」　「つがい」（ハ）集まり方の形状　
・・・「重」　「段」　「層３１束」などがある。Similarly, the classification symbol 112 includes: (a) Neutral...'Group' r set' (B) Number of constituent elements...'Pair''Pair' (semantic classification of ten nouns) 'Foot''Pair' (c) Shape of gathering
...There are "heavy", "tier", "layer 31 bundle", etc.

次に、第２図、第３図を用いて日本語の助数詞に関して
従来技術が扱っていなかった点を説明する。Next, with reference to FIGS. 2 and 3, points that the prior art does not deal with regarding Japanese particle words will be explained.

まず、前述した（ｉ）の「名詞と助数詞との弁別」につ
いて、従来技術が扱っていなかった点を説明する。第３
図は、助数詞と名詞の弁別のために必要な語の分類を示
した図である。第３図では、従来区別が明確でなかった
助数詞関連語を数詞との位置関係によって、６種類の語
群に区別した。First, we will explain the point that the prior art does not deal with regarding the above-mentioned (i) "discrimination between nouns and classifiers". Third
The figure shows the classification of words necessary for distinguishing between particles and nouns. In FIG. 3, number word-related words, which had not been clearly distinguished in the past, are classified into six types of word groups based on their positional relationship with number words.

数詞との位置関係としては、の二つの位置を考える。In terms of positional relationship with number words, Consider two positions.

通常、日本語では、名詞は、助数詞を介して数詞とつな
がる。例えば、動物を数える時には「三匹の犬」という
ように、助数詞「匹」が必要であり、数詞と名詞が「三
大」のように直接つながることはない。これが、典型的
な名詞のもつ統語的性質である。Normally, in Japanese, nouns are connected to number words through number particles. For example, when counting animals, the number word ``ni'' is required, as in ``three dogs,'' and the number word and noun are not directly connected, as in ``three big.'' This is the syntactic property of typical nouns.

一方、「三匹」の「匹」は典型的な助数詞であり、必ず
数詞の直後に置かれ、数詞との間に他の要素が入り込む
ことはない。On the other hand, ``to'' in ``三人'' is a typical number particle, and is always placed immediately after a number word, with no other elements intervening between it and the number word.

このように、名詞と助数詞は一般的には出現する位置が
まったく異なる。第３図ではこの違いを第１行目の「通
常の名詞」と第６行目の「通常の助数詞」のところで示
している。第３図において、記号○はその位置関係に来
うることを示し、記号Ｘはその位置関係には来ないこと
を表す。In this way, nouns and particles generally appear in completely different positions. In Figure 3, this difference is shown in the ``ordinary noun'' in the first line and the ``ordinary particle'' in the sixth line. In FIG. 3, the symbol ◯ indicates that it can occur in that positional relationship, and the symbol X indicates that it does not occur in that positional relationship.

従来は、数詞の直後に位置しうる語はすべて助数詞的で
あると考えられ、はっきりした分類がなされていなかっ
た。しかし、通常の名詞とも通常の助数詞とも振舞いの
異なる語もある。このような語群としては、第３図の第
２行から第５行の４種類がある。この４種類はすべて数
詞の直後の位置に置かれうる。Previously, all words that could be placed immediately after a number word were considered to be number words, and there was no clear classification. However, there are some words that behave differently from regular nouns and regular particles. There are four types of such word groups, shown in lines 2 to 5 in FIG. 3. All four types can be placed immediately after a number word.

まず、「−袋のトマト」の「袋」のように、名詞として
も助数詞としても用いられる語がある。First, there are words that can be used both as nouns and particles, such as ``bag'' in ``-bag of tomatoes.''

このような語群は第２図の分類では分類記号１１３の記
号「容器」の意味をもつ名詞に典型的にみられる。「袋
」はそれ自身を数えるときには、「一つの袋」のように
助数詞をともなって、普通の名詞として数えられる。「
−袋」のように数詞の直後の位置に来るときには「袋」
を数えているのではなく、「袋」を助数詞としてその後
の名詞、例えば「トマト」を数えている。この語群は名
詞として使われているときと助数詞として使われている
ときで意味する対象に差はないが、数量表現としての統
語的振舞いには差がある。数詞の直後に置かれたとき、
すなわち、助数詞として使われたときには、英語にした
ときにも、通常の名詞とはちがって、　　ｒａ　ｂａｇ
　ｏｆ　ｔｏｍａｔｏｓ」のように前置詞ｒｏｆ」を介
して名詞の前に置かれる。この語群を第３図では第４行
に示した。Such a word group is typically found in the classification symbol 113, a noun having the meaning of "container" in the classification shown in FIG. When ``bukuro'' is counted by itself, it is counted as an ordinary noun with a particle, such as ``one bag.''"
- When it comes immediately after a number word, such as "fukuro", "fukuro"
We are not counting ``bag'', but using ``bag'' as a particle and counting the following nouns, such as ``tomato.'' Although there is no difference in the meaning of this group of words when used as a noun or as a particle, there is a difference in their syntactic behavior as quantitative expressions. When placed immediately after a number word,
In other words, when used as a particle, even when translated into English, unlike a normal noun, ra bag
It is placed before a noun via the preposition "rof", as in "of tomatoes". This group of words is shown in the fourth line of Figure 3.

また、「袋」の類と似た別の語群がある。第３図で第２
行に示した、「語」、「文」、ｒ列」、「行」といった
語がそれである。これらは、「袋」の類と同じように名
詞でありながら、数詞の直後に位置しうる。しかし、「
袋」の類とは異なり、それ自身を数えるときに、「一つ
の語」と助数詞を介すのと全（同じ意味で助数詞を介さ
ず「−語」と表現できる。つまりこの語群は、数詞の直
後に位置しろるが、助数詞としての働きはない。この語
群の意味的な特徴はそれ自身がさらに大きなものの構成
要素になるものということである。名詞の数を数えると
きは一般に第２図の分類の「そのもの自身を数える」と
いう分類記号１１１のように、「一つｊ、「二つ」と助
数詞を介して数えるが、名詞が上述したような性質を持
っているときには、助数詞を介さずに数えることができ
る。There is also another group of words similar to ``bukuro.'' 2nd in Figure 3
These are words such as ``word'', ``sentence'', ``column r'', and ``row'' shown in the rows. Although these words are nouns like ``bukuro,'' they can be placed immediately after a number word. but,"
Unlike the ``bag'' class, when counting by itself, it can be expressed as ``-word'' without using a ``single word'' and a particle (with the same meaning, without using a particle. In other words, this word group is Although it is placed immediately after a number word, it does not function as a particle.The semantic feature of this word group is that it is itself a component of something larger.When counting the number of nouns, it is generally As in the classification symbol 111 of ``counting itself'' in the classification in Figure 2, counting is done through a particle such as ``one j,''``two,'' but when the noun has the above-mentioned properties, the particle It can be counted without using .

また、別の語群として、第３図の第５行目にあげた、「
本」、「人」、「台」、「頭」といった語群がある。こ
れら名詞としても助数詞としても現れうる点で「袋」の
類に似ているが、名詞のときの意味と助数詞のときの意
味とが全く異なる。In addition, as another word group, "
There are word groups such as ``book'', ``person'', ``table'', and ``head''. These words are similar to ``bukuro'' (bag) in that they can appear both as nouns and as particles, but their meanings when used as nouns and particles are completely different.

例えば、「本」は、名詞としては書物を表すが、助数詞
としては細長い名詞を数えるときに用いられ、書物とは
全く関係がない。また、語の発音も名詞のときと助数詞
のときとで異なる場合も多い。For example, ``hon'' is used as a noun to refer to a book, but as a particle, it is used to count long and narrow nouns and has nothing to do with books. Furthermore, the pronunciation of a word often differs between when it is a noun and when it is a particle.

例えば、「人」は、名詞としては「ひと」であるが、助
数詞としては「す」または「にん」である。For example, ``person'' is ``hito'' as a noun, but ``su'' or ``nin'' as a particle.

さらに、従来、助数詞との区別が不明瞭であった別の語
群として、第３図の第３行に挙げ、ｔようなものがある
。雑誌を表す「誌」、学校を表す「校」、会社を表す「
社」、新聞を表す「紙」等の接尾語がそれである。これ
らは、例えば、「三誌」、「四校」のように数詞の直後
に置かれ、かつ、通常の名詞とは違って、「誌」、「校
」だけが名詞として機能することはない。この性質から
は、「一つ」の「つ」等の助数詞とよく僚ているが、意
味的には助数詞とは異なる。助数詞が何か別の名詞を数
えるときに用いられるのに対し、「誌」、「校」などが
それ自身を数えるという点で、上記の「語」、「文」等
に似ているといえる。Furthermore, there is another group of words for which it has been unclear to distinguish them from classifiers, such as t, listed in the third line of FIG. "Magazine" to represent a magazine, "Gakko" to represent a school, and "" to represent a company.
This is the case with the suffixes such as ``sha'' and ``paper'', which refers to newspapers. For example, these words are placed immediately after a number word, such as ``sanshi'' and ``shiko'', and unlike normal nouns, ``shi'' and ``school'' do not function as nouns. . Because of this property, it is often used as a number particle such as ``tsu'' in ``hitsuto'', but it is different from a particle in meaning. It can be said that they are similar to the above-mentioned ``words'' and ``bun'' in that number particles are used to count other nouns, whereas ``shi'' and ``school'' count themselves. .

「語」、「文」等がそれ自身で名詞として自立している
のに対して、「誌」、「校」等は統語的な独立性が低い
点が異なってはいるが、しかし、計量語すなわち助数詞
として機能しているのではない点で共通している。この
ことは、英語に対応させたとき、例えば、「校」が「学
校」の意味で名詞ｒ　５ｃｈｏｏｌ　」に対応すること
でもわかる。``Word'', ``bun'', etc. stand on their own as nouns, whereas ``magazine'', ``school'', etc. differ in that they are less syntactically independent; however, they are metrically independent. What they have in common is that they do not function as a word or particle. This can also be seen from the fact that, for example, when the word ``school'' is translated into English, the word ``school'' corresponds to the noun ``r 5chool''.

以上述べたように、従来は整理が不明確であった助数詞
関係の語を第３図のように分類し、この分類にしたがっ
て辞書に記述するので、本発明の数量表現処理方式では
名詞と助数詞の弁別が正しく行える。辞書内容の例は第
４図に示す。As mentioned above, the words related to particle numbers, which were unclear in the past, are classified as shown in Figure 3, and are written in the dictionary according to this classification. Can discriminate correctly. An example of dictionary contents is shown in FIG.

次に、前述した（ｉｉ）の「英語との対応」について、
従来技術が扱っていなかった点を説明する。Next, regarding (ii) “correspondence with English” mentioned above,
Points not covered by the prior art will be explained.

第２図のように分類したことで、日本語と英語の表現の
対応も整理できる。表現の差が大きいのは、分類記号１
１１の「そのものを数える」分類記号２の「順序」である。分類記号１１１の中で、名詞が離散的な個体で
あることを含意する「人」、「個」等に相当する助数詞
は英語には存在しない。しかし、同じ分類記号１１１で
も名詞が非離散的なものを指すときには、日本語の「−
枚の紙」に相当するｒａｓｈｅｅｔ　ｏｆ　ｐａｐｅｒ
Ｊのように助数詞機能をもつ語が存在する。分類記号１
１１以外の「組」、「容器」、「単位」、「種類」、「
回」の助数詞相当表現は英語にも存在する。「組」に対
応する英語の助数詞相当表現としてはｒ　ｔｗｏ　ｇｒ
ｏｕｐｓ　ｏｆ　」を、「容器」に対応する表現として
はｒ　ｔｈｏ　ｃｕｐｓ　ｏｆ　Ｊを、「単位」として
はｒ２ｋｇｏｆ」を、「種類」に相当する英語表現とし
ては、ｒｔｗｏ　ｋｉｎｄｓ　ｏｆ」を、「回」に相当
するものとして、ｒ　ｔｗｏ　ｔｉｍｅｓ　Ｊを挙げる
ことができる。By categorizing them as shown in Figure 2, we can also organize the correspondence between Japanese and English expressions. Classification symbol 1 has the largest difference in expression.
This is the ``order'' of ``counting things'' classification symbol 2 of 11. Among the classification symbols 111, there are no particles in English that correspond to "person", "individual", etc., which imply that the noun is a discrete individual. However, even with the same classification symbol 111, when a noun refers to a non-discrete thing, the Japanese "-"
rashet of paper
There are words like J that have a particle function. Classification symbol 1
"Group", "Container", "Unit", "Type", "
An expression equivalent to the particle "time" also exists in English. The English particle equivalent expression for "gumi" is r two gr.
The expression corresponding to ``container'' is ``r two kgof'', the English expression corresponding to ``kind'' is ``r two kgof,'' and ``times.'' An example of the equivalent is r two times J.

日本語と英語のもう一つの大きな差は、「２順序」に関
する表現である。周知のように、順序を表すのに英語で
は序数を用い、日本語では基数に助数詞を重ねる。Another big difference between Japanese and English is the expressions related to "two orders." As is well known, in English, ordinal numbers are used to express order, and in Japanese, a number is superimposed on the base number.

また、第２図では、いわゆるｒ単位」を体積、容積、重
さ（分類記号１１３）とそれ以外の属性（分類記号１３
）に分けた。これは意味的な「ものの集まりの全体量」
と「そのもの自体のもつ属性」という違いに基づいてい
る。英語ではこの意味の違いが構文の違いに反映してお
り、重さ等はｒ　３ｋｇｏｆ　ｓｕｇａｒ」のように数
量が前置詞ｏｆの前にくる構文をとり、その他の属性は
ｒａｎ　ｏｕｔｐｕｔ　ｏｆ　３０１１ＶＪのように数
量が後ろに配置する構文をとる。In addition, in Figure 2, the so-called "r unit" is defined as volume, capacity, weight (classification symbol 113) and other attributes (classification symbol 13).
). This is the semantic "total amount of a collection of things"
It is based on the difference between "attributes of the thing itself". In English, this difference in meaning is reflected in the difference in syntax, such as weight, etc., where the quantity comes before the preposition of, as in "r 3kg of sugar," and other attributes, as in ran output of 3011VJ. Uses a syntax in which the quantity is placed at the end.

以上説明したように、本発明では、従来、未整理であっ
た助数詞関連語を整理し辞書にその振舞いを記述しであ
るので、助数詞に関係した日本語文が正しく処理できる
。As explained above, in the present invention, the conventionally unorganized classifier-related words are organized and their behavior is described in the dictionary, so that Japanese sentences related to classifiers can be processed correctly.

〔Example〕

次に、本発明の実施例について図面を参照して説明する
。Next, embodiments of the present invention will be described with reference to the drawings.

第１図は、本発明の一実施例を示すブロック図である。FIG. 1 is a block diagram showing one embodiment of the present invention.

第１図の数量表現処理方式は、形態素解析部１０と、数
量表現認定部２０と、辞書３０と、数量表現生成部４０
とを備えている。The quantitative expression processing method shown in FIG.
It is equipped with

このような数量表現処理方式において、形態素解析部１
０は、入力された日本語の文を語分割する。In such a quantitative expression processing method, the morphological analysis unit 1
0 divides the input Japanese sentence into words.

数量表現認定部２０は、文中に数量表現が含まれている
ときに、数量表現の構造と意味の認定とを行う。その際
、辞書３０に記述されている助数詞の辞書内容を参照す
る。この数量表現認定部２０の処理の手順が第５図に示
されている。The quantitative expression recognition unit 20 recognizes the structure and meaning of a quantitative expression when a quantitative expression is included in a sentence. At this time, the dictionary contents of the classifier words described in the dictionary 30 are referred to. The processing procedure of this quantitative expression recognition section 20 is shown in FIG.

第５図に示されるように、入力文の形態素解析の結果を
受は取り（ステップＳ１）、辞書３０を引いて各形態素
の辞書内容を検索しくステップ３２）、数詞の直後の語
の品詞を調べる（ステップＳ３）。As shown in FIG. 5, the results of the morphological analysis of the input sentence are obtained (step S1), the dictionary contents of each morpheme are retrieved from the dictionary 30 (step 32), and the part of speech of the word immediately after the number word is determined. Check (step S3).

この語が助数詞である場合（ステップＳ４）、分類コー
ドがｒｌｌｌＪであり、がっ、離散的名詞に使われるか
どうか調べる（ステップｓ５）、ステップＳ５において
ｒ　ＹＥＳ　Ｊの場合、数値−数量−名詞という意味構造を作成する（ステップＳ６）。また、ス
テップＳ５においてｒＮＯＪの場合、数値−数量一助数
詞一数量一名詞という意味構造を作成する（ステップＳ７）。If this word is a particle (step S4), the classification code is rllllJ, and it is checked whether it is used for a discrete noun (step s5). If r YES J in step S5, the number-quantity-noun A semantic structure is created (step S6). Further, in the case of rNOJ in step S5, a semantic structure of numeric value-quantity-one quantifier-one quantity-noun is created (step S7).

ステップＳ３において、数詞の直後の語の品詞が数詞と
並ぶ接尾語の場合（ステップＳ８）、数値−敗量一接尾
語という意味構造を作成する（ステップＳ９）。In step S3, if the part of speech of the word immediately after the number word is a suffix that lines up with the number word (step S8), a semantic structure of numerical value - loss amount one suffix is created (step S9).

ステップＳ３において、数詞の直後の語の品詞が数と並
ぶ名詞の場合（ステップ５１０）、数値−敗量一接尾語という意味構造を作成する（ステップ５ｌｌ）。In step S3, if the part of speech of the word immediately after the number word is a noun that is aligned with number (step 510), a semantic structure of number - loss amount one suffix is created (step 5ll).

ステップＳ３において、数詞の直後の語の品詞が、助数
詞でもなく、数詞と並ぶ接尾語でもなく、数詞と並ぶ名
詞でもない場合、入力文を構文と認定する（ステップ５
１２）。In step S3, if the part of speech of the word immediately after the number word is neither a particle nor a suffix along with a number word, nor a noun along with a number word, the input sentence is recognized as a syntax (step S3).
12).

このように、数量表現認定部２０は、意味構造を作成し
、数量表現生成部４０に渡す。In this way, the quantitative expression recognition section 20 creates a semantic structure and passes it to the quantitative expression generation section 40.

辞書３０は、形態素解析部１０と数量表現認定部２０と
数量表現生成部４０とに接続されている。この辞書３０
は、日本語の助数詞、名詞、接尾語の辞書である。また
、日本語の助数詞、名詞、接尾語の辞書にその語が数詞
と関係したときの構文の特徴および意味解釈を整理して
記述しである。辞書内容の例は、第４図に示されている
。第４図の辞書内容は、第２図２第３図に示す本発明の
分類に従って記述されている。第４図には、日本語の表
層語、その品詞、意味機能、対応する英語の品詞、表層
を示す。助数詞の意味機能の部分には、第２図の分類に
したがって、分類コードが付与しである。The dictionary 30 is connected to the morphological analysis section 10, the quantitative expression recognition section 20, and the quantitative expression generation section 40. This dictionary 30
is a dictionary of Japanese particles, nouns, and suffixes. In addition, a dictionary of Japanese number words, nouns, and suffixes organizes and describes syntactical features and semantic interpretations when the words are related to number words. An example of dictionary contents is shown in FIG. The contents of the dictionary in FIG. 4 are described according to the classification of the present invention shown in FIG. 2 and FIG. 3. Figure 4 shows Japanese surface words, their parts of speech, semantic functions, and the corresponding English parts of speech and surface. Classification codes are assigned to the semantic functions of the classifiers according to the classification shown in Figure 2.

数量表現生成部４０は、数量表現認定部２０で認定され
た数量表現の意味構造を受は取る。数量表現生成部４０
は、辞書内容の英語側の情報を用いて英語を出力する。The quantitative expression generation section 40 receives the semantic structure of the quantitative expression certified by the quantitative expression certification section 20. Quantitative expression generation unit 40
outputs English using the information on the English side of the dictionary contents.

このような数量表現生成部４ｏでの処理の手順が第６図
に示されている。The procedure of such processing in the quantitative expression generating section 4o is shown in FIG.

第６図に示されるように、数量表現認定部２０から意味
構造を受は取り（ステップ５２１）、辞書３゜を引き（
ステップ５２２）、助数詞が含まれているかどうか調べ
る（ステップ５２３）。ステップｓ２３でｒＹＥＳＪの
場合、数詞、助数名詞、前置詞Ｏｆ、名詞という語順で
生成する（ステップ５２４）。また、ステップＳ２３で
「ＮＯ」の場合、数詞、名詞という語順で生成する（ス
テップ５２５）。As shown in FIG. 6, the semantic structure is received from the quantitative expression recognition unit 20 (step 521), and the dictionary 3° is extracted (
Step 522), and check whether a particle is included (Step 523). If rYESJ is determined in step s23, the words are generated in the order of number word, particle noun, preposition Of, and noun (step 524). If "NO" in step S23, the words are generated in the order of numerals and nouns (step 525).

このように、数量表現生成部４０は、英語を生成して出
力する。In this way, the quantitative expression generation unit 40 generates and outputs English.

次に、本実施例の動作について説明する。Next, the operation of this embodiment will be explained.

例として、「−本の鉛筆」と「−袋の鉛筆」が入力され
た場合を考える。As an example, consider a case where "-book pencil" and "-bag pencil" are input.

まず、「−本の鉛筆」が入力されると、形態素解析部１
０で解析された結果、一／本／の／鉛筆／と語分割される。この結果は数量表現認定部２ｏに送ら
れる。数量表現認定部２０では辞書３ｏを引く。First, when "-book pencil" is input, the morphological analysis unit 1
As a result of the analysis using 0, the words are divided into 1/book/of/pencil/. This result is sent to the quantitative expression recognition section 2o. The quantitative expression recognition unit 20 looks up the dictionary 3o.

辞書３０の「本」の辞書内容は第４図の中に示したよう
な内容であり、１本」という同じ表記の助数詞と名詞が
存在することが記述されている。数量表現認定部２０は
、第５図に示した処理により、この入力文において、数
詞「−」と「本」が隣接していることから、「本」は助
数詞であると認定する。そこで数量表現認定部２０はこ
の入力文の意味構造は次のような構造であると解釈する
。The dictionary contents for "book" in the dictionary 30 are as shown in FIG. 4, and it is described that there is a classifier and a noun with the same notation, "1 book". Through the process shown in FIG. 5, the quantitative expression recognition unit 20 recognizes that the number word "-" and "hon" are adjacent to each other in this input sentence, so that "hon" is a classifier. Therefore, the quantitative expression recognition unit 20 interprets that the semantic structure of this input sentence is as follows.

この構造は、鉛筆の数量が「１」であることを表してい
る。This structure represents that the quantity of pencils is "1".

この構造を受は取った数量表現生成部４０は、第６図に
示した処理により、辞書３０の情報をもとにして、英語
を生成する。第４図の中の助数詞ｒ本」の英語側辞書に
は記号「−」が記しであるが、これは助数詞「本」に対
応する英語の語は存在せず、「本」に対応する語は英語
では不必要であるということを表す。この辞書情報をも
とにして、数量表現生成部４０は１本」に相当する語は
出さず、ｏｎｅ　ｐｅｎｃｉｌ　　またはａ　ｐｅｎｃ
ｉｌを生成する。The quantitative expression generation unit 40 that has received this structure generates English based on the information in the dictionary 30 through the processing shown in FIG. The sign "-" is marked in the English dictionary for the particle "r book" in Figure 4, but this means that there is no English word corresponding to the particle "hon", and there is no word corresponding to "hon". means that it is unnecessary in English. Based on this dictionary information, the quantitative expression generation unit 40 does not produce a word equivalent to ``one pencil'', but instead outputs a word equivalent to ``one pencil'' or a penc.
Generate il.

次に、「−袋の鉛筆」が入力された場合を述べる。上記
の場合と同様に、一／袋／の／鉛筆／と形態素解析される。数量表現認定部２ｏで辞書３゜を
引くと、１袋」の辞書には第４図に記したように、名詞
と助数詞の二つがあることがわかる。数量表現認定部２
０は、入力文中で数詞「−」と「袋」が隣接しているこ
とから、１袋」は助数詞であると認定する。第４図に記
したように、辞書の意味機能の欄から助数詞「袋」は［
分量器を使って数える」ときにその分量器として使われ
ることがわかる。そこで、意味構造としては、次のよう
な構造が作られる。Next, a case will be described in which "-bag of pencils" is input. As in the case above, 1/bag/of/pencil/ is morphologically analyzed. When we look up the dictionary 3° in the quantitative expression recognition unit 2o, we find that the dictionary for ``1 bag'' has two types: nouns and classifiers, as shown in Figure 4. Quantitative expression certification part 2
Since the number word "-" and "bukuro" are adjacent to each other in the input sentence, "0" is recognized as a particle word. As shown in Figure 4, from the semantic function column of the dictionary, the particle word ``bukuro'' is [
You can see that it is used as a measuring device when "counting using a measuring device." Therefore, the following semantic structure is created.

この構造は、鉛筆の数量が袋を分量器としてはがって’
ＩＪであることを表す。この意味構造を受は取った数量
表現生成部４ｏは辞書３ｏを引き、「袋」に対応する英
語側の語が「助数名詞ｊであることを知る。英語の場合
、助数名詞の語順には二種類あって、それが第２図の分
類の１３「そのもの自体の自体のもつ属性をいう」表現
の場合、例えばｒａｎ　ｏｕｔｐｕｔ　ｏｆ　３０１１
１ＶＪの「３０１Ｉｌｖ」のように助数名詞は前置詞「
ｏｆ」を介して名詞の後ろに配置されるが、第２図の分
類の１１３「分量器を使って数える」場合には、配置が
逆になり、助数名詞が名詞の前に来る。今の場合、「袋
」は後者であることが辞書情報かられかるので、数量表
現生成部４゜は、「袋」に対応するｒ　ｂａｇ　」を名
詞の前に置いて、ａ　ｂａｇ　ｏｆ　ｐｅｎｃｉｌｓを出力する。This structure allows the quantity of pencils to be peeled off and used as a measuring device.
It represents IJ. The quantitative expression generation unit 4o, which has received this semantic structure, looks up the dictionary 3o and learns that the English word corresponding to "bukuro" is "number noun j.In the case of English, the word order of the number noun is There are two types of expressions, and in the case of the expression 13 "Referring to the attributes of the thing itself" in the classification in Figure 2, for example, ran output of 3011
As in 1VJ's "301Ilv", the number noun is a preposition "
However, in the case of classification 113 "Counting using a measuring device" in Figure 2, the arrangement is reversed and the fractional noun comes before the noun. In this case, the dictionary information shows that "bag" is the latter, so the quantitative expression generation unit 4 places "r bag" corresponding to "bag" in front of the noun, and converts "a bag of pencils" into "a bag of pencils". Output.

このように、本実施例である数量表現処理方式は、自然
言語を入力としてその構造と意味を認定する自然言語処
理方式において、日本語の助数詞、名詞、接尾語の辞書
にその語が数詞と関係したときの構文の特徴および意味
解釈を記述した辞書を有し、これらの情報を用いて数量
表現を含んだ日本語を解析する。In this way, the quantitative expression processing method of this embodiment is a natural language processing method that receives natural language as input and identifies its structure and meaning. It has a dictionary that describes the syntactical features and semantic interpretation of related cases, and uses this information to analyze Japanese words that include quantitative expressions.

これにより、本実施例では、従来、未整理であった助数
詞関連語を整理し辞書にその振舞いを記述しであるので
、助数詞に関係した日本語文が正しく処理できる。また
、このような方式は、例えば機械翻訳システムに適用さ
れる。As a result, in this embodiment, the conventionally unorganized classifier-related words are organized and their behavior is described in the dictionary, so that Japanese sentences related to classifiers can be processed correctly. Further, such a method is applied to, for example, a machine translation system.

（発明の効果〕以上説明したように、本発明は助数詞機能と名詞機能を
整理、弁別することにより、テキスト処理における句構
造の正確な同定や、それを用いた読み付け、翻訳におけ
る相手言語での適切な表現の決定などが正確に行えるよ
うになる。(Effects of the Invention) As explained above, the present invention organizes and distinguishes particle functions and noun functions, thereby enabling accurate identification of phrase structures in text processing, reading using them, and target language recognition in translation. Students will be able to accurately determine appropriate expressions for

[Brief explanation of drawings]

第１図は、本発明の一実施例を示すブロック図、第２図
は、本発明で用いる助数詞の分類を説明するための図、第３図は、助数詞と名詞との区別を説明するための図、第４図は、第１図中の辞書３ｏに記述される助数詞、名
詞、接尾語などの辞書内容の例を示す図、第５図は、第
１図中の数量表現認定部２ｏの動作を説明するためのフ
ローチャート、第６図は、第１図中の数量表現生成部４０の動作を説明
するためのフローチャートである。１０・・・・・形態素解析部２０・・・・・数量表現認定部３０・・・・・辞書４０・・・・・数量表現生成部Figure 1 is a block diagram showing one embodiment of the present invention, Figure 2 is a diagram for explaining the classification of classifiers used in the present invention, and Figure 3 is a diagram for explaining the distinction between classifiers and nouns. FIG. 4 is a diagram showing an example of dictionary contents such as classifiers, nouns, and suffixes described in the dictionary 3o in FIG. 1, and FIG. Flowchart for explaining the operation of FIG. 6 is a flowchart for explaining the operation of the quantitative expression generation section 40 in FIG. 10...Morphological analysis unit 20...Quantitative expression recognition unit 30...Dictionary 40...Quantitative expression generation unit

Claims

[Claims]

(1) In a natural language processing method that receives natural language as input and identifies its structure and meaning, a dictionary of Japanese parts of speech contains a dictionary section that describes syntactic features and semantic interpretation when each part of speech is related to a number word. A quantitative expression processing method characterized in that it analyzes Japanese language containing quantitative expressions using information from the dictionary section.