JPS6395569A

JPS6395569A - Language analyzing device

Info

Publication number: JPS6395569A
Application number: JP61240215A
Authority: JP
Inventors: Toshihiko Yokogawa; 横川　壽彦
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1986-10-11
Filing date: 1986-10-11
Publication date: 1988-04-26
Anticipated expiration: 2011-03-04
Also published as: JPH0821032B2

Abstract

PURPOSE:To realize a proper morpheme analysis in terms of the expression including the numerical value by including the discrimination display showing that the dictionary consulting unit shows numbers in the form of the dictionary data for the dictionary consulting unit showing numbers. CONSTITUTION:A dictionary retrieving part 104 retrieves a word dictionary 18 based on the retrieving key character string inputted from a unit segmenting part 102 and extracts the dictionary information to send this information to processing parts 110, 1120, 114 and 118 respectively. The part 110 arranges the numerals into numerical value and the part 114 processes the numerals connected by hyphens. Then a processing part 116 processes the continuous numeric characters and the part 118 arranges the currency symbols and the numerical value into a single noun. A processing part 120 arranges the numerical value and the units into a single noun. Then a processing part 122 arranges the immediately preceding numerical values after conversion of the numerals into the numerical value, the processing of numerals containing hyphens and the continuous processing of numerals.

Description

【発明の詳細な説明】技術分野本発明は言語解析装置、とくに、たとえば自動翻訳装置
に有用な自然言語を解析する言、：ｈ解析装置に関する
。DETAILED DESCRIPTION OF THE INVENTION TECHNICAL FIELD The present invention relates to a language analysis device, and more particularly to a language analysis device for analyzing natural language, which is useful, for example, in automatic translation devices.

従来技術たとえば英語などの外国語の文からそれに対応する日本
語の文を作成する場合、入力された英文の形態素を解析
し、その構文を解析し、その文構造をｆ摸し、そののち
日本語の訳文を生成する。BACKGROUND ART For example, when creating a corresponding Japanese sentence from a sentence in a foreign language such as English, the morphemes of the input English sentence are analyzed, its syntax is analyzed, and the sentence structure is copied. Generate translations of words.

文の形態素を解析する際、あるぼ５Δにおける数の表現
が他の百語における数の表現と必ずしも１対１で対応し
ないことがある。たとえば、英Ｊｈなどのヨーロッパ語
では、日本語と数を数える際の基本的発想、すなわち位
取りが相違する。そこで、数を表わす語を英語と日本語
で１対１に対応させると、適切な翻訳が行なわれないこ
とがある。たとえば、英語の”ｔｅｎ“は日本ｊ／ｌの
「十」であり、英語の”ｔｈｏｕｓａｎｄ“は日本語の
「千」である。このような単純な対応方式をとると、た
とえば”ｔｅｎ　ｔｈｏｕｓａｎｄ”は巾に「ｆｆ」と
翻訳されてしまう。つまり位取りのずれが生ずる。この
「十丁」を「万」なる位取りを示す語に変換するために
、従来の方式では、両者の対応を示すテーブルを備え、
逐一これを参照していた。このような位取り変換データ
をすべての位取りについて対応テーブルとして備えてい
たのでは、システムのもつデータ量が総体的に多くなり
すぎてしまう。When analyzing the morphemes of a sentence, the number expressions in Arubo5Δ may not necessarily correspond one-to-one with the number expressions in other 100 words. For example, in European languages such as English Jh, the basic idea of counting, that is, place value, is different from Japanese. Therefore, if words representing numbers are made to correspond one-to-one between English and Japanese, appropriate translation may not be achieved. For example, "ten" in English is "ten" in Japanese j/l, and "thousand" in English is "thousand" in Japanese. If such a simple correspondence method is adopted, for example, "ten thousand" will be translated as "ff". In other words, a shift in scale occurs. In order to convert this ``Jucho'' into the word ``10,000,'' which indicates the scale, the conventional method includes a table that shows the correspondence between the two.
I referred to this point by point. If such scale conversion data were provided as a correspondence table for all scales, the overall amount of data held by the system would become too large.

たとえば英語の数値表現″ａ　ｈｕｎｄｒｅｄ　ａｎｄ
　ｔｗ。For example, the English numerical expression ``a hundred and
tw.

ｔｈｏｕｓａｎｄ　ｔｗｏ　ｈｕｎｄｒｅｄ　ａｎｄ　
ｆｏｕｒ″を、その構成要素に単純に分解して日本語の
それぞれ対応する数表現に置換するだけの単純なシステ
ムでは、これは巾に［百と２丁−２百と４］として解析
されるにすぎない０本来これは、最終的に日本語でもｒ
１０２，２０４　Ｊすなわち「１０万２′Ｆ２百４」と
解すべきである。thousand and two hundred and
In a simple system that simply decomposes ``four'' into its component parts and replaces them with their corresponding numerical expressions in Japanese, this would be parsed as [Hyaku to 2 cho - 200 and 4]. It's just 0. Originally, this was finally translated into Japanese as well.
It should be interpreted as 102,204 J, or "102,204 F204."

また、たとえば英語の場合、”＄１．５　ｍ１ｌｌｉｏ
ｎ”といった表現も多く用いられるが、このような通貨
記号を含む単位記号の付された数値からなるＪ！Ｉ語を
適切に翻訳するには、この連語を数値”１．５ｍ１ｌｌ
ｉｏｎ”に単位記号”Ｓ“が付加されたものとして正し
く解析することが要求される。しかし、スペースすなわ
ち空白文字を中詰の切れ目とするような単純な解析を行
なう従来のシステムでは、これを”１．５　ｄｏｌｌａ
ｒｓ”と”＋５ｉｌｌｉｏｎ’″の２ｉｉｌ（７）要素
からなっていると誤って解析してしまう。For example, in English, “$1.5 m1llio”
The expression "n" is also often used, but in order to properly translate the J!
ion" with the unit symbol "S" added to it. However, in conventional systems that perform simple analysis such as using spaces or blank characters as padding breaks, this is not possible. ”1.5 dolla
If it consists of 2il(7) elements of "rs" and "+5illion'", it will be incorrectly analyzed.

目　　　的本発明はこのような要求に鑑み、数値を含む表現につい
て適切な形態素解析を行なうことのできる言語解析装置
を提供することを目的とする。OBJECTS In view of these demands, an object of the present invention is to provide a language analysis device that can perform appropriate morphological analysis on expressions that include numerical values.

構　　成木発明はと記の目的を達成させるため、辞書引き単位ご
とに辞書データが格納された辞書手段と、入力された文
を辞書引き単位に分け、辞書引き単位について辞書手段
を参照して形態素解析を行なう解析手段とを有する言語
解析装置において、辞書手段は、数を表わす辞書引き単
位について辞書引き単位が数を表わすことを示す識別表
示を辞書データとして含み、解析手段は、入力された文
に含まれるそれぞれの辞書引き単位について辞書手段を
参照し、索出された辞書データに識別表示が含まれると
きは、この識別表示が索出された辞書引き単位をその辞
書引さ！′ｎ位の付近にあって他の識別表示が索出され
た辞書引き単位と組み合わせ、両辞書引き単位の意味す
る数値を互いに痰算して屯−の数値とし、両辞書引き単
位を重−の解析単位とすることを特徴としたものである
。In order to achieve the above object, the composition tree invention includes a dictionary means storing dictionary data for each dictionary lookup unit, divides an input sentence into dictionary lookup units, and refers to the dictionary means for the dictionary lookup unit. In a language analysis device having an analysis means for performing morphological analysis, the dictionary means includes, as dictionary data, an identification display indicating that the dictionary lookup unit represents a number for a dictionary lookup unit representing a number; The dictionary means is referred to for each dictionary lookup unit included in the sentence, and when the dictionary data retrieved includes an identification display, the dictionary lookup unit for which this identification display is retrieved is searched for in that dictionary! ' Combine with the dictionary lookup unit that is near the nth position and for which other identification indications have been found, calculate the meanings of both dictionary lookup units together to obtain the value of ton, and combine both dictionary lookup units with It is characterized by having the unit of analysis as the unit of analysis.

以Ｆ、木発明の一実施例に基づいて具体的に説明する。Hereinafter, a detailed explanation will be given based on one embodiment of the wooden invention.

第２図を参照すると、本発明による言語解析装置を英日
目動翻訳装置に適用した実施例の全体構成が示されてい
る。なお本発明は、英語を日本語に翻訳する英日自動翻
訳装置のみならず、ある言語を他の言語に翻訳する際お
もに、入力される言語の文を解析する如何なる言語の解
析装置にも効果的に適用されることは、言うまでもない
。Referring to FIG. 2, there is shown the overall configuration of an embodiment in which the language analysis device according to the present invention is applied to an English-Japanese visual translation device. The present invention is effective not only for an English-Japanese automatic translation device that translates English into Japanese, but also for any language analysis device that analyzes sentences in an input language when translating one language into another language. Needless to say, it is applicable.

同実施例は入力部ｌＯを有し、日本語に翻訳すべき英文
テキスト１２がこれにより入力される。入力部ｌＯはた
とえば、英数字キーなどの文字キーや機能キーなどを有
するキーポーＦ、紙に記録された英文テキストを読み取
る光学的文字読取装置（ＯＣＲ）　、および（または〕
磁気ディスクなどの記憶媒体に記録された英文テキスト
を読み込むファイル記憶装置などを含んでよい。This embodiment has an input section 1O, through which an English text 12 to be translated into Japanese is input. The input unit IO includes, for example, a keypad F having character keys such as alphanumeric keys and function keys, an optical character reader (OCR) that reads English text recorded on paper, and/or
It may include a file storage device that reads English text recorded on a storage medium such as a magnetic disk.

入力部ｌＯにより入力された英文テキストは、前編集部
１４に読み込まれ、翻訳の前処理が行なわれる。ここで
は、王として文の認定と未知語の処理を行なう。これは
形態素解析の−・部として機１七する。The English text input by the input unit 1O is read into the pre-editing unit 14 and pre-processed for translation. Here, as the king, he recognizes sentences and processes unknown words. This is used as part of morphological analysis.

前編集された英文データは、前４集で得られた情報とと
もに形態素解析部１６に転送される。形態素解析部１６
では、ｔｒＬ語辞書１日を索引して文に分割し、英文の
形態素を解析し、未知語の処理、固有名詞１時の表現、
数の表現などの各種のまとめあげを社ない、付加疑問、
同格の認定などの文全体の処理を行なう、その形態素解
析ルールは解析ルールファイル３８に格納されている。The pre-edited English text data is transferred to the morphological analysis unit 16 together with the information obtained in the previous four collections. Morphological analysis section 16
Now, we will index the trL language dictionary, divide it into sentences, analyze the morphemes of the English sentence, process unknown words, express proper nouns at 1 o'clock,
We provide various summaries such as number expressions, additional questions,
Morphological analysis rules for processing the entire sentence, such as recognition of apposition, are stored in the analysis rule file 38.

形１ぷ素解析された英文データは、形態素解析で得られ
た辞書情報とともに構文解析１部２０に転送される。ｍ
文解析Ｉ部２０は、文法ルールを英文データに適用して
文について表層構造の解析を行ない、すべての構文的可
能性を見つけ出す機能部である。The English sentence data subjected to the morphological analysis is transferred to the parser 1 section 20 together with the dictionary information obtained through the morphological analysis. m
The sentence analysis section I 20 is a functional section that applies grammar rules to English data to analyze the surface structure of a sentence, and finds all syntactic possibilities.

構文解析１部２０で構文解析された英文データは、その
解析情報とともに構文解析ＩＩ部２２に送られる。ここ
では、構文解析■による表層的な構文解析結果から、構
造記述を適用して解を選択する。これによって英語文の
確からしい解析木を作成し、その構造を作る。これらの
構文解析ルールはやはり、解析ルールファイル３Ｂに格
納されている。The English text data parsed by the parser 1 section 20 is sent to the parser II section 22 together with the analysis information. Here, a solution is selected by applying a structural description from the superficial syntactic analysis result of syntactic analysis (■). This creates a reliable parse tree for the English sentence and creates its structure. These parsing rules are also stored in the parsing rule file 3B.

構文解析された英文データは、解析木のデータとして構
造変換部２４に転送される。構造変換部２４では、英語
文の中間的構造である構文木から対応する日本語文の構
文木を作成し、日本語文を訳出しやすい日本語基底構造
に変換する。The parsed English data is transferred to the structure conversion unit 24 as parse tree data. The structure conversion unit 24 creates a syntax tree for a corresponding Japanese sentence from a syntax tree that is an intermediate structure of an English sentence, and converts the Japanese sentence into a basic Japanese structure that is easy to translate.

こうして構造変換された日本語の基底構造を示す構文木
データは訳文生成部２Ｂに送出され、後者にて訳文の生
成が行なわれる。これは、日本語の構文木の木構造から
日本語の文を生成するａ能部である。The syntax tree data indicating the basic structure of the Japanese language whose structure has been converted in this way is sent to the translation generation unit 2B, and the latter generates a translation. This is an a function section that generates Japanese sentences from the tree structure of Japanese syntax trees.

訳文生成された日本語文データ、すなわち訳文データは
、後編集部３０に送られる。後編集部３０では、翻訳処
理に利用した情報を使用し、辞書１８を索引して訳文デ
ータを修正し、より自然な日本語文を完成する。この日
本語文データは出力部３２に転送され、翻訳された日本
語文３４として出力部３２から出力される。出力部３２
は、たとえばプリンタ、ディスプレイ、および（または
）磁気ディスクなどのファイル記憶装置を含む。The translated Japanese text data, that is, the translated text data, is sent to the post-editing section 30. The post-editing unit 30 uses the information used in the translation process to index the dictionary 18 and correct the translated data to complete a more natural Japanese sentence. This Japanese sentence data is transferred to the output unit 32 and outputted from the output unit 32 as a translated Japanese sentence 34. Output section 32
includes, for example, a printer, a display, and/or a file storage device such as a magnetic disk.

これらの一連の翻訳処理の流れは、水装置全体の制御を
統括する制御部３Ｂによって制御される。The flow of a series of these translation processes is controlled by the control unit 3B that controls the entire water device.

４１語辞害１８には、本実施例では英、７Ｍおよび日本
語の単語についての辞書データが格納され、語りだけで
なく、係り関係すなわち共起関係や、意味、単複、品詞
などの様々な情報が記述されている。また解析ルールフ
ァイル３６には、形態素解析および構文解析のルールデ
ータが格納されている。In this embodiment, the 41-word dictionary 18 stores dictionary data for English, 7M, and Japanese words, and includes not only narration, but also dependency relationships, that is, co-occurrence relationships, meanings, singularity, plurality, parts of speech, etc. Information is written. The analysis rule file 36 also stores rule data for morphological analysis and syntactic analysis.

制御部３８には、操作表示部４０が接続されている。操
作表示部４０は、操作者から水装置に様々な指示をテえ
る、たとえば翻訳指示キー、カーソルキーなどの操作キ
ーや、入力英語文テキスト、翻訳結果の日本語文、辞書
情報などの中間データ、操作者に対する様々な指示など
を可視表示するディスプレイやインジケータを有する。An operation display section 40 is connected to the control section 38 . The operation display section 40 is used to input various instructions from the operator to the water device, such as operation keys such as translation instruction keys and cursor keys, intermediate data such as input English text, translated Japanese text, dictionary information, etc. It has a display and indicators that visually display various instructions to the operator.

なお、それらの操作表示機能の多くは１人力部１０にキ
ーボードを備えている場合はそのキーボードに、また出
力部３２にディスプレイを備えている場合はそのディス
プレイに含まれるように構成してよい。Note that many of these operation display functions may be configured to be included in the keyboard if the single-person power unit 10 is equipped with a keyboard, or in the display if the output unit 32 is equipped with a display.

第１図を参照すると、形態素解析部１６の数の処理に関
する詳細な構成が例示されている。形ｙ５素解析部１６
は、当然他の解析機能部も有するが、ここでは本発明の
理解に直接関連のある部分について示しである。形態素
解析は、人力文字列の先頭から順に検索キーの文字列に
従って辞書探索を指示し、これに従って辞書検索部１０
４から得た辞書情報を後述の数字フラグに従った処理な
どを実行することによって行なわれる。Referring to FIG. 1, a detailed configuration regarding number processing of the morphological analysis unit 16 is illustrated. Shape y5 elementary analysis section 16
Of course, it also has other analysis functional units, but only those that are directly relevant to understanding the present invention are shown here. In the morphological analysis, the dictionary search is instructed in accordance with the character string of the search key in order from the beginning of the human-powered character string, and the dictionary search unit 10
This is done by executing processing on the dictionary information obtained from step 4 according to numerical flags, which will be described later.

形態素解析部！６は、前処理部１４から入力される入力
文字列データを受けて入力処理するための人力処理部１
００を有する。入力処理部１００には、たとえばＡＳＣ
ＩＩなどのコードデータの形で英文文字列データが入力
され、その文字列データを一時蓄積する入力文字列バッ
ファが備えられている。Morphological analysis department! Reference numeral 6 denotes a human power processing unit 1 for receiving input character string data input from the preprocessing unit 14 and performing input processing.
00. The input processing unit 100 includes, for example, an ASC
English character string data is input in the form of code data such as II, and an input character string buffer is provided for temporarily storing the character string data.

入力処理部１００に一時蓄積された入力文字列データを
午、；ｈなどの辞書引き単位に切り出す単位切出し部１
０２に送られる。単位切出し部１０２は。A unit extraction unit 1 that extracts input character string data temporarily stored in the input processing unit 100 into dictionary lookup units such as pm, ;h, etc.
Sent to 02. The unit cutting section 102 is.

後に辞書検索部１０４にて辞書１日を検索する際、その
検索午−文字列を構成する辞書引き単位を識別する機俺
部である。辞書引き単位の切出し処理で使・用される辞
書引きデリミタは、英文字、数字。When the dictionary search section 104 later searches the dictionary for one day, this section is used to identify the dictionary lookup unit that constitutes the search character string. Dictionary lookup delimiters used in the dictionary lookup unit extraction process are alphanumeric characters.

アポストロフィ、ハイフンおよびピリオド以外の文字、
ならびに空白文字に続くアポストロフィの位置に置かれ
る。これは、テリミツトテーブル１０Ｂに格納され、単
位切出し部１０２で辞Ｒ引き単位の切出しの際参照され
る。Characters other than apostrophes, hyphens and periods,
and at the apostrophe following a whitespace character. This is stored in the territory table 10B, and is referred to by the unit extraction unit 102 when extracting the R search unit.

！Ｔ！、ｔ／！辞占１８は、とくに切出し単位を検索す
るための情報が格納されている。たとえば第８図にその
エントリ情報の例を示すように、各辞書引き単位、たと
えば単語のエントリについて品詞などの文法情報の他に
、数を表わす語については、それが数を表わすことを示
す識別表示すなわち数字フラグと、その数値を示す数値
情報が格納されている。! T! ,t/! The dictionary 18 stores information particularly for searching for cut-out units. For example, as shown in Figure 8, an example of the entry information shows that for each dictionary lookup unit, for example, a word entry, in addition to grammatical information such as the part of speech, for words that express numbers, there is an identification that indicates that the word expresses a number. A display, that is, a numerical flag, and numerical information indicating the numerical value are stored.

同図に例示するように、単語辞書１８における各エント
リは、単数形と複数形の両方が併記され、それぞれ１つ
のエントリを構成している。数字フラグは、「１」がヴ
っているとその語が数を意味する＋ｉＡであることを表
示するフラグである。その他の情報としては、たとえば
名詞の可算、不可算の別、自動詞、他動詞の別、訳語な
どが登録されている。たとえば“ｔｈｏｕｓａｎｄ″は
、数を示す名詞であるのでその数字フラグが「１」であ
り、数値はｒｌｏｏＯ」−ｃある。また”ｔｈｒｅａｄ
”は、名詞であるが数を示す名詞すなわち数詞ではない
ので、数字フラグは「０」として登録されている。As illustrated in the figure, each entry in the word dictionary 18 is written in both singular and plural forms, and each constitutes one entry. The numeric flag is a flag that, when "1" is displayed, indicates that the word is +iA, which means a number. Other information registered includes, for example, whether a noun is countable or uncountable, whether it is an intransitive verb or a transitive verb, and its translation. For example, "thousand" is a noun that indicates a number, so its numeric flag is "1" and the numerical value is rloooO'-c. Also “thread”
” is a noun, but it is not a noun that indicates a number, that is, it is not a number word, so the number flag is registered as “0”.

数の認定は、たとえばｏｎｅ”ｔｈｏｕｓａｎｄ”など
のように辞書１８に登録されている語の場合、その数字
フラグで行なわれる。未登録語でも、たとえばｒ１２３
Ｊなどの数字連、ｒ　１０．２４などの小数のように２
組の数字連の間にピリオドをはさむもの、およびｒｌ、
０００，０００　Ｊなどのように数字連の間にコンマを
含むものも数と認定される。なお、本明細書において用
語「数字」は通常、単に算用数字のみならず、”ｔｈｉ
ｒｔｅｅｎ”などとスペルアウトした数表現も含むもの
とする。In the case of a word registered in the dictionary 18, such as one "thousand", the number is recognized using its numerical flag. Even if it is an unregistered word, for example r123
Number series such as J, r 2 like decimals such as 10.24
those with a period between the numbers in the set, and rl,
A number that includes a comma between numbers, such as 000,000 J, is also recognized as a number. In addition, in this specification, the term "number" usually refers not only to simply an arithmetic number but also to "thi
It also includes number expressions spelled out, such as "rteen".

なお、第１２図に示すように辞書１８には、様々な通貨
記号を登録した通貨記号テーブル１８ａ２位取り記号”
、”、”（スペース）“などを登録した位取り記号テー
ブル１８ｂ、および小数点”、”、”などを登録した小
数点テーブル１８ｃを備えている。このように位取り記
号や小数点についてテーブルを備えているのは、周知の
ように、日本語や英語では、位取り記号に”、”を、ま
た小数点に”、“を使用するが、フランス語やドイツ類
などの他のヨーロッパ語では主として位取り記号にスペ
ースまたは”、”を、小数点に”、”を使用するなど、
対象とする言語によって、記号の用法が相違するためで
ある。As shown in FIG. 12, the dictionary 18 includes a currency symbol table 18a in which various currency symbols are registered.
, "," (space), etc., and a decimal point table 18c, in which decimal points ",", ", etc. are registered. The reason why we have tables for place value symbols and decimal points is that, as is well known, Japanese and English use "," for place value symbols and "," for decimal points, whereas French and German languages use "," for place value symbols and "," for decimal points. Other European languages, such as , mainly use a space or "," for the place mark and "," for the decimal point, etc.
This is because the usage of symbols differs depending on the target language.

辞書検索部１０４は、単位切出し部１０２から入力され
る検案キー文字夕１に基づき、単語辞書１８を検索して
辞書情報を取り出し、これを処理部１１０゜１１２、１
１４および１１Ｂに転送する機能部である。The dictionary search unit 104 searches the word dictionary 18 to extract dictionary information based on the sample key character number 1 inputted from the unit extraction unit 102, and sends the dictionary information to the processing units 110, 112, 1.
14 and 11B.

数字連のまとめあげは、次の２つの処理にて行なう。ま
ず、前述のようにして数と認定された場合、次の辞ど引
き単位を見てそれも数と認定されると、これらをまとめ
て１つの数を合成する。数が続くかぎりこの操作を繰り
返す、たとえば”３０ｔｈｏｕｓａｎｄ”はｒ　３００
００　Ｊ　、　”１．５　ｍ１ｌｌｉｏｎ″はｒ１５０
００００　Ｊとなる０次に、ａｎｄ″をはさんでさらに
数表現が続くときは、それらの数表現の意味上で、”ａ
ｎｄ”の右側でポインタが指示している数値の各桁に対
応した”ａｎｄ”の左側の桁がすべて０”であるとき、
１つの数に合成する。たとえば”ｏｎｅ　ｈｕｎｄｒｅ
ｄ　ａｎｄ　ｔｈｉｒｔｙ”はｒ１３０Ｊ　に、また”
３０　ｔｈｏｕｓａｎｄ　ａｎｄ　ｔｗｏ　ｈｕｎｄｒ
ｅｄ’はｒ３０２００　Ｊになる。Combining numbers is performed by the following two processes. First, when it is recognized as a number as described above, the next jido-hiki unit is looked at and if it is also recognized as a number, these are combined into one number. Repeat this operation as long as the number continues, for example "30 thousand" is r 300
00 J, "1.5 m1llion" is r150
When the 0th order, which is 0000 J, is followed by further number expressions with "and" in between, in terms of the meaning of those number expressions, "a"
When the digits on the left side of “and” corresponding to each digit of the numerical value pointed to by the pointer on the right side of “and” are all 0”,
Combine into one number. For example, “one hundred”
d and thirty” is r130J, and “
30 thousand and two hundred
ed' becomes r30200 J.

このような数の認定ののち、さらに必要な局所解析を行
なう。これは、局所解析ルールに基づいて各解析単位の
形ＴｒＦ：、素起動情報から起動される連続した解析単
位を１つの解析単位にまとめあげる。たとえば、通貨記
号と数字”￥１，０００”はｒ　１０００円」に、また
数字と単位“１６５に層”は「１．５キロメートル」に
まとめあげる。After determining such a number, further necessary local analysis is performed. This combines the form TrF: of each analysis unit based on local analysis rules, and the consecutive analysis units activated from elementary activation information into one analysis unit. For example, the currency symbol and number ``¥1,000'' are combined into ``r 1000 yen,'' and the number and unit ``165 layers'' are combined into ``1.5 kilometers.''

これらのまとめあげ処理は処理部１１０〜１２２にて行
なわれる。処理部１１０は、数詞を通貨記号または単位
とまとめあげる処理を行なう機能部である。処理部１１
２は、数詞を数値化する処理を行なう機能部である。ま
た処理部１１４は、ハイフンで連結された数詞の処理を
行なう機能部である。さらに処理部１１１３は、連続し
た数字を処理するａ面部である。These grouping processes are performed by processing units 110 to 122. The processing unit 110 is a functional unit that performs a process of grouping numerals into currency symbols or units. Processing section 11
2 is a functional unit that performs a process of converting number words into numerical values. Further, the processing unit 114 is a functional unit that processes number words connected with a hyphen. Further, the processing section 1113 is an a-plane section that processes consecutive numbers.

通貨記号または単位とのまとめあげ処理を行なった数詞
は、通貨記号とのまとめあげ場合は処理部１１Ｂにて通
貨記号と数値がまとめあげられ、単一の名詞とされる。When a numerical word is combined with a currency symbol or a unit, the processing unit 11B combines the currency symbol and numerical value into a single noun.

また単位とのまとめあげ場合は、処理部１２０にて数値
と単位をまとめて単一の名詞とされる。また、数詞の数
値化処理、ハイフン付数詞の処理、および数詞連続の処
理を行なったものは、それらの直前の数値とまとめあげ
る処理が処理部１２２にて行なわれる。これらの処理を
完γした入力文字列の辞書情報は、検案済み辞書情報バ
ッファすなわち辞書情報保存テーブル１２４に格納され
る。In addition, in the case of combining the numerical value and the unit, the processing unit 120 combines the numerical value and the unit into a single noun. Furthermore, the processing unit 122 performs a process of combining numerical words that have been subjected to numerical processing, hyphenated number processing, and continuous number word processing with the numerical value immediately before them. The dictionary information of the input character string that has undergone these processes is stored in the verified dictionary information buffer, that is, the dictionary information storage table 124.

形態素解析された結果は、辞書情報保存テーブル１２４
から構文解析１部２０へ転送される。The morphological analysis results are stored in the dictionary information storage table 124.
The data is then transferred to the syntax analysis section 1 20.

数字フラグによる処理は、第３Ａ図および第３Ｂ図に示
すようなシーケンスにて行なう。入力処理部１００に入
力文字列データを受けて入力処理を行なう　（２００）
。そこで単位切出し部１０２は、辞書１８を索引するた
めに入力文字列を辞、り引き単位に切り出す（２０１）
、辞書検索部１０４は、これに従って辞書１８ｉ検索し
く２０３）　、辞書エントリがあれば（２０４）　。Processing using numerical flags is performed in the sequence shown in FIGS. 3A and 3B. The input processing unit 100 receives input character string data and performs input processing (200)
. Therefore, the unit extraction unit 102 extracts the input character string into units of extraction in order to index the dictionary 18 (201).
Then, the dictionary search unit 104 searches the dictionary 18i according to this (203), and if there is a dictionary entry (204).

その数字フラグを調べる　（２０５）、数字フラグが立
っていないと、これは数ＪＩ４以外であるのでその辞書
情報を辞書情報保存テーブル１２４に蓄積する。数字フ
ラグにｒｌＪがケっでいると、処理部１１２にて数詞を
ａ値化しく２０６）　、処理部１２２にて直前の数値と
のまとめあげ処理２０７を行なう。これれらの処理を入
力文字列データの示す文の最終位置まで行なうと（２０
２）　、処理部１１Ｂおよび１２０にて通貨記号または
単位とのまとめあげ処理２０９を行ない、それらの形態
素解析結果を構文解析１部２０へ出力する　（２１０）
。The numeric flag is checked (205). If the numeric flag is not set, this is other than the number JI4, so the dictionary information is stored in the dictionary information storage table 124. If rlJ is set in the number flag, the processing unit 112 converts the number words into a-values (206), and the processing unit 122 performs processing 207 to combine them with the immediately preceding numerical value. When these processes are performed up to the final position of the sentence indicated by the input character string data (20
2) The processing units 11B and 120 perform a process 209 to combine the currency symbols or units, and output the morphological analysis results to the syntactic analysis unit 1 20 (210)
.

辞書引きの結果、ステップ２０４にてエントリが存在し
ないと、その要素がハイフン付きであれば（２１２）処
理部１１４にてハイフン付数詞の処理２１３を行なう。As a result of the dictionary lookup, if the entry does not exist in step 204, and if the element has a hyphen (212), the processing unit 114 performs hyphenated number processing 213.

ハイフン付きでなく最初が通貨記号であれば（２１４）
、通貨記号のみで辞書情報保存テーブル１２４に保存し
く２１８）　、辞書引き単位から通貨記号を削除する　
（２１７）。最初が通貨記号でないと（２１４）　、数
字連続の処理２１５を処理部１１Ｂにて行なう、これを
最終位置まで実行する　（２０２）。If the first currency symbol is not a hyphen (214)
, save only the currency symbol in the dictionary information storage table 124 (218), delete the currency symbol from the dictionary lookup unit
(217). If the first symbol is not a currency symbol (214), the processing unit 11B performs numeric sequence processing 215, which is executed up to the final position (202).

通貨記号および単位とのまとめあげ処理２０９は、第４
図に示すような処理フローで処理部１１０にて行なわれ
る。まず初期処理２２０では、処理の先頭ポインタを最
初はバッファの先頭にセットする。ポインタの指示して
いる要素が数１１でなければ（２２１）、ポインタを歩
進させる　（２２６）、数値であっても、その直前が通
貨記号なく、かつその直後が単位でないときは、やはり
ポインタを歩進させる（２２２．２２４）、辞書引き単
位の最終位置までこれを行なう（２２７）。The process 209 of combining currency symbols and units is performed in the fourth
The processing is performed by the processing unit 110 according to the processing flow shown in the figure. First, in initial processing 220, a processing start pointer is initially set to the beginning of the buffer. If the element pointed to by the pointer is not the number 11 (221), the pointer is incremented (226).Even if it is a number, if the immediately before it is not a currency symbol and the immediately after it is not a unit, the pointer is still is incremented (222, 224), and this is done until the final position of the dictionary lookup unit (227).

数値であれば（２２２）　、その通貨記号と数値をまと
めて１個の名詞とする　（２２３）、たとえば、通貨記
号と数字”￥１，０００“は１個の名詞とする。また、
直前が通ＩＹ記号でなく直後が栄位であるときは、その
数値と単位をまとめて１個の名詞とする（２２５）、た
とえば、数字と単位“１．５部ｍ”は１個の　＝名詞と
する。これを辞３引き単位の最終位置まで行なう　（２
２７）。If it is a numerical value (222), the currency symbol and the numerical value are combined into one noun (223). For example, the currency symbol and the number "¥1,000" are treated as one noun. Also,
When the immediately preceding symbol is not a common IY symbol and the immediately following is an honor, the number and unit are combined into one noun (225). For example, the number and the unit "1.5 part m" are one = Use it as a noun. Do this until the final position of the 3-pull unit (2
27).

ハイフン付数詞の処理２１３は、７５５Ａ図および第５
Ｂ図に示すような処理フローで処理部１１４にて行なわ
れる。まず初期処理２３０にてハイフン付きの辞書引き
単位をバッファに保存する。また、数値ｒＯＪを保存し
、元の辞書引き単位のハイフンはスペースに変えておく
。The processing 213 of hyphenated number words is shown in Figure 755A and in Figure 5.
The processing is performed by the processing unit 114 according to the processing flow shown in Figure B. First, in initial processing 230, a dictionary lookup unit with a hyphen is stored in a buffer. Also, save the numerical value rOJ and replace the hyphen in the original dictionary lookup unit with a space.

そこで辞書引き栄位を切り出しく２３１）、辞書検索２
３５を行なう。辞書検索の結果、エントリがないと、す
なわち辞書に登録されていない語であると（２３Ｂ）　
、そのハイフン付きの辞書引き単位全体を辞書未登録語
として辞書情報保存テーブル１２４に保存する　（２３
７）。So let's take a look at the dictionary 231), Dictionary search 2
Do 35. As a result of the dictionary search, there is no entry, that is, the word is not registered in the dictionary (23B)
, the entire dictionary lookup unit with the hyphen is stored in the dictionary information storage table 124 as an unregistered word in the dictionary (23
7).

辞書引ぎの結果、エントリが得られると（２３Ｂ）　。When an entry is obtained as a result of dictionary lookup (23B).

その数字フラグが「ｌ」であるか否かをみる。数字フラ
グがｒｌＪでないと、これは数字でないことを意味し、
そのハイフン付きの辞；！２引き単位全体を辞書未登録
語として辞書情報保存テーブル１２４に保存する　（２
３７）。Check whether the numerical flag is "l". If the number flag is not rlJ, this means it is not a number,
That hyphenated word;! Save the entire 2-lookup unit in the dictionary information storage table 124 as an unregistered word in the dictionary (2
37).

辞書エントリの数字フラグにｒｌＪがゲつていると、処
理部１２はその数Ｊ４をエントリデータに基づいて数１
１化する　（２３９）。次に、この数値化した数値を現
在保存されている数値に／Ｉｌ’ｌ算しく２４０）　、
加ユ結果を保存する　（２４１）、これによって、たと
えばｔｗｅｎｔｙ−ｔｗｏ’の”を質０″は、その直前
の”ｔｗｅｎｔｙ″の「２０」と加算され、「２２」　
となる、これを辞書引き単位の最終位置まで行なう　（
２３２）。When rlJ is set in the number flag of the dictionary entry, the processing unit 12 converts the number J4 into the number 1 based on the entry data.
1 (239). Next, convert this numerical value into the currently saved value /Il'l(240),
Save the addition result (241), so that, for example, "20" of twenty-two' is added to "20" of "twenty" immediately before it, and becomes "22".
This is done until the final position of the dictionary lookup unit (
232).

最終位置まで歩進すると、ステップ２３２にて処理２３
３に移行し、保存した数値をハイフン付きの辞書引き単
位全体の数値とする０次に、この数値をその直前の数値
とまとめあげる処理２０７を行なう。When the step reaches the final position, the process 23 is performed in step 232.
3, the stored numerical value is used as the numerical value of the entire dictionary lookup unit with a hyphen. Next, a process 207 is performed in which this numerical value is combined with the immediately preceding numerical value.

第６Ａ図および第６Ｂ図を参照して、処理部１１６にて
実行される数字連続処理２１５を説明する。なおこれら
のフロー図において、記号「＜＝」は代入を示す。まず
、保存数値マａ　ｌ−５ａｖｅをｒＯＪにし、パラメー
タ「ｉ」をｒｌＪにし、ポインタｐを辞書引き単位の文
字列の先頭にセットする初期化２５０を行なう。Referring to FIGS. 6A and 6B, the numeric sequence process 215 executed by the processing unit 116 will be described. Note that in these flowcharts, the symbol “<=” indicates substitution. First, initialization 250 is performed in which the saved numerical value map a l-5ave is set to rOJ, the parameter "i" is set to rlJ, and the pointer p is set to the beginning of the character string of the dictionary lookup unit.

次に、ポインタｐの指示している文字様が数字であるか
（２５１）、位取り文字であるか（２５２）　、小数点
であるか（２５３）をチェックし、それらのいずれでも
なければ、文字列全体を辞書未登録語として辞書情報保
存テーブル１２４に格納する　（２５５）、小数点であ
れば（２５３）　、パラメータ「ｉ」を１０倍して（２
５４）　、ステップ２５８を１行する。ステップ２５８
では、保存数１１マａｌ−ｓａｖｅに文字本ｐの数イＩ
ｆｆｎｕ厘（本ｐ）を加算して新たな保存数値とする。Next, check whether the character type pointed to by pointer p is a number (251), a scale character (252), or a decimal point (253), and if it is neither of these, the character string The entire word is stored in the dictionary information storage table 124 as a word not registered in the dictionary (255), and if it is a decimal point (253), the parameter "i" is multiplied by 10 (2
54), execute step 258 for one line. Step 258
Then, add the number of character books p to the number of saves (11).
ffnu 厘 (this p) is added to obtain a new saved numerical value.

数値ｎｕｎ（零ｐ）は、文字（攻ｐ）を数値とみたとき
の値である。The numerical value nun (zero p) is a value when the character (p) is viewed as a numerical value.

ステップ２５１または２５２において数字であったり、
位取り文字であったりすると、ステップ２５７を実行す
る。ステップ２５７では、保存数値マａ１−ｓａｖｅを
１０倍してこれに文字様の数値ｎｕ■（本ｐ）を加算し
、新たな保存数値とする。In step 251 or 252, it is a number,
If it is a scale character, step 257 is executed. In step 257, the saved numerical value maa1-save is multiplied by 10 and the character-like numerical value nu■ (book p) is added thereto to form a new saved numerical value.

これらの処理ののち、ポインタを歩進させ（２５９）　
、辞書引き単位の最終位置までこの処理を繰り返す（２
６０）、文字列の最終位置であると、文字列全体の数値
を保存数値としく２８１）、処理部１２２において直前
の数値とのまとめあげ処理２０７を実行する。これによ
って、たとえば連続数字”１，０００．５″は数イ４　
ｒｌｏｏＯ，５Ｊに解析される。After these processes, the pointer is incremented (259)
, repeat this process until the final position of the dictionary lookup unit (2
60), if it is the last position of the character string, the numerical value of the entire character string is saved as the numerical value 281), and the processing unit 122 executes the process 207 of combining it with the immediately preceding numerical value. With this, for example, the consecutive numbers "1,000.5" are number 4
parsed into rlooO,5J.

直前の数値とのまとめあげ処理２０７は次のようにして
処理部１２２で行なわれる。まず、辞書テーブルのポイ
ンタをその辞書引き単位の直前の位置にセットする　（
２７０）、この位置に何もなければ、保存テーブルの最
初の位置がその数値であることを意味し、現辞書引き単
位の数値を辞書保存テーブル１２４に記録する　（２８
４）、その記録位置は、現ポインタＰの指示する位置の
次の位置である。The process 207 of combining the values with the immediately preceding numerical value is performed by the processing unit 122 as follows. First, set the dictionary table pointer to the position immediately before the dictionary lookup unit (
270), if there is nothing in this position, it means that the first position of the storage table is the numerical value, and the numerical value of the current dictionary lookup unit is recorded in the dictionary storage table 124 (28
4), the recording position is the next position to the position indicated by the current pointer P.

ステップ２７１にて、直前に語が存在するときは、ポイ
ンタＰの指示するエントリが”ａｎｄ“でな（（２７２
）、かつポインタｐが数値を指していなければ（２７３
）　、辞書保存テーブル１２４の現ポインタｐの指示す
る位置の次の位置に現辞書引き単位の数値を記録する　
（２８４）、たとえば”Ｔｏ　ｈｉｍ　ｔｗｏ、、、、
”の例では、”ｔｗｏ”を数値「２」として新たに記録
する。In step 271, if a word exists immediately before, the entry pointed to by pointer P is "and" ((272
), and if pointer p does not point to a numerical value (273
), records the value of the current dictionary lookup unit in the position next to the position indicated by the current pointer p in the dictionary storage table 124.
(284), for example, "To him two..."
In the example, "two" is newly recorded as the numerical value "2".

ステップ２７３において、ポインタＰが数値を指示して
いると、ポインタｐの指示しているエントリの数４（ｒ
　ｐ→マに現辞書引き単位の数値マーｎｏｖを来じて新
たなポインタｐの指示しているエントリの数４ｅ　ｐ→
マとする　（２７４）。たとえば”ｔｗ。In step 273, if pointer P points to a numerical value, the number of entries pointed to by pointer p is 4 (r
The number of entries pointed to by the new pointer p is 4e by adding the numerical value of the current dictionary lookup unit to p→
Ma (274). For example, “tw.

ｔｈｏｕｓａｎｄ”の例では、ｒ　２！１０００＝２０
００　Ｊを実行し、′″ｔｗｏ　ｔｈｏｕｓａｎｄ”全
体を１つとする。そののち、現辞書引き単位の終ｒ位置
をポインタｐのエントリの終了位置、すなわちｐ→終了
位置とする（２８２）。In the example “thousand”, r 2!1000=20
00 J and make the whole ``two thousand'' into one. Thereafter, the end position r of the current dictionary lookup unit is set as the end position of the entry of pointer p, that is, p→end position (282).

ステップ２７２にて、ポインタｐの指示するエントリが
”ａｎｄ”であれば、ポインタｐをその前の辞書引き単
位に移す（２７５）、それが最終位置でなく（２７Ｂ）
　、　Ｌかも数１１であれば（２７７）　、現辞書引き
単位の数値マーｎｏｖを最上位桁で繰り上げてまるめ、
これを値マｌとする。現辞書引き単位の数値マーｎｏ％
Ｆがたとえばｒ８Ｊ　　１１１１　Ｊ　　ｒ９ａＪ　　
ｒｌｌ」であれば。In step 272, if the entry pointed to by the pointer p is "and", the pointer p is moved to the previous dictionary lookup unit (275), and it is not the final position (27B).
, If L is also number 11 (277), round up the numerical value nov of the current dictionary lookup unit to the most significant digit,
Let this be the value Mar. Numerical value of current dictionary lookup unit no%
For example, F is r8J 1111 J r9aJ
rll”.

イ１マｌはそれぞれ、ｒｌＯＪ　　ｒｌＯＪ　　ｒｌｏ
ｏ　Ｊ　　ｒｌｏｏ　Ｊとなる。Each square is rlOJ rlOJ rlo
o J rloo J.

そこで、ポインタｐの指示しているエントリの数値ｐ→
マをマｌで除した余り、すなわちｒａｏｄ（ｐ→ｖ、　
ｖｌ）が「Ｏ」であるか否かを調べる。Therefore, the numerical value p of the entry pointed to by the pointer p →
The remainder when m is divided by m, that is, raod(p→v,
vl) is "O".

ｒＱＪでなければ、ポインタｐをインクリメントしく２
８３）　、辞書保存テーブル１２４の現ポインタｐの指
示する位置の次の位置に現辞書引き単位の数値を記録す
る　（２８４）。たとえば、　”Ｉ　ａｎｄ　ｔｗｏ’
　ｃ７）例では、”ｔｗｏ″を「２」を数（４ｒ２Ｊと
して新たに記録する。If not rQJ, increment pointer p by 2.
83), records the numerical value of the current dictionary lookup unit in the position next to the position indicated by the current pointer p in the dictionary storage table 124 (284). For example, ``I and two''
c7) In the example, "two" is newly recorded as "2" as a number (4r2J).

ステップ２７９で余りが「０」であると、ポインタｐの
指示しているエントリの数値ｐ→マに現辞書引き単位の
数値マーｎｏｗを加算して新たなポインタｐの指示して
いるエントリの数値ｐ→！とする（２８０）。たとえば
、”ｔｗｏ　ｔｈｏｕｓａｎｄ　ａｎｄ　ｔｗｏ”ノ例
において、この段階では、すでに’ｔｗｏ　ｔｈｏｕｓ
ａｎｄ″がひとまとまりにｒ　２０００Ｊとしてまとめ
られている。そこで、加Ｍ２Ｏ０によってこれがｔｗｏ
“の「２」と７Ｉａ算され、ｒ　２００２Ｊとし、全体
を１つとする。そののち、情報保存テーブル１２４から
ポインタｐ÷１の指示する”ａｎｄ”の情報を削除しく
２８１）。If the remainder is "0" in step 279, the numerical value p of the entry pointed to by the pointer p→ma is added to the numerical value m now of the current dictionary lookup unit, and a new numerical value of the entry pointed to by the pointer p is created. p→! (280). For example, in the example "two thousand and two", at this stage it is already 'two thousand'.
and'' are grouped together as r 2000J. Then, by adding M2O0, this is
"2" and 7Ia are calculated, r 2002J, and the whole is one. Thereafter, the "and" information pointed to by the pointer p÷1 is deleted from the information storage table 124 (281).

ステップ２８２に移行する。The process moves to step 282.

例をあげて説明する。たとえば第９図に示すように、入
力文字列”Ｔｏ　ｈｉｍ　ｔｗｏ　ｔｈｏｕｓａｎｄ　
ａｎｄｔｗｅｎｔｙ−ｔｗｏ、、、、”について辞書引
きを行なうと、第１０Ａ図に示すような辞書エントリ情
報が辞書情報保存テーブル１２４に８き込まれる。たと
えば、”ｈｉｍ”については、その開始位置が「４」で
あり、終ｒ位置が「６」であり、品詞は代名詞である。Let me explain with an example. For example, as shown in Figure 9, the input character string "To him two thousand
When a dictionary lookup is performed for "andtwenty-two,...", dictionary entry information as shown in FIG. 4'', the final r position is ``6'', and the part of speech is a pronoun.

数の処理では、ますｔｗｏ”について数字フラグが「１
」であり（２０５）　、その数値が【２」であることが
識別される。この文字列では”ｔ％＃０“の直前が数値
でないので、これはそのまま同テーブル１２４に格納さ
れる　（２０６，２０７，２８４）。In number processing, the number flag for "mass two" is "1".
” (205), and it is identified that the numerical value is [2”. In this character string, since the part immediately before "t%#0" is not a numerical value, it is stored as is in the same table 124 (206, 207, 284).

次にポインタをインクリメントし、”ｔｈｏｕＳａｎｄ
”の処理に移行する。その数字フラグは「１」、数値は
ｒ　１００ＯＪである（２０５．２０８）。しかも、そ
の直前は数値「２」であるから（２０７，２７３）、乗
算２ｚ１０００　ｔ−実行しく２７４）　、テーブル１
２４に格納する（第１０Ｂ図）。次の”ａｎｄ”につい
ては、一応そのまま辞書情報をテーブル１２４に蓄積す
る（第１００図）。Then increment the pointer and
”.The numerical flag is “1” and the numerical value is r100OJ (205.208). Moreover, since the value immediately before that is "2" (207, 273), multiplication 2z1000 t-execution 274), Table 1
24 (FIG. 10B). Regarding the next "and", the dictionary information is stored in the table 124 as is (FIG. 100).

さらにポインタを進め、”ｔｗｅｎｔ７−ｔｗｏ”を処
理する。このままでは、辞書エントリにないハイフンａ
ｈであり（２１２）、ハイフン語付数詞の処理２１３に
よってｒ２０＋２−２２４を実行する（２３７．２３９
〜２４１）。その直前は”ａｎｄ”であり（２７２）　
、その前の数値ｒ　２０００Ｊであるので（２７７）　
、数値「２２」の最北位桁をまるめてｒｌｏｏＪとしく
２７８）　、割算２７３を１行すると、その余りがｒＯ
Ｊとなるので、ｒ　２０００Ｊと「２２」の加算２８０
を行なう、保存テーブル１２４からａｎｄ”の情報を削
除しく２８２）　、加算結果ｒ　２０２２Ｊを数値とし
てテーブル１２４に保存する。The pointer is further advanced and "twent7-two" is processed. As it is, the hyphen a that is not in the dictionary entry
h (212), and r20+2-224 is executed by the hyphenated number processing 213 (237.239
~241). Immediately before that is “and” (272)
, since the previous value r is 2000J (277)
, round the northernmost digit of the number "22" to rlooJ278), and perform division 273 in one line, the remainder is rO
J, so add r 2000J and "22" to 280
282), and saves the addition result r 2022J in the table 124 as a numerical value.

これによって、”ｔｗｏ　ｔｈｏｕｓａｎｄ　ａｎｄ　
ｔｗｅｎｔｙ−ｔｗｏ”をｒ　２０２２Ｊと認識する、
直前の数値とのまとめあげ処理２０７が行なわれた。By this, “two thousand and
twenty-two" is recognized as r 2022J,
Processing 207 for combining the previous numerical value was performed.

他の例を示す、第１１図に示すように、人力文字列″Ｙ
ｏｕ　５ａｉｄ　＄１，０００．５　ｔｈｏｕｓａｎｄ
　ｗａｓ、、、”について解析を進める。”＄１，００
０．５″は辞書１８に登録されていない。最初は通貨記
″−）”Ｓ″であり、辞、リエントリから通貨記号であ
ることが認識される。これは保存テーブル１２４に独ｔ
して記録する（２１４゜２１６、第１３Ａ図）。Another example, as shown in FIG.
ou 5aid $1,000.5 thousand
Proceed with the analysis of was...” $1,00
0.5'' is not registered in the dictionary 18. Initially, it is a currency symbol ``-)''S'', and from the cursor and re-entry, it is recognized as a currency symbol. This is stored in the save table 124.
(214°216, Figure 13A).

次に”１．ＱＯｏ、５”は、数字連続処理２１５により
数４４　ｒｌｏｏｏ、５Ｊ　とする、その直前は、記号
”Ｓ”であり数値でないので、この数値をそのまま記録
する（２７０〜２７３．２８４、第１３Ｂ図）。Next, "1.QOo, 5" is converted into the number 44 rloooo, 5J by the number sequence processing 215. The symbol "S" immediately before it is not a numerical value, so this numerical value is recorded as it is (270 to 273.284 , Figure 13B).

その次の中詰”ｔｈａｕｓａｎｄ”は数詞であり、その
数値はｒ　１００ＯＪである。直前は数値であるから（
２７２，２７３）、ｒｌｏｏｏ、５ｘ１０００＝１００
０５００　Ｊなる演算２７４を実行する（第１３Ｃ図）
。The next middle sentence "thausand" is a number word, and its numerical value is r 100OJ. Since the immediately preceding value is a numerical value (
272,273), rlooo, 5x1000=100
Execute operation 274 of 0500 J (Figure 13C)
.

こうして辞書引きが終了したのち、辞書情報保存テーブ
ル１７４の保存内容を順次調べる。数値ｒｌｏ００５０
０　Ｊの直前に通貨記号”Ｓ“が存在するので、両者を
まとめてｒ　＄１０００５００Ｊを単一の名詞エントリ
とする（２０９．．２２１〜２２３．第１３０図）。After the dictionary lookup is completed in this way, the contents stored in the dictionary information storage table 174 are sequentially checked. Number rlo0050
Since the currency symbol "S" exists immediately before 0J, both are combined to form a single noun entry r$1000500J (209..221-223.Figure 130).

効　　果本発明によれば、形ｙｇ　Ｚ解析の際、数については、
数表現を数値に置換し、ハイフン付数詞や数字の連続が
あったり、直前に数値があると、それと合成して加算ま
たは東算を行ない、単一の数値と解析単位にまとめあげ
る。また、数にともなう通貨記号や単位なども数値とと
もにひとまとまりの解析単位として解析する。これによ
って、数値を含む表現について適切な位取りや単位で形
態素゛解析を行なうことができる。Effects According to the present invention, when analyzing the form ygZ, regarding the number,
Replaces a number expression with a numerical value, and if there is a hyphenated number word, a series of numbers, or a numerical value immediately before it, it is combined with that and performs addition or east arithmetic to combine it into a single numerical value and unit of analysis. In addition, currency symbols and units associated with numbers are analyzed together with numerical values as a unit of analysis. This allows morpheme analysis to be performed using appropriate scale and units for expressions that include numerical values.

【図面の簡単な説明】７ｊＳ１図は、第２図に示す実施例の形態素解析部の詳
細な構成例を示す機能ブロック図。第２図は本発明による言語解析装置を英日目動翻訳装置
に適用した実施例の全体構成を示すａ面ブロック図、７ＪＳＧＡ図および第３Ｂ図は、第１図に示す実施例に
おける形態素解析処理の例を示すフロー図。第４図は形態素解析処理における通貨記号および単位の
まとめあげ処理の例を示すフロー図、第５Ａ図および第
５Ｂ図は、形態素解析処理におけるハイフン付数詞の処
理の例を示すフロー図、７ＪＳＢＡ図および第６Ｂ図は
、形態素解析処理における数字連続の処理の例を示すフ
ロー図、７ＪＳ７Ａ図および第７Ｂ図は、形態素解析処
理における直前の数値とのまとめあげ処理の例を示すフ
ロー図。第８図は同実施例における数字フラグ付き辞書ファイル
の構成例を示す説明図、第９図は同実施例における入力文字列の例を示す説明図
、７ＪＳＩＯＡ図ないし第１００図は、第９図に例示した
入力文字列について辞占引きした辞書情報保存テーブル
の内容を処理の段階に応じて示す説明図、ｍｌ１図は同実施例における入力文字列の他の例を示す
説明図、第１２図は同実施例における辞どの通貨記号テーブル、
位取り記号テーブル、小数点テーブルの内容の例を示す
説明図、第１３Ａ図ないし第１３Ｄ図は、第１１図に例示した入
力文字列について詳言引きした辞書情報保存テーブルの
例を処理の段階に応じて示す説明図である・主要部分の符号の説明１８、、、形態素解析部１８、、、辞　書１０４、、、辞書検索部１１０、、、通貨記号・単位とのまとめあげ処理部１１２、、、数詞の数値化処理部１１４、、、ハイフン付数詞の処理部１１Ｅｉ、、、数字Ｊ！Ｉ！続の処理部１１８、、通貨記
号と数詞をまとめて１個の名調とする処理部１２０、、、数値と単位をまとめて１個の名詞とする処理部１２２、、、直前の数値とのまとめあ（デ処理部１２４
１０．辞Ｊ）情報保存テーブル本３Ａ５第５Δ図蔦５８１！１集る４図ネろＢ　凹第７Ａ　Ｉ￥］乳７８１２Ｉ阜δ　凹／ａ朱９図工／ＱＣ図尾７００図、竿、ＨｖＪ・・・・”ｒａｑ　ｔｏ　／Ｈ２−・・・Ｙｏｕ　ｓａ
ｔ　ｄ　％　Ｉ　ｒ　ＯＯ’　、ら＋、ｈＯｕＳｃＬｎ
ｃｌ弯αＳ・・・・毛１２図弄Ｊ３Ａ図尾／３８凹秦１３Ｃ図[BRIEF DESCRIPTION OF THE DRAWINGS] FIG. 7jS1 is a functional block diagram showing a detailed configuration example of the morphological analysis section of the embodiment shown in FIG. 2. FIG. 2 is an a-side block diagram showing the overall configuration of an embodiment in which the language analysis device according to the present invention is applied to an English-Japanese visual motion translation device, and 7JSGA diagram and FIG. 3B are morphological analysis in the embodiment shown in FIG. 1. FIG. 3 is a flow diagram showing an example of processing. FIG. 4 is a flowchart showing an example of the processing of summarizing currency symbols and units in the morphological analysis process, and FIGS. 5A and 5B are flowcharts showing an example of the processing of hyphenated numbers in the morphological analysis process. FIG. 6B is a flowchart showing an example of processing of consecutive numbers in morphological analysis processing, and FIG. 7JS7A and FIG. 7B are flowcharts showing examples of processing of combining numerical values immediately before in morphological analysis processing. FIG. 8 is an explanatory diagram showing an example of the configuration of a dictionary file with numerical flags in the same embodiment. FIG. 9 is an explanatory diagram showing an example of an input character string in the same embodiment. Fig. 12 is an explanatory diagram showing the contents of the dictionary information storage table obtained by lexicographically looking up the input character string exemplified in Fig. 12 according to the processing stage; is the dictionary currency symbol table in the same example,
Figures 13A to 13D are explanatory diagrams showing examples of the contents of a scale symbol table and a decimal point table. This is an explanatory diagram showing the symbols of the main parts 18, , morphological analysis unit 18, , dictionary 104, , dictionary search unit 110, , currency symbol/unit grouping processing unit 112, . Number word digitization processing unit 1 14,, Hyphenated number word processing unit 1 1Ei, , Number J! I! The following processing unit 118, processing unit 120 which combines currency symbols and numerals into one noun, processing unit 122 which combines numerical values and units into one noun,... Summary (de processing unit 124
10. Dictionary J) Information storage table book 3A5 5th Δ figure Tsuta 581!1 Gather 4 figure Nero B concave 7A I￥] Milk 7812I 阜δ concave/a Vermilion 9 figure/QC figure tail 700 figure, rod, HvJ... ..."raq to /H2-...You sa
t d % I r OO', ra+, hOuScLn
cl curvature αS...hair 12 figure groin J3A figure tail/38 concave Qin 13C figure

Claims

[Claims] 1. Dictionary means storing dictionary data for each dictionary lookup unit, dividing an input sentence into dictionary lookup units, and performing morphological analysis for the dictionary lookup unit by referring to the dictionary means. and an analysis means, wherein the dictionary means includes, as the dictionary data, an identification display indicating that the dictionary lookup unit represents a number for a dictionary lookup unit representing a number; The dictionary means is referred to for each dictionary lookup unit included in a sentence, and when the dictionary data retrieved includes the identification display, the dictionary lookup unit from which the identification display is retrieved is used as the dictionary lookup unit of the dictionary lookup unit. It is combined with other nearby dictionary lookup units for which the above-mentioned identification indications have been retrieved, and the meanings of both dictionary lookup units are calculated together to form a single numerical value, and both dictionary lookup units are treated as a single analysis unit. A language analysis device characterized by: 2. In the apparatus according to claim 1, when the analysis unit is accompanied by a currency symbol or a dictionary look-up unit representing a unit, the analysis means collects this together with the numerical value to form a single analysis unit. A language analysis device characterized by: