JPH02254565A - Syntax analysis system - Google Patents

Syntax analysis system

Info

Publication number
JPH02254565A
JPH02254565A JP1077341A JP7734189A JPH02254565A JP H02254565 A JPH02254565 A JP H02254565A JP 1077341 A JP1077341 A JP 1077341A JP 7734189 A JP7734189 A JP 7734189A JP H02254565 A JPH02254565 A JP H02254565A
Authority
JP
Japan
Prior art keywords
speech
words
parts
word
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP1077341A
Other languages
Japanese (ja)
Inventor
Norikazu Ito
則和 伊藤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP1077341A priority Critical patent/JPH02254565A/en
Publication of JPH02254565A publication Critical patent/JPH02254565A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

PURPOSE:To improve the syntax analysis efficiency by performing syntax analysis after limiting parts of speech of words to parts of speech, which satisfy combinations of the highest connection probability, to resolve the polysemy of words having many parts of speech at the time of analyzing these words. CONSTITUTION:A morpheme analyzing part 7a of a translation main body part (translation part) 7 refers to a dictionary for an input text, and a syntax analyzing part 7b gets information of individual words and performs purging in accordance with a grammatical rule, and a tree structure is generated from analysis results. A converting part 7c transforms the tree structure of the input language to that of the output language, and a generating part 7d translates every node of the obtained tree structure. Parts of speech of respective words are limited to parts of speech satisfying combinations of the highest connection probability by calculating the product of connection probability between parts of speech, and syntax analysis is performed after the polysemy of words having many parts of speech is resolved at the time of analyzing these words. Thus, the syntax analysis efficiency is improved.

Description

【発明の詳細な説明】 1監分互 本発明は、自然語処理における構文解析方式に関する。[Detailed description of the invention] 1 supervisor mutually The present invention relates to a syntax analysis method in natural language processing.

災米肢生 本発明に係る従来技術としては、特開昭61−4067
2号公報や特開昭]1−74069号公報がある。特開
昭61−40672号公報に記載された発明は、多品側
解消処理方式に関するもので、品詞決定のための多品側
解消規則が用意されており、品詞の出現率を考慮して1
つの品詞を決定している。また、特開昭61−4067
2号公報に記載された発明は、対話型翻訳方式に関する
もので、入力文中の単語の品詞指定を人間が行なうと、
それに基づき翻訳するものである。
As a prior art related to the present invention, Japanese Patent Application Laid-Open No. 61-4067
2 and Japanese Patent Application Laid-open No. 1-74069. The invention described in Japanese Patent Application Laid-Open No. 61-40672 relates to a multi-part side resolution processing method, in which a multi-part side cancellation rule for determining parts of speech is prepared, and one
The two parts of speech are determined. Also, JP-A No. 61-4067
The invention described in Publication No. 2 relates to an interactive translation method, in which when a human specifies the part of speech of a word in an input sentence,
The translation is based on that.

自然言語の構文解析で、多品詞の処理は非常に厄介な問
題である。1つの語が複数の意味を持っているとさまざ
まな解析結果が導かれるが、正解以外は誤解析である。
Processing multiple parts of speech is a very difficult problem in natural language parsing. When one word has multiple meanings, various analysis results can be derived, but anything other than the correct answer is an incorrect analysis.

多品詞語の多義解消は構文解析にとって2つの意味があ
る。1つは解析精度の向上である。多義解消によって誤
解析が大幅に軽減される。もう1つは解析精度の向上で
ある。
The disambiguation of multipart speech words has two meanings for syntactic analysis. One is to improve analysis accuracy. Misanalysis is greatly reduced by disambiguation. The other is improving analysis accuracy.

他の品詞を排除するので正解を導く解析以外はほとんど
なされなくなる。例えば多品詞語を複数持つ文をそのま
ま解析すると8つの解の候補が得られるとする。多品詞
の解消を行えば解の候補が大幅に減るであろうから、解
析規則の適用回数及び組み合わせ数が少なくなり、構文
解析の負担は著しく軽減される。
Since other parts of speech are excluded, little analysis is required other than the one that leads to the correct answer. For example, suppose that eight possible solutions are obtained when a sentence with multiple parts of speech is analyzed as is. If multiple parts of speech are eliminated, the number of solution candidates will be greatly reduced, so the number of applications of parsing rules and the number of combinations will be reduced, and the burden of parsing will be significantly reduced.

豆−一五 本発明は、上述のごとき実情に鑑みてなされたもので、
新しく連接確率を導入し、語の優先度を併用することで
、構文解析効率の向上を図り、より精度の高い構文解析
を実現し、熟語処理にも対応できるような構文解析方式
を提供することを目的としてなされたものである。
Mame-15 This invention was made in view of the above-mentioned circumstances.
By introducing a new concatenation probability and using word priority, we aim to improve syntactic analysis efficiency, achieve more accurate syntactic analysis, and provide a syntactic analysis method that can also handle idiom processing. It was made for the purpose of

盪−一鬼 本発明は、上記目的を達成するために、(1)機械翻訳
等の自然言語解析システムにおける、形態素解析部での
解析対象テキストの辞書引きの後に、各々の語が持つ品
詞を、それぞれの間の連接確率の積を計算することによ
り、最も連接確率の高くなる組み合わせを満たすものに
限定して、複数の品詞を持つ語が解析されるときの多品
詞語の多義性を解消してから構文解析を行うこと、或い
は、(2)機械翻訳等の自然言語解析システムにおける
、形態素解析部での解析対象テキストの辞書引きの後に
、各々の語が持つ品詞を、それぞれの間の連接確率の積
を計算することにより、最も連接確率の高くなる組み合
わせを満たすものに限定するときに、各々の品詞が持つ
優先度も併せて積の係数として計算を行い、多品詞語の
多義を解消して構文解析を行うこと、或いは、(3)機
械翻訳等の自然言語解析システムにおける、形態素解析
部での解析対象テキストの辞書引きの後に、各々の語が
持つ品詞を、それぞれの間の連接確率の積を計算するこ
とにより、最も連接確率の高くなる組み合わせを満たす
ものに限定するときに、辞書引きされた語の中に熟語が
あるとき、その熟語の優先度によって並列的計算と排他
的計算を行い、多品詞語の多義を解消して構文解析を行
うことを特徴としたものである。以下、本発明の実施例
に基づいて説明する。
(1) In a natural language analysis system such as machine translation, the present invention calculates the parts of speech of each word after the morphological analysis section looks up the text to be analyzed in a dictionary. , by calculating the product of the conjunctive probabilities between each, the ambiguity of multi-part speech words is resolved when words with multiple parts of speech are analyzed by limiting the combination to those that satisfy the highest conjunctive probability. (2) In a natural language analysis system such as machine translation, after the morphological analysis unit looks up the text to be analyzed in a dictionary, the part of speech of each word is determined by comparing the part of speech between each word. By calculating the product of conjunctive probabilities, when limiting the combination to those that satisfy the highest concatenating probability, the priority of each part of speech is also calculated as a coefficient of the product, and polysemy of multi-part speech words can be reduced. (3) In a natural language analysis system such as machine translation, after the morphological analysis unit looks up the text to be analyzed in a dictionary, the part of speech of each word is determined by comparing the part of speech between each word. By calculating the product of conjunctive probabilities, when limiting to combinations that satisfy the highest concatenating probability, if there is an idiom among the words looked up in the dictionary, parallel calculation and exclusion are performed depending on the priority of the idiom. It is characterized by performing syntactic analysis by performing calculations and eliminating ambiguity in multipart speech words. Hereinafter, the present invention will be explained based on examples.

第1図は、本発明による構文解析方式を用いた翻訳装置
の一実施例を説明するための構成図で、図中、1はCR
T、2はキーボード、3は0CR14は入力文書、5は
スペルチェック部、6は前編集部、7は翻訳本体部、8
は後編集部、9は辞書、10は文法規則、11は出力文
書、12はプリンタである。ファイル入力、キーボード
入力、OCR入力のいずれかによって得た入力文はスペ
ルチェック部5、前編集部6を用いて前処理を行える。
FIG. 1 is a block diagram for explaining one embodiment of a translation device using a syntactic analysis method according to the present invention, and in the figure, 1 is a CR
T, 2 is the keyboard, 3 is 0CR14 is the input document, 5 is the spell check section, 6 is the pre-editing section, 7 is the translation main section, 8
9 is a post-editing section, 9 is a dictionary, 10 is a grammar rule, 11 is an output document, and 12 is a printer. Input sentences obtained by file input, keyboard input, or OCR input can be preprocessed using spell check section 5 and preediting section 6.

翻訳本体部7によって得られた出力文は後編集部8によ
って翻訳情報を利用して編集できる。入力文と出力文は
プリンタ12を用いて印刷できる。
The output sentence obtained by the translation main unit 7 can be edited by the post-editing unit 8 using translation information. The input and output sentences can be printed using printer 12.

第2図は、翻訳本体部7の処理の流れを示すが、この翻
訳本体部(翻訳部)7は大きく分けて形態素解析、構文
解析、変換、生成の4つの処理からなり、形態素解析部
7aでは入力テキストの辞書引きを行ない、構文解析部
7bでは個々の語の情報を得て文法規則に従ってパージ
ングを行い、解析結果から木構造を作成する。変換部7
cでは入力言語の木構造から出力言語の木構造に変形し
、生成部7dでは得られた木構造をノードごとに訳出す
る。
FIG. 2 shows the processing flow of the translation main unit 7. This translation main unit (translation unit) 7 is roughly divided into four processes: morphological analysis, syntactic analysis, conversion, and generation. Then, the input text is looked up in a dictionary, and the syntactic analysis unit 7b obtains information on each word and performs parsing according to grammatical rules, and creates a tree structure from the analysis results. Conversion section 7
In step c, the tree structure of the input language is transformed into the tree structure of the output language, and in the generation section 7d, the obtained tree structure is translated node by node.

本発明は、上記の構文解析部に属するもので、ここでは
入力テキストは英文とする。入力されたテキストを対象
として、形態素解析部7aで辞書引きを行う、辞書引き
した結果を得て、構文解析部7bに進む。構文解析部7
bでは、まず、多品詞解消処理を行う。ここでの多品詞
解消方式は下、記のS 、 J 、 Deroseの文
献にて招介されているものを利用する。
The present invention belongs to the above-mentioned syntax analysis section, and here, the input text is assumed to be English text. The morphological analysis unit 7a performs a dictionary lookup on the input text, obtains the result of the dictionary lookup, and proceeds to the syntactic analysis unit 7b. Syntax analysis section 7
In step b, multi-part-of-speech resolution processing is first performed. The multi-part-of-speech resolution method here uses the method introduced in the following literature by S., J., and Derose.

Computational Linguistics
、 Vol、14. No、1゜す1nter 198
8. p31−39“Grammatical Cat
egory Disambiguation bySt
atstical Optimization” S、
J、Derose(BrownUniv、) 上記文献の筆者(Deross)の提案するVOLSU
NGAと呼ぶ多品詞解消方式を利用して解析部導入処理
とする。VOLSUNGAは以下の特徴を持つ。
Computational Linguistics
, Vol. 14. No, 1゜su1nter 198
8. p31-39 “Grammatical Cat
egory Disambiguation bySt
physical optimization”S,
J, Derose (BrownUniv,) VOLSU proposed by the author of the above document (Deross)
The analysis section is introduced using a multi-part-of-speech resolution method called NGA. VOLSUNGA has the following features.

■完全な数学的アルゴリズムに基づき、臨時的な付加部
分を最小限に押さえている。
■Based on a complete mathematical algorithm, temporary additions are kept to a minimum.

骨最適な品詞列の定義は、品詞列を構成する品詞連接確
率および相対品詞確率の積が最大のものである。
The definition of an optimal part-of-speech string is one in which the product of the part-of-speech conjunction probability and the relative part-of-speech probability that make up the part-of-speech string is maximum.

■効率的な最適品詞列探索法(動的プログラミング法)
により、指数関数的な計算を克服した。
■Efficient optimal part-of-speech sequence search method (dynamic programming method)
This overcomes exponential calculations.

ここでいう品詞は細かい品詞分類を指し、全部で100
種類ぐらいある。
The parts of speech here refer to detailed classifications of parts of speech, with a total of 100 parts of speech.
There are several types.

必要なデータとしては、品詞分類と相対品詞確率と熟語
優先情報を持つ辞書(第4図)、品詞連接確率表(第2
表)がある。
Necessary data include a dictionary with part-of-speech classification, relative part-of-speech probability, and idiom priority information (Figure 4), and a part-of-speech conjunctive probability table (Figure 2).
There is a table).

方式は、最適な品詞の組み合わせを以下の第1表から動
的プログラミング方法により求める。第3図には最適品
詞列選択のフロー図が示されている。
The method uses a dynamic programming method to find the optimal combination of parts of speech from Table 1 below. FIG. 3 shows a flowchart for selecting an optimal part-of-speech sequence.

第1表 もし、 To、1からTn、jまでの最適品詞組合せが
、Tn−1,iを通るとすると、Tn−1,iまでの部
分はTO21からTn−1,iまでの最適品詞組合せで
ある。
Table 1: If the optimal part-of-speech combination from To,1 to Tn,j passes through Tn-1,i, then the part up to Tn-1,i is the optimal part-of-speech combination from TO21 to Tn-1,i. It is.

なぜならば、Tn−1、iまでの部分が最適でなければ
、Tn−1、iまでの部分に最適品詞組合せを選ぶと、
その方が最適となる。
This is because if the part up to Tn-1,i is not optimal, if the optimal part-of-speech combination is selected for the part up to Tn-1,i,
That would be optimal.

ゆえに、 TO,1からTn、jまでの最適品詞組合せ
は、各1(=1.・・、io)についてのTO21から
Tn−1、iまでの最適品詞組合せとTn−1,iから
Tn、jへの組合せ中の最適なものである。例えば The man 5till 5ati her。
Therefore, the optimal part-of-speech combination from TO,1 to Tn,j is the optimal part-of-speech combination from TO21 to Tn-1,i for each 1 (=1..., io) and the optimal part-of-speech combination from Tn-1,i to Tn, It is the optimal one among the combinations to j. For example, The man 5till 5ati her.

という文についての最適品詞組合せを計算する。Calculate the optimal part-of-speech combination for the sentence.

ここではそれぞれの語が以下の品詞を持つとする。Assume that each word has the following parts of speech.

また、説明を簡単にするため、それぞれの語の品詞が持
つ相対品詞確率は省略する。実際には連接確率に加えて
品詞相対確率も係数となる。
Furthermore, to simplify the explanation, the relative part-of-speech probabilities of the parts of speech of each word are omitted. Actually, in addition to the conjunction probability, the part-of-speech relative probability is also a coefficient.

The  man  5till  saw  her
AT  NN  NN   NN  PP0VB  V
B   VBD  PPS B ここで、AT(=冠詞)、NN(=名詞)、ppo (
=代名詞目的格)、pps(=所有代名詞)、RB(=
副詞)、VB(=動詞)、VBD(=動詞過去)である
The man 5till saw her
AT NN NN NN PP0VB V
B VBD PPS B Here, AT (=article), NN (=noun), ppo (
= pronoun object), pps (= possessive pronoun), RB (=
adverb), VB (=verb), and VBD (=verb past).

先頭のTheから末尾のherまでの品詞組合せは11
213m2*2=24の24通りある。次のページの第
2表の確率を用いて最適組合せを計算する。
There are 11 part-of-speech combinations from "The" at the beginning to "her" at the end.
There are 24 ways, 213m2*2=24. Calculate the optimal combination using the probabilities in Table 2 on the next page.

^    ロコ 2    〉 QXQ z    cQ    C15 Z   >   = ===  妻 ↑ ≧    αコ ≧=    〉 2巴    〉 ↑ ■ ↑ 呂 第2表(連接確率の例) 次に、相対品詞確率による補正の例を示す。^   Loco 2   〉 QXQ z   cQ  C15 Z    >   = === Wife ↑ ≧    α ≧=   〉 2 Tomoe ↑ ■ ↑ Lu Table 2 (Example of connection probability) Next, an example of correction using relative part-of-speech probabilities will be shown.

so : QL(限定側、932)、CS(従属接続詞
、479) 。
so: QL (limiting side, 932), CS (subordinating conjunction, 479).

LIH(間投詞、1)数字は品詞相対確率so tha
t  の並びを連接確率のみと相対品詞確率併用で品詞
推定する。
LIH (interjection, 1) The number is the part of speech relative probability so tha
The part of speech of the sequence t is estimated using only the conjunction probability and the relative part of speech probability.

連接確率のみによる方法P(Ull−C3)) P(C
3−C5)連接確率と相対品詞確率 P(UH−C3)*P(so−UH)*P(that−
C3)<P (C8−C5)*P (so−C3)*P
 (that−C3)P、()は確率を示す。P(UH
−C3)はtlH−C5の連接確率を示す。P (so
−C3)はsoのCSの品詞相対確率を示す。連接確率
と相対品詞確率を併用したときに正解C3−CSが得ら
れる。
Method P(Ull-C3)) P(C
3-C5) Conjunction probability and relative part-of-speech probability P(UH-C3)*P(so-UH)*P(that-
C3)<P (C8-C5)*P (so-C3)*P
(that-C3)P, () indicates probability. P(UH
-C3) indicates the connection probability of tlH-C5. P (so
-C3) indicates the relative probability of the CS of so. The correct answer C3-CS is obtained when the conjunction probability and the relative part-of-speech probability are used together.

次に、熟語があるときの処理例を示す。Next, an example of processing when there is an idiom is shown.

The 1lan came in order to
 win、は以下の品詞を持つとする。
The 1lan came in order to
Assume that win has the following parts of speech.

The man came in order to 
win。
The man came in order to
Win.

AT  NN  VBD  PRNN   TONN 
 To(不定詞を伴うto)VB    RB VB 
  PRVB  PR(前置詞)RB   PPN(人
称代名詞主格) <−−TO−−) 先頭の工から末尾のwinまでの品詞組合せは1*2*
1*2*2*3*2=48 1ネ2*1  *  1 1 2=  4計48通りあ
る。本来、熟語はその構成単語ごとに扱わずにひとかた
まりで1つの言葉を成すと考えるべきであるが、ここで
は便宜上その構成単語ごとに仮の単語として組み合わせ
の可能性を残しておく。このとき、もちろんこの部分は
他の単語との組み合わせは許さない。その熟語を構成す
る最後の単語において連接確率及び品詞相対確率を計算
する。この方法では熟語が長ければ長いほど係数の数が
少なくなるので、それに応じてあらたに係数を乗じてお
く必要もある。
AT NN VBD PRNN TONN
To (to with infinitive) VB RB VB
PRVB PR (preposition) RB PPN (personal pronoun nominative) <--TO--) The part-of-speech combinations from the first work to the last win are 1*2*
1 * 2 * 2 * 3 * 2 = 48 1 ne 2 * 1 * 1 1 2 = 4 There are 48 ways in total. Normally, an idiom should be considered as a single word rather than treated individually, but for the sake of convenience, each of its constituent words is treated as a provisional word, leaving open the possibility of combinations. At this time, of course, this part cannot be combined with other words. The conjunction probability and part-of-speech relative probability are calculated for the last word constituting the compound word. In this method, the longer the idiom, the fewer the number of coefficients, so it is necessary to multiply the number of coefficients accordingly.

この例では、in order toが熟語である。こ
の部分は“1nl)“order”to”の組み合わせ
は2*2*3で12通りあるが“in order t
o”は1通りである。下では本−*−TOと表わしてい
る。−ゝ°の部分が組み合わせが固定であることを示し
ている。
In this example, in order to is an idiom. This part is "1nl)" There are 12 combinations of "order" to, 2*2*3, but "in order t"
There is only one type of "o". Below, it is expressed as this -*-TO. The -ゝ° part indicates that the combination is fixed.

The  man  came  in  order
  to  win。
The man came in order
To win.

NN       NN          NN  
   PRAT  −+  AT    4  AT 
   VBD  −+  AT    VBDVB  
    VB         VB     RB本 NN     PRNN      NN     P
RNN  T。
NN NN NN
PRAT −+ AT 4 AT
VBD −+ AT VBDVB
VB VB RB book NN PRNN NN P
RNN T.

4AT    VBD       4AT    V
BD        PRVB     RB  VB
      VB     RB  VB  RB* 
−−*             * −−* −−T
4AT VBD 4AT V
BD PRVB RB VB
VB RB VB RB*
−−* * −−* −−T
.

NN     PRNN  To  NN4AT   
 VBD        PRVB     RB  
 VB   RB   VB* −−* −−T。
NN PRNN To NN4AT
VBD PRVB RB
VB RB VB* --* --T.

The  man  came  in  order
  to  win。
The man came in order
To win.

AT  NN  VBD   *−−*−−To  V
Bまた、熟語には辞書で優先情報を与えることができる
。今までは熟語が優先情報を持たない場合を説明してい
る。優先情報を持つときは熟語以外の可能性を廃棄する
。すなわち“in order to”が優先情報を持
つときは“1n″“order”to”は辞書引きされ
ない。例文は以下の品詞しか持たなくなる。なおこの優
先情報は熟語により付与するものもあるし付与しないも
のもある。
AT NN VBD *--*--To V
BAlso, priority information can be given to phrases in a dictionary. So far, we have explained the case where an idiom does not have priority information. When having priority information, possibilities other than idiomatic words are discarded. In other words, when "in order to" has priority information, "1n""order"to" will not be looked up in the dictionary.The example sentence will only have the following parts of speech.This priority information may be given by an idiom, or it may not be given. There are some things.

The man came in order to 
win。
The man came in order to
Win.

AT  NN  VBD  <−−To −−−> N
NVB          VB こうして多品用の解消をしてから構文解析を行う。構文
解析では多義性が大幅に解消されているので本構造の作
成が主な仕事となる。
AT NN VBD <--To ---> N
NVB VB In this way, syntax analysis is performed after eliminating multiple items. Since ambiguity is largely eliminated in syntactic analysis, the main task is to create this structure.

羞−一米 以上の説明から明らかなように、本発明によると、請求
項1により、新しく連接確率を導入することで、構文解
析率が向上する。また、請求項2により、語の優先度を
併用することで゛、より精度の高い構文解析ができる。
As is clear from the above description, according to the present invention, the parsing rate is improved by newly introducing the concatenation probability according to claim 1. Furthermore, according to claim 2, by using word priority in combination, more accurate syntactic analysis can be performed.

また、請求項3により。Also according to claim 3.

熟語処理に対応する。Supports idiom processing.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は、本発明による構文解析方式を用いた翻訳装置
の一実施例を説明するための構成図、第2図は、第1図
における翻訳本体部の処理フローを示す図、第3図は、
最適品詞別選択のフローを示す図、第4図は、辞書の例
を示す図である。 1・・・CRT、2・・・キーボード、3・・・0CR
14・・・入力文書、5・・・スペースチエツク部、6
・・・前編集部、7・・・翻訳本体部、8・・・後編集
部、9・・・辞書、10・・・文法規則、11・・・出
力文書、12・・・プリンタ。 第 図 第 図
FIG. 1 is a block diagram for explaining an embodiment of a translation device using the syntax analysis method according to the present invention, FIG. 2 is a diagram showing the processing flow of the translation main body in FIG. 1, and FIG. teeth,
FIG. 4, which is a diagram showing the flow of selection by optimal part of speech, is a diagram showing an example of a dictionary. 1...CRT, 2...Keyboard, 3...0CR
14... Input document, 5... Space check section, 6
... Pre-editing section, 7... Translation body section, 8... Post-editing section, 9... Dictionary, 10... Grammar rules, 11... Output document, 12... Printer. Figure Figure

Claims (1)

【特許請求の範囲】 1、機械翻訳等の自然言語解析システムにおける、形態
素解析部での解析対象テキストの辞書引きの後に、各々
の語が持つ品詞を、それぞれの間の連接確率の積を計算
することにより、最も連接確率の高くなる組み合わせを
満たすものに限定して、複数の品詞を持つ語が解析され
るときの多品詞語の多義性を解消してから構文解析を行
うことを特徴とする構文解析方式。 2、機械翻訳等の自然言語解析システムにおける、形態
素解析部での解析対象テキストの辞書引きの後に、各々
の語が持つ品詞を、それぞれの間の連接確率の積を計算
することにより、最も連接確率の高くなる組み合わせを
満たすものに限定するときに、各々の品詞が持つ優先度
も併せて積の係数として計算を行い、多品詞語の多義を
解消して構文解析を行うことを特徴とする構文解析方式
。 3、機械翻訳等の自然言語解析システムにおける、形態
素解析部での解析対象テキストの辞書引きの後に、各々
の語が持つ品詞を、それぞれの間の連接確率の積を計算
することにより、最も連接確率の高くなる組み合わせを
満たすものに限定するときに、辞書引きされた語の中に
熟語があるとき、その熟語の優先度によって並列的計算
と排他的計算を行い、多品詞語の多義を解消して構文解
析を行うことを特徴とする構文解析方式。
[Claims] 1. In a natural language analysis system such as machine translation, after the morphological analysis unit looks up the text to be analyzed in a dictionary, the product of the conjunctive probabilities between the parts of speech of each word is calculated. By doing this, syntactic analysis is performed after eliminating the ambiguity of multi-part speech words when a word with multiple parts of speech is analyzed, limiting it to the combination that satisfies the highest conjunctive probability. parsing method. 2. In a natural language analysis system such as machine translation, after the morphological analysis unit looks up the text to be analyzed in a dictionary, the part of speech that each word has is calculated to find the most connected part of speech by calculating the product of the connection probabilities between each part of speech. When restricting to combinations that satisfy a high probability, the priority of each part of speech is also calculated as a product coefficient, and syntactic analysis is performed by eliminating polysemy of multi-part speech words. Parsing method. 3. In a natural language analysis system such as machine translation, after the morphological analysis unit looks up the text to be analyzed in a dictionary, the part of speech that each word has is calculated to find the most connected part of speech by calculating the product of the connection probabilities between each part of speech. When limiting to words that satisfy combinations that increase the probability, if there is a compound word among the words looked up in the dictionary, parallel calculation and exclusive calculation are performed depending on the priority of the compound word to eliminate polysemy of multi-part speech words. A syntactic analysis method characterized by performing syntactic analysis.
JP1077341A 1989-03-29 1989-03-29 Syntax analysis system Pending JPH02254565A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1077341A JPH02254565A (en) 1989-03-29 1989-03-29 Syntax analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1077341A JPH02254565A (en) 1989-03-29 1989-03-29 Syntax analysis system

Publications (1)

Publication Number Publication Date
JPH02254565A true JPH02254565A (en) 1990-10-15

Family

ID=13631218

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1077341A Pending JPH02254565A (en) 1989-03-29 1989-03-29 Syntax analysis system

Country Status (1)

Country Link
JP (1) JPH02254565A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05189481A (en) * 1991-07-25 1993-07-30 Internatl Business Mach Corp <Ibm> Computor operating method for translation, term- model forming method, model forming method, translation com-putor system, term-model forming computor system and model forming computor system
WO1998039711A1 (en) * 1997-03-04 1998-09-11 Hiroshi Ishikura Language analysis system and method
US7672829B2 (en) 1997-03-04 2010-03-02 Hiroshi Ishikura Pivot translation method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05189481A (en) * 1991-07-25 1993-07-30 Internatl Business Mach Corp <Ibm> Computor operating method for translation, term- model forming method, model forming method, translation com-putor system, term-model forming computor system and model forming computor system
WO1998039711A1 (en) * 1997-03-04 1998-09-11 Hiroshi Ishikura Language analysis system and method
US7672829B2 (en) 1997-03-04 2010-03-02 Hiroshi Ishikura Pivot translation method and system

Similar Documents

Publication Publication Date Title
KR20090066067A (en) Method and apparatus for providing hybrid automatic translation
De Gispert et al. Catalan-English statistical machine translation without parallel corpus: bridging through Spanish
JPS63223962A (en) Translating device
US20010029443A1 (en) Machine translation system, machine translation method, and storage medium storing program for executing machine translation method
JPH02254565A (en) Syntax analysis system
JP2005284723A (en) Natural language processing system, natural language processing method, and computer program
Laippala et al. Towards automated processing of clinical Finnish: Sublanguage analysis and a rule-based parser
Sánchez-Martínez et al. Using alignment templates to infer shallow-transfer machine translation rules
Kogure Parsing Japanese spoken sentences based on HPSG
Ney et al. The RWTH system for statistical translation of spoken dialogues
JPH04112364A (en) Dictionary consulting system
JP4039205B2 (en) Natural language processing system, natural language processing method, and computer program
JP4033088B2 (en) Natural language processing system, natural language processing method, and computer program
Sobha Resolution of pronominals in tamil
Motavallian et al. An intelligent extension of the training set for the Persian n-gram language model: an enrichment algorithm
Matusov et al. Statistical machine translation of spontaneous speech with scarce resources
JP2004326584A (en) Parallel translation unique expression extraction device and method, and parallel translation unique expression extraction program
Hegde et al. Tagging Speech For Words In Low Resourced Monolingual Contexts of Sanskrit Shlokas
Weller-Di Marco Linguistic Information in Machine Translation
Ejerhed et al. A self-extending lexicon: Description of a word learning program
JPH09160920A (en) Machine translation system
JP2870259B2 (en) Japanese sentence analysis method
Mekuria et al. A hybrid approach to the development of part-of-speech tagger for Kafi-noonoo text
JP4023384B2 (en) Natural language translation method and apparatus and natural language translation program
JPH0348366A (en) Morpheme alanysis, syntax analysis and morpheme forming system