JPS60200359A - Simple sentence producer - Google Patents

Simple sentence producer

Info

Publication number
JPS60200359A
JPS60200359A JP59055509A JP5550984A JPS60200359A JP S60200359 A JPS60200359 A JP S60200359A JP 59055509 A JP59055509 A JP 59055509A JP 5550984 A JP5550984 A JP 5550984A JP S60200359 A JPS60200359 A JP S60200359A
Authority
JP
Japan
Prior art keywords
sentence
phrase
word
analysis
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP59055509A
Other languages
Japanese (ja)
Inventor
Akishige Masuyama
増山 顕成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP59055509A priority Critical patent/JPS60200359A/en
Publication of JPS60200359A publication Critical patent/JPS60200359A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Devices For Executing Special Programs (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

PURPOSE:To facilitate easy natural language processing and its debug for collection, analysis, etc. of data by providing a word dividing means, a modification analyzing means, etc. and decomposing a natural language sentence containing complicated factors tangled with each other into a simple style. CONSTITUTION:A sentence supplied from an input part 2 is divided into words through a word dividing part 3 and then collated with the words stored in a dictionary 6 stored in a memory. Thus a word list is produced and sent to a paragraph composing part 4, and the part 4 supplies the word list every word at and after the head of the sentence. A paragraph list thus produced is sent to a modification analyzing part 5, and an analysis tree is produced via the dictionary 6 and an indicating part 7. This analysis tree is sent to single sentence extracting part 8 for production of a single sentence. Thus a complicated sentence is changed into a simple style via those parts 3, 4, 5, 7 and 8 as well as the dictionary 6. This can facilitate easy natural language processing and its debug for collection, analysis, etc. of data.

Description

【発明の詳細な説明】 (1)発明の技術分野 本発明はデータ処理装置による自然言語の処理に関する
もので、入力された文章から単文を生成する単文生成装
置に係るものである。
DETAILED DESCRIPTION OF THE INVENTION (1) Technical Field of the Invention The present invention relates to natural language processing by a data processing device, and relates to a simple sentence generation device that generates a simple sentence from an input sentence.

(2)従来技術と問題点 一般に自然言語の文章は非常に複雑な構造を持っている
ので、計算機による処理を行なう場合これを直接データ
として用いたり、または文章からデータを収集するのは
非常に困難である。しかし複雑な構造の文章であっても
、これを単文(主語と述語が一つずつで出来ている文)
に分解すれば構文的なあいまいさが少なくなるため構文
パターンの決定や動詞の格の決定が容易になるので計算
機による処理が行ない易くなる。しかし、単文に分解す
るためには入力する文章の構造が決定されていなければ
ならず、その構造を決定するためには構文パターン又は
動詞の格が分っていなければならないと言う条件がある
。しかし、元元、この構文パターン又は動詞の格を決定
するために単文をめようとしているのであるから、これ
らを自動的に行なうのは無理であると言う問題点があっ
た。
(2) Prior art and problems Natural language sentences generally have a very complex structure, so it is extremely difficult to use them directly as data or collect data from the sentences when performing computer processing. Have difficulty. However, even if the sentence has a complex structure, it can be called a simple sentence (a sentence consisting of one subject and one predicate).
By decomposing it into , it becomes easier to determine the syntactic pattern and the case of the verb because there is less syntactic ambiguity, which makes it easier to process by computer. However, in order to break it down into simple sentences, the structure of the input sentence must be determined, and in order to determine that structure, the syntactic pattern or the case of the verb must be known. However, since we are trying to find a simple sentence to determine the origin, this syntactic pattern, or the case of the verb, there is a problem in that it is impossible to do this automatically.

そのため、従来、自然言語の計算機による処理(データ
収集)を行なう場合は、人手によって対象となる文章を
簡単な形式に直して入力するか又は簡単な形式のものを
選んで入力すると言う方式を採、ってぃたが、前者は非
常に多くの手間が必要であり、後者は重要な情報が漏れ
る恐れがあると言う欠点があった。
Therefore, conventionally, when processing natural language using a computer (data collection), a method has been adopted in which the target sentence is manually input into a simple format, or a simple format is selected and input. However, the former method required a great deal of effort, and the latter method had the disadvantage that important information could be leaked.

(3) 発明の目的 本発明は上記従来の欠点に鑑み複雑な要因の絡み合った
自然言語の文意を簡単な形式に分解して、データ収集や
解析等の自然言語処理やそのデバッグを容易に行なうこ
との出来る方式を提供することを目的としている。
(3) Purpose of the Invention In view of the above-mentioned drawbacks of the conventional art, the present invention decomposes the sentence meaning of natural language involving complex factors into a simple format, thereby facilitating natural language processing such as data collection and analysis, and its debugging. The purpose is to provide a method that can be used.

(4)発明の構成 そして、この目的は入力された文章データを単語に分割
して単語リストと成し、該単語リストラ文節単位に連結
して文節リスト’6生成する文節合成手段と該文節リス
ト中の文節相互の係り受け関係を解析する係り受け解析
手段と、外部に文節間の係り受け関係の判断をめその決
定を受け入れる指示受は入れ手段と、前記係り受け解析
手段および指示受は入れ手段の出力を受けて解析木を作
成し該解析木より単文を抽出する単文抽出手段を有し、
前記係り受け解析手段は文節リスト中の連続した3個の
文節の中間に位置する文節が後続する文節に係るもので
あるか否かを記憶装置上の文法テーブルを参照して判断
すると共に、前記3個の文節の関係が特定のものである
とき外部に当該係り受けの関係の判断をめることにより
達成される。
(4) Structure of the Invention The purpose of this invention is to provide a phrase synthesis means that divides input text data into words to form a word list, and connects the word restructuring in phrase units to generate a phrase list '6. a dependency analysis means for analyzing the dependency relationship between the clauses in the text; a simple sentence extraction means for creating an analytic tree in response to the output of the means and extracting a simple sentence from the analytic tree;
The dependency analysis means determines whether or not a clause located in the middle of three consecutive clauses in the clause list relates to the following clause, with reference to the grammar table on the storage device, and When the relationship between three clauses is specific, this can be achieved by determining the dependency relationship externally.

(5) 発明の実施例 第1図は本発明を実施する装置の1例のブロック図であ
って、1は単文生成装置、2は入力部、3は単語分割部
、4は文節合成部、5は係り受け解析部、6は記憶装置
、7は指示部、8は単文抽出部を表わしている。
(5) Embodiment of the Invention FIG. 1 is a block diagram of an example of a device implementing the present invention, in which 1 is a simple sentence generation device, 2 is an input section, 3 is a word segmentation section, 4 is a phrase synthesis section, Reference numeral 5 represents a dependency analysis section, 6 a storage device, 7 an instruction section, and 8 a simple sentence extraction section.

第1図において入力部2を経て入力された、文章は単語
分割部3において記憶装置6中に登録されている辞書に
納められている単語と照合されて単語リストが作られ文
節合成部4に渡される。
In FIG. 1, a sentence input via the input unit 2 is compared with words stored in a dictionary registered in the storage device 6 in the word division unit 3 to create a word list, and then sent to the phrase synthesis unit 4. passed on.

文節合成部4は単語リストを文頭から1語づつ入力し、
下記の様な操作によって文節合成を行なう。該処理フロ
ーを第2図に示す。
The phrase synthesis unit 4 inputs the word list one word at a time from the beginning of the sentence,
Phrase synthesis is performed by the following operations. The processing flow is shown in FIG.

(イ)・・・・・・・・・名詞が連続しているものは、
これをつなぐ。
(b)・・・・・・・・・Things with consecutive nouns are
Connect this.

(ロ)・・・・・・・・・助詞は直前の単語につなぐ。(b)・・・・・・Particles connect to the previous word.

(ハ)・・・・・・・・・助動詞は直前の単語につなぐ
(c)・・・・・・Auxiliary verbs connect to the previous word.

に)・・・・・・・・・意味なし形式名詞(例えば6と
き″、“こと”など)は直前の単語につなぐ。
ni) ......Formal nouns that have no meaning (for example, 6 oki'', ``koto'', etc.) are connected to the previous word.

に)・・・・・・・・・連体詞は右の単語につなぐ。)・・・・・・・・・Adnominals are connected to the word on the right.

(へ)・・・・・・・・・動詞語尾は直前の単語につな
ぐ。
(to)・・・・・・・・・The verb ending connects to the previous word.

(ト)・・・・・・・・・カンマ、ピリオドおよびドツ
トは直前の単語につなぐ。
(g)・・・・・・Commas, periods, and dots connect to the previous word.

げ)・・・・・・・・・括弧の内部はすべてつなぐ。ま
た括弧で囲まれた全体を直前の単語につける。
)・・・・・・Connect everything inside the parentheses. Also, attach the whole thing enclosed in parentheses to the previous word.

ω)・・・・・・・・・単位(ビット、バイト、α、H
など)は直前の単語につなぐ。
ω)・・・・・・・・・Unit (bit, byte, α, H
) connects to the previous word.

休)・・・・・・・・・連結を行う際に、文節中の最後
の単語の文法属性を残す。
Leave the grammatical attributes of the last word in the clause when concatenating.

例えば、1学校で1で(連用修飾)1 の(名詞修飾)1勉強1という例では 「の(名詞修飾)」の文法属性金銭1 1学校での(名詞修飾)1とする。For example, 1 in 1 school (continuous modification) 1 In the example of (noun modification) 1 study 1, Grammatical attribute of “no (noun modification)” money 1 1 School (noun modification) 1.

文節の属性には名詞、動詞、名詞修飾、動詞修飾、連用
修飾の動詞、および連体修飾の動詞などがある。これら
の内、名詞修飾と動詞修飾は“と″のように共起するこ
とかあり、また、文章の末尾の動詞には1文末”と言う
属性を入れる。これらの各文節の属性は助詞や語尾によ
って、例えば、′は”、−ゾ、1に“等の助詞は動詞修
飾、“の”、′における″等は名詞修飾、6待ち“のよ
うに連用形動詞は連用修飾と云う様に定める。
The attributes of a clause include a noun, a verb, a noun modification, a verb modification, an adjunctive modification verb, and an adnominal modification verb. Among these, noun modifications and verb modifications may co-occur, such as "and", and the verb at the end of a sentence has the attribute "1 sentence end".The attributes of each of these clauses include particles and Depending on the ending of the word, for example, particles such as ``, -zo, 1'' modify the verb, ``no'', ``in'', etc. modify the noun, and conjunctive verbs such as ``6'' modify the verb. .

以上の方法によシ文節合成部4で作成された文節リスト
が係り受け解析部5に送られると、係り受け解析部5は
「三つの解析窓」を用いて、文意の後尾から解析を行な
う。「三つの解析窓」とは連続した三つの枠組を想定し
て、それぞれの枠組の中に1個の文節が入るように文章
を重ねるものを言い、解析の説明を容易にするために使
用するもので、それぞれの解析窓(枠組)を左からり、
MXRと名付ける。そして各窓内の文節の属性の組み合
わせによって解析全行ない解析木を作成する。「三つの
解析窓」を用いての解析手順は第1表及び第3図に示す
とおりである。第1表の解析手順の組合せは記憶装置6
中に文法テーブルとして格納されている。解析中に利用
者の指示をめる必要のある場合のメツセージ出力や利用
者からの指示は指示部7を経由して行なわれる。
When the phrase list created by the phrase synthesis section 4 using the above method is sent to the dependency analysis section 5, the dependency analysis section 5 uses the "three analysis windows" to analyze the meaning from the end of the sentence. Let's do it. ``Three analysis windows'' refers to three consecutive frameworks and overlapping sentences so that one clause fits in each framework, and is used to facilitate explanation of analysis. From the left, open each analysis window (framework) with a
Name it MXR. Then, an analysis tree is created by performing all the analysis lines by combining the attributes of the clauses in each window. The analysis procedure using the "three analysis windows" is as shown in Table 1 and Figure 3. The combination of analysis procedures in Table 1 is storage device 6.
It is stored as a grammar table inside. If a user's instructions are required during analysis, message output or instructions from the user are performed via the instruction section 7.

この様にして作成された解析木は単文抽出部8に送られ
て単文が生成される。この際の単文分割は次に示す手順
で行なわれる。
The parse tree created in this way is sent to the simple sentence extraction section 8, where a simple sentence is generated. The simple sentence division at this time is performed in the following steps.

第 1 表 ■・・・・・・・・・名詞句(連体修飾も含む)を取シ
出し最後の語のみを残す。
Table 1■・・・・・・Extract a noun phrase (including adnominal modifications) and leave only the last word.

■・・・・・・・・・連用修飾を分割する。■・・・・・・・・・Divide the conjunctive modification.

■・・・・・・・・・連用形を終止形に直し単文を出力
する。
■・・・・・・Converts the continuous form to the final form and outputs a simple sentence.

以上の各部における単文生成の過程を具体的な事例につ
いて更に補足すれば以下のとおりである。
The process of generating a simple sentence in each part above will be further supplemented with specific examples as follows.

例えば入力文が「図形情報はX−Y座標に対して与えら
れ、この座標領域を画面と呼ぶ」であるとき、単語リス
トは「図形1情報1はIX−Yl座標1に対して1与え
lられ1.1この1座標1領域1金1画面1と1呼1ぶ
1゜」の様になり、文節リストは[図形情報はIX−Y
座標に対して1与えられ、1この座標領域を1画面と1
呼ぶ。1」の様になる(1は単語または文節の区切りを
示している)。
For example, when the input sentence is ``Graphic information is given for X-Y coordinates, and this coordinate area is called a screen,'' the word list is ``Graphic 1 information 1 is given by 1 for IX-Yl coordinates 1. 1.1 this 1 coordinate 1 area 1 gold 1 screen 1 and 1 call 1 1゜'', and the phrase list is [Graphic information is IX-Y
1 is given for the coordinates, 1 this coordinate area is 1 screen and 1
call. 1" (1 indicates a break between words or phrases).

第4図は、更に文節リストを「三つの解析窓」を使用し
て解析する経過を示す図で、9は入力文、10は文節の
区切り、11〜13は「三つの解析窓」で11がLl 
12がM113がRを表わしている。そして、この様な
解析の結果として解析木が得られる。
Figure 4 is a diagram showing the process of further analyzing the clause list using the "three analysis windows", where 9 is the input sentence, 10 is the clause break, and 11 to 13 are the "three analysis windows". is Ll
12 represents M113 represents R. As a result of such analysis, an analytic tree is obtained.

第5図は解析木と抽出された単文を示す図で(&)〜(
e)は解析木、(d)は単文を表わしている。
Figure 5 shows the parse tree and extracted simple sentences.
e) represents a parse tree, and (d) represents a simple sentence.

(6)発明の効果 以上詳細に説明したように本発明の単文生成装置は簡潔
な構成で実現することが可能であり、また、機械的に判
断することが困難な個所は利用者に照会してその指示に
従う方式としているので、辞書情報等が少なくて済む利
点がある上、処理時間の損失が少ないから効果は大であ
る。
(6) Effects of the Invention As explained in detail above, the simple sentence generation device of the present invention can be realized with a simple configuration, and it is possible to refer to the user for parts that are difficult to judge mechanically. Since the system follows the instructions given by the computer, it has the advantage of requiring less dictionary information and the like, and the loss of processing time is small, which is very effective.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明を実施する装置の1例のブロック図、第
2図は文節合成部の動作を示す流れ図、第3図は係り受
け解析部の動作を示す流れ図、第4図は文節リストを「
三つの解析窓」を使用して解析する経過を示す図、第5
図は解析木と抽出された単文金示す図である。 1・・・・・・単文生成装置、2・・・・・・入力部、
3・・・・・・単語分割部、4・・・・・・文節合成部
、5・・・・・・係り受け解析部、6・・・・・・辞書
、7・・・・・・指示部、8・・・・・・単文抽出部、
9・・・・・・入力文、10・・・・・・文節の区切り
、11〜13・・・・・・「三つの解析窓」 第 1 図 第 2 図 第3図
Fig. 1 is a block diagram of an example of a device implementing the present invention, Fig. 2 is a flowchart showing the operation of the phrase synthesis section, Fig. 3 is a flowchart showing the operation of the dependency analysis section, and Fig. 4 is a clause list. of"
Diagram 5 showing the process of analysis using “Three Analysis Windows”
The figure shows the parse tree and the extracted simple money. 1... Simple sentence generation device, 2... Input section,
3... Word division unit, 4... Clause synthesis unit, 5... Dependency analysis unit, 6... Dictionary, 7... instruction section, 8... simple sentence extraction section,
9... Input sentence, 10... Clause break, 11-13... "Three analysis windows" Figure 1 Figure 2 Figure 3

Claims (1)

【特許請求の範囲】[Claims] 入力された文章データを単語に分割して単語リストと成
し、該単語リストを文節単位に連結して文節リストラ生
成する文節合成手段と該文節リスト中の文節相互の係り
受け関係を解析する係り受け解析手段と、外部に文節間
の係り受け関係の判断をめその決定を受け入れる指示受
は入れ手段と、前記係り受け解析手段および指示受は入
れ手段の出力を受けて解析木を作成し該解析木よシ単文
を抽出する単文抽出手段を有し、前記係り受け解析手段
は文節リスト中の連続した3個の文節の中間に位置する
文節が後続する文節に係るものであるか否かを記憶装置
上の文法テーブルを参照して判断すると共に、前記3個
の文節の関係が特定のものでおるとき外部に当該係シ受
けの関係の判断をめることを特徴とする単文生成装置。
A phrase synthesizing means that divides the input text data into words to form a word list, and connects the word list in phrase units to generate phrase restructuring; and a section that analyzes the dependency relationship between phrases in the phrase list. a receiver parsing means, an instruction receiving means for externally determining the dependency relationship between clauses, and an inserting means; The parse tree has simple sentence extraction means for extracting simple sentences, and the dependency analysis means determines whether a phrase located in the middle of three consecutive phrases in the phrase list is related to a subsequent phrase. 1. A simple sentence generation device characterized by making a judgment by referring to a grammar table on a storage device and, when the relationship between the three clauses is a specific one, to externally judge the relationship between the three clauses.
JP59055509A 1984-03-23 1984-03-23 Simple sentence producer Pending JPS60200359A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP59055509A JPS60200359A (en) 1984-03-23 1984-03-23 Simple sentence producer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP59055509A JPS60200359A (en) 1984-03-23 1984-03-23 Simple sentence producer

Publications (1)

Publication Number Publication Date
JPS60200359A true JPS60200359A (en) 1985-10-09

Family

ID=13000643

Family Applications (1)

Application Number Title Priority Date Filing Date
JP59055509A Pending JPS60200359A (en) 1984-03-23 1984-03-23 Simple sentence producer

Country Status (1)

Country Link
JP (1) JPS60200359A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61221875A (en) * 1985-03-08 1986-10-02 Sharp Corp System for converting processing japanese sentence into simple sentence
JPS62263568A (en) * 1986-05-12 1987-11-16 Matsushita Electric Ind Co Ltd Word processor
JPS6386073A (en) * 1986-09-30 1988-04-16 Ricoh Co Ltd Analyzer for qualifying relation of japanese word

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61221875A (en) * 1985-03-08 1986-10-02 Sharp Corp System for converting processing japanese sentence into simple sentence
JPH0512753B2 (en) * 1985-03-08 1993-02-18 Sharp Kk
JPS62263568A (en) * 1986-05-12 1987-11-16 Matsushita Electric Ind Co Ltd Word processor
JPS6386073A (en) * 1986-09-30 1988-04-16 Ricoh Co Ltd Analyzer for qualifying relation of japanese word

Similar Documents

Publication Publication Date Title
JPH02165378A (en) Machine translation system
JPS61255468A (en) Mechanical translating processing device
JP2944346B2 (en) Document summarization device
JPH0682377B2 (en) Emotion information extraction device
JPS60200359A (en) Simple sentence producer
JP2012185567A (en) Display control device, display control method and display control program
JP2866944B2 (en) Machine translation processor
JPH08123976A (en) Animation generating device
JPS63221475A (en) Analyzing method for syntax
JP3446341B2 (en) Natural language processing method and speech synthesizer
JP3363636B2 (en) Accent control device and method related to speech synthesis
JP2005092615A (en) Natural language processing system, natural language processing method, and computer program
JP2719453B2 (en) Machine translation equipment
KR20000026814A (en) Method for separating word clause for successive voice recognition and voice recognition method using the method
JPH05158765A (en) System for acquiring c language differential information
JPH07234872A (en) Morpheme string converting device for language data base
JP3269083B2 (en) Natural language processor
JP2000207395A (en) Device and method for analyzing japanese language and storage medium recording japanese language analyzing program
JP2856736B2 (en) Dictionary reference device and dictionary reference method
JPS60247787A (en) Document converting device
JPH08241319A (en) Machine translation system
JPH02140869A (en) Sentence structure analyzing method
JPS62264367A (en) Japanese word producing device
KR20020048715A (en) Natural Language Analyzing Apparatus and Method for Controlled Korean Grammar
JPS6330968A (en) Language analyzing system