JPS5930169A

JPS5930169A - Document processing system

Info

Publication number: JPS5930169A
Application number: JP57139283A
Authority: JP
Inventors: Yoshinori Goto; 美紀後藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-08-11
Filing date: 1982-08-11
Publication date: 1984-02-17

Abstract

PURPOSE:To edit a high-grade book with an index automatically, by allowing a document processing system to deform a keyword candidate into a keyword and generate the index together with given pages only by specifying the keyword candidate at the stage wherein a document is inputted to the document processing system. CONSTITUTION:An sentence or phrase to be employed as a keyword in the document is specified by an operator during input operation together with added control codes (the starting and ending of the document, starting and ending of reading, coined word, and word to be modified). A word dictionary 18 is used to make a grammatical analysis of an extracted sentence (keyword) specified by the control codes. The dictionary contains keywords arranged in reverse reading order, a concatenation table of annexed words, derivative nouns of terms, and an inflection table. The extracted sentence is divided into clauses by the grammatical analysis and then the deformation into the keyword is performed by two kinds of deformation pont, i.e. deformation into a derivative noun and deformation by the specification of a word to be modified.

Description

【発明の詳細な説明】発明の技術分野本発明は文書処理システムに関し、文偶中の指定した文
から目次、索引の見出し語を生成する機能を当該システ
ムに与えようとするものである。DETAILED DESCRIPTION OF THE INVENTION Technical Field of the Invention The present invention relates to a document processing system, and is intended to provide the system with a function of generating a table of contents and an index entry from a specified sentence in a sentence.

従来技術と問題点久居処理システムでは久居を制御コードと共に磁気媒体
に単純に連続させて（ヘク打ら状に）入力ずれば、Ｗ　
ＩＷＩＪ３１．が組版処理して所定行数の頁毎に改頁す
る、所定９３分で改行する、１ｈ定された本文の一部を
その頁のノンプルとともに抽出して目次や索引を作る、
等の編集を行なうことができる。Conventional technology and problems In the Hisai processing system, if Hisai is simply input continuously (in a hexagonal pattern) along with the control code on the magnetic medium, W
IWIJ31. is typesetting processing and page breaks every page with a predetermined number of lines, line breaks are made after a predetermined 93 minutes, a part of the text that has been set for 1 hour is extracted along with non-pulls from that page to create a table of contents and an index.
etc. can be edited.

しかし従来の久居処理システムでは目次や索引の見出し
語は本文中の語句そのものであり、編集者が手作業で行
う場合のように語形変化や語順倒置などによる変形を加
えることはできない。見出し語は例えば「カードによる
入力」などのように名詞化されているが本文中の語句は
［カードにより入力する」等の如く文章そのものである
ことが多く、これをそのま−見出し語に採用すると不自
然な結果になってしまう。従ゲζ、見出し語にはそれに
適した本文中の語句を選ぶか、それができなりれば手作
業で（１ｙ正する必要があり、厄介であっ）こ。However, in the conventional Hisai processing system, the headwords in the table of contents and index are the exact words in the main text, and editors cannot make changes such as inflection or inversion of word order, as they would do manually. For example, headwords are converted into nouns, such as ``input by card,'' but the words in the main text are often sentences themselves, such as ``input by card,'' and are used as headwords. This results in unnatural results. For the sub-game ζ, choose a word or phrase from the main text that is suitable for the headword, or if possible, do it manually (1y correction is necessary, which is troublesome).

発明の目的それ故本発明は、本文中で指定された見出し語候補文を
見出し語として適当な語句に変形する（１に能を久居処
理システムに−Ｉ：ｊえようとするものである。OBJECTS OF THE INVENTION Therefore, the present invention aims to transform a headword candidate sentence specified in the main text into a suitable word or phrase as a headword (1) to add function to the Hisai processing system.

発明の構成本発明は人力された文書の本文に頁割伺ＬＪをし、また
目次、索引等を作成して編集を行なう久居処理システム
において、単語を逆読めの順に並べ、例属語には接続表
を、用君には転成名詞を、用言および助動１ＩｉｉＪに
は活用表を伺した単語辞吉と、制御コードにより見出し
語候補であること、造語であること、及び被修飾語とず
べきであることを指示された前記本文を受りて、見出し
語候補を転成名詞化し、被イー飾語甫定のあるものは当
該文節を被ｆ］ト飾語に残りの文節を連体修飾語に変形
して見出し語を生成し、割（Ｊりられた頁と共に索引を
作成する処理装置とを備えることを特徴とするが、次に
実施例を参照しながらこれをｉ’ｒＩＩＩｌに説明する
。Structure of the Invention The present invention is a Hisai processing system that performs pagination LJ on the main text of a manually-written document, and also creates and edits a table of contents, index, etc., in which words are arranged in reverse reading order, and examples and subordinate words are The conjunctive table is used, the conjugated noun is used for Yo-kun, the word Jikichi is used for the conjugation table, and the control code indicates that it is a headword candidate, that it is a coined word, and that it is a modified word. In response to the above-mentioned text, which is instructed to be a ``descriptor'', the entry word candidate is converted into a transposed noun, and if there is an ``e'' modification, the relevant clause is modified into an ``e'' and the remaining clauses are modified into an adnominal. It is characterized by comprising a processing device that generates a headword by transforming it into a word and creates an index together with the searched pages. do.

発明の実施例第１図は文鉗処理ンステムにおＬＪる処理の流れを示し
、１０は原稿、１２はデータエン１−り装置等のキーボ
ード、１４はフロッピィディスク、１Ｇはａｌ′Ｗ機即
ら処理装置である。文ｐ４処理に当りオペレータはキー
ボー１１２を操作して原稿１０の内容を）１コツビイデ
イスク１４にヘタｔ−１５形式で所要制御コードを例月
しながら入力する。処理装置１６はフロッピィディスク
１４を続出し、制御コードに（ｊｌｌって本文の頁割当
、改行、「Ｉ次作成その他を行ない版下を作る。索引の
作成は頁割当て等され）こ本文について行なわれ、索引
に載−Ｕる用語とそれが存在する頁を抽出し、５ｏ音順
等に並べ、頁割当てして索引頁の版下を作る。本発明は
その索引処理に抽出文つまり見出し語１じ袖の文法解析
、見出し語への変形機能を与えようとするものである。Embodiment of the Invention Fig. 1 shows the flow of processing in the LJ processing system, where 10 is a document, 12 is a keyboard of a data encoding device, etc., 14 is a floppy disk, and 1G is an AL'W machine. This is a processing device. To process the sentence P4, the operator operates the keyboard 112 to input the contents of the document 10 into the hard disk 14 in the form of a T-15 format, while inputting the required control code. The processing device 16 outputs the floppy disks 14 one after another, and uses the control code (jll means page assignment, line break, "I-order creation, etc., to create a copy. Index creation involves page assignment, etc.") and instructs the text to be performed for this text. The terms listed in the index and the pages in which they exist are extracted, arranged in alphabetical order, etc., and page assignments are made to create a copy of the index page. It attempts to provide first-class grammar analysis and headword transformation functions.

第１図の斜ａ１枠部分は、索引処理中の本発明に依る部
分を示す。抽出文の文ｔノ１１′１月、見出し語への変
形を行なうには制ｆｆｆｌｌ　：Ｊ　　Ｍを名士変える
必要がある。そこで以下では本発明を入力方法、抽出文
の解析、抽出文の変形の順で説明する。The diagonal a1 frame portion in FIG. 1 shows the portion according to the present invention during index processing. In order to transform the extracted sentence into a headword, it is necessary to change the control fffll: J M to a famous person. Therefore, the present invention will be explained below in the order of input method, analysis of extracted sentences, and transformation of extracted sentences.

人力方法：文ｔ！Ｆ中の見出し語に採用する文または語
句は原稿文書のフロッピィディスクへの人力時にオペレ
ータが指定する。指定要領は次の如くである。ご＼でＯ
は文の始まり、（◎は開路わり、Ｏは読め方の始まり、
Ｏは開路わりの制御コードである。Human power method: sentence t! The sentence or phrase to be adopted as the headword in F is specified by the operator when manually inputting the original document to a floppy disk. The specification procedure is as follows. Please \deO
is the beginning of a sentence, (◎ is an open line, O is the beginning of a reading,
O is a control code for opening.

Ｑｔ＋ｌ＋：ｌ末Ｏクンマツ（い・らデータを読ａ’ｊ
込む０ヨミコムＯＯ本例での本文中の文は［端末からデータを読め込む」で
あり、オペレータは本文をフロッピィディスクに入力す
るに当り、アンダーラインなどを（＝Ｊされている上記
文に制御コード０，０・・・・・を例月する。本例に示
すように漢字はその漢字と共に読みも入力する。読みは
、漢字のみの又は仮名混りの漢字の単語別に、その直後
に入れる。複数の単語を合成したもの例えば「端末装置
」などはこれを１つの単語として？ｒｌｉｌ末装置Ｏタ
ン“７ツソウヂ・としても、また別々に分りで端末Ｏク
ンマツＯ装置０＝ハシチＯとしてもよい。ま〕こ仮名混
りの漢゛字単語は各漢字別に、例えば「カードを読め取
る」は「カードを統０ヨ◎め取◎１・Ｏる」としてもよ
い。処理装置は読み力制御コー１’　Ｑに先行する漢字
（当該文の先頭または直前のＯまでのもの）は該ゆノ・
ら始まる読みを持つものとして処理する。Qt+l+: l-end O Kunmatsu (I read the data a'j
Input 0 Yomicom OO In this example, the sentence in the main text is [read data from the terminal], and when the operator inputs the main text into the floppy disk, the operator controls the above text marked with (=J) by adding an underline, etc. Codes 0, 0, etc. are entered every month.As shown in this example, for kanji, enter the reading along with the kanji.The reading is entered immediately after each word for kanji that is only kanji or contains kana. A combination of multiple words, such as "terminal device", can be used as one word.rlil terminal device Otan "7TSOUJI・" or separately as terminal O Kunmatsu O device 0 = Hashichi O. Good. Kanji words that are mixed with makokana can be written separately for each kanji. For example, "to read a card" can be written as "to control a card 0 yo ◎ take ◎ 1 ・Oru". Control code 1' The kanji that precedes Q (the ones up to the beginning of the sentence or the O immediately before it) are
It is treated as having a reading starting from .

索引は、その久居独自の用ａハを主に掲載するので、単
語辞書に含まれていない造語（主に名詞）を扱うケース
が多い。このような造語については、入力時に制御：、
Ｊ−１’の例月によって文中に明示しておく。例えば「
ジョブをザブミツトする」という文中の造語「ザブミツ
ト」に対してはＪ’ｌ！［ＲＡの始まり、開路わり各制
御コード０．０を伺力牝で１−ＯジョブをＯす′ブミソ
１−０する。」とＪる。Since the index mainly lists Hisai's unique usages, it often deals with coined words (mainly nouns) that are not included in the word dictionary. For such coined words, control when inputting:,
It is clearly stated in the text according to the month of J-1'. for example"
J'l! for the coined word "zabmitsuto" in the sentence "zabmitsuto". [At the beginning of RA, open each control code 0.0 and perform a 1-0 job. ” said J.

見出し語は名トＪ化するが、［−カードをあ°との取る
」、「画面に表示する」、［オペレータに操作さ・Ｕる
」などの体言１７＋１ぢ名詞、代名詞または数詞が１つ
、用言即し動詞、形容詞、又は形容動詞が１つの単純な
文なら、熟語中の該用言を転成名詞（これが有るなら）
に置き換え、残りの分節を転成名詞に係る修飾語として
本例なら「カードの読取り」、「画面への表示」、「オ
ペレークの操作」とすればよく、これが見出し倍化の基
本形である。この基本形でよい変形については変形要領
の指示、具体的には制御コードの例月はしない（不要）
。しかし文の述語以外の部分が主語、目的語などかりな
り体循が複数ある場合はいずれも被（１ト飾語でありｉ
！ｌるので、そのどれを希望するかを１ｒＴ定する。Headwords are often made into names, but there are 17 + 1 nouns, pronouns, or numerals such as [-to take the card, to display on the screen, to be operated by the operator, etc.] , if it is a simple sentence with one verb, adjective, or adjective, the predicate in the idiom can be used as a transposition noun (if it exists)
, and the remaining segments can be used as modifiers related to transposed nouns, such as ``reading a card'', ``displaying on a screen'', or ``operating an operation'', which is the basic form of heading doubling. For transformations that are suitable for this basic form, we will not give instructions on how to transform them, specifically, we will not explain the control code (unnecessary).
. However, if the part other than the predicate of the sentence has multiple circularity, such as a subject or an object, all of them are 1-to-modifiers and i
! 1rT, so decide which one you want.

述語は、この被（じ飾語にかかる連体修飾語に変形する
。被（＋’ｔ　Ｍｒ語はその前後に制ｔａｌし１−　Ｆ
Ｑ、・をつりζ１け定する。例を挙げると１−オペレー
タはジョブをキャンセルする」という文には述語ｒ　＝
ｌ−ヤンセルする」以外のｆ（１じ）に体言［メベレー
タ」、「ジョブ」があり、いずれも被（ｌｋ　ｈＱｉ　
ｉハたり１！Ｉるのでそれを指定さ一已ｒＱオペレータ
Ｏはジョブをキャンセルする」又は１−オペレータＧ、
ｌ・ジョブ０をキャンセルする］とさ−Ｖる。このよう
にすれば処理装置は前者は「ジョブをキャンセルするオ
ペレータ」とし、ｌ＆ｆｊは「オペレータがキャンセル
するジョブ」と変形する。見出し語への変形態様にはこ
の他にも種々あろうが、本発明では上記の転成名詞化、
被修飾語１旨定の２方法を想定する（詐ず）。The predicate is transformed into an adnominal modifier that depends on this predicate.
Q, and ζ are determined by 1 digit. For example, in the sentence 1-The operator cancels the job, the predicate r =
f (1st) other than "l-yansel" has the nominal word "meberator" and "job", both of which are subject (lk hQi
i hatari 1! I specify it as ``RQ Operator O cancels the job'' or 1-Operator G,
・Cancel job 0]. In this way, the processing device transforms the former into "the operator who cancels the job" and transforms l&fj into "the job which is canceled by the operator". There may be other ways to transform the entry into a headword, but in the present invention, the above-mentioned transformation into a noun,
Assume two methods for determining one modified word (false).

抽出文の文法解析：索引がある文責は専門官、実用店、
論文などで、かなり特定化されており、また永別として
採用される語句に長文はないとしてよい。一般的な特徴
を列挙すると■常体の１」語文で■かれた平叙文であり
、構造的には単文である。■文節間の関係は、主語・述
語の関係及び（１と飾・被修飾の関係のみである。０文
に特別な意味を（＝Ｊ加する１（１ｊ屈語（一部の副助
詞、１１１．量１、葛志、伝聞などの助動Ｎ；Ｊ）は用
いていない。■連体詞副詞は用いていない、である。本
発明でもか＼る特徴を持−ノ文が選択されるものと仮定
する。Grammar analysis of extracted sentences: The authors with indexes are specialists, practical stores,
It can be said that it is quite specific in articles and the like, and there are no long sentences among the words and phrases used as eternal farewell. To enumerate its general characteristics, it is a declarative sentence written in ordinary ``1'' words, and structurally it is a simple sentence. ■The relationships between clauses are only those between subjects and predicates, and between (1 and ornaments and modified objects. 1 adds a special meaning to the 0 sentence (=J 111. Quantity 1, auxiliary verbs such as katsushi, hearsay, etc. are not used.■ Adjunctive adverbs are not used. In the present invention, -no sentences with such characteristics are also selected. Assume that

ｉ１ｉ’ｌ　ｆｆｆ＋＋コード０．Ｑ（前者はセンテン
ス・スター１−１ｆ、ＬＷはセンテンス・エンドの略）
で１旨定された抽出文を文法解析するには工１′！語辞
壱１Ｂを使用する。単語辞書１８は一般の国語辞典（Ｉ
Ｊ語）に前記特徴又は制限、及び当該文書の専門分野加
味して、選定した単語をアイウェオ順に配列し、各単語
にその晶Ｅ、活用の種類などを記したものである。一般
の国語辞書と異なる点は、見出し語単語はすべて逆読み
の順に並べであることである。i1i'l fff++ code 0. Q (the former stands for Sentence Star 1-1f, LW stands for Sentence End)
To grammatically analyze an extracted sentence defined as 1 in 1'! Use Glossary 1B. The word dictionary 18 is a general Japanese dictionary (I
The selected words are arranged in alphabetical order by taking into account the characteristics or limitations mentioned above and the specialized field of the document, and each word is marked with its crystal E, type of conjugation, etc. The difference from general Japanese dictionaries is that all entry words are arranged in reverse reading order.

例えば「よめとる」（読取る）は（−るとのよ」として
、［“る」が最後に来る語句の群内の、後から２番目が
（−と」のものの中の、１及から３　？Ｉ′ｆ目が１−
の」のものの中の、後から４番目が「よ」のものの占め
るべき位置に登録する。逆読め順に並べる理由は、述語
は最後に来る等の日本語のＩＩＮ徴から、このようにす
ると検索リフ率が良くなるからである。For example, ``Yometoru'' (read) is expressed as (-to no yo). ?I′f is 1-
The fourth from the end is registered as the position that should be occupied by the word ``yo.'' The reason for arranging them in reverse reading order is that the retrieval ref rate improves because of the IIN characteristics of Japanese, such as the fact that predicates come last.

また単語辞吉では、１！Ｉ１屈語即ら助ｒｉｉｉｌ、助
動Ｎｉ１Ｊには接続表を用意しておく。例えば助動詞１
られる］の接続表はである。この表は、助動詞「られる」の（＝Ｊくｉ葉は
動Ｕｊならづ行変格活用の２、上一段活用、下一段活用
、力行変格活用の各未然形、助動詞１−１！る」「さ・
ける」ならやはりその未然形でＪ）る、ことを示す。見
出し語に採用される文には前記のｚ１η徴又は制限があ
るので該当（”Ｊ’Ｗ’ｒ　詔の数は多くない。この接
続表があると、助動詞「られる」が検出されたら、その
先行語は上記表の未然形と一致するものとして評忍晶成
すること力くできる。Also, in the word Jikichi, 1! A connection table is prepared for I1 idioms, ie, suke riiiil and suke ni1J. For example, auxiliary verb 1
The connection table for This table shows the auxiliary verb ``areru'' (=Jkuiha is the verb Uj narazu line declension conjugation 2, upper 1st conjugation, lower 1st conjugation, and power declension conjugation each unnatural form, auxiliary verb 1-1!ru). difference·
If it is ``keru'', then the unformed form shows that J) ru. Sentences used as headwords have the above-mentioned z1η characteristics or restrictions, so there are not many edicts ("J'W'r").With this connection table, when the auxiliary verb "areru" is detected, its It is easy to conclude that the antecedent word is consistent with the unformed form in the table above.

漢字の送り仮名は例えば読み取る、読取る、当たる、当
るなど複数形式があるので特定形式を規定・Ｕ“ず、全
体の読め方と仮名混りの漢字か−・致していればよいと
する（検索できる）。There are multiple forms of kanji okurikana, such as reading, reading, hitting, hitting, etc., so a specific format is not specified, but it is sufficient to match the overall reading and whether the kanji is mixed with kana (search can).

ま、ノこ各用言には転成名詞を例月しておく。例えば「
統ろ取る」には「読取」を（ツ加しておく。名詞＋する
、の形の動詞の転成名１ＩｉＪは、す変りＪ詞１−Ｊる
」の部分を除いて名詞のめとしたものであるから、それ
を例月しておく。１−分１４＋１　Ｊに苅する１−非分
離」など名詞に否定形があれば、これもイ・１加してお
く。Well, every month I put a transposed noun in each noun. for example"
``Read'' is added to ``to control.'' If there is a negative form of the noun, such as 1-minute 14+1 J ni kariru 1-non-separation, add 1 to this as well.

また「実行の際」、「出力する場合」、［入力するとき
」、「午後出発する」の５際、場合、とき、午後、など
の（＝Ｊ属語を伴なわずに連用修飾語文節を作る名Ｒｎ
Ｊにはその旨を示しておく。これは、連用修飾語が述語
を１ｍ　ｆ！ｔ）ＩＬ、ている文を見出し倍化するとき
該連用修飾語は連体修飾語に変形するが、その際に有用
である。例えば１−午後出発する」はし午後の出発」と
すると見出し語になるが、これはザ変動詞［−出発する
」を「出発」とし、述語を修ｆｉｆｉする連用修飾語「
午後」は連体修飾語「午後の」に変１臭してｉ！７られ
る。また、これらの名訃丁は、助１ｉｉＪ　ｌ”に」を
伴って、１−午後に」の形をとることもある。この場合
の連体ｔ＋￥　飾語化も＋’１−ｆ＆の」のようになる
が、助ｉｉＪ　Ｉに」は、方向を表Ｊ−場合に１゛〜へ
のＪとなることもあり、この識別の指釧を前記１−その
旨の表示」が与える。なお連用修飾語の連体修飾語への
変換とは平易に菖えぽ［の」を追加することであり、」
二記の例では１−実行の際の」、「出力する場合の」、
「人力するときの」とすることである。Also, "when executing", "when outputting", "when inputting", "when leaving in the afternoon", when, when, in the afternoon, etc. (=J conjunction modifier clause without accompanying Create name Rn
Please indicate this to J. This means that the conjunctive modifier converts the predicate into 1m f! t) IL, when doubling the heading of a sentence, the adjunctive modifier is transformed into an adnominal modifier, which is useful at that time. For example, if you write ``1-I'm leaving in the afternoon'' and ``I'm leaving in the afternoon'', it becomes a headword, but this is because the inflection [-departure] is used as ``departure'', and the conjunctive modifier that modifies the predicate is ``departure in the afternoon''.
``Afternoon'' is changed to the adnominal modifier ``afternoon'' and is i! 7. Also, these meibancho sometimes take the form ``1-pm'' with ``suke1iiJ l''. In this case, the adjunct t+￥ is also converted into a ``+'1-f&'', but when the direction is J-, it can also become J to 1゛～, The above-mentioned 1-Indication to that effect provides a finger for identification. Furthermore, converting a conjunctive modifier to a conjunctive modifier simply means adding 薖えPO [no].
In the second example, 1 - "When executing", "When outputting",
``When using human power.''

次に、抽出文中の用言等の活用形は様々であり、一方、
見出し語には終止形が望まれるが、この変換は活用表で
行なう。用言および助動詞の活用形を次に示す。Next, there are various conjugations of predicates, etc. in the extracted sentences; on the other hand,
A final form is desired for headwords, but this conversion is done using a conjugation table. The conjugations of pragmatics and auxiliary verbs are shown below.

この表に示すように活用形は用言については近設活用、
同音便、上一段活用など種類別に持っており、助動詞は
不規則なので各単語別に持っている。見出し語ではこの
表で充分処理できる。As shown in this table, the conjugated form is the proximal conjugation for the declarative,
I have them for different types, such as homophones and joichidan conjugations, and since auxiliary verbs are irregular, I have them for each word. This table is sufficient for headwords.

抽出文部ら被解析文は、各文字を文末から順に扱うため
、第２図に示すように文字相位に番冒伺けられた領域に
格納する。抽出文には用言・助動詞の終止形で終る完全
終止文と、接続助詞又は用言°・助動詞の連用中止法で
終る不完全終止文とがある（見出し語の抽出はこの２つ
にする）が、抽出文の文法解析にしＪ完全終止文が好ま
しいので文末処理でこの変換を行なう。接続助詞は、ば
、でも（でも）、のに、ので、て（で）、なから１．た
り（だり）、ものの、程度であるとしてよく、これらの
処理は次の如く行なう。こ＼で文字（１１とは第２図に
示した文末の文字をいう。Since each character in the extracted sentence part and the analyzed sentence is treated sequentially from the end of the sentence, they are stored in an area arranged in the order of character phase as shown in FIG. Extracted sentences include perfect final sentences that end with the final form of a pragmatic/auxiliary verb, and incomplete final sentences that end with a conjunctive particle or a conjunctive cessation of a pragmatic/auxiliary verb (extract headwords from these two types) ) is preferable for the grammatical analysis of the extracted sentence and a J-complete sentence, so this conversion is performed in the sentence-end processing. The conjunctive particles are ba, demo (but), ni, so, te (de), nakara1. It may be said that it is a matter of degree (dari) or something, and these processes are performed as follows. This character (11 refers to the last character of the sentence shown in Figure 2).

（１）文字（１）−“ば”、“の”、“も゛、又は゛ら
パのときは、用言・助動詞の連用形にはなり胃ないので
、各々、接続助詞の“ば”、“もののパ、゛ても（でも
）”、゛ながら”の語尾であるとする。(1) When character (1) - “ba”, “no”, “mo゛, or ゛rapa”, it becomes a conjunctive form of a pragmatic/auxiliary verb and does not have a stomach, so the conjunctive particle “ba”, “ Mono no pa, ゛mo (but)'', and ゛nagara'' are the endings of the words.

（２）文字（１）−“′に”、”て（で）″又は“す”
のときは用言の活用語尾の可能性があるため、活用表か
ら終止形語尾を求めて単語辞書を検索する（検索方法は
後述）。辞書になりれば、各々、接続助詞、“のに、“
て（で）°°、“たり（たり）”である。(2) Character (1) - “′ni”, “te(de)” or “su”
In this case, there is a possibility that it is the conjugated ending of the word, so search the word dictionary for the final ending from the conjugation table (the search method will be described later). In a dictionary, each conjunctive particle, “noni,”
te (de) °°, “tari (tari)”.

（３）接続助ｄｉｊの場合には、単語辞別の接続表から
接続パターンを調べて、活用表から終止形語尾を求め、
単語辞」によって、接続する単語を決定する。(3) In the case of conjunction dij, check the connection pattern from the connection table for each word, find the final ending from the conjugation table,
The words to be connected are determined by the word dictionary.

（４）用言・助動詞の場合も、活用表から終止形語尾を
求め、単語辞書によって、単語を決定する。(4) In the case of pragmatics and auxiliary verbs, find the final ending from the conjugation table and determine the word using the word dictionary.

こうして抽出文は全て完全終止文とし、これに対し文法
ＩＷ析を行なう。先ず■文字＋１１を単語と仮定して辞
書を引く。辞書は逆読み順になっているので卯たり見れ
ば該当するものが見（（Ｊかり、その中から文字（２）
が同じものを選び、更に文字（３）も同じものを選び、
といった操作を繰り返しζす；＜と、単語と見做せるも
のが幾つか出てくる。これらは単語接続を表わす木（１
−リー）構造の根となる。In this way, all extracted sentences are complete stop sentences, and grammatical IW analysis is performed on them. First, assume that the character ■+11 is a word and look it up in a dictionary. Dictionaries are read in reverse order, so if you look at it, you'll see the corresponding one ((J), then the letter (2).
select the same one, and also select the same letter (3),
If you repeat the operation ζ;<, you will come across some words that can be considered words. These are trees representing word connections (1
−Lee) is the root of the structure.

第３図にこの様子を示す。単語ｗ、、、ｗ＋、、・・・
・・・は単語Ｗ１に連なる単語群、単語ｗ、、、　　、
　ｗ、、、、。Figure 3 shows this situation. Word w,,,w+,,...
... is a group of words connected to word W1, word w, , ,
lol...

Ｗ１２２　　・・・・・・は単語Ｗ＋＋　、　Ｗ＋２・
・・・・・に連なるｆｌｉ語ＪＩＹである。なお文字「
を」の直前、片仮名、平仮名、英数字の各々の境目、読
点、造語の開始・終止点などは単語の切れ目となるから
、これらは単語を切出す際の手掛りとなる。■辞書を引
いた結果、助ＮＨＪ、助動詞などの（ｔＪ属語であるこ
とが分れば前述の接続表を調べ、接続する語の終止形分
類と活用形によって活用表を引き、語尾を終止形に戻す
。W122... is the word W++, W+2・
The fli word JIY is connected to... In addition, the character ``
Immediately before ``wo'', the boundaries between katakana, hiragana, alphanumeric characters, commas, and the start and end points of coined words, etc., are word breaks, so these serve as clues when cutting out words. ■As a result of looking up a dictionary, if you find that it is a (tJ genus of an auxiliary NHJ, auxiliary verb, etc.), check the connection table mentioned above, and draw up a conjugation table based on the final form classification and conjugation of the connecting words, and Return to shape.

なおこれは−意に定まるとは限らない。次に■前記ので
ｌｉ語として取り出された残りの文字列又は前記■によ
って語尾を終止形に戻された文字列をもとに、前記■の
ようにして辞」を引く。これによっζ次に続く単語が定
まる。以下同様の処理を繰り返す。■前記■を行ってい
る際、造語でないのに、辞書に出ていない語に当たった
りして処理が行き詰まることがある。その場合には、そ
の′枝”は枯れたとみなして処理を止め、次の枝の処理
に移る。■すべての枝を処理した後、生きている枝が複
数あったならば、何らかの方法で人間の介入が必要とな
る。簡単な場合には、単語の並びが−意に定まる。抽出
文は仮名混りの漢字文であるので、同音異語などにより
混乱されることなく、単語列が定まる賄率が大きい。単
語が定まれば辞書よりその晶１局く分るから、各単語に
は、辞書引きの際それを抽出して、第４図に示すように
品詞を何月しておく　（活用語の場合は活用形なども含
む）。なおＰｌ、Ｐ２・・・・・・は品詞１２品詞２・
・・・・・を示ず。■品詞の並びから“自立語→−伺属
語の並び“が文節であるからこの文節を認識する。各文
節には第５図に示すように単語の番号を対応づりること
によって記１ａする。こうして抽出文を文節に区切る。Note that this is not necessarily determined at will. Next, based on the remaining character string extracted as a li word in the previous step or the character string whose ending has been returned to the final form in the previous step, a ``word'' is drawn as in the above step . This determines the next word. The same process is repeated thereafter. ■When performing step (2) above, the process may get stuck due to a word that does not appear in the dictionary, even though it is not a coined word. In that case, that 'branch' is considered to be dead and processing is stopped, and processing moves on to the next branch. ■ After processing all branches, if there are multiple living branches, human In simple cases, the word order is determined according to the meaning.Since the extracted sentence is a kanji sentence mixed with kana, the word sequence is determined without being confused by homonyms etc. The conversion rate is high.Once a word is determined, its details can be found in a dictionary, so for each word, extract it when looking up the dictionary and write the part of speech for each word as shown in Figure 4. (In the case of conjugated words, the conjugated form is also included.) Pl, P2...... are part of speech 12, part of speech 2, etc.
Does not indicate... ■From the sequence of parts of speech, “independent word → - dependent word sequence” is a phrase, so recognize this phrase. Each clause is marked 1a by corresponding word numbers as shown in FIG. In this way, the extracted sentence is divided into clauses.

抽出文の変形：上記のようにして抽出文を各文節に区り
ｊったら、次は各文節の機能の認識を行なう。前述のよ
うに見出し語は文としては可成り特定のものであるので
文節の機能は主語と述語、連用又は連体修飾語と被修飾
語など少数である。そして抽出文は完全終止形にしであ
るので、述語は文末の文節としてよい。但し、次の■、
■の場合は直前の文節に述語としての意味を表わす語が
あるので、これらは連文節としてまとめる。■１−カー
ドを読み取っている」の「いる」のように文末の文節が
品詞分類−り補助用言であるとき、■［ファイルをオニ
ブンするｊの「メープンＪように、ジ゛変動ｉ１ｉ’Ｊ
　ｒする」を含む文節に先行する名ト」のめからなる文
節。前者は「読の取っている」をまた後者は「オープン
する」を連文節述語とする。次に主語は、抽出文は単文
即ら主語１つの文であるから、格助ＲＨＪ　「が」また
は副助詞［−は］を伴なう文節が主語である。Transformation of extracted sentence: After dividing the extracted sentence into clauses as described above, the next step is to recognize the function of each clause. As mentioned above, headwords are quite specific as sentences, so the functions of clauses are small, such as subject and predicate, conjunction or adnominal modifier, and modified word. Since the extracted sentence is in the perfect final form, the predicate can be a clause at the end of the sentence. However, the following ■,
In the case of ■, there is a word that expresses the meaning as a predicate in the preceding clause, so these are grouped together as a continuous clause. ■When the clause at the end of the sentence is a part-of-speech classification auxiliary predicate, such as ``Iru'' in ``1-I'm reading a card,'' ■[File oniving j's ``Mapun J, ji゛variation i1i' J
A clause consisting of the word ``meito'' that precedes a phrase containing ``r.''. The former uses ``yomi-no-tori-tu'' and the latter uses ``open'' as a conjunctive clause predicate. Next, as for the subject, since the extracted sentence is a simple sentence, that is, a sentence with one subject, the subject is a clause accompanied by case particle RHJ ``ga'' or the adverbial particle [-wa].

連用修飾語は、次の形をとるのが一般的なので、これら
の文節を連用修飾語とする。■名詞→−柊助詞（“に”
、Ｕを”、′へ”、“と、“がら”。Since conjunctive modifiers generally take the following form, these clauses are considered conjunctive modifiers. ■Noun → −Hiragi particle (“ni”
, U to”, ′to”, “and “gara”.

゛より”、“で”、など）、■動詞（連用）十に、Ｑ）
用計・助動ＷｉｉＪ　（連用）十接続助詞のうら連用（
＋と飾語を作るもの（゛て（で）”、゛ながら、“たり
゛など）、（（ｉ）形容ド］の連用側１ｌｉｉＪ法（終
止形分類の形容２　）　、（ＣＪ）形容動詞の連用副詞
法（終止形分類の形動３　）　、（Ｄ例属語を伴わずに
連用（ｌと飾語文節を作る名詞（際”、“場合”、゛と
き”。゛Yori”, “De”, etc.), ■verb (conjunction) 10, Q)
Yomei/Support WiiJ (Conjunction) Ura Conjunction of Ten Conjunction Particle (
+ and decorations (゛te(de)'', ゛nagara, ``tari゛, etc.), ((i) adjective do] on the conjunctive side 1liiJ method (adjective 2 of the final classification), (CJ) adjective verbs Conjunctive adverbial method (verb 3 of the final form classification), (D example) Conjunctive without a genitive (l and a noun that forms an ornamental clause (when, when, ゛toki).

″おり”、″朝”、°′午後”など）。連体（１と飾語
は、抽出文の前述の制約、即し文節間の関係は主語述語
又は修飾、被１１と飾の関係のめであるから、ｔ）；１
記主語、述語、連用修飾語以外のもの（文節）が連体修
飾語である。``ori'', ``morning'', °'afternoon'', etc.).Adnominals (1 and decorations are based on the above-mentioned constraints of extracted sentences, i.e., the relationship between clauses is due to the relationship between subject-predicate or modification, and the relationship between arguable and decoration). Because there is, t);1
Items (clauses) other than subjects, predicates, and adjunctive modifiers are adjunctive modifiers.

転成名ＮＨＪ化による変形の場合、述語部分が名詞化さ
れるので、述語にかかる連用修飾語は連体修飾８ｈ化す
る。ｔｊｆｇって修飾関係を認識する必要があるが、意
味による認識は処理装置では困ゲ１ｆなので位置関係の
みでこれを行なう。即ち連用ｆｌｈ　ｆｉｆｉｒ語は１
多続する最も近い用言にががるものとみなす。例えば「
ジョブを実１ｊ゛する際に４’Ｓ　（１７７するＪなる
文で１よ「ジョブを」が「実行する」を修飾し、「際に
」が［“準備するＪを１１１１する、とする。同様に「
実行に先立って用意する」では１゛実行に」が１先立っ
て」をまた「先立って」が「用意する」を修飾し、１−
午後処理するジョブを用意する」では１−午後」が１−
処理する」を修飾する、とする。In the case of transformation by transposition name NHJ, the predicate part is transformed into a noun, so the adjunctive modifier related to the predicate is transformed into an adnominal modifier 8h. Although it is necessary to recognize the modification relationship tjfg, it is difficult for the processing device to recognize the meaning based on the meaning, so this is done based only on the positional relationship. That is, the continuous flh fifir word is 1
It is assumed that the closest repeated term is used. for example"
When executing a job, it is assumed that in the sentence 4'S (177) 1, ``job'' modifies ``execute'', and ``when'' is [``1111 prepare J''. Similarly, “
In ``prepare before execution'', 1゛in advance of execution'' modifies ``prepare'', and ``prior to'' modifies ``prepare'', and 1-
"Prepare a job to be processed in the afternoon" means "1-afternoon" is 1-
"to process" is modified.

こうして文節、その修飾関係、が分れば見出し語への変
形を行なう。変形要領は転成名詞化による変形と被＃Ｅ
　ｆｉｆｌｉ語指定による変形の２種である。Once the clauses and their modification relationships are known, they can be transformed into headwords. The transformation method is transformation by transposition nounization and #E
There are two types of transformations based on the fili word specification.

転成名ｖｙＪ化による変形は次の如く行なう。ｎｕら、
■述語文節中の用言の転成名詞を辞ｐ（がら求める。Transformation by transmutation name vyJ is performed as follows. nu et al.
■Find the transposed noun of the predicate in the predicate clause.

前述のように辞宵には各用言の転成名詞が載っている。As mentioned above, Jiyoi lists the transposed nouns for each noun.

もし、この文節中に打消しの助動詞゛ない゛などが含ま
れれば、転成名ｄｉＪは否定形の方をとる。If this clause contains a negating auxiliary verb such as ``nai'', the transposition name diJ will take the negative form.

■主語文節中の助詞“が”又は゛は゛を格助詞°′の”
に置き換える。■述語にががる連用修飾語を連体修ｆｉ
ｔｅ瀕に直ず。ごれは可成り厄介で一律にはできず、次
のようにｆｆ／、Ｉ々に行なう。■The particle “ga” or ゛wa゛ in the subject clause of the case particle °′
Replace with ■ Modify adjunctive modifiers that follow predicates
te is on the verge of recovery. Dirt is quite troublesome and cannot be done uniformly, so do it ff/I separately as follows.

ａ、助ａｙＪ“を”は、助詞”の”に直す。助詞パへ”
、°“と°°、゛から（より）”、゛で”、°て”など
は、後に′の”をイリ加する。a. Change the auxiliary ayJ “to” into the particle “”. particle pahe”
, °" and °°, ゛kara(yori), ゛de", °te, etc., add ``no'' after it.

例：デーク牽−人力するーデータ個人カ道路を走行する
一道路の走行データエラーからアヘフドするーデータエラーからのア
ヘフドデ゛−タエラ−二アヘンドする−・データコニラ−での
アヘフドデータを表示しながらヂエソクするー・データを表示し
ながらのチェックｌ〕、助詞°“に”の変換は次のように行う。Example: Data driving by hand - Personal data Driving on a road - Ahead of a road driving data error - Ahead of a data error - Ahead of a data error - Ahead of a data error - Checking while displaying data on a data controller -・Check while displaying the data], the particle ° “ni” is converted as follows.

・１（１１屈詔を伴わずに連用偵ｆｉｆｅ＋語文節を作
る名詞にＩ妾続するパに”は゛の゛に置き換える。例え
ば「ジョブを実行する場合片変更する」は１゛ジヨブを
実行する場合の変更」にする。・1 (11) Create a continuation fife + word clause without yakugo Replace ``to the pa attached to the noun'' with ゛ of ゛.For example, ``When executing a job, make a partial change'' means 1゛execute the job. Change the case.

・述語中に受Ｕ月、使役の助動詞（れる、られる、せる
、させる）があるとき、それにかかる連用修飾語の“に
゛は、“の”に置き換える。例えばＩ−オペレータに壕
作させる」は１−オペレータの側作」に、［−ユーザに
指定される」は「ユーザの指定」にする。・When there is an auxiliary verb (reru, wareru, seru, let) in the predicate, the ``ni'' in the conjunctive modifier is replaced with ``no''. For example, ``I--make the operator make a trench.'' is set to 1-operator's side work', and [-specified by user] is set to 'user's designation'.

・上記以外の“に”は方向、場所を表わすものが多いの
で゛への”に置き換える。例えば［−運動場に集合する
」は「運動」−への集合−１に、［一画面に表示する」
は１一画面への表示」に、［−オペレータに依頼する」
は１−オペレークへの１衣頼」にする。・Other than the above, "ni" often indicates a direction or place, so replace it with "to".For example, [-gather at the playground] is changed to "exercise" -gathering to -1, and [display on one screen] ”
1.Display on screen" and "-Request to operator"
is 1 - 1 request for operation.

Ｃ５形容詞・形容動詞の連用形は、活用表によっ連体形
に直す。C5 Adjunctive forms of adjectives and adjective verbs are converted into adnominal forms using the conjugation table.

ｄ、名詞（前述の＜ｏｉ＝Ｊ属語を伴わずに連用修飾語
文節を作る名詞）には“の”を（＝Ｊ加する。例えば［
午後入力処理ぐ開始する」はｌ’ｌｌ＆の人力処理の開
始」とする。d. Add “no” (=J to nouns (nouns that form a conjunctive modifier clause without the aforementioned <oi=J genus). For example, [
``Start input processing in the afternoon'' means ``start of manual processing of l'll&''.

被修飾語指定による変形は、比較的簡単であり、指定さ
れた被修飾語を最後に持って来て、残りをその（１ト飾
語にすればよい。具体的には述語の活用語尾を活用表に
よって連体形に直し、指定された被修飾語をその後に続
りる。被修ｈ（１１語に接続していた例属語は、リノリ
捨てる。例外として、助動詞のうら名詞に続く連体形の
無いもの（ｌＩＪｉ定の助動詞゛だ°゛）は、特別に変
更する（“である”に直す）。また主語を表す助ａＨＩ
が°“は”になツー（いるならば、“が゛に変更する。Transformation by specifying a modifier is relatively simple; all you have to do is bring the specified modifier at the end and make the rest its (1-t modifier). Specifically, the conjugated ending of the predicate is It is converted into an adnominal form according to the conjugation table, and the specified modified word follows it.The example genitive that was connected to the modified h (11 words) is discarded.As an exception, the adnominal form following the noun after the auxiliary verb is Something without body shape (lIJi definite auxiliary verb ゛°゛) is specially changed (changed to “is”).Also, the auxiliary aHI that represents the subject
If there is a ``ha'' in ``ha'', change it to ``ga゛''.

例：ユーザはコンソールからオペランドを人力する。Example: User manually enters an operand from the console.

被（じ飾語　　変形語ユーザ　→コンソールからオペランドを入力するユーザコンソール−ユーザがオペランドを入力するコンソ　−
ルオベラン１−−ユーザがコンソールから入力するオペラ
ンドこうして文書中の指定された語句を抽出し、それを見出
し語に適する形態に直して索引作成を行なうことができ
る。(same decoration word Variant word user → User console where the operand is input from the console − Console where the user inputs the operand −
Luobelan 1--Operan that the user inputs from the console.In this way, a specified word or phrase in a document can be extracted, converted into a form suitable for a headword, and an index can be created.

発明の９〕果以上説明したように本発明によれば、文」処理システム
に文凋を入力する段階で見出し語候袖を指定するだりで
、あとは大吉処理システムが該見出し語候補を見出し語
に変形し、頁伺すしたその頁と共に索引を作成Ｊるので
、オペレータの介在をａ、減し、文肖処理システムが索
引側きの高度な図四を自動編集することが可能になる。[9] Results of the Invention As explained above, according to the present invention, a headword candidate is specified at the stage of inputting a word to the sentence processing system, and the Daikichi processing system then converts the headword candidate into a header. Since the index is created along with the page that is searched, operator intervention is reduced, and the text processing system can automatically edit the advanced figures on the index side. .

[Brief explanation of drawings]

第１図は本発明を適用した大吉処理システムにお＆Ｊる
処理の流れを示すフローチャー１・、第２図〜第５図は
処理要領の説明図である。図面で、■０は文書原稿、１２はディスプレイ（；Ｊき
のキーボード、１４はフロッピィディスク、１６は処理
装置、１８は単語辞書である。FIG. 1 is a flowchart 1 showing the process flow of the Daikichi processing system to which the present invention is applied, and FIGS. 2 to 5 are explanatory diagrams of the processing procedure. In the drawing, 0 is a document, 12 is a display keyboard, 14 is a floppy disk, 16 is a processing device, and 18 is a word dictionary.

Claims

[Claims] In the Hisai processing system, which paginates the main text of the input text, creates a table of contents, index, etc., and performs °ζ editing, the words are arranged in reverse reading order, Conjunction tables are used for words, transposed nouns are used for pragmatics, and words with /iIi table are used for pragmatics and auxiliary verbs.The control code indicates that the word is a headword candidate, that it is a coined word, and In response to the above text that was instructed to be a modified word, some headword candidates were converted to transposed nouns, and for those with a modified word specified, the relevant clause was used as a modified word, and the remaining clauses were changed to an adnominal modifier. 1. A document processing system comprising: a processing device that transforms into °C, generates a headword, and creates an index together with the cut out page.