JPS58225485A

JPS58225485A - Automatic generation method of book index

Info

Publication number: JPS58225485A
Application number: JP57109498A
Authority: JP
Inventors: Yoshinori Goto; 美紀後藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-06-25
Filing date: 1982-06-25
Publication date: 1983-12-27
Also published as: JPS6362767B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（ａ１発明の技術分野本発明は、図書のレイアウトをコンピュータで自動的に
行なう文書処理システムにおいて、図書の巻末に設けら
れる索引を自動的に生成する方式（ｂｌ技術の背景第１図は、文書処理システムのブロック図で、まず図書
の本文の原稿ｌをキーバンチャーが入力装置２で入力し
て、磁気媒体３に保存しておく。Detailed Description of the Invention (a1 Technical Field of the Invention The present invention relates to a method for automatically generating an index provided at the end of a book (BL technology) in a document processing system in which the layout of a book is automatically performed by a computer. Background FIG. 1 is a block diagram of a document processing system. First, a key buncher inputs a manuscript l of the main text of a book using an input device 2, and stores it on a magnetic medium 3.

その情報を計算機システムに入力すると共に、１ページ
当りの印刷行数などのレイアウト情報を指定・入力する
ことにより、編集部４において自動的に計算機のメモリ
中に割付けられ、それを漢字ラインプリンタなどで出力
すると、本文の版下５ができあがる。また索引処理部６
では、本文の記憶領域８中から索引の見出しとなる用語
を自動的に抽出して、例えば五十音順に配列し索引ペー
ジ７をプリントアウトする。By inputting this information into the computer system and specifying and inputting layout information such as the number of lines to be printed per page, the editing section 4 automatically allocates it in the computer's memory and prints it on a kanji line printer, etc. If you output it with , you will have a version 5 of the main text. Also, the index processing unit 6
Then, terms that will serve as index headings are automatically extracted from the main text storage area 8, arranged in alphabetical order, for example, and the index page 7 is printed out.

（Ｃ）従来技術とその問題点このように索引を自動的に生成するには、従来は本文中
に例えば“実対称行列の直交変換”という用語が用いら
れている場合、それを索引に設けたければ、本文中の用
語に制御コードなどで印を付けておき、且つその読みも
指示しておく。例えば、（ＩＸ）実対称行列の直交変換（Ｋ）ジックィショウギ
ョウレツノチョッコウヘンカン（ＩＥ）というように、
本文中に入力しておく。(C) Prior art and its problems In order to automatically generate an index in this way, conventionally, if the term "orthogonal transformation of a real symmetric matrix" is used in the text, it must be added to the index. If desired, mark the terms in the text with control codes, etc., and also indicate how to read them. For example, (IX) Orthogonal transformation of a real symmetric matrix (K) Jicky Shogyoretsu no Chokkou Henkan (IE)
Enter it in the text.

ここで、（ＩＸ）　　（Ｋ）　　（ＩＢ）は制御コード
で、それぞれ、（ＩＸ）：索引の始まり（Ｋ）：読み方と区切り（ＩＥ）：索引の終り従って索引処理部６では、本文の記憶領域を読み出して
始めから終りまでチェックしていき、上記のような制御
コードが設けられている用語を抽出すると共に、五十音
順に分類しそのページ数を　９− 付加し゛Ｃ索引ページを作成する。Here, (IX), (K), and (IB) are control codes, respectively. (IX): Start of index (K): Reading and delimitation (IE): End of index. Read out the area and check it from beginning to end, extract terms that have control codes like the ones above, classify them in alphabetical order, add the number of pages, and create a C index page. .

ところがこのように本文中の用語をそのまま索引の見出
し語として用いるだけでは、“ジ”の項の１１１１ｉ１
所からしか索引できず、索引の見出し語が制限され、読
者が検索する場合不便である。つまり索引は読者の検索
の便を図るためにあるので、１つの用語がいろいろな観
点から分類されているのが望ましい。特に学術図書など
の場合は、１つの用語を分解して、分解された各用語ご
とに索引することもできるようにして充実した索引構成
にしておくと、読者にとって、簡単かつ確実に目的の用
語を見イ１けることができる。However, if we simply use the terms in the main text as headwords in the index, the 1111i1
It is inconvenient for readers to search because they can only be indexed from the beginning, and the headwords for the index are limited. In other words, since indexes exist to facilitate searches by readers, it is desirable that one term be classified from various viewpoints. Particularly in the case of academic books, it is useful to have a rich index structure that allows you to break down one term and index each broken down term, allowing readers to easily and reliably find the desired term. You can see the difference.

・例えば上記の“実対称行列の直交変換”という用語で
あれば、従来どおりそのまま索引の見出し語に利用する
ことは勿論であるが、それに加えて第２図のように、“
変換”という見出し語も設け、それに付随して階層の低
い“直交”、“フーリエ−”、ユニタリー”などの見出
し語を設ける。ここに、ノンブルーには、“変換”が挿
入される。・For example, the term “orthogonal transformation of a real symmetric matrix” mentioned above can of course be used as an index entry as before, but in addition, as shown in Figure 2, “
The headword ``transformation'' is also provided, and accompanying headwords such as ``orthogonal'', ``Fourier'', and unitary are provided at lower levels. Here, “conversion” is inserted for non-blue.

また“直交”という見出し語に付随して、更に階層の低
い“対称行列の−”という用語や“対称作用素の−”と
いうような用語が設けられている。Further, accompanying the headword "orthogonal", there are also lower-level terms such as "-" for a symmetric matrix and "- for a symmetric operator."

ここに、ノンブルーには、“直交変換”が挿入される。Here, "orthogonal transformation" is inserted into non-blue.

このように本文中の用語を分解して、分解した用語と“
−”で表されるノンブルーあるいは更に関連する用語と
組合せると共に、゛変換”は第ルヘル、“直交”と“フ
ーリエ”ユニタリ”は第２レベル、“対称行列の−”と
“対称作用素の−”は第３レベルというように階層分け
することにより、きめ細か（使い易い索引構成となる。In this way, the terms in the main text are broken down, and the broken down terms and “
In combination with non-blue or more related terms denoted by ``-'', ``transform'' is at the first level, ``orthogonal'' and ``Fourier'' are at the second level, ``-'' for a symmetric matrix and ``for a symmetric operator -" is the third level, resulting in a fine-grained (and easy-to-use) index structure.

なお第２図の、“変換”のほかに、“対称”行列の”直
交”などの見出し語でも索引できるようにしておくのが
よい。In addition to "transformation" in FIG. 2, it is preferable to use a headword such as "orthogonal" for "symmetrical" matrices to be indexed.

このように各単語に分解すると共に従属関係や相互参照
関係が付けられ、各階層ごとにレベル分けした索引が最
も好ましく、内容が高級な図書においては欠くことがで
きない。ところでこのような高度な索引は従来人手で作
成しているが、見出し語の分解のしかたや関係付けは、
索引編Ｓ者の５　− Ｊ　　− 知識と技能に負うところが大きく、コンピュータによる
文書処理システムにおいても対処しにくい問題である。In this way, an index that is broken down into each word and has subordination relationships and mutual reference relationships, and is divided into levels for each hierarchy is most preferable and is indispensable for books with high-quality content. By the way, such advanced indexes have traditionally been created manually, but the method of breaking down headwords and making connections is difficult.
Index Editor S Person 5 - J - It is a problem that is difficult to deal with even in computer-based document processing systems, as it depends largely on knowledge and skills.

（ｄ１発明の目的本発明の目的は、従来の索引自動生成システムにおける
このような問題を解決し、本文中の用語を分解して関連
用語が網羅され充実した索引を、人手を介することなし
に自動的に作成できるようにすることにある。(d1 Purpose of the Invention The purpose of the present invention is to solve such problems in the conventional automatic index generation system, and to break down the terms in the text and create a rich index that covers related terms without any manual intervention. The purpose is to be able to create it automatically.

（ｅ）発明の構成この目的を達成するために本発明は、本文の原稿を入力
する際、索引の見出しに用いられる用語に索引コードを
付加して入力し、それを記憶手段に記録しておいて、本
文のレイアウト終了後に、本文のデータを順次チェック
することにより、索引用語とページ番号を抽出して分類
する、図書索引のコンピュータシステムによる自動生成
方式において、複合語の場合は、それを各要素に分解して各要素のつな
ぎ目に分離符を付加して入力し、６一それに基づいて計算機内で、分解された各用語に対応す
るレベル識別フラグを付与し、それを読み出して、分解
された用語のあらゆる組合せを作成すると共に、それぞ
れの分解用語に付随する階層の低い用語も生成し、上記のように分解された各用語の組合せの内、レベル識
別フラグなどの情報を利用し、一定の基準に基づいて不
必要な用語を間引く後処理を行なう構成を採っている。(e) Structure of the Invention In order to achieve this object, the present invention involves inputting an index code by adding an index code to the terms used in the heading of the index when inputting a manuscript of the main text, and recording the code in a storage means. In the automatic generation method using a computer system for book indexes, which extracts and categorizes index terms and page numbers by sequentially checking the data of the text after completing the layout of the main text, in the case of compound words, they are Decompose it into each element, add a separator to the joint between each element, input it, and based on that, give a level identification flag corresponding to each decomposed term in the computer, read it out, and input it. In addition to creating all combinations of terms that have been decomposed, it also generates lower-level terms that accompany each decomposed term, and uses information such as level identification flags among the combinations of terms that have been decomposed as described above. A configuration is adopted in which post-processing is performed to thin out unnecessary terms based on certain criteria.

ｆｆ１発明の実施例次に本発明による図書索引の自動生成方式が実際上どの
ように具体化されるかを実施例で説明する。第３図は本
発明による処理方式を示すブロック図である。本発明の
場合も、本文の記憶領域８を読み出して、制御コードを
抽出し、索引ページを生成する点は従来と同じであるが
、複合語の場合の分離符を次のようにして付与する。ff1 Embodiments of the Invention Next, how the automatic book index generation method according to the present invention is actually implemented will be explained by way of embodiments. FIG. 3 is a block diagram showing a processing method according to the present invention. In the case of the present invention, the main text storage area 8 is read out, the control code is extracted, and an index page is generated, as in the case of the conventional method, but a separator mark for compound words is added as follows. .

即ち本文中の用語を分解すると共に最小単位の用語を抽
出可能にするために、本文中に入力する際に、“実対称
行列の直交変換”という用語であれば、例・１のように
、可能な限り分解して分離符を付記しておく。In other words, in order to be able to decompose the terms in the main text and extract the smallest unit of terms, when entering the term in the main text, if the term is "orthogonal transformation of a real symmetric matrix", as in Example 1, Disassemble as much as possible and add separators.

例・１　・・・（ＩＸ）実（Ｉｌｌ対称（ｐ１行列の（
＋）ｌ直交（ｐ）変換（ｋ）ジツ（ｐ）タイショウ（ｐ
）ギョウレツノ（Ｐ）チョソコウ（ｐ）ヘンカン（ＩＥ
）・・・このように入力しておくと、本文の編集後に、計算機で
自動的に処理される。即ち本文の記憶領域８を読み出し
、制御コードの抽出部９で、１つの制御コードを抽出し
て、次の前処理部１０に処理を渡す。すると前処理部ｌ
Ｏでは、分離符付きで入力された１つの索引データを解
析し、分離された要素数と要素の特性が調べられる。１
１は１つの用語の前処理後の記憶状態を示すもので、分
解された各索引用語と、それの読み、並びに分離符フラ
グの形式で格納される。この例では、前記のように５つ
に分解されているので、要素数＝５である。Example 1... (IX) real (Ill symmetric (p1 matrix (
+) l Orthogonal (p) transformation (k) Jitsu (p) Taisho (p
) Goretsuno (P) Chosokou (p) Henkan (IE
)... If you enter it like this, the computer will automatically process it after editing the main text. That is, the main text storage area 8 is read out, one control code is extracted by the control code extraction unit 9, and the processing is passed to the next preprocessing unit 10. Then, the preprocessing section l
In O, one index data input with a separator is analyzed, and the number of separated elements and characteristics of the elements are checked. 1
1 indicates the storage state of one term after preprocessing, and is stored in the form of each decomposed index term, its pronunciation, and a separator flag. In this example, the number of elements is 5 because it is decomposed into 5 parts as described above.

分解された用語の内、取扱いが異なるものがある。索引
に採用する語句は、“連体修飾語十体言”の形が殆どで
あり、分離符には、連体修飾語と被修飾語の間にあるも
のと、複合名詞中の各複合要素間にあるものとがある。Some of the decomposed terms are handled differently. Most of the words used in the index are in the form of "adnominal modifiers", and separators include those between the adnominal modifier and the modified word, and those between each compound element in a compound noun. There is something.

前者をＡレベル、後者をＢレベルということにする。例
えば“実対称行列”という複合名詞の中の“実”と“対
称”と“行列”とはそれぞれ緊密度が高いので、それぞ
れの間はＢレベルとし、修飾語である“実対称行列の”
と被修飾語である“直交変換”との間は緊密度が低いの
で、Ａレベルとする。図示例では、平仮名（用語の活用
語尾、助詞または助動詞である）の後に来た分離符をＡ
レベルとみなし、それ以外はＢレベルとみなし、語尾に
おける仮名の有無でレベルを判断して分離符フラグに記
憶させる。The former will be referred to as A level and the latter as B level. For example, in the compound noun "real symmetric matrix,""real,""symmetric," and "matrix" each have a high degree of closeness, so the space between them is set to B level, and the modifier "of a real symmetric matrix" is used.
Since there is a low degree of closeness between this term and the modified word "orthogonal transformation," it is set at A level. In the illustrated example, A
Other words are considered as B level, and the level is determined based on the presence or absence of kana at the end of the word and stored in the separator flag.

なおこの分離符は、後述する間引き処理で利用される。Note that this separator is used in the thinning process described later.

また各要素となる用語には、０〜４のように番号を付し
て記憶させ、以後番号に置き換えて扱えるようにする。Further, each element term is assigned a number such as 0 to 4 and stored, so that it can be handled by replacing it with a number from now on.

従って第２図のような索引構成は、計算機の［憶領域で
は、各要素に対応する数字を、レベル１から順に並べて
得られる順列、即ち９− となる。Therefore, the index structure as shown in FIG. 2 is a permutation obtained by arranging the numbers corresponding to each element in order from level 1 in the storage area of the computer, that is, 9-.

しかもこの順列は、計算処理部１２において分類プログ
ラムを用い、第４図のような処理の流れで計算処理する
ことによって得られる。この図で、各処理ブロックの機
能は次の通りである。すなわちＩＮＴＬは、Ｘ　＝　（
０）　、’ｆ　　（１）＝　（０）のように、初期値を
定める。、　ＣＩＩＩＲＬは、対応する端末■までの処
理を、制御変数μの値を１からｍまで１つずつ変えなが
ら、その各値について反復実行する。Moreover, this permutation is obtained by using a classification program in the calculation processing section 12 and performing calculation processing according to the processing flow shown in FIG. In this figure, the functions of each processing block are as follows. That is, INTL is X = (
0),'f(1)=(0). , CIIIRL repeatedly executes the process up to the corresponding terminal ■ for each value while changing the value of the control variable μ one by one from 1 to m.

ＬＯＡＤは、メモリからｆ　　（ｊ）を取り出す。５Ｈ
ＦＴは、リストｆ　　（ｊ）の要素の各文字をｉ−ｊ＋
ｌシフトする。Ｃ０Ｎ５は、リストＸとりストｙを連結
して、リスト　（ｘ、ｙ）を得る。ＥＸＴＮは、リスト
Ｚの要素が、文字０，１．・・・、ｉの順列になるよう
に拡大する。即ちＺの要素について、０，１．・・・、
ｉに含まれない文字があれば、それらの昇順列を、その
要素の後に追加する。５ＴＯＲは、変数にの内容１０− をメモリ中のｆ　　（ｉ＋ｌ）に格納する。LOAD retrieves f (j) from memory. 5H
FT converts each character of the elements of list f (j) into i−j+
l shift. C0N5 concatenates list X and list y to obtain list (x, y). EXTN indicates that the elements of list Z are characters 0, 1 . ..., expand so that it becomes a permutation of i. That is, for the elements of Z, 0, 1 . ...,
If there are any characters not included in i, their ascending sequence is added after that element. 5TOR stores the contents of the variable 10- into f (i+l) in memory.

このようなリスト処理によって、例・１の場合であれば
、要素数ｎ＝５として求め、数字の順列であらゆる組合
せを生成すると表・１のように３４通りの組合せができ
る。つまり、項番が１の欄は元の本文中の用語と全く同
じ語で、第１階層に配置される。この場合は勿論第２階
層以下は存在しない。項番２〜５は置換が行なわれ、階
層の第ルベルが“対称”で、第２レベルが“実”となっ
ており、第３階層は存在しない。項番６以降は次第に置
換が複雑になり、○印のついたものの一部に見られるよ
うに、第４階層以上も存在するものもある。Through such list processing, in the case of Example 1, if the number of elements is found as n=5 and all combinations are generated by permuting the numbers, 34 combinations can be created as shown in Table 1. In other words, the column with item number 1 has exactly the same term as the term in the original text, and is placed in the first hierarchy. In this case, of course, there are no layers below the second level. Item numbers 2 to 5 are replaced, the level of the hierarchy is "symmetrical", the second level is "actual", and the third hierarchy does not exist. From item number 6 onwards, the replacements become progressively more complex, and as seen in some of the items marked with an ○, there are cases where there are more than 4th hierarchy.

しかしながら必ずしもこのような組合せの総てを索引の
見出し語とする必要性も無い。例えば図書の索引ページ
上で類似語が集中的に現れたりすると見苦しいだけでな
く、紙面の都合もあり、また索引検索の上でも必要無い
場合もある。そこで、後処理部１３中の階層レベルチェ
ック部１３１と見出し語連結チェック部１３２において
、次のような１１− 表・１条件を満たずものは、不適当とみなして、索引ページに
出力しない。However, it is not necessary to use all such combinations as headwords in the index. For example, if similar words appear in a concentrated manner on the index page of a book, it is not only unsightly, but also due to space constraints, and may not be necessary for index searches. Therefore, the hierarchical level checking section 131 and headword concatenation checking section 132 in the post-processing section 13 consider that items that do not meet the following conditions are inappropriate and do not output them to the index page.

条件ｌ：階層レベル数が標準（通常は２〜３）を越えた
場合。Condition 1: When the number of hierarchy levels exceeds the standard (usually 2 to 3).

条件２：第ルベルまたは下の階層を順に連結していく中
間結果が、Ａレベルの分離符を含むとき、更に次の階層
連結したものの末尾がＢレベルの分離符の場合。例えば
、“行列の（ρ）直交（ｐｉ”。Condition 2: When the intermediate result of successively concatenating the level or the lower hierarchy includes an A-level separator, and the end of the concatenated next level is a B-level separator. For example, “matrix (ρ) orthogonal (pi”).

このような条件に基づいて、該当するものを除くと、表
・１中でＯ印のついている組合せのみが有効となって、
索引レコードファイル１４が作成される。なお表・１中
で、“４〜”の欄で“有”となっているのが、条件ｌの
階層レベル数が多過ぎるために、階層レベルチェック部
１３１でオミソトされる組合せであり、その他が条件２
の意味上の理由で見出し語連結チェック部１３２におい
てオミットされる組合せである。例えば項番４や８等の
組合せは、“行列の”の後がＡレベルの分離符で、“直
交”の後の分離符がＢレベルであり、条件２に該当する
ので、除去される。Based on these conditions, excluding the applicable ones, only the combinations marked with an O in Table 1 are valid.
An index record file 14 is created. In Table 1, the combinations that are marked "Yes" in the "4~" column are those that are omitted by the hierarchy level check unit 131 because the number of hierarchy levels in condition 1 is too large, and other combinations are is condition 2
This is a combination that is omitted by the headword concatenation check unit 132 for semantic reasons. For example, combinations such as item numbers 4 and 8 are removed because the separator after "matrix" is an A-level separator, and the separator after "orthogonal" is a B-level separator, and satisfies condition 2.

１３− この後処理が終了すると、再度索引用語抽出部９で次の
索引用語を抽出して、前処理−計算処理−後処理が繰り
返される。13- When this post-processing is completed, the index term extraction unit 9 extracts the next index term again, and the pre-processing-calculation process-post-processing is repeated.

このように後処理部１３において間引き処理された後で
も未だ索引用語数が多過ぎる場合は、出力処理の段階で
群間引きされる。即ち後処理の済んだ索引用語に基づい
て、索引レコードの分類と累積処理を行ない、索引レコ
ードファイル１４を作成して記憶させておく。この索引
レコードは、図示のように、１ｌｌｌｌ’ｉｔレベルを
なす見出し語、読み方、ノンプルおよび派生元の語を表
す識別番号からなっている。If the number of index words is still too large even after being thinned out in the post-processing unit 13, groups are thinned out at the output processing stage. That is, the index records are classified and accumulated based on the post-processed index terms, and the index record file 14 is created and stored. As shown in the figure, this index record is made up of identification numbers representing headwords, readings, nonples, and derived words at 1llll'it level.

第５図は索引レコード１４中を一部拡大して示したもの
であり、この図に示されているように、読み方は、分類
がうまくいくように、順に上の階層のものを受は継いで
いくようにする。そしてこれを従来の文書処理システム
のように、索引ページに割付ける。Figure 5 shows an enlarged view of a portion of the index record 14.As shown in this figure, the reading method is to inherit the items in the upper hierarchy in order for the classification to be successful. I'll try to go with it. This is then allocated to an index page like in a conventional document processing system.

次に割付けた結果を再入力し、出力処理部１５において
、五十音順またはアルファベット類の見１４− 出し語項目中の、ある階層レベルで、同一語から派生し
た見出し語（識別番号で判断）が−例えば１０以内とい
うように一定の距離以内に割（＝ｊけられておれば、次
のルールに従って、一方が削除される。つまり索引の目
的は読者の検索の便を図るためにあるので、１つの用語
がいろいろな観点から分類されているのが望ましい。そ
のため同じ界出し語が付近に集中していてもあまり意味
をなさない。特に紙面の都合で索引ページの余裕がない
図書の場合は、必要最小限の見出し語にとどめたいとい
う要求がある。Next, the assigned results are re-inputted, and the output processing unit 15 selects headwords derived from the same word (judged by identification number) at a certain hierarchical level in the alphabetical order or alphabetical list. ) is divided within a certain distance (=j, for example within 10), one of them will be deleted according to the following rules.In other words, the purpose of the index is to facilitate the reader's search. Therefore, it is desirable for one term to be classified from various viewpoints.Therefore, it does not make much sense even if the same common words are concentrated in the same vicinity.Especially for books that do not have enough space for index pages due to space constraints. In some cases, there is a desire to keep headwords to the minimum necessary.

このような見出し語としての機能が小さい用語を削除す
るためのルールとしては、工１階層の深さが異なるときは、深い方は繁雑であるの
で、それを削除する。The rules for deleting such terms that have little function as headwords are: When the depth of the first layer of engineering differs, the deeper one is more complicated, so it is deleted.

２、階層の深さが同じの場合は、次のルールで削除する
。2. If the depth of the hierarchy is the same, delete according to the following rule.

（ａ）ノンブルーを両側から見出し語が挾む形、例えば
、“行列の一変換”のようなものが、上の階層レベルに
あると見ずらいので、より上の階層レベルにある方を削
る。同じレベルにある場合は、より多くの階層がこのル
ールに当てはまる方を削る。(a) If a non-blue word is sandwiched between headwords from both sides, such as "one transformation of a matrix," it will be difficult to see it if it is located at a higher hierarchical level. Sharpen. If they are at the same level, remove the one with more layers that apply to this rule.

（ｂｌ上の（ａ）のルールでも減らない場合は、Ａレベ
ルの分離符が無効になっている階層が多い方を削る。(If the rule (a) on BL does not reduce the amount, delete the layer with the most A-level separators disabled.

第６図は、このルールに基づいて百聞引き処理される例
で、■■■■は第２階層として近過ぎるが、■は識別記
号の異なる見出し語を下位の階層にもつので、削除でき
ない。また■■■の中では、ルール１．に従って第３階
層をもたない■を残す。FIG. 6 shows an example in which the ``see-through'' process is performed based on this rule, where ``■■■■'' is too close to the second level, but ``■'' cannot be deleted because it has a headword with a different identification symbol in a lower level. Also, in ■■■, rule 1. Accordingly, ■ that does not have a third layer is left.

その結果、最終的に出力される索引ページ１６としては
、例・２のような状態となる。As a result, the index page 16 that is finally output will be in a state as shown in Example 2.

例・２変換実対称行列の直交− 相似− 直交− 実対称行列の直交− １Ｃ一平面図形の− フーリエ− ユニタリー変数以上のような処理でも減少しない場合は、メソセージを
表示して、人間の判断を求める。なお百聞引き処理の済
んだ索引は、チェック済み索引ページファイル１７に格
納保存される。Example 2 Orthogonality of transformed real symmetric matrices - Similarity - Orthogonality - Orthogonality of real symmetric matrices - 1C one-plane figure - Fourier seek. Note that the index that has been subjected to the see-through processing is stored and saved in the checked index page file 17.

（ｇ１発明の効果以上のように本発明によれば、索引用語が複合語の場合
は、それを各要素に分解して各要素のつなぎ目に分離符
を付加して入力すると、それに基づいて計算機内で、分
解された各用語に対応するレベル識別フラグが付与され
る。そしてそれを読み出して、分解された用語のあらゆ
る組合せを作成すると共に、それぞれの分解用語にイ１
随する階層の低い用語も生成し、上記のように分解され
た各用語の組合せの内、レベル識別フラグなどの情報を
利用することによって、一定の基準に基づき一１７＝　１ｈ− 不必要な用語を間引（後処理を行なう構成を採っている
。(g1 Effect of the Invention According to the present invention, when an index term is a compound word, if it is broken down into each element and a separator is added to the joint between each element and input, the computer calculates the , a level identification flag corresponding to each decomposed term is given, and it is read out to create all combinations of decomposed terms, as well as to assign an index to each decomposed term.
It also generates lower terms in the associated hierarchy, and uses information such as level identification flags among the combinations of terms decomposed as described above to eliminate unnecessary terms based on certain criteria. The system employs a configuration that performs thinning (post-processing).

そのため、従来のように本文中の用語をそのまま索引の
見出し語として用いる場合と違って、本文中の用語を分
解した用語のあらゆる組合せが生成され、その中から必
要な見出し語のみが残されることになる。その結果必要
かつ充分な見出し語を得ることができ、且つコンピュー
タ処理で実現されるので、編集者の恣意による問題も解
消され、極めて信頼度が高い高度の索引ページを作成す
ることが可能となる。Therefore, unlike the conventional case of using the terms in the text as they are as headwords in the index, all combinations of terms are generated by decomposing the terms in the text, and only the necessary headwords are retained. become. As a result, necessary and sufficient headwords can be obtained, and since they are realized through computer processing, problems caused by the editor's discretion are eliminated, making it possible to create highly reliable index pages. .

【図面の簡単な説明】第１図は文書処理システムの一般構成を示すブロック図
、第２図は階層レベルのついた索引の例、第３図は本発
明による処理方式を示すブロック図、第４図は計算処理
部のブロック図、第５図は索引レコードの一部を拡大し
て示す図、第６図は百聞引きの例を示す図である。図において、６は索引処理部、９は索引用語の抽出部、
１０は前処理部、１１は前処理後の記憶１８− 状態、１２は計算処理部、１３は後処理部、１４は索引
レコードファイル、１５は出力処理部をそれぞれ示す。特許出願人　　　　　　富士通株式会社代理人　弁理士
　　　　青　柳　　　稔第１図１へ夕冑閤本文のま本文へ°−ジ゛　　　　本文ｑ次下） ■・手続補正書（自船昭和５７年１１月１１　日特許庁長官　若　杉　　和　夫　殿１、事件の表示　　　特願昭５７−１０９４９８２、発
明の名称　　　図書索引の自動生成方式％式％５、補正の対象　　　明細書の「発明の詳細な説明」の
欄６、補正の内容　　　別紙のとおり１、明細書の第３頁第１５行の「読み方と区切り」を「
読み方との区切り」と補正する。２、同第４頁第１９行、第５頁の第３行、第６行および
第１５頁第１８行の「ノンプル」をそれぞれ「ダッシュ
」と補正する。３、同第９頁第９行の「用語の」を「用言の」と補正す
る。４、同第１０頁第５行〜第６行の「分類プログラム」を
「置換処理プログラム」と補正する。５、同第１０頁第１１行の「ｌからｍまで」を「１から
入力値ｍまで」と補正する。６、同第１０頁第１３行〜第１８行のｒ　ＬＯＡＤは、
〜拡大する。」の記載を、次のように補正する。ｒＬｏ＾Ｄは、指定された番号ｊに対応するＩ　　Ｎ）
をメモリから取り出す。Ｓ　ＩＩ　Ｆ　Ｔは、指定され
た数値ｉを用いてリストｆ　　（ｊ）の要素の各文字を
ｉ−ｊ＋ｌシフトする。Ｃ０Ｎ５は、入力した２個のリ
スト、リストＸとりストｙを連結して、リスト（ｘ　＋
ｙ）を得る。ＥＸＴＮは、入力したりストｚの要素が、
指定された数値ｌをもとに文字０．１．・・・。ｉの順列になるように拡大する。」７．同第１１頁第１行の「メモリ中」を「指定されたｉ
に従ってメモリ中」と補正する。８、同第１１頁第１１行の「Ｏ印のついたちの」を「Ｏ
印のつかないもの」と補正する。９、同第１４頁第１１行および第１５頁第２行の「識別
番号」をそれぞれ「識別記号」と補正する。１０、同第１６頁未行の「直交」を特徴する特許出願人
　　　　　　富士通株式会社代理人　弁理士　　　　青
　柳　　　稔手続ネ市正書（方式）１、事件の表示　　特願昭５７−１０９４９８２、発明
の名称　　図書索引の自動生成方式５、補正命令の日付
　昭和５７年９月９日■０図面の第３図、第４図および
第６図を別紙のように補正する。２、明細書第１６頁第７行ないし第８行の［第６図は、
この〜される例で、」の記載を、次のように補正する。「第６図　（イ）は、このルールに基づいて百聞引き処
理される例を示すもので、（ロ）はその場合の「本文中
の索引用語」と「識別記号」との対応関係を示す図表で
ある。」[Brief Description of the Drawings] Figure 1 is a block diagram showing the general configuration of a document processing system, Figure 2 is an example of an index with hierarchical levels, Figure 3 is a block diagram showing a processing method according to the present invention, FIG. 4 is a block diagram of the calculation processing section, FIG. 5 is an enlarged view of a part of an index record, and FIG. 6 is a diagram showing an example of a ``seeing''. In the figure, 6 is an index processing unit, 9 is an index term extraction unit,
10 is a preprocessing section, 11 is a storage 18-state after preprocessing, 12 is a calculation processing section, 13 is a postprocessing section, 14 is an index record file, and 15 is an output processing section. Patent Applicant: Fujitsu Limited Agent, Patent Attorney Minoru AoyagiGo to Figure 1, Figure 1, and Go to Main Text. Commissioner of the Japan Patent Office Kazuo Wakasugi 1, Indication of the case: Japanese Patent Application No. 57-1094982, Title of the invention: Automatic book index generation method % formula % 5, Subject of amendment: "Detailed description of the invention" column 6 of the specification , Contents of the amendment As shown in the attached sheet 1. "Reading and delimitation" on page 3, line 15 of the specification has been changed to "
It is corrected as "separation from reading." 2. "Non-pull" on page 4, line 19, page 5, lines 3 and 6, and page 15, line 18, are corrected to "dash". 3. On page 9, line 9, "terminology" is amended to "terminology". 4. Correct "classification program" in lines 5 and 6 of page 10 to "replacement processing program." 5. Correct "from l to m" in line 11 of page 10 to "from 1 to input value m". 6. r LOAD on page 10, lines 13 to 18 is as follows:
~Expanding. ” shall be amended as follows. rLo^D is the I N corresponding to the specified number j)
retrieve from memory. S II F T shifts each character of the elements of list f (j) by i-j+l using the specified number i. C0N5 concatenates the two input lists, list X and list y, to create list (x +
y) is obtained. EXTN is the input or the element of the strike z,
Based on the specified number l, the characters 0.1. .... Expand so that it becomes a permutation of i. 7. In the first line of page 11, “in memory” is changed to “specified i”.
"In memory" according to the correction. 8. On page 11, line 11 of the same page, change “O-marked Tachino” to “O
"Things with no markings," he corrected. 9. Correct the "identification number" on page 14, line 11 and page 15, line 2, respectively, to "identification symbol." 10. Patent applicant characterized by "orthogonal" unpublished on page 16 of the same Patent attorney Minoru Aoyagi, patent attorney 1. Indication of case Patent application 1984-1094982, Invention Name: Book index automatic generation method 5, date of correction order: September 9, 1980 ■0 Figures 3, 4, and 6 of the drawings will be corrected as shown in the attached sheet. 2, page 16, lines 7 to 8 of the specification [Figure 6 shows
In this example, the statement "is amended as follows. ``Figure 6 (a) shows an example of ``see-you-ku'' processing based on this rule, and (b) shows the correspondence between ``index terms in the text'' and ``identification symbols'' in that case. This is a diagram. ”

Claims

[Claims] When inputting the manuscript of the main text, an index code is added to the term used in the index heading, and is recorded in the storage means. In the automatic generation method using a computer system for book indexes, which extracts and categorizes index terms and page numbers by sequentially checking the data, in the case of a compound word, it is broken down into each element and the joint between each element is A separator is added and input, and based on that, a level identification flag corresponding to each decomposed term is given in the calculator, and it is read out to create all combinations of decomposed terms, and each It also generates lower-level terms that accompany the decomposed terms, and uses information such as level identification flags to thin out unnecessary terms based on certain criteria from among the combinations of each decomposed term as described above. An automatic book index generation method characterized by processing.