JPS63163956A

JPS63163956A - Document preparation and correction supporting device

Info

Publication number: JPS63163956A
Application number: JP61314279A
Authority: JP
Inventors: Shigeki Kuga; 空閑　茂起; Masahiro Wada; 和田　正寛; Toshiyuki Tanaka; 敏幸田中; Taro Morishita; 森下　太朗; Nobuo Nakamura; 信夫中村
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1986-12-26
Filing date: 1986-12-26
Publication date: 1988-07-07
Also published as: JPH0531186B2

Abstract

PURPOSE:To make it easy to confirm and correct a part related to a numeral by extracting the part related to the numeral from a sentence by a morpheme analysis, and displaying it in a form to be easily confirmed. CONSTITUTION:A Japanese character string inputted by an inputting and editing means 1, is stored in an input character storing means 2. The character string stored in the said storing means 2 is given a morpheme analysis by a control means 5 by using a dictionary stored in a dictionary storing means 3 and a gramma stored in a gramma storing means 4, and the part of the morpheme related to the numeral is extracted. Then, the extracted part related to the numeral is displayed on a displaying means 6 so as to be distinguished from other part of the sentence. Thus, the confirmation of the part related to the numeral in the sentence, and the correction of it by a correcting means 7 are easily performed.

Description

【発明の詳細な説明】〈産業上の利用分野〉本発明は、日本語文章の中がら、形態素解析を行うこと
により、数字に関連する部分を抽出し、確認及び校正を
行い易くする文書作成・校正支援装置に関するものであ
る。[Detailed description of the invention] <Industrial application field> The present invention extracts parts related to numbers from Japanese text by morphological analysis, and creates documents that facilitate confirmation and proofreading. -Relates to calibration support equipment.

く従来の技術〉現在、日本語ワードプロセッサが実用化されており、そ
れに関連した、日本語の入出力、編集、かな漢字変換ア
ルゴリズム、辞書の技術などの基本技術が確立している
。Conventional Technology Currently, Japanese word processors are in practical use, and related basic technologies such as Japanese input/output, editing, kana-kanji conversion algorithms, and dictionary technology have been established.

また、日本語処理技術では、形態素解析、構文解析、意
味解析などの基本的な技術が知られている。Furthermore, basic techniques such as morphological analysis, syntactic analysis, and semantic analysis are known in Japanese language processing technology.

言葉を処理するためのコード体系が何種類か制定されて
おり、数字を外の文字と区別することは、日本語に限ら
ず、既知の事実として確立している。Several types of coding systems have been established to process words, and the ability to distinguish numbers from other characters is a well-known fact not limited to Japanese.

欧米では、ワードプロセッサが早（から発展したため関
連技術が進んでおり、スペルのチェック、コレクトの機
能を持った装置が実用化されている。In Europe and the United States, word processors developed early on, so related technology has advanced, and devices with spell checking and spelling functions have been put into practical use.

欧米の言葉が単語単位に句切られて記述されるのに比べ
、日本語は句切りのない漢字仮名交り文で記述されるの
が通常であり、正書法が徹底していないため、解析が難
しく校正を自動化する装置は実用化されでいない。Compared to Western words, which are written with punctuation in each word, Japanese is usually written in kanji, kana, and kanji characters without punctuation, and the lack of a thorough orthography makes it difficult to analyze. A device that automates the difficult calibration has not yet been put into practical use.

従来、正確な日本語を扱うことが要求される場合、複数
の人が対になり読み合わせをして問題のある点を抽出し
たり、あるいは校正の専門的な知識を持った人が逐次照
合を加え、校正する方法などが用いられている。Traditionally, when accurate Japanese was required, multiple people read together in pairs to identify problematic points, or people with specialized proofreading knowledge carried out sequential cross-checking. In addition, calibration methods are also used.

最近、このような校正作業を支援するための装置が開発
されつつある。それらの装置は、大きく分けると２つの
方式に分類できる。Recently, devices for supporting such calibration work are being developed. These devices can be broadly classified into two types.

一つは、前者に対するものであり、漢字仮名交りの文章
を解析して音声に変換し、機械との間で、音声を介した
読み合わせを行うことのできる校正！ｌ置である。One is for the former, which is a proofreading system that analyzes kanji, kana, and kana text, converts it into audio, and then reads it back to the machine via audio! It is placed in l position.

今一つは、後者に対するもので、校正の自動化あるいは
校正の専門化の支援を行うことを目的としたものである
。この装置は前者の装置に比べ、より高度の技術が要求
されるため、概念の提案があるのみで該装置を構成する
手段、校正の具体的な手段の報告は少ない。The other one is for the latter, and is aimed at supporting automation of proofreading or specialization of proofreading. Since this device requires more advanced technology than the former device, there are only conceptual proposals, and there are few reports on the means for configuring the device or the specific means for calibration.

また、その他の技術として、言語処理研究の中では、Ｋ
ＷＩＣ（Ｋｅ　ｙ　　Ｗｏ　ｒ　ｄ　　　Ｉｎ　　Ｃ。In addition, in language processing research, K
WIC (Key Word In C.

ｎｔｅｘｔ）に関する技術は一般的である。Techniques related to (ntext) are common.

言語処理以外では、ワークステージ１ンに関連した技術
が確立しており、作業の効率を上げるためのマルチライ
ンンドウを用いた装置が実用化されている。In areas other than language processing, technology related to work stage windows has been established, and devices using multi-line windows have been put into practical use to improve work efficiency.

〈発明が解決しようとする問題点〉近年日本語のワードプロセッサが普及し、該装置で作成
した文書が多くなっている。ワードプロセッサでは、入
力の簡便なかな漢字変換方式（以下特に断りがない場合
、ローマ字漢字変換方式を含む）を採用した機種が多く
なっている。<Problems to be Solved by the Invention> In recent years, Japanese word processors have become widespread, and more and more documents are being created with these devices. An increasing number of word processors are adopting the easy-to-enter Kana-Kanji conversion method (hereinafter, unless otherwise specified, this includes the Romaji-Kanji conversion method).

かな漢字変換のアルゴリズムは、かなを漢字に変換する
過程で単語辞書、文法などの言語的な裏付けのある情報
との照合が行われるため、確率的な基盤に基づいてはい
るが、ある程度の妥当性が確かめられている。The Kana-Kanji conversion algorithm is based on a probabilistic basis, but it has some validity because in the process of converting Kana to Kanji, it is checked against linguistically supported information such as word dictionaries and grammar. has been confirmed.

ところが、数字そのものの部分では、照合すべき辞書の
たぐいが無く、入力された数字がそのまま受理されて文
章が作Ｉ＆されていく、従って、数字そのものの部分は
、単語の部分より、低い精度の正確さしか有していない
。However, when it comes to the numbers themselves, there is no dictionary to check them, and the input numbers are accepted as they are and sentences are created. Therefore, the numbers themselves have lower accuracy than the word parts. It has nothing but accuracy.

又、１８歳未満、１８歳以下のように数字の前後に助数
詞などが接続した場合は、それらを含めて校正を行う必
要があり、数字のみを抽出する方法では不十分であった
。In addition, when numbers are connected before and after numbers, such as under 18 years of age or under 18 years of age, it is necessary to include them in the proofreading, and the method of extracting only the numbers is insufficient.

上述のごとく、最近開発されつつある校正機能を持った
装置の場合も数字の部分を重点的に確認、校正する機能
は報告されていない。As mentioned above, even in the case of devices with a calibration function that are being developed recently, there have been no reports of a function that focuses on checking and calibrating the numerical part.

たとえば、音声読み上げの機能を持った装置では、文章
を逐一かな文字列に分解・変換し、音声合成装置で読み
上げるため、数字の部分に対しても厳重な確認が可能で
あるが、処理速度が遅いという欠点があった。For example, a device with a voice reading function breaks down and converts a sentence into a string of simple characters and reads them out using a speech synthesizer, which makes it possible to rigorously check even numbers, but the processing speed is slow. The drawback was that it was slow.

又、該装置では音声合成装置、スピーカーなどの付加装
置が必要であり、コスト高になるという欠点があった。Furthermore, this device requires additional devices such as a voice synthesizer and a speaker, which has the disadvantage of increasing costs.

又、該装置では、言語と音声に関する処理が必要であり
、処理が複雑になるとともにプログラムの容量が増大す
るという欠点があった。Furthermore, this device requires processing related to language and speech, which has the drawback of complicating the processing and increasing the capacity of the program.

又、音声を用いる騒音が問題になり、その解消のために
ヘンドアオンが必要になったり、それを装着する手間が
かかったりする欠点があった。In addition, there is a problem of noise caused by the sound, and in order to eliminate the problem, a hand-on is required, and it takes time and effort to install it.

一方、形態素、構文、意味解析を用いる方式では、数字
の間違いを校正するＷ１能は報告されておらず、従来通
り、文中から数字の部分を抽出し、確認の後、校正を行
わねばならず、作業者の負担、時間が強要されるという
欠点があった。On the other hand, in the method using morphemes, syntax, and semantic analysis, W1 ability to proofread numerical errors has not been reported, and as before, the numerical part must be extracted from the sentence, checked, and then proofread. However, this method has the drawback of requiring a lot of burden and time on the workers.

又、両方式に共通する問題として、文書の中で、数字が
大きな意味をなす場合が、多いが、数字の部分を取り出
して確認を行うという機能は報告されておらず、数字部
分のみの確認を行う場合、作業の効率を落とすという欠
点があった。In addition, a problem common to both methods is that numbers often have great meaning in documents, but there is no reported function to extract and confirm the numerical part, and it is not possible to confirm only the numerical part. When doing this, there was a drawback that the efficiency of the work was reduced.

本発明の方式は、文章の中から数字に関連した部分を形
態素解析によって抽出し、確認が行い易い形で表示する
とともに、装置の使用者が簡単に校正できるようにし、
かかる問題を解決しようとするものである。The method of the present invention extracts parts related to numbers from sentences by morphological analysis, displays them in a form that is easy to check, and allows the user of the device to easily proofread.
This is an attempt to solve this problem.

く問題点を解決するための手段〉本発明は、日本語を入力・ｍ集する手段と、該入力され
た日本語を記憶する手段と、辞書を記憶する手段と、文
法を記憶する手段と、該入力された日本語の中から校正
すべき文字・記号列を抽出するマイクロプロセッサなど
のＮ御手段と、文章及び該候補文字・記号列などを表示
する手段と、校正すべき文字・記号列がある場合に該文
字を修正する手段とから構成される。Means for Solving the Problems> The present invention provides a means for inputting and collecting Japanese words, a means for storing the input Japanese words, a means for storing a dictionary, and a means for storing a grammar. , N control means such as a microprocessor for extracting characters/symbol strings to be proofread from the input Japanese, means for displaying the text and the candidate characters/symbol strings, and characters/symbols to be proofread. and means for correcting the character if there is a column.

〈作用〉入力手段により計算機などに入力された日本語の中から
、形態素解析を行い、数字に関係する形！！素の部分を
抽出するとともに、他の文章の部分と区別が付くように
表示し、数字に関連した部分の正誤の確認が行い易くす
るように作用する。<Action> Morphological analysis is performed from the Japanese input into a computer etc. using an input method, and forms related to numbers are analyzed! ! It extracts the plain parts and displays them so that they can be distinguished from other parts of the text, making it easier to confirm whether the parts related to numbers are correct or incorrect.

抽出された部分をカーソルなどのポインティングデバイ
スで指示したとき、数字部分をキーとするＫＷＩＣを作
成し、別の場所に表示することにより、数字に関連した
部分の正誤の確認を更に、行い易くするように作用する
。When the extracted part is indicated with a pointing device such as a cursor, a KWIC with the numerical part as a key is created and displayed in a separate location, making it easier to confirm whether the part related to the number is correct or incorrect. It works like this.

文章中に数字の部分が複数個存在する場合は、文章中の
他の部分と数字部分を区別した情報を利用して、抽出さ
れた部分のみの指示を可能とする機能を付加し、更に数
字の部分の正誤の確認を行いあくするように作用する。When there are multiple number parts in a sentence, a function is added that makes it possible to specify only the extracted part by using information that distinguishes the number part from other parts of the sentence, and then It works to confirm the correctness of the part.

上記の確認作業において、もし、抽出された部分に間違
いのあることが確認された場合は、原文あるいはＫＷＩ
Ｃ中の闇違いの部分を入力ｍ集機能を用いて修正し、文
章を校正するように作用する。In the above confirmation process, if it is confirmed that there is a mistake in the extracted part, please check the original text or KWI.
It works by correcting the incorrect parts in C using the input m collection function and proofreading the text.

以上のような、確認、修正機能を用いて、数字の部分を
効率良く、校正し、前述の問題点を改良するように作用
する。By using the above-mentioned confirmation and correction functions, the numerical part can be efficiently corrected and the above-mentioned problems can be improved.

〈実施例〉以下図に基づいて本発明の詳細な説明する。第１図は本
発明に係わる日本語文章校正装置のブロック構成図であ
る。<Example> The present invention will be described in detail below based on the drawings. FIG. 1 is a block diagram of a Japanese grammar proofing device according to the present invention.

図において１は日本語の文字列を入力・編集する手段で
ある。In the figure, 1 is a means for inputting and editing Japanese character strings.

２は該入力手段により入力された日本語の文字列を記憶
する手段である。入力手段は通常キーボードが用いられ
るが逐次的に入力を行なわないで、たとえばフロッピー
ディスク、磁気テープなどのように入力した日本語の文
字列を記憶する外部記憶手段で代用することも可能であ
る。即ち、１が省略された構成も存在しうる。2 is a means for storing the Japanese character string inputted by the input means. A keyboard is usually used as the input means, but it is also possible to use an external storage means such as a floppy disk or magnetic tape for storing input Japanese character strings without sequential input. That is, there may also be a configuration in which 1 is omitted.

３は上記２にＷ積された日本語の文字・記号列を形態素
解析するための辞書を記憶する手段である。3 is a means for storing a dictionary for morphologically analyzing the Japanese character/symbol strings multiplied by 2 above.

４は同様の目的のために使用するもので文法、その他の
辞書を記憶する手段である。4 is used for the same purpose and is a means for storing grammar and other dictionaries.

５は２に蓄えられた文字列の中から、数字の部分を抽出
したり、途中結果を記憶したり、表示の司令などを行っ
たりする制御手段である。該制御手段には制御によりて
得られる結果を記憶する手段を含む。Reference numeral 5 denotes a control means that extracts the numerical part from the character string stored in 2, stores intermediate results, and commands display. The control means includes means for storing results obtained by the control.

６は入力された文字列、照合の途中結果、校正すべき文
字列、ＫＷＩＣなどを表示するＣＲＴなどの表示の手段
である。Reference numeral 6 denotes display means such as a CRT for displaying input character strings, intermediate results of verification, character strings to be proofread, KWIC, and the like.

７は６によって表示されたＫＷＩＣの中に誤りがあった
場合、その修正結果を原文中に正しく反映するための校
正手段である。7 is a proofreading means for correctly reflecting the correction results in the original text when there is an error in the KWIC displayed by 6.

第２図は１によって入力された文字列の例であり、５の
１１１ＩＴｌ１手段により、６に表示された状態を表し
ている。この文章を８とする。＊た、この文章は闇違い
であり、正しくは［昭和６２年度の総売上高は１兆円の
予定である。Ｊであるとする。FIG. 2 is an example of the character string input by 1, and shows the state displayed in 6 by means of 111ITl1 in 5. This sentence is numbered 8. *This sentence is false; the correct answer is [Total sales for fiscal year 1986 are scheduled to be 1 trillion yen. Suppose it is J.

即ち、数字の「６１」は誤りであり、「６２」が正しい
ものとする。正しく校正された後の文章は８と区別して
９とする。That is, it is assumed that the number "61" is incorrect and "62" is correct. After being correctly proofread, the text will be marked 9 to distinguish it from 8.

第３図は本発明に係わる表示の例を示している。FIG. 3 shows an example of a display according to the present invention.

本図は文章８の中から数字の部９１０を抽出し、他と区
別しで表示した結果の例を示している。This figure shows an example of the result of extracting the number part 910 from text 8 and displaying it to distinguish it from the others.

第４図はＫＷＩＣを用いた本発明の別の表示の例であり
、１０の中で最初に出現する数字列にカーソルのような
ポインティングデバイスを合わせたときの状態の例を示
している６図中、ＫＷＩＣの中のキーワードに該当する
部分を１１、ＫＷＩＣの全体を１２とする。Figure 4 is an example of another display of the present invention using KWIC, and Figure 6 shows an example of the state when a pointing device such as a cursor is placed on the number string that appears first in 10. 11 is the part corresponding to the keyword in KWIC, and 12 is the entire KWIC.

第５図は単語辞書の内容の例を示している。１３は見出
しであり、１４は自立語あるいは付属語の区別を表す情
報であり、１５は品詞または分類の情報である。１４の
付は付属語であり、自は自立語であることを示している
。FIG. 5 shows an example of the contents of the word dictionary. 13 is a heading, 14 is information indicating the distinction between an independent word or an attached word, and 15 is information on a part of speech or classification. The number 14 indicates that it is an attached word, and that it is an independent word.

第６図は文法の例を示している。１６は文節の条件を表
したものである。［］はその中の要素が文節が成立する
ための必須の条件であることを示している。その他の要
素は省略されることがありうる。Figure 6 shows an example of the grammar. 16 represents the conditions of the clause. [ ] indicates that the elements inside are essential conditions for the clause to be established. Other elements may be omitted.

第７図は形態素を組み上げて文節を形成するための情報
を表したものである。１７は先行する形態素、１８は後
続する形態素、１９はこれらの形態素をつないだときの
文字列が文節の終了条件を満足するか否かの情報を示す
ものである。FIG. 7 shows information for assembling morphemes to form phrases. Reference numeral 17 indicates a preceding morpheme, 18 indicates a subsequent morpheme, and 19 indicates information as to whether or not a character string obtained by connecting these morphemes satisfies the clause end condition.

第８図は数字列を抽出し他と区別して表示するための情
報を示している。２０は文字であり、２１は表示画面の
中のその文字の存在する行の位置を示し、２２は列の位
置を示している。又、２３は数字列か否かを区別するた
めの情報である。２４は数字列の開始と終了点を記述し
、ＫＷＩＣなどを作成するとき利用するための情報であ
る。他と区別がつけば何であっても構わないが、説明を
分かりやすくするため、数字列の開始、４をＳ、終了点
をＥとしてお（。FIG. 8 shows information for extracting a numeric string and displaying it to distinguish it from others. 20 is a character, 21 indicates the row position on the display screen where the character exists, and 22 indicates the column position. Further, 23 is information for distinguishing whether it is a numeric string or not. 24 is information that describes the start and end points of a number string and is used when creating a KWIC or the like. It doesn't matter what you want as long as you can distinguish it from others, but to make the explanation easier to understand, let's use the starting point of the number sequence, 4, as S, and the ending point as E (.

第９図は表示装置の特定の位置２５がＭ、Ｈの行、列の
番号で記述できることを示したものである。FIG. 9 shows that a specific position 25 on the display device can be described by M and H row and column numbers.

第１０図は本発明の該略７０−図である。第２図の例文
、８を泪いて、以下に詳しく述べる。FIG. 10 is a schematic 70-diagram of the present invention. The example sentence in Figure 2, 8, will be described in detail below.

まず、入力・ＩＩＡ！手段１により入力された［６１年
度の総売上高は１亮円の予定である。」という文が、入
力文字記憶装置２に蓄積される。２６はこの処理ブロッ
クである。First, input IIA! The total sales amount for fiscal year 1961 is expected to be 1 ryo yen, which was input using means 1. ” is stored in the input character storage device 2. 26 is this processing block.

各入力文字列は単語辞書、文法と照合され形態素の要素
が決定される。２７は単語辞書照合の処理ブロックであ
る。ここでは、単語辞書検索の結果まず、「昭和」が名
詞の場合と、前置助数詞の場合のあることが分かる０次
に「６１」に処理が進む。Each input string is checked against a word dictionary and grammar to determine morpheme elements. 27 is a processing block for word dictionary comparison. Here, as a result of the word dictionary search, the process proceeds to the 0th order "61" where it is found that "Showa" can be a noun or a prepositional particle.

文字コードからこれが数字であることが分かる。We can tell from the character code that this is a number.

そこで、文法的な照合を加える。この処理ブロックを２
８とする。２７．２８により辞書、文法の双方の照合に
成功する場合と失敗する場合が生ずる。失敗する場合は
、エラー処Ｊ！！！３５が行われ、次文字の処理に移る
。処理２８では、第７図を用い、先行する形態素の要素
と後続する形！！素の要素間の接続の可否を調べる。今
の場合、名詞の場合は「昭和」で文節が終了し、前置助
数詞の場合は「昭和６１」までが仮の文字列になる。次
に、「年度」の処理に進む、おなしような処理を行うこ
とでこれが助数詞であり、「６１」との接続が可能であ
ることが分かる。次に「の」の処理に移る。その結果、
「昭和６１年度の」が一つの文節であることが分かる０
名詞の「昭和」の場合は「昭和」で文節が切れ、「６１
年度の」は次の文節になる。このような場合、最長一致
法が知られており、文節数を少なくし、文字列が一番長
い候補を選択する。これにより「昭和６１年度の」が選
択される。この文庫を決定する処理ブロックを２９とす
る。Therefore, we add grammatical matching. This processing block is 2
8. 27.28, there are cases where both dictionary and grammar matching succeeds and cases where they fail. If it fails, please contact the error department J! ! ! 35 is performed, and processing moves on to the next character. In process 28, using FIG. 7, elements of the preceding morpheme and the following form! ! Check whether there is a connection between elementary elements. In this case, in the case of a noun, the clause ends with "Showa", and in the case of a prefix, the provisional character string ends in "Showa 61". Next, we proceed to the processing of "year" and perform a similar process to find out that this is a suffix and can be connected to "61". Next, we move on to the processing of "no". the result,
I can see that "of 1986" is one phrase0
In the case of the noun ``Showa,'' the clause ends at ``Showa,'' and ``61
``of the year'' becomes the next clause. In such cases, the longest match method is known, which reduces the number of clauses and selects the candidate with the longest character string. As a result, "19861" is selected. The processing block 29 determines this library.

入力文字列のコードの照合順序は、特に固定されたもの
はなく、いずれの文字から行っても本発明に影響はない
、ここでは、通常文章を読むときの順序、「昭　和　６
１　・・・」で考えている。The order of collation of the codes of the input character string is not particularly fixed, and the present invention will not be affected by starting with any character.
1..."I'm thinking.

２７．２８．２９により、「昭和・・」の「昭」の文字
の処の２４にＳの記号が付けられ、１年度」の「度」の
文字の２４にＥが付けられ、「昭和６１年度」が一つの
数字列であることが分かる。According to 27.28.29, the symbol S was added to the 24 of the character ``Show'' in ``Showa...'', and the symbol E was added to the 24 of the ``degree'' of ``1 year'', and ``Showa 61 It can be seen that "Year" is a single number string.

同様にして「１兆円」が一つの数字列であることが分か
る。Similarly, it can be seen that ``1 trillion yen'' is a string of numbers.

３０は形！！！素解析処理の終了の判断を行うための処
理ブロックである。後続する文字があれば、その文字を
制御装置に呼び込み、形態素解析のための準備を行う、
この処理は、後続文字が無くなるまで繰り返される。30 is a shape! ! ! This is a processing block for determining the end of elementary analysis processing. If there is a subsequent character, it is called into the control device and prepared for morphological analysis.
This process is repeated until there are no more subsequent characters.

後続する文字列が存在する場合は該文字列を呼び込み、
スタックにセットする。この処理ブロックを３１とする
。If a subsequent string exists, call that string,
Set it on the stack. This processing block is designated as 31.

３２は文字判断処理の終了後、第８図に示した情報が保
存されたスタックの結果をもとに、原文中の数字列を他
と区別して表示する処理ブロックである。32 is a processing block that displays the numeric string in the original text to distinguish it from others based on the result of the stack in which the information shown in FIG. 8 is saved after the character judgment process is completed.

３３は数字列を抽出して表示する別の方法である。すな
わち、数字列をキーとしたＫＷＩＣを生成する処理ブロ
ックである。33 is another method of extracting and displaying a numeric string. That is, it is a processing block that generates a KWIC using a number string as a key.

３４は数字列を確認し、間違いがあれば修正・Ｉｉ集を
行う処理ブロックである。この中で、数字の「６１」は
「６２」と修正される。34 is a processing block that checks the number string and if there is a mistake, corrects it and performs collection of Ii. In this, the number "61" is corrected to "62".

確認の終了後、一連の作業を終了することができろ。After checking, you can finish the series of tasks.

次に、上記の説明以外の実現手段の例について述べる。Next, examples of implementation means other than those described above will be described.

第８図で数字とその他の文字との区別を記述する情報と
数字列の開始と終了を表す情報を区別して２３．２４に
保存したがこれを共用した構成も考えられる。In FIG. 8, the information describing the distinction between numbers and other characters and the information representing the start and end of a numeric string are stored separately in 23 and 24, but a configuration in which they are shared is also conceivable.

又、上の説明では原文中の数字列を抽出した後で、ＫＷ
ＩＣを作成するようにしたが、この順序にはとられれな
いで良い、すなわち、原文中の数字部分のみを抽出して
、ＫＷＩＣの生成、表示は行わない方法、あるいは、Ｋ
ＷＩＣを生成してから原文中の数字列抽出する方法など
もありうる。Also, in the above explanation, after extracting the number string in the original text, KW
IC was created, but it does not need to be done in this order; in other words, there is a method that extracts only the numerical part of the original text and does not generate or display KWIC, or
Another possible method is to generate a WIC and then extract a number string from the original text.

抽出された、数字列の表示は、全部一様に表示する方法
、カーソルなどで指示された部分のみを表示する方法な
どが考えられ、いずれの方法を採用しても本発明に影響
はない。The extracted number string may be displayed in a uniform manner, or only a portion indicated by a cursor or the like may be displayed, and the present invention is not affected by either method.

又、数字の校正は、原文中の抽出された数字部分に対し
直接に行う方法、ＫＷＩＣの中のキーワードに対し行い
、その結果を原文中に取り込む方法などが考えられ、い
ずれの方法を採用しても本発明に影響はない。In addition, there are two ways to proofread numbers: directly on the extracted numbers in the original text, or on keywords in KWIC and incorporating the results into the original text. However, this does not affect the present invention.

又、上の説明では算用数字の例を示しているが漢数字の
場合やアラビヤ数字などにも同じ手段が適用できる。Furthermore, although the above explanation shows an example of arithmetic numerals, the same method can be applied to Chinese numerals, Arabic numerals, etc.

又、第５図の単語辞書は付属語、自立語を一つの構成に
しているが、これは説明を簡単にするためであり、通常
の形態素解析のように自立語、接辞、付属語などを区別
した構成にしても本発明に影響は無い。In addition, the word dictionary in Figure 5 has attached words and independent words in one structure, but this is to simplify the explanation, and unlike ordinary morphological analysis, independent words, affixes, attached words, etc. Even if the configurations are differentiated, the present invention is not affected.

〈発明の効果〉本発明の効果は、数字が重要な意味を持つ、文書の中か
ら、数字部分のみを抽出し、確認・校正を行いやすくで
きる点にある。<Effects of the Invention> The effects of the present invention are that only the numeric parts can be extracted from documents in which numbers have important meanings, making it easier to check and proofread.

又、本発明によれば、数字部分が確実に抽出されるため
、校正の精度を高めるという点で効果がある。Further, according to the present invention, since the numerical part is reliably extracted, it is effective in increasing the accuracy of proofreading.

又、数字だけでなく前置助数詞、助数詞、後置助数詞な
どを含めた数字列を抽出することができるので、数字の
校正の精度を更に向上させる点で効果がある。Furthermore, since it is possible to extract not only numbers but also numeric strings including prefix particles, particle particles, postfix particles, etc., it is effective in further improving the accuracy of numeric proofreading.

又、原文中から数字の部分を抽出する手間が省けるため
、校正の処理時間の短縮、校正を行う人の精神的負担や
疲労を軽減という点で効果がある。Furthermore, since the effort of extracting numerical parts from the original text can be saved, it is effective in shortening the processing time for proofreading and reducing the mental burden and fatigue of the person doing the proofreading.

又、音声読み上げ装置、スピーカー、音声合成装置など
の特別な装置を必要としないと゛ういう利点がある。Another advantage is that there is no need for special equipment such as a voice reading device, a speaker, or a voice synthesis device.

又、ヘッド７オンなどの特別な装置の装着が不必要であ
り、装着の手間及び装置購入が省略できるという利点が
ある。Further, there is no need to attach a special device such as a head 7-on, and there is an advantage that the effort of attaching the device and the purchase of the device can be omitted.

又、言語処理、音声処理などの複雑な処理が不必要であ
り、処理のプログラム作成、メンテナンス、記憶容量の
点で効果がある。Further, complicated processing such as language processing and speech processing is unnecessary, and it is effective in terms of processing program creation, maintenance, and storage capacity.

又、簡単な処理で実行することができるため、処理速度
が速いという点で効果がある。Furthermore, since it can be executed with simple processing, it is effective in terms of high processing speed.

又、音声のような騒音の原因になる要素を含んでいない
ため周囲への影響を考えないで８！械の設置や作業がで
きるという点で効果がある。Also, since it does not include elements that cause noise such as voices, there is no need to consider the impact on the surroundings.8! It is effective in that it allows you to install and work on machines.

[Brief explanation of the drawing]

である。１・・・入力・Ｊｇ集千手段・・・入力文字記憶手段３・・・辞書記憶手段４・・・文法記憶手段５・・・制御手段６・・・表示手段７・・・校正手段８・・・誤りを含んだ例文９・・・正しい例文１０・・・文章中の数字の部分１１・・・ＫＷＩＣ中のキーワード１２・・・ＫＷＩＣ全体１３・・・単語辞書中の見出し１４・・・単語辞書中の自立語、付属語区別情報１５・
・・−’Ｉ語辞書中の品詞、区別情報１６・・・文節成
立条件１７・・・先行形態素１８・・・後続形態素１９・・・文節終了条件２０・・・画面表示用の文字スタック２１・・・画面表示用の行位置スタック２２・・・画面
表示用の列位置スタック２３・・・画面表示用の数字判
定スタック２４・・・数字列判定情報スタック２５・・・表示装置上の特定な位置２６・・・入力文字列ＩＦ積熱処理ブロック７・・・単
語辞書照合処理ブロック２８・・・文法照合処理ブロック２９・・・文節決定照合処理ブロック３０・・・形態素解析終了処理ブロック３１・・・次文
字処理ブロック３２・・・数字列抽出、表示処理ブロック３３・・・数
字列ＫＷＩＣ表示処理ブロック３４・・・校正処理ブロ
ック３５・・・エラー処理ブロック代理人　弁理士　杉山毅至（他１名）」第１図第２図第　　３　　図第・ト図！ｍ５図第６図第７図ｆ：ｔＳｓ図ｆ５９図第１０図It is. 1... Input/Jg collection means... Input character storage means 3... Dictionary storage means 4... Grammar storage means 5... Control means 6... Display means 7... Proofreading means 8 ... Example sentences containing errors 9 ... Correct example sentences 10 ... Numerical parts in sentences 11 ... Keywords in KWIC 12 ... KWIC as a whole 13 ... Headings in word dictionary 14 ...・Independent word and attached word distinction information in the word dictionary 15・
...-' Part of speech in the I-word dictionary, distinction information 16... Clause establishment condition 17... Leading morpheme 18... Following morpheme 19... Clause end condition 20... Character stack for screen display 21 ...Row position stack for screen display 22...Column position stack for screen display 23...Number judgment stack for screen display 24...Number string judgment information stack 25...Identification on the display device Position 26... Input character string IF product heat processing block 7... Word dictionary matching processing block 28... Grammar matching processing block 29... Clause determination matching processing block 30... Morphological analysis end processing block 31. ...Next character processing block 32...Number string extraction and display processing block 33...Number string KWIC display processing block 34...Proofreading processing block 35...Error processing block Agent Patent attorney Takeshi Sugiyama (et al.) 1 person) ” Figure 1 Figure 2 Figure 3 Figure 3! m5 figure 6 figure 7 figure f: tSs figure f59 figure 10

Claims

[Claims]

1. A means for inputting and editing Japanese, a means for storing the input Japanese, a means for storing a dictionary, a means for storing grammar, and a method for proofreading from the input Japanese. Means for extracting character/symbol strings, sentences and candidate characters/symbols
Extracting morphemes related to numbers in a document processing system having means for displaying symbol strings, etc., and means for correcting the characters/symbol strings when there are characters/symbol strings to be proofread,
A document creation/proofreading support device characterized by making proofreading easier.