JPH08137884A

JPH08137884A - Proofreading method for japanese sentence

Info

Publication number: JPH08137884A
Application number: JP6276867A
Authority: JP
Inventors: Shinsuke Matsushiro; 信輔末代; Yuka Itai; 由花板井
Original assignee: Meidensha Corp; Meidensha Electric Manufacturing Co Ltd
Current assignee: Meidensha Corp; Meidensha Electric Manufacturing Co Ltd
Priority date: 1994-11-11
Filing date: 1994-11-11
Publication date: 1996-05-31

Abstract

PURPOSE: To securely and easily proofread a misused word and an unregistered word in Japanese processing. CONSTITUTION: The unregistered word is extracted and corrected by taking a morpheme analysis of the original sentence (S1) and the misused word is extracted by referring to a misused word dictionary for morpheme data of the corrected unregistered word (S2); and the extracted misused word is pointed out and a list of correction candidates is displayed (S3). When the misused word is pointed out, a terminal which seems correct is selected out of the correction candidates to proofread the misused words and unregistered word through correction (S4) and its confirmation (S5). A syntax analysis of the morpheme data obtained by the morpheme analysis is taken (S6), a syntax analysis of morpheme data obtained by correcting the misused word is taken (S7), and when both the syntax analytic results have a difference in sentence meaning, an error of the correction is decided and recorrection is pointed out.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、日本語処理システムに
係り、特に文書中の誤用語・未登録語の校正方法に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a Japanese language processing system, and more particularly to a proofreading method for erroneous terms and unregistered words in documents.

【０００２】[0002]

【従来の技術】ワードプロセッサや機械翻訳、ドキュメ
ントデータベース、ハイパーテキストといったコンピュ
ータを使った日本語処理が実用化されている。2. Description of the Related Art Japanese language processing using computers such as word processors, machine translations, document databases, and hypertexts has been put into practical use.

【０００３】このための自然言語解析は、まず解析対象
となる文章を形態素単位（語構成の最小単位）に区切
り、それぞれの形態素がもつ性質を明らかにする形態素
解析を行う。この後、自然言語の統語規則から解析する
構文解析、続いて曖昧性や漠然性を取り除く意味解析、
文脈解析を行う。In natural language analysis for this purpose, a sentence to be analyzed is first divided into morpheme units (minimum units of word structure), and morpheme analysis is performed to clarify the properties of each morpheme. After this, a syntactic analysis that analyzes from the syntactic rules of natural language, and then a semantic analysis that removes ambiguity and vagueness,
Perform contextual analysis.

【０００４】構文解析には、形態素解析された文を文法
を用いて正しい文であるか否かを判定し、正しい文のと
きはその構文解析結果として木構造（解析木）を得る。In the syntactic analysis, it is determined whether or not the morphologically analyzed sentence is a correct sentence by using a grammar, and when the sentence is correct, a tree structure (a parse tree) is obtained as the syntactic analysis result.

【０００５】この構文解析処理では、文法的な適合性の
みに着目しているため、構文的な曖昧性が発生し、多く
の構文木が生成されてしまう。この中から、正しい解析
木を選択するために、意味解析処理を行う。In this syntactic analysis process, since attention is paid only to the grammatical conformity, syntactic ambiguity occurs and many syntactic trees are generated. A semantic analysis process is performed to select the correct parse tree from among these.

【０００６】意味解析処理では、単語の文法カテゴリ
（品詞に相当）だけでなく、その意味的な情報を利用す
るものである。In the semantic analysis process, not only the grammatical category of a word (corresponding to a part of speech) but also its semantic information is used.

【０００７】このような日本語処理において、文書校正
支援システムの１機能として、誤用語や未登録語を抽出
・修正する機能を設けている。この機能は、図２に示す
手順でなされている。In such Japanese language processing, a function of extracting and correcting erroneous terms and unregistered words is provided as one function of the document proofreading support system. This function is performed by the procedure shown in FIG.

【０００８】（Ｓ１）日本語処理対象となる原文を形態
素解析し、未登録語を抽出する。この未登録語はオペレ
ータが直接に修正する。(S1) Morphological analysis is performed on the original sentence to be processed in Japanese, and unregistered words are extracted. This unregistered word is directly corrected by the operator.

【０００９】（Ｓ２）形態素解析結果となる形態素デー
タを読み込み、誤用語を辞書を参照して抽出する。な
お、修正された未登録語に対しても誤用語があれば抽出
する。(S2) The morpheme data which is the morpheme analysis result is read, and the erroneous term is extracted by referring to the dictionary. If there is an erroneous term in the corrected unregistered word, it is extracted.

【００１０】（Ｓ３）抽出された誤用語は表示画面上で
オペレータに指摘すると共に、修正候補の一覧表を表示
する。(S3) The extracted erroneous term is pointed out to the operator on the display screen, and a list of correction candidates is displayed.

【００１１】（Ｓ４）この指摘に対して、オペレータが
修正候補の中から正しいと思われる用語を選択したとき
に該用語の置き換えをする。(S4) In response to this indication, when the operator selects a term that seems to be correct from the correction candidates, the term is replaced.

【００１２】（Ｓ５）用語の置き換えでオペレータの確
認を得たときに、次の誤用語の抽出をし、全ての誤用語
の修正と確認を得たときに校了とする。(S5) When the operator's confirmation is obtained by replacing the terms, the next erroneous term is extracted, and when all the erroneous terms are corrected and confirmed, the completion is completed.

【００１３】[0013]

【発明が解決しようとする課題】従来の方法は、人手に
よる校正作業としてなされるため、以下のような問題が
ある。Since the conventional method is carried out as a manual calibration work, it has the following problems.

【００１４】（１）修正候補が多数ある場合、誤った候
補を選択する可能性がある。(1) If there are many correction candidates, there is a possibility that the wrong candidate will be selected.

【００１５】（２）修正候補がない場合、オペレータの
知識に大きく依存し、適切な修正ができない場合があ
る。(2) If there is no correction candidate, it may depend largely on the operator's knowledge and an appropriate correction may not be possible.

【００１６】本発明の目的は、誤用語・未登録語の校正
を確実容易にする校正方法を提供することにある。An object of the present invention is to provide a proofreading method that surely facilitates proofreading of erroneous terms and unregistered words.

【００１７】[0017]

【課題を解決するための手段】本発明は、前記課題の解
決を図るため、日本語文を形態素解析し、この解析結果
データに対する未登録語の追加と誤用語の抽出とオペレ
ータによる修正とを行うことにより誤用語・未登録語を
校正する方法において、前記形態素解析した形態素デー
タに対して構文解析を行い、前記誤用語を修正した形態
素データに対して構文解析を行い、両構文解析結果に文
意的に相違があるときに再度の修正を指摘することを特
徴とする。In order to solve the above problems, the present invention performs morphological analysis of Japanese sentences, adds unregistered words to the analysis result data, extracts erroneous words, and corrects them by an operator. In the method of proofreading incorrect terms / unregistered words, the syntactic analysis is performed on the morpheme data obtained by the morphological analysis, the syntactic analysis is performed on the morpheme data obtained by correcting the incorrect terms, and both syntactic analysis results are converted into sentences. The feature is to point out the re-correction when there is a discrepancy in intention.

【００１８】[0018]

【作用】適切な修正か否かを形態素データと誤用語を修
正した形態素データとの文意上の相違の有無でチェック
することにより確実な修正を得る。The correct correction is obtained by checking whether or not the correction is appropriate based on whether or not the morpheme data is different from the morpheme data in which the erroneous term is corrected.

【００１９】適切な修正か否かを自動的にチェックする
こと、及び適切な修正でないときに再度の修正を指摘す
ることにより、オペレータの負担を軽減する。The operator's burden is reduced by automatically checking whether or not the correction is appropriate and by pointing out the correction again when the correction is not appropriate.

【００２０】[0020]

【実施例】図１は、本発明の一実施例を示す処理手順図
である。同図が図２と異なる部分は、処理要素Ｓ６〜Ｓ
８を設けた点にある。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT FIG. 1 is a processing procedure diagram showing an embodiment of the present invention. 2 is different from FIG. 2 in that processing elements S6 to S
8 is provided.

【００２１】構文解析Ｓ６は、形態素解析結果になる形
態素データを使って構文解析を行う。この処理は、従来
の日本語処理システムが持つ構文解析機能を利用する。In the syntactic analysis S6, syntactic analysis is performed using the morpheme data which is the morpheme analysis result. This processing uses the parsing function of the conventional Japanese language processing system.

【００２２】構文解析Ｓ７は、処理Ｓ４により修正され
た後の形態素データを使って構文解析を行う。この処理
も構文解析Ｓ６と同じになる。In the syntactic analysis S7, syntactic analysis is performed using the morpheme data corrected by the process S4. This processing is also the same as the syntax analysis S6.

【００２３】文意解析Ｓ８は、両構文解析Ｓ６．Ｓ７の
解析結果を比較し、構文に意味的な変化が発生したか否
かを意味・文脈解析で判定する。この処理は、従来の日
本語処理システムが持つ意味・文脈解析機能を利用す
る。The textual analysis S8 is a syntactic analysis S6. The analysis results of S7 are compared, and it is determined by semantic / context analysis whether or not a semantic change has occurred in the syntax. This processing uses the semantic / context analysis function of the conventional Japanese language processing system.

【００２４】文意解析Ｓ８の判定で文意に相違が生じた
場合、すなわち誤用語（追加された未登録語も含む）で
あるとして警告と指摘処理Ｓ３による指摘を行い、再度
の修正処理Ｓ４に戻し、再度の修正による文意比較を行
う。When a difference is made in the meaning of the sentence in the judgment of the meaning analysis S8, that is, as a wrong term (including the added unregistered word), a warning is issued and an indication is made by the indication processing S3, and the correction processing S4 is performed again. Return to, and perform a textual comparison by correcting again.

【００２５】文意判定で文意に相違がない場合、適切な
修正として次の誤用語判定Ｓ２に移る。If there is no difference in the meaning of the sentence determination, the process proceeds to the next incorrect term determination S2 as an appropriate correction.

【００２６】したがって、本実施例によれば、形態素解
析した形態素データと、これに誤用語を修正した形態素
データとの両構文解析結果に文意上の相違があるときに
不適性な修正として再度の修正を指摘する。Therefore, according to the present embodiment, when the syntactic analysis results of the morpheme data obtained by the morpheme analysis and the morpheme data obtained by correcting the erroneous term are different from each other in the sense of meaning, the correction is again performed as an inappropriate correction. Point out the fix.

【００２７】これにより、誤用語・未登録語の修正に文
意上の誤りが警告と指摘により即座に修正することがで
き、オペレータの知識・思い込みに左右されることな
く、常に高い構成品質を保つことができる。As a result, it is possible to immediately correct erroneous words and unregistered words due to warnings and indications of mistakes in the meaning of the sentence, and it is possible to always obtain a high composition quality without being influenced by the knowledge and assumptions of the operator. Can be kept.

【００２８】また、誤用語・未登録語を修正した場合、
従来では修正した文と原文とで文意的な相違が発生して
いないことを読み比べる必要があり、膨大な修正時間と
作業負担を強いられていたのに対して、本実施例によれ
ば自動的に文意を評価するため、修正時間と作業負担を
大幅に軽減するすることができる。Further, when the incorrect term / unregistered term is corrected,
In the past, it was necessary to read and compare that the corrected sentence and the original sentence did not have a literary difference, and a huge amount of correction time and work load were imposed. Since the sentence meaning is automatically evaluated, the correction time and the work load can be greatly reduced.

【００２９】[0029]

【発明の効果】以上のとおり、本発明によれば、形態素
解析した形態素データに対して構文解析を行い、前記誤
用語を修正した形態素データに対して構文解析を行い、
両構文解析結果に文意的に相違があるときに再度の修正
を指摘するようにしたため、確実な校正になって校正品
質を高め、またオペレータの作業時間及び負担を大幅に
軽減する効果がある。As described above, according to the present invention, the syntactic analysis is performed on the morpheme data obtained by the morpheme analysis, and the syntactic analysis is performed on the morpheme data obtained by correcting the erroneous term,
Since corrections are pointed out again when there is a literal difference between both parsing results, there is an effect that it will be a reliable calibration, the calibration quality will be improved, and the operator's work time and burden will be greatly reduced. .

[Brief description of drawings]

【図１】本発明の一実施例を示す処理手順図。FIG. 1 is a processing procedure diagram showing an embodiment of the present invention.

【図２】従来の処理手順図。FIG. 2 is a diagram of a conventional processing procedure.

Claims

[Claims]

1. A method of morphologically analyzing a Japanese sentence, adding an unregistered word to the analysis result data, extracting an erroneous term, and correcting it by an operator, thereby correcting the erroneous term / unregistered word. Perform syntactic analysis on the analyzed morpheme data, perform syntactic analysis on the morpheme data in which the incorrect term is corrected, and point out recorrection when the syntactic analysis results are literally different. A proofreading method for characteristic Japanese sentences.