JPH0258159A - Proofreading system for japanese sentence - Google Patents

Proofreading system for japanese sentence

Info

Publication number
JPH0258159A
JPH0258159A JP63210055A JP21005588A JPH0258159A JP H0258159 A JPH0258159 A JP H0258159A JP 63210055 A JP63210055 A JP 63210055A JP 21005588 A JP21005588 A JP 21005588A JP H0258159 A JPH0258159 A JP H0258159A
Authority
JP
Japan
Prior art keywords
notation
standard
word
expressions
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP63210055A
Other languages
Japanese (ja)
Inventor
Shiyou Imasato
詔 今郷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP63210055A priority Critical patent/JPH0258159A/en
Publication of JPH0258159A publication Critical patent/JPH0258159A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To unify expressions to standard expressions without trouble that an operator selects an expression at each time of detecting fluctuation of an expression by using a standard expression dictionary, where standard expressions are preliminarily determined, to detect the presence or the absence of fluctuation of expressions and automatically correcting the expression to a standard expression by a correcting means at the time of detecting the presence of fluctuation. CONSTITUTION:A document storage means 1 where a Japanese-language document is stored, a word dictionary 3 where expressions of words and information of parts of speech, word discrimination information, and expression discrimination information corresponding to expressions are stored, a connection table 2 for parts of speech where information indicating whether connection between parts of speech is permitted or not is stored, and a standard expression dictionary 4 where sets of word discrimination information, expression discrimination information, and expressions are stored. A word dividing means 5 which divides an input sentence with a word as the unit, an expression fluctuation detecting means 6 which detects expressions other than standard expressions, and an expression fluctuation correcting means 7 which corrects expressions other than standard expressions to standard expressions are provided. The presence or the absence of fluctuation of expressions is detected by the detecting means 6; and the presence is detected, expressions are automatically corrected to standard expressions by the correcting means 7. Thus, expressions are unified to standard expressions without trouble that an expression is selected at each time of detecting fluctuation of expressions.

Description

【発明の詳細な説明】 産業上の利用分野 本発明は、日本語ワードプロセッサ等を用いて入力され
機械処理可能な形の日本語文章中に含まれる表記のゆれ
を検出・訂正する日本文の校正システムに関する。
[Detailed Description of the Invention] Industrial Application Field The present invention is a proofreading method for Japanese text that detects and corrects orthographic deviations contained in machine-processable Japanese text input using a Japanese word processor or the like. Regarding the system.

従来の技術 一般に、日本文にあっては、1つの単語の表記方法が複
数ある場合があるが、1つの文書中では、何れか1つの
表記に統一することが望ましい。しかし、複数の人間に
より1つの文書を分担して作成するような場合には、単
語の表記が不統一となりやすい。よって、同一文書中に
同一単語について表記の異なるものが混在する表記のゆ
れが生じ、見苦しい文書となってしまう。
BACKGROUND OF THE INVENTION Generally, in Japanese texts, there may be multiple ways to represent one word, but it is desirable to use one of the ways to represent one word in one document. However, when a document is created by multiple people, the notation of words tends to be inconsistent. Therefore, variations in the notation occur in which different notations for the same word coexist in the same document, resulting in an unsightly document.

このような日本文の表記のゆれの問題に関し、例えば特
開昭62−209668号公報に示されるように、複数
の表記が可能な単語を検出した場合、可能な表記を全て
提示してオペレータに選択させる方式がある。
Regarding this problem of variations in Japanese writing, for example, as shown in Japanese Patent Application Laid-Open No. 62-209668, when a word that can be written in multiple ways is detected, all possible spellings are presented to the operator. There is a method that allows you to choose.

また、特開昭62−22176号公報に示されるように
、正しくない用語と正しい用語との対を記憶させた辞書
を用い、正しくない用語を正しい用語に変換させる方式
もある。
Furthermore, as disclosed in Japanese Patent Application Laid-Open No. 62-22176, there is a method of converting incorrect terms into correct terms using a dictionary that stores pairs of incorrect terms and correct terms.

発明が解決しようとする問題点 ところが、前者によると、ある単語をどの表記で表現す
るかは固定しているのが通常であるので、表記のゆれの
可能性がある場合に一々選択するのは面倒であり、かつ
、選択の回数が増える分、選択誤りによる表記のゆれが
残ってしまう可能性も増える。
Problems to be Solved by the Invention However, according to the former, the notation used to express a certain word is usually fixed, so it is difficult to select one by one when there is a possibility of variation in the notation. This is troublesome, and as the number of selections increases, the possibility that variations in the notation will remain due to selection errors also increases.

また、後者によると、例えば「合い言葉」 「合言葉」
のように何れの表記を用いてもよい単語であっても、必
ず1つの表記を正しい表記のものとして設定しておかな
くてはならず、柔軟性に乏しいものである。
Also, according to the latter, for example, "password""password"
Even if a word can be expressed in any way, as in the case of ``words'', one expression must always be set as the correct expression, and it lacks flexibility.

問題点を解決するための手段 請求項1記載の発明では、日本語の文書を記・ツする文
書記憶手段と、単語の表記及び表記に対応する品詞情報
、単語識別情報、表記識別情報を記憶した単語辞書と、
品詞間の接続の可否を記憶した品詞接続表と、単語識別
情報と表記識別情報と表記との組を記憶した標準表記辞
書とを設け、入力文を単語単位に分割する単語分割手段
を設け、標準でない表記を検出する表記のゆれ検出手段
と、標準でない表記を標準の表記に訂正する表記のゆれ
訂正手段とを設ける。
Means for Solving the Problem The invention according to claim 1 includes a document storage means for recording and writing Japanese documents, and storing word notation and part-of-speech information corresponding to the notation, word identification information, and notation identification information. word dictionary and
A part-of-speech connection table that stores connections between parts of speech and a standard notation dictionary that stores combinations of word identification information, notation identification information, and notations, and word division means that divides an input sentence into word units; A notation deviation detection means for detecting non-standard notation and a notation deviation correction means for correcting non-standard notation to standard notation are provided.

また、請求項2記載の発明では、請求項1記載の発明に
加え、標準表記辞書中の記憶内容を変更する標準変更手
段を設ける。
In addition to the invention set forth in claim 1, the invention set forth in claim 2 further includes standard changing means for changing the contents stored in the standard notation dictionary.

作用 請求項1記載の発明によれば、予め標準表記の決められ
た標準表記辞書を用いて、表記のゆれ検出手段により表
記のゆれの有無を検出し、ゆれがある時には訂正手段に
より標準表記に自動的に訂正される。つまり、表記のゆ
れ検出毎にオペレータが表記を選択するような面倒がな
く、標準表記に統一される。
According to the invention as claimed in claim 1, the presence or absence of a deviation in the notation is detected by the notation deviation detection means using a standard notation dictionary in which the standard notation is determined in advance, and when there is any deviation, the correction means changes the notation to the standard notation. Corrected automatically. In other words, there is no need for the operator to select a notation every time a deviation in the notation is detected, and the notation is unified to the standard notation.

また、請求項2記載の発明によれば、標準変更手段によ
り標準表記辞書中の記憶内容、即ち標準表記を変更し得
るので、オペレータ所望のものを任意に標準表記とする
ことができ、柔軟性を持つことになる。
Further, according to the invention as claimed in claim 2, since the contents stored in the standard notation dictionary, that is, the standard notation can be changed by the standard changing means, the operator can arbitrarily set what he or she desires as the standard notation, and has flexibility. will have.

実施例 本発明の一実施例を図面に基づいて説明する。Example An embodiment of the present invention will be described based on the drawings.

第1図は本実施例によるシステム構成を示すもので、ま
ず、日本語ワードプロセッサ等により作成された日本語
の文書を保存するファイルである文書記憶手段1が設け
られている。また、辞書類として、品詞接続表2、単語
辞書3及び標準表記辞書4が設けられている。
FIG. 1 shows the system configuration according to this embodiment. First, a document storage means 1 is provided, which is a file for storing Japanese documents created by a Japanese word processor or the like. Further, as dictionary documents, a part-of-speech connection table 2, a word dictionary 3, and a standard notation dictionary 4 are provided.

品詞接続表2は、品詞同士が文法的に接続するか否かの
可否情報を2値情報により記憶させたもので、その−例
を第2図に示す。図中、「○」が接続可を示し、 「×
」が接続不可を示す。
The part-of-speech connection table 2 stores binary information indicating whether or not parts of speech are grammatically connected to each other, and an example thereof is shown in FIG. In the figure, "○" indicates that connection is possible, and "×
” indicates that connection is not possible.

単語辞書3は第3図に例示するように、各単語の表記と
、対応する単語識別情報、表記識別情報、品詞情報を記
憶したものである。ここに、単語識別情報は、表記が異
なる単語が同じ単語であるか否かを判定するための情報
であり、同じ単語には同じ番号が割当てられる。第3図
によれば、「取り扱う」と「取扱う」とは表記が異なる
が同じ単語であるので、単語識別情報としては同じ番号
r102Jが割当てられている。表記識別情報は、同じ
単語の異なる表記を区別するための番号情報である。第
3図の例によれば、「薬品」のように表記が1つしかな
い単語には「0」を割当て、「取り扱う」 「取扱う」
のように複数の表記がある単語には「1」以上の番号を
割当てるものである。
As illustrated in FIG. 3, the word dictionary 3 stores the notation of each word, corresponding word identification information, notation identification information, and part-of-speech information. Here, the word identification information is information for determining whether words with different notations are the same word, and the same number is assigned to the same word. According to FIG. 3, "handling" and "handling" have different notations but are the same word, so the same number r102J is assigned as word identification information. The spelling identification information is number information for distinguishing between different spellings of the same word. According to the example in Figure 3, "0" is assigned to words that have only one notation, such as "drugs," and "handles" and "handles".
A number of ``1'' or higher is assigned to a word that has multiple notations, such as ``1''.

標準表記辞書4は複数の表記がある単語について、標準
的な表記とその表記識別情報との対を、単語識別情報を
キーとして検索し得るように構成されたものである。そ
の−例を第4図に示す。
The standard notation dictionary 4 is configured to search for a pair of a standard notation and its notation identification information for a word that has multiple notations, using the word identification information as a key. An example of this is shown in FIG.

さらに、実際の検出・訂正処理を実行するために、単語
分割手段5と表記のゆれ検出手段6と表記のゆれ訂正手
段7が設けられ、更には、標準変更手段8が設けられて
いる。単語分割手段5は文書記憶手段1中から1つの文
を抽出し、単語辞書3及び品詞接続表2を用いて、文を
互いに文法的に接続可能な単語列に分割するものである
。この際、各m語には単語識別情報と表記識別情報とが
付与される。例えば、入力文が「薬品を取り扱う」の場
合、「薬品」+「を]+「取り扱うjに分割される。
Furthermore, in order to carry out actual detection and correction processing, word division means 5, spelling deviation detection means 6, spelling deviation correction means 7 are provided, and furthermore, standard changing means 8 is provided. The word division means 5 extracts one sentence from the document storage means 1, and uses the word dictionary 3 and the part-of-speech connection table 2 to divide the sentence into word strings that can be grammatically connected to each other. At this time, word identification information and spelling identification information are assigned to each m word. For example, if the input sentence is "handling drugs", it is divided into "drugs" + "wo" + "handling j".

このように単語分割手段5により分割された単語につい
て、表記のゆれ検出手段6、訂正手段7による表記のゆ
れの検出・訂正処理が第5図のフローチャートに従い行
われる。まず、単語の表記識別情報が「O」であるか否
か調べる。「o」であればその単語の表記は1つしかな
いので表記のゆれはないことになる。しかし、「1」以
上であれば、単語識別情報をキーとして標準表記辞書4
中を検索し、表記識別情報と表記との対を得る。
Regarding the words thus divided by the word dividing means 5, the spelling deviation detection and correction processing by the spelling deviation detecting means 6 and the correcting means 7 is performed according to the flowchart of FIG. First, it is checked whether the notation identification information of the word is "O". If it is "o", there is only one spelling of that word, so there is no variation in spelling. However, if it is "1" or more, the standard notation dictionary 4 uses the word identification information as a key.
Search inside and obtain a pair of notation identification information and notation.

そして、単語に付与されている表記識別情報と標準表配
牌s4から得られた表記識別情報とを比較する。両者が
一致していれば、入力文中の表記が標準的な表記であっ
て表記のゆれはないと判定され、そのままとされる。一
方、両表記識別情報が異なる場合には、入力文中の当該
単語に表記のゆれがあると判定され、標準表記辞書4か
ら得られた表記に置換される。
Then, the notation identification information given to the word is compared with the notation identification information obtained from the standard table layout s4. If the two match, it is determined that the notation in the input sentence is the standard notation and there is no variation in the notation, and the notation is left as is. On the other hand, if the two notation identification information are different, it is determined that the word in the input sentence has a variation in the notation, and the word is replaced with the notation obtained from the standard notation dictionary 4.

例えば、「取り扱う」という単語を対象とする場合を考
える。この「取り扱う」という単語には単語分割手段5
による単語分割に際して単語辞書3を用いて単語識別情
報r102J、表記識別情報「1」が付与される。そこ
で、検出手段6により、r102Jなる単語識別情報を
キーとして標準表記辞書4を検索すると、「2コなる表
記識別情報と「取扱う」なる表記との組が得られる。こ
のように、単語分割により得られた表記識別情報と標準
表=辞書4の検索により得られた表記識別情報とが異な
るので、表記のゆれがあると判定され、入力文中の[取
り扱うJなる表記はr取扱うJなる標準表記に置換され
る。
For example, consider the case where the word "handling" is targeted. This word "handling" has word division means 5.
When dividing the word by , word identification information r102J and notation identification information "1" are given using the word dictionary 3. Therefore, when the detection means 6 searches the standard notation dictionary 4 using the word identification information r102J as a key, a set of ``2 pieces of notation identification information and the notation ``handle'' is obtained. In this way, since the orthographic identification information obtained by word segmentation and the orthographic identification information obtained by searching the standard table = dictionary 4 are different, it is determined that there is a deviation in the orthography, and the is replaced with the standard notation J, which handles r.

このように、入力文中に表記のゆれがあったとしても、
自動的に検出され、かつ、標準表記形態に訂正されるこ
とになる。特に、標準表記辞書4により予め標準表記が
決められているので、表記のゆれの訂正に際してオペレ
ータがどの表記形態に訂正するかを選択する必要がなく
、自動的に標準表記なる表記に統一させることができる
In this way, even if there are spelling variations in the input sentence,
It will be automatically detected and corrected to a standard notation format. In particular, since the standard notation is determined in advance by the standard notation dictionary 4, there is no need for the operator to select which notation form to correct when correcting deviations in the notation, and the notation is automatically unified to the standard notation. Can be done.

ところで、このような標準表記辞書4中の標準表記は、
初期状態では予めシステムにおいて決められているもの
であるが、ユーザにおし1で、単語によっては違う表記
を標準表記としたこともある。
By the way, the standard notations in this standard notation dictionary 4 are:
In the initial state, it is predetermined in the system, but in Oshi 1, the user may decide to use a different notation as the standard notation depending on the word.

このような場合には、標準変更手段8によりユーザレベ
ルで標準表記辞書4の内容を変更すればよい。
In such a case, the contents of the standard notation dictionary 4 may be changed at the user level by the standard changing means 8.

第6図はこの標準変更手段8により標準変更処理を示す
フローチャートである。まず、新しく標準表記としたい
表記を入力する。次に、その表記を単語辞書3中から検
索し、対応する単語識別情報、表記識別情報を得る。そ
して、単語識別情報をキーとして標準表記辞書4中の対
応するレコードを得る。そこで、そのレコードの表記識
別情報と表記とを、入力された表記とこれに対応する表
記識別情報とに書換える。第4図に示したような当初の
状態の標準表記辞書4に対し、「取り扱う」なる表記を
人力して当該表記を標準表記に変更した結果を第7図に
示す。
FIG. 6 is a flowchart showing the standard changing process performed by this standard changing means 8. First, enter the notation you want to make the new standard notation. Next, the word dictionary 3 is searched for the notation, and corresponding word identification information and notation identification information are obtained. Then, a corresponding record in the standard notation dictionary 4 is obtained using the word identification information as a key. Therefore, the notation identification information and notation of that record are rewritten to the input notation and notation identification information corresponding thereto. FIG. 7 shows the result of manually changing the notation ``handle'' to the standard notation dictionary 4 in its initial state as shown in FIG. 4 to the standard notation.

発明の効果 本発明は、上述したように請求項1記載の発明によれば
、文書記憶手段、単語辞書、品詞接続表に加え、単語識
別情報と表記識別情報と表記との組を記憶した標準表記
辞書を設け、入力文を単語単位に分割する単語分割手段
と、標準でない表記を検出する表記のゆれ検出手段と、
標準でない表記を標準の表記に訂正する表記のゆれ訂正
手段とを設けたので、予め標準表記の決められた標準表
記辞書を用いて、表記のゆれ検出手段により表記のゆれ
の有無を検出し、ゆれがある時には訂正手段により標準
表記に自動的に訂正されるので、表記のゆれ検出毎にオ
ペレータが表記を選択するような面倒がなく、標準表記
への統一化を図ることができ、また、請求項2記載の発
明によれば、請求項1記載の発明に加え、標準表記辞書
中の記憶内容を変更する標準変更手段を設けたので、標
準変更手段により標準表記辞書中の標準表記を変更する
ことが可能であり、オペレータ所望のものを任意に標準
表記とすることができ、柔軟性を持たせることができる
Effects of the Invention As described above, according to the invention described in claim 1, the present invention provides a standard that stores a combination of word identification information, notation identification information, and notation in addition to a document storage means, a word dictionary, and a part-of-speech connection table. a word dividing means for providing a notation dictionary and dividing an input sentence into word units; a notation deviation detection means for detecting non-standard notation;
Since a notation deviation correction means for correcting a non-standard notation to a standard notation is provided, the presence or absence of a notation deviation is detected by a notation deviation detection means using a standard notation dictionary in which standard notation is determined in advance, When there is a deviation, it is automatically corrected to the standard notation by the correction means, so there is no need for the operator to select a notation every time a deviation is detected, and it is possible to unify the notation to the standard notation. According to the invention set forth in claim 2, in addition to the invention set forth in claim 1, a standard changing means for changing the contents stored in the standard notation dictionary is provided, so that the standard notation in the standard notation dictionary is changed by the standard changing means. It is possible to use the standard notation as desired by the operator, providing flexibility.

【図面の簡単な説明】[Brief explanation of the drawing]

図面は本発明の一実施例を示し、第1図はブロック図、
第2図は品詞接続表の構成図、第3図は単語辞書の構成
図、第4図は標準表記辞書の構成図、第5図は表記のゆ
れの検出・訂正処理を示すフローチャート、第6図は標
準変更手段を示すフローチャート、第7図は変更された
標準表記辞書の構成図である。 ■・・・文書記憶手段、2・・品詞接続表、3・・m語
辞書、4・・標準表記辞書、5・・・単語分割手段、6
・表記のゆれ検出手段、7・・表記のゆれ訂正手段、8
・・標準変更手段 一不 図
The drawings show one embodiment of the present invention, and FIG. 1 is a block diagram;
Figure 2 is a configuration diagram of a part-of-speech connection table, Figure 3 is a configuration diagram of a word dictionary, Figure 4 is a configuration diagram of a standard notation dictionary, Figure 5 is a flowchart showing the process of detecting and correcting deviations in orthography, and Figure 6 is a configuration diagram of a word dictionary. The figure is a flowchart showing the standard changing means, and FIG. 7 is a configuration diagram of the changed standard notation dictionary. ■...Document storage means, 2...Part-of-speech connection table, 3...M-word dictionary, 4...Standard notation dictionary, 5...Word division means, 6
・Notation deviation detection means, 7...Notation deviation correction means, 8
・Means for changing standards

Claims (1)

【特許請求の範囲】 1、日本語の文書を記憶する文書記憶手段と、単語の表
記及び表記に対応する品詞情報、単語識別情報、表記識
別情報を記憶した単語辞書と、品詞間の接続の可否を記
憶した品詞接続表と、単語識別情報と表記識別情報と表
記との組を記憶した標準表記辞書と、入力文を単語単位
に分割する単語分割手段と、標準でない表記を検出する
表記のゆれ検出手段と、標準でない表記を標準の表記に
訂正する表記のゆれ訂正手段とからなることを特徴とす
る日本文の校正システム。 2、標準表記辞書中の記憶内容を変更する標準変更手段
を持つことを特徴とする請求項1記載の日本文の校正シ
ステム。
[Scope of Claims] 1. A document storage means for storing Japanese documents, a word dictionary storing word notation and part-of-speech information corresponding to the notation, word identification information, orthography identification information, and a word dictionary that stores the connection between parts of speech. A part-of-speech connection table that stores information about whether or not it is possible, a standard notation dictionary that stores pairs of word identification information, orthographic identification information, and orthography, a word division means that divides input sentences into word units, and a notation that detects non-standard notations. A Japanese text proofreading system comprising a deviation detection means and a notation deviation correction means for correcting non-standard notation to standard notation. 2. The Japanese text proofreading system according to claim 1, further comprising standard changing means for changing the contents stored in the standard notation dictionary.
JP63210055A 1988-08-24 1988-08-24 Proofreading system for japanese sentence Pending JPH0258159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63210055A JPH0258159A (en) 1988-08-24 1988-08-24 Proofreading system for japanese sentence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63210055A JPH0258159A (en) 1988-08-24 1988-08-24 Proofreading system for japanese sentence

Publications (1)

Publication Number Publication Date
JPH0258159A true JPH0258159A (en) 1990-02-27

Family

ID=16583067

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63210055A Pending JPH0258159A (en) 1988-08-24 1988-08-24 Proofreading system for japanese sentence

Country Status (1)

Country Link
JP (1) JPH0258159A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03255576A (en) * 1990-03-06 1991-11-14 Matsushita Electric Ind Co Ltd Method and device for correcting japanese language document

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03255576A (en) * 1990-03-06 1991-11-14 Matsushita Electric Ind Co Ltd Method and device for correcting japanese language document
JP2887327B2 (en) * 1990-03-06 1999-04-26 松下電器産業株式会社 Japanese document proofreading device

Similar Documents

Publication Publication Date Title
JPH07325828A (en) Grammar checking system
JPH0793328A (en) Inadequate spelling correcting device
JPS61217863A (en) Electronic dictionary
JPH0258159A (en) Proofreading system for japanese sentence
JP2007122660A (en) Document data processor and document data processing program
JPS61214051A (en) Electronic dictionary
JP3945075B2 (en) Electronic device having dictionary function and storage medium storing information retrieval processing program
JPS6210763A (en) Kana to kanji conversion system
JPH05250416A (en) Registering and retrieving device for data base
JP3501240B2 (en) Document creation support device
JP3794369B2 (en) Information display device and information display processing program
JP3278889B2 (en) Machine translation equipment
JPH0612451A (en) Illustrative sentence retrieving system
JPS62212871A (en) Sentence reading correcting device
JPH0267684A (en) Calibration supporting system and dictionary retrieving system
JPH03233669A (en) Document preparing device
JPH0330048A (en) Character input device
JP2702443B2 (en) Japanese input device
JPS62256069A (en) Document processor
JPH0785040A (en) Inscription nonuniformity detecting method and kana/ kanji converting method
JP2019144840A (en) Ruby setting program and ruby setting device
JPH03144850A (en) Back-up system for proofreading of sentence
JPH06119325A (en) Word correcting device
JPH0346058A (en) Sentence reading support device
JPH01306959A (en) System for detecting error in word-separation