JPH0275059A - Error correction processor for japanese sentence - Google Patents

Error correction processor for japanese sentence

Info

Publication number
JPH0275059A
JPH0275059A JP63227841A JP22784188A JPH0275059A JP H0275059 A JPH0275059 A JP H0275059A JP 63227841 A JP63227841 A JP 63227841A JP 22784188 A JP22784188 A JP 22784188A JP H0275059 A JPH0275059 A JP H0275059A
Authority
JP
Japan
Prior art keywords
characters
dictionary
character
word
correction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP63227841A
Other languages
Japanese (ja)
Inventor
Masako Mochizuki
望月 雅子
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP63227841A priority Critical patent/JPH0275059A/en
Publication of JPH0275059A publication Critical patent/JPH0275059A/en
Pending legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

PURPOSE:To improve the processing efficiency for production of the correction candidates by using a correction character dictionary which stores the pairs of wrong characters and their correct candidates in the order of characters which are used wrong more easily. CONSTITUTION:A processing part 2 includes a morpheme analyzing part 4, an error detecting part 5, and a correction candidate producing part 6. The part 4 uses a word dictionary 7 and a word connection table 8 to process the Japanese word documents received from an input part 1. The part 5 detects the wrong character strings. Then the part 6 performs its process with use of a character connection table 9 and a correction character dictionary 10 in addition to the dictionary 7 and the table 8. In such a constitution, the uniform process is not applied to all characters and the preference is given to those characters which are easily misused. Thus the correction candidates are produced with high efficiency.

Description

【発明の詳細な説明】 産業上の利用分野 本発明は、日本文中の誤り文字列を検出し、正しい単語
候補を提示する日本文誤り訂正処理装置に関する。
DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a Japanese sentence error correction processing device that detects erroneous character strings in Japanese sentences and presents correct word candidates.

従来の技術 従来、この種の日本文誤り訂正については、例えば情報
処理学会誌 Vo Q、  25 No、2 Mar。
BACKGROUND ART Conventionally, this type of Japanese error correction has been described, for example, in Information Processing Society of Japan Vo Q, 25 No. 2 Mar.

1984中の論文「単語解析プログラムによる1″J本
文誤字の自動検出と二次マルコフモデルによる訂正候補
の抽出」に示されるものがある。これは、概略的には、
訂正文字候補を誤り′部分の前後の文字との連鎖確率に
より訂正候補を推定するものである。
Some examples are shown in the paper ``Automatic detection of 1''J main text errors using a word analysis program and extraction of correction candidates using a quadratic Markov model'' published in 1984. This is, roughly speaking,
The correction candidate is estimated based on the probability of linking the corrected character candidate with the characters before and after the error' part.

発明が解決しようとする問題点 ところが、このような従来方式による場合、全ての文字
が同等に扱われるため、文字の誤りやすい組合せ、人力
時の操作状況等を反映させることができない。
Problems to be Solved by the Invention However, in the case of such a conventional method, all characters are treated equally, so it is not possible to reflect the combinations of characters that are likely to be mistaken, the operating conditions when using manual input, etc.

また、文字単位であるので、前の単語と文法的に接続し
ない文字であっても、候補として抽出してしまうことが
ある。
Furthermore, since the search is performed in character units, even characters that are not grammatically connected to the previous word may be extracted as candidates.

問題点を解決するための手段 単語辞書及び単語接続表とを用いた形態素解析部及び誤
り検出部により日本文による文書中から誤り文字列を検
出し、ひらがな誤りについて誤り文字列の正解となる語
を提示する日本文誤り訂正処理装置において、誤り文字
と正解である候補文字との対をその誤りの生じやすい文
字順に格納した訂正文字辞書を設ける。
Means for Solving the Problem: A morphological analysis unit and an error detection unit using a word dictionary and a word connection table detect incorrect character strings from a Japanese document, and detect the correct word for the erroneous character string for hiragana errors. In this Japanese sentence error correction processing device, a corrected character dictionary is provided which stores pairs of erroneous characters and correct candidate characters in the order of characters in which errors are likely to occur.

作用 日本文における誤りを考えた場合、キーボード上でのキ
ー配′列、シフトキーの操作忘れ等に起因して、発生し
やすい誤りもあり、逆に誤りの発生しにくい文字もある
。ここに、訂正文字辞書ではこのような点を考慮し、誤
り文字と正解である候補文字との対をその誤りの生じや
すい文字順に格納しであるので、全ての文字について均
等的な処理どならず、誤りの発生しやすいものが優先す
るため、効率のよい訂正候補作成が可能となる。
When considering errors in Japanese text, there are some that are easy to make due to the layout of the keys on the keyboard, forgetting to use the shift key, etc., and there are also characters that are less likely to make mistakes. The corrected character dictionary takes this point into consideration and stores pairs of error characters and correct candidate characters in the order in which errors are likely to occur, so it is not possible to process all characters equally. First, since priority is given to items that are likely to cause errors, it is possible to create correction candidates efficiently.

実施例 本発明の第一の実施例を第1図ないし第7図に基づいて
説明する。第1図は日本文誤り訂正処理装置のブロック
図構成を示し、概略的には人力部1と処理部2と出力部
3とからなる。ここに、処理部2は形態素解析部4と誤
り検出部5と訂正候補作成部6とからなる。まず、形態
素解析部4は単語辞書7と単語接続表8とを用いて入力
部1からの日本文文書を処理し、誤り検出部5により誤
り文字列を検出する。一方、訂正候補作成部6は前記単
語辞書7、単語接続表8とともに文字接続表9及び本実
施例の特徴とする訂正文字辞書10を用いて処理を行う
ものである。
Embodiment A first embodiment of the present invention will be described with reference to FIGS. 1 to 7. FIG. 1 shows a block diagram of a Japanese sentence error correction processing device, which roughly consists of a human power section 1, a processing section 2, and an output section 3. Here, the processing section 2 includes a morphological analysis section 4, an error detection section 5, and a correction candidate creation section 6. First, the morphological analysis section 4 uses the word dictionary 7 and the word connection table 8 to process the Japanese document from the input section 1, and the error detection section 5 detects erroneous character strings. On the other hand, the correction candidate creation section 6 performs processing using the word dictionary 7, word connection table 8, character connection table 9, and correction character dictionary 10, which is a feature of this embodiment.

なお、入力部1からの処理対象文書がかな人力/ローマ
字入力の何れでなされたかを人力することで、前記訂正
文字辞書】Oを切換える辞書データ指定手段11を訂正
候補作成部6に対して備えてもよい。
Note that the correction candidate creation section 6 is provided with a dictionary data specifying means 11 for switching the correction character dictionary]O by manually inputting whether the document to be processed from the input section 1 has been input manually in kana or Roman characters. You can.

二二に、前記単語辞書7は例えば第3図に示すように、
品詞を表すコード、単語の読み、単語の表記からなる。
22. The word dictionary 7, for example, as shown in FIG.
It consists of codes representing parts of speech, word pronunciation, and word notation.

また、単語接続表8は例えば第4図に示すように、単語
の接続を後方への接続と前方からの接続とを2次元の表
に表したものである。
Further, the word connection table 8, as shown in FIG. 4, for example, is a two-dimensional table that represents word connections in terms of backward connections and forward connections.

図示の如く、接続すれば「O」、接続しなければ「×」
 と記憶している。この単語接続表8を用いて、形態素
解析部4では単語間の接続を調べ、訂正候補作成部6で
は訂正候補を絞ることになる。
As shown in the diagram, "O" if connected, "×" if not connected
I remember that. Using this word connection table 8, the morphological analysis section 4 examines connections between words, and the correction candidate creation section 6 narrows down correction candidates.

また、文字接続表9は文字間の接続を表すもので、単語
接続表れと同様に2次元の表に表したもので、例えば第
5図に示すように、表中の行の文字の後に列の文字が接
続するかどうかを表している。図示の如く、接続すれば
「O」、接続しなければ「×」と記憶している。これは
、主に促音、拗音のチエツクに使用される。
In addition, the character connection table 9 represents the connections between characters, and is expressed in a two-dimensional table similar to the word connection table. For example, as shown in Figure 5, the character connection table The character indicates whether or not to connect. As shown in the figure, if it is connected, it is stored as "O", and if it is not connected, it is stored as "x". This is mainly used to check consonants and consonants.

しかして、本実施例の特徴とする訂正文字辞書10は、
第2図に示すように、かな入力とローマ字入力により大
きく2項目に分けられており、ワードプロセッサにおけ
るキーの位置やシフトの有無によって文字を分類し、誤
り文字を入力時に誤りやすい順に、誤って人力された文
字(誤り文字)と訂正候補となる文字(候補文字)とを
対にして並べたものである。辞書は、入力方法の別(か
な人力/ローマ字入力)、誤り文字、候補文字からなる
。訂正候補作成部6は誤り文字列中に訂正文字辞書10
中の誤り文字があれば、候補文字として置換し、訂正候
補を作成する。
Therefore, the corrected character dictionary 10, which is a feature of this embodiment, is as follows:
As shown in Figure 2, it is broadly divided into two categories: kana input and romaji input. Characters are classified according to the position of the key in the word processor and the presence or absence of shifts, and the erroneous characters are sorted in order of their likelihood of error when inputting. This is a pair of characters that have been corrected (erroneous characters) and characters that are candidates for correction (candidate characters). The dictionary consists of input methods (kana manual/romaji input), error characters, and candidate characters. The correction candidate creation unit 6 creates a correction character dictionary 10 in the error character string.
If there are any erroneous characters, they are replaced as candidate characters and correction candidates are created.

このような構成において、誤り訂正処理の概略を第6図
のフローチャートに示す。まず、誤り訂正処理を行う対
象となる日本文の文書がかな入力とローマ字入力との何
れの方法によりなされたかの、人力方法の選択を辞書デ
ータ指定手段1〕により行う。本実施例では、入力時の
キーの位置によって誤りとなる文字を推定するので、か
な入力とローマ字人力とでは違う方法で誤り訂正を行う
ためである。入力方法の選択の後、処理すべき入力文字
列が有るかどうかチエツクし、入力文字列が有れば人力
された文に対して形態素解析部4により形態素解析を行
い、その結果に対して、誤り検出部5により誤り検出を
行う。そして、誤り検出された文字列に対し訂正候補作
成部6により訂正候補作成処理を行い、その結果である
訂正候補を画面上に表示させる。
In such a configuration, an outline of error correction processing is shown in the flowchart of FIG. First, the dictionary data specifying means 1 manually selects whether a Japanese document to be subjected to error correction processing was input using kana or Roman characters. This is because, in the present embodiment, since the erroneous character is estimated based on the position of the key at the time of input, errors are corrected using different methods for kana input and for manual Romaji input. After selecting an input method, it is checked whether there is an input character string to be processed, and if there is an input character string, the morphological analysis unit 4 performs morphological analysis on the human-written sentence, and the result is The error detection section 5 performs error detection. Then, the correction candidate generation unit 6 performs correction candidate generation processing on the character string in which an error has been detected, and the resulting correction candidates are displayed on the screen.

このような処理中、訂正文字辞書】0を用いた訂正候補
作成部6での処理を第7図のフローチャー1−により説
明する。まず、訂正文字辞書10中の全ての誤り文字を
調べ終え、又は、候補単語の数がかなり多くなったら(
本実施例では、10個とする)、処理を終了するが、そ
れ以前の状態であれば、誤り部分に訂正文字辞書10中
の誤り文字が有るかを訂正文学界1)1.0の先頭から
順に調べていく。なければ、次の誤り文字について同様
に調べる。このようなチエツクの結果、誤り部分に訂正
文学界A10中の誤り文字があり、かつ、候補文字があ
る場合には、誤り部分の誤り文字の直前の文字と候補文
字が接続するかを、文字接続表9を用いて調べる。接続
すれば、誤り部分の誤り文字と同じ表記の文字と候補文
字とを置換する。
During such processing, the processing in the correction candidate creation unit 6 using the correction character dictionary 0 will be explained with reference to the flowchart 1- in FIG. First, when all the erroneous characters in the corrected character dictionary 10 have been checked or the number of candidate words has become quite large (
In this embodiment, the number of characters is 10), and the process ends, but if the state is before then, check whether there is an error character in the correction character dictionary 10 in the error part. We will investigate in order. If not, check the next erroneous character in the same way. As a result of such a check, if there is an erroneous character in the corrected literature society A10 in the error part and there is a candidate character, check whether the candidate character connects with the character immediately before the erroneous character in the error part. Check using connection table 9. If connected, the candidate character is replaced with a character that has the same notation as the erroneous character in the error part.

置換して生成した語が単語辞書7にあり、単語接続表8
により、誤り部分の直前の11語と接続ずれば、この語
を訂正候補単語とし、次の候補り1語に処理を進める。
The word generated by the replacement is in the word dictionary 7, and the word connection table 8
Accordingly, if the word is misconnected with the 11 words immediately before the error part, this word is set as a correction candidate word, and processing proceeds to the next candidate word.

このようにして、誤り部分について訂正文字辞書10の
先頭から誤り文字と一致する文字があれば置換して訂正
候補を作成していき、前述したように、全ての誤り文字
を調べ、又は候補嘔詰が一定値に達したら、処理を終え
る。
In this way, if there is a character that matches the error character from the beginning of the correction character dictionary 10 in the error part, it is replaced to create a correction candidate, and as described above, all the error characters are checked or the candidate When the amount reaches a certain value, the process ends.

今、具体例として「定期券を再u2−こう−する手続き
」なる対象文の例で説明する。この対象文1:ついて形
態素解析部4により形態素解析を行うと、「けっこう」
が未登録語となり、誤り検出部5で誤り部分と認定され
る。この後で、第7図に示したような訂正候補作成部6
による訂正候補抽出処理に供される。まず、前述した如
き処理の進行において、誤り部分「けっこう」に訂正文
字辞書10中の「つ」があり、候補文字があり、「は」
と「つ」とが接続するので、「けっこう」の「つ」を候
補文字の一番目「つ」と置換する。そして、単語辞書7
の読み「はっこう」で当該辞書7を検索する。単語辞書
7中には第3図に例示するように、「発行」 「発酵」
 「薄幸」 「発効」 「発光」等がある。誤り部分の
直前の単語「再」にはす変名詞が接続するので、接続し
ない「薄幸」は削除される。そして、残りの語を訂正候
補単語とする。
Now, as a specific example, we will explain the target sentence ``Procedures for repurchasing a commuter pass.'' This target sentence 1: When morphologically analyzed by the morphological analysis unit 4, it becomes ``Kekkou''.
becomes an unregistered word, and the error detection unit 5 recognizes it as an error part. After this, the correction candidate creation unit 6 as shown in FIG.
The correction candidate extraction process is performed by First, in the progress of the process described above, there is "tsu" in the corrected character dictionary 10 in the error part "kekko", there is a candidate character, and "ha" is found in the corrected character dictionary 10.
and "tsu" are connected, so the "tsu" in "kekko" is replaced with the first "tsu" of the candidate characters. And word dictionary 7
The dictionary 7 is searched for the reading "Hakko". As illustrated in Figure 3, the word dictionary 7 includes ``issuance'' and ``fermentation.''
Examples include ``poor happiness,''``effect,'' and ``luminescence.'' Since the word ``re'' immediately before the incorrect part is connected to a subverb noun, ``susuki'', which is not connected, is deleted. The remaining words are then used as correction candidate words.

次いで、訂正文字辞書10の誤り文字「つ」について次
の候補文字に処理を進める。まず、rつ」の候補文字「
っ」の次の「ち」について調べる。
Next, the process proceeds to the next candidate character for the error character "tsu" in the corrected character dictionary 10. First of all, candidate characters for ``r'' are ``
Look up the ``chi'' after ``.

「ち」について「は」は文字接続が可能であり、Fつ」
と「ち」を置換し、「はちこう」では「へ高」が単語辞
書7にあるが、「再」に接続しないので、候補としない
。誤り文字「つ」について同様に調べていき、候補j)
1語が尽きると、訂正文字辞書10の次の誤り文字「ゆ
」について調べる。
Regarding “chi”, “ha” can be connected with letters, and “F”
and "chi" are replaced, and "hetaka" is in the word dictionary 7 for "hachikou", but since it does not connect to "re", it is not selected as a candidate. We will investigate the incorrect character "tsu" in the same way and find the candidate j)
When one word is exhausted, the next erroneous character "yu" in the corrected character dictionary 10 is checked.

「ゆ」は誤り部分「けつこう」にないので、次の誤り文
字「う」に処理を進める。「う」は誤り部分「はつこう
」中にあるので、誤り文字「う」の一番目の「う」につ
いて調べる。「こ」と「う」は接続しないので、次の候
補文字「あ」に1)いて同様に調べる。
Since "yu" is not in the error part "ketsukou", processing proceeds to the next error character "u". ``U'' is in the error part ``Hatsukou'', so check the first ``U'' of the erroneous character ``U''. Since ``ko'' and ``u'' are not connected, examine the next candidate character ``a'' in the same way as in 1).

以下、同様に処理を進め、候補数が10より大きくなる
か、又は訂正文学界N10の全ての誤り文字について調
べたところで、この誤り部分について処理を終了する。
Thereafter, the process proceeds in the same manner, and when the number of candidates becomes greater than 10 or all the erroneous characters in the corrected literary world N10 have been investigated, the process ends for this erroneous part.

このようにして得られた、訂正候補文字を表示させるこ
とにより、 [定期券を再発行する手続き」の如き正し
い日本文への訂正に供されるにのように、本実施例によ
れば、キー配列等を考慮して誤りやすい文字順に候補を
格納1ツてなる訂正文字辞書12を用いるので、従来の
全ての文字についての同等な処理方式に比し、訂正候補
作成を、より効率的に行うことができる。また、本実施
例によれば、単語の接続をも調べて訂正候補を絞るので
、文法的に接続しないものは候補として抽出されないこ
とになり、精度が上がる。
According to this embodiment, by displaying the corrected candidate characters obtained in this way, it is possible to correct the correct Japanese text, such as in [procedures for reissuing a commuter pass]. Since we use a single correction character dictionary 12 that stores candidates in the order of characters that are likely to make mistakes, taking into account key layouts, etc., we can create correction candidates more efficiently than with conventional equivalent processing methods for all characters. It can be carried out. Furthermore, according to this embodiment, since the word connections are also examined to narrow down the correction candidates, words that are not grammatically connected are not extracted as candidates, improving accuracy.

つづいて、本発明の第二の実施例を第8図ないし第10
図により説明する。本実施例は、処理対象なる日本文の
文書中の単語についての頻度情報を格納した頻度情報辞
書12を付加し、この頻度情報辞書12の頻度情報を用
いて正解候補単語に優先順位を付与するようにしたもの
である。
Next, a second embodiment of the present invention will be described in FIGS. 8 to 10.
This will be explained using figures. This embodiment adds a frequency information dictionary 12 that stores frequency information about words in a Japanese document to be processed, and uses the frequency information of this frequency information dictionary 12 to give priority to correct candidate words. This is how it was done.

即ち、形態素解析部4は単語辞書7と単語接続表8とを
用いて処理し、解析した単語の頻度をこの頻度情報辞書
〕2に格納する。そして、訂正候補作成部6では単語辞
書7、単語接続表8、文字接続表9及び訂正文字辞書1
0とともに、この頻度情報辞書12を用いて処理を行う
ことになる。
That is, the morphological analysis unit 4 performs processing using the word dictionary 7 and the word connection table 8, and stores the frequencies of the analyzed words in the frequency information dictionary]2. The correction candidate creation unit 6 then uses a word dictionary 7, a word connection table 8, a character connection table 9, and a correction character dictionary 1.
0 and this frequency information dictionary 12 will be used for processing.

このような頻度情報辞書12は例えば第9図に示すよう
に表記と頻度との対からなり、形態素解析時に単語が確
定する毎にこの頻度を1ずつ増やす。
Such a frequency information dictionary 12 consists of pairs of notation and frequency, as shown in FIG. 9, for example, and this frequency is increased by one each time a word is determined during morphological analysis.

第10図はこのような頻度情報辞書12を付加した本実
施例による訂正候補作成処理を示すフローチャートであ
る。基本的には、前記実施例による処理の場合と同様で
あるが、置換して生成した語が単語辞書7に有り、かつ
、誤り部分の直前の単語と接続可能な語がある場合には
、頻度情報辞書12中の頻度の順に語を並び換え、これ
を訂正候補単語とする。
FIG. 10 is a flowchart showing correction candidate creation processing according to this embodiment in which such a frequency information dictionary 12 is added. Basically, the processing is the same as in the case of the above embodiment, but if the word generated by substitution exists in the word dictionary 7 and there is a word that can be connected to the word immediately before the error part, The words are rearranged in the order of frequency in the frequency information dictionary 12, and these are used as correction candidate words.

例えば、具体例として「−度発行された定期券を紛失し
た場合を現下に記す。定期券を再(↓−2−二−jする
手続きは」なる対象文の場合を考える。まず、前記実施
例の場合と同様に「はつこう」の「つ」を[っ」と置換
し、単語辞書7を検索すると「発行」 「発酵」 「薄
幸」 「発光」 「発効」が得られる。そして、接続可
能な品J4かどうかのチエツクに従い、接続しない「薄
幸Jが削除される。
For example, as a specific example, consider the case where the target sentence is ``If you lose a commuter pass that has been issued twice. As in the case of the example, if you replace "tsu" in "hatsuko" with "tsu" and search the word dictionary 7, you will get "issuance", "fermentation", "light happiness", "luminescence", and "effectiveness".And, Following the check to see if the product J4 can be connected, the unconnected product J4 will be deleted.

残りの各単語について、形態素解析の結果による頻度情
報辞書12を調べ、頻度の高い順に推べる。
For each of the remaining words, the frequency information dictionary 12 based on the results of morphological analysis is checked, and the words can be suggested in descending order of frequency.

すると、「発行」が最も頻度が高いので、これが訂正候
補’d Pの先頭に来るようにする。
Then, since "issue" has the highest frequency, it is placed at the beginning of the correction candidate 'dP.

発明の効果 本発明は、−1−述したように誤り文字と正解である候
補文字との対をその誤りの牛じやすい文字順に格納した
訂正文字辞書を設けたので、キーボード等におけるキー
配列、シフトキー操作の併用の有無等に起因して誤りを
生じやすい文字を優先させることができ、よって、訂正
候補作成処理を、より効率的に行うことができる。
Effects of the Invention As described above, the present invention provides a correction character dictionary that stores pairs of error characters and correct candidate characters in the order of the characters that are most likely to cause errors. Priority can be given to characters that are likely to cause errors due to the presence or absence of combined use of the shift key, etc. Therefore, correction candidate creation processing can be performed more efficiently.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の第一の実施例を示すブロック構成図、
第2図は訂正文字辞書の構成図、第3図は単語辞書の構
成図、第4図は単語接続表の構成図、第5図は文字接続
表の構成図、第6図は全体的な処理を示すフローチャー
1・、第7図は訂正候補作成処理を示すフローチャート
、第8図は本発明の第二の実施例を示すブロック構成図
、第9図は頻度情報辞書の構成図、第10図は訂正候補
作成処理を示すフローチャー1・である。 4・・・形態素解析部、5・・・誤り検出部、7・・・
m語辞書、8・・・単語接続表、10・・・訂正文7辞
書出 願 人   株式会社   リ コ −47図
FIG. 1 is a block diagram showing a first embodiment of the present invention;
Figure 2 is a configuration diagram of the corrected character dictionary, Figure 3 is a configuration diagram of the word dictionary, Figure 4 is a configuration diagram of the word connection table, Figure 5 is a configuration diagram of the character connection table, and Figure 6 is the overall configuration. Flowchart 1 showing the process, FIG. 7 is a flowchart showing the correction candidate creation process, FIG. 8 is a block configuration diagram showing the second embodiment of the present invention, FIG. 9 is a configuration diagram of the frequency information dictionary, and FIG. FIG. 10 is a flowchart 1 showing correction candidate creation processing. 4... Morphological analysis unit, 5... Error detection unit, 7...
m-word dictionary, 8...word connection table, 10...corrected sentence 7 dictionary Applicant Ricoh Co., Ltd. -47 Figure

Claims (1)

【特許請求の範囲】[Claims] 単語辞書及び単語接続表とを用いた形態素解析部及び誤
り検出部により日本文による文書中から誤り文字列を検
出し、ひらがな誤りについて誤り文字列の正解となる語
を提示する日本文誤り訂正処理装置において、誤り文字
と正解である候補文字との対をその誤りの生じやすい文
字順に格納した訂正文字辞書を設けたことを特徴とする
日本文誤り訂正処理装置。
Japanese sentence error correction processing that detects erroneous character strings in Japanese documents using a morphological analysis unit and error detection unit using a word dictionary and word connection table, and presents words that are the correct answer to the erroneous character strings for hiragana errors. 1. A Japanese sentence error correction processing device, characterized in that the device is provided with a correction character dictionary storing pairs of error characters and correct candidate characters in the order of characters in which errors are likely to occur.
JP63227841A 1988-09-12 1988-09-12 Error correction processor for japanese sentence Pending JPH0275059A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63227841A JPH0275059A (en) 1988-09-12 1988-09-12 Error correction processor for japanese sentence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63227841A JPH0275059A (en) 1988-09-12 1988-09-12 Error correction processor for japanese sentence

Publications (1)

Publication Number Publication Date
JPH0275059A true JPH0275059A (en) 1990-03-14

Family

ID=16867208

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63227841A Pending JPH0275059A (en) 1988-09-12 1988-09-12 Error correction processor for japanese sentence

Country Status (1)

Country Link
JP (1) JPH0275059A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011065384A (en) * 2009-09-16 2011-03-31 Nippon Telegr & Teleph Corp <Ntt> Text analysis device, method, and program coping with wrong letter and omitted letter

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011065384A (en) * 2009-09-16 2011-03-31 Nippon Telegr & Teleph Corp <Ntt> Text analysis device, method, and program coping with wrong letter and omitted letter

Similar Documents

Publication Publication Date Title
US5745602A (en) Automatic method of selecting multi-word key phrases from a document
EP0439743B1 (en) Constraint driven on-line recognition of handwritten characters and symbols
EP0971294A2 (en) Method and apparatus for automated search and retrieval processing
KR100798752B1 (en) Apparatus for and method of korean orthography
Uthayamoorthy et al. Ddspell-a data driven spell checker and suggestion generator for the tamil language
JPH0275059A (en) Error correction processor for japanese sentence
Cissé et al. Automatic Spell Checker and Correction for Under-represented Spoken Languages: Case Study on Wolof
JPH09153034A (en) Document preparing device and method therefor
JP4047895B2 (en) Document proofing apparatus and program storage medium
JP4318223B2 (en) Document proofing apparatus and program storage medium
JP2004206659A (en) Reading information determination method, device, and program
JPS63163956A (en) Document preparation and correction supporting device
JPH0612453A (en) Unknown word extracting and registering device
JP3273778B2 (en) Kana-kanji conversion device and kana-kanji conversion method
JP3907106B2 (en) Translation rule creation device and program
JPH0264859A (en) Text processing device
JPH07121542A (en) Machine translation system
Bol'shakov Automatic error correction in inflected languages
JP2006338682A (en) Document corrector and program storage medium
JPH0262659A (en) Extracting device for correction candidate character of japanese sentence
JPS58214931A (en) Word separating device
JPS63136264A (en) Mechanical translating device
JPH0991278A (en) Document preparation device
JPH04296970A (en) Sentence checking device
JPS6177954A (en) Kana-to-kanji conversion system