JPS63292265A

JPS63292265A - Editing system for japanese word text data

Info

Publication number: JPS63292265A
Application number: JP62129221A
Authority: JP
Inventors: Kazuhito Furukawa; 古川　和仁
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1987-05-25
Filing date: 1987-05-25
Publication date: 1988-11-29

Abstract

PURPOSE:To record a large quantity of Japanese word text data except ruled line characters with use of an output file having limited capacity, by compressing only the ruled line character string data using many types of same characters among those data containing many types of characters like the Japanese word text data, etc. CONSTITUTION:Instructions are given from a keyboard 1 for input, editing and output (to an output file) of the Japanese word text data. A memory buffer 2 stores temporarily data supplied from the keyboard 1. A comparison part 3 compares a certain ruled line character data extracted out of the input Japa nese word text data with other data for extraction of the same ruled line characters and also decides whether the character string data should be com pressed or not. A file output part 4 outputs the input data to an output file. Then an editing part 5 performs editing jobs to display the input Japanese word text data on a display part 6 and to output said text data to the output file and also compresses the character string data. Thus it is possible to output a large quantity of data to a file having limited capacity.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は日本語処理のテキストデータ編集に関し、特に
、漢字を含む文字データのほかに罫線文字データを多く
使用したテキストデータにおいて、同一文字種の繰り返
しで構成されることが多い罫線文字列だけを列長エンコ
ード化によってデータ圧縮して出力する日本語テキスト
データの編集方式に関する。Detailed Description of the Invention (Field of Industrial Application) The present invention relates to text data editing for Japanese language processing, and in particular, in text data that uses many ruled line character data in addition to character data including kanji, This invention relates to a Japanese text data editing method that compresses and outputs only ruled line character strings, which are often composed of repeats, by column length encoding.

（従来の技術）従来、この種の日本語テキストデータの編集方式は、キ
ーボー゛ドから入力した罫線文字データを、他の文字デ
ータと同様に入力バッファに格納して順次表示し、出力
ファイルへ出力していた。(Prior art) Conventionally, this type of Japanese text data editing method stores ruled line character data input from the keyboard in an input buffer like other character data, displays it sequentially, and then outputs it to an output file. It was outputting.

（発明が解決しようとする問題点）上述した従来の日本語テキストデータの編集方式は、同
一文字種の繰り返しで構成されることが多い罫線文字デ
ータにおいて、これらをそのまま出力ファイルに出力し
ている。従って、罫線文字データの中で同一文字の繰り
返しか多い場合には、出力ファイルの多くを同一文字デ
ータで占めることになり、限られた容量のファイルに多
くのデータか出力されないという問題点がある。(Problems to be Solved by the Invention) The conventional Japanese text data editing method described above outputs ruled line character data, which is often composed of repetitions of the same character type, to an output file as is. Therefore, if there are many repetitions of the same character in the ruled line character data, much of the output file will be occupied by the same character data, resulting in the problem that a large amount of data will not be output to a file with limited capacity. .

（問題点を解決するための手段）上述の問題点を解決するために本発明が提供する日本語
テキストデータの編集方式は、所定数以上の連続した同
一文字列について、連続表示、連続する文字及び連続文
字数・をコード化するデータ圧縮を施してからファイル
へ出力することを特徴とする。このことを実現するため
に本発明は、日本語テキストデータを入力する手段と、
前記入力手段により入力された日本語テキストデータを
格納する手段と、前記格納手段に格納された日本語テキ
ストデータから所定数以上の連続した同一文字列を抽出
する手段と、前記抽出手段により抽出された前記同一文
字列をデータ圧縮して日本語テキストデータを編集する
手段と、前記編集手段により編集された日本語テキスト
データをディスプレイに表示する手段と、この日本語テ
キストデータをファイルへ出力する手段とを備えている
。(Means for Solving the Problems) In order to solve the above-mentioned problems, the Japanese text data editing method provided by the present invention is to display consecutive identical character strings of a predetermined number or more, It is characterized in that it performs data compression to encode the number of consecutive characters and then outputs it to a file. To achieve this, the present invention provides means for inputting Japanese text data;
means for storing Japanese text data input by the input means; means for extracting a predetermined number or more of consecutive identical character strings from the Japanese text data stored in the storage means; means for editing Japanese text data by compressing the same character string, means for displaying the Japanese text data edited by the editing means on a display, and means for outputting the Japanese text data to a file. It is equipped with

（実施例）以下、本発明の一実施例について図面を参照して説明す
る。(Example) Hereinafter, an example of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例の日本語テキストデータの編
集方式のブロック図である。FIG. 1 is a block diagram of a method for editing Japanese text data according to an embodiment of the present invention.

キーボード１から日本語テキストデータの入力、′並び
に編集及び出力ファイルへの出力の指示を行う。メモリ
バッファ２は、キーボード１から入力したデータを一時
格納する。比較部３では、入力した日本語テキストデー
タの中からある罫線文字データとそれ以外のデータとを
比較して同一罫線文字を抽出し、または文字列のデータ
圧縮処理を施すべきか否かを判定する。ファイル出力部
４は入力したデータを出力ファイルへ出力し、編集部５
は、入力した日本語テキストデータを表示部６へ表示す
る場合の編集、出力ファイルに出力する場合の編集、及
び文字列のデータ圧縮処理を行う。The keyboard 1 is used to input Japanese text data, edit it, and give instructions for outputting it to an output file. A memory buffer 2 temporarily stores data input from the keyboard 1. The comparison unit 3 compares certain ruled line character data with other data from the input Japanese text data, extracts the same ruled line characters, or determines whether or not to perform character string data compression processing. do. The file output section 4 outputs the input data to an output file, and the editing section 5
performs editing when displaying input Japanese text data on the display unit 6, editing when outputting to an output file, and data compression processing of character strings.

表示部６は入力したデータを表示する。The display unit 6 displays the input data.

第２図は本実施例により罫線文字列をデータ圧縮した圧
縮データの構成図であり、前記文字列を列長エンコード
化したものである。FIG. 2 is a diagram showing the structure of compressed data obtained by compressing a ruled line character string according to this embodiment, in which the character string is encoded by string length.

Ｃｓはエンコード化した３文字（Ｃｓ　＋　Ｃ１１及び
Ｃｃ　）がデータ圧縮処理を施したことを示すデータ圧
縮指示コードであり、Ｃ５はデータ圧縮処理を行った罫
線文字のコード、ＣＣはデータ圧縮処理を行った罫線文
字数を示す。Cs is a data compression instruction code that indicates that the three encoded characters (Cs + C11 and Cc) have been subjected to data compression processing, C5 is the code of the ruled line character that has been subjected to data compression processing, and CC is the code for data compression processing. Indicates the number of ruled line characters.

次に、本実施例の動作を詳細に説明する。Next, the operation of this embodiment will be explained in detail.

第３図は本実施例における文字列のデータ圧縮の動作を
示すフローチャートである。FIG. 3 is a flowchart showing the operation of character string data compression in this embodiment.

初期状態においては、最初の罫線文字を識別するための
罫線文字カウンタ及び同一罫線文字数の加算を行う文字
カウンタは°“０″に設定される（ステップ１１及びス
テップ１２）。キーボード１から日本語テキストデータ
を入力すると、このデータはメモリバッファ２に一時格
納され、表示部６によって画面に表示される（ステップ
１３）。また、比較部３で前記データの入力が終了か否
かを判定し、終了の場合は処理を終了するくステップ１
４及びステップ１５）６人力した日本語テキストデータ
が罫線文字か否かを判定し、罫線文字でない場合は文字
列のデータ圧縮処理を実行しないでステップ１１へ戻る
（ステップ１６）。In the initial state, a ruled line character counter for identifying the first ruled line character and a character counter for adding up the number of identical ruled line characters are set to "0" (steps 11 and 12). When Japanese text data is input from the keyboard 1, this data is temporarily stored in the memory buffer 2 and displayed on the screen by the display section 6 (step 13). In addition, the comparison unit 3 determines whether or not the input of the data is completed, and if the input is completed, the process is terminated in step 1.
4 and Step 15) 6 It is determined whether the Japanese text data entered manually is a ruled line character or not. If it is not a ruled line character, the process returns to Step 11 without executing the data compression process of the character string (Step 16).

入力したし］本誌デキストデータが罫線文字である”場
合は、この罫線文字が最初の罫線文字であるか否かを前
記罫線文字カウンタが°゛０″であるか否かにより判定
する（ステップ１７）。最初の罫線文字の場合は、ステ
ップ１８により前記罫線文字カウンタに″１″を加算し
、ステップ１９で比較部３にこの罫線文字のコードを格
納し、ステップ１３へ戻って次の日本語テキストデータ
の入力処理となる。If the magazine dex data is a ruled line character, it is determined whether or not this ruled line character is the first ruled line character based on whether or not the ruled line character counter is 0'' (step 17). ). In the case of the first ruled line character, "1" is added to the ruled line character counter in step 18, the code of this ruled line character is stored in the comparison section 3 in step 19, and the process returns to step 13 to write the next Japanese text data. This is the input process.

次に入力した日本語テキストデータが罫線文字である場
合に、この２番目の罫線文字が最初の罫線文字と同一罫
線文字であるか検出するなめに、比較部３に格納しであ
る前記罫線文字コードとの一致か検証される（ステップ
２０）　、　ｆｌ初の罫線文字と２番目の罫線文字が一
致しない場合は、ステップ１９へ進み、２番目の罫線文
字のコードを比較部３へ格納し、３番目の日本語テキス
トデータの入力処理となる。最初の罫線文字と２番目の
罫線文字が一致する場合は、前記文字カウンタに“１”
を加算する（ステップ２１）。次に、文字列をデータ圧
縮して列長エンコード化した場合に、第２図に示すよう
に３文字分が必要となる。従って、同一罫線文字数をス
テップ２２で判定して、３文字以下のときは文字列のデ
ータ圧縮処理を行わずにズテ：ｌプ１３へ戻る。同一罫
線文字数が４文字以上の場合のみデータ圧縮処理を行う
（ステップ２３）。Next, when the input Japanese text data is a ruled line character, in order to detect whether this second ruled line character is the same ruled line character as the first ruled line character, the said ruled line character is stored in the comparing section 3. A match with the code is verified (step 20). If fl's first ruled line character and second ruled line character do not match, the process proceeds to step 19, where the code of the second ruled line character is stored in the comparison unit 3, This is the third input process for Japanese text data. If the first ruled line character and the second ruled line character match, “1” is added to the character counter.
are added (step 21). Next, when a character string is data compressed and encoded into string length encoders, three characters are required as shown in FIG. Therefore, the number of characters in the same ruled line is determined in step 22, and if the number is 3 or less, the process returns to step 13 without performing data compression processing on the character string. Data compression processing is performed only when the number of characters on the same ruled line is four or more (step 23).

また、罫線文字列の直後に罫線文字以外の日本語テキス
トデータか続く場合は、列長エンコード化されたデータ
に続けて出力される。データ圧縮処理を行った罫線文字
列を含む日本語テキストデータは、ファイル出力部４に
より出力ファイルに出力される。Furthermore, if Japanese text data other than the ruled line characters immediately follows the ruled line character string, it is output following the column length encoded data. The Japanese text data including the ruled line character strings subjected to data compression processing is outputted to an output file by the file output unit 4.

（発明の効果）以上説明したように本発明は、日本語テキストデータの
ような字種が多いデータの中で、同一文字種を多数使用
する罫線文字列データだけをデータ圧縮することにより
、限られた容量の出力ファイルで罫線文字以外の日本語
テキストデータを多く記録できる効果がある。(Effects of the Invention) As explained above, the present invention compresses only ruled line character string data that uses many of the same character types among data with many character types such as Japanese text data. This has the effect of being able to record a large amount of Japanese text data other than ruled line characters in an output file with a large capacity.

[Brief explanation of the drawing]

第１図は本発明の一実施例の日本語テキストデータの編
集方式のブロック図、第２図は本実施例により罫線文字
列をデータ圧縮した圧縮データの構成図、第３図は本実
施例における文字列のデータ圧縮の動作を示すフローチ
ャートである。１・・・キーボード、２・・・メモリバッファ、３・・
・比較部、４・・・ファイル出力部、５・・・編集部、
６・・・表示部。FIG. 1 is a block diagram of a Japanese text data editing method according to an embodiment of the present invention, FIG. 2 is a configuration diagram of compressed data obtained by compressing ruled line character strings according to this embodiment, and FIG. 3 is a block diagram of a method for editing Japanese text data according to an embodiment of the present invention. 3 is a flowchart showing the operation of character string data compression in FIG. 1...keyboard, 2...memory buffer, 3...
- Comparison section, 4... File output section, 5... Editing section,
6...Display section.

Claims

[Scope of Claims] Means for inputting Japanese text data; means for storing the Japanese text data input by the input means; and a predetermined number or more consecutive Japanese text data stored in the storage means. means for extracting the same character string extracted by the extracting means, means for compressing the same character string extracted by the extracting means to edit Japanese text data, and displaying the Japanese text data edited by the editing means. and a means for outputting this Japanese text data to a file, and the data compression performed by the editing means includes continuous display of the same character string, continuous characters, and encoding the number of consecutive characters. Features a Japanese text data editing method.