JPH0451327A

JPH0451327A - Character string processing method

Info

Publication number: JPH0451327A
Application number: JP2159684A
Authority: JP
Inventors: Shigeru Suzuki; 茂鈴木
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1990-06-20
Filing date: 1990-06-20
Publication date: 1992-02-19

Abstract

PURPOSE:To recognize the terminal of a one-word unit character-string, to simplify the processing, and also, to execute it at a high speed by adding a zero bit string to the terminal of the character string so that the data of at least one word in which all bits are always zero is present thereon. CONSTITUTION:A character string 11 is generated by a character string generating part 1, on the other hand, to the terminal of its character string 11, a prescribed number of zero bit trains 13 are always added from a zero bit train generating part 12. In the case of processing such a character string, the processing of a copy, etc., is executed by a word unit, that is, in this example, by a 4-byte unit. Subsequently, whether data which is read at the time when the processing such as a copy is executed is zero every time by a 1-word unit or not is decided. In such a manner, the terminal of the character string is recognized.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、情報処理装置において使用される文字データ
の終端認識を容易にした文字列処理方法に関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a character string processing method that facilitates recognition of the end of character data used in an information processing device.

（従来の技術）情報処理装置において文字データを使用する場合、予め
定められた固定長で文字列を処理する場合と、任意の不
定長で文字列を処理する場合とがある。後者の場合には
、文字データの終端を認識するために、文字列の終端に
１バイト程度の特別なコードを付加している。このコー
ドの代表的な例は、全てのビットが値ゼロのコードであ
る。この場合、文字データの終端は、次のようにして認
識し、処理されていた。(Prior Art) When character data is used in an information processing device, there are cases in which character strings are processed with a predetermined fixed length, and cases where character strings are processed with an arbitrary arbitrary length. In the latter case, a special code of about 1 byte is added to the end of the character string in order to recognize the end of the character data. A typical example of this code is a code in which all bits have a value of zero. In this case, the end of character data is recognized and processed as follows.

第２図に従来の文字列処理方法の動作説明図を示す。FIG. 2 is an explanatory diagram of the operation of the conventional character string processing method.

図において、認識の対象となる文字データは、ａ　−ｆ
までの６文字から成る文字列１１と、全てのビットが値
ゼロの１バイト分のゼロビット列１２から成る。尚、こ
のゼロビット列１２を、図において￥Ｏと表わしている
。In the figure, character data to be recognized is a - f
A character string 11 consisting of six characters up to 1, and a zero bit string 12 consisting of one byte in which all bits have a value of zero. Note that this zero bit string 12 is represented as ¥O in the figure.

この例では、各文字は、１バイトあるいは２バイトのデ
ータにより表現され、情報処理装置は、１ワード４バイ
ト単位で文字データを処理するものとする。この場合、
情報処理装置は、文字データを１ワード４バイト分ずつ
取出してコピー等の処理を行なう。In this example, each character is represented by 1 or 2 bytes of data, and the information processing device processes character data in units of 4 bytes per word. in this case,
The information processing device extracts 4 bytes of character data per word and performs processing such as copying.

一方、このコピー処理と並行して、文字データ１バイト
分ずつが￥０のゼロビット列１２であるか否かが判断さ
れ、￥Ｏのゼロビット列１２が見付かると、文字列終端
を認識したものとしてコピー処理が終了する。On the other hand, in parallel with this copying process, it is determined whether each byte of character data is a zero bit string 12 of ¥0, and if a zero bit string 12 of ¥O is found, it is assumed that the end of the character string has been recognized. Copy processing ends.

（発明が解決しようとする課題）ところで、上記のような従来の方法では、プロセッサ自
体は数バイトまとめて１ワードとし、ワード単位で文字
データを処理するにも関わらず、文字列終端の認識は１
バイト単位で行なわれている。従って、文字列の終端を
探すために、必ず各ワードを部分的に比較処理すること
になり、手間と時間がかかって、処理効率が悪くなると
いう難点があった。(Problem to be Solved by the Invention) By the way, in the conventional method as described above, although the processor itself combines several bytes into one word and processes character data in units of words, it is difficult to recognize the end of a character string. 1
This is done in bytes. Therefore, in order to find the end of a character string, each word must be partially compared, which is time-consuming and labor-intensive, resulting in poor processing efficiency.

本発明は以上の点に着目してなされたもので、情報処理
装置の取扱う１ワード分の長さに応じて、効率的な文字
列の終端を認識することができる文字列処理方法を提供
することを目的とするものである。The present invention has been made with attention to the above points, and provides a character string processing method that can efficiently recognize the end of a character string according to the length of one word handled by an information processing device. The purpose is to

（課題を解決するための手段）本発明の文字列処理方法は、不定長の文字列を１ワード
単位に分割した場合に、文字列終端において、全てのビ
ットが値ゼロの少なくとも１ワードのデータが存在する
ように一定数のゼロビット列を付加する一方、前記文字
列の終端近傍の１ワードが全てゼロである場合に、不定
の長文宇列の終端と認識することを特徴とするものであ
る。(Means for Solving the Problem) The character string processing method of the present invention provides at least one word of data in which all bits are zero at the end of the character string when a character string of indefinite length is divided into one word units. A certain number of zero bit strings are added so that the character string exists, and if one word near the end of the character string is all zero, it is recognized as the end of an indefinite long string. .

（作用）本発明の文字列処理方法によれば、文字列終端に、必ず
予めワード長等を考慮して設定された一定数のゼロビッ
ト列が付加される。即ち、文字列を１ワード単位に読取
る際、必ず全てのビットが値ゼロの１ワードのデータが
存在するようにゼロビット列を付加すれば、そのワード
を発見することによって文字列の終端が認識できる。(Operation) According to the character string processing method of the present invention, a constant number of zero bit strings, which are set in advance with consideration to the word length, etc., are always added to the end of the character string. In other words, when reading a character string word by word, if you add a zero bit string so that there is always one word of data where all bits have a value of zero, you can recognize the end of the string by finding that word. .

（実施例）以下、本発明を図の実施例を用いて詳細に説明する。(Example) Hereinafter, the present invention will be explained in detail using embodiments shown in the drawings.

第１図は、本発明の方法の動作説明図である。FIG. 1 is an explanatory diagram of the operation of the method of the present invention.

図に示した本発明の方法によれば、先ず、文字列１１を
文字列生成部１て生成する一方、その文字列１１の終端
に、常にゼロビット列生成部１２より一定数のゼロビッ
ト列１３が付加される。According to the method of the present invention shown in the figure, first, a character string 11 is generated by a character string generator 1, and at the end of the character string 11, a constant number of zero bit strings 13 are always added by a zero bit string generator 12. will be added.

ここで、例えば、４バイト３２ビツトを１ワードとして
、Ｉワード単位でデータを処理する情報処理装置を考え
る。この場合、１バイトで１文字が表わせる場合と、２
バイトで１文字が表わせる場合とがある。図の例では、
１バイト文字のみによる文字列１１を表示した。即ち、
この文字列１１は、ａ、ｂ、ｃ、ｄ、ｅ、ｆ、ｇ、ｈ、
ｉｊという１２文字分のコードから成る。一方、二の文
字列１１に付加されるゼロピッ］・列１３は、全てのビ
ットが値ゼロの１バイトのデータ￥０を７バイト連続さ
せた構成とされている。Here, for example, consider an information processing apparatus that processes data in units of I words, with 4 bytes and 32 bits as one word. In this case, one character can be represented by one byte, and two
There are cases where a single character can be represented by a byte. In the example shown,
Character string 11 consisting of only 1-byte characters was displayed. That is,
This character string 11 is a, b, c, d, e, f, g, h,
It consists of a 12-character code called ij. On the other hand, the zero-pi] column 13 added to the second character string 11 has a configuration in which 7 consecutive bytes of 1-byte data ¥0 in which all bits have a value of zero are arranged.

このような文字列を処理する場合、ワード単位、即ちこ
の例では、４バイト単位でコピー等の処理がなされる。When processing such a character string, processing such as copying is performed in word units, that is, in this example, in 4-byte units.

そして、このコピー等の処理がなされる際に読取られた
データが、１ワード単位でその都度上口であるか否かか
判断される。これによって、文字列終端が認識される。Then, when processing such as copying is performed, it is determined whether or not the data read is the upper limit each time in units of one word. This allows the end of the string to be recognized.

即ち、図の例では、１．ｊを含む１ワードがコピーされ
た後、￥０を内容とする４バイト分１ワードのデータが
読取られる。これが全てゼロであると判断されると、コ
ピー処理が終了する。That is, in the example shown in the figure, 1. After one word including j is copied, one word of data consisting of 4 bytes containing ¥0 is read. If it is determined that these are all zeros, the copying process ends.

第３図を用いて、本発明の方法の具体的な動作フローチ
ャートを示す。A specific operational flowchart of the method of the present invention is shown using FIG.

この例は、第１図に示したような構成の任意の長さの文
字列をコピーし、コピー終了後、第１図に示したような
ゼロビット列を付加する動作を説明している。This example describes the operation of copying a character string of arbitrary length having the structure shown in FIG. 1, and adding a zero bit string as shown in FIG. 1 after copying is completed.

先ず、ステップＳ１において、コピー元の文字列の先頭
アドレスを図示しない第１のレジスタにロードする。尚
、この第１のレジスタは、コピー元の文字列をワード単
位で読出すためのアドレスを指定するレジスタである。First, in step S1, the start address of a character string to be copied is loaded into a first register (not shown). Note that this first register is a register that specifies an address for reading out the copy source character string in units of words.

次に、ステップＳ２において、コピー先のアドレスを図
示しない第２のレジスタにロードする。Next, in step S2, the copy destination address is loaded into a second register (not shown).

コピー先は、図示しないメモリ等であって、第２のレジ
スタはその書込みアドレスを指定するためのものである
。The copy destination is a memory (not shown), and the second register is for specifying the write address.

次に、ステップＳ３において、第１のレジスタに示すア
ドレスから１ワードの文字列を図示しない第３のレジス
タにロードする。この第３のレジスタは、文字列コピー
のためのバッファとして使用される。Next, in step S3, a one-word character string is loaded into a third register (not shown) from the address indicated in the first register. This third register is used as a buffer for string copying.

ここで、ステップＳ４において、第３のレジスタの内容
が全てセロであるか否かが判断される。Here, in step S4, it is determined whether the contents of the third register are all zeros.

第３のレジスタの内容が全てゼロでない場合には、その
ワード中にコピーの必要な文字列が含まれていると考え
られる。そこで、ステップＳ５において、第３のレジス
タ中のワードを第２のレジスタに示すアドレスにコピー
する。そして、第１のレジスタと第２のレジスタの内容
に４°′を加算する（ステップＳ６）。これによって、
アドレスが１ワード分インクリメントされ、新たなワー
ドの読出しとコピー処理に移行する。If the contents of the third register are not all zeros, it is considered that the word contains a string that needs to be copied. Therefore, in step S5, the word in the third register is copied to the address indicated in the second register. Then, 4°' is added to the contents of the first register and the second register (step S6). by this,
The address is incremented by one word, and the process moves on to reading and copying a new word.

多数の文字列で構成されたデータのコピーは、ステップ
Ｓ３からステップＳ６を繰返すことにより実行される。Copying data composed of a large number of character strings is executed by repeating steps S3 to S6.

ここで、ステップＳ４において、第３のレジスタの内容
が全てゼロと判断されると、ステップＳ７に移行する。Here, if it is determined in step S4 that the contents of the third register are all zero, the process moves to step S7.

ステップＳ７においては、コピー先の文字列の終端に￥
Ｏのデータを７バイト付加する。即ち、ステップＳ４に
おいて、コピーずべきデータを１ワードずつ監視して、
全てゼロのゼロビット列であった場合には、文字列の終
端と認識して後処理を実行する。この場合に、後処理は
、￥Ｏのデータを７バイト付加するようにしている。In step S7, ¥ is added to the end of the copy destination character string.
Add 7 bytes of O data. That is, in step S4, the data to be copied is monitored word by word,
If it is a zero bit string with all zeros, it is recognized as the end of the character string and post-processing is executed. In this case, in the post-processing, 7 bytes of data of \O are added.

本発明の方法では、文字列終端に常に一定数のゼロビッ
ト列を付加する。In the method of the present invention, a constant number of zero bit strings are always added to the end of a character string.

上記の方法を実施するには、十分長い任意のゼロビット
列を付加してもよい。しかし、それではデータ長が無用
に長くなり、メモリが無駄になる。Any sufficiently long string of zero bits may be appended to implement the above method. However, this would make the data length unnecessarily long and waste memory.

そこで本発明では、必要最小限の一定数のゼロビット列
をイ」加するようにしている。Therefore, in the present invention, a fixed minimum number of zero bit strings are added to the memory.

第４図に、本発明の方法の各種の動作説明図を示す。FIG. 4 shows various operation explanatory diagrams of the method of the present invention.

本発明の方法においては、不定長の文字列の終端を認識
する場合に、１バイト単位でなく１ワード単位でこれを
認識する。そのために、本発明の場合、文字列の終端に
は、文字列を１ワード長ずつに分割した場合に、文字列
の長さに関わらず、その文字列の終端で、全てのビット
が、値ゼロの少なくとも１ワードのデータが存在するこ
とを保証するだけの一定数のゼロビット列を付加するよ
うにしている。In the method of the present invention, when recognizing the end of a character string of undefined length, it is recognized not in units of bytes but in units of words. Therefore, in the case of the present invention, when a character string is divided into 1 word length units, all bits at the end of the character string have a value, regardless of the length of the character string. A certain number of zero bit strings are added to ensure that at least one word of zero data exists.

上記実施例の場合、３２ビツト１ワードであって、１バ
イト文字と２バイト文字とが混在することから、文字列
の終端のセロビット列は７バイト以上必要となる。In the case of the above embodiment, one word is 32 bits, and since 1-byte characters and 2-byte characters are mixed, the cello bit string at the end of the character string requires 7 bytes or more.

第４図に、その理由を示すための本発明の方法の各種の
動作説明図を示す。FIG. 4 shows various operation explanatory diagrams of the method of the present invention to show the reason for this.

第４図（ａ）〜（ｄ）までは、第１図に示した文字列の
任意の位置から、文字列のコピーを開始した場合の文字
列終端近傍の状態を示している。FIGS. 4(a) to 4(d) show the state near the end of the character string when copying the character string is started from an arbitrary position in the character string shown in FIG.

第４図（ａ）は、文字列の最後の１ワードがｇ、ｈ、ｉ
、ｊという文字コードから成り、その後ろに７バイトの
￥０のデータが付加されている。この場合には、文字Ｊ
の後に続く４バイト１ワードの￥０のゼロビット列によ
って文字列の終端が認識され、残りの３バイトの￥０の
ゼロビット列は無視される。In Figure 4(a), the last word of the character string is g, h, i.
, j, followed by 7 bytes of ¥0 data. In this case, the letter J
The end of the character string is recognized by the ¥0 zero bit string of 4 bytes and one word that follows , and the remaining 3 bytes of ¥0 zero bit string are ignored.

次に第４図（ｂ）は、文字列の最後の文字ｊと３バイト
分の￥０のゼロビット列とが組合されて１ワードを構成
し、その後に￥Ｏのセロビット列が４バイト付加されて
いる状態を示す。このような状態では、最後の￥○の４
バイト１ワードのゼロビット列によって文字列の終端が
認識される。Next, in FIG. 4(b), the last character j of the character string and the 3-byte zero bit string of ¥0 are combined to form one word, and then the 4-byte cello bit string of ¥0 is added. Indicates the state in which In this situation, the last ¥○4
The end of the string is recognized by a string of zero bits in one byte word.

従って、第４図（ａ）の場合と比較すると、第４図（ａ
）では、無駄になる￥Ｏのゼロビット列が最大の３バイ
ト分であるのに対し、第４図（ｂ）の場合は無駄がない
。Therefore, when compared with the case of FIG. 4(a),
), the zero bit string of ¥O that is wasted is the maximum of 3 bytes, whereas in the case of FIG. 4(b), there is no waste.

第４図（ｃ）及び（ｄ）の場合、その中間的な状態を示
し、第４図（ｃ）は１バイト分の￥０のゼロビット列が
無駄になり、第４図（ｄ）は２バイト分の￥○のゼロビ
ット列が無駄になる。しかしながら、何れの場合におい
ても、必ず１ワード４ハイド分の￥０のゼロビット列の
存在が保証され、必ず文字列の終端が認識できる。In the cases of FIGS. 4(c) and (d), the intermediate states are shown. In FIG. 4(c), 1 byte of ¥0 zero bit string is wasted, and in FIG. 4(d), 2 The zero bit string of ¥○ for bytes is wasted. However, in either case, the existence of a zero bit string of \0 for one word and four hides is guaranteed, and the end of the character string can always be recognized.

従って、上記実施例においては、少なくとも￥０のゼロ
ビット列を７バイトだけ、文字列の終端に付加する必要
が生じることが分かる。この付加すべきゼロビット列の
数は、情報処理装置の取扱うワード長によって異なる。Therefore, it can be seen that in the above embodiment, it is necessary to add at least 7 bytes of the zero bit string of ¥0 to the end of the character string. The number of zero bit strings to be added differs depending on the word length handled by the information processing device.

１ワードが２バイトの場合、あるいは１ワードが６バイ
ト、８バイト等となった場合に、それぞれゼロビット列
列の長さは相違することになる。When one word is 2 bytes, or when one word is 6 bytes, 8 bytes, etc., the lengths of the zero bit strings will be different.

尚、情報処理装置のハードウェアの構成等によって、上
記値ゼロのビットを、ハイレベルの信号と対応させるか
ロウレベルの信号と対応させるかは自由に選定してよい
。Note that depending on the hardware configuration of the information processing device, etc., it may be freely selected whether the bit having the value of zero corresponds to a high level signal or a low level signal.

（発明の効果）以上説明した本発明の文字列処理方法によれば、文字列
を１ワード単位で取扱う情報処理装置において、文字列
の終端に、必ず全てのビットが、値ゼロの少なくとも１
ワードのデータが存在するようセロビット列を付加した
ので、１ワード単位にデータを処理し、１ワード単位に
そのデータの内容を検査して、文字列の終端を認識する
ことができる。従って、従来の１バイト単位による終端
認識方法に比べ、処理の単純化と高速化を図ることがで
きる。これにより、ワード単位の文字列のコピーや比較
等の処理を、より高速に実行することが可能となる。(Effects of the Invention) According to the character string processing method of the present invention described above, in an information processing device that handles character strings in units of one word, all bits at the end of a character string are always set to at least one value of zero.
Since the cello bit string is added so that word data exists, it is possible to process the data word by word, check the contents of the data word by word, and recognize the end of the character string. Therefore, compared to the conventional end recognition method based on 1-byte units, processing can be simplified and faster. This makes it possible to perform processes such as copying and comparing character strings in word units at higher speed.

[Brief explanation of the drawing]

第１図は本発明の方法の動作説明図、第２図は従来方法
の動作説明図、第３図は本発明の方法のフローチャート
、第４図は本発明の方法の各種動作説明図である。１１・・・文字列、１３・・・ゼロビット列。］　２FIG. 1 is an explanatory diagram of the operation of the method of the present invention, FIG. 2 is an explanatory diagram of the operation of the conventional method, FIG. 3 is a flowchart of the method of the present invention, and FIG. 4 is an explanatory diagram of various operations of the method of the present invention. . 11...Character string, 13...Zero bit string. ] 2

Claims

[Claims] When a character string of indefinite length is divided into 1 word units, all bits have a value of zero at least 1 at the end of the character string.
A certain number of zero bit strings are added so that word data exists, and if one word near the end of the character string is all zeros, it is recognized as the end of an indefinite long string. String processing method.