JPH07239772A

JPH07239772A - Character code conversion device

Info

Publication number: JPH07239772A
Application number: JP6051118A
Authority: JP
Inventors: Yoshiyuki Sano; 義幸佐野
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1994-02-25
Filing date: 1994-02-25
Publication date: 1995-09-12

Abstract

PURPOSE:To enable conversion into correct print data even in an environment of different code system by regarding character data at holding time as of the character code system and converting the character data at the holding time. CONSTITUTION:When a code system decision means 11 decides the character code system of inputted character data, the data are converted into a specific character code, which is sent to a printer 15, and the kind of the decided code system is stored in a storage means 12. An employment means 111 assumes the inputted data to be of the character code system stored in the storage means 12 unless of which character code system the inputted character data is, is specified, and converts the character data by a converting means 13. A holding time processing means 16 judges that the character data at the holding time is of a specific code system when it is decided that the character data is of s specific character code after the start of decision holding, and converts the character data in a buffer means 17 by the converting means.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、入力された文字データ
を所定の文字コード体系で変換する文字コード変換装置
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character code conversion device for converting input character data in a predetermined character code system.

【０００２】[0002]

【従来の技術】近年、複数の情報処理装置が接続されて
いるネットワーク環境が利用され、ネットワークに接続
されたプリンタは、情報処理装置を利用する者にとっ
て、共通の資源として利用できるようになってきた。さ
らに、最近のネットワークシステムでは、同一のネット
ワーク上に、たとえば、ＵＮＩＸマシン（ＡＴ＆Ｔベル
研究所で研究・開発されたオペレーティングシステムで
動作する情報処理装置。ＵＮＩＸは、商標）やＭＳＤＯ
Ｓマシン（マイクロソフト社で研究・開発されたオペレ
ーティングシステムで動作する情報処理装置。ＭＳＤＯ
Ｓは、商標）のように異なるオペレーテイングシステム
で動作するマシンを接続することが頻繁に行なわれるよ
うになってきている。なお、ＵＮＩＸマシンおよびＭＳ
ＤＯＳマシンは、市中に多く出回っで著名な名称である
ため、異なるオペレーティングシステムで動作する情報
処理装置の例として上記二つのマシン名を使用して説明
する。ＵＮＩＸマシンおよびＭＳＤＯＳマシンが接続さ
れているような環境下において、共通に使用されるプリ
ンタは、異なるマシン間でテキストファイルデータのや
り取りが必要となる。通常、ＵＮＩＸマシン上に格納さ
れているテキストファイルの日本語文字データは、拡張
ユニックスコード（以下、図においては、ＥＵＣコード
と記載する）体系が採用されている。また、ＭＳＤＯＳ
マシン上に格納されているテキストファイルの日本語文
字データは、シフトＪＩＳコード体系（以下、図におい
ては、ＳＪＩＳコードと記載する）が採用されている。2. Description of the Related Art In recent years, a network environment in which a plurality of information processing devices are connected has been used, and a printer connected to the network can be used as a common resource by those who use the information processing devices. It was Furthermore, in recent network systems, for example, UNIX machines (information processing devices that operate with operating systems researched and developed by AT & T Bell Laboratories. UNIX is a trademark) and MSDO are on the same network.
S machine (information processing device that runs on an operating system researched and developed by Microsoft Corporation. MSDO
It is becoming more and more common for S to connect machines operating under different operating systems, such as the trademark. In addition, UNIX machine and MS
Since the DOS machine is widely known in the market and has a well-known name, the above two machine names will be used as an example of an information processing apparatus that operates under different operating systems. In an environment where a UNIX machine and an MSDOS machine are connected, a commonly used printer needs to exchange text file data between different machines. In general, Japanese character data of a text file stored on a UNIX machine adopts an extended Unix code (hereinafter referred to as EUC code) system. In addition, MSDOS
The Japanese character data of the text file stored on the machine adopts the shift JIS code system (hereinafter referred to as SJIS code in the drawings).

【０００３】図１６はネットワークにＵＮＩＸマシンと
ＭＳＤＯＳマシンとが接続されている状態を説明するた
めの図である。図１６において、ネットワーク１６１に
は、ＭＳＤＯＳマシン１６２と、ＵＮＩＸマシン１６３
と、プリンタ１６４、１６５とが接続されている。ま
た、プリンタ１６４には、ＭＳＤＯＳマシン１６８が、
プリンタ１６５には、ＵＮＩＸマシン１６９がそれぞれ
接続されている。さらに、ＭＳＤＯＳマシン１６２に
は、シフトＪＩＳコードファイル１６６が、ＵＮＩＸマ
シン１６３には、拡張ユニックスコードファイル１６７
がそれぞれ接続されている。たとえば、図１６に示すよ
うに、同一のネットワーク１６１上に、ＭＳＤＯＳマシ
ン１６２と、ＵＮＩＸマシン１６３とが接続されたネッ
トワークシステムにおいて、ＭＳＤＯＳマシン１６２上
からＵＮＩＸマシン１６３の拡張ユニックスコードファ
イル１６７に格納されているファイルを印刷しようとし
た場合、ＵＮＩＸマシン１６３上の拡張ユニックスコー
ド体系（ＥＵＣコード体系）によって作成されたファイ
ルは、シフトＪＩＳコード体系と判断され、誤ったコー
ド体系で変換され、プリンタに送信してしまうため、正
しく印刷できないという問題があった。FIG. 16 is a diagram for explaining a state in which a UNIX machine and an MSDOS machine are connected to the network. In FIG. 16, a network 161 includes an MSDOS machine 162 and a UNIX machine 163.
And the printers 164 and 165 are connected. In addition, the MSDOS machine 168 is connected to the printer 164.
UNIX machines 169 are connected to the printers 165, respectively. Further, the MSDOS machine 162 has a shift JIS code file 166, and the UNIX machine 163 has an extended Unix code file 167.
Are connected respectively. For example, as shown in FIG. 16, in a network system in which an MSDOS machine 162 and a UNIX machine 163 are connected on the same network 161, the extended Unix code file 167 of the UNIX machine 163 is stored from the MSDOS machine 162. If you try to print the existing file, the file created by the extended Unix code system (EUC code system) on the UNIX machine 163 is judged to be the shift JIS code system, converted with the wrong code system, and sent to the printer. Therefore, there is a problem that printing cannot be performed correctly.

【０００４】そのため、テキストファイルを印刷すると
きには、エディタやツールなどでそのテキストファイル
内の文字コードを調べてから、対応する文字コード変換
ツールを用いてプリンタに送らなければならなかった。
この問題を解決しようとしたのが、特開平４−２７３５
２０号公報における「日本語文字コード変換方式」であ
る。上記公報に記載されている「日本語文字コード変換
方式」は、ＪＩＳコードまたは拡張ユニックスコードの
どちらの文字コード体系であるかを自動的に判別するこ
とが可能で、判別の結果、その文字コードに変換され
る。Therefore, when printing a text file, it was necessary to check the character code in the text file with an editor or tool and then send it to the printer using the corresponding character code conversion tool.
An attempt to solve this problem is made in JP-A-4-2735.
It is the "Japanese character code conversion method" in Japanese Patent No. 20. The "Japanese character code conversion method" described in the above publication can automatically determine whether the character code system is a JIS code or an extended Unix code. As a result of the determination, the character code Is converted to.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、文字コ
ード体系を自動的に判別可能な変換方式においても、シ
フトＪＩＳコード体系や拡張ユニックスコード体系のよ
うに、どちらのコード体系で記述されているテキストデ
ータか判別不可能な文字コードが存在する。たとえば、
１６進表現で“Ａ４Ａ２“は、シフトＪＩＳコード体系
において、半角カタカナで表現される２文字の“、
「“となる。しかし、拡張ユニックスコード体系では、
全角ひらがな１文字で表現される“あ“となる。上記文
字コードは、どちらのコード体系でも正しい文字として
認識される。したがって、１６進表現で“Ａ４Ａ２“だ
けから成るファイルは、シフトＪＩＳコード体系、ある
いは拡張ユニックスコード体系のどちらの文字コードで
あるか判らないという問題を有した。また、「日本語コ
ードの体系と特徴」（日経インフォベースＵＮＩＸ９
２年度版日本経済新聞社第３３６頁ないし第３４１
頁）に記述されているように、ネットワーク環境の広ま
りにつれて、さまざまな日本語コードの違いによる電子
メールやネットワークファイルシステムにおける異なる
日本語コードの扱いなどの問題が表面化してきている。However, even in the conversion method capable of automatically discriminating the character code system, the text data described by either code system such as the shift JIS code system or the extended Unix code system is used. There is a character code that cannot be distinguished. For example,
In hexadecimal notation, "A4A2" is a two-character ", represented by half-width katakana in the shift JIS code system.
"" However, in the extended Unix code system,
It becomes "a" expressed by one full-width hiragana character. The above character code is recognized as a correct character in both code systems. Therefore, there is a problem in that it is not known whether the file consisting of "A4A2" in hexadecimal notation is the shift JIS code system or the extended Unix code system. In addition, "Japanese code system and features" (Nikkei Infobase UNIX 9
2nd edition Nihon Keizai Shimbun, pages 336 to 341
As described in (p.), As the network environment spreads, problems such as the handling of different Japanese codes in e-mails and network file systems due to differences in various Japanese codes have come to the surface.

【０００６】本発明は、以上のような課題を解決するた
めのもので、異なるコード体系によって記述されたファ
イルを自動的に判断して、コード体系に係わりなく出来
得るかぎり正しいプリントデータに変換してプリンタに
送信することが可能となる文字コード変換装置を提供す
ることを目的とする。The present invention is intended to solve the above problems, and automatically determines a file described by a different code system and converts it into correct print data as far as possible regardless of the code system. It is an object of the present invention to provide a character code conversion device that can be transmitted to a printer as a result.

【０００７】[0007]

[Means for Solving the Problems]

（第１発明）前記目的を達成するために、本発明の文字
コード変換装置は、入力された文字データが複数種類の
文字コード体系のいずれであるかを判別するコード体系
判別手段（図１の１１）と、その判別結果に基づいて、
当該文字データを所定の文字コードに変換する文字コー
ド変換手段（図１の１３、１４）と、前記判別結果の文
字コード体系の種類を記憶する記憶手段（図１の１２）
と、前記コード体系判別手段（１１）により文字コード
体系の種類が特定できない場合、前記記憶手段（１２）
内の判別結果を採用する採用手段（図１の１１１）と、
前記採用手段（１１１）で、過去の判別結果を採用でき
ない場合、判別保留時に入力された文字データを記憶す
るバッファ手段（図１の１７、１８）と、判別保留開始
後、前記コード体系判別手段（１１）によって、文字デ
ータが特定の文字コード体系であると判別されると、前
記バッファ手段（１７、１８）内の文字データを特定さ
れた文字コード体系であるものとして当該文字データを
前記変換手段（１３、１４）に変換させる保留時処理手
段（図１の１６）とから構成される。(First Invention) In order to achieve the above object, the character code conversion device of the present invention is a code system discrimination means (FIG. 1) for discriminating which of a plurality of types of character code systems the input character data is. 11) and based on the discrimination result,
Character code conversion means (13 and 14 in FIG. 1) for converting the character data into a predetermined character code, and storage means (12 in FIG. 1) for storing the type of character code system of the discrimination result.
If the type of character code system cannot be specified by the code system determining means (11), the storage means (12)
Adoption means (111 in FIG. 1) that adopts the determination result in
When the adoption means (111) cannot adopt the past discrimination result, the buffer means (17, 18 in FIG. 1) for storing the character data input when the discrimination is suspended, and the code system discrimination means after the discrimination suspension is started. When it is determined by (11) that the character data has a specific character code system, the character data in the buffer means (17, 18) is regarded as the specified character code system and the character data is converted. Means (13, 14) for converting the means on hold (16 in FIG. 1).

【０００８】（第２発明）本発明の文字コード変換装置
は、さらに、入力された日本語文字データが複数種類の
文字コード体系のいずれであるかを判別する際、入力さ
れた日本語文字データの文字コード体系が特定されない
場合、日本語として許されない文字の組み合わせになる
種類の文字コードを排除（図１の排除手段１１２）し、
残りの種類の文字コードを判別結果とするコード体系判
別手段（１１）を具備することを特徴とする。(Second Invention) The character code conversion device of the present invention further includes input Japanese character data when determining whether the input Japanese character data belongs to a plurality of types of character code systems. If the character code system of No. is not specified, the character code of the type that is a combination of characters that is not allowed as Japanese is excluded (exclusion means 112 in FIG. 1),
It is characterized in that it is provided with a code system discrimination means (11) that makes the discrimination results of the remaining types of character codes.

【０００９】[0009]

【作用】（第１発明）入力された文字データは、コード体系判別
手段において、複数種類の文字コード体系のいずれであ
るかが判別される。そして、入力された文字データの文
字コード体系が判別された場合、文字コード変換手段に
よって所定の文字コードに変換した後、変換された文字
コードがプリンタに送られる。上記コード体系判別手段
によって判別されたコード体系の種類は、記憶手段に記
憶される。採用手段は、入力された文字データがいずれ
の文字コード体系であるか特定されない場合、前記記憶
手段に記憶されている文字コード体系であるものとして
文字データを変換手段によって変換させる。また、コー
ド体系判別手段によって、入力された文字データの文字
コード体系が特定されず、かつ前記記憶手段に記憶され
ている特定した文字コード体系採用手段が採用できない
場合、判別保留時に、入力された文字データは、バッフ
ァ手段に記憶される。保留時処理手段は、判別保留開始
後、前記コード体系判別手段によって文字データが特定
の文字コードであると判別された場合、前記保留時の文
字データを前記特定の文字コード体系であると判断し、
前記バッファ手段内の文字データを前記変換手段に変換
させる。[Operation] (First invention) The input character data is discriminated by the code system discriminating means as to which of plural types of character code systems. When the character code system of the input character data is determined, the character code conversion means converts the character code system into a predetermined character code, and then the converted character code is sent to the printer. The type of code system determined by the code system determining means is stored in the storage means. If the input character data is not specified as to which character code system, the adoption means causes the conversion means to convert the character data as the character code system stored in the storage means. Further, if the character code system of the input character data is not specified by the code system determining means and the specified character code system adopting means stored in the storage means cannot be used, it is input when the determination is suspended. The character data is stored in the buffer means. If the character system determines that the character data is a specific character code after the determination suspension is started, the on-hold processing unit determines that the character data on hold is the specific character code system. ,
Character data in the buffer means is converted by the conversion means.

【００１０】（第２発明）入力された日本語文字データ
は、コード体系判別手段において、複数種類の文字コー
ド体系のいずれであるかが判断される。また、コード体
系判別手段は、入力された日本語文字データの文字コー
ド体系が特定されない場合、たとえば、日本語コードの
漢字コード領域を参照して、日本語として許せない文字
の組み合わせになる種類の文字コードがあった場合、コ
ード体系判別手段における排除手段によってこれを排除
し、残りの種類の文字コードであると判別し、前記文字
データを変換手段に変換させるための処理を行なう。以
上のように、本発明の文字コード変換装置は、異なる文
字コード体系によって記述されたファイルを自動的に判
別して、文字コード体系に係わりなく、出来得るかぎり
正しいプリントデータに変換してプリンタに送信するこ
とが可能となる。(Second Invention) The input Japanese character data is judged by the code system discriminating means to be one of a plurality of types of character code systems. In addition, when the character code system of the input Japanese character data is not specified, the code system determination means refers to, for example, the Kanji code area of the Japanese code, and selects a combination of characters that is not allowed as Japanese. If there is a character code, it is eliminated by the eliminating means in the code system discriminating means, it is discriminated that it is the remaining type of character code, and processing for converting the character data by the converting means is performed. As described above, the character code conversion device of the present invention automatically determines a file described by a different character code system, converts it into the correct print data as much as possible, regardless of the character code system, and prints it in the printer. It becomes possible to send.

【００１１】[0011]

【実施例】図１は本発明の一実施例を説明するため
の概略ブロック構成図である。図１において、文字コー
ド変換装置は、入力データの文字コード体系を判別する
コード体系判別手段１１と、当該コード体系判別手段１
１によって判別されたコード体系の種類が記憶されてい
るコード体系記憶手段１２と、入力データをシフトＪＩ
Ｓコード体系で変換するシフトＪＩＳコード変換手段１
３と、同じく入力データを拡張ユニックスコード体系で
変換する拡張ユニックスコード変換手段１４と、無変換
あるいは変換された文字コードを印刷するプリンタ１５
と、前記コード体系判別手段１１において、入力データ
が特定できず、かつそれまでの種類の文字コードを採用
できない場合、文字データを文字コード体系が特定され
るまで保留したり、あるいは文字コード体系が特定され
た際に、保留された文字データをその文字コード体系に
よって処理する保留時処理手段１６と、文字コードの文
字コード体系を判別する際、および文字データを保留す
るためのリングバッファメモリ１７、およびリングバッ
ファメモリ１７に記憶された文字データを一時退避する
バッファメモリ１８とから構成される。また、コード体
系判別手段１１は、文字コード体系の種類が特定できな
い場合、それまでに判別できた判別結果を採用する採用
手段１１１と、入力された日本語文字データの文字コー
ド体系が特定されない場合、日本語として許されない文
字の組み合わせになる文字を排除する排除手段１１２と
を備えている。EXAMPLE FIG. 1 is a schematic block diagram for explaining an example of the present invention. In FIG. 1, the character code conversion device includes a code system discriminating means 11 for discriminating a character code system of input data, and the code system discriminating means 1
The code system storage means 12 in which the type of the code system determined by 1 is stored, and the input data is shifted JI.
Shift JIS code converting means 1 for converting by S code system
3, an extended Unix code converting means 14 for converting the input data in the extended Unix code system, and a printer 15 for printing the unconverted or converted character code.
If the input data cannot be specified and the character code of the type up to that point cannot be adopted in the code system discrimination means 11, the character data is held until the character code system is specified, or the character code system is A holding time processing means 16 for processing the reserved character data according to the character code system when specified, and a ring buffer memory 17 for judging the character code system of the character code and for retaining the character data, And a buffer memory 18 for temporarily saving the character data stored in the ring buffer memory 17. Further, when the type of character code system cannot be specified, the code system determining unit 11 adopts the adopting unit 111 that adopts the determination result that has been determined up to that time and the character code system of the input Japanese character data is not specified. , And an excluding means 112 for excluding characters that are combinations of characters that are not allowed in Japanese.

【００１２】次に、図１の概略ブロック構成図の動作を
説明する前に、文字コード体系について説明する。図２
は日本語文字コードの漢字コード領域を説明するための
図である。日本語の文字コードには、英数字のように１
バイトで表されるものと、漢字のように２バイトで表さ
れるものとがある。図２は第一バイト目を縦軸に、第二
バイト目を横軸にとった場合における各文字コード体系
の領域が示されている。たとえば、図２において、シフ
トＪＩＳコード体系や拡張ユニックスコード体系（ＥＵ
Ｃコード）の文字コードを判別するため、１６進表現で
“８０“未満のバイトについては、その時点で同じ英数
字文字コードであると認識できるので、後続バイトを調
べる必要がない。しかし、日本語文字コードを判別する
ためには、少なくとも入力されたバイトの第一バイト目
と第二バイト目とを調べ、その漢字コードを調べる必要
がある。日本語文字の文字コード体系では、このように
して２バイトを順次読み込んだ場合、図２に示されるよ
うに、それぞれの文字コード体系で入力可能な文字コー
ドがコード領域として表わされる。Next, before describing the operation of the schematic block diagram of FIG. 1, the character code system will be described. Figure 2
FIG. 4 is a diagram for explaining a Kanji code area of a Japanese character code. Japanese character code is 1 like alphanumeric
Some are represented by bytes, while others are represented by two bytes, such as Kanji. FIG. 2 shows the area of each character code system when the first byte is on the vertical axis and the second byte is on the horizontal axis. For example, in FIG. 2, the shift JIS code system and the extended Unix code system (EU
In order to determine the character code of (C code), it is possible to recognize that the bytes less than "80" in hexadecimal notation have the same alphanumeric character code at that time, and it is not necessary to examine the subsequent bytes. However, in order to determine the Japanese character code, it is necessary to check at least the first byte and the second byte of the input byte and check the Kanji code. In the character code system of Japanese characters, when 2 bytes are sequentially read in this way, as shown in FIG. 2, the character code that can be input in each character code system is represented as a code area.

【００１３】図２において、符号２１で示す破線で囲ま
れた領域は、第一バイト目が１６進表現で“８０“未満
の英数字文字部分であり、第二バイト目を調べる必要が
ない。図２で示すようにシフトＪＩＳコード体系のみで
表すことが可能な領域２２ないし２７（左上から右下に
流れる斜線の部分）と、拡張ユニックスコード体系のみ
で表すことが可能な領域２８（右上から左下に流れる斜
線の部分）と、両者が重なる領域２９および３０とがで
きる。領域２９、３０は、どちらの文字コード体系であ
るか判別不可能な文字コード領域である。しかし、文字
データを順次読み込み、このような判別不可能な領域の
文字コードが発見された場合でも、必ずしも全て判別が
不可能であるというわけではない。入力されてくる文字
データは、少なくとも１つの文字コード体系で記述され
ているはずであるから、この判別不可能な領域の文字コ
ードに到達する前に、シフトＪＩＳコード体系でのみ可
能な領域、または拡張ユニックスコード体系（ＥＵＣコ
ード）でのみ可能な領域に位置する文字コードをすでに
解析していれば、当然この判別不可能な文字コードであ
っても、既に判っている文字コード体系として処理する
ことが可能となる。In FIG. 2, in the area surrounded by the broken line indicated by reference numeral 21, the first byte is an alphanumeric character part less than "80" in hexadecimal notation, and it is not necessary to check the second byte. As shown in FIG. 2, areas 22 to 27 that can be represented only by the shift JIS code system (hatched portions flowing from the upper left to the lower right) and an area 28 that can be represented only by the extended Unix code system (from the upper right) And a region 29 and 30 where the both overlap with each other. Areas 29 and 30 are character code areas in which the character code system cannot be determined. However, even if the character data is sequentially read and a character code in such an indistinguishable area is found, it is not always possible to determine all. Since the input character data must have been described in at least one character code system, before the character code in this indeterminate region is reached, the area that is only possible in the shift JIS code system, or If the character code located in the area that is possible only with the extended Unix code system (EUC code) has already been analyzed, naturally this unidentifiable character code must be processed as the already known character code system. Is possible.

【００１４】たとえば、図１に示すように、コード体系
判別手段１１は、入力データのコード体系を判別する。
入力データの文字コードは、上記コード体系判別手段１
１によって、第一バイト目が１６進表現で８０未満のバ
イトであると判断された場合、英数字文字コードである
と判別されるため、無変換でプリンタ１５に送られる。
上記コード体系判別手段１１は、入力データの文字コー
ドを判別する際に、第一バイト目が１６進表現で８０以
上のバイトであると判別した後、第二バイト目を調べ、
シフトＪＩＳコード体系であることが判別できた場合、
シフトＪＩＳコード変換手段１３によって変換する。ま
た、上記コード体系判別手段１１は、入力データの文字
コードを前記同様に調べ、拡張ユニックスコード体系で
あると判別した場合、拡張ユニックスコード（ＥＵＣコ
ード）変換手段１４によって変換する。そして、それぞ
れの変換手段１３、１４によって変換された文字コード
は、プリンタ１５に送られ、印刷される。そして、コー
ド体系判別手段１１は、判別したコード体系をコード体
系記憶手段１２に記憶し、以後のコード体系判別は、前
記コード体系記憶手段１２の値によって決定する。For example, as shown in FIG. 1, the code system discriminating means 11 discriminates the code system of the input data.
The character code of the input data is the code system discriminating means 1 described above.
When it is determined by 1 that the first byte is a byte less than 80 in hexadecimal notation, it is determined to be an alphanumeric character code, and therefore it is sent to the printer 15 without conversion.
When determining the character code of the input data, the code system determination means 11 determines that the first byte is 80 or more bytes in hexadecimal notation, and then checks the second byte,
If it can be determined that it is the Shift JIS code system,
It is converted by the shift JIS code conversion means 13. Further, the code system discriminating means 11 examines the character code of the input data in the same manner as above, and when it discriminates that it is the extended Unix code system, it is converted by the extended Unix code (EUC code) converting means 14. Then, the character codes converted by the respective conversion means 13 and 14 are sent to the printer 15 and printed. Then, the code system discriminating means 11 stores the discriminated code system in the code system storing means 12, and subsequent code system discriminating is determined by the value of the code system storing means 12.

【００１５】また、コード体系判別手段１１が、入力デ
ータの文字コードを調べる際に、図２における判別不可
能な領域の文字コードに到達する前に、シフトＪＩＳコ
ード体系で変換可能な領域、または拡張ユニックスコー
ド体系で変換可能な領域に位置する文字コードを既に解
析していなければ、保留時処理手段１６は、入力された
文字コードのバイトを順次リングバッファメモリ１７に
蓄積する処理を行なう。そして、コード体系判別手段１
１は、常に入力された文字がシフトＪＩＳコード体系で
のみ変換可能な領域２２ないし２７、または拡張ユニッ
クスコード体系でのみ変換可能な領域２８に位置するか
を判断する。保留時処理手段１６は、コード体系判別手
段１１がシフトＪＩＳコード体系か拡張ユニックスコー
ド体系かを判断できた段階で、前記リングバッファメモ
リ１７に蓄積されていた入力データ列の文字コードを変
換する。Further, when the code system discrimination means 11 examines the character code of the input data, before the character code of the indiscriminate area in FIG. 2 is reached, the area which can be converted by the shift JIS code system, or If the character code located in the convertible area in the extended Unix code system has not been analyzed yet, the on-hold processing means 16 performs a process of sequentially storing the bytes of the input character code in the ring buffer memory 17. And the code system discrimination means 1
1 determines whether the input character is always located in the areas 22 to 27 which can be converted only by the shift JIS code system or the area 28 which can be converted only by the extended Unix code system. The hold processing means 16 converts the character code of the input data string stored in the ring buffer memory 17 at the stage when the code system discriminating means 11 can judge whether it is the shift JIS code system or the extended Unix code system.

【００１６】上記のようにすれば、入力データ中に、一
つでもシフトＪＩＳコード体系で変換可能な領域、また
は拡張ユニックスコード体系で変換可能な領域に位置す
る文字コードが存在すれば、文字コードは、確実に文字
コード体系が判断されて、変換されることが可能であ
る。しかし、リングバッファメモリ１７は、メモリ資源
の節約からみて、文字コード変換のために無制限に消費
するわけにもいかない。そこで、保留時処理手段１６
は、リングバッファメモリ１７の所定のサイズを越え
て、入力データが蓄積された場合、他のバッファメモリ
１８に保存する。本実施例は、以上のように、コンピュ
ータのメモリ資源を無駄に浪費せず、大多数の日本語文
字コード変換を高速に処理し、なかなか文字コード体系
の判別がつかないような場合、バッファメモリ１８を用
いることにより、確実に変換できるようになる。According to the above, if there is at least one character code in the input data that can be converted by the shift JIS code system or in the region that can be converted by the extended Unix code system, the character code The character encoding system can be reliably determined and converted. However, in view of saving memory resources, the ring buffer memory 17 cannot be consumed indefinitely for character code conversion. Therefore, the on-hold processing means 16
When the input data exceeds the predetermined size of the ring buffer memory 17, the input data is stored in another buffer memory 18. As described above, the present embodiment does not waste the memory resources of the computer unnecessarily, processes the majority of Japanese character code conversions at high speed, and when it is difficult to determine the character code system, the buffer memory is used. By using 18, it becomes possible to surely convert.

【００１７】上記各手段によって処理を行っても、図２
に示されるように、拡張ユニックスコード体系は、シフ
トＪＩＳコード体系と多くの領域で重なってしまい、コ
ード体系判別手段１１による判別が不可能に近い。そこ
で、コード体系判別手段１１は、コード体系の図２に示
すコードの領域の他に、たとえば日本語文字の特徴等が
記憶されており、これらを参照して、文字コードを走査
中にシフトＪＩＳコード体系独自の文字コードが発見さ
れた場合、シフトＪＩＳコード体系であると判別し、そ
うでなければ判別不可能として、拡張ユニックスコード
体系で変換する。たとえば、図２に示されているよう
に、第一バイト、第二バイト共に１６進表現でＡ１から
ＤＦまでの値を持つ文字コードは、シフトＪＩＳコード
において、半角カタカナ２文字で表される。また、同じ
く第一バイト、第二バイト共に１６進表現でＡ１からＤ
Ｆまでの値を持つ文字コードは、拡張ユニックスコード
体系において、全角漢字１文字で表される。Even if the processing is performed by each of the above means, FIG.
As shown in, the extended Unix code system overlaps with the shift JIS code system in many areas, and it is almost impossible to be discriminated by the code system discriminating means 11. Therefore, the code system discriminating means 11 stores, for example, the characteristics of Japanese characters in addition to the area of the code shown in FIG. 2 of the code system, and referring to these, shifts the character code during scanning JIS. If a character code unique to the code system is found, it is determined to be the shift JIS code system, and if not, it is determined to be undeterminable and conversion is performed using the extended Unix code system. For example, as shown in FIG. 2, a character code having a value from A1 to DF in hexadecimal notation for both the first byte and the second byte is represented by two half-width katakana characters in the shift JIS code. Similarly, both the first and second bytes are in hexadecimal notation from A1 to D.
Character codes having values up to F are represented by one double-byte Kanji character in the extended Unix code system.

【００１８】しかしながら、シフトＪＩＳコード体系に
おいて、半角カタカナ２文字であっても、日本語として
許されない文字列である場合には、この文字データの文
字コード体系は、拡張ユニックスコード体系であると判
断することが可能となる。たとえば、１６進表現で“Ａ
４ＡＦ“は、拡張ユニックスコード体系で、全角漢字の
“く“１文字であるが、シフトＪＩＳコード体系である
と、半角カタカナの“、ッ“となってしまう。日本語と
して促音“ッ“が読点の後ろに現れることはないので、
この文字コードは、拡張ユニックスコード体系であると
判断することが可能となる。However, in the shift JIS code system, even if two half-width katakana characters are not allowed in Japanese, it is determined that the character code system of this character data is the extended Unix code system. It becomes possible to do. For example, in hexadecimal notation "A
4AF "is an extended Unix code system, which is one character of full-width Kanji, but shift JIS code system results in half-width katakana", tsu ". Japanese consonants" tsu " It doesn't appear after the punctuation, so
It is possible to determine that this character code is the extended Unix code system.

【００１９】図３は本発明の他の実施例を説明するため
のブロック構成図である。図３において、図１と相違す
るところは、入力データのコード体系の判別状況、ある
いはバッファメモリ等に関する保留状況を表す履歴情報
を記憶するコード判別履歴情報記憶手段３１と、バッフ
ァメモリ１８に保留できない入力データを一時記憶する
一時ファイル３２と、コード体系判別手段１１におい
て、特定のコード体系が判別できない場合、ユーザの所
望のコード体系でＪＩＳコードに変換する処理を行なう
デフォルトコード体系処理手段３３と、ユーザが上記所
望のコード体系を設定するデフォルトコード体系設定手
段３４とが設けられていることである。FIG. 3 is a block diagram for explaining another embodiment of the present invention. 3 is different from FIG. 1 in that it cannot be held in the buffer memory 18 and the code judgment history information storage means 31 for storing history information indicating the judgment status of the code system of the input data or the holding status of the buffer memory and the like. A temporary file 32 for temporarily storing the input data; and a default code system processing unit 33 for converting into a JIS code in a code system desired by the user when the code system determination unit 11 cannot determine a specific code system. The default code system setting means 34 for the user to set the desired code system is provided.

【００２０】図３の中において、実線で表された矢印
は、プリントデータの流れを、また、破線で表された矢
印は、動作の制御を表すものである。アプリケーション
プログラムから送信されてきた入力データは、コード体
系判別手段１１により、そのコード体系が判別される。
コード体系判別手段１１は、当該コード体系判別手段１
１に記憶されている日本語コードの漢字コード領域（図
２）を参照し、入力データのコードが１６進表現で７Ｆ
以下のプリントデータの場合、ＪＩＳコードもシフトＪ
ＩＳコードも拡張ユニックスコード（ＥＵＣ）も変換せ
ずにそのままプリンタに送信されるように処理する。し
かし、コード体系判別手段１１は、それ以外の入力デー
タが入力された場合、前記日本語コードの漢字コード領
域を参照しながら、コード体系が判別できるまで、リン
グバッファメモリ１７、バッファメモリ１８、あるいは
一時ファイル３２に保存されるように処理する。In FIG. 3, the arrow shown by the solid line shows the flow of print data, and the arrow shown by the broken line shows the control of the operation. The code system of the input data transmitted from the application program is discriminated by the code system discriminating means 11.
The code system discrimination means 11 is the code system discrimination means 1
Refer to the Kanji code area of the Japanese code stored in 1 (Fig. 2), and the code of the input data is 7F in hexadecimal notation.
In the case of the following print data, the JIS code is also Shift J
The IS code and the extended Unix code (EUC) are processed so that they are directly transmitted to the printer without conversion. However, when other input data is input, the code system discrimination means 11 refers to the Kanji code area of the Japanese code until the code system can be discriminated, or the ring buffer memory 17, the buffer memory 18, or It is processed so as to be saved in the temporary file 32.

【００２１】コード体系判別手段１１は、入力データの
コード体系が判別可能となった段階で、リングバッファ
メモリ１７、バッファメモリ１８、あるいは一時ファイ
ル３２に保存されていた入力データがシフトＪＩＳコー
ド変換手段１３、あるいは拡張ユニックスコード（ＥＵ
Ｃ）変換手段１４を通じて、ＪＩＳコードに変換されプ
リンタ１５に送られるように処理する。コード体系判別
手段１１は、リングバッファメモリ１７、バッファメモ
リ１８、あるいは一時ファイル３２に保存する入力デー
タのサイズがコード判別履歴情報として、コード判別履
歴情報記憶手段３１に記憶されるように処理する。今ま
でのコード体系判別手段１１の判別結果を基にして作成
されたコード判別履歴情報が動的に変化させることによ
り、バッファメモリ１８および一時ファイル３２は、自
動的に最適なサイズに変更することができる。一例とし
て、今までの日本語文字コード自動変換プロセスの中
で、判別不可能な文字が発見された後から文字コードが
確定されるまでに要したバイト長の平均と、最大長とが
コード判別履歴情報として、コード判別履歴情報記憶手
段３１に記憶され、文字コード変換装置は、上記コード
判別履歴情報からバッファメモリ１８のサイズや一時フ
ァイル３２のサイズを最適な値に自動的に変更すること
が可能である。The code system discriminating means 11 shifts the input data stored in the ring buffer memory 17, the buffer memory 18 or the temporary file 32 when the code system of the input data can be discriminated. 13, or extended Unix code (EU
C) The conversion means 14 processes the data so that it is converted into a JIS code and sent to the printer 15. The code system discriminating means 11 performs processing so that the size of the input data stored in the ring buffer memory 17, the buffer memory 18, or the temporary file 32 is stored in the code discriminating history information storage means 31 as code discriminating history information. The buffer memory 18 and the temporary file 32 are automatically changed to the optimum size by dynamically changing the code discrimination history information created based on the discrimination result of the code system discriminating means 11 so far. You can As an example, in the Japanese character code automatic conversion process up to now, the average of the byte length required after the unidentifiable character was found until the character code was confirmed, and the maximum length were the code identification. The history information is stored in the code determination history information storage means 31, and the character code conversion device can automatically change the size of the buffer memory 18 or the size of the temporary file 32 to an optimum value from the code determination history information. It is possible.

【００２２】コード体系判別手段１１は、当該コード体
系判別手段１１に記憶されている日本語コードの漢字コ
ード領域を参照したにもかかわらず、どうしてもコード
体系を判別できない場合、デフォルトコード体系処理手
段３３の制御によって、入力データがＪＩＳコードに変
換されるように処理する。このデフォルトコード体系
は、デフォルトコード体系設定手段３４を用いて動的に
変更することができる。これにより、ユーザは、所望の
コード体系でＪＩＳコードに変換することが可能とな
る。つまり、できる限り入力データの文字コードを判別
した場合でも、判別不可能な文字コードのみからなるテ
キストコードの場合には、文字コード体系を正しく判別
できない。この場合には、システムに登録されているど
れか一つのコード体系に従ってコード変換を行なう。し
かし、これがユーザの意図しない文字コードであった場
合には、デフォルトコード体系設定手段３４により、ユ
ーザの所望するコード体系で文字コード変換を行なうこ
とができる。If the code system discriminating means 11 cannot refer to the kanji code area of the Japanese code stored in the code system discriminating means 11 but still cannot discriminate the code system, the default code system processing means 33. The input data is processed so as to be converted into the JIS code under the control of. This default code system can be dynamically changed using the default code system setting means 34. This allows the user to convert to a JIS code in a desired code system. In other words, even if the character code of the input data is discriminated as much as possible, the character code system cannot be discriminated correctly in the case of the text code consisting of only the unrecognizable character code. In this case, code conversion is performed according to one of the code systems registered in the system. However, if this is a character code not intended by the user, the default code system setting means 34 can perform character code conversion in the code system desired by the user.

【００２３】次に、本実施例において、文字コードを判
別する際の処理の流れを詳細に説明する。図４は本発明
の実施例であるコード判別手段のフローチャートであ
る。図５は本発明の実施例であるコード判別手段のフロ
ーチャートで、符号ａ−ａ′、符号ｂ−ｂ′、符号ｃ−
ｃ′、および符号ｄ−ｄ′によって接続されている。図
６は本発明の実施例であるコード判別のフローチャート
である。図７は本発明の実施例であるコード判別のフロ
ーチャートで、符号ｅ−ｅ′、符号ｆ−ｆ′、および符
号ｇ−ｇ′によって接続されている。Next, in the present embodiment, the flow of processing for determining the character code will be described in detail. FIG. 4 is a flowchart of the code discriminating means according to the embodiment of the present invention. FIG. 5 is a flowchart of the code discriminating means according to the embodiment of the present invention.
It is connected by c'and the code d-d '. FIG. 6 is a flowchart of code discrimination according to the embodiment of the present invention. FIG. 7 is a flow chart of the code discrimination according to the embodiment of the present invention, which is connected by the symbols ee ', ff', and gg '.

【００２４】コード体系判別手段１１（図１または図
３）は、初期設定として「コード体系」変数を「未定」
とする（ステップ４１１）。コード体系判別手段１１
は、初期設定として「コード体系」変数を「未定」とし
た後、リングバッファメモリ１７を初期設定する（ステ
ップ４１２）。コード体系判別手段１１は、コード判別
履歴情報記憶手段３１の内容を参照して、バッファメモ
リ１８に入力データを退避させるための退避サイズを設
定する（ステップ４１３）。コード体系判別手段１１
は、リングバッファメモリ１７に入力データを１バイト
ずつ読み込む（ステップ４１４）。コード体系判別手段
１１は、ファイルの最後であるか否かを調べ、ファイル
の最後であると判断した場合、処理を終了させる（ステ
ップ４１５、４１６）。The code system discrimination means 11 (FIG. 1 or 3) sets the "code system" variable to "undecided" as an initial setting.
(Step 411). Code system discrimination means 11
Initializes the ring buffer memory 17 after setting the "code system" variable to "undecided" as an initial setting (step 412). The code system discrimination means 11 refers to the contents of the code discrimination history information storage means 31 and sets a save size for saving the input data in the buffer memory 18 (step 413). Code system discrimination means 11
Reads the input data byte by byte into the ring buffer memory 17 (step 414). The code system discriminating means 11 checks whether it is the end of the file, and when it is determined that it is the end of the file, terminates the processing (steps 415, 416).

【００２５】コード体系判別手段１１は、ステップ４１
４において、読み込まれた１バイトが１６進表現で７Ｆ
以下（図２に示す日本語コードの漢字コード領域におい
て第一バイト目が１６進表現で８０以下）であり、かつ
解析途中の入力データがリングバッファメモリ１７に存
在しないかどうかを調べる（ステップ４１７）。コード
体系判別手段１１は、前記バイトが１６進表現で７Ｆ以
下であり、かつ解析途中の入力データがリングバッファ
メモリ１７に存在しないと判断した場合、入力データを
そのまま変換せずにプリンタ１５に送信する（ステップ
４１８）。コード体系判別手段１１は、読み込まれたバ
イトが１６進表現で８０以上であると判断した場合、コ
ード体系が「未定」か否かを調べる（ステップ４１
９）。The code system discrimination means 11 executes step 41.
In 4, the read 1 byte is 7F in hexadecimal notation.
It is below (the first byte in the Kanji code area of the Japanese code shown in FIG. 2 is 80 or less in hexadecimal notation) and it is checked whether or not the input data being analyzed is present in the ring buffer memory 17 (step 417). ). When the code system determination means 11 determines that the byte is 7F or less in hexadecimal notation and that the input data being analyzed does not exist in the ring buffer memory 17, the input data is directly transmitted to the printer 15 without being converted. (Step 418). When the code system discriminating means 11 judges that the read bytes are 80 or more in hexadecimal notation, it checks whether or not the code system is "undecided" (step 41).
9).

【００２６】コード体系が「未定」の場合、シフトＪＩ
Ｓコード体系あるいは拡張ユニックスコード体系で意味
する文字が異なるため、コード変換が必要となる。コー
ド体系判別手段１１は、「コード体系」変数が「未定」
であると判断した場合、入力データの１バイトをリング
バッファメモリ１７に格納する（ステップ４２０、４２
１）。コード体系判別手段１１は、続くバイトをリング
バッファメモリ１７に格納する（ステップ４２２）。次
に、コード体系判別手段１１は、当該コード体系判別手
段１１に記憶されている日本語コードの漢字コード領域
を参照してコード判別を行なう（ステップ４２３）。コ
ード体系判別手段１１は、入力データのコード体系が判
別可能か否かを調べる（ステップ４２４）。コード体系
判別手段１１は、入力データのコード体系が判別可能で
あると判断した場合、不要になったリングバッファメモ
リ１７内のバイトをバッファメモリ１８または一時ファ
イル３２に退避した後、ステップ４１４に戻り、次のバ
イトを読み込む（ステップ４２５）。When the code system is "undecided", shift JI
Since the characters that mean in the S code system or the extended Unix code system are different, code conversion is necessary. In the code system discrimination means 11, the "code system" variable is "undecided".
If it is determined that 1 byte of the input data is stored in the ring buffer memory 17 (steps 420, 42).
1). The code system discrimination means 11 stores the following bytes in the ring buffer memory 17 (step 422). Next, the code system discriminating means 11 refers to the Kanji code area of the Japanese code stored in the code system discriminating means 11 to perform the code discrimination (step 423). The code system discrimination means 11 checks whether or not the code system of the input data can be discriminated (step 424). When the code system discriminating means 11 judges that the code system of the input data can be discriminated, the bytes in the ring buffer memory 17 which have become unnecessary are saved in the buffer memory 18 or the temporary file 32, and then the process returns to step 414. , The next byte is read (step 425).

【００２７】コード体系判別手段１１は、入力データの
コード体系が判別可能でないと判断した場合、拡張ユニ
ックスコード体系であるか否かを調べる（ステップ４２
６）。コード体系判別手段１１は、入力データのコード
体系が拡張ユニックスコード体系であると判断した場
合、リングバッファメモリ１７とバッファメモリ１８と
一時ファイル３２に残っているデータを拡張ユニックス
コード体系で変換する（ステップ４２７）。コード体系
判別手段１１は、リングバッファメモリ１７とバッファ
メモリ１８と一時ファイル３２に残っているデータを拡
張ユニックスコード体系で変換した際に、「コード体
系」変数を拡張ユニックスコード体系に変えてコード判
別履歴情報記憶手段３１に記憶させる（ステップ４２
８）。コード体系判別手段１１は、入力データのコード
体系が拡張ユニックスコード体系でないと判断した場
合、リングバッファメモリ１７とバッファメモリ１８と
一時ファイル３２に残っているデータをシフトＪＩＳコ
ード体系で変換する（ステップ４２９）。コード体系判
別手段１１は、リングバッファメモリ１７とバッファメ
モリ１８と一時ファイル３２に残っているデータをシフ
トＪＩＳコード体系で変換した際に、「コード体系」変
数をシフトＪＩＳコード体系に変えてコード判別履歴情
報記憶手段３１に記憶させる（ステップ４３０）。コー
ド体系変数を変えた後、ステップ４１４に戻り、次のバ
イトを読み込み、前記同様な処理を繰り返す。When the code system discriminating means 11 judges that the code system of the input data cannot be discriminated, it checks whether or not it is the extended Unix code system (step 42).
6). When the code system discrimination means 11 determines that the code system of the input data is the extended Unix code system, it converts the data remaining in the ring buffer memory 17, the buffer memory 18, and the temporary file 32 into the extended Unix code system ( Step 427). The code system discriminating means 11 discriminates the code by changing the "code system" variable to the extended Unix code system when converting the data remaining in the ring buffer memory 17, the buffer memory 18 and the temporary file 32 by the extended Unix code system. It is stored in the history information storage means 31 (step 42).
8). When the code system discriminating means 11 determines that the code system of the input data is not the extended Unix code system, the data remaining in the ring buffer memory 17, the buffer memory 18 and the temporary file 32 is converted by the shift JIS code system (step). 429). The code system discriminating means 11 discriminates the code by changing the "code system" variable to the shift JIS code system when converting the data remaining in the ring buffer memory 17, the buffer memory 18 and the temporary file 32 by the shift JIS code system. It is stored in the history information storage means 31 (step 430). After changing the code system variable, the process returns to step 414, the next byte is read, and the same processing as described above is repeated.

【００２８】ステップ４１９において、コード体系判別
手段１１は、「コード体系」変数が「未定」でないと判
断した場合、入力データの文字コードが拡張ユニックス
コード体系であるか否かを調べる（ステップ４３１）。
コード体系判別手段１１は、拡張ユニックスコード体系
であると判断した場合、入力バイトを拡張ユニックスコ
ード変換手段１４によって変換する（ステップ４３
２）。コード体系判別手段１１は、拡張ユニックスコー
ド体系でないと判断した場合、入力バイトをシフトＪＩ
Ｓコード変換手段１３によって変換する（ステップ４３
３）。シフトＪＩＳコード変換手段１３および拡張ユニ
ックスコード変換手段１４によってＪＩＳコードに変換
されたデータは、プリンタ１５に送信されて印刷される
と共に、次のバイトを同様に処理するために、ステップ
４１４の処理に戻る（ステップ４３４）。When it is determined in step 419 that the "code system" variable is not "undecided", the code system discrimination means 11 checks whether the character code of the input data is the extended Unix code system (step 431). .
When the code system discriminating means 11 determines that the extended Unix code system is used, the input byte is converted by the extended Unix code converting means 14 (step 43).
2). When the code system discriminating means 11 judges that it is not the extended Unix code system, it shifts the input byte JI.
It is converted by the S code converting means 13 (step 43).
3). The data converted into the JIS code by the shift JIS code conversion unit 13 and the extended Unix code conversion unit 14 is transmitted to the printer 15 and printed, and the process of step 414 is performed in order to process the next byte in the same manner. Return (step 434).

【００２９】次に図６および図７と、図８ないし図１５
を参照しつつコード体系の判別方法を説明する。図８は
本発明の実施例に使用したリングバッファメモリの初期
状態を説明するための図である。図９は本発明の実施例
に使用したリングバッファメモリに２バイトのデータを
読み込んだ状態を説明するための図である。図１０は本
発明の実施例に使用したリングバッファメモリのコード
判別後の状態を説明するための図である。図１１は本発
明の実施例に使用したリングバッファメモリに２バイト
のデータを読み込んだ状態を説明するための図である。
図１２は本発明の実施例に使用したリングバッファメモ
リのコード判別後の状態を説明するための図である。図
１３は本発明の実施例に使用したリングバッファメモリ
に２バイトのデータを読み込んだ状態を説明するための
図である。図１４は本発明の実施例に使用したリングバ
ッファメモリのフル状態を説明するための図である。図
１５は本発明の実施例に使用したリングバッファメモリ
の退避後の状態を説明するための図である。コード体系
判別手段１１は、リングバッファメモリ１７におけるシ
フトＪＩＳ解析用バイトポインタから２バイトを取り出
す（ステップ６１１）。リングバッファメモリ１７は、
図９に示す状態になる。コード体系判別手段１１は、当
該コード体系判別手段１１に記憶されている日本語コー
ドの漢字コード領域（図２参照）を参照してコード領域
の検査を行ないシフトＪＩＳコード体系としてのみ解釈
できるコードか否かを調べる（ステップ６１２）。たと
えば、コード体系判別手段１１は、図２に示す日本語コ
ードの漢字コード領域において、左上から右下に流れる
斜線のみで示されるシフトＪＩＳコード領域のみで判別
可能な領域をシフトＪＩＳコード体系であると判別する
（ステップ６１３）。Next, referring to FIGS. 6 and 7, and FIGS.
The method of discriminating the code system will be described with reference to. FIG. 8 is a diagram for explaining the initial state of the ring buffer memory used in the embodiment of the present invention. FIG. 9 is a diagram for explaining a state in which 2-byte data is read into the ring buffer memory used in the embodiment of the present invention. FIG. 10 is a diagram for explaining a state after code discrimination of the ring buffer memory used in the embodiment of the present invention. FIG. 11 is a diagram for explaining a state where 2 bytes of data are read into the ring buffer memory used in the embodiment of the present invention.
FIG. 12 is a diagram for explaining the state after the code discrimination of the ring buffer memory used in the embodiment of the present invention. FIG. 13 is a diagram for explaining a state in which 2 bytes of data are read in the ring buffer memory used in the embodiment of the present invention. FIG. 14 is a diagram for explaining the full state of the ring buffer memory used in the embodiment of the present invention. FIG. 15 is a diagram for explaining the state after saving the ring buffer memory used in the embodiment of the present invention. The code system discrimination means 11 extracts 2 bytes from the shift JIS analysis byte pointer in the ring buffer memory 17 (step 611). The ring buffer memory 17 is
The state shown in FIG. 9 is obtained. Whether the code system discriminating means 11 can interpret the code region by referring to the kanji code region (see FIG. 2) of the Japanese code stored in the code system discriminating device 11 and interpret it only as a shift JIS code system. It is checked whether or not (step 612). For example, the code system discriminating means 11 is a shift JIS code system in which the region which can be discriminated only by the shift JIS code region shown only by the diagonal lines flowing from the upper left to the lower right in the Kanji code region of the Japanese code shown in FIG. (Step 613).

【００３０】コード体系判別手段１１は、取り出したコ
ードがシフトＪＩＳコードとしてのみ解釈できるコード
でないと判別した場合、シフトＪＩＳコードとしても許
されないコード領域（左上から右下に流れる斜線を持つ
領域）にあるか否かを調べる（ステップ６１４）。シフ
トＪＩＳとして許されるコードでない場合には、リング
バッファメモリ１７における拡張ユニックスコード解析
用バイトポインタから先の２バイトを取り出し（ステッ
プ６２３）、拡張ユニックスコード体系であるか否かを
調べる（ステップ６２４）。コード体系判別手段１１
は、入力データが拡張ユニックスコード体系でないと判
別した場合、不正なコードが判別されたと判断できる
（ステップ６２５）。コード体系判別手段１１は、入力
データが拡張ユニックスコード体系であると判別した場
合、入力データが拡張ユニックスコードであると判別で
きる。（ステップ６２６）。コード体系判別手段１１
は、入力データがシフトＪＩＳコード体系としても許さ
れると判別できる場合、シフトＪＩＳコード解析を終了
させるため、リングバッファメモリ１７中のシフトＪＩ
Ｓポインタを解析した文字分だけそれぞれ進める。この
時、コード体系判別手段１１に記憶されている日本語コ
ードの漢字コード領域を参照して、図２に符号３０で示
される領域にあるか否かを調べる（ステップ６１５）。When the code system discriminating means 11 discriminates that the retrieved code is not a code that can be interpreted only as a shift JIS code, it is placed in a code area which is not permitted as a shift JIS code (area having a diagonal line flowing from upper left to lower right). It is checked whether there is any (step 614). If the code is not permitted as shift JIS, the preceding 2 bytes are extracted from the extended Unix code analysis byte pointer in the ring buffer memory 17 (step 623), and it is checked whether or not the extended Unix code system is used (step 624). . Code system discrimination means 11
If it is determined that the input data does not have the extended Unix code system, it can be determined that an illegal code has been determined (step 625). When it is determined that the input data is the extended Unix code system, the code system determination unit 11 can determine that the input data is the extended Unix code. (Step 626). Code system discrimination means 11
If it can be determined that the input data is also allowed as the shift JIS code system, the shift JIS code in the ring buffer memory 17 is terminated to end the shift JIS code analysis.
The S pointer is advanced by the analyzed character. At this time, referring to the Kanji code area of the Japanese code stored in the code system discrimination means 11, it is checked whether or not it is in the area indicated by reference numeral 30 in FIG. 2 (step 615).

【００３１】コード体系判別手段１１は、入力データの
コード体系が、上記領域３０にあると判別した場合、シ
フトＪＩＳコード体系で、半角カタカナと全角漢字の第
一バイト目となるので、この場合、シフトＪＩＳポイン
タを１つだけ進める（ステップ６１６）。リングバッフ
ァメモリ１７は、図１２に示す状態になる。コード体系
判別手段１１は、入力データのコード体系が、上記領域
３０にないと判別した場合、シフトＪＩＳポインタを２
バイト進める（ステップ６１７）。リングバッファメモ
リ１７は、図１１に示す状態になる。次に、文字が拡張
ユニックスコードか判別するためコード体系判別手段１
１は、リングバッファメモリ１７の拡張ユニックスコー
ドポインタから先の２バイトを取り出す（ステップ６１
８）。When the code system discriminating means 11 discriminates that the code system of the input data is in the area 30, it becomes the first byte of half-width katakana and full-width kanji in the shift JIS code system. The shift JIS pointer is advanced by one (step 616). The ring buffer memory 17 is in the state shown in FIG. When the code system discriminating means 11 discriminates that the code system of the input data is not in the area 30, the shift JIS pointer is set to 2
Advance the byte (step 617). The ring buffer memory 17 is in the state shown in FIG. Next, a code system discriminating means 1 for discriminating whether the character is an extended Unix code.
1 retrieves the preceding 2 bytes from the extended Unix code pointer of the ring buffer memory 17 (step 61).
8).

【００３２】コード体系判別手段１１は、入力データの
コード体系が、拡張ユニックスコード体系であるか否か
を判別する（ステップ６１９）。コード体系判別手段１
１は、入力データのコード体系が、拡張ユニックスコー
ド体系であると判別した場合、どちらのコード体系か判
別できない。そこで、拡張ユニックスコード解析を終了
させるため、リングバッファメモリ１７から拡張ユニッ
クスコードポインタを２バイト進める（ステップ６２
０）。さらに、コード体系判別手段１１は、コード体系
を判別することが不可能であるため、図１４に示すリン
グバッファメモリ１７の入力データを図１５に示すよう
に退避バッファ、たとえば、図１に示すバッファメモリ
１８に一時退避させる（ステップ６２１）。一方、コー
ド体系判別手段１１は、ステップ６１９において、拡張
ユニックスコード体系でないと判別した場合、入力デー
タをシフトＪＩＳコード体系であると判別できる（ステ
ップ６２２）。The code system discriminating means 11 discriminates whether or not the code system of the input data is the extended Unix code system (step 619). Code system discrimination means 1
In No. 1, when it is determined that the code system of the input data is the extended Unix code system, it cannot be determined which code system. Therefore, in order to end the extended Unix code analysis, the extended Unix code pointer is advanced by 2 bytes from the ring buffer memory 17 (step 62).
0). Further, since the code system discriminating means 11 cannot discriminate the code system, the input data of the ring buffer memory 17 shown in FIG. 14 is saved as shown in FIG. 15, for example, the buffer shown in FIG. It is temporarily saved in the memory 18 (step 621). On the other hand, when it is determined in step 619 that the input data is not the extended Unix code system, the code system determining means 11 can determine that the input data is the shift JIS code system (step 622).

【００３３】次に、図８ないし図１５を参照して、リン
グバッファメモリ１７について詳述する。上述のごと
く、シフトＪＩＳコード体系と拡張ユニックスコード体
系とは、文字コードの区切りが同じとは限らない。この
ため、解析バイト列は、一旦メモリにバッファリングし
て管理する必要がある。本実施例では、これらの問題点
を解決するためにリングバッファメモリ１７を使用して
いる。図８はリングバッファメモリ１７の初期状態で、
開始ポインタ、終了ポインタ、拡張ユニックスコードポ
インタ、およびシフトＪＩＳポインタが一定の位置に示
されている。次に、図９には入力データをリングバッフ
ァメモリ１７に２バイト読み込んだ状態が示されてい
る。そして、入力データＡ１、Ａ２がリングバッファメ
モリ１７に格納され、終了ポインタが２つ移動してい
る。Next, the ring buffer memory 17 will be described in detail with reference to FIGS. As described above, the shift JIS code system and the extended Unix code system do not always have the same character code delimiter. Therefore, it is necessary to temporarily buffer the parsed byte string in the memory for management. In this embodiment, the ring buffer memory 17 is used to solve these problems. FIG. 8 shows the initial state of the ring buffer memory 17,
The start pointer, end pointer, extended Unix code pointer, and shift JIS pointer are shown in fixed positions. Next, FIG. 9 shows a state where 2 bytes of input data are read into the ring buffer memory 17. Then, the input data A1 and A2 are stored in the ring buffer memory 17, and the end pointer is moved by two.

【００３４】図１０には上記Ａ１、Ａ２バイトのコード
体系を判別した後の状態が示されている。たとえば、Ａ
１、Ａ２バイトのコード体系が図２で示す符号３０で示
される領域にない場合には、この２バイトで文字の区切
りとなるので、拡張ユニックスコードポインタ、シフト
ＪＩＳポインタが共に二つ移動する。図１１にはさらに
入力データを２バイト読み込んだ状態が示されている。
Ａ１、Ａ２に続いてＡ３、Ａ４がリングバッファメモリ
１７に格納され、終了ポインタが二つ移動する。図１
２は拡張ユニックスコードポインタ、およびシフトＪＩ
Ｓポインタを用いてこのＡ３、Ａ４バイトのコード体系
を判別した後の状態を示す。FIG. 10 shows a state after the code system of A1 and A2 bytes is discriminated. For example, A
If the code system of 1 and A2 bytes does not exist in the area indicated by the reference numeral 30 in FIG. 2, since the character delimiter is formed by these 2 bytes, both the extended Unix code pointer and the shift JIS pointer move. FIG. 11 shows a state in which the input data is further read by 2 bytes.
Following A1 and A2, A3 and A4 are stored in the ring buffer memory 17, and the end pointer moves by two. Figure 1
2 is an extended Unix code pointer and shift JI
The state after this A3 and A4 byte code system is discriminated using the S pointer is shown.

【００３５】図１２に示すコード判定では、Ａ３、Ａ４
バイトが図２で示す符号３０で示される領域にあった場
合が示されている。すなわち、拡張ユニックスコードポ
インタは、二つ移動するが、シフトＪＩＳポインタは、
Ａ３バイトのみ文字の区切りとして認識するので一つだ
け移動する。図１３は入力データをさらに２バイト読み
込んだ場合で、シフトＪＩＳポインタから解析するバイ
トは、Ａ４、Ａ５バイトとなる。このようにして、入力
データをリングバッファメモリ１７にバッファリングす
ると図１４に示す様にバッファメモリ１８が一杯にな
る。この時、図１５に示すように、シフトＪＩＳポイン
タ、または拡張ユニックスコードポインタの位置になる
まで、リングバッファメモリ１７の内容を、バッファメ
モリ１８や一時ファイル３２に退避させる。In the code judgment shown in FIG. 12, A3, A4
The case where the byte is in the area indicated by the reference numeral 30 shown in FIG. 2 is shown. That is, the extended Unix code pointer moves by two, but the shift JIS pointer
Only A3 bytes are recognized as character delimiters, so only one is moved. FIG. 13 shows a case in which 2 bytes of input data are read, and the bytes analyzed from the shift JIS pointer are A4 and A5 bytes. When the input data is buffered in the ring buffer memory 17 in this way, the buffer memory 18 becomes full as shown in FIG. At this time, as shown in FIG. 15, the contents of the ring buffer memory 17 are saved in the buffer memory 18 or the temporary file 32 until the position of the shift JIS pointer or the extended Unix code pointer is reached.

【００３６】以上、本実施例を詳述したが、前記本実施
例に限定されるものではない。そして、特許請求の範囲
に記載された本発明を逸脱することがなければ、種々の
設計変更を行なうことが可能である。たとえば、リング
バッファメモリは、本実施例の主旨を逸脱しないもので
あれば、どのようなメモリでも良く、また入力データ８
個に限定する必要がない。さらに、本発明の各手段は、
実施例において、具体的に示されていないが、公知また
は周知の論理回路によって構成される。また、本実施例
において、文字コード変換装置を制御する制御回路等、
通常のプリンタが備えている回路または装置が省略され
ている。また、本実施例は、シフトＪＩＳコード体系と
拡張ユニックスコード体系の場合で説明したが、如何な
るコード体系であっても良いことはいうまでもないこと
である。さらに、本実施例は、コード体系を判別する際
に、図２に示す日本語文字コードの漢字コード領域を使
用したが、これに限定されるものではなく、文字コード
を検索したり、あるいは各コード体系のテーブルのよう
なものを作製しておき、これに基づいて処理することも
できる。Although this embodiment has been described in detail above, it is not limited to this embodiment. Various design changes can be made without departing from the present invention described in the claims. For example, the ring buffer memory may be any memory as long as it does not depart from the gist of the present embodiment, and the input data 8
There is no need to limit to individual pieces. Further, each means of the present invention is
Although not specifically shown in the embodiment, it is configured by a known or well-known logic circuit. In the present embodiment, a control circuit for controlling the character code conversion device,
Circuits or devices included in a typical printer are omitted. Further, although the present embodiment has been described in the case of the shift JIS code system and the extended Unix code system, it goes without saying that any code system may be used. Further, in the present embodiment, the Kanji code area of the Japanese character code shown in FIG. 2 is used when the code system is discriminated. However, the present invention is not limited to this, and the character code can be searched or each character code can be searched. It is also possible to prepare a table such as a code system and perform processing based on this.

【００３７】[0037]

【発明の効果】本発明によれば、コード体系判別手段が
入力された文字データの文字コード体系を特定できなか
ったり、あるいはそれまでの文字コード体系を採用でき
ない場合、文字データの文字コード体系が特定されるま
で変換を保留し、その後、文字データが特定の文字コー
ド体系であると判断された場合、保留時の文字データを
前記文字コード体系であるものとして、保留時の文字デ
ータを変換することができるため、コード体系の異なる
環境下においても、コード体系に係わりなく、出来うる
かぎり正しいプリントデータに変換することができる。
また、本発明によれば、コード体系を判別する際に、日
本語として許されない文字の組み合わせに着目し、この
ような種類の文字コードを排除し、残りの種類の文字コ
ードを判別するため、コード体系の異なる環境下におい
ても、誤りの少ないプリントデータに変換することがで
きる。According to the present invention, when the character code system of the input character data cannot be specified by the code system discriminating means or the character code system up to that time cannot be adopted, the character code system of the character data is If the character data is judged to have a specific character code system after the conversion is suspended until it is specified, the character data at the time of suspension is regarded as the character code system and the character data at the time of suspension is converted. Therefore, even in environments with different code systems, the print data can be converted to the correct print data as much as possible regardless of the code system.
Further, according to the present invention, when distinguishing the code system, paying attention to a combination of characters that is not allowed as Japanese, eliminating such types of character codes, and determining the remaining types of character codes, Even in environments with different code systems, it is possible to convert print data with few errors.

[Brief description of drawings]

【図１】本発明の一実施例を説明するための概略ブロ
ック構成図である。FIG. 1 is a schematic block configuration diagram for explaining an embodiment of the present invention.

【図２】日本語文字コードの漢字コード領域を説明す
るための図である。FIG. 2 is a diagram for explaining a Kanji code area of a Japanese character code.

【図３】本発明の他の実施例を説明するためのブロッ
ク構成図である。FIG. 3 is a block diagram for explaining another embodiment of the present invention.

【図４】本発明の実施例であるコード判別手段のフロ
ーチャートである。FIG. 4 is a flowchart of a code discriminating means that is an embodiment of the present invention.

【図５】本発明の実施例であるコード判別手段のフロ
ーチャートで、符号ａ−ａ′、符号ｂ−ｂ′、符号ｃ−
ｃ′、および符号ｄ−ｄ′によって接続されている。FIG. 5 is a flowchart of a code discriminating means that is an embodiment of the present invention, in which a symbol aa ′, a symbol bb ′, and a symbol c−.
It is connected by c'and the code d-d '.

【図６】本発明の実施例であるコード判別のフローチ
ャートである。FIG. 6 is a flowchart of code determination according to the embodiment of the present invention.

【図７】本発明の実施例であるコード判別のフローチ
ャートで、符号ｅ−ｅ′、符号ｆ−ｆ′、および符号ｇ
−ｇ′によって接続されている。FIG. 7 is a flow chart of code discrimination according to the embodiment of the present invention, in which a code ee ′, a code ff ′, and a code g.
Connected by -g '.

【図８】本発明の実施例に使用したリングバッファメ
モリの初期状態を説明するための図である。FIG. 8 is a diagram for explaining an initial state of the ring buffer memory used in the embodiment of the present invention.

【図９】本発明の実施例に使用したリングバッファメ
モリに２バイトのデータを読み込んだ状態を説明するた
めの図である。FIG. 9 is a diagram for explaining a state where 2-byte data is read into the ring buffer memory used in the embodiment of the present invention.

【図１０】本発明の実施例に使用したリングバッファ
メモリのコード判別後の状態を説明するための図であ
る。FIG. 10 is a diagram for explaining a state after code discrimination of the ring buffer memory used in the embodiment of the present invention.

【図１１】本発明の実施例に使用したリングバッファ
メモリに２バイトのデータを読み込んだ状態を説明する
ための図である。FIG. 11 is a diagram for explaining a state in which 2-byte data is read into the ring buffer memory used in the embodiment of the present invention.

【図１２】本発明の実施例に使用したリングバッファ
メモリのコード判別後の状態を説明するための図であ
る。FIG. 12 is a diagram for explaining a state after code discrimination of the ring buffer memory used in the embodiment of the present invention.

【図１３】本発明の実施例に使用したリングバッファ
メモリに２バイトのデータを読み込んだ状態を説明する
ための図である。FIG. 13 is a diagram for explaining a state in which 2-byte data is read into the ring buffer memory used in the embodiment of the present invention.

【図１４】本発明の実施例に使用したリングバッファ
メモリのフル状態を説明するための図である。FIG. 14 is a diagram for explaining a full state of the ring buffer memory used in the example of the present invention.

【図１５】本発明の実施例に使用したリングバッファ
メモリの退避後の状態を説明するための図である。FIG. 15 is a diagram for explaining a state after saving the ring buffer memory used in the embodiment of the present invention.

【図１６】ネットワークにＵＮＩＸマシンとＭＳＤＯ
Ｓマシンとが接続されている状態を説明するための図で
ある。FIG. 16: UNIX machine and MSDO in network
It is a figure for explaining the state where S machine is connected.

[Explanation of symbols]

１１・・・コード体系判別手段１２・・・コード体系記憶手段１３・・・シフトＪＩＳコード変換手段１４・・・拡張ユニックスコード変換手段１５・・・プリンタ１６・・・保留時処理手段１７・・・リングバッファメモリ１８・・・バッファメモリ３１・・・コード判別履歴情報記憶手段３２・・・一時ファイル３３・・・デフォルトコード体系処理手段３４・・・デフォルトコード体系設定手段１１１・・・採用手段１１２・・・排除手段 11 ... Code system discriminating means 12 ... Code system storing means 13 ... Shift JIS code converting means 14 ... Extended Unix code converting means 15 ... Printer 16 ... Pending processing means 17 ... Ring buffer memory 18 ... Buffer memory 31 ... Code discrimination history information storage means 32 ... Temporary file 33 ... Default code system processing means 34 ... Default code system setting means 111 ... Adopting means 112 ... Elimination means

Claims

[Claims]

1. A code system discriminating means for discriminating which of a plurality of types of character code systems the input character data is, and a character for converting the character data into a predetermined character code based on the discrimination result. In a character code conversion device including a code conversion unit, a storage unit that stores the type of the character code system of the determination result; If the adoption means that adopts the result and the adoption result cannot adopt the past determination result,
Buffer means for storing the character data input when the discrimination is suspended, and, after the discrimination suspension is started, when the character system discriminates the character data to be a specific character encoding system, the character data in the buffer means is stored. And an on-hold processing unit for converting the character data to the conversion unit as a specified character code system.

2. When determining whether the input Japanese character data is one of a plurality of types of character code systems, if the character code system of the input Japanese character data is not specified, it is determined as Japanese. 2. The character code conversion device according to claim 1, further comprising a code system discriminating means for excluding character codes of a type that is a combination of unacceptable characters and making the remaining type of character codes a discrimination result.