JP5150979B2

JP5150979B2 - Code conversion system, code conversion device, code conversion method, and code conversion program

Info

Publication number: JP5150979B2
Application number: JP2009149619A
Authority: JP
Inventors: 俊彦田原
Original assignee: NEC System Technologies Ltd
Current assignee: NEC Solution Innovators Ltd
Priority date: 2009-06-24
Filing date: 2009-06-24
Publication date: 2013-02-27
Anticipated expiration: 2029-06-24
Also published as: JP2011008388A

Description

本発明は、コード変換システム、コード変換装置、コード変換方法、及びコード変換プログラムに関し、特に日本語コードをユニコード（Ｕｎｉｃｏｄｅ）に変換するためのコード変換システム、コード変換装置、コード変換方法、及びコード変換プログラムに関する。 The present invention relates to a code conversion system, a code conversion device, a code conversion method, and a code conversion program, and in particular, a code conversion system, a code conversion device, a code conversion method, and a code for converting a Japanese code into Unicode. Concerning conversion program.

ユニコードコンソーシアムによって定められた符号化文字集合の国際的な標準規格であるユニコード（Ｕｎｉｃｏｄｅ）においては、その符号化方式の一つであるＵＣＳ―２（Ｕniversal Ｃharacter Ｓet coded in 2 octets ）で定義される文字の集合は、全ての文字を１６ビット（２バイト）にて表現し、１つの文字コード体系にて多国語処理を可能としようとするものである。
しかしながら、［００００］〜［ＦＦＦＦ（１６進数）］で表される文字コード（１６×１６×１６×１６＝６５, ５３６文字分）にて、文字体系（言語体系）の異なる全ての文字を表現することは不可能である。そこで、ユニコードにおける新たな別の符号化方式であるＵＴＦ―１６（ＵＣＳＴransformaion Ｆormat for １６Ｐlanes of Ｇroup ００）においては、２文字分のコードを組（ペア）にすることで、つまり３２ビット（４バイト）を用いて１つの文字を表すサロゲートペア文字が利用可能となっている。 Unicode (Unicode), which is an international standard for coded character sets defined by the Unicode Consortium, is defined by UCS-2 (Universal Character Set coded in 2 octets), which is one of the coding schemes. A set of characters represents all characters in 16 bits (2 bytes), and is intended to enable multilingual processing with one character code system.
However, all characters with different character systems (language systems) are expressed by the character codes (16 × 16 × 16 × 16 = 65,536 characters) represented by [0000] to [FFFF (hexadecimal number)]. It is impossible to do. Therefore, in UTF-16 (UCS Transformaion Format for 16 Planes of Group 00), which is another new encoding method in Unicode, a code of 2 characters is paired, that is, 32 bits (4 A surrogate pair character representing one character using a byte) can be used.

具体的には、ユニコード（Ｕｎｉｃｏｄｅ）の基本文字領域ＢＭＰ（Ｂasic Ｍutilingual Ｐlane）における［Ｄ８００］〜［ＤＦＦＦ］の領域をサロゲートペア文字領域として定め、［Ｄ８００］〜［ＤＢＦＦ］で示される上位２バイトコードと、［ＤＣ００］〜［ＤＦＦＦ］で示される下位２バイトコードとの組合せの４バイトにより計（１, ０２４×１, ０２４＝１, ０４８, ５７６）文字をサロゲートペア文字として取り扱うことを可能としている。ここで、サロゲートペア文字は（２５６×２５６）の区点によって示される面（ｐｌａｎｅ）毎に６５, ５３６文字ずつ、１６面に亘って管理される。 Specifically, the area of [D800] to [DFFF] in the basic character area BMP (Basic Mutilingual Plane) of Unicode is defined as the surrogate pair character area, and the upper 2 bytes indicated by [D800] to [DBFF] It is possible to handle a total of (1, 024 × 1, 024 = 1, 048, 576) characters as surrogate pair characters by 4 bytes of the combination of the code and the lower 2-byte code indicated by [DC00] to [DFFF] It is said. Here, surrogate pair characters are managed over 16 planes, 65,536 characters for each plane indicated by (256 × 256) divisions.

更に、近年、日本語に対応したコンピュータでは、文化審議会国語分科会で制定された印刷標準字体に対応した情報交換用符号化漢字集合のＪＩＳ規格である「ＪＩＳＸ０２１３：２００４」に対応したユニコード（Ｕｎｉｃｏｄｅバージョン３．２以上）が使われ始めている。これに伴い、各種ベンダの日本語２バイトコードと「ＪＩＳＸ０２１３：２００４」に対応したユニコードの変換対応が必要になってきている。 Furthermore, in recent years, computers that support Japanese language have been adapted to “JIS X 0213: 2004”, which is a JIS standard for coded kanji sets for information exchange corresponding to the print standard fonts established by the Cultural Council National Language Subcommittee. Unicode (Unicode version 3.2 or higher) is starting to be used. Along with this, it is necessary to support conversion between Japanese 2-byte codes of various vendors and Unicode corresponding to “JIS X 0213: 2004”.

ここで、「ＪＩＳＸ０２１３：２００４」では、その前身のＪＩＳ規格である「ＪＩＳＸ０２０８」や「ＪＩＳＸ０２１２」から文字を大幅に追加しており、これに伴いユニコードにおいては基本多言語面に入りきらなかった文字（３０３文字など）を漢字追加面に収容することとなった。この際、漢字追加面に収容された文字は、ＵＴＦ−１６の符号化形式で４バイトのコード（サロゲートペア）により表される。このため、「ＪＩＳＸ０２１３：２００４」の文字列をＵＴＦ−１６符号化形式で表すと基本多言語面にかかる２バイトの文字コードの場合と、漢字追加面にかかる４バイトの文字コードの場合とが混在する形式となる。 Here, in “JIS X 0213: 2004”, characters are greatly added from “JIS X 0208” and “JIS X 0212”, which are the predecessor JIS standards. Characters that could not be entered (303 characters, etc.) were to be accommodated on the Kanji addition side. At this time, characters accommodated in the Chinese character addition surface are represented by a 4-byte code (surrogate pair) in the UTF-16 encoding format. Therefore, when the character string of “JIS X 0213: 2004” is expressed in the UTF-16 encoding format, it is a case of a 2-byte character code related to the basic multilingual aspect and a case of a 4-byte character code related to the Chinese character addition aspect. Is a mixed format.

これまでの汎用のコード変換装置では、「ＪＩＳＸ０２０８」に対応した文字のみで済んでいたため、ＵＴＦ−１６符号化形式で表した場合であってもサロゲートペアを使用する必要がなく、全ての文字を２バイトで構成することができた。
このため、「ＪＩＳＸ０２０８」などの漢字２バイトコードからユニコードへコード変換する場合、２バイトから２バイトへ１対１のコード変換テーブルを持てばコード変換を行うことができた。 In general-purpose code conversion devices so far, only the characters corresponding to “JIS X 0208” have been used, so it is not necessary to use surrogate pairs even when expressed in UTF-16 encoding format. Can be composed of 2 bytes.
For this reason, in the case of code conversion from kanji 2-byte code such as “JIS X 0208” to Unicode, code conversion can be performed if a one-to-one code conversion table is provided from 2 bytes to 2 bytes.

しかしながら、コード変換装置が変換先に「ＪＩＳＸ０２１３：２００４」に対応したユニコードをサポートするには、２バイトから２バイトへ変換する仕組みに加え、２バイトから４バイトへ変換する仕組みも必要となる。即ち、「ＪＩＳＸ０２１３：２００４」にかかるコードを、ＵＴＦ−１６にかかるユニコードに変換するコード変換装置においては、２バイトデータへの変換と、４バイトデータへの変換との双方を行う必要がある。 However, in order for the code conversion device to support Unicode corresponding to “JIS X 0213: 2004” as a conversion destination, in addition to a mechanism for converting from 2 bytes to 2 bytes, a mechanism for converting from 2 bytes to 4 bytes is also required. Become. That is, in a code conversion device that converts a code according to “JIS X 0213: 2004” into a Unicode according to UTF-16, it is necessary to perform both conversion into 2-byte data and conversion into 4-byte data. is there.

そこで、このコード変換を実現する仕組みとして汎用技術と同じ考え方で２バイトコードから４バイトコードへ1 対１のコード変換テーブルを作成する場合が想定され得る。
具体的には、２バイトコードと４バイトコードが混在するコードへ変換を行うにためには、図１１のコード変換テーブルＴ１の各要素を４バイトとする手法が一般的である。 Therefore, as a mechanism for realizing this code conversion, it may be assumed that a one-to-one code conversion table is created from a 2-byte code to a 4-byte code in the same way as general-purpose technology.
Specifically, in order to perform conversion into a code in which a 2-byte code and a 4-byte code are mixed, a method in which each element of the code conversion table T1 in FIG. 11 is 4 bytes is common.

図１１は、２バイトコードから２バイトコードへ変換するための一般的なコード変換テーブルであり、変換元の２バイトコード値に対応した変換先の２バイトコードを配列に格納した例である。図１２は、図１１のコード変換テーブルＴ１の各要素を４バイトとした例である。 FIG. 11 is a general code conversion table for converting a 2-byte code to a 2-byte code, and is an example in which a conversion-target 2-byte code corresponding to a conversion-source 2-byte code value is stored in an array. FIG. 12 is an example in which each element of the code conversion table T1 of FIG. 11 is 4 bytes.

図１２に示すコード変換テーブルＴ２を利用したコード変換では、２バイトコードをキーとして表の要素を参照したとき、上位２バイトの値が００００ｈであれば、下位２バイトの値を基本多言語面の文字を表すＵＴＦ−１６の２バイトコードの値とし、上位２バイトが００００ｈ以外の場合、漢字追加面の文字を表すサロゲートペア（ＵＴＦ−１６の４バイトコード）の値とすることで、２バイトと４バイトのコード変換結果を得ることができる。
この場合、１文字あたり４バイトの情報を必要とするため、汎用技術の２倍のサイズを持つコード変換テーブルが必要となる。 In the code conversion using the code conversion table T2 shown in FIG. 12, when referring to a table element using a 2-byte code as a key, if the value of the upper 2 bytes is 0000h, the value of the lower 2 bytes is converted into the basic multilingual aspect. The value of the 2-byte code of UTF-16 representing the character of the character, and when the upper 2 bytes are other than 0000h, the value of the surrogate pair (4-byte code of UTF-16) representing the character of the Chinese character addition surface is set to 2 Byte and 4-byte code conversion results can be obtained.
In this case, since information of 4 bytes per character is required, a code conversion table having a size twice that of the general-purpose technology is required.

一方、サロゲートペアに対応したコード変換を行うシステムの関連技術の一例が、特許文献１、特許文献２などに記載されている。 On the other hand, an example of a related technology of a system that performs code conversion corresponding to a surrogate pair is described in Patent Literature 1, Patent Literature 2, and the like.

特許文献１のコード変換は、変換入力と変換出力がともにユニコードである。特許文献１では、サロゲートペア（４バイトコード）を２バイトのコードに変換してデータベースに格納する。この特許文献１に記載されたコード変換手段では、サロゲートペアを２バイトの私用領域のコードに置き換えデータベースに格納し、データベースからデータを取り出すとき私用領域のコードがあれば、コード逆変換手手段により対応するサロゲートペアへ戻す手法を示している。 In the code conversion of Patent Document 1, both the conversion input and the conversion output are Unicode. In Patent Document 1, a surrogate pair (4-byte code) is converted into a 2-byte code and stored in a database. In the code conversion means described in Patent Document 1, the surrogate pair is replaced with a 2-byte private area code, stored in the database, and if there is a private area code when data is extracted from the database, the code reverse conversion procedure is performed. The method of returning to the corresponding surrogate pair by means is shown.

図１３は、特許文献１のコード逆変換手段が参照するコード変換テーブルの一例である。この特許文献１によると、コード逆変換手段において、入力した２バイトコードが私用領域の文字コードであった場合、私用領域の文字コードをキーとして図１３のコード変換テーブルＴ３を検索し、キーに対応するサロゲートペア文字コードを得ている。 FIG. 13 is an example of a code conversion table referred to by the code reverse conversion means of Patent Document 1. According to Patent Document 1, in the code reverse conversion means, when the input 2-byte code is a character code in the private area, the code conversion table T3 in FIG. 13 is searched using the character code in the private area as a key, The surrogate pair character code corresponding to the key is obtained.

又、特許文献２では、その段落番号［００１２］〜[ ００１４] に開示されているように、文字データ入力部は、入力された文字コード列を２バイトずつ切り出して該２バイトのコードがサロゲートペア文字の特定に用いられるコードであるか否かを判定し、サロゲートペア文字を特定する４バイトのコードを求めるサロゲートペア文字検出手段を備えている。
更に、文字データ入力部は、変換テーブルを参照してサロゲートペア文字検出手段にて求められた４バイトのコードに予め対応付けられた２バイトのコードを求めこの２バイトのコードにて４バイトのコードを置換するコード置換手段を備えている。 In Patent Document 2, as disclosed in paragraph numbers [0012] to [0014], the character data input unit cuts out the input character code string by 2 bytes, and the 2-byte code is surrogate. Surrogate pair character detection means for determining whether or not the code is used for specifying a pair character and obtaining a 4-byte code specifying the surrogate pair character is provided.
Further, the character data input unit obtains a 2-byte code associated with the 4-byte code obtained by the surrogate pair character detection means with reference to the conversion table, and obtains a 4-byte code using the 2-byte code. Code replacement means for replacing the code is provided.

更に、この特許文献２では、コード置換手段は、サロゲートペア文字を特定する４バイトのコードを、予め文字処理システムにおけるフォントファイルの外字領域に準備したサロゲートペア文字を特定する２バイトのコードに変換する。文字データ入力部は、ユニコードの基本文字領域ＢＭＰを予め区画して設定したサロゲートペア文字領域［Ｄ８００］〜［ＤＦＦＦ］に含まれる上位２バイトコードと下位２バイトコードとの組によって特定される複数のサロゲートペア文字の中の出現頻度の高い幾つかのサロゲートペア文字、或いはニーズに応じて規定された複数のサロゲートペア文字を、予めユニコードの基本文字領域ＢＭＰにおける外字領域［Ｅ０００］〜［Ｆ８ＦＦ］に２バイトのコードにて特定可能な外字（サロゲート文字）として登録している。 Further, in Patent Document 2, the code replacement means converts a 4-byte code that specifies a surrogate pair character into a 2-byte code that specifies a surrogate pair character prepared in the external character area of the font file in the character processing system in advance. To do. The character data input unit is specified by a combination of a high-order 2 byte code and a low-order 2 byte code included in the surrogate pair character areas [D800] to [DFFF] set by previously dividing the Unicode basic character area BMP. The surrogate pair characters having a high frequency of appearance in the surrogate pair characters, or a plurality of surrogate pair characters defined according to needs, are previously stored in the external character regions [E000] to [F8FF] in the Unicode basic character region BMP. Is registered as an external character (surrogate character) that can be specified by a 2-byte code.

又、特許文献２では、文字データ入力部においては、フォントファイルの外字領域に登録した複数の外字（サロゲート文字）をそれぞれ特定する２バイトのコードと、当該外字（サロゲート文字）に相当するサロゲートペア文字を特定する４バイトのコードとを相互に対応付けて変換テーブルに登録している。
このため、特許文献２の変換テーブルでは、当該テーブルの要素に４バイトのテータが必要となっている。 Further, in Patent Document 2, in the character data input unit, a 2-byte code for identifying a plurality of external characters (surrogate characters) registered in the external character area of the font file and a surrogate pair corresponding to the external character (surrogate character) A 4-byte code specifying a character is associated with each other and registered in the conversion table.
For this reason, the conversion table of Patent Document 2 requires 4-byte data for the elements of the table.

特開２００８−２４２９９２号公報JP 2008-242992 A 特開２００９−４２９８０号公報JP 2009-42980 A

しかしながら、上述のような関連技術を利用した図１２に示す手法では、図１１のコード変換テーブルＴ１を利用する場合に比較して、コード変換テーブルの格納領域が２倍となる。その理由は、図１１のコード変換テーブルＴ１の要素１つが２バイトであるのに対して図１２のコード変換テーブルＴ２の要素１つは４バイトのためである。 However, in the method shown in FIG. 12 using the related technique as described above, the storage area of the code conversion table is doubled compared to the case of using the code conversion table T1 of FIG. The reason is that one element of the code conversion table T1 of FIG. 11 is 2 bytes, whereas one element of the code conversion table T2 of FIG. 12 is 4 bytes.

ところで、組み込み機器などコード変換を実装する装置によっては、コード変換テーブルに割り当てることのできる領域のサイズに制限があるため、「ＪＩＳＸ０２１３：２００４」に対応したユニコードへのコード変換で利用するコード変換テーブルのサイズを必要最小限の大きさとする必要がある。
このため、図１２に示す手法では、変換テーブルを格納するのに必要なサイズが汎用技術の２倍のサイズとなってしまい、変換テーブルの格納領域が膨大となり、比較的少ないメモリ容量の組み込み機器などコード変換を実装する装置に前記変換テーブルを搭載できない、という不都合があった。 By the way, depending on a device that implements code conversion, such as an embedded device, there is a limit on the size of an area that can be allocated to the code conversion table, so a code used for code conversion to Unicode corresponding to “JIS X 0213: 2004”. It is necessary to make the size of the conversion table as small as possible.
For this reason, in the method shown in FIG. 12, the size required to store the conversion table is twice the size of the general-purpose technology, the conversion table storage area becomes enormous, and the embedded device has a relatively small memory capacity. For example, the conversion table cannot be mounted on a device that implements code conversion.

又、「ＪＩＳＸ０２１３：２００４」において漢字追加面（第２面）に収容されている漢字は３０３文字であり、２バイトの漢字コードから４バイトのユニコードに変換するパターンは３０３個程度である。
このため、図１２で示されるコード変換テーブルＴ２のように要素が８８３６個（９４×９４）ある場合、４バイトコードへの変換が必要な文字は全体の４％程度であり、追加面の文字へのコード変換が３０３文字の場合、残り９６％の要素において上位２バイトは不要な情報であり、無駄が大きい、という不都合があった。 In “JIS X 0213: 2004”, there are 303 kanji characters stored in the kanji addition surface (second surface), and there are about 303 patterns to convert from 2-byte kanji code to 4-byte unicode. .
For this reason, when there are 8836 elements (94 × 94) as in the code conversion table T2 shown in FIG. 12, about 4% of the characters need to be converted into a 4-byte code. When the code conversion to 303 is 303 characters, the upper 2 bytes are unnecessary information in the remaining 96% of the elements, and there is a disadvantage that it is wasteful.

更に、関連技術の特許文献１の図１３に示す仕組みを漢字２バイトコードからサロゲートペア（ＵＴＦ−１６の４バイトコード）へ変換する手段として利用するとともに、漢字２バイトコードからＵＴＦ−１６の２バイトコードへの変換には図１１のコード変換テーブルＴ１を使用してコード変換を行う場合には、以下の不都合があった。
すなわち、図１３のコード変換テーブルＴ３を使用する上で変換元のコード範囲が、ユニコード（Ｕｎｉｃｏｄｅ）の私用領域の範囲（Ｅ０００ｈ〜Ｆ８ＦＦｈ）に限定されているため、漢字２バイトコードで使用するコード範囲（２１２１ｈ〜７Ｅ７Ｅｈ）などが変換対象とならない。
このため、図１３のコード変換テーブルＴ３を使用して漢字２バイトコードから、サロゲートペアへの変換を可能とするためには、図１３のコード変換テーブルＴ３において、キーとなる値を漢字２バイトコードの全範囲とする必要があった。 Further, the mechanism shown in FIG. 13 of Patent Document 1 of the related art is used as a means for converting a kanji 2-byte code to a surrogate pair (UTF-16 4-byte code), and from a kanji 2-byte code to UTF-16 2. The conversion to the byte code has the following disadvantages when the code conversion is performed using the code conversion table T1 of FIG.
That is, when using the code conversion table T3 in FIG. 13, the code range of the conversion source is limited to the private area range (E000h to F8FFh) of Unicode (Unicode), so it is used with a Kanji 2-byte code. The code range (2121h to 7E7Eh) or the like is not subject to conversion.
Therefore, in order to enable conversion from a Kanji 2-byte code to a surrogate pair using the code conversion table T3 of FIG. 13, the key value in the code conversion table T3 of FIG. The entire range of codes needed to be.

更に又、図１４に示すように、図１３の特許文献１のコード変換テーブルＴ３における私用領域文字コードの列を漢字２バイトコードに置き換えることが想定され得る。
このような図１４のコード変換テーブルＴ４を使って漢字２バイトコードに対応するサロゲートペアへ変換しようとした場合、次の不都合があった。 Furthermore, as shown in FIG. 14, it can be assumed that the sequence of private area character codes in the code conversion table T3 of Patent Document 1 in FIG. 13 is replaced with a Kanji 2-byte code.
When the code conversion table T4 of FIG. 14 is used to convert to a surrogate pair corresponding to a Kanji 2-byte code, there is the following inconvenience.

すなわち、図１４のコード変換テーブルＴ４を使用した場合、１文字を変換するごとに図１３のコード変換テーブルＴ４を検索しなければならない。その理由は、図１４のコード変換テーブルＴ４を検索する条件を限定できないためである。
ここで、漢字２バイトコードは、ＪＩＳコードに限らず、ベンダ独自の漢字コードなど、２バイトで表すことのできる漢字コードを対象としている。このため、サロゲートペアに変換する可能性のある変換元の漢字２バイトコードが、漢字２バイトコードの全範囲であるとすると、１文字の変換ごとにサロゲートペアに変換するべき漢字２バイトコードであるか図１４のコード変換テーブルＴ４を検索して判断する必要があり、多大な時間を要することになる。 That is, when the code conversion table T4 of FIG. 14 is used, the code conversion table T4 of FIG. 13 must be searched every time one character is converted. This is because the conditions for searching the code conversion table T4 in FIG. 14 cannot be limited.
Here, the kanji 2-byte code is not limited to the JIS code, but targets a kanji code that can be expressed in 2 bytes, such as a vendor-specific kanji code. For this reason, if the conversion source kanji 2-byte code that can be converted to a surrogate pair is the entire range of the kanji 2-byte code, it is a kanji 2-byte code that should be converted to a surrogate pair for each character conversion. It is necessary to search the code conversion table T4 shown in FIG. 14 to determine whether it takes a long time.

このように、仮に、特許文献１のコード変換テーブルを応用したとしても、２バイトから２バイトへ変換するコード変換に比べて、２バイトから４バイトへのコード変換を行うに際しては、コード変換における処理速度が遅くなる、という不都合があった。
更に、上記特許文献１では、変換テーブルの要素に４バイトデータが必要であり、テーブルのサイズ、テーブルの格納領域が膨大となり、比較的少ないメモリ容量の組み込み機器などコード変換を実装する装置に前記変換テーブルを搭載することができない、という不都合があった。 Thus, even if the code conversion table of Patent Document 1 is applied, when code conversion from 2 bytes to 4 bytes is performed as compared with code conversion from 2 bytes to 2 bytes, in code conversion, There was a disadvantage that the processing speed was slow.
Further, in the above-mentioned Patent Document 1, 4-byte data is required for the elements of the conversion table, the table size and the storage area of the table become enormous, and the above-mentioned device such as an embedded device having a relatively small memory capacity implements the above-mentioned code conversion. There was an inconvenience that a conversion table could not be installed.

加えて、特許文献２においても、変換テーブルの要素に４バイトデータが必要であり、テーブルのサイズ、テーブルの格納領域が膨大となり、比較的少ないメモリ容量の組み込み機器などコード変換を実装する装置には前記変換テーブルを搭載することができない、という不都合があった。 In addition, in Patent Document 2, 4-byte data is required for the elements of the conversion table, the table size and the storage area of the table become enormous, and a device that implements code conversion such as an embedded device having a relatively small memory capacity. Has the disadvantage that the conversion table cannot be mounted.

（発明の目的）
本発明の目的は、上述の関連技術の不都合を解決することにあり、コード変換するに際し、変換テーブルの格納領域を最小限にして組み込み機器などコード変換を実装する装置に変換テーブルを搭載可能としながらも、コード変換にかかる処理時間の短縮化を可能ならしめるコード変換システム、コード変換装置、コード変換方法、及びコード変換プログラムを提供することにある。 (Object of invention)
An object of the present invention is to solve the inconveniences of the related art described above, and at the time of code conversion, the conversion table can be mounted on a device that implements code conversion such as an embedded device by minimizing the storage area of the conversion table. However, an object of the present invention is to provide a code conversion system, a code conversion device, a code conversion method, and a code conversion program that can shorten the processing time required for code conversion.

上記目的を達成するため、本発明のコード変換システムは、１つの文字情報に対して予め割り当てられているバイト数の第１情報量にかかる第１コードを入力する入力装置と、この入力された前記第１コードを、予め設けられたコード変換テーブルを参照して当該第１コードと異なるコード体系であって前記文字情報に対する前記第１情報量にかかる第２コード又はこれより多いバイト数の第２情報量にかかる第２コードに変換するコード変換装置と、このコード変換装置用として当該コード変換装置に併設され前記第１情報量の第１コードと前記第１情報量の前記第２コードとが対応づけられ且つ前記第１情報量の値を要素とする基本多言語面変換テーブルと前記第１情報量の値の一部と前記第２情報量より少ないバイト数から成る第３情報量の第１中間コードとが対応づけられ且つ前記第３情報量の値を要素とする追加面変換テーブルとを具備した記憶装置と、前記コード変換装置にて変換された前記第１情報量又は前記第２情報量の前記第２コードを出力する出力装置とを備えている。 In order to achieve the above object, a code conversion system according to the present invention includes an input device that inputs a first code corresponding to a first information amount of the number of bytes allocated in advance to one character information, The first code is a code system different from the first code with reference to a code conversion table provided in advance, and the second code corresponding to the first information amount with respect to the character information or the number of bytes greater than this. A code conversion device for converting into a second code corresponding to two information amounts, a first code of the first information amount, and a second code of the first information amount provided for the code conversion device for the code conversion device; Are associated with each other, and a third multilingual plane conversion table having the first information amount value as an element, a part of the first information amount value, and a third information comprising a smaller number of bytes than the second information amount. A storage device associated with an amount of the first intermediate code and having an additional surface conversion table having the value of the third information amount as an element, and the first information amount converted by the code conversion device, or An output device that outputs the second code of the second information amount.

そして、前述したコード変換装置は、前記基本多言語面変換テーブルを参照し、前記第１情報量が前記第１情報量の値の一部の範囲である場合には前記追加面変換テーブルを参照して前記第３情報量の前記第１中間コードを出力すると共に、この第３情報量の前記第１中間コードに対して予め定められた演算処理を施して前記第２情報量の前記第２コードを生成し出力する第２コード生成出力機能を備えていることを特徴とする。 The above-described code conversion device refers to the basic multilingual surface conversion table, and refers to the additional surface conversion table when the first information amount is a partial range of the value of the first information amount. Then, the first intermediate code of the third information amount is output, and a predetermined calculation process is performed on the first intermediate code of the third information amount to perform the second information amount of the second information amount. A second code generation / output function for generating and outputting a code is provided.

上記目的を達成するため、本発明のコード変換装置は、１つの文字情報に対して予め割り当てられている第１情報量の第１コードを入力しこの第１コードを基本多言語面変換テーブル及び追加面変換テーブルの一方又は双方を参照することにより、前記第１コードと異なるコード体系であって前記文字情報に対して前記第１情報量の場合又はこれより多い第２情報量の場合が存在し得る第２コードに変換して出力するコード変換装置であって、前述した基本多言語面変換テーブルは、前記第１情報量の第１コードと前記第１情報量の前記第２コードとが対応づけられて前記第１情報量の値を要素とするデータ構造を備え、前記追加面変換テーブルは、前記要素の一部と前記第２情報量より少ない第３情報量の第１中間コードとが対応づけられて前記第３情報量の値を要素とするデータ構造を備えている。 In order to achieve the above object, a code conversion apparatus according to the present invention inputs a first code having a first information amount assigned in advance to one character information, and uses the first code as a basic multilingual surface conversion table and By referring to one or both of the additional surface conversion tables, there is a case where the code information is different from the first code and the character information has the first information amount or a second information amount larger than this. The basic multilingual plane conversion table described above includes a first code of the first information amount and a second code of the first information amount. The additional surface conversion table includes a data structure associated with the value of the first information amount as an element, and the additional surface conversion table includes a part of the element and a first intermediate code having a third information amount smaller than the second information amount. Is associated And a data structure whose elements are the values of the third information amount.

そして、前述した基本多言語面変換テーブルを参照した結果、前記第１コードが前記要素の一部の範囲である場合には、前記追加面変換テーブルを参照して当該第１コードに対応する前記第３情報量の前記第１中間コードを出力機能と、この第３情報量の前記第１中間コードに予め定められた演算処理を施して前記第２情報量の前記第２コードを生成して出力する機能とを備えたことを特徴とする。 As a result of referring to the basic multilingual surface conversion table, when the first code is a partial range of the element, the additional surface conversion table is referred to and the first code corresponds to the first code. A function of outputting the first intermediate code of the third information amount and a predetermined arithmetic processing on the first intermediate code of the third information amount to generate the second code of the second information amount And an output function.

上記目的を達成するため、本発明のコード変換方法は、１つの文字情報に対して予め割り当てられているバイト数の第１情報量にかかる第１コードが入力された場合に当該第１コードを、予め設けられたコード変換テーブルを参照して当該第１コードと異なるコード体系であって前記文字情報に対する前記第１情報量にかかる第２コード又はこれより多いバイト数の第２情報量にかかる第２コードに変換するコード変換装置と、このコード変換装置用として当該コード変換装置に併設され、前記第１情報量の第１コードと前記第１情報量の前記第２コードとが対応づけられ且つ前記第１情報量の値を要素とする基本多言語面変換テーブル、及び前記第１情報量の値の一部と前記第２情報量より少ないバイト数から成る第３情報量の第１中間コードとが対応づけられ且つ前記第３情報量の値を要素とする追加面変換テーブルを備えたコード変換システムにあって、
入力された前記第１コードを前記基本多言語面変換テーブル及び前記追加面変換テーブルの一方又は双方を参照することにより、前記第１コードと異なるコード体系であって前記文字情報に対して前記第１情報量の場合又はこれより多い第２情報量の場合が存在し得る第２コードにコード変換装置が変換し、しかる後、この変換された前記第１情報量又は前記第２情報量の前記第２コードが外部出力される構成とし、
前記コード変換装置による変換に際しては、最初に前記基本多言語面変換テーブルを参照し、その参照結果が前記要素の一部の範囲である場合に、次に前記追加面変換テーブルを参照して前記第３情報量の前記第１中間コードを出力し、この第３情報量の前記第１中間コードに予め定められた演算処理を施して前記第２情報量の前記第２コードを生成することを特徴とする。 In order to achieve the above object, the code conversion method of the present invention converts the first code when the first code corresponding to the first information amount of the number of bytes allocated in advance to one character information is input. , Referring to a code conversion table provided in advance, a second code amount corresponding to the first information amount with respect to the character information or a second information amount having a larger number of bytes than the first code. A code conversion device for converting to a second code, and a code conversion device for the code conversion device, the first code of the first information amount and the second code of the first information amount are associated with each other. And a basic multilingual plane conversion table having the value of the first information amount as an element, and a first intermediate of a third information amount comprising a part of the value of the first information amount and a smaller number of bytes than the second information amount Co There the values of de and is associated and the third amount of information code conversion system with additional surface conversion table whose elements,
By referring to one or both of the basic multilingual surface conversion table and the additional surface conversion table for the input first code, the first code is a code system different from the first code, and the character information is the first code. The code conversion device converts the information into a second code that may have a case of one information amount or a case of a second information amount larger than this, and then, the converted first information amount or the second information amount The second code is output externally,
In the conversion by the code conversion device, first, the basic multilingual surface conversion table is referred to, and when the reference result is a partial range of the element, the additional surface conversion table is then referred to Outputting the first intermediate code of the third information amount, and performing a predetermined calculation process on the first intermediate code of the third information amount to generate the second code of the second information amount. Features.

上記目的を達成するため、本発明のコード変換プログラムは、１つの文字情報に対して予め割り当てられているバイト数の第１情報量にかかる第１コードが入力された場合に当該第１コードを、予め設けられたコード変換テーブルを参照して当該第１コードと異なるコード体系の第２コードに変換するコード変換装置と、このコード変換装置用として当該コード変換装置に併設され、前記第１情報量の第１コードと前記第１情報量の前記第２コードとが対応づけられ且つ前記第１情報量の値を要素とする基本多言語面変換テーブル、及び前記第１情報量の値の一部と前記第２情報量より少ないバイト数から成る第３情報量の第１中間コードとが対応づけられ且つ前記第３情報量の値を要素とする追加面変換テーブルを備えたコード変換システムにあって、
入力された前記第１情報量の第１コードを前記基本多言語面変換テーブル及び追加面変換テーブルの一方又は双方を参照して前記第１コードと異なるコード体系であって前記文字情報に対してバイト数が前記第１情報量の場合又はこれより多い第２情報量の場合の第２コードに変換するコード変換機能、及びこの変換された前記第１情報量又は前記第２情報量の前記第２コードを出力する出力処理機能を有し、
前記コード変換処理機能では、前記第１コードの変換処理に際し、前記基本多言語面変換テーブルを参照すると共にその参照結果が前記要素の一部の範囲にある場合に、前記追加面変換テーブルを参照して変換処理し前記第３情報量の前記第１中間コードを出力する第１中間コード出力処理機能、及びこの出力処理された第１中間コードに予め定められた演算処理を施して前記第２情報量の前記第２コードを生成しこれを出力する第２コード生成処理機能、をその内容とし、
これら各処理機能を前記コード変換装置が備えているコンピュータに実行させるようにしたことを特徴とする。 In order to achieve the above object, the code conversion program according to the present invention receives the first code when the first code corresponding to the first information amount of the number of bytes allocated in advance for one character information is input. A code conversion device that converts a second code having a code system different from the first code with reference to a code conversion table provided in advance, and the code conversion device for the code conversion device; A basic multilingual plane conversion table in which the first code of the amount and the second code of the first information amount are associated with each other, and one of the values of the first information amount Conversion system comprising an additional surface conversion table in which a first intermediate code of a third information amount having a number of bytes smaller than the second information amount is associated with the value and the value of the third information amount In the,
The input first code of the first information amount is different from the first code with reference to one or both of the basic multilingual plane conversion table and the additional plane conversion table, and the character information A code conversion function for converting into a second code when the number of bytes is the first information amount or a second information amount larger than the first information amount, and the converted first information amount or the second information amount Has an output processing function that outputs two codes,
In the code conversion processing function, when the conversion process of the first code is performed, the basic multilingual surface conversion table is referred to, and when the reference result is in a partial range of the element, the additional surface conversion table is referred to. A first intermediate code output processing function for performing the conversion processing and outputting the first intermediate code of the third information amount, and applying the predetermined arithmetic processing to the output first intermediate code to perform the second processing A second code generation processing function for generating and outputting the second code of the amount of information,
Each of these processing functions is executed by a computer provided in the code conversion apparatus.

本発明によれば、追加面変換テーブルを第２情報量より少ない第３情報量を要素として構成し、第２情報量の第２コードを生成するためのコード変換では、追加面変換テーブルより第３情報量の第１中間コードを取り出してこれに予め定められた演算処理を施し前記第２情報量の第２コードを生成するようにしたので、前述した関連技術のような第２情報量の変換テーブルを構成する必要がなく、コード変換するに際し、変換テーブルの格納領域を最小限にして組み込み機器などコード変換を実装する装置に変換テーブルを搭載可能としながらも、コード変換にかかる処理時間の短縮化が可能となり、変換処理の高速化を図ることができる、という、関連技術にない優れたコード変換システム、コード変換装置、コード変換方法、及びコード変換プログラムを提供することができる。 According to the present invention, the additional surface conversion table is configured with the third information amount smaller than the second information amount as an element, and in the code conversion for generating the second code of the second information amount, the additional surface conversion table is more than the additional surface conversion table. Since the first intermediate code of three information amounts is taken out and subjected to a predetermined calculation process to generate the second code of the second information amount, the second information amount of the second information amount as in the related art described above is generated. There is no need to configure a conversion table, and when converting code, the conversion table can be installed in a device that implements code conversion, such as an embedded device, by minimizing the storage area of the conversion table. An excellent code conversion system, code conversion device, code conversion method, and code that are not available in related technologies, which can be shortened and can speed up conversion processing. It is possible to provide a conversion program.

本発明の第１実施形態によるコード変換システムの全体構成の一例を示すブロック図である。It is a block diagram which shows an example of the whole structure of the code conversion system by 1st Embodiment of this invention. 図１に開示した実施形態のコード変換システムにおける基本多言語面変換テーブルのデータ構造の一例を示す説明図である。It is explanatory drawing which shows an example of the data structure of the basic multilingual surface conversion table in the code conversion system of embodiment disclosed in FIG. 図１に示すコード変換システムにおける追加面変換テーブルのデータ構造の一例を示す説明図である。It is explanatory drawing which shows an example of the data structure of the additional surface conversion table in the code conversion system shown in FIG. 図１に開示した実施形態のコード変換システムにおける動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement in the code conversion system of embodiment disclosed in FIG. 図１に開示した実施形態のコード変換システムにおける全体の動作を示すフローチャートである。It is a flowchart which shows the whole operation | movement in the code conversion system of embodiment disclosed in FIG. 本発明の第２実施形態におけるコード変換システムの全体構成の一例を示すブロック図である。It is a block diagram which shows an example of the whole structure of the code conversion system in 2nd Embodiment of this invention. 図６に開示した第２実施形態のコード変換システムにおける動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement in the code conversion system of 2nd Embodiment disclosed in FIG. 本発明の第３実施形態におけるコード変換システムの全体構成の一例を示すブロック図である。It is a block diagram which shows an example of the whole structure of the code conversion system in 3rd Embodiment of this invention. 図８に開示した第３実施形態のコード変換システムにおける追加面Ｎ変換テーブルのデータ構造の一例を示す説明図である。It is explanatory drawing which shows an example of the data structure of the additional surface N conversion table in the code conversion system of 3rd Embodiment disclosed in FIG. 図８に開示した第３実施形態のコード変換システムにおける動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement in the code conversion system of 3rd Embodiment disclosed in FIG. 関連技術におけるコード変換テーブルのデータ構造の一例を示す説明図である。It is explanatory drawing which shows an example of the data structure of the code conversion table in related technology. 関連技術におけるコード変換テーブルの他のデータ構造の一例を示す説明図である。It is explanatory drawing which shows an example of the other data structure of the code conversion table in related technology. 関連技術におけるコード変換テーブルの更に他のデータ構造の一例を示す説明図である。It is explanatory drawing which shows an example of the other data structure of the code conversion table in related technology. 関連技術におけるコード変換テーブル（変形例）のデータ構造の一例を示す説明図である。It is explanatory drawing which shows an example of the data structure of the code conversion table (modified example) in related technology.

以下、本発明の第１の実施形態にかかるコード変換システムを図１乃至図５に基づいて説明する。
この第１実施形態におけるコード変換システム１は、漢字２バイトコードからユニコード（Ｕｎｉｃｏｄｅ／２バイトと４バイトの混在）へ変換を行うにあたり、必要最低限のメモリ容量で高速にコード変換を行うことを意図した構成となっている。 Hereinafter, a code conversion system according to a first embodiment of the present invention will be described with reference to FIGS.
The code conversion system 1 according to the first embodiment performs high-speed code conversion with a minimum necessary memory capacity when performing conversion from kanji 2-byte code to Unicode (mixed of Unicode / 2 bytes and 4 bytes). It has the intended configuration.

そして、これを実現するために、本第１実施形態では、漢字２バイトコードをユニコード（Ｕｎｉｃｏｄｅ）へ変換するに際し、まず、基本多言語面変換部２１と追加面変換部２２とを備え、各変換部に対応して、基本多言語面変換テーブル記憶部３１および追加面変換テーブル記憶部３２が、それぞれ併設されている。 In order to realize this, the first embodiment includes a basic multilingual plane conversion unit 21 and an additional plane conversion unit 22 when converting a kanji 2-byte code into Unicode. Corresponding to the conversion unit, a basic multilingual plane conversion table storage unit 31 and an additional plane conversion table storage unit 32 are provided respectively.

基本多言語面変換テーブル記憶部３１には、漢字２バイトコードに対応したＵｎｉｃｏｄｅ（ＵＴＦ−１６符号化形式の２バイトコード）が予め格納されている。そして、この基本多言語面変換テーブル記憶部３１では、変換先が漢字追加面のＵｎｉｃｏｄｅ（２バイトコード）となる文字の変換に際しては、文字が割り当られることのないサロゲートペアで使われるコード範囲（Ｄ８００ｈ〜ＤＦＦＦｈ）の２バイト値が割り当てられてこれを、保持する。 The basic multilingual plane conversion table storage unit 31 stores in advance Unicode (2-byte code in UTF-16 encoding format) corresponding to the Kanji 2-byte code. In the basic multilingual surface conversion table storage unit 31, when converting a character whose conversion destination is Unicode (2-byte code) of the kanji adding surface, a code range used in a surrogate pair to which no character is assigned. A 2-byte value (D800h to DFFFh) is assigned and held.

又、追加面変換部２２は、基本多言語面変換部２１でＤ８００ｈ〜ＤＦＦＦｈの値を取得したときのみ使用する。追加面変換部２２は、その変換動作時には追加面変換テーブル記憶部３２を参照する。ここで、追加面変換テーブル記憶部３２は、Ｄ８００ｈ〜ＤＦＦＦｈの１コードにつき２バイトで構成し、Ｄ８００ｈ〜ＤＦＦＦｈの値で直接参照可能とする。 The additional plane conversion unit 22 is used only when the basic multilingual plane conversion unit 21 acquires the values D800h to DFFFh. The additional surface conversion unit 22 refers to the additional surface conversion table storage unit 32 during the conversion operation. Here, the additional plane conversion table storage unit 32 is configured with 2 bytes for each code of D800h to DFFFh, and can be directly referred to by the values of D800h to DFFFh.

そして、追加面変換部２２から得られる２バイトから、ユニコード面区点生成手段１１２とユニコードＵＴＦ−１６符号化手段１１３により、漢字追加面のユニコード（４バイト）が変換出力される。
以下、これを具体的に説明する。 Then, from the 2 bytes obtained from the additional plane conversion unit 22, the Unicode plane section generation unit 112 and the Unicode UTF-16 encoding unit 113 convert and output Unicode (4 bytes) of the Chinese character additional plane.
This will be specifically described below.

〔第１の実施の形態〕
先ず、本実施形態におけるコード変換システムの基本的構成について説明する。
本第１実施形態におけるコード変換システム１は、図１に示すように、予め１つの文字情報に対して割り当てられているバイト数の第１情報量の第１コードを入力する入力装置１０を備えている。 [First Embodiment]
First, the basic configuration of the code conversion system in this embodiment will be described.
As shown in FIG. 1, the code conversion system 1 according to the first embodiment includes an input device 10 that inputs a first code having a first information amount of the number of bytes allocated to one piece of character information in advance. ing.

又、コード変換システム１は、入力装置１０にて入力された前記第１コードを基本多言語面変換テーブル３１ａ及び追加面変換テーブル３２ａの一方又は双方を参照することにより、前記第１コードと異なるコード体系であって前記文字情報に対して前記第１情報量の場合又はこれより多い第２情報量の場合が存在し得る第２コードに変換するコード変換装置２０Ａを備えている。 The code conversion system 1 is different from the first code by referring to one or both of the basic multilingual surface conversion table 31a and the additional surface conversion table 32a with respect to the first code input by the input device 10. A code conversion device 20A is provided that converts the character information into a second code that may have a case of the first information amount or a case of a second information amount larger than the character information.

更に、コード変換システム１は、記憶装置３０を備えている。この記憶装置３０は、前記第１情報量の第１コードと前記第１情報量の前記第２コードとが対応づけられて前記第１情報量の値を要素とする基本多言語面変換テーブル３１ａを装備した基本多言語面変換テーブル記憶部３１を備えている。更に、この記憶装置３０は、前記要素の一部と前記第２情報量より少ない第３情報量の第１中間コードとが対応づけられて前記第３情報量の値を要素とする追加面変換テーブル３２ａを装備した追加面変換テーブル記憶部３２を備えている。
また、コード変換システム１は、上記コード変換装置２０Ａにて変換された前記第１情報量又は第２情報量の前記第２コードを出力する出力装置４０を備えている。 Furthermore, the code conversion system 1 includes a storage device 30. The storage device 30 includes a basic multilingual plane conversion table 31a in which the first code of the first information amount and the second code of the first information amount are associated with each other and the value of the first information amount is an element. Is provided with a basic multilingual surface conversion table storage unit 31. Further, the storage device 30 associates a part of the element with a first intermediate code having a third information amount smaller than the second information amount, and performs additional surface conversion using the value of the third information amount as an element. An additional surface conversion table storage unit 32 equipped with a table 32a is provided.
The code conversion system 1 further includes an output device 40 that outputs the second code of the first information amount or the second information amount converted by the code conversion device 20A.

コード変換装置２０Ａは、基本多言語面変換テーブル３１ａを参照した結果が前記要素の一部の範囲である場合には、次に追加面変換テーブル３２ａを参照して前記第３情報量の第１中間コードを出力すると共に、この第３情報量の第１中間コードに予め定められた演算処理を施して前記第２情報量の第２コードを生成してこれを出力する機能を備えている。 When the result of referring to the basic multilingual surface conversion table 31a is a partial range of the element, the code conversion device 20A then refers to the additional surface conversion table 32a and then adds the first information of the third information amount. In addition to outputting the intermediate code, the first intermediate code having the third information amount is subjected to a predetermined arithmetic process to generate a second code having the second information amount and outputting the second code.

このようなコード変換システム１の基本的構成によれば、追加面変換テーブル３２ａは第２情報量より少ない第３情報量を要素として構成され、第２情報量の第２コードを生成するためのコード変換では、追加面変換テーブル３２ａを参照して第３情報量の第１中間コードを出力しこれに予め定められた演算処理を施して前記第２情報量の第２コードを生成するため、関連技術のような第２情報量の変換テーブルを構成する必要がなく、コード変換するに際し、変換テーブルの格納領域を最小限にして組み込み機器などコード変換を実装する装置に変換テーブルを搭載可能としながらも、コード変換にかかる処理時間の短縮化が可能となり、変換処理の高速化を図ることができる。 According to such a basic configuration of the code conversion system 1, the additional surface conversion table 32a is configured with the third information amount smaller than the second information amount as an element, and generates the second code of the second information amount. In the code conversion, in order to generate the second code of the second information amount by outputting the first intermediate code of the third information amount with reference to the additional surface conversion table 32a and performing a predetermined arithmetic process on the first intermediate code. There is no need to construct a conversion table for the second information amount as in the related art, and it is possible to mount a conversion table in a device that implements code conversion, such as an embedded device, by minimizing the storage area of the conversion table when performing code conversion. However, the processing time for code conversion can be shortened, and the conversion process can be speeded up.

（コード変換システムの全体構成）
次に、本第１実施形態のコード変換システムの具体的構成について、全体構成から説明し、続いて各部の詳細構成について図１を参照しつつ説明する。
図１は、本第１実施形態におけるコード変換システムの全体を示すブロック図である。 (Overall configuration of code conversion system)
Next, the specific configuration of the code conversion system according to the first embodiment will be described from the overall configuration, and then the detailed configuration of each unit will be described with reference to FIG.
FIG. 1 is a block diagram showing the entire code conversion system in the first embodiment.

この図１に示すように、本第１実施形態におけるコード変換システム１は、キーボード等の入力装置１０と、プログラム制御により動作するコード変換装置２０Ａと、コード変換の対応を示す情報を記憶する記憶装置３０と、ディスプレイ装置や印刷装置などの出力装置４０とを備えている。 As shown in FIG. 1, the code conversion system 1 according to the first embodiment includes an input device 10 such as a keyboard, a code conversion device 20A that operates by program control, and a memory that stores information indicating the correspondence of code conversion. A device 30 and an output device 40 such as a display device or a printing device are provided.

記憶装置３０は、図２に示す基本多言語面変換テーブル３１ａを記憶した基本多言語面変換テーブル記憶部３１と、図３に示す追加面変換テーブル３２ａを記憶した追加面変換テーブル記憶部３２とを備えている。 The storage device 30 includes a basic multilingual surface conversion table storage unit 31 that stores the basic multilingual surface conversion table 31a illustrated in FIG. 2, and an additional surface conversion table storage unit 32 that stores the additional surface conversion table 32a illustrated in FIG. It has.

（基本多言語面変換テーブルのデータ構造）
図２に示すように、基本多言語面変換テーブル３１ａは、漢字２バイトコードの一例として、上位１バイトを２１ｈ〜７Ｅｈ、下位１バイトを２１ｈ〜７Ｅｈの範囲をもつ変換元の漢字２バイトコードに対応したマトリックス表であり、それぞれの要素に変換先のユニコード（ＵＴＦ−１６）を格納したデータ構造となっている。 (Basic multilingual plane conversion table data structure)
As shown in FIG. 2, the basic multilingual plane conversion table 31a is an example of a kanji 2-byte code, which is a conversion source kanji 2-byte code having an upper 1 byte in the range of 21h to 7Eh and a lower 1 byte of 21h to 7Eh. Is a matrix table corresponding to the data structure in which Unicode (UTF-16) of the conversion destination is stored in each element.

ここで、変換元に対応する変換先のユニコードが追加面の文字の場合、Ｄ８００ｈ〜ＤＦＦＦｈの値を格納する。
基本多言語面変換テーブル３１ａにおけるＤ８００ｈ〜ＤＦＦＦｈの値は、追加面の文字を一意に決定する値とし、変換先のＵＴＦ−１６の値として扱わず、追加面変換部２２が追加面変換テーブル記憶部３２を参照するためのキーとして扱う。
更に、記憶装置３０に実装する際は、変換元の２バイトコードの昇順で表の要素である２バイトを連続して格納した構造となっている。 Here, when the conversion destination Unicode corresponding to the conversion source is the character of the additional plane, the values of D800h to DFFFh are stored.
The values of D800h to DFFFh in the basic multilingual surface conversion table 31a are values that uniquely determine the character of the additional surface, and are not handled as the values of the conversion destination UTF-16, and the additional surface conversion unit 22 stores the additional surface conversion table. The key is used as a key for referring to the part 32.
Furthermore, when mounted on the storage device 30, the structure is such that 2 bytes as the elements of the table are continuously stored in ascending order of the conversion source 2-byte code.

漢字２バイトコードからユニコードへの変換に際して基本多言語面変換部２１が参照する基本多言語面変換テーブル記憶部３１は、漢字２バイトコードに対応したユニコード（ＵＴＦ−１６符号化形式の２バイトコード）が予め格納されている。この基本多言語面変換テーブル記憶部３１には、変換先が漢字追加面のユニコード（４バイトコード）となる文字変換に対しては、文字が割り当てられることのないサロゲートペアで使われるコード範囲（Ｄ８００ｈ〜ＤＦＦＦｈ）の２バイト値を割り当てて格納されている。 The basic multilingual plane conversion table storage unit 31 referred to by the basic multilingual plane conversion unit 21 for conversion from the kanji 2-byte code to the Unicode is Unicode corresponding to the kanji 2-byte code (2-byte code in the UTF-16 encoding format). ) Is stored in advance. In the basic multilingual plane conversion table storage unit 31, a code range used in a surrogate pair to which no character is assigned for character conversion in which the conversion destination is Unicode (4-byte code) of the Chinese character addition plane. D800h to DFFFh) are allocated and stored.

ここで、前記基本多言語面変換テーブル３１ａでは、前記第１バイト数（２バイトなど）の第１コード（２１２１ｈなど）と前記第１バイト数の前記第２コード（３０００ｈなど）とが対応づけられており、前記第１バイト数の第２コードが予め定められた範囲内となる特定範囲内第１バイト数第２コード（Ｄ８００ｈ〜ＤＦＦＦｈなど：第１領域）と前記範囲外となる特定範囲外第１バイト数第２コード（Ｄ８００ｈ〜ＤＦＦＦｈ以外：第２領域）とを含むデータ構造を備えている。 Here, in the basic multilingual plane conversion table 31a, the first code (2121h, etc.) of the first number of bytes (2 bytes, etc.) is associated with the second code (3000h, etc.) of the first number of bytes. A second code (D800h to DFFFh, etc .: first area) within a specific range in which the second code of the first byte is within a predetermined range and a specific range outside the range A data structure including an outer first byte number second code (other than D800h to DFFFh: second area) is provided.

この点を更に詳述すると、基本多言語面変換テーブル３１ａは、前記第１コードを前記第１バイト数の第２コードに変換するために、前記第１バイト数（２バイトなど）の第１コードを少なくとも２つの上位バイト及び下位バイト毎に分解した第１コード分解上位バイト数（上位１バイトなど）の一の分解第１コード（７４ｈなど）及び第１コード分解下位バイト数（下位１バイトなど）の他の分解第１コード（２２ｈなど）と、前記第１バイト数（２バイトなど）の前記第２コード（Ｅ００１ｈなど）とが対応づけられているデータ構造を備えている。 More specifically, the basic multilingual plane conversion table 31a converts the first code into the second code having the first number of bytes, in order to convert the first code of the first number of bytes (such as 2 bytes). Decomposition of code into at least two upper bytes and lower bytes First code decomposition upper byte number (upper 1 byte, etc.) 1 decomposition first code (74h, etc.) and first code decomposition lower byte number (lower 1 byte) Etc.) and a data structure in which the second code (E001h, etc.) corresponding to the first number of bytes (2 bytes, etc.) is associated with the other decomposed first code (22h, etc.).

即ち、上記基本多言語面変換テーブル３１ａは、前記第１情報量（第１バイト数の一例である２バイトなど）の第１コード（２１２１ｈなど）と前記第１情報量の第２コード（３０００ｈなど）とが対応づけられて前記第１情報量の値を要素とするデータ構造を備えている。 That is, the basic multilingual plane conversion table 31a includes a first code (eg, 2121h) of the first information amount (eg, 2 bytes which is an example of the number of first bytes) and a second code (3000h) of the first information amount. Etc.) and a data structure having the value of the first information amount as an element.

（追加面変換テーブルのデータ構造）
図３に示すように、追加面変換テーブル３２ａは、上位１バイトをＤ８ｈ〜ＤＦｈ、下位１バイトを００ｈ〜ＦＦｈの範囲をもつ２バイトに対応したマトリックス表であり、それぞれの要素に変換先のユニコードの区点情報（ＵＴＦ−３２の下位２バイト）を格納している。ここにおいて、ユニコードにおける漢字追加面は第２面固定であり、追加面の情報（ＵＴＦ−３２の上位２バイト）は省略する。 (Data structure of additional surface conversion table)
As shown in FIG. 3, the additional plane conversion table 32a is a matrix table corresponding to 2 bytes having a range of upper 8 bytes from D8h to DFh and lower 1 byte from 00h to FFh. Stores Unicode block information (lower 2 bytes of UTF-32). Here, the Chinese character addition surface in Unicode is fixed to the second surface, and information on the additional surface (the upper 2 bytes of UTF-32) is omitted.

この追加面変換テーブル３２ａは、上述した基本多言語面変換テーブル３１ａにおける第１情報量（第１バイト数など）の第２コード群（[ ３０００ｈ] など）の内、予め定められた範囲内にある（第１情報量の）特定範囲コード（Ｄ８００ｈ〜ＤＦＦＦｈなど）と、前記第２情報量より少ない第３情報量（本第１実施形態では、第３情報量＝第１情報量＝第１バイト数である２バイトとなっている）の第１中間コード（０００Ｂｈなど）とが対応づけられたデータ構造を備えている。つまり、この追加面変換テーブル３２ａは、前記第３情報量が前記第１情報量と同一となる情報量で構成されている。 This additional plane conversion table 32a is within a predetermined range within the second code group ([3000h], etc.) of the first information amount (the first byte count, etc.) in the basic multilingual plane conversion table 31a. A certain range code (such as D800h to DFFFh) (a first information amount) and a third information amount smaller than the second information amount (in the first embodiment, third information amount = first information amount = first The data structure is associated with the first intermediate code (000 Bh or the like) of 2 bytes, which is the number of bytes). That is, the additional surface conversion table 32a is configured with an information amount in which the third information amount is the same as the first information amount.

更に詳述すると、追加面変換テーブル３２ａは、前述した基本多言語面変換テーブル３１ａにおける第１バイト数の第２コード（３０００ｈなど）が予め定められた範囲内となる特定範囲第１バイト数第２コード（Ｄ８００ｈ〜ＤＦＦＦｈなど）を、少なくとも２つの上位バイト及び下位バイト毎に分解した第２コード分解上位バイト数（上位１バイトなど）の一の分解特定範囲第２コード（Ｄ８ｈなど）及び第２コード分解下位バイト数（下位１バイトなど）の他の分解特定範囲第２コード（０１ｈなど）と、変換先の（前記第１バイト数の）第１中間コード（００８８ｈなど）とが対応づけられているデータ構造を備えている。 More specifically, the additional plane conversion table 32a has a first byte number of a specific range in which the second code (eg, 3000h) of the first byte number in the basic multilingual plane conversion table 31a described above falls within a predetermined range. Two codes (D800h to DFFFh, etc.) are decomposed into at least two high-order bytes and low-order bytes. 2 code decomposition lower byte number (lower 1 byte etc.) other decomposition specific range second code (01h etc.) and conversion destination (first byte number) first intermediate code (0088h etc.) Data structure.

ここで、この追加面変換テーブル３２ａは、前述した基本多言語面変換テーブル３１ａにおける前記要素の一部（Ｄ８００ｈ〜ＤＦＦＦｈなど）と前記第２情報量より少ない第３情報量の第１中間コード（０００Ｂｈなど）とが対応づけられて前記第３情報量の値を要素とするデータ構造を備えている。 Here, the additional plane conversion table 32a includes a part of the elements (D800h to DFFFh, etc.) in the basic multilingual plane conversion table 31a described above and a first intermediate code having a third information amount smaller than the second information amount ( 000Bh) and the like, and a data structure having the value of the third information amount as an element.

そして、この追加面変換テーブル３２ａは、記憶装置３０での実装上は、変換元の２バイト値の昇順で表の要素である２バイト値を連続して格納したイメージとなる。又、漢字追加面に対応する文字が少ない場合、要素数は必要に応じて少なくすることができる。 The additional surface conversion table 32a is an image in which the 2-byte values that are the elements of the table are continuously stored in ascending order of the conversion-source 2-byte values in terms of mounting in the storage device 30. In addition, when the number of characters corresponding to the Chinese character addition surface is small, the number of elements can be reduced as necessary.

（コード変換装置の具体的構成）
コード変換装置２０Ａは、図１に示すように、第１コードの一例である漢字２バイトコードから第２コードの一例であるユニコード（２バイトコードと４バイトコードの混在）へ変換を行うにあたり、必要最低限のメモリ容量で高速にコード変換を行うものであり、基本多言語面変換部２１と、追加面変換部２２と、ユニコード面区点生成部２３と、ユニコードＵＴＦ−１６符号化部２４とを備えている。 (Specific configuration of code converter)
As shown in FIG. 1, the code conversion device 20A performs conversion from a kanji 2-byte code, which is an example of a first code, to a Unicode (a mixture of 2-byte code and 4-byte code), which is an example of a second code. Code conversion is performed at a high speed with a minimum necessary memory capacity, and a basic multilingual plane conversion unit 21, an additional plane conversion unit 22, a Unicode plane segment generation unit 23, and a Unicode UTF-16 encoding unit 24. And.

即ち、コード変換装置２０Ａは、追加面変換部２２から得られる２バイトから、ユニコード面区点生成部２３とユニコードＵＴＦ−１６符号化部２４とにより、漢字追加面のユニコード（４バイト）を出力する機能を備える。
換言すると、コード変換装置２０Ａは、予め１つの文字情報に対して割り当てられている第１情報量（第１バイト数の一例である２バイト）の第１コードを入力し、この第１コードを基本多言語面変換テーブル３１ａ及び追加面変換テーブル３２ａの一方又は双方を参照することにより、前記第１コードと異なるコード体系であって前記文字情報に対して前記第１情報量の場合又はこれより多い第２情報量（第２バイト数の一例である４バイト）の場合が存在し得る第２コードに変換して、これを出力する機能を備えている。 That is, the code conversion device 20A outputs the Unicode (4 bytes) of the Chinese character additional plane from the 2 bytes obtained from the additional plane conversion unit 22 by the Unicode plane segment generation unit 23 and the Unicode UTF-16 encoding unit 24. It has a function to do.
In other words, the code conversion device 20A inputs a first code of a first information amount (2 bytes that is an example of the first number of bytes) assigned in advance to one piece of character information. By referring to one or both of the basic multilingual plane conversion table 31a and the additional plane conversion table 32a, the code system is different from the first code, and the character information has the first information amount or more. It has a function of converting to a second code that may contain a large amount of second information (4 bytes, which is an example of the number of second bytes), and outputting this.

ここで、基本多言語面変換部２１は、入力装置１０から与えられた２バイトの値をキーにして基本多言語面変換テーブル記憶部３１に記憶されている値を取得する機能を備えている。 Here, the basic multilingual plane conversion unit 21 has a function of acquiring a value stored in the basic multilingual plane conversion table storage unit 31 using a 2-byte value given from the input device 10 as a key. .

又、追加面変換部２２は、Ｄ８００ｈ〜ＤＦＦＦｈの値をキーにして追加面変換テーブル記憶部３２に記憶されている値（第１中間コード）を取得する機能を備えている。そして、この追加面変換部２２は、コード変換に際しては追加面変換テーブル記憶部３２を参照する。
更に、この追加面変換部２２は、基本多言語面変換部２１で、Ｄ８００ｈ〜ＤＦＦＦｈの値を取得したときのみ使用する。追加面変換テーブル記憶部３２は、Ｄ８００ｈ〜ＤＦＦＦｈの１コードにつき２バイトで構成し、Ｄ８００ｈ〜ＤＦＦＦｈの値で、直接参照可能とする。 The additional surface conversion unit 22 has a function of acquiring a value (first intermediate code) stored in the additional surface conversion table storage unit 32 using the values of D800h to DFFFh as keys. The additional surface conversion unit 22 refers to the additional surface conversion table storage unit 32 at the time of code conversion.
Further, the additional plane conversion unit 22 is used only when the basic multilingual plane conversion unit 21 acquires the values of D800h to DFFFh. The additional surface conversion table storage unit 32 is configured with 2 bytes for each code of D800h to DFFFh, and can be directly referred to by the values of D800h to DFFFh.

ユニコード面区点生成部２３は、追加面変換部２２で参照した値（第１中間コード）をＵＴＦ−３２における下位２バイトとし、上位２バイトに漢字追加面を示す固定値を付加して第２中間コードとする機能を備えている。
又、ユニコードＵＴＦ−１６符号化部２４は、上記ユニコード面区点生成部２３で生成したＵＴＦ−３２の値（第２中間コード）をＵＴＦ−１６のサロゲートペア（４バイト）の値（第２コード）に変換する機能を備えている。 The Unicode plane segment generation unit 23 sets the value (first intermediate code) referenced by the additional plane conversion unit 22 as the lower 2 bytes in UTF-32, and adds a fixed value indicating the Chinese character additional plane to the upper 2 bytes. It has the function of 2 intermediate codes.
Further, the Unicode UTF-16 encoding unit 24 converts the value of UTF-32 (second intermediate code) generated by the Unicode plane segment generation unit 23 into the value of the surrogate pair (4 bytes) of UTF-16 (second byte). Code).

更に、前述した基本多言語面変換部２１は、基本多言語面変換テーブル３１ａを参照して得られる前記第１情報量の第２コードが前記範囲内であるかどうかを判定すると共に当該範囲外と判定された場合に稼働して前記第１コードを前記第１情報量の第２コードに変換してこれを出力する機能を備えている。 Furthermore, the basic multilingual surface conversion unit 21 described above determines whether or not the second code of the first information amount obtained by referring to the basic multilingual surface conversion table 31a is within the range and is out of the range. And a function of converting the first code into the second code having the first information amount and outputting the second code.

又、追加面変換部２２は、上記基本多言語面変換部２１において前記範囲内であると判定された場合に稼働し、前記追加面変換テーブル３２ａを参照して前記特定範囲コードを前記第３情報量の第１中間コードに変換する機能を備えている。 The additional plane conversion unit 22 operates when the basic multilingual plane conversion unit 21 determines that the range is within the range, and refers to the additional plane conversion table 32a to convert the specific range code into the third range code. A function of converting the information amount into a first intermediate code is provided.

ここで、ユニコード面区点生成部２３とユニコードＵＴＦ−１６符号化部２４とにより変換後処理手段２９ａが構成されている。
この変換後処理手段２９ａは、第３情報量の前記第１中間コードに予め定められた演算処理を施して前記第２情報量の前記第２コードを生成してこれを出力する機能を備えている。 Here, the post-conversion processing means 29a is configured by the Unicode plane segment generation unit 23 and the Unicode UTF-16 encoding unit 24.
The post-conversion processing means 29a has a function of performing a predetermined calculation process on the first intermediate code of the third information amount to generate the second code of the second information amount and outputting it. Yes.

更に、上記追加面変換部２２は、追加面変換テーブル３２ａを参照して前記特定範囲コードを前記第１情報量の第１中間コードに変換する機能を備えている。
この場合、ユニコード面区点生成部２３は、前記追加面変換部２２にて得られた前記第１情報量の第１中間コードに対して予め定められた演算処理を施して前記第２情報量の第２中間コードを生成する機能を備えている。 Further, the additional surface conversion unit 22 has a function of converting the specific range code into the first intermediate code of the first information amount with reference to the additional surface conversion table 32a.
In this case, the Unicode plane segment generation unit 23 performs a predetermined arithmetic process on the first intermediate code of the first information amount obtained by the additional plane conversion unit 22 to perform the second information amount. The second intermediate code is generated.

又、前記符号化部の第１の符号化機能としてのユニコードＵＴＦ−１６符号化部２４は、ユニコード面区点生成部２３で生成された前記第２情報量の第２中間コードに対して符号化結果が前記第２情報量となる第１の符号化形式で符号化処理を施し、これにより、前述した第２コードを出力する機能を備えている。 The Unicode UTF-16 encoding unit 24 as a first encoding function of the encoding unit encodes the second intermediate code of the second information amount generated by the Unicode plane segment generation unit 23. The encoding process is performed in the first encoding format in which the conversion result becomes the second information amount, and thereby has the function of outputting the above-described second code.

更に、上記追加面変換部２２は、前記上位バイトの一の分解特定範囲第２コードの値と前記下位バイトの他の分解特定範囲第２コードの値から前記追加面変換テーブル３２ａの要素を格納している位置を計算し、前述した第１中間コードの値を求める機能を備えている。 Further, the additional surface conversion unit 22 stores the elements of the additional surface conversion table 32a from the value of one decomposition specific range second code of the upper byte and the value of the other decomposition specific range second code of the lower byte. A function for calculating the position of the first intermediate code and calculating the value of the first intermediate code described above.

又、上記ユニコード面区点生成部２３は、前記追加面変換部２２にて得られた第１中間コードを下位バイトとし、予め定められた第１固定値を上位バイトとしてこれを前記下位バイトに付加する機能を備えている。 Further, the Unicode plane segment generation unit 23 uses the first intermediate code obtained by the additional plane conversion unit 22 as a lower byte, sets a predetermined first fixed value as an upper byte, and sets this as the lower byte. It has a function to add.

ここで、前記第１コードが漢字コードであり、前記第２コードが前記第１の符号化形式のユニコードである場合、前述した第１固定値は、前記第１の符号化形式と異なる第３の符号化形式のユニコードの漢字追加面を示す値である。
又、前記第１の符号化形式は、ＵＴＦ（Ｕniversal multi octet coded characterset Ｔransformation Ｆormat ）―１６であり、前記第３の符号化形式は、ＵＴＦ―３２である。 Here, when the first code is a Kanji code and the second code is a Unicode of the first encoding format, the first fixed value described above is a third different from the first encoding format. It is a value which shows the kanji addition surface of the Unicode of the encoding format of.
Further, the first encoding format is UTF (Universal multi octet coded characterset Transformation Format) -16, and the third encoding format is UTF-32.

（動作手順について）
（全体の動作）
次に、上記コード変換システムの全体の動作について、図１乃至図５を参照しつつ説明する。 (About operation procedure)
(Overall operation)
Next, the overall operation of the code conversion system will be described with reference to FIGS.

本実施形態におけるコード変換システム１は、図１に示したように、１つの文字情報に対して割り当てられている第１情報量の第１コードと前記第１情報量の第２コードとが対応づけられて前記第１情報量の値を要素とする基本多言語面変換テーブル３１ａと、前記要素の一部と前記第２情報量より少ないバイト数の第３情報量の第１中間コードとが対応づけられて前記第３情報量の値を要素とする追加面変換テーブル３２ａとを記憶装置３０が予め備えている。 In the code conversion system 1 according to the present embodiment, as shown in FIG. 1, the first code of the first information amount assigned to one character information and the second code of the first information amount correspond to each other. The basic multilingual plane conversion table 31a having the value of the first information amount as an element, a part of the element, and the first intermediate code of the third information amount having a smaller number of bytes than the second information amount The storage device 30 is preliminarily provided with an additional surface conversion table 32a that is associated with the value of the third information amount.

このコード変換システム１のコード変換動作に際しては、先ず、前記第１情報量の第１コードを、入力装置１０が入力する（図４／ステップＳ１０１；入力処理ステップ、入力処理機能）。 In the code conversion operation of the code conversion system 1, first, the input device 10 inputs the first code of the first information amount (FIG. 4 / step S101; input processing step, input processing function).

続いて、この入力された第１コードを基本多言語面変換テーブル３１ａ及び追加面変換テーブル３２ａの一方又は双方を参照することにより前記第１コードと異なるコード体系であって前記文字情報に対して前記第１情報量の場合又はこれより多いバイト数の第２情報量の場合が存在し得る第２コードに、コード変換装置２０Ａがコード変換する（図４／ステップＳ１０２；コード変換ステップ、コード変換機能）。 Subsequently, the input first code has a code system different from the first code by referring to one or both of the basic multilingual plane conversion table 31a and the additional plane conversion table 32a. The code conversion apparatus 20A performs code conversion to the second code that may have the first information amount or the second information amount having a larger number of bytes (FIG. 4 / step S102; code conversion step, code conversion). function).

その後、この変換された前記第１情報量又は前記第２情報量の第２コードを、出力装置４０が出力する（図４／ステップＳ１０３；出力処理ステップ、出力処理機能）。 Thereafter, the output device 40 outputs the converted second code of the first information amount or the second information amount (FIG. 4 / step S103; output processing step, output processing function).

ここで、前記変換をするに際しては、前述した基本多言語面変換テーブル３１ａを参照した結果が前記要素の一部の範囲である場合に、次に、追加面変換テーブル３２ａを参照して前記第３情報量の第１中間コードを出力すると共に、この第３情報量の前記第１中間コードに予め定められた演算処理を施して、前記第２情報量の第２コードを生成し、これを出力する（図５／ステップＳ１１１〜Ｓ１１５）。 Here, when performing the conversion, if the result of referring to the basic multilingual surface conversion table 31a described above is a partial range of the element, then the additional surface conversion table 32a is referred to A first intermediate code of 3 information amounts is output, and a predetermined calculation process is performed on the first intermediate code of the third information amount to generate a second code of the second information amount. Output (FIG. 5 / steps S111 to S115).

更に、このコード変換における動作手順では、追加面変換テーブル３２ａが、基本多言語面変換テーブル３１ａにおける前記第１情報量の第２コード群の内の予め定められた範囲内にある特定範囲コードと前記第２情報量より少ないバイト数の第３情報量の第１中間コードとが対応づけられたデータ構造を備えている場合、コード変換に際しては、先ず、基本多言語面変換テーブル３１ａを参照して得られる第１情報量の第２コードが前記範囲内であるかどうかを判定すると共に、当該範囲外と判定された場合に稼働し前記第１コードを第１情報量の第２コードに基本多言語面変換部２１が第１の変換処理をして出力する（図５／ステップＳ１１１〜Ｓ１１２；第１の変換処理ステップ、第１の変換機能）。 Furthermore, in the operation procedure in this code conversion, the additional plane conversion table 32a includes a specific range code within a predetermined range in the second code group of the first information amount in the basic multilingual plane conversion table 31a. When the data structure is associated with the first intermediate code of the third information amount having a smaller number of bytes than the second information amount, the basic multilingual plane conversion table 31a is first referred to for the code conversion. It is determined whether or not the second code of the first information amount obtained within the range is within the range, and when it is determined to be out of the range, the first code is used as the second code of the first information amount. The multilingual plane conversion unit 21 performs the first conversion process and outputs the result (FIG. 5 / steps S111 to S112; first conversion process step, first conversion function).

続いて、前記第１の変換処理において前記範囲内であると判定された場合に、追加面変換テーブル３２ａを参照して追加面変換部２２が前記特定範囲コードを前記第３情報量の第１中間コードに変換する第２の変換処理工程を実行する（図５／ステップＳ１１３；第２の変換処理ステップ、第２の変換機能）。 Subsequently, when it is determined in the first conversion process that the value is within the range, the additional surface conversion unit 22 refers to the additional surface conversion table 32a and converts the specific range code into the first information amount of the third information amount. A second conversion process step of converting to an intermediate code is executed (FIG. 5 / step S113; second conversion process step, second conversion function).

その後、この第３情報量の第１中間コードに予め定められた演算処理を施し、前記第２情報量の第２コードを、変換後処理手段２９ａが生成処理してこれを出力する変換後処理工程を実行する（図５／ステップＳ１１４〜Ｓ１１５；変換後処理ステップ、変換後処理機能）。 Thereafter, a post-conversion process in which a predetermined arithmetic process is performed on the first intermediate code of the third information amount, and the post-conversion processing means 29a generates and outputs the second code of the second information amount. The process is executed (FIG. 5 / Steps S114 to S115; post-conversion processing step, post-conversion processing function).

更に、このコード変換における動作手順では、前記第２の変換処理を実行するに際しては、前記追加面変換テーブル３２ａにおける第３情報量が前記第１情報量と同一となる場合に、前記追加面変換テーブル３２ａを参照してその特定範囲コードを前記第１情報量の第１中間コードに変換する。 Further, in the operation procedure in the code conversion, when the second conversion process is executed, the additional surface conversion is performed when the third information amount in the additional surface conversion table 32a is the same as the first information amount. The specific range code is converted into the first intermediate code of the first information amount with reference to the table 32a.

又、前記変換後処理工程の実行するに際しては、先ず、前記第２の変換処理にて得られた第１情報量の第１中間コードに対して予め定められた演算処理を施し、前記第２情報量の第２中間コードを生成する第１の生成処理を変換後処理手段２９ａが実行する（図５／ステップＳ１１４；第１の生成処理ステップ，第１の生成機能）。 In executing the post-conversion processing step, first, a predetermined calculation process is performed on the first intermediate code of the first information amount obtained in the second conversion process, and the second conversion process is performed. The post-conversion processing means 29a executes the first generation processing for generating the second intermediate code of the information amount (FIG. 5 / step S114; first generation processing step, first generation function).

続いて、この第１の生成処理にて生成された前記第２情報量の第２中間コードに対して予め設定されている符号化形式で符号化処理を施し前記第２コードを出力する符号化処理を、変換後処理手段２９ａが実行する（図５／ステップＳ１１５；符号化処理ステップ、符号化処理機能である第１の符号化機能）。 Subsequently, an encoding process is performed on the second intermediate code of the second information amount generated in the first generation process in a preset encoding format, and the second code is output. Processing is executed by the post-conversion processing means 29a (FIG. 5 / step S115; encoding processing step, first encoding function which is an encoding processing function).

更に、このコード変換における動作手順では、前記符号化処理の実行に際しては、前記第１の生成処理にて生成された第２情報量の第２中間コードに対して、符号化結果が前記第２情報量となる第１の符号化形式で符号化処理を施し、前記第２コードを出力する第１の符号化処理を変換後処理手段２９ａが実行する。 Furthermore, in the operation procedure in this code conversion, when the encoding process is executed, the encoding result is the second intermediate code of the second information amount generated in the first generation process. The post-conversion processing means 29a executes a first encoding process in which the encoding process is performed in the first encoding format that is the amount of information and the second code is output.

（詳細動作）
以下、この動作手順を更に詳述する。
まず、入力装置１０から与えられた漢字文字列は、２バイトごとの値に分けて基本多言語面変換部２１に供給される。
基本多言語面変換部２１は、供給された２バイトを上位１バイトと下位１バイトに分解し、上位バイトと下位バイトの値から基本多言語面変換テーブル記憶部３１における要素（変換元の２バイトコードに対応する変換先のＵＴＦ−１６の値）を格納している位置を計算した後、要素を取得する（図５／ステップＳ１１１；第１バイト第２コード取得ステップ、第１バイト第２コード取得機能）。 (Detailed operation)
Hereinafter, this operation procedure will be described in more detail.
First, the Kanji character string given from the input device 10 is divided into values of every 2 bytes and supplied to the basic multilingual plane conversion unit 21.
The basic multilingual plane conversion unit 21 decomposes the supplied 2 bytes into upper 1 byte and lower 1 byte, and the element (2 of conversion source) in the basic multilingual plane conversion table storage unit 31 from the values of the upper byte and lower byte. After calculating the position where the conversion destination UTF-16 value corresponding to the bytecode is stored, the element is acquired (FIG. 5 / step S111; first byte second code acquisition step, first byte second). Code acquisition function).

次に、基本多言語面変換部２１が取得した値についてＤ８００ｈ〜ＤＦＦＦｈの範囲にあるか調べる（図５／ステップＳ１１２；範囲判定ステップ、範囲判定機能）。
この図５のステップＳ１１２で、基本多言語面変換部２１で取得した値が、Ｄ８００ｈ〜ＤＦＦＦｈの範囲外であった場合、この値がユニコード（ＵＴＦ−１６）の変換結果であるため変換は完了する。 Next, it is examined whether or not the value acquired by the basic multilingual surface conversion unit 21 is in the range of D800h to DFFFh (FIG. 5 / step S112; range determination step, range determination function).
If the value acquired by the basic multilingual plane conversion unit 21 in step S112 of FIG. 5 is outside the range of D800h to DFFFh, the conversion is completed because this value is a conversion result of Unicode (UTF-16). To do.

一方、基本多言語面変換部２１で取得した値が、Ｄ８００ｈ〜ＤＦＦＦｈの範囲内であった場合、この値を追加面変換部２２に供給する。
続いて、追加面変換部２２は、供給された２バイトを上位１バイトと下位１バイトに分解し、上位バイトと下位バイトの値から追加面変換テーブル記憶部３２における要素（変換元の２バイトに対応する変換先のＵＴＦ−３２の下位２バイトの値）を格納している位置を計算した後、要素を取得する（図５／ステップＳ１１３：第１中間コード取得ステップ、第１中間コード取得機能）。 On the other hand, when the value acquired by the basic multilingual surface conversion unit 21 is within the range of D800h to DFFFh, this value is supplied to the additional surface conversion unit 22.
Subsequently, the additional plane conversion unit 22 decomposes the supplied 2 bytes into an upper 1 byte and a lower 1 byte, and an element (2 bytes of the conversion source) in the additional plane conversion table storage unit 32 from the values of the upper byte and the lower byte. After calculating the position storing the lower-order 2 bytes of UTF-32 of the conversion destination corresponding to the element, the element is acquired (FIG. 5 / step S113: first intermediate code acquisition step, first intermediate code acquisition) function).

更に、追加面変換部２１が取得した値は、ユニコード面区点生成部２３へ供給する。
次に、このユニコード面区点生成部２３は、供給された２バイトをＵＴＦ−３２における下位２バイトとし、上位２バイトには漢字追加面を示す０００２ｈ（固定値）を追加してユニコード（Ｕｎｉｃｏｄｅ）の面区点を表す値（ＵＴＦ−３２符号化形式）を生成する（図５／ステップＳ１１４：第２中間コード生成ステップ、第２中間コード生成機能）。 Further, the value acquired by the additional plane conversion unit 21 is supplied to the Unicode plane segment generation unit 23.
Next, the Unicode plane segment generation unit 23 sets the supplied 2 bytes as the lower 2 bytes in UTF-32, and adds 0002h (fixed value) indicating the Chinese character addition plane to the upper 2 bytes to add the Unicode (Unicode). ) Is generated (UTF-32 encoding format) (FIG. 5 / step S114: second intermediate code generation step, second intermediate code generation function).

その後、ユニコード面区点生成部２３が生成した値は、ユニコードＵＴＦ−１６符号化部２４に供給する。
ユニコードＵＴＦ−１６符号化部２４では、供給されたＵＴＦ−３２の４バイトを元にＵＴＦ−１６のサロゲートペアの４バイトに符号化する（図５／ステップＳ１１５；第２バイト第２コード生成ステップ、第２バイト第２コード生成機能）。
これにより、ユニコードＵＴＦ−１６符号化部２４による符号化の結果が変換結果であるため、変換は完了する。 Thereafter, the value generated by the Unicode plane segment generation unit 23 is supplied to the Unicode UTF-16 encoding unit 24.
The Unicode UTF-16 encoding unit 24 encodes the 4 bytes of the surrogate pair of UTF-16 based on the 4 bytes of the supplied UTF-32 (FIG. 5 / step S115; second byte second code generation step). , Second byte second code generation function).
Thereby, since the encoding result by the Unicode UTF-16 encoding unit 24 is the conversion result, the conversion is completed.

このようにして、変換先が漢字追加面のユニコード（Ｕｎｉｃｏｄｅ／４バイト）となる場合のみ、追加面変換部２２が稼動するため、実行コストが抑えられる。又、追加面変換部２２は、追加面変換テーブル記憶部３２の検索ではなく、参照して変換先の値を取得するため、追加面変換部２２の実行コストも抑えることができる。 In this way, the additional plane conversion unit 22 operates only when the conversion destination is Unicode (Unicode / 4 bytes) of the Chinese character additional plane, so that the execution cost can be suppressed. Further, since the additional surface conversion unit 22 obtains the conversion destination value by referring to the additional surface conversion table storage unit 32, the execution cost of the additional surface conversion unit 22 can be reduced.

ここで、基本多言語面変換テーブル記憶部３１のテーブルサイズは一般に２バイトコードから２バイトコードへのコード変換で使用する関連技術の図１０のコード変換テーブルＴ１のサイズと同じである。更に、追加面変換テーブル記憶部３２のテーブルサイズは、漢字追加面の漢字数×２バイトであり、記憶装置３０の記憶使用容量の増加を抑えることができる。 Here, the table size of the basic multilingual plane conversion table storage unit 31 is generally the same as the size of the code conversion table T1 of FIG. 10 of the related art used in code conversion from 2-byte code to 2-byte code. Furthermore, the table size of the additional surface conversion table storage unit 32 is the number of Chinese characters for the additional surface of Chinese characters × 2 bytes, and an increase in the storage usage capacity of the storage device 30 can be suppressed.

次に、上記第１実施形態の効果について詳述する。
第１の効果は、漢字２バイトコードからユニコード（２バイトコードと４バイトコードの混在）へ変換するコード変換テーブルのサイズが一般的なコード変換テーブルを使用した場合に比べて約５０〔％〕程度削減できることである。 Next, effects of the first embodiment will be described in detail.
The first effect is that the size of the code conversion table for converting kanji 2-byte code to Unicode (mixed of 2-byte code and 4-byte code) is approximately 50% compared to the case where a general code conversion table is used. It can be reduced to some extent.

その理由は、一般的なコード変換テーブルを使用した場合、変換元の漢字１文字に対応する１つの要素が４バイトであるのに対し、本第１実施形態のコード変換テーブルは１つの要素が２バイトのためである。
即ち、本第１実施形態では、基本多言語面変換テーブル記憶部３１におけるテーブル構造が、図１０に示される関連技術と同じように２バイトの要素の連続にて構成されている。このため、基本多言語面変換テーブル記憶部３１を図１１のように４バイトの要素の連続で構成した場合に比較して、基本多言語面変換テーブル記憶部３１のテーブルサイズ（即ち、記憶容量）を１／２にすることができる。 The reason is that when a general code conversion table is used, one element corresponding to one Kanji character of the conversion source is 4 bytes, whereas the code conversion table of the first embodiment has one element. This is for 2 bytes.
That is, in the first embodiment, the table structure in the basic multilingual plane conversion table storage unit 31 is configured by a continuous 2-byte element as in the related art shown in FIG. For this reason, the table size (that is, the storage capacity) of the basic multilingual plane conversion table storage unit 31 is compared with the case where the basic multilingual plane conversion table storage unit 31 is configured by a continuous 4-byte element as shown in FIG. ) Can be halved.

具体的には、ユニコードの漢字追加面に収容されている漢字へ変換するために追加で用意するコード変換テーブルのサイズは、追加面に収容されている漢字の数×２バイトのサイズで済むためである。 Specifically, the size of the code conversion table that is additionally prepared for conversion to the kanji stored in the additional surface of Unicode is only the size of the number of kanji stored in the additional surface × 2 bytes. It is.

例えば、変換元の漢字２バイトコードが１７３６９（９４×９４×２）個あり、このうち漢字追加面のユニコード（Ｕｎｉｃｏｄｅ）へ変換する文字が３０３文字あり、これ以外の文字を基本多言語面のユニコード（Ｕｎｉｃｏｄｅ）へ変換する場合、一般的なコード変換テーブルを使用すると、１７３６９×４＝６９４７６バイト必要となる。
これに対して、本第１実施形態のコード変換テーブルを使用すると、１７３６９×２＋３０３×２＝３５３４４バイトであり、本発明のコード変換テーブルを使用した場合、一般的なコード変換テーブルを使用した場合の約５０％のサイズとなる。 For example, there are 17369 (94 × 94 × 2) kanji 2-byte codes to be converted, of which 303 characters are converted to Unicode (Unicode) for adding kanji characters, and other characters are converted to basic multilingual characters. When converting to Unicode, 17369 × 4 = 69476 bytes are required if a general code conversion table is used.
On the other hand, when the code conversion table of the first embodiment is used, it is 17369 × 2 + 303 × 2 = 35344 bytes. When the code conversion table of the present invention is used, a general code conversion table is used. About 50% of the size.

第２の効果は、基本多言語面の文字コードにコード変換する場合、漢字２バイトコードからユニコード（Ｕｎｉｃｏｄｅ／２バイト固定長）のコード変換と同等の速度でコード変換できることである。 The second effect is that, when code conversion is performed to a basic multilingual character code, code conversion can be performed from a kanji 2-byte code at a speed equivalent to a Unicode (Unicode / 2-byte fixed length) code conversion.

その理由は、漢字追加面のユニコード（Ｕｎｉｃｏｄｅ）へ変換が必要なときのみ、漢字追加面への変換処理を行うためである。
関連技術のコード変換においても、基本多言語面のコード変換テーブルと追加面のコード変換テーブルを分け、テーブルサイズを小さくする手法が存在するが、この場合、１文字の変換ごとに、変換元の文字コードが漢字追加面のコード変換対象でないか、漢字追加面のコード変換テーブルを検索し、漢字追加面のコード変換対象でないこと確認した後に基本多言語面テーブルを参照するため、基本多言語面へのコード変換であってもコード変換の速度が落ちる。 The reason is that the conversion process to the Chinese character addition surface is performed only when conversion to the Unicode (Unicode) of the Chinese character addition surface is necessary.
There is also a technique for reducing the table size by separating the code conversion table of the basic multilingual side and the code conversion table of the additional side in the code conversion of the related technology, but in this case, for each character conversion, the conversion source The basic multilingual plane is used to check the code conversion table for the kanji addition plane, whether the character code is not the target for code conversion for the additional kanji plane, and the basic multilingual plane table after referring to the code conversion table for the kanji addition plane. Even code conversion to, the speed of code conversion slows down.

即ち、本第１実施形態では、漢字追加面の文字コードへ変換するための追加面変換部２２、ユニコード面区点生成部２３、及びユニコードＵＴＦ−１６符号化部２４は、追加面の文字コード（４バイト）へコード変換するときのみ稼働する。
基本多言語面変換部２１は、関連技術と同一の変換機能を採用することもできるため、関連技術のＵＴＦ−１６（２バイト）への変換は、関連技術と同等の速度で変換することができる。 That is, in the first embodiment, the additional plane conversion unit 22, the Unicode plane segment generation unit 23, and the Unicode UTF-16 encoding unit 24 for converting into the character code of the Chinese character additional plane are the character codes of the additional plane. Operates only when transcoding to (4 bytes).
Since the basic multilingual plane conversion unit 21 can also adopt the same conversion function as that of the related technology, the conversion to UTF-16 (2 bytes) of the related technology can be performed at the same speed as the related technology. it can.

第３の効果は、漢字追加面の文字コードにコード変換する場合でも高速にコード変換できることにある。
その理由は、漢字追加面の文字コードにコード変換する場合、１回の基本多言語面変換テーブル３１ａの参照と、１回の追加面変換テーブル３２ａの参照と、１回の面区点生成処理と、１回のユニコード符号化処理で変換結果が得られるためである。 The third effect is that code conversion can be performed at high speed even when code conversion is performed to the character code of the Chinese character addition surface.
The reason for this is that, when code conversion is performed to the character code of the additional surface of the Kanji character, the reference of the basic multilingual surface conversion table 31a is performed once, the reference of the additional surface conversion table 32a is performed once, and the section point generation processing is performed once. This is because a conversion result can be obtained by a single Unicode encoding process.

ここで、面区点生成処理とユニコード符号化処理は単純な処理で実現できる。関連技術の場合のコード変換では、変換元の漢字２バイトコードをキーとした追加面変換テーブルの検索処理が発生するため、コード変換の速度が落ちる。
このように、本第１実施形態では、追加面変換部２２は、追加面変換テーブル記憶部３２を検索するのではなく、基本多言語面変換部２１と同様の手法で追加面変換テーブル記憶部３２を参照して値を取得する。このため、検索キーを元に追加面変換テーブル記憶部３２を検索する場合に比して、変換結果を高速に取得することができる。 Here, the section point generation process and the Unicode encoding process can be realized by simple processes. In the code conversion in the case of the related art, the search process of the additional surface conversion table using the conversion source kanji 2-byte code as a key occurs, so that the code conversion speed decreases.
As described above, in the first embodiment, the additional surface conversion unit 22 does not search the additional surface conversion table storage unit 32 but uses the same method as the basic multilingual surface conversion unit 21 to add the additional surface conversion table storage unit. A value is acquired with reference to 32. Therefore, the conversion result can be acquired at a higher speed than when the additional surface conversion table storage unit 32 is searched based on the search key.

また、本第１実施形態では、追加面変換テーブル記憶部３２に記録する要素が追加面１文字の変換につき２バイトで済む。このため、追加面変換テーブル記憶部３２のテーブルサイズを図１３（関連技術）で示されるコード変換テーブルＴ４に比して、１／３のテーブルサイズ（記憶容量）とすることができる。 In the first embodiment, the element to be recorded in the additional plane conversion table storage unit 32 may be 2 bytes for converting one additional plane character. For this reason, the table size (storage capacity) of the additional surface conversion table storage unit 32 can be set to 1/3 of the code conversion table T4 shown in FIG. 13 (related technology).

以上のように、追加面変換テーブル３２ａは第２情報量より少ない第３情報量を要素として構成され、第２情報量の第２コードを生成するためのコード変換では、追加面変換テーブル３２ａを参照して第３情報量の第１中間コードを出力しこれに予め定められた演算処理を施して前記第２情報量の第２コードを生成するため、関連技術のような第２情報量の変換テーブルを構成する必要がなく、「ＪＩＳＸ０２１３：２００４」に対応する日本語コードをサロゲートペアの拡張機能を有するユニコードにコード変換するに際し、変換テーブルの格納領域を最小限にして比較的少ないメモリ容量の組み込み機器などコード変換を実装する装置に変換テーブルを搭載可能としながらも、コード変換にかかる処理時間の短縮化が可能となり、変換処理の高速化を図ることができる。 As described above, the additional surface conversion table 32a is configured with the third information amount smaller than the second information amount as an element, and the code conversion for generating the second code of the second information amount includes the additional surface conversion table 32a. In order to generate a second code of the second information amount by outputting a first intermediate code of the third information amount and performing a predetermined calculation process on the first intermediate code, the second information amount of the related information There is no need to configure a conversion table, and when converting Japanese codes corresponding to “JIS X 0213: 2004” to Unicode having an extended surrogate pair function, the storage area of the conversion table is minimized and relatively small While it is possible to install conversion tables on devices that implement code conversion, such as embedded devices with memory capacity, it is possible to shorten the processing time required for code conversion. It is possible to speed up the process.

さらに、漢字２バイトコードからユニコードへ変換するとき、基本多言語面変換部２１が参照する基本多言語面変換テーブル記憶部３１において、漢字２バイトコードに対応したユニコード（ＵＴＦ―１６符号化形式の２バイトコード）を予め格納する。そして、この基本多言語面変換テーブル記憶部３１において、変換先が漢字追加面のユニコード（４バイトコード）となる文字の変換では、文字が割り当てられることのないサロゲートペアで使われるコード範囲（Ｄ８００ｈ〜ＤＦＦＦｈ）の２バイト値を割り当てて、これを格納する。 Further, when converting from a kanji 2-byte code to a unicode, the basic multilingual plane conversion table storage unit 31 referred to by the basic multilingual plane conversion unit 21 uses a Unicode (UTF-16 encoding format) corresponding to the kanji 2-byte code. 2 byte code) is stored in advance. In the basic multilingual plane conversion table storage unit 31, in the conversion of the character whose conversion destination is the Unicode (4-byte code) of the Chinese character addition plane, the code range (D800h) used in the surrogate pair to which no character is assigned. ... DFFFh) is allocated and stored.

また、追加面変換部２２については、これを基本多言語面変換部２１で（Ｄ８００ｈ〜ＤＦＦＦｈ）の値を取得したときのみ使用する。この場合、追加面変換部２２は、追加面変換テーブル記憶部３１を参照し、追加面変換テーブル記憶部３１は（Ｄ８００ｈ〜ＤＦＦＦｈ）の１コードにつき２バイトで構成し、（Ｄ８００ｈ〜ＤＦＦＦｈ）の値で直接参照可能とし、追加面変換部２２から得られる２バイトから、ユニコード面区点生成部２３とユニコードＵＴＦ−１６符号化部２４とにより漢字追加面のユニコード（Ｕｎｉｃｏｄｅ／４バイト）を出力する。 Further, the additional plane conversion unit 22 is used only when the basic multilingual plane conversion unit 21 acquires a value of (D800h to DFFFh). In this case, the additional surface conversion unit 22 refers to the additional surface conversion table storage unit 31, and the additional surface conversion table storage unit 31 is configured with 2 bytes for each code of (D800h to DFFFh), and (D800h to DFFFh). Directly referable by value, and from the 2 bytes obtained from the additional plane conversion unit 22, the Unicode plane division generation unit 23 and the Unicode UTF-16 encoding unit 24 output the Unicode (Unicode / 4 bytes) of the Chinese character addition plane. To do.

これにより、ユニコード（Ｕｎｉｃｏｄｅ）の基本多言語面のサロゲートペアのコード範囲（Ｄ８００ｈ〜ＤＦＦＦｈ）を追加面変換テーブル３２ａを指す値として利用でき、当該追加面変換テーブル３２ａを２バイトで構成し、漢字２バイトコードからユニコード（Ｕｎｉｃｏｄｅ／ＪＩＳＸ０２１３：２００４）に変換対応するための使用メモリの増加を確実に抑えることができる。また、追加面変換テーブル３２ａを検索せず、参照することで、漢字追加面のユニコード（Ｕｎｉｃｏｄｅ）へ変換する場合も高速にコード変換できる。 Thereby, the code range (D800h to DFFFh) of the surrogate pair of the basic multilingual plane of Unicode can be used as a value indicating the additional plane conversion table 32a, and the additional plane conversion table 32a is configured by 2 bytes, It is possible to reliably suppress an increase in memory used for conversion from 2-byte code to Unicode (Unicode / JIS X 0213: 2004). In addition, by referring to the additional surface conversion table 32a without searching, it is possible to perform high-speed code conversion even when converting to the Unicode (Unicode) of the Chinese character additional surface.

以上、ブロック図における構成要素たる各手段及び各部が、電子回路ブロックなどからなるハードウエアであることを前提として述べたが、当該構成要素の一部又は全てが、コード変換装置２０Ａが備えたコンピュータにより実行可能なプログラムにより機能化された状態を示すソフトウエアモジュール構成であってもよい。 The above description has been made on the assumption that each means and each component as components in the block diagram is hardware composed of an electronic circuit block or the like. However, a part or all of the components are included in the code conversion device 20A. A software module configuration showing a state functionalized by a program that can be executed by.

この場合におけるハードウエア構成としては、制御部としてのプロセッサを備えている。即ち、物理的構成は例えば一又は複数のプロセッサと一又は複数のメモリ等であるが、各構成要素によるソフトウエア構成は、プログラムの制御によってプロセッサが発揮する複数の機能を、それぞれ複数の構成要素として表現したものとなる。 In this case, the hardware configuration includes a processor as a control unit. That is, the physical configuration is, for example, one or a plurality of processors and one or a plurality of memories, etc., but the software configuration by each component has a plurality of functions that the processor exhibits by controlling the program, respectively. Will be expressed as

プロセッサがプログラムによって実行されている動的状態（プログラムを構成する各手順を実行している状態）を機能表現した場合、プロセッサ内に実行部分にかかる各構成要素が構成されることになる。プログラムが実行されていない静的状態にあっては、各手段の構成を実現するプログラム全体（或いは各手段の構成に含まれるプログラム各部）は、メモリなどの記憶領域に記憶されている。 When the dynamic state in which the processor is executed by the program (the state in which each procedure constituting the program is executed) is functionally expressed, each component related to the execution part is configured in the processor. In a static state in which the program is not executed, the entire program (or each program part included in the configuration of each unit) that realizes the configuration of each unit is stored in a storage area such as a memory.

以上に示した各部（手段）は、プログラムにより機能化されたコンピュータをプログラムの機能と共に実現し得るように構成しても、また、固有のハードウエアにより恒久的に機能化された複数の電子回路ブロックからなる装置で構成してもよい。 Each unit (means) described above may be configured such that a computer functionalized by a program can be realized together with the function of the program, or a plurality of electronic circuits permanently functionalized by specific hardware You may comprise with the apparatus which consists of a block.

また、上記の説明において、上述した各ステップの動作内容及び各部の構成要素並びにそれらによる各機能をプログラム（ソフトウエアプログラム）化し、コンピュータに実行させてもよい。そして、以上説明した方法は、コンピュータがプログラムを記録媒体から読み込んで実行することによっても実現することが出来る。すなわち、上述のプログラムを、情報記録媒体に記録した構成であってもよい。 In the above description, the operation content of each step described above, the constituent elements of each unit, and the functions thereof may be programmed (software program) and executed by a computer. The method described above can also be realized by a computer reading a program from a recording medium and executing it. That is, the structure which recorded the above-mentioned program on the information recording medium may be sufficient.

〔第２の実施の形態〕
次に、本発明にかかるコード変換システムの第２の実施の形態を、図６乃至図７に基づいて説明する。
図６は、本第２実施形態にかかるコード変換システムを示すブロック図である。 [Second Embodiment]
Next, a second embodiment of the code conversion system according to the present invention will be described with reference to FIGS.
FIG. 6 is a block diagram showing a code conversion system according to the second embodiment.

この図６において、本第２実施形態では、コード変換システム１００におけるコード変換装置２０Ｂが、上述した第１実施形態のコード変換装置２０Ａの構成の内、ユニコード０面区点生成部２５が追加されている点と、ユニコードＵＴＦ−１６符号化部２４がユニコードＵＴＦ−８符号化部２６に置き代わっている点が異なる。 In FIG. 6, in the second embodiment, the code conversion device 20B in the code conversion system 100 includes a Unicode 0 plane segment generation unit 25 in the configuration of the code conversion device 20A of the first embodiment described above. The difference is that the Unicode UTF-16 encoding unit 24 is replaced with the Unicode UTF-8 encoding unit 26.

（詳細構成）
上述したコード変換システムのユニコード０面区点生成部２５は、基本多言語面変換部２１で参照した値をＵＴＦ−３２における下位２バイトとし、上位２バイトに基本多言語面を示す固定値を付加する。 (Detailed configuration)
The Unicode 0 plane section generation unit 25 of the above-described code conversion system uses the value referenced by the basic multilingual plane conversion unit 21 as the lower 2 bytes in UTF-32, and a fixed value indicating the basic multilingual plane in the upper 2 bytes. Append.

又、ユニコードＵＴＦ−８符号化部２６は、ユニコード面区点生成部２３もしくはユニコード０面区点生成部２５で生成した面区点情報（ＵＴＦ−３２符号化形式）をＵＴＦ−８（１〜４バイト）の値に変換する。 In addition, the Unicode UTF-8 encoding unit 26 converts the plane segment information (UTF-32 encoding format) generated by the Unicode plane segment generation unit 23 or the Unicode 0 plane segment generation unit 25 into UTF-8 (1 to 0). 4 bytes).

更に、符号化部の一部である第２の符号化機能としてのユニコードＵＴＦ−８符号化部２６は、前記第１の生成部としてのユニコード面区点生成部２３にて生成された前記第２情報量の第２中間コードに対して符号化結果の情報量が可変長となる第２の符号化形式（ＵＴＦ−８）で符号化処理を施して前記第２コードを出力する機能を備えている。 Furthermore, the Unicode UTF-8 encoding unit 26 as the second encoding function, which is a part of the encoding unit, is generated by the Unicode plane segment generation unit 23 as the first generation unit. A function of outputting the second code by performing an encoding process on the second intermediate code of two information amounts in the second encoding format (UTF-8) in which the information amount of the encoding result is variable length ing.

ここで、ユニコード面区点生成部２３とユニコード０面区点生成部２５とユニコードＵＴＦ−８符号化部２６とで、変換後処理手段２９ｂを構成している。
この変換後処理手段２９ｂは、前記基本多言語面変換部２１から得られた前記範囲外の第１情報量の第２コードに対して予め定められた演算処理を施して第２情報量の第３中間コードを生成する第３の生成部としてのユニコード０面区点生成部２５を備えている。 Here, the post-conversion processing means 29b is composed of the Unicode plane segment generation unit 23, the Unicode 0 plane segment generation unit 25, and the Unicode UTF-8 encoding unit 26.
The post-conversion processing means 29b performs a predetermined calculation process on the second code of the first information amount outside the range obtained from the basic multilingual surface conversion unit 21, and performs the second information amount of the second information amount. A Unicode 0 plane section generation unit 25 is provided as a third generation unit that generates 3 intermediate codes.

前記ユニコードＵＴＦ−８符号化部２６は、ユニコード０面区点生成部２５にて生成された第２情報量の第３中間コードに対して、符号化結果の情報量が可変長となる前記第２の符号化形式で符号化処理を施し、第２コードを出力する機能を更に備えている。 The Unicode UTF-8 encoding unit 26 has a variable length information amount with respect to the third intermediate code of the second information amount generated by the Unicode 0 plane segment generation unit 25. 2 is further provided with a function of performing encoding processing in the encoding format 2 and outputting a second code.

更に、第３の生成部としてのユニコード０面区点生成部２５は、基本多言語面変換部２１にて得られた第１バイト数の第２コードを下位バイトとし、予め定められた第２固定値を上位バイトとして前記下位バイトに付加する機能を備えている。
ここで、前記第２固定値は、前記第３の符号化形式（ＵＴＦ−３２）のユニコードの基本多言語面を示す値である。 Further, the Unicode 0 plane segment generation unit 25 as the third generation unit uses the second code of the first byte number obtained by the basic multilingual plane conversion unit 21 as the lower byte, and sets a second predetermined number. A function of adding a fixed value as the upper byte to the lower byte is provided.
Here, the second fixed value is a value indicating a basic multilingual plane of Unicode of the third encoding format (UTF-32).

又、ユニコードＵＴＦ−８符号化部２６は、前記第１バイト数の第２コードを下位バイトとし予め定められた第２固定値を上位バイトとして前記下位バイトに付加してなる前記第３の符号化形式の値を、符号化結果の情報量が可変長となる第２の符号化形式のユニコードに変換して出力する機能を備えている。 Further, the Unicode UTF-8 encoding unit 26 adds the second code of the first number of bytes as a lower byte and adds a predetermined second fixed value as an upper byte to the lower byte. The encoding format value is converted into Unicode of the second encoding format in which the amount of information of the encoding result is variable, and is output.

又、前記第１の生成部であるユニコード面区点生成部２３は、前記追加面変換部２２にて得られた第１中間コードを下位バイトとし、予め定められた第１固定値を上位バイトとして前記下位バイトに付加する機能を備えている。 The Unicode plane segment generation unit 23, which is the first generation unit, uses the first intermediate code obtained by the additional plane conversion unit 22 as the lower byte and the predetermined first fixed value as the upper byte. As a function to be added to the lower byte.

ここで、前記第１コードが漢字コードであり、前記第２コードが第１の符号化形式又は第２の符号化形式のユニコードであり、前記第１固定値は前記第１の符号化形式及び前記第２の符号化形式と異なる第３の符号化形式（ＵＴＦ−３２）のユニコードの漢字追加面を示す値である。
更に、前記第１の符号化形式はＵＴＦ（Ｕneversal multi octet coded characterset Ｔransformation Ｆormat ）―１６であり、前記第２の符号化形式はＵＴＦ―８であり、前記第３の符号化形式はＵＴＦ―３２であるとする。 Here, the first code is a Kanji code, the second code is a Unicode of the first encoding format or the second encoding format, and the first fixed value is the first encoding format and It is a value which shows the Chinese character addition surface of the Unicode of the 3rd encoding format (UTF-32) different from the said 2nd encoding format.
Furthermore, the first encoding format is UTF (Unversal multi octet coded characterset Transformation Format) -16, the second encoding format is UTF-8, and the third encoding format is UTF-32. Suppose that

（動作手順）
次に、図７のフローチャートを参照して本第２実施形態の動作について説明する。 (Operation procedure)
Next, the operation of the second embodiment will be described with reference to the flowchart of FIG.

まず、本第２実施形態に係るコード変換システムにおける動作手順では、第１実施形態にかかる基本的手順（図５／ステップＳ１１１〜Ｓ１１５）において、前記符号化処理を実行するに際しては、第１の生成処理にて生成された第２情報量の第２中間コードに対して、符号化結果の情報量が可変長となる第２の符号化形式で符号化処理を施して、第２コードを出力する第２の符号化処理を実行する（図７／ステップＳ２０６；第２の符号化処理ステップ，第２の符号化機能）。
ここで、図７のステップＳ２０１〜Ｓ２０４は、前述した図５のステップＳ１１１〜Ｓ１１５と同一である。 First, in the operation procedure in the code conversion system according to the second embodiment, when the encoding process is executed in the basic procedure (FIG. 5 / steps S111 to S115) according to the first embodiment, The second intermediate code generated by the generation process is subjected to the encoding process in the second encoding format in which the information amount of the encoding result is variable length, and the second code is output. The second encoding process is executed (FIG. 7 / step S206; second encoding process step, second encoding function).
Here, steps S201 to S204 in FIG. 7 are the same as steps S111 to S115 in FIG. 5 described above.

本第２実施形態に係るコード変換システムにおける動作手順では、第１実施形態にかかる基本的手順において、前記変換後処理を実行するに際しては、前記第１の変換処理から得られた前記範囲外の前記第１情報量の第２コードに対して予め定められた演算処理を施して前記第２情報量の第３中間コードを生成する第３の生成処理を、更に実行する（図７／ステップＳ２０５；第３の生成処理ステップ、第３の生成機能）。 In the operation procedure in the code conversion system according to the second embodiment, when the post-conversion process is executed in the basic procedure according to the first embodiment, it is out of the range obtained from the first conversion process. A third generation process for generating a third intermediate code of the second information amount by performing a predetermined arithmetic process on the second code of the first information amount is further executed (FIG. 7 / step S205). A third generation processing step, a third generation function).

又、前記第２の符号化処理の実行に際しては、上記第３の生成処理にて生成された前記第２情報量の第３中間コードに対して、符号化結果の情報量が可変長となる前記第２の符号化形式で符号化処理を施し、前記第２コードを出力する。 Further, when the second encoding process is executed, the information amount of the encoding result is variable with respect to the third intermediate code of the second information amount generated in the third generation process. Encoding processing is performed in the second encoding format, and the second code is output.

以下、これを更に詳述する。
まず、基本多言語面変換部２１で取得した値が、Ｄ８００ｈ〜ＤＦＦＦｈの範囲内にある場合は、前記第１実施形態と同様に、この値を追加面変換部２２に供給する。
又、基本多言語面変換部２１で取得した値がＤ８００ｈ〜ＤＦＦＦｈの範囲外であった場合（図７に示すステップＳ２０１の「Ｎｏ」）、この値をユニコード０面区点生成部２５に供給する。 This will be described in detail below.
First, when the value acquired by the basic multilingual surface conversion unit 21 is within the range of D800h to DFFFh, this value is supplied to the additional surface conversion unit 22 as in the first embodiment.
If the value acquired by the basic multilingual plane conversion unit 21 is outside the range of D800h to DFFFh (“No” in step S201 shown in FIG. 7), this value is supplied to the Unicode 0 plane segment generation unit 25. To do.

次に、ユニコード０面区点生成部２５では、供給された２バイトをＵＴＦ−３２の下位２バイトとし、上位２バイトに、基本多言語面を示す００００ｈを追加してユニコードの面区点を表す値（ＵＴＦ−３２符号化形式）を生成する（図７／ステップＳ２０５）。 Next, the Unicode 0 plane segment generation unit 25 sets the supplied 2 bytes as the lower 2 bytes of UTF-32 and adds 0000h indicating the basic multilingual plane to the upper 2 bytes to obtain the Unicode plane segment score. A value to be expressed (UTF-32 encoding format) is generated (FIG. 7 / step S205).

続いて、ユニコード０面区点生成部２５が生成した値は、ユニコードＵＴＦ−８符号化部２６に供給する。その後、ユニコードＵＴＦ−８符号化部２６は、供給されたＵＴＦ−３２の４バイトを、ＵＴＦ−８（１〜４バイトの可変長）に符号化して出力する（図７／ステップＳ２０６）。 Subsequently, the value generated by the Unicode 0 plane section generation unit 25 is supplied to the Unicode UTF-8 encoding unit 26. Thereafter, the Unicode UTF-8 encoding unit 26 encodes and supplies the supplied 4 bytes of UTF-32 to UTF-8 (1 to 4 bytes of variable length) (FIG. 7 / step S206).

次に、本第２実施形態の効果について説明する。
本第２実施形態では、基本多言語面変換部２１や追加面変換部２２で取得した値をＵＴＦ−３２の値とした上で、ユニコードＵＴＦ−８符号化部２６に供給するように構成されている。このため、漢字２バイトコードからユニコード（ＵＴＦ−８）へ変換することができる。 Next, the effect of the second embodiment will be described.
The second embodiment is configured to supply the value obtained by the basic multilingual surface conversion unit 21 and the additional surface conversion unit 22 to the Unicode UTF-8 encoding unit 26 after setting the value as UTF-32. ing. For this reason, it is possible to convert Kanji 2-byte code to Unicode (UTF-8).

また、本第２実施形態では、さらに、記憶装置３０上に格納する基本多言語面変換テーブル記憶部３１及び追加面変換テーブル記憶部３２、又これらを参照する基本多言語面変換部２１及び追加面変換部２２は、第１実施形態に係るユニコード（ＵＴＦ−１６）の変換と同一の構成となる。
このため、ＵＴＦ−１６とＵＴＦ−８の変換で変換部と変換テーブルとを供用でき、従って、コード変換テーブルを増やすことなくＵＴＦ−８の変換を行うことができる。 In the second embodiment, the basic multilingual plane conversion table storage unit 31 and the additional plane conversion table storage unit 32 stored on the storage device 30, and the basic multilingual plane conversion unit 21 that refers to them and the addition are added. The surface conversion unit 22 has the same configuration as the Unicode (UTF-16) conversion according to the first embodiment.
For this reason, the conversion unit and the conversion table can be used for the conversion of UTF-16 and UTF-8, and therefore the conversion of UTF-8 can be performed without increasing the code conversion table.

その他の構成およびその他のステップないしは機能並びにその作用効果については、前述した第１の実施形態と同一となっている。
また、上記の説明において、上述した各ステップの動作内容及び各部の構成要素並びにそれらによる各機能をプログラム化（ソフトウエアプログラム）し、コンピュータに実行させてもよい。 Other configurations, other steps or functions, and the effects thereof are the same as those in the first embodiment described above.
In the above description, the operation content of each step described above, the constituent elements of each unit, and the functions thereof may be programmed (software program) and executed by a computer.

〔第３の実施の形態〕
次に、本発明にかかる第３の実施形態について、図８乃至図１０に基づいて説明する。 [Third Embodiment]
Next, a third embodiment according to the present invention will be described with reference to FIGS.

図８において、本第３実施形態にかかるコード変換システム２００は、コード変換装置２０Ｃが、図１に開示した第１実施形態におけるコード変換装置２０Ａにおいて、追加面変換部２２を追加面Ｎ変換部２７に置き代えた点と、ユニコード面区点生成部２３をユニコードＮ面区点生成部２８に置き代えた点が異なる。更に、図８に示すように、記憶装置３０では、図１に開示した第１実施形態における記憶装置３０における追加面変換テーブル記憶部３２を追加面Ｎ変換テーブル記憶部３８とした点が異なる。 8, in the code conversion system 200 according to the third embodiment, the code conversion device 20C replaces the additional surface conversion unit 22 with the additional surface N conversion unit in the code conversion device 20A according to the first embodiment disclosed in FIG. 27 is different from the point replaced with the Unicode plane division generating unit 23 with the Unicode N plane division generating unit 28. Further, as shown in FIG. 8, the storage device 30 is different in that the additional surface conversion table storage unit 32 in the storage device 30 in the first embodiment disclosed in FIG.

上記追加面Ｎ変換部２７は、コード変換の実行に際しては、Ｄ８００ｈ〜ＤＦＦＦｈの値をキーにして追加面Ｎ変換テーブル記憶部３８に記憶されている値を参照する。
ユニコードＮ面区点生成部２８は、３バイトの値を受け取り、この値をＵＴＦ−３２における下位３バイトとし、上位１バイトに００ｈを付加する。 When executing the code conversion, the additional surface N conversion unit 27 refers to the value stored in the additional surface N conversion table storage unit 38 using the values of D800h to DFFFh as keys.
The Unicode N plane segment generation unit 28 receives a 3-byte value, sets this value as the lower 3 bytes in UTF-32, and adds 00h to the upper 1 byte.

ここで、上記追加面Ｎ変換テーブル記憶部３８における追加面変換テーブル３８ａは、図９に示すように、上位１バイトをＤ８ｈ〜ＤＦｈ、下位１バイトを００ｈ〜ＦＦｈの範囲をもつ２バイトに対応したマトリックス表を備えており、それぞれの要素に変換先のユニコードの面区点の情報（ＵＴＦ−３２の下位３バイト）を格納している。
ここで、記憶装置３０の実装上は、変換元の２バイト値の昇順で表の要素である３バイトを連続して格納したイメージとなる。 Here, the additional surface conversion table 38a in the additional surface N conversion table storage unit 38 corresponds to 2 bytes having a range of upper 8 bytes from D8h to DFh and lower 1 byte from 00h to FFh, as shown in FIG. The matrix table is stored, and the information of the plane section point of the conversion destination Unicode (lower 3 bytes of UTF-32) is stored in each element.
Here, on the implementation of the storage device 30, an image in which 3 bytes, which are elements of the table, are continuously stored in ascending order of the 2-byte value of the conversion source.

ここで、前記追加面変換テーブル３８ａは、前記第３情報量が、前記第１情報量（第１バイト数の一例である２バイト）より多く且つ前記第２情報量（第２バイト数の一例である４バイト）より少ない情報量（３バイト）で構成されている。 Here, in the additional surface conversion table 38a, the third information amount is larger than the first information amount (2 bytes which is an example of the first byte number) and the second information amount (an example of the second byte number). (4 bytes) which is less than 3 bytes).

これを更に詳述すると、追加面変換テーブルの一例である第３変換テーブルとしての追加面Ｎ変換テーブル３８ａは、前記基本多言語面変換テーブル３１における前記第１バイト数の第２コードが予め定められた範囲内となる特定範囲第１バイト数第２コードを、少なくとも２つの上位バイト及び下位バイト毎に分解した第２コード分解上位バイト数（上位１バイト）の一の分解特定範囲第２コード（Ｄ８ｈなど）及び第２コード分解下位バイト数（下位１バイト）の他の分解特定範囲第２コード（００ｈなど）と、変換先となる前記第１バイト数と前記第２バイト数の間の第３バイト数（３バイトなど）の第１中間コードの一例である第４中間コード（０２０００Ｂｈなど）とが、対応づけられているデータ構造を備えている。 More specifically, an additional surface N conversion table 38a as a third conversion table, which is an example of an additional surface conversion table, has a predetermined second code of the first number of bytes in the basic multilingual surface conversion table 31. A specific range first byte number second code within a specified range, a second code decomposition upper byte number (upper 1 byte) obtained by decomposing at least two upper bytes and lower bytes (D8h, etc.) and the second code decomposition lower byte number (lower 1 byte) other decomposition specific range second code (00h, etc.), and between the first byte number and the second byte number to be converted A fourth intermediate code (such as 02000Bh), which is an example of a first intermediate code having a third number of bytes (such as 3 bytes), has a data structure associated therewith.

又、前述した追加面Ｎ変換部２７は、前記追加面Ｎ変換テーブル３８ａを参照して前記特定範囲コードを前記第３情報量（３バイト）の前記第１中間コードに変換する機能を備えている。 The additional surface N conversion unit 27 described above has a function of converting the specific range code into the first intermediate code of the third information amount (3 bytes) with reference to the additional surface N conversion table 38a. Yes.

ここで、ユニコードＮ面区点生成部２８とユニコードＵＴＦ−１６符号化部２４とで、変換後処理手段２９ｃが構成されている。
そして、この変換後処理手段２９ｃが、前述した追加面Ｎ変換部２７にて得られた前記第３情報量の第１中間コードに対して予め定められた演算処理を施して、前記第２情報量の第２中間コードを生成する第２の生成部であるユニコードＮ面区点生成部２８を備えている。 Here, the post-conversion processing means 29c is composed of the Unicode N plane section generation unit 28 and the Unicode UTF-16 encoding unit 24.
Then, the post-conversion processing means 29c performs a predetermined calculation process on the first intermediate code of the third information amount obtained by the additional surface N conversion unit 27 described above, so that the second information A Unicode N plane section generation unit 28 is provided as a second generation unit that generates a second amount of the second intermediate code.

更に、この変換後処理手段２９ｃは、このユニコードＮ面区点生成部２８（第２の生成部）で生成された前記第２情報量の第２中間コードに対して、予め設定されている符号化形式で符号化処理を施し、前記第２コードを出力する符号化部であるユニコードＵＴＦ−１６符号化部２４を備えている。
その他の構成は、前述した第１実施形態と同一となっている。 Further, the post-conversion processing means 29c has a code set in advance with respect to the second intermediate code of the second information amount generated by the Unicode N plane section generation unit 28 (second generation unit). A Unicode UTF-16 encoding unit 24, which is an encoding unit that performs encoding processing in an encoded format and outputs the second code.
Other configurations are the same as those of the first embodiment described above.

（動作手順）
次に、本第３実施形態の動作を図１０に基づいて説明する。
先ず、本第３実施形態では、前述した第１実施形態における基本的動作手順において、前記第２の変換処理を実行するに際しては、前記追加面Ｎ変換テーブル３８ａにおける前記第３情報量が前記第１情報量より多く前記第２情報量より少ない場合に、前記追加面Ｎ変換テーブル３８ａを参照して前記特定範囲コードを前記第３情報量の第１中間コードに変換する（図１０／ステップＳ３０３；第２の変換処理ステップ、第２の変換機能）。 (Operation procedure)
Next, the operation of the third embodiment will be described with reference to FIG.
First, in the third embodiment, when the second conversion process is executed in the basic operation procedure in the first embodiment described above, the third information amount in the additional plane N conversion table 38a is equal to the first information amount. When the amount is larger than one information amount and smaller than the second information amount, the specific range code is converted into the first intermediate code of the third information amount with reference to the additional plane N conversion table 38a (FIG. 10 / step S303). Second conversion processing step, second conversion function).

更に、前記変換後処理を実行するに際しては、先ず、前記第２の変換処理にて得られた前記第３情報量の第１中間コードに対して予め定められた演算処理を施して前記第２情報量の第２中間コードを生成する第２の生成処理を実行する（図１０／ステップＳ３０４；第２の生成処理ステップ、第２の生成機能）。 Further, when executing the post-conversion process, first, a predetermined calculation process is performed on the first intermediate code of the third information amount obtained in the second conversion process to perform the second process. A second generation process for generating the second intermediate code of the information amount is executed (FIG. 10 / step S304; second generation process step, second generation function).

続いて、この第２の生成処理にて生成された前記第２情報量の第２中間コードに対して予め設定されている符号化形式で符号化処理を施し、前記第２コードを出力する符号化処理を実行する（図１０／ステップＳ３０５；符号化処理ステップ、符号化処理機能）。 Subsequently, a code for performing the encoding process in a preset encoding format on the second intermediate code of the second information amount generated in the second generation process and outputting the second code The encoding process is executed (FIG. 10 / step S305; encoding process step, encoding process function).

以下、これを詳述する。
まず、基本多言語面変換部２１で取得した値が、Ｄ８００ｈ〜ＤＦＦＦｈの範囲内にある場合は、この値を追加面Ｎ変換部２７に供給する。 This will be described in detail below.
First, when the value acquired by the basic multilingual plane conversion unit 21 is within the range of D800h to DFFFh, this value is supplied to the additional plane N conversion unit 27.

次に、追加面Ｎ変換部２７は、供給された２バイトを上位１バイトと下位１バイトに分解し、上位バイトと下位バイトの値から追加面Ｎ変換テーブル記憶部３８における要素（変換元の２バイトに対応する変換先のＵＴＦ−３２の下位３バイトの値）を格納している位置を計算した後、要素を参照し、これを取得する（図１０／ステップＳ３０３）。 Next, the additional plane N conversion unit 27 breaks down the supplied 2 bytes into an upper 1 byte and a lower 1 byte, and the element (conversion source After calculating the position storing the lower 3 bytes of the conversion destination UTF-32 corresponding to 2 bytes), the element is referenced and obtained (FIG. 10 / step S303).

この追加面変換部２２で取得した値は、ユニコードＮ面区点生成部２８へ供給する。続いて、ユニコードＮ面区点生成部２８は、供給された３バイト（面区点情報）をＵＴＦ−３２における下位３バイトとし、上位１バイトに００ｈ（固定値）を追加してユニコードのＵＴＦ−３２の値を生成する（図１０／ステップＳ３０４）。
その後、ユニコードＮ面区点生成部２８で取得した値は、ユニコードＵＴＦ−１６符号化部２４に供給する。 The value acquired by the additional plane conversion unit 22 is supplied to the Unicode N plane section generation unit 28. Subsequently, the Unicode N plane division generation unit 28 sets the supplied 3 bytes (plane division point information) as the lower 3 bytes in UTF-32 and adds 00h (fixed value) to the upper 1 byte to generate the Unicode UTF A value of −32 is generated (FIG. 10 / step S304).
Thereafter, the value acquired by the Unicode N plane segment generation unit 28 is supplied to the Unicode UTF-16 encoding unit 24.

次に、上記第３実施形態の効果について説明する。
この第３実施形態では、追加面Ｎ変換テーブル記憶部３８における追加面Ｎ変換テーブル３８ａについて、追加面変換テーブル３２ａの要素のそれぞれに面の情報を付加した構造である。
このため、変換先のユニコードを追加漢字面（第２面）のみでなく、追加面に収容されている他の文字にも変換することができる。 Next, effects of the third embodiment will be described.
In the third embodiment, the additional surface N conversion table 38a in the additional surface N conversion table storage unit 38 has a structure in which surface information is added to each element of the additional surface conversion table 32a.
For this reason, the conversion destination Unicode can be converted not only to the additional Chinese character plane (second plane) but also to other characters accommodated in the additional plane.

その他の構成およびその他のステップないしは機能並びにその作用効果については、前述した第１実施形態の場合と同一となっている。
また、上記の説明において、上述した各ステップの動作内容及び各部の構成要素並びにそれらによる各機能をプログラム化（ソフトウエアプログラム）し、コンピュータに実行させるように構成してもよい。 Other configurations, other steps or functions, and the effects thereof are the same as those in the first embodiment described above.
In the above description, the operation content of each step described above, the components of each unit, and the functions thereof may be programmed (software program) and executed by a computer.

〔その他の各種変形例〕
また、本発明にかかる装置及び方法は、そのいくつかの特定の実施の形態に従って説明してきたが、本発明の主旨および範囲内において本発明の本文に記述した実施の形態に対して種々の変形が可能である。 [Other variations]
The apparatus and method according to the present invention have been described according to some specific embodiments thereof, but various modifications may be made to the embodiments described in the text of the present invention within the spirit and scope of the present invention. Is possible.

ところで、このような装置、システムは、単独で存在する場合もあるし、ある機器（例えば電子機器の機能として組み込まれているなど）に組み込まれた状態で利用されることもあるなど、発明の思想としてはこれに限らず、各種の態様を含む。
更に、コード変換システムを搭載した電子機器は、プログラム制御により動作し、ネットワーク関連の通信機能を有していてもよい。デスクトップ、ラップトップコンピュータ、その他無線・有線通信機能を有する情報機器、情報家電機器（テレビ・携帯音楽プレーヤ・ゲーム機）、またはこれに類するコンピュータなどいかなるコンピュータでもよく、移動式・固定式を問わない。 By the way, such an apparatus or system may exist independently, or may be used in a state of being incorporated in a certain device (for example, incorporated as a function of an electronic device). The idea is not limited to this and includes various aspects.
Furthermore, an electronic device equipped with a code conversion system may operate under program control and have a network-related communication function. Any computer such as a desktop, laptop computer, other information device having wireless / wired communication function, home information appliance (TV / portable music player / game machine), or similar computer may be used. .

更に、本発明は、文字コード変換に関し、特に漢字２バイトコードの文字列をＪＩＳＸ０２１３：２００４に対応したＵｎｉｃｏｄｅに変換する方法、及びＵｎｉｃｏｄｅ追加面（４バイトコード）に変換する場合の変換テーブルを２バイト固定長で格納するといった用途に利用できる。 Furthermore, the present invention relates to character code conversion, and in particular, a method for converting a character string of a kanji 2-byte code to Unicode corresponding to JIS X 0213: 2004, and a conversion table when converting to a Unicode additional surface (4-byte code). Can be used for purposes such as storing a fixed length of 2 bytes.

また、本発明の範囲は、図示例に限定されないものとする。さらに、上記各実施の形態には種々の段階が含まれており、開示される複数の構成要件における適宜な組み合わせにより種々の発明が抽出され得る。つまり、上述の各実施の形態同士、或いはそれらのいずれかと各変形例のいずれかとの組み合わせによる例をも含む。
以上、実施形態を参照して本発明を説明したが、本発明は上記実施形態に限定されるものではない。 Further, the scope of the present invention is not limited to the illustrated examples. Further, the above embodiments include various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements. In other words, examples include combinations of the above-described embodiments or any one of them and any of the modifications.
The present invention has been described above with reference to the embodiments, but the present invention is not limited to the above embodiments.

本発明は、コンピュータシステム全般に利用できる。 The present invention can be used in general computer systems.

１コード変換システム
１０入力装置
２０Ａ，２０Ｂ，２０Ｃコード変換装置
２１基本多言語面変換部
２２追加面変換部
２３ユニコード面区点生成部（第１の生成部）
２４ユニコードＵＴＦ―１６符号化部（符号化部，第１の符号化機能）
２５ユニコード０面区点生成部（第３の生成部）
２６ユニコードＵＴＦ―８符号化部（符号化部，第２の符号化機能）
２７追加面Ｎ変換部
２８ユニコードＮ面区点生成部（第２の生成部）
２９ａ，２９ｂ，２９ｃ変換後処理手段
３０記憶装置
３１基本多言語面変換テーブル記憶部
３１ａ基本多言語面変換テーブル
３２追加面変換テーブル記憶部
３２ａ追加面変換テーブル
３８追加面Ｎ変換テーブル記憶部
３８ａ追加面Ｎ変換テーブル
４０出力装置 DESCRIPTION OF SYMBOLS 1 Code conversion system 10 Input device 20A, 20B, 20C Code conversion device 21 Basic multilingual surface conversion part 22 Additional surface conversion part 23 Unicode surface point generation part (1st generation part)
24 Unicode UTF-16 encoding unit (encoding unit, first encoding function)
25 Unicode 0 plane section generator (third generator)
26 Unicode UTF-8 encoding unit (encoding unit, second encoding function)
27 Additional plane N conversion section 28 Unicode N plane section generation section (second generation section)
29a, 29b, 29c Post-conversion processing means 30 Storage device 31 Basic multilingual surface conversion table storage unit 31a Basic multilingual surface conversion table 32 Additional surface conversion table storage unit 32a Additional surface conversion table 38 Additional surface N conversion table storage unit 38a Addition Surface N conversion table 40 Output device

Claims

An input device for inputting a first code relating to a first information amount of the number of bytes allocated in advance to one character information;
The input first code is a code system different from the first code with reference to a code conversion table provided in advance, and the second code corresponding to the first information amount for the character information or more. A code conversion device for converting to a second code according to the second amount of information in bytes;
The code conversion device is attached to the code conversion device, the first code of the first information amount is associated with the second code of the first information amount, and the value of the first information amount is used as an element The third multilingual plane conversion table, a part of the value of the first information amount, and the first intermediate code of the third information amount having a smaller number of bytes than the second information amount, and the third information A storage device comprising an additional surface conversion table whose elements are values of quantities;
An output device that outputs the second code of the first information amount or the second information amount converted by the code conversion device;
The code conversion device includes:
Refer to the basic multilingual surface conversion table, and refer to the additional surface conversion table when the first information amount is a partial range of the value of the first information amount. A second code that outputs a first intermediate code and generates and outputs the second code of the second information amount by performing a predetermined arithmetic process on the first intermediate code of the third information amount. A code conversion system having a generation output function.

The code conversion system according to claim 1,
The additional surface conversion table of the storage unit is smaller than the second information amount and the specific range code within a predetermined range of the second code group of the first information amount in the basic multilingual surface conversion table. A data structure associated with the first intermediate code of the third information amount of the number of bytes,
The code converter is
It is determined whether or not the second code of the first information amount obtained by referring to the basic multilingual surface conversion table is within the range, and operates when it is determined that the second code is outside the range. A basic multilingual plane conversion unit that converts a code into a second code of the first information amount and outputs the second code;
The basic multilingual surface conversion unit operates when the second code of the first information amount is determined to be within the range, and refers to the additional surface conversion table to convert the specific range code into the third information amount. An additional surface conversion unit for converting the first intermediate code into
Post-conversion processing means for performing a predetermined arithmetic processing on the first intermediate code of the third information amount to generate the second code of the second information amount and outputting it;
A code conversion system characterized by comprising:

The code conversion system according to claim 2,
The additional surface conversion table includes an information amount in which the third information amount is the same as the first information amount,
The additional surface conversion unit has a function of converting the specific range code into the first intermediate code of the first information amount with reference to the additional surface conversion table,
The post-conversion processing means is
A first generation unit that performs a predetermined calculation process on the first intermediate code of the first information amount obtained by the additional surface conversion unit to generate the second intermediate code of the second information amount When,
An encoding unit that performs an encoding process on the second intermediate code of the second information amount generated by the first generation unit in a preset encoding format and outputs the second code;
A code conversion system comprising:

The code conversion system according to claim 2,
The additional surface conversion table is configured with an information amount in which the third information amount is larger than the first information amount and smaller than the second information amount,
The additional surface conversion unit has a function of converting the specific range code into the first intermediate code of the third information amount with reference to the additional surface conversion table,
The post-conversion processing means is
A second generation unit that generates a second intermediate code of the second information amount by performing a predetermined arithmetic process on the first intermediate code of the third information amount obtained by the additional surface conversion unit When,
An encoding unit that performs an encoding process in a preset encoding format on the second intermediate code of the second information amount generated by the second generation unit and outputs the second code;
A code conversion system comprising:

The code conversion system according to claim 3,
The encoding unit encodes the second intermediate code of the second information amount generated by the first generation unit in a first encoding format in which the encoding result is the second information amount. A code conversion system comprising a first encoding function for performing processing and outputting the second code.

The code conversion system according to claim 3,
The encoding unit encodes the second intermediate code of the second information amount generated by the first generation unit in a second encoding format in which the information amount of the encoding result is variable length. A code conversion system comprising a second encoding function for performing processing and outputting the second code.

The code conversion system according to claim 6,
The post-conversion processing means performs a predetermined arithmetic process on the second code of the first information amount outside the range obtained from the basic multilingual conversion unit to perform a third calculation of the second information amount. A third generator for generating intermediate code;
The second encoding function is:
The third intermediate code of the second information amount generated by the third generation unit is subjected to an encoding process in the second encoding format in which the information amount of the encoding result is variable length, and A code conversion system further comprising a function of outputting two codes.

The code conversion system according to claim 7, wherein
The first information amount is a first number of bytes;
The second information amount is a second number of bytes;
The additional plane conversion table includes at least two higher order specific range codes having the first number of bytes within a predetermined range in the second code group having the first number of bytes in the basic multilingual plane conversion table. Matrix table in which one decomposition specific range second code of the upper byte decomposed for each byte and lower byte and another decomposition specific range second code of the lower byte are associated with the first intermediate code of the conversion destination Format data structure,
The additional surface conversion unit is
The position where the element of the additional plane conversion table is stored is calculated from the value of one disassembly specific range second code of the upper byte and the value of the other disassembly specific range second code of the lower byte. A code conversion system comprising a function for obtaining an intermediate code value.

The code conversion system according to claim 8,
The first generation unit has a function of adding the first intermediate code obtained by the second conversion means as a lower byte and adding a predetermined first fixed value as an upper byte to the lower byte. A code conversion system characterized by

The code conversion system according to claim 9, wherein
The third generation unit adds the second code of the first number of bytes obtained by the first conversion means as a lower byte and adds a predetermined second fixed value as an upper byte to the lower byte. A code conversion system characterized by having a function to perform.

The code conversion system according to claim 10,
The first code is a Kanji code;
The second code is a unicode of either the first encoding format or the second encoding format;
The code conversion system characterized in that the first fixed value is a value indicating an additional surface of Unicode characters in a third encoding format different from the first encoding format and the second encoding format. .

The code conversion system according to claim 11, wherein
The code conversion system according to claim 2, wherein the second fixed value is a value indicating a basic multilingual plane of Unicode in the third encoding format.

The code conversion system according to claim 12,
The second encoding function in the encoding unit is:
The value of the third encoding format obtained by adding the second fixed number of the first byte number as the lower byte and adding a predetermined second fixed value as the upper byte to the lower byte, and the information amount of the encoding result A code conversion system characterized by having a function of converting to Unicode of a second encoding format having a variable length and outputting the same.

The code conversion system according to claim 13, wherein
The first encoding format is UTF (Universal multi octet coded characterset Transformation Format) -16, the second encoding format is UTF-8, and the third encoding format is UTF-32. A code conversion system characterized by that.

By inputting a first code of a first information amount pre-assigned to one character information and referring to one or both of the basic multilingual plane conversion table and the additional plane conversion table, A code conversion device that is a code system different from the first code and converts the character information into a second code in which the case of the first information amount or the case of the second information amount larger than this may exist, and outputs the second code There,
The basic multilingual plane conversion table has a data structure in which the first code of the first information amount and the second code of the first information amount are associated with each other and the value of the first information amount is an element. ,
The additional surface conversion table has a data structure in which a part of the element is associated with a first intermediate code having a third information amount smaller than the second information amount and the value of the third information amount is used as an element. And
As a result of referring to the basic multilingual surface conversion table, when the first code is a partial range of the element, the third information corresponding to the first code with reference to the additional surface conversion table The first intermediate code of the amount is output, and the first intermediate code of the third information amount is subjected to predetermined arithmetic processing to generate the second code of the second information amount and output it A code conversion device characterized by having a configuration to perform.

When the first code relating to the first information amount of the number of bytes allocated in advance for one character information is input, the first code is referred to the code conversion table provided in advance. A code conversion device that converts a second code related to the first information amount with respect to the character information or a second code related to the second information amount having a larger number of bytes, and a code conversion device different from the code A basic multilingual that is attached to the code conversion device for the purpose, associates the first code of the first information amount with the second code of the first information amount, and uses the value of the first information amount as an element The surface conversion table, a part of the value of the first information amount and the first intermediate code of the third information amount having a smaller number of bytes than the second information amount are associated with each other, and the value of the third information amount is Elements and In the code conversion system with additional surface conversion table that,
By referring to one or both of the basic multilingual surface conversion table and the additional surface conversion table, the input first code is a code system different from the first code, and the character information The code conversion device converts the second information amount in the case of the first information amount or the case of the second information amount larger than this,
Thereafter, the converted second information amount of the first information amount or the second information amount is output to the outside,
In the conversion by the code conversion device,
Reference is first made to the basic multilingual surface conversion table, and when the reference result is a partial range of the element, the first intermediate of the third information amount is then referred to the additional surface conversion table. A code conversion method comprising: outputting a code, and performing a predetermined calculation process on the first intermediate code of the third information amount to generate the second code of the second information amount.

The code conversion method according to claim 16, wherein
The additional plane conversion table includes a specific range code within a predetermined range in the second code group of the first information amount in the basic multilingual plane conversion table, and a third less than the second information amount. A data structure associated with the first intermediate code of the information amount is provided in advance,
For the conversion,
First, it is determined whether or not the second code of the first information amount obtained by referring to the basic multilingual plane conversion table is within the range. A code conversion method comprising: operating a basic multilingual plane conversion unit to convert the first code into a second code having the first information amount and outputting the second code.

The code conversion method according to claim 16, wherein
The additional plane conversion table includes a specific range code within a predetermined range in the second code group of the first information amount in the basic multilingual plane conversion table, and a third less than the second information amount. A data structure associated with the first intermediate code of the information amount;
For the conversion,
First, it is determined whether or not the second code of the first information amount obtained by referring to the basic multilingual plane conversion table is within the range, and if it is determined to be within the range, the code conversion device An additional surface conversion unit is activated to perform conversion processing for converting the specific range code into the first intermediate code of the third information amount with reference to the additional surface conversion table,
A predetermined arithmetic process is performed on the first intermediate code of the third information amount, and the second code of the second information amount is generated by the post-conversion processing means of the code conversion device and is output. A code conversion method characterized by that.

The code conversion method according to claim 18, wherein
In executing the conversion process,
When the third information amount in the additional surface conversion table is the same as the first information amount, the specific range code is changed to the first intermediate code of the first information amount with reference to the additional surface conversion table. Converted,
When executing the post-conversion processing,
First, a first calculation process is performed on the first intermediate code of the first information amount obtained in the second conversion process to generate a second intermediate code of the second information amount. Execute the generation process of
Subsequently, an encoding process is performed on the second intermediate code of the second information amount generated in the first generation process in a preset encoding format, and the second code is output. A code conversion method characterized by executing processing.

The code conversion method according to claim 18, wherein
When executing the conversion process,
When the third information amount in the additional surface conversion table is larger than the first information amount and smaller than the second information amount, the specific range code is set to the third information amount with reference to the additional surface conversion table. Converting to the first intermediate code,
In executing the post-conversion processing,
First, the second intermediate code of the second information amount is generated by performing a predetermined arithmetic process on the first intermediate code of the third information amount obtained in the second conversion process. Execute the generation process of
Subsequently, an encoding process is performed on the second intermediate code of the second information amount generated in the second generation process in a predetermined encoding format and the second code is output. A code conversion method characterized by executing processing.

The code conversion method according to claim 19, wherein
In executing the encoding process,
Encoding processing is performed on the second intermediate code of the second information amount generated in the first generation processing in the first encoding format in which the encoding result is the second information amount, and then A code conversion method comprising: executing a first encoding process for outputting the second code.

The code conversion method according to claim 19, wherein
In executing the encoding process,
Encoding processing is performed on the second intermediate code of the second information amount generated in the first generation processing in the second encoding format in which the information amount of the encoding result is variable length, and then And a second encoding process for outputting the second code.

The code conversion method according to claim 22,
When performing post-processing after the conversion,
A third intermediate code of the second information amount is generated by performing a predetermined arithmetic processing on the second code of the first information amount outside the range obtained from the first conversion processing; Execute the generation process of
In executing the second encoding process,
The third intermediate code of the second information amount generated in the third generation processing is subjected to encoding processing in the second encoding format in which the information amount of the encoding result is variable length, and then And outputting the second code to the code conversion method.

When the first code relating to the first information amount of the number of bytes allocated in advance for one character information is input, the first code is referred to the code conversion table provided in advance. A code conversion device for converting to a second code having a code system different from the code, and the code conversion device for the code conversion device, the first code of the first information amount and the second code of the first information amount A basic multilingual plane conversion table which is associated with a code and has the value of the first information amount as an element, and a third portion comprising a part of the value of the first information amount and the number of bytes smaller than the second information amount A code conversion system comprising an additional surface conversion table associated with a first intermediate code of information amount and having the value of the third information amount as an element,
The input first code of the first information amount is different from the first code with reference to one or both of the basic multilingual plane conversion table and the additional plane conversion table, and the character information A code conversion function for converting into a second code when the number of bytes is the first information amount or a second information amount larger than the first information amount, and the converted first information amount or the second information amount Has an output processing function that outputs two codes,
In the code conversion processing function,
In the conversion process of the first code, when the basic multilingual plane conversion table is referred to and the reference result is in a part of the range of the element, the conversion process is performed by referring to the additional plane conversion table. A first intermediate code output processing function for outputting the first intermediate code of three information amounts; and a second arithmetic operation for the first intermediate code subjected to the output processing to perform a predetermined arithmetic processing on the second code of the second information amount A second code generation processing function for generating and outputting
A code conversion program characterized in that the above-mentioned processing functions are executed by a computer provided in the code conversion device.

In the code conversion program according to claim 24,
The additional plane conversion table includes a specific range code within a predetermined range in the second code group of the first information volume in the basic multilingual plane conversion table and a third information volume smaller than the second information volume. A data structure associated with the first intermediate code in advance,
In the code conversion function,
The first code operates when it is determined that the second code of the first information amount obtained by referring to the basic multilingual plane conversion table is within the range, and is determined to be out of the range. A first conversion function for converting the first information amount into a second code of the first information amount and outputting the same
The second conversion function operates when it is determined by the first conversion function to be within the range, and converts the specific range code into the first intermediate code of the third information amount with reference to the additional surface conversion table. Conversion function,
And a post-conversion processing function that generates a second code of the second information amount by performing a predetermined arithmetic process on the first intermediate code of the third information amount, and outputs the second code.
A code conversion program characterized by having the contents of the code conversion program and causing the computer to execute them.

The code conversion program according to claim 25,
In the second conversion function,
When the third information amount in the additional surface conversion table is the same as the first information amount, the specific range code is changed to the first intermediate code of the first information amount with reference to the additional surface conversion table. The content to be converted
In the post-conversion processing function,
First generation for generating a second intermediate code of the second information amount by performing a predetermined arithmetic processing on the first intermediate code of the first information amount obtained by the second conversion function Function and encoding that performs encoding processing in a preset encoding format on the second intermediate code of the second information amount generated by the first generation function and outputs the second code Processing function,
The content of
A code conversion program for causing a computer to execute these.

The code conversion program according to claim 25,
In the second conversion function,
When the third information amount in the additional surface conversion table is larger than the first information amount and smaller than the second information amount, the specific range code is set to the third information amount with reference to the additional surface conversion table. The content is the function to convert to the first intermediate code,
In the post-conversion processing function,
Second generation for generating a second intermediate code of the second information amount by performing a predetermined arithmetic process on the first intermediate code of the third information amount obtained in the second conversion processing Function and encoding that performs encoding processing in a preset encoding format on the second intermediate code of the second information amount generated by the second generation function and outputs the second code Processing function,
The content of
A code conversion program for causing a computer to execute these.

The code conversion program according to claim 26, wherein
In the encoding processing function,
The second intermediate code of the second information amount generated by the first generation function is subjected to an encoding process in a first encoding format in which the encoding result is the second information amount, and the second information code The first encoding function for outputting the code is the content,
A code conversion program for causing a computer to execute this.

The code conversion program according to claim 26, wherein
In the encoding processing function,
The second intermediate code generated by the first generation function is subjected to an encoding process in a second encoding format in which the information amount of the encoding result is variable length. The content is the second encoding function that outputs the code,
A code conversion program for causing a computer to execute this.

The code conversion program according to claim 29,
In the post-conversion processing function, a predetermined calculation process is performed on the second code of the first information amount outside the range obtained from the first conversion function to perform a third calculation of the second information amount. The content is the third generation function that generates intermediate code.
In the second encoding function, the second encoding in which the information amount of the encoding result is variable length with respect to the third intermediate code of the second information amount generated by the third generation function. The content of the function of performing the encoding process in the format and outputting the second code,
A code conversion program for causing a computer to execute these.