JP6491438B2

JP6491438B2 - Migration support device

Info

Publication number: JP6491438B2
Application number: JP2014174745A
Authority: JP
Inventors: 孝介坂井; 佳範城代; 厚志粟河
Original assignee: Hitachi Social Information Services Ltd
Current assignee: Hitachi Social Information Services Ltd
Priority date: 2014-08-29
Filing date: 2014-08-29
Publication date: 2019-03-27
Anticipated expiration: 2034-08-29
Also published as: JP2016051235A; WO2016031959A1; CN106663020B; CN106663020A

Description

本発明は、いわゆるレガシーマイグレーション（以下、単に、「マイグレーション」と称する場合がある）の技術に関し、特に、文字コード体系の切り替えを伴うマイグレーションの技術に関する。 The present invention relates to a so-called legacy migration (hereinafter, sometimes simply referred to as “migration”) technique, and more particularly to a migration technique that involves switching character code systems.

近年、これまで現行コンピュータで稼働してきた業務システム（レガシーシステム）を新規コンピュータに移行させるためのマイグレーションサービスを望む企業、自治体などが多い。マイグレーションの形態としては、例えば、汎用系のホストコンピュータ（または、オフコン）から、ＷＩＮＤＯＷＳ（登録商標）、ＵＮＩＸ（登録商標）、ＬＩＮＵＸ（登録商標）などのＯＳ（Operating System）が稼働するオープン系のサーバコンピュータへのマイグレーション、という形態がある。なお、マイグレーションに関する技術は、数多く公開されており、例えば、特許文献１に公開されている。 In recent years, there are many companies, local governments, and the like who desire a migration service for transferring a business system (legacy system) that has been operated on a current computer to a new computer. As a form of migration, for example, an open system in which an OS (Operating System) such as WINDOWS (registered trademark), UNIX (registered trademark), LINUX (registered trademark) is operated from a general-purpose host computer (or office computer). There is a form of migration to a server computer. A number of techniques relating to migration are disclosed, for example, in Patent Document 1.

しかし、所定の文字コード体系（例：ＥＢＣＤＩＫ（Extended Binary Coded Decimal Interchange Kana Code）、ＫＥＩＳ（Kanji processing Extended Information System）、ＪＩＳ８、ＳＪＩＳ（Shift JIS）。以下、「旧文字コード体系」と称する場合がある）でデータを取り扱っているホストコンピュータが、その文字コード体系にて標準では登録されていない外字を数多く登録していた場合（ホストコンピュータの外字エリアは9024文字分）がある。この場合、小さな外字エリアしか提供できないＯＳ（ＷＩＮＤＯＷＳが提供する外字エリアは1880文字分）が稼働するサーバコンピュータへのマイグレーションは実現できない。 However, a predetermined character code system (e.g., EBCDIK (Extended Binary Coded Decimal Interchange Kana Code), KEIS (Kanji processing Extended Information System), JIS8, SJIS (Shift JIS). In some cases, the host computer that handles the data has registered many external characters that are not registered as standard in the character code system (the external character area of the host computer is 9024 characters). In this case, migration to a server computer running an OS that can provide only a small external character area (the external character area provided by WINDOWS is equivalent to 1880 characters) cannot be realized.

また、近年では、使用できる文字数が限られている現行コンピュータに対して、新規コンピュータでは、使用できる文字数を増やしてほしい、という要望が多くの企業、自治体などから出されている。具体的には、国際化に伴い、漢字だけでなく簡体字やハングル文字などの外国の文字も表現できるようにして欲しい、個人を正しく表記するために旧漢字も表現できるようにして欲しい、などの要望がある。 In recent years, there have been requests from many companies, local governments, and the like to increase the number of characters that can be used with new computers, compared to current computers where the number of characters that can be used is limited. Specifically, with internationalization, we want to be able to express not only Kanji but also foreign characters such as Simplified and Hangul characters, or want to be able to express old Kanji in order to correctly represent individuals, etc. There is a request.

そこで、これらの事情に対する対応策として、ＵＴＦ（Unicode Transformation Format）−８、ＵＴＦ−１６など、といったより大規模な文字コード体系を、新文字コード体系として取り扱う新規コンピュータへのマイグレーションが考えられる。 Therefore, as a countermeasure against these circumstances, migration to a new computer that handles a larger character code system such as UTF (Unicode Transformation Format) -8 or UTF-16 as a new character code system is conceivable.

マイグレーションでは、主に、（１）業務システム上の既存のデータの移行、および、（２）そのようなデータにアクセスする、業務システム上で動作する既存のプログラムの移行、がなされる。よって、移行する既存の文字データは、新文字コード体系に対応するように文字コードを変換する必要がある。また、既存のプログラム（例えば、ＣＯＢＯＬ言語で記述されたプログラム）は、文字コードを変換した文字データを読み込むことができるように変換する必要がある。 The migration mainly includes (1) migration of existing data on the business system, and (2) migration of existing programs operating on the business system that access such data. Therefore, it is necessary to convert the character code of the existing character data to be migrated so as to correspond to the new character code system. An existing program (for example, a program written in the COBOL language) needs to be converted so that character data obtained by converting the character code can be read.

しかし、従来技術では、文字データに割り当てられた文字コードの変換と比較して、プログラムの変換は、非常に煩雑かつ困難である、という問題点があった。この問題点は、旧文字コード体系と新文字コード体系との組み合わせによっては、同じ文字であっても、その文字を表現するバイト列のバイト数が両文字コード体系間で相違すること、既存のプログラムが文字のバイト列を格納するために指定するメモリ上のエリアの長さが固定長であること、に起因する。プログラムの変換の際は、これらの事情を考慮してプログラムの記述内容を適宜修正する必要がある（修正をしないと、文字データの溢れ、位置ずれなどが生じ、プログラムは、目的とする文字データとは異なる文字データを取得してしまう）。しかし、エリアに格納される文字のバイト列によって修正パターンが異なるため、修正は非常に煩雑かつ困難な作業となる。特許文献１の技術を含めた従来技術において、このような作業に対する改善策は何ら存在しない。 However, the conventional technique has a problem that the conversion of the program is very complicated and difficult as compared with the conversion of the character code assigned to the character data. The problem is that depending on the combination of the old character code system and the new character code system, the number of bytes in the byte string representing that character may differ between the two character code systems, even for the same character. This is due to the fact that the length of the area in memory that the program specifies to store the byte sequence of characters is a fixed length. When converting a program, it is necessary to modify the contents of the program as appropriate in consideration of these circumstances. (If the program is not modified, overflow of character data, misalignment, etc. will occur. Will get different character data). However, since the correction pattern differs depending on the byte string of characters stored in the area, the correction is a very complicated and difficult task. In the prior art including the technique of Patent Document 1, there is no improvement measure for such work.

特許第４４０５５７１号公報Japanese Patent No. 4405571

そこで、本発明は、このような事情に鑑みてなされたものであり、異なる文字コード体系への切り替えが伴うマイグレーションにおいて、マイグレーションの対象となるプログラムの変換を容易にすることを目的とする。 Therefore, the present invention has been made in view of such circumstances, and an object of the present invention is to facilitate the conversion of a program to be migrated in migration involving switching to a different character code system.

前記目的を達成するために、本発明は、
第１のコンピュータから第２のコンピュータへのマイグレーションを支援するマイグレーション支援装置であって、
前記第１のコンピュータが有する第１の文書ファイル中の文字データに割り当てられた第１の文字コードを、記憶部が有する文字コード変換表を参照して、前記第２のコンピュータが有する第２の文書ファイル中の文字データに割り当てられた第２の文字コードに変換する文字コード変換部と、
前記第１のコンピュータが有する、前記第１の文書ファイルを処理するための第１のプログラムを、前記第２のコンピュータが有する、前記第２の文書ファイルを処理するための第２のプログラムに変換するプログラム変換部と、
前記第２の文字コードが割り当てられた文字データを前記第２のプログラムに読み込ませることで、前記読み込まれた文字データについて、前記第２のプログラムが指定するメモリ上のエリアの数を、前記文字データに割り当てられていた第１の文字コードを表現するバイト列のバイト数と同じに定める交換情報を生成する交換情報生成部と、
前記交換情報により定められた数からなる前記エリアに、前記読み込まれた文字データに割り当てられた１つの前記第２の文字コードを格納するエリア格納部と、を備える、
ことを特徴とする。
その他の手段については後記する。 In order to achieve the above object, the present invention provides:
A migration support apparatus that supports migration from a first computer to a second computer,
The second character of the second computer has the first character code assigned to the character data in the first document file of the first computer with reference to the character code conversion table of the storage unit. A character code conversion unit that converts the second character code assigned to the character data in the document file;
The first program for processing the first document file included in the first computer is converted into the second program for processing the second document file included in the second computer. A program conversion unit for
By causing the second program to read the character data to which the second character code is assigned, the number of areas on the memory designated by the second program for the read character data is set to the character An exchange information generating unit for generating exchange information determined to be the same as the number of bytes of the byte string expressing the first character code assigned to the data;
An area storage unit for storing one of the second character codes assigned to the read character data in the area having a number determined by the exchange information;
It is characterized by that.
Other means will be described later.

レガシーとしての第１のプログラムは、文字データのサイズ（項目の長さ）をバイト列のバイト数として扱い、バイト数と同じ数のエリアをメモリ上に指定して文字データのバイト列を格納していた。つまり、従来のように、第１のプログラムは、メモリ上に指定するエリアを、１バイトのデータを格納するためのエリアとし、バイト数単位で文字データを処理していた。また、第１のプログラムのソースコードの記述内容はその処理に対応したものとなっていた。
これに対し、変換した第２のプログラムは、文字コードの変換によって、１文字を表現するバイト列のバイト数が異なった文字データを処理する際、交換情報を参照することで、第１のプログラムが使用したエリアの数と同じ数のエリアを使用することができる。つまり、第２のプログラムは、メモリ上に指定するエリアを、１文字のデータを格納するための１または複数のエリアとし、文字数単位で文字データを処理することができる。よって、第２のプログラムで組まれたロジックを第１のプログラムで組まれたロジックと同じにすることができ、第２のプログラムのソースコードの記述内容のうち、ロジックに関する部分（例えば、ＣＯＢＯＬ言語における桁数）を修正する必要はない。
したがって、異なる文字コード体系への切り替えが伴うマイグレーションにおいて、マイグレーションの対象となるプログラムの変換を容易にすることができる。 The first legacy program treats the size of the character data (item length) as the number of bytes in the byte sequence, specifies the same number of bytes as the number of bytes on the memory, and stores the byte sequence of the character data. It was. That is, as in the prior art, the first program uses the area specified on the memory as an area for storing 1-byte data, and processes character data in units of bytes. Further, the description contents of the source code of the first program correspond to the processing.
On the other hand, the converted second program refers to the exchange information when processing the character data in which the number of bytes of the byte string representing one character is different by converting the character code. As many areas as can be used. That is, the second program can process the character data in units of the number of characters by setting the area designated on the memory as one or a plurality of areas for storing data of one character. Therefore, the logic assembled in the second program can be made the same as the logic assembled in the first program, and the portion related to the logic (for example, COBOL language) in the description contents of the source code of the second program There is no need to correct the number of digits in.
Therefore, in a migration that involves switching to a different character code system, it is possible to easily convert a program to be migrated.

本発明によれば、異なる文字コード体系への切り替えが伴うマイグレーションにおいて、マイグレーションの対象となるプログラムの変換を容易にすることができる。 According to the present invention, it is possible to facilitate the conversion of a program to be migrated in migration involving switching to a different character code system.

本実施形態のマイグレーション支援装置の機能構成を示す図である。It is a figure which shows the function structure of the migration assistance apparatus of this embodiment. 交換情報のデータ構造を示す図である。It is a figure which shows the data structure of exchange information. 本実施形態のマイグレーション支援装置の処理を示すフローチャートである。It is a flowchart which shows the process of the migration assistance apparatus of this embodiment. 比較例として、ＥＢＣＤＩＫ＋ＫＥＩＳコードからＵＴＦ−８コードへの変換に合わせてＣＯＢＯＬ言語のプログラムを変換する際、ソースコードの記述内容の修正を必要とすることを説明するための図である。As a comparative example, it is a figure for demonstrating that the description content of a source code needs to be corrected when converting the program of a COBOL language according to the conversion from an EBCDIC + KEIS code to a UTF-8 code. 本実施例として、ＥＢＣＤＩＫ＋ＫＥＩＳコードからＵＴＦ−８コードへの変換に合わせてＣＯＢＯＬ言語のプログラムを変換する際、ソースコードの記述内容の修正を不要とすることを説明するための図である。FIG. 5 is a diagram for explaining that correction of the description content of a source code is not necessary when a COBOL language program is converted in accordance with conversion from an EBCDIC + KEIS code to a UTF-8 code as an example of the present invention.

図１に示すように、作業用ＰＣ１は、現行コンピュータ２から新規コンピュータ３へのマイグレーションを担当する作業員が操作するコンピュータであって、本実施形態のマイグレーション支援装置である。作業用ＰＣ１は、現行コンピュータ２から入力ファイル２１および入力プログラム２２を取得し、所定の変換（詳細は後記する）をした後、出力ファイル３１および出力プログラム３２として新規コンピュータ３に出力する。 As shown in FIG. 1, the work PC 1 is a computer operated by a worker in charge of migration from the current computer 2 to the new computer 3, and is a migration support apparatus of this embodiment. The work PC 1 acquires the input file 21 and the input program 22 from the current computer 2, performs predetermined conversion (details will be described later), and then outputs the output file 31 and the output program 32 to the new computer 3.

現行コンピュータ２（第１のコンピュータ）は、汎用系のホストコンピュータである。
新規コンピュータ３（第２のコンピュータ）は、オープン系のサーバコンピュータである。 The current computer 2 (first computer) is a general-purpose host computer.
The new computer 3 (second computer) is an open server computer.

入力ファイル２１（第１の文書ファイル）は、文字データを含む文書ファイルであって、現行コンピュータ２のレガシーである。入力ファイル２１中の文字データは、現行コンピュータ２が取り扱っている文字コード体系に従う。現行コンピュータ２が取り扱っている文字コード体系は、半角英数文字、半角記号、および半角カナ文字の文字データについてはＥＢＣＤＩＫであり、全角文字の文字データについてはＫＥＩＳである。本実施形態では、入力ファイル２１中の文字データに割り当てられた文字コードを「ＥＢＣＤＩＫ＋ＫＥＩＳコード」と称する場合がある。 The input file 21 (first document file) is a document file including character data and is a legacy of the current computer 2. The character data in the input file 21 follows the character code system handled by the current computer 2. The character code system handled by the current computer 2 is EBCDIC for character data of half-width alphanumeric characters, half-width symbols, and half-width kana characters, and KEIS for character data of full-width characters. In the present embodiment, the character code assigned to the character data in the input file 21 may be referred to as “EBCDIK + KEIS code”.

なお、ＥＢＣＤＩＫは、半角英数文字、半角記号、および半角カナ文字については、１文字を１バイトで表現する（バイト数＝１）。ＫＥＩＳは、全角文字については、１文字を２バイトで表現する（バイト数＝２）。 In addition, EBCDIC expresses one character by 1 byte for half-width alphanumeric characters, half-width symbols, and half-width kana characters (number of bytes = 1). KEIS expresses one character with 2 bytes for double-byte characters (number of bytes = 2).

入力プログラム２２（第１のプログラム）は、入力ファイル２１を処理するためのプログラムであって、現行コンピュータ２のレガシーである。入力プログラム２２は、ＣＯＢＯＬ言語で記述されており、その記述内容は、ＥＢＣＤＩＫ兼ＫＥＩＳからなる文字コード体系に即している。 The input program 22 (first program) is a program for processing the input file 21 and is a legacy of the current computer 2. The input program 22 is described in the COBOL language, and the description content conforms to a character code system composed of EBCDIC and KEIS.

出力ファイル３１（第２の文書ファイル）は、文字データを含む文書ファイルである。出力ファイル３１中の文字データは、新規コンピュータ３が取り扱っている文字コード体系に従う。新規コンピュータ３が取り扱っている文字コード体系は、半角英数文字、半角記号、半角カナ文字、および全角文字のいずれの文字の文字データについてもＵＴＦ−８である。本実施形態では、出力ファイル３１中の文字データに割り当てられた文字コードを「ＵＴＦ−８コード」と称する場合がある。 The output file 31 (second document file) is a document file including character data. The character data in the output file 31 follows the character code system handled by the new computer 3. The character code system handled by the new computer 3 is UTF-8 for character data of any one-byte alphanumeric characters, half-width symbols, half-width kana characters, and full-width characters. In the present embodiment, the character code assigned to the character data in the output file 31 may be referred to as “UTF-8 code”.

なお、ＵＴＦ−８は、半角英数文字および半角記号については、１文字を１バイトで表現し（バイト数＝１）、半角カナ文字および全角文字については、１文字を３バイトで表現する（バイト数＝３）。 In UTF-8, one-byte alphanumeric characters and half-width symbols are represented by 1 byte (byte number = 1), and half-width Kana characters and full-width characters are represented by 3 bytes ( Number of bytes = 3).

出力プログラム３２（第２のプログラム）は、出力ファイル３１を処理するためのプログラムである。本実施形態では、出力プログラム３２は、ＣＯＢＯＬ言語で記述されているとする。しかし、周知の形式的な記述を施すことで、出力プログラム３２を、ＪＡＶＡ（登録商標）言語で記述することができる。 The output program 32 (second program) is a program for processing the output file 31. In the present embodiment, it is assumed that the output program 32 is described in the COBOL language. However, by providing a well-known formal description, the output program 32 can be described in JAVA (registered trademark) language.

なお、作業用ＰＣ１は、入力部、出力部、制御部、および記憶部といったハードウェアを含む。例えば、制御部がＣＰＵ（Central Processing Unit）から構成される場合、その制御部を含むコンピュータによる情報処理は、ＣＰＵによるプログラム実行処理で実現する。また、そのコンピュータが含む記憶部は、ＣＰＵが指令し、そのコンピュータの機能を実現するためのプログラムを記憶する。これによりソフトウェアとハードウェアの協働が実現される。前記プログラムは、記録媒体に記録したり、ネットワークを経由したりすることで提供される。 The work PC 1 includes hardware such as an input unit, an output unit, a control unit, and a storage unit. For example, when the control unit is configured by a CPU (Central Processing Unit), information processing by a computer including the control unit is realized by program execution processing by the CPU. In addition, the storage unit included in the computer stores a program that is instructed by the CPU and implements the function of the computer. This realizes cooperation between software and hardware. The program is provided by being recorded on a recording medium or via a network.

図１に示すように、作業用ＰＣ１は、文字コード変換部１１と、プログラム変換部１２と、交換情報生成部１３と、エリア格納部１４といった機能部を有し、文字コード変換表Ｔと、交換情報Ｅとを記憶部に記憶している。 As shown in FIG. 1, the work PC 1 has functional units such as a character code conversion unit 11, a program conversion unit 12, an exchange information generation unit 13, and an area storage unit 14, and includes a character code conversion table T, Exchange information E is stored in the storage unit.

文字コード変換部１１は、入力ファイル２１中の文字データに割り当てられたＥＢＣＤＩＫ＋ＫＥＩＳコード（第１の文字コード）を、文字コード変換表Ｔを参照して、出力ファイル３１中の文字データに割り当てられたＵＴＦ−８コード（第２の文字コード）に変換する。 The character code conversion unit 11 assigns the EBCDIC + KEIS code (first character code) assigned to the character data in the input file 21 to the character data in the output file 31 with reference to the character code conversion table T. Convert to UTF-8 code (second character code).

プログラム変換部１２は、文字コード変換部１１による文字コードの変換に対応するように、入力プログラム２２を出力プログラム３２に変換する。プログラム変換部１２は、出力プログラム３２の記述言語を、入力プログラム２２の記述言語と同じにするように変換することもできるし（例：ＣＯＢＯＬ→ＣＯＢＯＬ）、異なるように変換することもできる（例：ＣＯＢＯＬ→ＪＡＶＡ）。 The program conversion unit 12 converts the input program 22 into an output program 32 so as to correspond to the character code conversion by the character code conversion unit 11. The program conversion unit 12 can convert the description language of the output program 32 to be the same as the description language of the input program 22 (for example, COBOL → COBOL), or can convert it differently (for example, : COBOL → JAVA).

交換情報生成部１３は、ＵＴＦ−８コードが割り当てられた文字データを出力プログラム３２に読み込ませることで、読み込まれた文字データについて、出力プログラム３２が指定するメモリ上のエリアの数を、文字データに割り当てられていたＥＢＣＤＩＫ＋ＫＥＩＳコードを表現するバイト列のバイト数と同じに定める交換情報Ｅを生成する。
出力プログラム３２が読み込む、ＵＴＦ−８コードが割り当てられた文字データは、例えば、出力ファイル３１から抽出した文字データである。 The exchange information generation unit 13 causes the output program 32 to read the character data to which the UTF-8 code is assigned, so that the number of areas on the memory designated by the output program 32 is determined for the read character data. The exchange information E defined to be the same as the number of bytes of the byte string expressing the EBCDIC + KEIS code assigned to is generated.
The character data to which the UTF-8 code assigned by the output program 32 is read is, for example, character data extracted from the output file 31.

エリア格納部１４は、交換情報Ｅにより定められた数からなる前記エリアに、出力プログラム３２に読み込まれた文字データに割り当てられた１つのＵＴＦ−８コードを格納する。 The area storage unit 14 stores one UTF-8 code assigned to the character data read by the output program 32 in the area having the number determined by the exchange information E.

文字コード変換表Ｔは、所定の文字集合（例えば、現行コンピュータ２が取り扱うＥＢＣＤＩＫ兼ＫＥＩＳからなる文字コード体系にて規定されている文字の文字集合）に含まれる文字について、当該文字に割り当てられている、ＥＢＣＤＩＫ＋ＫＥＩＳコードとＵＴＦ−８コードとを対応付けている。対応付けの詳細は周知であり、説明は省略する。 The character code conversion table T is assigned to a character included in a predetermined character set (for example, a character set of characters defined by the character code system composed of EBCDIK and KEIS handled by the current computer 2). The EBCDIK + KEIS code and the UTF-8 code are associated with each other. The details of the association are well known and will not be described.

交換情報生成部１３が生成する交換情報Ｅは、ＵＴＦ−８コードが割り当てられた文字データごとに、当該文字データのサイズ（項目の長さ）であるバイト数と、出力プログラム３２が指定するメモリ上のエリアの数とを対応付けている。
図２に示すように、さまざまな文字データに割り当てられるＵＴＦ−８コードは、半角英数記号の文字（半角英数文字＋半角記号）を表す文字コード、半角カナの文字を表す文字コード、全角文字を表す文字コードに分類することができる。分類された文字コードに対して、上記した「バイト数」および「エリアの数」が決定される。 The exchange information E generated by the exchange information generation unit 13 includes, for each character data to which the UTF-8 code is assigned, the number of bytes that is the size (length of the item) of the character data, and a memory specified by the output program 32. Corresponds to the number of upper areas.
As shown in FIG. 2, UTF-8 codes assigned to various character data are character codes representing half-width alphanumeric characters (half-width alphanumeric characters + half-width symbols), character codes representing half-width kana characters, full-width characters. It can be classified into character codes representing characters. The “number of bytes” and “number of areas” described above are determined for the classified character codes.

半角英数記号の文字を表す文字コードに対しては、先述の通り、ＵＴＦ−８は対応する１文字を１バイトで表現するので、「バイト数」は「１」となる。また、先述の通り、ＥＢＣＤＩＫは、半角英数文字および半角記号については、１文字を１バイトで表現するので、交換情報生成部１３の機能により、「エリアの数」は「１」となる。 As described above, since UTF-8 expresses one corresponding character with 1 byte for a character code representing a single-byte alphanumeric character, the “number of bytes” is “1”. In addition, as described above, EBCDIC expresses one character in one byte for one-byte alphanumeric characters and one-byte symbols, and therefore, the number of areas becomes “1” by the function of the exchange information generation unit 13.

半角カナの文字を表す文字コードに対しては、先述の通り、ＵＴＦ−８は対応する１文字を３バイトで表現するので、「バイト数」は「３」となる。また、先述の通り、ＥＢＣＤＩＫは、半角カナについては、１文字を１バイトで表現するので、交換情報生成部１３の機能により、「エリアの数」は「１」となる。 For a character code representing a half-width kana character, as described above, UTF-8 expresses one corresponding character in 3 bytes, so the “number of bytes” is “3”. In addition, as described above, EBCDIC expresses one character in one byte for half-width kana, and “number of areas” becomes “1” by the function of the exchange information generation unit 13.

全角文字を表す文字コードに対しては、先述の通り、ＵＴＦ−８は対応する１文字を３バイトで表現するので、「バイト数」は「３」となる。また、先述の通り、ＫＥＩＳは、全角文字については、１文字を２バイトで表現するので、交換情報生成部１３の機能により、「エリアの数」は「２」となる。 For a character code representing a full-width character, as described above, UTF-8 expresses one corresponding character with 3 bytes, so the “number of bytes” is “3”. In addition, as described above, KEIS expresses one character with 2 bytes for double-byte characters, so that the number of areas becomes “2” by the function of the exchange information generation unit 13.

交換情報Ｅの内容は、現行コンピュータ２で取り扱う文字コード体系と、新規コンピュータ３で取り扱う文字コード体系との組み合わせによって決まる。 The content of the exchange information E is determined by a combination of a character code system handled by the current computer 2 and a character code system handled by the new computer 3.

≪処理≫
本実施形態の処理について説明する。この処理の主体は、作業用ＰＣ１の制御部であるが、説明の便宜上、「制御部」という語は省略する。
図３に示すように、作業用ＰＣ１は、現行コンピュータ２から新規コンピュータ３へのマイグレーションを行うにあたり、ステップＳ１から処理を開始する。 << Process >>
The processing of this embodiment will be described. The subject of this processing is the control unit of the work PC 1, but for convenience of explanation, the term “control unit” is omitted.
As shown in FIG. 3, the work PC 1 starts processing from step S 1 when performing migration from the current computer 2 to the new computer 3.

ステップＳ１において、作業用ＰＣ１は、現行コンピュータ２から入力ファイル２１および入力プログラム２２を取得する。ステップＳ１の後、ステップＳ２に進む。 In step S 1, the work PC 1 acquires the input file 21 and the input program 22 from the current computer 2. After step S1, the process proceeds to step S2.

ステップＳ２において、作業用ＰＣ１は、文字コード変換部１１によって、取得した入力ファイル２１中の文字データに対して、文字コードを、ＥＢＣＤＩＫ＋ＫＥＩＳコードからＵＴＦ−８コードに変換し、出力ファイル３１を生成する。ステップＳ２の後、ステップＳ３に進む。 In step S 2, the work PC 1 converts the character code of the acquired character data in the input file 21 from the EBCDIK + KEIS code to the UTF-8 code by the character code conversion unit 11 to generate the output file 31. . After step S2, the process proceeds to step S3.

ステップＳ３において、作業用ＰＣ１は、プログラム変換部１２によって、取得した入力プログラム２２を出力プログラム３２に変換する。ステップＳ３の後、ステップＳ４に進む。 In step S 3, the work PC 1 uses the program conversion unit 12 to convert the acquired input program 22 into the output program 32. After step S3, the process proceeds to step S4.

ステップＳ４において、作業用ＰＣ１は、ＵＴＦ−８コードが割り当てられた文字データを出力プログラム３２で読み込む。ステップＳ４の後、ステップＳ５に進む。 In step S 4, the work PC 1 reads character data to which the UTF-8 code is assigned by the output program 32. After step S4, the process proceeds to step S5.

ステップＳ５において、作業用ＰＣ１は、交換情報生成部１３によって、ステップＳ４にて読み込まれた文字データについて、交換情報Ｅを生成する。ステップＳ５の後、ステップＳ６に進む。 In step S5, the work PC 1 uses the exchange information generation unit 13 to generate exchange information E for the character data read in step S4. After step S5, the process proceeds to step S6.

ステップＳ６において、作業用ＰＣ１は、エリア格納部１４によって、交換情報Ｅが定めた数からなるエリア（出力プログラム３２が指定するメモリ上のエリア）に、対応するＵＴＦ−８コード、つまり、ステップＳ４にて読み込まれた文字データに割り当てられたＵＴＦ−８コードを格納する。ステップＳ６の後、図３の処理を終了する。 In step S6, the work PC 1 uses the area storage unit 14 to store the UTF-8 code corresponding to the area (the area on the memory specified by the output program 32) having the number determined by the exchange information E, that is, in step S4. The UTF-8 code assigned to the character data read in is stored. After step S6, the process of FIG.

作業用ＰＣ１にて生成された出力ファイル３１、出力プログラム３２、および交換情報Ｅは、新規コンピュータ３に出力される。ここで、新規コンピュータ３にて、所定の業務処理を実行するために、出力プログラム３２が出力ファイル３１を開く場合を考える。この場合、出力プログラム３２は、交換情報Ｅを参照して、出力プログラム３２が指定するメモリ上のエリアに格納されているＵＴＦ−８コードに、出力プログラム３２が定める順番でアクセスする。 The output file 31, the output program 32, and the exchange information E generated by the work PC 1 are output to the new computer 3. Here, consider a case where the output program 32 opens the output file 31 in order to execute a predetermined business process in the new computer 3. In this case, the output program 32 refers to the exchange information E and accesses the UTF-8 code stored in the memory area designated by the output program 32 in the order determined by the output program 32.

入力プログラム２２は、入力ファイル２１中の文字データのサイズ（項目の長さ）をバイト列のバイト数として扱い、バイト数と同じ数のエリアをメモリ上に指定して文字データのバイト列を格納していた。つまり、従来のように、現行コンピュータ２にて、入力プログラム２２は、メモリ上に指定するエリアを、１バイトのデータを格納するためのエリアとし、バイト数単位で入力ファイル２１中の文字データを処理することで、実質的に文字データを１文字ずつ順番に処理していた。 The input program 22 treats the size of the character data (item length) in the input file 21 as the number of bytes of the byte sequence, specifies the same number of areas as the number of bytes on the memory, and stores the byte sequence of the character data. Was. In other words, as in the past, in the current computer 2, the input program 22 uses the area specified on the memory as an area for storing 1-byte data, and character data in the input file 21 in units of bytes. By processing, the character data is processed in order substantially one character at a time.

ＥＢＣＤＩＫ＋ＫＥＩＳコードからＵＴＦ−８コードに文字コードが変換されたことでバイト列のバイト数が変更した文字データに対して、交換情報Ｅは、出力プログラム３２がメモリ上に指定するエリアの数を、入力プログラム２２がメモリ上に指定していたエリアの数と同じにすることを可能にする。例えば、ＥＢＣＤＩＫ＋ＫＥＩＳコードからＵＴＦ−８コードに変換されると、バイト列のバイト数が「２」から「３」に変更される全角文字の文字データに対して、出力プログラム３２は、交換情報Ｅを参照することで、メモリ上に指定するエリアの数を、従来技術のように「３」ではなく、「２」にすることができる。エリア格納部１４は、（連続する）２つ分のエリアに当該全角文字に割り当てられた１つのＵＴＦ−８コードを格納する。 For character data in which the number of bytes in the byte string has changed due to the conversion of the character code from the EBCDIK + KEIS code to the UTF-8 code, the exchange information E inputs the number of areas designated on the memory by the output program 32 The program 22 can be the same as the number of areas designated on the memory. For example, when the EBCDIK + KEIS code is converted to UTF-8 code, the output program 32 converts the exchange information E to the full-width character data in which the number of bytes in the byte sequence is changed from “2” to “3”. By referencing, the number of areas designated on the memory can be set to “2” instead of “3” as in the prior art. The area storage unit 14 stores one UTF-8 code assigned to the full-width character in two (continuous) areas.

よって、出力プログラム３２は、メモリ上に指定するエリアを、１バイトのデータを格納するためのエリアではなく、１文字のデータを格納するためのエリアとすることができ、文字数単位で出力ファイル３１中の文字データを処理することができる。その結果、入力プログラム２２が入力ファイル２１中の文字データを１文字ずつ順番に処理するのと同様に、新規コンピュータ３にて、出力プログラム３２は出力ファイル３１中の文字データを１文字ずつ順番に処理することができる。つまり、文字データのサイズが異なる文字コードの変換を伴うマイグレーションを行ったとしても、出力プログラム３２で組まれたロジックを入力ファイル２１で組まれたロジックと同じままにすることができる。マイグレーションを行う作業者は、出力プログラム３２のソースコードの記述内容のうち、ロジックに関する部分を修正する必要はない。 Therefore, the output program 32 can make the area designated on the memory not an area for storing 1-byte data but an area for storing 1-character data, and the output file 31 in units of the number of characters. The character data inside can be processed. As a result, in the same way as the input program 22 processes the character data in the input file 21 one character at a time, in the new computer 3, the output program 32 outputs the character data in the output file 31 one character at a time. Can be processed. That is, even if migration is performed that involves conversion of character codes having different character data sizes, the logic assembled in the output program 32 can remain the same as the logic assembled in the input file 21. The worker who performs the migration does not need to correct the logic-related part of the description contents of the source code of the output program 32.

なお、作業用ＰＣ１は、ＵＴＦ−８コードが割り当てられた文字データのバイト列を１バイトずつ格納する規定個数分（例えば、全角文字であれば３個分）のエリア（１バイトのデータを格納するためのエリア）を、出力プログラム３２がメモリ上に別途指定するように制御することができる。そして、作業用ＰＣ１は、エリア格納部１４が１つのＵＴＦ−８コードを格納する１つまたは２つ分のエリアと、前記規定個数分のエリアとを紐づけるように制御する。よって、新規コンピュータ２にて、出力プログラム３２が、エリア格納部１４が格納したＵＴＦ−８コードにアクセスするとき、前記紐づけられたエリアに格納されているバイト列にアクセスすることで、対象となる文字データを処理することができる。 The work PC 1 stores an area (one byte of data) for a specified number of bytes (for example, three full-width characters) for storing a byte sequence of character data to which a UTF-8 code is assigned one byte at a time. The output program 32 can be controlled so as to be separately designated on the memory. Then, the work PC 1 performs control so that the area storage unit 14 associates one or two areas for storing one UTF-8 code with the prescribed number of areas. Therefore, in the new computer 2, when the output program 32 accesses the UTF-8 code stored in the area storage unit 14, the output program 32 accesses the byte sequence stored in the associated area, Can be processed.

≪具体例≫
図４、図５を参照して、文字コード体系の切り替えを伴うマイグレーションによってプログラムを変換することの具体例を説明する。本具体例では、変換前プログラム（入力プログラム２２に相当）も変換後プログラム（出力プログラム３２に相当）もＣＯＢＯＬ言語で記述されている。変換前プログラムが扱う文字コードはＥＢＣＤＩＫ＋ＫＥＩＳコードであり、変換後プログラムが扱う文字コードはＵＴＦ−８コードである。 ≪Specific example≫
With reference to FIG. 4 and FIG. 5, a specific example of converting a program by migration accompanied by switching of a character code system will be described. In this specific example, both the pre-conversion program (corresponding to the input program 22) and the post-conversion program (corresponding to the output program 32) are described in the COBOL language. The character code handled by the pre-conversion program is EBCDIK + KEIS code, and the character code handled by the post-conversion program is UTF-8 code.

図４には、従来技術としての比較例を示す。図４（ａ）の上部には、変換前プログラムのソースコードのうちデータ部ワーキング節の記述例が示されている。集団項目DATA‐Aのなかに、DATA‐A1およびDATA‐A2という変数（項目）がこの順番で宣言されている。
DATA‐A1において、「PIC X」は、１文字１バイトのデータ（ＥＢＣＤＩＫ）格納エリアをメモリ上に確保することを表しており、「(03)」は、このエリアが３つあることを表している（桁数は３）。よって、DATA‐A1に（半角文字）３文字分のデータを入力できる。
DATA‐A2において、「PIC N」は、１文字２バイトのデータ（ＫＥＩＳ）格納エリアをメモリ上に確保することを表しており、「(03)」は、このエリアが３つあることを表している（桁数は３）。よって、DATA‐A2に（全角文字）３文字分のデータを入力できる。
なお、ＣＯＢＯＬ言語は、変数を固定長で宣言する。 FIG. 4 shows a comparative example as a conventional technique. In the upper part of FIG. 4A, a description example of the data section working section in the source code of the pre-conversion program is shown. In the group item DATA-A, variables (items) DATA-A1 and DATA-A2 are declared in this order.
In DATA-A1, “PIC X” indicates that one byte of data (EBCDIK) storage area is secured in memory, and “(03)” indicates that there are three such areas. (Number of digits is 3). Therefore, data for 3 characters (half-width characters) can be input to DATA-A1.
In DATA-A2, “PIC N” indicates that one character 2 bytes of data (KEIS) storage area is secured on the memory, and “(03)” indicates that there are three such areas. (Number of digits is 3). Therefore, data of 3 characters (double-byte characters) can be input to DATA-A2.
The COBOL language declares variables with a fixed length.

図４（ａ）の下部には、上記記述例を具現化したエリアの模式図が示されている。１つのエリアを１つのボックスで表わすと、このボックスは、１バイトのデータ格納エリアを表している。この模式図によれば、変換前プログラムは、DATA‐A1に対して３バイト分のエリアをメモリ上に指定することで、DATA‐A1に３文字分のデータを入力できる。また、DATA‐A2に対して６バイト（２バイト×３）分のエリアをメモリ上に指定することで、DATA‐A2に３文字分のデータを入力できる。このように、変換前プログラムは、従来のように、文字データのバイト列が格納されるエリアを１バイトごとに指定しており、バイト数単位で文字データを処理する（左から順番にボックス内のバイト列に１つずつアクセスする）。 A schematic diagram of an area embodying the above description example is shown in the lower part of FIG. If one area is represented by one box, this box represents a 1-byte data storage area. According to this schematic diagram, the pre-conversion program can input data for 3 characters into DATA-A1 by designating an area for 3 bytes on the memory for DATA-A1. In addition, by specifying an area of 6 bytes (2 bytes × 3) on the memory for DATA-A2, data of 3 characters can be input to DATA-A2. Thus, as before, the pre-conversion program specifies the area where the byte string of character data is stored for each byte, and processes the character data in units of bytes (in the box in order from the left). One byte sequence at a time).

ここで、マイグレーションにて文字コードを変換し、プログラムも変換する場合、１文字を表現するバイト列のバイト数が異なった文字データを間違いなく処理するために（目的とした文字データを確実に読み出すために）、従来技術では、変換後プログラムのロジックを手作業で修正する必要があった。 Here, when character code is converted by migration and the program is also converted, in order to correctly process character data in which the number of bytes in a byte string representing one character is different (the target character data is reliably read out) Therefore, in the prior art, it was necessary to manually correct the logic of the converted program.

図４（ｂ）の上部には、変換後プログラムのソースコードのうちデータ部ワーキング節の記述例が示されている。プログラムの変換前後でロジックを同じにするためには、図４（ａ）の記述例に対して図中の下線部で示したような記述を追加する修正が必要である。
前記修正として、DATA‐A1については、桁数を３から９に変更している。このように桁数を変更させる理由は、ＥＢＣＤＩＫが半角カナ１文字を１バイトで表現するのに対し、ＵＴＦ−８は半角カナ１文字を３バイトで表現するため、DATA‐A1に半角カナ３文字分のバイト列が入力された場合に対応できるように（データの溢れを防ぐように）、DATA‐A1に９バイト分のエリア（３バイト×３文字）を持たせるためである。
また、前記修正として、DATA‐A2については、桁数を３から５に変更している。このように桁数を変更させる理由は、ＫＥＩＳが全角文字１文字を２バイトで表現するのに対し、ＵＴＦ−８は全角文字１文字を３バイトで表現するため、DATA‐A2に全角文字３文字のバイト列が入力された場合に対応できるように、DATA‐A2に少なくとも９バイト分のエリア（３バイト×３文字）を持たせるためである。図４（ｂ）の例では、DATA‐A2の桁数を５にすることで、DATA‐A2に１０バイト分のエリアを持たせている。 In the upper part of FIG. 4B, a description example of the data section working section in the source code of the converted program is shown. In order to make the logic the same before and after the conversion of the program, it is necessary to modify the description example of FIG. 4A by adding a description as indicated by the underlined portion in the figure.
As a modification, the number of digits is changed from 3 to 9 for DATA-A1. The reason for changing the number of digits in this way is that EBCDIK represents one half-width kana character in 1 byte, whereas UTF-8 represents one half-width kana character in 3 bytes. This is because DATA-A1 has an area of 9 bytes (3 bytes × 3 characters) so that it can cope with a case where a byte string of characters is input (to prevent data overflow).
As a modification, the number of digits is changed from 3 to 5 for DATA-A2. The reason for changing the number of digits in this way is that KEIS expresses one full-width character in 2 bytes, whereas UTF-8 expresses one full-width character in 3 bytes. This is because DATA-A2 is provided with an area of at least 9 bytes (3 bytes × 3 characters) so as to cope with the case where a character byte string is input. In the example of FIG. 4B, by setting the number of digits of DATA-A2 to 5, DATA-A2 has an area of 10 bytes.

図４（ｂ）の下部には、上記修正がなされた記述例を具現化したエリアの模式図が示されている。図４（ｂ）に示すボックスは、図４（ａ）に示すボックス同様、１バイトのデータ格納エリアを表している。前記修正の結果、ボックスの数を増やすことで、DATA‐A1に３文字分のデータを入力できること、および、DATA‐A2に３文字分のデータを入力できること、という変換前プログラムの特性が変換後プログラムにおいても保持される。ただ、このようなボックスを増やすように、プログラムに組まれたロジックを修正することは、プログラム中のすべての変数に対して行う必要があるので、多大な作業量を必要とする。 In the lower part of FIG. 4B, a schematic diagram of an area embodying the description example with the above modification is shown. The box shown in FIG. 4B represents a 1-byte data storage area, like the box shown in FIG. As a result of the above modification, the characteristics of the pre-conversion program that data of 3 characters can be input to DATA-A1 and data of 3 characters can be input to DATA-A2 by increasing the number of boxes are converted. It is retained in the program. However, modifying the logic built into the program so as to increase such boxes requires a large amount of work because it needs to be performed for all variables in the program.

図５には、本実施例を示す。図５（ａ）は、図４（ａ）と同じである。つまり、変数DATA‐A1には３文字分のデータを入力でき、変数DATA‐A2には３文字分のデータを入力できる。
図５（ｂ）の上部には、変換後プログラムのソースコードのうちデータ部ワーキング節の記述例が示されている。本実施例にてプログラムを変換する場合、すでに説明した交換情報Ｅが用いられる。 FIG. 5 shows this embodiment. FIG. 5A is the same as FIG. That is, data for three characters can be input to the variable DATA-A1, and data for three characters can be input to the variable DATA-A2.
In the upper part of FIG. 5B, a description example of the data section working section in the source code of the converted program is shown. When the program is converted in this embodiment, the exchange information E already described is used.

すでに説明したように、交換情報Ｅによって、変換後プログラムがメモリ上に指定するエリアは、１バイトのデータを格納するためのエリアではなく、１文字のデータを格納するためのエリアとして機能する。このことは、図５（ｂ）の下部に示すように、１つのボックスが、１つのエリアを半角英数記号カナ文字１文字のデータ格納エリアとして表すことと同義である。ここで、「半角英数記号カナ文字」という語は、半角英数文字、半角記号、および半角カナ文字をまとめた語である。半角英数記号カナ文字１文字のデータ格納エリアは、２つ並べると全角文字１文字のデータ格納エリアを表すことができる。 As already described, the area designated by the converted program on the memory by the exchange information E functions as an area for storing one character data, not an area for storing one byte of data. This is synonymous with the fact that one box represents one area as a data storage area for one half-width alphanumeric symbol Kana character as shown in the lower part of FIG. 5B. Here, the term “single-byte alphanumeric symbol kana characters” is a word that is a collection of single-byte alphanumeric characters, half-width symbols, and half-width kana characters. A data storage area for one half-width alphanumeric symbol kana character can represent a data storage area for one full-width character when arranged in two.

したがって、図５（ｂ）の記述例において、DATA‐A1の「PIC X(03)」は、半角英数記号カナ文字１文字のデータ（ＵＴＦ−８）格納エリアをメモリ上に３つ確保することを表すことができる。このことは、図４（ｂ）のように桁数を増やさなくても（ロジックを修正しなくても）、変数DATA‐A1には３文字分のデータ（ＵＴＦ−８コードが割り当てられた文字データ）を入力できることを意味する。 Therefore, in the description example of FIG. 5B, “PIC X (03)” of DATA-A1 secures three data (UTF-8) storage areas in memory for one-byte alphanumeric symbol Kana characters. Can be expressed. This means that even if the number of digits is not increased as shown in FIG. 4B (the logic is not corrected), the variable DATA-A1 has three characters of data (characters assigned with UTF-8 code). Data).

また、DATA‐A2の「PIC N(03)」は、全角文字１文字のデータ（ＵＴＦ−８）格納エリアをメモリ上に３つ確保することを表すことができる。このことは、図４（ｂ）のように桁数を増やさなくても（ロジックを修正しなくても）、変数DATA‐A2には３文字分のデータ（ＵＴＦ−８コードが割り当てられた文字データ）を入力できることを意味する。 Further, “PIC N (03)” in DATA-A2 can indicate that three data (UTF-8) storage areas for one double-byte character are secured in the memory. This means that even if the number of digits is not increased as shown in FIG. 4B (the logic is not corrected), the variable DATA-A2 has three characters of data (characters assigned with UTF-8 code). Data).

すでに説明したように、１つまたは２つの半角英数記号カナ文字１文字のデータ格納エリアには、１つのＵＴＦ−８コードが格納される。よって、所定の業務処理の実行の際、変換後プログラムは、エリアに格納されたＵＴＦ−８コードに所定の順番でアクセスすれば、文字数単位で文字データを処理することができる。 As already described, one UTF-8 code is stored in the data storage area of one or two half-width alphanumeric symbols and one kana character. Therefore, when executing the predetermined business process, the converted program can process the character data in units of the number of characters by accessing the UTF-8 code stored in the area in a predetermined order.

このように、交換情報Ｅを用いることで、変換後プログラムがメモリ上に指定するエリアの取り扱いを変えることで、プログラムに組まれたロジックを修正する、といった多大な作業量を無くすことができる。 In this way, by using the exchange information E, it is possible to eliminate a great amount of work such as correcting the logic assembled in the program by changing the handling of the area designated on the memory by the converted program.

≪まとめ≫
本実施形態によれば、変換した出力プログラム３２は、文字コードの変換によって、１文字を表現するバイト列のバイト数が異なった文字データを処理する際、交換情報Ｅを参照することで、入力プログラム３２が使用したエリアの数と同じ数のエリアを使用することができる。つまり、出力プログラム３２は、メモリ上に指定するエリアを、１文字のデータを格納するための１または複数のエリアとし、文字数単位で文字データを処理することができる。よって、出力プログラム３２で組まれたロジックを入力プログラム３２で組まれたロジックと同じにすることができ、出力プログラム３２のソースコードの記述内容のうち、ロジックに関する部分を修正する必要はない。
したがって、異なる文字コード体系への切り替えが伴うマイグレーションにおいて、マイグレーションの対象となるプログラムの変換を容易にすることができる。 ≪Summary≫
According to this embodiment, the converted output program 32 refers to the exchange information E when processing character data in which the number of bytes in a byte string representing one character is different by converting character codes. The same number of areas as the number of areas used by the program 32 can be used. That is, the output program 32 can process the character data in units of the number of characters, with the area designated on the memory being one or a plurality of areas for storing one character data. Therefore, the logic assembled in the output program 32 can be made the same as the logic assembled in the input program 32, and it is not necessary to modify the logic-related part of the description contents of the source code of the output program 32.
Therefore, in a migration that involves switching to a different character code system, it is possible to easily convert a program to be migrated.

≪その他≫
本実施形態では、ＥＢＣＤＩＫおよびＫＥＩＳを用いた文字コード体系から、ＵＴＦ−８を用いた文字コード体系への切り替えが伴うマイグレーションについて説明した。しかし、ＪＩＳ８およびＳＪＩＳを用いた文字コード体系から、ＵＴＦ−８を用いた文字コード体系への切り替えが伴うマイグレーションについても本発明を適用できる。 ≪Others≫
In the present embodiment, the migration accompanied by switching from the character code system using EBCDIC and KEIS to the character code system using UTF-8 has been described. However, the present invention can also be applied to migration involving switching from a character code system using JIS8 and SJIS to a character code system using UTF-8.

なお、ＪＩＳ８は、半角英数文字、半角記号、および半角カナ文字については、１文字を１バイトで表現する（バイト数＝１）。ＳＪＩＳは、全角文字については、１文字を２バイトで表現する（バイト数＝２）。 JIS8 expresses one character in one byte for half-width alphanumeric characters, half-width symbols, and half-width kana characters (number of bytes = 1). In SJIS, for double-byte characters, one character is represented by 2 bytes (number of bytes = 2).

また、本実施形態では、エリア格納部１４が、出力プログラム３２が指定するメモリ上のエリアにＵＴＦ−８コードを格納していた。しかし、ＵＴＦ−８コードではなく、該当文字データを識別できる任意の形式のデータを格納することも可能である。 In this embodiment, the area storage unit 14 stores the UTF-8 code in an area on the memory designated by the output program 32. However, instead of the UTF-8 code, it is also possible to store data in any format that can identify the corresponding character data.

また、本実施形態では、交換情報生成部１３が交換情報Ｅを生成する際、出力プログラム３２が読み込む、ＵＴＦ−８コードが割り当てられた文字データは、例えば、出力ファイル３１から抽出した文字データとした。しかし、例えば、作業用ＰＣ１が、所定の文字集合（例えば、ＵＴＦ−８を取り扱うオープン系サーバコンピュータへのマイグレーションの場合、現存するすべての文字からなる文字集合）に含まれるすべての文字について、交換情報Ｅを生成するために、ＵＴＦ−８コードが割り当てられた文字データを外部から事前に取得しておき、取得した文字データを出力プログラム３２に読み込ませてもよい。 In the present embodiment, when the exchange information generating unit 13 generates the exchange information E, the character data assigned with the UTF-8 code read by the output program 32 is, for example, character data extracted from the output file 31. did. However, for example, the work PC 1 replaces all characters included in a predetermined character set (for example, in the case of migration to an open server computer that handles UTF-8, a character set consisting of all existing characters). In order to generate the information E, character data to which a UTF-8 code is assigned may be acquired in advance from the outside, and the acquired character data may be read into the output program 32.

また、本実施形態で説明した種々の技術を適宜組み合わせた技術を実現することもできる。
本実施形態で説明したソフトウェアをハードウェアとして実現することもでき、ハードウェアをソフトウェアとして実現することもできる。
その他、ハードウェア、ソフトウェア、フローチャートなどについて、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 In addition, it is possible to realize a technique in which various techniques described in this embodiment are appropriately combined.
The software described in this embodiment can be realized as hardware, and the hardware can also be realized as software.
In addition, hardware, software, flowcharts, and the like can be changed as appropriate without departing from the spirit of the present invention.

１作業用ＰＣ（マイグレーション支援装置）
１１文字コード変換部
１２プログラム変換部
１３交換情報生成部
１４エリア格納部
２現行コンピュータ（第１のコンピュータ）
２１入力ファイル（第１の文書ファイル）
２２入力プログラム（第１のプログラム）
３新規コンピュータ（第２のコンピュータ）
３１出力ファイル（第２の文書ファイル）
３２出力プログラム（第２のプログラム）
Ｔ文字コード変換表
Ｅ交換情報 1 Work PC (migration support device)
11 Character code conversion unit 12 Program conversion unit 13 Exchange information generation unit 14 Area storage unit 2 Current computer (first computer)
21 Input file (first document file)
22 Input program (first program)
3 New computer (second computer)
31 Output file (second document file)
32 Output program (second program)
T Character code conversion table E Exchange information

Claims

A migration support apparatus that supports migration from a first computer to a second computer,
The second character of the second computer has the first character code assigned to the character data in the first document file of the first computer with reference to the character code conversion table of the storage unit. A character code conversion unit that converts the second character code assigned to the character data in the document file;
The first program for processing the first document file included in the first computer is converted into the second program for processing the second document file included in the second computer. A program conversion unit for
By causing the second program to read the character data to which the second character code is assigned, the number of areas on the memory designated by the second program for the read character data is set to the character An exchange information generating unit for generating exchange information determined to be the same as the number of bytes of the byte string expressing the first character code assigned to the data;
An area storage unit for storing one of the second character codes assigned to the read character data in the area having a number determined by the exchange information;
A migration support apparatus characterized by that.

If before Symbol character data is character data of alphanumeric characters, byte symbols or byte kana characters, is defined by the replacement information, the number of the areas is 1,
When the character data is double-byte character data, the number of areas defined by the exchange information is two.
The migration support apparatus according to claim 1.