CN106663020B - Migration support device - Google Patents

Migration support device Download PDF

Info

Publication number
CN106663020B
CN106663020B CN201580046561.8A CN201580046561A CN106663020B CN 106663020 B CN106663020 B CN 106663020B CN 201580046561 A CN201580046561 A CN 201580046561A CN 106663020 B CN106663020 B CN 106663020B
Authority
CN
China
Prior art keywords
character
program
data
code
character data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201580046561.8A
Other languages
Chinese (zh)
Other versions
CN106663020A (en
Inventor
坂井孝介
城代佳范
粟河厚志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Social Information Services Ltd
Original Assignee
Hitachi Government and Public Sector System Engineering Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Government and Public Sector System Engineering Ltd filed Critical Hitachi Government and Public Sector System Engineering Ltd
Publication of CN106663020A publication Critical patent/CN106663020A/en
Application granted granted Critical
Publication of CN106663020B publication Critical patent/CN106663020B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Stored Programmes (AREA)
  • Devices For Executing Special Programs (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The job PC1 (migration support device) includes: a character code conversion unit (11) for converting the EBCDIK + KEIS code into UTF-8 code; a program conversion unit (12) that converts an input program (22) into an output program (32); an exchange information generation unit (13) that reads character data to which a UTF-8 code is assigned by an output program (32), and generates exchange information (E) for the read character data, the exchange information (E) determining the number of areas on a memory specified by the output program (32) to be the same as the number of bytes representing the byte sequence of the EBCDIK + KEIS code assigned to the character data; and an area storage unit (14) that stores 1 UTF-8 code assigned to the read character data in an area of the number determined by the exchange information (E).

Description

Migration support device
Technical Field
The present invention relates to a technique of so-called legacy system migration (hereinafter sometimes simply referred to as "migration"), and more particularly to a migration technique accompanied by character code system switching.
Background
In recent years, many enterprises, autonomous entities, and the like are required to perform migration services for transferring a business system (legacy system) that has been currently running on a computer to a new computer. As a migration format, for example, there is a format in which a general-purpose host computer (or office computer) migrates to an open server computer that is operated by an OS (Operating System) such as WINDOWS (registered trademark), UNIX (registered trademark), or LINUX (registered trademark). Further, a technique related to migration has been disclosed in a large number, for example, patent document 1.
However, there are cases where a host computer that processes data using a predetermined character Code System (e.g., EBCDIK (Extended binary coded Decimal Interchange Kana Code), KEIS (Kanji processing Extended Information System), JIS8, SJIS (Shift JIS), hereinafter sometimes referred to as "old character Code System") registers a large number of non-standard characters that are not registered in the standard in its character Code System (the non-standard character area of the host computer is the amount of 9024 characters). At this time, migration to a server computer running an OS that can provide only a small nonstandard character region (the nonstandard character region provided by WINDOWS is an amount of 1880 characters) cannot be realized.
In recent years, there has been a demand for increasing the number of characters that can be used in a new computer in most of enterprises, autonomous bodies, and the like, in comparison with current computers having a limited number of characters that can be used. Specifically, with the internationalization, there are the following desires: it is desirable that not only chinese characters but also foreign characters such as simplified characters and korean characters can be expressed, and that old chinese characters and the like can be expressed for accurate representation of an individual.
Therefore, as a countermeasure against these situations, migration to a new computer is considered in which a larger-scale character code system such as UTF (Unicode transformational format) -8 or UTF-16 is handled as a new character code system.
In migration, mainly: (1) transferring existing data on a service system; and (2) migration of existing programs running on the business system that access such data. Thus, the transferred existing character data requires conversion of the character data to correspond to the new character code system. In addition, an existing program (for example, a program described in COBOL language) needs to be converted so that character data obtained by converting a character code can be read.
However, the conventional technique has a problem that the program conversion is complicated and difficult compared to the conversion of the character code assigned to the character data. This problem arises because, by combining the old character code system with the new character code system, even if the characters are the same, the number of bytes of the byte sequence expressing the characters differs between the two character code systems, and the length of the area on the memory specified by the existing program to store the byte sequence of the characters is fixed. When converting a program, it is necessary to correct the description of the program in consideration of these circumstances (if the description is not corrected, character data overflow, positional deviation, or the like occurs, and the program acquires character data different from the target character data). However, since the correction pattern differs depending on the byte sequence of the character stored in the area, the correction becomes a very troublesome and difficult operation. In the prior art including the technique of patent document 1, there is no improvement for such work.
Documents of the prior art
Patent document
Patent document 1 Japanese patent No. 4405571
Disclosure of Invention
Problems to be solved by the invention
The present invention has been made in view of such circumstances, and an object thereof is to facilitate conversion of a program to be migrated in a migration involving switching to a different character code system.
Means for solving the problems
In order to achieve the above object, the present invention provides a migration support device for supporting migration from a1 st computer to a2 nd computer, the migration support device including: a character code conversion unit that converts a1 st character code assigned to character data in a1 st document file provided to the 1 st computer into a2 nd character code assigned to character data in a2 nd document file provided to the 2 nd computer, with reference to a character code conversion table provided in a storage unit; a program converting section that converts a1 st program that the 1 st computer has for processing the 1 st document file into a2 nd program that the 2 nd computer has for processing the 2 nd document file; an exchange information generation unit that causes the 2 nd program to read character data to which the 2 nd character code is assigned, and generates exchange information for the read character data, the exchange information determining the number of areas on the memory specified by the 2 nd program to be the same as the number of bytes expressing the byte sequence of the 1 st character code assigned to the character data; and an area storage unit that stores 1 of the 2 nd character codes assigned to the read character data in the area of the number determined by the exchange information.
Other means will be described later.
The 1 st program, which is a legacy software, stores the byte sequence of the character data by designating an area of the same number as the number of bytes on the memory, with the size of the character data (the length of the item) as the number of bytes of the byte sequence. That is, the 1 st program processes character data in units of byte numbers by setting a region designated on a memory as a region for storing 1 byte of data as in the conventional art. The description of the source code of the 1 st program corresponds to the processing thereof.
In contrast, when the converted 2 nd program processes character data having different byte counts of the byte sequence representing 1 character by converting the character code, the same number of areas as the number of areas used by the 1 st program can be used by referring to the exchange information. That is, the 2 nd program may set an area designated on the memory to 1 or more areas for storing data of 1 character, and process character data in units of the number of characters. Therefore, the logic constructed by the 2 nd program can be made the same as the logic constructed by the 1 st program, and there is no need to correct a portion (for example, the number of bits in the COBOL language) related to the logic in the description content of the source code of the 2 nd program.
Therefore, in the migration accompanying the switching to a different character code system, the conversion of the program to be migrated can be facilitated.
Effects of the invention
According to the present invention, in migration involving switching to a different character code system, it is possible to facilitate conversion of a program to be migrated.
Drawings
Fig. 1 is a diagram showing a functional configuration of a migration support apparatus according to the present embodiment.
Fig. 2 is a diagram showing a data structure of exchange information.
Fig. 3 is a flowchart showing the processing of the migration support apparatus according to the present embodiment.
Fig. 4 is a diagram for explaining a case where, as a comparative example, when a program in the COBOL language is converted in accordance with the conversion from the EBCDIK + KEIS code to the UTF-8 code, the description of the source code needs to be corrected, where (a) is a diagram showing a description of a data part work section in the source code of the program before conversion and a schematic diagram of an area in which the description is embodied, and (b) is a diagram showing a description of a data part work section in the source code of the program after conversion and a schematic diagram of an area in which the description is embodied after completion of predetermined correction.
Fig. 5 is a diagram for explaining a case where, as the present embodiment, when a program in the COBOL language is converted in accordance with the conversion from the EBCDIK + KEIS code to the UTF-8 code, the description of the source code is not required to be corrected, where (a) shows a description example of a data part work section in the source code of the program before conversion and a schematic diagram of an area in which the description example is embodied, and (b) shows a description example of a data part work section in the source code of the program after conversion and a schematic diagram of an area in which the description example in which a predetermined correction is not required is embodied.
Detailed Description
As shown in fig. 1, the job PC1 is a computer operated by an operator who is in charge of migration from the current computer 2 to the new computer 3, and is a migration support apparatus according to the present embodiment. The job PC1 acquires the input file 21 and the input program 22 from the current computer 2, performs predetermined conversion (details will be described later), and outputs the converted input file and the converted input program as the output file 31 and the output program 32 to the new computer 3.
The current computer 2 (computer 1) is a general purpose host computer.
The new computer 3 (2 nd computer) is an open server computer.
The input file 21 (1 st document file) is a document file containing character data, and is a legacy software of the present computer 2. The character data in the input file 21 will be in accordance with the character code system currently being processed by the computer 2. The character code system currently processed by the computer 2 is EBCDIK for character data of half-angle english characters, half-angle symbols, and half-angle kana characters, and KEIS for character data of full-angle characters. In the present embodiment, the character code assigned to the character data in the input file 21 is sometimes referred to as "EBCDIK + KEIS code".
EBCDIK represents 1 character by 1 byte (the number of bytes is 1) for a half-corner character, a half-corner sign, and a half-corner kana character. KEIS represents 1 character in 2 bytes (number of bytes 2) for a full-angle character.
The input program 22 (1 st program) is a program for processing the input file 21, and is a legacy software of the present computer 2. The input program 22 is described in the COBOL language, and the description thereof corresponds to the character code system of EBCDIK and KEIS.
The output file 31 (2 nd document file) is a document file containing character data. The character data in the output file 31 follows the character code system processed by the new computer 3. The character code system processed by the new computer 3 is UTF-8 for any character of half-angle English characters, half-angle signs, half-angle kana characters and full-angle characters. In the present embodiment, the character code assigned to the character data in the output file 31 is sometimes referred to as "UTF-8 code".
UTF-8 expresses 1 character (1 byte number) by 1 byte for the semihorny character and the semihorny symbol, and expresses 1 character (3 byte number) by 3 bytes for the semihorny character and the full-horny character.
The output program 32 (2 nd program) is a program for processing the output file 31. In the present embodiment, the output program 32 is described in COBOL language. However, by applying the description in a known form, the output program 32 can be described in JAVA (registered trademark) language.
The job PC1 includes hardware such as an input unit, an output unit, a control unit, and a storage unit. For example, when the control unit is constituted by a CPU (central Processing unit), information Processing by a computer including the control unit is realized by executing Processing by a program of the CPU. The storage unit included in the computer stores a program instructed by the CPU to realize the functions of the computer. Thereby enabling software to cooperate with hardware. The program is recorded in a recording medium or provided via a network.
As shown in fig. 1, the job PC1 includes: the functional units such as the character code conversion unit 11, the program conversion unit 12, the exchange information generation unit 13, and the area storage unit 14 store the character code conversion table T and the exchange information E in the storage unit.
The character code conversion unit 11 converts the EBCDIK + KEIS code (1 st character code) assigned to the character data in the input file 21 into the UTF-8 code (2 nd character code) assigned to the character data in the output file 31 with reference to the character code conversion table T.
The program converting section 12 converts the input program 22 into the output program 32 so as to correspond to character code conversion by the character code converting section 11. The program conversion unit 12 may convert the description language of the output program 32 into the same language as that of the input program 22 (e.g., COBOL → COBOL) or into a different language (e.g., COBOL → JAVA).
The exchange information generation unit 13 causes the output program 32 to read the character data to which the UTF-8 code is assigned, and generates the exchange information E, which specifies the number of areas on the memory specified by the output program 32 as being equal to the number of bytes representing the byte sequence of the EBCDIK + KEIS code assigned to the character data, for the read character data.
The character data to which the UTF-8 code is assigned, which is read in by the output program 32, is, for example, character data extracted from the output file 31.
The area storage unit 14 stores 1 UTF-8 code assigned to the character data read in the output program 32 in the area constituted by the number determined by the exchange information E.
The character code conversion table T associates an EBCDIK + KEIS code and a UTF-8 code assigned to a character included in a predetermined character set (for example, a character set of characters specified in a character code system including EBCDIK and KEIS currently processed by the computer 2) with each other. The details of the correspondence are well known, and the description thereof is omitted.
The exchange information E generated by the exchange information generation unit 13 associates, for each character data to which the UTF-8 code is assigned, the number of bytes, which is the size (the length of the item) of the character data, with the number of areas on the memory designated by the output program 32.
As shown in fig. 2, the UTF-8 codes assigned to various character data may be classified into character codes representing characters of a half-angle english number sign (half-angle english number + half-angle sign), character codes representing half-angle kana characters, and character codes representing full-angle characters. The "number of bytes" and the "number of areas" are determined for the character codes to be classified.
For the character code representing the half-quartile numeric notation character, UTF-8 represents the corresponding 1 character in 1 byte as described above, and thus the "byte count" is "1". Note that, as described above, since EBCDIK represents 1 character by 1 byte for the half-angle character and the half-angle symbol, the "number of areas" is "1" by the function of the exchange information generation unit 13.
For the character code representing the half-angle kana character, UTF-8 represents the corresponding 1 character in 3 bytes as described above, and thus the "byte number" is "3". Note that, as described above, since EBCDIK represents 1 character by 1 byte for a half-corner kana, the "number of areas" is "1" by the function of the exchange information generation unit 13.
For the character code representing the full-size character, UTF-8 represents the corresponding 1 character in 3 bytes as described above, and thus the "byte count" is "3". Since the KEIS expresses 1 character in 2 bytes for the full-size character as described above, the "number of areas" is "2" by the function of the exchange information generation unit 13.
The content of the exchange information E is determined according to the combination of the character code system processed by the current computer 2 and the character code system processed by the new computer 3.
Treatment
The process of the present embodiment will be described. The main body of this processing is the control unit of the job PC1, but for the sake of convenience of description, the word "control unit" is omitted.
As shown in fig. 3, the job PC1 starts processing from step S1 each time a transition is made from the current computer 2 to the new computer 3.
In step S1, the job PC1 acquires the input file 21 and the input program 22 from the current computer 2. After step S1, the process proceeds to step S2.
In step S2, the job PC1 converts the character code from the EBCDIK + KEIS code to the UTF-8 code for the character data in the acquired input file 21 by the character code conversion unit 11, and generates the output file 31. After step S2, the process proceeds to step S3.
In step S3, the job PC1 converts the acquired input program 22 into the output program 32 by the program conversion unit 12. After step S3, the process proceeds to step S4.
In step S4, the job PC1 reads in character data to which the UTF-8 code is assigned, via the output program 32. After step S4, the process proceeds to step S5.
In step S5, the exchange information generating unit 13 of the job PC1 generates exchange information E for the character data read in step S4. After step S5, the process proceeds to step S6.
In step S6, the job PC1 stores the corresponding UTF-8 code, that is, the UTF-8 code assigned to the character data read in step S4, in the area (the area on the memory designated by the output program 32) having the number specified by the exchange information E, by the area storage unit 14. After step S6, the process of fig. 3 ends.
The output file 31, the output program 32, and the exchange information E generated by the job PC1 are output to the new computer 3. Here, since a predetermined business process is executed by the new computer 3, a case where the output program 32 opens the output file 31 is considered. At this time, the output program 32 refers to the exchange information E and accesses the UTF-8 code stored in the area on the memory designated by the output program 32 in the order specified by the output program 32.
The input program 22 processes the size of the character data (the length of the item) in the input file 21 into the number of bytes of the byte sequence, specifies an area of the same number as the number of bytes in the memory, and stores the byte sequence of the character data. That is, as before, with the present computer 2, the input program 22 processes character data in the input file 21 in units of byte number by setting a designated area on the memory as an area for storing 1 byte of data, thereby processing the character data substantially in sequence character by character.
By converting the character code from the EBCDIK + KEIS code to the UTF-8 code, the information E is exchanged so that the number of areas designated on the memory by the output program 32 is the same as the number of areas designated on the memory by the input program 22 for the character data in which the number of bytes of the byte sequence is changed. For example, when the EBCDIK + KEIS code is converted into the UTF-8 code, the output program 32 may refer to the exchange information E to set the number of areas designated on the memory to "2" instead of "3" as in the conventional technique, for the character data of the full-size character in which the number of bytes in the byte sequence is changed from "2" to "3". The area storage unit 14 stores 1 UTF-8 code assigned to the full-size character in 2 (consecutive) areas.
Therefore, the output program 32 can treat the character data in the output file 31 in units of the number of characters, with the area designated on the memory as an area for storing 1 character of data, instead of an area for storing 1 byte of data. As a result, in the new computer 3, the output program 32 may process the character data in the output file 31 in sequence character by character, as in the case where the input program 22 processes the character data in the input file 21 in sequence character by character. That is, even if migration accompanying conversion of character codes different in character data size is performed, the logic constructed by the output program 32 can be kept the same as the logic constructed by the input file 21. The migration worker does not need to correct the logical part of the description of the source code of the output program 32.
The job PC1 may control the output program 32 to separately specify, on the memory, an area (an area for storing 1 byte of data) in which a predetermined number (for example, 3 if the number is a full-size character) of byte columns in which character data to which the UTF-8 code is assigned is stored byte by byte. The job PC1 controls the area storage unit 14 to store 1 or 2 areas of 1 UTF-8 code in association with the predetermined number of areas. Therefore, in the new computer 2, when accessing the UTF-8 code stored in the area storage unit 14, the output program 32 can process the target character data by accessing the byte sequence stored in the associated area.
Detailed description of the invention
A specific example of converting a program by transition accompanied by switching of a character code system will be described with reference to fig. 4 and 5. In this specific example, both the pre-conversion program (corresponding to the input program 22) and the post-conversion program (corresponding to the output program 32) are described in the COBOL language. The character code processed by the program before conversion is EBCDIK + KEIS code, and the character code processed by the program after conversion is UTF-8 code.
A comparative example as a prior art is shown in fig. 4. An example of a description of a data section work section in the source code of the program before conversion is shown in the upper part of fig. 4 (a). Among the clique items DATA-A, variables (items) DATA-A1 and DATA-A2 are asserted in that order.
In DATA-a1, "PIC X" indicates a DATA (EBCDIK) storage area that secures 1 byte per character on the memory, and "(03)" indicates that the area has 3 (the number of bits is 3). Therefore, DATA of an amount of 3 characters can be input to DATA-a1 (half-corner character).
In DATA-a2, "PIC N" indicates that 2 bytes of DATA per character are guaranteed on the memory (KEIS), "(03)" indicates that the area has 3 (3 digits). Therefore, DATA of an amount of 3 characters can be input to DATA-a2 (full-angle character).
In addition, the COBOL language declares variables with a fixed length.
Fig. 4 (a) is a schematic view of a region in which the above description is embodied. If 1 region is represented by 1 box, the box represents a 1-byte data storage region. According to this diagram, the pre-conversion program specifies an area of an amount of 3 bytes on the memory for DATA-a1, thereby enabling DATA of an amount of 3 characters to be input to DATA-a 1. In addition, an area of 6 bytes (2 bytes × 3) is specified on the memory for DATA-a2, whereby DATA of 3 characters can be input to DATA-a 2. In this way, the pre-conversion program specifies the area storing the byte sequence of the character data for 1 byte as in the conventional art, and processes the character data in units of byte number (the byte sequences in the box are sequentially accessed from the left).
Here, when converting a character code and also converting a program during migration, in order to accurately process character data representing a byte sequence of 1 character with a different number of bytes (in order to reliably read the targeted character data), it is necessary to manually correct the logic of the converted program.
An example of a description of a data section work section in the source code of the converted program is shown in the upper part of fig. 4 (b). In order to make the logic the same before and after the conversion of the program, the description example of fig. 4 (a) needs to be corrected by adding the description indicated by the underlined part in the figure.
As the correction, the bit number is changed from 3 to 9 for DATA-a 1. The reason why the number of bits is changed in this way is that since 1 character of the kana is expressed by 1 byte and 1 character of the kana is expressed by 3 bytes in UTF-8 against EBCDIK, DATA-a1 has an area of 9 bytes (3 bytes × 3 characters), and it is possible to cope with a case where a byte sequence of 3 characters of the kana is input to DATA-a1 (in order to prevent overflow of DATA).
In addition, as the correction, the number of bits is changed from 3 to 5 for DATA-a 2. The reason why the number of bits is changed in this way is that since 1 character of the full-size character is expressed by 2 bytes and 1 character of the full-size character is expressed by 3 bytes in UTF-8 with respect to KEIS, DATA-a2 has at least a region of 9 bytes (3 bytes × 3 characters), and it is possible to cope with the case where a byte sequence of 3 characters of the full-size character is input to DATA-a 2. In the example of fig. 4 (b), the DATA-a2 has an area of 10 bytes by setting the number of bits of DATA-a2 to 5.
A schematic diagram of a region in which the modified description is embodied is shown in the lower part of fig. 4 (b). The box shown in fig. 4 (b) shows a data storage area of 1 byte, similarly to the box shown in fig. 4 (a). As a result of the correction, by increasing the number of boxes, the characteristics of the pre-conversion program such as DATA of an amount of 3 characters that can be input to DATA-A1 and DATA of an amount of 3 characters that can be input to DATA-A2 are maintained in the post-conversion program. However, modifying the logic built in the program by adding such a box requires performing for all variables in the program, and thus requires a great deal of work.
Fig. 5 shows the present embodiment. Fig. 5 (a) is the same as fig. 4 (a). That is, DATA of 3 characters can be input to the variable DATA-a1, and DATA of 3 characters can be input to the variable DATA-a 2.
Fig. 5 (b) shows an upper part of fig. 5 (b) as an example of a description of a data section work section in the source code of the converted program. In the present embodiment, the exchange information E described above may be used to convert the program.
As described above, according to the exchange information E, the area designated on the memory by the program after conversion functions as an area for storing 1 character data, not as an area for storing 1 byte data. This case is synonymous with the 1-character data storage area in which 1 box represents 1 area as a half-quartile notation kana character, as shown in the lower part of fig. 5 (b). Here, a term such as "half-angle english number symbol kana character" refers to a term in which half-angle english number characters, half-angle symbols, and half-angle kana characters are combined. If 2 data storage areas for 1 character of the half-corner english notation kana character are arranged side by side, the data storage area for 1 character of the full-corner character can be represented.
Therefore, in the description example of fig. 5 (b), "PIC X (03)" of DATA-a1 may indicate a DATA (UTF-8) storage area in which 1 character of 3 half-quartile notation kana characters is secured on the memory. This case indicates that DATA of an amount of 3 characters (character DATA to which the UTF-8 code is assigned) can be input to the variable DATA-a1 without increasing the number of bits (without correcting the logic) as shown in fig. 4 (b).
In addition, "PIC N (03)" of DATA-a2 may indicate a DATA (UTF-8) storage area of 1 character that ensures 3 full-size characters on the memory. This case indicates that DATA of an amount of 3 characters (character DATA to which the UTF-8 code is assigned) can be input to the variable DATA-a2 without increasing the number of bits (without correcting the logic) as shown in fig. 4 (b).
As already explained, 1 UTF-8 code is stored in the 1-character data storage area of 1 or 2 half-quartile sign kana characters. Therefore, when a predetermined service process is performed, if the converted program accesses the UTF-8 code stored in the area in a predetermined order, the character data can be processed in units of the number of characters.
In this way, by using the exchange information E, the post-conversion program changes the processing of the area specified on the memory, whereby an extremely large amount of work of correcting the logic built in the program can be eliminated.
Summary of the invention
According to the present embodiment, when character data having different byte counts of the byte sequence representing 1 character is processed by the converted output program 32 by converting the character code, the same number of areas as the number of areas used by the input program 32 can be used by referring to the exchange information E. That is, the output program 32 may set the area designated on the memory to 1 or more areas for storing data of 1 character, and process the character data in units of the number of characters. Therefore, the logic constructed in the output program 32 can be made the same as the logic constructed in the input program 32, and there is no need to correct the portion related to the logic in the description of the source code of the output program 32.
Therefore, in the migration accompanying the switching to a different character code system, the conversion of the program to be migrated can be facilitated.
(others)
In the present embodiment, the transition from the character code system using EBCDIK and KEIS to the character code system using UTF-8 is described. However, the present invention is also applicable to a transition accompanied by a switch from a character code system using JIS8 and SJIS to a character code system using UTF-8.
In addition, JIS8 expresses 1 character by 1 byte (the number of bytes is 1) for a half-corner character, a half-corner symbol, and a half-corner kana character. SJIS represents 1 character in 2 bytes (number of bytes is 2) for full-angle characters.
In the present embodiment, the area storage unit 14 stores the UTF-8 code in an area on the memory designated by the output program 32. However, it is not necessary to be the UTF-8 code, and any form of data capable of recognizing the corresponding character data may be stored.
In the present embodiment, when the exchange information generation unit 13 generates the exchange information E, the character data to which the UTF-8 code is assigned, which is read by the output program 32, is, for example, character data extracted from the output file 31. However, for example, the job PC1 may acquire character data to which the UTF-8 code is assigned in advance from the outside and cause the output program 32 to read the acquired character data in order to generate the exchange information E for all characters included in a predetermined character set (for example, a character set composed of all existing characters when migrating to an open server computer that handles UTF-8).
Further, various techniques described in this embodiment can be combined as appropriate.
The software described in this embodiment can be implemented as hardware, and the hardware can also be implemented as software.
Further, hardware, software, flowcharts, and the like may be modified as appropriate without departing from the scope of the present invention.
Description of the symbols
1 job PC (migration support apparatus);
11 a character code conversion unit;
12 a program converting section;
13 an exchange information generating unit;
14 area storage part;
2 current computer (1 st computer);
21 inputting a file (1 st document file);
22 input program (1 st program);
3 new computer (2 nd computer);
31 output file (2 nd document file);
32 output program (2 nd program);
a T character code conversion table;
e exchanging information.

Claims (2)

1. A migration support device for supporting migration from a1 st computer to a2 nd computer, the migration support device comprising:
a character code conversion unit that converts a1 st character code assigned to character data in a1 st document file provided to the 1 st computer into a2 nd character code assigned to character data in a2 nd document file provided to the 2 nd computer, with reference to a character code conversion table provided in a storage unit;
a program converting section that converts a1 st program that the 1 st computer has for processing the 1 st document file into a2 nd program that the 2 nd computer has for processing the 2 nd document file;
an exchange information generation unit that causes the 2 nd program to read character data to which the 2 nd character code is assigned, and generates exchange information for the read character data, the exchange information determining the number of areas on the memory specified by the 2 nd program to be the same as the number of bytes expressing the byte sequence of the 1 st character code assigned to the character data; and
and an area storage unit that stores 1 of the 2 nd character codes assigned to the read character data in the area of the number determined by the exchange information.
2. Migration support apparatus according to claim 1,
when the character data is character data of a half-angle English character, a half-angle symbol, or a half-angle kana character, the number of the areas determined by the exchange information is 1,
when the character data is character data of a full-size character, the number of the areas determined by the exchange information is 2.
CN201580046561.8A 2014-08-29 2015-08-28 Migration support device Active CN106663020B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2014-174745 2014-08-29
JP2014174745A JP6491438B2 (en) 2014-08-29 2014-08-29 Migration support device
PCT/JP2015/074401 WO2016031959A1 (en) 2014-08-29 2015-08-28 Migration support device

Publications (2)

Publication Number Publication Date
CN106663020A CN106663020A (en) 2017-05-10
CN106663020B true CN106663020B (en) 2020-05-01

Family

ID=55399842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580046561.8A Active CN106663020B (en) 2014-08-29 2015-08-28 Migration support device

Country Status (3)

Country Link
JP (1) JP6491438B2 (en)
CN (1) CN106663020B (en)
WO (1) WO2016031959A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6720993B2 (en) * 2018-03-07 2020-07-08 オムロン株式会社 Support devices and programs
CN117270961B (en) * 2023-11-21 2024-04-12 武汉蜂鸟龙腾软件有限公司 Method for analyzing and loading MFC character resources in Linux environment

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1035195A (en) * 1987-12-11 1989-08-30 骆守昌 Character recognition device
CN1075563A (en) * 1992-02-18 1993-08-25 国际商业机器公司 Improving one's methods of the exchange code conversion of multi-byte character string characters
JPH11203279A (en) * 1998-01-19 1999-07-30 Toshiba Corp Kana-kanji conversion device and method and storage medium
CN1235309A (en) * 1998-05-11 1999-11-17 日本先锋公司 Production of document data including dynamic character refresentation
JP2000105765A (en) * 1998-09-28 2000-04-11 Toshiba Corp Data converting device
CN1321362A (en) * 1999-07-13 2001-11-07 索尼公司 Method of generating distribution content, method and apparatus for content distribution, and method of code conversion
CN1324031A (en) * 2000-02-25 2001-11-28 株式会社东芝 Character code transition system under multiple platform condition, and computer readable recording medium
CN1722221A (en) * 2004-07-15 2006-01-18 索尼株式会社 Character information conversion device and method
CN101079023A (en) * 2003-01-24 2007-11-28 株式会社理光 Character string processing apparatus, character string processing method, and image-forming apparatus
CN101553810A (en) * 2006-08-10 2009-10-07 夏普株式会社 Character converting device and character converting device control method
WO2014002281A1 (en) * 2012-06-29 2014-01-03 株式会社エス・ケイ・ケイ Document processing system, electronic document, document processing method, and program

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2507980B2 (en) * 1993-06-11 1996-06-19 日本電気株式会社 Data conversion program automatic generation method
JP2008226010A (en) * 2007-03-14 2008-09-25 Hitachi Ltd Compile method and compile device
JP2010224656A (en) * 2009-03-19 2010-10-07 Ns Solutions Corp Source code generation device, program and source code generation method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1035195A (en) * 1987-12-11 1989-08-30 骆守昌 Character recognition device
CN1075563A (en) * 1992-02-18 1993-08-25 国际商业机器公司 Improving one's methods of the exchange code conversion of multi-byte character string characters
JPH11203279A (en) * 1998-01-19 1999-07-30 Toshiba Corp Kana-kanji conversion device and method and storage medium
CN1235309A (en) * 1998-05-11 1999-11-17 日本先锋公司 Production of document data including dynamic character refresentation
JP2000105765A (en) * 1998-09-28 2000-04-11 Toshiba Corp Data converting device
CN1321362A (en) * 1999-07-13 2001-11-07 索尼公司 Method of generating distribution content, method and apparatus for content distribution, and method of code conversion
CN1324031A (en) * 2000-02-25 2001-11-28 株式会社东芝 Character code transition system under multiple platform condition, and computer readable recording medium
CN101079023A (en) * 2003-01-24 2007-11-28 株式会社理光 Character string processing apparatus, character string processing method, and image-forming apparatus
CN1722221A (en) * 2004-07-15 2006-01-18 索尼株式会社 Character information conversion device and method
CN101553810A (en) * 2006-08-10 2009-10-07 夏普株式会社 Character converting device and character converting device control method
WO2014002281A1 (en) * 2012-06-29 2014-01-03 株式会社エス・ケイ・ケイ Document processing system, electronic document, document processing method, and program

Also Published As

Publication number Publication date
WO2016031959A1 (en) 2016-03-03
JP2016051235A (en) 2016-04-11
JP6491438B2 (en) 2019-03-27
CN106663020A (en) 2017-05-10

Similar Documents

Publication Publication Date Title
US9501471B2 (en) Generating a context for translating strings based on associated application source code and markup
KR20060047421A (en) Language localization using tables
US6055365A (en) Code point translation for computer text, using state tables
JPWO2020021845A1 (en) Document classification device and trained model
EP2845147B1 (en) Re-digitization and error correction of electronic documents
CN113076167A (en) Code processing method and related equipment
CN106663020B (en) Migration support device
US9798721B2 (en) Innovative method for text encodation in quick response code
US20150113391A1 (en) Document processing system, document processing method and storage medium
US10303755B2 (en) Enhanced documentation validation
US9448975B2 (en) Character data processing method, information processing method, and information processing apparatus
US20160041626A1 (en) Configurable character variant unification
US11132497B2 (en) Device and method for inputting characters
CN111273913B (en) Method and device for outputting application program interface data represented by specifications
TW561360B (en) Method and system for case conversion
JP2017091024A (en) Input assistance device
JP2011154495A (en) Character code conversion device, character code conversion method and character code conversion program
CN105260035A (en) Inputting method and device for self-made characters
CN117235345B (en) Open format document OFD searching method and device and electronic equipment
US20230057636A1 (en) Method and system for identifying terms from cryptic forms of variable names in program code
US20050216495A1 (en) Conversion method for multi-language multi-code databases
JP6076285B2 (en) Translation apparatus, translation method, and translation program
JP7083473B2 (en) Input support device
Sharan et al. Character Encodings
Peruginelli et al. Character sets: towards a standard solution?

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1234839

Country of ref document: HK

CB02 Change of applicant information

Address after: Tokyo, Japan

Applicant after: Hitachi Social Information Service Co., Ltd.

Address before: Tokyo, Japan

Applicant before: HITACHI GOVERNMENT & PUBLIC CORPORATION SYSTEM ENGINEERING, LTD.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant