Embodiment
The present inventor notices that one eight byte has 256 kinds of variations at most, it can be divided into 4 groups, form by 64 kinds of variations for every group, and every group can be by a special character representation, in addition, preceding two bits in any one byte must be one of following four kinds: 00,01,10,11.Therefore, if with a byte separated into two parts, just, a part is made up of preceding two bits, and a part is made up of 6 remaining bits, and then this byte can be made up of the prefix (00,01,10,11) of one 2 bit and the part of one 6 bit.
Because the byte of one 6 bit can convert a printable character to the BASE64 coded system, if give prefix with a special printable character, then any one eight bit byte can be by a special printable character and a BASE64 character representation.Encoding scheme of the present invention adopts BASE64 character set and eight bit byte of four special character representations, these special characters can be printable characters arbitrarily, the value that they concentrate corresponding to 128 ascii characters of standard is 32 to 126 character, except those characters of having been used by the BASE64 character set.These special characters can comprise, @ for example, #, $, % , ﹠amp; , *, (), ^,! ,-,?,<, 〉, ", ' ,-: and; , [,], {, },, | etc.
Because each byte of eight can be made up of the part of a prefix and one 6 bit, according to method of the present invention, can one byte ground of byte handle the input data, and not used in BASE64, one group of ground of per three bytes is handled and is imported data.
Because eight bit byte has 256 kinds of variations.These 256 kinds of variations are mapped to 64 characters in one of 4 groups of BASE64 character set.Every group from character @, #, $, % , ﹠amp; , *, (), ^,! ,-,?,<, 〉, ", ' ,-: and; , [,], {, },, | the expression of one of 4 special characters of middle selection.For example,
00=@, 01=#, 10=$, 11=%, therefore a byte can convert one to by for example @, #, $, prefix and a BASE64 character of the expression of one of four characters of %.
Mapping relations between byte and the printable character (special character and BASE64 character) can be provided with as follows:
Mapping example 1:
Perhaps mapping can be represented by ' prefix '+BASE64.That is to say,,, can be expressed as the prefix of one 2 bit and corresponding to 6 bits of decimal numeral 0-63 no matter this byte is 0-63 or 192-255 for any one byte of eight.For 2 bit prefix, @ represents 00, and # represents 01, and $ represents 10, and % represents 11.
Yet, importantly because special character is selected from character group @, #, $, % , ﹠amp; , *, (), ^,! ,,?,<, 〉, ", ' ,-: and; , [,], {, },, |, also can use other mapping, as:
Mapping example 2
Represent 00 at this %, $ represents 01, and # represents 10, and @ represents 11.
In fact, except @, #, other special characters beyond the $, % also can be used for representing respectively 00,01,10,11.
Because have only 4 special characters, the prefix of two successive bytes in a binary data stream may be identical, in other words, they may represent same group of BASE64 character set, have only when it with back to back last byte not when same group (identical prefix is arranged), prefix just is used or exports.
Now, by example explanation the present invention.
Fig. 1 is a process flow diagram that shows the inventive method.At first, according to change and as: mapping example 1 is referred to one of four groups (steps 102) with an input byte.
Then, according to the classification of mapping example 1 definite coding, that is: the prefix designates (step 103) of this byte.
Then, the prefix with this prefix and last byte compares (step 104a).If the group of the prefix TYP of it and last byte, or in other words is if the prefix of it and last byte is represented this prefix be left in the basket (step 104b) by same special character.
Simultaneously, byte is divided into one 2 bit prefix and one 6 bit part (step 104).According to the matching way of routine, partly convert 6 bits to BASE64 character (step 105).That is to say, in a BASE64 character set, find out a BASE64 character that partly is complementary with 6 bits and finish matching process.For example, one 6 bit partly is 011001, the capital Z in its expression BASE64 character set.
At last, with BASE64 character and prefix (for example: Z) together output so that for example: transmission (step 106), if the prefix of previous byte is not represent Z to organize.Yet,, only export the BASE64 character and ignore prefix if this prefix is identical with the prefix of previous byte.
As shown in Figure 2, can also be not first with byte packet, but it directly is divided into the part (step 202) of one 2 bit prefix and one 6 bit.Then, according to the corresponding special character of prefix search (step 203a).If prefix is 00, then represent with @, if prefix is 01, then represent with #, if prefix is 10, then represent with $, if prefix is 11, then represent with %.
Then, the prefix of represented prefix and last byte is compared (step 204), if the same group of byte representation of it and front, perhaps in other words, if it uses the special character the same with the prefix of back to back previous byte to represent, then omit represented prefix (step 205).
If the group that this prefix that is expressed representative is different or different with the expression of front byte prefix, this prefix is about to be output so.
Simultaneously,, partly convert 6 bits to a BASE64 character (step 203b), promptly in the BASE64 character set, find out a BASE64 character that partly is complementary with this 6 bit and mate according to the matching way of routine.For example, if 6 bits partly are 011001, it is represented by ' Z ' in the BASE64 character set.
At last, this BASE64 character with represented prefix (for example: Z) export, if the prefix of last byte is not to represent Z group (step 206).Yet,, only export the BASE64 character, and prefix is omitted if represented prefix is the same with the prefix of last byte.
For a hexadecimal data sequence 0x14fb9c03d9, its eight-digit binary number is expressed as 00,010,100 11,111,011 10,011,100 00,000,011 11011001.
According to the present invention, in the scale-of-two-text-converted of these data, at first get first byte, and be divided into a prefix 00 and one 6 bit part 010100, if adopt mapping example 1, special character @ of prefix 00 usefulness represents.
Prefix with this prefix and tight last byte compares then, in this case, because in this data stream, the 00010100th, first byte, prefix are saved so that export.
This 6 bit part 010100 is converted into a BASE64 character T, and the form of Yi @T is exported with prefix @ then.
Handle second byte 11111011 then.It is divided into a prefix 11 and one 6 bit part 111011, prefix 11 corresponding special character %, with the prefix of it and the byte 00010100 in front relatively, just the special character @ with expression 00 compares, because two prefixes are not at same group, prefix % will be output so that transmit so.
6 bit parts 111011 are converted into a BASE64 character 7, export with prefix % with the form of %7 then.
Handle the 3rd byte 10011100 similarly, it is divided into a prefix 10 and one 6 bit part 011100, prefix 10 corresponding special character $, with the prefix of the byte 11111011 of it and its front relatively, just the special character % with expression 11 compares, because two prefixes are not at same group, prefix $ will be output so that transmit so.
6 bit parts 011100 are converted into a BASE64 character c, and the form of Yi $c is exported with prefix $ then.
Also handle nybble 00000011 similarly, it is divided into a prefix 00 and one 6 bit part 000011, prefix 00 corresponding special character @, 6 bit parts 000011 are converted into a BASE64 character D.
Although prefix @ is identical with the prefix of first byte, the byte in it and tight front, that is: 10011100 prefix difference, prefix @ exports with the BASE64 character of expression 000011.
Similarly, the 5th byte 11011001 also is divided into a prefix 11 and one 6 bit part 011001, and prefix 11 usefulness character % represent that 6 bit parts 011001 are converted into zed.
Although the prefix of the prefix % and second byte is at same group, the byte in it and tight front, that is: 00000011 at same group, and prefix % exports with the zed of expression 011001.
Byte ground of byte of cataloged procedure carries out.
In decode procedure, process is carried out on the contrary.Just use mapping relations and BASE64 mode, printable ascii character is changed back the eight-digit binary number data, for example, at first be divided into and T, be converted into 00, T is converted into 010100, therefore, is converted into 00010100.
Similarly, %7 , $c , @D, %Z is changed back 11111011,10011100,00000011 and 11011001 respectively, and table 1 and table 2 have shown the result who data (sexadecimal) 0x14fb9c03d9 is converted to ascii character.
Table 1
The input data: | ????0x14fb9c03d9 | | | | |
HEX: | ????1??????4 | ??f??????b | ??9??????c | ??0??????3 | ??d??????9 |
Eight: | ????00010100 | ??11111011 | ??10011100 | ??00000011 | ??11011001 |
The decimal system: | ????20 | ??251 | ??156 | ??3 | ??217 |
Output: | ????@T | ??%7 | ??$c | ??@D | ??%Z |
Table 2
The input data: | ??0x14fb9c03d9 | | | | |
HEX: | ??1??????4 | ????f??????b | ????9??????c | ????0??????3 | ????d??????9 |
Eight: | ??00010100 | ????11111011 | ????10011100 | ????00000011 | ????11011001 |
Prefix: | ??@ | ????% | ????$ | ????@ | ????% |
6-bit | ??20 | ????59 | ????28 | ????3 | ????25 |
Output: | ??@T | ????%7 | ????$c | ????@D | ????%Z |
In cataloged procedure, those characters in the BASE64 character set not, as: carriage return and other blank character, may indicate an error of transmission.
In decode procedure, if the character in decoded character and the BASE64 character set does not match or beyond the BASE64 character set, these characters are left in the basket or are not decoded.The carriage return or other character that do not have in BASE64 character set and special prefix character list also are left in the basket.
With reference now to Fig. 3,, Fig. 3 is the block diagram that embodies an embodiment of the present invention system.
System according to the present invention comprises: an impact damper (301), a byte separation vessel (302) links to each other with above-mentioned impact damper, a prefix generator (303) links to each other with described byte separation vessel respectively with a BASE64 generator (304), and an ASCII character generator (305) links to each other with the output of above-mentioned prefix generator (303) with BASE64 generator (304) respectively.
At first, binary data stream is imported into a byte separation vessel or a mechanism that realizes identity function.If data are hexadecimal, must convert it to eight bit byte.Utilize separating mechanism with this byte separated into two parts, i.e. 2 bit prefix and remaining 6 bit parts.
One of this 2 bit prefix and predetermined four special printable characters are complementary, and represent with a special printable character, and this is at a device of finishing such function, as: in the prefix generator.
According to the BASE64 mode of routine, 6 bits partly are converted into a BASE64 character, and this is at a device of finishing such function, as: carry out in the BASE64 generator.
Then, produce ASCII character character device at one, as: the prefix with the byte of prefix and tight front that produced in the ASCII generator compares, and whether belongs to same group to determine byte of current byte and tight front.If they belong to same group, just the prefix of a byte of prefix that is produced and tight front is identical, then has only the BASE64 character to produce and be output as ascii character.If different, prefix that is produced and BASE64 character export with the form of ascii character so that transmission or further use.
Should be noted that those skilled in the art can be in every way or method realize the present invention, and be not only mode described above or method.For example, some function can not only be passed through an independent device, but passes through, and for example: the combination of two kinds of devices realizes.
Fig. 4 is a schematic block diagram that shows another embodiment of system of the present invention.In Fig. 4, the byte tripping device is finished self-adaptation " dynamically " mapping rather than " static state " mapping, and the mapping in " static state " mapping between eight bit byte and the printable character is reserved as mentioned above in advance.
In the embodiment shown in fig. 3, the byte separation vessel carries out so-called " static state " mapping, because the mapping between eight system bytes and 4 BASE64 group is that reserve in advance or fixing.Yet mapping also can realize that for example, the byte in a data stream can followingly be divided into groups based on the frequency that byte value occurs:
Mapping example 3 (dynamically shining upon 1):
In this case, 64 bytes in the data stream are by expression, and 64 follow-up bytes are represented by #, or the like, that is to say, with the sequential packet of byte transmission, and no matter their value.Mapping relations between byte and the BASE64 character can be set up with a following mapping table:
Byte 1 | Byte 2 | Byte 3 | Byte 4 | Byte 5 | Byte 6 | Byte 7 | Byte .. | Byte 64 |
??A | ??B | ??C | ??D | ??B | ??F | ??G | ??... | ??/ |
The byte that has with sample value will be mapped to a character, and for example, because byte 2 has identical value with byte 5, they are mapped to identical symbol, for example: capital B.
Be the example of one " dynamically " mapping below.For an input string: " I ' m glad to seeyou.... ", if use the dynamic mapping mode of mapping example 3, then conversion will be:
Text: | I | ‘ | ?m | ?sp | ?g | ?l | ?a | ?d | ?sp | ?t | ?o | ?sp | ?s | ... |
The decimal system | 73 | ?39 | ?109 | ?32 | ?103 | ?108 | ?97 | ?100 | ?32 | ?116 | ?111 | ?32 | ?115 | ... |
16 systems | 49 | ?27 | ?6D | ?20 | ?67 | ?6C | ?61 | ?64 | ?20 | ?74 | ?6F | ?20 | ?73 | ... |
Scale-of-two | 01 001 001 | ?00 ?100 ?111 | ?01 ?101 ?101 | ?00 ?100 ?000 | ?01 ?100 ?111 | ?01 ?101 ?100 | ?01 ?100 ?001 | ?01 ?100 ?100 | ?00 ?100 ?000 | ?01 ?110 ?100 | ?01 ?101 ?111 | ?00 ?100 ?000 | ?01 ?110 ?011 | ... |
Static 1 | #J | ?@n | ?#t | ?@g | ?#n | ?#s | ?#h | ?#k | ?@g | ?#0 | ?#v | ?@g | ?#z | ... |
Dynamic 1 | A | ?B | ?C | ?D | ?E | ?F | ?G | ?H | ?D | ?I | ?J | ?D | ?K | |
The output of above-described usefulness " static state " mapping is: J@n#t@g#nshk@g#0v@g#z...
For " dynamically " mapping is because first 64 byte is that @ is omitted and exports and is: ABCDEFGHDIJDK... among this group of Zai @+BASE64
In this case, the mapping table of " dynamically " is:
16 systems | 49 | ??27 | ??6D | ??20 | ??67 | ??6C | ??61 | ??64 | ??74 | ??6F | ??73 | ... |
Dynamically | A | ??B | ??C | ??D | ??E | ??F | ??G | ??H | ??I | ??J | ??K | |
In some applications, as the e_mail text, its used byte is less than 64, and " dynamically " of ASCII mapping at this moment arranges with respect to " static state " arrangement its superiority.Therefore, will be between group less than conversion, output is less.
The present invention has below been described by way of example.Importantly the invention is not restricted to described example, those skilled in the art can carry out various modifications to it under the situation that does not break away from spirit of the present invention.