CN101692254B

CN101692254B - Method and device for displaying multi-Unicode language character codes

Info

Publication number: CN101692254B
Application number: CN200910210131A
Authority: CN
Inventors: 张更; 毛斌利; 桑广莲
Original assignee: ZTE Corp
Current assignee: Dong Guoqiang
Priority date: 2009-10-27
Filing date: 2009-10-27
Publication date: 2012-09-05
Anticipated expiration: 2029-10-27
Also published as: CN101692254A; WO2010145135A1

Abstract

The invention discloses a method and a device for displaying multi-Unicode language character codes. The method comprises the following steps of: pre-setting a Unicode extension character library comprising lattice information of common language characters and an extension Unicode code sequence corresponding to each language character, and acquiring the original Unicode code sequence of a character string to be displayed; converting the original Unicode code sequence of each character in the character string to acquire he extension Unicode code sequence corresponding to each character; and using the obtained extension Unicode code sequence to find the lattice information of the character string corresponding to the character string from the extension character library, and displaying the character string according to the lattice information. By the device, when a character string is displayed, the lattice information of the character string can be directly acquired from the Unicode extension character library by the extension Unicode code sequence and outputted to display, obtaining a display position and a conversion result of each character by calculation is unnecessary, and each character can be displayed just by reading the character library system once, so that the display speed is greatly improved.

Description

A kind of multi-Unicode language character codes display packing and device

Technical field

The present invention relates to many Unicode language word-base system and character code and show field, particularly a kind of multi-Unicode language character codes display packing and device.

Background technology

The universal language languages are abundant; Most of rare foreign languages all have the characteristics of many Unicode; Promptly only there is its alphabetical Unicode sign indicating number in (Unicode Standard 5.1) in the Unicode standard, and do not have the Unicode sign indicating number of each word, and each word need be represented with the letter U nicode sequence of a plurality of uncertain numbers when expression; The maximum characteristic that also is the rare foreign languages language is that word is made up of a plurality of Unicode, and the Unicode number is indefinite.With certain state's language is example; The Unicode sign indicating number ((0x1780～0x17FF)) that in Unicode Standard 5.1, only comprises basic language elements such as its consonant, vowel, phonetic symbol; This national language calls the turn in actual use has 4000 left and right sides words approximately; Each word is made up of consonant, vowel, phonetic symbol etc. respectively, and the word of a complicacy possibly be made up of 7,8 Unicode.

Along with the develop rapidly of electronic information technology, computer and other electronic products are popularized gradually, and digital development has become a kind of trend, and binding data processing, online, communication function are.Along with the development of embedded system, the processing of many Unicode language no longer is confined to computer, and at mobile phone, PDA(Personal Digital Assistant), MP3 player etc. has in the small-sized electronic product of embedded system also need import and handle many Unicode language.Because the complicacy of many Unicode of rare foreign languages language self adds real-time requirement in the embedded system, processing brings very big burden to CPU.

With certain state's language is example; Windows is when showing this state spoken and written languages; Utilize the combination display characteristic of literal; The lattice information that only contains the standard character of Unicode such as its vowel, consonant phonetic symbol in the character library, the Unicode sequence of read out word successively when showing certain word, and then outgoing position and deformation result through calculating each Unicode.Can reduce the complexity of character library like this, but to the display routine band huge burden, especially under the condition of embedded system resource-constrained, this display packing has become bottleneck.With the NOKIA mobile phone of supporting this language is example, and one full frame (200*160 dot matrix) refreshes to be needed more than the 1s, also is the content that the user just can see the menu the inside 1 second after choice menus.

Summary of the invention

The embodiment of the invention provides a kind of multi-Unicode language character codes display packing and device, is used for solving outgoing position and the deformation result complexity height that prior art is calculated each Unicode sign indicating number, and causes the big problem of display routine burden.

A kind of multi-Unicode language character codes display packing is provided with Unicode expansion character library in advance, and this Unicode expansion character library comprises the lattice information and the corresponding expansion Unicode sign indicating number sequence of each language character of language character commonly used, and this method comprises:

Obtain the original Unicode sign indicating number sequence of character string to be shown;

Original Unicode sign indicating number sequence to each character in the said character string is changed, and obtains the corresponding expansion Unicode sign indicating number sequence of each character;

The expansion Unicode sign indicating number sequence that utilization obtains is searched from said expansion character library and is obtained said character string corresponding characters string lattice information, and shows this character string according to this lattice information.

A kind of multi-Unicode language character codes display device comprises:

Presetting module is used for being provided with in advance Unicode expansion character library, and this Unicode expansion character library comprises the lattice information and the corresponding expansion Unicode sign indicating number sequence of each language word of language word commonly used;

Original Unicode sign indicating number sequence acquisition module is used to obtain the original Unicode sign indicating number sequence of character string to be shown;

Expansion Unicode sign indicating number modular converter is used for the original Unicode sign indicating number sequence of said each character of character string is changed, and obtains the corresponding expansion Unicode sign indicating number sequence of each character;

Show output module, be used for utilizing the expansion Unicode sign indicating number sequence that obtains to search and obtain said character string corresponding characters string lattice information, and show this character string according to this lattice information from said presetting module.

The embodiment of the invention can directly be obtained back output through expansion Unicode sequence and show from Unicode expansion character library when display string; Do not need again through calculating display position and the modification result who obtains each Chinese character; And the demonstration of each word only need be read word-base system one time, has therefore improved display speed greatly.

Description of drawings

The process flow diagram of a kind of multi-Unicode language character codes display packing of Fig. 1 embodiment of the invention;

The structural drawing of a kind of multi-Unicode language character codes display device of Fig. 2 embodiment of the invention;

The structural drawing of expansion Unicode sign indicating number modular converter in a kind of multi-Unicode language character codes display device of Fig. 3 embodiment of the invention.

Embodiment

A kind of multi-Unicode language character codes display packing of the embodiment of the invention and device; This method comprises: Unicode expansion character library is set in advance; This Unicode expansion character library comprises the lattice information and the corresponding expansion Unicode sign indicating number sequence of each language character of language character commonly used, obtains the original Unicode sign indicating number sequence of character string to be shown; Original Unicode sign indicating number sequence to each character in the said character string is changed, and obtains the corresponding expansion Unicode sign indicating number sequence of each character; The expansion Unicode sign indicating number sequence that utilization obtains is searched from said expansion character library and is obtained said character string corresponding characters string lattice information, and shows this character string according to this lattice information.

Be elaborated below in conjunction with the Figure of description specific embodiments of the invention.

As shown in Figure 1, a kind of multi-Unicode language character codes display packing of the embodiment of the invention, concrete steps comprise:

Step 101 is provided with Unicode expansion character library, and this Unicode expansion character library comprises the lattice information and the corresponding expansion Unicode sign indicating number sequence of each language character of language character commonly used.

Wherein, can realize the setting of this Unicode expansion character library through following method:

The first step, according to the grammatical rule of combination of language, all language elements are carried out permutation and combination, list all possible combination;

Second step, from all permutation and combination results, get rid of illegal word, keep the combination of all significant characters;

The 3rd goes on foot, gives all effective words to distribute expansion Unicode sign indicating numbers, the corresponding unique expansion Unicode sign indicating number sequence of each significant character;

The lattice information of each significant character is obtained in the 4th step, scanning, and the corresponding relation of this lattice information and said expansion Unicode sign indicating number is set;

The 5th step, generation font file comprise standard Unicode character and the lattice information of expanding the Unicode word.

Step 102 is obtained the original Unicode sign indicating number sequence that needs the characters displayed string;

Step 103 is changed the Unicode sign indicating number sequence of each character in the said original character string Unicode sign indicating number sequence, obtains the corresponding expansion Unicode sign indicating number sequence of each character;

The concrete grammar that can realize this step comprises:

This step is changed the Unicode sign indicating number sequence of each character in the said original character string Unicode sign indicating number sequence, obtains the corresponding expansion Unicode sign indicating number sequence of each character and comprises:

According to each the preset character and the corresponding relation of Unicode sign indicating number number, said character string Unicode sign indicating number sequence is divided, obtain the corresponding Unicode sign indicating number sequence of each character in the character string;

Utilize the corresponding Unicode sign indicating number sequence of each character from preset Unicode sign indicating number sequence and expansion Unicode sign indicating number sequence corresponding relation database, obtain the corresponding expansion Unicode sign indicating number sequence of each character.

Step 104 is utilized the expansion Unicode sign indicating number sequence that obtains from said expansion character library, to search and is obtained the corresponding character string lattice information of this expansion Unicode sign indicating number sequence, and shows this character string according to this lattice information.

Wherein, Said step 103; In when the Unicode sign indicating number sequence of each character in the said original character string Unicode sign indicating number sequence changed; Also create the Unicode space big or small on an equal basis with said original character string Unicode sign indicating number sequence, the expansion Unicode sign indicating number sequence after the storage conversion utilizes the Unicode sign indicating number sequence after changing to obtain display message during demonstration.

According to said method, the embodiment of the invention also provides a kind of multi-Unicode language character codes display device, comprises data memory module 201, shows acquisition module 202, shows modular converter 203 and shows output module 204:

Data memory module 201 is used to store the lattice information of language word commonly used and the expansion Unicode sign indicating number sequence of each language word correspondence;

In addition, the also lattice information and the Unicode sign indicating number of the standard character of all these language among the Unicode Standard 5.1 in the storage standards Unicode character library of this data memory module.

Original Unicode sign indicating number sequence acquisition module 202 is used to obtain the original Unicode sign indicating number sequence of character string to be shown;

Expansion Unicode sign indicating number modular converter 203 is used for the original Unicode sign indicating number sequence to said each character of character string is changed, and obtains the corresponding expansion Unicode sign indicating number sequence of each character;

Show output module 204, be used for utilizing the expansion Unicode sign indicating number sequence that obtains to search and obtain said character string corresponding characters string lattice information, and show this character string according to this lattice information from said data memory module.

As shown in Figure 3, said expansion Unicode sign indicating number modular converter 203 comprises disconnected word cell 301 and Unicode map unit 302:

Disconnected word cell 301 is used for according to each the preset character and the corresponding relation of Unicode sign indicating number number said character string Unicode sign indicating number sequence being divided, and obtains the corresponding Unicode sign indicating number sequence of each character in the character string;

Unicode map unit 302 is used for utilizing the corresponding Unicode sign indicating number sequence of each character from preset Unicode sign indicating number sequence and expansion Unicode sign indicating number sequence corresponding relation database, obtains the corresponding expansion Unicode sign indicating number sequence of each character.

In addition; In order not change the original Unicode sign indicating number sequence of input of character string; Then said expansion Unicode sign indicating number modular converter 203 also is used for when the Unicode sign indicating number sequence of said each character of original character string Unicode sign indicating number sequence is changed; Create the Unicode space big or small on an equal basis with said original character string Unicode sign indicating number sequence, the expansion Unicode sign indicating number sequence after the storage conversion utilizes the Unicode sign indicating number sequence after changing to obtain display message during demonstration.

Be that example is done further detailed explanation to the specific embodiment of the inventive method technical scheme with Laos's language below:

The present invention is when implementing; First-selected one-to-one relationship of establishing Laos's literal (original Unicode sign indicating number sequence) and its expansion Unicode sign indicating number; This corresponding relation is used for the conversion of Unicode sequence, and designers different in the practical design environment can have different corresponding relations according to different Laos's consonants, vowel permutation and combination order.Be root with the consonant when designing here, the expansion Unicode corresponding relation design of Laos's literal is as shown in table 1:

The Laotian type	The Unicode scope	The Unicode type
			The Lao letter	0x0E80～0x0EFF	Standard
Single consonant word	0xA000～0xB45F	Expansion
			The double consonant word	0xC000～0xCDBF	Expansion
Special words	0xCE00～0xCEFF	Expansion
			The distortion phonetic symbol	0xCD02、0xCD03	Expansion

Table 1

Because character library Laos literal is more, lists its concrete corresponding relation no longer one by one.

The demonstration acquisition module obtains and needs the Unicode of display string sequence, when obtaining the Unicode sequence, can from string resource, obtain, and also from dynamic text, obtains; Here the Unicode sequence that gets access to is the original Unicode sequence of character string.

Show that modular converter converts the original Unicode of character string to be used to show Unicode sequence, the Unicode sequence after the conversion is to comprise the expansion Unicode sign indicating number sequence with display message; If the original Unicode sequence number of certain word is N in the character string; Then the Unicode sequence number after the conversion still be N, and wherein first Unicode be the expansion Unicode sign indicating number (consistent with the expansion Unicode sign indicating number of this word in the character library) of this word, and other N-1 of this word Unicode utilization filling Unicode fills; It is 0xDFFF that Unicode is filled in choosing here; Extended code is used to show this word, and filler code does not need to show, guarantees that just the Unicode sign indicating number number of a word does not change.

is example with Laos's character string, and this character string comprises its original Unicode sequence of four words and is:

0x0EC0，0x0E99，0x0EB7，0x0EAD，0x0EC2，0x0E9B，0x0EB0，0x0EC0，0x0E9F，0x0EBB，0x0EB2，0x0EC0，0x0EA2，0x0EB5，0x0EC9；

Expansion Unicode sequence after the conversion is:

0xA3AE，0xDFFF，0x0xDFFF，0xDFFF，0xA02C，0xDFFF，0xDFFF，0xB135，0xDFFF，0xDFFF，0xDFFF，0xA298，0xDFFF，0xDFFF，0xDFFF；

0xA3AE wherein, 0xA02C, 0xB135 and 0xA298 are respectively the expansion Unicode sign indicating number of four Laos of character string in word-base system.

Show that output module accomplishes the demonstration output of character string, from word-base system, take out 0xC3AE successively when showing above-mentioned character string, 0xA02C, 0xB135, the lattice information of 0xA298 show, and do not show for 0xDFFF.

In addition; (this database comprises and is used for characters displayed string resource to cooperate the string resource storehouse 4 that also has of accomplishing Presentation Function; Also promptly need the unicode code value of characters displayed string, what this data representation need show, what the word-base system decision is shown as); This storehouse has comprised the character string that might use in the system, also can be dynamic displaying contents; Word-base system 5 has comprised the lattice information and the Unicode sign indicating number of all language word, and display module obtains the lattice information of each word from word-base system 5 when showing, show output.

Wherein, when the character string Unicode sequence of input is changed, show the disconnected word cell in the modular converter, from character string Unicode sequence, obtain the Unicode sequence of each Laos's word; The Unicode map unit is mapped to the expansion Unicode sign indicating number of this word in word-base system 5 with the original Unicode sequence of each word.

For example: the original Unicode sequence of Laos's word

is { 0x0EC0; 0x0EA2; 0x0EB5; 0x0EC9} mapping back Unicode is 0xA298, and is consistent with the Unicode sign indicating number of this word in the word-base system.

Display module carries out display process to the Unicode sequence after the conversion successively, if for filling the Unicode sign indicating number then skip and do not carry out showing; If the Unicode sign indicating number for expansion Unicode sign indicating number or standard Unicode sign indicating number the lattice information that from word-base system, obtains this word show.

Word-base system that the embodiment of the invention provided and display packing; Owing to comprise the lattice information of each word in the word-base system; Therefore when showing, can directly from character library, obtain back output shows; Do not need again the display position and the modification result that obtain each Chinese character through calculating, and the demonstration of each word only need read word-base system one time, therefore improve display speed greatly.

In addition; Since word-base system to the distribution of expansion Unicode fully independent with English standard Unicode, therefore can with English mixed display, and do not influence display effect; Like need during, also only need scope to expansion Unicode define again and get final product with the other Languages mixed display.Compare with existing display technique; This display packing only needs in existing display packing, to add one deck and shows conversion layer; Be used to generate Unicode sequence with display message; Be easy to be applied in the middle of existing each display system, show that simultaneously conversion layer does not change the Unicode sequence of original displaying contents, therefore when communicating, accomplished compatibility fully with other embedded system.

Method of the present invention is not limited to the embodiment described in the embodiment, and those skilled in the art's technical scheme according to the present invention draws other embodiment, belongs to technological innovation scope of the present invention equally.Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technologies thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims

1. multi-Unicode language character codes display packing; It is characterized in that; Unicode expansion character library is set in advance, and this Unicode expansion character library comprises the lattice information and the corresponding expansion Unicode sign indicating number sequence of each language character of language character commonly used, and this method comprises:

2. the method for claim 1 is characterized in that, comprising:

When the Unicode sign indicating number sequence of each character in the original Unicode sign indicating number sequence of said character string to be shown is changed; Create and the equal big or small Unicode space of the original Unicode sign indicating number sequence of said character string to be shown; Expansion Unicode sign indicating number sequence after the storage conversion, and utilize when showing the Unicode sign indicating number sequence after the conversion to obtain display message.

3. the method for claim 1 is characterized in that, the said Unicode of setting expansion character library comprises:

Grammatical rule of combination according to language carries out permutation and combination to all language elements, and keeps the combination of significant character;

For distributing, the significant character that keeps expands the Unicode sign indicating number, wherein the corresponding unique expansion Unicode sign indicating number sequence of each significant character;

The lattice information of each significant character is obtained in scanning, and the corresponding relation of this lattice information and said expansion Unicode sign indicating number is set.

4. the method for claim 1 is characterized in that, the original Unicode sign indicating number sequence of each character in the said character string is changed, and obtains the corresponding expansion Unicode sign indicating number sequence of each character and comprises:

According to each the preset character and the corresponding relation of original Unicode sign indicating number sequence number, the corresponding original Unicode sign indicating number sequence of said character string to be shown is divided, obtain the corresponding original Unicode sign indicating number sequence of each character in the character string;

Utilize the corresponding original Unicode sign indicating number sequence of each character from preset original Unicode sign indicating number sequence and expansion Unicode sign indicating number sequence corresponding relation database, obtain the corresponding expansion Unicode sign indicating number sequence of each character in the character string to be shown.

5. a multi-Unicode language character codes display device is characterized in that, comprising:

6. device as claimed in claim 5; It is characterized in that; When said expansion Unicode sign indicating number modular converter also is used for the Unicode sign indicating number sequence of original each character of Unicode sign indicating number sequence of said character string to be shown changed; Create and the equal big or small Unicode space of the original Unicode sign indicating number sequence of said character string to be shown, the expansion Unicode sign indicating number sequence after the storage conversion;

When showing, said demonstration output module utilize the Unicode sign indicating number sequence after changing to obtain display message.

7. device as claimed in claim 5 is characterized in that, said expansion Unicode sign indicating number modular converter comprises:

Disconnected word cell; Be used for according to each the preset character and the corresponding relation of original Unicode sign indicating number sequence number; The corresponding original Unicode sign indicating number sequence of said character string to be shown is divided, obtain the corresponding original Unicode sign indicating number sequence of each character in the character string;

The Unicode map unit; Be used for utilizing the corresponding original Unicode sign indicating number sequence of each character from preset original Unicode sign indicating number sequence and expansion Unicode sign indicating number sequence corresponding relation database, obtain the corresponding expansion Unicode sign indicating number sequence of each character in the character string to be shown.