CN103885607A - Method for judging and storing concatenation of Uyghur based on embedded system - Google Patents
Method for judging and storing concatenation of Uyghur based on embedded system Download PDFInfo
- Publication number
- CN103885607A CN103885607A CN201210553890.XA CN201210553890A CN103885607A CN 103885607 A CN103885607 A CN 103885607A CN 201210553890 A CN201210553890 A CN 201210553890A CN 103885607 A CN103885607 A CN 103885607A
- Authority
- CN
- China
- Prior art keywords
- uyghur
- word
- uighur
- character
- storing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Controls And Circuits For Display Device (AREA)
Abstract
The invention discloses a method for judging and storing concatenation of Uyghur based on an embedded system; the method comprises the steps of: constituting character sets according to Uyghur head parts, Uyghur middle parts and Uyghur tail parts based on a Uyghur unicode extended code according to language characteristics of Uyghur, judging whether a word is connected with a front one, a back one or both front and back ones or is an independent word according to variations at the head part, middle part or tail part of the word, and deforming; by taking a standard 8*16 word stock as a criterion, taking out actually used Uyghur characters displayed in an intelligent terminal interface to generate a new simplified Uyghur literal pool. According to the method, the problem that at present Uyghur is low in inputting efficiency and storing speed and requires large storage space can be solved, and a spelling and storing method applicable to the characteristics of Uyghur is provided.
Description
Technical field
The present invention relates to Uighur areas of information technology, particularly the judgement of the Uighur write the two or more syllables of a word together based on embedded system and storage means.
Background technology
In recent years, along with the development of ethnic group's informationization and automatic field, smart machine in Xinjiang based on embedded system has also had application more widely in ethnic group, but the educational level between each department, various nationalities differs greatly, apply intelligent terminal to the user of ethnic group and brought certain difficulty.
Summary of the invention
The object of the present invention is to provide a kind of judgement of Uighur write the two or more syllables of a word together and storage means based on embedded system, can solve that current Uighur input efficiency is slow, memory rate is slow, the problems such as required storage space is large, and proposed according to the spelling and the storage means that are applicable to Uighur language feature.
The object of the present invention is achieved like this: a kind of judgement of Uighur write the two or more syllables of a word together and storage means based on embedded system, according to Uighur language feature, taking Uighur unicode extended code as basis, and according in Uighur prefix, word, suffix forms separately character set and it is at prefix, different with the position of suffix in word, judge whether to connect before word rear company, middle (connecting) or an independent word, be out of shape; Taking standard 8 × 16 character libraries as benchmark, take out the intelligent terminal interface display actual Uyghur Character of using, generated a new Uighur character library of simplifying.
Uighur transformation rule processing: Uyghur Character belongs to Arabic word series, arabian writing spreads very wide under Mohammedan impact.Farsi, the Wu Er kinds of words such as Kazak, kirgiz in the Xinjiang of literary composition and China adopts Arabic alphabet.The letter of Uighur does not have the differentiation of upper case and lower case, but has the difference of block letter and clerical type, and removes
beyond these five letters, other 27 letters can with alphabetical write the two or more syllables of a word together below, and different with the position of suffix in prefix, word because of it, font also changes to some extent.
The presentation direction of Uyghur Character is different with Chinese, it is that right-to-left is write sidewards, therefore, the books of Uighur and book are all right formats, in line feed, usually, need to judge whether be whole word, carry out the line feed of whole word, and word can not be splitted into two parts, the numeral of Uighur the inside still adopts display mode from left to right.According to point-score above, be divided into first, last, middle, alone, and judge whether to connect before word (above character set 1 in), rear company (character is in set 2 below), middle (, connect, before character in gathering 1, character is in set 2 below) or an independent word, be out of shape.According to above-mentioned analysis, provide the array of distortion: corresponding situation above respectively.For other, not in array, its distortion is identical with self.For example: program execution step of the present invention is as follows:
const WORD Arbic_Position[][4]= // first, last, middle, alone
{
{ 0xfe90, 0xfe91, 0xfe92, 0xfe8f}, // 0x628
{ 0xfe94, 0xfe93, 0xfe93, 0xfe93},
{ 0xfe96, 0xfe97, 0xfe98, 0xfe95}, // 0x62A
{ 0xfe9a, 0xfe9b, 0xfe9c, 0xfe99},
{ 0xfe9e, 0xfe9f, 0xfea0, 0xfe9d},
{ 0xfea2, 0xfea3, 0xfea4, 0xfea1},
{ 0xfea6, 0xfea7, 0xfea8, 0xfea5},
{ 0xfeaa, 0xfea9, 0xfeaa, 0xfea9},
……
}
Judge whether to connect above, adopt and judge that the decision method of the previous character of this character, method are, see that previous character is whether gathering in set1.If, there is connection above.Gather 1 as follows:
static U16 theSet1[23]={
0x62c, 0x62d, 0x62e, 0x647, 0x639, 0x63a, 0x641, 0x642,
0x62b, 0x635, 0x636, 0x637, 0x643, 0x645, 0x646, 0x62a,
0x644, 0x628, 0x64a, 0x633, 0x634, 0x638, 0x626};
Judge whether to connect below, adopt the decision method that judge a character after this character, method is, sees that a rear character is whether gathering in set2.If, there is connection below.Gather 2 as follows:
static U16 theSet2[35]={
0x62c, 0x62d, 0x62e, 0x647, 0x639, 0x63a, 0x641, 0x642,
0x62b, 0x635, 0x636, 0x637, 0x643, 0x645, 0x646, 0x62a,
0x644, 0x628, 0x64a, 0x633, 0x634, 0x638, 0x626,
0x627, 0x623, 0x625, 0x622, 0x62f, 0x630, 0x631, 0x632,
0x648, 0x624, 0x629, 0x649};
Hyphen is that character string is below 0x622,0x623 with 0x644 beginning, 0x625,0x627, and according to circumstances take off the character array 0 or 1 of face, if the previous character of 0x644 is the set 1 on set 1(is same) in the middle of, peek group 1, otherwise peek group 0.
Array is as follows: static U16 arabic_specs[] [2]=0xFEF5,0xFEF6},
{0xFEF7,0xFEF8},
{0xFEF9,0xFEFA},
{0xFEFB,0xFEFC},
};
For example: 0x064A, 0x0644,0x0622
A character 0x0644 below of 0x064A, in set 2, show that according to coding rule 1 it is rear hyphen (last), therefore convert to: 0xFEF3. and 0x064A are in set 1, therefore substitute these two codings of 0x0644 0x0622 with 0xFEF6.
In order to save storage space, design a kind of character library extraction procedure, taking standard 8 × 16 character libraries as benchmark, take out the intelligent terminal interface display actual Uyghur Character of using, generated a new Uighur character library of simplifying.For example, show the Uighur of " voltage ", its corresponding extended code is " FE97; FEEE; FED9; FE91, FBE7, FEB4; FEE4; FEF0 ", and the character library generating according to Uyghur Character extended code is found out the dot matrix font that this word shows, wherein the computing formula of dot matrix font address is Uaddr=zkfile+uiger_length × 2+x × 16, wherein, zkfile is the first address of Uighur font file in storer, and uiger_length is the sum of Uyghur Character in character library, and X is its position in character library.The present invention can solve the problems such as current Uighur input efficiency is slow, memory rate is slow, and required storage space is large, and has proposed according to the spelling and the storage means that are applicable to Uighur language feature.
Brief description of the drawings
Below in conjunction with accompanying drawing, the invention will be further described.
Fig. 1 is program flow diagram.
Embodiment
A kind of judgement of Uighur write the two or more syllables of a word together and storage means based on embedded system, as shown in Figure 1, according to Uighur language feature, taking Uighur unicode extended code as basis, and according in Uighur prefix, word, suffix forms character set and its separately at prefix, different with the position of suffix in word, judge whether to connect before word, rear company, middle (connecting) or an independent word, be out of shape; Taking standard 8 × 16 character libraries as benchmark, take out the intelligent terminal interface display actual Uyghur Character of using, generated a new Uighur character library of simplifying.
Hardware of the present invention mainly comprises kernal hardware 8051 low-power scms, storer, 1602 LCDs etc.The tasks such as single-chip microcomputer adopts the AT89S51 microprocessor of atmel corp, completes steering order given, the exchanges data between word-base data processing and liquid crystal.It is a low-power consumption, 8 single-chip microcomputers of high-performance CMOS, sheet includes the Flash read-only program memory of erasable 1000 times repeatedly of 4k Bytes ISP (In-system programmable), device adopts high density, the nonvolatile storage technologies manufacture of atmel corp, compatibility standard MCS-51 order set and 80C51 pin configuration, core Embedded general 8 central processing units and ISP Flash storage unit, the AT89S51 of powerful microcomputer can be many embedded Control application systems provides the solution of high performance-price ratio.AT89S51 has following features: 40 pins, 4k Bytes Flash sheet internal program storer, the Random Access Data storer (RAM) of 128 bytes, 32 two-way I/O in outside (I/O) mouthful, 2 layers of interrupt nesting of 5 interrupt priority levels interrupt, 2 16 programmer timing counters, 2 full duplex serial communication ports, house dog (WDT) circuit, on-chip clock oscillator.Character formula LCD MODULE SMC1602 is made up of 8 × 16 lattice lcds screens and control chip HD44780 and auxiliary circuit thereof.It can show letter, numeral, symbol etc., the capacity of display is 16x2 character, and chip operating voltage is 4.5 ~ 5.5V, and working current is 2mA (5V), module optimum operating voltage is 5V, and character size is 4.95x7.95 (W × H) mm.1602 LCD MODULE have 16 interface signal line, comprise 8 tri-state data lines, enable signal line E, and signal wire R/W is selected in read-write, and command/data is selected signal wire RS etc.Wherein, 8 tri-state data lines are connected with the P0 mouth of single-chip microcomputer, the reference power source that VL is liquid crystal display, and external adjustable resistance can be used to the contrast of adjustable liquid crystal display screen.R/W selects signal for read-write, and R/W=1 is read states, and R/W=0 is for writing state.RS is register selection signal, and RS=1 is order register, and RS=0 is data register.E is enable signal, and read states is effective at high level, writes state effective at high impulse negative edge.These three control lines use for master cpu access modules internal controller HD44780.The interface routine of single-chip microcomputer and liquid crystal is the c language of programming under keil51 system software.This system software is supported the various functions of 89s51 single-chip microcomputer, is convenient to coding and debugging.), each Uyghur Character respectively accounts for 16B, the left side is 1,3,5 ... the right is 2,4,6 ... can find out according to the ranks number that start to show and the columns of every row the address that display random access memory is corresponding on LCD, set up cursor, the first byte of serving the Uyghur Character that will show, cursor position adds 1, send second byte, column alignment is pressed in line feed, send the 3rd byte ... until just having shown, 16B can on LCD, obtain a complete word.
Claims (1)
1. the judgement of Uighur write the two or more syllables of a word together and the storage means based on embedded system, its method is: according to Uighur language feature, taking Uighur unicode extended code as basis, and according in Uighur prefix, word, suffix forms separately character set and it is at prefix, different with the position of suffix in word, judge whether to connect before word rear company, middle (connecting) or an independent word, be out of shape; Taking standard 8 × 16 character libraries as benchmark, take out the intelligent terminal interface display actual Uyghur Character of using, generated a new Uighur character library of simplifying.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210553890.XA CN103885607A (en) | 2012-12-19 | 2012-12-19 | Method for judging and storing concatenation of Uyghur based on embedded system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210553890.XA CN103885607A (en) | 2012-12-19 | 2012-12-19 | Method for judging and storing concatenation of Uyghur based on embedded system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103885607A true CN103885607A (en) | 2014-06-25 |
Family
ID=50954540
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210553890.XA Pending CN103885607A (en) | 2012-12-19 | 2012-12-19 | Method for judging and storing concatenation of Uyghur based on embedded system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103885607A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101866417A (en) * | 2010-06-18 | 2010-10-20 | 西安电子科技大学 | A Recognition Method for Handwritten Uighur Characters |
CN102446275A (en) * | 2010-09-30 | 2012-05-09 | 汉王科技股份有限公司 | Identification method and device for Arabic character |
CN102622610A (en) * | 2012-03-05 | 2012-08-01 | 西安电子科技大学 | Handwritten Uyghur character recognition method based on classifier integration |
-
2012
- 2012-12-19 CN CN201210553890.XA patent/CN103885607A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101866417A (en) * | 2010-06-18 | 2010-10-20 | 西安电子科技大学 | A Recognition Method for Handwritten Uighur Characters |
CN102446275A (en) * | 2010-09-30 | 2012-05-09 | 汉王科技股份有限公司 | Identification method and device for Arabic character |
CN102622610A (en) * | 2012-03-05 | 2012-08-01 | 西安电子科技大学 | Handwritten Uyghur character recognition method based on classifier integration |
Non-Patent Citations (3)
Title |
---|
袁保社等: "一种手写维吾尔文字母识别算法", 《计算机工程》 * |
袁保社等: "维吾尔文OpenType字库设计与实现", 《电脑知识与技术》 * |
阿力木江艾沙等: "基于SVM的维吾尔文文本分类研究", 《计算机工程与科学》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102156515B (en) | Embedded developing board having strong expansibility | |
DE112015003397B4 (en) | Device, system and method for determining comparison information based on memory data | |
CN103024059B (en) | A kind of middleware system for Internet of things | |
CN204360742U (en) | A kind of sub-advertisement screen control system of LED electrical of handwriting input | |
CN201994040U (en) | Lattice LED (light-emitting diode) display system | |
CN103098039A (en) | High-speed peripheral-device interconnected-bus port configuration method and apparatus | |
CN107463126A (en) | Unmanned plane double-core control system and digital independent and wiring method | |
CN115599309A (en) | NVMe disk array data processing method, device, equipment and storage medium | |
CN102193860A (en) | Microcontroller online debugging circuit and method as well as microcontroller | |
CN203204492U (en) | Industrial motherboard | |
CN101226513A (en) | Keyboard display module with single-wire transmission interface and single-byte operation | |
Rao et al. | Implementation of AMBA compliant Memory Controller on a FPGA | |
CN103389893A (en) | Read-write method and device for configuration register | |
CN204462995U (en) | A kind of plate carries internal memory ruggedized computer platform | |
CN103885607A (en) | Method for judging and storing concatenation of Uyghur based on embedded system | |
CN206975631U (en) | A kind of universal input output timing processor | |
CN102880574A (en) | Method for simulating low speed parallel interface by using GPIO (general purpose input output) | |
CN105930763A (en) | Ink Stroke Grouping Method And Product Based On Stroke Attributes | |
CN102542872A (en) | Driving system based on embedded development platform | |
CN102110065B (en) | Cache system for reducing data transmission | |
CN101866695A (en) | Method for Nandflash USB controller to read and write Norflash memory | |
CN202067260U (en) | Buffer memory system for reducing data transmission | |
CN207115918U (en) | Split screen device and combined liquid crystal display screen under control of single chip microcomputer | |
CN100495314C (en) | Computer display matrix output device | |
CN104598410B (en) | A kind of computer card for exempting to write driver and its development approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140625 |
|
RJ01 | Rejection of invention patent application after publication |