CN103984420B - A kind of Tibetan language intelligent input method based on phonetic - Google Patents
A kind of Tibetan language intelligent input method based on phonetic Download PDFInfo
- Publication number
- CN103984420B CN103984420B CN201410142863.2A CN201410142863A CN103984420B CN 103984420 B CN103984420 B CN 103984420B CN 201410142863 A CN201410142863 A CN 201410142863A CN 103984420 B CN103984420 B CN 103984420B
- Authority
- CN
- China
- Prior art keywords
- tibetan language
- input method
- phonetic
- input
- syllable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Document Processing Apparatus (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of Tibetan language intelligent input method based on phonetic.This method is:1)Each Tibetan language consonant and each Tibetan language vowel are set into a key mapping code respectively;2)Pinyin coding corresponding to setting one to each syllable according to the spelling of Tibetan language syllable order, and be saved into an input method character library;3)Establish the relational tree of a Pinyin coding and key mapping code;4)One input method engine is established based on the input method character library, the input method engine travels through the relational tree according to the key mapping code of input, obtains corresponding Pinyin coding;Then the input method character library, Tibetan language corresponding to return are inquired about according to the Pinyin coding.Compared with prior art, the present invention has repeated code low, it is easy to accomplish, the features such as being easy to establish and expand dictionary, and meet the Natural thinking of Tibetan language writing, allow the input of Tibetan language to be more convenient, be quick, flexibly.
Description
Technical field
The present invention relates to a kind of input method, more particularly to a kind of Tibetan language intelligent input method based on phonetic.
Background technology
Tibetan language since initiative, no matter the main carriers as national culture succession, or be used as Tibetan area propagation section now
The main tool of skill knowledge, or even the main indications national as one in informationized society, its unique human culture valency
Value and vast Tibetan area play great function be immeasurable.
Tibetan language strides into recent decades information age, and considerable hair has been obtained in each side of computer Tibetan information processing
Exhibition, also achieves many achievements, Tibetan language typesetting processing, the transmitting-receiving of Tibetan language Email, Tibetan language is entered into from the typewriting of Tibetan language
Web Hosting, the exploitation of Tibetan language application software, the making of Tibetan language courseware etc..
Tibetan language is alphabetic writing, with laterally writing property structure simultaneously also have longitudinal direction write property structure, its phrase or
Sentence is by syllable one by one(Or it is word)Composition, the corresponding sound of a syllable a, syllable has several Tibetan language again
Letter composition, slightly seem that it and English are much like, such asBut for a Tibetan language syllable, it is again from one
Base word starts, and is formed by upper word adding, down word adding, pre-script, back word adding stack combinations, thus has the characteristics of plane word again.Hide
The structure of the syllable of text is using a letter as core, and the letter of core is front and rear on this basis " base word ", remaining letter
Additional and upper and lower fold is write, and is combined into a complete word table structure, and the appellation of each letter is according to the position for being added in base word
And gain the name.As shown in Figure 1.
30 consonants of Tibetan language can make base word, and still, can do forward and backward, upper and lower plus word letter is all in the syntax
As defined in having, and limited amount.
Centered on Tibetan language pronunciation is also base word consonant, a syllable only has a vowel(Vowel a can be omitted), therefore one
The corresponding sound of individual syllable, when Tibetan language combines into syllables, since leftmost consonant, its order is 1)Pre-script, 2)It is upper to add
Word, 3)Base word, 4)Down word adding, 5)Vowel, 6)Back word adding, 7)Back word adding again.
The writing of Tibetan language is in units of syllable, is from left to right write across the page, and is separated between syllable with dot, such as The sequential write of syllable and the sequence consensus combined into syllables, most of input method are also to come according to this order
Input Tibetan code, but specific input hypothesis are complex, and have the drawbacks of very big because when writing upper word adding or under
Some special letters during word are added to need to deform, so Tibetan language defines 211 characters altogether in international code Unicode, its
In comprising general character, overlaying character, numerical chracter, astronomy and calendrical calculation symbol etc..Opentype character library marker characteristics are recycled, will
These general characters combine with overlaying character, and this function is that fontlib possesses in itself rather than input method, and input method is by root
Character code is formed according to the input of user, fontlib passes through character library marker characteristic so as to show Tibetan language syllable according to this coding.
At present, the input speed of Tibetan language still has obvious gap compared with the input speed of the other parts language such as Chinese,
Especially on mobile terminals, main cause lacks the input method of efficient intelligence.Among existing input method, only a small number of tools
There is the external Tibetan input method such as phrase inputting function, Microsoft's Himalaya input methods not support phrase or intelligent input method, and
Domestic class's intelligence with phrase inputting employs the phrase encoding scheme that base word adds back word adding up to input method, but has not certainly
So, difficult note is difficult uses and repeated code is more, and user can arbitrarily input character any combination, the shortcomings of violating the syntax of Tibetan language.Therefore it is badly in need of
A kind of easy-to-use, natural, versatile and low repeated code intelligent input scheme is developed, to improve the input speed of Tibetan language.
The content of the invention
In order to overcome technical problem present in prior art, searched for it is an object of the invention to provide one kind based on phonetic
Tibetan input method, according to the text structure, pronunciation character and spelling methods of Tibetan language, the present invention is using some letters as phonetic word
Accord with and to identify specific syllable, do not consider that the additive process of syllable represents, realize Pinyin Input, therefore phonetic proposed by the present invention is defeated
It is exactly based on it to enter method.The spelling rule of Tibetan language is specially made good use of, the phonetic of Tibetan language syllable and corresponding relation is deposited
It is put into character library, Pinyin coding is formed by input method, target word collection is returned further according to input method engine.
Therefore the present invention has repeated code low, it is easy to accomplish, the features such as being easy to establish and expand dictionary, and meet Tibetan language writing
Natural thinking, it is readily appreciated that and use.
The object of the invention is achieved by the following technical programs:
A kind of Tibetan language intelligent input method based on phonetic, its step are:
1)Each Tibetan language consonant and each Tibetan language vowel are set into a key mapping code respectively;
2)Pinyin coding corresponding to setting one to each syllable according to the spelling of Tibetan language syllable order, and it is saved into one
In input method character library;
3)Establish the relational tree of a Pinyin coding and key mapping code;
4)One input method engine is established based on the input method character library, the input method engine is according to the key mapping code time of input
The relational tree is gone through, obtains corresponding Pinyin coding;Then the input method character library is inquired about according to the Pinyin coding, returned corresponding
Tibetan language.
Further, the method to phonetic corresponding to each syllable setting one is:For monocase Tibetan language syllable, if
Put its phonetic for monocase Tibetan language syllable in itself;For more character Tibetan language syllables without upper and lower superposition, it is multiword to set its phonetic
Accord with Tibetan language syllable in itself;For there is the more character Tibetan language syllables being superimposed up and down, set its phonetic for monocase Tibetan language syllable in itself.
Further, same Pinyin coding corresponds to one or more syllables.
Further, the input method engine searches the phonetic of matching according to Pinyin coding, by all and this phonetic
Match somebody with somebody or the Tibetan language using this phonetic as beginning is shown in the candidate word region of input method, and sorted by word frequency order.
Further, on the mobile apparatus using full keyboard pattern or nine grids pattern as Tibetan language consonant and member
The inputting interface of sound letter.
Further, the key mapping mode of Himalaya input method is used on PC as Tibetan language consonant and vowel
Inputting interface.
A kind of Tibetan language intelligent input method flow chart based on phonetic of the present invention is as shown in Fig. 2 it is comprised the following steps that:
First, 30 Tibetan language consonants and 4 vowels are provided to form pinyin character, and it is suitable according to Tibetan language spelling
Sequence combines to form phonetic corresponding to every syllable.
Such as the consonant that table 1 is Tibetan language:
Table 1 is Tibetan language consonant table
Such as the vowel that table 2 is Tibetan language:
Table 2 is Tibetan language vowel table
The phonetic according to corresponding to the spelling of Tibetan language order provides each syllable, its is specific as follows:
1. the phonetic of monocase syllable is itself.Such as table 3:
Table 3 is monocase syllable
2. the syllable of character more than has no superposition up and down(Except vowel character is superimposed)When phonetic be itself.Such as table 4:
Table 4 is without the more character syllables being superimposed up and down
Phonetic is determined by the spelling order of the word when 3. syllable of character more than has superposition.Such as table 5 below:
Table 5 is to have the more character syllables being superimposed up and down
According to three above-mentioned rules, we can determine that each syllable corresponds to phonetic, while it has also been found that same spelling substantially
Sound corresponds to multiple syllables.
2nd, create and the phonetic of particular kind of relationship and corresponding syllable are added in character library and character library.
Target syllable is stored in character library with phonetic, and both storage organizations have two kinds of relations from data structure, a kind of
It is one-one relationship, i.e. a string of pinyin characters only represent a syllable, and another kind is many-one relationship, and a string of pinyin representations are more
Individual syllable, due to the particularity of Tibetan language, at most corresponding three syllables of a phonetic.
3rd, input method engine
Input method engine is the core for realizing intelligent input, and it provides an adapter for input method, that is, receives and use
The code value of family input, will find the phonetic code corresponding to the code value in adapter, then phonetic code is scanned for character library, will
The result of search returns to user, so as to complete to input.
Compared with existing input method, beneficial effects of the present invention:
1)It is versatile
Unicode coding of the character coding method of the present invention based on international standard, is easily achieved on different devices.
2)Key mapping typesetting is flexible
Due to the present invention pinyin character quantity it is few, it is only necessary to 34 key mappings, so we completely may be used on PC
In a manner of the key mapping using the current commonplace Himalaya input method used, and it is very ingenious in the terminals such as mobile phone, flat board
Full keyboard pattern or nine grids pattern are realized in ground, and existing mobile phone Tibetan input method does not have nine grids keyboard mode also.
3)The high repeated code of input rate is few
The Pinyin Input based on search is employed, and the many-one relationship amount of phonetic and word is few, not only repeated code is few, Er Qieneng
Preferred word is accurately determined, improves input speed.Such as phraseCommon input method needs 11 keys, and this input method only needs 8
Key, or even less.For shorter word, preferred word can accurately just be determined first by inputting.
4)It is consistent with the writing thinking of Tibetan language, it is eager to learn handy
The present invention is the input method based on phonetic, and Tibetan language is also alphabetic writing in itself, and the writing of phonetic is completely in Tibetan
Text writing thinking coincide, as long as have Tibetan language writing basis can Step By Step understand this input method.
5)It is easy to establish and expand dictionary
Fundamental system character library defined in the present invention, it includes all Tibetan language syllables, and we will be used by being established based on it
Family dictionary, the dictionary not only need good data structure, it is also necessary to good extendibility, compatibility, using the volume of the present invention
Code scheme, can insert all kinds of Tibetan language vocabulary in network in dictionary so that the expansion of dictionary is convenient to be realized.
6)Realize that word frequency records, input is faster
As the Chinese character coding input method of current main-stream, our input method also possesses programming count and the adjustment of word frequency, real
The input mode of user's feature is now adapted to, the memory function of user's phrase can also be realized, makes the input of Tibetan language more convenient, fast
It is prompt, flexible.
Brief description of the drawings
Below in conjunction with accompanying drawing, the present invention is described in further detail:
Fig. 1 is the structure elucidation schematic diagram of Tibetan language word;
Fig. 2 is the inventive method flow chart;
Fig. 3 is the input method theory diagram of a specific embodiment of the invention.
Embodiment
With reference to accompanying drawing, below the present invention is further described, implement the intelligent input based on phonetic by following steps
Method scheme:
1. according to given Pinyin rule, input method character library is established.A kind of efficient input method all necessarily correspond to it
The character library of specific structure, in the present invention, using the Unicode codings for overstating platform on coding, input is designed as shown in table 6
The structure of method character library, character library is by encoding(ID), syllable(Vlaue), phonetic code(PinyinCode)And frequency(Frequency)
Four parts form.
Table 6 is input method character library
The foundation of character library is carried out according to the following steps:
A. in character library syllable material collection, consult vocabulary, the material such as dictionary of correlation, due to all pinyin characters all
It is of the invention specific, it is necessary to by manually obtaining or writing program entry phonetic.
B. the phonetic of acquisition and syllable are integrated into input method character library, special program module can be designed and be responsible for this work
Make.
2. establish the input method engine based on character library
Input method engine based on character library mainly provides an adapter between user and character library, as shown in Figure 3.Have
Key mapping code and the corresponding phonetic value of matching user's input, output phonetic value simultaneously inquire about character library, return to result that character library is inquired about etc.
Function.
All phonetic value and the relational trees of key mapping code defined in input method, adapter will travel through this tree obtain with
The phonetic value that the current enter key bit code of user is matched, character library, and returning result are searched for according to resulting phonetic value.Work as user
When starting input, all candidate word regions for being matched with this phonetic or input method being shown to using this phonetic as the word started
In, and sorted by word frequency order, now user input Pinyin can completely can also be selected in candidate word regional choice target word
Word.The ending that needs can also be added automatically according to the grammatical input method of Tibetan language accords with, and has four kinds of ending symbols in Tibetan language, and by corresponding
The syntax regulation.
Tibetan language input process is exemplified below:
A. assume that keyboard layout is full keyboard, intends inputWord, its corresponding phonetic are:So when input first
Individual characterWhen input method by it is all withUser interface is returned to for the word of the phonetic of beginning, such as Deng word, when user inputs second characterWhen, input method will be withThe word of the phonetic of beginning returns to user, such asDeng by that analogy, constantly being screened, finally obtain the word to be inputted.
B. phonetic and syllable in the present invention(Word)Relation mainly exist with one-one relationship, and search for process come
See, find and just obtain result without whole phonetic of being totally lost, and input method engine returns to optimal result in input process
To user, as long as we therefrom select can.
Claims (5)
1. a kind of Tibetan language intelligent input method based on phonetic, its step are:
1) each Tibetan language consonant and each Tibetan language vowel are set into a key mapping code respectively;The key mapping code is phonetic word
Value;
2) according to the spelling of Tibetan language syllable order to Pinyin coding corresponding to each syllable setting one, and it is saved into an input
In method character library;
3) relational tree of a Pinyin coding and key mapping code is established;
4) an input method engine is established based on the input method character library, the input method engine travels through institute according to the key mapping code of input
Relational tree is stated, obtains corresponding Pinyin coding;Then the input method character library is inquired about according to the Pinyin coding, returns to corresponding hide
Text;
Wherein, it is to the method for Pinyin coding corresponding to each syllable setting one:For monocase Tibetan language syllable, its phonetic is set
For monocase Tibetan language syllable in itself;For more character Tibetan language syllables without upper and lower superposition, it is more character Tibetan language sounds to set its phonetic
Section is in itself;For there are the more character Tibetan language syllables being superimposed up and down, its phonetic is set according to its spelling order.
2. input method as claimed in claim 1, it is characterised in that same Pinyin coding corresponds to one or more syllables.
3. input method as claimed in claim 1 or 2, it is characterised in that the input method engine is searched according to Pinyin coding and matched
Phonetic, matched all with this phonetic or Tibetan language using this phonetic as beginning is shown to the candidate word regions of input method
In, and sorted by word frequency order.
4. input method as claimed in claim 1, it is characterised in that on the mobile apparatus using full keyboard pattern or nine grids
Pattern is as Tibetan language consonant and the inputting interface of vowel.
5. input method as claimed in claim 1, it is characterised in that made on PC using the key mapping mode of Himalaya input method
For Tibetan language consonant and the inputting interface of vowel.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410142863.2A CN103984420B (en) | 2014-04-10 | 2014-04-10 | A kind of Tibetan language intelligent input method based on phonetic |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410142863.2A CN103984420B (en) | 2014-04-10 | 2014-04-10 | A kind of Tibetan language intelligent input method based on phonetic |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103984420A CN103984420A (en) | 2014-08-13 |
CN103984420B true CN103984420B (en) | 2017-11-14 |
Family
ID=51276430
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410142863.2A Expired - Fee Related CN103984420B (en) | 2014-04-10 | 2014-04-10 | A kind of Tibetan language intelligent input method based on phonetic |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103984420B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104408037A (en) * | 2014-12-05 | 2015-03-11 | 才智杰 | Tibetan text vector model representation method |
CN104615269B (en) * | 2015-02-04 | 2018-01-16 | 史晓东 | A kind of Tibetan language Latin simple double spelling coding method and its intelligent input system entirely |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1696880A (en) * | 2005-05-08 | 2005-11-16 | 卢亚军 | General keyboard layout of Tibetan computer, and input method |
CN1737739A (en) * | 2005-07-16 | 2006-02-22 | 西北民族大学 | Tibetan input method based on English keyboard |
CN101751140A (en) * | 2008-12-22 | 2010-06-23 | 青海师范大学 | Input method leading modern Tibetan scripts to correspond to fingerboard key maps one by one |
-
2014
- 2014-04-10 CN CN201410142863.2A patent/CN103984420B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1696880A (en) * | 2005-05-08 | 2005-11-16 | 卢亚军 | General keyboard layout of Tibetan computer, and input method |
CN1737739A (en) * | 2005-07-16 | 2006-02-22 | 西北民族大学 | Tibetan input method based on English keyboard |
CN101751140A (en) * | 2008-12-22 | 2010-06-23 | 青海师范大学 | Input method leading modern Tibetan scripts to correspond to fingerboard key maps one by one |
Also Published As
Publication number | Publication date |
---|---|
CN103984420A (en) | 2014-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102298582B (en) | Data search and matching process and system | |
CN101183281B (en) | Method for inputting word related to candidate word in input method and system | |
CN102449579B (en) | All-in-one chinese character input method | |
CN100437557C (en) | Machine translation method and apparatus based on language knowledge base | |
KR20120006489A (en) | Input method editor | |
KR20100052461A (en) | Word probability determination | |
CN103314369B (en) | Machine translation apparatus and method | |
CN104462072A (en) | Input method and device oriented at computer-assisting translation | |
CN103324621A (en) | Method and device for correcting spelling of Thai texts | |
CN101158969A (en) | Whole sentence generating method and device | |
CN103324607B (en) | Word method and device cut by a kind of Thai text | |
CN104252542A (en) | Dynamic-planning Chinese words segmentation method based on lexicons | |
CN104239289A (en) | Syllabication method and syllabication device | |
Prabhakar et al. | Machine transliteration and transliterated text retrieval: a survey | |
CN102929864A (en) | Syllable-to-character conversion method and device | |
CN103984420B (en) | A kind of Tibetan language intelligent input method based on phonetic | |
CN102609455B (en) | Method for Chinese homophone searching | |
CN101577115A (en) | Voice input system and voice input method | |
CN101499056A (en) | Backward reference sentence pattern language analysis method | |
CN114970524B (en) | Controllable text generation method and device | |
CN108255818B (en) | Combined machine translation method using segmentation technology | |
Singh et al. | English-Dogri Translation System using MOSES | |
Kang | Word Similarity Calculation by Using the Edit Distance Metrics with Consonant Normalization. | |
KR101982490B1 (en) | Method for searching keywords based on character data conversion and apparatus thereof | |
CN109727591B (en) | Voice search method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171114 Termination date: 20190410 |