CN105955986A - Character converting method and apparatus - Google Patents
Character converting method and apparatus Download PDFInfo
- Publication number
- CN105955986A CN105955986A CN201610243297.3A CN201610243297A CN105955986A CN 105955986 A CN105955986 A CN 105955986A CN 201610243297 A CN201610243297 A CN 201610243297A CN 105955986 A CN105955986 A CN 105955986A
- Authority
- CN
- China
- Prior art keywords
- node
- converted
- word
- phonetic
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Document Processing Apparatus (AREA)
Abstract
The invention discloses a character converting method and apparatus, and the method comprises the steps of receiving a text to be converted; determining an object node corresponding to the text to be converted in a tri-search tree, and a corresponding relation between words and pinyin is stored in the nodes of the tri-search tree in advance; extracting the word or pinyin corresponding to the text to be converted in the object nodes; and outputting the word or pinyin corresponding to the text to be converted. In the process of determining the object node corresponding to the text to be converted in the tri-search tree, half query workload can be reduced every time searching the node corresponding to the text to be converted in the tri-search tree, therefore, according to the solution provided by the invention, the object node corresponding to the text to be converted can be rapidly queried, and the word or pinyin corresponding to the text to be converted can be obtained in the object node. In this way, the query efficiency is improved.
Description
Technical field
The present embodiments relate to communication technical field, in particular, relate to conversion method and the device of character.
Background technology
At present, in order to realize the mutual conversion of phonetic and word, it usually needs pre-build a powerful dictionary, at word
Storehouse needs the corresponding relation recording all of word with phonetic.Wherein, word at least includes two Chinese characters.
When user inputs a spelling sound, server needs to travel through whole dictionary from the beginning to the end to inquire about the word that this phonetic is corresponding
Language, so server may need the consumption long period can inquire the word that this phonetic is corresponding.In like manner, user
During input word, server needs to travel through from the beginning to the end whole dictionary to inquire about the phonetic that this word is corresponding, so server
Need also exist for consume the long period can inquire the phonetic that this word is corresponding.So the above-mentioned mode utilizing dictionary is carried out
Word and the conversion of phonetic, its search efficiency is the lowest.
Therefore, how to improve the search efficiency that phonetic and word are mutually changed, become to need badly at present and solve the technical problem that.
Summary of the invention
The present invention provides conversion method and the device of a kind of character, to improve the efficiency of inquiry.
First aspect according to embodiments of the present invention, it is provided that the conversion method of a kind of character, including:
Receiving text to be converted, described text to be converted is phonetic or word;
The destination node corresponding with described text to be converted is determined, in the node of described trident search tree in trident search tree
It is previously stored with the corresponding relation of word and phonetic;
Word corresponding to described text to be converted or phonetic is extracted in described destination node;
Export word corresponding to described text to be converted or phonetic.
Optionally, after the step of described reception text to be converted, described method also includes:
Judge whether described text to be converted can be split into participle;
When described text to be converted can be split into participle, utilize segmentation methods that described text to be converted is split
Obtain word segmentation result, described trident search tree determine the first appointment node corresponding with the participle in described word segmentation result,
Extract described first and specify the word or phonetic that described in node, participle is corresponding, export word corresponding to described participle or phonetic;
When described text to be converted can not be split into participle, determine in trident search tree described in triggering and turn with described waiting
The step of the destination node of this correspondence of the exchange of notes.
Optionally, the described step determining the destination node corresponding with described text to be converted in trident search tree includes:
When described text to be converted is a word, root node and the brother of described root node of word trident search tree
Node determines the second appointment node identical with the ASCII character value of the first Chinese character in described text to be converted;
Described second child node specifying node determines and the ASCII character of remaining Chinese character in described text to be converted
It is worth the 3rd identical appointment node;
Specifying node to be defined as destination node by the described 3rd, one word at least includes two Chinese characters.
Optionally, the described step determining the destination node corresponding with described text to be converted in trident search tree includes:
When described text to be converted is at least two spelling sound, at the root node of phonetic trident search tree and described root node
The brotgher of node determines the fourth appointment node identical with the first spelling sound in described text to be converted;
Identical with remaining the spelling sound in described text to be converted the is determined in the described 4th child node specifying node
Five specify node;
Node is specified to be defined as destination node by the described 5th, the every spelling sound in described at least two spelling sounds all corresponding
Chinese character.
Optionally, before the described step determining the destination node corresponding with described text to be converted in trident search tree,
Described method also includes:
Determine the ASCII character value corresponding with each word in standard dictionary;
Size according to ASCII character value is added the first Chinese character of each word in described standard dictionary to described trident and is searched
In the root node of Suo Shu and the brotgher of node of described root node;
The non-first Chinese character of each word in described standard dictionary and phonetic are added to described trident search tree described
In the child node of node, and in the child node of the brotgher of node of described root node.
Optionally, before the described step determining the destination node corresponding with described text to be converted in trident search tree,
Described method also includes:
Determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes a phonetic, every pair of phonetic
At least include two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic;
Order according to initial adds the first spelling sound of every pair of phonetic in described standard dictionary to described trident search tree
Root node and described root node the brotgher of node in;
Add the word that the non-first spelling sound of every pair of phonetic in described standard dictionary is corresponding with the every pair of phonetic to described trident
In the child node of the described root node of search tree, and in the child node of the brotgher of node of described root node.
Second aspect according to embodiments of the present invention, it is provided that the conversion equipment of a kind of character, including:
Receiver module, is used for receiving text to be converted, and described text to be converted is phonetic or word;
First determines module, for determining the destination node corresponding with described text to be converted in trident search tree, described
The node of trident search tree is previously stored with the corresponding relation of word and phonetic;
Extraction module, for extracting word corresponding to described text to be converted or phonetic in described destination node;
Output module, for exporting word corresponding to described text to be converted or phonetic.
Optionally, described device also includes:
Judge module, is used for judging whether described text to be converted can be split into participle;
First performs module, for when described text to be converted can be split into participle, utilizes segmentation methods to described
Text to be converted carries out fractionation and obtains word segmentation result, determines and the participle in described word segmentation result in described trident search tree
The first corresponding appointment node, extracts described first and specifies the word or phonetic that described in node, participle is corresponding, and output is described
Word that participle is corresponding or phonetic;
Second performs module, for when described text to be converted can not be split into participle, triggers described first and determines mould
Block.
Optionally, described first determines that module includes:
First determines submodule, and for when described text to be converted is a word, the root at word trident search tree saves
Point with the brotgher of node of described root node determines identical with the ASCII character value of the first Chinese character in described text to be converted
Second specifies node;
Second determines submodule, for described second specify node child node in determine with in described text to be converted
The 3rd that the ASCII character value of remaining Chinese character is identical specifies node;
3rd determines submodule, and for specifying node to be defined as destination node by the described 3rd, one word at least wraps
Include two Chinese characters.
Optionally, described first determines that module includes:
4th determines submodule, for when described text to be converted is at least two spelling sound, at phonetic trident search tree
Root node with the brotgher of node of described root node determines fourth appointment identical with the first spelling sound in described text to be converted
Node;
5th determines submodule, for described 4th specify node child node in determine with in described text to be converted
The 5th appointment node that remaining spelling sound is identical;
6th determines submodule, for specifying node to be defined as destination node, in described at least two spelling sounds by the described 5th
The all corresponding Chinese character of every spelling sound.
Optionally, described device also includes:
Second determines module, for determining the ASCII character value corresponding with each word in standard dictionary;
First adds module, is used for the size according to ASCII character value by the first Chinese character of each word in described standard dictionary
Add in the root node of described trident search tree and the brotgher of node of described root node;
Second adds module, for adding non-first Chinese character and the phonetic of each word in described standard dictionary to described three
In the child node of the described root node of fork search tree, and in the child node of the brotgher of node of described root node.
Optionally, described device also includes:
3rd determines module, and for determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes
One phonetic, every pair of phonetic at least includes two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic
Language;
3rd adds module, is added by the first spelling sound of every pair of phonetic in described standard dictionary for the order according to initial
In the brotgher of node of the root node of described trident search tree and described root node;
4th adds module, for by word corresponding with every pair of phonetic for the non-first spelling sound of every pair of phonetic in described standard dictionary
In the child node of the described root node that language adds described trident search tree to, and the son joint of the brotgher of node of described root node
In point.
Compared with prior art, the technical scheme that the present embodiment provides has the following advantages and feature:
In the scheme that the present invention provides, the node of trident search tree can be previously stored with word pass corresponding with phonetic
System, after receiving text to be converted, can determine the destination node corresponding with text to be converted in trident search tree.
If text to be converted is phonetic, then the word corresponding with phonetic can be extracted in destination node;If literary composition to be converted
This is word, then can extract the phonetic corresponding with word in destination node, so that phonetic can phase with word
Conversion mutually.During determining the destination node corresponding with text to be converted in trident search tree, in trident search tree
Often search the node that text the most to be converted is corresponding, all can reduce the inquiry workload of half, so the side that the present invention provides
Case can quickly inquire the destination node that text to be converted is corresponding, and obtains text pair to be converted in this destination node
The word answered or phonetic, thus improve search efficiency.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, will make required in embodiment below
Accompanying drawing be briefly described, it should be apparent that, below describe in accompanying drawing be only some embodiments of the present invention, for
From the point of view of those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to obtain other according to these accompanying drawings
Accompanying drawing.
The flow chart of the conversion method of a kind of character that Fig. 1 provides for the embodiment of the present invention.
The schematic diagram of the dictionary trident search tree that Fig. 2 provides for the embodiment of the present invention.
The schematic diagram of the phonetic trident search tree that Fig. 3 provides for the embodiment of the present invention.
The flow chart of the conversion method of the another kind of character that Fig. 4 provides for the embodiment of the present invention.
The schematic diagram of the conversion equipment of a kind of character that Fig. 5 provides for the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete retouching
State, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments.Based on the present invention
In embodiment, the every other embodiment that those of ordinary skill in the art are obtained under not making creative work premise,
Broadly fall into the scope of protection of the invention.
The flow chart of the conversion method of a kind of character that Fig. 1 provides for the embodiment of the present invention.The character that the embodiment of the present invention provides
Conversion method quickly can inquire the destination node that text to be converted is corresponding in trident search tree, and in this target
Node obtains word corresponding to text to be converted or phonetic, thus improves search efficiency.The method comprises the following steps.
Step S11, receive text to be converted.
Wherein, the method that the embodiment of the present invention provides can apply to install in the terminal of application software, and terminal can be
The equipment such as smart mobile phone, panel computer, notebook computer or desktop computer.
The application scenarios that the embodiment of the present invention is suitable for is the most.Such as, the embodiment of the present invention can be applied in e-book,
For the phonetic in e-book or word are mutually changed;And for example, the embodiment of the present invention can be applied at search engine
In, for the phonetic in the input frame of search engine or word are mutually changed.
Certainly, during the embodiment of the present invention is not limited to scene mentioned above, it is also possible to apply other need phonetic or
In the scene that word is mutually changed.
Text to be converted both can be the initial of the phonetic of word, it is also possible to for the spelling of the phonetic of word, it is also possible to for
One word.Wherein, a word at least includes two Chinese characters.
Step S12, in trident search tree, determine the destination node corresponding with text to be converted.
Wherein, before performing step S12, in order to word is converted to phonetic, the embodiment of the present invention needs to set up word
Trident search tree, and the word in standard dictionary and the phonetic corresponding with word are added to the node of word trident search tree
In.In like manner, in order to phonetic is converted to word, the embodiment of the present invention also needs to set up phonetic trident search tree, and will mark
Phonetic in quasi-dictionary and the word corresponding with phonetic add in the node of phonetic trident search tree.
The mode of setting up word trident search tree is described below.
The mode setting up word trident search tree comprises the following steps: first, determines corresponding with each word in standard dictionary
ASCII character value;Then, according to the size of ASCII character value, the first Chinese character of word each in standard dictionary is added to
In the root node of word trident search tree and the brotgher of node of root node;Finally, by the non-head of word each in standard dictionary
In the child node of the root node that individual Chinese character and phonetic add word trident search tree to, and the son of the brotgher of node of root node
In node.
Wherein, the ASCII character value of the first Chinese character of each word in the root node storage standard dictionary of word trident search tree
Chinese character placed in the middle, is positioned at the left-hand branch of root node less than the first Chinese character of this ASCII character value, more than this ASCII character value
First Chinese character be positioned at the right-hand branch of root node.
For the brightest above-mentioned process setting up word trident search tree, below by illustration.
Refer to shown in table 1, shown in table 1 in standard dictionary storage word, the first letter of pinyin of this word and should
The phonetic spelling of word.
Table 1
Such as, shown in Figure 2, and combine shown in table 1, it is assumed that standard dictionary stores word " Chinese ", " people
Race ", " social ", " colony " and " style ", and the first letter of pinyin of these words and phonetic spelling.Tentative standard word
The order that in storehouse, the ASCII character value of the first Chinese character of each word is descending be " group ", " wind ", " in ", " society " and
" people ", by Chinese character placed in the middle for the ASCII character value of Chinese character first in the word of standard dictionary " in " add word trident to and search
In the root node of Suo Shu, and by ASCII character value more than " in " " group " and " wind " add the right side of root node to and divide
, owing to the ASCII character value of " group " is more than the ASCII character value of " wind ", so " group " is added to " wind " institute
Right-hand branch at node.Again by ASCII character value less than " in " " people " and " society " add the left side of root node to
Branch, owing to the ASCII character value of " society " is more than the ASCII character value of " people ", so " society " is added to " people "
The right-hand branch of place node.Finally, by the non-first Chinese character of word each in standard dictionary and phonetic " state, zg,
Zhongguo ", " race, mz, minzu ", " lattice, fg, fengge ", " body, qt, qunti " and " meeting, sh, shehui "
In the child node of the root node being respectively added to word trident search tree, and in the child node of the brotgher of node of root node.
The mode of setting up phonetic trident search tree is described below.
The mode setting up phonetic trident search tree comprises the following steps: first, determines the lead-in of every pair of phonetic in standard dictionary
Mother, wherein, every spelling sound at least includes that a phonetic, every pair of phonetic at least include two spelling sounds, a spelling sound correspondence one
Individual Chinese character, a pair corresponding word of phonetic;Then, according to the order of initial by the head of every pair of phonetic in standard dictionary
Spelling sound adds in the root node of phonetic trident search tree and the brotgher of node of root node;Finally, by every in standard dictionary
The word that the non-first spelling sound of phonetic is corresponding with the every pair of phonetic is added to the child node of the root node of phonetic trident search tree
In, and in the child node of the brotgher of node of root node.
Wherein, the first spelling of a pair phonetic that initial is placed in the middle in the root node storage standard dictionary of phonetic trident search tree
Sound, the first spelling phoneme of phonetic is divided by other of the initial that lexicographic order stores less than this root node in the left side of root node
, the first spelling phoneme of phonetic is divided by other of the initial that lexicographic order stores more than this root node in the right side of root node
?.
For the brightest above-mentioned process setting up phonetic trident search tree, below by illustration.
Such as, shown in Figure 3, and combine shown in table 1, it is assumed that standard dictionary stores word " Chinese ", " people
Race ", " social ", " colony " and " style ", and the first letter of pinyin of these words and phonetic spelling.Due to standard words
The order that in storehouse, the initial of every pair of phonetic is descending is " z ", " s ", " q ", " m " and " f ", so by initial
The first spelling sound " qun " of a pair phonetic " qunti " that " q " placed in the middle is corresponding adds the root joint of phonetic trident search tree to
In point, it is more than the lexicographic order of the initial of " she " due to the lexicographic order of the initial of " zhong ", so by " zhong "
Add the right-hand branch of " she " place node to.Owing to the lexicographic order of the initial of " min " is more than " feng "
The lexicographic order of initial, so adding the right-hand branch of " feng " place node to by " min ".Finally, by standard
Word " guo, zg, China " that in dictionary, the non-first spelling sound of every pair of phonetic is corresponding with the every pair of phonetic, " hui, sh, society
Meeting ", " ti, qt, colony ", " zu, mz, national " and " ge, fg, style " add phonetic trident search tree to
In the child node of root node, and in the child node of the brotgher of node of root node.
Owing to standard dictionary generally includes 5 to 6 thousand words and phonetics thereof, above-mentioned example only illustrates the embodiment of the present invention
Principle, so in standard dictionary 5 to 6 thousand words and phonetic thereof all not being write out.
After trident search tree has been set up, just can determine the target corresponding with text to be converted in trident search tree
Node.It is briefly described below determining in trident search tree the process of the destination node corresponding with text to be converted.
If text to be converted is a word, then need to utilize word trident search tree mentioned above to be determined,
The step determining the destination node corresponding with text to be converted in word trident search tree is: first, at text to be converted
When being a word, determine in the root node of word trident search tree and the brotgher of node of root node with in text to be converted
Identical the second appointment node of the ASCII character value of first Chinese character.Then, determine in the second child node specifying node
The threeth appointment node identical with the ASCII character value of remaining Chinese character in text to be converted.Finally, node is specified by the 3rd
Being defined as destination node, a word at least includes two Chinese characters.
For the brightest above-mentioned mistake determining the destination node corresponding with text to be converted in word trident search tree
Journey, below by illustration.
Such as, shown in Figure 2, and combine shown in table 1, pre-build word trident search tree, standard
The order that in dictionary, the ASCII character value of the first Chinese character of each word is descending be " group ", " wind ", " in ", " society "
" people ".Assume that text to be converted is word " style ", then first compare ASCII character value and the word trident of " wind "
Search tree root node storage " in " ASCII character value, due to " wind " ASCII character value more than " in " ASCII
Code value, compares so continuing the right brotgher of node to root node.Due to the first Chinese character " wind " in text to be converted
ASCII character value identical with the ASCII character value of " wind " that the right brotgher of node of root node stores, treat so continuing to compare
" lattice " of the child node storage of the ASCII character value of second Chinese character " lattice " in converting text and the right brotgher of node of root node
ASCII character value the most identical.ASCII character value and root node due to second Chinese character " lattice " in text to be converted
The ASCII character value of " lattice " of child node storage of the right brotgher of node identical, so that can be by the right brother of root node
The child node of node is defined as destination node, in order to can extract text to be converted " style " from destination node corresponding
Phonetic spelling " fengge " and first letter of pinyin " fg ".
If text to be converted is a phonetic, then need to utilize phonetic trident search tree mentioned above to be determined,
The step determining the destination node corresponding with text to be converted in phonetic trident search tree is: first, at text to be converted
When being at least two spelling sounds, determine and literary composition to be converted in the root node of phonetic trident search tree and the brotgher of node of root node
The 4th appointment node that first spelling sound in Ben is identical.The 4th specify node child node in determine with in text to be converted
Remaining spelling sound identical the 5th appointment node.5th appointment node is defined as destination node, at least two spelling sounds
The all corresponding Chinese character of every spelling sound.
For the brightest above-mentioned mistake determining the destination node corresponding with text to be converted in phonetic trident search tree
Journey, below by illustration.
Such as, shown in Figure 3, and combine shown in table 1, pre-build phonetic trident search tree, standard
The order that in dictionary, the initial of every pair of phonetic is descending is " z ", " s ", " q ", " m " and " f ".Assume to be converted
Text is phonetic " fengge ", then first compare initial " f " and the phonetic trident of first spelling sound in text to be converted
The lexicographic order of the first letter of pinyin " q " of the root node storage of search tree, owing to the lexicographic order of " f " is less than " q "
Lexicographic order, compares so continuing the left brotgher of node to root node.Due to the first spelling sound in text to be converted
Initial " f " is identical with the lexicographic order of the first letter of pinyin " f " that the left brotgher of node of root node stores, so continuing
Relatively whether the phonetic " feng " of the left brotgher of node storage of the first spelling sound " feng " in text to be converted and root node
Identical.Due to identical, so continuing the Zuo brother of the second spelling sound " ge " and the root node comparing in text more to be converted
The phonetic of the child node storage of node is the most identical.Due to identical, so that can be by the son of the left brotgher of node of root node
Node is defined as destination node, in order to can extract the word that text to be converted " fengge " is corresponding from destination node
" style " and first letter of pinyin " fg ".
Step S13, in destination node, extract word corresponding to text to be converted or phonetic.
Wherein, after determining the destination node corresponding with text to be converted in trident search tree, just can be at destination node
Word that middle extraction text to be converted is corresponding or phonetic.If text to be converted is word, then just can be at destination node
The phonetic that middle extraction text to be converted is corresponding;If text to be converted is phonetic, then just can extract in destination node
The word that text to be converted is corresponding.
If text to be converted is phonetic, and this phonetic can corresponding multiple words, then need to use following steps to exist
Destination node extracts the word that text to be converted is corresponding: first, time in destination node containing at least two group words, really
The use frequency of each word set the goal in node.Then, according to using frequency that each word in destination node is entered
Row sequence, obtains ranking results.Finally, each word in ranking results is extracted.
Wherein, if containing at least two group words in destination node, then it needs to be determined that go out the use frequency of these words,
And according to using frequency to sort, finally extract the word after sequence, so that user is it can be seen that these words are pressed
Effect after being ranked up from high to low according to using frequency, so user can quickly navigate to the word using frequency high.
If text to be converted is word, and this word can correspondence at least two spelling sound, then need to use following step
Suddenly in destination node, the phonetic that text to be converted is corresponding is extracted: first, time in destination node containing at least two spelling sounds,
Determine the use frequency of every spelling sound in destination node, all corresponding Chinese character of the every spelling sound at least two spelling sounds.
Then, according to using frequency that the every spelling sound in destination node is ranked up, ranking results is obtained.Finally, the row of extraction
Every spelling sound in sequence result.
Wherein, if containing at least two spelling sounds in destination node, then it needs to be determined that go out the use frequency of every spelling sound,
And according to using frequency to sort, finally extract the every spelling sound after sequence, so that user is it can be seen that these groups
Phonetic is according to the effect after using frequency to be ranked up from high to low, so user can quickly navigate to use frequency high
Phonetic.
Step S14, export word corresponding to text to be converted or phonetic.
Wherein, after destination node extracts word corresponding to text to be converted or phonetic, just can export literary composition to be converted
The word of this correspondence or phonetic, in order to user is it can be seen that word corresponding to text to be converted or phonetic.
In the embodiment shown in fig. 1, word pass corresponding with phonetic can be previously stored with in the node of trident search tree
System, after receiving text to be converted, can determine the destination node corresponding with text to be converted in trident search tree.
If text to be converted is phonetic, then the word corresponding with phonetic can be extracted in destination node;If literary composition to be converted
This is word, then can extract the phonetic corresponding with word in destination node, so that phonetic can phase with word
Conversion mutually.During determining the destination node corresponding with text to be converted in trident search tree, in trident search tree
Often search the node that text the most to be converted is corresponding, all can reduce the inquiry workload of half, so the side that the present invention provides
Case can quickly inquire the destination node that text to be converted is corresponding, and obtains text pair to be converted in this destination node
The word answered or phonetic, thus improve search efficiency.
Shown in Figure 4, in other embodiments of the present invention, the method that the embodiment of the present invention provides can also be wrapped
Include following steps:
Step S15, judge whether text to be converted can be split into participle.Composition can be split at text to be converted
During word, trigger step S16;When text to be converted can not be split into participle, trigger step S12.
Step S16, utilize segmentation methods to treat converting text to carry out fractionation and obtain word segmentation result, in trident search tree really
The fixed first appointment node corresponding with the participle in word segmentation result, extracts word or spelling that in the first appointment node, participle is corresponding
Sound, exports word corresponding to participle or phonetic.
Wherein, segmentation methods exists a variety of, and such as, segmentation methods can be reverse maximum match, and segmentation methods also may be used
Thinking that forward maximum subtracts word, it is of course also possible to be other segmentation methods, the present invention does not limit to the kind of segmentation methods.
In some cases, word or sentence in text to be converted may be the longest, if in trident search tree
Determine the destination node that text to be converted is corresponding, then search efficiency can be caused low.So needing to utilize segmentation methods to incite somebody to action
Longer word or a word split into the most subdivisible participle, thus improve and determine with to be converted in trident search tree
The efficiency of the destination node that text is corresponding.
After getting text to be converted, need to judge whether text to be converted is to be split into participle.If treated
Converting text is the word that can be split, then segmentation methods just can be utilized to treat converting text and carry out fractionation and divided
Word result, and in trident search tree, determine the first appointment node corresponding with the participle in word segmentation result, extract the first finger
Determine the phonetic that participle in node is corresponding, the phonetic that output participle is corresponding.If text to be converted is the word that cannot be split
Language, then just can trigger step S12, directly determines corresponding with the participle in word segmentation result in trident search tree
One specifies node.
In like manner, if text to be converted is the phonetic that can be split, then segmentation methods just can be utilized literary composition to be converted
Originally carry out fractionation and obtain word segmentation result, and in trident search tree, determine the first appointment corresponding with the participle in word segmentation result
Node, extracts the word that in the first appointment node, participle is corresponding, the word that finally output participle is corresponding.
The schematic diagram of the conversion equipment of a kind of character that Fig. 5 provides for the embodiment of the present invention.With reference to Fig. 5, this device includes connecing
Receive module 11, first determine module 12, extraction module 13 and output module 14, wherein:
Receiver module 11, is used for receiving text to be converted, and text to be converted is phonetic or word.
First determines module 12, and for determining the destination node corresponding with text to be converted in trident search tree, trident is searched
The node of Suo Shu is previously stored with the corresponding relation of word and phonetic.
Extraction module 13, for extracting word corresponding to text to be converted or phonetic in destination node.
Output module 14, for exporting word corresponding to text to be converted or phonetic.
Optionally, the conversion equipment of the character that the embodiment of the present invention provides can also include with lower module: judge module, uses
In judging whether text to be converted can be split into participle.First performs module, and being used for can be split at text to be converted
When being divided into participle, utilize segmentation methods to treat converting text and carry out fractionation and obtain word segmentation result, determine in trident search tree
The first appointment node corresponding with the participle in word segmentation result, extracts word or phonetic that in the first appointment node, participle is corresponding,
Export word corresponding to participle or phonetic.Second performs module, is used for when text to be converted can not be split into participle,
Trigger first and determine module 12.
Optionally, above-mentioned first determines that module 12 can also include following submodule: first determines submodule, is used for
When text to be converted is a word, determine with to be converted in the root node of trident search tree and the brotgher of node of root node
Second that the ASCII character value of the first Chinese character in text is identical specifies node.Second determines submodule, for referring to second
Determine the child node of node determines the threeth appointment node identical with the ASCII character value of remaining Chinese character in text to be converted.
3rd determines submodule, and for the 3rd appointment node is defined as destination node, a word at least includes two Chinese characters.
Optionally, above-mentioned first determines that module 12 can also include following submodule: the 4th determines submodule, is used for
When text to be converted is at least two spelling sound, determines in the root node of trident search tree and the brotgher of node of root node and treat
The 4th appointment node that first spelling sound in converting text is identical.5th determines submodule, for specifying node the 4th
Child node determines the fiveth appointment node identical with remaining the spelling sound in text to be converted.6th determines submodule, uses
In the 5th appointment node is defined as destination node, all corresponding Chinese character of the every spelling sound at least two spelling sounds.
Optionally, the conversion equipment of the character that the embodiment of the present invention provides can also include with lower module: second determines module,
For determining the ASCII character value corresponding with each word in standard dictionary.First adds module, for according to ASCII character
The size of value adds the first Chinese character of word each in standard dictionary to the root node of trident search tree and the brother of root node
In node.Second adds module, for non-first Chinese character and the phonetic of word each in standard dictionary are added to trident and searched
In the child node of the root node of Suo Shu, and in the child node of the brotgher of node of root node.
Optionally, the conversion equipment of the character that the embodiment of the present invention provides can also include with lower module: the 3rd determines module,
For determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes a phonetic, every pair of phonetic
At least include two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic.3rd adds module,
For the order according to initial the first spelling sound of every pair of phonetic in standard dictionary added to trident search tree root node and
In the brotgher of node of root node.4th adds module, for by the non-first spelling sound of every pair of phonetic in standard dictionary and every pair
In the child node of the root node that the word that phonetic is corresponding adds trident search tree to, and the son joint of the brotgher of node of root node
In point.
About the device in above-described embodiment, wherein modules performs the concrete mode of operation in relevant the method
Embodiment is described in detail, explanation will be not set forth in detail herein.
The above, the only detailed description of the invention of the present invention, but protection scope of the present invention is not limited thereto, and any
Those familiar with the art, in the technical scope that the invention discloses, can readily occur in change or replace, answering
Contain within protection scope of the present invention.Therefore, protection scope of the present invention should be described with scope of the claims
It is as the criterion.
Claims (12)
1. the conversion method of a character, it is characterised in that including:
Receiving text to be converted, described text to be converted is phonetic or word;
The destination node corresponding with described text to be converted is determined, in the node of described trident search tree in trident search tree
It is previously stored with the corresponding relation of word and phonetic;
Word corresponding to described text to be converted or phonetic is extracted in described destination node;
Export word corresponding to described text to be converted or phonetic.
The conversion method of character the most according to claim 1, it is characterised in that at described reception text to be converted
After step, described method also includes:
Judge whether described text to be converted can be split into participle;
When described text to be converted can be split into participle, utilize segmentation methods that described text to be converted is split
Obtain word segmentation result, described trident search tree determine the first appointment node corresponding with the participle in described word segmentation result,
Extract described first and specify the word or phonetic that described in node, participle is corresponding, export word corresponding to described participle or phonetic;
When described text to be converted can not be split into participle, determine in trident search tree described in triggering and turn with described waiting
The step of the destination node of this correspondence of the exchange of notes.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree
The step of the destination node corresponding with described text to be converted includes:
When described text to be converted is a word, root node and the brother of described root node of word trident search tree
Node determines the second appointment node identical with the ASCII character value of the first Chinese character in described text to be converted;
Described second child node specifying node determines and the ASCII character value of remaining Chinese character in described text to be converted
The 3rd identical appointment node;
Specifying node to be defined as destination node by the described 3rd, one word at least includes two Chinese characters.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree
The step of the destination node corresponding with described text to be converted includes:
When described text to be converted is at least two spelling sound, at the root node of phonetic trident search tree and described root node
The brotgher of node determines the fourth appointment node identical with the first spelling sound in described text to be converted;
Fiveth identical with remaining the spelling sound in described text to be converted is determined in the described 4th child node specifying node
Specify node;
Node is specified to be defined as destination node by the described 5th, the every spelling sound in described at least two spelling sounds all corresponding
Chinese character.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree
Before the step of the destination node corresponding with described text to be converted, described method also includes:
Determine the ASCII character value corresponding with each word in standard dictionary;
The first Chinese character of each word in described standard dictionary is added to the search of described trident by the size according to ASCII character value
In the root node of tree and the brotgher of node of described root node;
The non-first Chinese character of each word in described standard dictionary and phonetic are added to described joint of described trident search tree
In the child node of point, and in the child node of the brotgher of node of described root node.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree
Before the step of the destination node corresponding with described text to be converted, described method also includes:
Determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes a phonetic, every pair of phonetic
At least include two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic;
Order according to initial adds the first spelling sound of every pair of phonetic in described standard dictionary to described trident search tree
In the brotgher of node of root node and described root node;
Add the word that the non-first spelling sound of every pair of phonetic in described standard dictionary is corresponding with the every pair of phonetic to described trident to search
In the child node of the described root node of Suo Shu, and in the child node of the brotgher of node of described root node.
7. the conversion equipment of a character, it is characterised in that including:
Receiver module, is used for receiving text to be converted, and described text to be converted is phonetic or word;
First determines module, for determining the destination node corresponding with described text to be converted in trident search tree, described
The node of trident search tree is previously stored with the corresponding relation of word and phonetic;
Extraction module, for extracting word corresponding to described text to be converted or phonetic in described destination node;
Output module, for exporting word corresponding to described text to be converted or phonetic.
The conversion equipment of character the most according to claim 7, it is characterised in that described device also includes:
Judge module, is used for judging whether described text to be converted can be split into participle;
First performs module, for when described text to be converted can be split into participle, utilizes segmentation methods to described
Text to be converted carries out fractionation and obtains word segmentation result, determines and the participle in described word segmentation result in described trident search tree
The first corresponding appointment node, extracts described first and specifies the word or phonetic that described in node, participle is corresponding, and output is described
Word that participle is corresponding or phonetic;
Second performs module, for when described text to be converted can not be split into participle, triggers described first and determines mould
Block.
The conversion equipment of character the most according to claim 7, it is characterised in that described first determines that module includes:
First determines submodule, and for when described text to be converted is a word, the root at word trident search tree saves
Point with the brotgher of node of described root node determines identical with the ASCII character value of the first Chinese character in described text to be converted
Second specifies node;
Second determines submodule, for described second specify node child node in determine with in described text to be converted
The 3rd that the ASCII character value of remaining Chinese character is identical specifies node;
3rd determines submodule, and for specifying node to be defined as destination node by the described 3rd, one word at least wraps
Include two Chinese characters.
The conversion equipment of character the most according to claim 7, it is characterised in that described first determines that module includes:
4th determines submodule, for when described text to be converted is at least two spelling sound, at phonetic trident search tree
Root node with the brotgher of node of described root node determines fourth appointment identical with the first spelling sound in described text to be converted
Node;
5th determines submodule, for described 4th specify node child node in determine with in described text to be converted
The 5th appointment node that remaining spelling sound is identical;
6th determines submodule, for specifying node to be defined as destination node, in described at least two spelling sounds by the described 5th
The all corresponding Chinese character of every spelling sound.
The conversion equipment of 11. characters according to claim 7, it is characterised in that described device also includes:
Second determines module, for determining the ASCII character value corresponding with each word in standard dictionary;
First adds module, is used for the size according to ASCII character value by the first Chinese character of each word in described standard dictionary
Add in the root node of described trident search tree and the brotgher of node of described root node;
Second adds module, for adding non-first Chinese character and the phonetic of each word in described standard dictionary to described three
In the child node of the described root node of fork search tree, and in the child node of the brotgher of node of described root node.
The conversion equipment of 12. characters according to claim 7, it is characterised in that described device also includes:
3rd determines module, and for determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes
One phonetic, every pair of phonetic at least includes two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic
Language;
3rd adds module, is added by the first spelling sound of every pair of phonetic in described standard dictionary for the order according to initial
In the brotgher of node of the root node of described trident search tree and described root node;
4th adds module, for by word corresponding with every pair of phonetic for the non-first spelling sound of every pair of phonetic in described standard dictionary
In the child node of the described root node that language adds described trident search tree to, and the son joint of the brotgher of node of described root node
In point.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610243297.3A CN105955986A (en) | 2016-04-18 | 2016-04-18 | Character converting method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610243297.3A CN105955986A (en) | 2016-04-18 | 2016-04-18 | Character converting method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105955986A true CN105955986A (en) | 2016-09-21 |
Family
ID=56917672
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610243297.3A Pending CN105955986A (en) | 2016-04-18 | 2016-04-18 | Character converting method and apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105955986A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106897257A (en) * | 2017-02-23 | 2017-06-27 | 郑州云海信息技术有限公司 | The conversion method and device of a kind of ASCII character and character string based on LINUX platforms |
CN111737986A (en) * | 2020-05-15 | 2020-10-02 | 深圳市世强元件网络有限公司 | Search term recommendation method and system based on multi-way tree |
CN113641731A (en) * | 2021-08-17 | 2021-11-12 | 成都知道创宇信息技术有限公司 | Fuzzy search optimization method and device, electronic equipment and readable storage medium |
US11947608B2 (en) | 2020-05-15 | 2024-04-02 | Shenzhen Sekorm Component Network Co., Ltd | Search term recommendation method and system based on multi-branch tree |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521418A (en) * | 2011-12-31 | 2012-06-27 | 青岛海信宽带多媒体技术有限公司 | Pinyin storage structure and pinyin input method |
CN102867049A (en) * | 2012-09-10 | 2013-01-09 | 山东康威通信技术股份有限公司 | Chinese PINYIN quick word segmentation method based on word search tree |
CN102866781A (en) * | 2011-07-06 | 2013-01-09 | 哈尔滨工业大学 | Pinyin-to-character conversion method and pinyin-to-character conversion system |
CN103823814A (en) * | 2012-11-19 | 2014-05-28 | 腾讯科技(深圳)有限公司 | Information processing method and information processing device |
CN104252484A (en) * | 2013-06-28 | 2014-12-31 | 重庆新媒农信科技有限公司 | Pinyin error correction method and system |
CN104268157A (en) * | 2014-09-03 | 2015-01-07 | 乐视网信息技术(北京)股份有限公司 | Device and method for error correction in data search |
-
2016
- 2016-04-18 CN CN201610243297.3A patent/CN105955986A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102866781A (en) * | 2011-07-06 | 2013-01-09 | 哈尔滨工业大学 | Pinyin-to-character conversion method and pinyin-to-character conversion system |
CN102521418A (en) * | 2011-12-31 | 2012-06-27 | 青岛海信宽带多媒体技术有限公司 | Pinyin storage structure and pinyin input method |
CN102867049A (en) * | 2012-09-10 | 2013-01-09 | 山东康威通信技术股份有限公司 | Chinese PINYIN quick word segmentation method based on word search tree |
CN103823814A (en) * | 2012-11-19 | 2014-05-28 | 腾讯科技(深圳)有限公司 | Information processing method and information processing device |
CN104252484A (en) * | 2013-06-28 | 2014-12-31 | 重庆新媒农信科技有限公司 | Pinyin error correction method and system |
CN104268157A (en) * | 2014-09-03 | 2015-01-07 | 乐视网信息技术(北京)股份有限公司 | Device and method for error correction in data search |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106897257A (en) * | 2017-02-23 | 2017-06-27 | 郑州云海信息技术有限公司 | The conversion method and device of a kind of ASCII character and character string based on LINUX platforms |
CN111737986A (en) * | 2020-05-15 | 2020-10-02 | 深圳市世强元件网络有限公司 | Search term recommendation method and system based on multi-way tree |
US11947608B2 (en) | 2020-05-15 | 2024-04-02 | Shenzhen Sekorm Component Network Co., Ltd | Search term recommendation method and system based on multi-branch tree |
CN113641731A (en) * | 2021-08-17 | 2021-11-12 | 成都知道创宇信息技术有限公司 | Fuzzy search optimization method and device, electronic equipment and readable storage medium |
CN113641731B (en) * | 2021-08-17 | 2023-05-02 | 成都知道创宇信息技术有限公司 | Fuzzy search optimization method, device, electronic equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103268313B (en) | A kind of semantic analytic method of natural language and device | |
CN107291783B (en) | Semantic matching method and intelligent equipment | |
CN103456297B (en) | A kind of method and apparatus of speech recognition match | |
CN104238991B (en) | Phonetic entry matching process and device | |
CN102254557B (en) | Navigation method and system based on natural voice identification | |
CN102915299B (en) | Word segmentation method and device | |
CN102236423B (en) | A kind of method that character supplements automatically, device and input method system | |
CN101634927B (en) | Method and device for displaying candidate items in character input | |
AUPR824301A0 (en) | Methods and systems (npw001) | |
CN105955986A (en) | Character converting method and apparatus | |
CN102968987A (en) | Speech recognition method and system | |
CN103376909B (en) | The method and system of adjusting candidate word sequence in input method | |
CN108446316B (en) | association word recommendation method and device, electronic equipment and storage medium | |
CN103838718A (en) | Translation system and translation method | |
CN106205613B (en) | A kind of navigation audio recognition method and system | |
EP3916579A1 (en) | Method for resource sorting, method for training sorting model and corresponding apparatuses | |
CN104915458B (en) | A kind of method, system and mobile terminal associated automatically when user searches for and applies | |
CN111160007A (en) | Search method and device based on BERT language model, computer equipment and storage medium | |
CN112541109B (en) | Answer abstract extraction method and device, electronic equipment, readable medium and product | |
CN112148895B (en) | Training method, device, equipment and computer storage medium for retrieval model | |
CN113360685A (en) | Method, device, equipment and medium for processing note content | |
CN111339314B (en) | Ternary group data generation method and device and electronic equipment | |
CN105808688A (en) | Complementation retrieval method and device based on artificial intelligence | |
US20160196303A1 (en) | String search device, string search method, and string search program | |
CN105404903A (en) | Information processing method and apparatus, and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160921 |
|
WD01 | Invention patent application deemed withdrawn after publication |