CN105955986A - Character converting method and apparatus - Google Patents

Character converting method and apparatus Download PDF

Info

Publication number
CN105955986A
CN105955986A CN201610243297.3A CN201610243297A CN105955986A CN 105955986 A CN105955986 A CN 105955986A CN 201610243297 A CN201610243297 A CN 201610243297A CN 105955986 A CN105955986 A CN 105955986A
Authority
CN
China
Prior art keywords
node
converted
word
phonetic
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610243297.3A
Other languages
Chinese (zh)
Inventor
谢晓静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Holding Beijing Co Ltd
LeTV Information Technology Beijing Co Ltd
Original Assignee
LeTV Holding Beijing Co Ltd
LeTV Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Holding Beijing Co Ltd, LeTV Information Technology Beijing Co Ltd filed Critical LeTV Holding Beijing Co Ltd
Priority to CN201610243297.3A priority Critical patent/CN105955986A/en
Publication of CN105955986A publication Critical patent/CN105955986A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a character converting method and apparatus, and the method comprises the steps of receiving a text to be converted; determining an object node corresponding to the text to be converted in a tri-search tree, and a corresponding relation between words and pinyin is stored in the nodes of the tri-search tree in advance; extracting the word or pinyin corresponding to the text to be converted in the object nodes; and outputting the word or pinyin corresponding to the text to be converted. In the process of determining the object node corresponding to the text to be converted in the tri-search tree, half query workload can be reduced every time searching the node corresponding to the text to be converted in the tri-search tree, therefore, according to the solution provided by the invention, the object node corresponding to the text to be converted can be rapidly queried, and the word or pinyin corresponding to the text to be converted can be obtained in the object node. In this way, the query efficiency is improved.

Description

The conversion method of a kind of character and device
Technical field
The present embodiments relate to communication technical field, in particular, relate to conversion method and the device of character.
Background technology
At present, in order to realize the mutual conversion of phonetic and word, it usually needs pre-build a powerful dictionary, at word Storehouse needs the corresponding relation recording all of word with phonetic.Wherein, word at least includes two Chinese characters.
When user inputs a spelling sound, server needs to travel through whole dictionary from the beginning to the end to inquire about the word that this phonetic is corresponding Language, so server may need the consumption long period can inquire the word that this phonetic is corresponding.In like manner, user During input word, server needs to travel through from the beginning to the end whole dictionary to inquire about the phonetic that this word is corresponding, so server Need also exist for consume the long period can inquire the phonetic that this word is corresponding.So the above-mentioned mode utilizing dictionary is carried out Word and the conversion of phonetic, its search efficiency is the lowest.
Therefore, how to improve the search efficiency that phonetic and word are mutually changed, become to need badly at present and solve the technical problem that.
Summary of the invention
The present invention provides conversion method and the device of a kind of character, to improve the efficiency of inquiry.
First aspect according to embodiments of the present invention, it is provided that the conversion method of a kind of character, including:
Receiving text to be converted, described text to be converted is phonetic or word;
The destination node corresponding with described text to be converted is determined, in the node of described trident search tree in trident search tree It is previously stored with the corresponding relation of word and phonetic;
Word corresponding to described text to be converted or phonetic is extracted in described destination node;
Export word corresponding to described text to be converted or phonetic.
Optionally, after the step of described reception text to be converted, described method also includes:
Judge whether described text to be converted can be split into participle;
When described text to be converted can be split into participle, utilize segmentation methods that described text to be converted is split Obtain word segmentation result, described trident search tree determine the first appointment node corresponding with the participle in described word segmentation result, Extract described first and specify the word or phonetic that described in node, participle is corresponding, export word corresponding to described participle or phonetic;
When described text to be converted can not be split into participle, determine in trident search tree described in triggering and turn with described waiting The step of the destination node of this correspondence of the exchange of notes.
Optionally, the described step determining the destination node corresponding with described text to be converted in trident search tree includes:
When described text to be converted is a word, root node and the brother of described root node of word trident search tree Node determines the second appointment node identical with the ASCII character value of the first Chinese character in described text to be converted;
Described second child node specifying node determines and the ASCII character of remaining Chinese character in described text to be converted It is worth the 3rd identical appointment node;
Specifying node to be defined as destination node by the described 3rd, one word at least includes two Chinese characters.
Optionally, the described step determining the destination node corresponding with described text to be converted in trident search tree includes:
When described text to be converted is at least two spelling sound, at the root node of phonetic trident search tree and described root node The brotgher of node determines the fourth appointment node identical with the first spelling sound in described text to be converted;
Identical with remaining the spelling sound in described text to be converted the is determined in the described 4th child node specifying node Five specify node;
Node is specified to be defined as destination node by the described 5th, the every spelling sound in described at least two spelling sounds all corresponding Chinese character.
Optionally, before the described step determining the destination node corresponding with described text to be converted in trident search tree, Described method also includes:
Determine the ASCII character value corresponding with each word in standard dictionary;
Size according to ASCII character value is added the first Chinese character of each word in described standard dictionary to described trident and is searched In the root node of Suo Shu and the brotgher of node of described root node;
The non-first Chinese character of each word in described standard dictionary and phonetic are added to described trident search tree described In the child node of node, and in the child node of the brotgher of node of described root node.
Optionally, before the described step determining the destination node corresponding with described text to be converted in trident search tree, Described method also includes:
Determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes a phonetic, every pair of phonetic At least include two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic;
Order according to initial adds the first spelling sound of every pair of phonetic in described standard dictionary to described trident search tree Root node and described root node the brotgher of node in;
Add the word that the non-first spelling sound of every pair of phonetic in described standard dictionary is corresponding with the every pair of phonetic to described trident In the child node of the described root node of search tree, and in the child node of the brotgher of node of described root node.
Second aspect according to embodiments of the present invention, it is provided that the conversion equipment of a kind of character, including:
Receiver module, is used for receiving text to be converted, and described text to be converted is phonetic or word;
First determines module, for determining the destination node corresponding with described text to be converted in trident search tree, described The node of trident search tree is previously stored with the corresponding relation of word and phonetic;
Extraction module, for extracting word corresponding to described text to be converted or phonetic in described destination node;
Output module, for exporting word corresponding to described text to be converted or phonetic.
Optionally, described device also includes:
Judge module, is used for judging whether described text to be converted can be split into participle;
First performs module, for when described text to be converted can be split into participle, utilizes segmentation methods to described Text to be converted carries out fractionation and obtains word segmentation result, determines and the participle in described word segmentation result in described trident search tree The first corresponding appointment node, extracts described first and specifies the word or phonetic that described in node, participle is corresponding, and output is described Word that participle is corresponding or phonetic;
Second performs module, for when described text to be converted can not be split into participle, triggers described first and determines mould Block.
Optionally, described first determines that module includes:
First determines submodule, and for when described text to be converted is a word, the root at word trident search tree saves Point with the brotgher of node of described root node determines identical with the ASCII character value of the first Chinese character in described text to be converted Second specifies node;
Second determines submodule, for described second specify node child node in determine with in described text to be converted The 3rd that the ASCII character value of remaining Chinese character is identical specifies node;
3rd determines submodule, and for specifying node to be defined as destination node by the described 3rd, one word at least wraps Include two Chinese characters.
Optionally, described first determines that module includes:
4th determines submodule, for when described text to be converted is at least two spelling sound, at phonetic trident search tree Root node with the brotgher of node of described root node determines fourth appointment identical with the first spelling sound in described text to be converted Node;
5th determines submodule, for described 4th specify node child node in determine with in described text to be converted The 5th appointment node that remaining spelling sound is identical;
6th determines submodule, for specifying node to be defined as destination node, in described at least two spelling sounds by the described 5th The all corresponding Chinese character of every spelling sound.
Optionally, described device also includes:
Second determines module, for determining the ASCII character value corresponding with each word in standard dictionary;
First adds module, is used for the size according to ASCII character value by the first Chinese character of each word in described standard dictionary Add in the root node of described trident search tree and the brotgher of node of described root node;
Second adds module, for adding non-first Chinese character and the phonetic of each word in described standard dictionary to described three In the child node of the described root node of fork search tree, and in the child node of the brotgher of node of described root node.
Optionally, described device also includes:
3rd determines module, and for determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes One phonetic, every pair of phonetic at least includes two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic Language;
3rd adds module, is added by the first spelling sound of every pair of phonetic in described standard dictionary for the order according to initial In the brotgher of node of the root node of described trident search tree and described root node;
4th adds module, for by word corresponding with every pair of phonetic for the non-first spelling sound of every pair of phonetic in described standard dictionary In the child node of the described root node that language adds described trident search tree to, and the son joint of the brotgher of node of described root node In point.
Compared with prior art, the technical scheme that the present embodiment provides has the following advantages and feature:
In the scheme that the present invention provides, the node of trident search tree can be previously stored with word pass corresponding with phonetic System, after receiving text to be converted, can determine the destination node corresponding with text to be converted in trident search tree. If text to be converted is phonetic, then the word corresponding with phonetic can be extracted in destination node;If literary composition to be converted This is word, then can extract the phonetic corresponding with word in destination node, so that phonetic can phase with word Conversion mutually.During determining the destination node corresponding with text to be converted in trident search tree, in trident search tree Often search the node that text the most to be converted is corresponding, all can reduce the inquiry workload of half, so the side that the present invention provides Case can quickly inquire the destination node that text to be converted is corresponding, and obtains text pair to be converted in this destination node The word answered or phonetic, thus improve search efficiency.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, will make required in embodiment below Accompanying drawing be briefly described, it should be apparent that, below describe in accompanying drawing be only some embodiments of the present invention, for From the point of view of those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to obtain other according to these accompanying drawings Accompanying drawing.
The flow chart of the conversion method of a kind of character that Fig. 1 provides for the embodiment of the present invention.
The schematic diagram of the dictionary trident search tree that Fig. 2 provides for the embodiment of the present invention.
The schematic diagram of the phonetic trident search tree that Fig. 3 provides for the embodiment of the present invention.
The flow chart of the conversion method of the another kind of character that Fig. 4 provides for the embodiment of the present invention.
The schematic diagram of the conversion equipment of a kind of character that Fig. 5 provides for the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete retouching State, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments.Based on the present invention In embodiment, the every other embodiment that those of ordinary skill in the art are obtained under not making creative work premise, Broadly fall into the scope of protection of the invention.
The flow chart of the conversion method of a kind of character that Fig. 1 provides for the embodiment of the present invention.The character that the embodiment of the present invention provides Conversion method quickly can inquire the destination node that text to be converted is corresponding in trident search tree, and in this target Node obtains word corresponding to text to be converted or phonetic, thus improves search efficiency.The method comprises the following steps.
Step S11, receive text to be converted.
Wherein, the method that the embodiment of the present invention provides can apply to install in the terminal of application software, and terminal can be The equipment such as smart mobile phone, panel computer, notebook computer or desktop computer.
The application scenarios that the embodiment of the present invention is suitable for is the most.Such as, the embodiment of the present invention can be applied in e-book, For the phonetic in e-book or word are mutually changed;And for example, the embodiment of the present invention can be applied at search engine In, for the phonetic in the input frame of search engine or word are mutually changed.
Certainly, during the embodiment of the present invention is not limited to scene mentioned above, it is also possible to apply other need phonetic or In the scene that word is mutually changed.
Text to be converted both can be the initial of the phonetic of word, it is also possible to for the spelling of the phonetic of word, it is also possible to for One word.Wherein, a word at least includes two Chinese characters.
Step S12, in trident search tree, determine the destination node corresponding with text to be converted.
Wherein, before performing step S12, in order to word is converted to phonetic, the embodiment of the present invention needs to set up word Trident search tree, and the word in standard dictionary and the phonetic corresponding with word are added to the node of word trident search tree In.In like manner, in order to phonetic is converted to word, the embodiment of the present invention also needs to set up phonetic trident search tree, and will mark Phonetic in quasi-dictionary and the word corresponding with phonetic add in the node of phonetic trident search tree.
The mode of setting up word trident search tree is described below.
The mode setting up word trident search tree comprises the following steps: first, determines corresponding with each word in standard dictionary ASCII character value;Then, according to the size of ASCII character value, the first Chinese character of word each in standard dictionary is added to In the root node of word trident search tree and the brotgher of node of root node;Finally, by the non-head of word each in standard dictionary In the child node of the root node that individual Chinese character and phonetic add word trident search tree to, and the son of the brotgher of node of root node In node.
Wherein, the ASCII character value of the first Chinese character of each word in the root node storage standard dictionary of word trident search tree Chinese character placed in the middle, is positioned at the left-hand branch of root node less than the first Chinese character of this ASCII character value, more than this ASCII character value First Chinese character be positioned at the right-hand branch of root node.
For the brightest above-mentioned process setting up word trident search tree, below by illustration.
Refer to shown in table 1, shown in table 1 in standard dictionary storage word, the first letter of pinyin of this word and should The phonetic spelling of word.
Table 1
Such as, shown in Figure 2, and combine shown in table 1, it is assumed that standard dictionary stores word " Chinese ", " people Race ", " social ", " colony " and " style ", and the first letter of pinyin of these words and phonetic spelling.Tentative standard word The order that in storehouse, the ASCII character value of the first Chinese character of each word is descending be " group ", " wind ", " in ", " society " and " people ", by Chinese character placed in the middle for the ASCII character value of Chinese character first in the word of standard dictionary " in " add word trident to and search In the root node of Suo Shu, and by ASCII character value more than " in " " group " and " wind " add the right side of root node to and divide , owing to the ASCII character value of " group " is more than the ASCII character value of " wind ", so " group " is added to " wind " institute Right-hand branch at node.Again by ASCII character value less than " in " " people " and " society " add the left side of root node to Branch, owing to the ASCII character value of " society " is more than the ASCII character value of " people ", so " society " is added to " people " The right-hand branch of place node.Finally, by the non-first Chinese character of word each in standard dictionary and phonetic " state, zg, Zhongguo ", " race, mz, minzu ", " lattice, fg, fengge ", " body, qt, qunti " and " meeting, sh, shehui " In the child node of the root node being respectively added to word trident search tree, and in the child node of the brotgher of node of root node.
The mode of setting up phonetic trident search tree is described below.
The mode setting up phonetic trident search tree comprises the following steps: first, determines the lead-in of every pair of phonetic in standard dictionary Mother, wherein, every spelling sound at least includes that a phonetic, every pair of phonetic at least include two spelling sounds, a spelling sound correspondence one Individual Chinese character, a pair corresponding word of phonetic;Then, according to the order of initial by the head of every pair of phonetic in standard dictionary Spelling sound adds in the root node of phonetic trident search tree and the brotgher of node of root node;Finally, by every in standard dictionary The word that the non-first spelling sound of phonetic is corresponding with the every pair of phonetic is added to the child node of the root node of phonetic trident search tree In, and in the child node of the brotgher of node of root node.
Wherein, the first spelling of a pair phonetic that initial is placed in the middle in the root node storage standard dictionary of phonetic trident search tree Sound, the first spelling phoneme of phonetic is divided by other of the initial that lexicographic order stores less than this root node in the left side of root node , the first spelling phoneme of phonetic is divided by other of the initial that lexicographic order stores more than this root node in the right side of root node ?.
For the brightest above-mentioned process setting up phonetic trident search tree, below by illustration.
Such as, shown in Figure 3, and combine shown in table 1, it is assumed that standard dictionary stores word " Chinese ", " people Race ", " social ", " colony " and " style ", and the first letter of pinyin of these words and phonetic spelling.Due to standard words The order that in storehouse, the initial of every pair of phonetic is descending is " z ", " s ", " q ", " m " and " f ", so by initial The first spelling sound " qun " of a pair phonetic " qunti " that " q " placed in the middle is corresponding adds the root joint of phonetic trident search tree to In point, it is more than the lexicographic order of the initial of " she " due to the lexicographic order of the initial of " zhong ", so by " zhong " Add the right-hand branch of " she " place node to.Owing to the lexicographic order of the initial of " min " is more than " feng " The lexicographic order of initial, so adding the right-hand branch of " feng " place node to by " min ".Finally, by standard Word " guo, zg, China " that in dictionary, the non-first spelling sound of every pair of phonetic is corresponding with the every pair of phonetic, " hui, sh, society Meeting ", " ti, qt, colony ", " zu, mz, national " and " ge, fg, style " add phonetic trident search tree to In the child node of root node, and in the child node of the brotgher of node of root node.
Owing to standard dictionary generally includes 5 to 6 thousand words and phonetics thereof, above-mentioned example only illustrates the embodiment of the present invention Principle, so in standard dictionary 5 to 6 thousand words and phonetic thereof all not being write out.
After trident search tree has been set up, just can determine the target corresponding with text to be converted in trident search tree Node.It is briefly described below determining in trident search tree the process of the destination node corresponding with text to be converted.
If text to be converted is a word, then need to utilize word trident search tree mentioned above to be determined, The step determining the destination node corresponding with text to be converted in word trident search tree is: first, at text to be converted When being a word, determine in the root node of word trident search tree and the brotgher of node of root node with in text to be converted Identical the second appointment node of the ASCII character value of first Chinese character.Then, determine in the second child node specifying node The threeth appointment node identical with the ASCII character value of remaining Chinese character in text to be converted.Finally, node is specified by the 3rd Being defined as destination node, a word at least includes two Chinese characters.
For the brightest above-mentioned mistake determining the destination node corresponding with text to be converted in word trident search tree Journey, below by illustration.
Such as, shown in Figure 2, and combine shown in table 1, pre-build word trident search tree, standard The order that in dictionary, the ASCII character value of the first Chinese character of each word is descending be " group ", " wind ", " in ", " society " " people ".Assume that text to be converted is word " style ", then first compare ASCII character value and the word trident of " wind " Search tree root node storage " in " ASCII character value, due to " wind " ASCII character value more than " in " ASCII Code value, compares so continuing the right brotgher of node to root node.Due to the first Chinese character " wind " in text to be converted ASCII character value identical with the ASCII character value of " wind " that the right brotgher of node of root node stores, treat so continuing to compare " lattice " of the child node storage of the ASCII character value of second Chinese character " lattice " in converting text and the right brotgher of node of root node ASCII character value the most identical.ASCII character value and root node due to second Chinese character " lattice " in text to be converted The ASCII character value of " lattice " of child node storage of the right brotgher of node identical, so that can be by the right brother of root node The child node of node is defined as destination node, in order to can extract text to be converted " style " from destination node corresponding Phonetic spelling " fengge " and first letter of pinyin " fg ".
If text to be converted is a phonetic, then need to utilize phonetic trident search tree mentioned above to be determined, The step determining the destination node corresponding with text to be converted in phonetic trident search tree is: first, at text to be converted When being at least two spelling sounds, determine and literary composition to be converted in the root node of phonetic trident search tree and the brotgher of node of root node The 4th appointment node that first spelling sound in Ben is identical.The 4th specify node child node in determine with in text to be converted Remaining spelling sound identical the 5th appointment node.5th appointment node is defined as destination node, at least two spelling sounds The all corresponding Chinese character of every spelling sound.
For the brightest above-mentioned mistake determining the destination node corresponding with text to be converted in phonetic trident search tree Journey, below by illustration.
Such as, shown in Figure 3, and combine shown in table 1, pre-build phonetic trident search tree, standard The order that in dictionary, the initial of every pair of phonetic is descending is " z ", " s ", " q ", " m " and " f ".Assume to be converted Text is phonetic " fengge ", then first compare initial " f " and the phonetic trident of first spelling sound in text to be converted The lexicographic order of the first letter of pinyin " q " of the root node storage of search tree, owing to the lexicographic order of " f " is less than " q " Lexicographic order, compares so continuing the left brotgher of node to root node.Due to the first spelling sound in text to be converted Initial " f " is identical with the lexicographic order of the first letter of pinyin " f " that the left brotgher of node of root node stores, so continuing Relatively whether the phonetic " feng " of the left brotgher of node storage of the first spelling sound " feng " in text to be converted and root node Identical.Due to identical, so continuing the Zuo brother of the second spelling sound " ge " and the root node comparing in text more to be converted The phonetic of the child node storage of node is the most identical.Due to identical, so that can be by the son of the left brotgher of node of root node Node is defined as destination node, in order to can extract the word that text to be converted " fengge " is corresponding from destination node " style " and first letter of pinyin " fg ".
Step S13, in destination node, extract word corresponding to text to be converted or phonetic.
Wherein, after determining the destination node corresponding with text to be converted in trident search tree, just can be at destination node Word that middle extraction text to be converted is corresponding or phonetic.If text to be converted is word, then just can be at destination node The phonetic that middle extraction text to be converted is corresponding;If text to be converted is phonetic, then just can extract in destination node The word that text to be converted is corresponding.
If text to be converted is phonetic, and this phonetic can corresponding multiple words, then need to use following steps to exist Destination node extracts the word that text to be converted is corresponding: first, time in destination node containing at least two group words, really The use frequency of each word set the goal in node.Then, according to using frequency that each word in destination node is entered Row sequence, obtains ranking results.Finally, each word in ranking results is extracted.
Wherein, if containing at least two group words in destination node, then it needs to be determined that go out the use frequency of these words, And according to using frequency to sort, finally extract the word after sequence, so that user is it can be seen that these words are pressed Effect after being ranked up from high to low according to using frequency, so user can quickly navigate to the word using frequency high.
If text to be converted is word, and this word can correspondence at least two spelling sound, then need to use following step Suddenly in destination node, the phonetic that text to be converted is corresponding is extracted: first, time in destination node containing at least two spelling sounds, Determine the use frequency of every spelling sound in destination node, all corresponding Chinese character of the every spelling sound at least two spelling sounds. Then, according to using frequency that the every spelling sound in destination node is ranked up, ranking results is obtained.Finally, the row of extraction Every spelling sound in sequence result.
Wherein, if containing at least two spelling sounds in destination node, then it needs to be determined that go out the use frequency of every spelling sound, And according to using frequency to sort, finally extract the every spelling sound after sequence, so that user is it can be seen that these groups Phonetic is according to the effect after using frequency to be ranked up from high to low, so user can quickly navigate to use frequency high Phonetic.
Step S14, export word corresponding to text to be converted or phonetic.
Wherein, after destination node extracts word corresponding to text to be converted or phonetic, just can export literary composition to be converted The word of this correspondence or phonetic, in order to user is it can be seen that word corresponding to text to be converted or phonetic.
In the embodiment shown in fig. 1, word pass corresponding with phonetic can be previously stored with in the node of trident search tree System, after receiving text to be converted, can determine the destination node corresponding with text to be converted in trident search tree. If text to be converted is phonetic, then the word corresponding with phonetic can be extracted in destination node;If literary composition to be converted This is word, then can extract the phonetic corresponding with word in destination node, so that phonetic can phase with word Conversion mutually.During determining the destination node corresponding with text to be converted in trident search tree, in trident search tree Often search the node that text the most to be converted is corresponding, all can reduce the inquiry workload of half, so the side that the present invention provides Case can quickly inquire the destination node that text to be converted is corresponding, and obtains text pair to be converted in this destination node The word answered or phonetic, thus improve search efficiency.
Shown in Figure 4, in other embodiments of the present invention, the method that the embodiment of the present invention provides can also be wrapped Include following steps:
Step S15, judge whether text to be converted can be split into participle.Composition can be split at text to be converted During word, trigger step S16;When text to be converted can not be split into participle, trigger step S12.
Step S16, utilize segmentation methods to treat converting text to carry out fractionation and obtain word segmentation result, in trident search tree really The fixed first appointment node corresponding with the participle in word segmentation result, extracts word or spelling that in the first appointment node, participle is corresponding Sound, exports word corresponding to participle or phonetic.
Wherein, segmentation methods exists a variety of, and such as, segmentation methods can be reverse maximum match, and segmentation methods also may be used Thinking that forward maximum subtracts word, it is of course also possible to be other segmentation methods, the present invention does not limit to the kind of segmentation methods.
In some cases, word or sentence in text to be converted may be the longest, if in trident search tree Determine the destination node that text to be converted is corresponding, then search efficiency can be caused low.So needing to utilize segmentation methods to incite somebody to action Longer word or a word split into the most subdivisible participle, thus improve and determine with to be converted in trident search tree The efficiency of the destination node that text is corresponding.
After getting text to be converted, need to judge whether text to be converted is to be split into participle.If treated Converting text is the word that can be split, then segmentation methods just can be utilized to treat converting text and carry out fractionation and divided Word result, and in trident search tree, determine the first appointment node corresponding with the participle in word segmentation result, extract the first finger Determine the phonetic that participle in node is corresponding, the phonetic that output participle is corresponding.If text to be converted is the word that cannot be split Language, then just can trigger step S12, directly determines corresponding with the participle in word segmentation result in trident search tree One specifies node.
In like manner, if text to be converted is the phonetic that can be split, then segmentation methods just can be utilized literary composition to be converted Originally carry out fractionation and obtain word segmentation result, and in trident search tree, determine the first appointment corresponding with the participle in word segmentation result Node, extracts the word that in the first appointment node, participle is corresponding, the word that finally output participle is corresponding.
The schematic diagram of the conversion equipment of a kind of character that Fig. 5 provides for the embodiment of the present invention.With reference to Fig. 5, this device includes connecing Receive module 11, first determine module 12, extraction module 13 and output module 14, wherein:
Receiver module 11, is used for receiving text to be converted, and text to be converted is phonetic or word.
First determines module 12, and for determining the destination node corresponding with text to be converted in trident search tree, trident is searched The node of Suo Shu is previously stored with the corresponding relation of word and phonetic.
Extraction module 13, for extracting word corresponding to text to be converted or phonetic in destination node.
Output module 14, for exporting word corresponding to text to be converted or phonetic.
Optionally, the conversion equipment of the character that the embodiment of the present invention provides can also include with lower module: judge module, uses In judging whether text to be converted can be split into participle.First performs module, and being used for can be split at text to be converted When being divided into participle, utilize segmentation methods to treat converting text and carry out fractionation and obtain word segmentation result, determine in trident search tree The first appointment node corresponding with the participle in word segmentation result, extracts word or phonetic that in the first appointment node, participle is corresponding, Export word corresponding to participle or phonetic.Second performs module, is used for when text to be converted can not be split into participle, Trigger first and determine module 12.
Optionally, above-mentioned first determines that module 12 can also include following submodule: first determines submodule, is used for When text to be converted is a word, determine with to be converted in the root node of trident search tree and the brotgher of node of root node Second that the ASCII character value of the first Chinese character in text is identical specifies node.Second determines submodule, for referring to second Determine the child node of node determines the threeth appointment node identical with the ASCII character value of remaining Chinese character in text to be converted. 3rd determines submodule, and for the 3rd appointment node is defined as destination node, a word at least includes two Chinese characters.
Optionally, above-mentioned first determines that module 12 can also include following submodule: the 4th determines submodule, is used for When text to be converted is at least two spelling sound, determines in the root node of trident search tree and the brotgher of node of root node and treat The 4th appointment node that first spelling sound in converting text is identical.5th determines submodule, for specifying node the 4th Child node determines the fiveth appointment node identical with remaining the spelling sound in text to be converted.6th determines submodule, uses In the 5th appointment node is defined as destination node, all corresponding Chinese character of the every spelling sound at least two spelling sounds.
Optionally, the conversion equipment of the character that the embodiment of the present invention provides can also include with lower module: second determines module, For determining the ASCII character value corresponding with each word in standard dictionary.First adds module, for according to ASCII character The size of value adds the first Chinese character of word each in standard dictionary to the root node of trident search tree and the brother of root node In node.Second adds module, for non-first Chinese character and the phonetic of word each in standard dictionary are added to trident and searched In the child node of the root node of Suo Shu, and in the child node of the brotgher of node of root node.
Optionally, the conversion equipment of the character that the embodiment of the present invention provides can also include with lower module: the 3rd determines module, For determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes a phonetic, every pair of phonetic At least include two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic.3rd adds module, For the order according to initial the first spelling sound of every pair of phonetic in standard dictionary added to trident search tree root node and In the brotgher of node of root node.4th adds module, for by the non-first spelling sound of every pair of phonetic in standard dictionary and every pair In the child node of the root node that the word that phonetic is corresponding adds trident search tree to, and the son joint of the brotgher of node of root node In point.
About the device in above-described embodiment, wherein modules performs the concrete mode of operation in relevant the method Embodiment is described in detail, explanation will be not set forth in detail herein.
The above, the only detailed description of the invention of the present invention, but protection scope of the present invention is not limited thereto, and any Those familiar with the art, in the technical scope that the invention discloses, can readily occur in change or replace, answering Contain within protection scope of the present invention.Therefore, protection scope of the present invention should be described with scope of the claims It is as the criterion.

Claims (12)

1. the conversion method of a character, it is characterised in that including:
Receiving text to be converted, described text to be converted is phonetic or word;
The destination node corresponding with described text to be converted is determined, in the node of described trident search tree in trident search tree It is previously stored with the corresponding relation of word and phonetic;
Word corresponding to described text to be converted or phonetic is extracted in described destination node;
Export word corresponding to described text to be converted or phonetic.
The conversion method of character the most according to claim 1, it is characterised in that at described reception text to be converted After step, described method also includes:
Judge whether described text to be converted can be split into participle;
When described text to be converted can be split into participle, utilize segmentation methods that described text to be converted is split Obtain word segmentation result, described trident search tree determine the first appointment node corresponding with the participle in described word segmentation result, Extract described first and specify the word or phonetic that described in node, participle is corresponding, export word corresponding to described participle or phonetic;
When described text to be converted can not be split into participle, determine in trident search tree described in triggering and turn with described waiting The step of the destination node of this correspondence of the exchange of notes.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree The step of the destination node corresponding with described text to be converted includes:
When described text to be converted is a word, root node and the brother of described root node of word trident search tree Node determines the second appointment node identical with the ASCII character value of the first Chinese character in described text to be converted;
Described second child node specifying node determines and the ASCII character value of remaining Chinese character in described text to be converted The 3rd identical appointment node;
Specifying node to be defined as destination node by the described 3rd, one word at least includes two Chinese characters.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree The step of the destination node corresponding with described text to be converted includes:
When described text to be converted is at least two spelling sound, at the root node of phonetic trident search tree and described root node The brotgher of node determines the fourth appointment node identical with the first spelling sound in described text to be converted;
Fiveth identical with remaining the spelling sound in described text to be converted is determined in the described 4th child node specifying node Specify node;
Node is specified to be defined as destination node by the described 5th, the every spelling sound in described at least two spelling sounds all corresponding Chinese character.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree Before the step of the destination node corresponding with described text to be converted, described method also includes:
Determine the ASCII character value corresponding with each word in standard dictionary;
The first Chinese character of each word in described standard dictionary is added to the search of described trident by the size according to ASCII character value In the root node of tree and the brotgher of node of described root node;
The non-first Chinese character of each word in described standard dictionary and phonetic are added to described joint of described trident search tree In the child node of point, and in the child node of the brotgher of node of described root node.
The conversion method of character the most according to claim 1, it is characterised in that described determine in trident search tree Before the step of the destination node corresponding with described text to be converted, described method also includes:
Determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes a phonetic, every pair of phonetic At least include two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic;
Order according to initial adds the first spelling sound of every pair of phonetic in described standard dictionary to described trident search tree In the brotgher of node of root node and described root node;
Add the word that the non-first spelling sound of every pair of phonetic in described standard dictionary is corresponding with the every pair of phonetic to described trident to search In the child node of the described root node of Suo Shu, and in the child node of the brotgher of node of described root node.
7. the conversion equipment of a character, it is characterised in that including:
Receiver module, is used for receiving text to be converted, and described text to be converted is phonetic or word;
First determines module, for determining the destination node corresponding with described text to be converted in trident search tree, described The node of trident search tree is previously stored with the corresponding relation of word and phonetic;
Extraction module, for extracting word corresponding to described text to be converted or phonetic in described destination node;
Output module, for exporting word corresponding to described text to be converted or phonetic.
The conversion equipment of character the most according to claim 7, it is characterised in that described device also includes:
Judge module, is used for judging whether described text to be converted can be split into participle;
First performs module, for when described text to be converted can be split into participle, utilizes segmentation methods to described Text to be converted carries out fractionation and obtains word segmentation result, determines and the participle in described word segmentation result in described trident search tree The first corresponding appointment node, extracts described first and specifies the word or phonetic that described in node, participle is corresponding, and output is described Word that participle is corresponding or phonetic;
Second performs module, for when described text to be converted can not be split into participle, triggers described first and determines mould Block.
The conversion equipment of character the most according to claim 7, it is characterised in that described first determines that module includes: First determines submodule, and for when described text to be converted is a word, the root at word trident search tree saves Point with the brotgher of node of described root node determines identical with the ASCII character value of the first Chinese character in described text to be converted Second specifies node;
Second determines submodule, for described second specify node child node in determine with in described text to be converted The 3rd that the ASCII character value of remaining Chinese character is identical specifies node;
3rd determines submodule, and for specifying node to be defined as destination node by the described 3rd, one word at least wraps Include two Chinese characters.
The conversion equipment of character the most according to claim 7, it is characterised in that described first determines that module includes:
4th determines submodule, for when described text to be converted is at least two spelling sound, at phonetic trident search tree Root node with the brotgher of node of described root node determines fourth appointment identical with the first spelling sound in described text to be converted Node;
5th determines submodule, for described 4th specify node child node in determine with in described text to be converted The 5th appointment node that remaining spelling sound is identical;
6th determines submodule, for specifying node to be defined as destination node, in described at least two spelling sounds by the described 5th The all corresponding Chinese character of every spelling sound.
The conversion equipment of 11. characters according to claim 7, it is characterised in that described device also includes:
Second determines module, for determining the ASCII character value corresponding with each word in standard dictionary;
First adds module, is used for the size according to ASCII character value by the first Chinese character of each word in described standard dictionary Add in the root node of described trident search tree and the brotgher of node of described root node;
Second adds module, for adding non-first Chinese character and the phonetic of each word in described standard dictionary to described three In the child node of the described root node of fork search tree, and in the child node of the brotgher of node of described root node.
The conversion equipment of 12. characters according to claim 7, it is characterised in that described device also includes:
3rd determines module, and for determining the initial of every pair of phonetic in standard dictionary, wherein, every spelling sound at least includes One phonetic, every pair of phonetic at least includes two spelling sounds, the corresponding Chinese character of a spelling sound, a pair corresponding word of phonetic Language;
3rd adds module, is added by the first spelling sound of every pair of phonetic in described standard dictionary for the order according to initial In the brotgher of node of the root node of described trident search tree and described root node;
4th adds module, for by word corresponding with every pair of phonetic for the non-first spelling sound of every pair of phonetic in described standard dictionary In the child node of the described root node that language adds described trident search tree to, and the son joint of the brotgher of node of described root node In point.
CN201610243297.3A 2016-04-18 2016-04-18 Character converting method and apparatus Pending CN105955986A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610243297.3A CN105955986A (en) 2016-04-18 2016-04-18 Character converting method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610243297.3A CN105955986A (en) 2016-04-18 2016-04-18 Character converting method and apparatus

Publications (1)

Publication Number Publication Date
CN105955986A true CN105955986A (en) 2016-09-21

Family

ID=56917672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610243297.3A Pending CN105955986A (en) 2016-04-18 2016-04-18 Character converting method and apparatus

Country Status (1)

Country Link
CN (1) CN105955986A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897257A (en) * 2017-02-23 2017-06-27 郑州云海信息技术有限公司 The conversion method and device of a kind of ASCII character and character string based on LINUX platforms
CN111737986A (en) * 2020-05-15 2020-10-02 深圳市世强元件网络有限公司 Search term recommendation method and system based on multi-way tree
CN113641731A (en) * 2021-08-17 2021-11-12 成都知道创宇信息技术有限公司 Fuzzy search optimization method and device, electronic equipment and readable storage medium
US11947608B2 (en) 2020-05-15 2024-04-02 Shenzhen Sekorm Component Network Co., Ltd Search term recommendation method and system based on multi-branch tree

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521418A (en) * 2011-12-31 2012-06-27 青岛海信宽带多媒体技术有限公司 Pinyin storage structure and pinyin input method
CN102867049A (en) * 2012-09-10 2013-01-09 山东康威通信技术股份有限公司 Chinese PINYIN quick word segmentation method based on word search tree
CN102866781A (en) * 2011-07-06 2013-01-09 哈尔滨工业大学 Pinyin-to-character conversion method and pinyin-to-character conversion system
CN103823814A (en) * 2012-11-19 2014-05-28 腾讯科技(深圳)有限公司 Information processing method and information processing device
CN104252484A (en) * 2013-06-28 2014-12-31 重庆新媒农信科技有限公司 Pinyin error correction method and system
CN104268157A (en) * 2014-09-03 2015-01-07 乐视网信息技术(北京)股份有限公司 Device and method for error correction in data search

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866781A (en) * 2011-07-06 2013-01-09 哈尔滨工业大学 Pinyin-to-character conversion method and pinyin-to-character conversion system
CN102521418A (en) * 2011-12-31 2012-06-27 青岛海信宽带多媒体技术有限公司 Pinyin storage structure and pinyin input method
CN102867049A (en) * 2012-09-10 2013-01-09 山东康威通信技术股份有限公司 Chinese PINYIN quick word segmentation method based on word search tree
CN103823814A (en) * 2012-11-19 2014-05-28 腾讯科技(深圳)有限公司 Information processing method and information processing device
CN104252484A (en) * 2013-06-28 2014-12-31 重庆新媒农信科技有限公司 Pinyin error correction method and system
CN104268157A (en) * 2014-09-03 2015-01-07 乐视网信息技术(北京)股份有限公司 Device and method for error correction in data search

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897257A (en) * 2017-02-23 2017-06-27 郑州云海信息技术有限公司 The conversion method and device of a kind of ASCII character and character string based on LINUX platforms
CN111737986A (en) * 2020-05-15 2020-10-02 深圳市世强元件网络有限公司 Search term recommendation method and system based on multi-way tree
US11947608B2 (en) 2020-05-15 2024-04-02 Shenzhen Sekorm Component Network Co., Ltd Search term recommendation method and system based on multi-branch tree
CN113641731A (en) * 2021-08-17 2021-11-12 成都知道创宇信息技术有限公司 Fuzzy search optimization method and device, electronic equipment and readable storage medium
CN113641731B (en) * 2021-08-17 2023-05-02 成都知道创宇信息技术有限公司 Fuzzy search optimization method, device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN103268313B (en) A kind of semantic analytic method of natural language and device
CN107291783B (en) Semantic matching method and intelligent equipment
CN103456297B (en) A kind of method and apparatus of speech recognition match
CN104238991B (en) Phonetic entry matching process and device
CN102254557B (en) Navigation method and system based on natural voice identification
CN102915299B (en) Word segmentation method and device
CN102236423B (en) A kind of method that character supplements automatically, device and input method system
CN101634927B (en) Method and device for displaying candidate items in character input
AUPR824301A0 (en) Methods and systems (npw001)
CN105955986A (en) Character converting method and apparatus
CN102968987A (en) Speech recognition method and system
CN103376909B (en) The method and system of adjusting candidate word sequence in input method
CN108446316B (en) association word recommendation method and device, electronic equipment and storage medium
CN103838718A (en) Translation system and translation method
CN106205613B (en) A kind of navigation audio recognition method and system
EP3916579A1 (en) Method for resource sorting, method for training sorting model and corresponding apparatuses
CN104915458B (en) A kind of method, system and mobile terminal associated automatically when user searches for and applies
CN111160007A (en) Search method and device based on BERT language model, computer equipment and storage medium
CN112541109B (en) Answer abstract extraction method and device, electronic equipment, readable medium and product
CN112148895B (en) Training method, device, equipment and computer storage medium for retrieval model
CN113360685A (en) Method, device, equipment and medium for processing note content
CN111339314B (en) Ternary group data generation method and device and electronic equipment
CN105808688A (en) Complementation retrieval method and device based on artificial intelligence
US20160196303A1 (en) String search device, string search method, and string search program
CN105404903A (en) Information processing method and apparatus, and electronic device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160921

WD01 Invention patent application deemed withdrawn after publication