CN106297799A - Voice recognition processing method and device - Google Patents

Voice recognition processing method and device Download PDF

Info

Publication number
CN106297799A
CN106297799A CN201610647539.5A CN201610647539A CN106297799A CN 106297799 A CN106297799 A CN 106297799A CN 201610647539 A CN201610647539 A CN 201610647539A CN 106297799 A CN106297799 A CN 106297799A
Authority
CN
China
Prior art keywords
character string
edit distance
user
phonetic
smallest edit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610647539.5A
Other languages
Chinese (zh)
Inventor
周蕾蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leshi Zhixin Electronic Technology Tianjin Co Ltd
LeTV Holding Beijing Co Ltd
Original Assignee
Leshi Zhixin Electronic Technology Tianjin Co Ltd
LeTV Holding Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leshi Zhixin Electronic Technology Tianjin Co Ltd, LeTV Holding Beijing Co Ltd filed Critical Leshi Zhixin Electronic Technology Tianjin Co Ltd
Priority to CN201610647539.5A priority Critical patent/CN106297799A/en
Publication of CN106297799A publication Critical patent/CN106297799A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Abstract

The embodiment of the invention discloses a kind of voice recognition processing method and device, relate to natural language processing technique field.Described method includes the user's key word resolving in the voice messaging of user;Whether detection exists and the ID of user's Keywords matching in presupposed information storehouse;In the presence of not, obtain the pinyin character string that user's key word is corresponding;According to pinyin character string, from presupposed information storehouse, identify at least one ID similar to pinyin character string pronunciation, as the result of voice recognition processing.The technical scheme of the embodiment of the present invention, can optimize the result of voice recognition processing, improves the precision of voice recognition processing result, improves the efficiency of voice recognition processing, such that it is able to further enhance mobile phone users ease of use and Experience Degree.

Description

Voice recognition processing method and device
Technical field
The present invention relates to natural language processing technique field, particularly relate to a kind of voice recognition processing method and device.
Background technology
Along with the development of science and technology, the appearance of mobile terminal, greatly facilitate the life of people.Such as people can use Intelligence mobile terminal is contacted by telephone contact or note with household or friend, is very easy to communication and the friendship of people Stream.
In recent years, along with the intellectuality of mobile terminal, the use of people has been further facilitated.The most existing a lot of intelligence Mobile terminal add the function of speech recognition, when user in-convenience in use when, can by voice to mobile eventually End sends the order performing certain operation, and mobile terminal is by being identified the voice of user.Thus perform relevant operation.
In realizing process of the present invention, inventor finds that in prior art, at least there are the following problems: mobile terminal is to language The recognition accuracy of sound is relatively low, such as, be " phoning Wang little Ming " when the voice of user, and the identification that mobile terminal is to voice Result for " phoning Wang little Meng ", thus may cause mobile terminal cannot find the phone of " Wang little Ming ", thus cannot hold The operation that row is relevant.Therefore, existing mobile terminal is relatively low to the recognition efficiency of voice.
Summary of the invention
In view of the above problems, it is proposed that a kind of voice recognition processing method and device of the embodiment of the present invention.
The embodiment of the present invention provides a kind of voice recognition processing method, and described method includes:
Resolve the user's key word in the voice messaging of user;
Whether detection exists and the ID of described user's Keywords matching in presupposed information storehouse;
In the presence of not, obtain the pinyin character string that described user's key word is corresponding;
According to described pinyin character string, identify from described presupposed information storehouse similar to described pinyin character string pronunciation to A few ID, as the result of voice recognition processing.
Still optionally further, in method as above, the user's key word in the described voice messaging resolving user, tool Body includes:
The described voice messaging of described user is identified, obtains Word message;
Described Word message is resolved, obtains the described user's key word in described Word message.
Still optionally further, in method as above, according to described pinyin character string, know from described presupposed information storehouse At least one ID the most similar to described pinyin character string pronunciation, specifically includes:
Calculate the minimum between the phonetic of each described ID in described presupposed information storehouse and described pinyin character string Editing distance;
By described ID all of in described presupposed information storehouse according to described smallest edit distance from small to large suitable Sequence is ranked up, and obtains ID list;
From described ID list, at least one described ID is screened according to order from front to back.
Still optionally further, in method as above, calculate each described ID in described presupposed information storehouse Smallest edit distance between phonetic and described pinyin character string, specifically includes:
For each described ID in described presupposed information storehouse, calculate initial consonant in the phonetic of described ID, Smallest edit distance between simple or compound vowel of a Chinese syllable and tone initial consonant, simple or compound vowel of a Chinese syllable and the tone respectively and in described pinyin character string;
Smallest edit distance that described in phonetic according to described ID, initial consonant is corresponding, the minimum that described simple or compound vowel of a Chinese syllable is corresponding Editing distance and smallest edit distance corresponding to described tone, obtain the phonetic of described ID and described pinyin character string Between smallest edit distance.
Still optionally further, in method as above, described in the phonetic according to described ID, initial consonant is corresponding Smallest edit distance that smallest edit distance, described simple or compound vowel of a Chinese syllable are corresponding and smallest edit distance corresponding to described tone, obtain institute State the smallest edit distance between the phonetic of ID and described pinyin character string, specifically include:
The minimum volume that the smallest edit distance that described in phonetic by described ID, initial consonant is corresponding, described simple or compound vowel of a Chinese syllable are corresponding Volume distance and smallest edit distance corresponding to described tone are added, and obtain the phonetic of described ID and described pinyin character Smallest edit distance between string;
Or the smallest edit distance that described in the phonetic by described ID, initial consonant is corresponding and the initial consonant weight preset Product, smallest edit distance corresponding to the described simple or compound vowel of a Chinese syllable minimum corresponding with the product of the simple or compound vowel of a Chinese syllable weight preset and described tone Editing distance and the product addition of the tone weight preset, obtain between the phonetic of described ID and described pinyin character string Smallest edit distance.
Still optionally further, in method as above, according to described pinyin character string, obtain from described presupposed information storehouse Take at least one ID similar to described pinyin character string pronunciation, after the result of voice recognition processing, also wrap Include:
At least one described ID is shown to described user;
Further, described method also includes:
Receive targeted customer's mark that described user selects from ID at least one described;
Process accordingly according to described targeted customer mark.
The embodiment of the present invention also provides for a kind of voice recognition processing device device, and described device includes:
Parsing module, the user's key word in the voice messaging resolving user;
Whether detection module, exist in presupposed information storehouse and the ID of described user's Keywords matching for detecting;
Acquisition module, for ought not in the presence of, obtain the pinyin character string that described user's key word is corresponding;
Identification module, for according to described pinyin character string, identifies and described pinyin character from described presupposed information storehouse At least one ID that string pronunciation is similar, as the result of voice recognition processing.
Still optionally further, in device as above, described parsing module, specifically for:
The described voice messaging of described user is identified, obtains Word message;
Described Word message is resolved, obtains the described user's key word in described Word message.
Still optionally further, in device as above, described identification module, specifically include:
Computing unit, for calculating the phonetic of each described ID and described pinyin character in described presupposed information storehouse Smallest edit distance between string;
Sequencing unit, is used for described ID all of in described presupposed information storehouse according to described smallest edit distance Order from small to large is ranked up, and obtains ID list;
Screening unit, for screening at least one described use according to order from front to back from described ID list Family identifies.
Still optionally further, in device as above, described computing unit, specifically for:
For each described ID in described presupposed information storehouse, calculate initial consonant in the phonetic of described ID, Smallest edit distance between simple or compound vowel of a Chinese syllable and tone initial consonant, simple or compound vowel of a Chinese syllable and the tone respectively and in described pinyin character string;
Smallest edit distance that described in phonetic according to described ID, initial consonant is corresponding, the minimum that described simple or compound vowel of a Chinese syllable is corresponding Editing distance and smallest edit distance corresponding to described tone, obtain the phonetic of described ID and described pinyin character string Between smallest edit distance.
Still optionally further, in device as above, described computing unit, specifically for:
The minimum volume that the smallest edit distance that described in phonetic by described ID, initial consonant is corresponding, described simple or compound vowel of a Chinese syllable are corresponding Volume distance and smallest edit distance corresponding to described tone are added, and obtain the phonetic of described ID and described pinyin character Smallest edit distance between string;
Or the smallest edit distance that described in the phonetic by described ID, initial consonant is corresponding and the initial consonant weight preset Product, smallest edit distance corresponding to the described simple or compound vowel of a Chinese syllable minimum corresponding with the product of the simple or compound vowel of a Chinese syllable weight preset and described tone Editing distance and the product addition of the tone weight preset, obtain between the phonetic of described ID and described pinyin character string Smallest edit distance.
Still optionally further, in device as above, also include:
Display module, for showing at least one described ID to described user;
Further, described device also includes:
Receiver module, for receiving targeted customer's mark that described user selects from ID at least one described;
Processing module, for processing accordingly according to described targeted customer mark.
The voice recognition processing installation method of the embodiment of the present invention and device, by the use in the voice messaging of parsing user Family key word;Whether detection exists and the ID of user's Keywords matching in presupposed information storehouse;In the presence of not, obtain and use The pinyin character string that family key word is corresponding;According to pinyin character string, identify and pinyin character string pronunciation phase from presupposed information storehouse As at least one ID, as voice recognition processing as a result, it is possible to optimize voice recognition processing result, improve language The precision of sound identifying processing result, improves the efficiency of voice recognition processing, such that it is able to further enhance mobile phone users Ease of use and Experience Degree.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of description, and in order to allow above and other objects of the present invention, the feature and advantage can Become apparent, below especially exemplified by the detailed description of the invention of the present invention.
Accompanying drawing explanation
By reading the detailed description of hereafter preferred implementation, various other advantage and benefit common for this area Technical staff will be clear from understanding.Accompanying drawing is only used for illustrating the purpose of preferred implementation, and is not considered as the present invention Restriction.And in whole accompanying drawing, it is denoted by the same reference numerals identical parts.In the accompanying drawings:
Fig. 1 is the flow chart of the voice recognition processing method of the embodiment of the present invention.
Fig. 2 is voice recognition processing method one exemplary plot of the embodiment of the present invention.
Fig. 3 is voice recognition processing method two exemplary plot of the embodiment of the present invention.
Fig. 4 is voice recognition processing method three exemplary plot of the embodiment of the present invention.
Fig. 5 is voice recognition processing method four exemplary plot of the embodiment of the present invention.
Fig. 6 is voice recognition processing method five exemplary plot of the embodiment of the present invention.
Fig. 7 is the structure chart of the voice recognition processing device embodiment one of the embodiment of the present invention.
Fig. 8 is the structure chart of the voice recognition processing device embodiment two of the embodiment of the present invention.
Detailed description of the invention
It is more fully described the exemplary embodiment of the disclosure below with reference to accompanying drawings.Although accompanying drawing shows the disclosure Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure and should be by embodiments set forth here Limited.On the contrary, it is provided that these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
Fig. 1 is the flow chart of the voice recognition processing method of the embodiment of the present invention.As it is shown in figure 1, the voice of the present embodiment Identifying processing method, specifically may include steps of:
100, the user's key word in the voice messaging of user is resolved;
101, whether detection presupposed information storehouse exists and the ID of user's Keywords matching;In the presence of not, hold Row step 102;Otherwise perform step 103;
102, the pinyin character string that user's key word is corresponding is obtained;Perform step 104;
103, using the ID with user's Keywords matching as the result of voice recognition processing, terminate.
104, according to pinyin character string, from presupposed information storehouse, identify that at least one similar to pinyin character string pronunciation is used Family identifies, as the result of voice recognition processing.
The executive agent of the voice recognition processing method of the present embodiment is voice recognition processing device, specifically, and this voice Recognition process unit may be mounted on mobile terminal in use.
In the present embodiment, mainly the user's key word in the voice messaging of user is identified, the language of such as user " Wang little Ming " in message breath " phoning Wang little Ming ", " Zhang Xiao in the voice messaging " photos and sending messages is to Xiaoli Zhang " of user Beautiful ", or " U.S. numerous " in " the sending out video to U.S. numerous " in the voice messaging of user;Specifically the key word of user is permissible For the title of user, or it can also be the ID of the pet name etc of user.Owing to these user's key words are used for identifying The purpose user of this operation of user.Certainly, the destination of the concrete operations of user is not purpose user, is specifically as follows mesh The mobile terminal of user Wang little Ming, the mobile terminal of Xiaoli Zhang or U.S. numerous mobile terminal.Specifically, the present embodiment Presupposed information storehouse can be address list, such as, be specifically as follows the address list that the mobile terminal of user is local, or user i.e. Time communication in address list.
The technical scheme of the present embodiment is when implementing, when user is inconvenient to use hands to operate mobile terminal when, and can To send order by voice to mobile terminal, then resolved the voice of user by the voice recognition processing device in mobile terminal User's key word in information;Then whether detection presupposed information storehouse such as address list exists and the user of user's Keywords matching Mark, this identification process may be considered the processing procedure of first speech recognition.For example, it is possible to the key word that parsing is obtained " Wang little Ming ", " Xiaoli Zhang " or " U.S. numerous " contrast one by one with all ID in address list respectively, see in address list It is all to be present in the ID that user's key word is the same.Now corresponding step 100 specifically can include following two steps:
(a1) voice messaging of user is identified, obtains Word message;
(a2) Word message is resolved, obtain the user's key word in Word message.
Or can also be Word message by the voice messaging Direct Recognition of user, but directly obtain the language of user The phonetic of all characters in message breath;The phonetic of all characters from voice messaging obtains the phonetic of user's key word, Then whether detection presupposed information storehouse such as address list be present in the ID that the phonetic of user's key word is identical;As existed, Using with the ID of user's Keywords matching as the result of voice recognition processing.Otherwise, user's key word is obtained corresponding Pinyin character string, according to pinyin character string, identifies at least one user mark similar to pinyin character string pronunciation from communication storehouse Knowing, as the result of voice recognition processing, this step can be used as first as secondary voice recognition processing process Supplementing, because first voice recognition processing process is the most coarse, it is possible to cause obtaining of voice recognition processing process To voice recognition processing result accurately, through secondary voice recognition processing process, according to the user's key word obtained Pinyin character string, identifies at least one ID similar to pinyin character string pronunciation, as second time language from communication storehouse The result of sound identifying processing,
The voice recognition processing method of the present embodiment, by the user's key word in the voice messaging of parsing user;Detection Whether presupposed information storehouse exists and the ID of user's Keywords matching;In the presence of not, obtain user's key word corresponding Pinyin character string;According to pinyin character string, from presupposed information storehouse, identify at least one similar to pinyin character string pronunciation ID, as voice recognition processing as a result, it is possible to optimize voice recognition processing result, improve voice recognition processing knot The precision of fruit, improves the efficiency of voice recognition processing.
Still optionally further, on the basis of the technical scheme of above-described embodiment, step 104 therein is " according to phonetic word Symbol string, identifies at least one ID similar to pinyin character string pronunciation from presupposed information storehouse ", specifically can include as Lower step:
(b1) smallest edit distance between phonetic and the pinyin character string of each ID in presupposed information storehouse is calculated;
Wherein editing distance (Edit Distance) refers between two word strings, is changed into another by one required Few edit operation number of times.The edit operation of license includes replacing to a character another character, inserts a character, deletes One character.And editing distance is the least, the similarity of two strings is the biggest.
Such as being edited by kitten and change into sitting, the smallest edit distance of needs is 3, and concrete editing and processing process is such as Under:
sitten(k→s)
sittin(e→i)
sitting(→g)
The concept of editing distance is proposed in nineteen sixty-five by Russian scientist Vladimir Levenshtein, editing distance Algorithm employ the algorithm policy of dynamic programming, this problem possesses optimum minor structure, and smallest edit distance comprises that son is minimum to be compiled Collect distance, the formula that concrete employing is following:
d [ i , j ] = 0 i = 0 o r j = 0 min ( d [ i - 1 , j ] + 1 , d [ i , j - 1 ] + 1 , d [ i - 1 , j - 1 ] ) x i = y j min ( d [ i - 1 , j ] + 1 , d [ i , j - 1 ] + 1 , d [ i - 1 , j - 1 ] + 1 ) x i ≠ y j
Wherein d[i-1,j]+ 1 represents character string s2 inserts an alphabetical editing distance, d[i,j-1]+ 1 represents character string s1 deletes Except an alphabetical editing distance;Wherein min (d[i-1,j]+1、d[i,j-1]+1、d[i-1,j-1]) represent the minimum editor taking in three Distance.Work as xi=yjTime, it is not necessary to cost, so and previous step d[i-1,j-1]Cost is identical, otherwise+1, then d[i,j]It is above Minimum in three one.
The editing distance of cafe and coffee to be calculated.Such as concrete editing and processing process, can carry out following Operation: cafe → cofe → coffe → coffee.
Specifically, can first create one as shown in Figure 26 × 8 table (cafe a length of 4, coffee a length of 6, respectively Add 2).
Then, in the form shown in Fig. 2, insert numeral, obtain the form shown in Fig. 3.
According to the computing formula of above-mentioned smallest edit distance, start from 3 of form shown in Fig. 3,3 lattice, start calculate, take with The minima of lower three values:
(1) if the character of bottom is equal to the character of leftmost, then it is the numeral of lower left.It it is otherwise the number of lower left Word+1.For 3, for 3 lattice, lower section character " c " is identical with left character " c ", so being 0.
(2) left numeral+1, such as, for 3, be 2 for 3 lattice;
(3) lower section numeral+1, such as, for 3, be 2 for 3 lattice.
Through above-mentioned process, to 3 in Fig. 3,3 lattice carry out the result after minimum editing and processing as shown in Figure 4.
The most successively each lattice in Fig. 3 are processed, obtain the result shown in Fig. 5.
Finally, can take the upper right corner, obtaining clean up editing distance is 3, and in Fig. 5, the route of labelling is the optimal road of editing distance Footpath, needs a replacement, two updates, it may be assumed that cafe → cofe → coffe → coffee.
Smallest edit distance between phonetic and the pinyin character string of the ID in the present embodiment is it is to be understood that incite somebody to action The phonetic of ID is edited into pinyin character string, and minimum editor's number of times of needs, as smallest edit distance.
(b2) ID all of in presupposed information storehouse is arranged according to smallest edit distance order from small to large Sequence, obtains ID list;
(b3) from ID list, at least one ID is screened according to order from front to back.
Specifically, calculate smallest edit distance that in presupposed information storehouse such as address list, all of ID is corresponding it After, all of ID in address list can be ranked up according to smallest edit distance by the order of little arrival, obtains ID list.Owing to smallest edit distance is the least, represent that in address list, the phonetic of this ID is compiled as user's key The smallest edit distance of the pinyin character string that word is corresponding is the least, and in address list, the phonetic of this ID is corresponding with user's key word Pinyin character string the most similar.So, from ID list, at least one user mark is screened according to order from front to back Know, that is from ID list, obtain at least one use that the pinyin character string similarity corresponding with user's key word is maximum Family identifies.The quantity of at least one of which ID, can pre-set according to the demand of user, such as, can be 1, also may be used Think 2 or 3 or 5 or other positive integers N.That is obtain corresponding with user's key word from ID list The maximum top n ID of pinyin character string similarity.
In the present embodiment, by using phonetic to calculate editing distance, such as, for Chinese character, if directly calculated The close degree of two Chinese character strings is inaccurate, such as " Wang little Ming " and " Wang little Meng ", the editing distance of " Wang little Qiang " It is all 1, the most only needs a step replacement operation.But this result is inaccurate, from the point of view of pronunciation, it is evident that " Wang little Meng " with " Wang little Ming " closer to, so the present embodiment needs to calculate the similarity when pronunciation in less rank, i.e. pass through Phonetic calculates the similarity of Chinese-character pronunciation.For phonetic, a phonetic comprises three parts: initial consonant, simple or compound vowel of a Chinese syllable and tone, The phonetic of " bright " is " m ing 2 ", and the phonetic of " sprouting " is " m eng 2 ", and the phonetic of " by force " is " q iang 2 ", but if straight Connect and use above-mentioned smallest edit distance algorithm to calculate the close degree of two phonetics, can cause irrational alignment (i.e. initial consonant, Simple or compound vowel of a Chinese syllable and tone align the most respectively), in the present embodiment, can be respective to its calculating respectively by initial consonant, simple or compound vowel of a Chinese syllable and tone Similarity.It is wherein the phonetic not having initial consonant, a spcial character can be used to represent that it does not has initial consonant.Such as " difficult to understand ", initial consonant Part can replace with " ", i.e. phonetic can be expressed as " ao 4 ", it is simple to aligns and calculates.
Such as, step (b1) " calculates in presupposed information storehouse between phonetic and the pinyin character string of each ID Little editing distance ", specifically may include that
(c1) for each ID in presupposed information storehouse, initial consonant, simple or compound vowel of a Chinese syllable and sound in the phonetic of ID are calculated Adjust the smallest edit distance between initial consonant, simple or compound vowel of a Chinese syllable and the tone respectively and in pinyin character string;
(c2) according to smallest edit distance corresponding to initial consonant in the phonetic of ID, smallest edit distance that simple or compound vowel of a Chinese syllable is corresponding And the smallest edit distance that tone is corresponding, obtain the smallest edit distance between the phonetic of ID and pinyin character string.
Wherein step (c2) is according to smallest edit distance corresponding to initial consonant in the phonetic of ID, minimum that simple or compound vowel of a Chinese syllable is corresponding Editing distance and smallest edit distance corresponding to tone, obtain the minimum volume between the phonetic of ID and pinyin character string Volume distance, specifically can include the following two kinds situation:
The first situation, by minimum volume corresponding to smallest edit distance corresponding for initial consonant in the phonetic of ID, simple or compound vowel of a Chinese syllable Volume distance and smallest edit distance corresponding to tone are added, and obtain the minimum between the phonetic of ID and pinyin character string Editing distance;
The second situation, by smallest edit distance corresponding for initial consonant in the phonetic of ID and the initial consonant weight preset Smallest edit distance corresponding to product, the simple or compound vowel of a Chinese syllable smallest edit distance corresponding with the product of the simple or compound vowel of a Chinese syllable weight preset and tone with The product addition of the tone weight preset, obtains the smallest edit distance between the phonetic of ID and pinyin character string.
In the first situation, in acquiescence phonetic, the weight of initial consonant, simple or compound vowel of a Chinese syllable and tone is the most identical, directly by ID Smallest edit distance that smallest edit distance that in phonetic, initial consonant is corresponding, simple or compound vowel of a Chinese syllable are corresponding and smallest edit distance corresponding to tone It is added, as the smallest edit distance between phonetic and the pinyin character string of ID.In presupposed information storehouse is such as address list The when that the ID of similar pronunciation being more, the accuracy that may cause recognition result is poor.
In the second situation, it is contemplated that the when of speech recognition, initial consonant is easier to make mistakes, so the mistake of initial consonant is wanted Containing, the initial consonant weight preset can take smaller, i.e. can be by default initial consonant weight, default simple or compound vowel of a Chinese syllable weight and default Tone weight different values is set.
Such as, two pinyin character strings " l iu 2d e 2h ua 2 " and " w ang 2h e 4h ui 2 " are calculated Little editing distance, can be divided into three parts phonetic: initial consonant, simple or compound vowel of a Chinese syllable and tone, calculate the smallest edit distance of three respectively, Rear summation.
Specifically, it is referred to the calculation of the smallest edit distance of above-mentioned Fig. 2-Fig. 5, obtains the minimum shown in Fig. 6 Editing distance calculates process.As shown in Figure 6, from the upper right corner it can be seen that the smallest edit distance of initial consonant, simple or compound vowel of a Chinese syllable and tone is distinguished For:
The smallest edit distance of initial consonant: C_dis=2
The smallest edit distance of simple or compound vowel of a Chinese syllable: V_dis=2
The smallest edit distance of tone: T_dis=1
So, as shown in Figure 6, two final pinyin character string editing distances are: 2+2+1=5.
Specifically, when calculating smallest edit distance, initial consonant, simple or compound vowel of a Chinese syllable and tone to be alignd respectively and calculate, from And increase the accuracy of result of calculation.
It addition, in aforesaid operations to replace, insert, delete three operation cost value be all 1, it is also possible to taken by regulation Different values.
Further, initial consonant, simple or compound vowel of a Chinese syllable, the editing distance of tone calculate and can also take different weights, because identifying when Initial consonant is easier to make mistakes, so to contain the mistake of initial consonant, the such as weight of initial consonant, simple or compound vowel of a Chinese syllable and tone can be respectively as follows:
The initial consonant weight preset: C_weight=0.2
The simple or compound vowel of a Chinese syllable weight preset: V_weight=0.4
The tone weight preset: T_weight=0.4
So, final smallest edit distance is: C_weight*C_dis+V_weight*V_dis+T_weight*T_ Dis=0.2*2+0.4*2+0.4*1=1.6.In like manner, when final smallest edit distance calculates, need initial consonant, simple or compound vowel of a Chinese syllable, Tone aligns, and after calculating respectively, then sues for peace, and obtains the smallest edit distance of entirety.
Step 104 at above-described embodiment " according to pinyin character string, obtains from presupposed information storehouse and reads with pinyin character string At least one ID that sound is similar, as the result of voice recognition processing " after, it is also possible to comprise the steps:
(d1) at least one ID is displayed to the user that;
(d2) targeted customer's mark that user selects from least one ID is received;
(d3) process accordingly according to targeted customer's mark.
Such as, specifically at least one ID can be shown by the display screen of mobile terminal, so, when user sees At least one ID, can select one from which as making the targeted customer's mark carrying out processing.Then basis Targeted customer's mark is called, other process accordingly to send short message or transmission video information etc..
The voice recognition processing method of above-described embodiment, by using technique scheme, can optimize at speech recognition The result of reason, improves the precision of voice recognition processing result, improves the efficiency of voice recognition processing, such that it is able to increase further Strong mobile phone users ease of use and Experience Degree.
Fig. 7 is the structure chart of the voice recognition processing device embodiment one of the embodiment of the present invention.As it is shown in fig. 7, this enforcement The voice recognition processing device of example, specifically may include that parsing module 10, detection module 11, acquisition module 12 and identification module 13。
The wherein parsing module 10 user's key word in the voice messaging resolving user;Detection module 11 is used for detecting Whether presupposed information storehouse exists the ID of the user's Keywords matching obtained with parsing module 10 parsing;Acquisition module 12 In the presence of detecting not when detection module 11, obtain the pinyin character string that user's key word is corresponding;Identification module 13 is for root The pinyin character string obtained according to acquisition module 12, identifies at least one similar to pinyin character string pronunciation from presupposed information storehouse ID, as the result of voice recognition processing.
The voice recognition processing device of the present embodiment, by using above-mentioned module to realize the process of speech recognition and above-mentioned phase The realization mechanism and the technique effect that close embodiment of the method are identical, are referred to the record of above-mentioned related method embodiment in detail, Do not repeat them here.
Fig. 8 is the structure chart of the voice recognition processing device embodiment two of the embodiment of the present invention.As shown in Figure 8, this enforcement The voice recognition processing device of example, on the basis of the technical scheme of above-mentioned embodiment illustrated in fig. 7, retouches the most in further detail State technical scheme.
In the voice recognition processing device of the present embodiment, parsing module 10 specifically for:
The voice messaging of user is identified, obtains Word message;
Word message is resolved, obtains the user's key word in Word message.
As shown in Figure 8, in the voice recognition processing device of the present embodiment, identification module 13 specifically includes: computing unit 131, sequencing unit 132 and screening unit 133.
Wherein computing unit 131 obtains with acquisition module 12 for calculating the phonetic of each ID in presupposed information storehouse Pinyin character string between smallest edit distance;
Sequencing unit 132 is for the result of calculation according to computing unit 131, by ID all of in presupposed information storehouse It is ranked up according to smallest edit distance order from small to large, obtains ID list;
Screening unit 133 for sorting the ID list obtained according to sequencing unit 132, suitable according to from front to back Sequence screens at least one ID from ID list.
Still optionally further in the voice recognition processing device of the present embodiment, computing unit 131 specifically for:
For each ID in presupposed information storehouse, calculate initial consonant, simple or compound vowel of a Chinese syllable in the phonetic of ID and divide with tone The smallest edit distance between initial consonant, simple or compound vowel of a Chinese syllable and tone not and in the pinyin character string of acquisition module 12 acquisition;
The smallest edit distance corresponding according to smallest edit distance corresponding to initial consonant in the phonetic of ID, simple or compound vowel of a Chinese syllable and The smallest edit distance that tone is corresponding, obtains the smallest edit distance between the phonetic of ID and pinyin character string.
Still optionally further in the voice recognition processing device of the present embodiment, computing unit 131 specifically for:
By smallest edit distance corresponding to smallest edit distance corresponding for initial consonant in the phonetic of ID, simple or compound vowel of a Chinese syllable and sound Adjust corresponding smallest edit distance to be added, obtain the smallest edit distance between the phonetic of ID and pinyin character string;
Or by smallest edit distance corresponding for initial consonant in the phonetic of ID and the product of initial consonant weight preset, rhythm Female corresponding smallest edit distance smallest edit distance corresponding with the product of the simple or compound vowel of a Chinese syllable weight preset and tone is with default The product addition of tone weight, obtains the smallest edit distance between the phonetic of ID and pinyin character string.
Still optionally further in the voice recognition processing device of the present embodiment, also include: display module 14, receiver module 15 With processing module 16.
Wherein display module 14 is for displaying to the user that at least one ID that screening unit 133 obtains;Receive mould Targeted customer's mark that block 15 selects from least one ID that display module 14 shows for receiving user;Process mould Block 16 processes accordingly for the targeted customer's mark received according to receiver module 15.
The voice recognition processing device of the present embodiment, by using above-mentioned module to realize the process of speech recognition and above-mentioned phase The realization mechanism and the technique effect that close embodiment of the method are identical, are referred to the record of above-mentioned related method embodiment in detail, Do not repeat them here.
Described above illustrate and describes some preferred embodiments of the application, but as previously mentioned, it should be understood that the application Be not limited to form disclosed herein, be not to be taken as the eliminating to other embodiments, and can be used for other combinations various, Amendment and environment, and can be in invention contemplated scope described herein, by above-mentioned teaching or the technology of association area or knowledge It is modified.And the change that those skilled in the art are carried out and change are without departing from spirit and scope, the most all should be in this Shen Please be in the protection domain of claims.

Claims (12)

1. a voice recognition processing method, it is characterised in that described method includes:
Resolve the user's key word in the voice messaging of user;
Whether detection exists and the ID of described user's Keywords matching in presupposed information storehouse;
In the presence of not, obtain the pinyin character string that described user's key word is corresponding;
According to described pinyin character string, from described presupposed information storehouse, identify at least similar to described pinyin character string pronunciation Individual ID, as the result of voice recognition processing.
Method the most according to claim 1, it is characterised in that the user in the voice messaging of described parsing user is crucial Word, specifically includes:
The described voice messaging of described user is identified, obtains Word message;
Described Word message is resolved, obtains the described user's key word in described Word message.
Method the most according to claim 1, it is characterised in that according to described pinyin character string, from described presupposed information storehouse At least one ID that middle identification is similar to described pinyin character string pronunciation, specifically includes:
Calculate the minimum editor between the phonetic of each described ID in described presupposed information storehouse and described pinyin character string Distance;
Described ID all of in described presupposed information storehouse is entered according to described smallest edit distance order from small to large Row sequence, obtains ID list;
From described ID list, at least one described ID is screened according to order from front to back.
Method the most according to claim 3, it is characterised in that calculate each described ID in described presupposed information storehouse Phonetic and described pinyin character string between smallest edit distance, specifically include:
For each described ID in described presupposed information storehouse, calculate initial consonant, simple or compound vowel of a Chinese syllable in the phonetic of described ID And the smallest edit distance between initial consonant, simple or compound vowel of a Chinese syllable and the tone that tone is respectively and in described pinyin character string;
The minimum editor that the smallest edit distance that described in phonetic according to described ID, initial consonant is corresponding, described simple or compound vowel of a Chinese syllable are corresponding Distance and smallest edit distance corresponding to described tone, obtain between the phonetic of described ID and described pinyin character string Smallest edit distance.
Method the most according to claim 4, it is characterised in that initial consonant described in the phonetic according to described ID is corresponding Smallest edit distance corresponding to smallest edit distance, smallest edit distance corresponding to described simple or compound vowel of a Chinese syllable and described tone, obtain Smallest edit distance between phonetic and the described pinyin character string of described ID, specifically includes:
Minimum editor corresponding to the smallest edit distance that described in the phonetic by described ID, initial consonant is corresponding, described simple or compound vowel of a Chinese syllable away from From and smallest edit distance corresponding to described tone be added, obtain described ID phonetic and described pinyin character string it Between smallest edit distance;
Or the smallest edit distance that described in the phonetic by described ID, initial consonant is corresponding is taken advantage of with the initial consonant weight preset The minimum editor that the smallest edit distance long-pending, described simple or compound vowel of a Chinese syllable is corresponding is corresponding with the product of the simple or compound vowel of a Chinese syllable weight preset and described tone The product addition of distance and default tone weight, obtains between the phonetic of described ID and described pinyin character string Little editing distance.
6. according to the arbitrary described method of claim 1-5, it is characterised in that according to described pinyin character string, preset from described Information bank obtains at least one ID similar to described pinyin character string pronunciation, as the result of voice recognition processing Afterwards, described method also includes:
At least one described ID is shown to described user;
Further, described method also includes:
Receive targeted customer's mark that described user selects from ID at least one described;
Process accordingly according to described targeted customer mark.
7. a voice recognition processing device, it is characterised in that described device includes:
Parsing module, the user's key word in the voice messaging resolving user;
Whether detection module, exist in presupposed information storehouse and the ID of described user's Keywords matching for detecting;
Acquisition module, for ought not in the presence of, obtain the pinyin character string that described user's key word is corresponding;
Identification module, for according to described pinyin character string, identifies from described presupposed information storehouse and reads with described pinyin character string At least one ID that sound is similar, as the result of voice recognition processing.
Device the most according to claim 7, it is characterised in that described parsing module, specifically for:
The described voice messaging of described user is identified, obtains Word message;
Described Word message is resolved, obtains the described user's key word in described Word message.
Device the most according to claim 7, it is characterised in that described identification module, specifically includes:
Computing unit, for calculate in described presupposed information storehouse the phonetic of each described ID and described pinyin character string it Between smallest edit distance;
Sequencing unit, for by described ID all of in described presupposed information storehouse according to described smallest edit distance from little It is ranked up to big order, obtains ID list;
Screening unit, for screening at least one described user mark according to order from front to back from described ID list Know.
Device the most according to claim 9, it is characterised in that described computing unit, specifically for:
For each described ID in described presupposed information storehouse, calculate initial consonant, simple or compound vowel of a Chinese syllable in the phonetic of described ID And the smallest edit distance between initial consonant, simple or compound vowel of a Chinese syllable and the tone that tone is respectively and in described pinyin character string;
The minimum editor that the smallest edit distance that described in phonetic according to described ID, initial consonant is corresponding, described simple or compound vowel of a Chinese syllable are corresponding Distance and smallest edit distance corresponding to described tone, obtain between the phonetic of described ID and described pinyin character string Smallest edit distance.
11. devices according to claim 10, it is characterised in that described computing unit, specifically for:
Minimum editor corresponding to the smallest edit distance that described in the phonetic by described ID, initial consonant is corresponding, described simple or compound vowel of a Chinese syllable away from From and smallest edit distance corresponding to described tone be added, obtain described ID phonetic and described pinyin character string it Between smallest edit distance;
Or the smallest edit distance that described in the phonetic by described ID, initial consonant is corresponding is taken advantage of with the initial consonant weight preset The minimum editor that the smallest edit distance long-pending, described simple or compound vowel of a Chinese syllable is corresponding is corresponding with the product of the simple or compound vowel of a Chinese syllable weight preset and described tone The product addition of distance and default tone weight, obtains between the phonetic of described ID and described pinyin character string Little editing distance.
12. according to the arbitrary described device of claim 7-11, it is characterised in that described device also includes:
Display module, for showing at least one described ID to described user;
Further, described device also includes:
Receiver module, for receiving targeted customer's mark that described user selects from ID at least one described;
Processing module, for processing accordingly according to described targeted customer mark.
CN201610647539.5A 2016-08-09 2016-08-09 Voice recognition processing method and device Pending CN106297799A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610647539.5A CN106297799A (en) 2016-08-09 2016-08-09 Voice recognition processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610647539.5A CN106297799A (en) 2016-08-09 2016-08-09 Voice recognition processing method and device

Publications (1)

Publication Number Publication Date
CN106297799A true CN106297799A (en) 2017-01-04

Family

ID=57666992

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610647539.5A Pending CN106297799A (en) 2016-08-09 2016-08-09 Voice recognition processing method and device

Country Status (1)

Country Link
CN (1) CN106297799A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633017A (en) * 2017-08-23 2018-01-26 西安理工大学 A kind of fuzzy set construction method of Chinese key
WO2018166339A1 (en) * 2017-03-13 2018-09-20 中兴通讯股份有限公司 Information processing method
CN108682423A (en) * 2018-05-24 2018-10-19 北京奔流网络信息技术有限公司 A kind of audio recognition method and device
CN109101604A (en) * 2018-08-01 2018-12-28 深圳市元征科技股份有限公司 Vehicle brand knows method for distinguishing and vehicle brand identification device
CN109741749A (en) * 2018-04-19 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and terminal device of speech recognition
CN109785842A (en) * 2017-11-14 2019-05-21 蔚来汽车有限公司 Speech recognition error correction method and speech recognition error correction system
CN109992749A (en) * 2017-12-29 2019-07-09 珠海金山办公软件有限公司 A kind of character displaying method, device, electronic equipment and readable storage medium storing program for executing
CN110164435A (en) * 2019-04-26 2019-08-23 平安科技(深圳)有限公司 Audio recognition method, device, equipment and computer readable storage medium
CN110392281A (en) * 2018-04-20 2019-10-29 腾讯科技(深圳)有限公司 Image synthesizing method, device, computer equipment and storage medium
CN111429886A (en) * 2020-04-09 2020-07-17 厦门钛尚人工智能科技有限公司 Voice recognition method and system
CN112562668A (en) * 2020-11-30 2021-03-26 广州橙行智动汽车科技有限公司 Semantic information deviation rectifying method and device
CN115577712A (en) * 2022-12-06 2023-01-06 共道网络科技有限公司 Text error correction method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1674091A (en) * 2005-04-18 2005-09-28 南京师范大学 Sound identifying method for geographic information and its application in navigation system
CN101206859A (en) * 2007-11-30 2008-06-25 清华大学 Method for ordering song by voice
CN102867512A (en) * 2011-07-04 2013-01-09 余喆 Method and device for recognizing natural speech
CN103456297A (en) * 2012-05-29 2013-12-18 中国移动通信集团公司 Method and device for matching based on voice recognition
CN104216906A (en) * 2013-05-31 2014-12-17 大陆汽车投资(上海)有限公司 Voice searching method and device
CN104238991A (en) * 2013-06-21 2014-12-24 腾讯科技(深圳)有限公司 Voice input matching method and voice input matching device
CN105206274A (en) * 2015-10-30 2015-12-30 北京奇艺世纪科技有限公司 Voice recognition post-processing method and device as well as voice recognition system
CN105653517A (en) * 2015-11-05 2016-06-08 乐视致新电子科技(天津)有限公司 Recognition rate determining method and apparatus

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1674091A (en) * 2005-04-18 2005-09-28 南京师范大学 Sound identifying method for geographic information and its application in navigation system
CN101206859A (en) * 2007-11-30 2008-06-25 清华大学 Method for ordering song by voice
CN102867512A (en) * 2011-07-04 2013-01-09 余喆 Method and device for recognizing natural speech
CN103456297A (en) * 2012-05-29 2013-12-18 中国移动通信集团公司 Method and device for matching based on voice recognition
CN104216906A (en) * 2013-05-31 2014-12-17 大陆汽车投资(上海)有限公司 Voice searching method and device
CN104238991A (en) * 2013-06-21 2014-12-24 腾讯科技(深圳)有限公司 Voice input matching method and voice input matching device
CN105206274A (en) * 2015-10-30 2015-12-30 北京奇艺世纪科技有限公司 Voice recognition post-processing method and device as well as voice recognition system
CN105653517A (en) * 2015-11-05 2016-06-08 乐视致新电子科技(天津)有限公司 Recognition rate determining method and apparatus

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018166339A1 (en) * 2017-03-13 2018-09-20 中兴通讯股份有限公司 Information processing method
CN107633017A (en) * 2017-08-23 2018-01-26 西安理工大学 A kind of fuzzy set construction method of Chinese key
CN109785842A (en) * 2017-11-14 2019-05-21 蔚来汽车有限公司 Speech recognition error correction method and speech recognition error correction system
WO2019096068A1 (en) * 2017-11-14 2019-05-23 蔚来汽车有限公司 Voice recognition and error correction method and voice recognition and error correction system
CN109785842B (en) * 2017-11-14 2023-09-05 蔚来(安徽)控股有限公司 Speech recognition error correction method and speech recognition error correction system
CN109992749A (en) * 2017-12-29 2019-07-09 珠海金山办公软件有限公司 A kind of character displaying method, device, electronic equipment and readable storage medium storing program for executing
CN109741749A (en) * 2018-04-19 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and terminal device of speech recognition
CN109741749B (en) * 2018-04-19 2020-03-27 北京字节跳动网络技术有限公司 Voice recognition method and terminal equipment
CN110392281B (en) * 2018-04-20 2022-03-18 腾讯科技(深圳)有限公司 Video synthesis method and device, computer equipment and storage medium
CN110392281A (en) * 2018-04-20 2019-10-29 腾讯科技(深圳)有限公司 Image synthesizing method, device, computer equipment and storage medium
CN108682423A (en) * 2018-05-24 2018-10-19 北京奔流网络信息技术有限公司 A kind of audio recognition method and device
CN109101604A (en) * 2018-08-01 2018-12-28 深圳市元征科技股份有限公司 Vehicle brand knows method for distinguishing and vehicle brand identification device
CN110164435A (en) * 2019-04-26 2019-08-23 平安科技(深圳)有限公司 Audio recognition method, device, equipment and computer readable storage medium
WO2020215554A1 (en) * 2019-04-26 2020-10-29 平安科技(深圳)有限公司 Speech recognition method, device, and apparatus, and computer-readable storage medium
CN111429886B (en) * 2020-04-09 2023-08-15 厦门钛尚人工智能科技有限公司 Voice recognition method and system
CN111429886A (en) * 2020-04-09 2020-07-17 厦门钛尚人工智能科技有限公司 Voice recognition method and system
CN112562668A (en) * 2020-11-30 2021-03-26 广州橙行智动汽车科技有限公司 Semantic information deviation rectifying method and device
CN115577712A (en) * 2022-12-06 2023-01-06 共道网络科技有限公司 Text error correction method and device

Similar Documents

Publication Publication Date Title
CN106297799A (en) Voice recognition processing method and device
CN105830011B (en) For overlapping the user interface of handwritten text input
TWI613641B (en) Method and system of outputting content of text data to sender voice
CN105531758B (en) Use the speech recognition of foreign words grammer
CN105808197B (en) A kind of information processing method and electronic equipment
CN101876853A (en) Pinyin input method and device
CN101405693A (en) Personal synergic filtering of multimodal inputs
KR20140018859A (en) Chinese character information processing method and chinese character information processing device
AU2889301A (en) Apparatus and method for inputting alphabet characters on keypad
KR20010083120A (en) Alphabet input device on keypad and its method
CN101630316B (en) Word message prompting system
CN109740142A (en) A kind of character string error correction method and device
CN109841209A (en) Speech recognition apparatus and system
JP5545151B2 (en) Information processing apparatus, e-mail reply sentence extraction method, and program thereof
KR100625357B1 (en) Alphabet input apparatus in a keypad and method thereof
KR100981866B1 (en) Korean input method by pressing the consonant button and pressing the [ㅡ] button or [ㅣ] button twice consecutively
CN105320292A (en) Method of inputting characters using keyboard
KR20050036945A (en) Alphabet input apparatus in a keypad and method thereof
KR101633403B1 (en) An apparatus and method of searching using an index without a final consonant
KR20160082574A (en) The key word selection and sentence generation method and apparatus
CN105204658B (en) Electronic equipment and its input control method
KR200419298Y1 (en) Alphabet input apparatus in a keypad and method thereof
KR20200015436A (en) Mathematical operator input method and keypad thereof
KR20070036116A (en) Alphabet input apparatus in a keypad and method thereof
KR20190051903A (en) Method and apparatus for inputting mathematical operators on a keypad

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170104