CN103903617A - Voice recognition method and electronic device - Google Patents

Voice recognition method and electronic device Download PDF

Info

Publication number
CN103903617A
CN103903617A CN201210568770.7A CN201210568770A CN103903617A CN 103903617 A CN103903617 A CN 103903617A CN 201210568770 A CN201210568770 A CN 201210568770A CN 103903617 A CN103903617 A CN 103903617A
Authority
CN
China
Prior art keywords
recognition result
identification engine
identified
unit
conditioned
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210568770.7A
Other languages
Chinese (zh)
Inventor
戴海生
陆游龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201210568770.7A priority Critical patent/CN103903617A/en
Publication of CN103903617A publication Critical patent/CN103903617A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention provides a voice recognition method and an electronic device, and is applied to a voice recognition system at least comprising a first recognition engine and a second engine. The method comprises: obtaining voice information to be recognized, based on the voice information to be recognized, obtaining at least one voice unit to be recognized at least comprising a first voice unit to be recognized; and based on the first recognition engine and the second recognition engine, recognizing the first voice unit to be recognized, and obtaining a first recognition result.

Description

A kind of audio recognition method and electronic equipment
Technical field
The application belongs to speech recognition technology field, is specifically related to a kind of audio recognition method and electronic equipment.
Background technology
Speech recognition technology is exactly to identify by electronic equipment the phonetic order that user sends, and then carries out corresponding operation, and no longer needs user manually to control electronic equipment.Speech recognition technology not only can be applied in the occasions such as phonetic dialing, Voice Navigation, the typing of dictation data, can also be applied in speech recognition retrieval.
At present, speech recognition system for example, for common large vocabulary,, include the vocabulary of millions of film names, music name, place name, in the process of search identification, can not distinguish these words, but in the universal identification engine that includes these words, identify one by one search.
Present inventor is realizing in the process of the embodiment of the present application technical scheme, at least finds to exist in prior art following technical matters:
Owing to there being a large amount of data in universal identification engine, and different words has similarity in pronunciation, in identification search procedure, tend to obtain the result that can not meet actual needs, have the technical matters that discrimination is low, for example, user sends voice operating order, " search Journey to the West ", the result identifying includes the too much incoherent result such as " Journey to the West play ", " grapefruit note ";
Owing to utilizing universal identification engine to identify search in prior art in millions of words, can there is technical matters of a specified duration consuming time again;
And then, because discrimination is low, consuming time for a long time, cause user to experience poor.
Summary of the invention
The embodiment of the present invention provides a kind of method and electronic equipment of speech recognition, the low technical matters of discrimination existing for solving prior art, has realized and increased substantially discrimination, has met again identifying and cover the technique effect of all words.
A kind of audio recognition method, is applied in the electronic equipment of the speech recognition system that at least comprises the first identification engine and the second identification engine, and described method comprises:
Obtain a voice messaging to be identified;
Based on described voice messaging to be identified, obtain at least one voice unit to be identified that at least comprises the first voice unit to be identified;
Based on described the first identification engine and described the second identification engine, described the first voice unit to be identified is identified, obtain the first recognition result.
Further, described the second identification engine is specially:
Based on preset rules, the first content of described the first identification in engine screened and the first identification engine that includes second content that obtains; Or
There is the second identification engine of three content different from the described first content of described the first identification in engine.
Further, in the time that described the second identification engine is specially described the first identification engine, describedly based on described the first identification engine or described the second identification engine, described the first voice unit to be identified is identified, is obtained the first recognition result, specifically comprise:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the second recognition result;
Judge that whether described the second recognition result meets first pre-conditioned;
Meet described first when pre-conditioned at described the second recognition result, export described the second recognition result as described the first recognition result.
Further, described judge described the second recognition result whether meet first pre-conditioned after, described method also comprises:
Do not meet described first when pre-conditioned at described the second recognition result, based on described the first identification engine, described the first voice unit to be identified is identified, obtain described the first recognition result;
Export described the first recognition result.
Further, in the time that described the second identification engine is specially described the first identification engine, describedly based on described the first identification engine or described the second identification engine, described the first voice unit to be identified is identified, is obtained the first recognition result, specifically comprise:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 3rd recognition result;
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 4th recognition result;
Judge that whether described the 3rd recognition result or described the 4th recognition result meet second pre-conditioned;
Meet described second when pre-conditioned at described the 3rd recognition result or described the 4th recognition result, export described the 3rd recognition result or described the 4th recognition result as described the first recognition result.
Further, in the time that described the second identification engine is specially described the second identification engine, describedly based on described the first identification engine or described the second identification engine, described the first voice unit to be identified is identified, is obtained the first recognition result, specifically comprise:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 5th recognition result;
Based on described the second identification engine, described the first voice unit to be identified is identified, obtain the 6th recognition result;
Judge that whether described the 5th recognition result and described the 6th recognition result meet the 3rd pre-conditioned;
Meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result and do not meet the described the 3rd when pre-conditioned, export described the 5th recognition result as described the first recognition result.
Further, described judge described the 5th recognition result and described the 6th recognition result whether meet the 3rd pre-conditioned after, described method also comprises:
Do not meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result and meet the described the 3rd when pre-conditioned, export described the 6th recognition result as described the first recognition result.
Further, described judge described the 5th recognition result and described the 6th recognition result whether meet the 3rd pre-conditioned after, described method also comprises:
All meet the described the 3rd when pre-conditioned at described the 5th recognition result and described the 6th recognition result, export described the 5th recognition result or described the 6th recognition result as described the first recognition result.
A kind of electronic equipment, at least comprises the speech recognition system of the first identification engine and the second identification engine in described electronic equipment, described electronic equipment comprises:
First obtains unit, for obtaining a voice messaging to be identified;
Second obtains unit, for based on described voice messaging to be identified, obtains at least one voice unit to be identified that at least comprises the first voice unit to be identified;
Recognition unit, for based on described the first identification engine and described the second identification engine, identifies described the first voice unit to be identified, obtains the first recognition result.
Further, described the second identification engine is specially:
Based on preset rules, the first content of described the first identification in engine screened and the first identification engine that includes second content that obtains; Or
There is the second identification engine of three content different from the described first content of described the first identification in engine.
Further, in the time that described the second identification engine is specially described the first identification engine, described recognition unit specifically comprises:
The first recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the second recognition result;
The first judgment sub-unit, first pre-conditioned for judging that whether described the second recognition result meets;
The first output subelement, for meeting described first at described the second recognition result when pre-conditioned, exports described the second recognition result as described the first recognition result.
Further, described recognition unit also comprises:
The second recognin unit, for not meeting described first at described the second recognition result when pre-conditioned, based on described the first identification engine, identifies described the first voice unit to be identified, obtains described the first recognition result;
The second output subelement, for exporting described the first recognition result.
Further, in the time that described the second identification engine is specially described the first identification engine, described recognition unit specifically comprises:
The 3rd recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 3rd recognition result;
The 4th recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 4th recognition result;
The second judgment sub-unit, second pre-conditioned for judging that whether described the 3rd recognition result or described the 4th recognition result meet;
The 3rd output subelement, for meeting described second at described the 3rd recognition result or described the 4th recognition result when pre-conditioned, exports described the 3rd recognition result or described the 4th recognition result as described the first recognition result.
Further, in the time that described the second identification engine is specially described the second identification engine, described recognition unit specifically comprises:
The 5th recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 5th recognition result;
The 6th recognin unit, for based on described the second identification engine, identifies described the first voice unit to be identified, obtains the 6th recognition result;
The 3rd judgment sub-unit, the 3rd pre-conditioned for judging that whether described the 5th recognition result and described the 6th recognition result meet;
The 4th output subelement, does not meet the described the 3rd when pre-conditioned for meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result, exports described the 5th recognition result as described the first recognition result.
Further, described recognition unit also comprises:
The 5th output subelement, meets the described the 3rd when pre-conditioned for do not meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result, exports described the 6th recognition result as described the first recognition result.
Further, described recognition unit also comprises:
The 6th output subelement, for all meeting the described the 3rd at described the 5th recognition result and described the 6th recognition result when pre-conditioned, exports described the 5th recognition result or described the 6th recognition result as described the first recognition result.
The one or more technical schemes that provide in the embodiment of the present invention, at least have following technique effect or advantage:
By adopt at least two identification engines that include the first identification engine and the second identification engine in speech recognition system, at least one voice unit to be identified that at least comprises the first voice unit to be identified obtaining is identified, obtain the first recognition result, solve the low technical matters of discrimination existing in prior art, realize and increased substantially discrimination, met again identifying and cover the technique effect of all words;
Again, by adopting the second identification engine to be specially the second identification engine different from described the first identification engine content, universal identification engine of the prior art can be divided into several identification engines, and utilize in multiple identification engines that comprise different content and search for voice messaging to be identified, solve in prior art owing to utilizing universal identification engine to identify search in prior art in millions of words, can there is technical matters of a specified duration consuming time, realize reduction search time, improved the technique effect of the efficiency of identification search.
And then, owing to having improved discrimination and having reduced search time, user is experienced.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of audio recognition method in one embodiment of the invention;
Fig. 2 is the structural drawing of the second identification engine in one embodiment of the invention;
Fig. 3 is the structural drawing of electronic equipment in one embodiment of the invention.
Embodiment
The embodiment of the present invention provides a kind of method and electronic equipment of speech recognition, the low technical matters of discrimination existing for solving prior art, has realized and increased substantially discrimination, has met again identifying and cover the technique effect of all words.
Technical scheme in the embodiment of the present invention is for addressing the above problem, and general thought is as follows:
The present invention is by obtaining a voice messaging to be identified; Based on described voice messaging to be identified, obtain at least one voice unit to be identified that at least comprises the first voice unit to be identified; Based on described the first identification engine and described the second identification engine, described the first voice unit to be identified is identified, obtain the first recognition result, so solve the low technical matters of discrimination existing in prior art.
In order better to understand technique scheme, below in conjunction with Figure of description and concrete embodiment, technique scheme is described in detail.
One embodiment of the invention provides a kind of audio recognition method, be applied in the electronic equipment of the speech recognition system that at least comprises the first identification engine and the second identification engine, wherein, described speech recognition system at least comprises the first identification engine and the second identification engine, can also comprise multiple identification engines such as the 3rd identification engine, the 4th identification engine, described speech recognition system can be used for voice messaging to search for identification.
As shown in Figure 1, described audio recognition method comprises step:
S101: obtain a voice messaging to be identified.
S102: based on described voice messaging to be identified, obtain at least one voice unit to be identified that at least comprises the first voice unit to be identified.
In specific embodiment, user sends phonetic order to electronic equipment, includes described voice messaging to be identified in this phonetic order, includes at least one voice unit to be identified in described voice messaging to be identified.For example, when user sends phonetic order to electronic equipment, while " searching film Journey to the West ",, during voice messaging to be identified " is searched film Journey to the West ", include the first voice unit to be identified " Journey to the West ", the second voice unit to be identified " is searched ", three voice units to be identified of the 3rd voice unit to be identified " film ".In addition, in the embodiment of the present application, in order to improve recognition effect, after electronic equipment receives voice messaging, also can convert thereof into corresponding voice signal, and this voice signal is carried out to front-end processing, the impact bringing to eliminate noise and different speaker, makes signal after treatment more can reflect the essential characteristic of voice.In the embodiment of the present application, the most frequently used front-end processing technology has end-point detection and voice to strengthen.Certainly, those of ordinary skills can also use other front-end processing technology.
After based on S102, acquisition at least comprises at least one voice unit to be identified of the first voice unit to be identified, carry out S103: based on described the first identification engine and described the second identification engine, described the first voice unit to be identified is identified, obtained the first recognition result.
Wherein, as shown in Figure 2, described the second identification engine can comprise two kinds of identification engines:
The first identification engine 201: based on preset rules, the first content in described the first identification engine is screened and the identification engine that includes second content that obtains.Wherein, described second content is contained in described first content.
In specific embodiment, include multiple preset rules for the foundation of the first identification engine 201.
Rule one: use temperature according to user, for example, through the first content in the first identification engine is screened, user is used to the word that temperature is high, be the frequent frequent words using of user, be integrated into second content, and set up the first identification engine 201 based on described second content.
Rule two: according to the release time, for example, through the first content in the first identification engine is screened, by the word of releasing in the recent period, as the most emerging word in this week or in this month, be integrated into second content, and set up the first identification engine 201 based on described second content.
In addition, can also customize especially the first identification engine 201 obtaining, for example, based on syntax rule, acoustic model based on special training or based on grammer weight.Wherein, described preset rules and customized rules are not limited to the above-mentioned rule of mentioning, and according to actual needs, those of ordinary skills can also use other rules.
From the above, by at least two identification engines of the first identification engine and the second identification engine, at least one voice unit to be identified that at least comprises the first voice unit to be identified obtaining is identified, solve the low technical matters of discrimination existing in prior art, realized and increased substantially discrimination.
The second in the embodiment of the present application identification engine 202: the identification engine with three content different from described first content in described the first identification engine.
In specific embodiment, universal identification engine of the prior art can be divided into multiple identification engines that comprise different content, for example, the first identification engine comprises first content, and the second identification engine comprises three content different from first content, wherein, in the process splitting, except forming the first identification engine and the second identification engine 202, can also form other identification engines, as the three, four identification engine, just give an example no longer one by one in this application.
From the above, by setting up multiple identification engines that include different content, and utilize multiple identification engines to search for identification to voice messaging to be identified, solve in prior art owing to utilizing universal identification engine to identify search in prior art in the vocabulary that includes millions of words, can there is technical matters of a specified duration consuming time, realize reduction search time, improved the technique effect of the efficiency of identification search.
In the embodiment of the present application, in the time that described the second identification engine is specially the first identification engine 201, step S102 comprises two kinds of concrete implementations:
Mode one:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the second recognition result;
Judge that whether described the second recognition result meets first pre-conditioned;
Meet described first when pre-conditioned at described the second recognition result, export described the second recognition result as described the first recognition result.
In specific embodiment, in the time that the second identification engine is specially the first identification engine 201, the first voice unit to be identified is identified search process in, first, extract the characteristic parameter of described voice unit to be identified, then, described characteristic parameter is carried out to dynamic comparison with each speech model of corresponding second content in described the first identification engine 201, obtain a recognition result, judge again whether described recognition result meets default degree of confidence, wherein, degree of confidence is used for the degree of reliability of the recognition result that characterizes acquisition, described default degree of confidence is pre-defined by system, also can arrange voluntarily according to user's needs.In the time that described recognition result meets default degree of confidence, export described the second recognition result as final recognition result.Wherein, identification engine is not limited to above-mentioned a kind of mode to the identifying of voice unit to be identified, and those of ordinary skills can also adopt other mode.
Described judge described the second recognition result whether meet first pre-conditioned after, the described method in the embodiment of the present application also comprises:
Do not meet described first when pre-conditioned at described the second recognition result, based on described the first identification engine, described the first voice unit to be identified is identified, obtain described the first recognition result;
Export described the first recognition result.
Continue to use the example of degree of confidence above,: in the time that described the second recognition result does not meet default degree of confidence, utilize the first identification engine to identify described the first voice unit to be identified, obtain the first recognition result and the first recognition result is output as to final recognition result.
From the above, after the first identification engine does not obtain satisfied recognition result, utilize the first identification engine to identify the first voice unit to be identified, realized identifying and cover the technique effect of all words.
Introduce the second implementation of step S102 below, that is, and mode two:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 3rd recognition result;
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 4th recognition result;
Judge that whether described the 3rd recognition result or described the 4th recognition result meet second pre-conditioned;
Meet described second when pre-conditioned at described the 3rd recognition result or described the 4th recognition result, export described the 3rd recognition result or described the 4th recognition result as described the first recognition result.
Continue to use the example of degree of confidence above, be: in the time that the second identification engine is specially the first identification engine 201, the first voice unit to be identified is identified search process in, first, extract the characteristic parameter of described voice unit to be identified, then, described characteristic parameter is carried out to dynamic comparison with each speech model and first each speech model of identifying the corresponding first content in engine of the corresponding second content in the first identification engine 201, obtain the 3rd recognition result of corresponding the first identification engine 201 and the 4th recognition result of corresponding the first identification engine, then, judge whether described the 3rd recognition result meets default degree of confidence, in the time that described the 3rd recognition result meets default degree of confidence, export described the 3rd recognition result as final recognition result.In the time that described the 3rd recognition result does not meet default degree of confidence, output the 4th recognition result is as final recognition result.
Wherein, corresponding the first identification set of the first identification engine.When the second identification engine is the first identification engine, corresponding the second identification set, the second identification set belongs to the first identification set.
Wherein, based on described the first identification engine and described the second identification engine, described the first voice unit to be identified is identified, obtain the step of the first recognition result:, can first first identify including at least one voice unit to be identified in described voice messaging to be identified with the second identification engine higher than the first identification engine based on the second identification engine priority.Identifying engine with first when identification while more not meeting voice match condition includes at least one voice unit to be identified in to described voice messaging to be identified and identifies; Can certainly identify including at least one voice unit to be identified in described voice messaging to be identified based on the second identification engine and the first identification engine simultaneously, in the time that the second identification engine meets voice match condition at least one voice unit coupling to be identified, output recognition result (, this speech recognition completes); If in the time that the second identification engine does not meet voice match condition to described at least one voice unit coupling to be identified, due to carry out simultaneously the second identification engine and first identification engine described at least one voice unit to be identified is mated, so the first identification engine mated described at least one voice unit to be identified before in the time that the second identification engine does not meet voice match condition to described at least one voice unit coupling to be identified, from improving the efficiency of speech recognition.
In addition, in the process that adopts the second way to identify, because the first identification engine and the first identification engine are identified the first voice unit to be identified simultaneously, even if identified at the first identification engine, there is not satisfied recognition result, the first identification engine has also recognized certain phase, has realized the technique effect of saving recognition time.
Further, in the time that described the second identification engine is specially described the second identification engine 202, S102 specific implementation process is:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 5th recognition result;
Based on described the second identification engine, described the first voice unit to be identified is identified, obtain the 6th recognition result;
Judge that whether described the 5th recognition result and described the 6th recognition result meet the 3rd pre-conditioned;
Meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result and do not meet the described the 3rd when pre-conditioned, export described the 5th recognition result as described the first recognition result.
Do not meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result and meet the described the 3rd when pre-conditioned, export described the 6th recognition result as described the first recognition result.
All meet the described the 3rd when pre-conditioned at described the 5th recognition result and described the 6th recognition result, export described the 5th recognition result or described the 6th recognition result as described the first recognition result.
Continue to continue to use the example of degree of confidence above, be, in the time that the second identification engine is specially the second identification engine 202, described the second identification engine 202 includes three content different from described first content in the first identification engine, first, the characteristic parameter of voice unit to be identified is carried out to dynamic comparison with each speech model and first each speech model of identifying corresponding first content in engine of corresponding the 3rd content in described the second identification engine 202, obtain the 5th recognition result of corresponding the first identification engine and the 6th recognition result of corresponding the second identification engine 202, then, judge the whether satisfied reliability that pre-sets of described the 5th recognition result and described the 6th recognition result, meet and pre-set reliability and described six recognition results and do not meet while pre-seting reliability at described the 5th recognition result, only export described the 5th recognition result as final recognition result, do not meet and pre-set reliability and described the 6th recognition result and meet while pre-seting reliability at described the 5th recognition result, only export described the 6th recognition result and tie as final identification, if described the 5th recognition result and described the 6th recognition result all meet while pre-seting reliability, export the 5th recognition result and the 6th recognition result as final recognition result simultaneously.
Another embodiment of the present invention provides a kind of electronic equipment, at least comprises the speech recognition system of the first identification engine and the second identification engine in described electronic equipment, and as shown in Figure 3, described electronic equipment comprises:
First obtains unit 301, for obtaining a voice messaging to be identified;
Second obtains unit 302, for based on described voice messaging to be identified, obtains at least one voice unit to be identified that at least comprises the first voice unit to be identified.
In specific embodiment, user sends phonetic order to electronic equipment, includes described voice messaging to be identified in this phonetic order, again, includes at least one voice unit to be identified in described voice messaging to be identified.For example, when user sends phonetic order to electronic equipment, while " searching film Journey to the West ",, during voice messaging to be identified " is searched film Journey to the West ", include the first voice unit to be identified " Journey to the West ", the second voice unit to be identified " is searched ", three voice units to be identified of the 3rd voice unit to be identified " film ".In addition, in the embodiment of the present application, in order to improve recognition effect, after electronic equipment receives voice messaging, also can convert thereof into corresponding voice signal, and this voice signal is carried out to front-end processing, the impact bringing to eliminate noise and different speaker, makes signal after treatment more can reflect the essential characteristic of voice.In the embodiment of the present application, the most frequently used front-end processing technology has end-point detection and voice to strengthen.Certainly, those of ordinary skills can also use other front-end processing technology.
In the embodiment of the present application, described electronic equipment also comprises:
Recognition unit 303, for based on described the first identification engine and described the second identification engine, identifies described the first voice unit to be identified, obtains the first recognition result.
Wherein, described the second acquisition unit 302 and described first obtains unit 301 and is connected, and described recognition unit 303 and described second obtains unit 302 and is connected.
Described the second identification engine can comprise two kinds of identification engines:
The first identification engine 201: based on preset rules, the first content in described the first identification engine is screened and the identification engine that includes second content that obtains.Wherein, described second content is contained in described first content.
In specific embodiment, include multiple preset rules for the foundation of the first identification engine 201.
Rule one: use temperature according to user, for example, through the first content in the first identification engine is screened, user is used to the word that temperature is high, be the frequent frequent words using of user, be integrated into second content, and set up the first identification engine 201 based on described second content.
Rule two: according to the release time, for example, through the first content in the first identification engine is screened, by the word of releasing in the recent period, as the most emerging word in this week or in this month, be integrated into second content, and set up the first identification engine 201 based on described second content.
In addition, can also customize especially the first identification engine 201 obtaining, for example, based on syntax rule, acoustic model based on special training or based on grammer weight.Wherein, described preset rules and customized rules are not limited to the above-mentioned rule of mentioning, and according to actual needs, those of ordinary skills can also use other rules.
From the above, by at least two identification engines of the first identification engine and the second identification engine, at least one voice unit to be identified that at least comprises the first voice unit to be identified obtaining is identified, solve the low technical matters of discrimination existing in prior art, realized and increased substantially discrimination.
In the embodiment of the present application, the second identification engine 202: the identification engine with three content different from described first content in described the first identification engine.
In specific embodiment, universal identification engine of the prior art can be divided into multiple identification engines that comprise different content, for example, the first identification engine comprises first content, and the second identification engine comprises three content different from first content, wherein, in the process of sealing off, except forming the first identification engine and the second identification engine 202, can also form other identification engines, as the three, four identification engine, just give an example no longer one by one in this application.
Further, in the time that described the second identification engine is specially described the first identification engine 201, described recognition unit comprises two kinds of implementations.
In mode one, described recognition unit 303 specifically comprises:
The first recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the second recognition result;
The first judgment sub-unit, first pre-conditioned for judging that whether described the second recognition result meets;
The first output subelement, for meeting described first at described the second recognition result when pre-conditioned, exports described the second recognition result as described the first recognition result;
The second recognin unit, for not meeting described first at described the second recognition result when pre-conditioned, based on described the first identification engine, identifies described the first voice unit to be identified, obtains described the first recognition result;
The second output subelement, for exporting described the first recognition result.
In specific embodiment, in the time that the second identification engine is specially the first identification engine 201, the first voice unit to be identified is identified search process in, first, extract the characteristic parameter of described voice unit to be identified, then, described characteristic parameter is carried out to dynamic comparison with each speech model of corresponding second content in described the first identification engine 201, obtain a recognition result, judge again whether described recognition result meets default degree of confidence, wherein, degree of confidence is used for the degree of reliability of the recognition result that characterizes acquisition, described default degree of confidence is pre-defined by system, also can arrange voluntarily according to user's needs.In the time that described recognition result meets default degree of confidence, export described the second recognition result as final recognition result.Wherein, identification engine is not limited to above-mentioned a kind of mode to the identifying of voice unit to be identified, and those of ordinary skills can also adopt other mode.
In addition, in the time that described the second recognition result does not meet default degree of confidence, utilize the first identification engine to identify described the first voice unit to be identified, obtain the first recognition result and the first recognition result is output as to final recognition result.
In mode two, described recognition unit 303 specifically comprises:
The 3rd recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 3rd recognition result;
The 4th recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 4th recognition result;
The second judgment sub-unit, second pre-conditioned for judging that whether described the 3rd recognition result or described the 4th recognition result meet;
The 3rd output subelement, for meeting described second at described the 3rd recognition result or described the 4th recognition result when pre-conditioned, exports described the 3rd recognition result or described the 4th recognition result as described the first recognition result.
Continue to use the example of degree of confidence above, be: in the time that the second identification engine is specially the first identification engine 201, the first voice unit to be identified is identified search process in, first, extract the characteristic parameter of described voice unit to be identified, then, described characteristic parameter is carried out to dynamic comparison with each speech model and first each speech model of identifying the corresponding first content in engine of the corresponding second content in the first identification engine 201, obtain the 3rd recognition result of corresponding the first identification engine 201 and the 4th recognition result of corresponding the first identification engine, then, judge whether described the 3rd recognition result meets default degree of confidence, in the time that described the 3rd recognition result meets default degree of confidence, export described the 3rd recognition result as final recognition result.In the time that described the 3rd recognition result does not meet default degree of confidence, output the 4th recognition result is as final recognition result.
Further, in the time that described the second identification engine is specially described the second identification engine 202, described recognition unit specifically comprises:
The 5th recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 5th recognition result;
The 6th recognin unit, for based on described the second identification engine, identifies described the first voice unit to be identified, obtains the 6th recognition result;
The 3rd judgment sub-unit, the 3rd pre-conditioned for judging that whether described the 5th recognition result and described the 6th recognition result meet;
The 4th output subelement, does not meet the described the 3rd when pre-conditioned for meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result, exports described the 5th recognition result as described the first recognition result;
The 5th output subelement, meets the described the 3rd when pre-conditioned for do not meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result, exports described the 6th recognition result as described the first recognition result;
The 6th output subelement, for all meeting the described the 3rd at described the 5th recognition result and described the 6th recognition result when pre-conditioned, exports described the 5th recognition result or described the 6th recognition result as described the first recognition result.
Continue to continue to use the example of degree of confidence above, be, in the time that the second identification engine is specially the second identification engine 202, described the second identification engine 202 includes three content different from described first content in the first identification engine, first, the characteristic parameter of voice unit to be identified is carried out to dynamic comparison with each speech model and first each speech model of identifying corresponding first content in engine of corresponding the 3rd content in described the second identification engine 202, obtain the 5th recognition result of corresponding the first identification engine and the 6th recognition result of corresponding the second identification engine 202, then, judge the whether satisfied reliability that pre-sets of described the 5th recognition result and described the 6th recognition result, meet and pre-set reliability and described six recognition results and do not meet while pre-seting reliability at described the 5th recognition result, only export described the 5th recognition result as final recognition result, do not meet and pre-set reliability and described the 6th recognition result and meet while pre-seting reliability at described the 5th recognition result, only export described the 6th recognition result and tie as final identification, if described the 5th recognition result and described the 6th recognition result all meet while pre-seting reliability, export the 5th recognition result and the 6th recognition result as final recognition result simultaneously.
The electronic equipment of introducing due to the present embodiment is for implementing the electronic equipment that in the embodiment of the present application, information processing method adopts, so based on information processing method in the embodiment of the present application, the embodiment that those skilled in the art can understand electronic equipment in the embodiment of the present application with and various version, so introduce no longer in detail for this electronic equipment at this.As long as those skilled in the art implement the electronic equipment that in the embodiment of the present application, information processing method adopts, all belong to the scope of the application institute wish protection.
The one or more technical schemes that provide in the embodiment of the present invention, at least have following technique effect or advantage:
By adopt at least two identification engines that include the first identification engine and the second identification engine in speech recognition system, at least one voice unit to be identified that at least comprises the first voice unit to be identified obtaining is identified, obtain the first recognition result, solve the low technical matters of discrimination existing in prior art, realize and increased substantially discrimination, met again identifying and cover the technique effect of all words;
Again, by adopting the second identification engine to be specially the second identification engine different from described the first identification engine content, universal identification engine of the prior art can be divided into several identification engines, and utilize in multiple identification engines that comprise different content and search for voice messaging to be identified, solve in prior art owing to utilizing universal identification engine to identify search in prior art in millions of words, can there is technical matters of a specified duration consuming time, realize reduction search time, improved the technique effect of the efficiency of identification search.
And then, owing to having improved discrimination and having reduced search time, user is experienced.
Obviously, those skilled in the art can carry out various changes and modification and not depart from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of the claims in the present invention and equivalent technologies thereof, the present invention is also intended to comprise these changes and modification interior.

Claims (16)

1. an audio recognition method, is characterized in that, is applied in the electronic equipment of the speech recognition system that at least comprises the first identification engine and the second identification engine, and described method comprises:
Obtain a voice messaging to be identified;
Based on described voice messaging to be identified, obtain at least one voice unit to be identified that at least comprises the first voice unit to be identified;
Based on described the first identification engine and described the second identification engine, described the first voice unit to be identified is identified, obtain the first recognition result.
2. the method for claim 1, is characterized in that, described the second identification engine is specially:
Based on preset rules, the first content of described the first identification in engine screened and the first identification engine that includes second content that obtains; Or
There is the second identification engine of three content different from the described first content of described the first identification in engine.
3. method as claimed in claim 2, it is characterized in that, in the time that described the second identification engine is specially described the first identification engine, described based on described the first identification engine or described the second identification engine, described the first voice unit to be identified is identified, obtain the first recognition result, specifically comprise:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the second recognition result;
Judge that whether described the second recognition result meets first pre-conditioned;
Meet described first when pre-conditioned at described the second recognition result, export described the second recognition result as described the first recognition result.
4. method as claimed in claim 3, is characterized in that, described judge described the second recognition result whether meet first pre-conditioned after, described method also comprises:
Do not meet described first when pre-conditioned at described the second recognition result, based on described the first identification engine, described the first voice unit to be identified is identified, obtain described the first recognition result;
Export described the first recognition result.
5. method as claimed in claim 2, it is characterized in that, in the time that described the second identification engine is specially described the first identification engine, described based on described the first identification engine or described the second identification engine, described the first voice unit to be identified is identified, obtain the first recognition result, specifically comprise:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 3rd recognition result;
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 4th recognition result;
Judge that whether described the 3rd recognition result or described the 4th recognition result meet second pre-conditioned;
Meet described second when pre-conditioned at described the 3rd recognition result or described the 4th recognition result, export described the 3rd recognition result or described the 4th recognition result as described the first recognition result.
6. method as claimed in claim 2, it is characterized in that, in the time that described the second identification engine is specially described the second identification engine, described based on described the first identification engine or described the second identification engine, described the first voice unit to be identified is identified, obtain the first recognition result, specifically comprise:
Based on described the first identification engine, described the first voice unit to be identified is identified, obtain the 5th recognition result;
Based on described the second identification engine, described the first voice unit to be identified is identified, obtain the 6th recognition result;
Judge that whether described the 5th recognition result and described the 6th recognition result meet the 3rd pre-conditioned;
Meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result and do not meet the described the 3rd when pre-conditioned, export described the 5th recognition result as described the first recognition result.
7. method as claimed in claim 6, is characterized in that, described judge described the 5th recognition result and described the 6th recognition result whether meet the 3rd pre-conditioned after, described method also comprises:
Do not meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result and meet the described the 3rd when pre-conditioned, export described the 6th recognition result as described the first recognition result.
8. method as claimed in claim 6, is characterized in that, described judge described the 5th recognition result and described the 6th recognition result whether meet the 3rd pre-conditioned after, described method also comprises:
All meet the described the 3rd when pre-conditioned at described the 5th recognition result and described the 6th recognition result, export described the 5th recognition result or described the 6th recognition result as described the first recognition result.
9. an electronic equipment, is characterized in that, at least comprises the speech recognition system of the first identification engine and the second identification engine in described electronic equipment, and described electronic equipment comprises:
First obtains unit, for obtaining a voice messaging to be identified;
Second obtains unit, for based on described voice messaging to be identified, obtains at least one voice unit to be identified that at least comprises the first voice unit to be identified;
Recognition unit, for based on described the first identification engine and described the second identification engine, identifies described the first voice unit to be identified, obtains the first recognition result.
10. electronic equipment as claimed in claim 9, is characterized in that, described the second identification engine is specially:
Based on preset rules, the first content of described the first identification in engine screened and the first identification engine that includes second content that obtains; Or
There is the second identification engine of three content different from the described first content of described the first identification in engine.
11. electronic equipments as claimed in claim 10, is characterized in that, in the time that described the second identification engine is specially described the first identification engine, described recognition unit specifically comprises:
The first recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the second recognition result;
The first judgment sub-unit, first pre-conditioned for judging that whether described the second recognition result meets;
The first output subelement, for meeting described first at described the second recognition result when pre-conditioned, exports described the second recognition result as described the first recognition result.
12. electronic equipments as claimed in claim 11, is characterized in that, described recognition unit also comprises:
The second recognin unit, for not meeting described first at described the second recognition result when pre-conditioned, based on described the first identification engine, identifies described the first voice unit to be identified, obtains described the first recognition result;
The second output subelement, for exporting described the first recognition result.
13. electronic equipments as claimed in claim 10, is characterized in that, in the time that described the second identification engine is specially described the first identification engine, described recognition unit specifically comprises:
The 3rd recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 3rd recognition result;
The 4th recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 4th recognition result;
The second judgment sub-unit, second pre-conditioned for judging that whether described the 3rd recognition result or described the 4th recognition result meet;
The 3rd output subelement, for meeting described second at described the 3rd recognition result or described the 4th recognition result when pre-conditioned, exports described the 3rd recognition result or described the 4th recognition result as described the first recognition result.
14. electronic equipments as claimed in claim 10, is characterized in that, in the time that described the second identification engine is specially described the second identification engine, described recognition unit specifically comprises:
The 5th recognin unit, for based on described the first identification engine, identifies described the first voice unit to be identified, obtains the 5th recognition result;
The 6th recognin unit, for based on described the second identification engine, identifies described the first voice unit to be identified, obtains the 6th recognition result;
The 3rd judgment sub-unit, the 3rd pre-conditioned for judging that whether described the 5th recognition result and described the 6th recognition result meet;
The 4th output subelement, does not meet the described the 3rd when pre-conditioned for meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result, exports described the 5th recognition result as described the first recognition result.
15. electronic equipments as claimed in claim 14, is characterized in that, described recognition unit also comprises:
The 5th output subelement, meets the described the 3rd when pre-conditioned for do not meet described the 3rd pre-conditioned and described the 6th recognition result at described the 5th recognition result, exports described the 6th recognition result as described the first recognition result.
16. electronic equipments as claimed in claim 14, is characterized in that, described recognition unit also comprises:
The 6th output subelement, for all meeting the described the 3rd at described the 5th recognition result and described the 6th recognition result when pre-conditioned, exports described the 5th recognition result or described the 6th recognition result as described the first recognition result.
CN201210568770.7A 2012-12-24 2012-12-24 Voice recognition method and electronic device Pending CN103903617A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210568770.7A CN103903617A (en) 2012-12-24 2012-12-24 Voice recognition method and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210568770.7A CN103903617A (en) 2012-12-24 2012-12-24 Voice recognition method and electronic device

Publications (1)

Publication Number Publication Date
CN103903617A true CN103903617A (en) 2014-07-02

Family

ID=50994899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210568770.7A Pending CN103903617A (en) 2012-12-24 2012-12-24 Voice recognition method and electronic device

Country Status (1)

Country Link
CN (1) CN103903617A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105632487A (en) * 2015-12-31 2016-06-01 北京奇艺世纪科技有限公司 Voice recognition method and device
CN106971712A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of adaptive rapid voiceprint recognition methods and system
CN106971726A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of adaptive method for recognizing sound-groove and system based on code book
CN106971735A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of method and system for regularly updating the Application on Voiceprint Recognition of training sentence in caching
CN106981287A (en) * 2016-01-14 2017-07-25 芋头科技(杭州)有限公司 A kind of method and system for improving Application on Voiceprint Recognition speed
CN109979454A (en) * 2019-03-29 2019-07-05 联想(北京)有限公司 Data processing method and device
CN109979437A (en) * 2019-03-01 2019-07-05 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101071564A (en) * 2006-05-11 2007-11-14 通用汽车公司 Distinguishing out-of-vocabulary speech from in-vocabulary speech
CN101075434A (en) * 2006-05-18 2007-11-21 富士通株式会社 Voice recognition apparatus and recording medium storing voice recognition program
CN101980197A (en) * 2010-10-29 2011-02-23 北京邮电大学 Long time structure vocal print-based multi-layer filtering audio frequency search method and device
CN102236686A (en) * 2010-05-07 2011-11-09 盛乐信息技术(上海)有限公司 Voice sectional song search method
CN102280106A (en) * 2010-06-12 2011-12-14 三星电子株式会社 VWS method and apparatus used for mobile communication terminal
CN102332265A (en) * 2011-06-20 2012-01-25 浙江吉利汽车研究院有限公司 Method for improving voice recognition rate of automobile voice control system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101071564A (en) * 2006-05-11 2007-11-14 通用汽车公司 Distinguishing out-of-vocabulary speech from in-vocabulary speech
CN101075434A (en) * 2006-05-18 2007-11-21 富士通株式会社 Voice recognition apparatus and recording medium storing voice recognition program
CN102236686A (en) * 2010-05-07 2011-11-09 盛乐信息技术(上海)有限公司 Voice sectional song search method
CN102280106A (en) * 2010-06-12 2011-12-14 三星电子株式会社 VWS method and apparatus used for mobile communication terminal
CN101980197A (en) * 2010-10-29 2011-02-23 北京邮电大学 Long time structure vocal print-based multi-layer filtering audio frequency search method and device
CN102332265A (en) * 2011-06-20 2012-01-25 浙江吉利汽车研究院有限公司 Method for improving voice recognition rate of automobile voice control system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105632487A (en) * 2015-12-31 2016-06-01 北京奇艺世纪科技有限公司 Voice recognition method and device
CN105632487B (en) * 2015-12-31 2020-04-21 北京奇艺世纪科技有限公司 Voice recognition method and device
CN106971712A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of adaptive rapid voiceprint recognition methods and system
CN106971726A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of adaptive method for recognizing sound-groove and system based on code book
CN106971735A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of method and system for regularly updating the Application on Voiceprint Recognition of training sentence in caching
CN106981287A (en) * 2016-01-14 2017-07-25 芋头科技(杭州)有限公司 A kind of method and system for improving Application on Voiceprint Recognition speed
CN109979437A (en) * 2019-03-01 2019-07-05 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and storage medium
CN109979437B (en) * 2019-03-01 2022-05-20 阿波罗智联(北京)科技有限公司 Speech recognition method, apparatus, device and storage medium
CN109979454A (en) * 2019-03-29 2019-07-05 联想(北京)有限公司 Data processing method and device

Similar Documents

Publication Publication Date Title
CN103903617A (en) Voice recognition method and electronic device
US11308934B2 (en) Hotword-aware speech synthesis
CN104078044A (en) Mobile terminal and sound recording search method and device of mobile terminal
CN110797027B (en) Multi-recognizer speech recognition
CN106233374A (en) Generate for detecting the keyword model of user-defined keyword
CN103903621A (en) Method for voice recognition and electronic equipment
CN104538034A (en) Voice recognition method and system
CN105336324A (en) Language identification method and device
CN103971681A (en) Voice recognition method and system
CN104142915A (en) Punctuation adding method and system
CN103871401A (en) Method for voice recognition and electronic equipment
CN103456297A (en) Method and device for matching based on voice recognition
CN107369439A (en) A kind of voice awakening method and device
EP1933301A3 (en) Speech recognition method and system with intelligent speaker identification and adaptation
CN103853703A (en) Information processing method and electronic equipment
CN105469789A (en) Voice information processing method and voice information processing terminal
CN103219007A (en) Voice recognition method and voice recognition device
CN103489444A (en) Speech recognition method and device
CN107863098A (en) A kind of voice identification control method and device
KR20240115216A (en) Method and apparatus for speech signal processing
CN103236261A (en) Speaker-dependent voice recognizing method
CN103902193A (en) System and method for operating computers to change slides by aid of voice
CN105206263A (en) Speech and meaning recognition method based on dynamic dictionary
US7529668B2 (en) System and method for implementing a refined dictionary for speech recognition
CN103903615B (en) A kind of information processing method and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140702

RJ01 Rejection of invention patent application after publication