CN1412741A - Chinese speech identification method with dialect background - Google Patents

Chinese speech identification method with dialect background Download PDF

Info

Publication number
CN1412741A
CN1412741A CN02155605A CN02155605A CN1412741A CN 1412741 A CN1412741 A CN 1412741A CN 02155605 A CN02155605 A CN 02155605A CN 02155605 A CN02155605 A CN 02155605A CN 1412741 A CN1412741 A CN 1412741A
Authority
CN
China
Prior art keywords
syllable
speech
mapping
dialect
search tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN02155605A
Other languages
Chinese (zh)
Other versions
CN1177313C (en
Inventor
郑方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing D Ear Technologies Co ltd
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CNB021556059A priority Critical patent/CN1177313C/en
Publication of CN1412741A publication Critical patent/CN1412741A/en
Application granted granted Critical
Publication of CN1177313C publication Critical patent/CN1177313C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

The present invention belongs to the field of computer artificial intelligent and mode identification technology, relates to a Chinese phonetic recognition method with dialect background. Said invention includes: according to the characteristics of specific dialect constructing syllable mapping talbet from common speech pronunciation to dialect pronunciation; according to the syllable mapping table expanding the search tree in the existent standard common speech phonetic recognizer, then replacing the search tree in standard common speech phonetic recognizer with the expanded search tree. Said invention not only can identify standard common speech, but also can identify the common speech with various dialects only by using syllable mapping table.

Description

The methods for mandarin speech recognition of band dialect background
Technical field
The invention belongs to artificial intelligence of computing machine and mode identification technology, particularly the method by the computer Recognition human speech.
Background technology
" big vocabulary continuous speech recognition " (Large Vocabulary Continuous Speech Recognition, LVCSR, be called for short " speech recognition "), exactly by the language message that is contained in the continuous voice signal of computing machine according to the people, what identify certain section voice correspondence is the process of which literal.Equipment or software that " big vocabulary continuous speech voice recognition device " (Large VocabularyContinuous Speech Recognizer is called for short " speech recognition device ") refers to be used to carry out speech recognition.Speech recognition is by the transfer process of voice signal to literal, can be widely used in nearly all aspects such as comprising telecommunications industry, banking industry, finance and economics financial circles, tourism and forwarding business, public utilities, show business, public consumption behavior aspect, enterprise management behavior aspect, the type of application comprises voice command control in call center (Call Center) voice service, Chinese intelligent interaction short message service, the computer/electronic equipment, education, national security field or the like.
Speech recognition device by two the part form: one be acoustic model (Acoustic Model, AM), one be language model (Language Model, LM).
Acoustic model is used for voice signal is converted to the grid of Chinese phonetic mother (or syllable), realizes by the conversion of signal to sound symbol (with sound mother or pinyin representation).The method of at present effective, the most general realization acoustic model is that (Hidden Markov Model, HMM) method reaches by its method that derives from hidden Markov model.Acoustic model is divided into two parts of identifying of the training process and the acoustic model of acoustic model, as shown in Figure 1.The training process 1 of acoustic model (being called for short the acoustics training) comprises that acoustic feature extracts, acoustics is trained and the foundation of acoustics model bank; It utilizes the acoustic feature that extracts from the said language of a large amount of speakers to set up a model for each acoustics identification primitive (also claiming identification primitive, primitive or speech recognition primitive), to the speech recognition of Chinese, the identification primitive is generally Chinese syllable, Chinese phonetic mother or Chinese phoneme etc.The identifying 2 of acoustic model (being called for short acoustics identification) comprises that acoustic feature extracts, the acoustics search; It carries out matching ratio with the acoustic feature of model in the model bank and certain section pronunciation, thereby finds most probable Matching Model sequence or grid, the result of acoustics identification just; Because the model sequence has a variety of possibilities, identifying needs to attempt as far as possible effectively various possible model combined sequence, this is equivalent to carry out the search of optimal sequence in the model sequence space, so the identifying of acoustic model is also referred to as the search procedure (being called for short the acoustics search) of acoustic model.In the whole process of speech recognition, the acoustics search is the phase one, its output result is the form of the grid of speech recognition primitive often, it is the input of next stage, as shown in Figure 2, among the figure, be the phonetic (actual pronunciation is: we are Chinese) of actual institute pronunciation joint in the circle of grey, and the phonetic in other circles is other possible candidates of acoustics search output.
Language model is in order to the collocation probabilistic relation between the adjacent speech in the context of delineation sentence.At present the most frequently used language model is the language model that is called Tri-gram (tlv triple), it provided collocation probability P between any three speech a, b and the c (c|a, b).The language model part also is divided into two parts of search procedure of the training process and the language model of language model.The training process of language model: when the Chinese language text that magnanimity has been arranged when (being called training text),, can count the collocation number of times between any three speech, thereby estimate its collocation probability by simple method of counting; The search procedure of language model: the intermediate result--speech recognition primitive grid--of acoustics search is being converted in the Chinese sentence process, and language model is in order to pick out best sentence candidate according to the principle of maximum likelihood from numerous possible candidates.Here maximum-likelihood criterion means maximum probability.In search procedure, the sentence probability calculates with following formula: P ( w 1 , w 2 , · · · , w N ) ≈ P ( w 1 ) · P ( w 2 | w 1 ) · Π n = 3 N P ( w n | w n - 2 , w n - 1 ) Tlv triple (the w of speech wherein N-2, w N-1, w n) probability that occurs, just P (w n| w N-2, w N-1), from training text, come by existing language model training method study.
When carrying out the search of language model, the employing search tree retrains the speed and the degree of its spatial spread, to guarantee the efficient of search.The example of search tree as shown in Figure 3, this is by initial consonant and rhythm matrix.Always have three category nodes in the search tree.Root node: representing with two annulus, is the starting point of one tree, also is the starting point of search procedure.Intermediate node: represent that with the black round dot what point to that the directed arc of this node marks at the father node from intermediate node is acoustic primitives, the acoustic primitives of representing among Fig. 3 is initial consonant, simple or compound vowel of a Chinese syllable; The father node of one of them node is defined as that node with this node of arrow points, and in search tree, the father node that removes what node of root node local official all has and have only one.Leaf node: represent with white round dot, what point to that the directed arc of this node marks at the father node from leaf node is the speech of Chinese, the pinyin string of representing this speech pronunciation be exactly from root node to this leaf node the pinyin string formed in order of the initial consonant that marks of all directed arcs of process and simple or compound vowel of a Chinese syllable; Because it is unique pointing to the directed arc of leaf node, therefore, the speech of this directed arc correspondence is called the pairing speech of this leaf node.
The whole vocabulary of speech recognition device formed in all pairing speech of leaf node in the search tree.The vocabulary of the big continuous Chinese speech recognizer of vocabulary generally contains 5~60,000 Chinese vocabularies.The search procedure of language model, exactly acoustics search intermediate result--speech recognition primitive grid (by sound female or by the phonetic tissue)--(by the sound mother or by the phonetic tissue) carries out matching ratio with search tree, utilize the probabilistic language model computing formula, thereby find the process of maximum likelihood sentence.In search procedure, if having mated, certain paths of identification primitive grid and the pairing directed arc of certain leaf node of search tree finish, then search tree can automatically revert to the root node starting point, unless that paths that discern in the primitive grid this moment has matched last primitive.
The big continuous Chinese speech recognizer of vocabulary has been obtained very big progress, and to standard mandarin, the accuracy rate of recognizer can reach more than 95%.But the dialect problem of Chinese is the subject matter that Chinese speech identification faces.Because the mandarin Chinese most people all has certain dialect background, under these circumstances, the performance of most speech recognition device all can descend greatly, even reduces to out of use stage.
In China, Chinese has eight big localism areas:
(1) northern dialect---with the Huanghe valley is the center, northeast and middle part, the Yangtze river basin and southwestern each province;
(2) Wu Fangyan---area, Shanghai, the southeast, Jiangsu and Zhejiang major part;
(3) most of area, Hunan dialect---Hunan Province;
(4) most of area, Jiangxi dialect---Jiangxi Province and Hubei southeast corner;
(5) the Hakkas's dialect---Guangdong, Guangxi, Fujian, some areas, Jiangxi;
(6) northern and some areas, Taiwan, the north of Fujian Province dialect---Fujian;
(7) the south of Fujian Province dialect---south Fujian, Chaozhou-Shantou region, Guangdong, Taiwan major part, some areas, Hainan;
(8) Guangdong dialect---the middle part, Guangdong and the west and south, the southeast, Guangxi.
This eight big dialect can be further divided into more than 40 sub-dialect again.Each dialect all has separately significantly characteristics, and making has the speaker's of dialect background mandarin and standard mandarin to have certain difference.
At present a lot of recognizers go to eliminate or weaken with data base method to the influence that the speech recognition device performance causes to the dialect background, in other words, a speech recognition device that standard mandarin is discerned is arranged when, need be when discerning with the mandarin of certain dialect background, the method that adopts is: collect a large amount of speech databases relevant with this dialect, utilize existing acoustic training model method to remove to train again acoustic model then, or utilize existing speaker adaptation method that acoustic model is carried out self-adaptation.The shortcoming of this method is: the workload of the database of (1) collecting belt dialect background is very huge, and for the so many dialects of Chinese, the collection of database is a huge engineering especially.(2) this method can't be taken into account the general character between standard mandarin and the band pronunciation background mandarin, only be to go to deal with problems by the method for data-driven, be equivalent to rebuild fully a speech recognition device, bring difficulty for resource sharing and compatibility between the speech recognition device of different dialect backgrounds.
Summary of the invention
The objective of the invention is for overcoming the weak point of existing voice recognition technology the mandarin identification of band dialect background, a kind of methods for mandarin speech recognition of new band dialect background is proposed, utilize serial of methods such as syllable mapping table and search tree expansion, just can eliminate the dialect background well to Chinese speech recognizer Effect on Performance with the speech database of recording band dialect background hardly.
The present invention proposes a kind of methods for mandarin speech recognition with the dialect background, comprises the speech recognition device of a Chinese standard mandarin; It is characterized in that this method may further comprise the steps:
1) according to the characteristics of specific dialect, makes up the syllable mapping table from the standard Chinese pronunciation to the dialect pronunciation;
2) according to the syllable mapping table, the search tree in the expansion existing standard mandarin pronunciation recognizer;
3) replace search tree in the standard mandarin speech recognition device with the search tree of having expanded.
Through the transformation of above three steps, the mandarin that the speech recognition device of Chinese standard mandarin just can identification tape dialect background.
Principle of the present invention is described as follows:
There are a lot of general character between the mandarin of band dialect background and the standard mandarin, by certain linguistry, can allow their fine combining, the speech recognition device of the mandarin of feasible band dialect background and the speech recognition device of standard mandarin are shared under a framework.According to linguistic knowledge, the mandarin of band dialect background, the syllabary of its syllabary and standard mandarin is similar.But because the influence of dialect background, the actual syllable pronunciation of the mandarin of band dialect background can change: if said mandarin standard of comparison, so this variation is very little; The pronunciation characteristic that then may keep on the contrary, a lot of dialects.In summary, this variation is divided into several, and the mapping relations of both pronunciations are (pronunciation before and after the mapping is all represented with phonetic, initial consonant or the simple or compound vowel of a Chinese syllable of standard mandarin) as shown in Figure 4:
(1) initial consonant of speech irrelevant (Word-Independent) and simple or compound vowel of a Chinese syllable change, and this variation all may take place in any speech, is not subjected to the influence of concrete speech.Such as southern accent initial consonant zh, ch, sh are sent out into z, c, s respectively; Simple or compound vowel of a Chinese syllable eng and en, ing and in or ang and an are obscured etc.
(2) syllable of speech relevant (Word-Dependent) changes, and this variation is different because of speech.Such as, in the words of Sichuan, phonetic guo reads gui in " China ", and still reads guo in " past ".
Among Fig. 4, the dotted line of band arrow is represented the syllable mapping that speech is irrelevant, because the variation of initial consonant or simple or compound vowel of a Chinese syllable only takes place, only relevant initial consonant or simple or compound vowel of a Chinese syllable is marked with black matrix among the figure, and this line segment points to pronunciation in the dialect that is mapped to by the pronunciation of standard mandarin.Among Fig. 4, the solid line of band arrow is represented the syllable mapping that speech is relevant, points to its pronunciation in this dialect by the standard mandarin pronunciation that the mapping syllable takes place in this speech; Pronunciation does not take place in the speech change, or those syllables that the irrelevant pronunciation of speech changes take place, then do not mark, corresponding Chinese character marks with " [] ".
The present invention has following feature:
1) makes full use of the knowledge and the rule of speech level, during conversion dialect background, need not gather and be used for adaptive speech database in a large number, thereby can save a large amount of workloads;
2) the mandarin pronunciation recognizer of different dialect backgrounds and shared identical acoustic model of the speech recognition device of standard mandarin and language model;
3) during conversion dialect background, only need conversion syllable mapping table to get final product, support the acoustics searching algorithm of syllable mapping table and language search algorithm can solve of the influence of dialect background well, thereby operation and maintenance makes things convenient for pronunciation;
4) speech recognition device both can the criterion of identification mandarin, can discern the mandarin of the dialect background that has various degree again, thereby can improve the performance of Chinese putonghua speech recognizer largely.
Description of drawings
Fig. 1 is the acoustic training model of existing speech recognition and the The general frame of search.
Fig. 2 is the example as a result (grid of phonetic) of acoustics search output in the existing speech recognition.
Fig. 3 is the search tree example by initial consonant and rhythm matrix.
Fig. 4 is the example (with the pinyin representation syllable mapping relations of standard mandarin) that Sichuan words pronunciation changes.
Fig. 5 is the process flow diagram that the syllable mapping table makes up.
Fig. 6 is the process flow diagram of search tree expansion.
Fig. 7 is by the irrelevant example of syllable mapping to directed arc is expanded of speech in the search tree.
Embodiment
The methods for mandarin speech recognition of the band dialect background that the present invention proposes reaches accompanying drawing in conjunction with the embodiments and is described in detail as follows:
The present invention proposes a kind of methods for mandarin speech recognition with the dialect background, comprises the speech recognition device of a Chinese standard mandarin; It is characterized in that this method may further comprise the steps:
1) according to the characteristics of specific dialect, makes up the syllable mapping table from the standard Chinese pronunciation to the dialect pronunciation;
2) according to the syllable mapping table, the search tree in the expansion existing standard mandarin pronunciation recognizer;
3) replace search tree in the standard mandarin speech recognition device with the search tree of having expanded.
Above-mentioned steps 1) the method embodiment of structure syllable mapping table as shown in Figure 5, may further comprise the steps:
(1) sums up the syllable mapping principle of relevant dialect according to linguistry;
(2),, then register the initial consonant mapping to { I if mapping is to occur in initial consonant for the irrelevant syllable mapping of any one speech *(x) } → { I *(y) }, its initial consonant of syllable that its expression contains initial consonant x can be mapped to y ,-for example: { I *(zh) } → { I *(z) }, { I *(hu} → { I *(w) } etc.;
(3) for the irrelevant syllable mapping of any one speech, if mapping is to occur in simple or compound vowel of a Chinese syllable, then register the simple or compound vowel of a Chinese syllable mapping to *F (x) } → *F (y) }, its simple or compound vowel of a Chinese syllable of syllable that its expression contains simple or compound vowel of a Chinese syllable x can be mapped to y, for example: *F (en) } → *F (eng}, *F (eng} → *F (en) } etc.;
(4) for the relevant syllable mapping of any one speech, then register the syllable mapping to { W (x 1..., x n) → { W (y 1..., y n), it is illustrated under the context environmental of speech W, and the syllable string of speech W is by (x 1..., x n) be mapped to (y 1..., y n), wherein, the syllable of syllable mapping does not take place, or the syllable of the irrelevant syllable mapping of speech, corresponding y only take place iWith " *" mark, for example: China (zhong, guo) } → China ( *, gui) }, be illustrated in " China " this speech, the relevant syllable of speech takes place syllable guo becomes gui, and the syllable zhong of the irrelevant syllable mapping of speech only takes place, use at the arrow right-hand member " *" mark.
Above-mentioned steps 2) the method embodiment of expanded search tree as shown in Figure 6, may further comprise the steps:
(1) shines upon { W (x according to the relevant syllable of each speech 1..., x n) → { W (y 1..., y n), in vocabulary, add a neologisms W, wherein the Chinese character string of this speech is constant, and is constant in order to the identification code (each speech has unique identification code in existing speech recognition device) of representing this speech, the syllable string (y of this speech 1..., y n) middle using " *" phonetic that marks duplicates from the corresponding syllable of former speech, to each such speech, this step makes it that new pronunciation be arranged;
(2) by the creation method of existing search tree, set up new search tree for having added the vocabulary behind the neologisms;
(3) to the irrelevant syllable mapping { I of each speech *(x) } → { I *(y) } or *F (x) } → *F (y) }, check the directed arc of all non-leaf node correspondences in the search tree, if institute's target initial consonant or simple or compound vowel of a Chinese syllable are x in this directed arc, then this directed arc are expanded a directed arc in the same way arranged side by side with it, and be marked with y; As shown in Figure 7, the directed arc that thick line is represented among the figure is to the directed arc expanding out according to the mapping of the syllable above the big arrow.
Above-mentioned steps 3) the search tree method embodiment that the search tree that usefulness has been expanded is replaced in the existing speech recognition device is, after the search tree expansion is finished, need not revise acoustics searching algorithm and language search algorithm in the existing recognizer, directly with removing to instruct acoustics search and language search in the existing speech recognition device.

Claims (4)

1, a kind of methods for mandarin speech recognition with the dialect background comprises the speech recognition device of a Chinese standard mandarin; It is characterized in that this method may further comprise the steps:
1) according to the characteristics of specific dialect, makes up the syllable mapping table from the standard Chinese pronunciation to the dialect pronunciation;
2) according to the syllable mapping table, the search tree in the expansion existing standard mandarin pronunciation recognizer;
3) replace search tree in the standard mandarin speech recognition device with the search tree of having expanded.
2, the methods for mandarin speech recognition of band dialect background as claimed in claim 1 is characterized in that, said step 1) makes up the method for syllable mapping table, specifically may further comprise the steps:
(1) sums up the syllable mapping principle of relevant dialect according to linguistry;
(2),, then register the initial consonant mapping to { I if mapping is to occur in initial consonant for the irrelevant syllable mapping of any one speech *(x) } → { I *(y) }, represent in the formula that its initial consonant of syllable that contains initial consonant x can be mapped to y;
(3) for the irrelevant syllable mapping of any one speech, if mapping is to occur in simple or compound vowel of a Chinese syllable, then register the simple or compound vowel of a Chinese syllable mapping to *F (x) } → *F (y) }, represent in the formula that its simple or compound vowel of a Chinese syllable of syllable that contains simple or compound vowel of a Chinese syllable x can be mapped to y;
(4) for the relevant syllable mapping of any one speech, then register the syllable mapping to { W (x 1..., x n) → { W (y 1..., y n), being illustrated in the formula under the context environmental of speech W, the syllable string of speech W is by (x 1..., x n) be mapped to (y 1..., y n), wherein, the syllable of syllable mapping does not take place, or the syllable of the irrelevant syllable mapping of speech, corresponding y only take place iWith " *" mark.
3, the methods for mandarin speech recognition of band dialect background as claimed in claim 1 is characterized in that, said step 2) method of expanded search tree, specifically may further comprise the steps:
(1) shines upon { W (x according to the relevant syllable of each speech 1..., x n) → → { W (y 1..., y n), in vocabulary, add a neologisms W, wherein the Chinese character string of this speech is constant, and is constant in order to the identification code of representing this speech, the syllable string (y of this speech 1..., y n) in the phonetic that marks with " * " duplicate from the corresponding syllable of former speech, to each such speech, a new pronunciation is arranged;
(2) by the creation method of existing search tree, set up new search tree for having added the vocabulary behind the neologisms;
(3) to the irrelevant syllable mapping { I of each speech *(x) } → { I *(y) } or *F (x) } → *F (y) }, check the directed arc of all non-leaf node correspondences in the search tree, if institute's target initial consonant or simple or compound vowel of a Chinese syllable are x in this directed arc, then this directed arc are expanded a directed arc in the same way arranged side by side with it, and be marked with y.
4, the methods for mandarin speech recognition of band dialect background as claimed in claim 1, it is characterized in that, the search tree method that the search tree that the usefulness of said step 3) has been expanded is replaced in the existing speech recognition device is, after the search tree expansion is finished, do not revise acoustics searching algorithm and language search algorithm in the existing recognizer, directly with removing to instruct acoustics search and language search in the existing speech recognition device.
CNB021556059A 2002-12-13 2002-12-13 Chinese speech identification method with dialect background Expired - Fee Related CN1177313C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB021556059A CN1177313C (en) 2002-12-13 2002-12-13 Chinese speech identification method with dialect background

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB021556059A CN1177313C (en) 2002-12-13 2002-12-13 Chinese speech identification method with dialect background

Publications (2)

Publication Number Publication Date
CN1412741A true CN1412741A (en) 2003-04-23
CN1177313C CN1177313C (en) 2004-11-24

Family

ID=4752679

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB021556059A Expired - Fee Related CN1177313C (en) 2002-12-13 2002-12-13 Chinese speech identification method with dialect background

Country Status (1)

Country Link
CN (1) CN1177313C (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100536532C (en) * 2005-05-23 2009-09-02 北京大学 Method and system for automatic subtilting
WO2009140884A1 (en) * 2008-05-23 2009-11-26 深圳市北科瑞声科技有限公司 A vehicle speech interactive system
CN101651788B (en) * 2008-12-26 2012-11-21 中国科学院声学研究所 Alignment system of on-line speech text and method thereof
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, voice recognition method and electronic device
CN103811000A (en) * 2014-02-24 2014-05-21 中国移动(深圳)有限公司 Voice recognition system and voice recognition method
CN104217719A (en) * 2014-09-03 2014-12-17 深圳如果技术有限公司 Triggering processing method
CN104485107A (en) * 2014-12-08 2015-04-01 畅捷通信息技术股份有限公司 Name voice recognition method, name voice recognition system and name voice recognition equipment
CN104751844A (en) * 2015-03-12 2015-07-01 深圳市富途网络科技有限公司 Voice identification method and system used for security information interaction
CN104765996A (en) * 2014-01-06 2015-07-08 讯飞智元信息科技有限公司 Voiceprint authentication method and system
CN105117034A (en) * 2015-08-31 2015-12-02 任文 Method for inputting Chinese speeches, positioning statements and correcting errors
CN105574173A (en) * 2015-12-18 2016-05-11 畅捷通信息技术股份有限公司 Commodity searching method and commodity searching device based on voice recognition
CN106059895A (en) * 2016-04-25 2016-10-26 上海云睦网络科技有限公司 Collaborative task generation method, apparatus and system
CN106598982A (en) * 2015-10-15 2017-04-26 比亚迪股份有限公司 Method and device for creating language databases and language translation method and device
CN106847276A (en) * 2015-12-30 2017-06-13 昶洧新能源汽车发展有限公司 A kind of speech control system with accent recognition
EP3188184A1 (en) * 2015-12-30 2017-07-05 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
CN106971721A (en) * 2017-03-29 2017-07-21 沃航(武汉)科技有限公司 A kind of accent speech recognition system based on embedded mobile device
CN107170454A (en) * 2017-05-31 2017-09-15 广东欧珀移动通信有限公司 Audio recognition method and Related product
CN107452379A (en) * 2017-08-17 2017-12-08 广州腾猴科技有限公司 The identification technology and virtual reality teaching method and system of a kind of dialect language
CN107945789A (en) * 2017-12-28 2018-04-20 努比亚技术有限公司 Audio recognition method, device and computer-readable recording medium
CN108986564A (en) * 2018-06-21 2018-12-11 广东小天才科技有限公司 It is a kind of that control method and electronic equipment are entered for based on intelligent interaction
CN109147762A (en) * 2018-10-19 2019-01-04 广东小天才科技有限公司 A kind of audio recognition method and system
CN109346059A (en) * 2018-12-20 2019-02-15 广东小天才科技有限公司 A kind of recognition methods of dialect phonetic and electronic equipment
CN111599347A (en) * 2020-05-27 2020-08-28 广州科慧健远医疗科技有限公司 Standardized sampling method for extracting pathological voice MFCC (Mel frequency cepstrum coefficient) features for artificial intelligence analysis
CN112382275A (en) * 2020-11-04 2021-02-19 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN114596845A (en) * 2022-04-13 2022-06-07 马上消费金融股份有限公司 Training method of voice recognition model, voice recognition method and device
US12033615B2 (en) 2020-11-04 2024-07-09 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for recognizing speech, electronic device and storage medium

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100536532C (en) * 2005-05-23 2009-09-02 北京大学 Method and system for automatic subtilting
WO2009140884A1 (en) * 2008-05-23 2009-11-26 深圳市北科瑞声科技有限公司 A vehicle speech interactive system
CN101281745B (en) * 2008-05-23 2011-08-10 深圳市北科瑞声科技有限公司 Interactive system for vehicle-mounted voice
CN101651788B (en) * 2008-12-26 2012-11-21 中国科学院声学研究所 Alignment system of on-line speech text and method thereof
CN103578467A (en) * 2013-10-18 2014-02-12 威盛电子股份有限公司 Acoustic model building method, voice recognition method and electronic device
TWI560697B (en) * 2013-10-18 2016-12-01 Via Tech Inc Method for building acoustic model, speech recognition method and electronic apparatus
CN104765996B (en) * 2014-01-06 2018-04-27 讯飞智元信息科技有限公司 Voiceprint password authentication method and system
CN104765996A (en) * 2014-01-06 2015-07-08 讯飞智元信息科技有限公司 Voiceprint authentication method and system
CN103811000A (en) * 2014-02-24 2014-05-21 中国移动(深圳)有限公司 Voice recognition system and voice recognition method
CN104217719A (en) * 2014-09-03 2014-12-17 深圳如果技术有限公司 Triggering processing method
CN104485107B (en) * 2014-12-08 2018-06-22 畅捷通信息技术股份有限公司 Audio recognition method, speech recognition system and the speech recognition apparatus of title
CN104485107A (en) * 2014-12-08 2015-04-01 畅捷通信息技术股份有限公司 Name voice recognition method, name voice recognition system and name voice recognition equipment
CN104751844A (en) * 2015-03-12 2015-07-01 深圳市富途网络科技有限公司 Voice identification method and system used for security information interaction
CN105117034A (en) * 2015-08-31 2015-12-02 任文 Method for inputting Chinese speeches, positioning statements and correcting errors
CN106598982A (en) * 2015-10-15 2017-04-26 比亚迪股份有限公司 Method and device for creating language databases and language translation method and device
CN105574173A (en) * 2015-12-18 2016-05-11 畅捷通信息技术股份有限公司 Commodity searching method and commodity searching device based on voice recognition
EP3188185A1 (en) * 2015-12-30 2017-07-05 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
CN106847277A (en) * 2015-12-30 2017-06-13 昶洧新能源汽车发展有限公司 A kind of speech control system with accent recognition
US9916828B2 (en) 2015-12-30 2018-03-13 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
CN106847276A (en) * 2015-12-30 2017-06-13 昶洧新能源汽车发展有限公司 A kind of speech control system with accent recognition
US10672386B2 (en) 2015-12-30 2020-06-02 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
EP3188184A1 (en) * 2015-12-30 2017-07-05 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
CN106059895A (en) * 2016-04-25 2016-10-26 上海云睦网络科技有限公司 Collaborative task generation method, apparatus and system
CN106971721A (en) * 2017-03-29 2017-07-21 沃航(武汉)科技有限公司 A kind of accent speech recognition system based on embedded mobile device
CN107170454B (en) * 2017-05-31 2022-04-05 Oppo广东移动通信有限公司 Speech recognition method and related product
CN107170454A (en) * 2017-05-31 2017-09-15 广东欧珀移动通信有限公司 Audio recognition method and Related product
CN107452379A (en) * 2017-08-17 2017-12-08 广州腾猴科技有限公司 The identification technology and virtual reality teaching method and system of a kind of dialect language
CN107452379B (en) * 2017-08-17 2021-01-05 广州腾猴科技有限公司 Dialect language identification method and virtual reality teaching method and system
CN107945789A (en) * 2017-12-28 2018-04-20 努比亚技术有限公司 Audio recognition method, device and computer-readable recording medium
CN108986564A (en) * 2018-06-21 2018-12-11 广东小天才科技有限公司 It is a kind of that control method and electronic equipment are entered for based on intelligent interaction
CN109147762A (en) * 2018-10-19 2019-01-04 广东小天才科技有限公司 A kind of audio recognition method and system
CN109346059A (en) * 2018-12-20 2019-02-15 广东小天才科技有限公司 A kind of recognition methods of dialect phonetic and electronic equipment
CN109346059B (en) * 2018-12-20 2022-05-03 广东小天才科技有限公司 Dialect voice recognition method and electronic equipment
CN111599347A (en) * 2020-05-27 2020-08-28 广州科慧健远医疗科技有限公司 Standardized sampling method for extracting pathological voice MFCC (Mel frequency cepstrum coefficient) features for artificial intelligence analysis
CN111599347B (en) * 2020-05-27 2024-04-16 广州科慧健远医疗科技有限公司 Standardized sampling method for extracting pathological voice MFCC (functional peripheral component interconnect) characteristics for artificial intelligent analysis
CN112382275A (en) * 2020-11-04 2021-02-19 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN112382275B (en) * 2020-11-04 2023-08-15 北京百度网讯科技有限公司 Speech recognition method, device, electronic equipment and storage medium
US12033615B2 (en) 2020-11-04 2024-07-09 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for recognizing speech, electronic device and storage medium
CN114596845A (en) * 2022-04-13 2022-06-07 马上消费金融股份有限公司 Training method of voice recognition model, voice recognition method and device

Also Published As

Publication number Publication date
CN1177313C (en) 2004-11-24

Similar Documents

Publication Publication Date Title
CN1177313C (en) Chinese speech identification method with dialect background
CN110534095B (en) Speech recognition method, apparatus, device and computer readable storage medium
CN110364171B (en) Voice recognition method, voice recognition system and storage medium
CN1169115C (en) Prosodic databases holding fundamental frequency templates for use in speech synthesis
CN109065032B (en) External corpus speech recognition method based on deep convolutional neural network
CN103578467B (en) Acoustic model building method, voice recognition method and electronic device
US7062436B1 (en) Word-specific acoustic models in a speech recognition system
CN108492820A (en) Chinese speech recognition method based on Recognition with Recurrent Neural Network language model and deep neural network acoustic model
CN109829058A (en) A kind of classifying identification method improving accent recognition accuracy rate based on multi-task learning
Moore et al. Juicer: A weighted finite-state transducer speech decoder
CN1187693C (en) Method, apparatus, and system for bottom-up tone integration to Chinese continuous speech recognition system
CN101493812B (en) Tone-character conversion method
CN109948144B (en) Teacher utterance intelligent processing method based on classroom teaching situation
CN1831937A (en) Method and device for voice identification and language comprehension analysing
CN115394287A (en) Mixed language voice recognition method, device, system and storage medium
CN110942767B (en) Recognition labeling and optimization method and device for ASR language model
CN116010874A (en) Emotion recognition method based on deep learning multi-mode deep scale emotion feature fusion
CN1224954C (en) Speech recognition device comprising language model having unchangeable and changeable syntactic block
CN111553157A (en) Entity replacement-based dialog intention identification method
Almekhlafi et al. A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks
CN115249479A (en) BRNN-based power grid dispatching complex speech recognition method, system and terminal
CN1141697C (en) Three-tone model with tune and training method
US20140142925A1 (en) Self-organizing unit recognition for speech and other data series
József et al. Automated grapheme-to-phoneme conversion system for Romanian
CN1099165A (en) Chinese written language-phonetics transfer method and system based on waveform compilation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: BEIJING D-EAR TECHNOLOGIES CO., LTD.

Free format text: FORMER OWNER: ZHENG FANG

Effective date: 20130319

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20130319

Address after: 100084 room 1005, B building, Tsinghua Science and Technology Park, Haidian District, Beijing

Patentee after: BEIJING D-EAR TECHNOLOGIES Co.,Ltd.

Address before: 100084 Haidian District Tsinghua Yuan, Beijing, Tsinghua University, West 14-4-202

Patentee before: Zheng Fang

PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20130307

Granted publication date: 20041124

Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee

Pledgor: Zheng Fang

Registration number: 200501226

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
PM01 Change of the registration of the contract for pledge of patent right

Change date: 20130307

Registration number: 200501226

Pledgee after: Zhongguancun Beijing technology financing Company limited by guarantee

Pledgee before: Zhongguancun Beijing science and technology Company limited by guarantee

DD01 Delivery of document by public notice

Addressee: Mi Qingshan

Document name: Notice of termination of patent right

DD01 Delivery of document by public notice
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20041124

Termination date: 20211213