CN108831458A - A kind of offline voice is to order transform method and system - Google Patents

A kind of offline voice is to order transform method and system Download PDF

Info

Publication number
CN108831458A
CN108831458A CN201810533495.2A CN201810533495A CN108831458A CN 108831458 A CN108831458 A CN 108831458A CN 201810533495 A CN201810533495 A CN 201810533495A CN 108831458 A CN108831458 A CN 108831458A
Authority
CN
China
Prior art keywords
voice
speech
speech recognition
recognition template
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810533495.2A
Other languages
Chinese (zh)
Inventor
马鸿飞
刘海模
吴晓东
苏云鹏
刘雄
肖虎
卢敬光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Sheng General Technology Co Ltd
Original Assignee
Guangdong Sheng General Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Sheng General Technology Co Ltd filed Critical Guangdong Sheng General Technology Co Ltd
Priority to CN201810533495.2A priority Critical patent/CN108831458A/en
Publication of CN108831458A publication Critical patent/CN108831458A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

A kind of offline voice includes the following steps to order transform method:Multiple respective speech texts of trained voice are received, and corresponding voice short sentence dictionary is constructed based on speech text, the voice short sentence dictionary includes at least the text information of corresponding speech text;The multistage input voice of each trained voice is received respectively;Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template, and the speech recognition template is stored in local;It is corresponding operational order by speech recognition Template Map, and the operational order is output to control equipment.Beneficial effects of the present invention are:It is stored in local by the speech recognition template that will have been trained, the command content that can be convenient the offline confirmation speaker in ground carries out speech recognition without voice content is sent to some external server, to improve the efficiency of voice control.

Description

A kind of offline voice is to order transform method and system
Technical field
The present invention relates to the technical field of voice control, more particularly to a kind of offline voice to order transform method and The system for realizing correlation technique.
Background technique
Voice is the most frequently used and most natural communication form of Human communication.Speech recognition is as a kind of man-machine in information technology The key technology of interface has important research significance and wide application value.With the day of speech recognition technology in recent years It gradually popularizes, many consumer products has directly been successfully applied to the function that machinery equipment issues control instruction by voice In.People and machine have obtained preliminary realization with the dream that natural language engages in the dialogue.Although speech recognition technology applies model Enclose extremely wide, and specific implementation needs according to every kind of concrete application scene to carry out adaptation adjustment;But either that Specific speech recognition application is directed to the transformation that voice itself arrives voice content.
Compared with traditional equipment control technology, although voice-based equipment control technology can provide more for user Direct convenience interactive operation mode (such as being indicated without user's manual input commands);But prior art due to voice from Body is easy caused by being influenced by other conditions (such as sounding situation different between background noise and multidigit speaker etc.) not Stablize, and the determination of voice content, i.e., by its from natural language be converted to the acceptable computer language such as machinery equipment toward It is past to require relevant device on-line joining process one external data base for semantic conversion.The problems in these practical applications all improve The use cost of voice-based equipment control technology.
Summary of the invention
Present invention aims to solve the deficiencies of the prior art, and provides a kind of a kind of offline voices to order transform method and to be System, can obtain and realize voice-based equipment control function offline, and reduces external condition as far as possible and convert to voice content Influence effect.
To achieve the goals above, present invention firstly provides a kind of offline voice to order transform method, including it is following Step:Multiple respective speech texts of trained voice are received, and corresponding voice short sentence dictionary is constructed based on speech text, it is above-mentioned Voice short sentence dictionary includes at least the text information of corresponding speech text;The multistage input language of each trained voice is received respectively Sound;Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template, and by institute Predicate sound recognition template is stored in local;It is corresponding operational order by speech recognition Template Map, and according to instruction voice institute Corresponding operational order is output to control equipment by matched speech recognition template.
In one or more embodiments of the method, above-mentioned voice short sentence dictionary also includes at least with the next item down speech text Sound pronunciation characteristic:Phrase, word, individual character, syllable and phoneme.
In one or more embodiments of the method, the formation of speech recognition template further includes sub-step below:Calculate language The mel-frequency cepstrum parameter of each section of input voice of sound text, to form the gauss hybrid models of each speech text;According to The transfer matrix of each section of input voice, to form corresponding hidden markov model;Gauss based on each speech text is mixed Molding type and hidden markov model form speech recognition template.
In one or more embodiments of the method, content and/or operational order and the speech recognition template of operational order Corresponding relationship is customized.
Further, in above method embodiment, the content and/or operational order and speech recognition template of operational order Corresponding relationship be stored in local.
In one or more embodiments of the method, speech recognition template is that collected input voice instruction is updated by dynamic Made of white silk.
Secondly, the present invention also proposes that the offline voice of one kind to order converting means, comprises the following modules:Received text mould Block constructs corresponding voice short sentence dictionary for receiving multiple respective speech texts of trained voice, and based on speech text, on Predicate sound short sentence dictionary includes at least the text information of corresponding speech text;Speech reception module, for receiving each instruction respectively The multistage for practicing voice inputs voice;Template generation module inputs voice for the multistage based on each trained voice and voice is short Sentence dictionary forms corresponding speech recognition template, and above-mentioned speech recognition template is stored in local;Voice mapping block, is used for It is corresponding operational order by speech recognition Template Map, and according to the matched speech recognition template of instruction voice institute, will corresponds to Operational order be output to control equipment.
In one or more Installation practices, above-mentioned voice short sentence dictionary also includes at least with the next item down speech text Sound pronunciation characteristic:Phrase, word, individual character, syllable and phoneme.
In one or more Installation practices, template generation module further includes submodule below:First modeling module: For calculating the mel-frequency cepstrum parameter of each section of input voice of speech text, to form the Gaussian Mixture of each speech text Model;Second modeling module, for inputting the transfer matrix of voice according to each section, to form corresponding hidden markov mould Type;Template creation module forms voice for gauss hybrid models and hidden markov model based on each speech text Recognition template.
In one or more Installation practices, content and/or operational order and the speech recognition template of operational order Corresponding relationship is customized.
Further, in above-mentioned apparatus embodiment, the content and/or operational order and speech recognition template of operational order Corresponding relationship be stored in local.
In one or more Installation practices, speech recognition template is that collected input voice instruction is updated by dynamic Made of white silk.
Finally, it is stored thereon with computer instruction the invention also discloses a kind of computer readable storage medium, the instruction It realizes when being executed by processor such as the step of aforementioned described in any item methods.
Beneficial effects of the present invention are:Be stored in local by the speech recognition template that will have been trained, can be convenient from The command content of line justification speaker carries out speech recognition without voice content is sent to some external server, to mention The high efficiency of voice control.
Detailed description of the invention
Fig. 1 show offline voice to order transform method one embodiment flow chart;
Fig. 2 show a configuration schematic diagram of method shown in Fig. 1;
Fig. 3 show the sub-step flow chart of the forming process of speech recognition template;
Fig. 4 show the schematic diagram of user's customized voice and operational order corresponding relationship;
Fig. 5 show offline voice to order transform method another embodiment configuration schematic diagram;
Fig. 6 show offline voice to order transformation system one embodiment function structure chart.
Specific embodiment
It is carried out below with reference to technical effect of the embodiment and attached drawing to design of the invention, specific structure and generation clear Chu, complete description, to be completely understood by the purpose of the present invention, scheme and effect.It should be noted that the case where not conflicting Under, the features in the embodiments and the embodiments of the present application can be combined with each other.The identical attached drawing mark used everywhere in attached drawing Note indicates the same or similar part.
Fig. 1 show offline voice to order transform method one embodiment flow chart.Wherein, above method packet Include following steps:Multiple respective speech texts of trained voice are received, and corresponding voice short sentence word is constructed based on speech text Allusion quotation, above-mentioned voice short sentence dictionary include at least the text information of corresponding speech text;The multistage of each trained voice is received respectively Input voice;Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template, And above-mentioned speech recognition template is stored in local;It is corresponding operational order by speech recognition Template Map, and according to instruction The matched speech recognition template of voice institute, is output to control equipment for corresponding operational order.
As shown in Fig. 2, in one embodiment, the corresponding trained voice of each single item operational order.Above-mentioned trained voice Speech text can be grammatically complete sentence, or one or more keywords.Voice short sentence dictionary is at least with this The form of text has recorded the content of above-mentioned speech text, the text information as speech text.One user repeatedly reads aloud above-mentioned Speech text or several users read aloud above-mentioned speech text respectively, form the multistage input voice of above-mentioned speech text.For The corresponding trained voice of each single item operational order, above-mentioned multistage input voice and voice short sentence dictionary are trained to be formed corresponding Speech recognition template.After speech recognition template is trained to, the instruction comprising speech text content is issued receiving user When voice, above-metioned instruction voice will be matched with speech recognition template, to confirm that instruction voice corresponds to multiple speech recognition moulds Which of plate, and corresponding operational order is output to control equipment.The matching of instruction voice and speech recognition template can It is realized by conventional algorithm in the art, the present invention not limits this.
In one embodiment, above-mentioned voice short sentence dictionary also includes at least special with the sound pronunciation of the next item down speech text Property:Phrase, word, individual character, syllable and phoneme.Above-mentioned sound pronunciation characteristic can be based on speech text itself formation, or passes through The mode being manually entered is formed, to improve the accuracy of speech recognition template generated.Meanwhile above-mentioned sound pronunciation characteristic It can be used for the pretreatment to input voice.(the example when inputting voice and above-mentioned sound pronunciation characteristic occurs apparent inconsistent The punctuate mistake or external noise such as inputted in voice is doped in input voice), prompt can be issued and required again Receive input voice.
Referring to the sub-step flow chart of the forming process of speech recognition template shown in Fig. 3, in one embodiment, voice Recognition template can be formed based on following steps:Calculate separately the mel-frequency cepstrum ginseng of the corresponding each section of input voice of speech text Number (Mel-Frequency Cepstral Coefficients, abbreviation MFCC), and according to mel-frequency cepstrum parameter and voice Pronunciation characteristics (such as syllable and phoneme etc.) are to form the gauss hybrid models of each speech text;Voice is inputted according to each section Transfer matrix, to form corresponding hidden markov model;Gauss hybrid models and implicit horse based on each speech text The mixed model of Er Kefu model forms speech recognition template.Those skilled in the art can be in terms of customary technical means in the art The mel-frequency cepstrum parameter of input voice is calculated, the present invention not limits this.
Referring to the schematic diagram of another embodiment shown in Fig. 4, in this embodiment, the content and operation life of operational order It is customized for enabling with the corresponding relationship of speech recognition template.User can define corresponding operation according to actual application scenarios The corresponding relationship of order and operational order and speech recognition template.Such as the application scenarios for access control system, operational order Content may be defined as " opening the door " and " shutdown " two.Similarly, in above-mentioned scene, can for " enabling " and " shutdown " two this The customized speech recognition template of two operational orders, for example will such as " 123456 " language of the verifying password as speech text Sound recognition template is mapped as operational order " enabling ".Only after the instruction voice of sending " 123456 ", access control system can just be connect Receive the operational order of " enabling ".
Further, the configuration schematic diagram of embodiment referring to Figure 5, the content and/or operation of aforesaid operations order Order and the corresponding relationship of speech recognition template can be stored in local database.The instruction voice of sending can be based on local Database in stored operational order content and/or operational order be mapped to the corresponding relationship of speech recognition template it is corresponding Operational order so that operational order can be output to control equipment without connecting network.
In one or more embodiments, speech recognition template is that collected input voice training is updated by dynamic At.For the application scenarios of above-mentioned access control system, user can improve gate inhibition system by regularly updating speech recognition template The safety coefficient of system avoids entering particular place by the personnel of other lacks of competence.
Fig. 6 show offline voice to order transformation system one embodiment function structure chart.Wherein, above-mentioned system System comprises the following modules:Received text module for receiving multiple respective speech texts of trained voice, and is based on speech text Corresponding voice short sentence dictionary is constructed, above-mentioned voice short sentence dictionary includes at least the text information of corresponding speech text;Voice connects Module is received, the multistage for receiving each trained voice respectively inputs voice;Template generation module, for being based on each trained language The multistage input voice and voice short sentence dictionary of sound form corresponding speech recognition template, and above-mentioned speech recognition template is stored In local;Voice mapping block, for being corresponding operational order by speech recognition Template Map, and according to instruction voice institute Corresponding operational order is output to control equipment by the speech recognition template matched.
As shown in Fig. 2, in one embodiment, the corresponding trained voice of each single item operational order.Above-mentioned trained voice Speech text can be grammatically complete sentence, or one or more keywords.Received text module is at least with this The text information that the form of text has recorded the content of above-mentioned speech text to form voice short sentence dictionary, as speech text.One Name user repeatedly reads aloud above-mentioned speech text or several users read aloud above-mentioned speech text respectively, forms above-mentioned speech text Multistage input voice and received by speech reception module.For the corresponding trained voice of each single item operational order, template The above-mentioned multistage input voice of generation module training and voice short sentence dictionary are to form corresponding speech recognition template.Work as speech recognition After template is trained to, when receiving instruction voice of user's sending comprising speech text content, voice mapping block will be upper It states instruction voice to be matched with speech recognition template, to confirm which in multiple speech recognition templates instruction voice correspond to It is a, and corresponding operational order is output to control equipment.The matching of instruction voice and speech recognition template can pass through this field Interior conventional algorithm realizes that the present invention not limits this.
In one embodiment, above-mentioned voice short sentence dictionary also includes at least special with the sound pronunciation of the next item down speech text Property:Phrase, word, individual character, syllable and phoneme.Above-mentioned sound pronunciation characteristic can be based on speech text itself formation, or passes through The mode being manually entered is formed, to improve the accuracy of speech recognition template generated.Meanwhile above-mentioned sound pronunciation characteristic It can be used for the pretreatment to input voice.(the example when inputting voice and above-mentioned sound pronunciation characteristic occurs apparent inconsistent The punctuate mistake or external noise such as inputted in voice is doped in input voice), prompt can be issued and required again Receive input voice.
In one embodiment, template generation module may include following submodule:First modeling module, for calculating separately The mel-frequency cepstrum parameter of the corresponding each section of input voice of speech text, and according to mel-frequency cepstrum parameter and sound pronunciation Characteristic (such as syllable and phoneme etc.) is to form the gauss hybrid models of each speech text;Second modeling module is used for basis The transfer matrix of each section of input voice, to form corresponding hidden markov model;Template creation module, for based on each The gauss hybrid models of speech text and the mixed model of hidden markov model form speech recognition template.This field skill Art personnel can calculate the mel-frequency cepstrum parameter of input voice with customary technical means in the art, and the present invention not limits this It is fixed.
Referring to the schematic diagram of another embodiment shown in Fig. 4, in this embodiment, the content and operation life of operational order It is customized for enabling with the corresponding relationship of speech recognition template.User can define corresponding operation according to actual application scenarios The corresponding relationship of order and operational order and speech recognition template.Such as the application scenarios for access control system, operational order Content may be defined as " opening the door " and " shutdown " two.Similarly, in above-mentioned scene, can for " enabling " and " shutdown " two this The customized speech recognition template of two operational orders, for example will such as " 123456 " language of the verifying password as speech text Sound recognition template is mapped as operational order " enabling ".Only after the instruction voice of sending " 123456 ", access control system can just be connect Receive the operational order of " enabling ".
Further, the configuration schematic diagram of embodiment referring to Figure 5, the content and/or operation of aforesaid operations order Order and the corresponding relationship of speech recognition template can be stored in local database.The instruction voice of sending can be based on local Database in stored operational order content and/or operational order be mapped to the corresponding relationship of speech recognition template it is corresponding Operational order so that operational order can be output to control equipment without connecting network.
In one or more embodiments, speech recognition template is that collected input voice training is updated by dynamic At.For the application scenarios of above-mentioned access control system, user can improve gate inhibition system by regularly updating speech recognition template The safety coefficient of system avoids entering particular place by the personnel of other lacks of competence.
Although description of the invention is quite detailed and especially several embodiments are described, it is not Any of these details or embodiment or any specific embodiments are intended to be limited to, but should be considered as is by reference to appended A possibility that claim provides broad sense in view of the prior art for these claims explanation, to effectively cover the present invention Preset range.In addition, with the foreseeable embodiment of inventor, present invention is described above, its purpose is to be provided with Description, and those still unforeseen at present change to unsubstantiality of the invention can still represent equivalent modifications of the invention.

Claims (8)

1. a kind of offline voice is to order transform method, which is characterized in that include the following steps:
Multiple respective speech texts of trained voice are received, and corresponding voice short sentence dictionary is constructed based on speech text, it is described Voice short sentence dictionary includes at least the text information of corresponding speech text;
The multistage input voice of each trained voice is received respectively;
Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template, and by institute Predicate sound recognition template is stored in local;
It is corresponding operational order by speech recognition Template Map, and according to the matched speech recognition template of instruction voice institute, it will Corresponding operational order is output to control equipment.
2. the method according to claim 1, wherein the voice short sentence dictionary also includes at least with the next item down language The sound pronunciation characteristic of sound text:Phrase, word, individual character, syllable and phoneme.
3. the method according to claim 1, wherein the formation of speech recognition template further includes sub-step below Suddenly:
The mel-frequency cepstrum parameter of each section of input voice of speech text is calculated, to form the Gaussian Mixture of each speech text Model;
The transfer matrix of voice is inputted, according to each section to form corresponding hidden markov model;
Gauss hybrid models and hidden markov model based on each speech text form speech recognition template.
4. the method according to claim 1, wherein the content and/or operational order of operational order and voice are known The corresponding relationship of other template is customized.
5. according to the method described in claim 4, it is characterized in that, the content and/or operational order of operational order and voice are known The corresponding relationship of other template is stored in local.
6. the method according to claim 1, wherein speech recognition template is collected defeated by dynamic update Enter made of voice training.
7. a kind of offline voice is to order converting means, which is characterized in that comprise the following modules:
Received text module, for receiving multiple respective speech texts of trained voice, and it is corresponding based on speech text construction Voice short sentence dictionary, the voice short sentence dictionary include at least the text information of corresponding speech text;
Speech reception module, the multistage for receiving each trained voice respectively input voice;
Template generation module inputs voice for the multistage based on each trained voice and voice short sentence dictionary forms corresponding language Sound recognition template, and the speech recognition template is stored in local;
Voice mapping block for being corresponding operational order by speech recognition Template Map, and is matched according to instruction voice Speech recognition template, corresponding operational order is output to control equipment.
8. a kind of computer readable storage medium, is stored thereon with computer instruction, it is characterised in that the instruction is held by processor It realizes when row such as the step of method described in any one of claims 1 to 6.
CN201810533495.2A 2018-05-29 2018-05-29 A kind of offline voice is to order transform method and system Pending CN108831458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810533495.2A CN108831458A (en) 2018-05-29 2018-05-29 A kind of offline voice is to order transform method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810533495.2A CN108831458A (en) 2018-05-29 2018-05-29 A kind of offline voice is to order transform method and system

Publications (1)

Publication Number Publication Date
CN108831458A true CN108831458A (en) 2018-11-16

Family

ID=64146626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810533495.2A Pending CN108831458A (en) 2018-05-29 2018-05-29 A kind of offline voice is to order transform method and system

Country Status (1)

Country Link
CN (1) CN108831458A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109495360A (en) * 2018-12-18 2019-03-19 深圳国美云智科技有限公司 A kind of smart home Internet of Things platform, offline sound control method and system
CN112669848A (en) * 2020-12-14 2021-04-16 深圳市优必选科技股份有限公司 Offline voice recognition method and device, electronic equipment and storage medium
WO2022134025A1 (en) * 2020-12-25 2022-06-30 京东方科技集团股份有限公司 Offline speech recognition method and apparatus, electronic device and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6088669A (en) * 1997-01-28 2000-07-11 International Business Machines, Corporation Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling
CN1268732A (en) * 2000-03-31 2000-10-04 清华大学 Speech recognition special-purpose chip based speaker-dependent speech recognition and speech playback method
CN2814830Y (en) * 2005-06-25 2006-09-06 陈修志 Sound control TV set and remote controller
CN102005070A (en) * 2010-11-17 2011-04-06 广东中大讯通信息有限公司 Voice identification gate control system
CN102227767A (en) * 2008-11-12 2011-10-26 Scti控股公司 System and method for automatic speach to text conversion
CN102568478A (en) * 2012-02-07 2012-07-11 合一网络技术(北京)有限公司 Video play control method and system based on voice recognition
CN105957518A (en) * 2016-06-16 2016-09-21 内蒙古大学 Mongolian large vocabulary continuous speech recognition method
CN107705787A (en) * 2017-09-25 2018-02-16 北京捷通华声科技股份有限公司 A kind of audio recognition method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6088669A (en) * 1997-01-28 2000-07-11 International Business Machines, Corporation Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling
CN1268732A (en) * 2000-03-31 2000-10-04 清华大学 Speech recognition special-purpose chip based speaker-dependent speech recognition and speech playback method
CN2814830Y (en) * 2005-06-25 2006-09-06 陈修志 Sound control TV set and remote controller
CN102227767A (en) * 2008-11-12 2011-10-26 Scti控股公司 System and method for automatic speach to text conversion
CN102005070A (en) * 2010-11-17 2011-04-06 广东中大讯通信息有限公司 Voice identification gate control system
CN102568478A (en) * 2012-02-07 2012-07-11 合一网络技术(北京)有限公司 Video play control method and system based on voice recognition
CN105957518A (en) * 2016-06-16 2016-09-21 内蒙古大学 Mongolian large vocabulary continuous speech recognition method
CN107705787A (en) * 2017-09-25 2018-02-16 北京捷通华声科技股份有限公司 A kind of audio recognition method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《大科学家讲科学 人机共创的智慧》: "《大科学家讲科学 人机共创的智慧》", 30 August 2017 *
中国航天科工集团第三研究院三0一所: "《世界国防科技年度发展报告》", 30 April 2018 *
赖真: "《解忧IT》", 30 October 2016 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109495360A (en) * 2018-12-18 2019-03-19 深圳国美云智科技有限公司 A kind of smart home Internet of Things platform, offline sound control method and system
CN112669848A (en) * 2020-12-14 2021-04-16 深圳市优必选科技股份有限公司 Offline voice recognition method and device, electronic equipment and storage medium
CN112669848B (en) * 2020-12-14 2023-12-01 深圳市优必选科技股份有限公司 Offline voice recognition method and device, electronic equipment and storage medium
WO2022134025A1 (en) * 2020-12-25 2022-06-30 京东方科技集团股份有限公司 Offline speech recognition method and apparatus, electronic device and readable storage medium

Similar Documents

Publication Publication Date Title
WO2022083083A1 (en) Sound conversion system and training method for same
US8498857B2 (en) System and method for rapid prototyping of existing speech recognition solutions in different languages
US20220139395A1 (en) Configurable output data formats
US10163436B1 (en) Training a speech processing system using spoken utterances
US10650810B2 (en) Determining phonetic relationships
EP3062239A1 (en) Natural expression processing method, processing and response method, device, and system
JP2017058674A (en) Apparatus and method for speech recognition, apparatus and method for training transformation parameter, computer program and electronic apparatus
CN202736475U (en) Chat robot
KR20170041105A (en) Apparatus and method for calculating acoustic score in speech recognition, apparatus and method for learning acoustic model
CN106463113A (en) Predicting pronunciation in speech recognition
CN109523989A (en) Phoneme synthesizing method, speech synthetic device, storage medium and electronic equipment
CN108766441A (en) A kind of sound control method and device based on offline Application on Voiceprint Recognition and speech recognition
JP2001100781A (en) Method and device for voice processing and recording medium
CN108831458A (en) A kind of offline voice is to order transform method and system
CN106057192A (en) Real-time voice conversion method and apparatus
CN112102811B (en) Optimization method and device for synthesized voice and electronic equipment
WO2010030742A1 (en) Method for creating a speech model
JPH10504404A (en) Method and apparatus for speech recognition
CN104679733B (en) A kind of voice dialogue interpretation method, apparatus and system
CN106710587A (en) Speech recognition data pre-processing method
US8401855B2 (en) System and method for generating data for complex statistical modeling for use in dialog systems
Venkatagiri Speech recognition technology applications in communication disorders
CN113257221B (en) Voice model training method based on front-end design and voice synthesis method
CN110085212A (en) A kind of audio recognition method for CNC program controller
CN113035237B (en) Voice evaluation method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181116

RJ01 Rejection of invention patent application after publication