CN108831458A - A kind of offline voice is to order transform method and system - Google Patents
A kind of offline voice is to order transform method and system Download PDFInfo
- Publication number
- CN108831458A CN108831458A CN201810533495.2A CN201810533495A CN108831458A CN 108831458 A CN108831458 A CN 108831458A CN 201810533495 A CN201810533495 A CN 201810533495A CN 108831458 A CN108831458 A CN 108831458A
- Authority
- CN
- China
- Prior art keywords
- voice
- speech
- speech recognition
- recognition template
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Abstract
A kind of offline voice includes the following steps to order transform method:Multiple respective speech texts of trained voice are received, and corresponding voice short sentence dictionary is constructed based on speech text, the voice short sentence dictionary includes at least the text information of corresponding speech text;The multistage input voice of each trained voice is received respectively;Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template, and the speech recognition template is stored in local;It is corresponding operational order by speech recognition Template Map, and the operational order is output to control equipment.Beneficial effects of the present invention are:It is stored in local by the speech recognition template that will have been trained, the command content that can be convenient the offline confirmation speaker in ground carries out speech recognition without voice content is sent to some external server, to improve the efficiency of voice control.
Description
Technical field
The present invention relates to the technical field of voice control, more particularly to a kind of offline voice to order transform method and
The system for realizing correlation technique.
Background technique
Voice is the most frequently used and most natural communication form of Human communication.Speech recognition is as a kind of man-machine in information technology
The key technology of interface has important research significance and wide application value.With the day of speech recognition technology in recent years
It gradually popularizes, many consumer products has directly been successfully applied to the function that machinery equipment issues control instruction by voice
In.People and machine have obtained preliminary realization with the dream that natural language engages in the dialogue.Although speech recognition technology applies model
Enclose extremely wide, and specific implementation needs according to every kind of concrete application scene to carry out adaptation adjustment;But either that
Specific speech recognition application is directed to the transformation that voice itself arrives voice content.
Compared with traditional equipment control technology, although voice-based equipment control technology can provide more for user
Direct convenience interactive operation mode (such as being indicated without user's manual input commands);But prior art due to voice from
Body is easy caused by being influenced by other conditions (such as sounding situation different between background noise and multidigit speaker etc.) not
Stablize, and the determination of voice content, i.e., by its from natural language be converted to the acceptable computer language such as machinery equipment toward
It is past to require relevant device on-line joining process one external data base for semantic conversion.The problems in these practical applications all improve
The use cost of voice-based equipment control technology.
Summary of the invention
Present invention aims to solve the deficiencies of the prior art, and provides a kind of a kind of offline voices to order transform method and to be
System, can obtain and realize voice-based equipment control function offline, and reduces external condition as far as possible and convert to voice content
Influence effect.
To achieve the goals above, present invention firstly provides a kind of offline voice to order transform method, including it is following
Step:Multiple respective speech texts of trained voice are received, and corresponding voice short sentence dictionary is constructed based on speech text, it is above-mentioned
Voice short sentence dictionary includes at least the text information of corresponding speech text;The multistage input language of each trained voice is received respectively
Sound;Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template, and by institute
Predicate sound recognition template is stored in local;It is corresponding operational order by speech recognition Template Map, and according to instruction voice institute
Corresponding operational order is output to control equipment by matched speech recognition template.
In one or more embodiments of the method, above-mentioned voice short sentence dictionary also includes at least with the next item down speech text
Sound pronunciation characteristic:Phrase, word, individual character, syllable and phoneme.
In one or more embodiments of the method, the formation of speech recognition template further includes sub-step below:Calculate language
The mel-frequency cepstrum parameter of each section of input voice of sound text, to form the gauss hybrid models of each speech text;According to
The transfer matrix of each section of input voice, to form corresponding hidden markov model;Gauss based on each speech text is mixed
Molding type and hidden markov model form speech recognition template.
In one or more embodiments of the method, content and/or operational order and the speech recognition template of operational order
Corresponding relationship is customized.
Further, in above method embodiment, the content and/or operational order and speech recognition template of operational order
Corresponding relationship be stored in local.
In one or more embodiments of the method, speech recognition template is that collected input voice instruction is updated by dynamic
Made of white silk.
Secondly, the present invention also proposes that the offline voice of one kind to order converting means, comprises the following modules:Received text mould
Block constructs corresponding voice short sentence dictionary for receiving multiple respective speech texts of trained voice, and based on speech text, on
Predicate sound short sentence dictionary includes at least the text information of corresponding speech text;Speech reception module, for receiving each instruction respectively
The multistage for practicing voice inputs voice;Template generation module inputs voice for the multistage based on each trained voice and voice is short
Sentence dictionary forms corresponding speech recognition template, and above-mentioned speech recognition template is stored in local;Voice mapping block, is used for
It is corresponding operational order by speech recognition Template Map, and according to the matched speech recognition template of instruction voice institute, will corresponds to
Operational order be output to control equipment.
In one or more Installation practices, above-mentioned voice short sentence dictionary also includes at least with the next item down speech text
Sound pronunciation characteristic:Phrase, word, individual character, syllable and phoneme.
In one or more Installation practices, template generation module further includes submodule below:First modeling module:
For calculating the mel-frequency cepstrum parameter of each section of input voice of speech text, to form the Gaussian Mixture of each speech text
Model;Second modeling module, for inputting the transfer matrix of voice according to each section, to form corresponding hidden markov mould
Type;Template creation module forms voice for gauss hybrid models and hidden markov model based on each speech text
Recognition template.
In one or more Installation practices, content and/or operational order and the speech recognition template of operational order
Corresponding relationship is customized.
Further, in above-mentioned apparatus embodiment, the content and/or operational order and speech recognition template of operational order
Corresponding relationship be stored in local.
In one or more Installation practices, speech recognition template is that collected input voice instruction is updated by dynamic
Made of white silk.
Finally, it is stored thereon with computer instruction the invention also discloses a kind of computer readable storage medium, the instruction
It realizes when being executed by processor such as the step of aforementioned described in any item methods.
Beneficial effects of the present invention are:Be stored in local by the speech recognition template that will have been trained, can be convenient from
The command content of line justification speaker carries out speech recognition without voice content is sent to some external server, to mention
The high efficiency of voice control.
Detailed description of the invention
Fig. 1 show offline voice to order transform method one embodiment flow chart;
Fig. 2 show a configuration schematic diagram of method shown in Fig. 1;
Fig. 3 show the sub-step flow chart of the forming process of speech recognition template;
Fig. 4 show the schematic diagram of user's customized voice and operational order corresponding relationship;
Fig. 5 show offline voice to order transform method another embodiment configuration schematic diagram;
Fig. 6 show offline voice to order transformation system one embodiment function structure chart.
Specific embodiment
It is carried out below with reference to technical effect of the embodiment and attached drawing to design of the invention, specific structure and generation clear
Chu, complete description, to be completely understood by the purpose of the present invention, scheme and effect.It should be noted that the case where not conflicting
Under, the features in the embodiments and the embodiments of the present application can be combined with each other.The identical attached drawing mark used everywhere in attached drawing
Note indicates the same or similar part.
Fig. 1 show offline voice to order transform method one embodiment flow chart.Wherein, above method packet
Include following steps:Multiple respective speech texts of trained voice are received, and corresponding voice short sentence word is constructed based on speech text
Allusion quotation, above-mentioned voice short sentence dictionary include at least the text information of corresponding speech text;The multistage of each trained voice is received respectively
Input voice;Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template,
And above-mentioned speech recognition template is stored in local;It is corresponding operational order by speech recognition Template Map, and according to instruction
The matched speech recognition template of voice institute, is output to control equipment for corresponding operational order.
As shown in Fig. 2, in one embodiment, the corresponding trained voice of each single item operational order.Above-mentioned trained voice
Speech text can be grammatically complete sentence, or one or more keywords.Voice short sentence dictionary is at least with this
The form of text has recorded the content of above-mentioned speech text, the text information as speech text.One user repeatedly reads aloud above-mentioned
Speech text or several users read aloud above-mentioned speech text respectively, form the multistage input voice of above-mentioned speech text.For
The corresponding trained voice of each single item operational order, above-mentioned multistage input voice and voice short sentence dictionary are trained to be formed corresponding
Speech recognition template.After speech recognition template is trained to, the instruction comprising speech text content is issued receiving user
When voice, above-metioned instruction voice will be matched with speech recognition template, to confirm that instruction voice corresponds to multiple speech recognition moulds
Which of plate, and corresponding operational order is output to control equipment.The matching of instruction voice and speech recognition template can
It is realized by conventional algorithm in the art, the present invention not limits this.
In one embodiment, above-mentioned voice short sentence dictionary also includes at least special with the sound pronunciation of the next item down speech text
Property:Phrase, word, individual character, syllable and phoneme.Above-mentioned sound pronunciation characteristic can be based on speech text itself formation, or passes through
The mode being manually entered is formed, to improve the accuracy of speech recognition template generated.Meanwhile above-mentioned sound pronunciation characteristic
It can be used for the pretreatment to input voice.(the example when inputting voice and above-mentioned sound pronunciation characteristic occurs apparent inconsistent
The punctuate mistake or external noise such as inputted in voice is doped in input voice), prompt can be issued and required again
Receive input voice.
Referring to the sub-step flow chart of the forming process of speech recognition template shown in Fig. 3, in one embodiment, voice
Recognition template can be formed based on following steps:Calculate separately the mel-frequency cepstrum ginseng of the corresponding each section of input voice of speech text
Number (Mel-Frequency Cepstral Coefficients, abbreviation MFCC), and according to mel-frequency cepstrum parameter and voice
Pronunciation characteristics (such as syllable and phoneme etc.) are to form the gauss hybrid models of each speech text;Voice is inputted according to each section
Transfer matrix, to form corresponding hidden markov model;Gauss hybrid models and implicit horse based on each speech text
The mixed model of Er Kefu model forms speech recognition template.Those skilled in the art can be in terms of customary technical means in the art
The mel-frequency cepstrum parameter of input voice is calculated, the present invention not limits this.
Referring to the schematic diagram of another embodiment shown in Fig. 4, in this embodiment, the content and operation life of operational order
It is customized for enabling with the corresponding relationship of speech recognition template.User can define corresponding operation according to actual application scenarios
The corresponding relationship of order and operational order and speech recognition template.Such as the application scenarios for access control system, operational order
Content may be defined as " opening the door " and " shutdown " two.Similarly, in above-mentioned scene, can for " enabling " and " shutdown " two this
The customized speech recognition template of two operational orders, for example will such as " 123456 " language of the verifying password as speech text
Sound recognition template is mapped as operational order " enabling ".Only after the instruction voice of sending " 123456 ", access control system can just be connect
Receive the operational order of " enabling ".
Further, the configuration schematic diagram of embodiment referring to Figure 5, the content and/or operation of aforesaid operations order
Order and the corresponding relationship of speech recognition template can be stored in local database.The instruction voice of sending can be based on local
Database in stored operational order content and/or operational order be mapped to the corresponding relationship of speech recognition template it is corresponding
Operational order so that operational order can be output to control equipment without connecting network.
In one or more embodiments, speech recognition template is that collected input voice training is updated by dynamic
At.For the application scenarios of above-mentioned access control system, user can improve gate inhibition system by regularly updating speech recognition template
The safety coefficient of system avoids entering particular place by the personnel of other lacks of competence.
Fig. 6 show offline voice to order transformation system one embodiment function structure chart.Wherein, above-mentioned system
System comprises the following modules:Received text module for receiving multiple respective speech texts of trained voice, and is based on speech text
Corresponding voice short sentence dictionary is constructed, above-mentioned voice short sentence dictionary includes at least the text information of corresponding speech text;Voice connects
Module is received, the multistage for receiving each trained voice respectively inputs voice;Template generation module, for being based on each trained language
The multistage input voice and voice short sentence dictionary of sound form corresponding speech recognition template, and above-mentioned speech recognition template is stored
In local;Voice mapping block, for being corresponding operational order by speech recognition Template Map, and according to instruction voice institute
Corresponding operational order is output to control equipment by the speech recognition template matched.
As shown in Fig. 2, in one embodiment, the corresponding trained voice of each single item operational order.Above-mentioned trained voice
Speech text can be grammatically complete sentence, or one or more keywords.Received text module is at least with this
The text information that the form of text has recorded the content of above-mentioned speech text to form voice short sentence dictionary, as speech text.One
Name user repeatedly reads aloud above-mentioned speech text or several users read aloud above-mentioned speech text respectively, forms above-mentioned speech text
Multistage input voice and received by speech reception module.For the corresponding trained voice of each single item operational order, template
The above-mentioned multistage input voice of generation module training and voice short sentence dictionary are to form corresponding speech recognition template.Work as speech recognition
After template is trained to, when receiving instruction voice of user's sending comprising speech text content, voice mapping block will be upper
It states instruction voice to be matched with speech recognition template, to confirm which in multiple speech recognition templates instruction voice correspond to
It is a, and corresponding operational order is output to control equipment.The matching of instruction voice and speech recognition template can pass through this field
Interior conventional algorithm realizes that the present invention not limits this.
In one embodiment, above-mentioned voice short sentence dictionary also includes at least special with the sound pronunciation of the next item down speech text
Property:Phrase, word, individual character, syllable and phoneme.Above-mentioned sound pronunciation characteristic can be based on speech text itself formation, or passes through
The mode being manually entered is formed, to improve the accuracy of speech recognition template generated.Meanwhile above-mentioned sound pronunciation characteristic
It can be used for the pretreatment to input voice.(the example when inputting voice and above-mentioned sound pronunciation characteristic occurs apparent inconsistent
The punctuate mistake or external noise such as inputted in voice is doped in input voice), prompt can be issued and required again
Receive input voice.
In one embodiment, template generation module may include following submodule:First modeling module, for calculating separately
The mel-frequency cepstrum parameter of the corresponding each section of input voice of speech text, and according to mel-frequency cepstrum parameter and sound pronunciation
Characteristic (such as syllable and phoneme etc.) is to form the gauss hybrid models of each speech text;Second modeling module is used for basis
The transfer matrix of each section of input voice, to form corresponding hidden markov model;Template creation module, for based on each
The gauss hybrid models of speech text and the mixed model of hidden markov model form speech recognition template.This field skill
Art personnel can calculate the mel-frequency cepstrum parameter of input voice with customary technical means in the art, and the present invention not limits this
It is fixed.
Referring to the schematic diagram of another embodiment shown in Fig. 4, in this embodiment, the content and operation life of operational order
It is customized for enabling with the corresponding relationship of speech recognition template.User can define corresponding operation according to actual application scenarios
The corresponding relationship of order and operational order and speech recognition template.Such as the application scenarios for access control system, operational order
Content may be defined as " opening the door " and " shutdown " two.Similarly, in above-mentioned scene, can for " enabling " and " shutdown " two this
The customized speech recognition template of two operational orders, for example will such as " 123456 " language of the verifying password as speech text
Sound recognition template is mapped as operational order " enabling ".Only after the instruction voice of sending " 123456 ", access control system can just be connect
Receive the operational order of " enabling ".
Further, the configuration schematic diagram of embodiment referring to Figure 5, the content and/or operation of aforesaid operations order
Order and the corresponding relationship of speech recognition template can be stored in local database.The instruction voice of sending can be based on local
Database in stored operational order content and/or operational order be mapped to the corresponding relationship of speech recognition template it is corresponding
Operational order so that operational order can be output to control equipment without connecting network.
In one or more embodiments, speech recognition template is that collected input voice training is updated by dynamic
At.For the application scenarios of above-mentioned access control system, user can improve gate inhibition system by regularly updating speech recognition template
The safety coefficient of system avoids entering particular place by the personnel of other lacks of competence.
Although description of the invention is quite detailed and especially several embodiments are described, it is not
Any of these details or embodiment or any specific embodiments are intended to be limited to, but should be considered as is by reference to appended
A possibility that claim provides broad sense in view of the prior art for these claims explanation, to effectively cover the present invention
Preset range.In addition, with the foreseeable embodiment of inventor, present invention is described above, its purpose is to be provided with
Description, and those still unforeseen at present change to unsubstantiality of the invention can still represent equivalent modifications of the invention.
Claims (8)
1. a kind of offline voice is to order transform method, which is characterized in that include the following steps:
Multiple respective speech texts of trained voice are received, and corresponding voice short sentence dictionary is constructed based on speech text, it is described
Voice short sentence dictionary includes at least the text information of corresponding speech text;
The multistage input voice of each trained voice is received respectively;
Multistage input voice and voice short sentence dictionary based on each trained voice form corresponding speech recognition template, and by institute
Predicate sound recognition template is stored in local;
It is corresponding operational order by speech recognition Template Map, and according to the matched speech recognition template of instruction voice institute, it will
Corresponding operational order is output to control equipment.
2. the method according to claim 1, wherein the voice short sentence dictionary also includes at least with the next item down language
The sound pronunciation characteristic of sound text:Phrase, word, individual character, syllable and phoneme.
3. the method according to claim 1, wherein the formation of speech recognition template further includes sub-step below
Suddenly:
The mel-frequency cepstrum parameter of each section of input voice of speech text is calculated, to form the Gaussian Mixture of each speech text
Model;
The transfer matrix of voice is inputted, according to each section to form corresponding hidden markov model;
Gauss hybrid models and hidden markov model based on each speech text form speech recognition template.
4. the method according to claim 1, wherein the content and/or operational order of operational order and voice are known
The corresponding relationship of other template is customized.
5. according to the method described in claim 4, it is characterized in that, the content and/or operational order of operational order and voice are known
The corresponding relationship of other template is stored in local.
6. the method according to claim 1, wherein speech recognition template is collected defeated by dynamic update
Enter made of voice training.
7. a kind of offline voice is to order converting means, which is characterized in that comprise the following modules:
Received text module, for receiving multiple respective speech texts of trained voice, and it is corresponding based on speech text construction
Voice short sentence dictionary, the voice short sentence dictionary include at least the text information of corresponding speech text;
Speech reception module, the multistage for receiving each trained voice respectively input voice;
Template generation module inputs voice for the multistage based on each trained voice and voice short sentence dictionary forms corresponding language
Sound recognition template, and the speech recognition template is stored in local;
Voice mapping block for being corresponding operational order by speech recognition Template Map, and is matched according to instruction voice
Speech recognition template, corresponding operational order is output to control equipment.
8. a kind of computer readable storage medium, is stored thereon with computer instruction, it is characterised in that the instruction is held by processor
It realizes when row such as the step of method described in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810533495.2A CN108831458A (en) | 2018-05-29 | 2018-05-29 | A kind of offline voice is to order transform method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810533495.2A CN108831458A (en) | 2018-05-29 | 2018-05-29 | A kind of offline voice is to order transform method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108831458A true CN108831458A (en) | 2018-11-16 |
Family
ID=64146626
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810533495.2A Pending CN108831458A (en) | 2018-05-29 | 2018-05-29 | A kind of offline voice is to order transform method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108831458A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109495360A (en) * | 2018-12-18 | 2019-03-19 | 深圳国美云智科技有限公司 | A kind of smart home Internet of Things platform, offline sound control method and system |
CN112669848A (en) * | 2020-12-14 | 2021-04-16 | 深圳市优必选科技股份有限公司 | Offline voice recognition method and device, electronic equipment and storage medium |
WO2022134025A1 (en) * | 2020-12-25 | 2022-06-30 | 京东方科技集团股份有限公司 | Offline speech recognition method and apparatus, electronic device and readable storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6088669A (en) * | 1997-01-28 | 2000-07-11 | International Business Machines, Corporation | Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling |
CN1268732A (en) * | 2000-03-31 | 2000-10-04 | 清华大学 | Speech recognition special-purpose chip based speaker-dependent speech recognition and speech playback method |
CN2814830Y (en) * | 2005-06-25 | 2006-09-06 | 陈修志 | Sound control TV set and remote controller |
CN102005070A (en) * | 2010-11-17 | 2011-04-06 | 广东中大讯通信息有限公司 | Voice identification gate control system |
CN102227767A (en) * | 2008-11-12 | 2011-10-26 | Scti控股公司 | System and method for automatic speach to text conversion |
CN102568478A (en) * | 2012-02-07 | 2012-07-11 | 合一网络技术(北京)有限公司 | Video play control method and system based on voice recognition |
CN105957518A (en) * | 2016-06-16 | 2016-09-21 | 内蒙古大学 | Mongolian large vocabulary continuous speech recognition method |
CN107705787A (en) * | 2017-09-25 | 2018-02-16 | 北京捷通华声科技股份有限公司 | A kind of audio recognition method and device |
-
2018
- 2018-05-29 CN CN201810533495.2A patent/CN108831458A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6088669A (en) * | 1997-01-28 | 2000-07-11 | International Business Machines, Corporation | Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling |
CN1268732A (en) * | 2000-03-31 | 2000-10-04 | 清华大学 | Speech recognition special-purpose chip based speaker-dependent speech recognition and speech playback method |
CN2814830Y (en) * | 2005-06-25 | 2006-09-06 | 陈修志 | Sound control TV set and remote controller |
CN102227767A (en) * | 2008-11-12 | 2011-10-26 | Scti控股公司 | System and method for automatic speach to text conversion |
CN102005070A (en) * | 2010-11-17 | 2011-04-06 | 广东中大讯通信息有限公司 | Voice identification gate control system |
CN102568478A (en) * | 2012-02-07 | 2012-07-11 | 合一网络技术(北京)有限公司 | Video play control method and system based on voice recognition |
CN105957518A (en) * | 2016-06-16 | 2016-09-21 | 内蒙古大学 | Mongolian large vocabulary continuous speech recognition method |
CN107705787A (en) * | 2017-09-25 | 2018-02-16 | 北京捷通华声科技股份有限公司 | A kind of audio recognition method and device |
Non-Patent Citations (3)
Title |
---|
《大科学家讲科学 人机共创的智慧》: "《大科学家讲科学 人机共创的智慧》", 30 August 2017 * |
中国航天科工集团第三研究院三0一所: "《世界国防科技年度发展报告》", 30 April 2018 * |
赖真: "《解忧IT》", 30 October 2016 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109495360A (en) * | 2018-12-18 | 2019-03-19 | 深圳国美云智科技有限公司 | A kind of smart home Internet of Things platform, offline sound control method and system |
CN112669848A (en) * | 2020-12-14 | 2021-04-16 | 深圳市优必选科技股份有限公司 | Offline voice recognition method and device, electronic equipment and storage medium |
CN112669848B (en) * | 2020-12-14 | 2023-12-01 | 深圳市优必选科技股份有限公司 | Offline voice recognition method and device, electronic equipment and storage medium |
WO2022134025A1 (en) * | 2020-12-25 | 2022-06-30 | 京东方科技集团股份有限公司 | Offline speech recognition method and apparatus, electronic device and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022083083A1 (en) | Sound conversion system and training method for same | |
US8498857B2 (en) | System and method for rapid prototyping of existing speech recognition solutions in different languages | |
US20220139395A1 (en) | Configurable output data formats | |
US10163436B1 (en) | Training a speech processing system using spoken utterances | |
US10650810B2 (en) | Determining phonetic relationships | |
EP3062239A1 (en) | Natural expression processing method, processing and response method, device, and system | |
JP2017058674A (en) | Apparatus and method for speech recognition, apparatus and method for training transformation parameter, computer program and electronic apparatus | |
CN202736475U (en) | Chat robot | |
KR20170041105A (en) | Apparatus and method for calculating acoustic score in speech recognition, apparatus and method for learning acoustic model | |
CN106463113A (en) | Predicting pronunciation in speech recognition | |
CN109523989A (en) | Phoneme synthesizing method, speech synthetic device, storage medium and electronic equipment | |
CN108766441A (en) | A kind of sound control method and device based on offline Application on Voiceprint Recognition and speech recognition | |
JP2001100781A (en) | Method and device for voice processing and recording medium | |
CN108831458A (en) | A kind of offline voice is to order transform method and system | |
CN106057192A (en) | Real-time voice conversion method and apparatus | |
CN112102811B (en) | Optimization method and device for synthesized voice and electronic equipment | |
WO2010030742A1 (en) | Method for creating a speech model | |
JPH10504404A (en) | Method and apparatus for speech recognition | |
CN104679733B (en) | A kind of voice dialogue interpretation method, apparatus and system | |
CN106710587A (en) | Speech recognition data pre-processing method | |
US8401855B2 (en) | System and method for generating data for complex statistical modeling for use in dialog systems | |
Venkatagiri | Speech recognition technology applications in communication disorders | |
CN113257221B (en) | Voice model training method based on front-end design and voice synthesis method | |
CN110085212A (en) | A kind of audio recognition method for CNC program controller | |
CN113035237B (en) | Voice evaluation method and device and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181116 |
|
RJ01 | Rejection of invention patent application after publication |