CN106558306A - Method for voice recognition, device and equipment - Google Patents
Method for voice recognition, device and equipment Download PDFInfo
- Publication number
- CN106558306A CN106558306A CN201610052812.XA CN201610052812A CN106558306A CN 106558306 A CN106558306 A CN 106558306A CN 201610052812 A CN201610052812 A CN 201610052812A CN 106558306 A CN106558306 A CN 106558306A
- Authority
- CN
- China
- Prior art keywords
- voice
- sound
- characteristics information
- unit
- equipment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 239000000284 extract Substances 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 6
- 230000008859 change Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000002618 waking effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000004568 cement Substances 0.000 description 2
- 230000004308 accommodation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000037007 arousal Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 210000004247 hand Anatomy 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004899 motility Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0635—Training updating or merging of old and new templates; Mean values; Weighting
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
Abstract
A kind of method for voice recognition, including:Receive voice messaging;Extract the voice characteristics information in the voice messaging;The voice characteristics information and the sound template in sound bank are matched;Using the sound template in sound bank described in the voice characteristics information re -training after the match is successful.After due to each speech recognition, sound template that all can be in re -training sound bank, so that sound template increasingly enriches, so as to greatly improve the success rate of speech recognition.A kind of device for speech recognition and a kind of equipment with speech identifying function are also disclosed in certain embodiments.
Description
Technical field
The invention belongs to mode identification technology, more particularly to a kind of method for voice recognition, device, and have
The equipment of speech identifying function.
Background technology
Currently, the smart machine such as panel computer, smart mobile phone, smart home product becomes increasingly popular, progressively become family and
Personal standard configuration.Smart machine based on interactive voice is practical, sets in household electrical appliances, car machine, mobile phone etc.
It has been widely used for upper, wherein, many equipment all have voice arousal function, open for unlocking screen or conduct
The supplementary meanss of dynamic application.It is such technology that voice wakes up, when equipment is in holding state, in low-down work(
Under the conditions of consumption, run without interruption a device on backstage, certain predefined wake-up word is detected, when detecting
When user says this word, equipment is waken up, put the device into normal operating conditions.But the skill of speech recognition at present
The success rate of art is also less desirable, needs further to improve.
The content of the invention
In view of this, it is an object of the invention to propose a kind of method for voice recognition, to improve speech recognition
Success rate.In order to the embodiment to disclosing some in terms of have a basic understanding, shown below is simple summary.
The summarized section is not extensive overview, nor key/critical component is determined or describes the protection of these embodiments
Scope.Its sole purpose is that some concepts are presented with simple form, in this, as the preamble of following detailed description.
In some optional embodiments, the method for voice recognition includes:Receive voice messaging;Extract described
Voice characteristics information in voice messaging;The voice characteristics information and the sound template in sound bank are matched;
With the sound template utilized after success in sound bank described in the voice characteristics information re -training.Due to each speech recognition
After success, sound template that all can be in re -training sound bank, so that sound template increasingly enriches, so as to significantly
Improve the success rate of speech recognition.
Another object of the present invention is to propose a kind of device for speech recognition.
In some optional embodiments, the device for speech recognition includes:Receive the voice collecting of voice messaging
Unit;Extract the feature extraction unit of the voice characteristics information in the voice messaging;By the voice characteristics information and language
The voice recognition unit matched by sound template in sound storehouse;With retraining unit, in the speech recognition list
The match is successful afterwards using the sound template in sound bank described in the voice characteristics information re -training for unit.
Another object of the present invention is to propose a kind of equipment with speech identifying function.
In some optional embodiments, the equipment with speech identifying function includes speech input device, also includes
The device for speech recognition.
For above-mentioned and related purpose, one or more embodiments include will be explained in below and in the claims
The feature for particularly pointing out.Description below and accompanying drawing describe some illustrative aspects in detail, and its indicate be only
Some modes in the utilizable various modes of principle of each embodiment.Other benefits and novel features will be with
Detailed description below is considered in conjunction with the accompanying and becomes obvious, the disclosed embodiments be will include all these aspects and
Their equivalent.
Description of the drawings
Fig. 1 is a method for voice recognition embodiment;
Fig. 2 is a device embodiment for being used for speech recognition;
Fig. 3 is another device embodiment for speech recognition.
Specific embodiment
The following description and drawings fully illustrate specific embodiments of the present invention, to enable those skilled in the art to reality
Trample them.Other embodiments can include structure, logic, electric, process and other changes.It is real
Apply example and only represent possible change.Unless explicitly requested, otherwise individually components and functionality is optional, and is operated
Order can change.The part of some embodiments and feature can be included in or replace other embodiments part and
Feature.The scope of embodiment of the present invention includes the gamut of claims, and all of claims can
The equivalent of acquisition.Herein, these embodiments of the invention individually or generally " can be invented " with term
To represent, it is convenient that this is used for the purpose of, and if in fact disclosing the invention more than, is not meant to automatically limit
The scope for making the application is any single invention or inventive concept.
Fig. 1 shows one embodiment of method for voice recognition.
Step 11:Receive voice messaging;
Step 12:Extract the voice characteristics information in voice messaging;
Step 13:Acoustic model in voice characteristics information and sound bank is matched;
Step 14:Using the speech model in voice characteristics information re -training sound bank after the match is successful.
For speech recognition technology, realize that the basic skills of meaningful, the substantial voice messaging of identification is at present:In advance
Voice characteristics information is analyzed, machine is given as requested and is stored, the voice characteristics information in this speech parameter storehouse
Referred to as " template (Template-based Approach) ", this process referred to as " training (Training) ".Send
Unknown voice (also known as after knowledge voice) to recognize is transformed into after the signal of telecommunication through pretreatment, pronunciation modeling and feature extraction,
Obtain voice characteristics information, it compared one by one with the sound template in sound bank, and adopt the method for matching to find out most connecing
The template of nearly phonetic feature, draws recognition result, and this process is known as " identification (Recognition) ".Certainly, exist
There to be individual standard when being compared, here it is " distortion measure (the Distortion between metering speech parameter vector
Measures) ", the content representated by that minimum template of distortion is exactly the result of identification.
Speech recognition process is generally divided into two stages:Training stage and cognitive phase.The former task is to set up identification base
The speech model and language model of this unit, the latter are then to carry out the speech characteristic parameter and sound template of target voice
Relatively, it is identified result.
Acoustic model
Acoustic model is the underlying model of identifying system, is a most key part in speech recognition system.Acoustic model
Target be to provide a kind of effective method, calculate the distance between feature vector sequence and each sound template of voice.
The design of acoustic model is closely related with language pronouncing feature.Model Identification cell size (word pronunciation model, word pronunciation mould
Type, half syllable-based hmm or phoneme model) have larger to voice training data volume size, phonetic recognization rate and motility
Affect.For speech recognition system more than medium vocabulary quantity, recognition unit is little, then amount of calculation is also little, required mould
Type amount of storage and amount of training data are also little, but the problem brought is the positioning and segmentation difficulty of correspondence voice segments, and more multiple
Miscellaneous identification model rule.Generally big recognition unit easily includes coarticulation in a model, and this is conducive to raising system
Discrimination, but require training data relative increase.
Language model
Language model (Language Model, LM) refers generally in matching search for words and the language of path constraint
Speech rule, is, for the knowledge that syntax and semantics are effectively combined during speech recognition, to improve discrimination, reduces
The scope of search.Due to being difficult to accurately determine the border of word, and acoustic model describes the limited in one's ability of sound-variation,
The sequence of many probability scores similar word will be produced during identification.Therefore, it is usually used in practical speech recognition system
Language model selects most possible word sequence to make up the deficiency of acoustic model from many candidate results.
Language model can be divided into rule-based language model and the language model based on statistics.Rule-based language mould
Type is to sum up grammatical ruless or even semantic rule, is then excluded in acoustics identification with these rules and does not conform to grammatical ruless or language
The result of adopted rule.Statistical language model passes through the dependence between statistical probability descriptor and word, indirectly to grammer
Or semantic rule is encoded.Rule-based language model obtains application well in particular task system, can be larger
Amplitude improves the discrimination of system.As everyday spoken english dialogue cannot be described with hard and fast rule, know in large vocabulary voice
The main language model using based on statistics in other system.
Feature extraction
Feature extraction seeks to the relevant information of the reflection phonetic feature for extracting important from speech waveform, removes those phases
To unrelated information.It is both the process that an information is significantly compressed, and a signal uncoiling process.Due to voice
The time-varying characteristics of signal, speech feature extraction must be carried out on a bit of voice signal, that is, carry out short-time analysiss.At present
The more commonly used Speech Feature Extraction is the linear prediction cepstrum coefficient technology (LPCC) based on channel model and based on listening
Feel Mel frequency cepstral technologies (MFCC) of mechanism.The former basic thought is:The adjacent sampled point of voice signal
Between have very strong dependency.Therefore the sampled value of each voice signal, can adding with the sampled value of several before it
Power and linear combination carry out approximate representation.The latter has then taken into full account the auditory properties of human ear, and with objective metric characterizing people
Subjective feeling to volume up-down.By contrast, MFCC has certain advantage:1. the information of voice has focused largely on low
Frequency part, and HFS easily receives ambient noise interference, MFCC to emphasize the low-frequency information of voice, it is favourable so as to highlight
In the information of identification, noise jamming is shielded;2.MFCC does not have any hypotheses, can make in all cases
With, recognition performance and noise robustness (i.e. to noise characteristic or the insensitivity of parameter) better than LPCC.
As a rule, knowledge voice will be treated before speech feature extraction is carried out carries out pretreatment, partially removes noise and not
The impact brought with speaker, makes the signal after process more reflect the substitutive characteristics of voice.The most frequently used pretreatment has end
Point detection and speech enhan-cement.End-point detection is referred to and in voice signal is made a distinction voice and non-speech audio period, accurate
The starting point of voice signal is determined really.After end-point detection, subsequent treatment just only can be carried out to voice signal,
This plays an important role to the degree of accuracy and recognition correct rate that improve model.The main task of speech enhan-cement is exactly to eliminate environment to make an uproar
Impact of the sound to voice.Method general at present is to adopt Wiener filtering, and the method effect in the case where noise is larger is good
In other wave filter.
Pattern match
Pattern match is also called measuring similarity, refers to according to certain criterion, makes unknown voice and a certain language in sound bank
Sound template obtains best match.Specifically, pattern match is by the language in the character vector and sound bank of voice to be known
Sound template carries out similarity measure comparison, using similarity highest sound template generic as the intermediate candidate result for recognizing
Output.
The process of speech recognition is exactly to gather voice messaging substantially, and which is carried out contrast with the sound template in sound bank
Match somebody with somebody, choose immediate result and exported.But correct identification is completed, specific operation process is just it is necessary to have suitable
Algorithm be supported.
A kind of optional speech recognition algorithm is dynamic time warping (DTW, the Dynamic Time based on pattern match
Warping) method, the method are the same recording for waking up word of some for prerecording, and train and obtain waking up some of word
Individual sound template and sound bank;In speech recognition period, the voice of collection and each sound template are carried out into Dynamic Matching, will
Matching distance is compared with threshold value set in advance, and when distance is less than threshold value, the match is successful.
Another kind of optional speech recognition algorithm is the side based on log-likelihood ratio (LLR, log likelihood ration)
Method, the method are a kind of methods based on model.The method says the same voice for waking up word, instruction first by a large amount of people
The hidden Markov model (HMM, Hidden Markov Model) of a wake-up word is got, and trains some
Individual background template.In matching, voice and model state are done into pressure using Viterbi (Viterbi) algorithm and is alignd, obtained
To a log-likelihood;Voice is given a mark using background model simultaneously, obtain a maximum and refer to likelihood value.Will
Log-likelihood and the maximum ratio with reference to likelihood value are compared with threshold value set in advance, when ratio is more than threshold value, matching
Success.
Another optional speech recognition algorithm is the method based on log-likelihood, the method for the method and above-mentioned LLR
Similar, difference is that it no longer needs background model, but will directly wake up word model carries out pressure with voice and align obtaining
The log-likelihood marking of optimal path, when marking is more than threshold value set in advance, the match is successful.
When the method for above-described embodiment is used for terminal, after user says wake-up word, mobile terminal match cognization goes out to wake up
After word, mobile terminal will be waken up, i.e., mode of operation will be switched to from battery saving mode.
Fig. 2 shows one embodiment of the device for speech recognition, and the device includes that the voice for receiving voice messaging is adopted
Collection cell S 21, extracts feature extraction unit S22 of the voice characteristics information in voice messaging, by voice characteristics information and
The voice recognition unit S23 matched by sound template in sound bank, and retraining cell S 24.Retraining unit
S24 is for the voice in voice recognition unit S23 is after the match is successful using voice characteristics information re -training sound bank
Template.
Fig. 3 shows another embodiment of the device for speech recognition, and the device includes the voice for receiving voice messaging
Collecting unit S21, carries out the pretreatment unit S31 of pretreatment to voice messaging, extracts the phonetic feature in voice messaging
Feature extraction unit S22 of information, the speech recognition matched by the sound template in voice characteristics information and sound bank
Cell S 23, and retraining cell S 24.
In some optional embodiments, voice recognition unit S23 has one of following computing unit:Dynamic time warping
Algorithm unit, log-likelihood ratio algorithm unit, or log-likelihood algorithm unit.
A kind of equipment with speech identifying function is proposed herein also, in one embodiment, the equipment includes phonetic entry
Device, also including the device for speech recognition disclosed in previous embodiment.In another embodiment, the equipment
Also include mode switch element, after the match is successful in the voice recognition unit, the equipment is cut from battery saving mode
Change to mode of operation.
The equipment is including but not limited to electronic equipment and electrical equipment etc..The electronic equipment is including but not limited to handss
Machine, tablet PC and cart-mounted computing device etc..Described is including but not limited to electrically TV, audio amplifier, electric light, heat
Hydrophone and refrigerator etc..
It should also be appreciated by one skilled in the art that with reference to the embodiments herein description various illustrative box, module,
Circuit and algorithm steps can be implemented as electronic hardware, computer software or its combination.In order to clearly demonstrate hardware and
Interchangeability between software, surrounds its function to various illustrative parts, frame, module, circuit and step above
It is generally described.Hardware is implemented as this function and is also implemented as software, depending on specific application and
The design constraint applied by whole system.Those skilled in the art can be directed to each application-specific, with accommodation
Mode realizes described function, but, it is this to realize that decision-making should not be construed as the protection domain away from the disclosure.
Claims (10)
1. a kind of method for voice recognition, it is characterised in that include:
Receive voice messaging;
Extract the voice characteristics information in the voice messaging;
The voice characteristics information and the sound template in sound bank are matched;
Using the sound template in sound bank described in the voice characteristics information re -training after the match is successful.
2. the method for claim 1, it is characterised in that methods described is used for mobile terminal, also includes:
After with success, the mobile terminal is switched to into second mode from first mode.
3. method as claimed in claim 1 or 2, it is characterised in that using dynamic time warping, log-likelihood
The voice characteristics information and the acoustic model in sound bank are matched than method or log-likelihood method.
4. method as claimed in claim 1 or 2, it is characterised in that before extracting voice characteristics information, it is right also to include
The voice messaging carries out pretreatment.
5. a kind of device for speech recognition, it is characterised in that include:
Receive the voice collecting unit of voice messaging;
Extract the feature extraction unit of the voice characteristics information in the voice messaging;
The voice recognition unit matched by the voice characteristics information and the sound template in sound bank;With,
Retraining unit, for the match is successful afterwards using the voice characteristics information re -training in the voice recognition unit
Sound template in the sound bank.
6. device as claimed in claim 5, it is characterised in that the voice recognition unit has following computing unit
One of:Dynamic time warping algorithm unit, log-likelihood ratio algorithm unit or log-likelihood algorithm unit.
7. the device as described in claim 5 or 6, it is characterised in that also include carrying out pretreatment to voice messaging
Pretreatment unit.
8. a kind of equipment with speech identifying function, including speech input device, it is characterised in that also include such as power
Profit requires the device for speech recognition described in 5,6 or 7.
9. equipment as claimed in claim 8, it is characterised in that also including mode switch element, in institute's predicate
The equipment is switched to second mode from first mode after the match is successful by sound recognition unit.
10. equipment as claimed in claim 8 or 9, it is characterised in that the equipment is electronic equipment or electrical equipment.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2015106317753 | 2015-09-28 | ||
CN201510631775 | 2015-09-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106558306A true CN106558306A (en) | 2017-04-05 |
Family
ID=58418180
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610052812.XA Pending CN106558306A (en) | 2015-09-28 | 2016-01-25 | Method for voice recognition, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106558306A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108600898A (en) * | 2018-03-28 | 2018-09-28 | 深圳市冠旭电子股份有限公司 | A kind of method, wireless sound box and the terminal device of configuration wireless sound box |
CN108831441A (en) * | 2018-05-08 | 2018-11-16 | 上海依图网络科技有限公司 | A kind of training method and device of speech recognition modeling |
CN109785825A (en) * | 2018-12-29 | 2019-05-21 | 广东长虹日电科技有限公司 | A kind of algorithm and storage medium, the electric appliance using it of speech recognition |
CN110471410A (en) * | 2019-07-17 | 2019-11-19 | 武汉理工大学 | Intelligent vehicle voice assisting navigation and safety prompting system and method based on ROS |
CN110782886A (en) * | 2018-07-30 | 2020-02-11 | 阿里巴巴集团控股有限公司 | System, method, television, device and medium for speech processing |
CN111292753A (en) * | 2020-02-28 | 2020-06-16 | 广州国音智能科技有限公司 | Offline voice recognition method, device and equipment |
CN111599363A (en) * | 2019-02-01 | 2020-08-28 | 浙江大学 | Voice recognition method and device |
CN112951274A (en) * | 2021-02-07 | 2021-06-11 | 脸萌有限公司 | Voice similarity determination method and device, and program product |
CN113709545A (en) * | 2021-04-13 | 2021-11-26 | 腾讯科技(深圳)有限公司 | Video processing method and device, computer equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1741131A (en) * | 2004-08-27 | 2006-03-01 | 中国科学院自动化研究所 | A kind of unspecified person alone word audio recognition method and device |
CN102074231A (en) * | 2010-12-30 | 2011-05-25 | 万音达有限公司 | Voice recognition method and system |
CN102693723A (en) * | 2012-04-01 | 2012-09-26 | 北京安慧音通科技有限责任公司 | Method and device for recognizing speaker-independent isolated word based on subspace |
CN102723078A (en) * | 2012-07-03 | 2012-10-10 | 武汉科技大学 | Emotion speech recognition method based on natural language comprehension |
CN102760434A (en) * | 2012-07-09 | 2012-10-31 | 华为终端有限公司 | Method for updating voiceprint feature model and terminal |
CN103903612A (en) * | 2014-03-26 | 2014-07-02 | 浙江工业大学 | Method for performing real-time digital speech recognition |
-
2016
- 2016-01-25 CN CN201610052812.XA patent/CN106558306A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1741131A (en) * | 2004-08-27 | 2006-03-01 | 中国科学院自动化研究所 | A kind of unspecified person alone word audio recognition method and device |
CN102074231A (en) * | 2010-12-30 | 2011-05-25 | 万音达有限公司 | Voice recognition method and system |
CN102693723A (en) * | 2012-04-01 | 2012-09-26 | 北京安慧音通科技有限责任公司 | Method and device for recognizing speaker-independent isolated word based on subspace |
CN102723078A (en) * | 2012-07-03 | 2012-10-10 | 武汉科技大学 | Emotion speech recognition method based on natural language comprehension |
CN102760434A (en) * | 2012-07-09 | 2012-10-31 | 华为终端有限公司 | Method for updating voiceprint feature model and terminal |
CN103903612A (en) * | 2014-03-26 | 2014-07-02 | 浙江工业大学 | Method for performing real-time digital speech recognition |
Non-Patent Citations (1)
Title |
---|
李梅: "《物联网科技导论》", 31 August 2015, 北京:北京邮电大学出版社 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108600898A (en) * | 2018-03-28 | 2018-09-28 | 深圳市冠旭电子股份有限公司 | A kind of method, wireless sound box and the terminal device of configuration wireless sound box |
CN108600898B (en) * | 2018-03-28 | 2020-03-31 | 深圳市冠旭电子股份有限公司 | Method for configuring wireless sound box, wireless sound box and terminal equipment |
CN108831441B (en) * | 2018-05-08 | 2019-08-13 | 上海依图网络科技有限公司 | A kind of training method and device of speech recognition modeling |
CN108831441A (en) * | 2018-05-08 | 2018-11-16 | 上海依图网络科技有限公司 | A kind of training method and device of speech recognition modeling |
CN110782886A (en) * | 2018-07-30 | 2020-02-11 | 阿里巴巴集团控股有限公司 | System, method, television, device and medium for speech processing |
CN109785825A (en) * | 2018-12-29 | 2019-05-21 | 广东长虹日电科技有限公司 | A kind of algorithm and storage medium, the electric appliance using it of speech recognition |
CN109785825B (en) * | 2018-12-29 | 2021-07-30 | 长虹美菱日电科技有限公司 | Speech recognition algorithm, storage medium and electric appliance applying speech recognition algorithm |
CN111599363A (en) * | 2019-02-01 | 2020-08-28 | 浙江大学 | Voice recognition method and device |
CN111599363B (en) * | 2019-02-01 | 2023-03-31 | 浙江大学 | Voice recognition method and device |
CN110471410A (en) * | 2019-07-17 | 2019-11-19 | 武汉理工大学 | Intelligent vehicle voice assisting navigation and safety prompting system and method based on ROS |
CN111292753A (en) * | 2020-02-28 | 2020-06-16 | 广州国音智能科技有限公司 | Offline voice recognition method, device and equipment |
CN112951274A (en) * | 2021-02-07 | 2021-06-11 | 脸萌有限公司 | Voice similarity determination method and device, and program product |
CN113709545A (en) * | 2021-04-13 | 2021-11-26 | 腾讯科技(深圳)有限公司 | Video processing method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106558306A (en) | Method for voice recognition, device and equipment | |
US10699699B2 (en) | Constructing speech decoding network for numeric speech recognition | |
KR102339594B1 (en) | Object recognition method, computer device, and computer-readable storage medium | |
CN102982811B (en) | Voice endpoint detection method based on real-time decoding | |
US20170140750A1 (en) | Method and device for speech recognition | |
US8140330B2 (en) | System and method for detecting repeated patterns in dialog systems | |
Mantena et al. | Query-by-example spoken term detection using frequency domain linear prediction and non-segmental dynamic time warping | |
CN105206271A (en) | Intelligent equipment voice wake-up method and system for realizing method | |
CN104575504A (en) | Method for personalized television voice wake-up by voiceprint and voice identification | |
Raj et al. | Phoneme-dependent NMF for speech enhancement in monaural mixtures | |
JP6284462B2 (en) | Speech recognition method and speech recognition apparatus | |
CN106782521A (en) | A kind of speech recognition system | |
CN107093422B (en) | Voice recognition method and voice recognition system | |
CN108091340B (en) | Voiceprint recognition method, voiceprint recognition system, and computer-readable storage medium | |
CN102945673A (en) | Continuous speech recognition method with speech command range changed dynamically | |
CN105210147B (en) | Method, apparatus and computer-readable recording medium for improving at least one semantic unit set | |
US20220076683A1 (en) | Data mining apparatus, method and system for speech recognition using the same | |
CN103943111A (en) | Method and device for identity recognition | |
Verma et al. | Indian language identification using k-means clustering and support vector machine (SVM) | |
CN116343797A (en) | Voice awakening method and corresponding device | |
Ravinder | Comparison of hmm and dtw for isolated word recognition system of punjabi language | |
Zheng et al. | Acoustic texttiling for story segmentation of spoken documents | |
Desplanques et al. | Adaptive speaker diarization of broadcast news based on factor analysis | |
Li et al. | Automatic segmentation of Chinese Mandarin speech into syllable-like | |
Prukkanon et al. | F0 contour approximation model for a one-stream tonal word recognition system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20180829 Address after: 100000 Beijing Chaoyang District Jinsong seven district 717 Building 5 door 102. Applicant after: Cui Zheng Address before: 511402 three, Dongxing Road, Dalong street, Guangzhou, Guangdong, three. Applicant before: Guangdong Xinxintong Information System Services Co., Ltd. |
|
TA01 | Transfer of patent application right | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170405 |
|
RJ01 | Rejection of invention patent application after publication |