CN106558306A - Method for voice recognition, device and equipment - Google Patents

Method for voice recognition, device and equipment Download PDF

Info

Publication number
CN106558306A
CN106558306A CN201610052812.XA CN201610052812A CN106558306A CN 106558306 A CN106558306 A CN 106558306A CN 201610052812 A CN201610052812 A CN 201610052812A CN 106558306 A CN106558306 A CN 106558306A
Authority
CN
China
Prior art keywords
voice
sound
characteristics information
unit
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610052812.XA
Other languages
Chinese (zh)
Inventor
王斌
杨帅
曾明
Original Assignee
Guangdong Xinxintong Information System Services Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Xinxintong Information System Services Co Ltd filed Critical Guangdong Xinxintong Information System Services Co Ltd
Publication of CN106558306A publication Critical patent/CN106558306A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A kind of method for voice recognition, including:Receive voice messaging;Extract the voice characteristics information in the voice messaging;The voice characteristics information and the sound template in sound bank are matched;Using the sound template in sound bank described in the voice characteristics information re -training after the match is successful.After due to each speech recognition, sound template that all can be in re -training sound bank, so that sound template increasingly enriches, so as to greatly improve the success rate of speech recognition.A kind of device for speech recognition and a kind of equipment with speech identifying function are also disclosed in certain embodiments.

Description

Method for voice recognition, device and equipment
Technical field
The invention belongs to mode identification technology, more particularly to a kind of method for voice recognition, device, and have The equipment of speech identifying function.
Background technology
Currently, the smart machine such as panel computer, smart mobile phone, smart home product becomes increasingly popular, progressively become family and Personal standard configuration.Smart machine based on interactive voice is practical, sets in household electrical appliances, car machine, mobile phone etc. It has been widely used for upper, wherein, many equipment all have voice arousal function, open for unlocking screen or conduct The supplementary meanss of dynamic application.It is such technology that voice wakes up, when equipment is in holding state, in low-down work( Under the conditions of consumption, run without interruption a device on backstage, certain predefined wake-up word is detected, when detecting When user says this word, equipment is waken up, put the device into normal operating conditions.But the skill of speech recognition at present The success rate of art is also less desirable, needs further to improve.
The content of the invention
In view of this, it is an object of the invention to propose a kind of method for voice recognition, to improve speech recognition Success rate.In order to the embodiment to disclosing some in terms of have a basic understanding, shown below is simple summary. The summarized section is not extensive overview, nor key/critical component is determined or describes the protection of these embodiments Scope.Its sole purpose is that some concepts are presented with simple form, in this, as the preamble of following detailed description.
In some optional embodiments, the method for voice recognition includes:Receive voice messaging;Extract described Voice characteristics information in voice messaging;The voice characteristics information and the sound template in sound bank are matched; With the sound template utilized after success in sound bank described in the voice characteristics information re -training.Due to each speech recognition After success, sound template that all can be in re -training sound bank, so that sound template increasingly enriches, so as to significantly Improve the success rate of speech recognition.
Another object of the present invention is to propose a kind of device for speech recognition.
In some optional embodiments, the device for speech recognition includes:Receive the voice collecting of voice messaging Unit;Extract the feature extraction unit of the voice characteristics information in the voice messaging;By the voice characteristics information and language The voice recognition unit matched by sound template in sound storehouse;With retraining unit, in the speech recognition list The match is successful afterwards using the sound template in sound bank described in the voice characteristics information re -training for unit.
Another object of the present invention is to propose a kind of equipment with speech identifying function.
In some optional embodiments, the equipment with speech identifying function includes speech input device, also includes The device for speech recognition.
For above-mentioned and related purpose, one or more embodiments include will be explained in below and in the claims The feature for particularly pointing out.Description below and accompanying drawing describe some illustrative aspects in detail, and its indicate be only Some modes in the utilizable various modes of principle of each embodiment.Other benefits and novel features will be with Detailed description below is considered in conjunction with the accompanying and becomes obvious, the disclosed embodiments be will include all these aspects and Their equivalent.
Description of the drawings
Fig. 1 is a method for voice recognition embodiment;
Fig. 2 is a device embodiment for being used for speech recognition;
Fig. 3 is another device embodiment for speech recognition.
Specific embodiment
The following description and drawings fully illustrate specific embodiments of the present invention, to enable those skilled in the art to reality Trample them.Other embodiments can include structure, logic, electric, process and other changes.It is real Apply example and only represent possible change.Unless explicitly requested, otherwise individually components and functionality is optional, and is operated Order can change.The part of some embodiments and feature can be included in or replace other embodiments part and Feature.The scope of embodiment of the present invention includes the gamut of claims, and all of claims can The equivalent of acquisition.Herein, these embodiments of the invention individually or generally " can be invented " with term To represent, it is convenient that this is used for the purpose of, and if in fact disclosing the invention more than, is not meant to automatically limit The scope for making the application is any single invention or inventive concept.
Fig. 1 shows one embodiment of method for voice recognition.
Step 11:Receive voice messaging;
Step 12:Extract the voice characteristics information in voice messaging;
Step 13:Acoustic model in voice characteristics information and sound bank is matched;
Step 14:Using the speech model in voice characteristics information re -training sound bank after the match is successful.
For speech recognition technology, realize that the basic skills of meaningful, the substantial voice messaging of identification is at present:In advance Voice characteristics information is analyzed, machine is given as requested and is stored, the voice characteristics information in this speech parameter storehouse Referred to as " template (Template-based Approach) ", this process referred to as " training (Training) ".Send Unknown voice (also known as after knowledge voice) to recognize is transformed into after the signal of telecommunication through pretreatment, pronunciation modeling and feature extraction, Obtain voice characteristics information, it compared one by one with the sound template in sound bank, and adopt the method for matching to find out most connecing The template of nearly phonetic feature, draws recognition result, and this process is known as " identification (Recognition) ".Certainly, exist There to be individual standard when being compared, here it is " distortion measure (the Distortion between metering speech parameter vector Measures) ", the content representated by that minimum template of distortion is exactly the result of identification.
Speech recognition process is generally divided into two stages:Training stage and cognitive phase.The former task is to set up identification base The speech model and language model of this unit, the latter are then to carry out the speech characteristic parameter and sound template of target voice Relatively, it is identified result.
Acoustic model
Acoustic model is the underlying model of identifying system, is a most key part in speech recognition system.Acoustic model Target be to provide a kind of effective method, calculate the distance between feature vector sequence and each sound template of voice. The design of acoustic model is closely related with language pronouncing feature.Model Identification cell size (word pronunciation model, word pronunciation mould Type, half syllable-based hmm or phoneme model) have larger to voice training data volume size, phonetic recognization rate and motility Affect.For speech recognition system more than medium vocabulary quantity, recognition unit is little, then amount of calculation is also little, required mould Type amount of storage and amount of training data are also little, but the problem brought is the positioning and segmentation difficulty of correspondence voice segments, and more multiple Miscellaneous identification model rule.Generally big recognition unit easily includes coarticulation in a model, and this is conducive to raising system Discrimination, but require training data relative increase.
Language model
Language model (Language Model, LM) refers generally in matching search for words and the language of path constraint Speech rule, is, for the knowledge that syntax and semantics are effectively combined during speech recognition, to improve discrimination, reduces The scope of search.Due to being difficult to accurately determine the border of word, and acoustic model describes the limited in one's ability of sound-variation, The sequence of many probability scores similar word will be produced during identification.Therefore, it is usually used in practical speech recognition system Language model selects most possible word sequence to make up the deficiency of acoustic model from many candidate results.
Language model can be divided into rule-based language model and the language model based on statistics.Rule-based language mould Type is to sum up grammatical ruless or even semantic rule, is then excluded in acoustics identification with these rules and does not conform to grammatical ruless or language The result of adopted rule.Statistical language model passes through the dependence between statistical probability descriptor and word, indirectly to grammer Or semantic rule is encoded.Rule-based language model obtains application well in particular task system, can be larger Amplitude improves the discrimination of system.As everyday spoken english dialogue cannot be described with hard and fast rule, know in large vocabulary voice The main language model using based on statistics in other system.
Feature extraction
Feature extraction seeks to the relevant information of the reflection phonetic feature for extracting important from speech waveform, removes those phases To unrelated information.It is both the process that an information is significantly compressed, and a signal uncoiling process.Due to voice The time-varying characteristics of signal, speech feature extraction must be carried out on a bit of voice signal, that is, carry out short-time analysiss.At present The more commonly used Speech Feature Extraction is the linear prediction cepstrum coefficient technology (LPCC) based on channel model and based on listening Feel Mel frequency cepstral technologies (MFCC) of mechanism.The former basic thought is:The adjacent sampled point of voice signal Between have very strong dependency.Therefore the sampled value of each voice signal, can adding with the sampled value of several before it Power and linear combination carry out approximate representation.The latter has then taken into full account the auditory properties of human ear, and with objective metric characterizing people Subjective feeling to volume up-down.By contrast, MFCC has certain advantage:1. the information of voice has focused largely on low Frequency part, and HFS easily receives ambient noise interference, MFCC to emphasize the low-frequency information of voice, it is favourable so as to highlight In the information of identification, noise jamming is shielded;2.MFCC does not have any hypotheses, can make in all cases With, recognition performance and noise robustness (i.e. to noise characteristic or the insensitivity of parameter) better than LPCC.
As a rule, knowledge voice will be treated before speech feature extraction is carried out carries out pretreatment, partially removes noise and not The impact brought with speaker, makes the signal after process more reflect the substitutive characteristics of voice.The most frequently used pretreatment has end Point detection and speech enhan-cement.End-point detection is referred to and in voice signal is made a distinction voice and non-speech audio period, accurate The starting point of voice signal is determined really.After end-point detection, subsequent treatment just only can be carried out to voice signal, This plays an important role to the degree of accuracy and recognition correct rate that improve model.The main task of speech enhan-cement is exactly to eliminate environment to make an uproar Impact of the sound to voice.Method general at present is to adopt Wiener filtering, and the method effect in the case where noise is larger is good In other wave filter.
Pattern match
Pattern match is also called measuring similarity, refers to according to certain criterion, makes unknown voice and a certain language in sound bank Sound template obtains best match.Specifically, pattern match is by the language in the character vector and sound bank of voice to be known Sound template carries out similarity measure comparison, using similarity highest sound template generic as the intermediate candidate result for recognizing Output.
The process of speech recognition is exactly to gather voice messaging substantially, and which is carried out contrast with the sound template in sound bank Match somebody with somebody, choose immediate result and exported.But correct identification is completed, specific operation process is just it is necessary to have suitable Algorithm be supported.
A kind of optional speech recognition algorithm is dynamic time warping (DTW, the Dynamic Time based on pattern match Warping) method, the method are the same recording for waking up word of some for prerecording, and train and obtain waking up some of word Individual sound template and sound bank;In speech recognition period, the voice of collection and each sound template are carried out into Dynamic Matching, will Matching distance is compared with threshold value set in advance, and when distance is less than threshold value, the match is successful.
Another kind of optional speech recognition algorithm is the side based on log-likelihood ratio (LLR, log likelihood ration) Method, the method are a kind of methods based on model.The method says the same voice for waking up word, instruction first by a large amount of people The hidden Markov model (HMM, Hidden Markov Model) of a wake-up word is got, and trains some Individual background template.In matching, voice and model state are done into pressure using Viterbi (Viterbi) algorithm and is alignd, obtained To a log-likelihood;Voice is given a mark using background model simultaneously, obtain a maximum and refer to likelihood value.Will Log-likelihood and the maximum ratio with reference to likelihood value are compared with threshold value set in advance, when ratio is more than threshold value, matching Success.
Another optional speech recognition algorithm is the method based on log-likelihood, the method for the method and above-mentioned LLR Similar, difference is that it no longer needs background model, but will directly wake up word model carries out pressure with voice and align obtaining The log-likelihood marking of optimal path, when marking is more than threshold value set in advance, the match is successful.
When the method for above-described embodiment is used for terminal, after user says wake-up word, mobile terminal match cognization goes out to wake up After word, mobile terminal will be waken up, i.e., mode of operation will be switched to from battery saving mode.
Fig. 2 shows one embodiment of the device for speech recognition, and the device includes that the voice for receiving voice messaging is adopted Collection cell S 21, extracts feature extraction unit S22 of the voice characteristics information in voice messaging, by voice characteristics information and The voice recognition unit S23 matched by sound template in sound bank, and retraining cell S 24.Retraining unit S24 is for the voice in voice recognition unit S23 is after the match is successful using voice characteristics information re -training sound bank Template.
Fig. 3 shows another embodiment of the device for speech recognition, and the device includes the voice for receiving voice messaging Collecting unit S21, carries out the pretreatment unit S31 of pretreatment to voice messaging, extracts the phonetic feature in voice messaging Feature extraction unit S22 of information, the speech recognition matched by the sound template in voice characteristics information and sound bank Cell S 23, and retraining cell S 24.
In some optional embodiments, voice recognition unit S23 has one of following computing unit:Dynamic time warping Algorithm unit, log-likelihood ratio algorithm unit, or log-likelihood algorithm unit.
A kind of equipment with speech identifying function is proposed herein also, in one embodiment, the equipment includes phonetic entry Device, also including the device for speech recognition disclosed in previous embodiment.In another embodiment, the equipment Also include mode switch element, after the match is successful in the voice recognition unit, the equipment is cut from battery saving mode Change to mode of operation.
The equipment is including but not limited to electronic equipment and electrical equipment etc..The electronic equipment is including but not limited to handss Machine, tablet PC and cart-mounted computing device etc..Described is including but not limited to electrically TV, audio amplifier, electric light, heat Hydrophone and refrigerator etc..
It should also be appreciated by one skilled in the art that with reference to the embodiments herein description various illustrative box, module, Circuit and algorithm steps can be implemented as electronic hardware, computer software or its combination.In order to clearly demonstrate hardware and Interchangeability between software, surrounds its function to various illustrative parts, frame, module, circuit and step above It is generally described.Hardware is implemented as this function and is also implemented as software, depending on specific application and The design constraint applied by whole system.Those skilled in the art can be directed to each application-specific, with accommodation Mode realizes described function, but, it is this to realize that decision-making should not be construed as the protection domain away from the disclosure.

Claims (10)

1. a kind of method for voice recognition, it is characterised in that include:
Receive voice messaging;
Extract the voice characteristics information in the voice messaging;
The voice characteristics information and the sound template in sound bank are matched;
Using the sound template in sound bank described in the voice characteristics information re -training after the match is successful.
2. the method for claim 1, it is characterised in that methods described is used for mobile terminal, also includes: After with success, the mobile terminal is switched to into second mode from first mode.
3. method as claimed in claim 1 or 2, it is characterised in that using dynamic time warping, log-likelihood The voice characteristics information and the acoustic model in sound bank are matched than method or log-likelihood method.
4. method as claimed in claim 1 or 2, it is characterised in that before extracting voice characteristics information, it is right also to include The voice messaging carries out pretreatment.
5. a kind of device for speech recognition, it is characterised in that include:
Receive the voice collecting unit of voice messaging;
Extract the feature extraction unit of the voice characteristics information in the voice messaging;
The voice recognition unit matched by the voice characteristics information and the sound template in sound bank;With,
Retraining unit, for the match is successful afterwards using the voice characteristics information re -training in the voice recognition unit Sound template in the sound bank.
6. device as claimed in claim 5, it is characterised in that the voice recognition unit has following computing unit One of:Dynamic time warping algorithm unit, log-likelihood ratio algorithm unit or log-likelihood algorithm unit.
7. the device as described in claim 5 or 6, it is characterised in that also include carrying out pretreatment to voice messaging Pretreatment unit.
8. a kind of equipment with speech identifying function, including speech input device, it is characterised in that also include such as power Profit requires the device for speech recognition described in 5,6 or 7.
9. equipment as claimed in claim 8, it is characterised in that also including mode switch element, in institute's predicate The equipment is switched to second mode from first mode after the match is successful by sound recognition unit.
10. equipment as claimed in claim 8 or 9, it is characterised in that the equipment is electronic equipment or electrical equipment.
CN201610052812.XA 2015-09-28 2016-01-25 Method for voice recognition, device and equipment Pending CN106558306A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2015106317753 2015-09-28
CN201510631775 2015-09-28

Publications (1)

Publication Number Publication Date
CN106558306A true CN106558306A (en) 2017-04-05

Family

ID=58418180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610052812.XA Pending CN106558306A (en) 2015-09-28 2016-01-25 Method for voice recognition, device and equipment

Country Status (1)

Country Link
CN (1) CN106558306A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108600898A (en) * 2018-03-28 2018-09-28 深圳市冠旭电子股份有限公司 A kind of method, wireless sound box and the terminal device of configuration wireless sound box
CN108831441A (en) * 2018-05-08 2018-11-16 上海依图网络科技有限公司 A kind of training method and device of speech recognition modeling
CN109785825A (en) * 2018-12-29 2019-05-21 广东长虹日电科技有限公司 A kind of algorithm and storage medium, the electric appliance using it of speech recognition
CN110471410A (en) * 2019-07-17 2019-11-19 武汉理工大学 Intelligent vehicle voice assisting navigation and safety prompting system and method based on ROS
CN110782886A (en) * 2018-07-30 2020-02-11 阿里巴巴集团控股有限公司 System, method, television, device and medium for speech processing
CN111292753A (en) * 2020-02-28 2020-06-16 广州国音智能科技有限公司 Offline voice recognition method, device and equipment
CN111599363A (en) * 2019-02-01 2020-08-28 浙江大学 Voice recognition method and device
CN112951274A (en) * 2021-02-07 2021-06-11 脸萌有限公司 Voice similarity determination method and device, and program product
CN113709545A (en) * 2021-04-13 2021-11-26 腾讯科技(深圳)有限公司 Video processing method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1741131A (en) * 2004-08-27 2006-03-01 中国科学院自动化研究所 A kind of unspecified person alone word audio recognition method and device
CN102074231A (en) * 2010-12-30 2011-05-25 万音达有限公司 Voice recognition method and system
CN102693723A (en) * 2012-04-01 2012-09-26 北京安慧音通科技有限责任公司 Method and device for recognizing speaker-independent isolated word based on subspace
CN102723078A (en) * 2012-07-03 2012-10-10 武汉科技大学 Emotion speech recognition method based on natural language comprehension
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103903612A (en) * 2014-03-26 2014-07-02 浙江工业大学 Method for performing real-time digital speech recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1741131A (en) * 2004-08-27 2006-03-01 中国科学院自动化研究所 A kind of unspecified person alone word audio recognition method and device
CN102074231A (en) * 2010-12-30 2011-05-25 万音达有限公司 Voice recognition method and system
CN102693723A (en) * 2012-04-01 2012-09-26 北京安慧音通科技有限责任公司 Method and device for recognizing speaker-independent isolated word based on subspace
CN102723078A (en) * 2012-07-03 2012-10-10 武汉科技大学 Emotion speech recognition method based on natural language comprehension
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103903612A (en) * 2014-03-26 2014-07-02 浙江工业大学 Method for performing real-time digital speech recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李梅: "《物联网科技导论》", 31 August 2015, 北京:北京邮电大学出版社 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108600898A (en) * 2018-03-28 2018-09-28 深圳市冠旭电子股份有限公司 A kind of method, wireless sound box and the terminal device of configuration wireless sound box
CN108600898B (en) * 2018-03-28 2020-03-31 深圳市冠旭电子股份有限公司 Method for configuring wireless sound box, wireless sound box and terminal equipment
CN108831441B (en) * 2018-05-08 2019-08-13 上海依图网络科技有限公司 A kind of training method and device of speech recognition modeling
CN108831441A (en) * 2018-05-08 2018-11-16 上海依图网络科技有限公司 A kind of training method and device of speech recognition modeling
CN110782886A (en) * 2018-07-30 2020-02-11 阿里巴巴集团控股有限公司 System, method, television, device and medium for speech processing
CN109785825A (en) * 2018-12-29 2019-05-21 广东长虹日电科技有限公司 A kind of algorithm and storage medium, the electric appliance using it of speech recognition
CN109785825B (en) * 2018-12-29 2021-07-30 长虹美菱日电科技有限公司 Speech recognition algorithm, storage medium and electric appliance applying speech recognition algorithm
CN111599363A (en) * 2019-02-01 2020-08-28 浙江大学 Voice recognition method and device
CN111599363B (en) * 2019-02-01 2023-03-31 浙江大学 Voice recognition method and device
CN110471410A (en) * 2019-07-17 2019-11-19 武汉理工大学 Intelligent vehicle voice assisting navigation and safety prompting system and method based on ROS
CN111292753A (en) * 2020-02-28 2020-06-16 广州国音智能科技有限公司 Offline voice recognition method, device and equipment
CN112951274A (en) * 2021-02-07 2021-06-11 脸萌有限公司 Voice similarity determination method and device, and program product
CN113709545A (en) * 2021-04-13 2021-11-26 腾讯科技(深圳)有限公司 Video processing method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106558306A (en) Method for voice recognition, device and equipment
US10699699B2 (en) Constructing speech decoding network for numeric speech recognition
KR102339594B1 (en) Object recognition method, computer device, and computer-readable storage medium
CN102982811B (en) Voice endpoint detection method based on real-time decoding
US20170140750A1 (en) Method and device for speech recognition
US8140330B2 (en) System and method for detecting repeated patterns in dialog systems
Mantena et al. Query-by-example spoken term detection using frequency domain linear prediction and non-segmental dynamic time warping
CN105206271A (en) Intelligent equipment voice wake-up method and system for realizing method
CN104575504A (en) Method for personalized television voice wake-up by voiceprint and voice identification
Raj et al. Phoneme-dependent NMF for speech enhancement in monaural mixtures
JP6284462B2 (en) Speech recognition method and speech recognition apparatus
CN106782521A (en) A kind of speech recognition system
CN107093422B (en) Voice recognition method and voice recognition system
CN108091340B (en) Voiceprint recognition method, voiceprint recognition system, and computer-readable storage medium
CN102945673A (en) Continuous speech recognition method with speech command range changed dynamically
CN105210147B (en) Method, apparatus and computer-readable recording medium for improving at least one semantic unit set
US20220076683A1 (en) Data mining apparatus, method and system for speech recognition using the same
CN103943111A (en) Method and device for identity recognition
Verma et al. Indian language identification using k-means clustering and support vector machine (SVM)
CN116343797A (en) Voice awakening method and corresponding device
Ravinder Comparison of hmm and dtw for isolated word recognition system of punjabi language
Zheng et al. Acoustic texttiling for story segmentation of spoken documents
Desplanques et al. Adaptive speaker diarization of broadcast news based on factor analysis
Li et al. Automatic segmentation of Chinese Mandarin speech into syllable-like
Prukkanon et al. F0 contour approximation model for a one-stream tonal word recognition system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20180829

Address after: 100000 Beijing Chaoyang District Jinsong seven district 717 Building 5 door 102.

Applicant after: Cui Zheng

Address before: 511402 three, Dongxing Road, Dalong street, Guangzhou, Guangdong, three.

Applicant before: Guangdong Xinxintong Information System Services Co., Ltd.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20170405

RJ01 Rejection of invention patent application after publication