CN113643692B - PLC voice recognition method based on machine learning - Google Patents

PLC voice recognition method based on machine learning Download PDF

Info

Publication number
CN113643692B
CN113643692B CN202110319744.XA CN202110319744A CN113643692B CN 113643692 B CN113643692 B CN 113643692B CN 202110319744 A CN202110319744 A CN 202110319744A CN 113643692 B CN113643692 B CN 113643692B
Authority
CN
China
Prior art keywords
voice
plc
model
voice signal
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110319744.XA
Other languages
Chinese (zh)
Other versions
CN113643692A (en
Inventor
侯龙潇
李建普
赵聪
李晓鹏
杨成林
雷珊珊
范宦潼
白保坤
赵贤
谢沙沙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Machinery Design & Research Institute Co ltd
Original Assignee
Henan Machinery Design & Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Machinery Design & Research Institute Co ltd filed Critical Henan Machinery Design & Research Institute Co ltd
Priority to CN202110319744.XA priority Critical patent/CN113643692B/en
Publication of CN113643692A publication Critical patent/CN113643692A/en
Application granted granted Critical
Publication of CN113643692B publication Critical patent/CN113643692B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention relates to a PLC voice recognition method based on machine learning, which comprises the following steps of a, collecting voice signal samples; b, voice signal end point detection and feature extraction; c, training an HMM-GMM model; d, establishing a mapping relation between the voice instruction and the PLC register data; e, collecting voice instructions; f, carrying out endpoint detection and feature extraction on the voice instruction; g, matching the characteristics of the voice instruction with the model; h, modifying the register data by the mapping relation between the matching result and the PLC register data. The method can realize the output of signals and the modification of parameters, can accurately identify the voice command sent by an operator, namely, can realize the replacement of industrial control means such as buttons, keys and the like with voice commands more friendly to operators, so that the operators do not need to face complex operation interfaces, can realize remote operation equipment, and can add new modes and ideas for industrial control modes.

Description

PLC voice recognition method based on machine learning
Technical Field
The invention relates to the technical field of machine learning, in particular to a PLC voice recognition method based on machine learning.
Background
In traditional industrial control, an operator carries out signal input or parameter modification on a PLC by using equipment such as a button, a touch screen, a mouse and a keyboard, and outputs instructions to the outside after carrying out logic processing on the PLC so as to control the equipment.
Based on the above premise, it is important to provide a natural and convenient man-machine interaction mode.
Disclosure of Invention
Aiming at the situation, in order to overcome the defects of the prior art, the invention provides the PLC voice recognition method based on machine learning, which is used for establishing a training model after processing voice data by acquiring the voice data of instructions required by equipment, matching the acquired instruction voice after processing with the model when in use, writing a matching result into a PLC internal register, realizing the output of signals and the modification of parameters, accurately recognizing the voice instructions sent by operators, and carrying out corresponding operation according to instruction equipment.
The invention relates to a PLC voice recognition method based on machine learning, which comprises the following specific implementation steps,
a, collecting a voice signal sample;
b, voice signal end point detection and feature extraction;
c, training an HMM-GMM model;
d, establishing a mapping relation between the voice instruction and the PLC register data;
e, collecting voice instructions;
f, carrying out endpoint detection and feature extraction on the voice instruction;
g, matching the characteristics of the voice instruction with the model;
h, modifying the register data by the mapping relation between the matching result and the PLC register data.
The beneficial effects of the invention are as follows: based on machine learning, firstly, a voice signal sample is collected, voice signal endpoint detection and feature extraction are carried out, then an HMM-GMM model is trained, secondly, a mapping relation between voice instructions and PLC register data is established, finally, voice instructions are collected, endpoint detection and feature extraction are carried out on the voice instructions, the features of the voice instructions are matched with the model, a matching result is written into a PLC internal register, signal output and parameter modification are achieved, voice instructions sent by operators can be accurately identified, namely, the fact that industrial control means such as buttons and keys are replaced with voice instructions which are more friendly to operators is achieved, operators do not need to face complex operation interfaces, remote operation equipment can be achieved, and new modes and ideas are added for industrial control modes.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a flow chart of the general steps of the present invention.
Fig. 2 is a flow chart of step a of the present invention.
Fig. 3 is a waveform diagram of the voice signal end point detection in the step b of the present invention.
Fig. 4 is a corresponding diagram of D1 in step D of the present invention.
Fig. 5 is a flowchart of step h of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.
In the first embodiment, a machine learning-based PLC voice recognition method is realized by the following steps,
a, collecting a voice signal sample;
b, voice signal end point detection and feature extraction;
c, training an HMM-GMM model;
d, establishing a mapping relation between the voice instruction and the PLC register data;
e, collecting voice instructions;
f, carrying out end point detection and feature extraction on the voice command (the voice command collection, end point detection and feature extraction modes are the same as the voice signal sample collection, end point detection and feature extraction);
g, matching the characteristics of the voice instruction with the model;
h, modifying the register data by the mapping relation between the matching result and the PLC register data, and combining the mapping relation and the voice instruction prediction result, and connecting the target PLC by using Sanp7 to finish the register data modification of the corresponding address.
In a second embodiment, based on the first embodiment, the step of collecting the voice signal sample in the step a is as follows,
a1, setting the collection times of each voice signal sample;
a2, setting a preservation path of the voice signal sample;
a3, setting the format as pyaudio.paInt16, the number of channels as 1, the sampling rate as 16000, and the recording duration of a single voice signal as 2.5s;
a4, collecting voice by using a pyaudio module;
a5, storing the collected voice signal sample by using a wave module;
a6, denoising the voice signal sample by using spectral subtraction;
and A7, circularly executing until the set acquisition times are reached.
In the third embodiment, on the basis of the first embodiment, the end point detection of the voice signal in the step b is to accurately determine the start point and the end point of the voice in a section of the signal containing the voice, and distinguish the voice section from the non-voice section. The double threshold method has three thresholds, the first two being the threshold for speech energy and the last being the threshold for speech zero-crossing rate. The energy of voiced sound is higher than unvoiced sound, the zero crossing rate of unvoiced sound is higher than that of unvoiced sound, the energy is firstly utilized to distinguish the voiced sound, then the zero crossing rate is utilized to extract unvoiced sound, the endpoint detection is completed, the specific steps are as follows,
bj1, taking a higher short-time energy as a threshold MH, and utilizing the threshold to firstly separate out a voiced sound part in voice, wherein the interval A1 to A2 is formed;
bj2, taking a lower energy threshold ML, searching from A1 and A2 to two ends by using the threshold, adding the voice part of the lower energy section into the voice section, and further expanding the voice section range, and the voice section between B1 and B2;
bj3, distinguishing consonants and silence by using a short-time zero-crossing rate, wherein the threshold value of the short-time zero-crossing rate is Zs, searching the voice section distinguished by using short-time energy to the two ends, and considering the part with the short-time zero-crossing rate larger than 3 times Zs as the unvoiced part of the voice, adding the part into the voice section, namely the obtained voice section, and the voice section is arranged between C1 and C2.
In a fourth embodiment, based on the first embodiment, the step of extracting the speech signal features in the step b is as follows,
bt1, pre-emphasis, framing and windowing are carried out on the voice;
bt2, obtaining a corresponding frequency spectrum through FFT for each short-time analysis window;
bt3, the above Frequency spectrum is passed through a Mel filter bank to obtain Mel Frequency spectrum (the human auditory system is a special nonlinear system, the sensitivity of which to respond to signals with different frequencies is different. In the aspect of extracting voice characteristics, the human auditory system is very good, it can extract not only semantic information, but also the personal characteristics of a speaker, which are all what is required by the existing voice recognition system;
bt4, carrying out cepstrum analysis on the Mel frequency spectrum, taking logarithm, carrying out inverse transformation, wherein the actual inverse transformation is generally realized through DCT discrete cosine transformation, taking the 2 nd to 13 th coefficients after DCT as MFCC coefficients, and obtaining Mel frequency cepstrum coefficient MFCC, wherein the MFCC is the characteristic of the frame of voice.
In a fifth embodiment, based on the first embodiment, the step of training the HMM-GMM model in the step c is as follows,
the method comprises the steps of C1, respectively using HMM-GMM (Hidden Markov Model: a Markov process with Hidden nodes unobservable and visible nodes, gaussianMixture Model: gaussian mixture model can be regarded as a model formed by combining K single Gaussian models, K submodels are Hidden variables Hidden variable of the mixture model, generally, any probability distribution can be used for one mixture model, and the Gaussian mixture model is used here because of good mathematical properties and good calculation performance of Gaussian distribution, and speech recognition is divided into three steps, namely, firstly, recognizing frames into states and being completed by the GMM; the third step, combine the phoneme into words, finish by HMM, can understand that the whole HMM-GMM network is actually used for HMM network service, for the problem that the speech recognition needs to solve, namely correctly recognize MFCC characteristic into corresponding HMM state, this process involves two probability to calculate, one is to recognize the characteristic of the current frame as the probability of this state, namely Likelihood in general HMM, mean vector and covariance matrix in GMM, namely GMM network is used for obtaining the probability of the current state, the second is to convert the probability of the last state into the current state, namely state transition probability, this process is that in HMM said Decoding, a sequence is converted into another sequence, there is exponential class conversion mode theoretically, so each frame only takes the highest probability of that state, such a route selection method is called Viterbi algorithm) is modeled, use 3 state modeling, wherein the emission probability of HMM uses Gaussian distribution modeling;
c2, initializing alignment, and averagely corresponding frames of the voice signal to each state;
updating model parameters, counting the times of obtaining the transition of each state, dividing the times by the total transition times to obtain the transition probability of each state, and calculating the mean vector and covariance matrix of the MFCC characteristics of the state, namely the emission probability;
c4, using a Viterbi algorithm, and carrying out state level alignment on the voice signal again according to the transition probability and the emission probability obtained in the last step;
step C5, repeating the step C2 and the step C3 until convergence;
and C6, saving the model after training.
In a sixth embodiment, based on the first embodiment, the step of establishing the mapping relationship between the voice command and the PLC register data in the step d is as follows,
d1, the data storage of the PLC is related to a storage section in the form of a Tag and is divided into an input (I), an output (O), a bit storage (M) and a Data Block (DB), and when a program accesses the corresponding (I/O) Tag, the program operates the corresponding address through Process Image Out of the access CPU, and the specific corresponding relation is shown in the following figure 4;
d2, establishing a link between a PC and a PLC register by using Snap7, wherein Snap7 is an open source library based on Ethernet and Siemens PLC communication of S7 series, and supports Ethernet communication comprising S7-200, S7-200 Smart, S7-300, S7-400, S7-1200 and S7-1500 of S7 series, wherein the communication steps are as follows: 1, instantiating snap7; setting a link port number, 2, calling an API of snap 7: connect,3, parameters require the IP address, rack number and slot number of the target PLC, 4, call API after operation is completed: disconnect disconnects the link;
d3, mapping the voice command and the data of the data register of the PLC, wherein the principle of the command operation executed by the PLC is to modify the data in the corresponding register address, and the API of the snap 7: the write_area and the client_read_area can implement writing and reading of PLC register data. The parameters need operation type address, register address, start bit and data, and the operation can finish the input and output of the I/O point;
for V and M regions, then API calls are required: the client db write and client db read perform read and write operations on the V and M variables, the parameters need register addresses, start bits and byte numbers of read data (wherein byte data is 1, words and integers are 2, double shaping and floating points are 4), and the operations can complete writing and reading of variable data;
the voice signal replaces a physical button or a key on the touch screen, data capable of realizing functions is written into a designated register address, mapping of the voice signal and the PLC register data is completed, for example, the voice signal is 'No. 1 motor start', and if the No. 1 motor is started when the output point Q0.1 is set, a client_area (0X 82,0, struct. Unpack ('B', 2)) statement is corresponding to the voice signal in a program.
In the seventh embodiment, based on the first embodiment, the step g of matching the features of the voice command with the models, importing a model set built by each phoneme of the HMM-GMM voice signal sample, matching the features of the voice command with each model of the model set, obtaining the voice sample with the highest matching rate, specifically comprises the following steps,
g1, importing a model group after training;
g2, creating a prediction score list;
g3, matching the input voice with each model of the model group;
g4, calculating a matching score and storing the matching score into a prediction score list;
g5, screening out the highest-scoring model;
and G6, outputting a voice signal mark corresponding to the model.
When the invention is specifically used, based on machine learning and training of a voice instruction model, the PC end program is connected with the PLC, industrial control means such as buttons, keys and the like are replaced by voice instructions which are more friendly to operators, so that the operators do not need to face a complex operation interface, remote operation equipment can be realized, new modes and ideas are added for the industrial control mode, the specific implementation steps are as follows,
a, collecting a voice signal sample;
b, voice signal end point detection and feature extraction;
c, training an HMM-GMM model;
d, establishing a mapping relation between the voice instruction and the PLC register data;
e, collecting voice instructions;
f, carrying out end point detection and feature extraction on the voice command (the voice command collection, end point detection and feature extraction modes are the same as the voice signal sample collection, end point detection and feature extraction);
g, matching the characteristics of the voice instruction with the model;
h, modifying the register data by the mapping relation between the matching result and the PLC register data, and combining the mapping relation and the voice instruction prediction result, and connecting the target PLC by using Sanp7 to finish the register data modification of the corresponding address.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims (6)

1. The PLC voice recognition method based on machine learning is characterized by comprising the following specific implementation steps,
a, collecting a voice signal sample;
b, voice signal end point detection and feature extraction;
c, training an HMM-GMM model;
d, establishing a mapping relation between the voice instruction and the PLC register data; the step of establishing the mapping relation comprises the following steps: d1, the data storage of the PLC is related to a storage interval in a Tag form and is divided into an input (I), an output (O), a bit storage (M) and a Data Block (DB); d2, establishing a link between the PC and the PLC register by using the snap7; d3, establishing mapping between the voice command and the PLC data register data based on the link;
e, collecting voice instructions;
f, carrying out endpoint detection and feature extraction on the voice instruction;
g, matching the characteristics of the voice instruction with the model;
h, modifying the register data by the mapping relation between the matching result and the PLC register data.
2. The machine learning based PLC speech recognition method of claim 1, wherein the step of collecting the speech signal samples in step a is as follows,
a1, setting the collection times of each voice signal sample;
a2, setting a preservation path of the voice signal sample;
a3, setting the format as pyaudio.paInt16, the number of channels as 1, the sampling rate as 16000, and the recording duration of a single voice signal as 2.5s;
a4, collecting voice by using a pyaudio module;
a5, storing the collected voice signal sample by using a wave module;
a6, denoising the voice signal sample by using spectral subtraction;
and A7, circularly executing until the set acquisition times are reached.
3. The machine learning based PLC speech recognition method of claim 1, wherein the step of speech signal endpoint detection in step b is as follows,
bj1, taking a higher short-time energy as a threshold MH, and utilizing the threshold to firstly separate out a voiced sound part in voice, wherein the interval A1 to A2 is formed;
bj2, taking a lower energy threshold ML, searching from A1 and A2 to two ends by using the threshold, adding the voice part of the lower energy section into the voice section, and further expanding the voice section range, and the voice section between B1 and B2;
bj3, distinguishing consonants and silence by using a short-time zero-crossing rate, wherein the threshold value of the short-time zero-crossing rate is Zs, searching the voice section distinguished by using short-time energy to the two ends, and considering the part with the short-time zero-crossing rate larger than 3 times Zs as the unvoiced part of the voice, adding the part into the voice section, namely the obtained voice section, and the voice section is arranged between C1 and C2.
4. The machine learning based PLC speech recognition method of claim 1, wherein the step of extracting speech signal features in step b is as follows,
bt1, pre-emphasis, framing and windowing are carried out on the voice;
bt2, obtaining a corresponding frequency spectrum through FFT for each short-time analysis window;
bt3, the above frequency spectrum is passed through a Mel filter bank to obtain Mel frequency spectrum;
bt4, carrying out cepstrum analysis on the Mel frequency spectrum, taking logarithm, carrying out inverse transformation, wherein the actual inverse transformation is generally realized through DCT discrete cosine transformation, taking the 2 nd to 13 th coefficients after DCT as MFCC coefficients, and obtaining Mel frequency cepstrum coefficient MFCC, wherein the MFCC is the characteristic of the frame of voice.
5. The machine learning based PLC speech recognition method of claim 1, wherein the step c of training the HMM-GMM model is as follows,
c1, modeling phonemes of a voice signal by using an HMM-GMM respectively and using a 3-state modeling, wherein the emission probability of the HMM is modeled by using a Gaussian distribution function;
c2, initializing alignment, and averagely corresponding frames of the voice signal to each state;
updating model parameters, counting the times of obtaining the transition of each state, dividing the times by the total transition times to obtain the transition probability of each state, and calculating the mean vector and covariance matrix of the MFCC characteristics of the state, namely the emission probability;
c4, using a Viterbi algorithm, and carrying out state level alignment on the voice signal again according to the transition probability and the emission probability obtained in the last step;
step C5, repeating the step C2 and the step C3 until convergence;
and C6, saving the model after training.
6. The method for recognizing PLC speech based on machine learning according to claim 1, wherein the step of matching the feature of the speech instruction with the model in the step g is as follows,
g1, importing a model group after training;
g2, creating a prediction score list;
g3, matching the input voice with each model of the model group;
g4, calculating a matching score and storing the matching score into a prediction score list;
g5, screening out the highest-scoring model;
and G6, outputting a voice signal mark corresponding to the model.
CN202110319744.XA 2021-03-25 2021-03-25 PLC voice recognition method based on machine learning Active CN113643692B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110319744.XA CN113643692B (en) 2021-03-25 2021-03-25 PLC voice recognition method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110319744.XA CN113643692B (en) 2021-03-25 2021-03-25 PLC voice recognition method based on machine learning

Publications (2)

Publication Number Publication Date
CN113643692A CN113643692A (en) 2021-11-12
CN113643692B true CN113643692B (en) 2024-03-26

Family

ID=78415711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110319744.XA Active CN113643692B (en) 2021-03-25 2021-03-25 PLC voice recognition method based on machine learning

Country Status (1)

Country Link
CN (1) CN113643692B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324232A (en) * 2011-09-12 2012-01-18 辽宁工业大学 Method for recognizing sound-groove and system based on gauss hybrid models
CN104078039A (en) * 2013-03-27 2014-10-01 广东工业大学 Voice recognition system of domestic service robot on basis of hidden Markov model
CN106395516A (en) * 2016-10-13 2017-02-15 东华大学 Passenger elevator intelligent control system based on speech recognition
CN106601230A (en) * 2016-12-19 2017-04-26 苏州金峰物联网技术有限公司 Logistics sorting place name speech recognition method, system and logistics sorting system based on continuous Gaussian mixture HMM
CN107331384A (en) * 2017-06-12 2017-11-07 平安科技(深圳)有限公司 Audio recognition method, device, computer equipment and storage medium
CN109243428A (en) * 2018-10-15 2019-01-18 百度在线网络技术(北京)有限公司 A kind of method that establishing speech recognition modeling, audio recognition method and system
CN109448726A (en) * 2019-01-14 2019-03-08 李庆湧 A kind of method of adjustment and system of voice control accuracy rate
CN209433234U (en) * 2019-03-15 2019-09-24 陕西中烟工业有限责任公司 There is the technology for making tobacco threds parameter monitor device of voice alarm function based on Raspberry Pi

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633842B (en) * 2017-06-12 2018-08-31 平安科技(深圳)有限公司 Audio recognition method, device, computer equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324232A (en) * 2011-09-12 2012-01-18 辽宁工业大学 Method for recognizing sound-groove and system based on gauss hybrid models
CN104078039A (en) * 2013-03-27 2014-10-01 广东工业大学 Voice recognition system of domestic service robot on basis of hidden Markov model
CN106395516A (en) * 2016-10-13 2017-02-15 东华大学 Passenger elevator intelligent control system based on speech recognition
CN106601230A (en) * 2016-12-19 2017-04-26 苏州金峰物联网技术有限公司 Logistics sorting place name speech recognition method, system and logistics sorting system based on continuous Gaussian mixture HMM
CN107331384A (en) * 2017-06-12 2017-11-07 平安科技(深圳)有限公司 Audio recognition method, device, computer equipment and storage medium
CN109243428A (en) * 2018-10-15 2019-01-18 百度在线网络技术(北京)有限公司 A kind of method that establishing speech recognition modeling, audio recognition method and system
CN109448726A (en) * 2019-01-14 2019-03-08 李庆湧 A kind of method of adjustment and system of voice control accuracy rate
CN209433234U (en) * 2019-03-15 2019-09-24 陕西中烟工业有限责任公司 There is the technology for making tobacco threds parameter monitor device of voice alarm function based on Raspberry Pi

Also Published As

Publication number Publication date
CN113643692A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
WO2018227780A1 (en) Speech recognition method and device, computer device and storage medium
WO2018227781A1 (en) Voice recognition method, apparatus, computer device, and storage medium
WO2021051544A1 (en) Voice recognition method and device
US20210125603A1 (en) Acoustic model training method, speech recognition method, apparatus, device and medium
CN110648691B (en) Emotion recognition method, device and system based on energy value of voice
CN109147774B (en) Improved time-delay neural network acoustic model
JPS62231996A (en) Allowance evaluation of word corresponding to voice input
JPH0422276B2 (en)
CN109377981B (en) Phoneme alignment method and device
Vyas A Gaussian mixture model based speech recognition system using Matlab
JPS58100195A (en) Continuous voice recognition equipment
CN102945673A (en) Continuous speech recognition method with speech command range changed dynamically
CN114360557B (en) Voice tone conversion method, model training method, device, equipment and medium
CN110428853A (en) Voice activity detection method, Voice activity detection device and electronic equipment
Hsieh et al. Improving perceptual quality by phone-fortified perceptual loss for speech enhancement
CN112071308A (en) Awakening word training method based on speech synthesis data enhancement
CN110910891A (en) Speaker segmentation labeling method and device based on long-time memory neural network
CN111554279A (en) Multi-mode man-machine interaction system based on Kinect
JPH09319392A (en) Voice recognition device
CN113643692B (en) PLC voice recognition method based on machine learning
CN115331658B (en) Voice recognition method
CN107785012B (en) Control method of sound control driller display
Islam et al. Improvement of text dependent speaker identification system using neuro-genetic hybrid algorithm in office environmental conditions
Dua et al. Noise robust automatic speech recognition: review and analysis
Gupta Speech recognition for Hindi

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant