CN110738991A - Speech recognition equipment based on flexible wearable sensor - Google Patents

Speech recognition equipment based on flexible wearable sensor Download PDF

Info

Publication number
CN110738991A
CN110738991A CN201910962682.7A CN201910962682A CN110738991A CN 110738991 A CN110738991 A CN 110738991A CN 201910962682 A CN201910962682 A CN 201910962682A CN 110738991 A CN110738991 A CN 110738991A
Authority
CN
China
Prior art keywords
voice
wearable sensor
flexible wearable
signal
speech recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910962682.7A
Other languages
Chinese (zh)
Inventor
吴俊�
段升顺
查欣婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201910962682.7A priority Critical patent/CN110738991A/en
Publication of CN110738991A publication Critical patent/CN110738991A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/081Search algorithms, e.g. Baum-Welch or Viterbi

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses voice recognition equipment based on a flexible wearable sensor, which comprises a voice acquisition unit, a voice signal receiving and processing unit and a voice recognition network unit, wherein a voice acquisition module comprises the flexible wearable sensor, mechanical vibration of laryngeal structure vibration during speaking is converted into an electric signal through the flexible wearable sensor, and the electric signal is output, wherein the frequency and the amplitude of the electric signal are positively correlated with the frequency and the amplitude of the laryngeal structure vibration.

Description

Speech recognition equipment based on flexible wearable sensor
Technical Field
The invention relates to a voice recognition technology, a flexible electronic and neural network, in particular to voice recognition equipment based on a flexible wearable sensor.
Background
Since th Bell laboratories in the 50 s developed systems capable of realizing ten English numbers, the speech recognition technology has undergone a great deal of development, and the successful introduction of HMM models and Artificial Neural Networks (ANNs) has enabled the performance of speech recognition systems to be superior to the past.
However, in the conventional speech recognition technology, the acquisition of speech signals depends on microphones, and a source from a speaker to the microphones needs to experience transmission through an air channel, and in the process, speech is easily affected by noise when propagating in a propagation medium such as air, and effective information received by a microphone receiver is seriously affected. Because the voice recognition system is sensitive to the environment, the collected voice training system is only suitable for the environment corresponding to the collected voice training system, which also influences the conversion of the collected voice training system from a laboratory demonstration system to a commodity.
Disclosure of Invention
The invention aims to provide voice recognition equipment based on a flexible wearable sensor, so as to solve the defect that the acquisition of a voice signal source is easily influenced by the environment, and increase the robustness of a voice recognition system and the applicability of a multi-complex environment.
speech recognition equipment based on flexible wearable sensor includes:
the voice acquisition unit comprises a flexible wearable sensor and an analog-to-digital conversion unit, wherein the flexible wearable sensor is attached to the neck, the wearable sensor acquires a laryngeal node vibration signal during speaking and converts the laryngeal node vibration signal into an analog electric signal, and the analog-to-digital conversion unit receives the analog electric signal and encodes the analog electric signal into a digital signal;
the voice signal receiving and processing unit is connected with the voice acquisition unit and used for extracting the characteristic vector of the voice signal after the digital signal is subjected to audio data preprocessing;
and the voice recognition network unit is connected with the voice signal receiving and processing unit, decodes the characteristic vectors extracted by the voice signal receiving and processing unit, constructs a search space by utilizing the dictionary, the acoustic model and the language model, and searches for an optimal path in the search space through a search algorithm to obtain a voice recognition result.
The audio data preprocessing specifically includes the following contents:
step 1, a voice signal receiving and processing unit acquires a digital signal, carries out filtering processing on the voice signal, and then cuts off the mute at the head end and the tail end by utilizing an end point detection technology;
step 2, performing framing processing on the audio signal obtained by the previous processing by adopting a moving window function to obtain series frames;
and 3, processing each frames by utilizing algorithms such as PLP (PLP) and Mel cepstrum coefficients, and converting each frame into a feature vector containing sound information.
The specific steps of the speech recognition are as follows:
step 1, inputting the feature vector of each frames obtained by the processing of a speech signal receiving and processing unit into an acoustic model based on a deep neural network and hidden Markov, wherein the acoustic model calculates the score of each feature vector on acoustic features according to sound characteristics and outputs the score as phoneme (pinyin) information corresponding to the frame;
step 2, constructing a Chinese character network space by using a language model, and then constructing a phoneme (pinyin) network space through a dictionary;
and 3, searching optimal paths in the phoneme network space through a dynamic planning pruning algorithm, so that the accumulated probability of the voice obtained in the paths is maximum, and the output voice is the corresponding voice signal.
And the dictionary is the mapping relation between Chinese characters and phonemes, and in step , the Chinese character phoneme set is all initials and finals.
The language model adopts an N-Gram model, and the probability of the correlation of single characters or words is obtained by training a large amount of text information.
, the voice acquisition unit comprises a Bluetooth module, the voice acquisition unit and the voice signal receiving and processing unit adopt a Bluetooth wireless transmission mode, the voice acquisition unit comprises a filtering unit, the analog electric signal is processed by the filtering unit and then coded into a digital signal, and , the analog-to-digital conversion unit and the filtering unit are integrated in the Bluetooth module.
Further , the voice capture unit includes a power module.
Compared with the prior art, the invention has the following remarkable advantages: through utilizing flexible wearable sensor to acquire speech signal through the vibration information that detects the larynx knot when speaking, compare with traditional microphone that uses the air as the medium and acquire speech signal, very big improvement speech signal's under the noisy environment SNR, solved the shortcoming that its speech signal source acquireed and easily be influenced by the environment, increase speech recognition system's robustness and many complex environment's suitability.
Drawings
FIG. 1 is a schematic diagram of the apparatus of the present invention;
FIG. 2 is a schematic diagram of a wearable sensor according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the present invention comparing the voice signal obtained by the attached sensor with the voice signal obtained by the conventional microphone using air as the medium;
FIG. 4 is a flow chart illustrating a speech recognition process;
fig. 5 is a schematic diagram of an acoustic model.
Detailed Description
The technical solution of the present invention will be described in detail with reference to the accompanying drawings and the detailed description.
As shown in FIG. 1, kinds of flexible wearable sensor-based voice recognition devices comprise a voice acquisition unit, a voice signal receiving and processing unit and a neural network unit.
In this particular example, the present invention employs a triboelectric wearable sensor, the self-energizing and direct mechanical vibration to electrical signal feature that makes the overall speech acquisition module more energy efficient and the module design simpler.
(1) Voice acquisition unit
The voice acquisition unit comprises a power module, a Bluetooth module and a triboelectric wearable sensor, wherein each submodule is electrically connected.
The specific structure of the triboelectric wearable sensor is shown in fig. 2, and the triboelectric wearable sensor has an upper packaging layer, an upper electrode layer, an upper friction electrode layer, a lower electrode layer and a lower packaging layer. The upper and lower packaging layers are PVA films, and the Young modulus of the upper and lower packaging layers is matched with that of a human body; the upper and lower electrode layers adopt sputtered copper layers, the upper friction electrode layer adopts a nylon film, the lower friction electrode layer adopts PDMS, and the upper and lower friction electrode layers adopt abrasive paper to assist surface microstructuring.
The working principle of the device is based on the triboelectrification effect and the electrostatic coupling effect, and the working principle is specifically as follows:
the device is in a double-electrode working mode, when a person speaks, the throat node can vibrate, and therefore contact-separation reciprocating motion of an upper double electrode layer and a lower double electrode layer of the triboelectric wearable sensor attached to the throat node is caused. Specifically, during contact-separation, the surfaces of the upper friction electrode layer nylon and the lower friction electrode layer PDMS respectively have positive and negative charges after contact-separation, and an electric pulse is generated under the connection of a double-electrode lead, so that a positive voltage peak value is generated; similarly, after separation-contact, the two charges of the two electrodes are electrically neutral to each other, which results from the former phenomenon of inverse charge movement, thereby generating a negative voltage pulse. Thereby generating a negative voltage peak value, and the amplitude and the frequency of the laryngeal knot vibration can be converted into the peak value and the number of the peak values of the output voltage and form positive correlation with the peak value and the number of the peak values. So far, the vibration information of the laryngeal node has been converted into voltage information during speaking.
As shown in fig. 3, the conventional microphone-based method for acquiring signals mainly acquires related information by transmitting the voice spoken by a person to the microphone through air medium vibration, in the process, the voice information spoken by the person is coupled with other voice signals in the air, and seriously, when the noise in the environment is large, the voice information spoken by the person is completely annihilated by the noise signals, so that the information is completely unavailable. The triboelectric wearable sensor provided by the method acquires a corresponding sound signal through the laryngeal structure vibration information, and the triboelectric wearable sensor does not utilize a spoken sound signal transmitted by a human through an air medium, so that the triboelectric wearable sensor is hardly influenced by a noise signal in the environment.
(2) Voice signal receiving and processing unit
The voice signal acquisition unit mainly acquires voice digital signals through a Bluetooth module, filters the voice signals, performs audio data preprocessing such as framing and the like to obtain characteristic vectors of original voice signals, and transmits the characteristic vectors to a lower-level voice recognition network unit, specifically, the signal acquisition module acquires the digital signals through the Bluetooth module, performs filtering processing on the voice signals, removes the mute at the head end and the tail end by using an endpoint detection technology to reduce the interference on the subsequent steps, performs framing processing on the audio signals obtained by the previous step by using a moving window function, wherein the frame length of the moving window function is 25ms, the frame is shifted by 10ms to obtain series frames, and finally processes each frame by using Mel cepstrum coefficient (MFCC) to convert each frame into mostly vectors containing voice information, namely characteristic vectors.
(3) Speech recognition network element
The speech recognition network unit comprises a dictionary (shown in table 1), an acoustic model (shown in fig. 5), a language model, a decoding space and a search algorithm, wherein the dictionary comprises a mapping relation between Chinese characters or words and phonemes (full initials and finals) as shown in table 1, the acoustic model is composed of a deep neural network and a hidden markov chain as shown in fig. 5, wherein the deep neural network carries out steps of feature extraction on feature vectors and calculates corresponding probabilities of the phonemes through the hidden markov chain, the decoding space is constructed into a phoneme network space through the Hidden Markov Model (HMM), and the search algorithm is a dynamic planning pruning algorithm.
TABLE 1
Chinese characters Phoneme
yi1
series of events yi2shi4
Open da3kai1
congress yi1zhong1quan2hui4
The working process of the invention is as follows:
1. the person speaks, causes the larynx knot vibration, and the laminating can produce corresponding shop output signal at the flexible wearable sensor of neck position this moment, and wherein the amplitude and the frequency variation of signal of telecommunication match with the vibration information phase-match of larynx knot, and the analog signal that bluetooth module will acquire the wearable sensor output through adc and wave filter carries out filtering coding to be digital signal, later transmits for signal acquisition processing unit under power module's power supply.
2. The method comprises the steps of obtaining a voice digital signal through a Bluetooth module by a signal obtaining module, obtaining the digital signal through the Bluetooth module by the signal obtaining module, filtering the voice signal, removing the mute at the head end and the tail end by using an end point detection technology to reduce the interference on the subsequent steps, performing framing processing on the audio signal obtained by the previous step by using a moving window function, wherein the frame length of the moving window function is 25-50ms, the frame is moved by 0-10ms to obtain series of frames, and finally processing each frame by using Mel cepstrum coefficient (MFCC) to convert each frame into a multi-dimensional vector-feature vector containing sound information.
3. The feature vector of each frames obtained by the speech signal receiving and processing unit is input into an acoustic Model (as shown in fig. 5) based on a Deep Neural network and Hidden Markov (DNN-HMM), and output as phoneme (pinyin) information corresponding to the frame.
4. A Hidden Markov Model (HMM) is used for constructing a Chinese character network space by utilizing a language model, then a phoneme (pinyin) network space is constructed through a dictionary, optimal paths are searched in the phoneme network space through a dynamic programming pruning algorithm, the accumulated probability of the voice obtained in the paths is the maximum, and the output voice is the corresponding voice signal.
The following example illustrates a simple speech recognition procedure:
(1) voice signal: for the acquired voice signal converted into the PCM file, the voice content is "i am a robot".
(2) Feature extraction: extracting feature vector [ 0.110.821.2]T
(3) Acoustic model: [0.110.821.2]TCorresponding to wo shi ji qi rn.
(4) A dictionary: nesting: wo; i: wo; the method comprises the following steps: shi; machine: ji; the device comprises: qi; human: rn; stage (2): ji (j) is carried out.
(5) And (3) voice model: i: 0.0786, is: 0.0546, i are: 0.0898, machine: 0.0967, robot: 0.6785.
(6) and (3) outputting: i am a robot.

Claims (10)

  1. An flexible wearable sensor based speech recognition device, comprising:
    the voice acquisition unit comprises a flexible wearable sensor and an analog-to-digital conversion unit, wherein the flexible wearable sensor is attached to the neck, the wearable sensor acquires a laryngeal node vibration signal during speaking and converts the laryngeal node vibration signal into an analog electric signal, and the analog-to-digital conversion unit receives the analog electric signal and encodes the analog electric signal into a digital signal;
    the voice signal receiving and processing unit is connected with the voice acquisition unit and used for extracting the characteristic vector of the voice signal after the digital signal is subjected to audio data preprocessing;
    and the voice recognition network unit is connected with the voice signal receiving and processing unit, decodes the characteristic vectors extracted by the voice signal receiving and processing unit, constructs a search space by utilizing the dictionary, the acoustic model and the language model, and searches for an optimal path in the search space through a search algorithm to obtain a voice recognition result.
  2. 2. The flexible wearable sensor-based speech recognition device of claim 1, wherein the audio data preprocessing specifically accommodates:
    step 1, a voice signal receiving and processing unit acquires a digital signal, carries out filtering processing on the voice signal, and then cuts off the mute at the head end and the tail end by utilizing an end point detection technology;
    step 2, performing framing processing on the processed audio signal by adopting a moving window function to obtain series frames;
    and 3, processing each frames by utilizing algorithms such as PLP (PLP) and Mel cepstrum coefficients, and converting each frame into a feature vector containing sound information.
  3. 3. The flexible wearable sensor-based voice recognition device of claim 1, wherein the specific steps of the voice recognition are as follows:
    step 1, inputting feature vectors of each frames obtained by processing of a speech signal receiving and processing unit into an acoustic model based on a deep neural network and hidden Markov, wherein the acoustic model calculates scores of the feature vectors on acoustic features according to sound characteristics and outputs phoneme information corresponding to the frames;
    step 2, constructing a Chinese character network space by using a language model, and then constructing a phoneme network space by using a dictionary;
    and 3, searching optimal paths in the phoneme network space through a dynamic planning pruning algorithm, so that the accumulated probability of the voice obtained in the paths is maximum, and the output voice is the corresponding voice signal.
  4. 4. The flexible wearable sensor-based speech recognition device of claim 3, wherein: the dictionary is a mapping relation between Chinese characters and phonemes.
  5. 5. The flexible wearable sensor-based speech recognition device of claim 4, wherein: the phoneme set in the Chinese character is all initials and finals.
  6. 6. The flexible wearable sensor-based speech recognition device of claim 3, wherein the language model employs an N-Gram model that derives a probability that individual words or phrases are related to each other by training textual information.
  7. 7. The voice recognition device based on the flexible wearable sensor, according to claim 1, wherein the voice acquisition unit comprises a bluetooth module, and the voice acquisition unit and the voice signal receiving and processing unit adopt a bluetooth wireless transmission mode; the analog-to-digital conversion unit is integrated in the Bluetooth module.
  8. 8. The flexible wearable sensor-based speech recognition device of claim 7, wherein the Bluetooth module comprises a filtering unit.
  9. 9. The speech recognition device based on the flexible wearable sensor, according to claim 1, wherein the speech acquisition unit comprises a filtering unit, and the analog electrical signal is encoded into a digital signal after being processed by the filtering unit.
  10. 10. The flexible wearable sensor-based speech recognition device of claim 1, wherein the speech acquisition unit comprises a power module.
CN201910962682.7A 2019-10-11 2019-10-11 Speech recognition equipment based on flexible wearable sensor Pending CN110738991A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910962682.7A CN110738991A (en) 2019-10-11 2019-10-11 Speech recognition equipment based on flexible wearable sensor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910962682.7A CN110738991A (en) 2019-10-11 2019-10-11 Speech recognition equipment based on flexible wearable sensor

Publications (1)

Publication Number Publication Date
CN110738991A true CN110738991A (en) 2020-01-31

Family

ID=69269957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910962682.7A Pending CN110738991A (en) 2019-10-11 2019-10-11 Speech recognition equipment based on flexible wearable sensor

Country Status (1)

Country Link
CN (1) CN110738991A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111616705A (en) * 2020-05-07 2020-09-04 清华大学 Flexible sensor for multi-modal muscle movement signal perception

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103961073A (en) * 2013-01-29 2014-08-06 中国科学院苏州纳米技术与纳米仿生研究所 Piezoresistive electronic skin and preparation method thereof
CN204089762U (en) * 2014-08-10 2015-01-07 纳米新能源(唐山)有限责任公司 Film sound-controlled switching device and apply its system
CN104575500A (en) * 2013-10-24 2015-04-29 中国科学院苏州纳米技术与纳米仿生研究所 Application of electronic skin in voice recognition, voice recognition system and voice recognition method
CN104836472A (en) * 2014-02-07 2015-08-12 北京纳米能源与系统研究所 Generator utilizing acoustic energy and sound transducer
CN105333943A (en) * 2014-08-14 2016-02-17 北京纳米能源与系统研究所 Sound sensor and sound detection method by using sound sensor
CN105326495A (en) * 2015-10-19 2016-02-17 杨军 Method for manufacturing and using wearable flexible skin electrode
CN106409289A (en) * 2016-09-23 2017-02-15 合肥华凌股份有限公司 Environment self-adaptive method of speech recognition, speech recognition device and household appliance
US20170069306A1 (en) * 2015-09-04 2017-03-09 Foundation of the Idiap Research Institute (IDIAP) Signal processing method and apparatus based on structured sparsity of phonological features
CN107633842A (en) * 2017-06-12 2018-01-26 平安科技(深圳)有限公司 Audio recognition method, device, computer equipment and storage medium
CN108168734A (en) * 2018-02-08 2018-06-15 南方科技大学 Flexible electronic skin based on cilium temperature sensing and preparation method thereof
CN108172218A (en) * 2016-12-05 2018-06-15 中国移动通信有限公司研究院 A kind of pronunciation modeling method and device
CN109036381A (en) * 2018-08-08 2018-12-18 平安科技(深圳)有限公司 Method of speech processing and device, computer installation and readable storage medium storing program for executing
CN109285544A (en) * 2018-10-25 2019-01-29 江海洋 Speech monitoring system
CN109410914A (en) * 2018-08-28 2019-03-01 江西师范大学 A kind of Jiangxi dialect phonetic and dialect point recognition methods
CN109584896A (en) * 2018-11-01 2019-04-05 苏州奇梦者网络科技有限公司 A kind of speech chip and electronic equipment

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103961073A (en) * 2013-01-29 2014-08-06 中国科学院苏州纳米技术与纳米仿生研究所 Piezoresistive electronic skin and preparation method thereof
CN104575500A (en) * 2013-10-24 2015-04-29 中国科学院苏州纳米技术与纳米仿生研究所 Application of electronic skin in voice recognition, voice recognition system and voice recognition method
CN104836472A (en) * 2014-02-07 2015-08-12 北京纳米能源与系统研究所 Generator utilizing acoustic energy and sound transducer
CN204089762U (en) * 2014-08-10 2015-01-07 纳米新能源(唐山)有限责任公司 Film sound-controlled switching device and apply its system
CN105333943A (en) * 2014-08-14 2016-02-17 北京纳米能源与系统研究所 Sound sensor and sound detection method by using sound sensor
US20170069306A1 (en) * 2015-09-04 2017-03-09 Foundation of the Idiap Research Institute (IDIAP) Signal processing method and apparatus based on structured sparsity of phonological features
CN105326495A (en) * 2015-10-19 2016-02-17 杨军 Method for manufacturing and using wearable flexible skin electrode
CN106409289A (en) * 2016-09-23 2017-02-15 合肥华凌股份有限公司 Environment self-adaptive method of speech recognition, speech recognition device and household appliance
CN108172218A (en) * 2016-12-05 2018-06-15 中国移动通信有限公司研究院 A kind of pronunciation modeling method and device
CN107633842A (en) * 2017-06-12 2018-01-26 平安科技(深圳)有限公司 Audio recognition method, device, computer equipment and storage medium
CN108168734A (en) * 2018-02-08 2018-06-15 南方科技大学 Flexible electronic skin based on cilium temperature sensing and preparation method thereof
CN109036381A (en) * 2018-08-08 2018-12-18 平安科技(深圳)有限公司 Method of speech processing and device, computer installation and readable storage medium storing program for executing
CN109410914A (en) * 2018-08-28 2019-03-01 江西师范大学 A kind of Jiangxi dialect phonetic and dialect point recognition methods
CN109285544A (en) * 2018-10-25 2019-01-29 江海洋 Speech monitoring system
CN109584896A (en) * 2018-11-01 2019-04-05 苏州奇梦者网络科技有限公司 A kind of speech chip and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111616705A (en) * 2020-05-07 2020-09-04 清华大学 Flexible sensor for multi-modal muscle movement signal perception
CN111616705B (en) * 2020-05-07 2021-08-17 清华大学 Flexible sensor for multi-modal muscle movement signal perception

Similar Documents

Publication Publication Date Title
CN110049270B (en) Multi-person conference voice transcription method, device, system, equipment and storage medium
KR100316077B1 (en) Distributed speech recognition system
Kingsbury et al. Recognizing reverberant speech with RASTA-PLP
Mitra et al. Time-frequency convolutional networks for robust speech recognition
Jou et al. Adaptation for soft whisper recognition using a throat microphone.
CN103065629A (en) Speech recognition system of humanoid robot
US11763801B2 (en) Method and system for outputting target audio, readable storage medium, and electronic device
CN111667834B (en) Hearing-aid equipment and hearing-aid method
Jain et al. Speech Recognition Systems–A comprehensive study of concepts and mechanism
Dubagunta et al. Improving children speech recognition through feature learning from raw speech signal
Londhe et al. Machine learning paradigms for speech recognition of an Indian dialect
CN118197309A (en) Intelligent multimedia terminal based on AI speech recognition
Kuamr et al. Implementation and performance evaluation of continuous Hindi speech recognition
CN110738991A (en) Speech recognition equipment based on flexible wearable sensor
WO2002103675A1 (en) Client-server based distributed speech recognition system architecture
Deng et al. Signal processing advances for the MUTE sEMG-based silent speech recognition system
Kurcan Isolated word recognition from in-ear microphone data using hidden markov models (HMM)
Ren et al. Speaker-independent automatic detection of pitch accent
Rafieee et al. A novel model characteristics for noise-robust automatic speech recognition based on HMM
Dalva Automatic speech recognition system for Turkish spoken language
Mendiratta et al. ASR system for isolated words using ANN with back propagation and fuzzy based DWT
Ananthakrishna et al. Effect of time-domain windowing on isolated speech recognition system performance
EP3718107A1 (en) Speech signal processing and evaluation
JP2002372988A (en) Recognition dictionary preparing device and rejection dictionary and rejection dictionary generating method
Aafaq et al. Convolutional Neural Networks for Deep Spoken Keyword Spotting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200131