CN110738991A - Speech recognition equipment based on flexible wearable sensor - Google Patents
Speech recognition equipment based on flexible wearable sensor Download PDFInfo
- Publication number
- CN110738991A CN110738991A CN201910962682.7A CN201910962682A CN110738991A CN 110738991 A CN110738991 A CN 110738991A CN 201910962682 A CN201910962682 A CN 201910962682A CN 110738991 A CN110738991 A CN 110738991A
- Authority
- CN
- China
- Prior art keywords
- voice
- wearable sensor
- flexible wearable
- signal
- speech recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 claims abstract description 30
- 239000013598 vector Substances 0.000 claims description 18
- 238000001914 filtration Methods 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 238000005516 engineering process Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 230000005236 sound signal Effects 0.000 claims description 6
- 238000009432 framing Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 5
- 238000001514 detection method Methods 0.000 claims description 4
- 238000013138 pruning Methods 0.000 claims description 4
- NGVDGCNFYWLIFO-UHFFFAOYSA-N pyridoxal 5'-phosphate Chemical compound CC1=NC=C(COP(O)(O)=O)C(C=O)=C1O NGVDGCNFYWLIFO-UHFFFAOYSA-N 0.000 claims description 4
- 238000010845 search algorithm Methods 0.000 claims description 4
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000002596 correlated effect Effects 0.000 abstract 1
- 238000000034 method Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004806 packaging method and process Methods 0.000 description 4
- 210000000867 larynx Anatomy 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 239000004205 dimethyl polysiloxane Substances 0.000 description 2
- 235000013870 dimethyl polysiloxane Nutrition 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- CXQXSVUQTKDNFP-UHFFFAOYSA-N octamethyltrisiloxane Chemical compound C[Si](C)(C)O[Si](C)(C)O[Si](C)(C)C CXQXSVUQTKDNFP-UHFFFAOYSA-N 0.000 description 2
- 238000004987 plasma desorption mass spectroscopy Methods 0.000 description 2
- 229920000435 poly(dimethylsiloxane) Polymers 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000001808 coupling effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010030 laminating Methods 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 229920006284 nylon film Polymers 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/081—Search algorithms, e.g. Baum-Welch or Viterbi
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses voice recognition equipment based on a flexible wearable sensor, which comprises a voice acquisition unit, a voice signal receiving and processing unit and a voice recognition network unit, wherein a voice acquisition module comprises the flexible wearable sensor, mechanical vibration of laryngeal structure vibration during speaking is converted into an electric signal through the flexible wearable sensor, and the electric signal is output, wherein the frequency and the amplitude of the electric signal are positively correlated with the frequency and the amplitude of the laryngeal structure vibration.
Description
Technical Field
The invention relates to a voice recognition technology, a flexible electronic and neural network, in particular to voice recognition equipment based on a flexible wearable sensor.
Background
Since th Bell laboratories in the 50 s developed systems capable of realizing ten English numbers, the speech recognition technology has undergone a great deal of development, and the successful introduction of HMM models and Artificial Neural Networks (ANNs) has enabled the performance of speech recognition systems to be superior to the past.
However, in the conventional speech recognition technology, the acquisition of speech signals depends on microphones, and a source from a speaker to the microphones needs to experience transmission through an air channel, and in the process, speech is easily affected by noise when propagating in a propagation medium such as air, and effective information received by a microphone receiver is seriously affected. Because the voice recognition system is sensitive to the environment, the collected voice training system is only suitable for the environment corresponding to the collected voice training system, which also influences the conversion of the collected voice training system from a laboratory demonstration system to a commodity.
Disclosure of Invention
The invention aims to provide voice recognition equipment based on a flexible wearable sensor, so as to solve the defect that the acquisition of a voice signal source is easily influenced by the environment, and increase the robustness of a voice recognition system and the applicability of a multi-complex environment.
speech recognition equipment based on flexible wearable sensor includes:
the voice acquisition unit comprises a flexible wearable sensor and an analog-to-digital conversion unit, wherein the flexible wearable sensor is attached to the neck, the wearable sensor acquires a laryngeal node vibration signal during speaking and converts the laryngeal node vibration signal into an analog electric signal, and the analog-to-digital conversion unit receives the analog electric signal and encodes the analog electric signal into a digital signal;
the voice signal receiving and processing unit is connected with the voice acquisition unit and used for extracting the characteristic vector of the voice signal after the digital signal is subjected to audio data preprocessing;
and the voice recognition network unit is connected with the voice signal receiving and processing unit, decodes the characteristic vectors extracted by the voice signal receiving and processing unit, constructs a search space by utilizing the dictionary, the acoustic model and the language model, and searches for an optimal path in the search space through a search algorithm to obtain a voice recognition result.
The audio data preprocessing specifically includes the following contents:
step 1, a voice signal receiving and processing unit acquires a digital signal, carries out filtering processing on the voice signal, and then cuts off the mute at the head end and the tail end by utilizing an end point detection technology;
step 2, performing framing processing on the audio signal obtained by the previous processing by adopting a moving window function to obtain series frames;
and 3, processing each frames by utilizing algorithms such as PLP (PLP) and Mel cepstrum coefficients, and converting each frame into a feature vector containing sound information.
The specific steps of the speech recognition are as follows:
step 1, inputting the feature vector of each frames obtained by the processing of a speech signal receiving and processing unit into an acoustic model based on a deep neural network and hidden Markov, wherein the acoustic model calculates the score of each feature vector on acoustic features according to sound characteristics and outputs the score as phoneme (pinyin) information corresponding to the frame;
step 2, constructing a Chinese character network space by using a language model, and then constructing a phoneme (pinyin) network space through a dictionary;
and 3, searching optimal paths in the phoneme network space through a dynamic planning pruning algorithm, so that the accumulated probability of the voice obtained in the paths is maximum, and the output voice is the corresponding voice signal.
And the dictionary is the mapping relation between Chinese characters and phonemes, and in step , the Chinese character phoneme set is all initials and finals.
The language model adopts an N-Gram model, and the probability of the correlation of single characters or words is obtained by training a large amount of text information.
, the voice acquisition unit comprises a Bluetooth module, the voice acquisition unit and the voice signal receiving and processing unit adopt a Bluetooth wireless transmission mode, the voice acquisition unit comprises a filtering unit, the analog electric signal is processed by the filtering unit and then coded into a digital signal, and , the analog-to-digital conversion unit and the filtering unit are integrated in the Bluetooth module.
Further , the voice capture unit includes a power module.
Compared with the prior art, the invention has the following remarkable advantages: through utilizing flexible wearable sensor to acquire speech signal through the vibration information that detects the larynx knot when speaking, compare with traditional microphone that uses the air as the medium and acquire speech signal, very big improvement speech signal's under the noisy environment SNR, solved the shortcoming that its speech signal source acquireed and easily be influenced by the environment, increase speech recognition system's robustness and many complex environment's suitability.
Drawings
FIG. 1 is a schematic diagram of the apparatus of the present invention;
FIG. 2 is a schematic diagram of a wearable sensor according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the present invention comparing the voice signal obtained by the attached sensor with the voice signal obtained by the conventional microphone using air as the medium;
FIG. 4 is a flow chart illustrating a speech recognition process;
fig. 5 is a schematic diagram of an acoustic model.
Detailed Description
The technical solution of the present invention will be described in detail with reference to the accompanying drawings and the detailed description.
As shown in FIG. 1, kinds of flexible wearable sensor-based voice recognition devices comprise a voice acquisition unit, a voice signal receiving and processing unit and a neural network unit.
In this particular example, the present invention employs a triboelectric wearable sensor, the self-energizing and direct mechanical vibration to electrical signal feature that makes the overall speech acquisition module more energy efficient and the module design simpler.
(1) Voice acquisition unit
The voice acquisition unit comprises a power module, a Bluetooth module and a triboelectric wearable sensor, wherein each submodule is electrically connected.
The specific structure of the triboelectric wearable sensor is shown in fig. 2, and the triboelectric wearable sensor has an upper packaging layer, an upper electrode layer, an upper friction electrode layer, a lower electrode layer and a lower packaging layer. The upper and lower packaging layers are PVA films, and the Young modulus of the upper and lower packaging layers is matched with that of a human body; the upper and lower electrode layers adopt sputtered copper layers, the upper friction electrode layer adopts a nylon film, the lower friction electrode layer adopts PDMS, and the upper and lower friction electrode layers adopt abrasive paper to assist surface microstructuring.
The working principle of the device is based on the triboelectrification effect and the electrostatic coupling effect, and the working principle is specifically as follows:
the device is in a double-electrode working mode, when a person speaks, the throat node can vibrate, and therefore contact-separation reciprocating motion of an upper double electrode layer and a lower double electrode layer of the triboelectric wearable sensor attached to the throat node is caused. Specifically, during contact-separation, the surfaces of the upper friction electrode layer nylon and the lower friction electrode layer PDMS respectively have positive and negative charges after contact-separation, and an electric pulse is generated under the connection of a double-electrode lead, so that a positive voltage peak value is generated; similarly, after separation-contact, the two charges of the two electrodes are electrically neutral to each other, which results from the former phenomenon of inverse charge movement, thereby generating a negative voltage pulse. Thereby generating a negative voltage peak value, and the amplitude and the frequency of the laryngeal knot vibration can be converted into the peak value and the number of the peak values of the output voltage and form positive correlation with the peak value and the number of the peak values. So far, the vibration information of the laryngeal node has been converted into voltage information during speaking.
As shown in fig. 3, the conventional microphone-based method for acquiring signals mainly acquires related information by transmitting the voice spoken by a person to the microphone through air medium vibration, in the process, the voice information spoken by the person is coupled with other voice signals in the air, and seriously, when the noise in the environment is large, the voice information spoken by the person is completely annihilated by the noise signals, so that the information is completely unavailable. The triboelectric wearable sensor provided by the method acquires a corresponding sound signal through the laryngeal structure vibration information, and the triboelectric wearable sensor does not utilize a spoken sound signal transmitted by a human through an air medium, so that the triboelectric wearable sensor is hardly influenced by a noise signal in the environment.
(2) Voice signal receiving and processing unit
The voice signal acquisition unit mainly acquires voice digital signals through a Bluetooth module, filters the voice signals, performs audio data preprocessing such as framing and the like to obtain characteristic vectors of original voice signals, and transmits the characteristic vectors to a lower-level voice recognition network unit, specifically, the signal acquisition module acquires the digital signals through the Bluetooth module, performs filtering processing on the voice signals, removes the mute at the head end and the tail end by using an endpoint detection technology to reduce the interference on the subsequent steps, performs framing processing on the audio signals obtained by the previous step by using a moving window function, wherein the frame length of the moving window function is 25ms, the frame is shifted by 10ms to obtain series frames, and finally processes each frame by using Mel cepstrum coefficient (MFCC) to convert each frame into mostly vectors containing voice information, namely characteristic vectors.
(3) Speech recognition network element
The speech recognition network unit comprises a dictionary (shown in table 1), an acoustic model (shown in fig. 5), a language model, a decoding space and a search algorithm, wherein the dictionary comprises a mapping relation between Chinese characters or words and phonemes (full initials and finals) as shown in table 1, the acoustic model is composed of a deep neural network and a hidden markov chain as shown in fig. 5, wherein the deep neural network carries out steps of feature extraction on feature vectors and calculates corresponding probabilities of the phonemes through the hidden markov chain, the decoding space is constructed into a phoneme network space through the Hidden Markov Model (HMM), and the search algorithm is a dynamic planning pruning algorithm.
TABLE 1
Chinese characters | Phoneme |
yi1 | |
series of events | yi2shi4 |
Open | da3kai1 |
congress | yi1zhong1quan2hui4 |
… | … |
The working process of the invention is as follows:
1. the person speaks, causes the larynx knot vibration, and the laminating can produce corresponding shop output signal at the flexible wearable sensor of neck position this moment, and wherein the amplitude and the frequency variation of signal of telecommunication match with the vibration information phase-match of larynx knot, and the analog signal that bluetooth module will acquire the wearable sensor output through adc and wave filter carries out filtering coding to be digital signal, later transmits for signal acquisition processing unit under power module's power supply.
2. The method comprises the steps of obtaining a voice digital signal through a Bluetooth module by a signal obtaining module, obtaining the digital signal through the Bluetooth module by the signal obtaining module, filtering the voice signal, removing the mute at the head end and the tail end by using an end point detection technology to reduce the interference on the subsequent steps, performing framing processing on the audio signal obtained by the previous step by using a moving window function, wherein the frame length of the moving window function is 25-50ms, the frame is moved by 0-10ms to obtain series of frames, and finally processing each frame by using Mel cepstrum coefficient (MFCC) to convert each frame into a multi-dimensional vector-feature vector containing sound information.
3. The feature vector of each frames obtained by the speech signal receiving and processing unit is input into an acoustic Model (as shown in fig. 5) based on a Deep Neural network and Hidden Markov (DNN-HMM), and output as phoneme (pinyin) information corresponding to the frame.
4. A Hidden Markov Model (HMM) is used for constructing a Chinese character network space by utilizing a language model, then a phoneme (pinyin) network space is constructed through a dictionary, optimal paths are searched in the phoneme network space through a dynamic programming pruning algorithm, the accumulated probability of the voice obtained in the paths is the maximum, and the output voice is the corresponding voice signal.
The following example illustrates a simple speech recognition procedure:
(1) voice signal: for the acquired voice signal converted into the PCM file, the voice content is "i am a robot".
(2) Feature extraction: extracting feature vector [ 0.110.821.2]T。
(3) Acoustic model: [0.110.821.2]TCorresponding to wo shi ji qi rn.
(4) A dictionary: nesting: wo; i: wo; the method comprises the following steps: shi; machine: ji; the device comprises: qi; human: rn; stage (2): ji (j) is carried out.
(5) And (3) voice model: i: 0.0786, is: 0.0546, i are: 0.0898, machine: 0.0967, robot: 0.6785.
(6) and (3) outputting: i am a robot.
Claims (10)
- An flexible wearable sensor based speech recognition device, comprising:the voice acquisition unit comprises a flexible wearable sensor and an analog-to-digital conversion unit, wherein the flexible wearable sensor is attached to the neck, the wearable sensor acquires a laryngeal node vibration signal during speaking and converts the laryngeal node vibration signal into an analog electric signal, and the analog-to-digital conversion unit receives the analog electric signal and encodes the analog electric signal into a digital signal;the voice signal receiving and processing unit is connected with the voice acquisition unit and used for extracting the characteristic vector of the voice signal after the digital signal is subjected to audio data preprocessing;and the voice recognition network unit is connected with the voice signal receiving and processing unit, decodes the characteristic vectors extracted by the voice signal receiving and processing unit, constructs a search space by utilizing the dictionary, the acoustic model and the language model, and searches for an optimal path in the search space through a search algorithm to obtain a voice recognition result.
- 2. The flexible wearable sensor-based speech recognition device of claim 1, wherein the audio data preprocessing specifically accommodates:step 1, a voice signal receiving and processing unit acquires a digital signal, carries out filtering processing on the voice signal, and then cuts off the mute at the head end and the tail end by utilizing an end point detection technology;step 2, performing framing processing on the processed audio signal by adopting a moving window function to obtain series frames;and 3, processing each frames by utilizing algorithms such as PLP (PLP) and Mel cepstrum coefficients, and converting each frame into a feature vector containing sound information.
- 3. The flexible wearable sensor-based voice recognition device of claim 1, wherein the specific steps of the voice recognition are as follows:step 1, inputting feature vectors of each frames obtained by processing of a speech signal receiving and processing unit into an acoustic model based on a deep neural network and hidden Markov, wherein the acoustic model calculates scores of the feature vectors on acoustic features according to sound characteristics and outputs phoneme information corresponding to the frames;step 2, constructing a Chinese character network space by using a language model, and then constructing a phoneme network space by using a dictionary;and 3, searching optimal paths in the phoneme network space through a dynamic planning pruning algorithm, so that the accumulated probability of the voice obtained in the paths is maximum, and the output voice is the corresponding voice signal.
- 4. The flexible wearable sensor-based speech recognition device of claim 3, wherein: the dictionary is a mapping relation between Chinese characters and phonemes.
- 5. The flexible wearable sensor-based speech recognition device of claim 4, wherein: the phoneme set in the Chinese character is all initials and finals.
- 6. The flexible wearable sensor-based speech recognition device of claim 3, wherein the language model employs an N-Gram model that derives a probability that individual words or phrases are related to each other by training textual information.
- 7. The voice recognition device based on the flexible wearable sensor, according to claim 1, wherein the voice acquisition unit comprises a bluetooth module, and the voice acquisition unit and the voice signal receiving and processing unit adopt a bluetooth wireless transmission mode; the analog-to-digital conversion unit is integrated in the Bluetooth module.
- 8. The flexible wearable sensor-based speech recognition device of claim 7, wherein the Bluetooth module comprises a filtering unit.
- 9. The speech recognition device based on the flexible wearable sensor, according to claim 1, wherein the speech acquisition unit comprises a filtering unit, and the analog electrical signal is encoded into a digital signal after being processed by the filtering unit.
- 10. The flexible wearable sensor-based speech recognition device of claim 1, wherein the speech acquisition unit comprises a power module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910962682.7A CN110738991A (en) | 2019-10-11 | 2019-10-11 | Speech recognition equipment based on flexible wearable sensor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910962682.7A CN110738991A (en) | 2019-10-11 | 2019-10-11 | Speech recognition equipment based on flexible wearable sensor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110738991A true CN110738991A (en) | 2020-01-31 |
Family
ID=69269957
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910962682.7A Pending CN110738991A (en) | 2019-10-11 | 2019-10-11 | Speech recognition equipment based on flexible wearable sensor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110738991A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111616705A (en) * | 2020-05-07 | 2020-09-04 | 清华大学 | Flexible sensor for multi-modal muscle movement signal perception |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103961073A (en) * | 2013-01-29 | 2014-08-06 | 中国科学院苏州纳米技术与纳米仿生研究所 | Piezoresistive electronic skin and preparation method thereof |
CN204089762U (en) * | 2014-08-10 | 2015-01-07 | 纳米新能源(唐山)有限责任公司 | Film sound-controlled switching device and apply its system |
CN104575500A (en) * | 2013-10-24 | 2015-04-29 | 中国科学院苏州纳米技术与纳米仿生研究所 | Application of electronic skin in voice recognition, voice recognition system and voice recognition method |
CN104836472A (en) * | 2014-02-07 | 2015-08-12 | 北京纳米能源与系统研究所 | Generator utilizing acoustic energy and sound transducer |
CN105333943A (en) * | 2014-08-14 | 2016-02-17 | 北京纳米能源与系统研究所 | Sound sensor and sound detection method by using sound sensor |
CN105326495A (en) * | 2015-10-19 | 2016-02-17 | 杨军 | Method for manufacturing and using wearable flexible skin electrode |
CN106409289A (en) * | 2016-09-23 | 2017-02-15 | 合肥华凌股份有限公司 | Environment self-adaptive method of speech recognition, speech recognition device and household appliance |
US20170069306A1 (en) * | 2015-09-04 | 2017-03-09 | Foundation of the Idiap Research Institute (IDIAP) | Signal processing method and apparatus based on structured sparsity of phonological features |
CN107633842A (en) * | 2017-06-12 | 2018-01-26 | 平安科技(深圳)有限公司 | Audio recognition method, device, computer equipment and storage medium |
CN108168734A (en) * | 2018-02-08 | 2018-06-15 | 南方科技大学 | Flexible electronic skin based on cilium temperature sensing and preparation method thereof |
CN108172218A (en) * | 2016-12-05 | 2018-06-15 | 中国移动通信有限公司研究院 | A kind of pronunciation modeling method and device |
CN109036381A (en) * | 2018-08-08 | 2018-12-18 | 平安科技(深圳)有限公司 | Method of speech processing and device, computer installation and readable storage medium storing program for executing |
CN109285544A (en) * | 2018-10-25 | 2019-01-29 | 江海洋 | Speech monitoring system |
CN109410914A (en) * | 2018-08-28 | 2019-03-01 | 江西师范大学 | A kind of Jiangxi dialect phonetic and dialect point recognition methods |
CN109584896A (en) * | 2018-11-01 | 2019-04-05 | 苏州奇梦者网络科技有限公司 | A kind of speech chip and electronic equipment |
-
2019
- 2019-10-11 CN CN201910962682.7A patent/CN110738991A/en active Pending
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103961073A (en) * | 2013-01-29 | 2014-08-06 | 中国科学院苏州纳米技术与纳米仿生研究所 | Piezoresistive electronic skin and preparation method thereof |
CN104575500A (en) * | 2013-10-24 | 2015-04-29 | 中国科学院苏州纳米技术与纳米仿生研究所 | Application of electronic skin in voice recognition, voice recognition system and voice recognition method |
CN104836472A (en) * | 2014-02-07 | 2015-08-12 | 北京纳米能源与系统研究所 | Generator utilizing acoustic energy and sound transducer |
CN204089762U (en) * | 2014-08-10 | 2015-01-07 | 纳米新能源(唐山)有限责任公司 | Film sound-controlled switching device and apply its system |
CN105333943A (en) * | 2014-08-14 | 2016-02-17 | 北京纳米能源与系统研究所 | Sound sensor and sound detection method by using sound sensor |
US20170069306A1 (en) * | 2015-09-04 | 2017-03-09 | Foundation of the Idiap Research Institute (IDIAP) | Signal processing method and apparatus based on structured sparsity of phonological features |
CN105326495A (en) * | 2015-10-19 | 2016-02-17 | 杨军 | Method for manufacturing and using wearable flexible skin electrode |
CN106409289A (en) * | 2016-09-23 | 2017-02-15 | 合肥华凌股份有限公司 | Environment self-adaptive method of speech recognition, speech recognition device and household appliance |
CN108172218A (en) * | 2016-12-05 | 2018-06-15 | 中国移动通信有限公司研究院 | A kind of pronunciation modeling method and device |
CN107633842A (en) * | 2017-06-12 | 2018-01-26 | 平安科技(深圳)有限公司 | Audio recognition method, device, computer equipment and storage medium |
CN108168734A (en) * | 2018-02-08 | 2018-06-15 | 南方科技大学 | Flexible electronic skin based on cilium temperature sensing and preparation method thereof |
CN109036381A (en) * | 2018-08-08 | 2018-12-18 | 平安科技(深圳)有限公司 | Method of speech processing and device, computer installation and readable storage medium storing program for executing |
CN109410914A (en) * | 2018-08-28 | 2019-03-01 | 江西师范大学 | A kind of Jiangxi dialect phonetic and dialect point recognition methods |
CN109285544A (en) * | 2018-10-25 | 2019-01-29 | 江海洋 | Speech monitoring system |
CN109584896A (en) * | 2018-11-01 | 2019-04-05 | 苏州奇梦者网络科技有限公司 | A kind of speech chip and electronic equipment |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111616705A (en) * | 2020-05-07 | 2020-09-04 | 清华大学 | Flexible sensor for multi-modal muscle movement signal perception |
CN111616705B (en) * | 2020-05-07 | 2021-08-17 | 清华大学 | Flexible sensor for multi-modal muscle movement signal perception |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110049270B (en) | Multi-person conference voice transcription method, device, system, equipment and storage medium | |
KR100316077B1 (en) | Distributed speech recognition system | |
Kingsbury et al. | Recognizing reverberant speech with RASTA-PLP | |
Mitra et al. | Time-frequency convolutional networks for robust speech recognition | |
Jou et al. | Adaptation for soft whisper recognition using a throat microphone. | |
CN103065629A (en) | Speech recognition system of humanoid robot | |
US11763801B2 (en) | Method and system for outputting target audio, readable storage medium, and electronic device | |
CN111667834B (en) | Hearing-aid equipment and hearing-aid method | |
Jain et al. | Speech Recognition Systems–A comprehensive study of concepts and mechanism | |
Dubagunta et al. | Improving children speech recognition through feature learning from raw speech signal | |
Londhe et al. | Machine learning paradigms for speech recognition of an Indian dialect | |
CN118197309A (en) | Intelligent multimedia terminal based on AI speech recognition | |
Kuamr et al. | Implementation and performance evaluation of continuous Hindi speech recognition | |
CN110738991A (en) | Speech recognition equipment based on flexible wearable sensor | |
WO2002103675A1 (en) | Client-server based distributed speech recognition system architecture | |
Deng et al. | Signal processing advances for the MUTE sEMG-based silent speech recognition system | |
Kurcan | Isolated word recognition from in-ear microphone data using hidden markov models (HMM) | |
Ren et al. | Speaker-independent automatic detection of pitch accent | |
Rafieee et al. | A novel model characteristics for noise-robust automatic speech recognition based on HMM | |
Dalva | Automatic speech recognition system for Turkish spoken language | |
Mendiratta et al. | ASR system for isolated words using ANN with back propagation and fuzzy based DWT | |
Ananthakrishna et al. | Effect of time-domain windowing on isolated speech recognition system performance | |
EP3718107A1 (en) | Speech signal processing and evaluation | |
JP2002372988A (en) | Recognition dictionary preparing device and rejection dictionary and rejection dictionary generating method | |
Aafaq et al. | Convolutional Neural Networks for Deep Spoken Keyword Spotting |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200131 |