CN110580919B - Voice feature extraction method and reconfigurable voice feature extraction device under multi-noise scene - Google Patents
Voice feature extraction method and reconfigurable voice feature extraction device under multi-noise scene Download PDFInfo
- Publication number
- CN110580919B CN110580919B CN201910764547.1A CN201910764547A CN110580919B CN 110580919 B CN110580919 B CN 110580919B CN 201910764547 A CN201910764547 A CN 201910764547A CN 110580919 B CN110580919 B CN 110580919B
- Authority
- CN
- China
- Prior art keywords
- signal
- noise
- voice
- feature extraction
- low
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/87—Detection of discrete points within a voice signal
Abstract
The invention discloses a voice feature extraction method and a reconfigurable voice feature extraction device in a multi-noise scene, and belongs to the technical field of voice recognition. The device dynamically selects a voice feature extraction mode according to a bottom noise threshold analysis judgment result and output results of the low-pass filter and the neural network by combining the characteristics of low power consumption of voice extraction of the low-pass filter and the characteristics of high accuracy of voice feature extraction of the Mel filter, and switches voice feature extraction channels through the reconfigurable feature extraction function configuration module. The low-pass filter is adopted to extract the voice features and identify the neural network under the condition that the external environment has no voice or has voice but high signal-to-noise ratio, and the Mel filter is adopted to extract the voice features under the condition that the signal-to-noise ratio is low and the voice is input, so that the overall power consumption of the voice feature extraction is reduced.
Description
Technical Field
The invention discloses a voice feature extraction method and a reconfigurable voice feature extraction device in a multi-noise scene, relates to an artificial intelligent neural network technology, and belongs to the technical field of voice recognition.
Background
Currently, with the development of voice recognition technology, digital equipment and multimedia technology, voice endpoint detection technology has been well developed, and voice endpoint detection is a technology for detecting voice segments in continuous signals, and can be combined with an automatic voice recognition system and a voiceprint recognition system, and can provide accurate and effective voice endpoints, so that voice endpoint detection is an important module.
In order to detect the voice endpoint constantly, the voice endpoint detection module must be opened all the time, the power consumption of the whole module must be considered at this time, and in order to reduce the power consumption and keep the recognition accuracy, the reconfigurable voice endpoint detection method facing to the multi-noise environment is provided.
Disclosure of Invention
The invention aims to provide a voice feature extraction method and a reconfigurable voice feature extraction device in a multi-noise scene aiming at the defects of the background technology, which can dynamically select a voice feature extraction mode to reduce power consumption on the premise of keeping precision and solve the technical problems of low power consumption, low precision or high power consumption and high precision of the traditional voice endpoint detection module.
The invention adopts the following technical scheme for realizing the aim of the invention:
the method for extracting the voice features in the multi-noise scene detects the environmental signal to noise ratio, extracts and identifies the features of the input signals based on low-pass filtering, and extracts and identifies the features of the input signals based on Mel filtering only when the environmental signal to noise ratio is low and the input signals are identified to contain the voice signals.
Further, in the voice feature extraction method under the multi-noise scene, the input signal is obtained by amplifying and performing analog-to-digital conversion on the signal acquired by the microphone array.
Further, in the method for extracting voice features in a multi-noise scene, the method for detecting the signal-to-noise ratio of the environment comprises the following steps: and detecting the environment bottom noise analog value, and quantizing the environment bottom noise analog value into an environment signal-to-noise ratio of n bits.
Further, in the method for extracting voice features in a multi-noise scene, the method for identifying the voice signal contained in the input signal comprises the following steps: and carrying out bit-by-bit shift operation on the feature extraction and recognition results based on low-pass filtering under the same clock signal and storing the result, and judging that the input signal contains a voice signal when at least one bit of the recognition results under the same clock signal has an output value.
Further, in the method for extracting voice features in a multi-noise scene, the method for judging the environmental signal-to-noise ratio is as follows: and comparing the environmental signal-to-noise ratio with a preset value according to bits, judging that the environmental signal-to-noise ratio is low when the environmental signal-to-noise ratio is lower than the preset value, and judging that the environmental signal-to-noise ratio is high when the environmental signal-to-noise ratio is higher than the preset value.
The speech feature extraction device under the multi-noise scene comprises:
a noise detection module for detecting the signal-to-noise ratio of the environment,
the feature extraction and identification module based on low-pass filtering firstly carries out low-pass filtering and then carries out feature extraction on the input signal, outputs an identification result,
a function configuration module for outputting a start signal of the feature extraction and recognition module based on Mel filtering when the environmental signal-to-noise ratio is low and the input signal is recognized to contain a voice signal, and,
and the characteristic extraction and identification module based on the Mel filtering firstly carries out the Mel filtering and then carries out the characteristic extraction on the input signal after receiving the starting signal, and outputs an identification result.
Furthermore, in the device for extracting reconfigurable speech features in a multi-noise scene, the noise detection module comprises:
a bottom noise detecting unit for detecting the environment bottom noise analog value, quantizing the environment bottom noise analog value into an environment signal-to-noise ratio of n bits, and,
and the noise threshold judging unit compares the environmental signal-to-noise ratio with a preset value according to bits, judges that the environmental signal-to-noise ratio is low when the environmental signal-to-noise ratio is lower than the preset value, and judges that the environmental signal-to-noise ratio is high when the environmental signal-to-noise ratio is higher than the preset value.
Further, in the device for extracting reconfigurable voice features in a multi-noise scene, the function configuration module includes:
a presence judging unit which bit-wise shifts and stores the feature extraction and recognition result based on the low-pass filtering in the same clock signal, judges that the input signal contains a voice signal when at least one bit of the recognition result has an output value in the same clock signal, and,
and the NAND gate unit is used for carrying out NAND operation on the output value of the noise detection module and the output value of the existence judgment unit.
A voice endpoint detection system under a multi-noise scene comprises:
a voice collecting device for amplifying and analog-to-digital converting the signals collected by the microphone array to obtain input signals,
any one of the above reconfigurable speech feature extraction devices extracts speech signal features from an input signal obtained by the speech acquisition device and then recognizes the input signal.
Furthermore, in the voice endpoint detection system under the multi-noise scene, the voice acquisition device comprises a low-noise amplifier, a programmable gain amplifier and an analog-to-digital converter which are connected in sequence, the input end of the low-noise amplifier is connected with the signals acquired by the microphone array, and the analog-to-digital converter outputs input signals.
The voice sampling input module amplifies and samples input voice into a digital signal. The extraction and recognition of voice features are realized through a low-pass filter and a forward neural network based on the feature extraction and recognition of the low-pass filter, the extraction of the voice features is performed on input voice data through the low-pass filter, and the extracted data is recognized through the forward neural network. Feature extraction and recognition based on Mel filtering are realized by Mel filter and forward neural network, the input voice data is subjected to extraction of voice features by Mel filter, and the extracted data is input to forward neural network for recognition. The bottom noise detection module detects the environmental bottom noise and the signal-to-noise ratio of the environment. The noise threshold judging module judges whether the environmental noise is smaller than a preset value according to the signal-to-noise ratio output by the bottom noise detecting module. The existence judging module compares the existence of '1' in the output results of the low-pass filter and the neural network thereof and judges whether the voice exists. The function configuration module opens the feature extraction and recognition module based on the Mel filtering when meeting the requirement of noise magnitude and judging that the output has the language, realizes the speech recognition by the Mel filter and the forward neural network, outputs the final recognition result based on the feature extraction and recognition module of the Mel filtering, realizes the reconstruction of the speech feature extraction channel, and switches the speech feature extraction mode under the condition of speech input in the environment with low signal-to-noise ratio.
By adopting the technical scheme, the invention has the following beneficial effects:
(1) aiming at the influence of a voice feature extraction mode on the recognition accuracy under different noise environments, the method adopts a low-power-consumption low-pass filtering-based feature extraction mode under the two conditions of no voice input under the environment with high signal-to-noise ratio or no voice input under the environment with low signal-to-noise ratio by switching the low-pass filtering-based feature extraction and recognition mode and the Mel filtering-based feature extraction and recognition mode, so that the requirement on the detection accuracy is met with lower power consumption, and the defect of poor voice accuracy under the environment with low signal-to-noise ratio is overcome by adopting the Mel filtering-based feature extraction and recognition mode under the environment with low signal-to-noise ratio.
(2) The constructed reconfigurable voice feature extraction device judges a reconfigurable voice feature extraction channel based on bottom noise threshold analysis, dynamically selects a feature extraction and identification module based on low-pass filtering and a feature extraction and identification module based on Mel filtering according to the magnitude of the environmental signal-to-noise ratio through a function configuration module, and reasonably controls the power consumption of the whole device under the condition of controlling the voice endpoint detection and identification precision.
Drawings
Fig. 1 is a schematic overall architecture diagram of the reconfigurable speech feature extraction device disclosed in the present invention.
FIG. 2 is a flow chart of extracting speech features in a multi-noise scene according to the present invention.
Fig. 3 is a circuit for implementing the noise threshold determination module according to the present invention.
FIG. 4 is a circuit diagram of an implementation of the presence determination module of the present invention.
Fig. 5 is a circuit for implementing the functional configuration module according to the present invention.
Fig. 6 shows the detailed steps of the function control of the reconfigurable speech feature extraction device according to the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of protection of the invention, since modifications of equivalent forms of the invention, which fall within the limits of the appended claims of the present application, will occur to persons skilled in the art after reading the present invention.
The reconfigurable voice feature extraction device oriented to the multi-noise scene realizes dynamic selection of a voice feature extraction mode and a voice recognition mode according to the relationship between the magnitude of an environmental signal-to-noise ratio and a threshold value and the voice existence of the environmental signal-to-noise ratio under the coordination control among all internal modules. As shown in fig. 1, the entire apparatus includes: the device comprises a voice sampling input module, a reconfigurable feature extraction function configuration module based on bottom noise threshold analysis and judgment, a feature extraction and identification module based on low-pass filtering and a feature extraction and identification module based on Mel filtering.
The voice sampling input module directly collects input sound into analog quantity by the microphone array, the analog quantity is amplified by the low-noise amplifier and then amplified by the programmable gain amplifier, and the analog quantity is sent to the analog-to-digital converter by the driving module to carry out data sampling and output a digital signal with fixed bits.
The sampled sound data can pass through a Mel filter and a forward neural network behind the Mel filter, a low-pass filter and a forward neural network behind the low-pass filter, and a reconfigurable feature extraction function configuration module based on bottom noise threshold analysis and judgment.
The reconfigurable feature extraction function configuration module based on the bottom noise threshold analysis judgment comprises a bottom noise detection module, a noise threshold judgment module, a presence judgment module and a reconfigurable feature extraction function configuration module. The input voice data can detect the signal-to-noise ratio in the environment through the bottom noise detection module, and the output of the analog-to-digital converter is compared with the set noise-free sample energy to obtain the signal-to-noise ratio of the noise existing in the sound in the environment. The range of signal-to-noise ratios output may be a smaller range around the preset value. In practice, the preset value can be set to 10dB, the output range can be 9dB to 12dB, and the output less than 9dB can be regarded as 9dB output; while a value greater than 12dB may be considered a 12dB output. The result of the simultaneous output is expressed by a 2-bit digital signal, 9dB can be expressed as "00", 10dB can be expressed as "01", 11dB can be expressed as "10", and 12dB can be expressed as "11".
The noise threshold judging module is a 2-bit numerical comparator, and is expressed as '01' according to the previously set preset value of 10 dB. And comparing the 2-bit binary number output by the bottom noise detection module with the set 2-bit binary preset value, wherein if the number is smaller than the preset value, the output signal is 1, and otherwise, the output signal is 0. As shown in fig. 3, a is the output signal of the noise detection module, B is a preset value of 10dB, i.e., "01", and FA < B is the required output signal. The process of selecting the speech feature extraction method according to the bottom noise detection value is shown in fig. 2.
The existence judging module can be regarded as a combination of a shift register and an OR gate, the output of the low-pass filter and the forward neural network is used as input to be stored in the shift register, whether each bit in the shift register has '1' or not is judged, if yes, the output is '1', and if not, the output is '0'. As shown in fig. 4, the output of the low-pass filter speech feature extraction and its neural network enters from the input port, the outputs of the four registers are the inputs of the or gate, and the output of the module is the output of the or gate.
As shown in fig. 5, the reconfigurable feature extraction function configuration may be regarded as a nand gate, and when the signal for determining the noise threshold is "1", if the output of the existence determination module is "1", the output of the reconfigurable feature extraction function configuration is "1" in order to improve the accuracy, so that data can be input to the mel filter and the following forward neural network is turned on, and at this time, the displayed result is the output result of the forward neural network after the mel filter; when the output result of the forward neural network after the low-pass filter is output to be 0 through the existence judging module, the Mel filter and the forward neural network behind the Mel filter are not started; and similarly, the environmental signal-to-noise ratio is not smaller than a preset value, and the analysis and judgment can not be carried out by adopting a Mel filter module.
Fig. 6 shows a reconfigurable speech feature extraction module and a function control method for a multi-noise scene in this embodiment, and the specific implementation steps are as follows:
step 101: the input sound is directly collected by the microphone array, amplified by the low-noise amplifier, amplified by the programmable gain amplifier and sent to the analog-to-digital converter by the driving module for data sampling;
step 102: the data after analog-to-digital conversion can be simultaneously input into a reconfigurable feature extraction function configuration module based on bottom noise threshold analysis judgment, a Mel filter and a backward forward neural network thereof, a low-pass filter and a backward forward neural network thereof, and the reconfigurable feature extraction function configuration module based on bottom noise threshold analysis judgment actually controls whether the Mel filter and the backward forward neural network module thereof are started;
step 103-A: the bottom noise detection module is used for performing bottom noise detection on an input voice signal to obtain the size of noise, and the signal-to-noise ratio of the environmental sound is calculated locally and accurately and is expressed as a 2-bit digital signal;
step 103-B: the noise threshold judgment module compares the measured signal-to-noise ratio of the environmental sound with a preset value, and outputs a judgment result after comparison, wherein the judgment result is '0' or '1';
step 103-D: the reconfigurable feature extraction function configuration module performs NAND operation on the output result of the noise threshold judgment and the output result of the existence judgment module in the step 103-C, and outputs a switching signal for determining whether to perform the step 105;
step 104: the low-pass filter and the backward forward neural network perform low-pass filtering voice feature extraction on the voice signal preprocessed by the voice acquisition module, send the extracted voice features into the neural network backward analysis after the low-pass filter, and output the analysis result;
step 103-C: the existence judging module inputs the results into a shift register in sequence, performs OR operation on each bit of output results of the shift register, outputs a bit of digital signal '0' or '1', and is used for the reconfigurable feature extraction function configuration module in the step 103-D;
step 105: in step 103-D, the output of the reconfigurable feature extraction function configuration module opens a Mel filter and a forward neural network behind the Mel filter, the Mel filter and the forward neural network behind the Mel filter perform Mel-filtered voice feature extraction on the voice signal preprocessed by the voice acquisition module, the extracted voice features are sent to the neural network behind the Mel filter for forward analysis, and the analyzed result is output;
step 106: the result output preferentially selects the output result of the Mel filter and the backward forward neural network; if the output result of the Mel filter and the backward forward neural network is not available, the output result of the low-pass filter and the backward forward neural network is output.
Claims (9)
1. The method for extracting the voice features in the multi-noise scene is characterized in that an environmental signal-to-noise ratio is detected, feature extraction and recognition based on low-pass filtering are carried out on an input signal, and feature extraction and recognition based on Mel filtering are carried out on the input signal only when the environmental signal-to-noise ratio is low and the input signal is recognized to contain the voice signal; the method for identifying the voice signal contained in the input signal comprises the following steps: and carrying out bit-by-bit shift operation on the feature extraction and recognition results based on low-pass filtering under the same clock signal and storing the result, and judging that the input signal contains a voice signal when at least one bit of the recognition results under the same clock signal has an output value.
2. The method for extracting speech features in multiple noise scenes according to claim 1, wherein the input signal is obtained by performing amplification processing and analog-to-digital conversion on signals collected by a microphone array.
3. The method for extracting speech features in a multi-noise scene according to claim 1, wherein the method for detecting the signal-to-noise ratio of the environment comprises: and detecting the environment bottom noise analog value, and quantizing the environment bottom noise analog value into an environment signal-to-noise ratio of n bits.
4. The method for extracting speech features in a multi-noise scene according to claim 3, wherein the method for judging the environmental signal-to-noise ratio is as follows: and comparing the environmental signal-to-noise ratio with a preset value according to bits, judging that the environmental signal-to-noise ratio is low when the environmental signal-to-noise ratio is lower than the preset value, and judging that the environmental signal-to-noise ratio is high when the environmental signal-to-noise ratio is higher than the preset value.
5. Reconfigurable speech feature extraction device under many noise scenes, its characterized in that includes:
a noise detection module for detecting the signal-to-noise ratio of the environment,
the feature extraction and identification module based on low-pass filtering firstly performs low-pass filtering and then performs feature extraction on the input signal, outputs a feature extraction and identification result based on low-pass filtering,
a function configuration module for bit-wise shifting and storing the feature extraction and recognition result based on low-pass filtering under the same clock signal, determining that the input signal contains a voice signal when at least one bit of the recognition result under the same clock signal has an output value, outputting a start signal of the feature extraction and recognition module based on Mel filtering when the environmental signal-to-noise ratio is low and the input signal contains the voice signal, and,
and the characteristic extraction and identification module based on the Mel filtering firstly carries out the Mel filtering and then carries out the characteristic extraction on the input signal after receiving the starting signal, and outputs the characteristic extraction and identification result based on the Mel filtering.
6. The device for extracting the reconfigurable speech feature under the multi-noise scene according to claim 5, wherein the noise detection module comprises:
a bottom noise detecting unit for detecting the environment bottom noise analog value, quantizing the environment bottom noise analog value into an environment signal-to-noise ratio of n bits, and,
and the noise threshold judging unit compares the environmental signal-to-noise ratio with a preset value according to bits, judges that the environmental signal-to-noise ratio is low when the environmental signal-to-noise ratio is lower than the preset value, and judges that the environmental signal-to-noise ratio is high when the environmental signal-to-noise ratio is higher than the preset value.
7. The device for extracting the reconfigurable speech feature under the multi-noise scene according to claim 5, wherein the functional configuration module comprises:
a presence judging unit which bit-wise shifts and stores the feature extraction and recognition result based on the low-pass filtering in the same clock signal, judges that the input signal contains a voice signal when at least one bit of the recognition result has an output value in the same clock signal, and,
and the NAND gate unit is used for carrying out NAND operation on the output value of the noise detection module and the output value of the existence judgment unit.
8. Voice endpoint detection system under many noise scenes, characterized by, includes:
a voice collecting device for amplifying and analog-to-digital converting the signals collected by the microphone array to obtain input signals,
a reconfigurable speech feature extraction mechanism according to claim 5, 6 or 7, wherein the speech signal features are extracted from the input signal obtained by the speech acquisition mechanism and the input signal is recognized.
9. The system according to claim 8, wherein the voice capturing device comprises a low noise amplifier, a programmable gain amplifier, and an analog-to-digital converter, which are connected in sequence, an input terminal of the low noise amplifier is connected to the signal captured by the microphone array, and the analog-to-digital converter outputs the input signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910764547.1A CN110580919B (en) | 2019-08-19 | 2019-08-19 | Voice feature extraction method and reconfigurable voice feature extraction device under multi-noise scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910764547.1A CN110580919B (en) | 2019-08-19 | 2019-08-19 | Voice feature extraction method and reconfigurable voice feature extraction device under multi-noise scene |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110580919A CN110580919A (en) | 2019-12-17 |
CN110580919B true CN110580919B (en) | 2021-09-28 |
Family
ID=68811160
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910764547.1A Active CN110580919B (en) | 2019-08-19 | 2019-08-19 | Voice feature extraction method and reconfigurable voice feature extraction device under multi-noise scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110580919B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112002307B (en) * | 2020-08-31 | 2023-11-21 | 广州市百果园信息技术有限公司 | Voice recognition method and device |
CN112786021A (en) * | 2021-01-26 | 2021-05-11 | 东南大学 | Lightweight neural network voice keyword recognition method based on hierarchical quantization |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6070140A (en) * | 1995-06-05 | 2000-05-30 | Tran; Bao Q. | Speech recognizer |
CN101809862A (en) * | 2007-08-03 | 2010-08-18 | 沃福森微电子股份有限公司 | Amplifier circuit |
CN102221991A (en) * | 2011-05-24 | 2011-10-19 | 华润半导体(深圳)有限公司 | 4-bit RISC (Reduced Instruction-Set Computer) microcontroller |
CN102483925A (en) * | 2009-07-07 | 2012-05-30 | 意法爱立信有限公司 | Digital audio signal processing system |
CN104038864A (en) * | 2013-03-08 | 2014-09-10 | 亚德诺半导体股份有限公司 | Microphone Circuit Assembly And System With Speech Recognition |
CN106601229A (en) * | 2016-11-15 | 2017-04-26 | 华南理工大学 | Voice awakening method based on soc chip |
CN106814788A (en) * | 2015-12-01 | 2017-06-09 | 马维尔国际贸易有限公司 | For the apparatus and method of active circuit |
CN207909193U (en) * | 2017-09-15 | 2018-09-25 | 苏州大学 | A kind of image filtering circuit of removal salt-pepper noise |
CN109410977A (en) * | 2018-12-19 | 2019-03-01 | 东南大学 | A kind of voice segments detection method of the MFCC similarity based on EMD-Wavelet |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8185389B2 (en) * | 2008-12-16 | 2012-05-22 | Microsoft Corporation | Noise suppressor for robust speech recognition |
US8942975B2 (en) * | 2010-11-10 | 2015-01-27 | Broadcom Corporation | Noise suppression in a Mel-filtered spectral domain |
CN102194452B (en) * | 2011-04-14 | 2013-10-23 | 西安烽火电子科技有限责任公司 | Voice activity detection method in complex background noise |
EP2788979A4 (en) * | 2011-12-06 | 2015-07-22 | Intel Corp | Low power voice detection |
US9390727B2 (en) * | 2014-01-13 | 2016-07-12 | Facebook, Inc. | Detecting distorted audio signals based on audio fingerprinting |
DE112016000287T5 (en) * | 2015-01-07 | 2017-10-05 | Knowles Electronics, Llc | Use of digital microphones for low power keyword detection and noise reduction |
-
2019
- 2019-08-19 CN CN201910764547.1A patent/CN110580919B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6070140A (en) * | 1995-06-05 | 2000-05-30 | Tran; Bao Q. | Speech recognizer |
CN101809862A (en) * | 2007-08-03 | 2010-08-18 | 沃福森微电子股份有限公司 | Amplifier circuit |
CN102483925A (en) * | 2009-07-07 | 2012-05-30 | 意法爱立信有限公司 | Digital audio signal processing system |
CN102221991A (en) * | 2011-05-24 | 2011-10-19 | 华润半导体(深圳)有限公司 | 4-bit RISC (Reduced Instruction-Set Computer) microcontroller |
CN104038864A (en) * | 2013-03-08 | 2014-09-10 | 亚德诺半导体股份有限公司 | Microphone Circuit Assembly And System With Speech Recognition |
CN106814788A (en) * | 2015-12-01 | 2017-06-09 | 马维尔国际贸易有限公司 | For the apparatus and method of active circuit |
CN106601229A (en) * | 2016-11-15 | 2017-04-26 | 华南理工大学 | Voice awakening method based on soc chip |
CN207909193U (en) * | 2017-09-15 | 2018-09-25 | 苏州大学 | A kind of image filtering circuit of removal salt-pepper noise |
CN109410977A (en) * | 2018-12-19 | 2019-03-01 | 东南大学 | A kind of voice segments detection method of the MFCC similarity based on EMD-Wavelet |
Non-Patent Citations (4)
Title |
---|
An energy-efficient voice activity detector using deep neural networks and approximate computing;bo Liu,et al.;《microelectronics journal》;20190531;第87卷;全文 * |
Low power speech detector on a FPAA;Sahil Shah,et al.;《2017 IEEE International Symposium on Circuits and Systems (ISCAS)》;IEEE;20170928;全文 * |
一种用于ADC电路的高速高精度比较器设计;吴光林等;《应用科学学报》;中国知网;20050225(第6期);全文 * |
多功能噪声报警器设计;茆文艺等;《电子器件》;中国知网;20170815(第4期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110580919A (en) | 2019-12-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018137704A1 (en) | Microphone array-based pick-up method and system | |
CN110580919B (en) | Voice feature extraction method and reconfigurable voice feature extraction device under multi-noise scene | |
US4633499A (en) | Speech recognition system | |
CN101071566B (en) | Small array microphone system, noise reducing device and reducing method | |
CA2469442A1 (en) | Automatic magnetic detection in hearing aids | |
KR100968970B1 (en) | Antenna diversity receiver | |
CN115951124A (en) | Time-frequency domain combined continuous and burst signal detection method and system | |
JP3471550B2 (en) | A / D converter | |
CN101377449B (en) | Automatic test device of erbium-doped optical fiber amplifier | |
CN114374411A (en) | Low-frequency power line carrier topology identification method | |
CN110108929B (en) | Anti-interference lightning current collecting device | |
JP3675047B2 (en) | Data processing device | |
CN112769506B (en) | Quick radio detection method based on frequency spectrum | |
US5640430A (en) | Method for detecting valid data in a data stream | |
CN109257247B (en) | Communication module's quality detection system | |
CN102088292A (en) | Multi-path gain adaptive matched signal acquisition method and device thereof | |
CN109639280A (en) | Have both the optical sampling circuit and acquisition method of sampling width and precision | |
CN211669266U (en) | Multichannel waveform acquisition device | |
CN101382593B (en) | Method for detecting weak low frequency signal form unknown strong frequency conversion signal | |
CN114783448A (en) | Audio signal processing device and method and storage medium | |
CN109150216A (en) | A kind of dual band receiver and its auto gain control method | |
CN113163282B (en) | Noise reduction pickup system and method based on USB | |
US7472025B2 (en) | Energy detection apparatus and method thereof | |
CN117687102A (en) | Multichannel parallel nuclear magnetic data acquisition system and method | |
US20040176062A1 (en) | Method for detecting a tone signal through digital signal processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |