CN111785285A - Voiceprint recognition method for home multi-feature parameter fusion - Google Patents
Voiceprint recognition method for home multi-feature parameter fusion Download PDFInfo
- Publication number
- CN111785285A CN111785285A CN202010439120.7A CN202010439120A CN111785285A CN 111785285 A CN111785285 A CN 111785285A CN 202010439120 A CN202010439120 A CN 202010439120A CN 111785285 A CN111785285 A CN 111785285A
- Authority
- CN
- China
- Prior art keywords
- characteristic parameters
- feature parameter
- voiceprint recognition
- recognition method
- filter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 230000004927 fusion Effects 0.000 title claims abstract description 9
- 239000000203 mixture Substances 0.000 claims abstract description 8
- 238000011478 gradient descent method Methods 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims abstract description 4
- 238000001228 spectrum Methods 0.000 claims description 36
- 238000000605 extraction Methods 0.000 claims description 13
- 238000007781 pre-processing Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 9
- 238000001914 filtration Methods 0.000 claims description 5
- 210000000721 basilar membrane Anatomy 0.000 claims description 3
- 230000006835 compression Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 3
- 230000001755 vocal effect Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 2
- 230000007547 defect Effects 0.000 abstract description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 210000003477 cochlea Anatomy 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/10—Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses a voiceprint recognition method for home multi-feature parameter fusion, which comprises the following steps: respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals; training three Gaussian mixture models by respectively using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters; and weighting and fusing the results of the three Gaussian mixture models, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result. The invention fuses the MFCC characteristic parameters, the GFCC characteristic parameters and the LPCC characteristic parameters, overcomes the defect that a single characteristic parameter cannot well express the characteristics of a speaker, and greatly improves the voiceprint recognition accuracy.
Description
Technical Field
The invention belongs to the field of voiceprint recognition, and particularly relates to a voiceprint recognition method for home multi-feature parameter fusion.
Background
Voiceprint recognition, also known as speaker recognition, includes speaker recognition and speaker verification. The voiceprint recognition application field is very wide, and comprises the financial field, the military safety field, the medical field, the home safety field and the like. Prior to the identification of many voiceprint recognition systems, in addition to pre-processing operations, feature parameters and model matching are critical to the accuracy of the identification.
The traditional single feature parameter cannot well express the voice feature of the speaker, overfitting may occur, and the MFCC feature parameter is easy and imitative. Besides single features, many scholars directly connect the GFCC and the MFCC to form a new feature parameter vector, which may cause dimension disasters and increase the computation of the system. Therefore, the current home voiceprint recognition algorithm cannot meet the requirement for better expressing the characteristics of the speaker, and the recognition accuracy of the current home voiceprint recognition algorithm needs to be improved.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the home-oriented voiceprint recognition method based on the fusion of the multiple feature parameters is provided, the problem that the voice feature of a speaker cannot be completely expressed by a single feature parameter is effectively solved, and the accuracy of voiceprint recognition is improved.
The technical scheme is as follows: in order to achieve the purpose, the invention provides a home multi-feature parameter fusion-oriented voiceprint recognition method, which comprises the following steps:
s1: respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals;
s2: training three Gaussian mixture models by respectively using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters;
s3: and weighting and fusing the results of the three Gaussian mixture models, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result.
Further, the speech signal is subjected to a preprocessing operation before feature parameter extraction in step S1.
Further, the preprocessing operation in step S1 includes sample quantization, pre-emphasis, frame windowing, and endpoint detection.
Further, the extraction process of the MFCC characteristic parameters in step S1 is as follows:
A1) preprocessing an input voice signal to generate a time domain signal, and processing each frame of voice signal through fast Fourier transform or discrete Fourier transform to obtain a voice linear frequency spectrum;
A2) inputting the linear frequency spectrum into a Mel filter bank for filtering to generate a Mel frequency spectrum, and taking the logarithmic energy of the Mel frequency spectrum to generate a corresponding logarithmic frequency spectrum;
A3) the log spectrum solution is converted to MFCC feature parameters by using a discrete cosine transform.
Further, the extraction process of the GFCC characteristic parameters in step S1 is as follows:
B1) preprocessing a voice signal to generate a time domain signal, and obtaining a discrete power spectrum through fast Fourier transform or discrete Fourier transform processing;
B2) squaring the discrete power spectrum to generate a voice energy spectrum, and performing filtering processing by using a Gamma atom filter bank;
B3) performing exponential compression on the output of each Gamma filter to obtain a group of exponential energy spectrums;
B4) the exponential energy spectrum is converted into GFCC characteristic parameters using a discrete cosine transform.
Further, the extraction process of the LPCC characteristic parameters in step S1 is as follows:
C1) setting a system function of the vocal tract model;
C2) setting the impulse response of a system function, and calculating the complex cepstrum of the impulse response;
C3) and calculating to obtain the LPCC characteristic parameters according to the relation between the complex cepstrum and the cepstrum coefficient.
Further, the determination method of the recognition result in step S3 is: and when the result of the weighted fusion is greater than or equal to the threshold value, the target speaker is identified, otherwise, the non-target speaker is identified.
Has the advantages that: compared with the prior art, the invention fuses the MFCC characteristic parameters, the GFCC characteristic parameters and the LPCC characteristic parameters, overcomes the defect that a single characteristic parameter can not well express the characteristics of a speaker, and greatly improves the accuracy of voiceprint recognition.
Drawings
FIG. 1 is a block diagram showing the general structure of the method of the present invention;
FIG. 2 is a flowchart of MFCC feature parameter extraction;
fig. 3 is a flow chart of GFCC characteristic parameter extraction.
Detailed Description
The present invention is further illustrated by the following figures and specific examples, which are to be understood as illustrative only and not as limiting the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications thereof which may occur to those skilled in the art upon reading the present specification.
As shown in fig. 1, the present invention provides a home CNN classification and feature matching combined voiceprint recognition method, which includes the following steps:
1) the method comprises the steps of preprocessing input speaker voice, wherein the preprocessing comprises sampling quantization, pre-emphasis, windowing, framing, endpoint detection and the like. The preprocessing aims to eliminate the interference of sounding organs and voice acquisition equipment and improve the recognition rate of the system.
2) Respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals;
3) training three Gaussian mixture models, namely a GMM model A, GMM model B and a GMM model C, respectively by using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters;
4) and weighting and fusing the results of the GMM model A, GMM model B and the GMM model C, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result.
As shown in fig. 2, the extraction process of the MFCC characteristic parameters in this embodiment is as follows:
A1) the input speech signal s (N) is preprocessed to generate a time-domain signal x (N) (the length N of the signal sequence is 256), and then, a speech linear spectrum x (k) is obtained by performing fast fourier transform or discrete fourier transform on each frame of speech signal, which can be represented as:
A2) inputting the linear spectrum X (k) into a Mel filter bank for filtering to generate a Mel spectrum, and then taking the logarithmic energy of the Mel spectrum to generate a corresponding logarithmic spectrum S (m).
Here, the Mel Filter Bank is a set of triangular band-identity filters Hm(k) And M is more than or equal to 0 and less than or equal to M, wherein M represents the number of the filters and is usually 20-28. The transfer function of a band-pass filter can be expressed as:
in the formula (2), f (m) is the center frequency.
The logarithm of the Mel energy spectrum is used to promote the performance of the voiceprint recognition system. The transfer function from the linear spectrum x (k) of speech to the logarithmic spectrum s (m) is:
A3) converting the solution of the logarithmic spectrum S (m) into the MFCC characteristic parameters by using Discrete Cosine Transform (DCT), wherein the expression of the nth dimension characteristic component C (n) of the MFCC characteristic parameters is as follows:
the MFCC characteristic parameters obtained through the steps only reflect the static characteristics of the voice signals, and the dynamic characteristic parameters can be obtained by solving the first-order difference and the second-order difference of the static characteristics.
In this embodiment, the design scheme of applying the GFCC (gamma frequency cepstrum coefficient) characteristic parameter to the gamma filter in the extraction process is as follows:
the Gammatone filter bank is used for simulating the auditory characteristics of a cochlear basilar membrane, and the time domain expression of the Gammatone filter bank is as follows:
g(f,t)=tn-1e-2πbtcos(2πfi+φi) U (t), i is more than or equal to 1 and less than or equal to N (5), wherein N is the number of filters;
n- - -the filter order number, typically 4;
i-filter ordinal number;
fi-the center frequency of the filter;
u (t) -unit step function;
bi-an attenuation factor of the filter;
φithe phase of the filter with sequence i is typically taken to be 0.
The bandwidth of each filter is related to the auditory critical band of the human ear, which according to the theory of psychology can be expressed in terms of equivalent rectangular bandwidth:
attenuation factor b of filteriThe decay rate of the impulse response is dependent on the bandwidth by a decay factor biAnd (6) determining. The expression is as follows:
bi=1.019EBR(f) (7)
the time-domain impulse function of the Gammatone filter is an analog function, and in order to facilitate calculation processing, the time-domain impulse function needs to be discretized, and the laplace transform performed on the formula (4) includes:
input speech signals s (n) and gi(n) the output of the Gamma-tone filter can be obtained through convolution operation.
The extraction process of the GFCC characteristic parameters is similar to that of the MFCC characteristic parameters, and only a Gamma filterbank is needed to replace a traditional Mel filterbank, so that the cochlea basilar membrane characteristics of the Gamma filterbank are effectively utilized, and the nonlinear processing can be well carried out on voice signals.
Based on the above Gammatone filter, as shown in fig. 3, the process of extracting the GFCC (Gammatone frequency cepstrum coefficient) characteristic parameters is as follows:
B1) firstly, preprocessing an input voice signal s (n) to generate a time domain signal x (n), and obtaining a discrete power spectrum X (k) through fast Fourier transform or discrete Fourier transform processing, wherein the expression is as follows:
B2) the discrete power spectrum x (k) is squared to generate a speech energy spectrum, which is then filtered using a gamma-tone filter bank.
B3) To better improve the performance of the voiceprint recognition system, the output of each filter is exponentially compressed to obtain a set of exponential energy spectra s1,s2,…,sM:
Where e (f) is the exponential compression value and M is the number of filter channels.
B4) Finally, the exponential energy spectrum is converted into GFCC characteristic parameters by using Discrete Cosine Transform (DCT), and the expression is as follows:
in the formula, L represents the dimension of the characteristic parameter.
In this embodiment, the process of extracting the characteristic parameters of LPCC (linear predictive cepstrum coefficient) is as follows:
assume that the system function of the vocal tract model is as follows:
in equation (12), p is the order of the predictor.
Combining formula (12) and formula (13), and for z-1Derivation, simplified to obtain:
equal sign two sides z of formula (14)-1The coefficients of the powers are added up to obtain a complex cepstrum as follows:
according to the relation between the complex cepstrum and the cepstrum coefficient:
linear prediction cepstrum coefficients can be calculated:
wherein c (n) is the linear prediction cepstrum coefficient LPCC, anAre linear prediction coefficients.
In step 4 of this embodiment, the mixture degrees of the GMM model A, GMM model B and the GMM model C are both 1024. The output results of the three models are a, b and c respectively, the three results are subjected to weighted fusion, and the weight coefficient is omegaiAnd isFinal result D ═ ω1a+ω2b+ω3cSetting a threshold gamma, and identifying the target speaker when D is greater than or equal to the threshold gamma, otherwise identifying the non-target speaker.
Claims (8)
1. The voiceprint recognition method for home multi-feature parameter fusion is characterized by comprising the following steps: the method comprises the following steps:
s1: respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals;
s2: training three Gaussian mixture models by respectively using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters;
s3: and weighting and fusing the results of the three Gaussian mixture models, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result.
2. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the speech signal undergoes a preprocessing operation in the step S1 before feature parameter extraction.
3. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 2, wherein: the preprocessing operation in step S1 includes sampling quantization, pre-emphasis, frame windowing, and endpoint detection.
4. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the extraction process of the MFCC characteristic parameters in step S1 is as follows:
A1) preprocessing an input voice signal to generate a time domain signal, and processing each frame of voice signal through fast Fourier transform or discrete Fourier transform to obtain a voice linear frequency spectrum;
A2) inputting the linear frequency spectrum into a Mel filter bank for filtering to generate a Mel frequency spectrum, and taking the logarithmic energy of the Mel frequency spectrum to generate a corresponding logarithmic frequency spectrum;
A3) the log spectrum solution is converted to MFCC feature parameters by using a discrete cosine transform.
5. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the extraction process of the GFCC characteristic parameters in step S1 is as follows:
B1) preprocessing a voice signal to generate a time domain signal, and obtaining a discrete power spectrum through fast Fourier transform or discrete Fourier transform processing;
B2) squaring the discrete power spectrum to generate a voice energy spectrum, and performing filtering processing by using a Gamma atom filter bank;
B3) performing exponential compression on the output of each Gamma filter to obtain a group of exponential energy spectrums;
B4) the exponential energy spectrum is converted into GFCC characteristic parameters using a discrete cosine transform.
6. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the extraction process of the LPCC characteristic parameters in step S1 is as follows:
C1) setting a system function of the vocal tract model;
C2) setting the impulse response of a system function, and calculating the complex cepstrum of the impulse response;
C3) and calculating to obtain the LPCC characteristic parameters according to the relation between the complex cepstrum and the cepstrum coefficient.
7. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the determination method of the recognition result in step S3 is: and when the result of the weighted fusion is greater than or equal to the threshold value, the target speaker is identified, otherwise, the non-target speaker is identified.
8. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the Gammatone filter bank in the step B2 is used for simulating the auditory characteristics of the cochlear basilar membrane, and the time domain expression thereof is as follows:
g(f,t)=tn-1e-2πbtcos(2πfi+φi)U(t),1≤i≤N
wherein N is the number of filters, N is the number of filter stages, i is the number of filter stages, fiFor the center frequency of the filter, U (t) is the unit step function, biIs the attenuation factor of the filter, phiiIs the phase of the filter of sequence i.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010439120.7A CN111785285A (en) | 2020-05-22 | 2020-05-22 | Voiceprint recognition method for home multi-feature parameter fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010439120.7A CN111785285A (en) | 2020-05-22 | 2020-05-22 | Voiceprint recognition method for home multi-feature parameter fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111785285A true CN111785285A (en) | 2020-10-16 |
Family
ID=72754331
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010439120.7A Pending CN111785285A (en) | 2020-05-22 | 2020-05-22 | Voiceprint recognition method for home multi-feature parameter fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111785285A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112542174A (en) * | 2020-12-25 | 2021-03-23 | 南京邮电大学 | VAD-based multi-dimensional characteristic parameter voiceprint identification method |
CN112712820A (en) * | 2020-12-25 | 2021-04-27 | 广州欢城文化传媒有限公司 | Tone classification method, device, equipment and medium |
CN112885355A (en) * | 2021-01-25 | 2021-06-01 | 上海头趣科技有限公司 | Speech recognition method based on multiple features |
CN113177536A (en) * | 2021-06-28 | 2021-07-27 | 四川九通智路科技有限公司 | Vehicle collision detection method and device based on deep residual shrinkage network |
CN113257266A (en) * | 2021-05-21 | 2021-08-13 | 特斯联科技集团有限公司 | Complex environment access control method and device based on voiceprint multi-feature fusion |
CN113393847A (en) * | 2021-05-27 | 2021-09-14 | 杭州电子科技大学 | Voiceprint recognition method based on fusion of Fbank features and MFCC features |
CN113612738A (en) * | 2021-07-20 | 2021-11-05 | 深圳市展韵科技有限公司 | Voiceprint real-time authentication encryption method, voiceprint authentication equipment and controlled equipment |
CN113823293A (en) * | 2021-09-28 | 2021-12-21 | 武汉理工大学 | Speaker recognition method and system based on voice enhancement |
CN113823290A (en) * | 2021-08-31 | 2021-12-21 | 杭州电子科技大学 | Multi-feature fusion voiceprint recognition method |
CN116386647A (en) * | 2023-05-26 | 2023-07-04 | 北京瑞莱智慧科技有限公司 | Audio verification method, related device, storage medium and program product |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436405A (en) * | 2008-12-25 | 2009-05-20 | 北京中星微电子有限公司 | Method and system for recognizing speaking people |
CN104835498A (en) * | 2015-05-25 | 2015-08-12 | 重庆大学 | Voiceprint identification method based on multi-type combination characteristic parameters |
-
2020
- 2020-05-22 CN CN202010439120.7A patent/CN111785285A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436405A (en) * | 2008-12-25 | 2009-05-20 | 北京中星微电子有限公司 | Method and system for recognizing speaking people |
CN104835498A (en) * | 2015-05-25 | 2015-08-12 | 重庆大学 | Voiceprint identification method based on multi-type combination characteristic parameters |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112712820A (en) * | 2020-12-25 | 2021-04-27 | 广州欢城文化传媒有限公司 | Tone classification method, device, equipment and medium |
CN112542174A (en) * | 2020-12-25 | 2021-03-23 | 南京邮电大学 | VAD-based multi-dimensional characteristic parameter voiceprint identification method |
CN112885355A (en) * | 2021-01-25 | 2021-06-01 | 上海头趣科技有限公司 | Speech recognition method based on multiple features |
CN113257266A (en) * | 2021-05-21 | 2021-08-13 | 特斯联科技集团有限公司 | Complex environment access control method and device based on voiceprint multi-feature fusion |
CN113393847A (en) * | 2021-05-27 | 2021-09-14 | 杭州电子科技大学 | Voiceprint recognition method based on fusion of Fbank features and MFCC features |
CN113393847B (en) * | 2021-05-27 | 2022-11-15 | 杭州电子科技大学 | Voiceprint recognition method based on fusion of Fbank features and MFCC features |
CN113177536A (en) * | 2021-06-28 | 2021-07-27 | 四川九通智路科技有限公司 | Vehicle collision detection method and device based on deep residual shrinkage network |
CN113177536B (en) * | 2021-06-28 | 2021-09-10 | 四川九通智路科技有限公司 | Vehicle collision detection method and device based on deep residual shrinkage network |
CN113612738A (en) * | 2021-07-20 | 2021-11-05 | 深圳市展韵科技有限公司 | Voiceprint real-time authentication encryption method, voiceprint authentication equipment and controlled equipment |
CN113823290A (en) * | 2021-08-31 | 2021-12-21 | 杭州电子科技大学 | Multi-feature fusion voiceprint recognition method |
CN113823293A (en) * | 2021-09-28 | 2021-12-21 | 武汉理工大学 | Speaker recognition method and system based on voice enhancement |
CN113823293B (en) * | 2021-09-28 | 2024-04-26 | 武汉理工大学 | Speaker recognition method and system based on voice enhancement |
CN116386647A (en) * | 2023-05-26 | 2023-07-04 | 北京瑞莱智慧科技有限公司 | Audio verification method, related device, storage medium and program product |
CN116386647B (en) * | 2023-05-26 | 2023-08-22 | 北京瑞莱智慧科技有限公司 | Audio verification method, related device, storage medium and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111785285A (en) | Voiceprint recognition method for home multi-feature parameter fusion | |
CN108447495B (en) | Deep learning voice enhancement method based on comprehensive feature set | |
CN102881289B (en) | Hearing perception characteristic-based objective voice quality evaluation method | |
CN109256138B (en) | Identity verification method, terminal device and computer readable storage medium | |
CN109767756B (en) | Sound characteristic extraction algorithm based on dynamic segmentation inverse discrete cosine transform cepstrum coefficient | |
CN111653289B (en) | Playback voice detection method | |
CN102968990B (en) | Speaker identifying method and system | |
CN108564965B (en) | Anti-noise voice recognition system | |
CN112735477B (en) | Voice emotion analysis method and device | |
CN111986679A (en) | Speaker confirmation method, system and storage medium for responding to complex acoustic environment | |
CN108564956A (en) | A kind of method for recognizing sound-groove and device, server, storage medium | |
Chauhan et al. | Speech to text converter using Gaussian Mixture Model (GMM) | |
Gamit et al. | Isolated words recognition using mfcc lpc and neural network | |
CN110570871A (en) | TristouNet-based voiceprint recognition method, device and equipment | |
CN111524524B (en) | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium | |
Jing et al. | Speaker recognition based on principal component analysis of LPCC and MFCC | |
CN115910097A (en) | Audible signal identification method and system for latent fault of high-voltage circuit breaker | |
CN111508504A (en) | Speaker recognition method based on auditory center perception mechanism | |
CN112863517B (en) | Speech recognition method based on perceptual spectrum convergence rate | |
CN111785262B (en) | Speaker age and gender classification method based on residual error network and fusion characteristics | |
CN114038469B (en) | Speaker identification method based on multi-class spectrogram characteristic attention fusion network | |
Zouhir et al. | Speech Signals Parameterization Based on Auditory Filter Modeling | |
Tzudir et al. | Low-resource dialect identification in Ao using noise robust mean Hilbert envelope coefficients | |
Islam | Modified mel-frequency cepstral coefficients (MMFCC) in robust text-dependent speaker identification | |
Singh et al. | A comparative study of recognition of speech using improved MFCC algorithms and Rasta filters |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201016 |
|
RJ01 | Rejection of invention patent application after publication |