CN111785285A - Voiceprint recognition method for home multi-feature parameter fusion - Google Patents

Voiceprint recognition method for home multi-feature parameter fusion Download PDF

Info

Publication number
CN111785285A
CN111785285A CN202010439120.7A CN202010439120A CN111785285A CN 111785285 A CN111785285 A CN 111785285A CN 202010439120 A CN202010439120 A CN 202010439120A CN 111785285 A CN111785285 A CN 111785285A
Authority
CN
China
Prior art keywords
characteristic parameters
feature parameter
voiceprint recognition
recognition method
filter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010439120.7A
Other languages
Chinese (zh)
Inventor
张晖
张金鑫
赵海涛
孙雁飞
倪艺洋
朱洪波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202010439120.7A priority Critical patent/CN111785285A/en
Publication of CN111785285A publication Critical patent/CN111785285A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/10Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a voiceprint recognition method for home multi-feature parameter fusion, which comprises the following steps: respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals; training three Gaussian mixture models by respectively using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters; and weighting and fusing the results of the three Gaussian mixture models, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result. The invention fuses the MFCC characteristic parameters, the GFCC characteristic parameters and the LPCC characteristic parameters, overcomes the defect that a single characteristic parameter cannot well express the characteristics of a speaker, and greatly improves the voiceprint recognition accuracy.

Description

Voiceprint recognition method for home multi-feature parameter fusion
Technical Field
The invention belongs to the field of voiceprint recognition, and particularly relates to a voiceprint recognition method for home multi-feature parameter fusion.
Background
Voiceprint recognition, also known as speaker recognition, includes speaker recognition and speaker verification. The voiceprint recognition application field is very wide, and comprises the financial field, the military safety field, the medical field, the home safety field and the like. Prior to the identification of many voiceprint recognition systems, in addition to pre-processing operations, feature parameters and model matching are critical to the accuracy of the identification.
The traditional single feature parameter cannot well express the voice feature of the speaker, overfitting may occur, and the MFCC feature parameter is easy and imitative. Besides single features, many scholars directly connect the GFCC and the MFCC to form a new feature parameter vector, which may cause dimension disasters and increase the computation of the system. Therefore, the current home voiceprint recognition algorithm cannot meet the requirement for better expressing the characteristics of the speaker, and the recognition accuracy of the current home voiceprint recognition algorithm needs to be improved.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the home-oriented voiceprint recognition method based on the fusion of the multiple feature parameters is provided, the problem that the voice feature of a speaker cannot be completely expressed by a single feature parameter is effectively solved, and the accuracy of voiceprint recognition is improved.
The technical scheme is as follows: in order to achieve the purpose, the invention provides a home multi-feature parameter fusion-oriented voiceprint recognition method, which comprises the following steps:
s1: respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals;
s2: training three Gaussian mixture models by respectively using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters;
s3: and weighting and fusing the results of the three Gaussian mixture models, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result.
Further, the speech signal is subjected to a preprocessing operation before feature parameter extraction in step S1.
Further, the preprocessing operation in step S1 includes sample quantization, pre-emphasis, frame windowing, and endpoint detection.
Further, the extraction process of the MFCC characteristic parameters in step S1 is as follows:
A1) preprocessing an input voice signal to generate a time domain signal, and processing each frame of voice signal through fast Fourier transform or discrete Fourier transform to obtain a voice linear frequency spectrum;
A2) inputting the linear frequency spectrum into a Mel filter bank for filtering to generate a Mel frequency spectrum, and taking the logarithmic energy of the Mel frequency spectrum to generate a corresponding logarithmic frequency spectrum;
A3) the log spectrum solution is converted to MFCC feature parameters by using a discrete cosine transform.
Further, the extraction process of the GFCC characteristic parameters in step S1 is as follows:
B1) preprocessing a voice signal to generate a time domain signal, and obtaining a discrete power spectrum through fast Fourier transform or discrete Fourier transform processing;
B2) squaring the discrete power spectrum to generate a voice energy spectrum, and performing filtering processing by using a Gamma atom filter bank;
B3) performing exponential compression on the output of each Gamma filter to obtain a group of exponential energy spectrums;
B4) the exponential energy spectrum is converted into GFCC characteristic parameters using a discrete cosine transform.
Further, the extraction process of the LPCC characteristic parameters in step S1 is as follows:
C1) setting a system function of the vocal tract model;
C2) setting the impulse response of a system function, and calculating the complex cepstrum of the impulse response;
C3) and calculating to obtain the LPCC characteristic parameters according to the relation between the complex cepstrum and the cepstrum coefficient.
Further, the determination method of the recognition result in step S3 is: and when the result of the weighted fusion is greater than or equal to the threshold value, the target speaker is identified, otherwise, the non-target speaker is identified.
Has the advantages that: compared with the prior art, the invention fuses the MFCC characteristic parameters, the GFCC characteristic parameters and the LPCC characteristic parameters, overcomes the defect that a single characteristic parameter can not well express the characteristics of a speaker, and greatly improves the accuracy of voiceprint recognition.
Drawings
FIG. 1 is a block diagram showing the general structure of the method of the present invention;
FIG. 2 is a flowchart of MFCC feature parameter extraction;
fig. 3 is a flow chart of GFCC characteristic parameter extraction.
Detailed Description
The present invention is further illustrated by the following figures and specific examples, which are to be understood as illustrative only and not as limiting the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications thereof which may occur to those skilled in the art upon reading the present specification.
As shown in fig. 1, the present invention provides a home CNN classification and feature matching combined voiceprint recognition method, which includes the following steps:
1) the method comprises the steps of preprocessing input speaker voice, wherein the preprocessing comprises sampling quantization, pre-emphasis, windowing, framing, endpoint detection and the like. The preprocessing aims to eliminate the interference of sounding organs and voice acquisition equipment and improve the recognition rate of the system.
2) Respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals;
3) training three Gaussian mixture models, namely a GMM model A, GMM model B and a GMM model C, respectively by using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters;
4) and weighting and fusing the results of the GMM model A, GMM model B and the GMM model C, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result.
As shown in fig. 2, the extraction process of the MFCC characteristic parameters in this embodiment is as follows:
A1) the input speech signal s (N) is preprocessed to generate a time-domain signal x (N) (the length N of the signal sequence is 256), and then, a speech linear spectrum x (k) is obtained by performing fast fourier transform or discrete fourier transform on each frame of speech signal, which can be represented as:
Figure BDA0002503416000000031
A2) inputting the linear spectrum X (k) into a Mel filter bank for filtering to generate a Mel spectrum, and then taking the logarithmic energy of the Mel spectrum to generate a corresponding logarithmic spectrum S (m).
Here, the Mel Filter Bank is a set of triangular band-identity filters Hm(k) And M is more than or equal to 0 and less than or equal to M, wherein M represents the number of the filters and is usually 20-28. The transfer function of a band-pass filter can be expressed as:
Figure BDA0002503416000000032
in the formula (2), f (m) is the center frequency.
The logarithm of the Mel energy spectrum is used to promote the performance of the voiceprint recognition system. The transfer function from the linear spectrum x (k) of speech to the logarithmic spectrum s (m) is:
Figure BDA0002503416000000033
A3) converting the solution of the logarithmic spectrum S (m) into the MFCC characteristic parameters by using Discrete Cosine Transform (DCT), wherein the expression of the nth dimension characteristic component C (n) of the MFCC characteristic parameters is as follows:
Figure BDA0002503416000000041
the MFCC characteristic parameters obtained through the steps only reflect the static characteristics of the voice signals, and the dynamic characteristic parameters can be obtained by solving the first-order difference and the second-order difference of the static characteristics.
In this embodiment, the design scheme of applying the GFCC (gamma frequency cepstrum coefficient) characteristic parameter to the gamma filter in the extraction process is as follows:
the Gammatone filter bank is used for simulating the auditory characteristics of a cochlear basilar membrane, and the time domain expression of the Gammatone filter bank is as follows:
g(f,t)=tn-1e-2πbtcos(2πfii) U (t), i is more than or equal to 1 and less than or equal to N (5), wherein N is the number of filters;
n- - -the filter order number, typically 4;
i-filter ordinal number;
fi-the center frequency of the filter;
u (t) -unit step function;
bi-an attenuation factor of the filter;
φithe phase of the filter with sequence i is typically taken to be 0.
The bandwidth of each filter is related to the auditory critical band of the human ear, which according to the theory of psychology can be expressed in terms of equivalent rectangular bandwidth:
Figure BDA0002503416000000042
attenuation factor b of filteriThe decay rate of the impulse response is dependent on the bandwidth by a decay factor biAnd (6) determining. The expression is as follows:
bi=1.019EBR(f) (7)
the time-domain impulse function of the Gammatone filter is an analog function, and in order to facilitate calculation processing, the time-domain impulse function needs to be discretized, and the laplace transform performed on the formula (4) includes:
Figure BDA0002503416000000043
input speech signals s (n) and gi(n) the output of the Gamma-tone filter can be obtained through convolution operation.
The extraction process of the GFCC characteristic parameters is similar to that of the MFCC characteristic parameters, and only a Gamma filterbank is needed to replace a traditional Mel filterbank, so that the cochlea basilar membrane characteristics of the Gamma filterbank are effectively utilized, and the nonlinear processing can be well carried out on voice signals.
Based on the above Gammatone filter, as shown in fig. 3, the process of extracting the GFCC (Gammatone frequency cepstrum coefficient) characteristic parameters is as follows:
B1) firstly, preprocessing an input voice signal s (n) to generate a time domain signal x (n), and obtaining a discrete power spectrum X (k) through fast Fourier transform or discrete Fourier transform processing, wherein the expression is as follows:
Figure BDA0002503416000000051
B2) the discrete power spectrum x (k) is squared to generate a speech energy spectrum, which is then filtered using a gamma-tone filter bank.
B3) To better improve the performance of the voiceprint recognition system, the output of each filter is exponentially compressed to obtain a set of exponential energy spectra s1,s2,…,sM
Figure BDA0002503416000000052
Where e (f) is the exponential compression value and M is the number of filter channels.
B4) Finally, the exponential energy spectrum is converted into GFCC characteristic parameters by using Discrete Cosine Transform (DCT), and the expression is as follows:
Figure BDA0002503416000000053
in the formula, L represents the dimension of the characteristic parameter.
In this embodiment, the process of extracting the characteristic parameters of LPCC (linear predictive cepstrum coefficient) is as follows:
assume that the system function of the vocal tract model is as follows:
Figure BDA0002503416000000054
in equation (12), p is the order of the predictor.
Let h (n) be the impulse response of H (z),
Figure BDA0002503416000000055
is a complex cepstrum of h (n), then
Figure BDA0002503416000000056
Combining formula (12) and formula (13), and for z-1Derivation, simplified to obtain:
Figure BDA0002503416000000061
equal sign two sides z of formula (14)-1The coefficients of the powers are added up to obtain a complex cepstrum as follows:
Figure BDA0002503416000000062
according to the relation between the complex cepstrum and the cepstrum coefficient:
Figure BDA0002503416000000063
linear prediction cepstrum coefficients can be calculated:
Figure BDA0002503416000000064
wherein c (n) is the linear prediction cepstrum coefficient LPCC, anAre linear prediction coefficients.
In step 4 of this embodiment, the mixture degrees of the GMM model A, GMM model B and the GMM model C are both 1024. The output results of the three models are a, b and c respectively, the three results are subjected to weighted fusion, and the weight coefficient is omegaiAnd is
Figure BDA0002503416000000065
Final result D ═ ω1a2b3cSetting a threshold gamma, and identifying the target speaker when D is greater than or equal to the threshold gamma, otherwise identifying the non-target speaker.

Claims (8)

1. The voiceprint recognition method for home multi-feature parameter fusion is characterized by comprising the following steps: the method comprises the following steps:
s1: respectively calculating and extracting MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters of the voice signals;
s2: training three Gaussian mixture models by respectively using MFCC characteristic parameters, GFCC characteristic parameters and LPCC characteristic parameters;
s3: and weighting and fusing the results of the three Gaussian mixture models, performing soft decision, setting a threshold value, obtaining an optimal weight coefficient by using a random gradient descent method, and outputting a final recognition result.
2. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the speech signal undergoes a preprocessing operation in the step S1 before feature parameter extraction.
3. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 2, wherein: the preprocessing operation in step S1 includes sampling quantization, pre-emphasis, frame windowing, and endpoint detection.
4. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the extraction process of the MFCC characteristic parameters in step S1 is as follows:
A1) preprocessing an input voice signal to generate a time domain signal, and processing each frame of voice signal through fast Fourier transform or discrete Fourier transform to obtain a voice linear frequency spectrum;
A2) inputting the linear frequency spectrum into a Mel filter bank for filtering to generate a Mel frequency spectrum, and taking the logarithmic energy of the Mel frequency spectrum to generate a corresponding logarithmic frequency spectrum;
A3) the log spectrum solution is converted to MFCC feature parameters by using a discrete cosine transform.
5. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the extraction process of the GFCC characteristic parameters in step S1 is as follows:
B1) preprocessing a voice signal to generate a time domain signal, and obtaining a discrete power spectrum through fast Fourier transform or discrete Fourier transform processing;
B2) squaring the discrete power spectrum to generate a voice energy spectrum, and performing filtering processing by using a Gamma atom filter bank;
B3) performing exponential compression on the output of each Gamma filter to obtain a group of exponential energy spectrums;
B4) the exponential energy spectrum is converted into GFCC characteristic parameters using a discrete cosine transform.
6. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the extraction process of the LPCC characteristic parameters in step S1 is as follows:
C1) setting a system function of the vocal tract model;
C2) setting the impulse response of a system function, and calculating the complex cepstrum of the impulse response;
C3) and calculating to obtain the LPCC characteristic parameters according to the relation between the complex cepstrum and the cepstrum coefficient.
7. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the determination method of the recognition result in step S3 is: and when the result of the weighted fusion is greater than or equal to the threshold value, the target speaker is identified, otherwise, the non-target speaker is identified.
8. The home furnishing multi-feature parameter fusion-oriented voiceprint recognition method according to claim 1, wherein: the Gammatone filter bank in the step B2 is used for simulating the auditory characteristics of the cochlear basilar membrane, and the time domain expression thereof is as follows:
g(f,t)=tn-1e-2πbtcos(2πfii)U(t),1≤i≤N
wherein N is the number of filters, N is the number of filter stages, i is the number of filter stages, fiFor the center frequency of the filter, U (t) is the unit step function, biIs the attenuation factor of the filter, phiiIs the phase of the filter of sequence i.
CN202010439120.7A 2020-05-22 2020-05-22 Voiceprint recognition method for home multi-feature parameter fusion Pending CN111785285A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010439120.7A CN111785285A (en) 2020-05-22 2020-05-22 Voiceprint recognition method for home multi-feature parameter fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010439120.7A CN111785285A (en) 2020-05-22 2020-05-22 Voiceprint recognition method for home multi-feature parameter fusion

Publications (1)

Publication Number Publication Date
CN111785285A true CN111785285A (en) 2020-10-16

Family

ID=72754331

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010439120.7A Pending CN111785285A (en) 2020-05-22 2020-05-22 Voiceprint recognition method for home multi-feature parameter fusion

Country Status (1)

Country Link
CN (1) CN111785285A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112542174A (en) * 2020-12-25 2021-03-23 南京邮电大学 VAD-based multi-dimensional characteristic parameter voiceprint identification method
CN112712820A (en) * 2020-12-25 2021-04-27 广州欢城文化传媒有限公司 Tone classification method, device, equipment and medium
CN112885355A (en) * 2021-01-25 2021-06-01 上海头趣科技有限公司 Speech recognition method based on multiple features
CN113177536A (en) * 2021-06-28 2021-07-27 四川九通智路科技有限公司 Vehicle collision detection method and device based on deep residual shrinkage network
CN113257266A (en) * 2021-05-21 2021-08-13 特斯联科技集团有限公司 Complex environment access control method and device based on voiceprint multi-feature fusion
CN113393847A (en) * 2021-05-27 2021-09-14 杭州电子科技大学 Voiceprint recognition method based on fusion of Fbank features and MFCC features
CN113612738A (en) * 2021-07-20 2021-11-05 深圳市展韵科技有限公司 Voiceprint real-time authentication encryption method, voiceprint authentication equipment and controlled equipment
CN113823293A (en) * 2021-09-28 2021-12-21 武汉理工大学 Speaker recognition method and system based on voice enhancement
CN113823290A (en) * 2021-08-31 2021-12-21 杭州电子科技大学 Multi-feature fusion voiceprint recognition method
CN116386647A (en) * 2023-05-26 2023-07-04 北京瑞莱智慧科技有限公司 Audio verification method, related device, storage medium and program product

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101436405A (en) * 2008-12-25 2009-05-20 北京中星微电子有限公司 Method and system for recognizing speaking people
CN104835498A (en) * 2015-05-25 2015-08-12 重庆大学 Voiceprint identification method based on multi-type combination characteristic parameters

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101436405A (en) * 2008-12-25 2009-05-20 北京中星微电子有限公司 Method and system for recognizing speaking people
CN104835498A (en) * 2015-05-25 2015-08-12 重庆大学 Voiceprint identification method based on multi-type combination characteristic parameters

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712820A (en) * 2020-12-25 2021-04-27 广州欢城文化传媒有限公司 Tone classification method, device, equipment and medium
CN112542174A (en) * 2020-12-25 2021-03-23 南京邮电大学 VAD-based multi-dimensional characteristic parameter voiceprint identification method
CN112885355A (en) * 2021-01-25 2021-06-01 上海头趣科技有限公司 Speech recognition method based on multiple features
CN113257266A (en) * 2021-05-21 2021-08-13 特斯联科技集团有限公司 Complex environment access control method and device based on voiceprint multi-feature fusion
CN113393847A (en) * 2021-05-27 2021-09-14 杭州电子科技大学 Voiceprint recognition method based on fusion of Fbank features and MFCC features
CN113393847B (en) * 2021-05-27 2022-11-15 杭州电子科技大学 Voiceprint recognition method based on fusion of Fbank features and MFCC features
CN113177536A (en) * 2021-06-28 2021-07-27 四川九通智路科技有限公司 Vehicle collision detection method and device based on deep residual shrinkage network
CN113177536B (en) * 2021-06-28 2021-09-10 四川九通智路科技有限公司 Vehicle collision detection method and device based on deep residual shrinkage network
CN113612738A (en) * 2021-07-20 2021-11-05 深圳市展韵科技有限公司 Voiceprint real-time authentication encryption method, voiceprint authentication equipment and controlled equipment
CN113823290A (en) * 2021-08-31 2021-12-21 杭州电子科技大学 Multi-feature fusion voiceprint recognition method
CN113823293A (en) * 2021-09-28 2021-12-21 武汉理工大学 Speaker recognition method and system based on voice enhancement
CN113823293B (en) * 2021-09-28 2024-04-26 武汉理工大学 Speaker recognition method and system based on voice enhancement
CN116386647A (en) * 2023-05-26 2023-07-04 北京瑞莱智慧科技有限公司 Audio verification method, related device, storage medium and program product
CN116386647B (en) * 2023-05-26 2023-08-22 北京瑞莱智慧科技有限公司 Audio verification method, related device, storage medium and program product

Similar Documents

Publication Publication Date Title
CN111785285A (en) Voiceprint recognition method for home multi-feature parameter fusion
CN108447495B (en) Deep learning voice enhancement method based on comprehensive feature set
CN102881289B (en) Hearing perception characteristic-based objective voice quality evaluation method
CN109256138B (en) Identity verification method, terminal device and computer readable storage medium
CN109767756B (en) Sound characteristic extraction algorithm based on dynamic segmentation inverse discrete cosine transform cepstrum coefficient
CN111653289B (en) Playback voice detection method
CN102968990B (en) Speaker identifying method and system
CN108564965B (en) Anti-noise voice recognition system
CN112735477B (en) Voice emotion analysis method and device
CN111986679A (en) Speaker confirmation method, system and storage medium for responding to complex acoustic environment
CN108564956A (en) A kind of method for recognizing sound-groove and device, server, storage medium
Chauhan et al. Speech to text converter using Gaussian Mixture Model (GMM)
Gamit et al. Isolated words recognition using mfcc lpc and neural network
CN110570871A (en) TristouNet-based voiceprint recognition method, device and equipment
CN111524524B (en) Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium
Jing et al. Speaker recognition based on principal component analysis of LPCC and MFCC
CN115910097A (en) Audible signal identification method and system for latent fault of high-voltage circuit breaker
CN111508504A (en) Speaker recognition method based on auditory center perception mechanism
CN112863517B (en) Speech recognition method based on perceptual spectrum convergence rate
CN111785262B (en) Speaker age and gender classification method based on residual error network and fusion characteristics
CN114038469B (en) Speaker identification method based on multi-class spectrogram characteristic attention fusion network
Zouhir et al. Speech Signals Parameterization Based on Auditory Filter Modeling
Tzudir et al. Low-resource dialect identification in Ao using noise robust mean Hilbert envelope coefficients
Islam Modified mel-frequency cepstral coefficients (MMFCC) in robust text-dependent speaker identification
Singh et al. A comparative study of recognition of speech using improved MFCC algorithms and Rasta filters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201016

RJ01 Rejection of invention patent application after publication