CN113409796A - Voice identity verification method based on long-term formant measurement - Google Patents

Voice identity verification method based on long-term formant measurement Download PDF

Info

Publication number
CN113409796A
CN113409796A CN202110510987.1A CN202110510987A CN113409796A CN 113409796 A CN113409796 A CN 113409796A CN 202110510987 A CN202110510987 A CN 202110510987A CN 113409796 A CN113409796 A CN 113409796A
Authority
CN
China
Prior art keywords
voice
long
frequency
distance
resonance peak
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110510987.1A
Other languages
Chinese (zh)
Other versions
CN113409796B (en
Inventor
汤申亮
张华军
邓小涛
王征华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dashengji Technology Co ltd
Original Assignee
Wuhan Dashengji Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dashengji Technology Co ltd filed Critical Wuhan Dashengji Technology Co ltd
Priority to CN202110510987.1A priority Critical patent/CN113409796B/en
Publication of CN113409796A publication Critical patent/CN113409796A/en
Application granted granted Critical
Publication of CN113409796B publication Critical patent/CN113409796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

The invention provides a voice identity verification method based on long-term formant measurement, wherein a voice file from the same speaker is known, the distance between long-term formant data of any two sections of voice in the known voice file is calculated to obtain an upper limit distance and a lower limit distance, when a material detection voice is acquired, the long-term formant distance between the material detection voice and the known voice file is calculated, and if the long-term formant distance is smaller than the lower limit distance, the material detection voice and the known voice file are judged to have identity; if the detected material voice is larger than the upper limit distance, judging that the detected material voice does not have the same identity with the known voice file; if the distance is between the upper limit and the lower limit, a hypothesis test method is adopted to verify the identity. The invention can improve the verification precision by acquiring the long-term formants of the voice file and combining a hypothesis test method to verify the voice identity according to the distance of the long-term formants.

Description

Voice identity verification method based on long-term formant measurement
Technical Field
The invention belongs to the technical field of voice detection, and particularly relates to a voice identity verification method based on long-term formant measurement.
Background
Formants are important features in voiceprint identification, which not only provide a reference for consonants and vowel resolution, but also include personality characteristics of the speaker. The formant frequency is affected by the length of the vocal tract, and a longer vocal tract results in a lower vowel formant, and the proportional size between the various parts of the vocal tract also affects the formant frequency.
There are many ways to measure the formant frequency. Among them, the method of measuring the central frequency values of different vowel formants is the most classical. However, there is not sufficient correlation between formant frequencies of different vowels and between different formants, and this characteristic reduces the accuracy of identification. Another method for studying formants is dynamic analysis, in which individuals leave traces of their specific movement patterns when they pronounce, these traces reflect the personality characteristics of the speaker, but the dynamics of formants are affected by both the segment and prosodic contexts, and this method also requires further study of the differences between different speaking contexts.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method for verifying the voice identity based on the long-term formant measurement can improve the verification precision.
The technical scheme adopted by the invention for solving the technical problems is as follows: a voice identity verification method based on long-term formant measurement comprises the following steps:
knowing a voice file from the same speaker, calculating the distance between the long-time resonance peak data of any two sections of voice in the known voice file, and obtaining the upper limit distance
Figure RE-GDA0003193908910000011
And a lower limit distance
Figure RE-GDA0003193908910000012
When a material testing voice is collected, calculating the long-term formant distance D between the material testing voice and the known voice file, and carrying out the following judgment:
when in use
Figure RE-GDA0003193908910000013
Judging that the material checking voice in the time interval has the same identity with the known voice file, namely the same speaker;
when in use
Figure RE-GDA0003193908910000014
Judging that the material testing voice in the time interval does not have the same identity with the known voice file, namely different speakers are obtained;
when in use
Figure RE-GDA0003193908910000015
A hypothesis test is used to verify identity.
According to the above method, the upper limit distance
Figure RE-GDA0003193908910000021
And a lower limit distance
Figure RE-GDA0003193908910000022
The calculation method of (2) is as follows:
let the 4 long-term formant measurement data of 2-segment speech in the known speech file be X1 and Y1, wherein,
Figure RE-GDA0003193908910000023
Figure RE-GDA0003193908910000024
in the formula, xF11……xF1mFor the first to the mth resonance peak data, x, under the first frequency of the first section voiceF21……xF2mFor the first to the mth resonance peak data x under the second frequency of the first section voiceF31……xF3mIs the first to the mth resonance peak data x under the third frequency of the first section voiceF41……xF4mData of first to mth resonance peaks under a fourth frequency of the first section of voice; y isF11……yF1nFor the first to nth resonance peak data, y, at the first frequency of the second speech segmentF21……yF2nFor the first to nth resonance peak data, y, at the second frequency of the second speech segmentF31……yF3nIs the first to nth resonance peak data y under the third frequency of the second section voiceF41……yF4nThe data of the first to nth resonance peak under the fourth frequency of the second section of voice; the first to fourth frequencies are frequencies which are sequentially increased or sequentially decreased;
the column data of each long-time formant measurement data matrix form a formant vector xi= [xF1i xF2ixF3i xF4i]、yi=[yF1i yF2i yF3i yF4i]Respectively calculating the central position of the m vectors of the first section of voice and the n vectors of the second section of voice, and enabling x to bec=[xF1c xF2c xF3c xF4c]Is the center of the X1 matrix, let yc=[yF1c yF2c yF3c yF4c]For the center of Y1 matrix, x is obtained according to the clustering principlecTo xiIs minimized, x is obtained by solving the following minimum problemcAnd yc
Figure RE-GDA0003193908910000025
Figure RE-GDA0003193908910000026
At xcAnd ycOn the basis, the Euclidean distance between centers is calculated to calculate the long-term formant distance D of the two sections of voice*
Figure RE-GDA0003193908910000027
From the known speech fileRespectively calculating the distance between every two voices in different segments according to the method, and taking the maximum value and the minimum value as the upper limit distance
Figure RE-GDA0003193908910000031
And a lower limit distance
Figure RE-GDA0003193908910000032
According to the method, the method for calculating the long-term formant distance D of the material detection voice and the long-term formant distance D of the two sections of voices in the known voice file*The same method is used.
According to the method, the hypothesis testing method is a t testing method, and the method comprises the following specific steps:
let the 4 long-term formant measurement data of the material testing voice be Z1, wherein
Figure RE-GDA0003193908910000033
In the formula, zF11……zF1jFor first to jth resonance peak data, z, at a first frequency of the material-testing voiceF21……zF2jFor first to jth resonance peak data, z, at a second frequency of the material-testing voiceF31……zF3jFor first to jth resonance peak data, Z, under the third frequency of the material-testing voiceF41……zF4jData of first to jth resonance peaks at a fourth frequency of the material detection voice are obtained;
let xF21、xF22、xF23、……、xF2mObedience as N (u, σ)2) Normal distribution of (a), zF21、zF22、 zF23……zF2jObedience as N (v, σ)2) According to the statistical theory, the data of the resonance peak at the second frequency are distributed as follows:
Figure RE-GDA0003193908910000034
wherein xF2mean、SxAre respectively xF21、xF22、xF23、……、xF2mMean and standard deviation of (2), zF2mean、SzAre each zF21、zF22、zF23……zF2jMean and standard deviation of;
given a degree of confidence a, when
Figure RE-GDA0003193908910000035
And judging that the time-interval material-checking voice is identical to the known voice file, otherwise, judging that the time-interval material-checking voice is not identical to the known voice file.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method for voice identity verification based on long-term formant measurements when executing the computer program.
A non-transitory computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for voice identity verification based on long-term formant measurements.
The invention has the beneficial effects that: the voice identity verification is carried out by acquiring the long-term formants of the voice file and combining a hypothesis test method according to the distance of the long-term formants, so that the verification precision can be improved.
Drawings
FIG. 1 shows the frequencies of the formants LTF2 and LTF3 at vowels in different contexts of speech.
FIG. 2 is a formant spectrum.
FIG. 3 is a plot of formant F1-F3 frequency versus time.
FIG. 4 is a frequency distribution plot of formants F1-F3.
FIG. 5 is a graph of the long-term formant LTF2 and LTF3 distribution for different speakers.
FIG. 6 is a graph of the long term formant LTF2 and LTF3 distribution for the same speaker.
Fig. 7 is a t-test confidence interval distribution map.
FIG. 8 is a flowchart of a method according to an embodiment of the present invention.
Detailed Description
The invention is further illustrated by the following specific examples and figures.
FIG. 1 depicts the frequency variation of LTF2 and LTF3 in both the natural speaking and reading contexts of multiple test persons, from which it can be seen that the frequency mean variation of LTF2 and LTF3 for speakers in both contexts is very small; LTF4 is more affected by the telephone communication bandwidth, so the present invention selects LTF2 and LTF3 for the voiceprint authentication basis.
As shown in FIG. 2, the positions of vowel formants F1-F4 are determined by combining a linear predictive analysis technique and manual correction for a voice file to be identified, wherein the positions are F1-F4 in sequence according to a curve from low frequency to high frequency, the formants F1-F3 are not used as identification bases due to unstable formants F4, the time-varying curves of the formants F1-F3 are shown in FIG. 3, and long-time formant F1-F3 frequency distribution curves shown in FIG. 4 can be drawn according to the frequency and the occurrence probability of each formant. From the above-mentioned frequency distribution characteristics of the long-term formants, different speakers have different distributions of LTF2 and LTF3, and fig. 5 depicts the distributions of vowels LTF2 and LTF3 of 2 testers, in which two solid lines are distributions of LTF2 of two testers, respectively, and two dashed lines are distributions of LTF3 of two testers, respectively. It can be seen from the figure that LTF2 and LTF3 of 2 people not only have different frequency means, but also have larger differences in the section covered by the distribution curve and the curve shape. The distribution of vowel LTF2 and LTF3 measured in different contexts for the same speaker is shown in fig. 6, where two solid lines are the vowel LTF2 distribution measured in different contexts for the same speaker, and two dashed lines are the vowel LTF3 distribution measured in different contexts for the same speaker, it can be known from the figure that the long-term formants LTF2 and LTF3 of the same speaker in different contexts not only have small frequency mean variation, but also have very similar intervals and shapes of distribution curves, so that hypothesis test can be performed on the measured long-term formants LTF2 and LTF3 data by using a probabilistic method to determine whether the detected speech sample is the target speaker.
Based on the above principle and research, the present invention provides a voice identity verification method based on long-term formant measurement, as shown in fig. 8, the method includes:
s1, knowing a voice file from the same speaker, calculating the distance between the long-time resonance peak data of any two sections of voice in the known voice file, and obtaining the upper limit distance
Figure RE-GDA0003193908910000051
And a lower limit distance
Figure RE-GDA0003193908910000052
The upper limit distance
Figure RE-GDA0003193908910000053
And a lower limit distance
Figure RE-GDA0003193908910000054
The calculation method of (2) is as follows:
let the 4 long-term formant measurement data of 2-segment speech in the known speech file be X1 and Y1, wherein,
Figure RE-GDA0003193908910000055
Figure RE-GDA0003193908910000056
in the formula, xF11……xF1mFor the first to the mth resonance peak data, x, under the first frequency of the first section voiceF21……xF2mFor the first to the mth resonance peak data x under the second frequency of the first section voiceF31……xF3mIs the first to the mth resonance peak data x under the third frequency of the first section voiceF41……xF4mFor the first speech segment from the first to the m < th > frequencyIndividual resonance peak data; y isF11……yF1nFor the first to nth resonance peak data, y, at the first frequency of the second speech segmentF21……yF2nFor the first to nth resonance peak data, y, at the second frequency of the second speech segmentF31……yF3nIs the first to nth resonance peak data y under the third frequency of the second section voiceF41……yF4nThe data of the first to nth resonance peak under the fourth frequency of the second section of voice; the first to fourth frequencies are frequencies which are sequentially increased or sequentially decreased;
the column data of each long-time formant measurement data matrix form a formant vector xi= [xF1ixF2ixF3ixF4i]、yi=[yF1i yF2i yF3i yF4i]Respectively calculating the central position of the m vectors of the first section of voice and the n vectors of the second section of voice, and enabling x to bec=[xF1c xF2c xF3c xF4c]Is the center of the X1 matrix, let yc=[yF1c yF2c yF3c yF4c]For the center of Y1 matrix, x is obtained according to the clustering principlecTo xiIs minimized, x is obtained by solving the following minimum problemcAnd yc
Figure RE-GDA0003193908910000061
Figure RE-GDA0003193908910000062
At xcAnd ycOn the basis, the Euclidean distance between centers is calculated to calculate the long-term formant distance D of the two sections of voice*
Figure RE-GDA0003193908910000063
From said knownRespectively calculating the distance between every two voices of different segments in the voice file according to the method, and taking the maximum value and the minimum value as the upper limit distance
Figure RE-GDA0003193908910000064
And a lower limit distance
Figure RE-GDA0003193908910000065
S2, when a material detection voice is collected, calculating the long-term formant distance D between the material detection voice and the known voice file, and the method for calculating the long-term formant distance D between the material detection voice and the long-term formant distance D between two sections of voices in the known voice file*The same method is used.
Then the following judgments were made: when in use
Figure RE-GDA0003193908910000066
Judging that the material checking voice in the time interval has the same identity with the known voice file, namely the same speaker; when in use
Figure RE-GDA0003193908910000067
Judging that the material testing voice in the time interval does not have the same identity with the known voice file, namely different speakers are obtained; when in use
Figure RE-GDA0003193908910000068
A hypothesis test is used to verify identity.
The hypothesis testing method is a t testing method, and comprises the following specific steps:
let the 4 long-term formant measurement data of the material testing voice be Z1, wherein
Figure RE-GDA0003193908910000069
In the formula, zF11……zF1jFor first to jth resonance peak data, z, at a first frequency of the material-testing voiceF21……ZF2jSecond voice for material inspectionFirst to jth resonance peak data z at frequencyF31……zF3jFor first to jth resonance peak data, Z, under the third frequency of the material-testing voiceF41……ZF4jData of first to jth resonance peaks at a fourth frequency of the material detection voice are obtained;
let xF21、xF22、xF23、……、xF2mObedience as N (u, σ)2) Normal distribution of (a), zF21、zF22、 zF23……zF2jObedience as N (v, σ)2) According to the statistical theory, the data of the resonance peak at the second frequency are distributed as follows:
Figure RE-GDA0003193908910000071
wherein xF2mean、SxAre respectively xF21、xF22、xF23、……、xF2mMean and standard deviation of (2), ZF2mean、SzAre each zF21、zF22、ZF23……zF2jMean and standard deviation of.
There are 2 hypotheses, H0:u=v,H1U is not equal to v if H0If true, then this time:
Figure RE-GDA0003193908910000072
to H0、H1When performing hypothesis testing, a confidence level α is given when
Figure RE-GDA0003193908910000073
Then, it is determined that the time interval material-checking voice is identical to the known voice file, i.e. H is accepted0(ii) a Otherwise, judging that the time interval material checking voice does not have the same identity with the known voice file, namely rejecting H0
As shown in fig. 7, when two test materials are considered to be from the same speaker with a probability of 95% confidence level, the two detected files are required to measure long-term formants satisfying the following inequality:
|xF2mean-zF2mean|<c
wherein
Figure RE-GDA0003193908910000074
t0.05And (m + j-2) is a t distribution variable value corresponding to the degree of freedom m + j-2, i.e., the degree of reliability α is 0.05. As can be seen from FIG. 7, the larger 1 α is, the higher H0The greater the confidence that it is established. Since the t distribution is symmetrical about the vertical axis, let 2 β be 1- α, then
Figure RE-GDA0003193908910000075
When the two samples are subjected to the hypothesis test of the identity of the two samples, in order to determine the reasonable value range of the beta, the upper and lower limits of the beta can be determined by comparing the beta with the samples
Figure RE-GDA0003193908910000076
When in use
Figure RE-GDA0003193908910000077
Figure RE-GDA0003193908910000078
The detected materials are considered to have identity; when in use
Figure RE-GDA0003193908910000079
Refusing the material to be detected to have identity;
Figure RE-GDA00031939089100000710
a comprehensive judgment needs to be made in conjunction with the distance D.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of the voice identity verification method based on the long-term formant measurement when executing the computer program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for voice identity verification based on long-term formant measurements.
The above embodiments are only used for illustrating the design idea and features of the present invention, and the purpose of the present invention is to enable those skilled in the art to understand the content of the present invention and implement the present invention accordingly, and the protection scope of the present invention is not limited to the above embodiments. Therefore, all equivalent changes and modifications made in accordance with the principles and concepts disclosed herein are intended to be included within the scope of the present invention.

Claims (6)

1. A voice identity verification method based on long-term formant measurement is characterized by comprising the following steps: the method comprises the following steps:
knowing a voice file from the same speaker, calculating the distance between the long-time resonance peak data of any two sections of voice in the known voice file, and obtaining the upper limit distance
Figure FDA0003060378190000011
And a lower limit distance
Figure FDA0003060378190000012
When a material testing voice is collected, calculating the long-term formant distance D between the material testing voice and the known voice file, and carrying out the following judgment:
when in use
Figure FDA0003060378190000013
Judging that the material checking voice in the time interval has the same identity with the known voice file, namely the same speaker;
when in use
Figure FDA0003060378190000014
Judging that the time interval material checking voice and the known voice file do not have the sameSex, namely different speakers;
when in use
Figure FDA0003060378190000015
A hypothesis test is used to verify identity.
2. The method of claim 1, wherein: the upper limit distance
Figure FDA0003060378190000016
And a lower limit distance
Figure FDA0003060378190000017
The calculation method of (2) is as follows:
let the 4 long-term formant measurement data of 2-segment speech in the known speech file be X1 and Y1, wherein,
Figure FDA0003060378190000018
Figure FDA0003060378190000019
in the formula, xF11……xF1mFor the first to the mth resonance peak data, x, under the first frequency of the first section voiceF21……xF2mFor the first to the mth resonance peak data x under the second frequency of the first section voiceF31……xF3mIs the first to the mth resonance peak data x under the third frequency of the first section voiceF41……xF4mData of first to mth resonance peaks under a fourth frequency of the first section of voice; y isF11……yF1nFor the first to nth resonance peak data, y, at the first frequency of the second speech segmentF21……yF2nFor the first to nth resonance peak data, y, at the second frequency of the second speech segmentF31……yF3nIs the first frequency of the second speech segmentTo nth resonance peak data, yF41……yF4nThe data of the first to nth resonance peak under the fourth frequency of the second section of voice; the first to fourth frequencies are frequencies which are sequentially increased or sequentially decreased;
the column data of each long-time formant measurement data matrix form a formant vector xi=[xF1i xF2i xF3ixF4i]、yi=[yF1i yF2i yF3i yF4i]Respectively calculating the central position of the m vectors of the first section of voice and the n vectors of the second section of voice, and enabling x to bec=[xF1c xF2c xF3c xF4c]Is the center of the X1 matrix, let yc=[yF1c yF2c yF3c yF4c]For the center of Y1 matrix, x is obtained according to the clustering principlecTo xiIs minimized, x is obtained by solving the following minimum problemcAnd yc
Figure FDA0003060378190000021
Figure FDA0003060378190000022
At xcAnd ycOn the basis, the Euclidean distance between centers is calculated to calculate the long-term formant distance D of the two sections of voice*
Figure FDA0003060378190000023
Respectively calculating the distance between every two voices from the known voice file according to the method, and taking the maximum value and the minimum value as the upper limit distance
Figure FDA0003060378190000024
And a lower limit distance
Figure FDA0003060378190000025
3. The method of claim 2, wherein: the method for calculating the long-term formant distance D of the material-tested voice is the same as the method for calculating the long-term formant distance D of two sections of voice in the known voice file.
4. The method of claim 3, wherein: the hypothesis testing method is a t testing method, and comprises the following specific steps:
let the 4 long-term formant measurement data of the material testing voice be Z1, wherein
Figure FDA0003060378190000026
In the formula, zF11……zF1jFor first to jth resonance peak data, z, at a first frequency of the material-testing voiceF21……zF2jFor first to jth resonance peak data, z, at a second frequency of the material-testing voiceF31……zF3jFor the first to jth resonance peak data, z, at the third frequency of the material-testing voiceF41……zF4jData of first to jth resonance peaks at a fourth frequency of the material detection voice are obtained;
let xF21、xF22、xF23、……、xF2mObedience as N (u, σ)2) Normal distribution of (a), zF21、zF22、zF23……zF2jObedience as N (v, σ)2) According to the statistical theory, the data of the resonance peak at the second frequency are distributed as follows:
Figure FDA0003060378190000031
wherein xF2mean、SxAre respectively xF21、xF22、xF23、……、xF2mMean and standard deviation of (2), zF2mean、SzAre each zF21、zF22、zF23……zF2jMean and standard deviation of;
given a degree of confidence a, when
Figure FDA0003060378190000032
And judging that the time-interval material-checking voice is identical to the known voice file, otherwise, judging that the time-interval material-checking voice is not identical to the known voice file.
5. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor, when executing the computer program, performs the steps of the method for verifying speech identity based on long-term formant measurements according to any one of claims 1 to 4.
6. A non-transitory computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program when being executed by a processor realizes the steps of a method for verifying speech identity based on long-term formant measurements according to any one of claims 1 to 4.
CN202110510987.1A 2021-05-11 2021-05-11 Voice identity verification method based on long-term formant measurement Active CN113409796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110510987.1A CN113409796B (en) 2021-05-11 2021-05-11 Voice identity verification method based on long-term formant measurement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110510987.1A CN113409796B (en) 2021-05-11 2021-05-11 Voice identity verification method based on long-term formant measurement

Publications (2)

Publication Number Publication Date
CN113409796A true CN113409796A (en) 2021-09-17
CN113409796B CN113409796B (en) 2022-09-27

Family

ID=77678249

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110510987.1A Active CN113409796B (en) 2021-05-11 2021-05-11 Voice identity verification method based on long-term formant measurement

Country Status (1)

Country Link
CN (1) CN113409796B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016209888A1 (en) * 2015-06-22 2016-12-29 Rita Singh Processing speech signals in voice-based profiling
CN111105815A (en) * 2020-01-20 2020-05-05 深圳震有科技股份有限公司 Auxiliary detection method and device based on voice activity detection and storage medium
CN111108552A (en) * 2019-12-24 2020-05-05 广州国音智能科技有限公司 Voiceprint identity identification method and related device
CN111108551A (en) * 2019-12-24 2020-05-05 广州国音智能科技有限公司 Voiceprint identification method and related device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016209888A1 (en) * 2015-06-22 2016-12-29 Rita Singh Processing speech signals in voice-based profiling
CN111108552A (en) * 2019-12-24 2020-05-05 广州国音智能科技有限公司 Voiceprint identity identification method and related device
CN111108551A (en) * 2019-12-24 2020-05-05 广州国音智能科技有限公司 Voiceprint identification method and related device
CN111105815A (en) * 2020-01-20 2020-05-05 深圳震有科技股份有限公司 Auxiliary detection method and device based on voice activity detection and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ERICA GOLD等: "Examining correlations between phonetic parameters: Implications for forensic speaker comparison", 《THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA》 *
ERICA GOLD等: "Examining long-term formant distributions as a discriminant in forensic speaker comparisons under a likelihood ratio framework", 《THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA》 *
MICHAEL JESSEN等: "Long‐term formant distribution as a forensic‐phonetic feature", 《THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA》 *
曹洪林: "长时共振峰分布特征在声纹鉴定中的应用", 《中国司法鉴定》 *
贾丽文: "音量增大时语音的长时共振峰分布特征变化及其对声纹鉴定的影响", 《山西大同大学学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN113409796B (en) 2022-09-27

Similar Documents

Publication Publication Date Title
Yu et al. Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features
US9536547B2 (en) Speaker change detection device and speaker change detection method
Becker et al. Forensic speaker verification using formant features and Gaussian mixture models.
US20210125603A1 (en) Acoustic model training method, speech recognition method, apparatus, device and medium
CN101136199B (en) Voice data processing method and equipment
Mandasari et al. Quality measure functions for calibration of speaker recognition systems in various duration conditions
US8271283B2 (en) Method and apparatus for recognizing speech by measuring confidence levels of respective frames
US20090313016A1 (en) System and Method for Detecting Repeated Patterns in Dialog Systems
Jin et al. Cute: A concatenative method for voice conversion using exemplar-based unit selection
Ferragne et al. Vowel systems and accent similarity in the British Isles: Exploiting multidimensional acoustic distances in phonetics
CN113409796B (en) Voice identity verification method based on long-term formant measurement
US20230178099A1 (en) Using optimal articulatory event-types for computer analysis of speech
WO2002029785A1 (en) Method, apparatus, and system for speaker verification based on orthogonal gaussian mixture model (gmm)
Vair et al. Loquendo-Politecnico di torino's 2006 NIST speaker recognition evaluation system.
Laskowski et al. Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum
Kinoshita et al. Sub-band cepstral distance as an alternative to formants: Quantitative evidence from a forensic comparison experiment
Arcienega et al. Pitch-dependent GMMs for text-independent speaker recognition systems.
CN113705671A (en) Speaker identification method and system based on text related information perception
Dusan On the relevance of some spectral and temporal patterns for vowel classification
Al-Manie et al. Automatic speech segmentation using the Arabic phonetic database
Nath et al. Composite feature selection method based on spoken word and speaker recognition
Andreev et al. Attacking the problem of continuous speech segmentation into basic units
Xie et al. Kurtosis normalization after short-time gaussianization for robust speaker verification
Tomar Discriminant feature space transformations for automatic speech recognition
Nath et al. Feature Selection Method for Speaker Recognition using Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant