CN102054480B - Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) - Google Patents
Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) Download PDFInfo
- Publication number
- CN102054480B CN102054480B CN2009102359018A CN200910235901A CN102054480B CN 102054480 B CN102054480 B CN 102054480B CN 2009102359018 A CN2009102359018 A CN 2009102359018A CN 200910235901 A CN200910235901 A CN 200910235901A CN 102054480 B CN102054480 B CN 102054480B
- Authority
- CN
- China
- Prior art keywords
- fundamental frequency
- frame
- signal
- harmonic wave
- aliasing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Abstract
The invention relates to a method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT), which belongs to the technical field of audio signal processing. The method comprises the following steps: firstly, preprocessing overlapping speech signals so as to remove mute-section signals of the overlapping speech signals and find out sonant frames; then, carrying out pitch detection on sonant-frame signals based on FrFT so as to separate the fundamental frequencies of the overlapping speeches; and finally, integrating the fundamental frequencies with a sinusoidal model of speech signals so as to synthesize speeches, thereby obtaining each speech signal subjected to separation. The method provided by the invention has the advantages that the fundamental frequencies of a plurality of overlapping speeches can be separated and extracted effectively, and finally, the effective separation of the overlapping speeches can be realized; and the pitch frequencies are extracted based on FrFT instead of traditional fast Fourier transform (FFT), thereby reducing the extension of a harmonic frequency spectrum and then obtaining more accurate fundamental frequencies of original signals. The method provided by the invention is especially suitable for the separation of monaural overlapping speeches containing speeches of two persons.
Description
Technical field
The present invention relates to a kind of method of utilizing fraction Fourier conversion to carry out monophony aliasing speech Separation, belong to the Audio Signal Processing technical field.
Background technology
In voice and audible signal process field, have an important problem is how from the aliasing voice signal, to isolate the interested voice of people.The aliasing speech Separation all has important significance for theories and use value at aspects such as voice communication, acoustic target detection, voice signal enhancings; But because it is overlapping fully on time domain and frequency domain to constitute each source voice signal of aliasing voice, sound enhancement method commonly used is difficult to people's interested voice of institute (being called target speech) are separated from disturb voice.
(Fractional Fourier Transform FrFT) has very excellent characteristic for analyzing some non-stationary signal to fraction Fourier conversion, becomes a kind of instrument that causes signal Processing circle extensive concern in recent years.As the voice of non-stationary signal, FrFT or the application of similar conversion in voice signal is handled mainly concentrate on the following aspects at present: speech analysis can provide the time frequency resolution higher than traditional fourier transform method; Fundamental tone is estimated, can provide than the more accurate fundamental tone of classic method and estimate; Voice strengthen; Speech recognition; And Speaker Identification etc.
Research aspect the aliasing speech Separation, (Auditory Scene Analysis ASA) separates (Blind Source Separation, BSS) two types with blind source mainly to be divided into auditory scene analysis.The research of auditory scene analysis has two kinds of methods: a kind of is auditory physiology and psychological characteristic from the people, the rule of research people in the voice recognition process, i.e. auditory scene analysis; Another kind is to utilize the achievement in research of people's sense of hearing perception is set up model; Model is carried out mathematical analysis and realizes it with computing machine; This be calculate auditory scene analysis (Computational Auditory Scene Analysis, CASA) the content that will study.The separation of blind source is meant under source signal, transmission channel characteristics condition of unknown, is only estimated the process of each component of source signal by some prioris (like probability density) of observation signal and source signal.The independent component analysis method that separate in blind source at first is to be proposed by P.Comon, and it is based on a kind of technology that neural network and statistical base growth are got up, and is the active field, forward position of a ten minutes.
Mainly there is following deficiency in existing aliasing speech separating method:
(1) auditory scene analysis also is in the starting stage with the research of calculating auditory scene analysis.Particularly in calculating auditory scene analysis research, the model of being set up can only be used for verifying some clear inadequately theories of auditory scene analysis research, and promptly human brain is handled the mechanism of audible signal.
Research to blind source separation method is very active; But this problem also is not well solved; It relates to the stability and the phase place uncertain problem of multichannel convolutive aliasing system and blind deconvolution system, especially the situation of blind deconvolution problem and band noise when the number in source is unknown.
(2) the fundamental frequency separation and Extraction of aliasing voice is to realize the key of aliasing speech Separation in the auditory scene analysis, but existing aliasing speech pitch separating and extracting process is only considered the aliasing of voiced sound and voiced sound, does not consider the aliasing of voiceless sound and voiced sound.This is that pumping signal is acyclic because in the unvoiced frames of voice signal, therefore estimates that the fundamental frequency of unvoiced frames does not have practical significance.Moreover; The common randomness of the fundamental frequency that unvoiced frames is estimated out is strong; Do not have continuity, and the fundamental frequency that separation and Extraction goes out from the aliasing voice is to judge its ownership with the continuity of fundamental frequency, so; The fundamental frequency that unvoiced frames estimates can influence the fundamental tone ownership to be judged, and then influences the smoothing processing effect of fundamental frequency.
Summary of the invention
The objective of the invention is to solve problem how from monophony aliasing voice signal, to isolate target speech, propose a kind of new monophony aliasing speech separating method based on fraction Fourier conversion for overcoming the defective of prior art.
The technical scheme that the present invention adopted is following:
A kind of monophony aliasing speech separating method based on fraction Fourier conversion may further comprise the steps:
Step 1, the aliasing voice signal is carried out pre-service, remove its quiet segment signal, find out unvoiced frame.
At first, the aliasing voice signal is carried out end-point detection, removes its quiet segment signal, remaining alias band signal as process object.
Then, residue alias band signal is carried out the branch frame handle, and carry out pure and impure sound and judge, mark unvoiced frame.
Step 2, based on fraction Fourier conversion, the unvoiced frame signal after step 1 is handled is carried out pitch Detection, isolate the pitch contour of aliasing voice, the fundamental frequency of each source signal just, process is following:
At first, calculate the exponent number of FrFT according to the continuity of every frame signal.Then, the unvoiced frame signal is carried out the FrFT conversion, try to achieve the long-pending spectrum of harmonic wave, extract one of them people's fundamental frequency again with dynamic programming method, i.e. the fundamental frequency of a source signal.
After the fundamental frequency of finding a people, in the long-pending spectrum of harmonic wave, deduct this person's the pairing spectrum composition of fundamental frequency harmonic, and then use a dynamic programming, can obtain another person's fundamental frequency,, i.e. the fundamental frequency of another source signal;
Repeat said process, can obtain the fundamental frequency of each source signal.
Step 3, since voice signal can enough one group of sinusoidal signals stack represent, therefore,, come synthetic speech in conjunction with the sinusoidal model of voice signal according to each the bar fundamental frequency that obtains through step 2, thus each voice signal after obtaining separating.
Good effect of the present invention and advantage are:
1. the fundamental frequency of a plurality of aliasing voice can effectively separated and extract to use the inventive method, thereby realize effective separation of aliasing voice.
2. adopt based on FrFT to replace traditional FFT (short time discrete Fourier transform) to extract fundamental frequency, reduced the extension of harmonic spectrum.
3. because every frame signal all has its intrinsic modulating frequency, use FrFT can select suitable exponent number to make it meet the intrinsic frequency modulation rate of signal, thereby obtain the fundamental frequency of original signal more accurately.
The present invention is particularly useful for separating the monophony aliasing voice that contain two people's voice.
Description of drawings
Fig. 1 is the realization flow block diagram of the inventive method.
Fig. 2 is the aliasing Pitch Detection of Speech Signals process flow diagram based on fraction Fourier conversion in the inventive method.
Embodiment
Below in conjunction with accompanying drawing preferred implementation of the present invention is described further.
A kind of monophony aliasing speech separating method based on fraction Fourier conversion, its realization flow is as shown in Figure 1, may further comprise the steps:
Step 1, the aliasing voice signal is carried out pre-service, remove its quiet segment signal, find out unvoiced frame.
At first, the aliasing voice signal is carried out end-point detection, removes its quiet segment signal, remaining alias band signal as process object.The method that end-point detection can adopt short-time energy and zero-crossing rate to combine.
Then, residue alias band signal is carried out the branch frame handle, the frame length when dividing frame is 20ms, and frame moves and is 10ms.At this moment, carry out pure and impure sound and judge, and mark unvoiced frame.The pure and impure sound of aliasing voice signal is judged with the judgement of individual voice slightly different, and the pure and impure situation of two aliasing voice has 3 kinds: two voiced sounds, one clear one turbid, two voicelesss sound.The pure and impure sound judgement of aliasing voice was divided into for two steps: judge earlier whether two aliasing signals are two voicelesss sound, if, judge and finish, if not, judge that again two aliasing signals are one clear one turbid or pair voiced sounds.For one clear one turbid, only unvoiced frame is carried out subsequent treatment, do not handle unvoiced frames.For two voiceless sound signals, it is not handled equally.
Step 2, employing are carried out pitch Detection based on the fraction Fourier conversion mode to the unvoiced frame after handling through step 1, isolate the pitch contour of aliasing voice, just isolate the fundamental frequency of each source signal.Its realization flow is as shown in Figure 2.
At first, according to the continuity of every frame signal, calculate the exponent number of FrFT.Consider that purpose is the fundamental frequency of finding the solution voice signal, and be to search for fundamental frequency that the fundamental frequency of the exponent number α i of FrFT and front and back two frames is closely related, therefore represent with following formula with the continuous characteristic of interframe:
Wherein, p
I-1, p
i, p
I+1Be respectively the estimation fundamental frequency of former frame, present frame and next frame, p
I-1, p
i, p
I+1Can obtain through short time discrete Fourier transform.
Then, the unvoiced frame signal that obtains after handling through step 1 is carried out the FrFT conversion again, try to achieve the long-pending spectrum of harmonic wave, extract wherein a pitch contour, just one of them people's fundamental frequency with dynamic programming method again.Detailed process is following:
(1) to unvoiced frame signal x (n), adopt following formula to carry out the fraction Fourier conversion of N point (for example 1024 points), obtain its amplitude spectrum X (α, k):
X(α,k)=FrFT
N{x(n)} 1.2
Again with amplitude spectrum X (α k) transforms to log-domain, obtain logarithm amplitude spectrum SLog (α, k):
SLog(α,k)=log
10(|X(α,k)|
2) 1.3
With all the harmonic wave logarithmic spectrum SLog in the frame signal (α k) sues for peace, obtain the long-pending spectrum of harmonic wave ρ (α, f):
In the formula 1.4, H is the harmonic wave number in the sampling bandwidth, and h is the value of harmonic wave index, and f is the fundamental frequency of every frame, and α is the exponent number of every frame.
(2) consider the aliasing of two voice, (α extracts M the candidate peak that possibly contain fundamental component in f) from the long-pending spectrum of harmonic wave ρ.Consider the problem of calculated amount, the value of M is greater than and equals 3.When M more than or equal to 3 the time, the result who obtains does not have to change basically.
Need to set a target function in the dynamic programming method, every paths is all calculated the value of its target function, the pairing path of maximal value is desired wherein pitch contour.In order to prevent half frequency mistake or frequency multiplication mistake in the estimation procedure of pitch period, to occur, with target function c (α, f
g) be set at:
c(α,f
g)=k(f
g)*(ρ(α,f
g)-ρ(α,f
g/2)) 1.5
In the formula 1.5, f
gBe the estimation fundamental frequency of every frame signal, k (f
g) for following f
gThe function that successively decreases.Set weighted value k (f
g) can avoid the frequency multiplication mistake, introduce ρ (α, f
g/ 2) can avoid half frequency mistake.Therefore, with (α
i, f
Gi) be designated as μ
i, the score function S in path
i(μ
i) be set at:
In the formula 1.6,1.7; I representes frame number,
be the parameter when selecting suitable exponent number and obtaining i-1 frame fundamental frequency.Because the fundamental frequency scope that the normal person speaks is 50Hz-400Hz; Therefore in this scope, search for fundamental frequency; In two peak points of every frame signal, all can find the f value of selecting to make score function Si (μ i) maximum, promptly think one of them people's in this frame signal fundamental frequency.In like manner, after all signals of search, can be linked to be a pitch contour, thereby obtain one of them people's pitch contour (being this person's fundamental frequency).
After the fundamental frequency of finding a people; (α deducts this person's the pairing spectrum composition of fundamental frequency harmonic in f), and then uses dynamic programming method one time at the long-pending spectrum of harmonic wave ρ; Can obtain another person's pitch contour (being this person's fundamental frequency), thereby isolate the fundamental frequency of aliasing voice.
The method of asking for the pairing spectrum composition of harmonic wave is following:
When in the long-pending spectrum of harmonic wave, deducting the pairing spectrum composition of harmonic wave, at first to know harmonic wave number H
i, can know thus to deduct several spectrum compositions actually.According to formula 1.8, can obtain the harmonic wave number H of i frame signal
i,
In the formula 1.8, f
iBe the fundamental frequency of i frame, f
sBe sampling rate.Then the relation of harmonic frequency f ' and fundamental frequency f is following:
f′=h*f,h=2,3,4,K,H 1.9
In the formula 1.9, H is the harmonic wave number.Obtain harmonic frequency f ', promptly known the pairing spectrum composition of harmonic wave.
Step 3, since voice signal can enough one group of sinusoidal signals stack represent, therefore, according to each the bar fundamental frequency f that obtains through step 2
i, come synthetic speech in conjunction with the sinusoidal model of voice signal, thus each voice signal after obtaining separating.
Claims (4)
1. monophony aliasing speech separating method based on fraction Fourier conversion is characterized in that may further comprise the steps:
Step 1, the aliasing voice signal is carried out pre-service, remove its quiet segment signal, find out unvoiced frame;
Step 2, based on fraction Fourier conversion, the unvoiced frame signal after step 1 is handled is carried out pitch Detection, isolate the pitch contour of aliasing voice, the fundamental frequency of each source signal just, process is following:
At first, calculate the exponent number of FrFT, then, the unvoiced frame signal is carried out the FrFT conversion, try to achieve the long-pending spectrum of harmonic wave, extract one of them people's fundamental frequency again with dynamic programming method, i.e. the fundamental frequency of a source signal according to the continuity of every frame signal;
After the fundamental frequency of finding a people, in the long-pending spectrum of harmonic wave, deduct this person's the pairing spectrum composition of fundamental frequency harmonic, and then use a dynamic programming, can obtain another person's fundamental frequency, i.e. the fundamental frequency of another source signal;
Repeat said process, can obtain the fundamental frequency of each source signal;
Step 3, according to each the bar fundamental frequency that obtains through step 2, come synthetic speech in conjunction with the sinusoidal model of voice signal, thus each voice signal after obtaining separating.
2. a kind of monophony aliasing speech separating method based on fraction Fourier conversion as claimed in claim 1 is characterized in that, in the said step 1, after removing quiet segment signal, the method for residue alias band signal being carried out the processing of branch frame is following:
Frame length when dividing frame is 20ms, and frame moves and is 10ms, at this moment, carries out pure and impure sound and judges, and mark unvoiced frame; The pure and impure sound judgement of aliasing voice was divided into for two steps: judge earlier whether two aliasing signals are two voicelesss sound, if, judge and finish, if not, judge that again two aliasing signals are one clear one turbid or pair voiced sounds; For one clear one turbid, only unvoiced frame is carried out subsequent treatment, do not handle unvoiced frames; For two voiceless sound signals, it is not handled equally.
3. according to claim 1 or claim 2 a kind of monophony aliasing speech separating method based on fraction Fourier conversion is characterized in that, in step 2, and when calculating the exponent number of FrFT, the exponent number α of FrFT
iRepresent with following formula with the fundamental frequency of front and back two frames:
Wherein, p
I-1, p
i, p
I+1Be respectively the estimation fundamental frequency of former frame, present frame and next frame.
4. according to claim 1 or claim 2 a kind of monophony aliasing speech separating method based on fraction Fourier conversion; It is characterized in that, behind the exponent number that calculates FrFT, the unvoiced frame signal that obtains after handling through step 1 is carried out the FrFT conversion; Try to achieve the long-pending spectrum of harmonic wave; Extract wherein pitch contour with dynamic programming method again, fundamental frequency just, its detailed process is following:
(1) to unvoiced frame signal x (n), adopt following formula to carry out the fraction Fourier conversion that N is ordered, obtain its amplitude spectrum X (α, k):
X(α,k)=FrFT
N{x(n)} 1.2
With amplitude spectrum X (α k) transforms to log-domain, obtain logarithm amplitude spectrum SLog (α, k):
SLog(α,k)=log
10(|X(α,k)|
2) 1.3
With all the logarithm amplitude spectrum SLog in the frame signal (α k) sues for peace, obtain the long-pending spectrum of harmonic wave ρ (α, f):
In the formula 1.4, H is the harmonic wave number in the sampling bandwidth, and h is the value of harmonic wave index, and f is the fundamental frequency of every frame, and α is the exponent number of every frame;
(2) (α extracts M the candidate peak that possibly contain fundamental component in f), and the value of M is greater than and equals 3 from the long-pending spectrum of harmonic wave ρ;
Need to set a target function in the dynamic programming method, every paths is all calculated the value of its target function, the pairing path of maximal value is desired wherein fundamental frequency; With target function c (α, f
g) be set at:
c(α,f
g)=k(f
g)*(ρ(α,f
g)-ρ(α,f
g/2)) 1.5
In the formula 1.5, f
gBe the estimation fundamental frequency of every frame signal, k (f
g) for following f
gThe function that successively decreases; With (α
i, f
Gi) be designated as μ
i, the score function S in path
i(μ
i) be set at:
In the formula 1.6,1.7, i representes frame number,
It is the parameter when selecting suitable exponent number to obtain i-1 frame fundamental frequency; Because the fundamental frequency scope that the normal person speaks is 50Hz-400Hz, therefore in this scope, search for fundamental frequency, in two peak points of every frame signal, all can find and select to make score function S
i(μ
i) maximum f
GiBe worth, promptly think one of them people's in this frame signal fundamental frequency; In like manner, after all signals of search, can be linked to be a pitch contour, thereby obtain one of them people's fundamental frequency;
After the fundamental frequency of finding a people, (α deducts this person's the pairing spectrum composition of fundamental frequency harmonic in f), and then uses dynamic programming method one time, can obtain another person's fundamental frequency, thereby isolate the pitch contour of aliasing voice at the long-pending spectrum of harmonic wave ρ;
The method of asking for the pairing spectrum composition of harmonic wave is following:
When in the long-pending spectrum of harmonic wave, deducting the pairing spectrum composition of harmonic wave, at first to know harmonic wave number H
i, can know thus to deduct several spectrum compositions actually; According to formula 1.8, can obtain the harmonic wave number H of i frame signal
i,
In the formula 1.8, f
iBe the fundamental frequency of i frame, f
sBe sampling rate; Then the relation of harmonic frequency f ' and fundamental frequency f is following:
f′=h*f,h=2,3,4,...,H 1.9
In the formula 1.9, H is the harmonic wave number, has obtained harmonic frequency f ', has promptly known the pairing spectrum composition of harmonic wave.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009102359018A CN102054480B (en) | 2009-10-29 | 2009-10-29 | Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009102359018A CN102054480B (en) | 2009-10-29 | 2009-10-29 | Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102054480A CN102054480A (en) | 2011-05-11 |
CN102054480B true CN102054480B (en) | 2012-05-30 |
Family
ID=43958735
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2009102359018A Expired - Fee Related CN102054480B (en) | 2009-10-29 | 2009-10-29 | Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102054480B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103854644B (en) * | 2012-12-05 | 2016-09-28 | 中国传媒大学 | The automatic dubbing method of monophonic multitone music signal and device |
CN103117061B (en) * | 2013-02-05 | 2016-01-20 | 广东欧珀移动通信有限公司 | A kind of voice-based animals recognition method and device |
CN104078051B (en) * | 2013-03-29 | 2018-09-25 | 南京中兴软件有限责任公司 | A kind of voice extracting method, system and voice audio frequency playing method and device |
EP2980801A1 (en) | 2014-07-28 | 2016-02-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method for estimating noise in an audio signal, noise estimator, audio encoder, audio decoder, and system for transmitting audio signals |
CN106571150B (en) * | 2015-10-12 | 2021-04-16 | 阿里巴巴集团控股有限公司 | Method and system for recognizing human voice in music |
CN106611604B (en) * | 2015-10-23 | 2020-04-14 | 中国科学院声学研究所 | Automatic voice superposition detection method based on deep neural network |
CN105590633A (en) * | 2015-11-16 | 2016-05-18 | 福建省百利亨信息科技有限公司 | Method and device for generation of labeled melody for song scoring |
CN106847267B (en) * | 2015-12-04 | 2020-04-14 | 中国科学院声学研究所 | Method for detecting overlapped voice in continuous voice stream |
CN105551501B (en) * | 2016-01-22 | 2019-03-15 | 大连民族大学 | Harmonic signal fundamental frequency estimation algorithm and device |
CN107657962B (en) * | 2017-08-14 | 2020-06-12 | 广东工业大学 | Method and system for identifying and separating throat sound and gas sound of voice signal |
CN109065025A (en) * | 2018-07-30 | 2018-12-21 | 珠海格力电器股份有限公司 | A kind of computer storage medium and a kind of processing method and processing device of audio |
CN109346109B (en) * | 2018-12-05 | 2020-02-07 | 百度在线网络技术(北京)有限公司 | Fundamental frequency extraction method and device |
CN111125423A (en) * | 2019-11-29 | 2020-05-08 | 维沃移动通信有限公司 | Denoising method and mobile terminal |
CN111613243B (en) * | 2020-04-26 | 2023-04-18 | 云知声智能科技股份有限公司 | Voice detection method and device |
CN113362840B (en) * | 2021-06-02 | 2022-03-29 | 浙江大学 | General voice information recovery device and method based on undersampled data of built-in sensor |
WO2023092368A1 (en) * | 2021-11-25 | 2023-06-01 | 广州酷狗计算机科技有限公司 | Audio separation method and apparatus, and device, storage medium and program product |
-
2009
- 2009-10-29 CN CN2009102359018A patent/CN102054480B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN102054480A (en) | 2011-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102054480B (en) | Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) | |
Zão et al. | Speech enhancement with EMD and hurst-based mode selection | |
Deshmukh et al. | Use of temporal information: Detection of periodicity, aperiodicity, and pitch in speech | |
CN103854662B (en) | Adaptive voice detection method based on multiple domain Combined estimator | |
CN108896878B (en) | Partial discharge detection method based on ultrasonic waves | |
CN104183245A (en) | Method and device for recommending music stars with tones similar to those of singers | |
CN109256127B (en) | Robust voice feature extraction method based on nonlinear power transformation Gamma chirp filter | |
CN111128213B (en) | Noise suppression method and system for processing in different frequency bands | |
CN102129456B (en) | Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping | |
KR20140079369A (en) | System and method of processing a sound signal including transforming the sound signal into a frequency-chirp domain | |
CN103646649A (en) | High-efficiency voice detecting method | |
Sukhostat et al. | A comparative analysis of pitch detection methods under the influence of different noise conditions | |
CN103236260A (en) | Voice recognition system | |
Shahnaz et al. | Pitch estimation based on a harmonic sinusoidal autocorrelation model and a time-domain matching scheme | |
CN111554256B (en) | Piano playing ability evaluation system based on strong and weak standards | |
Paliwal et al. | Usefulness of phase in speech processing | |
CN110136730A (en) | A kind of automatic allocation system of piano harmony and method based on deep learning | |
He et al. | Stress detection using speech spectrograms and sigma-pi neuron units | |
Katsir et al. | Evaluation of a speech bandwidth extension algorithm based on vocal tract shape estimation | |
Kumar et al. | A new pitch detection scheme based on ACF and AMDF | |
CN107993666B (en) | Speech recognition method, speech recognition device, computer equipment and readable storage medium | |
Gowda et al. | AM-FM based filter bank analysis for estimation of spectro-temporal envelopes and its application for speaker recognition in noisy reverberant environments. | |
RU93173U1 (en) | ANNOUNCER VOICE DISTORTION SYSTEM | |
Nasr et al. | Efficient implementation of adaptive wiener filter for pitch detection from noisy speech signals | |
CN111816208A (en) | Voice separation quality evaluation method and device and computer storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120530 Termination date: 20121029 |