CN102054480B - Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) - Google Patents

Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) Download PDF

Info

Publication number
CN102054480B
CN102054480B CN2009102359018A CN200910235901A CN102054480B CN 102054480 B CN102054480 B CN 102054480B CN 2009102359018 A CN2009102359018 A CN 2009102359018A CN 200910235901 A CN200910235901 A CN 200910235901A CN 102054480 B CN102054480 B CN 102054480B
Authority
CN
China
Prior art keywords
fundamental frequency
frame
signal
harmonic wave
aliasing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009102359018A
Other languages
Chinese (zh)
Other versions
CN102054480A (en
Inventor
茹婷婷
谢湘
匡镜明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN2009102359018A priority Critical patent/CN102054480B/en
Publication of CN102054480A publication Critical patent/CN102054480A/en
Application granted granted Critical
Publication of CN102054480B publication Critical patent/CN102054480B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT), which belongs to the technical field of audio signal processing. The method comprises the following steps: firstly, preprocessing overlapping speech signals so as to remove mute-section signals of the overlapping speech signals and find out sonant frames; then, carrying out pitch detection on sonant-frame signals based on FrFT so as to separate the fundamental frequencies of the overlapping speeches; and finally, integrating the fundamental frequencies with a sinusoidal model of speech signals so as to synthesize speeches, thereby obtaining each speech signal subjected to separation. The method provided by the invention has the advantages that the fundamental frequencies of a plurality of overlapping speeches can be separated and extracted effectively, and finally, the effective separation of the overlapping speeches can be realized; and the pitch frequencies are extracted based on FrFT instead of traditional fast Fourier transform (FFT), thereby reducing the extension of a harmonic frequency spectrum and then obtaining more accurate fundamental frequencies of original signals. The method provided by the invention is especially suitable for the separation of monaural overlapping speeches containing speeches of two persons.

Description

A kind of monophony aliasing speech separating method based on fraction Fourier conversion
Technical field
The present invention relates to a kind of method of utilizing fraction Fourier conversion to carry out monophony aliasing speech Separation, belong to the Audio Signal Processing technical field.
Background technology
In voice and audible signal process field, have an important problem is how from the aliasing voice signal, to isolate the interested voice of people.The aliasing speech Separation all has important significance for theories and use value at aspects such as voice communication, acoustic target detection, voice signal enhancings; But because it is overlapping fully on time domain and frequency domain to constitute each source voice signal of aliasing voice, sound enhancement method commonly used is difficult to people's interested voice of institute (being called target speech) are separated from disturb voice.
(Fractional Fourier Transform FrFT) has very excellent characteristic for analyzing some non-stationary signal to fraction Fourier conversion, becomes a kind of instrument that causes signal Processing circle extensive concern in recent years.As the voice of non-stationary signal, FrFT or the application of similar conversion in voice signal is handled mainly concentrate on the following aspects at present: speech analysis can provide the time frequency resolution higher than traditional fourier transform method; Fundamental tone is estimated, can provide than the more accurate fundamental tone of classic method and estimate; Voice strengthen; Speech recognition; And Speaker Identification etc.
Research aspect the aliasing speech Separation, (Auditory Scene Analysis ASA) separates (Blind Source Separation, BSS) two types with blind source mainly to be divided into auditory scene analysis.The research of auditory scene analysis has two kinds of methods: a kind of is auditory physiology and psychological characteristic from the people, the rule of research people in the voice recognition process, i.e. auditory scene analysis; Another kind is to utilize the achievement in research of people's sense of hearing perception is set up model; Model is carried out mathematical analysis and realizes it with computing machine; This be calculate auditory scene analysis (Computational Auditory Scene Analysis, CASA) the content that will study.The separation of blind source is meant under source signal, transmission channel characteristics condition of unknown, is only estimated the process of each component of source signal by some prioris (like probability density) of observation signal and source signal.The independent component analysis method that separate in blind source at first is to be proposed by P.Comon, and it is based on a kind of technology that neural network and statistical base growth are got up, and is the active field, forward position of a ten minutes.
Mainly there is following deficiency in existing aliasing speech separating method:
(1) auditory scene analysis also is in the starting stage with the research of calculating auditory scene analysis.Particularly in calculating auditory scene analysis research, the model of being set up can only be used for verifying some clear inadequately theories of auditory scene analysis research, and promptly human brain is handled the mechanism of audible signal.
Research to blind source separation method is very active; But this problem also is not well solved; It relates to the stability and the phase place uncertain problem of multichannel convolutive aliasing system and blind deconvolution system, especially the situation of blind deconvolution problem and band noise when the number in source is unknown.
(2) the fundamental frequency separation and Extraction of aliasing voice is to realize the key of aliasing speech Separation in the auditory scene analysis, but existing aliasing speech pitch separating and extracting process is only considered the aliasing of voiced sound and voiced sound, does not consider the aliasing of voiceless sound and voiced sound.This is that pumping signal is acyclic because in the unvoiced frames of voice signal, therefore estimates that the fundamental frequency of unvoiced frames does not have practical significance.Moreover; The common randomness of the fundamental frequency that unvoiced frames is estimated out is strong; Do not have continuity, and the fundamental frequency that separation and Extraction goes out from the aliasing voice is to judge its ownership with the continuity of fundamental frequency, so; The fundamental frequency that unvoiced frames estimates can influence the fundamental tone ownership to be judged, and then influences the smoothing processing effect of fundamental frequency.
Summary of the invention
The objective of the invention is to solve problem how from monophony aliasing voice signal, to isolate target speech, propose a kind of new monophony aliasing speech separating method based on fraction Fourier conversion for overcoming the defective of prior art.
The technical scheme that the present invention adopted is following:
A kind of monophony aliasing speech separating method based on fraction Fourier conversion may further comprise the steps:
Step 1, the aliasing voice signal is carried out pre-service, remove its quiet segment signal, find out unvoiced frame.
At first, the aliasing voice signal is carried out end-point detection, removes its quiet segment signal, remaining alias band signal as process object.
Then, residue alias band signal is carried out the branch frame handle, and carry out pure and impure sound and judge, mark unvoiced frame.
Step 2, based on fraction Fourier conversion, the unvoiced frame signal after step 1 is handled is carried out pitch Detection, isolate the pitch contour of aliasing voice, the fundamental frequency of each source signal just, process is following:
At first, calculate the exponent number of FrFT according to the continuity of every frame signal.Then, the unvoiced frame signal is carried out the FrFT conversion, try to achieve the long-pending spectrum of harmonic wave, extract one of them people's fundamental frequency again with dynamic programming method, i.e. the fundamental frequency of a source signal.
After the fundamental frequency of finding a people, in the long-pending spectrum of harmonic wave, deduct this person's the pairing spectrum composition of fundamental frequency harmonic, and then use a dynamic programming, can obtain another person's fundamental frequency,, i.e. the fundamental frequency of another source signal;
Repeat said process, can obtain the fundamental frequency of each source signal.
Step 3, since voice signal can enough one group of sinusoidal signals stack represent, therefore,, come synthetic speech in conjunction with the sinusoidal model of voice signal according to each the bar fundamental frequency that obtains through step 2, thus each voice signal after obtaining separating.
Good effect of the present invention and advantage are:
1. the fundamental frequency of a plurality of aliasing voice can effectively separated and extract to use the inventive method, thereby realize effective separation of aliasing voice.
2. adopt based on FrFT to replace traditional FFT (short time discrete Fourier transform) to extract fundamental frequency, reduced the extension of harmonic spectrum.
3. because every frame signal all has its intrinsic modulating frequency, use FrFT can select suitable exponent number to make it meet the intrinsic frequency modulation rate of signal, thereby obtain the fundamental frequency of original signal more accurately.
The present invention is particularly useful for separating the monophony aliasing voice that contain two people's voice.
Description of drawings
Fig. 1 is the realization flow block diagram of the inventive method.
Fig. 2 is the aliasing Pitch Detection of Speech Signals process flow diagram based on fraction Fourier conversion in the inventive method.
Embodiment
Below in conjunction with accompanying drawing preferred implementation of the present invention is described further.
A kind of monophony aliasing speech separating method based on fraction Fourier conversion, its realization flow is as shown in Figure 1, may further comprise the steps:
Step 1, the aliasing voice signal is carried out pre-service, remove its quiet segment signal, find out unvoiced frame.
At first, the aliasing voice signal is carried out end-point detection, removes its quiet segment signal, remaining alias band signal as process object.The method that end-point detection can adopt short-time energy and zero-crossing rate to combine.
Then, residue alias band signal is carried out the branch frame handle, the frame length when dividing frame is 20ms, and frame moves and is 10ms.At this moment, carry out pure and impure sound and judge, and mark unvoiced frame.The pure and impure sound of aliasing voice signal is judged with the judgement of individual voice slightly different, and the pure and impure situation of two aliasing voice has 3 kinds: two voiced sounds, one clear one turbid, two voicelesss sound.The pure and impure sound judgement of aliasing voice was divided into for two steps: judge earlier whether two aliasing signals are two voicelesss sound, if, judge and finish, if not, judge that again two aliasing signals are one clear one turbid or pair voiced sounds.For one clear one turbid, only unvoiced frame is carried out subsequent treatment, do not handle unvoiced frames.For two voiceless sound signals, it is not handled equally.
Step 2, employing are carried out pitch Detection based on the fraction Fourier conversion mode to the unvoiced frame after handling through step 1, isolate the pitch contour of aliasing voice, just isolate the fundamental frequency of each source signal.Its realization flow is as shown in Figure 2.
At first, according to the continuity of every frame signal, calculate the exponent number of FrFT.Consider that purpose is the fundamental frequency of finding the solution voice signal, and be to search for fundamental frequency that the fundamental frequency of the exponent number α i of FrFT and front and back two frames is closely related, therefore represent with following formula with the continuous characteristic of interframe:
α i = 1 - | p i - p i - 1 p i + p i + 1 | - - - 1.1
Wherein, p I-1, p i, p I+1Be respectively the estimation fundamental frequency of former frame, present frame and next frame, p I-1, p i, p I+1Can obtain through short time discrete Fourier transform.
Then, the unvoiced frame signal that obtains after handling through step 1 is carried out the FrFT conversion again, try to achieve the long-pending spectrum of harmonic wave, extract wherein a pitch contour, just one of them people's fundamental frequency with dynamic programming method again.Detailed process is following:
(1) to unvoiced frame signal x (n), adopt following formula to carry out the fraction Fourier conversion of N point (for example 1024 points), obtain its amplitude spectrum X (α, k):
X(α,k)=FrFT N{x(n)} 1.2
Again with amplitude spectrum X (α k) transforms to log-domain, obtain logarithm amplitude spectrum SLog (α, k):
SLog(α,k)=log 10(|X(α,k)| 2) 1.3
With all the harmonic wave logarithmic spectrum SLog in the frame signal (α k) sues for peace, obtain the long-pending spectrum of harmonic wave ρ (α, f):
ρ ( α , f ) = 1 H Σ h = 1 H SLog ( α , hf ) - - - 1.4
In the formula 1.4, H is the harmonic wave number in the sampling bandwidth, and h is the value of harmonic wave index, and f is the fundamental frequency of every frame, and α is the exponent number of every frame.
(2) consider the aliasing of two voice, (α extracts M the candidate peak that possibly contain fundamental component in f) from the long-pending spectrum of harmonic wave ρ.Consider the problem of calculated amount, the value of M is greater than and equals 3.When M more than or equal to 3 the time, the result who obtains does not have to change basically.
Need to set a target function in the dynamic programming method, every paths is all calculated the value of its target function, the pairing path of maximal value is desired wherein pitch contour.In order to prevent half frequency mistake or frequency multiplication mistake in the estimation procedure of pitch period, to occur, with target function c (α, f g) be set at:
c(α,f g)=k(f g)*(ρ(α,f g)-ρ(α,f g/2)) 1.5
In the formula 1.5, f gBe the estimation fundamental frequency of every frame signal, k (f g) for following f gThe function that successively decreases.Set weighted value k (f g) can avoid the frequency multiplication mistake, introduce ρ (α, f g/ 2) can avoid half frequency mistake.Therefore, with (α i, f Gi) be designated as μ i, the score function S in path ii) be set at:
S i ( μ i ) = S i - 1 ( μ i - 1 * ) + c ( μ i ) - - - 1.6
μ i - 1 * = arg μ i - 1 max [ s i - 1 ( μ i - 1 ) + c ( μ i ) ] - - - 1.7
In the formula 1.6,1.7; I representes frame number, be the parameter when selecting suitable exponent number and obtaining i-1 frame fundamental frequency.Because the fundamental frequency scope that the normal person speaks is 50Hz-400Hz; Therefore in this scope, search for fundamental frequency; In two peak points of every frame signal, all can find the f value of selecting to make score function Si (μ i) maximum, promptly think one of them people's in this frame signal fundamental frequency.In like manner, after all signals of search, can be linked to be a pitch contour, thereby obtain one of them people's pitch contour (being this person's fundamental frequency).
After the fundamental frequency of finding a people; (α deducts this person's the pairing spectrum composition of fundamental frequency harmonic in f), and then uses dynamic programming method one time at the long-pending spectrum of harmonic wave ρ; Can obtain another person's pitch contour (being this person's fundamental frequency), thereby isolate the fundamental frequency of aliasing voice.
The method of asking for the pairing spectrum composition of harmonic wave is following:
When in the long-pending spectrum of harmonic wave, deducting the pairing spectrum composition of harmonic wave, at first to know harmonic wave number H i, can know thus to deduct several spectrum compositions actually.According to formula 1.8, can obtain the harmonic wave number H of i frame signal i,
H i = f s / 2 f i - - - 1.8
In the formula 1.8, f iBe the fundamental frequency of i frame, f sBe sampling rate.Then the relation of harmonic frequency f ' and fundamental frequency f is following:
f′=h*f,h=2,3,4,K,H 1.9
In the formula 1.9, H is the harmonic wave number.Obtain harmonic frequency f ', promptly known the pairing spectrum composition of harmonic wave.
Step 3, since voice signal can enough one group of sinusoidal signals stack represent, therefore, according to each the bar fundamental frequency f that obtains through step 2 i, come synthetic speech in conjunction with the sinusoidal model of voice signal, thus each voice signal after obtaining separating.

Claims (4)

1. monophony aliasing speech separating method based on fraction Fourier conversion is characterized in that may further comprise the steps:
Step 1, the aliasing voice signal is carried out pre-service, remove its quiet segment signal, find out unvoiced frame;
Step 2, based on fraction Fourier conversion, the unvoiced frame signal after step 1 is handled is carried out pitch Detection, isolate the pitch contour of aliasing voice, the fundamental frequency of each source signal just, process is following:
At first, calculate the exponent number of FrFT, then, the unvoiced frame signal is carried out the FrFT conversion, try to achieve the long-pending spectrum of harmonic wave, extract one of them people's fundamental frequency again with dynamic programming method, i.e. the fundamental frequency of a source signal according to the continuity of every frame signal;
After the fundamental frequency of finding a people, in the long-pending spectrum of harmonic wave, deduct this person's the pairing spectrum composition of fundamental frequency harmonic, and then use a dynamic programming, can obtain another person's fundamental frequency, i.e. the fundamental frequency of another source signal;
Repeat said process, can obtain the fundamental frequency of each source signal;
Step 3, according to each the bar fundamental frequency that obtains through step 2, come synthetic speech in conjunction with the sinusoidal model of voice signal, thus each voice signal after obtaining separating.
2. a kind of monophony aliasing speech separating method based on fraction Fourier conversion as claimed in claim 1 is characterized in that, in the said step 1, after removing quiet segment signal, the method for residue alias band signal being carried out the processing of branch frame is following:
Frame length when dividing frame is 20ms, and frame moves and is 10ms, at this moment, carries out pure and impure sound and judges, and mark unvoiced frame; The pure and impure sound judgement of aliasing voice was divided into for two steps: judge earlier whether two aliasing signals are two voicelesss sound, if, judge and finish, if not, judge that again two aliasing signals are one clear one turbid or pair voiced sounds; For one clear one turbid, only unvoiced frame is carried out subsequent treatment, do not handle unvoiced frames; For two voiceless sound signals, it is not handled equally.
3. according to claim 1 or claim 2 a kind of monophony aliasing speech separating method based on fraction Fourier conversion is characterized in that, in step 2, and when calculating the exponent number of FrFT, the exponent number α of FrFT iRepresent with following formula with the fundamental frequency of front and back two frames:
α i = 1 - | p i - p i + 1 p i + p i + 1 | - - - 1.1
Wherein, p I-1, p i, p I+1Be respectively the estimation fundamental frequency of former frame, present frame and next frame.
4. according to claim 1 or claim 2 a kind of monophony aliasing speech separating method based on fraction Fourier conversion; It is characterized in that, behind the exponent number that calculates FrFT, the unvoiced frame signal that obtains after handling through step 1 is carried out the FrFT conversion; Try to achieve the long-pending spectrum of harmonic wave; Extract wherein pitch contour with dynamic programming method again, fundamental frequency just, its detailed process is following:
(1) to unvoiced frame signal x (n), adopt following formula to carry out the fraction Fourier conversion that N is ordered, obtain its amplitude spectrum X (α, k):
X(α,k)=FrFT N{x(n)} 1.2
With amplitude spectrum X (α k) transforms to log-domain, obtain logarithm amplitude spectrum SLog (α, k):
SLog(α,k)=log 10(|X(α,k)| 2) 1.3
With all the logarithm amplitude spectrum SLog in the frame signal (α k) sues for peace, obtain the long-pending spectrum of harmonic wave ρ (α, f):
ρ ( α , f ) = 1 H Σ h = 1 H SLog ( α , hf ) - - - 1.4
In the formula 1.4, H is the harmonic wave number in the sampling bandwidth, and h is the value of harmonic wave index, and f is the fundamental frequency of every frame, and α is the exponent number of every frame;
(2) (α extracts M the candidate peak that possibly contain fundamental component in f), and the value of M is greater than and equals 3 from the long-pending spectrum of harmonic wave ρ;
Need to set a target function in the dynamic programming method, every paths is all calculated the value of its target function, the pairing path of maximal value is desired wherein fundamental frequency; With target function c (α, f g) be set at:
c(α,f g)=k(f g)*(ρ(α,f g)-ρ(α,f g/2)) 1.5
In the formula 1.5, f gBe the estimation fundamental frequency of every frame signal, k (f g) for following f gThe function that successively decreases; With (α i, f Gi) be designated as μ i, the score function S in path ii) be set at:
S i ( μ i ) = S i - 1 ( μ i - 1 * ) + c ( μ i ) - - - 1.6
μ i - 1 * = arg μ i - 1 max [ s i - 1 ( μ i - 1 ) + c ( μ i ) ] - - - 1.7
In the formula 1.6,1.7, i representes frame number,
Figure FDA0000109809940000024
It is the parameter when selecting suitable exponent number to obtain i-1 frame fundamental frequency; Because the fundamental frequency scope that the normal person speaks is 50Hz-400Hz, therefore in this scope, search for fundamental frequency, in two peak points of every frame signal, all can find and select to make score function S ii) maximum f GiBe worth, promptly think one of them people's in this frame signal fundamental frequency; In like manner, after all signals of search, can be linked to be a pitch contour, thereby obtain one of them people's fundamental frequency;
After the fundamental frequency of finding a people, (α deducts this person's the pairing spectrum composition of fundamental frequency harmonic in f), and then uses dynamic programming method one time, can obtain another person's fundamental frequency, thereby isolate the pitch contour of aliasing voice at the long-pending spectrum of harmonic wave ρ;
The method of asking for the pairing spectrum composition of harmonic wave is following:
When in the long-pending spectrum of harmonic wave, deducting the pairing spectrum composition of harmonic wave, at first to know harmonic wave number H i, can know thus to deduct several spectrum compositions actually; According to formula 1.8, can obtain the harmonic wave number H of i frame signal i,
H i = f s / 2 f i - - - 1.8
In the formula 1.8, f iBe the fundamental frequency of i frame, f sBe sampling rate; Then the relation of harmonic frequency f ' and fundamental frequency f is following:
f′=h*f,h=2,3,4,...,H 1.9
In the formula 1.9, H is the harmonic wave number, has obtained harmonic frequency f ', has promptly known the pairing spectrum composition of harmonic wave.
CN2009102359018A 2009-10-29 2009-10-29 Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) Expired - Fee Related CN102054480B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009102359018A CN102054480B (en) 2009-10-29 2009-10-29 Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009102359018A CN102054480B (en) 2009-10-29 2009-10-29 Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)

Publications (2)

Publication Number Publication Date
CN102054480A CN102054480A (en) 2011-05-11
CN102054480B true CN102054480B (en) 2012-05-30

Family

ID=43958735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102359018A Expired - Fee Related CN102054480B (en) 2009-10-29 2009-10-29 Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)

Country Status (1)

Country Link
CN (1) CN102054480B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103854644B (en) * 2012-12-05 2016-09-28 中国传媒大学 The automatic dubbing method of monophonic multitone music signal and device
CN103117061B (en) * 2013-02-05 2016-01-20 广东欧珀移动通信有限公司 A kind of voice-based animals recognition method and device
CN104078051B (en) * 2013-03-29 2018-09-25 南京中兴软件有限责任公司 A kind of voice extracting method, system and voice audio frequency playing method and device
EP2980801A1 (en) 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method for estimating noise in an audio signal, noise estimator, audio encoder, audio decoder, and system for transmitting audio signals
CN106571150B (en) * 2015-10-12 2021-04-16 阿里巴巴集团控股有限公司 Method and system for recognizing human voice in music
CN106611604B (en) * 2015-10-23 2020-04-14 中国科学院声学研究所 Automatic voice superposition detection method based on deep neural network
CN105590633A (en) * 2015-11-16 2016-05-18 福建省百利亨信息科技有限公司 Method and device for generation of labeled melody for song scoring
CN106847267B (en) * 2015-12-04 2020-04-14 中国科学院声学研究所 Method for detecting overlapped voice in continuous voice stream
CN105551501B (en) * 2016-01-22 2019-03-15 大连民族大学 Harmonic signal fundamental frequency estimation algorithm and device
CN107657962B (en) * 2017-08-14 2020-06-12 广东工业大学 Method and system for identifying and separating throat sound and gas sound of voice signal
CN109065025A (en) * 2018-07-30 2018-12-21 珠海格力电器股份有限公司 A kind of computer storage medium and a kind of processing method and processing device of audio
CN109346109B (en) * 2018-12-05 2020-02-07 百度在线网络技术(北京)有限公司 Fundamental frequency extraction method and device
CN111125423A (en) * 2019-11-29 2020-05-08 维沃移动通信有限公司 Denoising method and mobile terminal
CN111613243B (en) * 2020-04-26 2023-04-18 云知声智能科技股份有限公司 Voice detection method and device
CN113362840B (en) * 2021-06-02 2022-03-29 浙江大学 General voice information recovery device and method based on undersampled data of built-in sensor
WO2023092368A1 (en) * 2021-11-25 2023-06-01 广州酷狗计算机科技有限公司 Audio separation method and apparatus, and device, storage medium and program product

Also Published As

Publication number Publication date
CN102054480A (en) 2011-05-11

Similar Documents

Publication Publication Date Title
CN102054480B (en) Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)
Zão et al. Speech enhancement with EMD and hurst-based mode selection
Deshmukh et al. Use of temporal information: Detection of periodicity, aperiodicity, and pitch in speech
CN103854662B (en) Adaptive voice detection method based on multiple domain Combined estimator
CN108896878B (en) Partial discharge detection method based on ultrasonic waves
CN104183245A (en) Method and device for recommending music stars with tones similar to those of singers
CN109256127B (en) Robust voice feature extraction method based on nonlinear power transformation Gamma chirp filter
CN111128213B (en) Noise suppression method and system for processing in different frequency bands
CN102129456B (en) Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping
KR20140079369A (en) System and method of processing a sound signal including transforming the sound signal into a frequency-chirp domain
CN103646649A (en) High-efficiency voice detecting method
Sukhostat et al. A comparative analysis of pitch detection methods under the influence of different noise conditions
CN103236260A (en) Voice recognition system
Shahnaz et al. Pitch estimation based on a harmonic sinusoidal autocorrelation model and a time-domain matching scheme
CN111554256B (en) Piano playing ability evaluation system based on strong and weak standards
Paliwal et al. Usefulness of phase in speech processing
CN110136730A (en) A kind of automatic allocation system of piano harmony and method based on deep learning
He et al. Stress detection using speech spectrograms and sigma-pi neuron units
Katsir et al. Evaluation of a speech bandwidth extension algorithm based on vocal tract shape estimation
Kumar et al. A new pitch detection scheme based on ACF and AMDF
CN107993666B (en) Speech recognition method, speech recognition device, computer equipment and readable storage medium
Gowda et al. AM-FM based filter bank analysis for estimation of spectro-temporal envelopes and its application for speaker recognition in noisy reverberant environments.
RU93173U1 (en) ANNOUNCER VOICE DISTORTION SYSTEM
Nasr et al. Efficient implementation of adaptive wiener filter for pitch detection from noisy speech signals
CN111816208A (en) Voice separation quality evaluation method and device and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120530

Termination date: 20121029