CN104599682A - Method for extracting pitch period of telephone wire quality voice - Google Patents

Method for extracting pitch period of telephone wire quality voice Download PDF

Info

Publication number
CN104599682A
CN104599682A CN201510017199.3A CN201510017199A CN104599682A CN 104599682 A CN104599682 A CN 104599682A CN 201510017199 A CN201510017199 A CN 201510017199A CN 104599682 A CN104599682 A CN 104599682A
Authority
CN
China
Prior art keywords
time domain
sound
autocorrelation function
pitch period
telephone wire
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510017199.3A
Other languages
Chinese (zh)
Inventor
常亮
唐昆
崔慧娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201510017199.3A priority Critical patent/CN104599682A/en
Publication of CN104599682A publication Critical patent/CN104599682A/en
Pending legal-status Critical Current

Links

Landscapes

  • Telephone Function (AREA)

Abstract

The invention discloses a method for extracting a pitch period of telephone wire quality voice. The method comprises the following steps of: nonlinearly processing primary voice, and calculating a first time domain self-correlation function of primary voice and a second time domain self-correlation function of nonlinearly processed voice; integrating the first time domain self-correlation function with the second time domain self-correlation function to obtain a third time domain self-correlation function; calculating a long time pitch period of each frame in the primary voice and amending the third time domain self-correlation function; performing LPC inverse filtering of the primary voice to obtain residual signals, performing FFT (Fast Fourier Transform Algorithm) conversion, and calculating a frequency domain self-correlation function according to conversion results; according to the third time domain self-correlation function and the frequency domain self-correlation function, calculating time domain weight and frequency domain weight of a pitch period candidate value and further obtaining the ultimate weight; according to the ultimate weight, performing route planning to determine a final pitch period value. When the telephone wire quality voice is extracted by the method, the accuracy is high.

Description

The method for extracting base-sound period of telephone wire quality audio
Technical field
The present invention relates to digitized voice communications technical field, particularly a kind of method for extracting base-sound period of telephone wire quality audio.
Background technology
Pitch period is very important parameter in voice compression coding, is also the parameter that a lot of speech-related technologies is used, and the correct extraction of pitch period parameter is the prerequisite ensureing digitize voice proper communication.
Current pitch period parameter extraction technology is for the complete voice of frequency spectrum, and the namely voice of 60-4000Hz, can reach higher extraction accuracy.For telephone wire quality audio, it does not refer to merely the voice of telephone system, comprises the voice that other causes voice spectrum to lack owing to have passed through 300-3400Hz bandpass filter, the voice of such as analog-interphone yet.Therefore, the most fundamental frequency of telephone wire quality audio has been filtered (base frequency range of people is 60-400Hz), can cause pitch period corresponding be not the maximal value of autocorrelation function, even do not appear in the middle of candidate value, but current extractive technique depends critically upon autocorrelation function, the accuracy of therefore extracting is not high, there will be very grave error, such as male voice becomes tapering, female voice becomes big and heavy, not only affect sense of hearing, also affect the identification of speaker and distinguishing of content of speaking, affect very large.
Summary of the invention
The present invention is intended to solve one of technical matters in above-mentioned correlation technique at least to a certain extent.
For this reason, the object of the invention is to the method for extracting base-sound period proposing a kind of telephone wire quality audio, time-domain and frequency-domain combines by the method, when extracting telephone wire quality audio, has the advantage that accuracy is high.
To achieve these goals, embodiments of the invention propose a kind of method for extracting base-sound period of telephone wire quality audio, comprise the following steps: Nonlinear Processing is carried out to the raw tone of input, and calculate the second time domain autocorrelation function of the voice after the first time domain autocorrelation function of described raw tone and Nonlinear Processing; Merge described first time domain autocorrelation function and described first time domain autocorrelation function obtains the 3rd time domain autocorrelation function; Calculate the long time base sound cycle of each frame in raw tone, and according to the described long time base sound cycle, described 3rd time domain autocorrelation function is revised; LPC liftering is carried out to described raw tone and obtains residual signal, and FFT conversion is carried out to described residual signal, and calculate frequency domain autocorrelation function according to transformation results; Calculate time domain weights and the frequency domain weight of pitch period candidate value according to described 3rd time domain autocorrelation function and described frequency domain autocorrelation function, and obtain the final weight of described pitch period candidate value according to described time domain weights and frequency domain weight; Final weight according to described pitch period candidate value and described pitch period candidate value carries out path planning, to determine final pitch period value.
According to the method for extracting base-sound period of the telephone wire quality audio of the embodiment of the present invention, time-domain and frequency-domain is combined, in time domain, introduce new parameter---long time base sound cycle, and according to voice short-term stationarity characteristic, time domain correction is carried out to autocorrelation function, removes the delay value that can not become pitch period; On frequency domain, calculate frequency domain autocorrelation function, the frequency domain autocorrelation value corresponding to pitch period candidate value is also alternatively worth a part for weight, to increase the weight of real pitch period.And then, the accuracy that the pitch period that the method can improve telephone wire quality audio extracts.
In addition, the method for extracting base-sound period of telephone wire quality audio according to the above embodiment of the present invention can also have following additional technical characteristic:
In some instances, by the 3rd time domain autocorrelation function described in following formulae discovery:
R comb ( τ ) = R abs ( τ ) , R abs ( τ ) > R orig ( τ ) R orig ( τ ) , R orig ( τ ) > R abs ( τ ) ,
Wherein, R comb(τ) be described 3rd time domain autocorrelation function, R orig(τ) be the first time domain autocorrelation function of raw tone, R abs(τ) be the second time domain autocorrelation function of the voice after Nonlinear Processing.
In some instances, in the long time base sound cycle of each frame in described calculating raw tone, specifically comprise:
Wherein, l is frame number, p avgin the l long time base sound cycle that () is present frame, p (l-1) is the long time base sound cycle of previous frame, P midbe positioned at the part of male voice and the coincidence of female voice pitch period scope, V l-1represent when being 0 and 1 that previous frame is voiceless sound and voiced sound respectively, G l-1for the energy of previous frame, G 0for the threshold value of energy.
In some instances, wherein, if previous frame voice signal is voiced sound, and its energy is greater than threshold value G 0, then upgrade the long time base sound cycle of present frame with the long time base sound cycle of previous frame, otherwise use P midupgrade the long time base sound cycle of present frame.
In some instances, wherein, by following formula, described 3rd time domain autocorrelation function is revised:
Wherein, p th1and p th2be two threshold values.
In some instances, wherein, p th1=45, p th2=26.
In some instances, wherein, if be positioned at p minto p th2between long time base sound cycle of τ value be greater than p th1, then the auto-correlation function value of this τ is set to 0.
In some instances, FFT conversion is carried out to described residual signal, and calculates frequency domain autocorrelation function according to transformation results, specifically comprise:
R sf ( f ) = 1 2 ( Σ m = 6 46 S res ( m ) S res ( m + f ) Σ m = 6 46 S res ( m ) S res ( m ) + Σ m = 24 64 S res ( m ) S res ( m + f ) Σ m = 24 64 S res ( m ) S res ( m ) ) ,
Wherein, R sff () is frequency domain autocorrelation function, the FFT transformation results that S (m) is residual signal.
In some instances, by the final weight of pitch period candidate value described in following formulae discovery:
R sx(τ,f)=αR comb(τ)+(1-α)R sf(f),
Wherein, R sx(τ, f) is the final weight of pitch period candidate value τ, α R comb(τ) be time domain weights, (1-α) R sff () is frequency domain weight, τ and f becomes corresponding relation, R comb(τ) be time domain autocorrelation value, R sff () is frequency domain autocorrelation value, α is weighting factor.
In some instances, wherein, α is 0.5.
Additional aspect of the present invention and advantage will part provide in the following description, and part will become obvious from the following description, or be recognized by practice of the present invention.
Accompanying drawing explanation
Above-mentioned and/or additional aspect of the present invention and advantage will become obvious and easy understand from accompanying drawing below combining to the description of embodiment, wherein:
Fig. 1 is the process flow diagram of the method for extracting base-sound period of telephone wire quality audio according to an embodiment of the invention;
Fig. 2 is the schematic flow sheet of the method for extracting base-sound period of telephone wire quality audio in accordance with another embodiment of the present invention;
Fig. 3 (a), (b), (c) are raw tone spectrum respectively, adopt the schematic diagram of the speech manual of the method synthesis of the speech manual of classic method synthesis and the employing embodiment of the present invention.
Embodiment
Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.
Below in conjunction with accompanying drawing, the method for extracting base-sound period according to the telephone wire quality audio of the embodiment of the present invention is described.
Fig. 1 is the process flow diagram of the method for extracting base-sound period of telephone wire quality audio according to an embodiment of the invention.Fig. 2 is the schematic flow sheet of the method for extracting base-sound period of telephone wire quality audio in accordance with another embodiment of the present invention.Shown in composition graphs 1 and Fig. 2, the method comprises the following steps:
Step S101, carries out Nonlinear Processing (such as taking absolute value) to the raw tone of input, and calculates the second time domain autocorrelation function of the voice after the first time domain autocorrelation function of raw tone and Nonlinear Processing.
Step S102, merges the first time domain autocorrelation function and the first time domain autocorrelation function obtains the 3rd time domain autocorrelation function.Particularly, in one embodiment of the invention, such as, by following formulae discovery the 3rd time domain autocorrelation function:
R comb ( τ ) = R abs ( τ ) , R abs ( τ ) > R orig ( τ ) R orig ( τ ) , R orig ( τ ) > R abs ( τ ) ,
Wherein, R comb(τ) be the 3rd time domain autocorrelation function, R orig(τ) be the first time domain autocorrelation function of raw tone, R abs(τ) be the second time domain autocorrelation function of the voice after Nonlinear Processing.
Step S103, calculates the long time base sound cycle (LTAP) of each frame in raw tone, and revises the 3rd time domain autocorrelation function according to the long time base sound cycle.
Wherein, very total in one embodiment of the present of invention, such as, by the long time base sound cycle of each frame in following formulae discovery raw tone:
Wherein, l is frame number, p avgin the l long time base sound cycle that () is present frame, p (l-1) is the long time base sound cycle of previous frame, P midbe positioned at the part of male voice and the coincidence of female voice pitch period scope, V l-1represent when being 0 and 1 that previous frame is voiceless sound and voiced sound respectively, G l-1for the energy of previous frame, G 0for the threshold value of energy.Further, in some instances, if previous frame voice signal is voiced sound, and its energy exceedes threshold value G 0, then upgrade the long time base sound cycle of present frame with the pitch period of previous frame, otherwise use P midupgrade the long time base sound cycle of present frame.
In one embodiment of the invention, such as by following formula, the 3rd time domain autocorrelation function is revised:
Wherein, p th1and p th2be two threshold values.In concrete example, based on lot of experiments experience, such as, can p be set th1=45, p th2=26.Further, if scope is at p minto p th2between long time base sound cycle of τ value be greater than p th1, then the auto-correlation function value of these τ values is set to 0, removes its possibility as pitch period candidate value.Reason is that these τ values and the distance in long time base sound cycle have exceeded the scope of normal variation, if do not remove the interference of the τ value that will be subject to mistake, correct pitch period can be made like this to have larger probability to appear in candidate value, also there is larger weight simultaneously, and then improve the accuracy of extracting pitch period.
Step S104, carries out LPC liftering to raw tone and obtains residual signal, and carries out FFT (FastFourier Transformation, Fast Fourier Transform (FFT)) conversion to residual signal, and calculates frequency domain autocorrelation function according to transformation results.
Wherein, in one embodiment of the invention, such as, by following formulae discovery frequency domain autocorrelation function:
R sf ( f ) = 1 2 ( Σ m = 6 46 S res ( m ) S res ( m + f ) Σ m = 6 46 S res ( m ) S res ( m ) + Σ m = 24 64 S res ( m ) S res ( m + f ) Σ m = 24 64 S res ( m ) S res ( m ) ) ,
Wherein, R sff () is frequency domain autocorrelation function, the FFT transformation results that S (m) is residual signal.
Step S105, calculates time domain weights and the frequency domain weight of pitch period candidate value, and obtains the final weight of pitch period candidate value according to time domain weights and frequency domain weight according to the 3rd time domain autocorrelation function and frequency domain autocorrelation function.In other words, by frequency domain autocorrelation value also as a part for pitch period candidate value weight, then final weight is such as determined by following formula:
R sx(τ,f)=αR comb(τ)+(1-α)R sf(f),
Wherein, R sx(τ, f) is the final weight of pitch period candidate value τ, α R comb(τ) be time domain weights, (1-α) R sff () is frequency domain weight, τ and f becomes corresponding relation, R comb(τ) be time domain autocorrelation value, R sff () is frequency domain autocorrelation value, α is weighting factor.More specifically, in some instances, α is 0.5.
Step S106, the final weight according to pitch period candidate value and pitch period candidate value carries out path planning, to determine final pitch period value.More specifically, the path of Least-cost is obtained by dynamic programming, to determine final pitch period value according to the final weight calculation cost function of pitch period candidate value.
As example particularly, as shown in Figure 3, Fig. 3 (a) illustrates the raw tone spectrum of telephone wire quality audio, Fig. 3 (b) illustrates the speech manual that telephone wire quality audio adopts traditional method for extracting base-sound period synthesis, and Fig. 3 (c) illustrates the speech manual that telephone wire quality audio adopts the method synthesis of the embodiment of the present invention.Be applied in speech compression system by the method for the embodiment of the present invention and classic method, the speech manual comparing result of synthetic speech as shown in Figure 3.By comparison diagram 3 (b) and Fig. 3 (c), clearly can see that the method for the embodiment of the present invention can estimate fundamental frequency exactly, therefore the speech manual of synthetic speech is closer to the speech manual of original phone line mass voice, and classic method then creates obvious frequency multiplication mistake.
More specifically, in some instances, shown by test, extract telephone wire quality audio pitch period, the method for the embodiment of the present invention declines 46.8% than the gross error rate of classic method.Meanwhile, to normal voice, the method for this embodiment of the present invention declines 31.2% than the gross error rate of classic method.
To sum up, according to the method for extracting base-sound period of the telephone wire quality audio of the embodiment of the present invention, time-domain and frequency-domain is combined, in time domain, introduce new parameter---long time base sound cycle, and according to voice short-term stationarity characteristic, time domain correction is carried out to autocorrelation function, remove the delay value that can not become pitch period; On frequency domain, calculate frequency domain autocorrelation function, the frequency domain autocorrelation value corresponding to pitch period candidate value is also alternatively worth a part for weight, to increase the weight of real pitch period.And then, the accuracy that the pitch period that the method can improve telephone wire quality audio extracts.
In describing the invention, it will be appreciated that, term " " center ", " longitudinal direction ", " transverse direction ", " length ", " width ", " thickness ", " on ", D score, " front ", " afterwards ", " left side ", " right side ", " vertically ", " level ", " top ", " end " " interior ", " outward ", " clockwise ", " counterclockwise ", " axis ", " radial direction ", orientation or the position relationship of the instruction such as " circumference " are based on orientation shown in the drawings or position relationship, only the present invention for convenience of description and simplified characterization, instead of indicate or imply that the device of indication or element must have specific orientation, with specific azimuth configuration and operation, therefore limitation of the present invention can not be interpreted as.
In addition, term " first ", " second " only for describing object, and can not be interpreted as instruction or hint relative importance or imply the quantity indicating indicated technical characteristic.Thus, be limited with " first ", the feature of " second " can express or impliedly comprise at least one this feature.In describing the invention, the implication of " multiple " is at least two, such as two, three etc., unless otherwise expressly limited specifically.
In the present invention, unless otherwise clearly defined and limited, the term such as term " installation ", " being connected ", " connection ", " fixing " should be interpreted broadly, and such as, can be fixedly connected with, also can be removably connect, or integral; Can be mechanical connection, also can be electrical connection; Can be directly be connected, also indirectly can be connected by intermediary, can be the connection of two element internals or the interaction relationship of two elements, unless otherwise clear and definite restriction.For the ordinary skill in the art, above-mentioned term concrete meaning in the present invention can be understood as the case may be.
In the present invention, unless otherwise clearly defined and limited, fisrt feature second feature " on " or D score can be that the first and second features directly contact, or the first and second features are by intermediary indirect contact.And, fisrt feature second feature " on ", " top " and " above " but fisrt feature directly over second feature or oblique upper, or only represent that fisrt feature level height is higher than second feature.Fisrt feature second feature " under ", " below " and " below " can be fisrt feature immediately below second feature or tiltedly below, or only represent that fisrt feature level height is less than second feature.
In the description of this instructions, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means to describe in conjunction with this embodiment or example are contained at least one embodiment of the present invention or example.In this manual, to the schematic representation of above-mentioned term not must for be identical embodiment or example.And the specific features of description, structure, material or feature can combine in one or more embodiment in office or example in an appropriate manner.In addition, when not conflicting, the feature of the different embodiment described in this instructions or example and different embodiment or example can carry out combining and combining by those skilled in the art.
Although illustrate and describe embodiments of the invention above, be understandable that, above-described embodiment is exemplary, can not be interpreted as limitation of the present invention, and those of ordinary skill in the art can change above-described embodiment within the scope of the invention, revises, replace and modification.

Claims (10)

1. a method for extracting base-sound period for telephone wire quality audio, is characterized in that, comprises the following steps:
Nonlinear Processing is carried out to the raw tone of input, and calculates the second time domain autocorrelation function of the voice after the first time domain autocorrelation function of described raw tone and Nonlinear Processing;
Merge described first time domain autocorrelation function and described first time domain autocorrelation function obtains the 3rd time domain autocorrelation function;
Calculate the long time base sound cycle of each frame in raw tone, and according to the described long time base sound cycle, described 3rd time domain autocorrelation function is revised;
LPC liftering is carried out to described raw tone and obtains residual signal, and FFT conversion is carried out to described residual signal, and calculate frequency domain autocorrelation function according to transformation results;
Calculate time domain weights and the frequency domain weight of pitch period candidate value according to described 3rd time domain autocorrelation function and described frequency domain autocorrelation function, and obtain the final weight of described pitch period candidate value according to described time domain weights and frequency domain weight;
Final weight according to described pitch period candidate value and described pitch period candidate value carries out path planning, to determine final pitch period value.
2. the method for extracting base-sound period of telephone wire quality audio according to claim 1, is characterized in that, by the 3rd time domain autocorrelation function described in following formulae discovery:
R comb ( τ ) = R abs ( τ ) , R abs ( τ ) > R orig ( τ ) R orig ( τ ) , R orig ( τ ) > R abs ( τ ) ,
Wherein, R comb(τ) be described 3rd time domain autocorrelation function, R orig(τ) be the first time domain autocorrelation function of raw tone, R abs(τ) be the second time domain autocorrelation function of the voice after Nonlinear Processing.
3. the method for extracting base-sound period of telephone wire quality audio according to claim 1, is characterized in that, in the long time base sound cycle of each frame in described calculating raw tone, specifically comprises:
Wherein, l is frame number, p avgin the l long time base sound cycle that () is present frame, p (l-1) is the long time base sound cycle of previous frame, P midbe positioned at the part of male voice and the coincidence of female voice pitch period scope, V l-1represent when being 0 and 1 that previous frame is voiceless sound and voiced sound respectively, G l-1for the energy of previous frame, G 0for the threshold value of energy.
4. the method for extracting base-sound period of telephone wire quality audio according to claim 3, is characterized in that, wherein, if previous frame voice signal is voiced sound, and its energy is greater than threshold value G 0, then upgrade the long time base sound cycle of present frame with the long time base sound cycle of previous frame, otherwise use P midupgrade the long time base sound cycle of present frame.
5. the method for extracting base-sound period of telephone wire quality audio according to claim 4, is characterized in that, wherein, is revised described 3rd time domain autocorrelation function by following formula:
Wherein, p th1and p th2be two threshold values.
6. the method for extracting base-sound period of telephone wire quality audio according to claim 5, is characterized in that, wherein, and p th1=45, p th2=26.
7. the method for extracting base-sound period of telephone wire quality audio according to claim 5, is characterized in that, wherein, if be positioned at p minto p th2between long time base sound cycle of τ value be greater than p th1, then the auto-correlation function value of this τ is set to 0.
8. the method for extracting base-sound period of telephone wire quality audio according to claim 1, is characterized in that, carries out FFT conversion to described residual signal, and calculates frequency domain autocorrelation function according to transformation results, specifically comprises:
R sf ( f ) = 1 2 ( Σ m = 6 46 S res ( m ) S res ( m + f ) Σ m = 6 46 S res ( m ) S res ( m ) + Σ m = 24 64 S res ( m ) S res ( m + f ) Σ m = 24 64 S res ( m ) S res ( m ) ) ,
Wherein, R sff () is frequency domain autocorrelation function, the FFT transformation results that S (m) is residual signal.
9. the method for extracting base-sound period of telephone wire quality audio according to claim 1, is characterized in that, the final weight by pitch period candidate value described in following formulae discovery:
R sx(τ,f)=αR comb(τ)+(1-α)R sf(f),
Wherein, R sx(τ, f) is the final weight of pitch period candidate value τ, α R comb(τ) be time domain weights, (1-α) R sff () is frequency domain weight, τ and f becomes corresponding relation, R comb(τ) be time domain autocorrelation value, R sff () is frequency domain autocorrelation value, α is weighting factor.
10. the method for extracting base-sound period of telephone wire quality audio according to claim 9, is characterized in that, wherein, α is 0.5.
CN201510017199.3A 2015-01-13 2015-01-13 Method for extracting pitch period of telephone wire quality voice Pending CN104599682A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510017199.3A CN104599682A (en) 2015-01-13 2015-01-13 Method for extracting pitch period of telephone wire quality voice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510017199.3A CN104599682A (en) 2015-01-13 2015-01-13 Method for extracting pitch period of telephone wire quality voice

Publications (1)

Publication Number Publication Date
CN104599682A true CN104599682A (en) 2015-05-06

Family

ID=53125414

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510017199.3A Pending CN104599682A (en) 2015-01-13 2015-01-13 Method for extracting pitch period of telephone wire quality voice

Country Status (1)

Country Link
CN (1) CN104599682A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106205638A (en) * 2016-06-16 2016-12-07 清华大学 A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN108956117A (en) * 2017-11-29 2018-12-07 杰富意先进技术株式会社 The minimizing technology of electric and magnetic oscillation component, Diagnosis of Rotating Machinery method and device
CN109389988A (en) * 2017-08-08 2019-02-26 腾讯科技(深圳)有限公司 Audio adjusts control method and device, storage medium and electronic device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04342298A (en) * 1991-05-20 1992-11-27 Nippon Telegr & Teleph Corp <Ntt> Momentary pitch analysis method and sound/silence discriminating method
EP1620844A2 (en) * 2003-03-31 2006-02-01 Motorola, Inc. System and method for combined frequency-domain and time-domain pitch extraction for speech signals
CN102842305A (en) * 2011-06-22 2012-12-26 华为技术有限公司 Method and device for detecting keynote

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04342298A (en) * 1991-05-20 1992-11-27 Nippon Telegr & Teleph Corp <Ntt> Momentary pitch analysis method and sound/silence discriminating method
EP1620844A2 (en) * 2003-03-31 2006-02-01 Motorola, Inc. System and method for combined frequency-domain and time-domain pitch extraction for speech signals
CN100589178C (en) * 2003-03-31 2010-02-10 国际商业机器公司 System and method for combined frequency-domain and time-domain pitch extraction for speech signals
CN102842305A (en) * 2011-06-22 2012-12-26 华为技术有限公司 Method and device for detecting keynote

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
常亮 徐敬德 崔慧娟 唐昆: "《电话线质量语音的基音周期提取算法》", 《清华大学学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106205638A (en) * 2016-06-16 2016-12-07 清华大学 A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN106205638B (en) * 2016-06-16 2019-11-08 清华大学 A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN109389988A (en) * 2017-08-08 2019-02-26 腾讯科技(深圳)有限公司 Audio adjusts control method and device, storage medium and electronic device
CN109389988B (en) * 2017-08-08 2022-12-20 腾讯科技(深圳)有限公司 Sound effect adjustment control method and device, storage medium and electronic device
CN108956117A (en) * 2017-11-29 2018-12-07 杰富意先进技术株式会社 The minimizing technology of electric and magnetic oscillation component, Diagnosis of Rotating Machinery method and device
CN108956117B (en) * 2017-11-29 2019-11-08 杰富意先进技术株式会社 The minimizing technology of electric and magnetic oscillation component, Diagnosis of Rotating Machinery method and device

Similar Documents

Publication Publication Date Title
KR20190045278A (en) A voice quality evaluation method and a voice quality evaluation apparatus
Bayya et al. Spectro-temporal analysis of speech signals using zero-time windowing and group delay function
KR20180063282A (en) Method, apparatus and storage medium for voice detection
CN106653056B (en) Fundamental frequency extraction model and training method based on LSTM recurrent neural network
CN103714826B (en) Formant automatic matching method towards vocal print identification
CN102054480B (en) Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)
CN102664006A (en) Abnormal voice detecting method based on time-domain and frequency-domain analysis
CN103117067A (en) Voice endpoint detection method under low signal-to-noise ratio
US20140309992A1 (en) Method for detecting, identifying, and enhancing formant frequencies in voiced speech
KR101998950B1 (en) Ensemble of deep neural networks for artificial noise robust speech bandwidth extension
CN101271686A (en) Method and apparatus for estimating noise by using harmonics of voice signal
US20150106087A1 (en) Efficient Discrimination of Voiced and Unvoiced Sounds
CN102881289A (en) Hearing perception characteristic-based objective voice quality evaluation method
CN103985390A (en) Method for extracting phonetic feature parameters based on gammatone relevant images
CN113870885B (en) Bluetooth audio squeal detection and suppression method, device, medium, and apparatus
Madikeri et al. Mel filter bank energy-based slope feature and its application to speaker recognition
CN104599682A (en) Method for extracting pitch period of telephone wire quality voice
TWI566242B (en) Speech recognition apparatus and speech recognition method
Yap et al. Voice source features for cognitive load classification
Lee et al. Detecting pathological speech using contour modeling of harmonic-to-noise ratio
Jing et al. Speaker recognition based on principal component analysis of LPCC and MFCC
CN112116909A (en) Voice recognition method, device and system
CN105989834B (en) Voice recognition device and voice recognition method
Loweimi et al. Robust Source-Filter Separation of Speech Signal in the Phase Domain.
CN105355206B (en) Voiceprint feature extraction method and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20150506

RJ01 Rejection of invention patent application after publication