HUE034664T2 - Eljárás és berendezés pitch periódus helyességének detektálására - Google Patents

Eljárás és berendezés pitch periódus helyességének detektálására Download PDF

Info

Publication number
HUE034664T2
HUE034664T2 HUE12876916A HUE12876916A HUE034664T2 HU E034664 T2 HUE034664 T2 HU E034664T2 HU E12876916 A HUE12876916 A HU E12876916A HU E12876916 A HUE12876916 A HU E12876916A HU E034664 T2 HUE034664 T2 HU E034664T2
Authority
HU
Hungary
Prior art keywords
pitch
parameter
frequency
correctness
spectral
Prior art date
Application number
HUE12876916A
Other languages
English (en)
Inventor
Fengyan Qi
Lei Miao
Original Assignee
Huawei Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Tech Co Ltd filed Critical Huawei Tech Co Ltd
Publication of HUE034664T2 publication Critical patent/HUE034664T2/hu

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • G10L19/125Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Auxiliary Devices For Music (AREA)

Description

(12) EUROPEAN PATENT SPECIFICATION (45) Date of publication and mention (51) Int Cl.: of the grant of the patent: G10L 25190 <2013 01> 05.04.2017 Bulletin 2017/14 (86) International application number: (21) Application number: 12876916.3 PCT/CN2012/087512 (22) Date of filing: 26.12.2012 (87) International publication number: WO 2013/170610 (21.11.2013 Gazette 2013/47)
(54) METHOD AND APPARATUS FOR DETECTING CORRECTNESS OF PITCH PERIOD
VERFAHREN UND VORRICHTUNG ZUR ERKENNUNG DER KORREKTHEIT DES NEIGUNGSZEITRAUMS
PROCEDE ET APPAREIL DE DETECTION DE LA JUSTESSE DE LA PERIODE DE TONIE (84) Designated Contracting States: · MIAO, Lei AL AT BE BG CH CY CZ DE DK EE ES FI FR GB Shenzhen, Guangdong 518129 (CN)
GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR (74) Representative: Kreuz, Georg Maria
Huawei Technologies Duesseldorf GmbH (30) Priority: 18.05.2012 CN 201210155298 Riesstrasse 8 80992 Miinchen (DE) (43) Date of publication of application: 04.03.2015 Bulletin 2015/10 (56) References cited: CN-A- 1 473 322 CN-A- 101 149 924 (60) Divisional application: CN-A-101 354 889 CN-A-101 814 291 17150741.1 CN-A-102 231 274 US-A- 4 791 671 US-A- 5 832 437 US-A- 6 108 621 (73) Proprietor: Huawei Technologies Co., Ltd. US-A1- 2004 158 462 US-A1- 2010 070 270
Longgang District US-B1- 6 496 797
Shenzhen, Guangdong 518129 (CN) (72) Inventors: • Ql, Fengyan
Shenzhen, Guangdong 518129 (CN)
Description TECHNICAL FIELD
[0001] The present invention relates to the field of audio technologies, and more specifically, to a method and an apparatus for detecting correctness of a pitch period.
BACKGROUND
[0002] In processing speech and audio signals, pitch detection is one of key technologies in various actual speech and audio applications. For example, the pitch detection is the key technology in applications of speech encoding, speech recognition, karaoke, and the like. Pitch detection technologies are widely applied to various electronic devices, such as, a mobile phone, a wireless apparatus, a personal digital assistant (PDA), a handheld or portable computer, a GPS receiver/navigator, a camera, an audio/video player, a video camera, a video recorder, and a surveillance device. Therefore, accuracy and detection efficiency of the pitch detection directly affect the effect of various actual speech and audio applications.
[0003] Current pitch detection is basically performed in a time domain, and generally, a pitch detection algorithm is a time domain autocorrelation method. However, in actual applications, pitch detection performed in the time domain often leads to a frequency multiplication phenomenon, and it is hard to desirably solve the frequency multiplication phenomenon in the time domain, because large autocorrelation coefficients are obtained both for a real pitch period and a multiplied frequency of the real pitch period, and in addition, in a case with background noise, an initial pitch period obtained by open-loop detection in the time domain may also be inaccurate. Here, a real pitch period is an actual pitch period in speech, that is, a correct pitch period. A pitch period refers to a minimum repeatable time interval in speech.
[0004] Detecting an initial pitch period in a time domain is used as an example, Most speech encoding standards of the ITU-T (International Telecommunication Union Telecommunication Standardization Sector, International Telecommunication Union Telecommunication Standardization Sector) require pitch detection to be performed, but almost all of the pitch detection is performed in a same domain (a time domain or a frequency domain). For example, an open-loop pitch detection method performed only in a perceptual weighted domain is applied in the speech encoding standard G729.
[0005] In this open-loop pitch detection method, after an initial pitch period is obtained by open-loop detection in the time domain, correctness of the initial pitch period is not performed, but close-loop fine detection is directly performed on the initial pitch period. The close-loop fine detection is performed in a period interval including the initial pitch period obtained by the open-loop detection, so that if the initial pitch period obtained by the open-loop detection is incorrect, a pitch period obtained by the final close-loop fine detection is also incorrect. In other words, because it is extremely hard to ensure that the initial pitch period obtained by the open-loop detection in the time domain is absolutely correct, if an incorrect initial pitch period is applied to the following processing, final audio quality may deteriorate.
[0006] In addition, in the prior art, it is also proposed to change the pitch period detection performed in the time domain to pitch period fine detection performed in the frequency domain, but the pitch period fine detection performed in the frequency domain is extremely complex. In the fine detection, further pitch detection may be performed on an input signal in the time domain or the frequency domain according to the initial pitch period, including short-pitch detection, fractional pitch detection, or multiplied frequency pitch detection.
[0007] Document US6,108,62A discloses a speech analysis method and a speech encoding method and apparatus in which, even if the harmonics of the speech spectrum are offset from integer multiples of the fundamental wave, the amplitudes of the harmonics can be evaluated correctly for producing a playback output of high clarity. To this end, the frequency spectrum of the input speech is split on the frequency axis into plural bands in each of which pitch search and evaluation of amplitudes of the harmonics are carried out simultaneously using an optimum pitch derived from the spectral shape. Using the structure of an harmonics as the spectral shape, and based on the rough pitch previously detected by an open-loop rough pitch search, a high-precision pitch search comprised of a first pitch search for the frequency spectrum in its entirety and a second pitch search of higher precision than the first pitch search is carried out. The second pitch search is performed independently for each of the high range side and the low range side of the frequency spectrum.
[0008] Document US2004/0158462A1 discloses an improved method of performing channel selection in multi-channel pitch detection systems. For each channel, several features are computed using the input signal and the value of the pitch candidate from the channel. The resulting feature vector is used to evaluate a multi-variate likelihood function which defines the likelihood that the pitch candidate represents the correct pitch. The final pitch estimate is then taken to be the pitch candidate with the highest likelihood of being correct, or the mean (or median) of the pitch candidates with likelihoods above a given threshold. The functional form of the likelihood function can be defined using several different parametric representations, and the parameters of the likelihood function can be advantageously derived in an automated manner using signals having pitch labels that are considered to be correct.
[0009] Document US6,496,797B1 discloses an apparatus and method for speech compression, which includes dividing the speech spectrum into a plurality of frames, assigning frame classifications to the plurality of frames, and determining the speech modeling parameters based on the assigned frame classification. The voiced part of the speech spectrum and the unvoiced part of the speech spectrum are synthesized separately using Analysis by Synthesis allowing a correct correspondence between voiced and unvoiced parts of the reconstructed signal. Particularly, a frequency response of a special simulated signal based on the previous and current frames is used as an approximating function. The simulated signal is synthesized at the encoder side in the way it will be generated at the decoder side. Also, a better of two encoding methods is selected to encode the spectral magnitudes.
SUMMARY
[0010] The present invention provide a method and an apparatus for detecting correctness of a pitch period, so as to solve a problem in the prior art that when correctness of an initial pitch period is detected in a time domain or a frequency domain, accuracy is low and complexity is relatively high.
[0011] According to one aspect, a method for detecting correctness of a pitch period is provided, including: determining, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, wherein the initial pitch period is obtained by performing open-loop detection on the input signal; determining, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal; and determining correctness of the initial pitch period according to the pitch period correctness decision parameter; the pitch period correctness decision parameter comprises a spectral difference parameter, an average spectral amplitude parameter, and a difference-to-amplitude ratio parameter, the spectral difference parameter is a sum of spectral differences of a predetermined quantity of frequency bins on two sides of the pitch frequency bin or a weighted and smoothed value of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; the average spectral amplitude parameter is an average of spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin or a weighted and smoothed value of the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; and the difference-to-amplitude ratio parameter is a ratio of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin to the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; where spectral differences refer to differences between spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin and a spectral amplitude of the pitch frequency bin.
[0012] According to another aspect, an apparatus for detecting correctness of a pitch period is provided, including: a pitch frequency bin determining unit, configured to determine, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, wherein the initial pitch period is obtained by performing open-loop detection on the input signal; a parameter generating unit, configured to determine, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal; and a correctness determining unit, configured to determine correctness of the initial pitch period according to the pitch period correctness decision parameter, the apparatus is characterized in that: the pitch period correctness decision parameter generated by the parameter generating unit comprises a spectral difference parameter, an average spectral amplitude parameter, and a difference-to-amplitude ratio parameter, the spectral difference parameter is a sum of spectral differences of a predetermined quantity of frequency bins on two sides of the pitch frequency bin or a weighted and smoothed value of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; the average spectral amplitude parameter is an average of spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin or a weighted and smoothed value of the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; and the difference-to-amplitude ratio parameter is a ratio of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin to the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; where spectral differences refer to differences between spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin and a spectral amplitude of the pitch frequency bin.
[0013] The method and apparatus for detecting correctness of a pitch period according to the embodiments of the present invention can improve, based on a relatively less complex algorithm, accuracy of detecting correctness of a pitch period.
BRIEF DESCRIPTION OF DRAWINGS
[0014] To describe the technical solutions in the present invention more clearly, the following briefly introduces the accompanying drawings required for describing the present invention. Apparently, the accompanying drawings in the following description show merely some embodiments of the present invention, and a person of ordinary skill in the art may still derive other drawings from these accompanying drawings without creative efforts. FIG. 1 is a flowchart of a method for detecting correctness of a pitch period according to an embodiment of the present invention; FIG. 2 is a schematic structural diagram of an apparatus for detecting correctness of a pitch period according to an embodiment of the present invention; FIG. 3 is a schematic structural diagram of an apparatus for detecting correctness of a pitch period according to an embodiment of the present invention; FIG. 4 is a schematic structural diagram of an apparatus for detecting correctness of a pitch period according to an embodiment of the present invention; and FIG. 5 is a schematic structural diagram of an apparatus for detecting correctness of a pitch period according to an embodiment of the present invention.
DESCRIPTION OF EMBODIMENTS
[0015] The following clearly and completely describes the technical solutions in embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are a part rather than all of the embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts shall fall within the protection scope of the present invention.
[0016] According to the embodiments of the present invention, correctness of an initial pitch period obtained by open-loop detection in a time domain is detected in a frequency domain, so as to avoid applying an incorrect initial pitch period to the following processing.
[0017] An objective of the embodiments of the present invention is to perform further correctness detection on an initial pitch period, which is obtained by open-loop detection in the time domain, so as to greatly improve accuracy and stability of pitch detection by extracting effective parameters in the frequency domain and making a decision by combining these parameters.
[0018] A method for detecting correctness of a pitch period according to an embodiment of the present invention, as shown in FIG. 1, includes the following steps.
[0019] 11. Determine, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, where the initial pitch period is obtained by performing open-loop detection on the input signal.
[0020] Generally, the pitch frequency bin of the input signal is reversely proportional to the initial pitch period of the input signal, and is directly proportional to a quantity of points of an FFT (Fast Fourier Transform, fast Fourier transform) performed on the input signal.
[0021] 12. Determine, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal.
[0022] The pitch period correctness decision parameter includes a spectral difference parameter Diff_sm, an average spectral amplitude parameter Spec_sm, and adifference-to-amplitude ratio parameter Diff_ratio. The spectral difference parameter Diff_sm is a sum Diff_sum of spectral differences of a predetermined quantity of frequency bins on two sides of the pitch frequency bin or a weighted and smoothed value of the sum Diff_sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin. The average spectral amplitude parameter Spec_sm is an average Spec_avg of spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin or a weighted and smoothed value of the average Spec_avg of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin. The difference-to-amplitude ratio parameter Diff_ratio is a ratio of the sum Diff_sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin to the average Spec_avg of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin.
[0023] 13. Determine correctness of the initial pitch period according to the pitch period correctness decision parameter.
[0024] For example, when the pitch period correctness decision parameter meets a correctness determining condition, it is determined that the initial pitch period is correct; and when the pitch period correctness decision parameter meets an incorrectness determining condition, it is determined that the initial pitch period is incorrect.
[0025] Specifically, the incorrectness determining condition meets at least one of the following: the spectral difference parameter Diff_sm is less than a first difference parameter threshold, the average spectral amplitude parameter Spec_sm is less than a first spectral amplitude parameter threshold, and the difference-to-amplitude ratio parameter Diff_ratio is less than a first ratio factor parameter threshold. The correctness determining condition meets at least one of the following: the spectral difference parameter Diff_sm is greater than a second difference parameter threshold, the average spectral amplitude parameter Spec_sm is greater than a second spectral amplitude parameter threshold, and the difference-to-amplitude ratio parameter Diff_ratio is greater than a second ratio factor parameter threshold.
[0026] For example, if the incorrectness determining condition is that the spectral difference parameter Diff_sm is less than the first difference parameter threshold and the correctness determining condition is that the spectral difference parameter Diff_sm is greater than the second difference parameter threshold, the second difference parameter threshold is greater than the first difference parameter threshold. Alternatively, if the incorrectness determining condition is that the average spectral amplitude parameter Spec_sm is less than the first spectral amplitude parameter threshold and the correctness determining condition is that the average spectral amplitude parameter Spec_sm is greater than the second spectral amplitude parameter threshold, the second spectral amplitude parameter threshold is greater than the first spectral amplitude parameter threshold. Alternatively, if the incorrectness determining condition is that the difference-to-amplitude ratio parameter Diff_ratio is less than the first ratio factor parameter threshold and the correctness determining condition is that the difference-to-amplitude ratio parameter Diff_ratio is greater than the second ratio factor parameter threshold, the second ratio factor parameter threshold is greater than the first ratio factor parameter threshold.
[0027] Generally, if the initial pitch period detected in the time domain is correct, there must be a peak in a frequency bin corresponding to the initial pitch period, and energy is great; and if the initial pitch period detected in the time domain is incorrect, then, fine detection may be further performed in thefrequency domain so as to determine a correct pitch period.
[0028] In other words, when it is detected that the initial pitch period is incorrect during the detecting, according to the pitch period correctness decision parameter, the correctness of the initial pitch period, the fine detection is performed on the initial pitch period.
[0029] Alternatively, when it is detected that the initial pitch period is incorrect during the detecting, according to the pitch period correctness decision parameter, the correctness of the initial pitch period, energy of the initial pitch period is detected in a low-frequency range; and short-pitch detection (a manner of fine detection) is performed when the energy meets a low-frequency energy determining condition.
[0030] Therefore, it can be learned that the method for detecting correctness of a pitch period according to this embodiment of the present invention can improve, based on a relatively less complex algorithm, accuracy of detecting correctness of a pitch period.
[0031] The following describes in detail a specific embodiment, which includes the following steps. 1. Perform an N-point FFT on an input signal s(n) so as to convert an input signal in a time domain to an input signal in a frequency domain to obtain a corresponding amplitude spectrum S(k) in the frequency domain, where N=256, 512, or the like.
Specifically, the amplitude spectrum S(k) may be obtained in the following steps:
Step A1. Preprocess the input signal s(n) to obtain a preprocessed input signal spre(n), where the preprocessing may be processing such as high-pass filtering, re-sampling, or pre-weighting. Only the pre-weighting processing is described herein by using an example. The preprocessed input signal spre(n) is obtained after the input signal s(n) passes a first order high-pass filter, where the high-pass filter has a filter factor Hpre_emph(z) = 1-0.68z-1. Step A2. Perform an FFT on the preprocessed input signal spre(n). In an embodiment, the FFT is performed on the preprocessed input signal spre(n) twice, where one is to perform the FFT on a preprocessed input signal of a current frame, and the other is to perform the FFT on a preprocessed input signal that includes a second half of the current frame and a first half of a future frame. Before the FFT is performed, the preprocessed input signal needs to be processed by windowing, where a window function is: I ( il 7ΓΎΙ WFFT(n) = JO. 5 -0.5 COS - =sin - , n = 0.....LFFr1· lfft's a length of the FFT. V V ^FFT y V LFFT y A windowed signal after a first analyzing window and a second analyzing window are added to the preprocessed input signal is: s[01w (n) = w,,,,,,. (n)spre (η), n = 0,..., LFFT -1, s wnd (E) ~ ^fft(n)spre (n + LFFT / 2), w -0,..., L,,,,.,, ~ 1, where, the first analyzing window corresponds to the current frame, and the second analyzing window corresponds to the second half of the current frame and the first half of the future frame.
The FFT is performed on the windowed signal to obtain a spectral coefficient:
k = 0,...,K-l, N = LPPT k = 0,...,K-l, N = Lfft 9 where K < LFF1J2.
The first half of the future frame is from a next frame (look-ahead) signal that is encoded in the time domain, and the input signal may be adjusted according to a quantity of next frame signals. A purpose of performing the FFT twice is to obtain more precise frequency domain information. In another embodiment, the FFT may also be performed on the preprocessed input signal spre(n) once.
Step A3. Calculate, based on the spectral coefficient, an energy spectrum. £(0) = ^(0)+-5^(1^/2)), E(k) = n(xl(k) + X](k)), k = where XR(/c) and X/(/c) denote a real part and an imaginary part of a /cth frequency bin respectively; and η is a constant which may be, for example, 4l(LFFT* LFFT).
Step A4. Perform weighting processing on the energy spectrum. E(k)=aEm(k) + {\-a)Em(k), k = 0,...,£-1, a <1
Herein, E[°l(k) is an energy spectrum, calculated according to the formula in step A3, of the spectral coefficient X[°l(k), and Ei1](k) is an energy spectrum, calculated according to the formula instep A3, of the spectral coefficient X[1](k).
Step A5. Calculate an amplitude spectrum of a logarithm domain.
k = 0,...,^-1, where Θ is a constant which may be, for example, 2; and ε is a relatively small positive number to prevent a logarithm value from overflowing. Alternatively, log-ig may be replaced by loge in a project implementation. 2. Perform open-loop detection on the input signal in the time domain to obtain an initial pitch period Top, steps of which are as follows:
Step B1. Convert the input signal s(n) to a perceptual weighted signal:
P P sw(n) = s(n) + "^ajlsin-i) ~'^aiY,2sw(n-i) n = 0,...,N -1 /=1 /=1 9 where a,· is an LP (Linear Prediction, linear prediction) coefficient, γλ and γ2 are perceptual weighting factors, p is an order of a perceptual filter, and N is a frame length.
Step B2. Search for a greatest value in each of three candidate detection ranges (for example, in a lower sampling domain, the three candidate detection ranges may be [62 115]; [32 61]; and [17 31]) by using a correlation function, and use the greatest values as candidate pitches:
where k is a value in a candidate detection range of a pitch period, for example, k may be a value in the three candidate detection ranges.
Step B3. Separately calculate normalized correlation coefficients of the three candidate pitches:
/ = 1,...,3
Step B4. Select an open-loop initial pitch period Top by comparing the normalized correlation coefficients of the ranges: Firstly, a period of a first candidate pitch is used as an initial pitch period. Then, if a normalized correlation coefficient of a second candidate pitch is greater than or equal to a product of a normalized correlation coefficient of the initial pitch period and a fixed ratio factor, a period of the second candidate pitch is used as the initial pitch period; otherwise, the initial pitch period does not change. Finally, if a normalized correlation coefficient of a third candidate pitch is greater than or equal to a product of the normalized correlation coefficient of the initial pitch period and the fixed ratio factor, a period of the third candidate pitch is used as the initial pitch period; otherwise, the initial pitch period does not change. Refer to the following program expression:
end
It can be understood that, no limitation is imposed on a sequence of the foregoing steps of obtaining the amplitude spectrum S(k) and the initial pitch period Top. The steps may be performed at the same time, or any step may be performed first. 3. Obtain a pitch frequency bin F_op according to a quantity N of points of the FFT and the initial pitch period T_op.
Fop = N/Top 4. Calculate a sum Spec_sum of spectral amplitudes and a sum Diff_sum of spectral amplitude differences of a predetermined quantity of frequency bins on two sides of the pitch frequency bin F_op, where the quantity offrequency bins on the two sides of the pitch frequency bin F_op may be preset.
Flerein, the sum Spec_sum of the spectral amplitudes is a sum of the spectral amplitudes of the predetermined quantity offrequency bins on the two sides of the pitch frequency bin, and the sum Diff_sum of spectral amplitude differences is a sum of spectral differences of the predetermined quantity offrequency bins on the two sides of the pitch frequency bin, where spectral differences refer to differences between spectral amplitudes of the predetermined quantity offrequency bins on the two sides of the pitch frequency bin F_op and a spectral amplitude of the pitch frequency bin. The sum Spec_sum of spectral amplitudes and the sum Diff_sum of spectral amplitude differences may be expressed in the following program expression:
Spec_sum[0]=0 ;
Diff_sum[0]=0; for (i=l; i < 2*F_op; i++){
Spec_sum[i] = Spec_sum[i-1] + S[i];
Diff_sum[i] = Diff_sum[i-1] + (S[F_op] - S[i]; }, where i is a sequence number of a frequency bin. In a project implementation, an initial value of i may be set to 2, so as to avoid low-frequency interference of a lowest coefficient. 5. Determine an average spectral amplitude parameter Spec_sm, a spectral difference parameter Diff_sm, and a difference-to-amplitude ratio parameter Diff_ratio.
The average spectral amplitude parameter Spec_sm may be an average spectral amplitude Spec_avg of the predetermined quantity offrequency bins on the two sides of the pitch frequency bin F_op, that is, the sum Spec_sum of spectral amplitudes divided by the quantity of all frequency bins of the predetermined quantity offrequency bins on the two sides of the pitch frequency bin F_op:
Spec avg = Spec_sum/(2* F_op-l).
Further, the average spectral amplitude parameter Spec_sm may also be a weighted and smoothed value of the average spectral amplitude Spec_avg of the predetermined quantity offrequency bins on the two sides of the pitch frequency bin F_op:
Spec_sm = 0.2*Spec_sm_pre + 0.8*Spec_avg, where Spec_sm_pre is a parameter being a weighted and smoothed value of an average spectral amplitude of a previous frame. In this case, 0.2 and 0.8 are weighting and smoothing coefficients. Different weighting and smoothing coefficients may be selected according to different features of input signals.
The spectral difference parameter Diff_sm may be a sum Diff_sum of spectral amplitude differences or a weighted and smoothed value of the sum Diff_sum of spectral amplitude differences:
Diff_sm =0.4* Diff_sm_pre + 0.6*Diff_sum, where Diff_sm_pre is a parameter being a weighted and smoothed value of a spectral difference of a previous frame. Here, 0.4 and 0.6 are weighting and smoothing coefficients. Different weighting and smoothing coefficients may be selected according to different features of input signals.
As can be learned from the above, generally, a weighted and smoothed value Spec_sm of an average spectral amplitude parameter of a current frame is determined based on a weighted and smoothed value Spec_sm_pre of an average spectral amplitude parameter of a previous frame, and a weighted and smoothed value Diff_sm of a spectral difference parameter of the current frame is determined based on a weighted and smoothed value Diff_sm_pre of a spectral difference parameter of the previous frame.
The difference-to-amplitude ratio parameter Diff_ratio is a ratio of the sum Diff_sum of spectral amplitude differences to the average spectral amplitude Spec_avg.
Diff_ratio = Diff_sum/Spec_avg. 6. According to the average spectral amplitude parameter Spec_sm, the spectral difference parameter Diff_sm, and the difference-to-amplitude ratio parameter Diff_ratio, determine whether the initial pitch period Top is correct, and determine whether to change a determining flag T_flag.
For example, when the spectral difference parameter Diff_sm is less than a first difference parameter threshold Diff_thr1, the average spectral amplitude parameter Spec_sm is less than a first spectral amplitude parameter threshold Spec_thr1, and the difference-to-amplitude ratio parameter Diff_ratio is less than a first ratio factor parameter threshold ratio_thr1, it is determined that the correctness flag T_flag is 1, and it is determined that the initial pitch period is incorrect according to the correctness flag. For another example, when the spectral difference parameter Diff_sm is greater than a second difference parameter threshold Diff_thr2, the average spectral amplitude parameter Spec_sm is greater than a second spectral amplitude parameter threshold Spec_thr2, and the difference-to-amplitude ratio parameter Diff_ratio is greater than a second ratio factor parameter threshold ratio_thr2, it is determined that the correctness flag T_flag is 0, and it is determined that the initial pitch period is correct according to the correctness flag. If not all correctness determining conditions are met and not all incorrectness determining conditions are met, an original flag T_flag remains unchanged.
It should be understood that, the first difference parameter threshold Diff_thr1, the first spectral amplitude parameter threshold Spec_thr1, the first ratio factor parameter threshold ratio_thr1, the second difference parameter threshold Diff_thr2, the second spectral amplitude parameter threshold Spec_thr2, and the second ratio factor parameter threshold ratio_thr2 may be selected according to a requirement.
For an incorrect initial pitch period detected according to the foregoing method, fine detection may be performed on the foregoing detection result, so as to avoid a detection error of the foregoing method.
In addition, energy in a low-frequency range may be further detected, so as to further detect the correctness of the initial pitch period. Short-pitch detection may be further performed on a detected incorrect pitch period. 7.1. Whether energy of the initial pitch period is very small in a low-frequency range may be further detected for the initial pitch period. When detected energy meets a low-frequency energy determining condition, the short-pitch detection is performed. Specifically, the low-frequency energy determining condition specifies two low-frequency energy relative values that represent that the low-frequency energy is relatively very small and the low-frequency energy is relatively large. Therefore, when the detected energy meets that the low-frequency energy is relatively very small, the correctness flag T_flag is set to 1; and when the detected energy meets that the low-frequency energy is relatively large, the correctness flag T_flag is set to 0. If the detected energy does not meet the low-frequency energy determining condition, the original flag T_flag remains unchanged. When the correctness flag T_flag is set to 1, the short-pitch detection is performed. In addition to specifying the low-frequency energy relative values, the low-frequency energy determining condition may also specify another combination of conditions to increase robustness of low-frequency energy determining condition.
For example, two frequency bins f_low1 and f_low2 are first set, energy being energy 1 and energy 2 of initial pitch periods in ranges between 0 and fjowl and between f_low1 andf_low2 is calculated separately, and then, an energy difference between the energyl and the energy2 is calculated: energy_diff=energy2-energy1. Further, the energy difference may be weighted, and a weighting factor may be a voicing degree factor voice_factor, that is, energy_diff_w=energy_diff * voice_factor. Generally, a weighted energy difference may be further smoothed, and a result of the smoothing is compared with a preset threshold to determine whether the energy of the initial pitch period in the low-frequency range is missing.
Alternatively, the foregoing algorithm is simplified, so that low-frequency energy of the initial pitch period in a range is directly obtained, then, the low-frequency energy is weighted and smoothed, and a result of the smoothing is compared with a preset threshold. 7.2. Perform the short-pitch detection, and determine, according to the correctness flag T_flag or according to the correctness flag T_flag in combination with another condition, whether to replace the initial pitch period Top with a result of the short-pitch detection. Alternatively, before the short-pitch period is performed, whether it is necessary to perform the short-pitch detection may be first determined according to the correctness flag T_flag or according to the correctness flag T_flag in combination with another condition.
The short-pitch detection may be performed in the frequency domain, or may be performed in the time domain. For example, in the time domain a detection range of the pitch period is generally from 34 to 231, to perform the short-pitch detection is to search for a pitch period with a range less than 34, and a method used may be a time domain autocorrelation function method: R(T) = MAX{R (t), t< 34}; if R(7) is greater than a preset threshold or an autocorrelation value that is corresponding to the initial pitch period, and when T_flag is 1 (another condition may also be added here), T may be considered as a detected short-pitch period.
In addition to the short-pitch detection, multiplied-frequency detection may also be performed. If the correctness flag T_flag is 1, it is indicated that the initial pitch period Top is incorrect, and therefore the multiplied-frequency pitch detection may be performed at a multiplied-frequency location of the initial pitch period Top, where a multiplied-frequency pitch period may be an integral multiple of the initial pitch period Top, or may be a fractional multiple of the initial pitch period Top.
For step 7.1 and step 7.2, only step 7.2 may be performed to simplify the process of the fine detection. 8. All of the steps 1 to 7.2 are performed for a current frame. After the current frame is processed, a next frame needs to be processed. Therefore, for the next frame, an average spectral amplitude parameter Spec_sm and a spectral difference parameter Diff_sm of the current frame are used a parameter Spec_sm_pre being a weighted and smoothed value of an average spectral amplitude of a previous frame and a parameter Diff_sm_pre being a weighted and smoothed value of a spectral difference of the previous frame, and are temporarily stored to implement parameter smoothing of the next frame.
[0032] Therefore, it can be learned that in this embodiment of the present invention, after an initial pitch period is obtained during open-loop detection, correctness of the initial pitch period is detected in a frequency domain, and if it is detected that the initial pitch period is incorrect, the initial pitch period is corrected by using fine detection, so as to ensure the correctness of the initial pitch period. In the method for detecting correctness of an initial pitch period, a spectral difference parameter, an average spectral amplitude (or spectral energy) parameter and a difference-to-amplitude ratio parameter of a predetermined quantity of frequency bins on two sides of a pitch frequency bin need to be extracted. Because complexity of extracting these parameters is low, this embodiment of the present invention can ensure that a pitch period with relatively high correctness is output based on a less complex algorithm. In conclusion, the method for detecting correctness of a pitch period according to this embodiment of the present invention can improve, based on a relatively less complex algorithm, accuracy of detecting correctness of a pitch period.
[0033] The following describes apparatuses for detecting correctness of a pitch period according to embodiments of the present invention in detail with reference to FIG. 2 to FIG. 4.
[0034] In FIG. 2, an apparatus 20 for detecting correctness of a pitch period includes a pitch frequency bin determining unit 21, a parameter generating unit 22, and a correctness determining unit 23.
[0035] The pitch frequency bin determining unit 21 is configured to determine, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, where the initial pitch period is obtained by performing open-loop detection on the input signal. Specifically, the pitch frequency bin determining unit 21 determines the pitch frequency bin based on the following manner: the pitch frequency bin of the input signal is reversely proportional to the initial pitch period, and is directly proportional to a quantity of points of an FFT performed on the input signal.
[0036] The parameter generating unit 22 is configured to determine, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal. The pitch period correctness decision parameter generated by the parameter generating unit 22 includes a spectral difference parameter Diff_sm, an average spectral amplitude parameter Spec_sm, and a difference-to-amplitude ratio parameter Diff_ratio. The spectral difference parameter Diff_sm is a sum Diff_sum of spectral differences of a predetermined quantity of frequency bins on two sides of the pitch frequency bin or a weighted and smoothed value of the sum Diff_sum of the spectral differences of the predetermined quantity of frequency bins on two sides of the pitch frequency bin. The average spectral amplitude parameter Spec_sm is an average Spec_avg of spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin or a weighted and smoothed value of the average Spec_avg of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin. The difference-to-amplitude ratio parameter Diff_ratio is a ratio of the sum Diff_sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin to the average Spec_avg of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin.
[0037] The correctness determining unit 23 is configured to determine correctness of the initial pitch period according to the pitch period correctness decision parameter.
[0038] Specifically, when the correctness determining unit 23 determines that the pitch period correctness decision parameter meets a correctness determining condition, the correctness determining unit 23 determines that the initial pitch period is correct; or, when the correctness determining unit 23 determines that the pitch period correctness decision parameter meets an incorrectness determining condition, the correctness determining unit 23 determines that the initial pitch period is incorrect.
[0039] Herein, the incorrectness determining condition meets at least one of the following: the spectral difference parameter Diff_sm is less than a first difference parameter threshold, the average spectral amplitude parameter Spec_sm is less than a first spectral amplitude parameter threshold, and the difference-to-amplitude ratio parameter DifF_ratio is less than a first ratio factor parameter threshold.
[0040] The correctness determining condition meets at least one of the following: the spectral difference parameter Diff_sm is greater than a second difference parameter threshold, the average spectral amplitude parameter Spec_sm is greater than a second spectral amplitude parameter threshold, and the difference-to-amplitude ratio parameter Diff_ratio is greater than a second ratio factor parameter threshold.
[0041] Optionally, as shown in FIG. 3, compared with the apparatus 20, an apparatus 30 for detecting correctness of a pitch period further includes a fine detecting unit 24, configured to, when it is detected that the initial pitch period is incorrect during the detecting, according to the pitch period correctness decision parameter, the correctness of the initial pitch period, perform fine detection on the input signal.
[0042] Optionally, as shown in FIG. 4, compared with the apparatus 30, an apparatus 40 for detecting correctness of a pitch period may further include an energy detecting unit 25, configured to, when an incorrect initial pitch period is detected during the detecting, according to the pitch period correctness decision parameter, the correctness of the initial pitch period, detect energy of the initial pitch period in a low-frequency range. Then, the fine detecting unit 24 performs short-pitch detection on the input signal when the energy detecting unit 25 detects that the energy meets a low-frequency energy determining condition.
[0043] Therefore, it can be learned that the apparatus for detecting correctness of a pitch period according to this embodiment of the present invention can improve, based on a relatively less complex algorithm, accuracy of detecting correctness of a pitch period.
[0044] Referring to FIG. 5, in another embodiment, an apparatus for detecting correctness of a pitch period includes: a receiver, configured to receive an input signal; and a processor, configured to determine a pitch frequency bin of the input signal according to an initial pitch period of the input signal in a time domain, where the initial pitch period is obtained by performing open-loop detection on the input signal; determine, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal; and determine correctness of the initial pitch period according to the pitch period correctness decision parameter.
[0045] It should be understood that, the processor may implement each step in the foregoing method embodiments.
[0046] A person of ordinary skill in the art may be aware that, in combination with the examples described in the embodiments disclosed in this specification, units and algorithm steps may be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether the functions are performed by hardware or software depends on particular applications and design constraint conditions of the technical solutions. A person skilled in the art may use different methods to implement the described functions for each particular application, but it should not be considered that the implementation goes beyond the scope of the present invention.
[0047] It may be clearly understood by a person skilled in the art that, for the purpose of convenient and brief description, for a detailed working process of the foregoing system, apparatus, and unit, reference may be made to a corresponding process in the foregoing method embodiments, and details are not described herein again.
[0048] In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described apparatus embodiment is merely exemplary. For example, the unit division is merely logical function division and may be other division in actual implementation. For example, a plurality of units or components may be combined or integrated into another system. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
[0049] The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. A part or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
[0050] In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
[0051] When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present invention essentially, or the part contributing to the prior art, or a part of the technical solutions may be implemented in a form of a software product. The software product is stored in a storage medium, and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or a part of the steps of the methods described in the embodiments of the present invention. The foregoing storage medium includes: any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk, or an optical disc.
[0052] The foregoing descriptions are merely specific implementation manners of the present invention, but are not intended to limit the protection scope of the present invention.
Claims 1. A method for determining correctness of a pitch period, comprising: determining (11), according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, wherein the initial pitch period is obtained by performing open-loop detection on the input signal; determining (12), based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal; and determining (13) correctness of the initial pitch period according to the pitch period correctness decision parameter; the method is characterized in that: the pitch period correctness decision parameter comprises a spectral difference parameter, an average spectral amplitude parameter, and a difference-to-amplitude ratio parameter, the spectral difference parameter is a sum of spectral differences of a predetermined quantity of frequency bins on two sides of the pitch frequency bin or a weighted and smoothed value of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; the average spectral amplitude parameter is an average of spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin or a weighted and smoothed value of the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; and the difference-to-amplitude ratio parameter is a ratio of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin to the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; where spectral differences refer to differences between spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin and a spectral amplitude of the pitch frequency bin. 2. The method according to claim 1, wherein the determining correctness of the initial pitch period according to the pitch period correctness decision parameter comprises: when the pitch period correctness decision parameter meets a correctness determining condition, determining that the initial pitch period is correct; and when the pitch period correctness decision parameter meets an incorrectness determining condition, determining that the initial pitch period is incorrect. 3. The method according to claim 2, wherein: the correctness determining condition meets at least one of the following: the spectral difference parameter is greater than a second difference parameter threshold, the average spectral amplitude parameter is greater than a second spectral amplitude parameter threshold, and the difference-to-amplitude ratio parameter is greater than a second ratio factor parameter threshold; and the incorrectness determining condition meets at least one of the following: the spectral difference parameter is less than a first difference parameter threshold, the average spectral amplitude parameter is less than a first spectral amplitude parameter threshold, and the difference-to-amplitude ratio parameter is less than a first ratio factor parameter threshold. 4. The method according to any one of claims 1 to 3, wherein: the pitch frequency bin of the input signal is reversely proportional to the initial pitch period, and is directly proportional to a quantity of points of a fast Fourier transform performed on the input signal. 5. An apparatus for determining correctness of a pitch period, comprising: a pitch frequency bin determining unit (21), configured to determine, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, wherein the initial pitch period is obtained by performing open-loop detection on the input signal; a parameter generating unit (22), configured to determine, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal; and a correctness determining unit (23), configured to determine correctness of the initial pitch period according to the pitch period correctness decision parameter, the apparatus is characterized in that: the pitch period correctness decision parameter generated by the parameter generating unit comprises a spectral difference parameter, an average spectral amplitude parameter, and a difference-to-amplitude ratio parameter, the spectral difference parameter is a sum of spectral differences of a predetermined quantity of frequency bins on two sides of the pitch frequency bin or a weighted and smoothed value of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; the average spectral amplitude parameter is an average of spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin or a weighted and smoothed value of the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; and the difference-to-amplitude ratio parameter is a ratio of the sum of the spectral differences of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin to the average of the spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin; where spectral differences refer to differences between spectral amplitudes of the predetermined quantity of frequency bins on the two sides of the pitch frequency bin and a spectral amplitude of the pitch frequency bin. 6. The apparatus according to claim 5, wherein the correctness determining unit (23) is specifically configured to: when it is determined that the pitch period correctness decision parameter meets a correctness determining condition, determine that the initial pitch period is correct; and when it is determined that the pitch period correctness decision parameter meets an incorrectness determining condition, determine that the initial pitch period is incorrect. 7. The apparatus according to claim 6, wherein: the correctness determining condition meets at least one of the following: the spectral difference parameter is greater than a second difference parameter threshold, the average spectral amplitude parameter is greater than a second spectral amplitude parameter threshold, and the difference-to-amplitude ratio parameter is greater than a second ratio factor parameter threshold; and the incorrectness determining condition meets at least one of the following: the spectral difference parameter is less than afirstdifference parameterthreshold, the average spectral amplitude parameter is less than a first spectral amplitude parameterthreshold, and the difference-to-amplitude ratio parameter is less than a first ratio factor parameter threshold. 8. The apparatus according to any one of claims 5 to 7, wherein: the pitch frequency bin of the input signal is reversely proportional to the initial pitch period, and is directly proportional to a quantity of points of a fast Fourier transform performed on the input signal.
Patentansprüche 1. Verfahren zum Bestimmen der Richtigkeit einer Tonhöhenperiode, umfassend:
Bestimmen (11), gemäß einer anfänglichen Tonhöhenperiode eines Eingangssignals in einem Zeitbereich, eines Tonhöhenfrequenz-Bins des Eingangssignals, wobei die anfängliche Tonhöhenperiode durch ein Vornehmen einer Detektion mit offener Schleife am Eingangssignal erhalten wird;
Bestimmen (12), basierend auf einem Amplitudenspektrum des Eingangssignals in einem Frequenzbereich, eines Tonhöhenperiodenrichtigkeits-Entscheidungsparameters des Eingangssignals, der mit dem Tonhöhen-frequenz-Bin assoziiert ist; und
Bestimmen (13) der Richtigkeit der anfänglichen Tonhöhenperiode gemäß dem Tonhöhenperiodenrichtigkeits-Entscheidungsparameter; wobei das Verfahren dadurch gekennzeichnet ist, dass: der Tonhöhenperiodenrichtigkeits-Entscheidungsparameter einen Spektraldifferenzparameter, einen Durchschnittsspektralamplitudenparameter und einen Differenz-Amplituden-Verhältnis-Parameter umfasst, wobei der Spektraldifferenzparameter eine Summe von Spektraldifferenzen einer vorbestimmten Anzahl von Frequenz-Bins an zwei Seiten des Tonhöhenfrequenz-Bins oderein gewichteter und geglätteter Wert der Summe der Spektraldifferenzen der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins ist; der Durchschnittsspektralamplitudenparameter ein Durchschnitt von Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins oder ein gewichteter und geglätteter Wert des Durchschnitts der Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins ist und der Differenz-Amplituden-Verhältnis-Parameter ein Verhältnis der Summe der Spektraldifferenzen der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins zu dem Durchschnitt der Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins ist; wobei sich Spektraldifferenzen auf Differenzen zwischen Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins und einer Spektralamplitude des Tonhöhenfrequenz-Bins beziehen. 2. Verfahren nach Anspruch 1, wobei das Bestimmen der Richtigkeit der anfänglichen Tonhöhenperiode gemäß dem Tonhöhenperiodenrichtigkeits-Entscheidungsparameter Folgendes umfasst: wenn der Tonhöhenperiodenrichtigkeits-Entscheidungsparameter eine Richtigkeitsbestimmungsbedingung erfüllt, Bestimmen, dass die anfängliche Tonhöhenperiode richtig ist; und wenn der Tonhöhenperiodenrichtigkeits-Entscheidungsparameter eine Unrichtigkeitsbestimmungsbedingung erfüllt, Bestimmen, dass die anfängliche Tonhöhenperiode falsch ist. 3. Verfahren nach Anspruch 2, wobei die Richtigkeitsbestimmungsbedingung mindestens eine der Folgenden erfüllt: der Spektraldifferenzparameter ist größer als eine zweite Differenzparameterschwelle, der Durchschnittsspektralamplitudenparameter ist größer als eine zweite Spektralamplitudenparameterschwelle und der Differenz-Amplituden-Verhältnis-Parameter ist größer als eine zweite Verhältnisfaktorparameterschwelle; und die Unrichtigkeitsbestimmungsbedingung mindestens eine der Folgenden erfüllt: der Spektraldifferenzparameter ist kleiner als eine erste Differenzparameterschwelle, der Durchschnittsspektralamplitudenparameter ist kleiner als eine erste Spektralamplitudenparameterschwelle und der Dif-ferenz-Amplituden-Verhältnis-Parameter ist kleiner als eine erste Verhältnisfaktorparameterschwelle. 4. Verfahren nach einem der Ansprüche 1 bis 2, wobei das Tonhöhenfrequenz-Bin des Eingangssignals umgekehrt proportional zu der anfänglichen Tonhöhenperiode und direkt proportional zu einer Anzahl von Punkten einer an dem Eingangssignal vorgenommenen schnellen Fourier-Transformation ist. 5. Vorrichtung zum Bestimmen der Richtigkeit einer Tonhöhenperiode, umfassend: eineTonhöhenfrequenz-Bin-Bestimmungseinheit (21), die dazu konfiguriert ist, gemäß einer anfänglichen Tonhöhenperiode eines Eingangssignals in einem Zeitbereich ein Tonhöhenfrequenz-Bin des Eingangssignals zu bestimmen, wobei die anfängliche Tonhöhenperiode durch Vornehmen einer Detektion mit offener Schleife am Eingangssignal erhalten wird; eine Parametererzeugungseinheit (22), die dazu konfiguriert ist, basierend auf einem Amplitudenspektrum des Eingangssignals in einem Frequenzbereich einen Tonhöhenperiodenrichtigkeits-Entscheidungsparameterdes Eingangssignals, der mit dem Tonhöhenfrequenz-Bin assoziiert ist, zu bestimmen; und eine Richtigkeitsbestimmungseinheit (23), die dazu konfiguriert ist, die Richtigkeit der anfänglichen Tonhöhenperiode gemäß dem Tonhöhenperiodenrichtigkeits-Entscheidungsparameter zu bestimmen, wobei die Vorrichtung dadurch gekennzeichnet ist, dass der Tonhöhenperiodenrichtigkeits-Entscheidungsparameter, der durch die Parametererzeugungseinheit erzeugtwird, einen Spektraldifferenzparameter, einen Durchschnittsspektralamplitudenparameter und einen Dif-ferenz-Amplituden-Verhältnis-Parameter umfasst, wobei der Spektraldifferenzparameter eine Summe von Spektraldifferenzen einer vorbestimmten Anzahl von Frequenz-Bins an zwei Seiten des Tonhöhenfrequenz-Bins oder ein gewichteter und geglätteter Wert der Summe der Spektraldifferenzen der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins ist; der Durchschnittsspektralamplitudenparameter ein Durchschnitt von Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins oder ein gewichteter und geglätteter Wert des Durchschnitts der Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins ist und der Differenz-Amplituden-Verhältnis-Parameter ein Verhältnis der Summe der Spektraldifferenzen der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins zu dem Durchschnitt der Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins ist; wobei sich Spektraldifferenzen auf Differenzen zwischen Spektralamplituden der vorbestimmten Anzahl von Frequenz-Bins an den beiden Seiten des Tonhöhenfrequenz-Bins und einer Spektralamplitude des Tonhöhenfrequenz-Bins beziehen. 6. Vorrichtung nach Anspruch 5, wobei die Richtigkeitsbestimmungseinheit (23) spezifisch dazu konfiguriert ist, wenn bestimmt wird, dass der Tonhöhenperiodenrichtigkeits-Entscheidungsparameter eine Richtigkeitsbestimmungsbedingung erfüllt, zu bestimmen, dass die anfängliche Tonhöhenperiode richtig ist; und wenn bestimmt wird, dass der Tonhöhenperiodenrichtigkeits-Entscheidungsparameter eine Unrichtigkeitsbestimmungsbedingung erfüllt, zu bestimmen, dass die anfängliche Tonhöhenperiode falsch ist. 7. Vorrichtung nach Anspruch 6, wobei die Richtigkeitsbestimmungsbedingung mindestens eine der Folgenden erfüllt: der Spektraldifferenzparameter ist größer als eine zweite Differenzparameterschwelle, der Durchschnittsspektralamplitudenparameter ist größer als eine zweite Spektralamplitudenparameterschwelle und der Differenz-Amplituden-Verhältnis-Parameter ist größer als eine zweite Verhältnisfaktorparameterschwelle; und die Unrichtigkeitsbestimmungsbedingung mindestens eine der Folgenden erfüllt: der Spektraldifferenzparameter ist kleiner als eine erste Differenzparameterschwelle, der Durchschnittsspektralamplitudenparameter ist kleiner als eine erste Spektralamplitudenparameterschwelle und der Dif-ferenz-Amplituden-Verhältnis-Parameter ist kleiner als eine erste Verhältnisfaktorparameterschwelle. 8. Vorrichtung nach einem der Ansprüche 5 bis 7, wobei das Tonhöhenfrequenz-Bin des Eingangssignals umgekehrt proportional zu der anfänglichen Tonhöhenperiode und direkt proportional zu einer Anzahl von Punkten einer an dem Eingangssignal vorgenommenen schnellen Fourier-Transformation ist.
Revendications 1. Procédé de détermination de la justesse d’une période de tonie, comportant les étapes consistant à : déterminer (11 ), en fonction d’une période initiale de tonie d’un signal d’entrée dans un domaine temporel, une classe de fréquences de tonie du signal d’entrée, la période initiale de tonie étant obtenue en effectuant une détection en boucle ouverte sur le signal d’entrée ; déterminer (12), d’après un spectre d’amplitude du signal d’entrée dans un domaine fréquentiel, un paramètre de décision de justesse de période de tonie, associé à la classe de fréquences de tonie, du signal d’entrée ; et déterminer (13) la justesse de la période initiale de tonie en fonction du paramètre de décision de justesse de période de tonie ; le procédé étant caractérisé en ce que : le paramètre de décision de justesse de période de tonie comporte un paramètre de différence spectrale, un paramètre d’amplitude spectrale moyenne, et un paramètre de rapport différence-amplitude, le paramètre de différence spectrale étant une somme de différences spectrales d’une quantité prédéterminée de classes de fréquences de deux côtés de la classe de fréquences de tonie ou une valeur pondérée et lissée de la somme des différences spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ; le paramètre d’amplitude spectrale moyenne est une moyenne d’amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ou une valeur pondérée et lissée de la moyenne des amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ; et le paramètre de rapport différence-amplitude est un rapport de la somme des différences spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie à la moyenne des amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ; les différences spectrales désignant des différences entre des amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie et un spectral amplitude de la classe de fréquences de tonie. 2. Procédé selon la revendication 1, la détermination de la justesse de la période initiale de tonie en fonction du paramètre de décision de justesse de période de tonie comportant les étapes consistant : lorsque le paramètre de décision de justesse de période de tonie satisfait une condition de détermination de justesse, à déterminer que la période initiale de tonie est juste ; et lorsque le paramètre de décision de justesse de période de tonie satisfait une condition de détermination de fausseté, à déterminer que la période initiale de tonie est fausse. 3. Procédé selon la revendication 2 : la condition de détermination de justesse vérifiant au moins une des propriétés suivantes : le paramètre de différence spectrale est supérieur à un deuxième seuil de paramètre de différence, le paramètre d’amplitude spectrale moyenne est supérieur à un deuxième seuil de paramètre d’amplitude spectrale, et le paramètre de rapport différence-amplitude est supérieur à un deuxième seuil de paramètre de facteur de rapport ; et la condition de détermination de fausseté vérifiant au moins une des propriétés suivantes : le paramètre de différence spectrale est inférieur à un premier seuil de paramètre de différence, le paramètre d’amplitude spectrale moyenne est inférieur à un premier seuil de paramètre d’amplitude spectrale, et le paramètre de rapport différence-amplitude est inférieurà un premier seuil de paramètre de facteur de rapport. 4. Procédé selon l’une quelconque des revendications 1 à 3 : la classe de fréquences de tonie du signal d’entrée étant inversement proportionnelle à la période initiale de tonie, et étant directement proportionnelle à une quantité de points d’une transformation de Fourier rapide effectuée sur le signal d’entrée. 5. Appareil de détermination de la justesse d’une période de tonie, comportant : une unité (21) de détermination de classes de fréquences de tonie, configurée pour déterminer, en fonction d’une période initiale de tonie d’un signal d’entrée dans un domaine temporel, une classe de fréquences de tonie du signal d’entrée, la période initiale de tonie étant obtenue en effectuant une détection en boucle ouverte sur le signal d’entrée ; une unité (22) de génération de paramètres, configurée pour déterminer, d’après un spectre d’amplitude du signal d’entrée dans un domaine fréquentiel, un paramètre de décision de justesse de période de tonie, associé à la classe de fréquences de tonie, du signal d’entrée ; et une unité (23) de détermination de justesse, configurée pour déterminer la justesse de la période initiale de tonie en fonction du paramètre de décision de justesse de période de tonie, l’appareil étant caractérisé en ce que : le paramètre de décision de justesse de période de tonie généré par l’unité de génération de paramètres comporte un paramètre de différence spectrale, un paramètre d’amplitude spectrale moyenne, et un paramètre de rapport différence-amplitude, le paramètre de différence spectrale étant une somme de différences spectrales d’une quantité prédéterminée de classes de fréquences de deux côtés de la classe de fréquences de tonie ou une valeur pondérée et lissée de la somme des différences spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ; le paramètre d’amplitude spectrale moyenne est une moyenne d’amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ou une valeur pondérée et lissée de la moyenne des amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ; et le paramètre de rapport différence-amplitude est un rapport de la somme des différences spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie à la moyenne des amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie ; les différences spectrales désignent des différences entre des amplitudes spectrales de la quantité prédéterminée de classes de fréquences des deux côtés de la classe de fréquences de tonie et un spectral amplitude de la classe de fréquences de tonie. 6. Appareil selon la revendication 5, l’unité (23) de détermination de justesse étant spécifiquement configurée pour : lorsqu’il est déterminé que le paramètre de décision de justesse de période de tonie satisfait une condition de détermination de justesse, déterminer que la période initiale de tonie est juste ; et lorsqu’il est déterminé que le paramètre de décision de justesse de période de tonie satisfait une condition de détermination de fausseté, déterminer que la période initiale de tonie est fausse. 7. Appareil selon la revendication 6 : la condition de détermination de justesse vérifiant au moins une des propriétés suivantes : le paramètre de différence spectrale est supérieur à un deuxième seuil de paramètre de différence, le paramètre d’amplitude spectrale moyenne est supérieur à un deuxième seuil de paramètre d’amplitude spectrale, et le paramètre de rapport différence-amplitude est supérieur à un deuxième seuil de paramètre de facteur de rapport ; et la condition de détermination de fausseté vérifiant au moins une des propriétés suivantes : le paramètre de différence spectrale est inférieur à un premier seuil de paramètre de différence, le paramètre d’amplitude spectrale moyenne est inférieur à un premier seuil de paramètre d’amplitude spectrale, et le paramètre de rapport différence-amplitude est inférieurà un premier seuil de paramètre de facteur de rapport. 8. Appareil selon l’une quelconque des revendications 5 à 7 : la classe de fréquences de tonie du signal d’entrée étant inversement proportionnelle à la période initiale de tonie, et étant directement proportionnelle à une quantité de points d’une transformation de Fourier rapide effectuée sur le signal d’entrée.
REFERENCES CITED IN THE DESCRIPTION
This list of references cited by the applicant is for the reader’s convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.
Patent documents cited in the description • US 610862 A [0007] · US 6496797 B1 [0009] • US 20040158462 A1 [0008]

Claims (3)

  1. S||ií'ás # pitch periódus hblyesságének detektáláséra iJ Ifi |f || ·|Β||||||! 1 mbzmmmmm* ® !1tN 0B2S Ϊ k t·, Eljárás egy pitch periódus Helyességnek ms^|á^2ésáía,.sx:iSljáit^i«alm8Xi68í 8f¥ idd demáinbae egy bemeneti lel kiindulást gitéh periódusa; szőriét s béménetf jéi $gf feléjének s péghgtái'ö:^ássí: |:|:|| ahss|e klpdefási pltch periétet-.8 bemeneti pl nyííthutku detektálásának végrehajtása ötjén kapjuk; egy :#ékyéöeia üfemainöan «bbemefifölií jpitehtlmltvencis hinne! társított piteh periódus helyesség: Ööfttésj: psteméteféstek megbetároxásét {12) ;s öesoeneh jel egy empltódó spektruma alapjáé; és a kiindulási pitéd periódus'N|yfss|jgéóektíPeg^*lro2á'S3t {13} a pitch periódus Helyesség:;d^^l.#iSÍ?^llir^ nek meghdeiöéö:; át eljárási: m jellemé, hogy; a piteh periódus helyesség döutésl peremáter tsríáimas egy spektrails lulöefeség paramétert, egy átlagos Spekzráiss arnpSStudö paramétert és sgyíkölönfeség/simplitúdósrány páramététt, a speksráilskölonbség paraméter a piteh frekvencia bip két oldalén frekvencia tsiPék élére imeghatirotott merlhylsége spekírális kölöriföségének: M óssrege; Vágy a pstch frekvencia hih két oldsile#ékyeríPÍs ólóék elére meghatározott meny-nyisége spektrélls különbségéinek súlyozott és simított értéke; a? átlagos spektráüs ámpiítéctó paraméter a pitch frekvencia bin két öídaiáö Itékkénela élnek elSre megbatároxeíi meonylsége sgéktfáils amplitúdóinak atlsgs, vagy a &amp;ekveneis bínék előre nteghatérorott mennyisége epekóélis aevEilfudéja átlagának súlyozott Is siofitött értéké a plteb frekyeeéia éip két oldalam és « köfönfeség/amplitudó arány paraméter a giteh frekvencia bin két oldalán frekvencia hinek előre meghatározóét mennyisége sghktrélfS különbségei, és a pltéh irekyenela bin két oldalán frekvencia binek előre meghatlfOtets meonyisége spektráifs amplitúdóinak :m étiaga: közötti arány; áhöl: á: spéktraiis kblSnbsépk a pitch frekvencia bin kél: oldalán frekvencia birtek előre meghatározott mennyiségének spektraiis ampSítdíIoi és a pitén frekvencia bin egy spektreüs amplitúdója kőkötsl; különbségekre vonatkoznak.
  2. 2. Ai 1. igénypont szennti eljárás, ahol a kiindulási pitcb periódus helyességének meghatározasa a pitch perié-dús helyesség döntési paraméter szarmbm^ ha a prtsh perSodUs bslyosség döntési paraméter kielégít egy helyesség msghatsmrú feltételt, annak meghatározását, hogy a kiindulási piteh periódus helyes; és ha a pltch periódus helyesség döntési paraméter agy helytelenség: meghatározd feltételt elégít ki,, annak meg-határozását* hogy a kiindulási pitch periódus helytelen.
  3. 3- A 2. igénypont szerinti eljárás, ahol; a helyesség nteghatárzszö: feltételt a kbvetkeadk kptdl legalább egy teljesíti; g spektráiís küldöttség paraméter nagyobb, rcsim egy második különbség parerhéter $g átíagös Spéktrális amplitúdó pamméteí nagyobb, mint egy síiásoöík spsktrálss ««^ÍSíádS pspasié*®? küszöbérték, és a :|üiö«bs|^m|>istúdé;3í|dy pataméíer nagyobb, mint egy második arány tényező paraméter küszöbérték; és a helytelenség magöatérotú feltételt skövsttkszők kísökfegsiébbsgy íé^skÉk p soeklfális különbség párámét®! kisebb, m>ot sgv «iső különbségipáraméter küszöbérték, δϊ áltagos SpektrásíS iasíplktödö paraméter kiséhh, min; egy s?ső speídrésis amplitúdő iparaméter küszöbérték, és: s küfono-· ség/ampiiiúddafány paraméter kisebb, mi?M ügy elsd^!iid!ifl4^pi?i#'!ál^i0séter küszöbérték, 4, : M t-3, igénypontok bármelyike szerinti s§árás,:$hdi:k 3 bemeneti jel pi;.ch frekvencia binje bírditottétv átársyös a kilpöuilsi ipitcb pöfíédpssöb és égyénsseh aranyos s: bemenetijeién végrehajtott: gyors keensr^enSkfoíröaeiöpöntíöinsk mennyiségévé!. 5, eereocmzés egy pitch periódus helyességének meghatározására, a mely tartalmaz: «gy pitch frekvencia bin meghatározd egységét (21), amely égy ven konfigurálva, hogy meghatározza egy ;dő domalnban egy bemeneti Jel kiindulási pitch periódusa szerint a bemeneti .lei egy pitch frekvencia hímét, ahol s kiindulási pitcb periódust s bemeneti .lei nvilthurkú detektálásának végrehajtása útján kapjuk; egy paraméter elöáiísté egységet: (22), amely úgy van konfigurálva, hogy meghatározza; egy frekvencia doniamban a bemeneti jel pitch frekvencia hinne! társított pitch periódus helyesség döntési paraméterét a bemeneti fel:egy amplitúdó spektruma alapján; és egy helyesség meg ha tárosé egységet (23), amely úgy van konfigurálva, hogy meghatározta a kiindulási pitch periédus helyességéta pitch periédus helyesség döntés! paraméternekimegféieíöen; p:lerendezést az Jellemzi, hogy: a paraméter sidaflitegység éitaí elöéiidQtt piföh períédus helyesség döntéspsraméter tartaimaa egy spektráliS: különbség paramétert,egy átlagos spéktíiils amplitödé paramétertésegy kufönhségjksmpü^^ tért, a spéktrsíis különbség paraméter a pitch frekvencia bin két oldalán frekvencia binc-k előre meghatározott mennyisége spektráíis különbségének az összege. vagy a piíSh frekvencia bín két oldalén frekvencia binek előre meghatározott mennyisége spsktrálís különbségéinek suiyokött ésmmitott értéke; m tüdő paraméter &amp; pitch ffékvencla bin két amplitúdóinak átlaga, vagy a frekvencia binek előre meghatározott mennyisége spektrális ampiitüdm^ átfsgá' nak súlyozott és síniltett értéke a pitóhdrekvedcía bin két oldalén; és a külöhbség/amplltúdé arány paraméter a pitch frekvencia bin két oldalán frekvencia hirtek elére megbatározott mennyisége spektráiís különbségei, és a pitch: frekvencia bin két ffpkved^rSliiék előre méghatárPZptt mennyisége spekuálls amplitúdóinak ss átlaga között! arány; $$$* «Sösílfv&amp;^yé^cfe .fgM&amp;nnyl- Ségének spirális amplfcótffö fe s e^cö- tmimetf*· fe&amp;* spéktráiis a«n0!tó^p:;^e kblörmspp'kre yonaikozosk, 6< Iz'% ^gmpmtsrnfM^mmiS% ·φ®ύ Náy*«*S m$hmwM mné$%M ráiva^bíJgy mdrnmm,. ***&amp; l"*#*» helyesség: ^&amp;(«és psfsméter kstS^st ygy helyes-· sbgmegbssároad feitéteiíyannak fflagbp#fö*is<&amp; %gyí»Mnö».Íásf ρφ perisdos fidyes; és. m a maghatározásnak az az erepfnpoye, hogy 3 pitch periédosbelyssseg dömés paraméter égy· héipÉeoslg maghatárözP főtételt elégd ki, mmk géáődf*héM^«- h- &amp;%. igénypont szerinti hwmádtzéfy *W:. a helyesség meglndthoz-ö tePátéíi: :a: kbvbtkezdk feö*öl iegsiább 8|V télissiti· a spektráíisikuioebség:. paraméternagyobb, mim egy második köiönfeség paraméter küszöbérték, m kpekzegi;; ampiicúdd paraméter nagyobb, mlrireiy második spaktráíis ampiltődd paraméter: küszöbérték, és a Mlödfeség/ampUtúdó arány paramátgr ogpobd, mint egy második arany téoyez§ :paf-amé£8f:kdsaöbáftéki és <* kovetksidk k&amp;tui legalább ngy tékésítn: a speknaiis különbség paraméter kisebb, mint agy eísbLköidobségípasömémr Ppöbé^,«%:á88ge$y$pé%á% a:mpíl£ddö ipsramétér kisebb, mint agy eisl spéktriMis ámpiküdP paraméter küszöbérték; és··.a köldob-ság/ampiitiméorény paramétsrkisebib irsírp. »gy ahé arány ienyé#prlP5l^fí<ö4Í8|iÉii#k, S, tlgéll!S.2»íilntl -S:hc»iit a:feem;meti jei pitch MkveMia bír# Íqrdiíbttanáranyösaikíln^ egyenesen arányos a bemeneti jbiím végrebajtoít gyem knurior>traosaförmáaib::poní|ainák mennyiségével
HUE12876916A 2012-05-18 2012-12-26 Eljárás és berendezés pitch periódus helyességének detektálására HUE034664T2 (hu)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210155298.4A CN103426441B (zh) 2012-05-18 2012-05-18 检测基音周期的正确性的方法和装置

Publications (1)

Publication Number Publication Date
HUE034664T2 true HUE034664T2 (hu) 2018-02-28

Family

ID=49583070

Family Applications (1)

Application Number Title Priority Date Filing Date
HUE12876916A HUE034664T2 (hu) 2012-05-18 2012-12-26 Eljárás és berendezés pitch periódus helyességének detektálására

Country Status (10)

Country Link
US (5) US9633666B2 (hu)
EP (2) EP3246920B1 (hu)
JP (2) JP6023311B2 (hu)
KR (2) KR101649243B1 (hu)
CN (1) CN103426441B (hu)
DK (1) DK2843659T3 (hu)
ES (2) ES2847150T3 (hu)
HU (1) HUE034664T2 (hu)
PL (1) PL2843659T3 (hu)
WO (1) WO2013170610A1 (hu)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103426441B (zh) 2012-05-18 2016-03-02 华为技术有限公司 检测基音周期的正确性的方法和装置
CN106373594B (zh) * 2016-08-31 2019-11-26 华为技术有限公司 一种音调检测方法及装置
US10192461B2 (en) 2017-06-12 2019-01-29 Harmony Helper, LLC Transcribing voiced musical notes for creating, practicing and sharing of musical harmonies
US11282407B2 (en) 2017-06-12 2022-03-22 Harmony Helper, LLC Teaching vocal harmonies
CN110600060B (zh) * 2019-09-27 2021-10-22 云知声智能科技股份有限公司 一种硬件音频主动探测hvad系统
CN111223491B (zh) * 2020-01-22 2022-11-15 深圳市倍轻松科技股份有限公司 一种提取音乐信号主旋律的方法、装置及终端设备
US11335361B2 (en) * 2020-04-24 2022-05-17 Universal Electronics Inc. Method and apparatus for providing noise suppression to an intelligent personal assistant

Family Cites Families (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL8400552A (nl) * 1984-02-22 1985-09-16 Philips Nv Systeem voor het analyseren van menselijke spraak.
US4885790A (en) * 1985-03-18 1989-12-05 Massachusetts Institute Of Technology Processing of acoustic waveforms
CA1245363A (en) * 1985-03-20 1988-11-22 Tetsu Taguchi Pattern matching vocoder
US4776014A (en) * 1986-09-02 1988-10-04 General Electric Company Method for pitch-aligned high-frequency regeneration in RELP vocoders
US5054072A (en) * 1987-04-02 1991-10-01 Massachusetts Institute Of Technology Coding of acoustic waveforms
US4809334A (en) 1987-07-09 1989-02-28 Communications Satellite Corporation Method for detection and correction of errors in speech pitch period estimates
US5127053A (en) 1990-12-24 1992-06-30 General Electric Company Low-complexity method for improving the performance of autocorrelation-based pitch detectors
US7171016B1 (en) * 1993-11-18 2007-01-30 Digimarc Corporation Method for monitoring internet dissemination of image, video and/or audio files
US6463406B1 (en) 1994-03-25 2002-10-08 Texas Instruments Incorporated Fractional pitch method
CA2154911C (en) * 1994-08-02 2001-01-02 Kazunori Ozawa Speech coding device
JP3528258B2 (ja) * 1994-08-23 2004-05-17 ソニー株式会社 符号化音声信号の復号化方法及び装置
US6136548A (en) * 1994-11-22 2000-10-24 Rutgers, The State University Of New Jersey Methods for identifying useful T-PA mutant derivatives for treatment of vascular hemorrhaging
US5774837A (en) * 1995-09-13 1998-06-30 Voxware, Inc. Speech coding system and method using voicing probability determination
US5729694A (en) * 1996-02-06 1998-03-17 The Regents Of The University Of California Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
US5864795A (en) 1996-02-20 1999-01-26 Advanced Micro Devices, Inc. System and method for error correction in a correlation-based pitch estimator
US5774836A (en) 1996-04-01 1998-06-30 Advanced Micro Devices, Inc. System and method for performing pitch estimation and error checking on low estimated pitch values in a correlation based pitch estimator
US6226604B1 (en) 1996-08-02 2001-05-01 Matsushita Electric Industrial Co., Ltd. Voice encoder, voice decoder, recording medium on which program for realizing voice encoding/decoding is recorded and mobile communication apparatus
US6014622A (en) * 1996-09-26 2000-01-11 Rockwell Semiconductor Systems, Inc. Low bit rate speech coder using adaptive open-loop subframe pitch lag estimation and vector quantization
JPH10105195A (ja) * 1996-09-27 1998-04-24 Sony Corp ピッチ検出方法、音声信号符号化方法および装置
JP4121578B2 (ja) 1996-10-18 2008-07-23 ソニー株式会社 音声分析方法、音声符号化方法および装置
US6456965B1 (en) 1997-05-20 2002-09-24 Texas Instruments Incorporated Multi-stage pitch and mixed voicing estimation for harmonic speech coders
US6438517B1 (en) 1998-05-19 2002-08-20 Texas Instruments Incorporated Multi-stage pitch and mixed voicing estimation for harmonic speech coders
US6188980B1 (en) * 1998-08-24 2001-02-13 Conexant Systems, Inc. Synchronized encoder-decoder frame concealment using speech coding parameters including line spectral frequencies and filter coefficients
DE69939086D1 (de) * 1998-09-17 2008-08-28 British Telecomm Audiosignalverarbeitung
US6233549B1 (en) * 1998-11-23 2001-05-15 Qualcomm, Inc. Low frequency spectral enhancement system and method
US6496797B1 (en) * 1999-04-01 2002-12-17 Lg Electronics Inc. Apparatus and method of speech coding and decoding using multiple frames
AU3651200A (en) 1999-08-17 2001-03-13 Glenayre Electronics, Inc Pitch and voicing estimation for low bit rate speech coders
US6151571A (en) * 1999-08-31 2000-11-21 Andersen Consulting System, method and article of manufacture for detecting emotion in voice signals through analysis of a plurality of voice signal parameters
US6418405B1 (en) 1999-09-30 2002-07-09 Motorola, Inc. Method and apparatus for dynamic segmentation of a low bit rate digital voice message
US6704711B2 (en) * 2000-01-28 2004-03-09 Telefonaktiebolaget Lm Ericsson (Publ) System and method for modifying speech signals
WO2001078061A1 (en) 2000-04-06 2001-10-18 Telefonaktiebolaget Lm Ericsson (Publ) Pitch estimation in a speech signal
JP2002149200A (ja) * 2000-08-31 2002-05-24 Matsushita Electric Ind Co Ltd 音声処理装置及び音声処理方法
WO2002029782A1 (en) * 2000-10-02 2002-04-11 The Regents Of The University Of California Perceptual harmonic cepstral coefficients as the front-end for speech recognition
SE522553C2 (sv) 2001-04-23 2004-02-17 Ericsson Telefon Ab L M Bandbreddsutsträckning av akustiska signaler
GB2375028B (en) * 2001-04-24 2003-05-28 Motorola Inc Processing speech signals
US6917912B2 (en) * 2001-04-24 2005-07-12 Microsoft Corporation Method and apparatus for tracking pitch in audio analysis
WO2002101717A2 (en) * 2001-06-11 2002-12-19 Ivl Technologies Ltd. Pitch candidate selection method for multi-channel pitch detectors
US6871176B2 (en) * 2001-07-26 2005-03-22 Freescale Semiconductor, Inc. Phase excited linear prediction encoder
KR100393899B1 (ko) 2001-07-27 2003-08-09 어뮤즈텍(주) 2-단계 피치 판단 방법 및 장치
JP3888097B2 (ja) 2001-08-02 2007-02-28 松下電器産業株式会社 ピッチ周期探索範囲設定装置、ピッチ周期探索装置、復号化適応音源ベクトル生成装置、音声符号化装置、音声復号化装置、音声信号送信装置、音声信号受信装置、移動局装置、及び基地局装置
CN1324556C (zh) * 2001-08-31 2007-07-04 株式会社建伍 生成基音周期波形信号的装置和方法及处理语音信号的装置和方法
US7657427B2 (en) * 2002-10-11 2010-02-02 Nokia Corporation Methods and devices for source controlled variable bit-rate wideband speech coding
US7233894B2 (en) 2003-02-24 2007-06-19 International Business Machines Corporation Low-frequency band noise detection
SG120121A1 (en) * 2003-09-26 2006-03-28 St Microelectronics Asia Pitch detection of speech signals
CA2566368A1 (en) 2004-05-17 2005-11-24 Nokia Corporation Audio encoding with different coding frame lengths
KR100724736B1 (ko) 2006-01-26 2007-06-04 삼성전자주식회사 스펙트럴 자기상관치를 이용한 피치 검출 방법 및 피치검출 장치
KR100770839B1 (ko) 2006-04-04 2007-10-26 삼성전자주식회사 음성 신호의 하모닉 정보 및 스펙트럼 포락선 정보,유성음화 비율 추정 방법 및 장치
CN100541609C (zh) * 2006-09-18 2009-09-16 华为技术有限公司 一种实现开环基音搜索的方法和装置
CN100524462C (zh) * 2007-09-15 2009-08-05 华为技术有限公司 对高带信号进行帧错误隐藏的方法及装置
US9142221B2 (en) * 2008-04-07 2015-09-22 Cambridge Silicon Radio Limited Noise reduction
CN101556795B (zh) * 2008-04-09 2012-07-18 展讯通信(上海)有限公司 计算语音基音频率的方法及设备
US9197181B2 (en) * 2008-05-12 2015-11-24 Broadcom Corporation Loudness enhancement system and method
US8645129B2 (en) * 2008-05-12 2014-02-04 Broadcom Corporation Integrated speech intelligibility enhancement system and acoustic echo canceller
US20090319261A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
US20090319263A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
US8577673B2 (en) * 2008-09-15 2013-11-05 Huawei Technologies Co., Ltd. CELP post-processing for music signals
CN101354889B (zh) * 2008-09-18 2012-01-11 北京中星微电子有限公司 一种语音变调方法及装置
CN101599272B (zh) 2008-12-30 2011-06-08 华为技术有限公司 基音搜索方法及装置
EP2211335A1 (en) * 2009-01-21 2010-07-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for obtaining a parameter describing a variation of a signal characteristic of a signal
WO2010091554A1 (zh) * 2009-02-13 2010-08-19 华为技术有限公司 一种基音周期检测方法和装置
CN101814291B (zh) * 2009-02-20 2013-02-13 北京中星微电子有限公司 在时域提高语音信号信噪比的方法和装置
US8718804B2 (en) * 2009-05-05 2014-05-06 Huawei Technologies Co., Ltd. System and method for correcting for lost data in a digital audio signal
US8620672B2 (en) 2009-06-09 2013-12-31 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for phase-based processing of multichannel signal
JP5433696B2 (ja) * 2009-07-31 2014-03-05 株式会社東芝 音声処理装置
US20140019125A1 (en) * 2011-03-31 2014-01-16 Nokia Corporation Low band bandwidth extended
CN102231274B (zh) * 2011-05-09 2013-04-17 华为技术有限公司 基音周期估计值修正方法、基音估计方法和相关装置
CN102842305B (zh) * 2011-06-22 2014-06-25 华为技术有限公司 一种基音检测的方法和装置
ES2757700T3 (es) * 2011-12-21 2020-04-29 Huawei Tech Co Ltd Detección y codificación de altura tonal muy débil
CN103426441B (zh) * 2012-05-18 2016-03-02 华为技术有限公司 检测基音周期的正确性的方法和装置
CN105976830B (zh) * 2013-01-11 2019-09-20 华为技术有限公司 音频信号编码和解码方法、音频信号编码和解码装置
CN104217727B (zh) * 2013-05-31 2017-07-21 华为技术有限公司 信号解码方法及设备
CN104517610B (zh) * 2013-09-26 2018-03-06 华为技术有限公司 频带扩展的方法及装置

Also Published As

Publication number Publication date
US10249315B2 (en) 2019-04-02
JP2017027076A (ja) 2017-02-02
ES2627857T3 (es) 2017-07-31
US9633666B2 (en) 2017-04-25
KR20160099729A (ko) 2016-08-22
WO2013170610A1 (zh) 2013-11-21
US20230402048A1 (en) 2023-12-14
US20210335377A1 (en) 2021-10-28
US20150073781A1 (en) 2015-03-12
JP2015516597A (ja) 2015-06-11
DK2843659T3 (en) 2017-07-03
JP6023311B2 (ja) 2016-11-09
KR101762723B1 (ko) 2017-07-28
KR101649243B1 (ko) 2016-08-18
US10984813B2 (en) 2021-04-20
EP2843659A1 (en) 2015-03-04
US11741980B2 (en) 2023-08-29
EP2843659A4 (en) 2015-07-15
CN103426441B (zh) 2016-03-02
PL2843659T3 (pl) 2017-10-31
CN103426441A (zh) 2013-12-04
US20190180766A1 (en) 2019-06-13
JP6272433B2 (ja) 2018-01-31
US20170194016A1 (en) 2017-07-06
EP3246920A1 (en) 2017-11-22
EP2843659B1 (en) 2017-04-05
KR20150014492A (ko) 2015-02-06
ES2847150T3 (es) 2021-08-02
EP3246920B1 (en) 2020-10-28

Similar Documents

Publication Publication Date Title
HUE034664T2 (hu) Eljárás és berendezés pitch periódus helyességének detektálására
US8725499B2 (en) Systems, methods, and apparatus for signal change detection
EP1914728B1 (en) Method and apparatus for decoding a signal using spectral band replication and interpolation of scale factors
US7778825B2 (en) Method and apparatus for extracting voiced/unvoiced classification information using harmonic component of voice signal
Huang et al. Pitch estimation in noisy speech using accumulated peak spectrum and sparse estimation technique
Kaya et al. A temporal saliency map for modeling auditory attention
EP1995723A1 (en) Neuroevolution training system
JP2009511954A (ja) モノラルオーディオ信号からオーディオソースを分離するためのニューラル・ネットワーク識別器
US20140309992A1 (en) Method for detecting, identifying, and enhancing formant frequencies in voiced speech
US20100138220A1 (en) Computer-readable medium for recording audio signal processing estimating program and audio signal processing estimating device
Murphy et al. Noise estimation in voice signals using short-term cepstral analysis
Yarra et al. A mode-shape classification technique for robust speech rate estimation and syllable nuclei detection
KR20070085788A (ko) 신호 속성들을 사용한 효율적인 오디오 코딩
US20150095035A1 (en) Wideband speech parameterization for high quality synthesis, transformation and quantization
Song et al. Improved CEM for speech harmonic enhancement in single channel noise suppression
Huang et al. Pitch estimation in noisy speech based on temporal accumulation of spectrum peaks.
de León et al. A complex wavelet based fundamental frequency estimator in singlechannel polyphonic signals
KR101804787B1 (ko) 음질특징을 이용한 화자인식장치 및 방법
CN113257276B (zh) 一种音频场景检测方法、装置、设备及存储介质
Uemura et al. Effects of audio compression on chord recognition
TWI381368B (zh) Coding mode selection device
TWI460717B (zh) 蛙鳴辨識方法
Ismail et al. Novel low-band phase representation for low bit-rate speech coding.
MX2008004572A (en) Neural network classifier for seperating audio sources from a monophonic audio signal