EP1557825B1

EP1557825B1 - Bandwidth expanding device and method

Info

Publication number: EP1557825B1
Application number: EP03756637A
Authority: EP
Inventors: Kazunori; c/o NEC CORPORATION OZAWA
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2002-10-31
Filing date: 2003-10-16
Publication date: 2010-12-22
Anticipated expiration: 2023-10-16
Also published as: EP1557825A4; KR20050062643A; WO2004040553A1; CN1708785B; KR100715013B1; EP1557825A1; DE60335486D1; AU2003301711A1; CN1708785A; CA2504175A1; JP2004151423A; US20050256709A1; JP4433668B2; US7684979B2

Abstract

A bandwidth expanding device comprising a spectrum parameter calculating circuit (100) for calculating a spectrum parameter of a narrow-bandwidth input signal x(n), a coefficient calculating circuit (130) for receiving the spectrum parameter and converting it into the coefficient of a signal the bandwidth of which is expanded, a gain circuit (140) for receiving a gain from a gain control circuit (210), multiplying the output signal from a noise generating circuit (120) and the gain, and outputting the product to a combined filter circuit (170), the combined filter circuit (170) for receiving the coefficient from coefficient calculating circuit (130) to constitute a filter and passing the signal from the gain circuit (140) through the filter to output a high-frequency signal y(n) for bandwidth expansion, a sampling frequency converting circuit (180) for receiving the narrow-bandwidth input signal x(n) and outputting a signal s(n) the frequency of which is up-sampled to a predetermined sampling frequency, and an adder (190) for adding the high-pass signal y(n) to the signal s(n) and outputting an expanded bandwidth signal.

Description

TECHNICAL FIELD

This invention relates to a method and an apparatus for extending the band, according to which a narrow-band signal is entered as input signal and a band extended signal having enlarged frequency range of the input signal is output to improve the acoustic sound quality.

BACKGROUND ART

There has been known a system in which the frequency range of a speech signal, encoded at a low bit rate and reproduced, is extended on the receiving side without the transmitting side having to send the auxiliary information for band extension (for example, see Non-Patent Publication 1).
Non-Patent Publication 1:

P. Jax, P.Vary, "Wideband extension of telephone speech using hidden Markov model", Proc. IEEE Speech Coding Workshop, pp.133-135, 2000.

With this state-of-the-art system, filter coefficients after band extension using HMM (Hidden Markov Model) are retrieved on the receiving side.
On the other hand, the processing for directly extending the band of the narrow-band input signal is unprecedented.
In the state-of-the-art method, shown in the Publication 1, in which modeling by HMM of filter coefficients or the broadband spectral envelope of speech is required, the following problem arises. That is, HMM model parameters need to be determined off-line at the outset from a voluminous speech database in a manner which entails prolonged computing time and increased cost. In addition, retrieval by an HMM model is needed for the receiving side to carry out band extension processing in real time, for which a large volume of calculations are required.
A further state of the art according to Article 54(3) (EPC) has been disclosed in EP-A 1 420 389 . An addition, US 5455888 and WO 01/35395 describe bandwidth extension techniques based on the LPC principle.
Accordingly, it is an object of the present invention to overcome the aforementioned problem and to provide a method and an apparatus for directly extending the frequency range of a narrow-band input signal. It is another object of the present invention to provide a method and an apparatus for extending the frequency range whereby the band-extended speech of optimum sound quality may be obtained with computational complexity less than that of the state-of-the-art system.

DISCLOSURE OF THE INVENTION

According to the present invention this object is solved by a band extending apparatus according to claim 1 and a band extending method according to claim 4.
Further advantageous features of the band extending apparatus and the band extending method are indicated in the dependent claims.
The present invention has such meritorious effect that a band extended signal (e.g. 7 kHz band signal) may be generated by generating a high frequency signal with processing for a narrow-band input signal (e.g. 4 kHz band signal) and by summing the resulting high frequency signal to a signal corresponding to the narrow-band input signal having its sampling frequency changed.
The present invention has such meritorious effect that a band extended signal with optimum sound quality may be generated in case periodicity is required for a high frequency part of the signal, such as a vowel, by generating an adaptive codebook signal, using a delay calculated from the narrow-band input signal, and by multiplying the so generated adaptive codebook signal with a gain and by summing the resulting signal to a noise signal.
The present invention also has such meritorious effect that a band extended signal for higher sound quality may be generated by employing a pitch pre-filter for a sound source signal, using the delay, or by weighting the coefficients from the coefficient calculating circuit for use for the post-filter.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig.1 is a diagram showing a configuration of a first embodiment not belonging to the present invention.
Fig.2 is a diagram showing a configuration of a second embodiment not belonging to the present invention.
Fig.3 is a diagram showing a configuration of a first embodiment of the present invention.
Fig.4 is a diagram showing a configuration of a second embodiment of the present invention.
Fig.5 is a diagram showing a configuration of a third embodiment of the present invention.
FIG. 6 is a diagram showing a modification of the second embodiment not belonging to the present invention.

PREFERRED EMBODIMENTS OF THE INVENTION

For more detailed explanation of the present invention, preferred embodiments of the present invention will be explained with reference to the drawings. It is presupposed in the following that a narrow-band input signal of a 4 kHz range is extended in band to a 5 kHz band or to a 7 kHz band.
Fig.1 shows the configuration of a first embodiment not belonging to the present invention. Referring to Fig.1, a band extension apparatus includes a spectral parameter calculating circuit 100, a noise generating circuit 120, a coefficient calculating circuit 130, a gain circuit 140, a synthesis filter circuit 170, a sampling frequency converting circuit 180, an adder 190, a voiced/unvoiced discriminating circuit 200 and a gain adjustment circuit 210.
In the band extending apparatus, supplied with a narrow-band input signal x(n), the spectral parameter calculating circuit 100 divides the input signal into plural frames, each being e.g. of 10 ms, and calculates spectral parameters of a predetermined number of orders P from frame to frame. It is noted that the spectral parameters represent parameters showing the outline shape of spectrum of a speech signal in terms of a frame as a unit. For the calculation, LPC analysis, as known per se, for example, is used. The spectral parameter calculating circuit 100 also converts the linear prediction coefficients α i (i = 1, ...P), calculated by the LPC analysis, into LPC parameters suitable for quantization or interpolation, to output the so formed LPC parameters. For converting the linear prediction coefficients into LSP, reference is made e.g. to the following treatises (for example see Non-Patent Publication 2):
Non-Patent Publication 2:

Sugamura and Itakura: "Speech Information Compression by Voice Analysis Synthesis System", Extended Abstract to Society of Electronic Communication, J64-A, pp.599 t- 606, 1981

The coefficient calculating circuit 130 is supplied with the spectral parameters and converts the parameters into coefficients of the band extended signal. For this conversion, well-known techniques, such as a technique for simply shifting the LSP frequency to a higher frequency, a technique for non-linear conversion or a technique for linear conversion, may be used. Here, the frequency band in which the LSPs are present is shifted to a higher frequency range, using all or part of the LSP parameters, for conversion to order-P linear prediction coefficients, which order-P linear prediction coefficients are then output to the synthesis filter circuit 170.
The noise generating circuit 120 generates a band-limited noise signal, having an average amplitude value normalized to a predetermined level, for a time duration equal to the frame duration, and outputs the so generated noise signal to the gain circuit 140. As the noise signal, the white noise is here used. However, other noise signal may also be used.
The voiced/unvoiced discriminating circuit 200 is supplied with the narrow-band input signal x(n) to verify whether the frame-based signal is voiced or unvoiced. For verifying whether the frame-based signal is voiced or unvoiced, a normalized autocorrelation function D(T) up to a predetermined delay time m is derived for the narrow-band input signal x(n) in accordance with the equation (1): $D (T) = [\sum_{n = 0}^{N - 1} x (n) x (n - T)] / [\sum_{n = 0}^{N - 1} x^{2} (n - T)]$

and a maximum value of D(T) is found. If the maximum value of D(T) is larger than a predetermined threshold value, the input signal is determined to be voiced. If otherwise, the input signal is determined to be unvoiced.
The voiced/unvoiced discriminating circuit 200 outputs the voiced/unvoiced discrimination information to the gain adjustment circuit 210. In the above equation (1), N denotes the number of samples for calculating the normalized autocorrelation.
The gain adjustment circuit 210 is supplied with the voiced/unvoiced discrimination information from the voiced/unvoiced discriminating circuit 200 and adjusts the gain to be imparted to the noise signal depending on whether the input signal is voiced or unvoiced, to output the so adjusted gain to the gain circuit 140.
The gain circuit 140 is supplied with the gain from the voiced/unvoiced discriminating circuit 200 and multiplies the output signal of the noise generating circuit 120 with the gain to output the resulting signal to the synthesis filter circuit 170.
The synthesis filter circuit 170 is supplied with the output signal of the gain circuit 140 and with coefficients of a predetermined number of orders, from the coefficient calculating circuit 130, to form a filter, and outputs a high frequency range signal y(n) needed for band extension.
The sampling frequency converting circuit 180 up-samples the narrow-band input signal x(n) to a predetermined sampling frequency to output the resulting up-sampled signal.
The adder 190 sums an output signal y(n) of the synthesis filter circuit 170 and an output signal s(n) of the sampling frequency converting circuit 180 to each other to form and output an ultimately band extended signal.
The above completes the explanation of the first embodiment.
Fig.2 shows the configuration of a second embodiment not belonging to the present invention. Referring to Fig.2, the band extending apparatus includes a spectral parameter calculating circuit 100, an adaptive codebook circuit 110, a noise generating circuit 120, a coefficient calculating circuit 130, a gain circuit 340, a synthesis filter circuit 170, a sampling frequency converting circuit 180, adders 160, 190, a voiced/unvoiced discriminating circuit 200, and a gain adjustment circuit 310. In Fig.2, the same reference numerals are used to depict the same parts or components as those shown in Fig.1. In the following, only the points of difference from Fig.1 are explained, whilst the same parts or components as those of Fig.1 are sometimes not explained. The present second embodiment of the present invention includes the adaptive codebook circuit 110 and the adder 160, in addition to the components of Fig.1.
The voiced/unvoiced discriminating circuit 200 is supplied with the narrow-band input signal x(n) to verify whether a frame-based signal is voiced or unvoiced. For verifying whether the frame-based signal is voiced or unvoiced, a normalized autocorrelation function D(T) up to the predetermined delay time m is derived for the narrow-band input signal x(n) in accordance with the equation (1), and a maximum value of D(T) is found. If the maximum value of D(T) is larger than a predetermined threshold value, the input signal is determined to be voiced. If otherwise, the input signal is determined to be unvoiced.
For the voiced frame, the voiced/unvoiced discriminating circuit 200 sends the value of T, maximizing the normalized autocorrelation function D(T), as a pitch period T to the adaptive codebook circuit 110.
The adaptive codebook circuit 110 is supplied from the voiced/unvoiced discriminating circuit 200 with the delay T of the adaptive codebook and, based on the past sound source signal v(n), generates an adaptive code vector p(n), in accordance with the following equation (2): $p (n) = v (n - T)$

and outputs the so generated vector to the gain circuit 340.
The gain circuit 340 is supplied from the gain adjustment circuit 310 with a gain which is then multiplied with an output signal of at least one of the adaptive codebook circuit 110 and the noise generating circuit 120. The resulting signal is output to the adder 160.
The adder 160 sums the two signals, output from the gain circuit 340, and outputs the resulting sum signal to the synthesis filter circuit 170 and to the adaptive codebook circuit 110.
The synthesis filter circuit 170 is supplied with an output signal (sound source signal) of the adder 160 and with a filter coefficient of a predetermined number of orders from the coefficient calculating circuit 130 to form a synthesis filter, and outputs a signal y(n) of a high frequency range needed for band extension.
The gain adjustment circuit 310 is supplied with the voiced/unvoiced discrimination information from the voiced/unvoiced discriminating circuit 200, and adjusts the gain of the adaptive codebook signal and the gain of the noise signal, depending on whether the input signal is voiced or unvoiced, to send the gain-adjusted signal to the gain circuit 340.
The adder 190 sums the output signal y(n) of the synthesis filter circuit 170 to the output signal s(n) of the sampling frequency converting circuit 180 to form and output an ultimately band extended signal.
With the second embodiment of not belonging to the present invention, an adaptive codebook signal is generated, using a delay calculated from the narrow-band input signal, based on the past sound source signal of high frequency portion, and are then multiplied with a proper gain. The resulting signal is then summed to e.g. a noise signal, whereby a band extended signal with superior sound quality may be generated for e.g. a vowel in case periodicity is needed for a high frequency portion. The above completes explanation of the second embodiment. As a modification of the second embodiment not belonging to the present invention, a pitch generating circuit 115 may be provided in place of the adaptive codebook circuit 110, as shown in Fig.6. The pitch generating circuit 115 calculates a pitch period from an input signal and generates a periodic signal based on the pitch period to output the so generated pitch signal to the gain circuit 340. Except for the pitch generating circuit 115, the modification is the same in the configuration as the above-described second embodiment, not belonging to the present invention.
Fig.3 shows the configuration of a first embodiment of the present invention. Referring to Fig.3, the band extending apparatus of the third embodiment includes a spectral parameter calculating circuit 100, an adaptive codebook circuit 110, a noise generating circuit 120, a coefficient calculating circuit 130, a gain circuit 300, a synthesis filter circuit 170, a sampling frequency converting circuit 180, an adder 190, a voiced/unvoiced discriminating circuit 200, a gain adjustment circuit 310, and a pitch pre-filter 400. In Fig.3, the same reference numerals are used to depict the parts or components which are the same as those shown in Figs.1 and 2. In the following, only the points of difference from the second embodiment not belonging to the present invention are explained, whilst the same parts or components as those of Fig.2 are sometimes not explained.
The gain circuit 300 is supplied with the gain from the gain adjustment circuit 310 and multiplies the output signals of the adaptive codebook circuit 110 and the noise generating circuit 120 with the gain. The resulting two signals are summed together and the resulting sum signal is output to the pitch pre-filter 400.
The pitch pre-filter 400 is supplied with the delay T from the voiced/unvoiced discriminating circuit 200, and performs pre-filtering on the sound source signal v(n) in accordance with the following equation (3): $vʹ (n) = v (n) + βp (n - T)$

to output the resulting signal to the synthesis filter circuit 170.
An output of the pitch pre-filter 400 is also supplied to the adaptive codebook circuit 110.
The synthesis filter circuit 170 is supplied with an output signal of the pitch pre-filter 400 and with coefficients of a predetermined number of orders from the coefficient calculating circuit 130 to form a filter, and outputs a signal y(n) of a high frequency range needed for band extension.
By employing the pitch pre-filter 400 for pre-filtering the sound source signal, using the delay, a band extended signal of superior sound quality may be produced. The above completes the explanation of the third embodiment. In the present embodiment, as in the modification of the second embodiment, not belonging to the invention, a pitch generating circuit may, of course, be used in place of the adaptive codebook circuit 110.
Fig.4 shows the configuration of a second embodiment of the present invention. Referring to Fig.4, the band extending apparatus of the second embodiment includes a spectral parameter calculating circuit 100, an adaptive codebook circuit 110, a noise generating circuit 120, a coefficient calculating circuit 130, a gain circuit 340, an adder 160, a synthesis filter circuit 170, a sampling frequency converting circuit 180, an adder 190, a voiced/unvoiced discriminating circuit 200, a gain adjustment circuit 310, and a low-pass filter circuit 500. In Fig.4, the same reference numerals are used to depict the parts or components which are the same as those shown in Fig.2. In the second embodiment, the low-pass filter 500 is added to the configuration of the above-described second embodiment not belonging to the present invention shown in Fig.2. In the following, only the points of difference from this second embodiment are explained, whilst the same parts or components as those of Fig.2 are explained only as necessary.
The low-pass filter 500 filters the output signal of the adaptive codebook circuit 110 in accordance with the equation: $pʹ (n) = p (n) * h (n)$

to permit a signal with a frequency not higher than a predetermined cut-off frequency to pass therethrough to the gain circuit 340. The cut-off frequency of the low-pass filter 500 may be predetermined to, for example, 6 kHz. Meanwhile, in Fig.4, h(n) denotes the impulse response of a low-pass filter, and a symbol "*" denotes the operation of convolution.
The foregoing completes the explanation of the second embodiment of the present invention. Meanwhile, a pitch generating circuit may be used in place of the adaptive codebook circuit 110, by way of a modification of the present second embodiment, as in the modification of the second embodiment described above.
Fig.5 shows the configuration of a third embodiment of the present invention. Referring to Fig.5, the band extending apparatus of the third embodiment includes a spectral parameter calculating circuit 100, an adaptive codebook circuit 110, a noise generating circuit 120, a coefficient calculating circuit 130, a gain circuit 300, a synthesis filter circuit 170, a sampling frequency converting circuit 180, an adder 190, a voiced/unvoiced discriminating circuit 200, a gain adjustment circuit 310, a pitch pre-filter 400, and a post-filter 600. In Fig.5, the same reference numerals are used to depict the same parts or components as those shown in Fig.3. The third embodiment of the present invention includes the post-filter 600 in addition to the configuration of the above-described first embodiment. In the following, only the points of difference from the first embodiment are explained, whilst the same parts or components as those of Fig.3 are explained only as necessary.
The post-filter 600 is supplied from the coefficient calculating circuit 130 with coefficients (filter coefficients), which then are weighted. The post-filter then performs post-filtering in accordance with the equation (5): $yʹ (n) = y (n) - Σ a_{i} {γ_{1}}^{1} y (n - i) + Σ a_{i} {γ_{2}}^{1} yʹ (n - i)$

in order to deliver an output to the adder 190.
By employing the post-filter 600, it is possible to generate a band extended signal of superior quality. The above completes the explanation of the third embodiment. It is noted that a pitch generating circuit may also be used in place of the codebook circuit 110, by way of a modification of the second embodiment, as in the modification of the second embodiment not belonging to the present invention described above.
The configurations of the above-described embodiments may also be combined together, such as by employing the post-filter, explained in the third embodiment, for the above-described first embodiment. In the present invention, plural sorts of the preset frequency band signal (narrow-band signal) may be input, in place of only one sort of the signals. Although the present invention has been explained with reference to the above specific embodiments, it is to be noted that the present invention may encompass various modifications or corrections that may be occur to those skilled in the art within the scope of the invention as defined in the claims.

Claims

A band extending apparatus receiving at least an input signal of a preset frequency band to output a band extended signal corresponding to said input signal extended in a frequency band thereof,
said apparatus comprising:
(A) a spectral parameter calculating unit (100), adapted to receive at least an input signal of a preset frequency band to calculate spectral parameters representing spectral characteristics;

(B) a coefficient calculating unit (130) adapted to shift the frequency of said spectral parameters to then calculate filter coefficients;

(C) a voiced/unvoiced discriminating circuit (200), adapted to supply for a voiced frame, a preset delay derived from a voiced/unvoiced decision, as a pitch period to an adaptive codebook circuit (110);

(D) said adaptive codebook circuit (110), being adapted to receive the delay from said voiced/unvoiced discriminating circuit, as a delay of said adaptive codebook, to generate an adaptive codebook signal based on the past sound source signal outputted from a synthesis filter circuit (170) and to output the adaptive codebook signal generated;

(E) a noise generating circuit (120) adapted to generate a noise signal;

(F) a gain adjustment circuit (310), adapted to receive voiced/unvoiced discrimination information output from said voiced/unvoiced discrimination circuit (200), to output a resulting gain adjustment signal for adjusting the gain of the output signal of said adaptive codebook circuit (110) and the gain of the output signal of noise generating unit (120), depending on whether said voiced/unvoiced discrimination information indicates voiced or unvoiced;

(G) a gain circuit (300) adapted to receive said gain adjustment signal from said gain adjustment circuit (310) to multiply the output signal of said adaptive codebook circuit (110) and the output signal of said noise generating circuit (120) with said gain adjustment signal to output two signals that are summed together to form a resulting sum signal;

(H) a pitch pre-filter (400) adapted to filter said resulting sum signal from said gain circuit (300), using said pitch period supplied from the voiced/unvoiced discriminating circuit (200) and to supply the output signal to the adaptive codebook circuit (110) and to the synthesis filter circuit (170);

(I) said synthesis filter circuit (170) being adapted to pass the output signal of said pitch pre-filter (400) through a synthesis filter, formed using said filter coefficients, to reproduce a signal for band extension; and

(J) an adder (190) adapted to add a signal corresponding to said input signal converted in a sampling frequency thereof to an output signal of said synthesis filter circuit (170) to generate the band extended signal.
The band extending apparatus as defined in claim 1, further comprising:
a low-pass filter (500), receiving an output signal of said adaptive codebook unit (110) as an input.
The band extending apparatus as defined in any one of claims 1 or 2, wherein
a post-filter (600) is formed using weighting coefficients as weighted version of filter coefficients output from said coefficient calculating unit (130), and wherein an output signal of said synthesis filter unit (170) is passed through said post-filter (600) to reproduce the signal for band extension.
A band extending method for receiving at least an input signal of a preset frequency band to output a band extended signal corresponding to said input signal extended in a frequency band thereof,
said method comprising:
(A) a spectral parameter-calculating step for receiving at least an input signal of a preset frequency band to calculate spectral parameters representing spectral characteristics;

(B) a coefficient-calculating step for shifting the frequency of said spectral parameters to then calculate filter coefficients;

(C) a voiced/unvoiced-discriminating step for supplying, for a voiced frame, a preset delay derived from a voiced/unvoiced discrimination, as a pitch period to an adaptive codebook generating step;

(D) said adaptive generating step receiving the delay from said voiced/ unvoiced discriminating step, as a delay of said adaptive codebook, generating an adaptive codebook signal based on the past sound source signal outputted from a synthesis filtering step and outputting the adaptive codebook signal generated;

(E) a step of generating a noise signal;

(F) a gain-adjusting step for receiving voiced/unvoiced discrimination information resulting from said voiced/unvoiced discrimination, to output a resulting gain adjustment signal for adjusting the gain of said adaptive codebook signal and the gain of the generated noise signal, depending on whether said voiced/unvoiced discrimination information indicates voiced or unvoiced;

(G) a gain-multiplying step for receiving said gain adjustment signal from said gain adjusting step and multiplying the output signal of said adaptive codebook and the generated noise signal with said gain adjustment signal to output two signals that are summed together to form a resulting sum signal;

(H) a pitch pre-filtering step for filtering said resulting sum signal from said gain multiplying step, using said pitch period, supplying the output signal of said pitch pre-filtering step to the adaptive codebook generating step and to the synthesis filtering step;

(I) said synthesis filtering step passing an output signal of said pitch pre-filtering step through a synthesis filter, formed using said filter coefficients, to reproduce a signal for band extension; and

(J) a step for adding a signal corresponding to said input signal converted in a sampling frequency thereof to an output signal of said synthesis filtering step to generate the band extended signal.
The band extending method as defined in claim 4, further comprising:
processing said adaptive codebook signal with a low-pass filter (500) to allow frequency components not higher than a preset cut-off frequency to pass there through.
The band extending method as defined in any one of claims 4 or 5, comprising:
passing an output signal of said synthesis filtering step through a post-filter (600) formed using weighting coefficients corresponding to weighted version of said filter coefficients to reproduce the signal for band extension.