CN105280190A

CN105280190A - Bandwidth extension encoding and decoding method and device

Info

Publication number: CN105280190A
Application number: CN201510591602.3A
Authority: CN
Inventors: 闫建新; 王磊
Original assignee: Shenzhen Rising Source Technology Co ltd
Current assignee: Guangdong Guangsheng Research And Development Institute Co ltd
Priority date: 2015-09-16
Filing date: 2015-09-16
Publication date: 2016-01-27
Anticipated expiration: 2035-09-16
Also published as: CN105280190B

Abstract

The invention relates to a bandwidth extension coding and decoding method and a device, which combine two key technologies of self-adaptive multi-resolution filtering, self-adaptive time-frequency grid construction and complex linear prediction coding high-frequency detail generation, can obviously improve the coding efficiency of a high-frequency part of a digital audio signal and the sound quality of the high-frequency part of the digital audio signal, and the low-frequency part of the digital audio signal can still adopt the traditional perceptual audio coding (such as DRA), thereby realizing a coding technology with higher subjective sound quality under low code rate and medium code rate. On the other hand, the invention is an enhancement tool added on the basis of high-quality perceptual coding algorithms such as DRA and the like at present, thus ensuring downward compatibility with the traditional perceptual coding algorithms such as DRA and the like. The digital audio codec realized based on the invention can be used in the fields of satellite HDTV sound processing, high-quality audio broadcasting and the like.

Description

Bandwidth extension encoding and coding/decoding method and device

Technical field

The present invention relates to digital audio decode technology, more particularly, relate to a kind of bandwidth extension encoding and coding/decoding method and device.

Background technology

The stereo exemplary operation code check of traditional perceptual audio technology (DRA, AAC and MP3 etc.) is 96 ~ 128kbps, and time below 64kbps/ is stereo, coding quality exists obvious subjective sensation distortion.The typical encoder bit rate of frequency modulation broadcasting application is that 48kbps ~ 64kbps/ is stereo, and at this moment heritage feels that the subjective sound quality of audio decoding techniques can not meet frequency modulation broadcasting requirement.

For this reason, bandwidth expansion (BandWidthExtension the is called for short BWE) coding techniques of digital audio and video signals is proposed.Current bandwidth extension encoding technology has a lot, and performance is also uneven.Disclose and mainly contain the following two kinds encryption algorithm for the bandwidth extension encoding technology in international standard:

The first bandwidth extension encoding technology is that the spectral band described in ISO/IEC14496-3MPEG-4 copies (SpectralBandwidthReplication is called for short SBR) coding.Fig. 1 shows the concrete theory diagram of SBR coding.SBR is the algorithm of frequency domain process, its coding principle is: every frame signal is by the quadrature mirror filter bank (QuadratureMirrorFilter of 64 subbands, be called for short QMF) obtain 64 uniform sub-bands, each sub-band comprises 32 sampling points, divide a rational T-F-grid according to the transient response of current demand signal, each raster symbol-base energy information also carries out huffman coding.This algorithm comprises tonality inspection and transmission other single sinusoidal signal parameter information simultaneously.Fig. 2 shows the concrete theory diagram of SBR decoding.SBR decoding principle is: the decoding pcm exported through core decoder (AAC) obtains 32 uniform sub-bands by the QMF of 32 subbands, each sub-band comprises 32 sampling points, high frequency generation is carried out according to the controling parameters that SBR demultiplexing exports, then according to controling parameters and envelope data, high frequency is adjusted, then by the output of low frequency 32 subband QMF and after adjustment the output of high-frequency sub-band QMF enter into 64 band QMF together and synthesize, finally output Whole frequency band pcm sound signal.

The major defect of MPEGSBR coding techniques is: the segmentation of (1) time-frequency is relatively fixing.For 48kHz sampling rate, owing to using 64 band QMF, then maximum frequency resolution is 375Hz (24khz/64); Every frame 2048 sampling point, then resolution is about 1.3ms (64/48000) maximum time.Because sound signal is extremely complicated, this algorithm can not meet the accuracy requirement of signal analysis sometimes well.(2) high frequency detail of SBR produces is directly to obtain or by obtaining the simple filtering of low frequency sub-band from low frequency part copy, when (pole) low bit-rate, this method can reduce the encoder bit rate of HFS greatly, but because sound signal low frequency and the high frequency of each sound channel only has very similar under small probability, therefore the detail recovery of SBR high frequency is more coarse, although apply other technologies to reduce the distortion brought, the reduction of whole HFS is still difficult to obtain higher quality.Therefore when digital audio encoding require relatively high-quality time, there is open defect in the high frequency detail process of SBR.

The second bandwidth extension encoding technology is the simple bandwidth expansion technique of one comprised in 3GPPAMR-WB+ coding method.It is a kind of algorithm of Time Domain Processing, main code principle is: the low frequency and the high frequency two parts time-domain signal that input signal are divided into same bandwidth, low frequency (LF) part obtains the residual signals of low frequency signal by lpc analysis filtering process, then simulate high frequency detail signal through high frequency LPC synthetic filtering; Then by with actual S _hFn the actual high-frequency signal of () compares, obtain the gain vector (every subframe yield value) of high-frequency envelope (energy), the further modified gain vector of consistance finally by the gain of low frequency high and low frequency tie point, this gain vector of then encoding.Therefore the gain vector comprising correction and the high frequency LPC coefficient of decoding end is transferred to.The high frequency decode procedure of AMR-WB+ is the inverse process of coding substantially.

There is following problem in the bandwidth extension encoding technology of 3GPPAMR-WB+: (1) realizes high-frequency coding in time domain, cannot obtain higher frequency resolution, only has a high-frequency region to divide because this method can be thought; (2) start frequency band of high-frequency coding is fixed, and can only be Fs/4, and for 48khz sample frequency, the initial frequency of high-frequency coding is 12khz; (3) cannot accurately recover the harmonic signal in high frequency; (4) envelope restoration of high-frequency signal is not accurate enough.

In addition also have some bandwidth extension encoding technology, time-frequency conversion unit adopts traditional FFT, then on frequency domain, high frequency is divided into several region, encodes to the spectrum energy in each region, and therefore every frame can only provide a multiple frequency resolution of temporal resolution.This High Frequency Reconstruction Technology based on FFT, frequency domain resolution is high and time resolution is too low, and when inputting the fast sound signal become, the signal of high-frequency reconstruction can not follow the tracks of the change of original audio signal well.

Intensity-stereo encoding in digital audio encoding also can think a kind of special bandwidth extension encoding technology, its principle utilizes the details of human auditory system to HFS insensitive, therefore mixing under carrying out the HFS of each sound channel of stereo or 5.1 surround sounds is a sound channel, and as the high frequency detail signal of all sound channels after normalization, but the envelope of the high-frequency signal of each sound channel (energy in high frequency critical band) all needs coding transmission.

Summary of the invention

The technical problem to be solved in the present invention is, for the above-mentioned defect of prior art, provides a kind of bandwidth extension encoding and coding/decoding method and device, with the sound quality of the code efficiency and HFS signal of improving digital audio and video signals HFS.

The technical solution adopted for the present invention to solve the technical problems is: propose a kind of bandwidth extension encoding method, comprise the steps:

S1, self-adaptation multiresolutional filter and self-adaptation T-F-grid structure are carried out to the monophonic audio signal of input, obtain best T-F-grid information, specifically comprise:

S11, carry out frequency resolution selection based on to the transient state analysis of input monophonic audio signal, self-adaptation many resolutions filtering is carried out to input monophonic audio signal, obtains best Time-frequency Filter signal;

S12, to filtering export each subband signal carry out Transient detection and location, transient state according to each subband signal is analyzed and is considered high frequency band encoder bit rate and people's ear critical band characteristic of setting, carry out the self-adaptation grid configuration of frequency direction and time orientation, obtain the best T-F-grid under current code check;

S2, in units of the T-F-grid of described the best, carry out high frequency detail coding, specifically comprise:

S21, in step S11 filtering export each subband signal carry out complex linear forecast analysis filtering, obtain the residual signals of each subband, try to achieve predictive coefficient, and complete the corresponding relation of all high-frequency sub-band residual signals and low frequency sub-band residual signals successively, export subband residual error copy parameter;

S22, quantization encoding predictive coefficient;

S3, in units of the T-F-grid of described the best, in step S11 filtering export each subband signal carry out high-frequency envelope entropy code;

S4, multiplexing and encoding parameter, output bandwidth extended coding code stream, described coding parameter comprises multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters.

According to one embodiment of present invention, described step S11 comprises further:

To transient signal, select thick frequency resolution and high temporal resolution to carry out filtering;

To stable state model, select thin frequency resolution and low temporal resolution to carry out filtering;

To other sound signal, frequency resolution and the middle temporal resolution of adaptively selected centre carry out filtering.

According to one embodiment of present invention, the self-adaptation grid configuration in described step S12 medium frequency direction comprises further: the frequency characteristic according to input monophonic audio signal higher frequency band part selects different grid configuration, is specially:

To general sound signal, frequency grid progressively reduces frequency resolution with the frequency rising of highband part, makes frequency grid consistent with people's ear critical band;

Comprise sound signal in highband part, under the prerequisite considering critical band, suitably increase the frequency resolution of grid compared with the situation of described general sound signal;

In described step S12, the self-adaptation grid configuration of time orientation comprises further: the position occurred according to one or more transient signal in input monophonic audio signal and the transient response of each subband signal, be configured to multiple temporal interval at time orientation, each interval represents a grid.

According to one embodiment of present invention, in described step S21, complete the corresponding relation of all high-frequency sub-band residual signals and low frequency sub-band residual signals successively, export subband residual error copy parameter, comprise further:

Analyze the residual signals of each high-frequency sub-band, from low frequency sub-band residual signals, select a best low frequency sub-band, and the subband of all low frequency sub-bands obtained with this number coding is exported.

To continuous one group of high-frequency sub-band residual signals, from low frequency sub-band residual signals, select one group of best continuous print low frequency sub-band, and the initial sum terminator reel number coding of the many groups low frequency sub-band obtained with this is exported.

According to one embodiment of present invention, described step S21 comprises further:

S211, to high frequency subband signals use hamming code window carry out overlapping windowing process;

S212, counterweight are folded the high frequency subband signals after windowing process and are carried out Linear Prediction filter, obtain high-frequency sub-band residual signals;

S213, under the criterion making the square error of residual signals minimum, by Paul levinson-Du Bin Algorithm for Solving predictive coefficient.

The present invention also proposes a kind of bandwidth expansion coding/decoding method for solving its technical matters, comprises the steps:

S1, to input bandwidth extension encoding code stream demultiplexing, obtain coding parameter, described coding parameter comprise multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters;

S2, based on T-F-grid, entropy decoding is carried out to high-frequency sub-band envelope parameters, obtain high-frequency sub-band envelope signal;

S3, to decoding, the low frequency signal that obtains carries out multiple orthogonalizing filtering bank analysis filtered, obtains low frequency sub-band signal;

S4, based on T-F-grid, carry out high frequency detail decoding according to described low frequency sub-band signal and subband residual error copy parameter, predictive coefficient, specifically comprise:

S41, compound linear forecast analysis filtering is carried out to low frequency sub-band signal, obtain low frequency sub-band residual signals;

S42, re-quantization decoding predictive coefficient;

S43, according to subband residual error copy parameter, low frequency sub-band residual signals is copied to high-frequency sub-band residual signals, then carries out the linear prediction synthetic filtering of high-frequency sub-band according to predictive coefficient, obtain high-frequency sub-band detail signal;

S5, in units of T-F-grid, application high-frequency sub-band envelope signal adjustment high-frequency sub-band detail signal, obtains high frequency subband signals;

S6, foundation multiresolutional filter Selection parameter, carry out the multiresolution synthesis corresponding with coding side to described high frequency subband signals and low frequency sub-band signal, export the monophonic audio signal of Whole frequency band.

The present invention also proposes a kind of bandwidth extension encoding device for solving its technical matters, comprising:

Self-adaptation many resolutions filtering and time-frequency grid configuration module, for carrying out self-adaptation multiresolutional filter and self-adaptation T-F-grid structure to the monophonic audio signal of input, obtain best T-F-grid information, specifically comprise:

Self-adaptation many resolutions filtering submodule, for carrying out frequency resolution selection based on to the transient state analysis of input monophonic audio signal, carrying out self-adaptation many resolutions filtering to input monophonic audio signal, obtaining best Time-frequency Filter signal;

T-F-grid constructor module, each subband signal for exporting filtering carries out Transient detection and location, transient state according to each subband signal is analyzed and is considered high frequency band encoder bit rate and people's ear critical band characteristic of setting, carry out the self-adaptation grid configuration of frequency direction and time orientation, obtain the best T-F-grid under current code check;

High frequency detail coding module, in units of the T-F-grid of described the best, carries out high frequency detail coding, specifically comprises:

Compound linear forecast analysis submodule, each subband signal for exporting filtering in step S11 carries out complex linear forecast analysis filtering, obtain the residual signals of each subband, try to achieve predictive coefficient, and complete the corresponding relation of all high-frequency sub-band residual signals and low frequency sub-band residual signals successively, export subband residual error copy parameter;

Quantization encoding submodule, for quantization encoding predictive coefficient;

High-frequency envelope coding module, in units of the T-F-grid of described the best, carries out high-frequency envelope entropy code to each subband signal that filtering in step S11 exports;

Parameter Multiplexing module, for multiplexing and encoding parameter, output bandwidth extended coding code stream, described coding parameter comprises multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters.

According to one embodiment of present invention, the self-adaptation grid configuration that described T-F-grid constructor module carries out frequency direction comprises further: the frequency characteristic according to input monophonic audio signal higher frequency band part selects different grid configuration, is specially:

The self-adaptation grid configuration that described T-F-grid constructor module carries out time orientation comprises further: the position occurred according to one or more transient signal in input monophonic audio signal and the transient response of each subband signal, be configured to multiple temporal interval at time orientation, each interval represents a grid.

The present invention also proposes a kind of bandwidth expansion decoding device for solving its technical matters, it is characterized in that, comprising:

Parametric solution Multiplexing module, for the bandwidth extension encoding code stream demultiplexing to input, obtain coding parameter, described coding parameter comprises multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters;

High-frequency envelope decoder module, for carrying out entropy decoding based on T-F-grid to high-frequency sub-band envelope parameters, obtains high-frequency sub-band envelope signal;

Multiple quadrature filtering analysis module, for carrying out multiple orthogonalizing filtering bank analysis filtered to the low frequency signal obtained of decoding, obtains low frequency sub-band signal;

High frequency detail decoder module, for based on T-F-grid, carries out high frequency detail decoding according to described low frequency sub-band signal and subband residual error copy parameter, predictive coefficient, specifically comprises:

Compound linear forecast analysis submodule, for carrying out compound linear forecast analysis filtering to low frequency sub-band signal, obtains low frequency sub-band residual signals;

Re-quantization submodule, for re-quantization decoding predictive coefficient;

High frequency synthon module, for according to subband residual error copy parameter, copies to high-frequency sub-band residual signals by low frequency sub-band residual signals, then carries out the linear prediction synthetic filtering of high-frequency sub-band according to predictive coefficient, obtain high-frequency sub-band detail signal;

High frequency adjusting module, in units of T-F-grid, application high-frequency sub-band envelope signal adjustment high-frequency sub-band detail signal, obtains high frequency subband signals;

Self-adaptation multiresolution synthetic filtering module, for according to multiresolutional filter Selection parameter, carries out the multiresolution synthesis corresponding with coding side to described high frequency subband signals and low frequency sub-band signal, exports the monophonic audio signal of Whole frequency band.

Bandwidth extension encoding of the present invention and coding/decoding method and device, be combined with AFAG (Adaptivemulti-resolutionFiltering & Adaptivetime-frequencyGriding, self-adaptation multiresolutional filter and self-adaptation T-F-grid structure) and CLPC (ComplexLinearPredictiveCoding, complex linear predictive coding) high frequency detail generates two gordian techniquies, the HFS code efficiency of digital audio and video signals and the sound quality of HFS signal can be significantly improved, and the low frequency part of digital audio and video signals still can adopt traditional perceptual audio (as DRA), thus achieve a kind of coding techniques all under low bit-rate and medium code check with higher subjective sound quality.On the other hand, the present invention is the enhancing instrument added on current DRA contour quality sensation encryption algorithm basis, so also can ensure with traditional sensory coding DRA scheduling algorithm backward compatible.The digital audio decode device realized based on the present invention can be used for the field such as Sound processing and high quality audio broadcast of satellite HDTV.

Accompanying drawing explanation

Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:

Fig. 1 is the theory diagram of existing SBR coding method;

Fig. 2 is the theory diagram of existing SBR coding/decoding method;

Fig. 3 is the process flow diagram of the bandwidth extension encoding method of one embodiment of the invention;

Fig. 4 is the process flow diagram of the bandwidth expansion coding/decoding method of one embodiment of the invention;

Fig. 5 is the logic diagram of the bandwidth extension encoding device of one embodiment of the invention;

Fig. 6 is the logic diagram of Fig. 5 high-frequency details coding module;

Fig. 7 is the logic diagram of the bandwidth expansion decoding device of one embodiment of the invention;

Fig. 8 is the logic diagram of Fig. 7 high-frequency details decoder module;

Fig. 9 is the schematic diagram that the bandwidth extension encoding method of one embodiment of the invention is applied to DRA coding techniques;

Figure 10 is the schematic diagram that the bandwidth expansion coding/decoding method of one embodiment of the invention is applied to DRA decoding technique;

Figure 11 is the schematic diagram of the best T-F-grid structure under constrained code rate.

Embodiment

In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.

Fig. 3 shows the process flow diagram of bandwidth extension encoding method 100 according to an embodiment of the invention.As shown in Figure 3, the method 100 comprises the steps:

In step S110, self-adaptation multiresolutional filter and self-adaptation T-F-grid structure are carried out to the monophonic audio signal of input, obtains best T-F-grid information.Specifically, this step S110 comprises the steps: further

Step S111, carries out frequency resolution selection based on to the transient state analysis of input monophonic audio signal, carries out self-adaptation many resolutions filtering, obtain best Time-frequency Filter signal to input monophonic audio signal.

In this step, first real-time analysis is carried out to the transient state of input monophonic audio signal, then frequency resolution selection is carried out according to the stable state/transient response of the sound signal analyzed, to select a best multirate filter bank (QMF) to carry out filtering to this sound signal, export best Time-frequency Filter signal.In general, the selection strategy of self-adaptation multiresolutional filter is carried out based on the transient state of input monophonic audio signal as follows:

To transient signal, thick frequency resolution and high temporal resolution can be selected to carry out filtering;

To stable state model, thin frequency resolution and low temporal resolution can be selected to carry out filtering;

To other sound signal, the frequency resolution of adaptively selected centre and middle temporal resolution filtering can be carried out.

In addition, consider the impact that high-frequency signal bandwidth extended coding code check is limited, if the total bitrate of audio-frequency signal coding is lower, thus the code check of high-frequency band signals coding also lower (or the available bits of coding highband part is less), then suitably to reduce the frequency resolution of high-frequency band signals filtering, namely, in the filtering resolution situation determined only considering input audio signal transient response, selected frequency resolution can suitably be reduced further.

Step S112, Transient detection and location are carried out to each subband signal that filtering exports, transient state according to each subband signal is analyzed and is considered high frequency band encoder bit rate and people's ear critical band characteristic of setting, carry out the self-adaptation grid configuration of frequency direction and time orientation, obtain the best T-F-grid under current code check.

The structure of T-F-grid depends on the particular location of transient signal in a frame, even depends on the transient state analysis of each filtering subband signal, the available code check simultaneously also needing consideration highband part to distribute and people's ear critical band characteristic.Therefore the elementary tactics of self-adaptation T-F-grid structure forms primarily of two parts, and one is the grid configuration of time orientation, sampling point combination in the son namely in same frequency subband; One is the grid configuration of frequency direction, namely combines between different frequency sub-bands.

The frequency characteristic of the self-adaptation grid configuration strategy Main Basis input monophonic audio signal higher frequency band part of frequency direction selects different grid configuration, be specially: to general sound signal, frequency grid progressively reduces frequency resolution with the frequency rising of highband part, makes frequency grid consistent with people's ear critical band; To the situation comprising sound signal in highband part, under the prerequisite considering critical band, the frequency resolution of grid suitably should be increased compared with the situation of aforementioned general sound signal.The position that in the self-adaptation grid configuration Main Basis input monophonic audio signal of time orientation, one or more transient signal occurs and the transient response of each subband signal, time orientation is configured to multiple temporal interval, and each interval represents a grid.

In addition, the above-mentioned T-F-grid structure calculated based on current high-frequency band signals characteristic, also to be subject to the restriction of high-frequency signal bandwidth extended coding code check, therefore also need to correct the grid construction that frequency direction and time orientation obtain based on high-frequency signal bandwidth extended coding code check, thus the best T-F-grid under obtaining current code check, as shown in figure 11.Basic correction method comprises:

(1) frequency resolution of grid is reduced: namely in a frequency direction, the width of each grid increases, be such as that wide to change 1/2 critical band into wide for 1/3 critical band originally, or the part low frequency sub-band in QMF high-frequency sub-band applies the wide and remainder of 1/3 critical band, and to apply 1/2 critical band wide.

(2) the time domain direction grid structure of different Q MF high-frequency sub-band is optimized: if carry out grid configuration based on the transient response of each subband signal, possible different Q MF subband has different grid numbers and the initial sum of each grid stops sampling point difference, the information of transmission is more, the grid configuration that therefore can adjust each QMF subband is on the whole interval, shares or reduces faceted boundary (interval) descriptor.Such as, all BWE high-frequency sub-band T-F-grid have the same grid configuration, and side information is minimum; Again such as, all BWE high-frequency sub-band have n (such as: n<4) individual T-F-grid, and higher BWE subband has less grid number, and grid number is 1/2 of previous subband, and each grid aligns with two grids of previous subband.

(3) temporal resolution of grid is reduced: namely in the time domain of QMF subband signal, increase the width (namely each grid comprises more subband sampling point) of grid, such as: former time orientation constructs between 16 homogeneity ranges, can merge between 8 homogeneity ranges between two, or part merges into 12 intervals (front subband signal transient state strong region grid is constant, and front and rear part suitably merges).

Method 100 in the step s 120 subsequently, in units of the T-F-grid of described the best, carries out high frequency detail coding.Specifically, this step S120 comprises the steps: further

Step S121, complex linear forecast analysis filtering is carried out to each subband signal that filtering in above-mentioned steps S111 exports, obtain the residual signals of each subband, try to achieve predictive coefficient, and complete the corresponding relation of all high-frequency sub-band residual signals and low frequency sub-band residual signals successively, export subband residual error copy parameter.Specifically, the relation of high-frequency sub-band residual signals and low frequency sub-band residual signals can be determined by the following two kinds method:

First method: analyze each residual signals needing the high-frequency sub-band of parameter coding, a most suitable low frequency sub-band is selected from low frequency sub-band residual signals, and using the subband of this low frequency sub-band number as parameter, all subbands number obtained in this approach export as subband residual error copy parameter coding.

Second method: to continuous one group of high-frequency sub-band residual signals, one group of best continuous print low frequency sub-band is selected from low frequency sub-band residual signals, using the initial sum terminator reel number of this group low frequency sub-band as parameter, process all high frequency subband signals in this approach, obtain many group initial sum terminator reel numbers, these subbands number are exported as subband residual error copy parameter coding.

Step S122, the predictive coefficient that quantization encoding abovementioned steps S121 obtains also exports.

Method 100 in step s 130, which subsequently, in units of the T-F-grid of described the best, carries out high-frequency envelope entropy code to each subband signal that filtering in step S111 exports, export high-frequency sub-band envelope parameters.

Along with in step S140, multiplexing all BWE coding parameters, export BWE code stream.Wherein, coding parameter comprises multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters.

The bandwidth extension encoding method 100 of the present embodiment is by the self-adaptation grid configuration technology (referred to as AFAG algorithm) in step S110, best T-F-grid is obtained based on self-adaptation multiresolutional filter and self-adaptation T-F-grid structure, be beneficial to the high frequency detail coded treatment that bandwidth extension encoding is follow-up, the HFS code efficiency of digital audio and video signals can be significantly improved.The bandwidth extension encoding method 100 of the present embodiment carries out CLPC analysis and communicating predicted coefficient to high-frequency sub-band in the step s 120, ensures the accuracy of high-frequency envelope, thus improves the sound quality of sound signal HFS.

When carrying out high frequency detail coding in the bandwidth extension encoding method according to the present invention's specific embodiment, carry out CLPC analysis by implementation procedure concrete as follows, try to achieve predictive coefficient:

The first step: use Kazakhstan bright (hamming) window to carry out overlapping windowing process to high frequency subband signals.For 32 subband QMF, optional window length is 96 QMF sample points, and comprise 32 QMF sample points of former frame overlap and 64 QMF sample points of present frame, window type is hamming window.By the QMF sample point x of high-frequency sub-band k _hfw is obtained after [n] [k] carries out overlapping windowing process _hf[n] [k] is as follows:

w _hf[n][k]＝x _hf[n][k]·win[n]n＝0,1...,95

Wherein, win [n] is hamming window.

Second step: counterweight is folded the high frequency subband signals after windowing process and carried out Linear Prediction filter, obtains high-frequency sub-band residual signals:

e_{h f} [n] [k] = w_{h f} [n] [k] - Σ_{i = 1}^{p} a [i] \cdot w_{h f} [n - i] [k]

Wherein, p is prediction order, typically can select 3 or 4; A [i] is predictive coefficient; e _hf[n] [k] is high frequency residual sample point.

3rd step: making residual signals e _hfthe square error of [n] [k] under minimum criterion, by Paul levinson-Du Bin (Levinson-Durbin) Algorithm for Solving predictive coefficient a [i].

Based on the bandwidth extension encoding method more than the present invention introduced, the present invention also proposes a kind of bandwidth expansion coding/decoding method.Fig. 4 shows the process flow diagram of bandwidth expansion coding/decoding method 200 according to an embodiment of the invention.As shown in Figure 4, this bandwidth expansion coding/decoding method 200 comprises the steps:

In step S210, to the BWE code stream demultiplexing of input, obtain coding parameter.Wherein, coding parameter comprises multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters.

In later step S220, based on T-F-grid, entropy decoding is carried out to high-frequency sub-band envelope parameters, obtain high-frequency sub-band envelope signal.

In later step S230, to decoding, the low frequency signal obtained carries out multiple orthogonalizing filtering bank (CQMF) analysis filtered, obtains low frequency sub-band signal.That is: obtain low frequency signal by common sensation audio decoder (such as DRA decoding), first CQMF analysis filtered carried out to this low frequency signal, obtain at the similar low frequency sub-band signal of coding side.This low frequency sub-band signal is used for the input of final subband synthesis on the one hand, is used for the generation of high-frequency sub-band detail signal on the other hand.

Along with in step S240, based on T-F-grid, carry out high frequency detail decoding according to described low frequency sub-band signal and subband residual error copy parameter, predictive coefficient.Specifically, this step S240 comprises the steps: further

Step S241, carries out compound linear prediction (CLPC) analysis filtered to low frequency sub-band signal, obtains the low frequency sub-band residual signals similar to coding side.

In step S242, re-quantization decoding predictive coefficient.By demultiplexing BWE encoding code stream during bandwidth expansion decoding, the high frequency details coding parameter informations such as the predictive coefficient of Availabilityization coding and subband residual error copy parameter.Method 200 is decoded and re-quantization to the predictive coefficient of this quantization encoding in step S242, to obtain the predictive coefficient synthesized for high frequency CLPC.

In step S243, according to subband residual error copy parameter, low frequency sub-band residual signals is copied to high-frequency sub-band residual signals, then carries out the linear prediction synthetic filtering of high-frequency sub-band according to predictive coefficient, obtain high-frequency sub-band detail signal.

The method 200 in step s 250 subsequently, and in units of T-F-grid, the high-frequency sub-band detail signal obtained in the high-frequency sub-band envelope signal set-up procedure S243 obtained in applying step S220, obtains high frequency subband signals.

In later step S260, according to multiresolutional filter Selection parameter, the multiresolution synthesis corresponding with coding side is carried out to the low frequency sub-band signal obtained in the high frequency subband signals obtained in step S250 and step S230, exports the monophonic audio signal of Whole frequency band.

The bandwidth expansion coding/decoding method 200 of the present embodiment replaces high-frequency sub-band residual signals to encourage the linear prediction synthetic filtering of high-frequency sub-band with low frequency residual signals optimal in low frequency sub-band signal, good high frequency detail can be obtained, thus the sound quality of sound signal HFS can be improved.

Based on the bandwidth extension encoding method that the present invention introduces above, the present invention also proposes a kind of bandwidth extension encoding device.Fig. 5 shows the logic diagram of bandwidth extension encoding device 300 according to an embodiment of the invention.As shown in Figure 5, this bandwidth extension encoding device 300 comprises self-adaptation many resolutions filtering and time-frequency grid configuration (AFAG) module 310, high frequency detail coding module 320, high-frequency envelope coding module 330 and parameter Multiplexing module 340.Wherein, AFAG module 310, for carrying out self-adaptation multiresolutional filter and self-adaptation T-F-grid structure to the monophonic audio signal of input, obtains best T-F-grid information.Specifically, AFAG module 310 comprises self-adaptation many resolutions filtering submodule 311 and T-F-grid constructor module 312 further.Self-adaptation many resolutions filtering submodule 311 carries out frequency resolution selection based on to the transient state analysis of input monophonic audio signal, carries out self-adaptation many resolutions filtering, obtain best Time-frequency Filter signal to input monophonic audio signal.Each subband signal that the 312 pairs of filtering of T-F-grid constructor module export carries out Transient detection and location, transient state according to each subband signal is analyzed and is considered high frequency band encoder bit rate and people's ear critical band characteristic of setting, carry out the self-adaptation grid configuration of frequency direction and time orientation, obtain the best T-F-grid under current code check.High frequency detail coding module 320, in units of the T-F-grid of described the best, carries out high frequency detail coding.Specifically as shown in Figure 6, high frequency detail coding module 320 comprises compound linear forecast analysis submodule 321 and quantization encoding submodule 322 further.Each subband signal that 321 pairs of self-adaptation many resolutions filtering submodule 311 filtering of compound linear forecast analysis submodule export carries out complex linear forecast analysis filtering, obtain the residual signals of each subband, try to achieve predictive coefficient, and complete the corresponding relation of all high-frequency sub-band residual signals and low frequency sub-band residual signals successively, export subband residual error copy parameter to parameter Multiplexing module 340.The predictive coefficient that quantization encoding submodule 322 quantization encoding compound linear forecast analysis submodule 321 is tried to achieve, exports to parameter Multiplexing module 340.High-frequency envelope coding module 330, in units of the T-F-grid of described the best, carries out high-frequency envelope entropy code to each subband signal that the filtering of self-adaptation many resolutions filtering submodule 311 exports.Parameter Multiplexing module 340, for multiplexing and encoding parameter, exports BWE code stream.Wherein, coding parameter can comprise multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters.The specific implementation of each module in relative assembly 300, can see the aforementioned associated description to bandwidth extension encoding method 100.

The bandwidth extension encoding device 300 of the present embodiment obtains best T-F-grid based on self-adaptation multiresolutional filter and self-adaptation T-F-grid structure, be beneficial to the high frequency detail coded treatment that bandwidth extension encoding is follow-up, the HFS code efficiency of digital audio and video signals can be significantly improved.Bandwidth extension encoding device 300 pairs of high-frequency sub-band of the present embodiment carry out CLPC analysis and communicating predicted coefficient, ensure the accuracy of high-frequency envelope, thus improve the sound quality of sound signal HFS.

Based on the bandwidth expansion coding/decoding method that the present invention introduces above, the present invention also proposes a kind of bandwidth expansion decoding device.Fig. 7 shows the logic diagram of bandwidth expansion decoding device 400 according to an embodiment of the invention.As shown in Figure 7, this bandwidth expansion connects yard device 400 and comprises parametric solution Multiplexing module 410, high-frequency envelope decoder module 420, multiple quadrature filtering (CQMF) analysis module 430, high frequency detail decoder module 440, high frequency adjusting module 450 and self-adaptation multiresolution synthetic filtering module 460.Wherein, parametric solution Multiplexing module 410, for the BWE code stream demultiplexing to input, obtains coding parameter.This coding parameter comprises multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters.High-frequency envelope decoder module 420, for carrying out entropy decoding based on T-F-grid to high-frequency sub-band envelope parameters, obtains high-frequency sub-band envelope signal.CQMF analysis module 430, for carrying out multiple orthogonalizing filtering bank analysis filtered to the low frequency signal such as obtained by common sensation audio decoder, obtains low frequency sub-band signal.High frequency detail decoder module 440, for based on T-F-grid, carries out high frequency detail decoding according to described low frequency sub-band signal and subband residual error copy parameter, predictive coefficient, obtains high-frequency sub-band detail signal.High frequency adjusting module 450 is in units of T-F-grid, and the high-frequency sub-band detail signal that the high-frequency sub-band envelope signal adjustment high frequency detail decoder module 440 that application high-frequency envelope decoder module 420 obtains obtains, generates high frequency subband signals.Self-adaptation multiresolution synthetic filtering module 460 is for foundation multiresolutional filter Selection parameter, the low frequency sub-band signal that the high frequency subband signals generated by high frequency adjusting module 450 and CQMF analysis module 430 obtain carries out the multiresolution synthesis corresponding with coding side, exports the monophonic audio signal of Whole frequency band.In specific embodiment, as shown in Figure 8, high frequency detail decoder module 440 comprises compound linear forecast analysis module 441, inverse quantization module 442 and high frequency synthesis module 443 further.Wherein, the low frequency sub-band signal that compound linear forecast analysis submodule 441 pairs of CQMF analysis modules 430 obtain carries out compound linear prediction (CLPC) analysis filtered, obtains low frequency sub-band residual signals.The predictive coefficient of the quantization encoding that re-quantization submodule 442 pairs of demultiplexings obtain carries out re-quantization decoding, obtains the predictive coefficient being used for high frequency CLPC and synthesizing.High frequency synthon module 443 performs high frequency CLPC and synthesizes, namely according to subband residual error copy parameter, low frequency sub-band residual signals is copied to high-frequency sub-band residual signals, then carries out the linear prediction synthetic filtering of high-frequency sub-band according to predictive coefficient, obtain high-frequency sub-band detail signal.The specific implementation of each module in relative assembly 400, can see the aforementioned associated description to bandwidth expansion coding/decoding method 200.

The bandwidth expansion decoding device 400 of the present embodiment replaces high-frequency sub-band residual signals to encourage the linear prediction synthetic filtering of high-frequency sub-band with low frequency residual signals optimal in low frequency sub-band signal, good high frequency detail can be obtained, thus the sound quality of sound signal HFS can be improved.

The bandwidth extension encoding method that Fig. 9 shows one embodiment of the invention is applied to the schematic diagram of DRA coding techniques.As shown in Figure 9, the encode basic process of application example of the DRA+ that this BWE technology and DRA technology form is: input Whole frequency band sound signal and lead up to low-pass filtering and the down-sampled low frequency part obtaining sound signal, then encoded by DRA; Whole frequency band sound signal is by volume bandwidth extension encoding method coding HFS of the present invention simultaneously; Finally be packaged into DRA+ code stream according to the frame format of DRA+.

Encode in application example at the DRA+ shown in Fig. 9, the concrete steps of bandwidth extension encoding method are as follows:

The first step: the pcm audio signal analyzing input, according to stable state/transient response, selects suitable QMF bank of filters.In DRA+, only consider complexity, so only select 32 band QMF and 128 band QMF, then filtering exports 32 subbands or 128 subband signals.

Second step: analyse in depth further pcm audio signal, detected transient point, then carries out the structure of time orientation grid.Consider the factor such as complexity and T-F-grid side information overhead, time orientation grid mostly is 8 most.

3rd step: according to code check and time grid configuration, carries out the grid configuration (namely multiple subband merges into a grid in frequency direction) of frequency direction, so far completes final T-F-grid structure.

4th step: QMF subband carries out CLPC process (predictive filter exponent number is 3), and encodes to the CLPC filter parameter of high-frequency sub-band.

5th step: the CLPC according to the 4th step carries out Filtering Analysis to QMF subband, obtain subband residual signals, in order to simplify and reduce side information, with continuous multiple subband (subband block) for unit, the correlativity of analysis of high frequency subband block residual error and low frequency sub-band block residual error, select maximally related low frequency sub-band block, using the initial subband of low frequency sub-band block number and subband block broadband as side information; Then the corresponding relation of all high-frequency sub-band residual errors and low frequency sub-band residual error is completed successively.

6th step: carry out high-frequency envelope coding.

7th step: undertaken multiplexing by all information being transferred to decoding end that needs, forms BWE code stream.

The bandwidth expansion coding/decoding method that Figure 10 shows one embodiment of the invention is applied to the schematic diagram of DRA decoding technique.As shown in Figure 10, the decode basic process of application example of the DRA+ that this BWE technology and DRA technology form is: unpack to DRA+ code stream, low frequency part obtains low frequency PCM signal by DRA decoding, this low frequency PCM signal and the high frequency BWE parameter of unpacking, by bandwidth expansion decoding method decodes of the present invention, export as Whole frequency band pcm audio data.

Decode in application example at the DRA+ shown in Figure 10, the concrete steps of bandwidth expansion coding/decoding method are as follows:

The first step: demultiplexing, obtains the coding parameter information such as multiresolutional filter Selection parameter, T-F-grid parameter, subband residual error copy parameter, predictive coefficient and high-frequency sub-band envelope parameters.

Second step: the low frequency part signal of the sound signal obtained of decoding to DRA, the QMF carrying out the frequency resolution of lower than coding side one times analyzes (i.e. 16 bands or 64 bands), obtains 16 or 64 QMF low frequency sub-band filtering signals.

3rd step: according to high-frequency sub-band residual error and low frequency sub-band residual error corresponding informance, copy high-frequency sub-band residual error to from low frequency sub-band residual error, recovers to obtain high-frequency sub-band residual signals like this.

4th step: application high-frequency sub-band residual signals excitation CLPC wave filter, obtains the high-frequency sub-band detail signal synthesized.

5th step: in units of T-F-grid, the high-frequency sub-band envelope signal adjustment high-frequency sub-band detail signal obtained by decoding high-frequency sub-band envelope parameters, exports high frequency subband signals.

6th step: high frequency subband signals and low frequency sub-band signal, by corresponding with coding side 32 band QMF or 128 band QMF synthetic filterings, export the monophonic audio PCM signal of Whole frequency band.

According to international test standards ITU-RBS.1534, repeatedly test to existing DRA, DRA+SBR and according to these three kinds of encoding and decoding techniques of DRA+BWE of the present invention, comprise the external testing of laboratory internal test and specification, test result shows:

During stereo 48kbps, suitable according to DRA+BWE and DRA+SBR of the present invention, significantly better than DRA;

During surround sound 128kbps, suitable according to DRA+BWE and DRA+SBR of the present invention, significantly better than DRA;

During surround sound 192kbps, DRA+BWE according to the present invention is slightly better than DRA+SBR, is all better than DRA.

Bandwidth extension encoding of the present invention and coding/decoding method and device, be combined with AFAG and CLPC high frequency detail and generate this two gordian techniquies, can significantly improve the HFS code efficiency of digital audio and video signals and the sound quality of HFS signal.About the specific implementation of AFAG and CLPC high frequency detail generation technique, also can be called in the name that same day submits to the content that " the self-adaptation grid configuration method and apparatus for bandwidth extension encoding " and name are called " method and apparatus that bandwidth extension encoding and decoding medium-high frequency generate " described in this two pieces patented claim see the applicant of present patent application.

The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any amendments done within the spirit and principles in the present invention, equivalent replacement and improvement etc., all should be included within protection scope of the present invention.

Claims

1. a bandwidth extension encoding method, is characterized in that, comprises the steps:

S22, quantization encoding predictive coefficient;

2. method according to claim 1, is characterized in that, described step S11 comprises further:

3. method according to claim 1, is characterized in that, the self-adaptation grid configuration in described step S12 medium frequency direction comprises further: the frequency characteristic according to input monophonic audio signal higher frequency band part selects different grid configuration, is specially:

4. method according to claim 1, is characterized in that, completes the corresponding relation of all high-frequency sub-band residual signals and low frequency sub-band residual signals in described step S21 successively, exports subband residual error copy parameter, comprises further:

5. method according to claim 1, is characterized in that, completes the corresponding relation of all high-frequency sub-band residual signals and low frequency sub-band residual signals in described step S21 successively, exports subband residual error copy parameter, comprises further:

6. method according to claim 1, is characterized in that, described step S21 comprises further:

7. a bandwidth expansion coding/decoding method, is characterized in that, comprises the steps:

S42, re-quantization decoding predictive coefficient;

8. a bandwidth extension encoding device, is characterized in that, comprising:

9. device according to claim 8, it is characterized in that, the self-adaptation grid configuration that described T-F-grid constructor module carries out frequency direction comprises further: the frequency characteristic according to input monophonic audio signal higher frequency band part selects different grid configuration, is specially:

10. a bandwidth expansion decoding device, is characterized in that, comprising:

Re-quantization submodule, for re-quantization decoding predictive coefficient;