CN103366751B

CN103366751B - A kind of sound codec devices and methods therefor

Info

Publication number: CN103366751B
Application number: CN201210085257.2A
Authority: CN
Inventors: 潘兴德; 李靓; 吴超刚
Original assignee: BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Current assignee: BEIJING TIANLAI CHUANYIN DIGITAL TECHNOLOGY Co Ltd
Priority date: 2012-03-28
Filing date: 2012-03-28
Publication date: 2015-10-14
Anticipated expiration: 2032-03-28
Also published as: CN103366751A

Abstract

The present invention relates to a kind of sound codec devices and methods therefor.The present invention is by being mapped to MDCT territory by digital audio signal from time domain, low by MDCT territory, high frequency spectrum is transformed into MDFT territory, waveform coding is carried out to the low frequency spectrum on MDCT territory, low on MDFT territory, high frequency spectrum carries out parameter coding, wherein, the special frequency channel of low frequency spectrum is mapped to the special frequency channel of high frequency spectrum, high frequency spectrum before and after coding side maps frequency spectrum carries out the border pre-service of MDFT territory, high frequency spectrum after decoding end maps frequency spectrum carries out the pre-service of MDFT border, and the border aftertreatment of MDFT territory is carried out to the decoded high frequency spectrum of parameter, improve because of the frequency band division in high-frequency parameter coding, the paramount frequency spectrum of low frequency spectrum maps the problem brought, improve spectral continuity and band signal naturalness, eliminate harmonic interference noise and reveal the aliasing noise caused because of secondary lobe, the coding quality of high-frequency parameter coding is further increased under lower code check.

Description

A kind of sound codec devices and methods therefor

Technical field

The present invention relates to a kind of sound codec apparatus and method, particularly relate to coding and decoding device and the method thereof of monophonic sounds coding and decoding device and method and stereo sound.

Background technology

Patent ZL200610087481.X discloses a kind of sound coder and method, comprising:

Time varying prediction analysis module, for carrying out Time varying prediction analysis to digital audio signal, to obtain time domain excitation signal;

Time-frequency mapping module, for by time domain excitation signal map to transform domain, to obtain the pumping signal on transform domain;

Coding module, for carrying out quantization encoding to the low frequency spectrum in the pumping signal on transform domain and intermediate frequency spectrum, to obtain low frequency waveform encoded data and intermediate frequency waveform encoded data; And according to the low frequency spectrum in the pumping signal on transform domain, intermediate frequency spectrum and high frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum from low frequency spectrum and intermediate frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to high-frequency parameter; And

Bit stream Multiplexing module, for carrying out multiplexing to low frequency waveform encoded data, intermediate frequency waveform encoded data and high-frequency parameter coded data, to export acoustic coding code stream.

This sound coder and method introduce new coding framework, with fully in conjunction with sound waveform coding and the feature of parameter coding, lower code check and calculation of complex degree constrain narrow, voice and music all can be encoded by high-quality.

According to the method that ZL200610087481.X proposes, how can under the prerequisite reduced or keep computation complexity, under lower code check, promote the coding quality to music signal further, be the problem that this technique direction faces.

Because the technology recovering high frequency spectrum from low frequency spectrum and intermediate frequency spectrum changes physical slot relation between each frequency band of original signal and energy size, therefore can bring series of problems, have impact on the coding quality of high-frequency parameter coding.Such as: the frequency band division in high-frequency parameter coding, block the association between each spectrum line of original signal, especially, when the frequency resolution of mapping territory signal is very high, the transitional zone between each frequency band is very narrow, destroys the continuity of frequency spectrum and the naturalness of band signal; The paramount frequency spectrum of low frequency spectrum maps the superposition that also may cause at stitching portion dual harmonic signal, produces harmonic interference noise; Stitching portion after mapping for the paramount frequency spectrum of low frequency spectrum between each frequency band, because the undesirable secondary lobe that can produce of prototype filter performance is revealed, thus introduces aliasing noise.

For the problem that above-mentioned high-frequency parameter is encoded, a kind of sound codec apparatus and method disclosed by the invention propose effective solution, promote the coding quality to music signal under lower code check further.

Summary of the invention

Detailed description by setting forth below, accompanying drawing and claim are become obvious by other characteristic sum benefits of exemplary embodiment of the present invention.

According to a first aspect of the invention, provide a kind of monophonic sounds code device, comprise: Modified Discrete Cosine Transform (MDCT) conversion module, for digital audio signal being mapped to MDCT territory from time domain to obtain the voice signal MDCT territory, and the voice signal on described MDCT territory is divided into low frequency spectrum and high frequency spectrum; Low frequency waveform coding module, for carrying out quantization encoding to obtain low frequency waveform encoded data to the low frequency spectrum of the voice signal on described MDCT territory; MDCT to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum being converted to low frequency spectrum and the high frequency spectrum of the voice signal on MDFT territory; The paramount frequency spectrum mapping block of low frequency spectrum, the special frequency channel for the low frequency spectrum by the voice signal on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum after frequency spectrum mapping; Border, MDFT territory pretreatment module, high frequency spectrum before high frequency spectrum after mapping the frequency spectrum on described MDFT territory and frequency spectrum map carries out border pre-service, wherein, the high frequency spectrum before described frequency spectrum mapping is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter; High-frequency parameter coding module, high frequency spectrum after mapping for the high frequency spectrum before mapping according to the pretreated frequency spectrum in described border and frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter; And bit stream Multiplexing module, for carrying out multiplexing, to export acoustic coding code stream to described low frequency waveform encoded data and described high-frequency parameter coded data.

According to a second aspect of the invention, provide a kind of monophonic sounds coding method, comprise: digital audio signal is mapped to Modified Discrete Cosine Transform (MDCT) territory to obtain the voice signal MDCT territory from time domain, and the voice signal on described MDCT territory is divided into low frequency spectrum and high frequency spectrum, quantization encoding is carried out to obtain low frequency waveform encoded data to the low frequency spectrum in the voice signal on described MDCT territory, the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum are converted to low frequency spectrum and the high frequency spectrum of the voice signal revised on discrete Fourier transformation (MDFT) territory, the special frequency channel of the low frequency spectrum of the voice signal on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, obtain the high frequency spectrum after frequency spectrum mapping, high frequency spectrum before high frequency spectrum after mapping the frequency spectrum on described MDFT territory and frequency spectrum map carries out border pre-service, wherein, high frequency spectrum before described frequency spectrum maps is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter, high frequency spectrum after high frequency spectrum before mapping according to the pretreated frequency spectrum in described border and frequency spectrum map, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter, and carry out multiplexing to described low frequency waveform encoded data and high-frequency parameter coded data, to export acoustic coding code stream.

According to a third aspect of the invention we, provide a kind of monophonic sound sound decoding device, comprising: bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data; Low frequency waveform decoder module, for described low frequency waveform encoded data of decoding, to obtain the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform (MDCT) territory; MDCT to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum decoded data of the voice signal on described MDCT territory is converted to MDFT territory from MDCT territory; The paramount frequency spectrum mapping block of low frequency spectrum, for demapping section modal data from the low frequency spectrum decoded data on described MDFT territory to HFS, obtains the high frequency spectrum after frequency spectrum mapping; Border, MDFT territory pretreatment module, carries out border pre-service for the high frequency spectrum after mapping described frequency spectrum; High-frequency parameter decoder module, obtains high frequency spectrum decoded data for carrying out parameter decoding according to described high-frequency parameter coded data to the high frequency spectrum after the pretreated frequency spectrum mapping in described border; Border, MDFT territory post-processing module, for carrying out border aftertreatment to described high frequency spectrum decoded data; And inverse correction discrete Fourier transformation (IMDFT) conversion module, carry out IMDFT conversion for being combined by the high frequency spectrum decoded data after described low frequency spectrum decoded data and the aftertreatment of described border, to obtain the voice codec data in time domain.

According to a forth aspect of the invention, provide a kind of monophonic sounds coding/decoding method, comprising: demultiplexing is carried out to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data; To decode described low frequency waveform encoded data, to obtain the low frequency spectrum decoded data in the voice signal on Modified Discrete Cosine Transform (MDCT) territory; Described low frequency spectrum decoded data is converted to from MDCT territory and revises discrete Fourier transformation (MDFT) territory, to obtain the low frequency spectrum decoded data on MDFT territory; From the low frequency spectrum decoded data described MDFT territory, demapping section modal data is to HFS, obtains the high frequency spectrum after frequency spectrum mapping; Border pre-service is carried out to the high frequency spectrum after described frequency spectrum maps; According to described high-frequency parameter coded data, parameter decoding is carried out to the high frequency spectrum after the pretreated frequency spectrum in described border maps, obtains the decoded high frequency spectrum decoded data in MDFT territory; Border aftertreatment is carried out to described high frequency spectrum decoded data; And the high frequency spectrum decoded data after described low frequency spectrum decoded data and the aftertreatment of described border combined carry out inversely revising discrete Fourier transformation (IMDFT) conversion, obtain the digital audio signal in decoded time domain.

According to a fifth aspect of the invention, provide a kind of stereo encoding apparatus, comprise: Modified Discrete Cosine Transform (MDCT) conversion module, for respectively digital audio signal being mapped to MDCT territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum; Low frequency stereo coding module, for carrying out stereo coding to the low frequency spectrum on the MDCT territory of described left and right sound channels, to obtain low frequency stereo coding data; MDCT to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum on the MDCT territory of described left and right sound channels and high frequency spectrum being converted to low frequency spectrum on MDFT territory and high frequency spectrum; The paramount frequency spectrum mapping block of low frequency spectrum, the special frequency channel for the low frequency spectrum by the voice signal on the MDFT territory of described left and right sound channels is mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum after the frequency spectrum mapping of left and right sound channels; Border, MDFT territory pretreatment module, high frequency spectrum before high frequency spectrum after mapping frequency spectrum on the MDFT territory of described left and right sound channels and frequency spectrum map carries out border pre-service, wherein, the high frequency spectrum before described frequency spectrum mapping is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter; High-frequency parameter coding module, high frequency spectrum after high frequency spectrum before mapping for the frequency spectrum respectively according to the pretreated left and right sound channels in described border and frequency spectrum map, calculate the high-frequency parameter being used for recovering high frequency spectrum respectively in decoding end from the low frequency spectrum of described left and right sound channels, and quantization encoding is carried out to obtain the high-frequency parameter coded data of described left and right sound channels to described high-frequency parameter; And bit stream Multiplexing module, for carrying out multiplexing to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels, to export acoustic coding code stream.

According to a sixth aspect of the invention, provide a kind of stereo encoding method, comprise: respectively digital audio signal is mapped to Modified Discrete Cosine Transform (MDCT) territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum, to a described left side, low frequency spectrum on the MDCT territory of R channel carries out stereo coding, to obtain low frequency stereo coding data, by a described left side, being converted to of low frequency spectrum on the MDCT territory of R channel and high frequency spectrum revises low frequency spectrum on discrete Fourier transformation (MDFT) territory and high frequency spectrum, by a described left side, the special frequency channel of the low frequency spectrum of the voice signal on the MDFT territory of R channel is mapped to the special frequency channel of high frequency spectrum, obtain a left side, high frequency spectrum after the frequency spectrum mapping of R channel, to a described left side, high frequency spectrum before high frequency spectrum on the MDFT territory of R channel after frequency spectrum mapping and frequency spectrum map carries out border pre-service, wherein, high frequency spectrum before described frequency spectrum maps is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter, for respectively according to a pretreated left side, described border, high frequency spectrum after high frequency spectrum before the frequency spectrum mapping of R channel and frequency spectrum map, calculate and be used in decoding end respectively from a described left side, the high-frequency parameter of high frequency spectrum is recovered in the low frequency spectrum of R channel, and quantization encoding is carried out to obtain a described left side to described high-frequency parameter, the high-frequency parameter coded data of R channel, and carry out multiplexing, to export acoustic coding code stream to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels.

According to a seventh aspect of the invention, provide a kind of stereo decoding apparatus, comprising: bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels; Low frequency stereo de-coding module, for carrying out stereo decoding to described low frequency stereo coding data, to obtain the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform (MDCT) territory of described left and right sound channels; MDCT is to revising discrete Fourier transformation (MDFT) modular converter, for the low frequency spectrum decoded data of the voice signal on the MDCT territory of described left and right sound channels is converted to MDFT territory from MDCT territory, obtain the low frequency spectrum decoded data on the MDFT territory of left and right sound channels; The paramount frequency spectrum mapping block of low frequency spectrum, for demapping section modal data from the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels to HFS, obtains the high frequency spectrum after the frequency spectrum mapping of left and right sound channels; Border, MDFT territory pretreatment module, carries out border pre-service for the high frequency spectrum after mapping the frequency spectrum of described left and right sound channels; High-frequency parameter decoder module, carries out to the high frequency spectrum after the mapping of the pretreated frequency spectrum in described border the high frequency spectrum decoded data that parameter decoding obtains described left and right sound channels for the high-frequency parameter coded data according to described left and right sound channels; Border, MDFT territory post-processing module, for carrying out border aftertreatment to the high frequency spectrum decoded data of described left and right sound channels; And inverse correction discrete Fourier transformation (IMDFT) conversion module, IMDFT conversion is carried out, to obtain the stereo decoding data in time domain for being combined by the high frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and the left and right sound channels after the aftertreatment of described border.

According to an eighth aspect of the invention, provide a kind of stereo decoding method, comprising: demultiplexing is carried out to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels; Stereo decoding is carried out to described low frequency stereo coding data, to obtain the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform (MDCT) territory of described left and right sound channels; The low frequency spectrum decoded data of the voice signal on the MDCT territory of described left and right sound channels is converted to from MDCT territory and revises discrete Fourier transformation (MDFT) territory, obtain the low frequency spectrum decoded data on the MDFT territory of left and right sound channels; From the low frequency spectrum decoded data the MDFT territory of described left and right sound channels, demapping section modal data is to HFS, obtains the high frequency spectrum after the frequency spectrum mapping of left and right sound channels; Border pre-service is carried out to the high frequency spectrum after the frequency spectrum of described left and right sound channels maps; High-frequency parameter coded data according to described left and right sound channels carries out to the high frequency spectrum after the mapping of the pretreated frequency spectrum in described border the high frequency spectrum decoded data that parameter decoding obtains described left and right sound channels; Border aftertreatment is carried out to the high frequency spectrum decoded data of described left and right sound channels; And the high frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and the left and right sound channels after the aftertreatment of described border combined carry out IMDFT conversion, to obtain the stereo decoding data in time domain.

The present invention is by being mapped to MDCT territory by digital audio signal from time domain, and the low frequency spectrum on MDCT territory and high frequency spectrum are transformed into MDFT territory, in conjunction with carrying out waveform coding for the low frequency spectrum on MDCT territory and carrying out parameter coding for the low frequency spectrum on MDFT territory and high frequency spectrum, and the special frequency channel of low frequency spectrum is mapped to the special frequency channel of high frequency spectrum, high frequency spectrum before and after coding side maps frequency spectrum carries out the border pre-service of MDFT territory, high frequency spectrum after decoding end maps frequency spectrum carries out the pre-service of MDFT border, and the border aftertreatment of MDFT territory is carried out to the decoded high frequency spectrum of parameter, improve due to the frequency band division in high-frequency parameter coding, the paramount frequency spectrum of low frequency spectrum maps the problem brought, improve the continuity of frequency spectrum and the naturalness of band signal, eliminate harmonic interference noise and reveal the aliasing noise caused because of secondary lobe, the coding quality of high-frequency parameter coding is further increased under lower code check.

Accompanying drawing explanation

Below with reference to accompanying drawings specific embodiment of the invention scheme is described in detail, in the accompanying drawings:

Fig. 1 is the structured flowchart of monophonic sounds code device according to the preferred embodiment of the invention.

Fig. 2 is the structured flowchart of the module of resampling shown in Fig. 1.

Fig. 3 is the structured flowchart of the low frequency waveform coding module shown in Fig. 1.

Fig. 4 is the structured flowchart of the high-frequency parameter coding module shown in Fig. 1.

Fig. 5 is that the frequency spectrum of the paramount frequency spectrum mapping block of low frequency spectrum maps schematic diagram, and wherein scheming a) is original signal spectrum figure, and figure is b) the signal spectrum figure after mapping.

Fig. 6 is the time-frequency plane figure after time-frequency maps, and wherein schemes a) to be the time-frequency plane figure of tempolabile signal, and figure is b) the time-frequency plane figure of fast changed signal.

Fig. 7 is that the process range revised in the preprocess method of discrete Fourier transform (DFT) (MDFT) border, territory selects schematic diagram, wherein a) be in window adding in frequency domain method process range signal, b) be in the combination treatment method of M DFT territory process range signal.

Fig. 8 is that the gain of high-frequency parameter coding module calculates schematic diagram, and wherein scheme a) to be fast height position and pattern diagram, figure is b) Region dividing and pattern diagram.

Fig. 9 is the structured flowchart of monophonic sound sound decoding device according to the preferred embodiment of the invention.

Figure 10 is the structured flowchart of the waveform of low frequency shown in Fig. 9 decoder module.

Figure 11 is the structured flowchart of the decoder module of high-frequency parameter shown in Fig. 9.

Figure 12 is the structured flowchart of stereo encoding apparatus according to the preferred embodiment of the invention;

Figure 13 is the illustraton of model of the present invention and difference stereo coding pattern;

Figure 14 is the illustraton of model of parameter stereo coding pattern of the present invention;

Figure 15 is the illustraton of model of parameter error stereo coding pattern of the present invention; And

Figure 16 is the structured flowchart of stereo decoding apparatus according to the preferred embodiment of the invention.

Embodiment

For making the object, technical solutions and advantages of the present invention clearly understand, by the following examples, and reference accompanying drawing, the present invention is described in more detail.

Fig. 1 is the structured flowchart of the monophonic sounds code device according to the preferred embodiment of the present invention 1.

As shown in Figure 1, the monophonic sounds code device of the preferred embodiment of the present invention 1 comprises: resampling module 101, signal type judge module 102, Modified Discrete Cosine Transform (MDCT) module 103, low frequency waveform coding module 104, Modified Discrete Cosine Transform (MDCT) are to revising discrete Fourier transform (DFT) (MDFT) modular converter 105, the paramount frequency spectrum mapping block 106 of low frequency spectrum, border, MDFT territory pretreatment module 107, high-frequency parameter coding module 108, and bit stream Multiplexing module 109.

First, the annexation of modules and function in summarized introduction Fig. 1, wherein:

Resampling module 101 is for transforming to target sampling rate by the digital audio signal of input from crude sampling rate, and the signal after resampling is outputted to signal type judge module 102 and MDCT conversion module 103 in units of frame, should note, if the digital audio signal inputted inherently has target sampling rate, then code device can not comprise this module in accordance with the principles of the present invention, directly digital audio signal can be input to signal type judge module 102 and MDCT conversion module 103.

Signal type judge module 102 for carrying out signal type analysis frame by frame to the voice signal after resampling, and outputs signal the result of type analysis.Due to the complicacy of signal itself, signal type adopts multiple expression, if such as this frame signal is tempolabile signal, then directly exports and represents that this frame signal is the mark of tempolabile signal; If fast changed signal, then need continuation to calculate the position of fast height generation, and output represent that this frame signal is the mark of fast changed signal and the position of fast height generation.The result of signal type analysis outputs to the exponent number carrying out MDCT conversion in MDCT conversion module 103 and controls.In addition, the result of signal type analysis is also output to bit stream Multiplexing module 109.It should be noted that if the result adopting the method determination signal type of closed-loop search to analyze, can not module be comprised according to sound coder of the present invention.

The signal type analysis result that MDCT conversion module 103 exports from signal type judge module 102 for basis, adopt the MDCT conversion of different length exponent number, voice signal after resampling is mapped to MDCT transform domain, and the MDCT domain coefficient of voice signal is outputted to low frequency waveform coding module 104, MDCT to MDFT modular converter 105.Particularly, if this frame signal is tempolabile signal, then in units of frame, does MDCT conversion, select the MDCT of longer exponent number to convert; If fast changed signal, then this frame signal is divided into subframe, in units of subframe, does MDCT conversion, select the MDCT of shorter exponent number to convert.The MDCT domain coefficient of voice signal is divided into low frequency spectrum and high frequency spectrum, and wherein low frequency spectrum outputs to described low frequency waveform coding module 104, and the result of low frequency spectrum, high frequency spectrum, signal type analysis outputs to described MDCT to MDFT modular converter 105.

Low frequency waveform coding module 104 is for receiving the low frequency part of the MDCT domain coefficient of voice signal from MDCT conversion module 103, redundancy Processing for removing is carried out to it, and the low frequency spectrum after redundancy process is carried out quantization encoding obtain low frequency coded data, and output to described bit stream Multiplexing module 109.It should be noted that, if the temporal redundancy of low-frequency component meets coding requirement, low frequency waveform coding module 104 also can not carry out redundancy Processing for removing.

MDCT to MDFT modular converter 105 is for receiving the MDCT domain coefficient of voice signal from MDCT conversion module 103, MDCT domain coefficient is converted to the MDFT domain coefficient including phase information, and MDFT domain coefficient is outputted to the paramount frequency spectrum mapping block 106 of low frequency spectrum and border, MDFT territory pretreatment module 107.

The paramount frequency spectrum mapping block 106 of low frequency spectrum is for from MDCT to MDFT, modular converter 105 receives the low frequency spectrum in MDFT territory, the special frequency channel of low frequency spectrum is mapped to the special frequency channel of high frequency spectrum, obtain the high frequency spectrum after frequency spectrum mapping, and high frequency spectrum input border, the MDFT territory pretreatment module 107 after frequency spectrum is mapped.Time-frequency plane after mapping is identical with former time-frequency plane, as shown in Figure 5.

Border, MDFT territory pretreatment module 107 is for from MDCT to MDFT, modular converter 105 receives the high frequency spectrum in MDFT territory and receives the high frequency spectrum in the MDFT territory after the paramount frequency spectrum mapping of low frequency spectrum from low frequency spectrum paramount frequency spectrum mapping block 106, the high frequency spectrum in the MDFT territory after mapping the high frequency spectrum in MDFT territory and the paramount frequency spectrum of low frequency spectrum carries out the border pre-service of MDFT territory, and the high frequency spectrum in MDFT territory after the high frequency spectrum through pretreated MDFT territory, border, MDFT territory and the paramount frequency spectrum of low frequency spectrum being mapped outputs to high-frequency parameter coding module 108.

The high frequency spectrum in the MDFT territory after high-frequency parameter coding module 108 maps for the paramount frequency spectrum of high frequency spectrum and low frequency spectrum receiving pretreated MDFT territory, border, MDFT territory from border, MDFT territory pretreatment module 107, the high-frequency parameter of such as gain parameter, tonality parameter and so on required for therefrom extracting, and quantization encoding is carried out to high-frequency parameter and outputs to bit stream Multiplexing module 109.

Bit stream Multiplexing module 109, for being undertaken multiplexing by the coded data exported from signal type judge module 102, low frequency waveform coding module 104 and high-frequency parameter coding module 108 and side information, forms acoustic coding code stream.

Below, the resampling module 101 in above-mentioned monophonic sounds code device, low frequency waveform coding module 104, MDCT to MDFT modular converter 105, border, MDFT territory pretreatment module 107, high-frequency parameter coding module 108 are specifically explained.

Resampling module 101 is for carrying out resampling to input audio signal.Fig. 2 is the structured flowchart of the module of resampling shown in Fig. 1, and this module comprises up-sampler 201, low-pass filter 202 and down-sampler 203.Wherein up-sampler 201 is for sample frequency being the up-sampling that signal x (n) of Fs carries out L times, obtain signal w (n) that sample frequency is L*Fs, low-pass filter 202 couples of w (n) carry out low-pass filtering and generate filtered signal v (n).The effect of low-pass filter 202 is mirror images of elimination up-sampler 201 generation and avoids by the issuable aliasing of down-sampler 203.Down-sampler 203 pairs of signals v (n) carry out M down-sampling doubly and obtain signal y (n) of sample frequency for (L/M) * Fs.And the signal after resampling is outputted to signal type judge module 102 and MDCT conversion module 103 in units of frame.

Fig. 3 is the structured flowchart of the waveform of low frequency shown in Fig. 1 coding module 104, and low frequency waveform coding module 104 comprises redundancy Processing for removing module 301 and quantization encoding module 302.The low-frequency component that MDCT conversion module 103 exports is the more stable part of signal, but its temporal correlation or frequency domain correlation (i.e. redundance) stronger.Due to the complicacy of signal itself, the MDCT conversion of fixing exponent number can not reach optimum correlativity completely and eliminate.Such as, when signal type judge module 102 judges that this frame signal type is fast changed signal, adopt the MDCT conversion process fast changed signal of shorter exponent number, now the temporal correlation of the low frequency part in MDCT territory and frequency domain correlation (i.e. redundancy) still stronger; And when signal type judge module 102 judges that this frame signal type is tempolabile signal, use the MDCT conversion process tempolabile signal of longer exponent number, now the frequency domain correlation (i.e. redundancy) of the low frequency part in MDCT territory can be stronger.Therefore, the redundancy Processing for removing module 301 that sound coder of the present invention comprises is selectable, its can eliminate further MDCT conversion obtain low-frequency component in time redundancy or frequency domain redundancy.

The process of low frequency redundancy can adopt many kinds of methods.Such as, adopt the transducer of shorter exponent number or the fallout predictor of higher-order number to eliminate the temporal correlation of the low frequency part in the MDCT territory between two subframes or between two continuous frames, such as discrete cosine transform (DCT), discrete Fourier transformation (DFT), Modified Discrete Cosine Transform (MDCT), long-term prediction (LTP) etc.; Adopt the fallout predictor of lower-order number to eliminate the frequency domain correlation of the low frequency part in MDCT territory, such as linear predictor (LPC) etc.Therefore, in sound coder of the present invention, redundancy Processing for removing module 301 adopts multiple redundancy Processing for removing method to calculate the effect eliminating redundancy, i.e. actual coding gain, then select the method whether adopting the process of low frequency redundancy and adopt the process of low frequency redundancy, finally will whether adopt the mark of redundancy Processing for removing module 301 and adopt which kind of method to output in bit stream Multiplexing module 109 as side information.

Quantization encoding module 302 pairs of low-frequency datas carry out quantization encoding and obtain the low-frequency data of coding.The quantization scheme adding Huffman (Huffman) as adopted the scalar in similar MPEG AAC and encode, also can adopt vector quantization scheme.In constant bit rate coding, vector quantizer is a rational selection scheme.The low-frequency data of coding and the side information of low frequency redundancy processing selecting output in bit stream Multiplexing module 109.

MDFT domain coefficient for MDCT domain coefficient is converted to MDFT domain coefficient, and is outputted to the paramount frequency spectrum mapping block 106 of low frequency spectrum and border, MDFT territory pretreatment module 107 by MDCT to MDFT modular converter 105.Concrete conversion method obtains MDST domain coefficient and combines with MDCT domain coefficient obtaining MDFT domain coefficient as carried out correction discrete sine transform (MDST) to voice signal, should be noted that, when adopting this conversion method, the voice signal that MDCT to MDFT modular converter 105 also will receive after resampling; And for example then carry out MDST conversion according to MDCT domain coefficient reconstruct time-domain signal and obtain MDST domain coefficient, MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient; Or obtain MDFT domain coefficient as then carried out MDFT conversion according to MDCT domain coefficient reconstruct time-domain signal; And as by setting up the relation between the MDCT domain coefficient of present frame and front and back frame and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient, thus realize directly being processed by MDCT domain coefficient obtaining MDST domain coefficient, finally MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.The high frequency spectrum in the MDFT territory after the high frequency spectrum in pretreatment module 107 pairs of MDFT territories, border, MDFT territory and the paramount frequency spectrum of low frequency spectrum map carries out border pre-service, and the high frequency spectrum in the MDFT territory after the high frequency spectrum through pretreated MDFT territory, border, MDFT territory and the paramount frequency spectrum mapping of low frequency spectrum is outputted to high-frequency parameter coding module 108.Because the technology recovering high frequency spectrum from low frequency spectrum changes physical slot relation between each frequency band of original signal and energy size, therefore can bring series of problems, have impact on the coding quality of high-frequency parameter coding.Such as: the frequency band division in high-frequency parameter coding, block the association between each spectrum line of original signal, especially, when the frequency resolution of mapping territory signal is very high, the transitional zone between each frequency band is very narrow, destroys the continuity of frequency spectrum and the naturalness of band signal; The paramount frequency spectrum of low frequency spectrum maps the superposition that also may cause at stitching portion dual harmonic signal, produces harmonic interference noise; Stitching portion after mapping for the paramount frequency spectrum of low frequency spectrum between each frequency band, because the undesirable secondary lobe that can produce of prototype filter performance is revealed, thus introduces aliasing noise.

The border pre-service of MDFT territory can adopt many kinds of methods, such as: the frequency domain truncated problem brought for frequency band division, adopts the method for window adding in frequency domain; Frequency spectrum is mapped to the harmonic interference noise of the stitching portion of bringing, adopt the method that harmonic interference is eliminated; The secondary lobe caused because prototype filter performance is undesirable is revealed and aliasing noise, adopts the method for MDFT territory combined treatment.

Fig. 4 comprises tonality parameter extractor 401 and gain parameter extraction apparatus 402 for the structured flowchart of the coding module of high-frequency parameter shown in Fig. 1 108, high-frequency parameter coding module 108.

Tonality parameter extractor 401 receive the high frequency spectrum in the pretreated MDFT territory, border, MDFT territory that border, MDFT territory pretreatment module 107 exports and the paramount frequency spectrum of low frequency spectrum map after the high frequency spectrum in MDFT territory, the high frequency spectrum after original high-frequency spectrum and the mapping of low frequency spectrum paramount frequency spectrum spectrum is divided into multiple sub-band.Next, calculate respectively the tonality of original high-frequency frequency band and low frequency spectrum paramount frequency spectrum spectrum map after the tonality of the corresponding frequency band of high frequency spectrum, obtain the tonality parameter be used for required for the high frequency spectrum tonality after adjusting the mapping of low frequency spectrum paramount frequency spectrum spectrum at decoding device end, and these parameters are outputted in bit stream Multiplexing module 109 after quantization encoding, wherein, tonality parameter can comprise adjustment type and adjustment parameter.

Gain parameter extraction apparatus 402 receive the high frequency spectrum in the pretreated MDFT territory, border, MDFT territory that border, MDFT territory pretreatment module 107 exports and the paramount frequency spectrum of low frequency spectrum map after the high frequency spectrum in MDFT territory.The position that gain parameter extraction apparatus 402 occurs according to signal type and fast height, high frequency time-frequency plane frequency spectrum after being mapped by paramount for low frequency spectrum frequency spectrum and original high-frequency time-frequency plane divide multiple region, the ratio of the region energy that the energy calculating each region in original time-frequency plane is corresponding with mapping time-frequency plane is as gain parameter, and this gain parameter outputs in bit stream Multiplexing module 109 after quantization encoding.

Be described in detail monophonic sounds coding method according to the preferred embodiment of the invention below, the method comprises the following steps:

Step 11: resampling process is carried out to input signal;

Step 12: carry out signal type judgement to the voice signal after resampling, if gradual type signal, then directly outputs signal type, if become type signal soon, then continues the position calculating the generation of fast height, final output signal type and fast height position;

Step 13: according to signal type analysis result, adopts the MDCT conversion of different length exponent number, carries out MDCT conversion, obtain the MDCT domain coefficient of voice signal to the voice signal after resampling;

Step 14: MDCT coefficient in transform domain is divided into low frequency spectrum and high frequency spectrum;

Step 15: low frequency waveform coding is carried out to low frequency spectrum and obtains low frequency waveform encoded data;

Step 16: MDCT domain coefficient is converted to MDFT domain coefficient, obtains low frequency spectrum and the high frequency spectrum in MDFT territory;

Step 17: the special frequency channel special frequency channel of low frequency spectrum being mapped to high frequency, forms the high frequency spectrum mapped;

Step 18: to the high frequency spectrum after the high frequency spectrum in MDFT territory and the paramount frequency spectrum of low frequency spectrum map carry out the border pre-service of MDFT territory obtain border pretreated original high-frequency spectrum and the paramount frequency spectrum of low frequency spectrum map after high frequency spectrum;

Step 19: extract and be used for the high frequency spectrum after mapping from the paramount frequency spectrum of the low frequency spectrum in pretreated MDFT territory, border, MDFT territory to recover the high-frequency parameter of original high-frequency spectrum, quantization encoding is carried out to high-frequency parameter and obtains high-frequency parameter coded data;

Step 20: the data after coding and side information are carried out multiplexing, obtains acoustic coding code stream.

In step 11, resampling process specifically comprises: first by sampling rate Fs and the resampling target sampling rate Fmax of input signal, the sampling rate calculating resampling compares Fmax/Fs=L/M.Wherein, resampling target sampling rate Fmax is that the best result of decoded signal analyses frequency, is generally determined by coding bit rate.Then carry out L up-sampling doubly to input audio signal x (n), the signal of output is signal after up-sampling is obtained by low-pass filter wherein N is the length (as N=∞, this wave filter is iir filter) of low-pass filter, and the cutoff frequency of low-pass filter is Fmax; The sequence of carrying out after M down-sampling doubly v (n) is y (n), then y (n)=v (Mn).Like this, the sampling rate of voice signal y (n) after resampling is exactly L/M times of the sampling rate of voice signal x (n) of original input.It should be noted that if the digital audio signal inputted inherently has target sampling rate, then without the need to performing step 11.

In step 12, signal type judgement is carried out to the digital audio signal after resampling.If gradual type signal, then directly output signal type, if become type signal soon, then continue the position, final output signal type and the fast height position that calculate the generation of fast height.It should be pointed out that this step can be omitted when not needing to carry out signal type analysis.

Signal type judges to adopt many kinds of methods.Such as, judge signal type by signal perceptual entropy, judge signal type etc. by the energy calculating signal subframe.Preferably, can adopt and judge signal type by calculating signal subframe energy, its detailed process is as follows:

In step 12-1: frame of digital voice signal y (n) is carried out high-pass filtering, by low frequency part, the frequency of such as below 500Hz, filters out;

In step 12-2: the signal after high-pass filtering is divided into several subframe yi (n), be convenience of calculation, usually a frame signal be divided into an integer subframe, as a frame be 2048 time, can 256 be a subframe;

In step 12-3: the ENERGY E i calculating each subframe yi (n) respectively, wherein i is the sequence number of subframe.Obtain the energy Ratios of present sub-frame and last subframe again, when energy Ratios is greater than certain threshold value Te, then judge that this frame signal type is fast changed signal, if when the energy Ratios of all subframes and former frame is all less than Te, then judge that this frame signal type is tempolabile signal.If fast changed signal, then continue to perform step 12-4, otherwise do not perform step 12-4, gradual signal type is defined as low frequency sub-band territory signal type analysis result.Threshold value Te in the method can adopt the well-known process in some signal transacting to obtain, and as the mean ratio of statistics coded signal energy, and is multiplied by certain constant and obtains Te;

In step 12-4: for fast changed signal, subframe maximum for energy is judged as the position that fast height occurs.The position that the signal type become soon and fast height occur is defined as low frequency sub-band territory signal type analysis result.

If do not need analytic signal type, without the need to performing step 12.

In step 13, according to signal type analysis result, adopt the MDCT conversion of different length exponent number, MDCT conversion is carried out to the voice signal after resampling, obtains the MDCT coefficient in transform domain of voice signal.

Below Modified Discrete Cosine Transform (MDCT) is specifically described in detail.

Choose the time-domain signal of former frame M sample and a present frame M sample, then windowing operation is carried out to the time-domain signal of common 2M the sample of this two frame, then MDCT conversion is carried out to the signal after windowing, thus obtain M spectral coefficient.

The impulse response of MDCT analysis filter is:

h_{k} (n) = w (n) \sqrt{\frac{2}{M}} \cos [\frac{(2 n + M + 1) (2 k + 1) π}{4 M}],

Then MDCT is transformed to:

X (k) = Σ_{n = 0}^{2 M - 1} x_{e} (n) h_{k} (n), 0 \leq k \leq M - 1,

Wherein: w (n) is window function; The input time-domain signal that x (n) converts for MDCT; The output frequency-region signal that X (k) converts for MDCT.

For meeting the condition of signal Perfect Reconstruction, window function w (n) of MDCT conversion must meet following two conditions:

W (2M-1-n)=w (n) and w ²(n)+w ²(n+M)=1.

In practice, Sine window can be selected as window function.Certainly, also by using biorthogonal conversion, the above-mentioned restriction to window function can be revised with specific analysis filter and synthesis filter.

Like this, these frame data adopting MDCT to carry out time-frequency conversion just obtain different time-frequency plane figure according to signal type.Such as, suppose that time-frequency conversion exponent number when present frame is tempolabile signal is 2048, for time-frequency conversion exponent number during fast changed signal type is 256, then time-frequency plane figure as shown in Figure 6, and wherein Fig. 6 a is the time-frequency plane figure of tempolabile signal; Fig. 6 b is the time-frequency plane figure of fast changed signal.

In step 14, MDCT is converted the MDCT domain coefficient obtained and is divided into low frequency spectrum and high frequency spectrum.Due to the sampling rate of coded sound signal and coding bit rate a lot, the division of frequency band is adjustable.Typically, the separation of low frequency spectrum and high frequency spectrum can between [1/3,1] of encoded bandwidth.Wherein, encoded bandwidth is not more than the actual bandwidth of signal to be encoded, and here, according to nyquist sampling theorem, the actual bandwidth of signal is the half of its sample frequency.Such as, under 16kbps code check, during coding 44.1kHz monophonic sound tone signal, a selection of encoded bandwidth is 12kHz.

In step 15, low frequency waveform encoded packets is drawn together the process of low frequency redundancy and low frequency quantization and to be encoded two steps.The process of low frequency redundancy can adopt many kinds of methods.Such as, adopt the transducer of shorter exponent number or the fallout predictor of higher-order number to eliminate the temporal correlation of the voice signal on the MDCT territory between two subframes or between two continuous frames, as discrete cosine transform (DCT), discrete Fourier transformation (DFT), Modified Discrete Cosine Transform (MDCT), long-term prediction (LTP) etc.; Adopt the fallout predictor of lower-order number to eliminate the frequency domain correlation in the voice signal on MDCT territory, as linear predictor (LPC) etc.

Preferably, the process of low frequency redundancy process is described for the LPC of the DCT of shorter exponent number and lower-order number.

First, the situation adopting the DCT of shorter exponent number to carry out the process of low frequency redundancy is described.Now, carry out redundancy process in chronological order to the low frequency spectrum of fast changed signal, 8 namely identical to time-frequency plane upper frequency position spectral coefficients adopt the dct transform of 8*8 to carry out redundancy elimination, adopt DCTII transform-based function here.

Secondly, the situation adopting the LPC of lower-order number to carry out the process of low frequency redundancy is described.Now, linear predictive coding is carried out to low frequency spectrum, namely linear prediction analysis is carried out to low frequency spectrum, obtain predictor parameter and low frequency residual error spectrum, and predictor parameter is quantized.

The scalar in similar MPEG AAC can be adopted to add the quantization scheme of Huffman encoding to low frequency waveform quantization encoding, also can adopt vector quantization scheme.In constant bit rate coding, vector quantizer is a rational selection scheme.

In step 16, MDCT domain coefficient is converted to MDFT domain coefficient.

Be described revising discrete Fourier transform (DFT) (MDFT), the relation of MDCT and MDFT and the MDCT domain coefficient conversion method to MDFT domain coefficient below.

For MDFT change situation, first the time-domain signal of former frame M sample and a present frame M sample is chosen, again windowing operation is carried out to the time-domain signal of common 2M the sample of this two frame, then MDFT conversion is carried out to the signal after windowing, thus obtain M spectral coefficient.The computing formula of MDFT conversion is:

X (k) = Σ_{n = 0}^{2 M - 1} s (n) \exp (j \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

k＝0，1，...，2M-1。Wherein: w (n) is window function; The input time-domain signal that s (n) converts for MDFT; The output frequency-region signal that X (k) converts for MDFT.MDFT spectral coefficient X (k) has following character:

X(k)＝-conj(X(2M-1-k))

Therefore, X (k) data only needing front M data wherein just can regain one's integrity.

For meeting the condition of signal Perfect Reconstruction, window function w (n) of MDFT conversion must meet following two conditions:

W (2M-1-n)=w (n) and w ²(n)+w ²(n+M)=1.

Secondly, the relation that MDCT conversion converts with MDFT is introduced.

For time-domain signal s (n), the computing formula of its MDCT domain coefficient X (k) is:

X (k) = Σ_{n = 0}^{2 M} s (n) \cos (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

Wherein, 2M is frame length.

Similar, the computing formula of definition MDST domain coefficient Y (k) is

Y (k) = Σ_{n = 0}^{2 M} s (n) \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

With MDCT domain coefficient X (k) for real part, MDST domain coefficient Y (k) is imaginary part, and structure MDFT domain coefficient Z (k) is:

Z (k)=X (k)+jY (k), k=0,1 ..., M-1, j are imaginary symbols.

Z (k) = X (k) + jY (k)

= Σ_{n = 0}^{2 M - 1} s (n) \cos (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

+ j Σ_{n = 0}^{2 M - 1} s (n) \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

= Σ_{n = 0}^{2 M - 1} s (n) \exp (i \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

This MDFT conversion is complex transform, and with phase information, and meet energy conservation, transform domain and time domain energy are consistent.The real part of visible MDFT domain coefficient is equivalent to MDCT domain coefficient exactly.

Finally, exemplify several conversion method specifically to explain the method that MDCT to MDFT changes.

Conversion method 1: carry out MDST conversion to voice signal, combines MDST domain coefficient and MDCT domain coefficient and obtains MDFT domain coefficient

For MDCT domain coefficient is converted to MDFT domain coefficient, the relation of MDCT, MDST and MDFT can be utilized, by calculating MDST domain coefficient and itself and MDCT domain coefficient being combined to obtain MDFT domain coefficient.This method is divided into MDST conversion, MDCT domain coefficient and MDST domain coefficient to be combined as MDFT domain coefficient two steps.

Step a:MDST converts.Synchronous in order to what keep with MDCT domain coefficient, MDFT, MDST will adopt the window function identical with present frame MDCT, window length etc.The length of current MDCT conversion is 2M, and window function is w (n), then its MDST domain coefficient Y (k) is:

Y (k) = Σ_{n = 0}^{2 M} s (n) \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

k＝0，1，..，M-1.

Step b:MDCT domain coefficient and MDST domain coefficient are combined as MDFT domain coefficient.With MDCT domain coefficient for real part, MDST domain coefficient is imaginary part, and structure MDFT domain coefficient is Z (k):

Z(k)＝X(k)+jY(k)，k＝0，1，...，M-1

Conversion method 2: according to MDCT domain coefficient reconstruct time-domain signal, then carry out MDST conversion and combine with MDCT domain coefficient obtaining MDFT domain coefficient.

By carrying out MDCT domain coefficient obtaining time domain reconstruction signal against Modified Discrete Cosine Transform (IMDCT) and splicing adding process, then MDST conversion is done to time domain reconstruction signal and obtain MDST domain coefficient, MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient.This method is divided into time-domain signal reconstruct, MDST conversion, MDCT domain coefficient and MDST domain coefficient to be combined as MDFT domain coefficient three steps.

Step a: time-domain signal reconstructs.This is identical with the method in inverse Modified Discrete Cosine Transform.The formula of inverse Modified Discrete Cosine Transform (IMDCT) is as follows:

x_{e} (n) = Σ_{k = 0}^{M - 1} X (k) h_{k} (n)

h_{k} (n) = w (n) \sqrt{\frac{2}{M}} \cos [\frac{(2 n + M + 1) (2 k + 1) π}{4 M}],

Wherein: x _en output time-domain signal that () converts for IMDCT; h _kn () is the impulse response of MDCT composite filter; W (n) is window function; X (k) is MDCT domain coefficient.

Step b:MDST converts.Synchronous in order to what keep with MDCT domain coefficient, MDFT, MDST conversion will adopt and convert identical window function, window length etc. with present frame MDCT.The length that present frame MDCT converts is 2M, and window function is w (n), then MDST is transformed to:

Y (k) = Σ_{n = 0}^{2 M} s (n) \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

k＝0，1，...，M-1.

Step c:MDCT domain coefficient and MDST domain coefficient are combined as MDFT domain coefficient.With MDCT domain coefficient for real part, MDST domain coefficient is imaginary part, and structure MDFT domain coefficient is Z (k):

Z(k)＝X(k)+jY(k)，k＝0，1，...，M-1

Conversion method 3: then carry out MDFT conversion according to MDCT coefficient reconstruct time-domain signal and obtain MDFT domain coefficient.

By carrying out MDCT domain coefficient obtaining time domain reconstruction signal against Modified Discrete Cosine Transform (IMDCT) and splicing adding process, then MDFT conversion being done to time domain reconstruction signal and obtaining MDFT domain coefficient.This method is divided into time-domain signal reconstruct, MDFT converts two steps.

Step a: time domain data reconstructs.This convert with IMDCT in step identical.

Step b:MDFT converts.Synchronous in order to what keep with MDCT domain coefficient, MDFT conversion will adopt and convert identical window function, window length etc. with present frame MDCT.The length that present frame MDCT converts is 2M, and window function is w (n), then MDFT is transformed to:

Z (k) = Σ_{n = 0}^{2 M - 1} sr (n) w (n) \exp (i \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

k＝0，1，...，M-1。

Conversion method 4: MDCT domain coefficient is directly processed and obtains MDFT coefficient.

By setting up the relation between the MDCT domain coefficient of present frame and front and back frame and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient.Thus realize directly being processed by MDCT domain coefficient obtaining MDST domain coefficient, finally MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.This method is divided into MDST coefficient calculations, combination MDFT coefficient two steps.

Step a:MDST coefficient calculations

Coefficient sets current frame length as 2M, and window function is w (n), former frame MDCT coefficient is S _-1k (), present frame MDCT coefficient is S ₀k (), a rear frame MDCT coefficient is S _{+ 1}(k), then MDST domain coefficient Y (k) is by following formulae discovery:

Y(k)＝S _-1(k)T _cs-1+S ₀(k)T _cs0+S ₊₁(k)T _cs+1

Wherein T _cs-1, T _cs0and T _cs+1for transition matrix, represent former frame MDCT coefficient, present frame MDCT coefficient, a rear frame MDCT coefficient respectively for the contribution of present frame MDST coefficient.

T _cs-1, T _cs+1, T _cs0be all sparse matrix, only have minority data non-zero, most of data equal 0 or close to 0, by the mode most of data being approximately 0, can simplify transition matrix, reduce operand.

Step b: combination MDFT coefficient.MDCT domain coefficient and MDST domain coefficient are combined as MDFT domain coefficient, with MDCT coefficient X (k) for real part, MDST coefficient Y (k) is imaginary part, structure MDFT coefficient Z (k) is: Z (k)=X (k)+jY (k), k=0,1 ..., M-1.

Below with T _cs0for example illustrates transition matrix T _cs-1, T _cs+1, T _cs0with account form.If the window function of former frame signal is w _-1n (), length is 2M, and the window function of current frame signal is w ₀n (), length is 2M, and the window function of a rear frame signal is w _{+ 1}n (), length is 2M.Former frame MDCT coefficient is S _-1k (), current MDCT coefficient is S ₀k (), a rear frame MDCT coefficient is S _{+ 1}(k), then present frame MDST coefficient Y (k) is:

Y (k) = Σ_{n = 0}^{2 M} sr (n) w_{0} (n) \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

Wherein sr (n) the time domain reconstruction signal that is present frame,

Be transition matrix, j is imaginary symbols.And have:

sr (n) = (\frac{1}{2 M} Σ_{k = 0}^{M - 1} S_{0} (k) \cos (- \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1)) * w_{0} (n)

+ \frac{1}{2 M} Σ_{k = 0}^{M - 1} S_{- 1} (k) \cos (- \frac{π}{4 M} * (2 (n + M) + 1 + \frac{2 M}{2}) * (2 k + 1)) * f_{- 1} (n) * w_{- 1} (n + M)

+ \frac{1}{2 M} Σ_{k = 0}^{M - 1} S_{+ 1} (k) \cos (- \frac{π}{4 M} * (2 (n - M) + 1 + \frac{2 M}{2}) * (2 k + 1)) * f_{+ 1} (n)) * w_{+ 1} (n - M)

f_{- 1} (n) = \{\begin{matrix} 1, n < M \\ 0, else \end{matrix}

f_{+ 1} (n) = \{\begin{matrix} 0, n < M \\ 1, else \end{matrix}

W _-1(n), w ₀(n), w _{+ 1}n () meets:

w _-1(n+M) ²+w ₀(n) ²＝1

w ₀(n+M) ²+w ₊₁(n) ²＝1

By above-mentioned formula of deriving, can be able to obtain:

Y(k)＝Y _-1(k)+Y ₀(k)+Y ₊₁(k)

Wherein,

Y_{- 1} (k) = Σ_{n = 0}^{2 M - 1} \frac{1}{2 M} Σ_{j = 0}^{M - 1} S_{- 1} (k) \cos (- \frac{π}{4 M} * (2 (n + M) + 1 + \frac{2 M}{2}) * (2 k + 1))

* f_{- 1} (n) * w_{- 1} (n + M) * w_{0} (n) * \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

Y_{0} (k) = Σ_{n = 0}^{2 M - 1} \frac{1}{2 M} Σ_{j = 0}^{M - 1} S_{0} (j) \cos (- \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 j + 1)) * w_{0} (n) * w_{0} (n)

* \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

Y_{+ 1} (k) = Σ_{n = 0}^{2 M - 1} \frac{1}{2 M} Σ_{j = 0}^{M - 1} S_{+ 1} (k) \cos (- \frac{π}{4 M} * (2 (n - M) + 1 + \frac{2 M}{2}) * (2 k + 1))

* f_{+ 1} (n)) * w_{+ 1} (n - M) * w_{0} (n) * \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

With Y ₀k () is example:

Y_{0} (k) = Σ_{n = 0}^{2 M - 1} \frac{1}{2 M} Σ_{j = 0}^{M - 1} S_{0} (j) \cos (- \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 j + 1)) * w_{0} (n) * w_{0} (n)

* \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

= Σ_{j = 0}^{M - 1} \frac{1}{2 M} S_{0} (j) Σ_{n = 0}^{2 M - 1} \cos (- \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 j + 1)) * w_{0} (n) * w_{0} (n) * \sin (\frac{π}{4 M}

* (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

= Σ_{j = 0}^{M - 1} \frac{1}{2 M} S_{0} (j) Σ_{n = 0}^{2 M - 1} w_{0} (n) * w_{0} (n) * \cos (- \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 j + 1)) * \sin (\frac{π}{4 M}

* (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

= Σ_{j = 0}^{M - 1} \frac{1}{4 M} S_{0} (j) Σ_{n = 0}^{2 M - 1} {* w}_{0} (n) * w_{0} (n) (\sin (- \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 j + 1 + 2 k + 1))

- \sin (\frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 j - 2 k)))

Order

G (k) = Σ_{n = 0}^{2 M - 1} w_{0} (n) * w_{0} (n) (\sin (- \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k)), k = - 2 M, 1, . . . 2 M

Then have

Y_{0 (k)} = Σ_{j = 0}^{M - 1} \frac{1}{4 M} S_{0} (j) (G (j + k + 1) + G (j - k))

I.e. Y _{0 (k)}s can be expressed as ₀the convolution array configuration of (j) and G (k).Make vectorial h ^(k)for

h^{(k)} = \frac{1}{4 M} (G (j + k + 1) + G (j - k)), j = 0,1, . . . M - 1, k = 0,1, . . . 2 M - 1

Then have:

Y _0(k)＝S ₀(k)T _cs0

T _cs0＝(h ⁽⁰⁾H ⁽¹⁾… h ^(2m-1))

This shows Y _{0 (k)}can be expressed as S ₀k the array configuration of (), transformation matrix is T _cs0.Similarly, Y _-1k () can be expressed as S _-1k the combination of (), transformation matrix is T _cs-1, Y _{+ 1}k () is to be expressed as S _-1k the combination of (), transformation matrix is T _cs-1.T _cs-1, T _cs+1, T _cs0be all sparse matrix, only have minority data non-zero, most of data equal 0 or close to 0, by the mode most of data being approximately 0, can simplify transition matrix, reduce operand.

In step 17, the special frequency channel of low frequency spectrum is mapped to the special frequency channel of high frequency, forms the high frequency spectrum after the paramount frequency spectrum mapping of low frequency spectrum.At present, the paramount frequency spectrum of low frequency spectrum maps and can adopt accomplished in many ways, such as folding mapping, linear mapping, frequency multiplication mapping etc.For linear mapping, suppose that the scope of the low-frequency spectra of original signal is [0, F _l], the scope of high frequency spectrum is [F _l, F _s], wherein 2 × F _l< F _s< 3 × F _l, as shown in Fig. 5 a).After carrying out linear mapping, can obtain as in Fig. 5 b) shown in frequency spectrum.

In step 18, the border pre-service of MDFT territory can adopt accomplished in many ways, such as, adopt the frequency domain truncated problem that the method improvement such as window adding in frequency domain bring due to frequency band division; The methods such as harmonic interference elimination are adopted to improve because the paramount frequency spectrum of low frequency spectrum maps the harmonic interference noise problem of the stitching portion of bringing; Adopt the methods such as MDFT territory combined treatment, harmonic wave extract, the elimination of MPEG2Layer3 butterfly conversion aliasing, deconvolution to improve the secondary lobe caused because prototype filter performance is undesirable to reveal and aliasing noise.

Preferably, for window adding in frequency domain, harmonic interference elimination, MDFT territory combined treatment, the pretreated process in border, MDFT territory is described.

First, the pretreated situation in border, MDFT territory of window adding in frequency domain is described.

Need to carry out frequency band division to the high frequency spectrum after the original high-frequency in MDFT territory spectrum and the paramount frequency spectrum of low frequency spectrum map in the high-frequency parameter involved by high-frequency parameter coding module 108 extracts, and frequency band division can bring frequency band truncated problem.Border, the MDFT territory preprocess method of window adding in frequency domain carries out windowing process to the high frequency spectrum after original high-frequency spectrum and the paramount frequency spectrum mapping of low frequency spectrum respectively, the frequency band existed when effectively can improve frequency band division blocks the problem brought, obtain level and smooth frequency domain transition effect, be conducive to keeping the continuity of frequency spectrum and the naturalness of band signal.To carry out border, the MDFT territory preprocess method of window adding in frequency domain to original high-frequency spectrum, the method is divided into structure windowing frequency band, adds frequency window two steps.

Step 18-1a: structure windowing frequency band.The high frequency spectrum coefficient S that M is treated windowing is constructed according to high frequency spectrum coefficient S (k) (k=0...K) in MDFT territory _m(l), wherein m=0...M, l=0...L _m, adjacent two frequency band S _m(l) and S _m+1q is had between (l) _moverlapping region.

S _m(L _m-Q _m+l)＝S _m+1(l)，l＝0...Q _m

As Fig. 7 a) shown in.

Step 18-1b: add frequency window.To S _ml () carries out windowing process, obtain the coefficient S ' m (l) of the high frequency spectrum after windowing.

S′ _m(l)＝S _m(l)*w _m(l)，l＝0...L _m

Select different window functions w (l) can obtain the smooth effect of different qualities, such as window function w (l) can be sinusoidal windows, rectangular window, KBD window etc.The window function of adjacent two frequency bands need meet:

w _m(L _m-Q _m+l)*w _m(L _m-Q _m+l)+w _m+1(l)*w _m+1(l)＝1，l＝0..Q _m

In order to the performance of Optimal Window function, this patent is a kind of window function Wbandexp based on exponent arithmetic structure of design and devdlop also, and it is defined as follows:

Wbandexp (l) = \{\begin{matrix} {(0.5 * α^{\frac{P}{2} - l - 0.5})}^{0.5}, 0 \leq l < P / 2 \\ {(1 - 0.5 * α^{l - \frac{P}{2} + 0.5})}^{0.5}, P / 2 \leq l < P \\ 1, P \leq l < L - Q \\ {(1 - 0.5 * α^{L - \frac{Q}{2} - l - 0.5})}^{0.5}, L - Q \leq l < L - Q / 2 \\ {(0.5 * α^{l - L + Q / 2 + 0.5})}^{0.5}, L - Q / 2 \leq l < L \end{matrix}

Wherein, L is window length, P and Q is respectively the length of window initial sum barrier portion, namely with the overlapping region length of adjacent two frequency bands (as Fig. 7 a) shown in), α is form factor, and determine the performance of window function, α span is (0,1), value is 0.75 in the present embodiment.

Secondly, the pretreated situation in border, MDFT territory adopting harmonic interference to eliminate is described.

Involved when being mapped to the special frequency channel of high frequency spectrum from the special frequency channel of low frequency spectrum in the paramount frequency spectrum mapping block 106 of low frequency spectrum, the splicing of two special frequency channel is there will be in high frequency spectrum after mapping, if stitching portion occurs now causing harmonic interference noise by the harmonic wave that two positions are too near.Harmonic interference removing method is used for the process of the high frequency spectrum stitching portion after mapping the paramount frequency spectrum of low frequency spectrum, eliminates because low frequency spectrum paramount frequency spectrum maps the noise dual harmonic position of bringing too closely caused.The method is divided into harmonic detecting, harmonic interference judgement and elimination of interference three steps.

Step 18-2a: harmonic detecting.High frequency spectrum after low frequency spectrum and the paramount frequency spectrum of low frequency spectrum being mapped is combined into a full range band spectrum, and this entire spectrum searches out all possible harmonic wave based on the local maximum of spectrum energy.

Step 18-2b: harmonic interference judges.Based on the result of harmonic detecting, harmonic interference judgement is carried out in the stitching portion of the high frequency spectrum after the paramount frequency spectrum of low frequency spectrum maps.If the position Sband of the centre frequency of the left and right dual harmonic of i-th stitching portion _{core (m)}with Sband _{core (m+1)}interval is less than threshold value Δ _i, then think to there is harmonic interference noise, proceed to step 18-2c and carry out elimination of interference process, otherwise do not process.

Step 18-2c: elimination of interference.Harmonic wave less for energy in dual harmonic is multiplied by a minimum scale-up factor, value is 0.005 in the present embodiment.

Finally, to adopting the pretreated situation in border, MDFT territory of MDFT territory combination treatment method to be described.

Involved when being mapped to the special frequency channel of high frequency spectrum from the special frequency channel of low frequency spectrum in the paramount frequency spectrum mapping block 106 of low frequency spectrum, the splicing of two special frequency channel is there will be in high frequency spectrum after mapping, the secondary lobe leakage problem that now the prototype filter performance of employing is undesirable brought in MDFT conversion highlights, and thus introduces aliasing noise.This method carries out MDFT territory combined treatment by the stitching portion of the high frequency spectrum after mapping the paramount frequency spectrum of low frequency spectrum, weakens the secondary lobe brought due to prototype filter performance and reveals and aliasing noise.The method is divided into three steps:

Step 18-3a: Fig. 7 b) in, frequency f _lthe boundary of low frequency spectrum and high frequency spectrum, frequency (f _l+ Δ f), (f _l+ 2 Δ f), (f _l+ 3 Δ f) the corresponding special frequency channel [f from low frequency spectrum of difference _c, f _l) be mapped to the special frequency channel [f of high frequency spectrum _l, f _l+ Δ f), [f _l+ Δ f, f _l+ 2 Δ f), [f _l+ 2 Δ f, f _l+ 3 Δ f) time the stitching portion that produces, MDFT territory combined treatment is that the frequency spectrum near the initial frequency of the special frequency channel of high frequency spectrum after mapping the paramount frequency spectrum of low frequency spectrum and cutoff frequency processes, such as, for special frequency channel [f _l+ Δ f, f _l+ 2 Δ f) respectively to f _l+ Δ f, f _lcentered by+2 Δ f, width is that the frequency range of δ processes.

Step 18-3b: to the initial frequency (f of special frequency channel _l+ Δ f) near carry out combined treatment computing formula be:

S^{,} (f_{l} + Δf + k)

= Σ_{j = - 3 δ / 2}^{3 δ / 2} S_{- 1} (f_{l} + Δf + j) * {Fx}_{- 1} (j, k)

+ Σ_{j = - 3 δ / 2}^{3 δ / 2} S_{0} (f_{l} + Δf + j) * {Fx}_{0} (j, k)

+ Σ_{j = - 3 δ / 2}^{3 δ / 2} S_{+ 1} (f_{l} + Δf + j) * {Fx}_{+ 1} (j, k), k = - \frac{δ}{2} . . . \frac{δ}{2}

To the cutoff frequency (f of special frequency channel _l+ 2 Δ f) near carry out combined treatment computing formula be:

S^{,} (f_{l} + 2 Δf + k)

= Σ_{j = - 3 δ / 2}^{3 δ / 2} S_{- 1} (f_{l} + 2 Δf + j) * {Fy}_{- 1} (j, k)

+ Σ_{j = - 3 δ / 2}^{3 δ / 2} S_{0} (f_{l} + 2 Δf + j) * {Fy}_{0} (j, k)

+ Σ_{j = - 3 δ / 2}^{3 δ / 2} S_{+ 1} (f_{l} + 2 Δf + j) * {Fy}_{+ 1} (j, k), k = - \frac{δ}{2} . . . \frac{δ}{2}

Wherein S _-1(k), S ₀(k), S _{+ 1}k () is respectively the spectral coefficient of the special frequency channel of former frame, present frame and rear this three frame of a frame, Fx _-1(j, k), Fx ₀(j, k), Fx _{+ 1}(j, k) is the combination parameter at starting frequency position place, and corresponding former frame, present frame and a rear frame frequency spectral coefficient are to the contribution of present frame MDFT territory combined treatment respectively, Fy _-1(j, k), Fy ₀(j, k), Fy _{+ 1}(j, k) is the combination parameter of cutoff frequency position, and corresponding former frame, present frame and a rear frame frequency spectral coefficient are to the contribution of present frame MDFT territory combined treatment respectively.

Step 18-3c: carry out splicing adding to spectral coefficient S ' (k) after step 18-3b process, obtains the high frequency spectrum after the paramount frequency spectrum mapping of the low frequency spectrum after the combined treatment of MDFT territory.Wherein, adjacent special frequency channel carries out the width of splicing adding is δ.

Combination parameter Fx _-1(j, k), Fx ₀(j, k), Fx _{+ 1}(j, k), Fy _-1(j, k), Fy ₀(j, k), Fy _{+ 1}(j, k), once after determining, does not need every frame all to calculate.If when former frame, present frame and this three frame signal of a rear frame are all tempolabile signals, combination parameter Fy _-1(j, k), Fy ₀(j, k), Fy _{+ 1}the calculating of (j, k) can be divided into following steps:

18-3-1: MDFT domain coefficient S (k) of structure Whole frequency band:

S (k) = \{\begin{matrix} 0, k &NotEqual; f_{l} + j_{0} \\ 1, k = f_{l} + j_{0} \end{matrix}

J ₀cutoff frequency f _lneighbouring off-set value;

18-3-2: inverse MDFT conversion is carried out to MDFT domain coefficient S (k), obtains time-domain signal sr (n) of present frame, 0≤n < 2M;

18-3-3: construct sr ' (n) by sr (n):

{sr}^{'} (n) = \{\begin{matrix} 0,0 \leq n < 2 M \\ sr (n - 2 M), 2 M \leq n < 4 M \\ 0,4 M \leq n < 6 M \end{matrix}

18-3-4: sr ' (n) is carried out with f _lfor the low-pass filtering of cutoff frequency obtains signal sr after low-pass filtering _l(n), low-pass filter can construct for prototype by pseudo-quadrature mirror filter (PQMF, pseudo quadrature mirror filter);

18-3-5: by sr _l(n) structure time-domain signal sr _-1(n), sr ₀(n), sr _{+ 1}n (), respectively to sr _-1(n), sr ₀(n), sr _{+ 1}n () carries out windowing and MDFT conversion obtains MDFT domain coefficient Sy _-1(k), Sy ₀(k), Sy _{+ 1}(k) S;

sr _-1(n)＝sr _l(n+M)，0≤n＜2M

sr ₀(n)＝sr _l(n+2M)，0≤n＜2M

sr ₊₁(n)＝sr _l(n+3M)，0≤n＜2M

18-3-6: by Sy _-1(k), Sy _{+ 1}(k), Sy ₀k () Sy-calculates MDFT territory combination parameter Fy _-1(j ₀, k), Fy ₀(j ₀, k), Fy _{+ 1}(j ₀, k):

Fy _-(j ₀，k)＝Sy ₊₁(k)

Fy ₊(j ₀，k)＝Sy _-1(k)

Fy ₀(j ₀，k)＝Sy ₀(k)

K span is

18-3-7: change j ₀value, proceed to 18-3-1, until calculate all j ₀? fy corresponding in scope _-1(j ₀, k), Fy ₀(j ₀, k), Fy _{+ 1}(j ₀, k) parameter.

The MDFT territory combination treatment method that it should be noted that in the present embodiment is equally applicable to the special frequency channel [f at low frequency spectrum _c, f _l) initial and process by frequency place, then the low frequency spectrum after process is mapped to the special frequency channel of high frequency spectrum.

In step 19, high-frequency parameter coding is that the high frequency spectrum after the paramount frequency spectrum of a kind of low frequency spectrum according to pretreated MDFT territory, border, MDFT territory maps extracts the method for the parameter recovering high frequency spectrum of being used for.In the present invention, following steps are comprised to high-frequency parameter coding method:

Step 19-1, paramount for low frequency spectrum frequency spectrum is mapped after high frequency time-frequency plane and the position that occurs according to signal type and fast height of original high-frequency time-frequency plane divide multiple region, then calculate respectively the energy in each region of original high-frequency and the paramount frequency spectrum of low frequency spectrum map after high frequency corresponding region energy and calculate the energy gain in this region, then by gain quantization, finally the gain after quantification is outputted to bit stream Multiplexing module 109 as side information.

The region divided described in step 19-1 is similar to the scale factor bands (ScaleFactor Band) in MPEG AAC, and the energy in certain region is obtained by the energy sum of the spectral line calculating this region.Because the high frequency spectrum after the paramount frequency spectrum of low frequency spectrum maps is obtained by low frequency spectrum mapping, so its structure is also consistent with low frequency spectrum, as shown in figure 11.When low frequency is gradual frame, high frequency spectrum can do Region dividing along frequency direction; When low frequency is fast change frame, in order to suppress the impact of pre-echo (pre-echo) and rear echo (post-echo), need higher temporal resolution, at this moment can do different Region dividing according to the position of fast height along time orientation.If fast height occur position as in Fig. 8 a) shown in, then corresponding Region dividing as in Fig. 8 b) shown in.Such as, when encoding medium and low frequency, judge that fast height position occurs at the 3rd window by signal type judge module, then utilize in Fig. 8 and a) need preference pattern 3, then according in Fig. 8 b) shown in Region dividing corresponding to mode 3 be (3,1,3,1).In order to reduce transmission side information bit number used, the resolution of frequency can be reduced when fast change frame.It is specifically intended that original high-frequency spectrum should be consistent with the Region dividing of the high frequency of low frequency spectrum paramount frequency spectrum mapping.Like this, the gain in certain region is exactly the ratio of energy of the energy of original high-frequency spectrum that calculates of this region and the high frequency spectrum of mapping.Finally the gain in all regions carried out quantizing and output to bit stream Multiplexing module 108.

The tonality of the high frequency band of the paramount frequency spectrum mapping of low frequency spectrum of step 19-2, the tonality calculating each original high-frequency frequency band respectively and correspondence, be adjusted the side information of special frequency band tonality, comprise adjustment type and adjustment parameter, and this side information is outputted to bit stream Multiplexing module 109.Multiple method can be adopted to calculate tonality.Such as, unpredictable degree is utilized to obtain the method for tonality in time domain by the method for linear prediction, the method for spectrum flatness and MPEG psycho-acoustic model 2.

For MPEG psycho-acoustic model 2, the computing method of tonality are described below: the tonality of model 2 is amplitude according to signal spectrum and phase place, " unpredictable estimate " that calculate spectral line obtains; Further, signal spectrum is divided into frequency range, each frequency range has a spectral line at least.

If the width number spectrum of current frame signal is:

X[k]＝f[k]e ^jφ[k]，k＝1，...，K

Wherein r [k] is amplitude, and φ [k] is phase place.

Calculate the energy of each frequency range,

e [b] = Σ_{k = k_{l}}^{k_{h}} r^{2} [k]

Wherein k _land k _hfor the up-and-down boundary of each k frequency range.

Each spectral line unpredictable estimates the relative distance (namely unpredictable estimate) for currency and the predicted value based on front cross frame.If the amplitude of predicted value and phase place are:

r _pred[k]＝r _t-1[k]+(r _t-1[k]-r _t-2[k])

φ _pred[k]＝φ _t-1[k]+(φ _t-1[k]-φ _t-2[k])

Then the unpredictable c of estimating [k] is defined as:

c [k] = \frac{disk (X [k], X_{pred} [k])}{r [k] + | r_{pred} [k] |} = \frac{| {re}^{jφ [k]} - r_{pred} e^{j φ_{pred} [k]} |}{r [k] + | r_{pred} [k] |}

Then the unpredictable degree of frequency range is that the line energy of this frequency range is multiplied by the unpredictable summation estimated.That is,

c [b] = Σ_{k = k_{l}}^{k_{h}} c [k] r^{2} [k]

The unpredictable degree of definition normalization subregion is:

c_{s} [b] = \frac{c [b]}{e [b]}

Calculate subregion tonality by the unpredictable degree of normalization to have:

t[b]＝-0.299-0.43log _e(c _s[b])

And limiting 0≤t [b]≤1, is pure string when t [b] equals 1, is white noise when t [b] equals 0.Utilize the method for above-mentioned calculating can obtain the tonality of original high-frequency spectrum and the paramount frequency spectrum of low frequency spectrum map after the tonality of high frequency spectrum.The parameter of the tonality adjustment of the high frequency spectrum after mapping the paramount frequency spectrum of low frequency spectrum can calculate according to the methods below:

If the tonality of the high frequency spectrum after the paramount frequency spectrum of low frequency spectrum maps is Test, energy is the tonality Tref of Eest, original high-frequency.Wherein Test and Tref can be obtained by above-mentioned computing method.Can process in the following several ways the tonality adjustment of the high frequency spectrum after mapping:

Situation 1, when the tonality Tref of the tonality Test of high frequency after mapping and original high-frequency about equal time, adjustment type coding, for not adjust, is outputted to bit stream Multiplexing module by adjustment type;

Situation 2, when the tonality Test mapping frequency range is less than the tonality Tref of original high-frequency special frequency channel, then adjusts type for adding string process.Specifically need the energy Δ E adding string _tbe calculated as follows:

T_{ref} = \frac{E_{est} \cdot \frac{T_{est}}{1 + T_{est}} + {ΔE}_{T}}{E_{est} \cdot \frac{1}{1 + T_{est}}} = \frac{E_{est} \cdot T_{est} + {ΔE}_{T} \cdot (1 + T_{est})}{E_{est}}

Obtain after arrangement: will carry out quantization encoding as adjustment parameter, and output to bit stream Multiplexing module 109 together with the coding of adjustment type;

Situation 3, when the tonality Test mapping frequency range is greater than the tonality Tref of original high-frequency special frequency channel, then adjusts type for adding process of making an uproar.Concrete needs add the energy Δ E made an uproar _nbe calculated as follows:

\frac{1}{T_{ref}} = \frac{E_{est} \cdot \frac{1}{1 + T_{est}} + {ΔE}_{N}}{E_{est} \cdot \frac{T_{est}}{1 + T_{est}}} = \frac{E_{est} + {ΔE}_{N} \cdot (1 + T_{est})}{E_{est} \cdot T_{est}}

Obtain after arrangement: will carry out quantization encoding as adjustment parameter, and output to bit stream Multiplexing module 109 together with adjustment type coding.

Below introduce monophonic sound sound decoding device and the method for the preferred embodiment of the present invention 1, because decode procedure is the inverse process of cataloged procedure, so only simply introduce decode procedure.

As shown in Figure 9, monophonic sound sound decoding device according to a preferred embodiment of the invention comprises: bit stream demultiplexing module 901, low frequency waveform decoder module 902, MDCT to MDFT modular converter 903, the paramount frequency spectrum mapping block 904 of low frequency spectrum, border, MDFT territory pretreatment module 905, high-frequency parameter decoder module 906, border, MDFT territory post-processing module 907, inversely revise discrete Fourier transform (DFT) (IMDFT) module 908 and resampling module 909.

Below, the annexation shown in summarized introduction Fig. 9 between each module and and respective function.

Bit stream demultiplexing module 901, for carrying out demultiplexing to the acoustic coding code stream received, obtain coded data and the side information of corresponding data frame, export corresponding coded data and side information to low frequency waveform decoder module 902, export corresponding side information to high-frequency parameter decoder module 905 and inverse discrete Fourier transform (DFT) (IMDFT) module 907 of revising;

Low frequency waveform decoder module 902 for decoding to this frame low frequency waveform encoded data, and carries out the inverse process of redundancy according to redundancy process side information to decoded data, obtains low frequency spectrum decoded data;

Low frequency spectrum desorption coefficient, for receiving the output of low frequency waveform decoder module, is converted to MDFT territory from MDCT territory by MDCT to MDFT modular converter 903, and the low frequency spectrum data in MDFT territory are outputted to the paramount frequency spectrum mapping block 904 of low frequency spectrum.

The paramount frequency spectrum mapping block 904 of low frequency spectrum is for receiving the output of MDCT to MDFT modular converter 903, and from the low frequency spectrum data in this frame MDFT territory, demapping section modal data is to HFS, obtains the high frequency spectrum after the paramount frequency spectrum mapping of low frequency spectrum;

Border, MDFT territory pretreatment module 905 is for receiving the output of the paramount frequency spectrum mapping block 904 of low frequency spectrum, border pre-service is carried out to the high frequency spectrum after the paramount frequency spectrum of low frequency spectrum maps, improve the performance of spectral boundaries, and the high frequency spectrum data after being mapped by paramount for the low frequency spectrum through pretreated MDFT territory, border, MDFT territory frequency spectrum output to high-frequency parameter decoder module 905.

High-frequency parameter decoder module 906 for receiving the high frequency spectrum after the paramount frequency spectrum mapping of pretreated low frequency spectrum from border, MDFT territory pretreatment module 905, and the high-frequency parameter coded data exported according to bit stream demultiplexing module 901 (comprising Gain tuning and tonality adjustment side information) adjusts its gain and tonality obtains high frequency spectrum decoded data;

Border, MDFT territory post-processing module 907, for receiving the output of high-frequency parameter decoder module 906, carries out border aftertreatment to high frequency spectrum decoded data, and the high frequency spectrum data after the border aftertreatment of MDFT territory are outputted to IMDFT conversion module 908.

IMDFT conversion module 908 carries out IMDFT conversion for low frequency spectrum and high frequency spectrum being combined.IMDFT conversion adopts the IMDFT conversion of different length exponent number according to signal type side information, obtains the time-domain signal of this frame.

Resampling module 909 for the sampling frequency conversion of this frame time-domain signal that IMDFT conversion module 908 is exported to the sample frequency being applicable to acoustic playback, should note, if the sample frequency of the signal that IMDFT conversion module 908 exports is suitable for acoustic playback, then can not comprise this module in sound decoding device of the present invention.

Below, the low frequency waveform decoder module 902 of monophonic sound sound decoding device and high-frequency parameter decoder module 906 are specifically explained.

Figure 10 is the structured flowchart of the low frequency waveform decoder module shown in Fig. 9.As shown in Figure 10, low frequency waveform decoder module 902 comprises inverse quantization module 1001 and redundancy against processing module 1002.First, the low frequency coded data obtained from bit stream demultiplexing module 901 is carried out re-quantization decoding by inverse quantization module 1001, obtain the low frequency spectrum after re-quantization, the method for re-quantization decoding is the inverse process adopting quantization encoding in coding side low frequency waveform coding module.Then first redundancy is done to judge according to the mark side information whether carrying out the inverse process of low frequency redundancy against processing module 1002, and do not do inverse process if be masked as, the low frequency spectrum after re-quantization does not change; Otherwise, the inverse process of low frequency redundancy is done to the low frequency spectrum after re-quantization.

Figure 11 is the structured flowchart of the high-frequency parameter decoder module 906 shown in Fig. 9.As shown in figure 11, high-frequency parameter decoder module 906 comprises tonality adjuster 1101 and fader 1102.

High frequency spectrum after paramount for low frequency spectrum frequency spectrum maps by tonality adjuster 1101 is divided into multiple frequency band, division methods is identical with tonality parameter extractor 401 division methods in coding side high-frequency parameter coding module 108, then do to judge according to tonality adjustment type side information, if adjustment type is not for adjust, then the frequency spectrum after mapping does not deal with; If adjustment type is made an uproar for adding, then de-quantization adjustment parameter side information, calculates the energy adding and make an uproar according to the result of de-quantization, and in frequency spectrum in the mapped, corresponding frequency band adds the noise of corresponding energy; If adjustment type is for adding string, then de-quantization adjustment parameter side information, calculates the energy adding string according to the result of de-quantization, and the central authorities in this frequency band of frequency spectrum in the mapped add the string of corresponding energy.When adding string, the phase place that front and back frame adds string will keep continuous.Time-frequency plane is divided multiple region according to fast height position side information by fader 1102, and the method for division is identical with the region partitioning method of gain parameter extraction apparatus 402 in high-frequency parameter coding module 108.Then obtained the target energy of each region gain adjustment by Gain tuning parameter side information, finally the energy in each region is carried out adjustment and make it identical with this regional aim energy.

Be described in detail monophonic sounds coding/decoding method according to the preferred embodiment of the invention below, the method comprises the following steps:

Step 21, acoustic coding code stream is carried out demultiplexing, obtain low frequency coded data, high-frequency parameter coded data and all side informations used of decoding.

Step 22, according to low frequency coded data and side information, re-quantization and decoding are carried out to low frequency coded data, then carry out the decoded low frequency spectrum that the inverse process of low frequency redundancy obtains MDCT territory;

Step 23, the low frequency spectrum after re-quantization is converted to MDFT territory from MDCT territory, obtains the low frequency spectrum in MDFT territory;

Step 24, the special frequency band of the low frequency spectrum in MDFT territory is mapped to the special frequency band of high frequency;

Step 25, the border pre-service of MDFT territory is carried out to the high frequency spectrum after the paramount frequency spectrum of low frequency spectrum maps, obtain the paramount frequency spectrum of the pretreated low frequency spectrum in border, MDFT territory map after the high frequency spectrum in MDFT territory;

Step 26, map according to the paramount frequency spectrum of the pretreated low frequency spectrum in border, MDFT territory after the high frequency spectrum in MDFT territory, parameter decoding is carried out to high-frequency parameter, obtains the high frequency spectrum in decoded MDFT territory;

Step 27, the border aftertreatment of MDFT territory is carried out to the high frequency spectrum in decoded MDFT territory, obtain the high frequency spectrum in the MDFT territory of decoded MDFT territory border aftertreatment;

Step 28, the decoded low frequency spectrum in MDFT territory and high frequency spectrum combined and carries out IMDFT conversion, obtain decoded time-domain signal;

Step 29, re-sampling operations is carried out to decoded time-domain signal, by the sampling rate conversion of decoded time-domain signal to the sample frequency being applicable to acoustic playback.

In step 22, low frequency signal decoding comprises low frequency re-quantization and the inverse process of low frequency redundancy two steps.First re-quantization and decoding are carried out to low frequency coded data, obtain the low frequency spectrum after re-quantization.Then judge whether these frame data have carried out the process of low frequency redundancy at coding side according to side information, if it is need the low frequency spectrum after by re-quantization to carry out the inverse process of low frequency redundancy, otherwise the low frequency spectrum after re-quantization does not do and changes.

Low frequency re-quantization and low frequency redundancy correspond respectively to low frequency signal coding method against disposal route.If what adopt in the specific embodiments of low frequency signal coded portion is the method for vector quantization, then corresponding low frequency re-quantization needs from code stream, obtain codebook vector word indexing, finds corresponding vector according to codewords indexes in fixed code book.Vector is combined into the low frequency spectrum after re-quantization in order.Judge whether coding side has carried out the process of low frequency redundancy according to side information.If not, then the low frequency spectrum after re-quantization does not do the inverse process of low frequency redundancy; If so, then judge which kind of low frequency redundancy processing method coding side adopts according to side information, if coding side adopts DCT method, then decoding end adopts the IDCT of 8*8 to carry out the inverse process of redundancy to low frequency; If coding side adopts LPC method, then decoding end carries out re-quantization to LPC model parameter, obtains the linear predictor parameter after re-quantization, carries out liftering process to low frequency residual error spectrum.

In step 23, MDCT to MDFT conversion has accomplished in many ways at present, and concrete conversion method obtains MDST domain coefficient as then carried out MDST conversion according to MDCT domain coefficient reconstruct time-domain signal, MDCT domain coefficient and MDST domain coefficient is combined and obtains MDFT domain coefficient; Or obtain MDFT domain coefficient as then carried out MDFT conversion according to MDCT domain coefficient reconstruct time-domain signal; And as by setting up the relation between the MDCT domain coefficient of present frame and front and back frame and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient, thus realize directly being processed by MDCT domain coefficient obtaining MDST domain coefficient, finally MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.MDCT to MDFT conversion method in this step was introduced in the embodiment of the coding method of monophony code device of the present invention, adopts identical method, therefore do not introduce in the embodiment of the coding/decoding method of monophony decoding device of the present invention.

In step 24, the paramount frequency spectrum of low frequency spectrum maps and has accomplished in many ways at present, such as folding mapping, linear mapping, frequency multiplication mapping etc.For linear mapping, the method that the paramount frequency spectrum of low frequency spectrum maps is described below.Suppose that the scope of the low frequency spectrum of original signal is [0, F _l], the scope of high frequency spectrum is [F _l, F _s], wherein 2 × F _l< F _s< 3 × F _l, as shown in Fig. 5 a).The frequency spectrum then obtained by linear mapping as in Fig. 5 b) shown in.

In step 25, the border pre-service of MDFT territory has accomplished in many ways at present, such as, adopt the frequency domain truncated problem that the method improvement such as window adding in frequency domain bring due to frequency band division; The methods such as harmonic interference elimination are adopted to improve because the paramount frequency spectrum of low frequency spectrum maps the harmonic interference noise problem of the stitching portion of bringing; Adopt the methods such as MDFT territory combined treatment, harmonic wave extract, the elimination of MPEG2Layer3 butterfly conversion aliasing, deconvolution to improve the secondary lobe caused because prototype filter performance is undesirable to reveal and aliasing noise.Border, MDFT territory preprocess method in this step was introduced in the embodiment of the coding method of monophony code device of the present invention, adopts identical method, therefore do not introduce in the embodiment of the coding/decoding method of monophony decoding device of the present invention.

In step 26, high-frequency parameter coding/decoding method can comprise following steps:

Step 26-1, ask the paramount frequency spectrum of low frequency spectrum to map after the energy in each region of time-frequency plane, the division in region is consistent with scrambler.

Step 26-2, to obtain tonality adjustment type from bit stream demultiplexing module 901, if adjustment type is not for adjust, then execution step 26-4, otherwise carry out step 26-3.

Step 26-3, to obtain tonality adjustment parameter from bit stream demultiplexing module 901 and de-quantization, then according to the tonality adjustment parameter after de-quantization, tonality adjustment is carried out to the frequency spectrum of pretreated mapping.

Gain after step 26-4, each area quantization of time-frequency plane that obtains from bit stream demultiplexing module 901, after de-quantization each region gain of the high frequency spectrum that step 26-1 or step 26-3 export adjusted, make the energy in each region after adjusting identical with target energy, form the high frequency spectrum of signal.

After adjustment parameter in step 26-2 after the adjustment type obtaining each region of high frequency and de-quantization, the tonality of mapping radio-frequency spectrum adjusts.If the energy mapping frequency band is Eest, the adjustment parameter after de-quantization is then adjustment can divide following two kinds of situation process:

Situation 1, when adjust type for adding string process time, the position adding string is the center of this frequency band, and the energy adding string is and the phase place making front and back frame add string keeps continuously;

Situation 2, when adjusting type and making an uproar process for adding, adds the energy of making an uproar the phase place of noise is random number.

In step 27, the border aftertreatment of MDFT territory is corresponding with the MDFT territory border pre-service in step 25.Wherein, when adopting the methods such as window adding in frequency domain, deconvolution, MPEG2 Layer3 butterfly conversion aliasing eliminations, harmonic wave extraction, needs carry out the aftertreatment corresponding with the MDFT territory border pre-service in step 25, and adopt during the methods such as harmonic interference elimination, MDFT territory combined treatment and do not need to carry out the aftertreatment corresponding with the MDFT territory border pre-service in step 25.

Corresponding to the specific implementation method of coded portion MDFT territory border aftertreatment in patent of the present invention, the process of MDFT territory border aftertreatment is described for window adding in frequency domain.

The MDFT territory border aftertreatment of window adding in frequency domain is divided into and adds frequency window, frequency aliasing reconstructs two steps.

Step 27-1a: add frequency window.To S ' _ml () carries out windowing process, obtain the MDFT territory high frequency spectrum coefficient S of windowing _m(l).

S _m(l)＝S′ _m(l)*w(l)，l＝0..2M

Step 27-1b: window adding in frequency domain reconstructs.By adjacent S _ml () carries out splicing adding, reconstruct high frequency spectrum coefficient S (k) after the border aftertreatment of MDFT territory.

In step 28, IMDFT conversion converts corresponding with the MDFT of coding side.Revise discrete Fourier transformation (IMDFT) for inverse, frequency-time mapping process comprises three step: IMDFT conversion, time-domain windowed process and time domain superpositions.

First IMDFT conversion is carried out to re-quantization spectrum, obtain time-domain signal sr (n) after converting.The expression formula of IMDFT conversion is:

, sr (n) = \frac{1}{2 M} Σ_{k = 0}^{2 M - 1} S (k) \exp (- i \frac{π}{4 M} * (2 n + 1 + \frac{2 M}{2}) * (2 k + 1))

Before IMDFT conversion, need S (k) to expand to 2M length:

X(k)＝-conj(X(2M-1-k))，k＝M...2M-1

Wherein, n represents sample sequence number, and 2M is frame length, and represent time domain samples number, value is 2048/256; K represents spectrum sequence number, and conj asks complex conjugate computing.

Secondly, in time domain, windowing process is carried out to the time-domain signal that IMDFT conversion obtains.For meeting perfect reconstruction filter bank, window function w (n) must meet following two condition: w (2M-1-n)=w (n) and w ²(n)+w ²(n+M)=1.

Typical window function has Sine window, KBD window etc.Can biorthogonal conversion be utilized in addition, adopt specific analysis filter and the above-mentioned restriction to window function of composite filter amendment.

Finally, overlap-add procedure is carried out to above-mentioned windowing time-domain signal, obtains time-domain audio signal.Specifically: by rear M sample overlap-add of the front M of the signal sample that obtains after windowing operation and former frame signal, obtain M the time-domain audio sample exported, i.e. timeSam _{i, n}=preSam _{i, n}+ preSam _{i-1, n+M}, wherein i represents frame number, and n represents sample sequence number, has 0≤n≤M.

In step 29, the implementation method of resampling is identical with code device end.It should be noted that if the sample frequency of the time-domain signal after IMDFT conversion is suitable for acoustic playback, then can not comprise re-sampling operations.

Below introduce stereo encoding apparatus and the method for the preferred embodiment of the present invention.

Figure 12 is the structured flowchart of the stereo encoding apparatus as the preferred embodiment of the present invention, the stereo encoding apparatus of the preferred embodiment of the present invention comprises: resampling module 1201, and signal type judge module 1202, MDCT conversion module 1203, low frequency stereo coding module 1204, MDCT to MDFT modular converter 1205, the paramount frequency spectrum mapping block 1206 of low frequency spectrum, border, MDFT territory pretreatment module 1207, high-frequency parameter coding module 1208 and bit stream Multiplexing module 1209.

First, the annexation of modules and function in summarized introduction Figure 12, wherein:

Resampling module 1201, for the digital audio signal in two sound channels of input is transformed to target sampling rate from crude sampling rate, and the signal after the resampling in two sound channels is outputted to and signal type judge module 1202 and MDCT conversion module 1203 in units of frame, should note, if the digital audio signal in input two sound channels inherently has target sampling rate, then code device can not comprise this module in accordance with the principles of the present invention, directly the digital audio signal in two sound channels can be input to and signal type judge module 1202 and MDCT conversion module 1203.

With signal type judge module 1202, for being calculated and signal by the left and right sound channels (L, R) in the stereophonic signal after resampling, signal type analysis being carried out to this and signal, exporting and signal type analysis result.Due to the complicacy of signal itself, and signal type can adopt multiform expression.Such as, if this frame and signal are tempolabile signals, then directly export and represent that this frame and signal are the marks of tempolabile signal; If fast changed signal, then need continuation to calculate the position of fast height generation, and output represent that this frame and signal are the mark of fast changed signal and the position of fast height generation.Output to the result of signal type analysis the exponent number carrying out MDCT conversion in MDCT conversion module 1203 to control, bit stream Multiplexing module 1209 is also output to the result of signal type analysis, it should be noted that, if adopt the method for closed-loop search to determine and the result that signal type is analyzed, module can not be comprised according to sound coder of the present invention.

MDCT conversion module 1203 for according to from export with signal type judge module 1202 with signal type analysis result, adopt the MDCT conversion of different length exponent number, voice signal in after resampling two sound channel is mapped to MDCT territory, and the MDCT domain coefficient of the voice signal in two sound channels is outputted to low frequency waveform coding module 1204, MDCT to MDFT modular converter 1205.If stereo encoding apparatus does not comprise and signal type judge module 1202 in accordance with the principles of the present invention, then when MDCT converts, match exponents does not control.Particularly, if this frame and signal are tempolabile signals, then the voice signal in two sound channels are done in units of frame MDCT conversion respectively, select the transformation of variable of longer rank; If fast changed signal, then the voice signal in two sound channels is divided into subframe, in units of subframe, does MDCT conversion respectively, select the MDCT of shorter exponent number to convert.Respectively the MDCT domain coefficient in two sound channels is divided into low frequency spectrum and high frequency spectrum, low frequency spectrum in described two sound channels outputs to low frequency stereo coding module 1204, and the low frequency spectrum in described two sound channels and high frequency spectrum and signal type analysis result output to MDCT to MDFT modular converter 1205.

Low frequency stereo coding module 1204, for receiving the low frequency spectrum in the MDCT territory in described two sound channels from MDCT conversion module 1203, and low frequency spectrum is divided into several sub-bands, stereo coding pattern is adopted to carry out stereo coding to each sub-band respectively, obtain low frequency stereo coding data, and output to bit stream Multiplexing module 1209 as the sound coding data in acoustic coding code stream.Wherein, stereo coding pattern comprises and differs from stereo coding pattern, parameter stereo coding pattern and parameter error stereo coding pattern.When carrying out stereo coding, each sub-band selects the one in above-mentioned three kinds of coding modes to carry out stereo coding.Wherein, coding mode selects information to output in bit stream Multiplexing module 1209 as side information simultaneously.

MDCT to MDFT modular converter 1205, for receiving the MDCT domain coefficient in described two sound channels from MDCT conversion module 1203, MDCT domain coefficient in described two sound channels is converted to the MDFT domain coefficient including phase information in described two sound channels, and the MDFT domain coefficient in described two sound channels is outputted to the paramount frequency spectrum mapping block 1206 of low frequency spectrum, border, MDFT territory pretreatment module 1207.

The paramount frequency spectrum mapping block 1206 of low frequency spectrum, for from MDCT to MDFT, modular converter 1205 receives the low frequency spectrum in described two sound channels, the special frequency channel of the low frequency spectrum of described two sound channels is mapped to the special frequency channel of the high frequency spectrum of described two sound channels, obtain the high frequency spectrum after the mapping of described two sound channels, and by high frequency spectrum input border, the MDFT territory pretreatment module 1207 after the mapping of described two sound channels.Time-frequency plane after mapping is identical with former time-frequency plane.

Border, MDFT territory pretreatment module 1207, for from MDCT to MDFT modular converter 1205 receive the MDFT territory in described two sound channels high frequency spectrum and from low frequency spectrum paramount frequency spectrum mapping block 1206 receive the paramount frequency spectrum of the low frequency spectrum of described two sound channels map after high frequency spectrum, border pre-service is carried out to the high frequency spectrum after the high frequency spectrum in the MDFT territory in two sound channels and the paramount frequency spectrum of the low frequency spectrum of two sound channels map, and the high frequency spectrum after the high frequency spectrum in the MDFT territory through pretreated two sound channels in border and the paramount frequency spectrum of the low frequency spectrum of two sound channels being mapped outputs to high-frequency parameter coding module 1208.

High-frequency parameter coding module 1208, for receiving the high frequency spectrum after the high frequency spectrum in the MDFT territory of pretreated two sound channels in border, MDFT territory and the paramount frequency spectrum of the low frequency spectrum of two sound channels map from border, MDFT territory pretreatment module 1207, the high-frequency parameter of two sound channels is extracted according to the high frequency spectrum after the high frequency spectrum in the MDFT territory of two sound channels and the paramount frequency spectrum mapping of the low frequency spectrum of two sound channels, the high-frequency parameter of these two sound channels is used for recovering from the low frequency spectrum of two sound channels the high frequency spectrum of two sound channels, then after the high-frequency parameter of this high-frequency parameter coding module 1208 to two sound channels extracted carries out quantization encoding, obtain the high-frequency parameter coded data of two sound channels, the high-frequency parameter coded data of these two sound channels is outputted to bit stream Multiplexing module 1209 as side information.

Bit stream Multiplexing module 1209, for by carrying out multiplexing from the sound coding data received with signal type judge module 1202, low frequency stereo coding module 1204 and high-frequency parameter coding module 1208 and side information, form stereosonic acoustic coding code stream.

In the present embodiment, MDCT conversion module 1203, MDCT to MDFT modular converter 1205, low frequency spectrum paramount frequency spectrum mapping block 1206, border, MDFT territory pretreatment module 1207, high-frequency parameter coding module 1208 need to process respectively stereosonic left and right sound channels, and its disposal route is identical with the resume module method of the same name in monophonic sounds code device.Therefore, each module in above-mentioned two modules is passed through the block combiner of the same name in two monophonic sounds code devices, thus realizes stereosonic process.

Visible, be with the monophonic sounds code device difference of the preferred embodiment of the present invention, when monophonic sounds code device generates the sound coding data of acoustic coding code stream, employing be low frequency waveform coding module 104; And stereo encoding apparatus is when generating the sound coding data of acoustic coding code stream, employing be low frequency stereo coding module 1204.This module is also carry out division sub-band and stereo coding to each subband of low frequency stereo coding data.

Be described in detail stereo encoding method according to the preferred embodiment of the invention below, the method comprises the following steps:

Step 31: respectively resampling process is carried out to the digital audio signal in two sound channels of input;

Step 32: calculated and signal by the voice signal after the resampling in two sound channels, carries out signal type analysis to this and signal, if gradual type signal, to be then directly defined as by signal type and signal type analysis result; If become type signal soon, then continue the position calculating the generation of fast height, finally signal type and fast height position are defined as and signal type analysis result.

Step 33: according to signal type analysis result, adopt different length exponent number to carry out MDCT conversion to the voice signal after the resampling in described two sound channels respectively, obtain the voice signal in the MDCT territory in described two sound channels.

Step 34: respectively the MDCT domain coefficient in two sound channels is divided into low frequency spectrum and high frequency spectrum.

Step 35: respectively the low frequency spectrum in the MDCT territory in two sound channels is divided into several sub-bands, carries out stereo coding to each sub-band, obtains low frequency stereo coding data.

Step 36: the MDFT domain coefficient respectively the MDCT domain coefficient of two sound channels being converted to two sound channels, obtains the high frequency spectrum in the low frequency spectrum in the MDFT territory of two sound channels and the MDFT territory of two sound channels.

Step 37: the special frequency channel respectively special frequency channel of the low frequency spectrum in the MDFT territory in two sound channels being mapped to the high frequency in two sound channels, forms the high frequency spectrum of the mapping in two sound channels.

Step 38: respectively the border pre-service of MDFT territory is carried out to the high frequency spectrum after the high frequency spectrum of two sound channels and the low frequency spectrum paramount frequency spectrum mapping of two sound channels and obtain the high frequency spectrum after the pretreated high frequency spectrum in border of two sound channels and the paramount frequency spectrum mapping of low frequency spectrum of two sound channels.

Step 39: the high frequency spectrum after mapping according to the high frequency spectrum of pretreated two sound channels in border, MDFT territory and the paramount frequency spectrum of the low frequency spectrum of two sound channels, extract the high-frequency parameter being used for recovering the high frequency spectrum in described two sound channels from the low frequency spectrum described two sound channels, quantization encoding is carried out to the high-frequency parameter of described two sound channels, obtains the high-frequency parameter coded data of described two sound channels.

Step 40: carry out multiplexing to the high-frequency parameter coded data of above-mentioned low frequency stereo coding data, described two sound channels and side information, obtain stereosonic acoustic coding code stream.

Wherein, low frequency spectrum paramount frequency spectrum mapping method in the conversion method of the MDCT transform method in the method for resampling in step 31, the signal type determination methods in step 32, step 33, the MDCT to MDFT in step 36, step 37, the MDFT territory border preprocess method in step 38 and the high-frequency parameter coding method in step 39 were introduced all in the embodiment of the coding method of monophony code device of the present invention, in the embodiment of the coding method of stereo encoding apparatus of the present invention, adopt identical method, therefore do not introduce.

Wherein, the process of the low frequency stereo coding of step 35 is, first the low frequency spectrum in described two sound channels is divided into several sub-bands respectively, then namely and difference stereo coding pattern, parameter stereo coding pattern and parameter error stereo coding pattern one is selected from three kinds of coding modes to each sub-band, the frequency spectrum in the sound channel of two in this sub-band is encoded.When dividing, respectively each subband of the low frequency spectrum of two sound channels is divided.First the implementation method that two kinds of coding modes are selected is provided below:

Coding mode selects implementation method 1: carry out Code And Decode with identical bit number to the low frequency spectrum in described two sound channels with three kinds of coding modes respectively, calculate the error of low frequency spectrum before low frequency spectrum in two sound channels that decoding recovers and coding, and the minimum coding mode of Select Error is as the coding mode of stereo coding.Information is selected by coding mode to output in bit stream Multiplexing module 1209 as side information;

Coding mode selects implementation method 2: for the lower frequency sub-band of frequency in low frequency spectrum lower than a determined value, the sub-band of such as below 1kHz, adopt respectively and differ from stereo coding pattern and parameter stereo coding pattern carries out Code And Decode, calculate the low frequency spectrum in two sound channels recovered and the error of front low frequency spectrum of encoding, and the coding mode that Select Error is less, information is selected by coding mode to output in bit stream Multiplexing module 1209 as side information, for the upper frequency sub-band of frequency higher than above-mentioned determined value, as the sub-band of more than 1kHz, adopt parameter stereo coding pattern.Now, the selection information of parameter stereo coding pattern can export or not export bit stream Multiplexing module 1209 to.

Certainly, also can adopt fixing stereo coding pattern in actual applications, in this case, not need to be selected by coding mode information to output in bit stream Multiplexing module 1209 as side information.

Respectively the implementation method of three kinds of stereo coding patterns is described in detail below.

Figure 13 is and differs from the illustraton of model of stereo coding pattern.Be according to the low frequency spectrum in the sub-band in described two sound channels with difference stereo coding pattern, calculate in this sub-band one and frequency spectrum and a difference frequency spectrum.Specific implementation method is as follows:

Composed by the spelling of left and right sound channels with calculate corresponding and frequency spectrum compose with difference frequency and will with after carrying out waveform quantization coding, by what obtain with bit stream Multiplexing module 1209 is outputted to as low frequency stereo coding data. with calculating formula be:

\overset{&RightArrow;}{M} = (\overset{&RightArrow;}{L} + \overset{&RightArrow;}{R}) / 2

\overset{&RightArrow;}{S} = (\overset{&RightArrow;}{L} - \overset{&RightArrow;}{R}) / 2

Wherein, right with carrying out waveform quantization coding can adopt low frequency waveform coding module 104 pairs of low frequency spectrums of monophonic sounds code device to carry out the method for quantization encoding.

Figure 14 is the illustraton of model of parameter stereo coding pattern.Parameter stereo coding pattern is according to the low frequency spectrum in the sub-band k in described two sound channels, calculate a monaural frequency spectrum in this sub-band k, calculate the parameter being used for the low frequency spectrum recovered by this sub-band monophony frequency spectrum in this sub-band k in described two sound channels simultaneously.Enumerate the specific implementation method of two kinds of parameter stereo codings below.

Parameter stereo coding specific implementation method 1 comprises following steps:

Step 35-1a: in sub-band k, for certain sound channel, as R channel calculate the weighting parameters g of this sound channel _r(k), and the frequency spectrum obtaining this sound channel after convergent-divergent make after convergent-divergent with energy equal; g _rk the computing method of () can adopt following formula:

g_{r} (k) = \sqrt{\frac{E_{R} (k)}{E_{L} (k)}}

Wherein, E _r(k) and E _lk () is respectively the energy of R channel, L channel in sub-band k.

Step 35-1b: for each Frequency point i in sub-band k, calculate the weighted sum frequency spectrum of this Frequency point with weighted difference frequency spectrum due to after convergent-divergent, in sub-band k, the energy Ratios of the left and right acoustic channels of each Frequency point is statistically approximate identical, so by with energy approximation is equal, therefore weighted sum frequency spectrum with weighted difference frequency spectrum near normal.Computing formula is as follows:

{\overset{&RightArrow;}{M}}^{'} = (\overset{&RightArrow;}{L} + {\overset{&RightArrow;}{R}}^{'}) / 2 = [\overset{&RightArrow;}{L} + \frac{1}{g_{r} (k)} \overset{&RightArrow;}{R}] / 2

{\overset{&RightArrow;}{S}}^{'} = (\overset{&RightArrow;}{L} - {\overset{&RightArrow;}{R}}^{'}) / 2

Step 35-1c: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical according to quadrature spectrum with weighted difference frequency spectrum calculate quadrature spectrum weighting parameters g _dk (), makes to adopt g _dquadrature spectrum after (k) convergent-divergent with energy equal.G _dk the computing method of () can adopt following formula:

g_{d} (k) = \sqrt{\frac{E_{S} (k)}{E_{D} (k)}}

Wherein, E _s(k) and E _dk () is respectively weighted difference frequency spectrum in sub-band k with quadrature spectrum energy.

Step 35-1d: above-mentioned weighted sum frequency spectrum and g _r(k) and g _dk () outputs to bit stream Multiplexing module 1209 respectively after quantization encoding.Wherein, after quantization encoding for low frequency stereo coding data, the g after quantization encoding _r(k) and g _dk () is side information.

Relative to specific implementation method 1, the parameter g in parameter stereo coding specific implementation method 2 _r(k), g _d(k) and weighted sum frequency spectrum obtain according to error minimum principle, comprise following steps:

Step 35-2a: for sub-band k, according to formula below, calculates first parameter g _d(k):

g_{d} (k) = \frac{- b (k) + \sqrt{b^{2} (k) + a^{2} (k)}}{a (k)}

Wherein,

a (k) = \underset{i &Element; band (k)}{Σ} (x_{r} [i, k] y_{l} [i, k] - x_{l} [i, k] y_{r} [i, k]),

b (k) = \underset{i &Element; band (k)}{Σ} (x_{l} [i, k] x_{r} [i, k] + y_{l} [i, k] y_{r} [i, k])

Wherein, x _land y _lbe respectively real part and the imaginary part of L channel low frequency spectrum, x _rand y _rbe respectively real part and the imaginary part of R channel low frequency spectrum;

Step 35-2b: for sub-band k, according to formula below, calculates second parameter g _r(k):

g_{r} (k) = \frac{- (c (k) - d (k)) + \sqrt{{(c (k) - d (k))}^{2} + g (k) m^{2} (k)}}{g (k) m^{2} (k)}

Wherein,

c (k) = \underset{i &Element; band (k)}{Σ} (x_{l} [i, k] x_{l} [i, k] + y_{l} [i, k] y_{l} [i, k]);

d (k) = \underset{i &Element; band (k)}{Σ} (x_{r} [i, k] x_{r} [i, k] + y_{r} [i, k] y_{r} [i, k]);

m (k) = 2 \frac{b (k) (1 - g_{d}^{2} (k)) + 2 a (k) g_{d} (k)}{1 + g_{d}^{2} (k)}

Step 35-2c: for each Frequency point i in sub-band k, goes out weighted sum frequency spectrum according to formulae discovery below

x_{m} [i, k] = \frac{x_{l} [i, k] + g_{d} (k) y_{l} [i, k] + g (k) g_{r} (k) (x_{r} [i, k] - g_{d} (k) y_{r} [i, k])}{(1 + g_{d}^{2} (k)) (1 + g (k) g_{r}^{2} (k))}

y_{m} [i, k] = \frac{{- g}_{d} (k) x_{l} [i, k] + y_{l} [i, k] + g (k) g_{r} (k) (g_{d} (k) x_{r} [i, k] + y_{r} [i, k])}{(1 + g_{d}^{2} (k)) (1 + g (k) g_{r}^{2} (k))}

Wherein, x _mand y _mrepresent weighted sum frequency spectrum respectively real part and imaginary part, g (k) is the importance factors of sub-band k intrinsic parameter stereo coding, reflect the distribution of parameter stereo coding error at left and right acoustic channels, can select according to characteristics of signals, such as g (k) can equal ratio and the E of L channel and the energy of R channel in sub-band k _l(k)/E _r(k).

Step 35-2d: above-mentioned weighted sum frequency spectrum g _r(k) and g _dk () outputs to bit stream Multiplexing module 1209 respectively after quantization encoding.Wherein, after quantization encoding for low frequency stereo coding data, the g after quantization encoding _r(k) and g _dk () is side information.

Figure 15 is the illustraton of model of parameter error stereo coding pattern.Parameter error stereo coding pattern is according to the low frequency spectrum in the sub-band in described two sound channels, calculates a monaural frequency spectrum in this sub-band, an Error Spectrum and is recovered the parameter of the low frequency spectrum in the sub-band in described two sound channels by this monophony frequency spectrum, Error Spectrum.

Compared to the computation model of parameter stereo coding pattern, if need to improve encoding precision, adopt parameter error stereo coding pattern, calculate the error of frequency spectrum further, i.e. Error Spectrum and by Error Spectrum also waveform quantization coding is carried out.The specific implementation method of parameter error stereo coding pattern comprises the following steps:

Step 35-3a: for certain sound channel in sub-band k, as R channel calculate the weighting parameters g of this sound channel _r(k), and the frequency spectrum obtaining this sound channel after convergent-divergent because the energy Ratios of the left and right acoustic channels of each Frequency point i in parameter extraction frequency band is statistically approximate identical, so with energy approximation is equal, so weighted sum frequency spectrum with weighted difference frequency spectrum near normal; Wherein, g _rg in the computing method of (k) and step 1705-1a _rk the computing method of () are identical.

Step 35-3b: for each Frequency point i in this sub-band, calculate the weighted sum frequency spectrum of this Frequency point with weighted difference frequency spectrum

Step 35-3c: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical

Step 35-3d: according to quadrature spectrum with weighted difference frequency spectrum calculate weighting parameters g _d(k), and obtain according to g _dquadrature spectrum after (k) convergent-divergent wherein, g _dg in the computing method of (k) and step 35-1c _dk the computing method of () are identical.

Step 35-3e: by calculating weighted difference frequency spectrum with the quadrature spectrum after convergent-divergent difference can obtain error spectrum namely

Step 35-3f: above-mentioned weighted sum frequency spectrum error spectrum parameter g _r(k) and g _dk () outputs to bit stream Multiplexing module 1209 respectively after quantization encoding.Wherein, after quantization encoding with for low frequency stereo coding data, the g after quantization encoding _r(k) and g _dk () is side information.

Below introduce stereo decoding apparatus and method according to the preferred embodiment of the invention.

Figure 16 is the structured flowchart of stereo decoding apparatus according to the preferred embodiment of the invention.As shown in figure 16, the stereo decoding apparatus of the preferred embodiment of the present invention comprises: the paramount frequency spectrum mapping block 1604 of bit stream demultiplexing module 1601, low frequency stereo de-coding module 1602, MDCT to MDFT modular converter 1603, low frequency spectrum, border, MDFT territory pretreatment module 1605, high-frequency parameter decoder module 1606, border, MDFT territory post-processing module 1607, IMDFT conversion module 1608 and resampling module 1609.

, specifically introduce annexation and the function of modules shown in Figure 16 below, wherein,

Bit stream demultiplexing module 1601, for carrying out demultiplexing to the acoustic coding stream received, obtains sound coding data and the side information of corresponding data frame.Export corresponding coded data and side information to low frequency stereo de-coding module 1602, side information comprises the mark whether carrying out the inverse process of low frequency redundancy; Side information to high-frequency parameter decoder module 1606 output comprises the position that tonality adjustment type, tonality adjustment parameter, Gain tuning parameter and fast height occur; The control signal exported to IMDFT conversion module 1608 is signal type parameter.When the low frequency stereo coding module 1204 of coding side outputs coding mode selection information, coding mode selects information also will export low frequency stereo de-coding module 1602 (not shown in Figure 16) to as side information.

Low frequency stereo de-coding module 1602, information is selected to carry out stereo decoding to low frequency stereo coding data for the coding mode exported according to bit stream demultiplexing module 1601 in side information, obtain the low frequency spectrum in described two sound channels, send to IMDFT conversion module 1608 and MDCT to MDFT modular converter 1603.

MDCT to MDFT modular converter 1603, for receiving the output of low frequency stereo de-coding module, low frequency spectrum desorption coefficient in two sound channels is converted to MDFT territory from MDCT territory, and the low frequency spectrum data in the MDFT territory in two sound channels are outputted to the paramount frequency spectrum mapping block 1604 of low frequency spectrum.

The paramount frequency spectrum mapping block 1604 of low frequency spectrum, for demapping section modal data in the low frequency spectrum from the MDFT territory in the sound channel of two after this frame decoding to the HFS in two sound channels, the high frequency spectrum after the paramount frequency spectrum of low frequency spectrum obtained in two sound channels maps.

Border, MDFT territory pretreatment module 1605, for receiving the output of the paramount frequency spectrum mapping block 1604 of low frequency spectrum, border pre-service is carried out to the high frequency spectrum after the paramount frequency spectrum of the low frequency spectrum in the MDFT territory in affiliated two sound channels maps, and the high frequency spectrum data after the paramount frequency spectrum of low frequency spectrum of pretreated two sound channels in border maps are outputted to high-frequency parameter decoder module 1606.

High-frequency parameter decoder module 1606, the high-frequency parameter coded data being received from two sound channels of the high frequency spectrum after the low frequency spectrum paramount frequency spectrum mapping in the MDFT territory in described two sound channels of border, MDFT territory pretreatment module 1605 and bit stream demultiplexing module 1601 output for basis recovers the high frequency spectrum in the MDFT territory in described two sound channels.

Border, MDFT territory post-processing module 1607, for receiving the output of high-frequency parameter decoder module 1606, border aftertreatment is carried out to the high frequency spectrum of two sound channels, and the high frequency spectrum data of the sound channel of two after the aftertreatment of border are outputted to IMDFT conversion module 1608.

IMDFT conversion module 1608, IMDFT conversion is carried out for the low frequency spectrum in the MDFT territory in described two sound channels and high frequency spectrum being combined, IMDFT conversion adopts the IMDFT conversion of different length exponent number according to signal type side information, obtains the stereophonic signal of this frame decoding.

Resampling module 1609, for the sampling frequency conversion of the stereophonic signal of this frame decoding that IMDFT conversion module 1608 is exported to the sample frequency being applicable to acoustic playback, should note, if the sample frequency of the signal that IMDFT conversion module 1608 exports is suitable for acoustic playback, then can not comprise this module in sound decoding device of the present invention.

In the present embodiment, MDCT to MDFT modular converter 1603, low frequency spectrum paramount frequency spectrum mapping block 1604, border, MDFT territory pretreatment module 1605, high-frequency parameter decoder module 1606, MDFT territory margin processing module 1607, IMDFT conversion module 1608, resampling module 1609 adopt the module of the same name of two cover monophonic sound sound decoding devices to process left and right sound channels signal respectively respectively.

Be described in detail stereo sound coding/decoding method according to the preferred embodiment of the invention below, the method comprises the following steps:

Step 41: acoustic coding code stream is carried out demultiplexing, obtains low frequency stereo coding data, the high-frequency parameter coded data of two sound channels and all side informations used of decoding.

Step 42: carry out stereo decoding to low frequency stereo coding data according to the low frequency stereo coding mode selecting information in side information, obtains the low frequency spectrum in the decoded MDCT territory in described two sound channels.

Step 43: the decoded low frequency spectrum in two sound channels is converted to MDFT territory from MDCT territory, obtains the low frequency spectrum in the MDFT territory in two sound channels.

Step 44: the special frequency band special frequency band of the low frequency spectrum in the decoded MDFT territory in described two sound channels being mapped to the high frequency in described two sound channels.

Step 45: carry out the border pre-service of MDFT territory to the high frequency spectrum after the paramount frequency spectrum of the decoded low frequency spectrum in the MDFT territory in two sound channels maps, obtains the high frequency spectrum after the paramount frequency spectrum mapping of low frequency spectrum in the MDFT territory in pretreated two sound channels in border, MDFT territory.

Step 46: the high frequency spectrum after mapping according to the paramount frequency spectrum of low frequency spectrum in the MDFT territory in pretreated two sound channels in border, described MDFT territory and the high-frequency parameter coded data in described two sound channels, recover the high frequency spectrum in the MDFT territory in described two sound channels, obtain the high frequency spectrum in the decoded MDFT territory in described two sound channels.

Step 47: carry out the border aftertreatment of MDFT territory to the high frequency spectrum in the decoded MDFT territory in two sound channels, obtains the high frequency spectrum in the decoded MDFT territory in two sound channels after the aftertreatment of border.

Step 48: the low frequency spectrum in MDFT territory in decoded described two sound channels and the high frequency spectrum in MDFT territory are combined and carries out IMDFT conversion, obtain decoded stereophonic signal.

Step 49: carry out re-sampling operations to decoded stereophonic signal, by the sampling rate conversion of decoded stereophonic signal to the sample frequency being applicable to acoustic playback.

Wherein, border, MDFT territory preprocess method in the paramount frequency spectrum mapping method of MDCT to MDFT conversion method in step 43, the low frequency spectrum in step 44, step 45, the high-frequency parameter coding/decoding method in step 46, border, the MDFT territory post-processing approach in step 47, the IMDFT transform method in step 48 and the method for resampling in step 49, all introduced in the embodiment of the coding/decoding method of monophony decoding device of the present invention, in the embodiment of the coding/decoding method of stereo decoding apparatus of the present invention, adopt identical method, therefore do not introduce.

Wherein, step 42 selects information to carry out stereo decoding according to coding mode, selects implementation method 1 corresponding to coding mode, and coding/decoding method is select information to the low frequency stereo coding decoding data of each sub-band according to coding mode; Implementation method 2 is selected corresponding to coding mode, coding/decoding method is select information to the low frequency stereo coding decoding data of sub-band each in lower frequency sub-band according to coding mode, for the sub-band of upper frequency, adopt parameter stereo decoding schema.Wherein, low frequency stereo decoding comprises three kinds of stereo decoding patterns.

Recovered the low frequency spectrum in described two sound channels in this sub-band by the low frequency in sub-band and frequency spectrum and difference frequency spectrum with difference stereo decoding pattern.Specific implementation method is as follows:

Low frequency stereo de-coding module 1602 will receive after low frequency stereo coding data carry out re-quantization decoding from bit stream demultiplexing module 1601, obtain low frequency and frequency spectrum compose with difference frequency following formula is adopted to recover the low frequency spectrum of left and right sound channels.

\hat{\overset{&RightArrow;}{L}} = \hat{\overset{&RightArrow;}{M}} + \hat{\overset{&RightArrow;}{S}}

\hat{\overset{&RightArrow;}{R}} = \hat{\overset{&RightArrow;}{M}} - \hat{\overset{&RightArrow;}{S}}

Parameter stereo decoding schema is the weighted sum frequency spectrum in the sub-band that receives according to low frequency stereo de-coding module 1602 and the relevant parameter in side information with recover the left and right sound channels low frequency spectrum in this sub-band.Corresponding to the embodiment 1 in the parameter stereo coding method of coded portion and embodiment 2, but the decode procedure of two kinds of embodiments is identical, comprises following steps:

Step 42-1a: low frequency stereo de-coding module 1602 will receive after lower frequency region stereo coding data and relevant parameter carry out re-quantization decoding from bit stream demultiplexing module 1601, obtain weighted sum frequency spectrum

{\hat{\overset{&RightArrow;}{M}}}^{'},

Parameter

{\hat{g}}_{d} (k)

With

{\hat{g}}_{r} (k);

Step 42-1b: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical wherein,

\overset{&RightArrow;}{D} [i, k] = - y_{m} [i, k] + {jx}_{m} [i, k];

Step 42-1c: according to the parameter obtained by quadrature spectrum convergent-divergent obtains the quadrature spectrum after convergent-divergent

Step 42-1d: by weighted sum frequency spectrum with the quadrature spectrum after convergent-divergent obtain the frequency spectrum of left and right sound channels, the frequency spectrum of one of them sound channel (R channel) is through after convergent-divergent; Computing formula is as follows:

{\hat{\overset{&RightArrow;}{R}}}^{'} = {\hat{\overset{&RightArrow;}{M}}}^{'} + {\hat{\overset{&RightArrow;}{D}}}^{'}

{\hat{\overset{&RightArrow;}{L}}}^{'} = {\hat{\overset{&RightArrow;}{M}}}^{'} - {\hat{\overset{&RightArrow;}{D}}}^{'}

Step 42-1e: by the parameter obtained from side information convergent-divergent is carried out again to a sound channel of convergent-divergent and returns original size, obtain

Parameter error stereo decoding pattern is the sub-band weighted sum frequency spectrum obtained according to low frequency stereo de-coding module 1602 error spectrum with parameter corresponding in side information with recover this sub-band left and right acoustic channels frequency spectrum.Specific implementation method comprises following steps:

Step 42-2a: low frequency stereo de-coding module 1602 will receive after low frequency stereo coding data and relevant parameter carry out re-quantization decoding from bit stream demultiplexing module 1601, obtain weighted sum frequency spectrum error spectrum and parameter with

Step 42-2b: produce and weighted sum frequency spectrum the quadrature spectrum that constant amplitude is vertical

Step 42-2c: according to the parameter obtained by quadrature spectrum convergent-divergent obtains the quadrature spectrum after convergent-divergent

Step 42-2d: the quadrature spectrum after convergent-divergent with error spectrum be added, the weighted difference frequency spectrum be restored

Step 42-2e: by weighted sum frequency spectrum with weighted difference frequency spectrum obtain the frequency spectrum of left and right acoustic channels, the frequency spectrum of one of them sound channel (R channel) is through after convergent-divergent;

Step 42-2f: pass through parameter convergent-divergent is carried out again to the sound channel of convergent-divergent and returns original size.

Obviously, under the prerequisite not departing from true spirit of the present invention and scope, the present invention described here can have many changes.Therefore, all changes that it will be apparent to those skilled in the art that, all should be included within scope that these claims contain.The present invention's scope required for protection is only limited by described claims.

Claims

1. a monophonic sounds code device, comprising:

Modified Discrete Cosine Transform MDCT conversion module, for digital audio signal being mapped to MDCT territory from time domain to obtain the voice signal MDCT territory, and is divided into low frequency spectrum and high frequency spectrum by the voice signal on described MDCT territory;

Low frequency waveform coding module, for carrying out quantization encoding to obtain low frequency waveform encoded data to the low frequency spectrum of the voice signal on described MDCT territory;

MDCT to correction discrete Fourier transformation MDFT modular converter, for being converted to low frequency spectrum and the high frequency spectrum of the voice signal on MDFT territory by the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum;

The paramount frequency spectrum mapping block of low frequency spectrum, the special frequency channel for the low frequency spectrum by the voice signal on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum after frequency spectrum mapping;

Border, MDFT territory pretreatment module, high frequency spectrum before high frequency spectrum after mapping the frequency spectrum on described MDFT territory and frequency spectrum map carries out border pre-service, wherein, the high frequency spectrum before described frequency spectrum mapping is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter;

High-frequency parameter coding module, high frequency spectrum after mapping for the high frequency spectrum before mapping according to the pretreated frequency spectrum in described border and frequency spectrum, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter; And

Bit stream Multiplexing module, for carrying out multiplexing, to export acoustic coding code stream to described low frequency waveform encoded data and described high-frequency parameter coded data.

2. device according to claim 1, also comprises:

Signal type analysis module, before mapping at described MDCT conversion module, signal type analysis is carried out to described digital audio signal, to know that described digital audio signal is fast changed signal or tempolabile signal, and signal type analysis result is outputted to described MDCT conversion module, described high-frequency parameter coding module and described bit stream Multiplexing module, wherein

MDCT conversion module is also for adopting the MDCT conversion of different length exponent number according to described signal type analysis result, high-frequency parameter coding module is also for extracting described high-frequency parameter according to described signal type analysis result, bit stream Multiplexing module also for, described signal type analysis result is carried out multiplexing with described low frequency waveform encoded data together with described high-frequency parameter coded data.

3. device according to claim 1, wherein, described low frequency waveform coding module also comprises redundancy Processing for removing module, before the low frequency spectrum of the voice signal on described MDCT territory carries out quantization encoding, carry out redundancy Processing for removing to it.

4. device according to claim 1, wherein, described high-frequency parameter coding module also comprises:

Tonality parameter extractor, for according to the high frequency spectrum before the pretreated frequency spectrum mapping in border and the high frequency spectrum after frequency spectrum mapping, extracts and is used for adjusting the tonality parameter needed for high frequency spectrum tonality in decoding end; And

Gain parameter extraction apparatus, for according to the high frequency spectrum before the pretreated frequency spectrum mapping in border and the high frequency spectrum after frequency spectrum mapping, extracts and is used for adjusting the gain parameter needed for high frequency spectrum gain in decoding end,

Wherein, described tonality parameter and described gain parameter are from low frequency spectrum, recover the high-frequency parameter of high frequency spectrum in decoding end.

5. device according to claim 1, also comprises:

Resampling module, before carrying out described mapping at described MDCT conversion module, transforms to target sampling rate by described digital audio signal from crude sampling rate.

6. a monophonic sounds coding method, comprising:

Digital audio signal is mapped to Modified Discrete Cosine Transform MDCT territory to obtain the voice signal MDCT territory from time domain, and the voice signal on described MDCT territory is divided into low frequency spectrum and high frequency spectrum;

Quantization encoding is carried out to obtain low frequency waveform encoded data to the low frequency spectrum in the voice signal on described MDCT territory, the low frequency spectrum of the voice signal on described MDCT territory and high frequency spectrum are converted to low frequency spectrum and the high frequency spectrum of the voice signal revised on discrete Fourier transformation MDFT territory, the special frequency channel of the low frequency spectrum of the voice signal on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, obtain the high frequency spectrum after frequency spectrum mapping, high frequency spectrum before high frequency spectrum after mapping the frequency spectrum on described MDFT territory and frequency spectrum map carries out border pre-service, wherein, high frequency spectrum before described frequency spectrum maps is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter, high frequency spectrum after high frequency spectrum before mapping according to the pretreated frequency spectrum in described border and frequency spectrum map, calculate the high-frequency parameter being used for recovering high frequency spectrum in decoding end from low frequency spectrum, and quantization encoding is carried out to obtain high-frequency parameter coded data to described high-frequency parameter, and

Carry out multiplexing to described low frequency waveform encoded data and high-frequency parameter coded data, to export acoustic coding code stream.

7. method according to claim 6, also comprises:

Before described digital audio signal is mapped to MDCT territory, signal type analysis is carried out to described digital audio signal, to know that described digital audio signal is fast changed signal or tempolabile signal, and output signal type analysis result;

The MDCT conversion of different length exponent number is adopted according to described signal type analysis result;

Extract in described high-frequency parameter according to described signal type analysis result;

Described signal type analysis result is carried out multiplexing with described low frequency waveform encoded data together with described high-frequency parameter coded data.

8. method according to claim 6, also comprises:

Before quantization encoding is carried out to the low frequency spectrum of the voice signal on described MDCT territory, redundancy Processing for removing is carried out to it.

9. method according to claim 6, wherein, describedly also to comprise the step that high-frequency parameter carries out quantization encoding:

High frequency spectrum after high frequency spectrum before mapping according to the pretreated frequency spectrum in border and frequency spectrum map, extracts and is used for adjusting the tonality parameter needed for high frequency spectrum tonality in decoding end; And

High frequency spectrum after high frequency spectrum before mapping according to the pretreated frequency spectrum in border and frequency spectrum map, extracts and is used for adjusting the gain parameter needed for high frequency spectrum gain in decoding end,

10. method according to claim 6, also comprises:

Before digital audio signal is mapped to MDCT territory from time domain, described digital audio signal is transformed to target sampling rate from crude sampling rate.

11. methods according to claim 6, wherein, one of comprise the following steps from MDCT to the conversion of MDFT:

Carry out correction discrete sine transform MDST to the time-domain signal of sound and obtain MDST domain coefficient, described MDST domain coefficient and MDCT domain coefficient are combined and obtain MDFT domain coefficient, wherein, voice signal is mapped to MDCT territory and obtains by described MDCT domain coefficient;

Reconstruct described time-domain signal according to described MDCT domain coefficient, MDST conversion is carried out to the time-domain signal of described reconstruct and obtains MDST domain coefficient, described MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient;

Reconstruct described time-domain signal according to described MDCT domain coefficient, MDFT conversion is carried out to the time-domain signal of described reconstruct and obtains MDFT domain coefficient; And

By setting up the relation between the MDCT domain coefficient of present frame and front and back frame thereof and present frame MDST domain coefficient, determine three transition matrixes being calculated present frame MDST domain coefficient by this three frame MDCT domain coefficient, obtain MDST domain coefficient according to the MDCT domain coefficient of this three frame and the transition matrix of correspondence thereof, then MDCT domain coefficient and MDST domain coefficient are combined and obtain MDFT coefficient.

12. methods according to claim 6, wherein, one or more during the pre-service of described border comprises the following steps:

Multiple high frequency spectrum frequency band treating windowing is gone out respectively according to the high frequency spectrum coefficient obtained when the high frequency spectrum on MDCT territory being converted to the high frequency spectrum on MDFT territory and the high frequency spectrum coefficients to construct obtained when the special frequency channel of the low frequency spectrum by the voice signal on described MDFT territory is mapped to the special frequency channel of high frequency spectrum, treat that the high frequency spectrum frequency band of windowing adds frequency window process to each, obtain the high frequency spectrum coefficient after windowing process and calculate for described high-frequency parameter;

Harmonic detecting is carried out based on the high frequency spectrum after the low frequency spectrum on described MDFT territory and described frequency spectrum map, harmonic interference judgement is carried out in stitching portion based on the high frequency spectrum of described harmonic detecting result after frequency spectrum maps, interferes according to the result harmonic carcellation that harmonic interference judges; And

Respectively MDFT territory combined treatment is carried out to the certain frequency scope centered by the initial frequency of special frequency channel of the high frequency spectrum after mapping by frequency spectrum and cutoff frequency, and result is used for described high-frequency parameter and calculates.

13. 1 kinds of monophonic sound sound decoding devices, comprising:

Bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data;

Low frequency waveform decoder module, for described low frequency waveform encoded data of decoding, to obtain the low frequency spectrum decoded data of the voice signal on Modified Discrete Cosine Transform MDCT territory;

MDCT to correction discrete Fourier transformation MDFT modular converter, for being converted to MDFT territory by the low frequency spectrum decoded data of the voice signal on described MDCT territory from MDCT territory;

The paramount frequency spectrum mapping block of low frequency spectrum, for demapping section modal data from the low frequency spectrum decoded data on described MDFT territory to HFS, obtains the high frequency spectrum after frequency spectrum mapping;

Border, MDFT territory pretreatment module, carries out border pre-service for the high frequency spectrum after mapping described frequency spectrum;

High-frequency parameter decoder module, obtains high frequency spectrum decoded data for carrying out parameter decoding according to described high-frequency parameter coded data to the high frequency spectrum after the pretreated frequency spectrum mapping in described border;

Border, MDFT territory post-processing module, for carrying out border aftertreatment to described high frequency spectrum decoded data; And

Inverse correction discrete Fourier transformation IMDFT conversion module, carries out IMDFT conversion for being combined by the high frequency spectrum decoded data after described low frequency spectrum decoded data and the aftertreatment of described border, to obtain the voice codec data in time domain.

14. devices according to claim 13, wherein, described low frequency waveform decoder module also comprises:

Inverse quantization module, for carrying out re-quantization decoding to described low frequency waveform encoded data, obtains the low frequency spectrum data after re-quantization;

Redundancy, against processing module, is eliminated inverse process for carrying out redundancy to the low frequency spectrum data after described re-quantization, is obtained described low frequency spectrum decoded data.

15. devices according to claim 13, also comprise:

Resampling module, for transforming to the sample frequency of applicable acoustic playback by the sampling frequency of the voice codec data in described time domain.

16. devices according to claim 13, wherein, described high-frequency parameter decoder module also comprises:

Tonality adjusting module, carries out tonality adjustment for the high frequency spectrum after mapping the pretreated frequency spectrum in described border; And

Gain regulation module, for carrying out Gain tuning to the high frequency spectrum after tonality adjustment, obtains high frequency spectrum decoded data.

17. 1 kinds of monophonic sounds coding/decoding methods, comprising:

Demultiplexing is carried out to acoustic coding code stream, to obtain low frequency waveform encoded data and high-frequency parameter coded data;

To decode described low frequency waveform encoded data, to obtain the low frequency spectrum decoded data in the voice signal on Modified Discrete Cosine Transform MDCT territory;

Described low frequency spectrum decoded data is converted to from MDCT territory and revises discrete Fourier transformation MDFT territory, to obtain the low frequency spectrum decoded data on MDFT territory;

From the low frequency spectrum decoded data described MDFT territory, demapping section modal data is to HFS, obtains the high frequency spectrum after frequency spectrum mapping;

Border pre-service is carried out to the high frequency spectrum after described frequency spectrum maps;

According to described high-frequency parameter coded data, parameter decoding is carried out to the high frequency spectrum after the pretreated frequency spectrum in described border maps, obtains the decoded high frequency spectrum decoded data in MDFT territory;

Border aftertreatment is carried out to described high frequency spectrum decoded data; And

High frequency spectrum decoded data after described low frequency spectrum decoded data and the aftertreatment of described border is combined and carries out inverse correction discrete Fourier transformation IMDFT conversion, obtain the digital audio signal in decoded time domain.

18. methods according to claim 17, wherein, the step of described low frequency waveform encoded data of decoding also comprises:

Re-quantization decoding is carried out to low frequency waveform encoded data, obtains low frequency spectrum decoded data; And

Redundancy is carried out to described low frequency spectrum decoded data and eliminates inverse process.

19. methods according to claim 17, also comprise:

The sampling frequency of the digital audio signal in described time domain is transformed to the sample frequency of applicable acoustic playback.

20. methods according to claim 17, wherein, one of comprise the following steps from MDCT to the conversion of MDFT:

According to the time-domain signal of described MDCT domain coefficient reconstruct sound, correction discrete sine transform MDST conversion is carried out to the time-domain signal of described reconstruct and obtains MDST domain coefficient, described MDCT domain coefficient and MDST domain coefficient are combined and obtains MDFT domain coefficient, wherein, the described low frequency waveform encoded data of described MDCT domain coefficient decoding obtains;

21. methods according to claim 17, wherein, one or more during the pre-service of described border comprises the following steps:

High frequency spectrum coefficients to construct according to obtaining in frequency spectrum mapping goes out multiple high frequency spectrum frequency band treating windowing, treats that the high frequency spectrum frequency band of windowing adds frequency window process, obtain the high frequency spectrum coefficient after windowing process and decode for described parameter each;

Harmonic detecting is carried out based on the high frequency spectrum after described low frequency spectrum decoded data and described frequency spectrum map, harmonic interference judgement is carried out in stitching portion based on the high frequency spectrum of described harmonic detecting result after frequency spectrum maps, interferes according to the result harmonic carcellation that harmonic interference judges; And

Respectively MDFT territory combined treatment is carried out to the certain frequency scope centered by the initial frequency of special frequency channel of the high frequency spectrum after mapping by frequency spectrum and cutoff frequency, and result is used for the decoding of described parameter.

22. methods according to claim 17, wherein, the aftertreatment of described border also comprises:

According to the high frequency spectrum frequency band obtained in described parameter decoding, frequency window process is added to each high frequency spectrum frequency band, and carry out splicing adding process by adding all high frequency spectrum frequency bands after frequency window process, obtain the high frequency spectrum decoded data after the aftertreatment of border and map for described frequency spectrum.

23. 1 kinds of stereo encoding apparatus, comprising:

Modified Discrete Cosine Transform MDCT conversion module, for respectively digital audio signal being mapped to MDCT territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum;

Low frequency stereo coding module, for carrying out stereo coding to the low frequency spectrum on the MDCT territory of described left and right sound channels, to obtain low frequency stereo coding data;

MDCT to revising discrete Fourier transformation MDFT modular converter, for the low frequency spectrum on the MDCT territory of described left and right sound channels and high frequency spectrum being converted to low frequency spectrum on MDFT territory and high frequency spectrum;

The paramount frequency spectrum mapping block of low frequency spectrum, the special frequency channel for the low frequency spectrum by the voice signal on the MDFT territory of described left and right sound channels is mapped to the special frequency channel of high frequency spectrum, obtains the high frequency spectrum after the frequency spectrum mapping of left and right sound channels;

Border, MDFT territory pretreatment module, high frequency spectrum before high frequency spectrum after mapping frequency spectrum on the MDFT territory of described left and right sound channels and frequency spectrum map carries out border pre-service, wherein, the high frequency spectrum before described frequency spectrum mapping is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter;

High-frequency parameter coding module, high frequency spectrum after high frequency spectrum before mapping for the frequency spectrum respectively according to the pretreated left and right sound channels in described border and frequency spectrum map, calculate the high-frequency parameter being used for recovering high frequency spectrum respectively in decoding end from the low frequency spectrum of described left and right sound channels, and quantization encoding is carried out to obtain the high-frequency parameter coded data of described left and right sound channels to described high-frequency parameter; And

Bit stream Multiplexing module, for carrying out multiplexing to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels, to export acoustic coding code stream.

24. 1 kinds of stereo encoding methods, comprising:

Respectively digital audio signal is mapped to Modified Discrete Cosine Transform MDCT territory from time domain, to obtain the digital audio signal on the MDCT territory of left and right sound channels, and the voice signal on the MDCT territory of described left and right sound channels is divided into low frequency spectrum and high frequency spectrum;

To a described left side, low frequency spectrum on the MDCT territory of R channel carries out stereo coding, to obtain low frequency stereo coding data, by a described left side, being converted to of low frequency spectrum on the MDCT territory of R channel and high frequency spectrum revises low frequency spectrum on discrete Fourier transformation MDFT territory and high frequency spectrum, by a described left side, the special frequency channel of the low frequency spectrum of the voice signal on the MDFT territory of R channel is mapped to the special frequency channel of high frequency spectrum, obtain a left side, high frequency spectrum after the frequency spectrum mapping of R channel, to a described left side, high frequency spectrum before high frequency spectrum on the MDFT territory of R channel after frequency spectrum mapping and frequency spectrum map carries out border pre-service, wherein, high frequency spectrum before described frequency spectrum maps is the high frequency spectrum on the MDFT territory after the conversion of described MDCT to MDFT modular converter, for respectively according to a pretreated left side, described border, high frequency spectrum after high frequency spectrum before the frequency spectrum mapping of R channel and frequency spectrum map, calculate and be used in decoding end respectively from a described left side, the high-frequency parameter of high frequency spectrum is recovered in the low frequency spectrum of R channel, and quantization encoding is carried out to obtain a described left side to described high-frequency parameter, the high-frequency parameter coded data of R channel, and

Carry out multiplexing, to export acoustic coding code stream to the high-frequency parameter coded data of described low frequency stereo coding data and described left and right sound channels.

25. 1 kinds of stereo decoding apparatus, comprising:

Bit stream demultiplexing module, for carrying out demultiplexing to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels;

Low frequency stereo de-coding module, for carrying out stereo decoding to described low frequency stereo coding data, to obtain the low frequency spectrum decoded data of the voice signal on the Modified Discrete Cosine Transform MDCT territory of described left and right sound channels;

MDCT, to revising discrete Fourier transformation MDFT modular converter, for the low frequency spectrum decoded data of the voice signal on the MDCT territory of described left and right sound channels is converted to MDFT territory from MDCT territory, obtains the low frequency spectrum decoded data on the MDFT territory of left and right sound channels;

The paramount frequency spectrum mapping block of low frequency spectrum, for demapping section modal data from the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels to HFS, obtains the high frequency spectrum after the frequency spectrum mapping of left and right sound channels;

Border, MDFT territory pretreatment module, carries out border pre-service for the high frequency spectrum after mapping the frequency spectrum of described left and right sound channels;

High-frequency parameter decoder module, carries out to the high frequency spectrum after the mapping of the pretreated frequency spectrum in described border the high frequency spectrum decoded data that parameter decoding obtains described left and right sound channels for the high-frequency parameter coded data according to described left and right sound channels;

Border, MDFT territory post-processing module, for carrying out border aftertreatment to the high frequency spectrum decoded data of described left and right sound channels; And

Inverse correction discrete Fourier transformation IMDFT conversion module, IMDFT conversion is carried out, to obtain the stereo decoding data in time domain for being combined by the high frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and the left and right sound channels after the aftertreatment of described border.

26. 1 kinds of stereo decoding methods, comprising:

Demultiplexing is carried out to acoustic coding code stream, to obtain the high-frequency parameter coded data of low frequency stereo coding data and left and right sound channels;

Stereo decoding is carried out to described low frequency stereo coding data, to obtain the low frequency spectrum decoded data of the voice signal on the Modified Discrete Cosine Transform MDCT territory of described left and right sound channels;

The low frequency spectrum decoded data of the voice signal on the MDCT territory of described left and right sound channels is converted to from MDCT territory and revises discrete Fourier transformation MDFT territory, obtain the low frequency spectrum decoded data on the MDFT territory of left and right sound channels;

From the low frequency spectrum decoded data the MDFT territory of described left and right sound channels, demapping section modal data is to HFS, obtains the high frequency spectrum after the frequency spectrum mapping of left and right sound channels;

Border pre-service is carried out to the high frequency spectrum after the frequency spectrum of described left and right sound channels maps;

High-frequency parameter coded data according to described left and right sound channels carries out to the high frequency spectrum after the mapping of the pretreated frequency spectrum in described border the high frequency spectrum decoded data that parameter decoding obtains described left and right sound channels;

Border aftertreatment is carried out to the high frequency spectrum decoded data of described left and right sound channels; And

High frequency spectrum decoded data on the MDFT territory of the low frequency spectrum decoded data on the MDFT territory of described left and right sound channels and the left and right sound channels after the aftertreatment of described border is combined and carries out IMDFT conversion, to obtain the stereo decoding data in time domain.